Openstack compute node cleanup

    I've never used openstack before, which I imagine is similar to many other people out there. Its actually pretty cool, although I encountered a problem the other day that I think is worthy of some more documentation. Openstack runs virtual machines for users, in much the same manner as Amazon's EC2 system. These instances are started with a base image, and then copy on write is used to write differences for the instance as it changes stuff. This makes sense in a world where a given machine might be running more than one copy of the instance.

    However, I encountered a compute node which was running low on disk. This is because there is currently nothing which cleans up these base images, so even if none of the instances on a machine require that image, and even if the machine is experiencing disk stress, the images still hang around. There are a few blog posts out there about this, but nothing really definitive that I could find. I've filed a bug asking for the Ubuntu package to include some sort of cleanup script, and interestingly that led me to learn that there are plans for a pretty comprehensive image management system. Unfortunately, it doesn't seem that anyone is working on this at the moment. I would offer to lend a hand, but its not clear to me as an openstack n00b where I should start. If you read this and have some pointers, feel free to contact me.

    Anyways, we still need to cleanup that node experiencing disk stress. It turns out that nova uses qemu for its copy on write disk images. We can therefore ask qemu which are in use. It goes something like this:

      $ cd /var/lib/nova/instances
      $ find -name "disk*" | xargs -n1 qemu-img info | grep backing | \
        sed -e's/.*file: //' -e 's/ .*//' | sort | uniq > /tmp/inuse
      


    /tmp/inuse will now contain a list of the images in _base that are in use at the moment. Now you can change to the base directory, which defaults to /var/lib/nova/instances/_base and do some cleanup. What I do is I look for large image files which are several days old. I then check if they appear in that temporary file I created, and if they don't I delete them.

    I'm sure that this could be better automated by a simple python script, but I haven't gotten around to it yet. If I do, I will be sure to mention it here.

posted at: 00:59 | path: /openstack | permanent link to this entry

    ### mathrock

    The image mgmt code just went in:

    https://review.openstack.org/#change,2902
    https://blueprints.launchpad.net/nova/+spec/nova-image-cache-management

    ### Gustavo Randich

    This bash script does the cleanup:

    USED=$(find /var/lib/nova/instances -name "disk*" | xargs -n1 qemu-img info | grep backing | sed -e's/.*file: //' -e 's/ .*//' | sort | uniq)
    for i in /var/lib/nova/instances/_base/*; do
    USING=0
    for j in $USED; do
    if [ "$i" == "$j" ]; then
    USING=1
    fi;
    done;
    if [ $USING -eq 0 ]; then echo Removing $i...; rm -f "$i"; fi
    done;

    ### Michael Still

    Essex automates this cleanup, so if you're using essex you probably shouldn't need a script at all any more.

    Add a comment to this post:

    Your name:

    Your email: Email me new comments on this post
      (Your email will not be published on this site, and will only be used to contact you directly with a reply to your comment if needed. Oh, and we'll use it to send you new comments on this post it you selected that checkbox.)


    Your website:

    Comments: