Create and use OpenStack snapshots

What is a snapshot in OpenStack/OVHcloud Public Cloud?

A snapshot is a mechanism that allows you to create a new image from a running instance. This mainly serves two purposes:

  1. As a backup mechanism: save the main disk of your instance to an image and later boot a new instance from this image with the saved data.
  2. As a templating mechanism: customise a base image and save it to use as a template for new instances.

Snapshots can be taken of instances while they are either running or stopped. In simple terms, snapshots are images with the following additional properties:

NameValue
image_typesnapshot
instance_uuid<uuid of instance that was used for the snapshot>
base_image_ref<uuid of original image of instance that was used for snapshot>
image_locationsnapshot

How to create a snapshot

Using the CLI

To create a snapshot of an instance using the CLI, use the following command:

# Load your OpenStack credentials
$ source openrc

# Using the openstack client
$ openstack server image create --name <name of the new image> <instance name or uuid>

# Or using the nova client (deprecated)
$ nova image-create <instance name or uuid> <name of the new image>

Using Horizon

Once you’re logged in to Horizon, you can create a snapshot via the Compute → Instances page by clicking on the “Create snapshot” action.

The snapshot’s status and information about it can be found on the Compute → Images page.

You can then select the snapshot when creating a new instance.

Live snapshots and data consistency

We call a snapshot taken against a running instance with no downtime a “live snapshot”.

These snapshots are simply disk-only snapshots, and may be inconsistent if the instance’s OS is not aware of the snapshot being taken.

This phenomenon occurs when a hypervisor freezes the instance to allow the creation of a “delta” file before resuming the execution of the instance. This is done to prevent the instance writing directly to its disk while it is copied. When the copy is done, the instance is frozen again to allow the “delta” to be merged with the instance’s disk, and the execution is then resumed with the disk fully merged.

Inconsistencies can appear on the first freeze if the instance is not aware that the hypervisor is taking a snapshot, because the applications and the kernel running on the instance are not told to flush their buffers.

Ensuring snapshots are consistent

OpenStack Nova relies on QEMU to manage the virtual machines. QEMU also provides tools to communicate with an agent installed on the instance, in order to allow it to take certain actions before a snapshot.

This communication takes place via a virtual device added to the instance when OpenStack Nova detects that the image used has the following property: hw_qemu_guest_agent set to yes.

On a previously created private image, you can set the properties using this command:

# Using the openstack client
$ openstack image set --property hw_qemu_guest_agent=yes <image name or uuid>

# Or using the glance client (deprecated)
$ glance image-update --property hw_qemu_guest_agent=yes <image name or uuid>

To check the properties are indeed set, use the following command:

$ openstack image show -f value -c properties <image name or uuid>

The following diagram shows the workflow of a snapshot under these conditions:

The specific step that prevents any inconsistencies is #6: QEMU-agent freezes the file filesystem.

Configuring the QEMU agent

Linux

The qemu-guest-agent is not installed by default, but once it is installed and started, the filesystem freeze/thaw mechanism will work straight out of the box.

You can check that your instance is set up for communication with the hypervisor by checking the specific device:

$ file /dev/virtio-ports/org.qemu.guest_agent.0
/dev/virtio-ports/org.qemu.guest_agent.0: symbolic link to `../vport2p1'

If this file is not present, the qemu guest agent will not work, which means your image does not have the hw_qemu_guest_agent property set to yes.

Debian-based distributions (Debian, Ubuntu)
# Install the agent
user@agent:~$ sudo apt-get update
user@agent:~$ sudo apt-get install qemu-guest-agent

# Check the agent is started (it should be automatically started and enabled)
user@agent:~$ sudo service qemu-guest-agent status
Redhat-based distributions (Centos, Fedora)
# Install the agent
user@agent:~$ sudo yum install qemu-guest-agent

# Enable the agent
user@agent:~$ sudo chkconfig qemu-guest-agent on

# Start the agent
user@agent:~$ sudo service qemu-guest-agent start

# Check the agent is started
user@agent:~$ sudo service qemu-guest-agent status

Windows

Download and install the MSI related to your architecture (32 or 64 bit versions, although we recommend 64 bits for Public Cloud) from the Fedora project: https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/latest-qemu-ga/

You can check the service is running by using this powershell command:

PS C:\Users\Administrator> Get-Service QEMU-GA

Status   Name               DisplayName
------   ----               -----------
Running  QEMU-GA            QEMU Guest Agent

The Fedora documentation on creating Windows images with virtIO drivers can be found here. [2]

Advanced usage: QEMU agent hooks

It is possible to add scripts that will run before the filesystem is frozen by the agent and after it is thawed.

The example below is done on Debian 9, so the configuration might need to be adjusted for a different distributions.

Also, some distributions already provide the fsfreeze-hook.

Add the fsfreeze-hook script

First, we need to add and activate the fsfreeze-hook mechanism:

# Create the folders to receive the hooks
debian@agent:~$ sudo mkdir -p /etc/qemu/fsfreeze-hook.d

# Download the fsfreeze-hook from the QEMU repository
debian@agent:~$ sudo wget -O /etc/qemu/fsfreeze-hook https://raw.githubusercontent.com/qemu/qemu/master/scripts/qemu-guest-agent/fsfreeze-hook
debian@agent:~$ sudo chmod +x /etc/qemu/fsfreeze-hook

# Add the configuration of the qemu-guest-agent daemon to use this script
debian@agent:~$ sudo tee /etc/default/qemu-guest-agent > /dev/null <<EOF
DAEMON_ARGS="-F/etc/qemu/fsfreeze-hook"
EOF

# Restart the service to take the modifications into account
debian@agent:~$ sudo service qemu-guest-agent restart

Example hook script

The /etc/qemu/fsfreeze-hook script allows users to add custom scripts, to be run before and after the filesystem freeze.

Let’s add a test that writes in a file when the instance is being frozen and thawed:

debian@agent:~$ sudo tee /etc/qemu/fsfreeze-hook.d/test_hook.sh > /dev/null <<EOF
#!/bin/bash

case \$1 in
 freeze)
   echo "I'm frozen" > /tmp/freeze
   ;;
 thaw)
   echo "I'm thawed" >> /tmp/freeze
   ;;
 *)
   exit 1
   ;;
esac
EOF

debian@agent:~$ sudo chmod +x /etc/qemu/fsfreeze-hook.d/test_hook.sh

Be very careful with custom hook scripts. If one fails, the snapshot will be totally abandoned and destroyed. If you cannot find the image that was supposed to be created with the snapshot, it is probably because one of the scripts has failed. In this case, check the instance’s qemu-agent log.

Take a snapshot of your instance:

$ openstack server image create --name test_snapshot <instance name or uuid>

Check the test hook has been run:

# It works!
debian@agent:~$ sudo cat /tmp/freeze
I'm frozen
I'm thawed

Clean the test:

debian@agent:~$ sudo rm /etc/qemu/fsfreeze-hook.d/test_hook.sh /tmp/freeze

Why isn’t this enabled by default?

The qemu-guest-agent is not installed by default on most distributions. So why don’t we add it by default on the base images we provide?

The problem is that we are not legally allowed to change the content of the base images provided by Ubuntu, so for the sake of uniformity, we don’t install the package on any distribution.

Sources:

  • Sébastien Han’s “OpenStack: Perform Consistent Snapshots” blog entry [1]
  • The proxmox wiki article on QEMU guest agent [2]
  • The Fedora documentation on creating Windows images with virtIO drivers [3]

+ posts

DevOps in the Public Cloud teams at OVH, Pierre focuses on pushing new and exciting products out to our customers.