Features/Snapshots

From QEMU

Live Snapshots

This document is describing the current design of live snapshots for QEMU. It is a work in progress and things may change as we progress.

Overall concept

The idea is to be able to issue a command to QEMU via the monitor or QMP, which causes QEMU to create a new snapshot image with the original image as the backing file, mounted read-only. This will allow the original image file to be backed up.

Roll-back to a previous version requires one to boot from the previous backing file, at which point the snapshot file becomes invalid. Unfortunately there is no way to detect that a backing file has been booted, making it important for administrators to take care to not rely on snapshot files being valid after a roll-back.

The snapshot image will have to be in a format which support backing files, ie QCOW2 (and QED when the code is integrated), however the original image can be of any supported format. Ie. it is possible to make a QCOW2 snapshot of a RAW image, or a QED snapshot of a QED image.

Guest Agent

Certain operations in the snapshot process can be optimized or improved through support from within the guest. These features will be implemented in the Guest Agent. Please check the guest Guest Agent page for design and implementation details.

High level design

Snapshot procedure:

  1. Run the VM
  2. (Optional)Admin tool calls Agent in guest, requesting consistent state
  3. (Optional)Admin tool pauses qemu (mandatory if more than one device will be snapshot simultaneously)
  4. Issue snapshot command to QEMU: The command will specify which block device is to be snapshot, as well as the filename of the QCOW2 snapshot image (see below).
  5. QEMU creates new snapshot image
  6. VM is halted by QEMU (if management tool did not already pause vm)
  7. Flush pending IOs
  8. Switch VM to open new snapshot image, using original image as read only backing file.new image.
  9. If VM was not paused at start of snapshot command, QEMU will Restart VM
  10. If admin tool paused VM first, then admin tool must restart VM
  11. (Optional)Admin tool calls Agent in guest, with run command

Monitor command

The monitor command is designed to be flexible enough to handle both internal and external snapshots, as well as snapshots to various different snapshot file formats.

snapshot_blkdev device snapshot-file [format]:

device block device to snapshot
snapshot-file target snapshot file
format format of snapshot image, valid formats are QCOW2 & QED (when merged upstream If not specified, the image will default to QCOW2.

QMP command

The QMP command matches the behaviour of the human monitor command, except it is named slightly differently to match the fact that the command is synchronous.

blockdev-snapshot-sync device snapshot-file [format]

device device name to snapshot (json-string)
snapshot-file name of new image file (json-string)
format format of new image (json-string, optional)

Future features

Internal snapshots to images which support internal snapshots (QCOW2 & QED) are not expected to be supported initially.

There have been requests and suggestions for a number of alternative and enhanced interfaces for accessing live snapshots as follows:

internal snapshots

By making the snapshot-file argument of the monitor and QMP command optional, that could be used as a request to make the snapshot internally instead of to an external file. However, without live block migration of an internal snapshot, there is no way to make a backup of an internal snapshot while still leaving the VM running, so this feature is not planned at the present. For now, the snapshot-file argument is required, and only external snapshots are implemented.

fd passed as target for snapshot file/device

To get around problems with selinux, in particular in conjunction with images based on NFS, there is a wish to be able to pass an already open file descriptor using the getfd interface.

However, this poses a number of problems. When creating the COW headers for the new image file, as the COW header needs to know the file name of the disk image it is pointing to. On Linux this can be obtained through /proc/self/fd/<X> but this is not available on all other operating systems.

An alternative solution would be to extend the getfd interface to take an optional file name. However this is ugly and error prone, as it would allow a broken/hostile controller to create an image which points to the wrong place, but which wouldn't be discovered until the time where the image was actually being booted from.

Allowing the controlling application to create the COW headers in the new image is not an acceptable solution. It is race prone and would cause problems for COW formats where the new COW headers include state as of when they are created.

Separating into multiple commands

There are suggestions for splitting the snapshot process into multiple monitor/QMP commands. The process would be split as follows, using human monitor style commands as example:

(qemu) guest-agent-fsfreeze

Call guest agent requesting it to freeze all file systems and flush all I/O requests.

(qemu) freeze-io <blockX>

Instruct QEMU to freeze all I/O processing for block device <blockX>

(qemu) getfd <fd> snapshotfd

Provide file descriptor <fd> and assign it the logical name snapshotfd

(qemu) snapshot-blkdev-async <blockX> fd:snapshotfd <format>

Initiate asynchronous snapshot of device <blockX> to newly file descriptor snapshotfd. This will write the COW headers to the snapshot device, and pivot the block device <blockX> to point to the new device, using the original file/device as it's backing file. It is important to note that it is QEMU which will generate the COW headers in the new snapshot file, externally creating these will not be allowed!

On completion a completion notification will be returned to the caller, hence this will require QAPI in place for proper async QMP command support.

(qemu) thaw-io <blockX>

Un-freeze I/O processing for device <blockX>

(qemu) guest-agent-fsthaw

Call guest agent requesting it to thaw/unfreeze all file systems within the guest.

(qemu) snapshot-blkdev-status <blockX>

Query the current snapshot status of <blockX>. In addition some form of notification of completion will be required.

Note that the caller can loop the process of comments freeze-io, getfd, snapshot-blkdev-async, and thaw-io to snapshot multiple block devices in one guest.

Live merge

See http://wiki.qemu.org/Features/LiveBlockMigration

Other proposed qemu features that solve similar or related problems

Snapshots2 and Livebackup

snapshots2
livebackup