Features/Snapshots
Live Snapshots
This document is describing the current design of live snapshots for QEMU. It is a work in progress and things may change as we progress.
Overall concept
The idea is to be able to issue a command to QEMU via the monitor or QMP, which causes QEMU to create a new snapshot image with the original image as the backing file, mounted read-only. This will allow the original image file to be backed up.
Roll-back to a previous version requires one to boot from the previous backing file, at which point the snapshot file becomes invalid. Unfortunately there is no way to detect that a backing file has been booted, making it important for administrators to take care to not rely on snapshot files being valid after a roll-back.
The snapshot image will have to be in a format which support backing files, ie QCOW2 (and QED when the code is integrated), however the original image can be of any supported format. Ie. it is possible to make a QCOW2 snapshot of a RAW image.
High level design
Snapshot procedure:
- Run the VM
- (Optional)Admin tool calls Agent in guest, requesting consistent state
- (Optional)Admin tool pauses qemu (mandatory if more than one device will be snapshot simultaneously)
- Issue snapshot command to QEMU: The command will specify which block device is to be snapshot, as well as the filename of the QCOW2 snapshot image (see below).
- QEMU creates new snapshot image
- VM is halted by QEMU (if management tool did not already pause vm)
- Flush pending IOs
- Switch VM to open new snapshot image, using original image as read only backing file.new image.
- If VM was not paused at start of snapshot command, QEMU will Restart VM
- If admin tool paused VM first, then admin tool must restart VM
- (Optional)Admin tool calls Agent in guest, with run command
Monitor command
The monitor command is designed to be flexible enough to handle both internal and external snapshots, as well as snapshots to various different snapshot file formats.
snapshot_blkdev device snapshot-file [format]:
device | block device to snapshot |
snapshot-file | target snapshot file |
format | format of snapshot image, valid formats are QCOW2 & QED (when merged upstream If not specified, the image will default to QCOW2. |
QMP command
The QMP command matches the behaviour of the human monitor command, except it is named slightly differently to match the fact that the command is synchronous.
blockdev-snapshot-sync device snapshot-file [format]
device | device name to snapshot (json-string) |
snapshot-file | name of new image file (json-string) |
format | format of new image (json-string, optional) |
Future features
Internal snapshots to images which support internal snapshots (QCOW2 & QED) are not expected to be supported initially.
There have been requests and suggestions for a number of alternative and enhanced interfaces for accessing live snapshots as follows:
internal snapshots
By making the snapshot-file argument of the monitor and QMP command optional, that could be used as a request to make the snapshot internally instead of to an external file. However, without live block migration of an internal snapshot, there is no way to make a backup of an internal snapshot while still leaving the VM running, so this feature is not planned at the present. For now, the snapshot-file argument is required, and only external snapshots are implemented.
fd passed as target for snapshot file/device
To get around problems with selinux, in particular in conjunction with images based on NFS, there is a wish to be able to pass an already open file descriptor using the getfd interface.
However, this poses a number of problems. When creating the COW headers for the new image file, as the COW header needs to know the file name of the disk image it is pointing to. On Linux this can be obtained through /proc/self/fd/<X> but this is not available on all other operating systems.
An alternative solution would be to extend the getfd interface to take an optional file name. However this is ugly and error prone, as it would allow a broken/hostile controller to create an image which points to the wrong place, but which wouldn't be discovered until the time where the image was actually being booted from.
Allowing the controlling application to create the COW headers in the new image is not an acceptable solution. It is race prone and would cause problems for COW formats where the new COW headers include state as of when they are created.
Separating into multiple commands
There are suggestions for splitting the snapshot process into multiple monitor/QMP commands. The process would be split as follows, using human monitor style commands as example:
(qemu) guest-agent-fsfreeze
Call guest agent requesting it to freeze all file systems and flush all I/O requests.
(qemu) freeze-io <blockX>
Instruct QEMU to freeze all I/O processing for block device <blockX>
(qemu) getfd <fd> snapshotfd
Provide file descriptor <fd> and assign it the logical name snapshotfd
(qemu) snapshot-blkdev-async <blockX> fd:snapshotfd <format>
Initiate asynchronous snapshot of device <blockX> to newly file descriptor snapshotfd. This will write the COW headers to the snapshot device, and pivot the block device <blockX> to point to the new device, using the original file/device as it's backing file.
(qemu) thaw-io <blockX>
Un-freeze I/O processing for device <blockX>
(qemu) guest-agent-fsthaw
Call guest agent requesting it to thaw/unfreeze all file systems within the guest.
(qemu) snapshot-blkdev-status <blockX>
Query the current snapshot status of <blockX>>. In addition some form of notification of completion will be required.
Note that the caller can loop the process of comments freeze-io, getfd, snapshot-blkdev-async, and thaw-io to snapshot multiple block devices in one guest.