Livebackup provides the ability for an administrator or a management server to use a livebackup_client program to connect to the qemu process and copy the disk blocks that were modified since the last backup was taken.
Jagane Sundar (jagane at sundar dot org)
The goal of this project is to add the ability to do full and incremental disk backups of a running VM. These backups will be transferred over a TCP connection to a backup server, and the virtual disk images will be reconstituted there. This project does not transfer the memory contents of the running VM, or the device states of emulated devices, i.e. livebackup is not VM suspend.
qemu is enhanced to maintain an in-memory dirty blocks bitmap for each virtual disk. A rudimentary network protocol for transferring the dirty blocks from qemu to another machine, possibly a dedicated backup server, is included. A new command line utility livebackup_client is developed. livebackup_client runs on the backup server, connects to qemu, transfers the dirty blocks, and updates or creates the backup version of all virtual disk images.
Livebackup introduces the notion of a livebackup snapshot that is created by qemu when the livebackup_client connects to qemu and is destroyed when all the dirty blocks have been transfered. When this snapshot is active, VM writes that would overwrite dirty blocks require the original blocks to be read and saved to a COW file.
Livebackup is fully self sufficient, and works with all types of disks and all disk images formats.
Currently, Livebackup is available as an enhancement to both the qemu and qemu-kvm projects. The goal is to get Livebackup accepted by the qemu project, and then flow through to the qemu-kvm project.
Clone git://github.com/jagane/qemu-livebackup.git to get access to a clone of the qemu source tree with Livebackup enhancements.
Clone git://github.com/jagane/qemu-kvm-livebackup.git to get access to a clone of the qemu-kvm source tree with Livebackup enhancements.
Today IaaS cloud platforms such as EC2 provide you with the ability to have two types of virtual disks in VM instances
I think that an efficient disk backup mechanism will enable a third type of virtual disk - one that is backed up, perhaps every hour or so. So a cloud operator using KVM virtual machines can offer three types of VMS:
In the following example, note the new command line parameters - livebackup_port, livebackup_dir and livebackup=on
# ./x86_64-softmmu/qemu-system-x86_64 -drive file=/dev/kvm_vol_group/kvm_root_part,boot=on,if=virtio,livebackup=on -drive file=/dev/kvm_vol_group/kvm_disk1,if=virtio,livebackup=on -m 512 -net nic,model=virtio,macaddr=52:54:00:00:00:01 -net tap,ifname=tap0,script=no,downscript=no -vnc 0.0.0.0:1000 -usb -usbdevice tablet -livebackup_dir /root/kvm/livebackup -livebackup_port 7900
The following is the client used to invoke livebackup client:
# livebackup_client /root/kvm-backup 192.168.1.220 7900
When this command is invoked, livebackup_client connects to the qemu process on machine 192.168.1.220 listening on port 7900 and transfers over the dirty blocks and creates virtual disk images in the local directory /root/kvm-backup
snapshots
snapshots2
In both of these proposals, the original virtual disk is made read-only and the VM writes to a different COW file. After backup of the original virtual disk file is complete, the COW file is merged with the original vdisk file.
Instead, I create an Original-Blocks-COW-file to store the original blocks that are overwritten by the VM everytime the VM performs a write while the backup is in progress. Livebackup copies these underlying blocks from the original virtual disk file before the VM's write to the original virtual disk file is scheduled. The advantage of this is that there is no merge necessary at the end of the backup, we can simply delete the Original-Blocks-COW-file.
Here's Stefan Hajnoczi's feedback (Thanks, Stefan)
Here's what I understand:
1. User takes a snapshot of the disk, QEMU creates old-disk.img backed by the current-disk.img.
2. Guest issues a write A.
3. QEMU reads B from current-disk.img.
4. QEMU writes B to old-disk.img.
5. QEMU writes A to current-disk.img.
6. Guest receives write completion A.
The tricky thing is what happens if there is a failure after Step 5. If writes A and B were unstable writes (no fsync()) then no ordering is guaranteed and perhaps write A reached current-disk.img but write B did not reach old-disk.img. In this case we no longer have a consistent old-disk.img snapshot - we're left with an updated current-disk.img and old-disk.img does not have a copy of the old data.
The solution is to fsync() after Step 4 and before Step 5 but this will hurt performance. We now have an extra read, write, and fsync() on every write.
When this happens, the livebackup_client is forced to do a full backup the next time around. Here's how: livebackup writes out the in-memory dirty bitmap to a dirty bitmap file only at the time of orderly shutdown of qemu. Hence, the mtime of the virtual disk file is later than the mtime of the livebackup dirty bitmap file. This causes livebackup to consider the dirty bitmap invalid, and forces the livebackup_client to do a full backup next time around.
In this case also, the livebackup_client is forced to do a full backup the next time around. The dirty bitmap file, the COW file used to store blocks written while a livebackup is in progress, are all deleted, and the livebackup client is forced to do a full backup next time around.
In this case, a new livebackup_client may be started, and it can redo the last type of backup it was doing - an incremental backup or a full backup. Note all the blocks of the last backup type need to be transferred over again, the qemu livebackup code does not keep track of what block the client was at. It does not need to be a forced full backup.
The basic guarantee that Livebackup provides is this: A backup of all the virtual disks of a VM will be created such that the backup virtual disk images will be a bit for bit match of the virtual disk images at the point in time when the livebackup_client program issues a 'create snapshot' command. Once the Livebackup code was written, I performed a full backup followed by several hundred incremental backups, while the VM was performing intense disk I/O. The backup disk images booted fine, and the data seemed good. However, this is proof of nothing. To actually state with certainty that the images were a bit for bit match, I came up with the following technique:
# /sbin/lvcreate -L1G -s -n kvm_root_part_backup /dev/kvm_vol_group/kvm_root_part
# cmp /dev/kvm_vol_group/kvm_root_part_backup kvm_root_part # cmp /dev/kvm_vol_group/kvm_disk1_backup ./kvm_disk1
It is possible to create a new LVM partition for each virtual disk in the VM. When a VM needs to be backed up, each of these LVM partitions is snapshotted. At this point things get messy - I don't really know of a good way to identify the blocks that were modified since the last backup. Also, once these blocks are identified, we need a mechanism to transfer them over a TCP connection to the backup server. Perhaps a way to export the 'dirty blocks' map to userland and use a deamon to transfer the block. Or maybe a kernel thread capable of listening on TCP sockets and transferring the blocks over to the backup client (I don't know if this is possible).