Features/VMSnapshotEnchancement: Difference between revisions

From QEMU
 
(5 intermediate revisions by the same user not shown)
Line 2: Line 2:


This feature will enhance VM snapshot functionality, to make it possible lively taking internal/external snapshots,
This feature will enhance VM snapshot functionality, to make it possible lively taking internal/external snapshots,
and make it works better with underlining components such as LVM. Qemu then will have a completed picture about snapshot
and make it works better with underlining components such as LVM to take snapshot by them instead of qemu. Qemu then
methods:
will have a completed picture about snapshot methods:


--qemu manage snapshot:
--qemu manage snapshots:


|----internal case
|----internal case
Line 11: Line 11:
|----external case
|----external case


--qemu do not manage snapshot:
--qemu do not manage snapshots:


|----block drain case
|----block drain case
Line 22: Line 22:
This feature would provide APIs that can do:
This feature would provide APIs that can do:


* 1 block device live snapshot as internal/external/blank delta data, export sync API for all type.
* 1 block device live snapshot as internal/external data or drain I/O to cooperate with external components, export sync API for all type.
* 2 vmstate live save as internal/external data, export async API for external data, fix the size problem.
* 2 vmstate live save as internal/external data, export async API for external data, with fixed size.
* 3 combination(internal block snapshot + internal vmstate save, internal block snapshot + external vmstate save, external block snapshot  external vmstate save).
* 3 combination of above, and make screendump at the correct time.
* 4 a way to screen dump in the time of snapshot complete.


= Subtask Details =
= Subtask Details =


Now qemu support block device live external snapshot, live migration to file, static internal block snapshot + internal vmstate save, following are the blanks need to be filled:
TBD.
 
* 1 expose block device live internal snapshot.
* 2 add and expose block device drain.
* 3 provide 1,2 together with block external snapshot in unified style.
* 4 make vmstate save lively.
* 5 add progress query.
* 6 fix the vmstate size issue.
* 7 add vmstate save to external file which have the format that qemu support.
* 8 provide vmstate save internal/external in unified style, user can specify whether cal GA FS freeze before complete, whether vm pause after complete.
* 9 add vm lively save interface in qemu(only for internal vmstate+ internal block snapshot, in which case the content is managed
by qemu).
* 10 related information retrieving enhancement as qmp/hmp interface.


= General Goal =
= General Goal =
Line 49: Line 36:
[[File:Snapshot_principle.jpg|700px]]
[[File:Snapshot_principle.jpg|700px]]


Pic 1, principle for snapshot data for an application
Pic 1, principle to make snapshot for an application




As the picture shows, generally the application need to ensure disk<->memory can be safely transformed in some format,
As the picture shows, generally the application need to ensure disk data is consistent at up level, and then
that is data on disk saved can be used to restore to a runtime application state, usually by a flush and freeze at
make a clone/mirror of the disk data at that time. So the disk job doesn't have to be done by qemu, qemu can
up level. Then let a lower level components write down the contents. Once above is satisfied, snapshot can be done.
just ensure data's consistence and let other completed the job remained.
That means for qemu, it can choose whether write down vmstate, whether does the disk content save/restore itself.




Snapshot are intensive used in backup application, for qemu case, the backup application may have a goal as following:
Often snapshots are used frequently to make data recoverable at some time point, for example, a management
program may want following for each VM, which enables data restore for multiple time point, while keeps
the space used little and VM online with good performance all the time. To make things clear, in this text,
I'll call a program want to form following as "incremental backup application".


[[File:Vmbackup_Common_goal.jpg|700px]]
[[File:Vmbackup_Common_goal.jpg|700px]]
Line 65: Line 54:




The base or delta are data set which does not have to be file, and qemu may just provide those data in some form, in
Note the key requirements of it: delta data and base are separated, vmstate data are standalone, and a hidden
some case the delta data can be got by exporting internal snapshot from an image if there is a tool. So there
one:VM need good performance.
will be some different cases to take snapshot, and now qemu 1.3 support external block file + external vmstate file,
with a little disadvantage that vmstate file size is not predicable, and internal case are not lively.
 
Note that there is requirement at VM host server to have better performance while a short chain exist, supporting
additional snapshot case may give a chance to improve the performance, and let external components take some
load from qemu to themself.


= User Cases =
= User Cases =


As show above, how deep it can recover to, resulting a choice of whether to save vmstate. Who take the action, resulting a
As show above, how deep it can recover to, resulting a choice of whether to save vmstate. Who take the action, resulting a
choice of whether qemu write/read the snapshot itself. For question two, things is a bit complicated, who take the action,
choice of whether qemu write/read the snapshot itself. How to take the action resulting external/internal cases. There are
what type of qemu's action to be(internal/external), will resulting many cases. Following picture shows the general
basically three choices of them, as following:
relationship of components, to complete a snapshot for VM:


[[File:function_blocks.jpg|700px]]
[[File:function_blocks.jpg|700px]]


pic3, co-operation relationship in the big picture
pic3, co-operation relationship in the big picture




Typical cases:
Typical cases:
take LVM2 as an example as third party tool, vmstate save are optional, following are the typical cases:
take LVM2 as an example as third party tool, vmstate save are optional, following are the typical cases:
Common Disadvantages now: vmstate size is not predictable in lively saving.
--Type one, qemu manage snapshot:
----Common advantages: less dependence, qemu manages all, workable on most systems.
----Common disadvantages: lower level component have no chance to take the job.


* Case 1: external image snapshot data + external vmstate data
* Case 1: external image snapshot data + external vmstate data
This is what qemu 1.3 support.
This is what qemu 1.3 support.
Step:
Step:
   1 save vmstate to external place.
   1 lively save vmstate to external place.
   2 blkdev-snapshot-sync each block device.
   2 at 95%, freeze guest FS by GA.
   3 Copy out data.
  3 freeze qemu block I/O by pause VM or queue I/O.
   4 Resume.
  4 blkdev-snapshot-sync each block device, get readonly image.
Todo:
  5 restore qemu block I/O by resume VM or flush queued I/O.
   Fix the vmstate size issue(may introduce a new API), provide a interface integrate the calls.
  6 restore guest FS by GA.
Advantage:
  7 Copy out readonly image.
   less dependence, chain and block bitmap are managed by qemu.
   8 Copy out vmstate file, if vmstate need to be stored in another place.
   9 merge the image files.
Advantages:
   Directly we have readonly standalone base / delta image files, "incremental backup application" can directly copy
  out them to form a chain.
Disadvantages:
   External chain are slower in deleting(merging), reading that internal snapshot.




* Case 2: internal image snapshot data + external vmstate data(not draw in the picture)
* Case 2: internal image snapshot data + internal vmstate data
This is what qemu 1.3 supported, but not lively.
Step:
Step:
   1 save vmstate to external place.
   1 lively save vmstate to internal qcow2 file.
   2 pause VM(may call GA before).
   2 at 95%, freeze guest FS by GA.
   3 internal snapshot each block device.
   3 freeze qemu block I/O by pause VM or queue I/O.
   4 LVM create snapshot.
   4 blkdev-snapshot-internal-sync each block device.
   5 resume.
   5 restore qemu block I/O by resume VM or flush queued I/O.
Todo:
  6 restore guest FS by GA.
   Fix the vmstate size issue, add block internal snapshot support, provide an interface integrate the calls.
  7 export base/delta internal data by an qemu API.
Advantage:
   8 export vmstate internal data by an qemu API.
   Internal snapshot are a bit faster, qemu managed the block snapshot consistence.
  9 delete the internal snapshot.
Lack:
Advantages:
   live commit internal snapshots now, so fit more for desktop usage now.
   Better performance for running VM and deleting is faster, than external case.
Disadvantages:
   The data are not separated, a tool is needed to export them, which seems not available
  in qemu now.




* Case 3: internal image blank data (drain) + external vmstate data
* Case 3: internal image snapshot data + external vmstate data
Step:
Step:
   1 save vmstate to external place.
   1 lively save vmstate to external qcow2 file.
   2 pause VM(may call GA before).
   2 at 95%, freeze guest FS by GA.
   3 drain each block device.
   3 freeze qemu block I/O by pause VM or queue I/O.
   4 LVM create snapshot.
   4 blkdev-snapshot-internal-sync each block device.
   5 resume.
   5 restore qemu block I/O by resume VM or flush queued I/O.
Todo:
  6 restore guest FS by GA.
   Fix the vmstate size issue, add block device drain support, provide a interface integrate the calls.
  7 export base/delta internal data by an qemu API.
Advantage:
  8 Copy out vmstate file, if vmstate need to be stored in another place.
  Fast, backing chain and block bitmap management can be offloaded from qemu to lower component, this also gives a
  9 delete the internal snapshot.
  chance to lower software/hardware to accelerate it.
Advantages:
   Same with case 2, except vmstate is already stand alone.
Disadvantages:
  Same with case 2, May have not paired vmstate with block snapshots, but not a problem if
  management stack handle it.
 
 
Small summary:
Case 3 seems the best, for it have best performance, and qemu need to provide
a way to transform internal snapshot and external snapshot each other, that is something
like: qemu_export_internal_delta(int *id, char *buf). But I haven't confirm if it is possible
to export base data lively in theory of qcow2. If not we may need to combine external/internal
steps to form a incremental back up chain with better performance.
 




* Case 4: internal image snapshot data + internal vmstate data
--Type two, qemu do not manage snapshot:
This is what qemu support as static method.
* Case 4: block I/O drain + external vmstate data
Step:
Step:
   1 save vmstate to internal place.
   1 lively save vmstate to external place.
   2 internal snapshot each block device.
   2 at 95%, freeze guest FS by GA.
   (3) pause the vm.
   3 block I/O drain and then pause VM or queue I/O.
   (4) LVM create snapshot.
   4 3rd part components create snapshot.
   (5) resume.
   5 restore qemu block I/O by resume VM or flush queued I/O.
Todo:
  6 restore guest FS by GA.
   Fix the vmstate size issue, change it to commit.
  7 3rd part component to get delta/base data.
Advantage:
   8 Copy out vmstate file, if vmstate need to be stored in another place.
   qemu can manage bitmap and backing chain, so it is consistent.
  9 3rd part component to merge its snapshot.
Lack:
Advantages:
   Can't delete internal snapshot lively, so fit better for desktop usage.
   Faster, backing chain and block bitmap management can be offloaded from qemu to lower component, this also gives a
  chance to let lower software/hardware accelerate it.
Disadvantages:
   Need extra components.




As a summary:
As a summary:
focus on case 4, 1 for desktop usage on windows/Linux, focus on case 3, 1 for server usage on Linux.
For qemu managing snapshot type, recommend case 1 for 1st generation of incremental backup, implement case 3 to
get better performance.
For qemu not managing snapshot type, implement the missing part and let the dedicated software/hardware do it.
It is also possible to define some interface(snapshot ioctl call for a host block devie) and let 3rd part implement it,
as an alternative method.


= API design =
= API design =

Latest revision as of 09:21, 25 January 2013

VM Snapshot enhancement

This feature will enhance VM snapshot functionality, to make it possible lively taking internal/external snapshots, and make it works better with underlining components such as LVM to take snapshot by them instead of qemu. Qemu then will have a completed picture about snapshot methods:

--qemu manage snapshots:

|----internal case

|----external case

--qemu do not manage snapshots:

|----block drain case

  • Name: Wenchao Xia
  • Email: xiawenc@linux.vnet.ibm.com, xiaxia347os@163.com

General Summary

This feature would provide APIs that can do:

  • 1 block device live snapshot as internal/external data or drain I/O to cooperate with external components, export sync API for all type.
  • 2 vmstate live save as internal/external data, export async API for external data, with fixed size.
  • 3 combination of above, and make screendump at the correct time.

Subtask Details

TBD.

General Goal

First consider what is needed to take snapshot for an common application, or qemu:

Snapshot principle.jpg

Pic 1, principle to make snapshot for an application


As the picture shows, generally the application need to ensure disk data is consistent at up level, and then make a clone/mirror of the disk data at that time. So the disk job doesn't have to be done by qemu, qemu can just ensure data's consistence and let other completed the job remained.


Often snapshots are used frequently to make data recoverable at some time point, for example, a management program may want following for each VM, which enables data restore for multiple time point, while keeps the space used little and VM online with good performance all the time. To make things clear, in this text, I'll call a program want to form following as "incremental backup application".

Vmbackup Common goal.jpg

Pic 2, general goal on backup server


Note the key requirements of it: delta data and base are separated, vmstate data are standalone, and a hidden one:VM need good performance.

User Cases

As show above, how deep it can recover to, resulting a choice of whether to save vmstate. Who take the action, resulting a choice of whether qemu write/read the snapshot itself. How to take the action resulting external/internal cases. There are basically three choices of them, as following:

Function blocks.jpg

pic3, co-operation relationship in the big picture


Typical cases:

take LVM2 as an example as third party tool, vmstate save are optional, following are the typical cases:

Common Disadvantages now: vmstate size is not predictable in lively saving.

--Type one, qemu manage snapshot:


Common advantages: less dependence, qemu manages all, workable on most systems.


Common disadvantages: lower level component have no chance to take the job.

  • Case 1: external image snapshot data + external vmstate data

This is what qemu 1.3 support. Step:

 1 lively save vmstate to external place.
 2 at 95%, freeze guest FS by GA.
 3 freeze qemu block I/O by pause VM or queue I/O.
 4 blkdev-snapshot-sync each block device, get readonly image.
 5 restore qemu block I/O by resume VM or flush queued I/O.
 6 restore guest FS by GA.
 7 Copy out readonly image.
 8 Copy out vmstate file, if vmstate need to be stored in another place.
 9 merge the image files.

Advantages:

 Directly we have readonly standalone base / delta image files, "incremental backup application" can directly copy
 out them to form a chain.

Disadvantages:

 External chain are slower in deleting(merging), reading that internal snapshot.


  • Case 2: internal image snapshot data + internal vmstate data

This is what qemu 1.3 supported, but not lively. Step:

 1 lively save vmstate to internal qcow2 file.
 2 at 95%, freeze guest FS by GA.
 3 freeze qemu block I/O by pause VM or queue I/O.
 4 blkdev-snapshot-internal-sync each block device.
 5 restore qemu block I/O by resume VM or flush queued I/O.
 6 restore guest FS by GA.
 7 export base/delta internal data by an qemu API.
 8 export vmstate internal data by an qemu API.
 9 delete the internal snapshot.

Advantages:

 Better performance for running VM and deleting is faster, than external case.

Disadvantages:

 The data are not separated, a tool is needed to export them, which seems not available
 in qemu now.


  • Case 3: internal image snapshot data + external vmstate data

Step:

 1 lively save vmstate to external qcow2 file.
 2 at 95%, freeze guest FS by GA.
 3 freeze qemu block I/O by pause VM or queue I/O.
 4 blkdev-snapshot-internal-sync each block device.
 5 restore qemu block I/O by resume VM or flush queued I/O.
 6 restore guest FS by GA.
 7 export base/delta internal data by an qemu API.
 8 Copy out vmstate file, if vmstate need to be stored in another place.
 9 delete the internal snapshot.

Advantages:

 Same with case 2, except vmstate is already stand alone.

Disadvantages:

 Same with case 2, May have not paired vmstate with block snapshots, but not a problem if
 management stack handle it.


Small summary: Case 3 seems the best, for it have best performance, and qemu need to provide a way to transform internal snapshot and external snapshot each other, that is something like: qemu_export_internal_delta(int *id, char *buf). But I haven't confirm if it is possible to export base data lively in theory of qcow2. If not we may need to combine external/internal steps to form a incremental back up chain with better performance.


--Type two, qemu do not manage snapshot:

  • Case 4: block I/O drain + external vmstate data

Step:

 1 lively save vmstate to external place.
 2 at 95%, freeze guest FS by GA.
 3 block I/O drain and then pause VM or queue I/O.
 4 3rd part components create snapshot.
 5 restore qemu block I/O by resume VM or flush queued I/O.
 6 restore guest FS by GA.
 7 3rd part component to get delta/base data.
 8 Copy out vmstate file, if vmstate need to be stored in another place.
 9 3rd part component to merge its snapshot.

Advantages:

 Faster, backing chain and block bitmap management can be offloaded from qemu to lower component, this also gives a
 chance to let lower software/hardware accelerate it.

Disadvantages:

 Need extra components.


As a summary: For qemu managing snapshot type, recommend case 1 for 1st generation of incremental backup, implement case 3 to get better performance. For qemu not managing snapshot type, implement the missing part and let the dedicated software/hardware do it. It is also possible to define some interface(snapshot ioctl call for a host block devie) and let 3rd part implement it, as an alternative method.

API design

TBD.


Progress

TBD.