ToDo/LiveMigration: Difference between revisions

From QEMU
(Created page with '## page was renamed from Live Migration Improvements = Live Migration Road Map = == Stability == * Complete VmState transition * CPU port posted upstream * virtio is posted…')
 
No edit summary
Line 1: Line 1:
## page was renamed from Live Migration Improvements
= Live Migration Road Map =
= Live Migration Road Map =


== Stability ==
== Stability ==
  * Complete VmState transition
   
  * CPU port posted upstream
* Complete VmState transition
  * virtio is posted to vmstate, still missing are virtio-serial lists.
** CPU port posted upstream
  * slirp is posted to vmstate still missing are the toplevel lists , need some testing  
** virtio is posted to vmstate, still missing are virtio-serial lists.
* Visitors  
** slirp is posted to vmstate still missing are the toplevel lists , need some testing  
* Device State automatic code generation - for example using annotation like Qt.
* Visitors  
* Migration downtime calculation :
* Device State automatic code generation - for example using annotation like Qt.
  * The calculation of estimated migration downtime is done with the last bandwidth (how much we sent in the last iteration).
* Migration downtime calculation :
  The bandwidth can fluctuate between iteration it could be better to try and use an average bandwidth.  
** The calculation of estimated migration downtime is done with the last bandwidth (how much we sent in the last iteration).The bandwidth can fluctuate between iteration it could be better to try and use an average bandwidth.Because Qemu is single thread the actual downtime can be greater , if the thread will be busy. Separating the migration thread can help in this case.
  Because Qemu is single thread the actual downtime can be greater , if the thread will be busy.
** We need a mechanism to detect when we exceed maximal downtime and return to the iteration phase This can be implement using a timer.
  Separating the migration thread can help in this case.
* Migration speed calculation:
  * We need a mechanism to detect when we exceed maximal downtime and return to the iteration phase.
* Default migration speed can be too low, this can result in extending the migration and in some case never complete it (https://bugzilla.redhat.com/show_bug.cgi?id=695394).
  This can be implement using a timer.
* In the current implementation calculating the actual migration speed is very complex: we use a QemuFileBuffered for the outgoing migration , it can sends the data in two cases: 100 millisecond pasted from the previous packet or the buffer is full (~3.2M).  
* Migration speed calculation:
* Tests:
  * Default migration speed can be too low, this can result in extending the migration and in some case  
** autotest for migration (already exist), need to add test with guest and host load.
  never complete it (https://bugzilla.redhat.com/show_bug.cgi?id=695394).
** VmState unit test , save to file/ load for file.
  * In the current implementation calculating the actual migration speed is very complex:
** VmState Sections/Subsections testing.  
  we use a QemuFileBuffered for the outgoing migration , it can sends the data in two cases:  
* Bugs !
  100 millisecond pasted from the previous packet or the buffer is full (~3.2M).  
* Tests:
  * autotest for migration (already exist), need to add test with guest and host load.
  * VmState unit test , save to file/ load for file.
  * VmState Sections/Subsections testing.  
* Bugs !


== Performance ==
== Performance ==
* XBRLE page delta compression - SAP patches http://lists.gnu.org/archive/html/qemu-devel/2011-07/msg00474.html
* XBRLE page delta compression - SAP patches http://lists.gnu.org/archive/html/qemu-devel/2011-07/msg00474.html http://www.linux-kvm.org/wiki/images/c/cb/2011-forum-kvm_hudzia.pdf.
  http://www.linux-kvm.org/wiki/images/c/cb/2011-forum-kvm_hudzia.pdf.
* Sending cold pages aka Page Priority - also SAP (see http://www.linux-kvm.org/wiki/images/c/cb/2011-forum-kvm_hudzia.pdf)
* Sending cold pages aka Page Priority - also SAP (see http://www.linux-kvm.org/wiki/images/c/cb/2011-forum-kvm_hudzia.pdf)
* Migration threads - Juan is working on it.  
* Migration threads - Juan is working on it.
* Splitting Bitmap - Juan is  working on it.
* Splitting Bitmap - Juan is  working on it.
* Migration protocol - The protocol should be separate from device state and data format.It should be bi-directional protocol unlike today.
* Migration protocol - The protocol should be separate from device state and data format.
* Post copy for guest with very large memory - http://www.linux-kvm.org/wiki/images/e/ed/2011-forum-yabusame-postcopy-migration.pdf
  It should be bi-directional protocol unlike today.
* RDMA
* Post copy for guest with very large memory - http://www.linux-kvm.org/wiki/images/e/ed/2011-forum-yabusame-postcopy-migration.pdf
* Remove Buffered File - Depends on the migration threads.
* RDMA
* Remove Buffered File - Depends on the migration threads.

Revision as of 15:22, 15 December 2011

Live Migration Road Map

Stability

  • Complete VmState transition
    • CPU port posted upstream
    • virtio is posted to vmstate, still missing are virtio-serial lists.
    • slirp is posted to vmstate still missing are the toplevel lists , need some testing
  • Visitors
  • Device State automatic code generation - for example using annotation like Qt.
  • Migration downtime calculation :
    • The calculation of estimated migration downtime is done with the last bandwidth (how much we sent in the last iteration).The bandwidth can fluctuate between iteration it could be better to try and use an average bandwidth.Because Qemu is single thread the actual downtime can be greater , if the thread will be busy. Separating the migration thread can help in this case.
    • We need a mechanism to detect when we exceed maximal downtime and return to the iteration phase This can be implement using a timer.
  • Migration speed calculation:
  • Default migration speed can be too low, this can result in extending the migration and in some case never complete it (https://bugzilla.redhat.com/show_bug.cgi?id=695394).
  • In the current implementation calculating the actual migration speed is very complex: we use a QemuFileBuffered for the outgoing migration , it can sends the data in two cases: 100 millisecond pasted from the previous packet or the buffer is full (~3.2M).
  • Tests:
    • autotest for migration (already exist), need to add test with guest and host load.
    • VmState unit test , save to file/ load for file.
    • VmState Sections/Subsections testing.
  • Bugs !

Performance