Features/PostCopyLiveMigration

From QEMU
Revision as of 10:58, 30 September 2014 by Dgilbert (talk | contribs) (→‎design)

summary

post-copy based live migration

owner

description

A postcopy implementation that allows migration of guests that have large page change rates (relative to the avialable bandwidth) to be migrated in a finite time.

design

This postcopy implementation uses the Linux 'userfault' and 'remap_anon_pages' kernel mechanisms from Andrea Arcangeli; it's not specific to Postcopy and is designed to allow use with all of the standard kernel features (like transparent huge pages, KSM etc).

Mixed pre/post copy is built into the design from the start; a command is sent to switch modes after the migration has been stated (as long as postcopy mode has been enabled first by a capability)


Postcopyflow.png

Major components

Where possible the design attempts to build reusable components that other features can reuse.

  • 'command' section type for sending migration commands that don't directly reflect guest state; this is used to send messages that move through different phases of postcopy and is expandable for use by others.
  • 'return path' a method for the destination to send messages back to the source; used for postcopy page requests, and allows the destination to signal failure back to the source; this is currently supported on TCP and fd (where the fd is socket backed).
  • 'sent map' a bitmap on the source populated with the set of all pages that have already been transmitted
  • 'postcopy pagemap inbound (PMI)' a map on the destination holding the state of each page, whether it's been requested from the source and whether it has been received.

TODO

future enhancement

  • optimization - rate limit the background page transmission to reduce the impact on the latency of postcopy page requests.
  • Integration with RDMA

links