Features/Migration/Visitor

From QEMU

Summary

This patch series implements a QEMUFile Visitor class that's intended to abstract away direct calls to qemu_put_*/qemu_get_* for save/load functions. Currently this is done by always creating a QEMUFileInputVisitor/QEMUFileOutputVisitor pair with each call to qemu_fopen_ops() and maintaining a QEMUFile->I/O Visitor mapping. save/load functions that are to be converted would them simply use lookup routines to get a Visitor based on their QEMUFile arguments. Once these save/load functions are all converted, we can then change the interfaces in bulk and switch over to passing in the Visitor directly.

Migration compatibility is not affected by these changes, as the qemu_put_*/qemu_get_* calls have a trivial 1-to-1 mapping with the visit_type_* calls, and the added field names/structure are not sent out over the wire when using the QEMUFile visitors, but serve simply as placeholders for future functionality.

Owner

  • Name: Dave Gilbert
  • Email: dgilbert@redhat.com
  • Name: Michael Roth
  • Email: mdroth@linux.vnet.ibm.com

Plans (2014)

Get it to the point where the infrastructure works and doesn't break anything and is basically right Get it merged Deal with the remaining items that are still stuck as non-vmstate

In particular I don't want to have to wait for every device currently in use to work with visitors, so the code bundles old data up as octet-streams.

Code/Status

  - Block migration
  - SPAPR (iterative like RAM and block)
  - RDMA
  - Floats
  - Need to improve the visitor interface for the compatibility sturcutres
  - _TEST entries in VMState, and producing a schema that always works with them
  - Clean up places where shims are pased visitors but still need QEMUFiles
  - Test cases
  - Places where the interfaces and structure are still too tied to the old file format
  - Selecting the BER output format
  - Clean up error handling
  - Visitors should share more code.
  

Michael's code from 2011: The latest stable code is available at

git://repo.or.cz/qemu/mdroth.git migration-visitor-v2 git://repo.or.cz/qemu/mdroth.git migration-visitor-conversions-set1-v1

A bit over over 1/3 of the conversions are completed, with most of x86-relevant conversions done.

Old plans

With these patches in place non-vmstate save/load functions can be converted over to using Visitors incrementally.

Short term (1.1 timeframe), the goal is to implement a new migration protocol that is self-describing, such that incompatibilities or migration errors can be more easilly detected. Currently, a simple change in data types for a particular device can introduce subtle bugs that won't be detected by the target, since the target interprets the data according to it's own expectation of what those data types are. Currently the plan is to achieve this by using ASN.1 BER in place of QEMUFile via a new BERVisitor.

Also planned is using the visitor interface to aid in things like device introspection (for both vmstate and non-vmstate devices), likely via the existing QMPOutputVisitor, as well as debugging/testing migration via a well structured "PrettyPrintVisitor" type of visitor that also retains ordering, field names, and type information (unlike with QMP/JSON or ASN.1). I've played around with doing this indirectly to test the conversions by doing some post-processing on the visit_* trace statements:

 ...
 PIIX3.PCIDevice.config(256)(1).[255] = (uint8) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[0] = (uint32) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[1] = (uint32) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[2] = (uint32) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[3] = (uint32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[0] = (int32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[1] = (int32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[2] = (int32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[3] = (int32) 0x0
 i8259.last_irr = (uint8) 0x1                                                     
 i8259.irr = (uint8) 0x10
 i8259.imr = (uint8) 0xb8
 i8259.isr = (uint8) 0x0
 i8259.priority_add = (uint8) 0x0
 i8259.irq_base = (uint8) 0x8
 i8259.read_reg_select = (uint8) 0x0
 i8259.poll = (uint8) 0x0
 i8259.special_mask = (uint8) 0x0
 i8259.init_state = (uint8) 0x0
 i8259.auto_eoi = (uint8) 0x0
 ...
 slirp.ip_id = (uint16) 0x2
 slirp.bootp.bootp_clients(16)(8).[0].allocated = (uint16) 0x1
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[0] = (uint8) 0x1
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[1] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[2] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[3] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[4] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[5] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].allocated = (uint16) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[0] = (uint8) 0x1
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[1] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[2] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[3] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[4] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[5] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[2].allocated = (uint16) 0x0
 ...

It's not completely clear at this point how the latter will fall into place with other plans such as QOM that would provide similar capabilities, but since device serialization will likely be done via automatically-generated visitor patterns, this conversion should at least serve as an incremental step toward that transition.


(old) Testing

Conversions are being staged/tested through the testing branch:

git://repo.or.cz/qemu/mdroth.git migration-visitor-test

This branch also includes some helper scripts to aid in testing the conversions under the test-migration-visitor/ directory of the test branch (usage details in test-migration-visitor). There are also trivial modifications to the way qemu_put_buffer/qemu_get_buffer behave to provide a closer match to the trace output of converted save/load functions (mainly, qemu_put_buffer uses qemu_put_byte rather than directly memcpy'ing to the QEMUFile buffer, since this is how the QEMUFileVisitor does it via visit_start_array()/visit_end_array().

Essentially, the test involves 3 phases. Build qemu with only the patches up to and including "tests: helper scripts for testing migration visitor conversions", and another qemu with all the conversions on top of that that you're testing (both with --enable-trace-backend=stderr). We then do, via the test-migration.sh helper script:

The conversions are fairly trivial for the most part, and have been tested using a mostly-automated test framework that involved tracing all visitor-based and qemufile-based puts/gets and do the following:

1. migrate converted qemu instance to converted qemu instance. check that visitor and qemufile puts/gets match up and report the same values. This confirms symmetry between the save and load sides.

2. migrate pre-converted qemu instance to converted qemu instance. check that the qemufile puts/gets match up between the 2. This confirms symmetry between old save routines and new load routines.

3. migrate converted qemu instance to pre-converted instance. confirm symmetry between new save routines and old load routines.

There are also unit tests in test-visitor to confirm that data written/read via qemufile/visitor interfaces match.

What's not covered in the testing:

1. coverage: there's no guaruntee all paths will be tested, so I've been checking coverage manually by examining the post-processed visitor-based traces, where field names form unique(-ish) paths that are human readable, to determine what paths are being hit.

2. correctness: all tests involve checking symmetry on source and target during migration, but there's no guarantee the source values are correct. Presumably, serializing the wrong data with cause a break in symmetry, or if not, cause a mismatch in the converted load side with pre-converted save side, but care should be taken nonetheless.