Features/Migration/Visitor

From QEMU
Revision as of 17:47, 27 October 2011 by Mdroth (talk | contribs)

Summary

This patch series implements a QEMUFile Visitor class that's intended to abstract away direct calls to qemu_put_*/qemu_get_* for save/load functions. Currently this is done by always creating a QEMUFileInputVisitor/QEMUFileOutputVisitor pair with each call to qemu_fopen_ops() and maintaining a QEMUFile->I/O Visitor mapping. save/load functions that are to be converted would them simply use lookup routines to get a Visitor based on their QEMUFile arguments. Once these save/load functions are all converted, we can then change the interfaces in bulk and switch over to passing in the Visitor directly.

This series also converts all of vmstate over to using Visitors, leveraging the existing vmstate hierachical structure to handle namespacing and structuring of the fields that are fed to the visitor functions, and introduces trace statements for qemu_put_*/qemu_get_* and visitor API calls to aid in testing.

Migration compatibility is not affected by these changes, as the qemu_put_*/qemu_get_* calls have a trivial 1-to-1 mapping with the visit_type_* calls, and the added field names/structure are not sent out over the wire when using the QEMUFile visitors, but serve simply as placeholders for future functionality.

Owner

  • Name: Michael Roth
  • Email: mdroth@linux.vnet.ibm.com

Plans

With these patches in place non-vmstate save/load functions can be converted over to using Visitors incrementally.

Short term (1.1 timeframe), the goal is to implement a new migration protocol that is self-describing, such that incompatibilities or migration errors can be more easilly detected. Currently, a simple change in data types for a particular device can introduce subtle bugs that won't be detected by the target, since the target interprets the data according to it's own expectation of what those data types are. Currently the plan is to achieve this by using ASN.1 BER in place of QEMUFile via a new BERVisitor.

Also planned is using the visitor interface to aid in things like device introspection (for both vmstate and non-vmstate devices), likely via the existing QMPOutputVisitor, as well as debugging/testing migration via a well structured "PrettyPrintVisitor" type of visitor that also retains ordering, field names, and type information (unlike with QMP/JSON or ASN.1). I've played around with doing this indirectly to test the conversions by doing some post-processing on the visit_* trace statements:

 ...
 PIIX3.PCIDevice.config(256)(1).[255] = (uint8) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[0] = (uint32) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[1] = (uint32) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[2] = (uint32) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[3] = (uint32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[0] = (int32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[1] = (int32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[2] = (int32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[3] = (int32) 0x0
 i8259.last_irr = (uint8) 0x1                                                     
 i8259.irr = (uint8) 0x10
 i8259.imr = (uint8) 0xb8
 i8259.isr = (uint8) 0x0
 i8259.priority_add = (uint8) 0x0
 i8259.irq_base = (uint8) 0x8
 i8259.read_reg_select = (uint8) 0x0
 i8259.poll = (uint8) 0x0
 i8259.special_mask = (uint8) 0x0
 i8259.init_state = (uint8) 0x0
 i8259.auto_eoi = (uint8) 0x0
 ...
 slirp.ip_id = (uint16) 0x2
 slirp.bootp.bootp_clients(16)(8).[0].allocated = (uint16) 0x1
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[0] = (uint8) 0x1
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[1] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[2] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[3] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[4] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[5] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].allocated = (uint16) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[0] = (uint8) 0x1
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[1] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[2] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[3] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[4] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[5] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[2].allocated = (uint16) 0x0
 ...

It's not completely clear at this point how the latter will fall into place with other plans such as QOM that would provide similar capabilities, but since device serialization will likely be done via automatically-generated visitor patterns, this conversion should at least serve as an incremental step toward that transition.

Code/Status

The latest stable code is available at

git://repo.or.cz/qemu/mdroth.git migration-visitor-v2 git://repo.or.cz/qemu/mdroth.git migration-visitor-conversions-set1-v1

A bit over over 1/3 of the conversions are completed, with most of x86-relevant conversions done.

Testing

Conversions are being staged/tested through the testing branch:

git://repo.or.cz/qemu/mdroth.git migration-visitor-test

This branch also includes some helper scripts to aid in testing the conversions under the test-migration-visitor/ directory of the test branch (usage details in test-migration-visitor). There are also trivial modifications to the way qemu_put_buffer/qemu_get_buffer behave to provide a closer match to the trace output of converted save/load functions (mainly, qemu_put_buffer uses qemu_put_byte rather than directly memcpy'ing to the QEMUFile buffer, since this is how the QEMUFileVisitor does it via visit_start_array()/visit_end_array().

Essentially, the test involves 3 phases. Build qemu with only the patches up to and including "tests: helper scripts for testing migration visitor conversions", and another qemu with all the conversions on top of that that you're testing (both with --enable-trace-backend=stderr). We then do, via the test-migration.sh helper script:

The conversions are fairly trivial for the most part, and have been tested using a mostly-automated test framework that involved tracing all visitor-based and qemufile-based puts/gets and do the following:

1. migrate converted qemu instance to converted qemu instance. check that visitor and qemufile puts/gets match up and report the same values. This confirms symmetry between the save and load sides.

2. migrate pre-converted qemu instance to converted qemu instance. check that the qemufile puts/gets match up between the 2. This confirms symmetry between old save routines and new load routines.

3. migrate converted qemu instance to pre-converted instance. confirm symmetry between new save routines and old load routines.

There are also unit tests in test-visitor to confirm that data written/read via qemufile/visitor interfaces match.

What's not covered in the testing:

1. coverage: there's no guaruntee all paths will be tested, so I've been checking coverage manually by examining the post-processed visitor-based traces, where field names form unique(-ish) paths that are human readable, to determine what paths are being hit.

2. correctness: all tests involve checking symmetry on source and target during migration, but there's no guarantee the source values are correct. Presumably, serializing the wrong data with cause a break in symmetry, or if not, cause a mismatch in the converted load side with pre-converted save side, but care should be taken nonetheless.