Features/Migration/Visitor

From QEMU

Summary

This patch series implements a QEMUFile Visitor class that's intended to abstract away direct calls to qemu_put_*/qemu_get_* for save/load functions. Currently this is done by always creating a QEMUFileInputVisitor/QEMUFileOutputVisitor pair with each call to qemu_fopen_ops() and maintaining a QEMUFile->I/O Visitor mapping. save/load functions that are to be converted would them simply use lookup routines to get a Visitor based on their QEMUFile arguments. Once these save/load functions are all converted, we can then change the interfaces in bulk and switch over to passing in the Visitor directly.

This series also converts all of vmstate over to using Visitors, leveraging the existing vmstate hierachical structure to handle namespacing and structuring of the fields that are fed to the visitor functions, and introduces trace statements for qemu_put_*/qemu_get_* and visitor API calls to aid in testing.

Migration compatibility is not affected by these changes, as the qemu_put_*/qemu_get_* calls have a trivial 1-to-1 mapping with the visit_type_* calls, and the added field names/structure are not sent out over the wire when using the QEMUFile visitors, but serve simply as placeholders for future functionality.

Owner

  • Name: Michael Roth
  • Email: mdroth@linux.vnet.ibm.com

Plans

With these patches in place non-vmstate save/load functions can be converted over to using Visitors incrementally.

Short term (1.1 timeframe), the goal is to implement a new migration protocol that is self-describing, such that incompatibilities or migration errors can be more easilly detected. Currently, a simple change in data types for a particular device can introduce subtle bugs that won't be detected by the target, since the target interprets the data according to it's own expectation of what those data types are. Currently the plan is to achieve this by using ASN.1 BER in place of QEMUFile via a new BERVisitor.

Also planned is using the visitor interface to aid in things like device introspection (for both vmstate and non-vmstate devices), likely via the existing QMPOutputVisitor, as well as debugging/testing migration via a well structured "PrettyPrintVisitor" type of visitor that also retains ordering, field names, and type information (unlike with QMP/JSON or ASN.1). I've played around with doing this indirectly to test the conversions by doing some post-processing on the visit_* trace statements:

 ...
 PIIX3.PCIDevice.config(256)(1).[255] = (uint8) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[0] = (uint32) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[1] = (uint32) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[2] = (uint32) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[3] = (uint32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[0] = (int32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[1] = (int32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[2] = (int32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[3] = (int32) 0x0
 i8259.last_irr = (uint8) 0x1                                                     
 i8259.irr = (uint8) 0x10
 i8259.imr = (uint8) 0xb8
 i8259.isr = (uint8) 0x0
 i8259.priority_add = (uint8) 0x0
 i8259.irq_base = (uint8) 0x8
 i8259.read_reg_select = (uint8) 0x0
 i8259.poll = (uint8) 0x0
 i8259.special_mask = (uint8) 0x0
 i8259.init_state = (uint8) 0x0
 i8259.auto_eoi = (uint8) 0x0
 ...
 slirp.ip_id = (uint16) 0x2
 slirp.bootp.bootp_clients(16)(8).[0].allocated = (uint16) 0x1
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[0] = (uint8) 0x1
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[1] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[2] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[3] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[4] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[5] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].allocated = (uint16) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[0] = (uint8) 0x1
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[1] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[2] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[3] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[4] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[5] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[2].allocated = (uint16) 0x0
 ...

It's not completely clear at this point how the latter will fall into place with other plans such as QOM that would provide similar capabilities, but since device serialization will likely be done via automatically-generated visitor patterns, this conversion should at least serve as an incremental step toward that transition.

Code/Testing

The latest stable code is available at

git://repo.or.cz/qemu/mdroth.git migration-visitor

Conversions are being staged/tested through the testing branch:

git://repo.or.cz/qemu/mdroth.git migration-visitor-test

This branch also includes some helper scripts to aid in testing the conversions under the test-migration-visitor/ directory of the test branch (usage details in test-migration-visitor). There are also trivial modifications to the way qemu_put_buffer/qemu_get_buffer behave to provide a closer match to the trace output of converted save/load functions (mainly, qemu_put_buffer uses qemu_put_byte rather than directly memcpy'ing to the QEMUFile buffer, since this is how the QEMUFileVisitor does it via visit_start_array()/visit_end_array().

Essentially, the test involves 3 phases. Build qemu with only the patches up to and including "tests: helper scripts for testing migration visitor conversions", and another qemu with all the conversions on top of that that you're testing (both with --enable-trace-backend=stderr). We then do, via the test-migration.sh helper script:

1. migrate "new" qemu to another instance of "new" qemu and look at the processed visit_* traces. These should be identical on source/target. Also use the field names to get an idea of what kinda of coverage you're getting in testing the converted devices. You may have to play with command-line options, code, etc to make sure all the paths are covered in your testing.

2. migrate "old" qemu to "new" qemu instance, look at the processed qemu_(put|get)_byte traces, and confirm that they are identical. This means you conversions read load data in the same sequence as before the conversion.

3. migrate new->old and do the same analysis. This means your conversions generate save data in the same sequence as before the conversion.

A set of unit tests, buildable via make test-visitor, are used to ensure affinity between, for example, qemu_put_be32() and visit_type_uint32().

What's not covered by this testing as munging the conversion such that you pass the wrong data into visit_type_*()....so just be careful there :)