Features/Migration/Visitor: Difference between revisions

From QEMU
 
(3 intermediate revisions by 2 users not shown)
Line 10: Line 10:
all converted, we can then change the interfaces in bulk and switch over to
all converted, we can then change the interfaces in bulk and switch over to
passing in the Visitor directly.
passing in the Visitor directly.
This series also converts all of vmstate over to using Visitors, leveraging the
existing vmstate hierachical structure to handle namespacing and structuring of
the fields that are fed to the visitor functions, and introduces trace
statements for qemu_put_*/qemu_get_* and visitor API calls to aid in testing.


Migration compatibility is not affected by these changes, as the
Migration compatibility is not affected by these changes, as the
Line 23: Line 18:


== Owner ==
== Owner ==
*Name: Dave Gilbert
*Email: dgilbert@redhat.com


*Name: Michael Roth
*Name: Michael Roth
*Email: mdroth@linux.vnet.ibm.com
*Email: mdroth@linux.vnet.ibm.com


== Plans ==
== Plans (2014) ==
 
Get it to the point where the infrastructure works and doesn't break anything and is basically right
Get it merged
Deal with the remaining items that are still stuck as non-vmstate
 
In particular I don't want to have to wait for every device currently in use to work with visitors,
so the code bundles old data up as octet-streams.
 
== Code/Status ==
 
* (2014) WIP posted to qemu-devel - http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg03579.html
* Current todo list:
  - Block migration
  - SPAPR (iterative like RAM and block)
  - RDMA
  - Floats
  - Need to improve the visitor interface for the compatibility sturcutres
  - _TEST entries in VMState, and producing a schema that always works with them
  - Clean up places where shims are pased visitors but still need QEMUFiles
  - Test cases
  - Places where the interfaces and structure are still too tied to the old file format
  - Selecting the BER output format
  - Clean up error handling
  - Visitors should share more code.
 
Michael's code from 2011: The latest stable code is available at
 
git://repo.or.cz/qemu/mdroth.git migration-visitor-v2
git://repo.or.cz/qemu/mdroth.git migration-visitor-conversions-set1-v1
 
A bit over over 1/3 of the conversions are completed, with most of x86-relevant conversions done.
 
== Old plans ==


With these patches in place non-vmstate save/load functions can be converted
With these patches in place non-vmstate save/load functions can be converted
Line 94: Line 125:
that transition.
that transition.


== Code/Testing ==
The latest stable code is available at


git://repo.or.cz/qemu/mdroth.git migration-visitor
== (old)  Testing ==


Conversions are being staged/tested through the testing branch:
Conversions are being staged/tested through the testing branch:
Line 108: Line 136:
Essentially, the test involves 3 phases. Build qemu with only the patches up to and including "tests: helper scripts for testing migration visitor conversions", and another qemu with all the conversions on top of that that you're testing (both with --enable-trace-backend=stderr). We then do, via the test-migration.sh helper script:
Essentially, the test involves 3 phases. Build qemu with only the patches up to and including "tests: helper scripts for testing migration visitor conversions", and another qemu with all the conversions on top of that that you're testing (both with --enable-trace-backend=stderr). We then do, via the test-migration.sh helper script:


1. migrate "new" qemu to another instance of "new" qemu and look at the processed visit_* traces. These should be identical on source/target. Also use the field names to get an idea of what kinda of coverage you're getting in testing the converted devices. You may have to play with command-line options, code, etc to make sure all the paths are covered in your testing.
The conversions are fairly trivial for the most part, and have been tested
using a mostly-automated test framework that involved tracing all
visitor-based and qemufile-based puts/gets and do the following:
 
1. migrate converted qemu instance to converted qemu instance. check that
visitor and qemufile puts/gets match up and report the same values. This
confirms symmetry between the save and load sides.
 
2. migrate pre-converted qemu instance to converted qemu instance. check
that the qemufile puts/gets match up between the 2. This confirms symmetry
between old save routines and new load routines.
 
3. migrate converted qemu instance to pre-converted instance. confirm
symmetry between new save routines and old load routines.


2. migrate "old" qemu to "new" qemu instance, look at the processed qemu_(put|get)_byte traces, and confirm that they are identical. This means you conversions read load data in the same sequence as before the conversion.
There are also unit tests in test-visitor to confirm that data written/read
via qemufile/visitor interfaces match.


3. migrate new->old and do the same analysis. This means your conversions generate save data in the same sequence as before the conversion.
What's not covered in the testing:


A set of unit tests, buildable via make test-visitor, are used to ensure affinity between, for example, qemu_put_be32() and visit_type_uint32().
1. coverage: there's no guaruntee all paths will be tested, so I've been
checking coverage manually by examining the post-processed visitor-based
traces, where field names form unique(-ish) paths that are human readable,
to determine what paths are being hit.


What's not covered by this testing as munging the conversion such that you pass the wrong data into visit_type_*()....so just be careful there :)
2. correctness: all tests involve checking symmetry on source and target
during migration, but there's no guarantee the source values are correct.
Presumably, serializing the wrong data with cause a break in symmetry, or
if not, cause a mismatch in the converted load side with pre-converted save
side, but care should be taken nonetheless.

Latest revision as of 16:21, 24 April 2014

Summary

This patch series implements a QEMUFile Visitor class that's intended to abstract away direct calls to qemu_put_*/qemu_get_* for save/load functions. Currently this is done by always creating a QEMUFileInputVisitor/QEMUFileOutputVisitor pair with each call to qemu_fopen_ops() and maintaining a QEMUFile->I/O Visitor mapping. save/load functions that are to be converted would them simply use lookup routines to get a Visitor based on their QEMUFile arguments. Once these save/load functions are all converted, we can then change the interfaces in bulk and switch over to passing in the Visitor directly.

Migration compatibility is not affected by these changes, as the qemu_put_*/qemu_get_* calls have a trivial 1-to-1 mapping with the visit_type_* calls, and the added field names/structure are not sent out over the wire when using the QEMUFile visitors, but serve simply as placeholders for future functionality.

Owner

  • Name: Dave Gilbert
  • Email: dgilbert@redhat.com
  • Name: Michael Roth
  • Email: mdroth@linux.vnet.ibm.com

Plans (2014)

Get it to the point where the infrastructure works and doesn't break anything and is basically right Get it merged Deal with the remaining items that are still stuck as non-vmstate

In particular I don't want to have to wait for every device currently in use to work with visitors, so the code bundles old data up as octet-streams.

Code/Status

  - Block migration
  - SPAPR (iterative like RAM and block)
  - RDMA
  - Floats
  - Need to improve the visitor interface for the compatibility sturcutres
  - _TEST entries in VMState, and producing a schema that always works with them
  - Clean up places where shims are pased visitors but still need QEMUFiles
  - Test cases
  - Places where the interfaces and structure are still too tied to the old file format
  - Selecting the BER output format
  - Clean up error handling
  - Visitors should share more code.
  

Michael's code from 2011: The latest stable code is available at

git://repo.or.cz/qemu/mdroth.git migration-visitor-v2 git://repo.or.cz/qemu/mdroth.git migration-visitor-conversions-set1-v1

A bit over over 1/3 of the conversions are completed, with most of x86-relevant conversions done.

Old plans

With these patches in place non-vmstate save/load functions can be converted over to using Visitors incrementally.

Short term (1.1 timeframe), the goal is to implement a new migration protocol that is self-describing, such that incompatibilities or migration errors can be more easilly detected. Currently, a simple change in data types for a particular device can introduce subtle bugs that won't be detected by the target, since the target interprets the data according to it's own expectation of what those data types are. Currently the plan is to achieve this by using ASN.1 BER in place of QEMUFile via a new BERVisitor.

Also planned is using the visitor interface to aid in things like device introspection (for both vmstate and non-vmstate devices), likely via the existing QMPOutputVisitor, as well as debugging/testing migration via a well structured "PrettyPrintVisitor" type of visitor that also retains ordering, field names, and type information (unlike with QMP/JSON or ASN.1). I've played around with doing this indirectly to test the conversions by doing some post-processing on the visit_* trace statements:

 ...
 PIIX3.PCIDevice.config(256)(1).[255] = (uint8) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[0] = (uint32) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[1] = (uint32) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[2] = (uint32) 0x0
 PIIX3.PCIDevice.irq_state(4)(4).[3] = (uint32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[0] = (int32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[1] = (int32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[2] = (int32) 0x0
 PIIX3.pci_irq_levels_vmstate(4)(4).[3] = (int32) 0x0
 i8259.last_irr = (uint8) 0x1                                                     
 i8259.irr = (uint8) 0x10
 i8259.imr = (uint8) 0xb8
 i8259.isr = (uint8) 0x0
 i8259.priority_add = (uint8) 0x0
 i8259.irq_base = (uint8) 0x8
 i8259.read_reg_select = (uint8) 0x0
 i8259.poll = (uint8) 0x0
 i8259.special_mask = (uint8) 0x0
 i8259.init_state = (uint8) 0x0
 i8259.auto_eoi = (uint8) 0x0
 ...
 slirp.ip_id = (uint16) 0x2
 slirp.bootp.bootp_clients(16)(8).[0].allocated = (uint16) 0x1
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[0] = (uint8) 0x1
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[1] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[2] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[3] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[4] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[0].macaddr(6)(1).[5] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].allocated = (uint16) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[0] = (uint8) 0x1
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[1] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[2] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[3] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[4] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[1].macaddr(6)(1).[5] = (uint8) 0x0
 slirp.bootp.bootp_clients(16)(8).[2].allocated = (uint16) 0x0
 ...

It's not completely clear at this point how the latter will fall into place with other plans such as QOM that would provide similar capabilities, but since device serialization will likely be done via automatically-generated visitor patterns, this conversion should at least serve as an incremental step toward that transition.


(old) Testing

Conversions are being staged/tested through the testing branch:

git://repo.or.cz/qemu/mdroth.git migration-visitor-test

This branch also includes some helper scripts to aid in testing the conversions under the test-migration-visitor/ directory of the test branch (usage details in test-migration-visitor). There are also trivial modifications to the way qemu_put_buffer/qemu_get_buffer behave to provide a closer match to the trace output of converted save/load functions (mainly, qemu_put_buffer uses qemu_put_byte rather than directly memcpy'ing to the QEMUFile buffer, since this is how the QEMUFileVisitor does it via visit_start_array()/visit_end_array().

Essentially, the test involves 3 phases. Build qemu with only the patches up to and including "tests: helper scripts for testing migration visitor conversions", and another qemu with all the conversions on top of that that you're testing (both with --enable-trace-backend=stderr). We then do, via the test-migration.sh helper script:

The conversions are fairly trivial for the most part, and have been tested using a mostly-automated test framework that involved tracing all visitor-based and qemufile-based puts/gets and do the following:

1. migrate converted qemu instance to converted qemu instance. check that visitor and qemufile puts/gets match up and report the same values. This confirms symmetry between the save and load sides.

2. migrate pre-converted qemu instance to converted qemu instance. check that the qemufile puts/gets match up between the 2. This confirms symmetry between old save routines and new load routines.

3. migrate converted qemu instance to pre-converted instance. confirm symmetry between new save routines and old load routines.

There are also unit tests in test-visitor to confirm that data written/read via qemufile/visitor interfaces match.

What's not covered in the testing:

1. coverage: there's no guaruntee all paths will be tested, so I've been checking coverage manually by examining the post-processed visitor-based traces, where field names form unique(-ish) paths that are human readable, to determine what paths are being hit.

2. correctness: all tests involve checking symmetry on source and target during migration, but there's no guarantee the source values are correct. Presumably, serializing the wrong data with cause a break in symmetry, or if not, cause a mismatch in the converted load side with pre-converted save side, but care should be taken nonetheless.