Today we only support generating the latest serialization of devices. To increase the probability of the latest version working on older versions of QEMU, we strategically omit fields that we know can safely be omitted with older versions (subsections). More than likely, migrating new to old won't work.
Migrating old to new is more likely to work. We version each section in order to be able to identify when we're dealing with old.
But all of this logic lives in one of two forms. Either as a savevm/loadvm callback that takes a QEMUFile and writes byte serialization to the stream in an open way (usually big endian) or encoded declaratively in a VMState section.
What we need
We need to decompose migration into three different problems:
- serializing device state
- transforming the device model in order to satisfy forwards and backwards compatibility
- encoding the serialized device model on the wire.
We also need a way to future proof ourselves.
What we can do
Add migration capabilities to future proof ourselves. I think the simplest way this would work is to have a 'query-migration-capabilities' command that returned a bitmask of supported migration features. I think we also introduce a 'set-migration-capabilities' command that can mask some of the supported features.
A management tool would query-migration features on the source and destination, take the intersection of the two masks, and set that mask on both the source and destination.
Lack of support for these commands indicates a mask of zero which is the protocol we offer today.
Switch to a visitor model to serialize device state. This involves converting any occurance of:
visit_type_u32(v, "guest_connected", &port->guest_connected, &local_err);
It's 100% mechanical and makes absolutely no logic change. It works equally well with legacy and VMstate migration handlers.
Add a Visitor class that operates on QEMUFile.
At this state, we can migrate to data structures. That means we can migrate to QEMUFile, QObjects, or JSON. We could change the protocol at this stage to something that was still binary but had section sizes and things of that nature.
But we shouldn't stop here.
Device Model Transformation
Compatibility logic should be extracted from the savevm functions and VMstate functions into separate functions that take a data structure. Basically, we want to have something roughly equivalent to:
QObject *e1000_migration_compatibility(QObject *src, int src_version, int dst_version);
We can have lots of helpers that reuse the VMstate declarative stuff to do this but this should be registered independent of the main serialization handler.
This moves us to a model where we always generate the latest serialization format, and then have specific ways to convert to older mechanisms. It allows us to do very big backwards compatibility steps like convert the state of one device into two separate devices (because we're just dealing with in-memory data structures).
It's this step that lets us truly support compatibility with migration. The good news is, it doesn't have to be all or nothing. Since we always already generate the latest serialization format, the existing code only deals with migrating older versions to the latest which is something that isn't all that important.
So if we did this in 1.0, we could have a single function that converted the 1.0 device model to 1.1 and vice versa, and we'd be fine. We wouldn't have to touch 200 devices to do this.
Next Gen Transport
Once we're here, we can implement the next 5-year format. That could be ASN.1 and be bidirectional or whatever makes the most sense. We could support 50 formats if we wanted to. As long as the transport is distinct from the serialization and compat routines, it really doesn't matter.
- Add query-migration-capabilities command and set-migration-capabilities command
- Enhance Visitor framework to support u8, u16, u32, u64 types.
- Add QEMUFile Visitors that marshals to and from a QEMUFile. All QEMUFile Visitors should be tracked in a linked list, with a function that can look up a Visitor for a given QEMUFile pointer.
- Begin converting occurrences of qemu_put_be* to visit_type_* using the aforementioned function to look up the Visitor * based on a QEMUFile *. Should be possible to test at every stage. Likewise, convert qemu_get_be* to visit_type_*.
- Once fully converted, remove QEMUFile * from migration path and replace with Visitor *, remove link list and reverse lookup function.
- Instead of using QEMUFile Visitor when marshalling, using QmpOutputVisitor to store device state in a data structure. Implement a function that can dump a QObject to a QEMUFile *.
- Allow filtering of marshalling via a registered callback manipulating a QObject.
- Switch to a self describing migration transport like ASN.1 or JSON.
- Build a QObject from incoming migration traffic, use QmpInputVisitor to pass to serialization code.
- Allow filtering of unmarshalling via a registered callback manipulating a QObject.