Features/record-replay

From QEMU

Overview

Record/replay feature is implementation of deterministic replay for system-level simulation (softmmu mode).

Record/replay functions are used for the reverse execution and deterministic replay of qemu execution. Determinitsic replay is used to record volatile system execution once and replay it for multiple times for the sake of analysis, debugging, logging, etc. This implementation of deterministic replay can be used for deterministic and reverse debugging of guest code through a gdb remote interface.

One of the aims of deterministic/reverse debugging is eliminating Heisenbugs. Stopping the program in the debugger may cause timeout in data processing or data transfer. The behavior of the connected device may change and the bug will disappear. Each program run can expose different behavior of the program without giving a chance to examine the bugs.

Limitations

Record/replay reuses icount to implement deterministic execution. Therefore rr inherits icount limitations:

  • Works only in single CPU TCG mode.
  • Some platforms have incomplete icount implementation

Current record/replay implementation is incomplete and cannot be used with

  • Passthrough USB devices

Patches for support of these devices will be added later.

Using record/replay

Record/replay feature is tested for i386, x86_64, ARM, and MIPS platforms.

Execution recording may be enabled through icount command line option: -icount shift=7,rr=record,rrfile=replay.bin

To enable replaying icount option should look like this: -icount shift=7,rr=replay,rrfile=replay.bin

To record and replay block operations the drive must be configured as follows:

-drive file=disk.qcow,if=none,id=img-direct
-drive driver=blkreplay,if=none,image=img-direct,id=img-blkreplay
-device ide-hd,drive=img-blkreplay

blkreplay driver should be inserted between disk image and virtual driver controller. Therefore all disk requests may be recorded and replayed.

Character devices connected to QEMU are recorded/replayed automatically. Both of record and replay command lines should have equivalent number of attached character devices.

Record and replay for network interactions is performed with the network filter. Each backend must have its own instance of the replay filter as follows:

-netdev user,id=net1 -device rtl8139,netdev=net1
-object filter-replay,id=replay,netdev=net1

Record/replay for audio devices (-soundhw option) is enabled automatically.

Supported inputs

  • Mouse input
  • Keyboard input
  • Host real time clock
  • Character devices
  • Network devices
  • Audio input

Snapshotting

New VM snapshots may be created in replay mode. They can be used later to recover the desired VM state. All VM states created in replay mode are associated with the moment of time in the replay scenario. After recovering the VM state replay will start from that position.

Default starting snapshot name may be specified with icount field rrsnapshot as follows:

-icount shift=7,rr=record,rrfile=replay.bin,rrsnapshot=snapshot_name

This snapshot is created at start of recording and restored at start of replaying. It also can be loaded while replaying to roll back the execution.

Features to add

Full version of record/replay will include support of:

  • Automatic VM snapshotting
  • Reverse debugging through GDB

How to get involved

First version of patches of record/replay feature is prepared by ISP RAS

You can mail Pavel Dovgalyuk to get information about patches that were not upstreamed yet.

Links

Papers with description of deterministic replay implementation:

Prior QEMU version with block patches added: https://github.com/Dovgalyuk/qemu/tree/rr-15