Features/tcg-multithread

This is the feature that allows the Tiny Code Generator run one host-thread per guest thread or guest vCPU (in system emulation mode). It was first introduced in QEMU 2.9 for Alpha and ARM. Work to enable full multi-threading support in additional system emulations is on going.

Overview

QEMU's system emulation mode could always emulate multiple vCPUs but it scheduled them in a single thread and executed each one in tern in a round-robin fashion. To switch to a host-thread per vCPU a number of changes had to be made to the core code as well as explicit support in each guest architecture. The design decisions are documented in docs/devel/multi-thread-tcg.txt.

There was a talk at KVM Forum 2015 (video slides) which is a little out of date but acts as a useful primer on the challenges involved.

Controlling MTTCG

Once a MTTCG guest is supported there should be no need to enable it explicitly. The system emulation will enable it if the following conditions are met:

The guest architecture has defined TARGET_SUPPORTS_MTTCG
The host architectures TCG_TARGET_DEFAULT_MO supports TCG_GUEST_DEFAULT_MO

When this is not the case you can force MTTCG by specifying:

   $QEMU $OPTS --accel tcg,thread=multi

although you are likely to get strange behaviour. If you suspect that guest emulation is incorrect you can revert to single threaded mode and re-run your test:

   $QEMU $OPTS --accel tcg,thread=single

Incompatibilities

MTTCG is not compatible with -icount and enabling icount will force a single threaded run.

Developer Details

Porting a guest architecture

Before MTTCG can be enabled for a guest the following changes must be made.

Correctly translate atomic/exclusive instructions (see tcg_gen_atomic_)
Ensure the translation step correctly handles barrier instructions (tcg_gen_mb)
Define TCG_GUEST_DEFAULT_MO
Audit instructions that modify system state
- generally this means taking BQL (e.g. HELPER(set_cp_reg))
Audit MMU management functions
- cputlb provides an API for various tlb_flush_FOO operations
- updates to the guests page tables need to be atomic (e.g. dirty bits)
Audit power/reset sequences
- see for example target/arm/arm-powerctl.c

The work queue API async_[safe_]run_on_cpu provides a mechanism for one vCPU to queue work on another.

Once this work is done your final patch can update configure and enable TARGET_SUPPORTS_MTTCG

Testing

Ideally you'll want a comprehensive set of tests to exercise the corner cases of system emulation behaviour. See Alex's kvm-unit-tests for an example of how the ARM architecture is exercised.

Further Work

Enabling strong-on-weak memory consistency (e.g. emulate x86 on an ARM host)

People

Now MTTCG is merged it is supported by the TCG maintainers. However the following people where involved:

Fred Konrad (Original core MTTCG patch set)
Alex Bennée (ARM testing, base enabling tree)
Alvise Rigo (LL/SC work)
Emilio Cota (QHT, cmpxchg atomics)