Features/CPUModels

From QEMU

Summary

This page was about the feature of "externally-configurable" CPU models, but its scope was gradually changed to discussion about the design of the CPU code, the CPU model system. The old "cpudef" config section was deprecated, so the original description doesn't apply anymore.

Owner

Roadmap

After QEMU 1.4

  • CPU feature words refactor
  • x86 CPU properties (Igor Mammedov)
    • Being redone to use static properties
  • machine-friendly error reporting of -cpu enforce/check
  • x86 CPU model subclasses
  • Allow changing of Hypervisor CPUIDs (Don Slutz)
  • Probing for CPU features supported by the host and can be enabled
  • Probing for the features that are actually enabled on each CPU model

Already done

QEMU 1.4

  • Make CPU a subclass of DeviceState (included)
  • APIC-ID-related topology fixes (ehabkost) (RFC submitted)
  • Fixes for -cpu enforce flag

Before QEMU 1.4

  • Drop "-cpu ?dump" (Peter Maydell)
  • Move CPU models to C code (ehabkost)
  • Eliminate cpudef config section support (ehabkost)
  • "unduplicate feature names" series (ehabkost)
  • -cpu host use GET_SUPPORTED_CPUID (ehabkost)
  • add feature flag name list for CPUID 7

Interfaces/requirements for libvirt

Ensuring predictable set of guest features

Requirement: libvirt needs to ensure all features required on the command-line are present and exposed to the guest.

Current problem: libvirt doesn't use the "enforce" flag so it can't guarantee that a given feature will be actually exposed to the guest.

Solution: use the "enforce" flag on the "-cpu" option.

Limitation: no proper machine-friendly interface to report which features are missing.
Workaround: See "querying for host capabilities" below.

Future plans

Machine-friendly reporting of missing host features/capabilities.

Proposal: add "removed-features" property to X86CPU objects
Libvirt could then use "-cpu <model>,check" instead of "enforce", and check if "removed-features" is empty before unpausing the guest or starting migration


Listing CPU models

Requirement: libvirt needs to know which CPU models are available to be used with the "-cpu" option.

Current problem: libvirt relies on help output parsing for that.

Solution: use QMP qom-list-types command.

Dependency: X86CPU subclasses.
Limitation: needs a live QEMU process for the query.
Example: { "execute": "qom-list-types", "arguments": { "implements": "cpu", "abstract": false } }
Caveat: the CPU class name for -cpu model will in the format model-arch-cpu or model-kvm-arch-cpu.

Solution: use QMP query-cpu-definitions command.

Limitation: needs a live QEMU process for the query.
Probably it will be deprecated in favor of QOM commands.

Requirements: CPU class/model list should not depend on any other command-line option (e.g. not depend on machine-type)

Unanswered question: we may have separated subclasses for KVM and TCG CPU models.

Future plans

It would be interesting to get rid of the requirement for a live QEMU process to be started, just to list CPU models.

Getting information about CPU models

Requirement: libvirt uses the predefined CPU models from QEMU, but it needs to be able to query for CPU model details, to find out how it can create a VM that matches what was requested by the user.

Current problem: libvirt has a copy of the CPU model definitions on its cpu_map.xml file, and the copy can be out of sync in case CPU models in QEMU change. libvirt also assumes that the set of features on each model is always the same on all machine-types, which is not true.

Benefits of changing: cpu_map.xml and QEMU won't need to match exactly, anymore. The definitions exposed by libvirt could be completely different from the definitions in QEMU, as long as libvirt probes for CPU model information and uses the right flags in the command-line to make QEMU expose what libvirt users expect.

Challenge: the resulting CPU features depend on many factors:

  • Chosen CPU model name (of course)
  • accel=kvm option (CPU models are different in TCG and KVM models)
  • machine-type
  • Host CPU vendor (unless explicit "vendor" option is used)
  • Host CPU capabilities (not valid anymore, as long as "enforce" is used)
  • Host kernel capabilities (not valid anymore, as long as "enforce" is used)
  • kernel-irqchip option (not valid anymore, as long as "enforce" is used)
Solution: start a paused VM with no devices, but with the right machine-type and right CPU model. Use QMP QOM commands to query for CPU flags (especially the properties starting with the "f-" prefix).
Dependency: X86CPU feature properties ("f-*" properties).
Limitation: requires a live QEMU process with the right machine-type/CPU-model to be started, to make the query.
Limitation: requires starting a new QEMU process for each machine-type/CPU-model pair that is going to be queried.

Problem: qemu -machine machine -cpu model will create CPU objects where the CPU features are already filtered based on host capabilities.

  • Using "enforce" wouldn't solve it, because then QEMU would abort, and QMP would be unavailable.
  • Using "check" wouldn't solve it either, because the features are always filtered out when the CPU is created.
Solution: see proposal about adding a "removed-features" property.

Requirement: the resulting CPU features for a given host-CPU-vendor + machine-type + CPU-model combination must not ever change, on any future QEMU version.

This should allow libvirt to safely cache CPU model data, even if the QEMU binary changes.

Requirement: libvirt needs to know if a specific CPU model can be used in the current host.

See "Ensuring predictable set of guest features" above
See "Querying host capabilities" below

Querying host capabilities

Requirement: libvirt needs to know which feature can really be enabled, before it tries to start a VM, and before it tries to start a live-migration process.

The set of available capabilities depend on:

  • Host CPU (hardware) capabilities;
  • Kernel capabilities (reported by GET_SUPPORTED_CPUID);
  • QEMU capabilities;
  • Specific configuration options (e.g. in-kernel IRQ chip is required for some features).

Current problem: libvirt uses the CPUID intruction directly and assumes that the presence of a feature in the host CPU means it can be enabled and exposed to the guest. This breaks when virtualization of a feature requires:

  • Additional hardware support (e.g. INVPCID);
  • Additional host kernel code (this applies to _all_ CPU features, that need to be reported as supported by GET_SUPPORTED_CPUID);
  • Additional QEMU-side code;
  • Specific configuration options
    • kernel-irqchip (affects tsc-deadline and x2apic availability)
    • machine-type
    • NOTE: any other option that affects CPU feature availability, MUST:
      • have defaults depending on machine-type, so libvirt versions that don't know about the new option will still work because they already check machine-type
      • be documented as affecting availability of CPU features, so once libvirt starts setting the option explicitly, it will take it into account when probing for host capabilities


Challenge: QEMU doesn't have a generic capability-querying interface, and host capability querying depends on KVM to be initialized.

Workaround: start a paused VM using the "host" CPU model, that has every single CPU feature supported by the host enabled by default, and query for the information about the CPU though QMP, using the QOM commands.

Current solution: start a paused VM with no devices but with "host" CPU model and use QMP QOM commands to query for the enabled CPU features.

Dependency: X86CPU feature properties

Future plans

It would be interesting to have a more generic capability-querying interface that doesn't require starting a whole machine with a live QEMU process.

See also: -query-capabilities RFC series from Anthony

Message-Id: <1332169763-30665-9-git-send-email-aliguori@us.ibm.com>

Solved challenges

Allowing CPU models to be updated

We need a mechanism to allow the existing CPU models on Qemu to be updated, without making guest-visible changes for existing Virtual Machines, when migrating to a new version.

Examples

Examples where CPU model updates are necessary and have to be deployed to users:

  • The Nehalem CPU model currently has the wrong "level" value, making CPU topology information unavailable.
  • The CPUID PMU leaf was added on Qemu 1.1, but it is not supposed to be visible to guests running using -M pc-1.0
  • New features are implemented by KVM and we may want to add them to existing models (e.g. SandyBridge may need to have tsc-deadline added)

Requirements

  • A different CPU will be visible to the guest depending on the machine-type chosen.
    • That means that "-M pc-1.0 -cpu Nehalem" will be different from "-M pc-1.1 -cpu Nehalem"
    • Rationale:
      • The meaning of "-M pc-1.0 -cpu Nehalem" can't be changed or it will change existing guests
      • The meaning of "-M pc-1.1 -cpu Nehalem" needs to be different from the pc-1.0 one, otherwise we would be stuck with a broken "Nehalem" model forever

Status/solution

  • CPU model definitions were moved to C code, so we can easily add compatibility code to them if necessary
  • CPUs are now DeviceState objects
  • CPU models will become separate classes, so per-CPU-model compatibility properties can be used on machine-type definitions

-cpu host and feature probing

See http://article.gmane.org/gmane.comp.emulators.kvm.devel/90035

-cpu host vs -cpu best

Currently we have -cpu host, but the naming and semantics are unclear.

We have 3 possible modes of "try to get the best CPU model":

  1. all-you-can-enable: Enable every single bit that can be enabled, including the ones not present on the host but that can be emulated.
  2. match-host-CPU: Enable all bits that are present in the host CPU that can be enabled.
  3. best-predefined-model: Use the best CPU model available from the pre-defined CPU model list.

Status

  • -cpu host will be the "all-you-can-enable" mode, that will enable every bit from GET_SUPPORTED_CPUID on the VCPU
  • We're not going to have a mode for match-host-CPU, probably
  • A "best-predefined-model" mode can be implemented by libvirt.

Moving CPU model definitions to C code

The old "cpudef" config section was deprecated because there are expectations that QEMU is going to provide the CPU model list, and will keep migration compatibility using machine-types. Machine-type compatibility code is incide QEMU C code, so making external config files depend and/or be affected by internal QEMU C code would be confusing and fragile. Now both CPU model definitions and per-machine-type CPU-model compatibility code are inside the QEMU C code.

check/enforce flags

The pseudo CPUID flag 'check' when appearing in the command line feature flag list will warn when feature flags (either implicit in a cpu model or explicit on the command line) would have otherwise been quietly unavailable to a guest:

   # qemu-system-x86_64 ... -cpu Nehalem,check
   warning: host cpuid 0000_0001 lacks requested flag 'sse4.2|sse4_2' [0x00100000]
   warning: host cpuid 0000_0001 lacks requested flag 'popcnt' [0x00800000]

A similar 'enforce' pseudo flag exists which in addition to the above causes qemu to error exit if requested flags are unavailable.