Features/CPUHotplug

From QEMU

Contents

Summary

There are 2 approaches to implement CPU hotplug in QEMU:

  • dedicated interface: currently only cpu-add QMP command
  • device-add/device-del interface for hot-(un)plugging CPUs

Targets are encouraged to (re)design CPU creation so that it would be possible to use device_add/device-del interface for it. However if due to target design or a necessary long re-factoring time to use CPU with device_add/device-del interface, it is possible speed-up CPU hot-add feature development by using cpu-add interface.

Owner

  • Name: Igor Mammedov
  • Email: imammedo@redhat.com

cpu-add interface

Summary

{ 'command': 'cpu-add', 'data': {'id': 'int'} }

  • ID - a number in range [0..max-cpus)
  • Available since: 1.5
  • Supported targets: i386-softmmu, x86_64-softmmu

Description

Command is an intermediate solution for CPU hot-add and gives a simplified interface for it. It provides an opportunity to implement the feature for targets that currently can't implement CPU hot-add using device_add command due to their present design. Later targets that implement it could rewrite it to become a wrapper over device_add when it becomes usable for target.

Usage example

1. start QEMU with QMP socket available and with startup amount of CPUs less than maxcpus

./qemu-system-x86_64 -qmp unix:/tmp/qmp-sock,server,nowait -smp 1,maxcpus=4

2. Connect to QMP socket using qmp-shell command

./QMP/qmp-shell /tmp/qmp-sock

3. Add CPUs issuing cpu-add command in qmp-shell command prompt

cpu-add id=1

4. Optionally online newly added CPU inside guest

Linux kernel doesn't online hot-added CPUs automatically. Once CPU is hot-added it should be onlined using an appropriate udev script or manually by issuing a following command:

echo 1 > /sys/devices/system/cpu/cpu1/online

Sample udev script: Add the following line to /etc/udev/rules.d/99-hotPlugCPU.rules

ACTION=="add", KERNEL=="cpu*", SUBSYSTEM=="cpu", DRIVER=="processor", RUN+="/root/helloCPU.sh %n"

Contents of /root/helloCPU.sh:

#!/bin/bash
echo 1 > /sys/devices/system/cpu/cpu${1}/online

Current limitations

  1. migration target should be started with initial CPU count '-smp XX' that includes hot-added CPUs on migration source side.
  2. CPU shouldn't be hot-plugged during migration.
  3. adding CPUs should be done in successive order from lower to higher IDs in [0..max-cpus) range.
    It's possible to add arbitrary CPUs in random order, however that would cause migration to fail on its target side.

device_add/device_del interface

Legend: green - done, blue - in progress, red - TBD

Work in progress/TODOs

  • Conversion of features and other properties into static properties provides following benefits:
    • global properties for CPU, generalizing -cpu xxx,features_string template to a set of global properties
    • latest implementation doing only conversion to static properties tree: [1] posted v7 series
    • conversion to global properties is postponed until CPU sub-classes
  • CPU models as CPU subclasses
    • gives ability to create CPUs using CPU subclass name without any ad-hoc calls.
    • there are several implementations in qemu-devel with following open questions:
      • if running in KVM mode, kvm_init() should be called before sub-classes class_init is called
        • issue 1: in KVM mode qemu provides host CPU model. This model depends on kvm being initialized before CPU could be created due to dependency on kvm_arch_get_supported_cpuid(). Due to lazy type initialization it usually doesn't cause problem because CPUs are created after kvm_init(). However if qemu is called with options -enable-kvm -cpu help, it should display host model as available. With introduction of CPU sub-classes, host's cpu model type should be registered only when kvm is enabled, introducing dependency on kvm_init() being called first. The same issue applies to type introspection when it will be available. I think that we are reached consensus here that 'host' CPU type should be registered at kvm_arch_init() time.
        • issue 2: when qemu is started with -enable-kvm and vendor feature is not overridden on command line, built-in vendor is replaced with host's vendor [see commit 8935499831312]. Again with lazy type initialization it doesn't cause problem because kvm_init() is called before CPU's type class_init() is called, so class_init() could overwrite built-in vendor with host's value. But if type introspection wouldn't require instantiated machine /i.e. be like -cpu help/ it could cause problem that even with -enable-kvm it would return built-in vendor values instead of host's due to the lack of option dependencies which would say that -enable-kvm should be processed before '-show-types' or something like this over other interfaces.. 'vendor' issue evolved into a generic problem, there are several ideas how to proceed:
          • 1. Register all CPU sub-classes at QEMU start-up time and fix them up later if/when kvm_arch_init() called to apply KVM specific to them
            • keeps amount of CPU sub-classes == cpu_models
            • it requires enumeration of all CPU sub-classes to fixup default values. Therefore causing type class initialization for types which won't be used.
            • defaults could be deceiving/not valid if class introspection to happen before kvm_arch_init(), and there is no sure way to prevent misuse.
            • CPU sub-classes defaults are mutable depending on -enable-kvm option.
            • possible to get rid of x86_def_t type/array embedding it in class_[model]_init() functions
          • 2. Register *-tcg-* and *-kvm-* subclasses at QEMU start-up time, with TCG/KVM defaults hard-codded in respective class_[tcg|kvm]_init() functions.
            • CPU-subclasses defaults are not mutable depending on -enable-kvm option and doesn't require (now) kvm_init() being called first
            • doubles CPU sub-classes amount
            • exposes *-kvm-* CPU sub-classes to user and allows to create *-kvm-* based CPU in TCG mode and vice/verse.
            • hard to re-factor since users could start use class names instead of cpu_model names and won't allow to eliminate x86_def_t type/array.
          • 3. Register CPU sub-classes after kvm_init() but before machine init and set defaults in class_init() depending on if KVM is available/inited.
            • keeps amount of CPU sub-classes == cpu_models
            • user sees only one immutable set of classes. But set has different defaults depending on mode QEMU was started with (TCG/KVM).
            • possible to get rid of x86_def_t type/array embedding it in class_[model]_init() functions
            • requires global hook in vl.c between kvm_init() and machine_init()
            • CPU classes won't be available before this hook, so QMP, qom-get/set, list_cpus will be forced to be called after it to get access to CPU sub-classes

Completed dependencies

  • External CPU clean-ups. Move CPU internals inside CPU object
    • move tcg init code CPU. commits d65e98, 84e3b60, eeec69d, 130a038
    • move CPU reset from board level into CPU. commits 65dee3805, dd673288
    • move APIC creation/initialization into CPU object. commit bdeec8021
  • CPU as Device [commit: 961f839]
    • necessary for converting "CPUID features" into static properties
    • allows to use device_add command after CPU subclasses is implemented.
  • QOM realize, device-only
    • convert CPUs realizefn() to use DeviceRealize [commit: 2b6f294]
  • CPUID features as properties
    • provides an ability to set/get features using common FEAT_FOO=VAL property interfaces.
    • Features related clean-ups and code reorganisation:
      • move feature flags fix-ups & checks to realize time
        • in OQM model any feature/property could be amended until realize time. Masking out unsupported kvm/tcg features too early could lead to invalid features to be exposed to guest
        • 9b15cd9 target-i386: Sanitize AMD's ext2_features at realize time
        • 4586f15 target-i386: Filter out unsupported features at realize time
        • 5ec01c2 Move kvm_check_features_against_host() check to realize time
      • separate features parsing from setting defaults
        • clean-up cpu_x86_parse_featurestr(), leaving in it only custom features parsing that should be set on CPU instance after all defaults are set. Later defaults initialization should be moved to CPU sub-classes and custom legacy features parser cpu_x86_parse_featurestr() could be converted to setting global properties, when CPU features are converted into static properties. [commits: 077c68c, fa2db3c, 8ba8a69, 99b88a1, 11acfdd, a91987c, 2c728df]
      • simplify vendor property
        • current 'vendor' feature implementation has complex semantic depending on if qemu is running in TCG or KVM mode and if vendor is overridden on command-line. if (kvm_enabled() == true) vendor = host's vendor; else vendor = built-in vendor; if (custom vendor) vendor = custom vendor[commit: 99b88a1, 11acfdd]