Features/DirtyRateCalc
Introduction
QEMU provides a few ways to measure dirty rates for the guest. It can be used for VM monitor purpose, and it can provide a clue on how hard would it be to migrate this VM with live migrations.
There're currently three modes supported for dirty rate calculations:
- Page sampling
- Dirty bitmap
- Dirty ring
The page sampling mode can be used anytime, while the dirty bitmap or dirty ring mode will be dependent on what's the dirty tracking mechanism enabled for the specific virtual machine.
For QMP, one can use the command "calc-dirty-rate" to trigger a sample procedure with specific parameters. Then, one can use "query-dirty-rate" to check the results. The corresponding HMP commands are "calc_dirty_rate" and "info dirty-rate".
Modes of Operations
Page Sampling Mode
The page sampling mode is the 1st mode got supported for dirty rate calculations. The algorithm is based on small page hash values.
When the tracking is triggered, the hypervisor will select a few pages (specified in the sample-pages= parameter, with a default value of 512 pages per GB), calculate the hash value for these pages and remember them. Then the hypervisor waits for a specific length of time (specified by call-time=) and redo the hash calculation. If any of the page got a different hash value on its data stored, it means this page has changed during the period.
An example to start the sampling with 1024 sample pages per GB and sample period of 3 seconds:
(QEMU) calc-dirty-rate calc-time=3 mode=page-sampling sample-pages=1024 { "arguments": { "calc-time": 3, "mode": "page-sampling", "sample-pages": 1024 }, "execute": "calc-dirty-rate" } { "return": {} }
Before the 3 seconds end, it'll show that it's still during measuring:
(QEMU) query-dirty-rate { "arguments": {}, "execute": "query-dirty-rate" } { "return": { "calc-time": 3, "mode": "page-sampling", "sample-pages": 1024, "start-time": 59478, "status": "measuring" } }
After that, we should see the status reported as "measured" and value reported correspondingly.
(QEMU) query-dirty-rate { "arguments": {}, "execute": "query-dirty-rate" } { "return": { "calc-time": 3, "dirty-rate": 200, "mode": "page-sampling", "sample-pages": 1024, "start-time": 59478, "status": "measured" } }
Dirty Bitmap Mode
We can enable dirty bitmap mode of dirty rate measurement when dirty bitmap based dirty tracking is enabled on the guest (no "-accel kvm,dirty-ring-size=N" specified in QEMU cmdline).
To start dirty rate measurement with dirty bitmap mode:
(QEMU) calc-dirty-rate calc-time=3 mode=dirty-bitmap { "arguments": { "calc-time": 3, "mode": "dirty-bitmap" }, "execute": "calc-dirty-rate" } { "return": {} }
Results:
(QEMU) query-dirty-rate { "arguments": {}, "execute": "query-dirty-rate" } { "return": { "calc-time": 3, "dirty-rate": 202, "mode": "dirty-bitmap", "sample-pages": 0, "start-time": 60679, "status": "measured" } }
Dirty Ring Mode
Dirty ring mode can provide a finer grained dirty rate measurement in per-vCPU basis. It can only be used when dirty ring is enabled for the specific guest (with "-accel kvm,dirty-ring-size=N" specified in QEMU cmdline).
To kickoff a dirty-ring based calculation:
(QEMU) calc-dirty-rate calc-time=3 mode=dirty-ring { "arguments": { "calc-time": 3, "mode": "dirty-ring" }, "execute": "calc-dirty-rate" } { "return": {} }
Its result will be shown in both per-VM (in the original "dirty-rate" field) and per-vCPU way (in "vcpu-dirty-rate" section):
(QEMU) query-dirty-rate { "arguments": {}, "execute": "query-dirty-rate" } { "return": { "calc-time": 3, "dirty-rate": 185, "mode": "dirty-ring", "sample-pages": 0, "start-time": 60901, "status": "measured", "vcpu-dirty-rate": [ { "dirty-rate": 0, "id": 0 }, { "dirty-rate": 0, "id": 1 }, { "dirty-rate": 0, "id": 2 }, { "dirty-rate": 200, "id": 3 }, } } }
Misc
One thing to mention is that page-sample solution can be inaccurate because the pages to sample are only a portion of the system page, meanwhile the selection is random. However it has a benefit that it does not need KVM dirty tracking intervention. It means the measurement overhead can be fully transparent to the guest but only done in a single host thread (if ignoring processor cache pollutions). On the other hand, either dirty bitmap or dirty ring mode measurements could have an impact on guest workload performance. It's not only because the overhead to start/stop the guest dirty page tracking mechanism could intervene with guest memory accesses, but also when trapping each writes could take a page fault depending on the host configurations (e.g. whether PML is enabled).