Features/DiskIOLimits/Requirements

From QEMU
Revision as of 09:32, 24 May 2011 by Stefanha (talk | contribs) (Created page with 'Either we can provide fine-grained limits on I/O resources at the expense of user interface and implementation complexity, or we can provide simple I/O limits that do not provide…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Either we can provide fine-grained limits on I/O resources at the expense of user interface and implementation complexity, or we can provide simple I/O limits that do not provide as much control but are easy to understand and implement. The fine-grained features are marked "(optional)".

Overview

Disk I/O limits cap the amount of disk I/O that a guest can perform. This is important when storage resources are shared between multiple VMs and excessive disk utilization in one VM could impact the performance of other VMs.

Per-disk I/O Limits

It must be possible to control disk I/O limits individually for each disk when multiple disks are attached to a VM. This enables use cases like unlimited local disk access but shared storage access with limits.

Multi-disk I/O Limits (optional)

It must be possible to assign a single disk I/O limit across multiple disks so that I/O on any of the disks is accounted against one shared limit. This enables use cases like setting a single disk I/O limit across several disks that come from the same storage resource.

Fine-grained I/O Limits (optional)

Limits can affect only reads, only writes, or both reads and writes. This allows limits to be set differently for read requests and write requests, if desired by the administrator. The combined reads and writes limit can be used when no distinction between request types is desirable. Each of these three limits can be disabled (unlimited).

struct limit {
    uint64_t rd; /* only reads */
    uint64_t wr; /* only writes */
    uint64_t rdwr; /* both reads and writes */
};

bool exceeds_limit(struct limit *limit, uint64_t value, bool is_write)
{
    uint64_t rd_or_wr = is_write ? limit->wr : limit->rd;

    return value >= rd_or_wr || value >= limit->rdwr;
}

Note that if fine-grained I/O limits are not implemented then a single limit that affects both reads and writes must be supported.

Iops Limit

Disk I/O limits must be configurable for iops (I/O operations per second). The iops limit stops a guest from performing many small requests that consume all available request processing capacity.

Throughput Limit (optional)

Disk I/O limits must be configurable for throughput (bytes per second). The throughput limit stops a guest from performing few requests that are huge in size and consume all available bandwidth.

Note that if throughput limits are not implemented then iops can be used to approximate throughput limits by calculating the iops limit for the VM's average request size. For example, a 10 MB/s throughput limit for a VM that performs 8 KB requests would be 1280 iops.

I/O Limiting

Requests that exceed the current limit must be queued since block I/O is reliable. No error must be returned to the VM because it may cause the application to see failed I/O.

Queued requests must be issued once I/O resources fall below the limit again.

Command-line options

The QEMU -drive option must be extended so that disk I/O limits can be specified on the command-line.

QMP interface

It must be possible to change disk I/O limits at runtime using the QEMU Monitor Protocol interface.

Libvirt integration

Domain XML must support disk I/O limits and the QEMU driver must be able to set these.

Enhanced I/O Statistics (optional)

Per-disk average queue depth and request latency must be added to existing block statistics. This will enable administrators to monitor and understand disk performance.

TODO

  1. default values
  2. discovery of host limits
  3. device hotplug
  4. notification mechanism of reaching limit
    1. pre-limit watermarks, with-in 10% of limit...
  5. guest agent integration (may be part of (4))
  6. ensure we have the right api for letting a separate policy manager to do things like apply a system-wide policy to all VMs