Features/WriteCacheEnable

From QEMU

Summary

Allow the guest to enable/disable the disk write cache at runtime. Currently the guest disk write cache is only configurable on the QEMU command-line on startup.

Owner

  • Name: Stefan Hajnoczi
  • Email: stefanha@linux.vnet.ibm.com

virtio-blk guest driver enhancements

  • The virtio-blk guest driver requires a new feature bit indicating the presence of write cache enable/disable.
  • The guest driver can query the current status of the write cache from a new virtio-blk config field. The field is `__u8 wce`.
  • The guest driver can change the write cache status by sending a new request type VIRTIO_BLK_T_SET_WCE. The reason for adding a new request type rather than allowing writes to virtio_blk_config.wce is that config writes are synchronous instructions from the vcpu perspective and we cannot report errors back. A request is asynchronous from the guest perspective and can return a status indicating errors that have occurred.
  • The guest can query and change the write cache status from userspace using a sysfs attribute like SCSI sd's cache_type attribute.

qemu-kvm enhancements

  • The virtio-blk device needs to support the new feature bit indicating the presence of write cache enable/disable.
  • The virtio-blk device needs to expose the block device's cache mode in the new virtio_blk_config.wce field.
  • The virtio-blk device needs to change the block device's cache mode when the VIRTIO_BLK_T_SET_WCE request is issued by the guest.

Changing the block device's cache must be implemented carefully so that an image file that has been deleted or cannot be opened for some other reason can be handled safely. The block device should not start failing I/O requests if reopening the block device fails.

It is not possible to toggle O_SYNC on an open file descriptor on Linux. Here is a sketch of the suggested workaround:

quiesce_and_ensure_no_pending_aio(old_fd);
fsync(old_fd);
new_fd = open("/proc/$pid/fd/$old_fd", new_flags);
if (new_fd < 0) {
    /* fail and keep using old_fd */
} else {
    close(old_fd);
    /* use new_fd from now on */
}

Image formats in QEMU peek at the cache flags so care must be taken to keep image formats updated with the latest cache setting.