ToDo/Block/DmgChunkSizeIndependence

From QEMU

Summary: Refactor the block/dmg.c driver to read ranges within a chunk instead of reading an entire chunk at a time.

Contact: Stefan Hajnoczi <stefanha@redhat.com> for questions

NOTE: This task has been started in February 2017 by sporgj_ <sporgj@gmail.com>. Contact Stefan if you want to work on it and it appears to have been abandoned.

The dmg file format is Apple's disk image format. It is sometimes used in place of an ISO file and also for software distribution. The file format is not officially documented but information on how to interpret it is available on Wikipedia and here.

QEMU's block/dmg.c block driver performs I/O in "chunk" units. Chunk information (size, offset, etc) is read from file's metadata when it is opened. Chunk size is variable so the block driver takes the maximum chunk size as its I/O buffer size. That way the driver can guarantee all data read from disk will fit into the buffer.

The problem with this approach is that chunks in some files are large. QEMU must prevent corrupt or malicious image files from causing it to allocate huge amounts of memory. This is important to prevent denial-of-service in cases where untrusted files are being accessed by the user.

Files seen in the wild have chunk sizes around ~250 MB. This exceeds the hardcoded 64 MB limit and produces the following error message:

qemu-img: sector count 511952 for chunk 0 is larger than max (131072)

The task is to refactor the block driver to avoid reading an entire chunk into a buffer for each read request. Instead it should seek to the range within the chunk and only read bytes that are actually being requested.

This is complicated by the fact that chunks may be deflate or bz2 compressed so special logic is needed to seek through compressed data. Study the zlib and libbz2 APIs to find out how to achieve this.