File systems within a guest tend to not reuse blocks which means that even if a file system remains small relative to the virtual image size, the actual image size tends to grow until it reaches the maximum size.
This can be mitigated in two ways. Recent guests support the ability to send TRIM commands to a virtual disk indicating that a block is no longer in use by the file system. This can be thought of as equivalent to "I don't care about the contents of this block anymore". For guests that don't support the TRIM command, a common technique to reduce image size is to periodically write out a file that contains nothing but zeros.
We can achieve image size reduction by using the TRIM command or detecting zero writes to free allocated clusters. This spec discusses how to implement free cluster tracking in QED.
- The free list is discoverable at fsck time by searching for orphaned blocks.
- We know an image is stable whenever we issue an fsck(). If we write a bit in the header to indicate the header is stable after doing an fsck followed by another fsck, we know for sure that the bit is reliable.
- We currently plan to write this bit out during shutdown only.
- If we added a compat feature to QED that was a pointer to the first free block, we could chain subsequent blocks by writing a header to each cluster.
- If the header dirty bit isn't set, we can rely on the free block pointer to enumerate all of the free clusters in the image; otherwise, we need to search for orphaned blocks during fsck.
- When allocating a new cluster, we can simply consult the free list instead of allocating at the end of file.
- Defragmentation could also consider the free list in choosing candidate defragmentation targets.
- We could aggressively reduce the free list size by moving allocated blocks to the free list using a similar technique to defragmentation.