Features/Sheepdog

From QEMU
Revision as of 06:03, 5 October 2010 by Kazutaka (talk | contribs) (Created page with '== Summary == A distributed storage system for QEMU == Owner == * '''Name:''' Kazutaka Morita * '''Email:''' morita.kazutaka@lab.ntt.co.jp == Detailed Summary…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Summary

A distributed storage system for QEMU

Owner

Detailed Summary

Sheepdog is a distributed storage system for QEMU. It provides highly available block level storage volumes that can be attached to QEMU based virtual machines. Sheepdog scales to several hundreds nodes, and supports advanced volume management features such as snapshot, cloning, and thin provisioning.

The architecture of Sheepdog is fully symmetric; there is no central node such as a meta-data server. This design enables following features.

Linear scalability in performance and capacity When more performance or capacity is needed, Sheepdog can be grown linearly by simply adding new machines to the cluster.

No single point of failure Even if a machine fails, the data is still accessible through other machines.

Easy administration There is no configuration file about cluster's role. When administrators launch Sheepdog programs at the newly added machine, Sheepdog automatically detects the added machine and begins to configure it as a member of the cluster.

Getting Started

Status

  • QEMU 0.13 provides built-in support for sheepdog block devices.

TODOs

generic items

  • add more documentations for users
  • add more documentations for developers
  • add testing tools to avoid regression
  • output better debug and error messages
  • support architectures other than X86_64, i386
  • support libvirt

sheep

  • update VDI objects atomically
  • handle connection timeout
  • scalability upto several hundreds nodes
  • better data re-balancing
  • remove data objects which are no longer used
  • provide different redundancy levels for each VDI
  • handle total node failure
  • handle network partition failure
  • remove limitation of the number of VMs on the same host
  • support VMs running outside the cluster
  • better load balancing, performance

collie

  • provide a machine parsable format option
  • provide a manual recovery command from the total node failure
  • show differences between VDIs to backup efficiently

qemu block driver

  • support live migration
  • support snapshot deletion
  • support a variable object size
  • support shrinking image size

Links