Google Summer of Code 2014: Difference between revisions

From QEMU
Line 110: Line 110:
'''Summary:''' Implement persistent incremental image backup.
'''Summary:''' Implement persistent incremental image backup.
   
   
Users want to do regular backup of VM image data to protect data from
Users want to do regular backup of VM image to protect data from
unexpected loss.  Incremental backup is a backup strategy that only copies out
unexpected loss.  Incremental backup is a backup strategy that only copies out
the "new data" that is changed since previous backup, to reduce the overhead of
the "new data" that is changed since previous backup, to reduce the overhead of

Revision as of 10:36, 7 February 2014

Introduction

QEMU is applying as a mentoring organization for Google Summer of Code 2014. This page contains our ideas list and information for students and mentors.

Note to students: Google has not yet announced participating organizations for 2014. We do not know whether or not QEMU can participate this year, so use your time wisely and don't invest too much effort yet.

Find Us

Please contact the mentor for the project idea you are interested in. IRC is usually the quickest way to get an answer.

For general questions about QEMU in GSoC, please contact the following people:

Important links

Project Ideas

This is the listing of suggested project ideas. Students are free to suggest their own projects by emailing qemu-devel@nongnu.org and (optionally) CCing potential mentors.

QEMU projects

Device driver framework for low-level testing

Summary: Implement bus drivers and example device tests

It is currently very hard to exercise device emulation code and, as a result, most device emulation code does not have tests. The reason for this is that modern hardware is complex and often requires elaborate setup before it can be accessed. When applicable, we also want to test the same devices under multiple emulated board. The same test should apply to all boards, just like Linux uses the same network card driver on any system where you can plug that card in.

The analogy with Linux is not a coincidence; a good device testing framework is really very similar to the plug-and-play subsystem of a "real" operating system. So far we didn't have any of this, and it made testing QEMU device emulation code was "too hard"... but this project will change that!

QEMU has a mode called 'qtest' where it simulates the machine without running guest code. Instead, qtest exposes a protocol for accessing memory, detecting interrupts, and stepping the clock. Tests use qtest mode to verify that an emulated device operates according to its specification.

For example, a test case can program the timer chip and then step the clock, checking if the timer interrupt has been raised.

In order to do this for more complex PCI, I2C, ISA, and USB devices, we need the equivalent of device driver frameworks that operating system kernels have. Right now, QEMU's libqos provides a meager set of device driver APIs but much more is necessary before test cases can be expressed concisely without a lot of setup.

Links:

Details:

  • Skill level: intermediate to advanced
  • Language: C
  • Mentor: Stefan Hajnoczi <stefanha@redhat.com> (stefanha on IRC), Paolo Bonzini <pbonzini@redhat.com>

Disk image fuzz testing

Summary: Implement fuzz testing for image file formats to identify security vulnerabilities before the bad guys do

QEMU supports a range of image file formats including qcow2, VMDK, and VHDX. Image files are often uploaded by untrusted users to cloud providers or shared online as untrusted "demo appliances". Any bug in QEMU that can be triggered by a malicious image file could be used to compromise the host opening the image file.

Fuzz testing is an automated random testing technique that explores a program's code paths by trying random inputs. When the program under test crashes, a bug has been found. These techniques have been used successfully in other areas such as testing the Linux system call interface.

Your task is to implement fuzz tests for qcow2, VMDK, and VHDX. These tests will be merged into qemu.git and part of the test suite. You can either fix bugs found by your tests yourself, or work with the community to report them and provide fixes.

You should consider different approaches to fuzzing that interest you (e.g. combining with code coverage metrics). Using an existing fuzzing framework may be a good idea too.

Links:

Details:

  • Skill level: intermediate
  • Language: Python and/or shell
  • Mentor: Stefan Hajnoczi <stefanha@redhat.com> (stefanha on IRC)

Incremental backup of block images

Summary: Implement persistent incremental image backup.

Users want to do regular backup of VM image to protect data from unexpected loss. Incremental backup is a backup strategy that only copies out the "new data" that is changed since previous backup, to reduce the overhead of backup and improve the storage utilization. To track which part of guest data is changed, QEMU needs to store image's "dirty bitmap" on the disk as well as the image data itself.

The task is to implement a new block driver (a filter) to load/store this persistent dirty bitmap file, and maintain the dirty bits while the guest writes to the data image. As a prerequisite, you also need to make the design of this bitmap file format. Then, design test cases and write scripts to test the driver.

The persistent bitmap file must contain:

  • Magic bits to identify the format of this file.
  • Bitmap granularity (e.g. 64 KB)
  • The actual bitmap (1 TB disk @ 64 KB granularity = 2 MB bitmap)
  • Flags including a "clean" flag. The "clean" flag is used to tell whether the persistent bitmap file is safe to use again. When QEMU opens the persistent dirty bitmap, it clears the "clean" flag. When QEMU deactivates and finishes writing out the dirty bitmap, it sets the "clean" flag. If the QEMU process crashes it is not safe to trust the dirty bitmap; a full backup must be performed. Make use of this flag in the driver to limit the performance overhead.

Links:

Details:

  • Skill level: intermediate
  • Language: C
  • Mentors: Fam Zheng <famz@redhat.com> (fam on IRC), Stefan Hajnoczi <stefanha@redhat.com> (stefanha on IRC)

Libvirt projects

Introducing job control to the storage driver

Currently, libvirt support job cancellation and progress reporting on domains. That is, if there's a long running job on a domain, e.g. migration, libvirt reports how much data has already been transferred to the destination and how much still needs to be transferred. However, libvirt lacks such information reporting in storage area, to which libvirt developers refer to as the storage driver. The aim is to report progress on several storage tasks, like volume wiping, file allocation an others.

  • Skill level: intermediate

Rewriting VirtualBox driver

If you have ever looked into our VirtualBox driver, you still may experience ocassional hedaches. I still do. The code is horribly structured so we would be more than happy to have somebody to rewrite the code and bring cleanliness that we strive to keep in the rest of the code.

  • Skill level: beginner

Your own idea

Just catch me (Michal Privoznik) on IRC and we can discuss what interests you.

Links:

Details:

  • Component: libvirt
  • Skill level: (see description to each item)
  • Language: C
  • Mentor: Michal Privoznik <mprivozn@redhat.com>, mprivozn on IRC (#virt OFTC)
  • Suggested by: Michal Privoznik <mprivozn@redhat.com>

Project idea template

=== TITLE ===
 
 '''Summary:''' Short description of the project
 
 Detailed description of the project.
 
 '''Links:'''
 * Wiki links to relevant material
 * External links to mailing lists or web sites
 
 '''Details:'''
 * Skill level: beginner or intermediate or advanced
 * Language: C
 * Mentor: Email address and IRC nick
 * Suggested by: Person who suggested the idea

Information for mentors

Mentors are responsible for keeping in touch with their student and assessing the student's progress. GSoC has a mid-term evaluation and a final evaluation where both the mentor and student assess each other.

The mentor typically gives advice, reviews the student's code, and has regular communication with the student to ensure progress is being made.

Being a mentor is a significant time commitment, plan for 5 hours per week. Make sure you can make this commitment because backing out during the summer will affect the student's experience.

The mentor chooses their student by reviewing student application forms and conducting IRC interviews with candidates. Depending on the number of candidates, this can be time-consuming in itself. Choosing the right student is critical so that both the mentor and the student can have a successful experience.