Outreachy 2016 MayAugust

From QEMU
Revision as of 17:36, 15 February 2016 by Stefanha (talk | contribs) (Created page with '= Introduction = QEMU is participating in [https://www.gnome.org/outreachy/ Outreachy 2016 May-August]. This page contains our ideas list and information for candidates and ment…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Introduction

QEMU is participating in Outreachy 2016 May-August. This page contains our ideas list and information for candidates and mentors.

Find Us

  • IRC: #qemu-outreachy on irc.oftc.net
  • IRC (development):
    • QEMU: #qemu on irc.oftc.net
    • KVM: #kvm on chat.freenode.net

Please contact the mentor for the project idea you are interested in. IRC is usually the quickest way to get an answer.

For general questions about QEMU in Outreachy, please contact the following people:

How to get familiar with our software

See what people are developing and talking about on the mailing lists:

Grab the source code or browse it:

Build QEMU and run it: QEMU on Linux Hosts

Project Ideas

This is the listing of suggested project ideas.

QEMU projects

AF_VSOCK packet capture in Linux and Wireshark

Summary: Develop a AF_VSOCK packet capture Linux device driver and Wireshark dissector

Wireshark and Linux's packet capture functionality support more than just Ethernet traffic dumping. USB device traffic and netlink software communication can also be captured.

The AF_VSOCK address family is currently not support by Wireshark because there is no Linux kernel device driver for packet capture. AF_VSOCK is the socket address family that is used by the virtio-vsock host/guest communication device that is currently in development. The aim of this project is to first implement a Linux device driver for AF_VSOCK packet capture and then a Wireshark dissector. Minor changes to tcpdump may be necessary too.

This will allow tcpdump and Wireshark to dump host/guest communication with virtio-vsock (and maybe also VMware VMSockets). Traffic capture is an essential debugging tool for network communication and has not been available to programs using AF_VSOCK.

This project is challenging because you need to work on multiple codebases. You must have experience with device driver development and network programming.

Links:

Details:

  • Skill level: advanced
  • Language: C
  • Mentor: Stefan Hajnoczi <stefanha@redhat.com> (stefanha on IRC)

qemu-img fuzzing using afl-fuzz

Summary: Apply the afl-fuzz fuzz testing tool to qemu-img and submit patches fixing bugs discovered with afl-fuzz.

The qemu-img tool is used to convert between disk image file formats and inspect image files. It supports multiple file formats including qcow2, vmdk, vhdx, and parallels. Since this tool is often used on untrusted inputs (e.g. in a cloud or hosting environment where end-users can upload disk image files), it must not allow arbitrary code execution or other classes of security bugs.

afl-fuzz instruments the program to record codepaths taken for each input test file. This allows afl-fuzz to mutate inputs and choose the ones that explore new codepaths. The amount of prior knowledge that afl-fuzz needs about the input grammar is limited since it learns how inputs affect the codepath. This makes it possible to fuzz various disk image file formats without painstakingly writing grammars for each file format.

In Outreach Program for Women 2014, a qcow2-specific fuzzing tool was developed in Python and several bugs were discovered. This project aims to tackle the other file formats (especially vmdk, vhdx, and parallels).

This project is suitable for candidates interested in software security, software testing, compilers, and disk image file formats.

Links:

Details:

  • Skill level: intermediate
  • Language: C
  • Mentor: Stefan Hajnoczi <stefanha@redhat.com> (stefanha on IRC)
  • Suggested by: Stefan Hajnoczi

qemu-img new subcommand "dd"

Summary: Add "qemu-img dd" subcommand.

dd(1) is a convenient tool to work on binary files, while qemu-img(1) has the knowledge of many image formats (qcow2, vhdx, vdi, vmdk, etc.) and protocols (nfs, iscsi, gluster, ssh, etc.). If we put them together, we'll have the power of dd to work on various virtual images and, or even pipe it through any host side utilities, such as grep(1), xxd(1) or xz(1). The idea is to implement a new subcommand in qemu-img, the tool provided by QEMU for manupulating virtual images.

Currently qemu-img has following subcommands:

Command syntax:
 check [-q] [-f fmt] [--output=ofmt] [-r [leaks | all]] [-T src_cache] filename
 create [-q] [-f fmt] [-o options] filename [size]
 commit [-q] [-f fmt] [-t cache] filename
 compare [-f fmt] [-F fmt] [-T src_cache] [-p] [-q] [-s] filename1 filename2
 convert [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-T src_cache] [-O output_fmt] [-o options] [-s snapshot_id_or_name] [-l snapshot_param] [-S sparse_size] filename [filename2 [...]] output_filename
 info [-f fmt] [--output=ofmt] [--backing-chain] filename
 map [-f fmt] [--output=ofmt] filename
 snapshot [-q] [-l | -a snapshot | -c snapshot | -d snapshot] filename
 rebase [-q] [-f fmt] [-t cache] [-T src_cache] [-p] [-u] -b backing_file [-F backing_fmt] filename
 resize [-q] filename [+ | -]size
 amend [-q] [-f fmt] [-t cache] -o options filename

You will extend the subcommand set with the new "dd" command, in a syntax that is familiar to *nix "dd" users.

Note that we don't have to mirror the behavior of GNU coreutils' or BDS systems' dd(1), or try to support every operand found there. A subset of operands (and probably some qemu-img specific ones) as chosen by you will be implemented. It is also your responsibility to write documentation for the new command and options.

Links

Details:

  • Skill level: beginner
  • Language: C
  • Mentor: Fam Zheng <famz@redhat.com>, fam on IRC

qtest-os: a mini operating system written in Python

Summary: Write a Python library to interact with QEMU's qtest, and then as much as possible of a "mini-OS" written in Python

...

Links:

  • ...

Details:

  • Skill level: medium
  • Language: Python
  • Mentor: Paolo Bonzini <pbonzini@redhat.com> (bonzini on IRC)

Postcopy migration: Recovery from a broken network connection

Summary: Improve the postcopy migration mode so it can cope with a network failure during the migration.

Postcopy migration is a scheme that is good at live migrating large VMs that rapidly change memory, but if the network connection fails during the postcopy phase you're left with an inconsistent VM. I had some ideas how to fix this by putting both VMs into a paused state and then hunting for the missing pages (see the Links).

Links: https://www.mail-archive.com/qemu-devel@nongnu.org/msg344360.html

Details:

  • Skill level: medium/advanced
  • Language: C
  • Mentor Dave Gilbert <dgilbert@redhat.com> (davidgiluk on IRC)

Multi-threaded TCG Projects

Summary: Add support for modelling memory ordering between mismatched frontend and backend.

Details:

  • Skill level: advanced
  • Language: C
  • Mentor Alex Bennee <alex.bennee@linaro.org> (stsquad on IRC)


Event loop profiling tool

Summary: Develop a top(1)-like tool to monitor event loop dispatching

A running QEMU process can have a number of different types of threads. An I/O thread (either the main thread, or a custom iothread for dataplane devices) is a thread that runs an poll based event loop.

The event loop dispatches I/O events that come from user interface (e.g. monitor fd), guest OS (e.g. ioeventfd), or program's internal sources (e.g. bottom halves or timers). Their occupation of host CPU time is often very useful debug/diagnostic information. Ideally the profiling code in QEMU would be in a dedicated thread so it is still usable even when the event loops are stuck.

In this project you will develop a tool for QEMU that is like the top(1) utility for Linux, to monitor QEMU's event loops. As a prerequisite, you need to modify QEMU to expose necessary data that will be collected by the new tool to generate the profiling output.

You must be familiar with (n)curses library and multi-threaded programming. You can write the tool in either C or Python.

Links:

Details:

  • Skill level: advanced
  • Language: C, (optional) Python
  • Mentor: Fam Zheng <famz@redhat.com>, fam on IRC

Information for mentors

Mentors are responsible for keeping in touch with their candidate and assessing the candidate's progress.

The mentor typically gives advice, reviews the candidate's code, and has regular communication with the candidate to ensure progress is being made.

Being a mentor is a significant time commitment, plan for 5 hours per week. Make sure you can make this commitment because backing out during the summer will affect the candidate's experience.

The mentor chooses their candidate by reviewing candidate application forms, giving out bite-sized tasks so applicants can submit a patch upstream, and conducting IRC interviews with candidates. Depending on the number of candidates, this can be time-consuming in itself. Choosing the right candidate is critical so that both the mentor and the candidate can have a successful experience.