Google Summer of Code 2022: Difference between revisions

From QEMU
Line 63: Line 63:
{{:Internships/ProjectIdeas/EncryptedStorageInVMBasedContainerRuntimes}}
{{:Internships/ProjectIdeas/EncryptedStorageInVMBasedContainerRuntimes}}
{{:Internships/ProjectIdeas/S390xRISU/}}
{{:Internships/ProjectIdeas/S390xRISU/}}
{{:Internships/ProjectIdeas/SnapshotFuzzingDevice}}
{{:Internships/ProjectIdeas/CoverageGuidedDiskImageFuzzer}}


== How to add a project idea ==
== How to add a project idea ==

Revision as of 09:07, 21 February 2022

Introduction

QEMU is applying to Google Summer of Code 2022. This page contains our ideas list and information for applicants and mentors. Google Summer of Code is an open source internship program offering paid remote work.

Google will announced participating organizations on March 7th. We will not know if QEMU is participating this year until the announcement.

Application Process

1. Discuss the project idea with the mentor(s)

Read the project ideas list and choose one you are interested in. Read the links in the project idea description and start thinking about how you would approach this. Ask yourself:

  • Do I have the necessary technical skills to complete this project?
  • Will I be able to work independently without the physical presence of my mentor?

If you answer no to these questions, choose another project idea and/or organization that fits your skills.

Once you have identified a suitable project idea, email the mentor(s) your questions about the idea and explain your understanding of the project idea to them to verify that you are on the right track.

2. Fill out the application form

The application form asks for a problem description and outline of how you intend to implement a solution. You will need to do some background research (looking at source code, browsing relevant specifications, etc) in order to decide how to tackle the project. The form asks for an initial project schedule which you should create by breaking down the project into tasks and estimating how long they will take. The schedule can be adjusted during the summer so don't worry about getting everything right ahead of time.

3. IRC interview including a coding exercise

You may be invited to an IRC interview. The interview consists of a 30-minute coding exercise, followed by technical discussion and a chance to ask questions you have about the project idea, QEMU, and GSoC. The coding exercise is designed to show fluency in the programming language for your project idea (QEMU projects are typically in C but could also be in Python or Rust).

Here is a C coding exercise we have used in previous years when interviewing applicants: 2014 coding exercise

Try it and see if you can complete it comfortably. We cannot answer questions about the previous coding exercise but hopefully it should be self-explanatory.

If you find the exercise challenging, think about applying to other organizations where you have a stronger technical background and will be more competitive compared with other candidates.

Key Dates

From the timeline

  • March 7 - Organizations and project ideas announced
  • April 4 to 19 - Application period
  • May 20 - Accepted applicants announced
  • June 13 to September 12 - Coding period

Find Us

For general questions about QEMU in GSoC, please contact the following people:

Project Ideas

This is the listing of suggested project ideas. Students are free to suggest their own projects, see #How to propose a custom project idea below.

Add zoned device support to QEMU's virtio-blk emulation

Summary:

The goal of this project is to let guests (virtual machines) access zoned storage devices on the host (hypervisor) through a virtio-blk device. This involves extending QEMU's block layer and virtio-blk emulation code.

Zoned devices are a special type of block device (hard-disks or SSDs) that are split into regions called zones. Any sector from any zone can be read in any order (sequentially or randomly) but zones can only be written sequentially and do not accept random writes. The "Links" section below contains more information about zoned devices and how they fit into the software stack.

QEMU's block layer needs new APIs that call Linux ZBD ioctls when disk images are located on zoned devices. The virtio-blk emulation code then needs to be extended to handle zoned device commands by calling the new block layer APIs to perform zoned device I/O on behalf of the guest. The virtio-blk zoned device command VIRTIO specification is currently being drafted and you will implement it in QEMU.

This project will expose you to device emulation and zoned storage. You will gain experience in systems programming and especially how storage devices work in the context of Linux and QEMU.

The concrete goals are:

  • Add QEMU block layer APIs resembling Linux ZBD ioctls.
  • Extend QEMU virtio-blk emulation to implement zoned device commands using new QEMU block layer zoned storage APIs.
  • Add qemu-iotests test cases covering zoned block devices.

Stretch goals (if there is enough time):

  • Implement zoned storage emulation in QEMU's block/null.c driver so it's easy to run tests without root (needed for Linux null or scsi_debug drivers) or nested guests (needed for QEMU NVMe ZNS).
  • Implement SCSI ZBC support in QEMU's SCSI target to enable zoned devices in QEMU's emulated SCSI HBAs.
  • Implement NVMe ZNS using new QEMU block layer zoned storage APIs (currently it emulates fake zones but doesn't call actual Linux ZBD ioctls).

You do not need to have a physical zoned storage device for this project because there are several ways to simulate zoned devices in software (Linux null_blk, Linux scsi_debug, tcmu-runner, and QEMU NVMe ZNS emulation).

Links:

Details:

  • Project size: 350 hours
  • Difficulty: intermediate
  • Required skills: C programming
  • Mentors: Damien Le Moal <Damien.LeMoal@wdc.com>, Dmitry Fomichev <Dmitry.Fomichev@wdc.com>, Hannes Reinecke <hare@suse.de>, Stefan Hajnoczi <stefanha@redhat.com>

VIRTIO_F_IN_ORDER support for virtio devices

Summary: Implement VIRTIO_F_IN_ORDER in QEMU and Linux (vhost and virtio drivers)

The VIRTIO specification defines a feature bit (VIRTIO_F_IN_ORDER) that devices and drivers can negotiate when the device uses descriptors in the same order in which they were made available by the driver.

This feature can simplify device and driver implementations and increase performance. For example, when VIRTIO_F_IN_ORDER is negotiated, it may be easier to create a batch of buffers and reduce DMA transactions when the device uses a batch of buffers.

Currently the devices and drivers available in Linux and QEMU do not support this feature. An implementation is available in DPDK for the virtio-net driver.

Goals:

  • Implement VIRTIO_F_IN_ORDER for a single device/driver in QEMU and Linux (virtio-net or virtio-serial are good starting points).
  • Generalize your approach to the common virtio core code for split and packed virtqueue layouts.
  • If time allows, support for the packed virtqueue layout can be added to Linux vhost, QEMU's libvhost-user, and/or QEMU's virtio qtest code.

Links:

Details:

  • Project size: 350 hours
  • Difficulty: intermediate
  • Required skills: C programming
  • Mentors: Stefano Garzarella <sgarzare@redhat.com>, Eugenio Perez Martin <eperezma@redhat.com>
    • IRC/Matrix nicks: sgarzare, eperezma
  • Suggested by: Jason Wang <jasowang@redhat.com>

Create encrypted storage using VM-based container runtimes

Summary: Extend crun to create encrypted storage by running a libkrun VM

The Linux cryptsetup(8) tool requires root privileges to encrypt storage with LUKS. However, privileged containers are generally discouraged for security reasons. A possible solution to avoid extra privileges is using VM-based container runtimes (e.g crun with libkrun or kata-containers) and running the storage encryption tool inside the VM.

This internship focusses on a proof-of-concept for integrating and extending the crun container runtime with libkrun in order to create encrypted storage without root privileges. The initial step will focus on creating encrypted images to demonstrate the feasibility and the necessary changes in the software stack. If the timeframe allows it, an interesting follow-up to the first step is the encryption of persistent storage using block-based volumes.

This project will expose you to container runtimes and virtual machines. You must be willing to dig into different source codes like crun (written in C), libkrun (written in Rust), and possibly podman or other kubernetes/containers projects (written in Go).

Links:

Details:

  • Project size: 350 hours
  • Required skills: C programming
  • Desirable skills: ability to read Go and Rust code, knowledge of containers and virtualization
  • Mentor: Alice Frosi <afrosi@redhat.com>, Co-mentor: Sergio Lopez Pascual <slp@redhat.com>

Improve s390x (IBM Z) emulation with RISU

Summary: Adapt RISU to s390x and fix CPU emulation along the way.

RISU (Random Instruction Sequence generator for Userspace testing) is a tool for testing CPU instructions with randomly generated opcodes. RISU generates random CPU instruction sequences and runs them both on a reference machine and under QEMU. The results are compared between the reference machine and QEMU so that inconsistencies in QEMU's emulation can be detected and fixed.

The goal of this project is to adapt the RISU framework for the IBM Z CPU architecture (a.k.a. s390x), so that it could be used to test the s390x emulation of QEMU for correctness. This will certainly help to spot some instruction emulation deficiencies in QEMU which should be addressed during this internship, too.

Goals / tasks include:

  • Getting familiar with the RISU framework (i.e. study the code, run it on other architectures like x86)
  • Getting familiar with s390x instructions (i.e. study the "z/Architecture Principles of Operation" document)
  • Adapt the RISU framework for s390x
  • Get familiar with the TCG emulation framework of QEMU (see the target/s390x/ folder in the QEMU sources)
  • Fix at least one problem that has been discovered by running RISU on s390x and get the patch accepted in the QEMU project

Links:

Details:

  • Project size: 350 hours
  • Difficulty: intermediate
  • Required skills: C and Perl programming, good basic understand of assembly (CPU instructions) but not necessarily s390x
  • Mentor: Thomas Huth <thuth@redhat.com> (th_huth on IRC)

Implement a snapshot fuzzing device

Summary: Add a new emulated device for rapid guest-initiated snapshot/restore functionality for fuzzing.

Fuzz testing runs a program with random inputs to find bugs that lead to crashes or other program failures. Fuzz testing is a popular technique for finding security bugs.

Many recent fuzzing projects rely on snapshot/restore functionality [1,2,3,4,5]. For example tests/fuzzers that target large targets, such as OS kernels and browsers benefit from full-VM snapshots, where solutions such as manual state-cleanup and fork-servers are insufficient. Many of the existing solutions are based on QEMU, however there is currently no upstream-solution. Furthermore, hypervisors, such as Xen have already incorporated support for snapshot-fuzzing. In this project, you will implement a virtual-device for snapshot fuzzing, following a spec agreed-upon by the community. The device will implement standard fuzzing APIs that allow fuzzing using engines, such as libFuzzer and AFL++. The simple APIs exposed by the device will allow fuzzer developers to build custom harnesses in the VM to request snapshots, memory/device/register restores, request new inputs, and report coverage.

Project goals include:

  • Adding a new emulated device for snapshot fuzzing into QEMU.
  • Writing documentation and final editing of the hardware interface specification so fuzzer developers can learn how to take advantage of the device from inside a guest.

Links:

  1. https://arxiv.org/pdf/2111.03013.pdf
  2. https://blog.mozilla.org/attack-and-defense/2021/01/27/effectively-fuzzing-the-ipc-layer-in-firefox/
  3. https://www.usenix.org/system/files/sec20-song.pdf
  4. https://github.com/intel/kernel-fuzzer-for-xen-project
  5. https://github.com/quarkslab/rewind

Details:

  • Project size: 350 hours
  • Difficulty: intermediate
  • Required skills: C programming
  • Desirable skills: previous experience with fuzzing and/or device driver development
  • Topic/Skill Areas: Fuzzing, OS/Systems/Drivers
  • Mentor: Alexander Bulekov <alxndr@bu.edu> (a1xndr on IRC)

Coverage-guided disk image fuzzing

Summary: Implement a coverage-guided fuzzer for disk images file formats

Fuzz testing runs a program with random inputs to find bugs that lead to crashes or other program failures. Fuzz testing is a popular technique for finding security bugs.

QEMU has a qcow2 fuzzer (see tests/image-fuzzer). However, this fuzzer is not coverage-guided, is limited to qcow2 images, and does not run on OSS-Fuzz. Therefore the existing fuzzer does not provide a lot of code coverage and a modern coverage-guided fuzzer integrated into OSS-Fuzz is desirable.

Disk image files sometimes come from an untrusted source and this makes QEMU's disk image format code an attack surface. One example is the qemu-img utility that can convert between disk image formats and may be used to import untrusted disk images during virtual machine creation. As such, it is important to fuzz this code effectively.

Your task will be to create a coverage-guided fuzzer for image formats supported by QEMU. Beyond basic image-parsing code (qemu-img info), the fuzzer should be able to find bugs in image-conversion code (qemu-img convert). Combined with a corpus of disk image files, the coverage-guided fuzzer will be able to explore code paths without much built-in knowledge of the about disk image file layout.

Project goals include:

  • Implement a fuzzer capable of exploring qemu-img convert and block/qcow2-*.c code.
  • Retarget the fuzzer to VMDK (block/vmdk.c) and VHDX (block/vhdx*.c) image files.
  • Add the new fuzzer to OSS-Fuzz
  • Stretch goal: Support DMG (block/dmg.c), Parallels (block/parallels.c), VDI (block/vdi.c), and VPC (block/vpc.c)

Links:

Details:

  • Project size: 175 hours
  • Difficulty: intermediate
  • Required skills: C programming
  • Topic/Skill Areas: Fuzzing, libFuzzer/AFL
  • Mentor: Alexander Bulekov <alxndr@bu.edu> (a1xndr on IRC)

How to add a project idea

  1. Create a new wiki page under "Internships/ProjectIdeas/YourIdea" and follow #Project idea template.
  2. Add a link from this page like this: {{:Internships/ProjectIdeas/YourIdea}}

Example idea from a previous year: Internships/ProjectIdeas/I2CPassthrough

Project idea template

=== TITLE ===
 
 '''Summary:''' Short description of the project
 
 Detailed description of the project.
 
 '''Links:'''
 * Wiki links to relevant material
 * External links to mailing lists or web sites
 
 '''Details:'''
 * Skill level: beginner or intermediate or advanced
 * Language: C
 * Mentor: Email address and IRC nick
 * Suggested by: Person who suggested the idea

How to propose a custom project idea

Applicants are welcome to propose their own project ideas. The process is as follows:

  1. Email your project idea to qemu-devel@nongnu.org. CC Stefan Hajnoczi <stefanha@gmail.com> and regular QEMU contributors who you think might be interested in mentoring.
  2. If a mentor is willing to take on the project idea, work with them to fill out the "Project idea template" above and email Stefan Hajnoczi <stefanha@gmail.com>.
  3. Stefan will add the project idea to the wiki.

Note that other candidates can apply for newly added project ideas. This ensures that custom project ideas are fair and open.

How to get familiar with our software

See what people are developing and talking about on the mailing lists:

Grab the source code or browse it:

Build QEMU and run it: QEMU on Linux Hosts

Links

Information for mentors

Mentors are responsible for keeping in touch with their intern and assessing progress. GSoC has evaluations where both the mentor and intern assess each other.

The mentor typically gives advice, reviews the intern's code, and has regular communication with the intern to ensure progress is being made.

Being a mentor is a significant time commitment, plan for 5 hours per week. Make sure you can make this commitment because backing out during the summer will affect the intern's experience.

The mentor chooses their intern by reviewing application forms and conducting IRC interviews with applicants. Depending on the number of candidates, this can be time-consuming in itself. Choosing the right intern is critical so that both the mentor and the intern can have a successful experience.