Google Summer of Code 2024: Difference between revisions

From QEMU
No edit summary
Line 67: Line 67:
{{:Internships/ProjectIdeas/VhostUserMemoryIsolation}}
{{:Internships/ProjectIdeas/VhostUserMemoryIsolation}}
{{:Internships/ProjectIdeas/NitroEnclaves}}
{{:Internships/ProjectIdeas/NitroEnclaves}}
{{:Internships/ProjectIdeas/TCGBinaryTracing}}


== How to add a project idea ==
== How to add a project idea ==

Revision as of 18:37, 6 February 2024

Introduction

QEMU is applying for Google Summer of Code 2024. This page contains our ideas list and information for applicants and mentors. Google Summer of Code is an open source internship program offering paid remote work.

Status: Google will publish the list of accepted GSoC organizations at 18:00 UTC on February 21st. Applicants may get in touch with mentors before that date, but please don't invest too much time before accepted organizations are announced.

Application Process

1. Discuss the project idea with the mentor(s)

Read the project ideas list and choose one you are interested in. Read the links in the project idea description and start thinking about how you would approach this. Ask yourself:

  • Do I have the necessary technical skills to complete this project?
  • Will I be able to work independently without the physical presence of my mentor?

If you answer no to these questions, choose another project idea and/or organization that fits your skills.

Once you have identified a suitable project idea, email the mentor(s) your questions about the idea and explain your understanding of the project idea to them to verify that you are on the right track.

2. Submit your proposal

Upload your proposal PDF file to the Google Summer of Code website and notify your mentor(s) so they can give you feedback. You can make changes and upload the PDF again until the application deadline. Your proposal must include the following:

  • Project idea (title)
  • Your name and email address
  • Outline of your solution
    • Do some background research by looking at source code, browsing relevant specifications, etc in order to decide how to tackle the project. Discuss any questions with your mentor. This section will explain how your solution will work.
  • Project schedule
    • Create a week-by-week schedule of the coding period. Breaking down the project into tasks and estimate how many weeks they will take. The schedule can be adjusted during the summer so don't worry about getting everything right ahead of time.
  • Relevant experience (programming language knowledge, hobby projects, etc)
  • Are you available to work with no other commitments (jobs, university, vacation, etc) for the duration of your project? If not, please give details about the working hours and dates.

3. Contribution task

Once you have submitted your proposal PDF, let your mentor know and request a contribution task. The task will be a real bug or small feature that should not take more than 1 or 2 days to complete. This will allow you to demonstrate your skills in a realistic setting. Your mentor will provide you the details and help you with any questions.

Key Dates

From the timeline:

  • February 21 18:00 UTC - Organizations and project ideas announced
  • March 18 - April 2 18:00 UTC - Application period
  • April 21 - Contribution task deadline
  • May 1 18:00 UTC - Accepted applicants announced
  • May 27 - August 26 - Standard coding period (an extended timeline is possible depending on your project)

Find Us

For general questions about QEMU in GSoC, please contact the following people:

Project Ideas

This is the listing of suggested project ideas. Students are free to suggest their own projects, see #How to propose a custom project idea below.

RISC-V Vector TCG Frontend Optimization

Summary: Improve QEMU's performance on RISC-V vector instructions.

The RISC-V vector extension has been implemented in QEMU, but we have some performance pathologies mapping it to existing TCG backends. This project aims to improve the performance of the RISC-V vector ISA's mappings to QEMU TCG just-in-time compiler.

The RISC-V TCG frontend (ie, decoding RISC-V instructions and emitting TCG calls to emulate them) has some inefficient mappings to TCG, which results in binaries that have vector instructions frequently performing worse than those without, sometimes even up to 10x slower. This causes various headaches for users, including running toolchain regressions and doing distro work. This project's aim would be to bring the performance of vectorized RISC-V code to a similar level as the corresponding scalar code.

This will definitely require changing the RISC-V TCG frontend. It's likely there is some remaining optimization work that can be done without adding TCG primitives, but it may be necessary to do some core TCG work in order to improve performance sufficiently.

Internship tasks:

TODO

Links:

Details

  • Project size: 350 hours
  • Skill level: intermediate
  • Language: C, RISC-V assembly
  • Mentors: Palmer Dabbelt <palmer@dabbelt.com>

GStreamer Backend for vhost-device-sound

Summary: Implement a GStreamer audio backend in rust-vmm's vhost-device-sound crate.

Project Description:

virtio-sound device emulation has recently been developed in the Rust vhost-device-sound crate. The crate currently contains audio backends for the ALSA and PipeWire sound APIs. The aim of this project is to build a new GStreamer audio backend.

Audio backends are written by implementing the AudioBackend trait. Refer to alsa.rs and pipewire.rs for examples of existing backends. The Stream and Buffer structs are used to transfer audio samples between the virtio-sound device and the sound API (e.g. GStreamer).

The backend should be implemented using the GStreamer Rust bindings. Mono and stereo playback and capture should be supported. The GStreamer pipelines for playback and capture will be hardcoded and only Linux needs to be supported.

Application Phase Tasks:

  • Familiarize yourself with the vhost-device-sound crate's AudioBackend trait and Stream and Buffer structs.
  • Familiarize yourself with the GStreamer Rust bindings.

Internship Tasks:

  • Write a skeleton for the GStreamer audio backend in the vhost-device-sound crate. Reviewing how the ALSA and PipeWire backends work is a helpful guide.
  • Implement playback functionality in the GStreamer audio backend for vhost-device-sound.
  • Implement capture functionality in the GStreamer audio backend for vhost-device-sound.
  • Implement automated tests with cargo test.
  • Test the implementation with QEMU, which can act as a vhost-user frontend.
  • As a stretch goal, contribute to the rust-vmm/vhost-device repo by fixing issues.

Links:

Details:

  • Project Size: 180 hrs
  • Skill level: intermediate
  • Language: Rust
  • Mentors: Dorinda Bassey <dbassey@redhat.com>, Matias Ezequiel Vara Larsen <mvaralar@redhat.com>
  • Suggested by: Dorinda Bassey, Matias Ezequiel Vara Larsen

Add packed virtqueue to Shadow Virtqueue

Summary: Add the packed virtqueue format support to QEMU's Shadow Virtqueue.

To live migrate a guest with a passthrough device, QEMU needs a way to know which memory the device modifies so it is able to migrate it every time it is modified. Otherwise the migrated guest would resume with outdated memory contents after live migration.

This is especially hard with passthrough hardware devices, as transports like PCI impose a few security and performance challenges. As a method to overcome this for VIRTIO devices, QEMU can offer an emulated virtqueue to the device, called a Shadow Virtqueue (SVQ), instead of allowing the device to communicate directly with the guest. SVQ will then forward the writes to the guest, being the effective writer in the guest memory and knowing when a portion of it needs to be migrated again.

Compared with the original Split Virtqueue layout already supported by Shadow Virtqueues, the Packed Virtqueue layout is a more compact representation that uses less memory size and allows both devices and drivers to exchange the same amount of information with fewer memory operations.

The task is to complete the packed virtqueue support for SVQ, using the kernel VIRTIO ring driver as a reference. There is already a setup that can be used to test the changes.

Internship tasks:

  • Build the hands on blogs scenarios as development environment.
  • Understand (in a very high level) the virtqueue handling code, using the virtqueues blogs, the code from QEMU hw/virtio/virtio.c and the kernel drivers/virtio/virtio_ring.c.
  • Develop the basic code of the packet virtqueue in vhost-shadow-virtqueue.c, ignoring features like indirect.
  • Add event_idx code.
  • If there is bandwidth, add the corresponding device code to kernel's drivers/vhost/vringh, following the code of QEMU's device at hw/virtio/virtio.c.

Links:

Details:

  • Project size: 180 hrs
  • Skill level: Intermediate
  • Language: C
  • Mentors: Eugenio Perez Martin <eperezma@redhat.com>, Stefano Garzarella <sgarzare@redhat.com>

vhost-user memory isolation

Summary: Add a new mode for vhost-user devices that does not expose guest RAM as shared memory.

vhost-user enables VIRTIO devices to be implemented as separate processes outside of QEMU. This allows device emulation code to be written in any programming language, sharing of device emulation code with other emulators besides QEMU, and complex device implementations that would not fit well into the QEMU process. vhost-user achieves good performance by directly accessing guest RAM through shared memory. Exposing guest RAM is not always desirable for security reasons and is sometimes not possible due to lack of host platform support. This project will add an alternative mode for vhost-user devices where guest RAM is not exposed.

Today, QEMU configures the guest in such a way that the vhost-user device is directly notified when I/O requests are ready for processing by the device. Similarly, when the vhost-user device completes I/O requests, it directly notifies the guest. The vhost-user device has full access to guest RAM via shared memory in order to transfer data buffers while processing I/O requests. This project will add a mode where QEMU intercepts I/O requests, copies data buffers between guest RAM and a vhost-user isolated memory area that the vhost-user device can access, and then forwards the notifications between the guest and the vhost-user device.

This approach of intercepting I/O requests is already being used in certain live migration scenarios and is called Shadow Virtqueue. The project will involve reusing the Shadow Virtqueue implementation and integrating it into the vhost-user code. It is important that existing vhost-user devices work with memory isolation and no vhost-user protocol changes are required.

You will gain experience with QEMU internals, VIRTIO, and vhost-user.

Internship tasks:

  • Add a bool "memory-isolation" qdev property to QEMU's vhost-user devices.
  • Modify hw/virtio/vhost-user.c to intercept and forward the vhost-user callfd and kickfd eventfds when memory isolation is enabled.
  • Manage an area of memory where I/O requests will be copied.
  • Integrate the existing Shadow Virtqueue (SVQ) code into hw/virtio/vhost-user.c so that vhost-user devices see the SVQ instead of the guest's virtqueue.
  • Extend tests/qtest/vhost-user-test.c to run with memory-isolation=on, proving that the feature works.

Links:

Details:

  • Project size: 350 hours
  • Skill level: intermediate
  • Language: C
  • Mentor: Stefano Garzarella <sgarzare@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Eugenio Perez Martin <eperezma@redhat.com>, Hanna Czenczek <hreitz@redhat.com>

Implement -M nitro-enclave in QEMU

Summary: AWS EC2 provides the ability to create an isolated sibling VM context from within a VM. This project implements the machine model and input data format parsing needed to run these sibling VMs stand alone in QEMU.

Nitro Enclaves are the first widely adopted implementation of hypervisor assisted compute isolation. Similar to technologies like Intel SGX, it allows to spawn a separate context that is inaccessible by the parent Operating System. This is implemented by "giving up" resources of the parent VM (CPU cores, memory) to the hypervisor which then spawns a second vmm to execute a completely separate virtual machine. That new VM only has a vsock communication channel to the parent and has a built-in lightweight Trusted Platform Module called NSM.

One big challenge with Nitro Enclaves is that due to its roots in security, there are very few debugging / introspection capabilities. That makes OS bringup, debugging and bootstrapping very difficult. Having a local development and test environment that looks like an Enclave, but is 100% controlled by the developer and introspectable would make life a lot easier for everyone working on them. It also may pave the way to see Nitro Enclaves adopted in VM environments outside of EC2.

This project will consist of adding a new machine model to QEMU that mimics a Nitro Enclave environment, including NSM, the vsock communication channel and building firmware which loads the special "EIF" file format which contains kernel, initramfs and metadata from a -kernel image.

If the student finishes early, we can then proceed to implement the Nitro Enclaves parent driver in QEMU as well to create a full QEMU-only Nitro Enclaves environment.

Tasks:

  • Implement a device model for the NSM device (link to spec and driver code below)
  • Implement a new machine model (-M nitro-enclave)
  • Implement firmware for the new machine model that implements EIF parsing
  • Add tests for the NSM device
  • Add integration test for the machine model executing an actual EIF payload

Links:

Details:

  • Project size: 350 hours
  • Skill level: intermediate - advanced (some understanding of QEMU machine modeling would be good)
  • Language: C
  • Mentor: Alexander Graf (OFTC: agraf, Email: graf@amazon.com)

Binary tracing of TCG

Summary: Right now, most logging for the TCG accelerator can only be produced on stderr: this includes input and output assembly, unoptimized and optimized TCG opcodes, and exceptions/interrupts. Text output is easy to interpret but it is more expensive to produce and harder to filter.

This project will consist of integrating three new kinds of "trace events" into the "simple" trace backend: target assembly (as used by -d in_asm), host assembly (-d out_asm), TCG opcodes (-d op and op_opt). To do so, a few ancillary tasks are required:

  • Support for the various kinds of "-d" output is currently done with code such as
   if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM) ... {
       FILE *logfile = qemu_log_trylock();
       if (logfile) {
           fprintf(logfile, "----------------\n");
           ops->disas_log(db, cpu, logfile);
           fprintf(logfile, "\n");
           qemu_log_unlock(logfile);
       }
   }
the code within the "if" statement has to be replaced with a function call that will call into the trace backends. While the current code applies to the "log" backend, different logic has to be used for the "simple" backend.
  • a new formatter for simpletrace output files. Instead of using Python, the new formatter will be written in C or Rust in order to use the capstone disassembler. It will be placed in the contrib/ directory and, if Rust is used, it will be built with cargo.

Tasks:

  • Implement a basic equivalent of scripts/simpletrace.py in C or Rust
  • Implement a binary dump format for assembly fragments, so that "-d in_asm" and "-d out_asm" can be used by both the "log" backend and the "simple" backend
  • Implement a binary dump format for TCG opcodes
  • Implement variable buffer size for

Details:

  • Project size: 350 hours
  • Skill level: intermediate (some understanding of QEMU machine modeling would be good)
  • Language: C
  • Mentor: Paolo Bonzini (OFTC: bonzini, Email: pbonzini@redhat.com)

How to add a project idea

  1. Create a new wiki page under "Internships/ProjectIdeas/YourIdea" and follow #Project idea template.
  2. Add a link from this page like this: {{:Internships/ProjectIdeas/YourIdea}}

This is the listing of suggested project ideas. Students are free to suggest their own projects, see #How to propose a custom project idea below.

Project idea template

=== TITLE ===
 
 '''Summary:''' Short description of the project
 
 Detailed description of the project.
 
 '''Links:'''
 * Wiki links to relevant material
 * External links to mailing lists or web sites
 
 '''Details:'''
 * Skill level: beginner or intermediate or advanced
 * Language: C
 * Mentor: Email address and IRC nick
 * Suggested by: Person who suggested the idea

How to propose a custom project idea

Applicants are welcome to propose their own project ideas. The process is as follows:

  1. Email your project idea to qemu-devel@nongnu.org. CC Stefan Hajnoczi <stefanha@gmail.com> and regular QEMU contributors who you think might be interested in mentoring.
  2. If a mentor is willing to take on the project idea, work with them to fill out the "Project idea template" above and email Stefan Hajnoczi <stefanha@gmail.com>.
  3. Stefan will add the project idea to the wiki.

Note that other candidates can apply for newly added project ideas. This ensures that custom project ideas are fair and open.

How to get familiar with our software

See what people are developing and talking about on the mailing lists:

Grab the source code or browse it:

Build QEMU and run it: QEMU on Linux Hosts

Links

Information for mentors

Mentors are responsible for keeping in touch with their intern and assessing progress. GSoC has evaluations where both the mentor and intern assess each other.

The mentor typically gives advice, reviews the intern's code, and has regular communication with the intern to ensure progress is being made.

Being a mentor is a significant time commitment, plan for 5 hours per week. Make sure you can make this commitment because backing out during the summer will affect the intern's experience.

The mentor chooses their intern by reviewing application forms and conducting IRC interviews with applicants. Depending on the number of candidates, this can be time-consuming in itself. Choosing the right intern is critical so that both the mentor and the intern can have a successful experience.