Google Summer of Code 2021

From QEMU
Revision as of 17:40, 18 February 2021 by Stefanha (talk | contribs)

Introduction

QEMU is applying to Google Summer of Code 2021. This page contains our ideas list and information for students and mentors. Google Summer of Code is an open source internship program for university students offering 10-week, paid remote work (175 hours) from June to August.

Applicants: You are welcome to think about project ideas and familiarize yourself with QEMU, but please don't invest too much time at this early stage. Google will announce participating organizations on March 9th.

Application Process

1. Discuss the project idea with the mentor(s)

Read the project ideas list and choose one you are interested in. Read the links in the project idea description and start thinking about how you would approach this. Ask yourself:

  • Do I have the necessary technical skills to complete this project in 10 weeks?
  • Will I be able to work independently without the physical presence of my mentor?

If you answer no to these questions, choose another project idea and/or organization that fits your abilities better.

Once you have identified a suitable project idea, email the mentor(s) your questions about the idea and explain your understanding of the project idea to them to verify that you are on track.

2. Fill out the application form

The application form asks for a problem description and outline of how you intend to implement a solution. You will need to do some background research (looking at source code, browsing relevant specifications, etc) in order to form an idea of how to tackle the project. The form asks for an initial project schedule which you should create by breaking down the project into tasks and estimating how long they will take. The schedule can be adjusted during the summer so don't worry about getting everything right ahead of time.

3. IRC interview including a coding exercise

You may be invited to an IRC interview. The interview consists of a 30-minute coding exercise, followed by technical discussion and a chance to ask questions you have about the project idea, QEMU, and GSoC. The coding exercise is designed to show fluency in the programming language for your project idea (QEMU projects are typically in C but could also be in Python or Rust).

Here is a C coding exercise we have used in previous years when interviewing students: 2014 coding exercise

Try it and see if you can complete it comfortably. We cannot answer questions about the previous coding exercise but hopefully it should be self-explanatory.

If you find the exercise challenging, think about applying to other organizations where you have a stronger technical background and will be more competitive compared with other candidates.

Key Dates

From the timeline

  • March 9 - Organizations and project ideas announced
  • March 29 to April 13 - Student application period
  • May 17 - Accepted students announced
  • June 7 to August 16 - Coding period

Find Us

  • IRC (GSoC specific): #qemu-gsoc on irc.oftc.net
  • IRC (development):
    • QEMU: #qemu on irc.oftc.net
    • KVM: #kvm on chat.freenode.net

For general questions about QEMU in GSoC, please contact the following people:

Project Ideas

This is the listing of suggested project ideas. Students are free to suggest their own projects, see #How to propose a custom project idea below.

TCG Plugin Cache Modelling

Summary: Implement a simple cache modelling plugin for QEMU's TCG plugins.

QEMU's TCG emulation has traditionally avoided doing complex modelling of the processor in favor of running fast. However the recent introduction of TCG plugins we can put some simple cache modelling into a plugin which can be optionally loaded when we want to examine how a program works. With such a plugin we could identify areas of code in either a linux-user program or a whole system that may not be cache optimal. The aim would be to write a plugin that allows you to simply model different icache/dcache configurations rather than actually simulate the micro-architecture of a CPU.

Links:

Details:

  • Skill level: intermediate with a good understanding of a processor instruction and data caches
  • Language: C, Python
  • Mentor: Alex Bennée (alex.bennee@linaro.org)
  • Suggested by: Alex Bennée

TCG Code Coverage Plugin

Summary: Implement a cove coverage plugin for QEMU's TCG plugins.

Generating code coverage of your programs usually involves instrumenting them through your build process. Often this is painful if you want to also work out how much coverage of library functions you are doing. As QEMU's basic block boundaries are naturally at branch points you could do all of this pretty easily with a TCG plugin. The plugin could then output information that could be injested by a script to calculate the what was needed for tools like gcovr.

Links:

Details:

  • Skill level: intermediate with a reasonable understanding of execution flow
  • Language: C, Python
  • Mentor: Alex Bennée (alex.bennee@linaro.org)
  • Suggested by: Alex Bennée

Complete AMD virtualization emulation

Summary: Fix bugs and add extra features for QEMU's emulation of AMD virtualization instructions.

QEMU already includes a basic implementation of the virtualization extensions that are found in AMD processors. The project includes:

  • fixing bugs in the implementations, using the kvm-unit-tests suite
  • adding support for new features such as vGIF or vVMLOAD/vVMSAVE

Links:

Details:

  • Skill level: intermediate, or beginner with a good understanding of the x86 architecture
  • Language: C
  • Mentor: Paolo Bonzini (pbonzini@redhat.com)
  • Suggested by: Paolo Bonzini

MIPS support to RISU

Summary: Push decodetree improvements back to RISU and add support for MIPS architecture.

  • RISU and decodetree

RISU (Random Instruction Sequence generator for Userspace testing) is a tool intended to assist in testing the implementation of models of architectures such as QEMU and Valgrind. In particular it restricts itself to considering the parts of the architecture visible from Linux userspace, so it can be used to test programs which only implement userspace, like Valgrind and QEMU's linux-user mode. RISU generators are written in Perl.

RISU inspired the decodetree specification describing instruction patterns. QEMU uses a script written in Python to generates instruction decoder for some (or all) instruction set architectures (ARM, AVR, HPPA, Microblaze, MIPS, OpenRISC, RISC-V, RX).

The decodetree field extraction logic is more nuanced than RISU. There could be a fair amount of benefit to pushing decodetree improvements back to RISU.

  • MIPS

MIPS architecture has a long trajectory. Some old CPUs are still regularly emulated in QEMU (R4000, VR5432), but very recent models are also added (I7200 with nanoMIPS, Loongson-3A4000). It would be beneficial for the emulation community to run RISU on the dying MIPS hardware and the trendy new hardware, not widely available.

  • Possible Roadmap
  1. Fill gaps in decodetree format to express same logic as .risu format, or generate in this format. Suggestion, t16.decode -> thumb.risu.
    1. constraints (range of valid values for a field), address mode (used to calculate offsets for ld/st ops)
  2. Write risugen.py based on decodetree.py
  3. Write risugen_mips.py
  4. Write test_mips.s to run on user-land
  5. Test MIPS MSA/SIMD/LoongsonMMI with RISU
  6. Convert QEMU MIPSr6 to decodetree/RISU
  • Possible follow up

If the student is motivated, it is possible to investigate how to test privileged instructions out of user-land, eventually using a JTAG probe (or gdbstub?).

Links:

* Peter Maydell's RISU repository
* KVM Forum 2014 presentation by Alex Bennée
* Decodetree Specification
* Decodetree script

Details:

* Skill level: advanced
* Language: C, Python, Perl
* Mentor: Philippe Mathieu-Daudé <f4bug@amsat.org> ("f4bug" on IRC)
* Special requirements: Having MIPS hardware able to run Linux could be helpful, otherwise (slow) remote access will be provided.

Interactive, asynchronous QEMU Machine Protocol (QMP) text user interface (TUI)

Summary: Write an interactive terminal program for issuing and receiving Qemu Monitor Protocol (QMP) commands from a running QEMU instance.

QMP is a JSON message-based protocol that serves as the primary method by which QEMU is controlled and managed by other applications. It is designed to be easy to send and parse from a variety of frameworks and languages, but it is not easy to type by hand. We have an existing python tool that some developers use called 'qmp-shell', but this tool has several fairly severe shortcomings.

qmp-shell is not asynchronous, so it cannot display responses from the server in realtime. Updating the tool to handle asynchronous input will require a fundamental rewrite of the tool to accommodate simultaneous writing of new commands by the human user while new input is received asynchronously from the server. If you are familiar with the console IRC chat program irssi, we are looking to create an interface that is similar. The program would have a message history that updates in realtime (like chat history in irssi), and a text editing bar to type new commands (like the text entry field in a chat room).

We have an existing synchronous QMP library, and an asyncio prototype has been developed to replace it. The focus of this project will be to use and polish that asyncio QMP library and write the actual TUI and interactive elements of the program itself.


Details:

  • Skill level: Intermediate
  • Language: Python 3.6
  • Topic/Skill areas:
    • asyncio: We will be using Python's asyncio library. Experience with this library isn't required, but familiarity with async programming concepts will help: any of coroutines, cooperative scheduling, user threads, etc. If you've ever written a Discord.py bot, you've already used the asyncio library!
    • gradual typing: We will be using gradually typed Python 3.6, using mypy to statically validate those types. If you have not used types in Python before, it is not hard to learn as you go, and the mypy getting started guide is very approachable.
    • UI programming: We will likely be using urwid, a text console UI library for Python. If you have another toolkit/framework you are skilled with, we can likely use that instead. Some knowledge of UI programming concepts (generally class-based, using widgets and signals) will help you along.
    • Interactive console programs: Some knowledge of text-based interactive programs will help you know what good ideas to copy (or bad ones to avoid). If you've ever used links/lynx, irssi, emacs, vim, nano/pico, or even just bash, you'll have a good sense of TUI design basics.
  • Mentor: John Snow <jsnow@redhat.com>
    • Pronouns: Any of the following; None (use 'jsnow'), he/him, or they/them at your preference. I don't use an honorific (no Mr., Mx., etc).
    • IRC nick: jsnow (OFTC). I am usually reachable here between 11AM EST and 7PM EST, Monday-Friday.
    • About: jsnow is the QEMU maintainer for various Python utilities, libraries and scripts used for testing and debugging in the QEMU codebase, has two cats, and really likes Pokemon.


Links:

  • irssi: a good example of a text user interface with a live history/log and a textbox for inputting commands
  • mitmproxy: A great example of a project that uses the urwid library to create a very effective TUI.
  • urwid: Python TUI library used to implement mitmproxy's interface.
  • asyncio: Python asynchronous library.
  • aioconsole: An async python REPL for interactively writing async code in python. It might have good ideas to steal.
  • urwid readline library: Implements readline-like hotkeys for urwid, which may be useful for writing a text entry box.
  • Asynchronous QMP library: Work in progress; this is a prototype for a QMP library written for Python 3.6 using asyncio and mypy type hints.
  • Synchronous QMP library: This is the existing QMP library used for various testing and debug utilities upstream in QEMU today. It is also strictly typed with mypy.
  • qmp-shell: This is the existing interactive utility, and what this project aims to replace.


Recommended Research:

  • Try using irssi to connect to irc.oftc.net and join the #qemu-gsoc channel. Say hello!
  • Try installing mitmproxy and following along the mitmproxy tutorial. This will give you a good idea of the type of interface that inspired this project -- you only need to follow along until "27. You now know basics of mitmproxy's UI and how to control it."
  • Read the mypy getting started guide for learning how gradual typing works in Python if you aren't already familiar.

Style checker for Meson

Summary: Write a style checker for QEMU's Meson-based build system

QEMU is a complex program with a complex build system. The switch to Meson made it possible to access a pre-parsed representation of the build process. We would like to style-check Meson files for occurrences of possible issues:

  • dependencies searched with a method other than "pkg-config" or "system"
  • dependencies lacking "kwargs: static_kwargs"
  • static libraries lacking "build_by_default: false"
  • variables not defined on all paths (Meson accepts undefined variables on the RHS of short-circuiting boolean operators)
  • always-true or always-false conditions

The Meson language is not Turing complete and does not have functions, hence the Meson files have a very simple control-flow graph; complicated dataflow analysis techniques are not necessary. However it is useful to know the basics of what is a CFG and how dataflow analysis works.

Details:

  • Skill level: Intermediate
  • Language: Python 3.6
  • Topic/Skill areas: compilation techniques, Meson build system
  • Mentor: Paolo Bonzini <pbonzini@redhat.com>
    • IRC nick: pbonzini (OFTC). I am usually reachable here between 10AM CET and 6PM CET, Monday-Friday.

vhost-user-scsi device server in Rust

Summary: Implement a vhost-user-scsi device server in Rust

The vhost-user protocol allows driving a virtio process from a separate process. This is better from a security perspective, as it allows to implement a better privilege separation policy, but also makes possible to implement the device personality using a foreign programming language (QEMU is mainly written in C) and to experiment with optimization features that would be hard to deploy inside QEMU itself.

There are already a number of crates in the rust-vmm umbrella project that provide a framework for writing vhost-user device servers in Rust, and some device servers implementations such as vhost-user-blk, vhost-user-net and vhost-user-fs.

This project consists in writing a vhost-user-scsi device server in Rust that's able to serve multiple virtual LUNs, each one backed by a different disk image or iSCSI target. If time allows, benchmarks should also be provided comparing its performance against the integrated virtio-scsi implementation.

Links:

Details:

  • Skill level: Intermediate
  • Language: Rust
  • Mentor: Sergio Lopez <slp@redhat.com>

vhost-user-vsock application

Summary: Develop a vhost-user-vsock application in Rust and integrate it with Kata Container

Kata Containers provides a secure container runtime using lightweight virtual machines that feel and perform like traditional containers. Kata Containers leverages KVM and supports multiple Virtual Machine Monitors, including QEMU. It uses virtio-vsock to create a communication channel between the runtime, running in the host, and the agent running in the guest.

Kata Containers focuses on security, so moving the device emulation into an external user space process is very attractive in order to reduce the attack surface.

This project aims to realize an application (i.e. vhost-user-vsock) that will leverage the vhost-user protocol to emulate the virtio-vsock device in an external process. It will provide the hybrid VSOCK interface over AF_UNIX introduced by Firecracker.

The QEMU part has already been implemented and tested with a proof of concept based on Cloud Hypervisor crates, that can be used as starting point for this project.

The new application should be written in Rust reusing as much as possible the crates available in rust-vmm. It's an umbrella project that provides a set of virtualization components that can be easily reused to speed up the implementation.

If time allows, we could integrate vhost-user-vsock into Kata Containers.

Possible roadmap:

  • vhost-user-vsock application (Rust)
    • Getting familiar with vsock and tools (ncat, tcpdump, wireshark)
    • Learning vhost-user protocol
    • Trying QEMU with the vhost-user-vsock PoC
    • Rust application development based on vhost-user-vsock PoC
      • Replace Cloud Hypervisor crates with rust-vmm crates (e.g. vhost)
      • Try to move other crates to rust-vmm (e.g. virtio-vsock)
      • Cleanups and tests
      • Publish vhost-user-vsock in the rust-vmm umbrella project
  • Kata Container integration (Go)
    • Getting familiar with kata-containers and its environment
      • Deploying and using Kata Containers on minikube
      • Able to modify the content of the projects and run the modified binaries on minikube
    • Runtime side work:
      • start the application daemon (similar to virtio-fs)
      • ensure it's receiving the correct SELinux labels (container_kvm_t label similar to virtio-fs)
    • GOVMM side:
      • add support to "vhost-user-vsock"
    • There may be some work needed on the agent related to this integration, but we hope everything will be transparent on that layer.

Links:

Details:

  • Skill level: intermediate
  • Language: Rust / Go
  • Mentors: Stefano Garzarella <sgarzare@redhat.com>, Fabiano Fidêncio <fidencio@redhat.com>
    • IRC nick: sgarzare (OFTC/freenode), fidencio (OFTC/freenode)

Mocking framework for Virtio Queues

Summary: Implement a mocking framework for virtio queues

Paravirtualized devices (such as those defined by the Virtio standard) are used to provide high performance device emulation. Virtio drivers from a guest VM communicate with the device model using an efficient mechanism based on queues stored in a shared memory area that operate based on a protocol and message format defined by the standard. Various implementations of devices and other virtualization building blocks require mocking the contents that a driver would place into a Virtio queue for validation, testing, and evaluation purposes.

This project aims to lay the foundations of a reusable framework for mocking the driver side of Virtio queue operation, that can be consumed by rust-vmm crates and other projects. At the basic level, this means providing a flexible and easy to use interface for users to set up the underlying memory areas and populate contents (as the driver would do) for the basic split queue format in a generic manner. This can further be extended for the packed format and with device-specific mocking capabilities.

Links:

Issue in rust-vmm about reusing the mocking logic: rust-vmm/vm-virtio: https://github.com/rust-vmm/vm-virtio

Details:

  • Skill level: intermediate
  • Language: Rust
  • Mentors: aagch@amazon.com, fandree@amazon.com
  • Suggested by: aagch@amazon.com

Local running rust-vmm-ci

Summary: Run the rust-vmm-ci locally

The rust-vmm-ci provides automation for uniformely running the tests on all rust-vmm repositories. It is built on top of Buildkite, and only allows running the tests in the Buildkite context. To run the same tests as in the CI locally, users need to manually copy the Buildkite pipeline steps.

The scope of this project is to make it possible for the same tests to easily run locally. This project makes it easier to contribute to all rust-vmm repositories.

In order for that to be possible, the following steps are required: - the Buildlkite pipeline is autogenerated from code instead of being a static list of tests to run. This also allows us to uniformely use the same container version for running all the tests (instead of manually modifying each step in the pipeline) - the code for autogenerating the Buildkite pipeline is reused for generating a Python script which can be run locally


Links:

Details:

  • Skill level: intermediate
  • Language: Python
  • Mentor: fandree@amazon.com, aagch@amazon.com
  • Suggested by: fandree@amazon.com

How to add a project idea

  1. Create a new wiki page under "Internships/ProjectIdeas/YourIdea" and follow #Project idea template.
  2. Add a link from this page like this: {{:Internships/ProjectIdeas/YourIdea}}

Example idea from a previous year: Internships/ProjectIdeas/I2CPassthrough

Project idea template

=== TITLE ===
 
 '''Summary:''' Short description of the project
 
 Detailed description of the project.
 
 '''Links:'''
 * Wiki links to relevant material
 * External links to mailing lists or web sites
 
 '''Details:'''
 * Skill level: beginner or intermediate or advanced
 * Language: C
 * Mentor: Email address and IRC nick
 * Suggested by: Person who suggested the idea

How to propose a custom project idea

Applicants are welcome to propose their own project ideas. The process is as follows:

  1. Email your project idea to qemu-devel@nongnu.org. CC Stefan Hajnoczi <stefanha@gmail.com> and regular QEMU contributors who you think might be interested in mentoring.
  2. If a mentor is willing to take on the project idea, work with them to fill out the "Project idea template" above and email Stefan Hajnoczi <stefanha@gmail.com>.
  3. Stefan will add the project idea to the wiki.

Note that other candidates can apply for newly added project ideas. This ensures that custom project ideas are fair and open.

How to get familiar with our software

See what people are developing and talking about on the mailing lists:

Grab the source code or browse it:

Build QEMU and run it: QEMU on Linux Hosts

Links

Information for mentors

Mentors are responsible for keeping in touch with their student and assessing the student's progress. GSoC has a mid-term evaluation and a final evaluation where both the mentor and student assess each other.

The mentor typically gives advice, reviews the student's code, and has regular communication with the student to ensure progress is being made.

Being a mentor is a significant time commitment, plan for 5 hours per week. Make sure you can make this commitment because backing out during the summer will affect the student's experience.

The mentor chooses their student by reviewing student application forms and conducting IRC interviews with candidates. Depending on the number of candidates, this can be time-consuming in itself. Choosing the right student is critical so that both the mentor and the student can have a successful experience.