Google Summer of Code 2017

From QEMU
Jump to: navigation, search

Introduction

QEMU is applying for Google Summer of Code 2017. This page contains our ideas list and information for students and mentors. Google Summer of Code is a program that pays students for 12-week full-time remote work on open source projects from May to August!

Students: Google has not published the list of accepted organizations yet. You are welcome to contribute to QEMU and contact mentors for project ideas you are interested in. Please keep in mind that QEMU may not be accepted into GSoC to avoid disappointment.

Application Process

After contacting the mentor to discuss the project idea you should fill out the application form at [1]. The form asks for a problem description and outline of how you intend to implement a solution. You will need to do some background research (looking at source code, browsing relevant specifications, etc) in order to form an idea of how to tackle the project. The form asks for an initial 12-week project schedule which you should create by breaking down the project into tasks and estimating how long they will take. The schedule can be adjusted during the summer so don't worry about getting everything right ahead of time.

Candidates may be invited to an IRC interview with the mentor. The interview consists of a 30 minute C coding exercise, followed by technical discussion and a chance to ask questions you have about the project idea, QEMU, and GSoC. The coding exercise is designed to show fluency in C programming.

Here is a C coding exercise we have used in previous years when interviewing students: 2014 coding exercise

Try it and see if you are comfortable enough writing C. We cannot answer questions about the previous coding exercise but hopefully it should be self-explanatory.

If you find the exercise challenging, think about applying to other organizations where you have a stronger technical background and will be more competitive compared with other candidates.

Find Us

  • IRC (GSoC specific): #qemu-gsoc on irc.oftc.net
  • IRC (development):
    • QEMU: #qemu on irc.oftc.net
    • libvirt: #virt on irc.oftc.net
    • KVM: #kvm on chat.freenode.net

Please contact the mentor for the project idea you are interested in. IRC is usually the quickest way to get an answer.

For general questions about QEMU in GSoC, please contact the following people:

Project Ideas

This is the listing of suggested project ideas. Students are free to suggest their own projects, see #How to propose a custom project idea below.

QEMU audio backend

Summary: Rework QEMU audio backend

The audio backend facilitates audio playback and capture using host audio APIs (CoreAudio, PulseAudio, DirectSound, etc). It is used by emulated soundcards and may need to convert between the audio format supported by the emulated soundcard and the format supported by the physical soundcard. This area of the codebase has been stable for a long time but is now due some significant improvements.

The goal of this summer project is to improve the audio/ backend. The preliminary task is to rebase and merge (some or all) of the GSOC "audio 5.1 patches 00/51" series which modernizes the audio backend codebase.

Then, add a generic GStreamer audio backend. GStreamer is an open source multimedia framework that is cross-platform and already supports a lot of the functionality that is implemented in QEMU's audio backend.

Finally, try to replace as much of audio/ by custom gstreamer pipelines. This would be a major simplification that reduces the code size significantly, making QEMU's audio backend smaller and easier to maintain.

Links:

Details:

  • Skill level: intermediate or advanced
  • Language: C
  • Mentors: marcandre.lureau@redhat.com, kraxel@redhat.com
  • Contact: past gsoc student "Kővágó Zoltán" <dirty.ice.hu@gmail.com>
  • Suggested by: marcandre.lureau@redhat.com

Disk Backup Tool

Summary: Write a tool that performs both full and incremental disk backups

QEMU has added command primitives that can be combined to perform both full and incremental disk backups while the virtual machine is running. A full backup copies the entire contents of the disk. An incremental backup copies only regions that have been modified. Orchestrating a multi-disk backup to local or remote storage is non-trivial and there is no example code showing how to do it from start to finish.

It would be helpful to have a "reference implementation" that performs backups using QEMU's QMP commands. Backup software and management stack developers wishing to add QEMU backup support could look at this tool's code as an example. Users who run QEMU directly could us this tool as their backup software.

You need to be able to read C since that's what most of QEMU is written in. This project will expose you to backup and data recovery, as well as developing command-line tools in Python.

See the links to familiarize yourself with disk image files, backups, and snapshots.

Links:

Details:

  • Skill level: intermediate
  • Language: Python
  • Mentors: John Snow <jsnow@redhat.com> (jsnow on IRC), Stefan Hajnoczi <stefanha@redhat.com> (stefanha on IRC)

Moving I/O throttling and write notifiers into block filter drivers

Summary: Refactor the block layer so that I/O throttling and write notifiers are implemented as block filter drivers instead of being hardcoded into the core code

QEMU's block layer handles I/O to disk image files and now supports flexible configuration through a "BlockDriverState graph". Block drivers can be inserted or removed from the graph to modify how I/O requests are processed.

Block drivers implement read and write functions (among other things). Typically they access a file or network storage but some block drivers perform other jobs like data encryption. These block drivers are called "filter" drivers because they process I/O requests but ultimately forward requests to the file format and protocol drivers in the leaf nodes of the graph.

I/O throttling (rate-limiting the guest's disk I/O) and write notifiers (used to implement backup) are currently hardcoded into the block layer's core code. The goal of this project is to extract this functionality into filter drivers that are inserted into the graph only when a feature is needed. This makes the block layer more modular and reuses the block driver abstraction that is already present.

This project will expose you to QEMU's block layer. It requires refactoring existing code for which there is already some test coverage to aid you.

Links:

Details:

  • Skill level: intermediate
  • Language: C
  • Mentor: Kevin Wolf <kwolf@redhat.com> (kwolf on IRC), Stefan Hajnoczi <stefanha@redhat.com> (stefanha on IRC), Alberto Garcia (berto on IRC)

PCI Express to PCI bridge

Summary: Code an emulated PCIe-to-PCI bridge for QEMU PCI Express machines

Modern Virtual Machines and their devices are PCI Express, however a means of supporting existing PCI and PCI-X deployment is required. Some use cases may need using legacy PCI devices that plug into platforms that exclusively support PCI and PCI-X system slots.

QEMU already has a solution, the i82801b11 DMI-to-PCI Bridge Emulation. However, the device has some disadvantages: it cannot be used by ARM guests and it is part of the Root Complex, so it can't be hot-plugged.

The goal of this summer project is to code a generic PCIe-PCI bridge. The bridge should be hot-pluggable into PCI Express Root Ports and be usable across various architectures and Guest Operating Systems.

Once the bridge is merged upstream, the PCI/PCI Express infrastructure will be ported to the QOM model to conform with QEMU standards, all that as the time permits.

Links:

Details:

  • Skill level: intermediate
  • Language: C
  • Mentor: marcel@redhat.com, marcel_a on IRC
  • Suggested by: Marcel Apfelbaum <marcel@redhat.com>

Add a Hypervisor.framework accelerator

Summary: Add x86 virtualization support on macOS using Hypervisor.framework

QEMU does not yet take advantage of Hypervisor.framework, the API for hypervisors on macOS. Currently one must use the slower TCG just-in-time compiler or the Intel HAXM accelerator module that relies on a third-party driver.

Hypervisor.framework was added to macOS in Yosemite (10.10). It exposes the Intel VMX CPU feature for running guest code safely at native speed. The main difference to the KVM or HAXM APIs is that the Hypervisor.framework user must implement instruction emulation to handle instructions that vmexit due to I/O accesses. Most of the code will be related to this emulator.

QEMU would be able to run x86 virtual machines with much better performance and without relying on third-party drivers thanks to Hypervisor.framework. This will make QEMU more useful on macOS and encourage more contributions from developers on that platform.

This project is an advanced project. You should be familiar with the concept of an emulator. Luckily there is the Linux KVM code as well as other code that implements VMX or Hypervisor.framework to use for inspiration. You will learn about writing the most core part of a hypervisor.

There is an existing QEMU-based Hypervisor.framework implementation in Veertu's hypervisor. This can serve as a reference and one way to approach the project is to take that code and get it merged into QEMU after necessary changes have been made.

Links:

Details:

  • Skill level: advanced
  • Language: C
  • Mentor: Alexander Graf <agraf@suse.de>

Vhost-pci based inter-VM communication extension

Summary: extend the current vhost-pci based inter-VM communication

Existing vhost-pci supports dynamic setup (i.e. the vhost-pci-net device is created and hot-plugged to the VM based on runtime requests) of an asymmetric inter-VM communication channel (i.e. communication between vhost-pci-net and virtio-net). The channel is built by sharing a VM’s entire memory with another VM. This gives rise to good inter-VM communication performance and it is useful for use cases (e.g. Network Function Virtualization) where security is not an important factor.

In the extension work, we enable static setup (i.e. create vhost-pci-net via QEMU booting command line) of a symmetrical inter-VM communication channel (i.e. vhost-pci-net to vhost-pci-net communication). As opposed to sharing the entire VM’s memory, the two VMs share a piece of intermediate memory to transmit network packets.

Links:

Details:

  • Skill level: advanced
  • Language: C
  • Mentor: Wei, Wang <wei.w.wang@intel.com>, Yuanhan, Liu <yuanhan.liu@intel.com>
  • Suggested by: Marc-André Lureau <marcandre.lureau@redhat.com>

Python module for Jailhouse

Summary: Factor out a Python module for the Jailhouse command & control interface

The partitioning hypervisor Jailhouse is controlled from a Linux guest system (the root cell) via a command line tool (jailhouse). This tool consists of a small core written in C, providing the minimally required services to operate the hypervisor. For more advanced features like configuration file generation or the loading of additional Linux guests, a set of extension scripts have been written in Python. The scripts are now at a point where they would benefit from using a "pyjailhouse" module that should implements common functionality only once, e.g. configuration format or system information parsing but also interaction with the low-level hypervisor control interface.

The goal of this project is to create such Python module, transfer existing code from the scripts into it, and make sure that the module is both properly installed and also usable from the source directory. As a reuse case and benchmark for a reasonable split-up, jailhouse hardware check shall be enabled to parse the system configuration itself, instead of relying on a config file previously generated by jailhouse config create.

Bonus (with advanced Python experience): Further refactoring of the system configuration parser and config file generator to enable the reuse on non-x86 architectures or for generating and validating non-root cell configurations.

Links:

Details:

  • Skill level: intermediate to advanced
  • Language: Python (basic C knowledge helpful)
  • Mentors: Jan Kiszka <jan.kiszka@web.de>, Valentine Sinitsyn <valentine.sinitsyn@gmail.com>

New configuration format for Jailhouse

Summary: Define a new Jailhouse configuration source file format and implement a compiler for it

So far, the configuration files for the Jailhouse hypervisor are defined in C by filling structures and array and then translated into their binary representation via standard gcc. In order to simplify the error-prone definition process while keeping the binary output format unchanged, a new source form shall be developed. That could be YAML-based, but alternatives can be discussed as well.

The requirements on the format are

  • human readability (no XML...)
  • easily modifiable by humans (e.g. by using symbolic labels for references, instead of array indexes)
  • unambitious transformation into binary format
  • optional: support for including or referencing common fragments (e.g. device resources initially used by root cell, then transferred to a non-root cell)
  • optional: support for grouping of associated resources (e.g. a PCI device with its capabilities and memory regions)

In order to use the format in place of the existing C descriptions, a compiler shall be written (can be in Python, C or a mixture) to generate the binary representation of the configuration files and hooked into the Jailhouse build process. There will surely be a transitional phase before the new format is matured and old format can be discontinued. This phase will likely last longer than this project.

Links:

Details:

  • Skill level: advanced
  • Language: Python, C
  • Mentors: Jan Kiszka <jan.kiszka@web.de>, Valentine Sinitsyn <valentine.sinitsyn@gmail.com>

Vulkan-ize virgl

Summary: accelerated rendering of Vulkan APIs

virgl enables accelerated 3d rendering in a VM. It uses Desktop GL on host, and provides OpenGL/GLES in guest.

This project would aim at implementing Vulkan accelerated rendering. There are multiple ways of interpreting this idea. One interesting approach would be to support Vulkan in VM on a Vulkan-capable host, doing more passthrough.

Links:

Details:

  • Skill level: advanced
  • Language: C
  • Mentors: airlied@redhat.com, marcandre.lureau@redhat.com
  • Suggested by: marcandre.lureau@redhat.com

virgl Windows driver

Summary: accelerated rendering of Windows guests

virgl enables accelerated 3d rendering in a VM. It currently only supports Linux guests.

There is some working prototype of virtio-gpu dod driver already. The goal of this project would be to enable 3d rendering. By working on an OpenGL Installable Client Driver (probably as a first step), and DirectX support (could be worth investigating and using the 'nine' mesa state tracker)

Links:

Details:

  • Skill level: advanced
  • Language: C
  • Mentors: airlied@redhat.com, vrozenfe@redhat.com
  • Suggested by: marcandre.lureau@redhat.com

virgl on Windows host

Summary: make virgl rendering work on Windows host

virgl enables accelerated 3d rendering in a VM. It requires Desktop GL on the host.

In theory, virgl should work on Windows with a capable host driver. This project aim at making virgl work well with various GPU on Windows. Since many Windows OpenGL drivers have bad behaviours, it would be worth to support ANGLE/opengles instead. This would require various modifications in virgl library. Additionally, it would be a good opportunity to ease the cross-compilation and packaging of qemu/virgl with msitools.

Links:

Details:

  • Skill level: intermediate or advanced
  • Language: C
  • Mentors: marcandre.lureau@redhat.com, airlied@redhat.com
  • Suggested by: marcandre.lureau@redhat.com

MTTCG Performance Enhancements

Summary: The MTTCG Project is a project that converted the TCG engine from single threaded execution to multi-threaded execution to take advantage of all cores on a modern processor. With this conversion several performance bottlenecks were identified when running strongly ordered guests like x86 on weakly ordered hosts like ARM64. The first part of the project will be to quantify the identified bottlenecks for TCG performance. Based on this data, you need to prioritize one of the following sub-tasks.

  • Measure performance bottlenecks experimentally
 - Reasons for code flushes in the current code execution
 - Re-translation overhead for commonly used translation blocks
 - Consistency overhead caused by generating fence instructions for all loads/stores
  • Place TranslationBlock structures into the same memory block as code_gen_buffer

Consider what happens within every TB:

(1) We have one or more references to the TB address, via exit_tb.

For aarch64, this will normally require 2-4 insns.

 # alpha-softmmu
 0x7f75152114:  d0ffb320      adrp x0, #-0x99a000 (addr 0x7f747b8000)
 0x7f75152118:  91004c00      add x0, x0, #0x13 (19)
 0x7f7515211c:  17ffffc3      b #-0xf4 (addr 0x7f75152028)
 # alpha-linux-user
 0x00569500:  d2800260      mov x0, #0x13
 0x00569504:  f2b59820      movk x0, #0xacc1, lsl #16
 0x00569508:  f2c00fe0      movk x0, #0x7f, lsl #32
 0x0056950c:  17ffffdf      b #-0x84 (addr 0x569488)

We would reduce this to one insn, always, if the TB were close by, since the ADR instruction has a range of 1MB.

(2) We have zero to two references to a linked TB, via goto_tb.

  • Remove the 128MB translation cache size limit on ARM64.

The translation cache size for an ARM64 host is currently limited to 128 MB. This limitation is imposed by utilizing a branch instruction which encodes the jump offset and is limited by the number of bits it can use for the range of the offset. The performance impact by this limitation is severe and can be observed when you try to run large programs like a browser in the guest. The cache is flushed several times before the browser starts and the performance is not satisfactory. This limitation can be overcome by generating a branch-to-register instruction and utilizing that when the destination address is outside the range of what can be encoded in current branch instruction.

Based on the previous task of placing the translation structures within the code gen buffer, we can remove this 128 MB cache size limit as follows:

(i) Raise the maximum to 2GB by aligning an instruction pair, adrp+add, to compute the address; the following insn would branch. The update code would write a new destination by modifing the adrp+add with a single 64-bit store.

(ii) Eliminate the maximum altogether by referencing the destination directly in the TB. This is the !USE_DIRECT_JUMP path. It is normally not used on 64-bit targets because computing the full 64-bit address of the TB is harder, or just as hard, as computing the full 64-bit address of the destination.

However, if the TB is nearby, aarch64 can load the address from TB.jmp_target_addr in one insn, with LDR (literal). This pc-relative load also has a 1MB range.

This has the side benefit that it is much quicker to re-link TBs, both in the computation of the code for the destination as well as re-flushing the icache.

  • Implement an LRU translation block code cache.

In the current mechanism that it is not necessary to know how much code is going to be generated for a given set of TCG opcodes. When we reach the high-water mark, we flush everything and start over at the beginning of the buffer. We can improve this situation by not flushing the TBs that were recently used i.e., by implementing an LRU policy for freeing the blocks. If you manage the cache with an allocator, you'll need to know in advance how much code is going to be generated. This is going to require that you generate position-independent code into an external buffer and copy it into the code gen buffer after determining the size. We can then implement an LRU policy for removing unused blocks and saving the translation cache.

  • Avoid consistency overhead for strong memory model guests by generating load-acquire and store-release instructions.

To run a strongly ordered guest on a weakly ordered host using MTTCG, for example, x86 on ARM64, we have to generate fence instructions for all the guest memory accesses to ensure consistency. The overhead imposed by these fence instructions is significant (almost 3x when compared to a run without fence instructions). ARM64 provides load-acquire and store-release instructions which are sequentially consistent and can be used instead of generating fence instructions. Add support to generate these instructions in the TCG run-time to reduce the consistency overhead in MTTCG. You have to use the memory access auxiliary info tags to generate appropriate fences on the host architecture unlike the current situation, where only explicit guest fence instructions are translated.

Further Reading:

Requirements: Working on this will require the student to develop a good understanding of the internals of tiny code generator (TCG) in QEMU. An understanding of compiler theory or previous knowledge of the TCG would also be beneficial to this work. Finally familiarity with GIT and being able to frequently re-base work on upstream master branch would be useful.

Details:

  • Skill level: intermediate
  • Language: C
  • Mentor: Alex Bennée <alex.bennee@linaro.org> (stsquad on IRC)
  • Suggested by: Pranith Kumar, Alex Bennée, and Richard Henderson

Project idea template

=== TITLE ===
 
 '''Summary:''' Short description of the project
 
 Detailed description of the project.
 
 '''Links:'''
 * Wiki links to relevant material
 * External links to mailing lists or web sites
 
 '''Details:'''
 * Skill level: beginner or intermediate or advanced
 * Language: C
 * Mentor: Email address and IRC nick
 * Suggested by: Person who suggested the idea

How to propose a custom project idea

Applicants are welcome to propose their own project ideas. The process is as follows:

  1. Email your project idea to qemu-devel@nongnu.org. CC Stefan Hajnoczi <stefanha@gmail.com> and regular QEMU contributors who you think might be interested in mentoring.
  2. If a mentor is willing to take on the project idea, work with them to fill out the "Project idea template" above and email Stefan Hajnoczi <stefanha@gmail.com>.
  3. Stefan will add the project idea to the wiki.

Note that other candidates can apply for newly added project ideas. This ensures that custom project ideas are fair and open.

How to get familiar with our software

See what people are developing and talking about on the mailing lists:

Grab the source code or browse it:

Build QEMU and run it: QEMU on Linux Hosts

Important links

Information for mentors

Mentors are responsible for keeping in touch with their student and assessing the student's progress. GSoC has a mid-term evaluation and a final evaluation where both the mentor and student assess each other.

The mentor typically gives advice, reviews the student's code, and has regular communication with the student to ensure progress is being made.

Being a mentor is a significant time commitment, plan for 5 hours per week. Make sure you can make this commitment because backing out during the summer will affect the student's experience.

The mentor chooses their student by reviewing student application forms and conducting IRC interviews with candidates. Depending on the number of candidates, this can be time-consuming in itself. Choosing the right student is critical so that both the mentor and the student can have a successful experience.