Google Summer of Code 2020

From QEMU
Revision as of 19:27, 25 February 2020 by F4bug (talk | contribs) (→‎Project Ideas: Add Graphical user interface and Arduino project)

Introduction

QEMU is applying to Google Summer of Code 2020. This page contains our ideas list and information for students and mentors. Google Summer of Code is an open source internship program for university students offering 12-week, full-time, paid remote work from May to August.

Applicants: You are welcome to think about project ideas and familiarize yourself with QEMU, but please don't invest too much time at this early stage. Google will announce participating organizations on February 20.

Application Process

1. Discuss the project idea with the mentor(s)

Read the project ideas list and choose one you are interested in. Read the links in the project idea description and start thinking about how you would approach this. Ask yourself:

  • Do I have the necessary technical skills to complete this project in 12 weeks?
  • Will I be able to work independently without the physical presence of my mentor?

If you answer no to these questions, choose another project idea and/or organization that fits your abilities better.

Once you have identified a suitable project idea, email the mentor(s) your questions about the idea and explain your understanding of the project idea to them to verify that you are on track.

2. Fill out the application form

The application form asks for a problem description and outline of how you intend to implement a solution. You will need to do some background research (looking at source code, browsing relevant specifications, etc) in order to form an idea of how to tackle the project. The form asks for an initial 12-week project schedule which you should create by breaking down the project into tasks and estimating how long they will take. The schedule can be adjusted during the summer so don't worry about getting everything right ahead of time.

3. IRC interview including a coding exercise

You may be invited to an IRC interview. The interview consists of a 30-minute coding exercise, followed by technical discussion and a chance to ask questions you have about the project idea, QEMU, and GSoC. The coding exercise is designed to show fluency in the programming language for your project idea (QEMU projects are typically in C but could also be in Python or Rust).

Here is a C coding exercise we have used in previous years when interviewing students: 2014 coding exercise

Try it and see if you are comfortable enough writing C. We cannot answer questions about the previous coding exercise but hopefully it should be self-explanatory.

If you find the exercise challenging, think about applying to other organizations where you have a stronger technical background and will be more competitive compared with other candidates.

Key Dates

From the timeline

  • March 16 - 31, 2020 - Student Applications
  • April 27, 2020 - Student Projects Announced

Find Us

  • IRC (GSoC specific): #qemu-gsoc on irc.oftc.net
  • IRC (development):
    • QEMU: #qemu on irc.oftc.net
    • KVM: #kvm on chat.freenode.net

For general questions about QEMU in GSoC, please contact the following people:

Project Ideas

This is the listing of suggested project ideas. Students are free to suggest their own projects, see #How to propose a custom project idea below.

Device Emulation

NVMe Emulation Performance Optimization

Summary: QEMU's NVMe emulation uses the traditional trap-and-emulate method to emulate I/Os, thus the performance suffers due to frequent VM-exits. Version 1.3 of the NVMe specification defines a new feature to update doorbell registers using a Shadow Doorbell Buffer. This can be utilized to enhance performance of emulated controllers by reducing the number of Submission Queue Tail Doorbell writes.

Further more, it is possible to run emulation in a dedicated thread called an IOThread. Emulating NVMe in a separate thread allows the vcpu thread to continue execution and results in better performance.

Finally, it is possible for the emulation code to watch for changes to the queue memory instead of waiting for doorbell writes. This technique is called polling and reduces notification latency at the expense of an another thread consuming CPU to detect queue activity.

The goal of this project is to add implement these optimizations so QEMU's NVMe emulation performance becomes comparable to virtio-blk performance.

Tasks include:

  • Add Shadow Doorbell Buffer support to reduce doorbell writes
  • Add Submission Queue Tail Doorbell register ioeventfd support when the Shadow Doorbell Buffer is enabled (see existing patch linked below)
  • Add Submission Queue polling
  • Add IOThread support so emulation can run in a dedicated thread

Links:

Details:

  • Project size: 350 hours
  • Difficulty: intermediate to advanced
  • Required skills: C programming
  • Desirable skills: knowledge of the NVMe PCI specification, knowledge of device driver or emulator development
  • Mentor: Klaus Jensen <its@irrelevant.dk> (kjensen on IRC), Keith Busch <kbusch@kernel.org>
  • Suggested by: Huaicheng Li <huaicheng@cs.uchicago.edu>, Paolo Bonzini <pbonzini@redhat.com> ("bonzini" on IRC)

BusLogic SCSI adapter emulation

Summary: Port the BusLogic SCSI adapter from VirtualBox to QEMU

QEMU does not emulate the BusLogic BT-958 SCSI adapter. Virtual machines created by VirtualBox may only include the BusLogic driver and therefore be unable to boot under QEMU.

This project is aimed at supporting the BusLogic BT-958 adapter in QEMU. VirtualBox code may be used as a reference. There is no hardware documentation available, however the Linux driver may be used to recover the details of the adapter behavior.

This project will expose you to device emulation and how SCSI Host Bus Adapters (HBAs) work. You will learn in detail how drivers perform disk I/O with the BusLogic BT-958 adapter. Previous experience with device driver development or device emulation will be helpful but is not necessary.

Links:

Details:

  • Skill level: advanced
  • Language: C
  • Mentor: Denis Dmitriev <Denis.Dmitriev@ispras.ru>, Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
  • Suggested by: Pavel Dovgalyuk

Virtual FIDO2/U2F security key

Summary: Write a virtual usb device which presents a FIDO2/U2F security key to the guest.

Possible modes of operation:

  • pass-through: pass any requests to a physical key plugged into the host. Allow parallel usage from host and (multiple) guests.
  • virtual key: fully emulated device.

Links:

Details:

  • Skill level: intermediate/advanced
  • Language: C
  • Mentor: Gerd Hoffmann <kraxel@redhat.com>
  • Suggested by: Gerd Hoffmann <kraxel@redhat.com>

Block layer improvements

Anonymization of virtual disk images

Summary: Extend the qemu-img utility to drop all data from the virtual disk while preserving image metadata.

Virtual disk images like QCOW2, VHDX, or VMDK files may reach a bad state during their lifecycle and require debugging. This happens on the side of cloud or hosting providers and these images contain end-user (even not cloud provider) data. European cloud providers nowadays treat this under terms of GDPR privacy regulations and these image files cannot be easily sent to developers for investigation.

The idea of this project is to drop all end-user data from images, including data blocks, memory inside internal snapshots, etc. On the other hand, all bits and bytes of metadata of original image should be preserved even so-called "in-use" bit and internal metadata state. This will allow problematic image files to be debugged without transmitting the privacy-sensitive data contents of the disk image files.

The task is to implement a "qemu-img anonymize" command for the QCOW2 file format and also add support for the VHDX and VMDK file formats if time permits. This new command will not only help meet GDPR regulations but also make support more convenient for users because anonymized disk image files compress much better.

This project will allow you to learn about how disk image file formats work. You will become familiar with the internals of the QCOW2 file format and how data is laid out on disk.

Links:

Details:

  • Skill level: intermediate
  • Language: C
  • Mentor: Denis V. Lunev <den@openvz.org>
  • Suggested by: Denis V. Lunev <den@openvz.org>

TCG Just-in-Time Compiler

TCG Plugin Cache Modelling

Summary: Implement a simple cache modelling plugin for QEMU's TCG plugins.

QEMU's TCG emulation has traditionally avoided doing complex modelling of the processor in favor of running fast. However the recent introduction of TCG plugins we can put some simple cache modelling into a plugin which can be optionally loaded when we want to examine how a program works. With such a plugin we could identify areas of code in either a linux-user program or a whole system that may not be cache optimal. The aim would be to write a plugin that allows you to simply model different icache/dcache configurations rather than actually simulate the micro-architecture of a CPU.

Links:

Details:

  • Skill level: intermediate with a good understanding of a processor instruction and data caches
  • Language: C, Python
  • Mentor: Alex Bennée (alex.bennee@linaro.org)
  • Suggested by: Alex Bennée

TCG Continuous Benchmarking

Summary: The nature of this project lies more in exploration, analysis and presentation than in coding. The performance of a software product will be examined to the greatest details. The software under examination will be QEMU emulator - across its modes, across its components, and across time.


QEMU may operate in so called user mode (an executable built for one processor (in QEMU parlance, target) is, by means of QEMU emulation, executed on the system with another processor (again, in QEMU slang, host)) and in system mode (the whole system of one kind (target) is emulated on the system of another kind (host)). These two modes will be examined separately:


TASK

PART I: (user mode)

  • select around a dozen test programs (resembling components of SPEC benchmark, but all must be open source, and preferably having license compatible with QEMU); those test programs should be distributed like this: 4-5 FPU CPU-intensive, 4-5 non-FPU CPU intensive, 1-2 I/O intensive;
  • measure execution time and other performance data of all selected test program across all targets on Intel and possibly other hosts, for the latest QEMU version:
    • try to improve performance if there is an obvious bottleneck;
    • develop tests that will be protection against performance regressions in future;
    • provide automated nightly tests for letting know QEMU developers if something changed performance-wise.
  • measure execution time of all selected test programs for selected targets for all QEMU versions in last 5 years (there are appr. 15 such versions):
    • confirm performance improvements and/or detect performance degradation.
  • summarize all results in a comprehensive form, using also graphics/data visualization.

PART II: (system mode)

  • measure execution time and other performance data for boot/shutdown cycle for selected machines for the latest QEMU version:
    • try to improve performance if there is an obvious bottleneck.
  • summarize all results in a comprehensive form.


DELIVERABLES

1) Each target maintainer for target will be given a list of top 25 functions in terms of spent host CPU time for each benchmark described in the previous section. Additional information and observations will be also provided, if the judgment is they are useful and relevant.

2) Each machine maintainer machine (that has successful boot/shutdown cycle) will be given a list of top 25 functions in terms of spent host time during boot/shutdown cycle. Additional information and observations may also be provided.

3) The community will be given all devised performance measurement methods in the form of easily reproducible step-by-step setup and execution procedures.

Deliverables should be gradually distributed over wider time interval of around two months.


Links:


Details:

  • Skill level: intermediate
  • Languages:
    • C (for code analysis, performance improvements)
    • Python (for automatization)
    • potentially JavaScript (d3.js or similar library; for data visualization)
  • Mentor: Aleksandar Markovic (aleksandar.markovic@rt-rk.com)
  • Suggested by: Aleksandar Markovic

-user mode

Extend linux-user syscalls and ioctls

Summary: Implement new or missing syscalls in linux-user.

Background

Although QEMU is often used to run a full virtual machine with a guest operating system inside, it also supports running individual applications on top of the host operating systems. QEMU's linux-user mode translates a program's CPU instructions and emulates Linux system calls so that an ARM Linux executable can execute on an x86 host, for example.

There are currently 2500+ ioctls defined in the Linux kernel. QEMU linux-user currently supports only several hundred. There is a constant need for expanding ioctl support in QEMU. Users use Linux-user mode in variety of setups (for example, building and testing tools and applications under chroot environment), and, on a regular basis, efforts by multiple people are made to fill in missing support.

Regarding syscall support in QEMU linux-user, the coverage is much better than in case of ioctls. However, kernel syscall interface continuously develops and grows, and QEMU linux-user support usually lags considerably. The support for new syscalls is usually left unimplemented, until an end user reports that it is missing in hers/his usage scenario.

In conclusion, the efforts for supporting ioctls and syscalls in QEMU have usually been done on a piece-by-piece basis, in a limited way covering a particular need. This project will take more proactive stance by improving QEMU before users try applications that fail due to missing functionality.

The contributions of this project will be mostly to QEMU, but some parts will also extend LTP (Linux Test Project).

PART I:

  1. Add strace support for printing the third argument of ioctl() (be it int, string, structure or array) - limited to selected ioctls that are frequently used.
  2. Add strace support for printing the arguments of selected syscalls that are frequently used, and not covered in QEMU strace module so far.

PART II:

  1. Amend support for existing groups of ioctls that are not completed 100% (e.g. filesystem ioctls)
  2. Add support for a selected group of ioctls that are not currently supported (e.g. DM ioctls, Bluetooth ioctls, or Radeon DRM ioctls)
  3. Add support for a selected group of syscalls that were recently introduced in kernel.

PART III:

  1. Within LTP (Linux Test Project), develop unit tests for selected ioctls that are supported in QEMU (including some whose support is developed in PART II).
  2. Within LTP (Linux Test Project), develop unit tests for selected syscalls that are supported in QEMU (including some whose support is developed in PART II).

Deliverables

The deliverables are in the form of source code for each part, intended to be upstreamed to either QEMU or LTP open source projects. The time needed for upstreaming (addressing reviews, etc.) process is included into this project. The delivery of results can and should be distributed over larger period of time (2-3 months).

Links:

Details:

  • Skill level: intermediate
  • Language: C
  • Mentor: Laurent Vivier <laurent@vivier.eu>
  • Suggested by: Aleksandar Markovic <amarkovic@wavecomp.com>

Graphical user interface

QEMU emulated Arduino board visualizer

Execution flow.

Summary

The project will add a visual representation of an Arduino based board. By running the code on the emulated AVR processor, the virtual board is updated and displays the changes. Interacting with the code via external events (widgets) triggers changes on the UI.

For details of the intended implementation access to Internships/ProjectIdeas/ArduinoVisualisation:detail.

Goal

Be able to use a QEMU emulated Arduino as part of a virtual board. Use the virtual board for interaction with the QEMU emulated Arduino and for visualization of the board states. Be able to program the emulated Arduino with the Arduino IDE.

The result should be easily usable by newcomers to the Arduino world.

Deliverables

The project is divided in several deliverables:

IDE Integration

  • Configure QEMU with the Arduino IDE (using chardev UART0).
  • Compile program and upload via serial.
  • The IDE doesn't need modifications.

UI (Python)

  • Connect UART1 (via QMP or chardev), display as textbox (input is not important at this point).

QEMU: GPIO

  • Produce a script to extract the GPIO devices from the netlist.
  • Configure QEMU devices to use the previous names/values.
  • Publish GPIO events (name as a string and tension as float) via a QMP socket (JSON form?).
  • Write a test which runs FreeRTOS to generate a stable output.

UI (Python)

  • Connect to the QMP socket and display the GPIO events.
  • Now GPIOs are connected to LEDs. Present graphical LEDs as ON/OFF.
  • Add an oscilloscope representation (matplotlib widget). Each GPIO can be plugged into the oscilloscope channels.
  • Add Switches and PushButtons to the UI, generating QMP events which trigger GPIO input.

QEMU: PWM

  • Modify script to extract PWM devices used from the netlist.
  • Configure QEMU devices to use the previous names/values.
  • Use QEMU sound API to generate a stream of PWM values (as a wav).
  • Add a QMP command to lookup the PWM wav stream.
  • Write a FreeRTOS test producing a sinusoidal via PWM, verify the wav form.

UI (Python)

  • Lookup the wav stream via the QMP socket, connect to it, display it on the oscilloscope view.
  • Add a graphical representation of the LED intensity.

QEMU: ADC

  • Modify the script to extract the ADC devices from the netlist.
  • Similarly to PWM, use the sound wav stream to read ADC samples.

UI: Python

  • Add a textbox to set the ambient temperature (a thermometer is connected to some ADC pins).
  • Use a slider to set the tension sampled by the ADC (like if it was a potentiometer).

Materials provided

Boards definition

A specific circuit configuration represented as a netlist.

Arduino code examples

Preset Arduino exanples compliant with QEMU limitations:

  • Digital example: "Blink: Turn a LED on and off."
  • Analog example: "Fading: Use an analog output (PWM pin) to dim a LED."
  • Analog example: "Analog Input: Use a potentiometer to control the blinking of a LED." QMP commands documentation

Extra tasks

Additional tasks are available for applicants who completes the project.

Essential skills required

  • Fluent in C
  • Comfortable programming in Python
  • Knowledge of Javascript might be useful (Java will *not* be used).
  • Working knowledge with User Interfaces

Electrical engineering background is not essential

Details

  • Skill level: intermediate to advanced
  • Language: C
  • Mentor: Philippe Mathieu-Daudé <f4bug@amsat.org> ("f4bug" on IRC)
  • Mentor: Joaquin de Andres <me@xcancerberox.com.ar> ("xcancerberox" on IRC)

How to add a project idea

  1. Create a new wiki page under "Internships/ProjectIdeas/YourIdea" and follow #Project idea template.
  2. Add a link from this page like this: {{:Internships/ProjectIdeas/YourIdea}}

Example idea from a previous year: Internships/ProjectIdeas/I2CPassthrough

Project idea template

=== TITLE ===
 
 '''Summary:''' Short description of the project
 
 Detailed description of the project.
 
 '''Links:'''
 * Wiki links to relevant material
 * External links to mailing lists or web sites
 
 '''Details:'''
 * Skill level: beginner or intermediate or advanced
 * Language: C
 * Mentor: Email address and IRC nick
 * Suggested by: Person who suggested the idea

How to propose a custom project idea

Applicants are welcome to propose their own project ideas. The process is as follows:

  1. Email your project idea to qemu-devel@nongnu.org. CC Stefan Hajnoczi <stefanha@gmail.com> and regular QEMU contributors who you think might be interested in mentoring.
  2. If a mentor is willing to take on the project idea, work with them to fill out the "Project idea template" above and email Stefan Hajnoczi <stefanha@gmail.com>.
  3. Stefan will add the project idea to the wiki.

Note that other candidates can apply for newly added project ideas. This ensures that custom project ideas are fair and open.

How to get familiar with our software

See what people are developing and talking about on the mailing lists:

Grab the source code or browse it:

Build QEMU and run it: QEMU on Linux Hosts

Links

Information for mentors

Mentors are responsible for keeping in touch with their student and assessing the student's progress. GSoC has a mid-term evaluation and a final evaluation where both the mentor and student assess each other.

The mentor typically gives advice, reviews the student's code, and has regular communication with the student to ensure progress is being made.

Being a mentor is a significant time commitment, plan for 5 hours per week. Make sure you can make this commitment because backing out during the summer will affect the student's experience.

The mentor chooses their student by reviewing student application forms and conducting IRC interviews with candidates. Depending on the number of candidates, this can be time-consuming in itself. Choosing the right student is critical so that both the mentor and the student can have a successful experience.