Internships/ProjectIdeas/TCGCodeQuality: Difference between revisions

From QEMU
No edit summary
Line 1: Line 1:
=== Measure Tiny Code Generation Quality ===
=== Measure Tiny Code Generation Quality ===
'''Status:''' Vanderson M. do Rosario is working on this project for GSoC.
'''Status:''' Vanderson M. do Rosario <vandersonmr2@gmail.com> (vanderson on #qemu IRC) is working on this project for GSoC.
 
'''Mentor:''' Alex Bennée <alex.bennee@linaro.org> (stsquad on #qemu IRC)


'''Project Github:''' vandersonmr/gsoc-qemu [https://github.com/vandersonmr/gsoc-qemu]
'''Project Github:''' vandersonmr/gsoc-qemu [https://github.com/vandersonmr/gsoc-qemu]


'''Summary:''' in most applications, the majority of the execution time is spent in a very small portion of code. Regions of a code which have high-frequency execution are called hot while all other regions are called cold. As a direct consequence, emulators also spent most of their execution time emulating these hot regions and, so, dynamic compilers and translators need to pay extra attention to them. To guarantee that these hot regions are compiled/translated generating high-quality code is fundamental to achieve a final high-performance emulation. Thus, one of the most important steps in tuning an emulator performance is to identify which are the hot regions and to measure their translation quality.
'''Summary:''' in most applications, the majority of the execution time is spent in a very small portion of code. Regions of a code which have high-frequency execution are called hot while all other regions are called cold. As a direct consequence, emulators also spent most of their execution time emulating these hot regions and, so, dynamic compilers and translators need to pay extra attention to them. To guarantee that these hot regions are compiled/translated generating high-quality code is fundamental to achieve a final high-performance emulation. Thus, one of the most important steps in tuning an emulator performance is to identify which are the hot regions and to measure their translation quality.


== TBStatsitics (TBStats) ==
== TBStatsitics (TBStats) ==

Revision as of 22:08, 6 July 2019

Measure Tiny Code Generation Quality

Status: Vanderson M. do Rosario <vandersonmr2@gmail.com> (vanderson on #qemu IRC) is working on this project for GSoC.

Mentor: Alex Bennée <alex.bennee@linaro.org> (stsquad on #qemu IRC)

Project Github: vandersonmr/gsoc-qemu [1]

Summary: in most applications, the majority of the execution time is spent in a very small portion of code. Regions of a code which have high-frequency execution are called hot while all other regions are called cold. As a direct consequence, emulators also spent most of their execution time emulating these hot regions and, so, dynamic compilers and translators need to pay extra attention to them. To guarantee that these hot regions are compiled/translated generating high-quality code is fundamental to achieve a final high-performance emulation. Thus, one of the most important steps in tuning an emulator performance is to identify which are the hot regions and to measure their translation quality.

TBStatsitics (TBStats)

Improving the code generation of the TCG backend is a hard task that involves reading through large amounts of text looking for anomalies in the generated code. It would be nice to have tools to more readily extract and parse code generation information. This would include options to dump:

  • The hottest Translations Blocks (TB) and their execution count (total and atomic).
  • Translation counters:
    • The number of times a TB has been translated, uncached and spanned.
  • Code quality metrics:
    • The number of TB guest, IR (TCG ops), and host instructions.
    • The Number of spills during the register allocation.

So, we collect all this information dynamically for very TB or for a specific set of TBs and store it on TBStatistics structures. Every TB can have one TBStatistics linked to it by a new field inserted in the TranslationBlock structure[2]. Moreover, TBStatistics are not flushed during tb_flush and they survive longer being relinked to retranslated TBs using their keys (phys_pc, pc, flags, cs_base) to matches TBs and their TBStats.

struct TBStatistics {
   tb_page_addr_t phys_pc;
   target_ulong pc;
   uint32_t     flags;
   /* cs_base isn't included in the hash but we do check for matches */
   target_ulong cs_base;
/* Translation stats */ struct { unsigned long total; unsigned long uncached; unsigned long spanning; /* XXX: count invalidation? */ } translations;
/* Execution stats */ struct { unsigned long total; unsigned long atomic; } executions;
struct { unsigned num_guest_inst; unsigned num_host_inst; unsigned num_tcg_inst; unsigned spills; } code;
/* HMP information - used for referring to previous search */ int display_id; };




Improving the code generation of the TCG backend is a hard task that involves reading through large amounts of text looking for anomalies in the generated code. It would be nice to have tools to more readily extract and parse code generation information. This would include:

Modifying code generator, dumping additional data

  • which are hot blocks (frequently run, hence more important performance wise)
  • export block JIT information for perf tool (the later version)

Tweaking -d op,out_asm output

  • how many fills/spills in a block (where register contents are moved due to register pressure)
  • number of host instructions for each guest instruction (JIT profiling has a basic version of this)
  • elide or beautify common blocks like softmmu access macros (which are always the same)

Modifying the HMP

  • support interactive exploration of translation state (system emulation)

QEMU currently only works on translating simple basic blocks with one or two exit paths. This work could be a pre-cursor to supporting Internships/ProjectIdeas/Multi-exit Hot Blocks in the future.

Links:


Details:

  • Skill level: intermediate or advanced, understanding of code generation (compilers/JITs)
  • Language: C, Assembly (x86 or preferred host)
  • Mentor: Alex Bennée <alex.bennee@linaro.org> (stsquad on #qemu IRC)
  • Suggested by: Alex Bennée