Features/CoroutineFnCleanup: Difference between revisions

From QEMU
Line 65: Line 65:


The query language is based on regular expressions:
The query language is based on regular expressions:
;<tt>[label1,label2]</tt>
;<tt>[label1]</tt>
:Find function with all the specified <tt>__attribute((annotate))</tt> annotations
:Find function with the specified <tt>__attribute((annotate))</tt> annotation
:Labels can be replaced with <tt>/regex/</tt> (but it's less efficient)
;<tt>funcname</tt>, <tt>/regex/</tt>
;<tt>![label1|label2]</tt>
:Find functions with the given name or matching the regex.
:Find function with none of the specified <tt>__attribute((annotate))</tt> annotations
;<tt>funcname</tt>
:Find functions with the given name
;<tt>...</tt>
;<tt>...</tt>
:Zero or more functions (clearer alternative to <tt>[]*</tt>; same as <tt>.*</tt> in grep)
:Zero or more functions (clearer alternative to <tt>[]*</tt>; same as <tt>.*</tt> in grep)
Line 78: Line 75:


Anything except <tt>...</tt> can be followed by a star to mean 0 or more occurrences.
Anything except <tt>...</tt> can be followed by a star to mean 0 or more occurrences.
<tt>[label1]</tt> is actually its own mini-language to query individual nodes:
;<tt>[label1,label2]</tt>
:Find function with all the specified <tt>__attribute((annotate))</tt> annotations
:Labels can be replaced with <tt>/regex/</tt> (but it's less efficient)
;<tt>![label1|label2]</tt>
:Find function with none of the specified <tt>__attribute((annotate))</tt> annotations
;<tt>["func1"]</tt>, <tt>[/regex/]</tt>
:Match on function name, for example <tt>[coroutine_fn, !"qemu_coroutine_yield"]</tt>.


Example of useful queries:
Example of useful queries:
Line 93: Line 100:
:Find mixed functions called from functions that are not coroutine_fn.  Any coroutine_mixed_fn functions not on the list are candidates for changing to coroutine_fn
:Find mixed functions called from functions that are not coroutine_fn.  Any coroutine_mixed_fn functions not on the list are candidates for changing to coroutine_fn


While the query language currently can only be used from the command line, it is possible to construct the AST from a Python program and therefore build static analysis that can be run at "make check" time.  (That said, the parsing is a bit slow due to the use of Python.  Converting the clang visitor to Cython is a possible future idea).
While the query language currently can only be used from the command line, it is possible to construct the AST from a Python program and therefore build static analysis that can be run at "make check" time.


(An idea for extension is to add "functions" so that coroutine candidates could be written '''<tt>[!coroutine_fn,!coroutine_mixed_fn,coroutine_fn.callee,![!coroutine_fn.callee]]</tt>''')
An idea for extension is to add "functions" so that coroutine candidates could be written '''<tt>[!coroutine_fn,!coroutine_mixed_fn,coroutine_fn.callee,![!coroutine_fn.callee]]</tt>'''

Revision as of 16:49, 29 November 2022

Hand-written wrappers for coroutine functions

  1. bdrv_pread/bdrv_pwrite have different calling convention than bdrv_co_pread/bdrv_co_pwrite [Alberto]
    • Fix using Coccinelle
    • Switch bdrv_pread/bdrv_pwrite to generated_co_wrapper
  2. Missing bdrv_co_pwrite_sync, bdrv_co_pwrite_zeroes
    • Adjust code in block/io.c and introduce generated_co_wrapper
  3. Use generated_co_wrapper for block-backend.c too, removing the need for declarations in block/coroutines.h [Alberto]
  4. Probably more will be found with custom matching tools (see below)

Converting callbacks to coroutines

Conversion of GLOBAL_STATE_CODE() callbacks to coroutines depends on removal of the AioContext lock:

  • from coroutine context, IO_OR_GS_CODE() functions such as "drain_begin" have to be called with the AioContext lock taken
  • to call them with the lock taken, the code has to run in the BDS's home iothread...
  • ... and that is not possible for GLOBAL_STATE_CODE() functions unless the BDS home iothread is the main thread (which cannot be guaranteed)

IO_CODE() callbacks can be converted to coroutines. Even for functions may be GLOBAL_STATE_CODE(), the underlying callbacks can be an IO_CODE() coroutine_fn. For example:

  • bdrv_inactivate, which does I/O, can be converted to a coroutine_fn callback. It is only implemented by qcow2, so it can be called from bdrv_close instead of qcow2_close. The call in bdrv_inactivate_recurse would go through a generated_co_wrapper in block/coroutines.h, e.g. bdrv_do_inactivate_one.
  • It should be possible to convert the bdrv_snapshot_goto callback too. Likewise, bdrv_snapshot_goto would call a generated_co_wrapper, e.g. bdrv_do_snapshot_goto.

Annotations

  1. Add coroutine_mixed_fn annotation; functions that call qemu_in_coroutine(). Leaves are:
    • all generated_co_wrapper functions
    • bdrv_create
    • bdrv_can_store_new_dirty_bitmap, bdrv_remove_persistent_dirty_bitmap
    • bdrv_drain_all_begin, bdrv_do_drained_begin, bdrv_do_drained_end
    • qio_channel_readv_full_all_eof, qio_channel_writev_full
    • qcow2_open, qcow2_has_zero_init
    • bdrv_qed_open
    • qmp_dispatch
    • nvme_get_free_req
    • schedule_next_request
  2. Detect call from non-coroutine_fn/coroutine_mixed_fn to coroutine_fn (libclang/clang-tidy)
  3. Detect call from coroutine_fn to generated_co_wrapper (libclang/clang-tidy)
    • Use _co_ version instead.
    • Possibly add fixit to rewrite
  4. Detect call from non-coroutine_fn to coroutine_mixed_fn (libclang/clang-tidy)
    • Change caller to coroutine_mixed_fn too
    • If there is none, coroutine_mixed_fn can be changed to coroutine_fn

Clang-query examples

Invoking clang-query:

  1. Change definition of coroutine_fn to #define coroutine_fn __attribute__((annotate("coroutine")))
  2. Invoke clang-query -p build-path block/io.c

Example of clang-query uses:

match functionDecl(allOf(unless(hasAttr("attr::Annotate")),hasDescendant(callExpr(callee(hasAttr("attr::Annotate"))).bind("callee"))))
call from non-coroutine_fn to coroutine_fn (doesn't actually check the string in the annotate attribute)
match varDecl(hasType(namedDecl(hasName("BlockDriver"))), hasInitializer(anything()))
declaration of BlockDriver struct, with initializer
match functionDecl(isExpansionInMainFile(), forEachDescendant(callExpr(callee(unless(functionDecl()))).bind("call"))).bind("root")'
indirect call

Clang references

https://devblogs.microsoft.com/cppblog/exploring-clang-tooling-part-1-extending-clang-tidy/

https://devblogs.microsoft.com/cppblog/exploring-clang-tooling-part-2-examining-the-clang-ast-with-clang-query/

https://devblogs.microsoft.com/cppblog/exploring-clang-tooling-part-3-rewriting-code-with-clang-tidy/

VRC / path search

VRC is a tool to query a call graph. The call graph can be obtained from either GCC debug dumps (-fdump-rtl-expand) or clang (experimental).

The query language is based on regular expressions:

[label1]
Find function with the specified __attribute((annotate)) annotation
funcname, /regex/
Find functions with the given name or matching the regex.
...
Zero or more functions (clearer alternative to []*; same as .* in grep)
( ... | ... )
Alternation

Anything except ... can be followed by a star to mean 0 or more occurrences.

[label1] is actually its own mini-language to query individual nodes:

[label1,label2]
Find function with all the specified __attribute((annotate)) annotations
Labels can be replaced with /regex/ (but it's less efficient)
![label1|label2]
Find function with none of the specified __attribute((annotate)) annotations
["func1"], [/regex/]
Match on function name, for example [coroutine_fn, !"qemu_coroutine_yield"].

Example of useful queries:

paths ![coroutine_fn|coroutine_mixed_fn] [coroutine_fn]
Find incorrect calls to coroutine_fn
paths [coroutine_fn] [no_coroutine_fn]
Find incorrect calls to no_coroutine_fn
paths [coroutine_fn] ![coroutine_fn|coroutine_mixed_fn]
paths ![coroutine_fn] ![coroutine_fn|coroutine_mixed_fn]
Find functions called only from functions that are coroutine_fn, and respectively not coroutine_fn. Functions only found in the first set are candidates for labeling as coroutine_fn.
paths [coroutine_fn_candidate] ... qemu_coroutine_yield
Find coroutine_fn candidates (see previous query) that ultimately suspend. These are slam-dunk candidates for labeling as coroutine_fn.
paths ![coroutine_fn] [coroutine_mixed_fn]
Find mixed functions called from functions that are not coroutine_fn. Any coroutine_mixed_fn functions not on the list are candidates for changing to coroutine_fn

While the query language currently can only be used from the command line, it is possible to construct the AST from a Python program and therefore build static analysis that can be run at "make check" time.

An idea for extension is to add "functions" so that coroutine candidates could be written [!coroutine_fn,!coroutine_mixed_fn,coroutine_fn.callee,![!coroutine_fn.callee]]