Requirements/GatingCI: Difference between revisions

From QEMU
No edit summary
No edit summary
Line 49: Line 49:


The parallel-buildtest script causes GNU parallel to print a series of lines with the logfiles containing the captured stdout/stderr from each machine. I run the output of those through the 'greplogs' script which looks for things that look like error messages. This is intended to capture warnings/errors which didn't manage to cause the make process to return failure for one reason or another.
The parallel-buildtest script causes GNU parallel to print a series of lines with the logfiles containing the captured stdout/stderr from each machine. I run the output of those through the 'greplogs' script which looks for things that look like error messages. This is intended to capture warnings/errors which didn't manage to cause the make process to return failure for one reason or another.
= Sketch of 'phase one' Solution =
To keep the scope of the initial implementation constrained the plan is:
* identifying pull request emails, performing the actual git merge, and pushing the resulting staging branch to a public location should remain manual (or locally shell-scripted) tasks initially
* the CI should have some mechanism for "start CI on this git repo + branch-or-tag"
* that mechanism being scriptable and returning a success/failure code is preferable but not essential in phase one
* the CI should have a web UI for looking at current status, logs from failed tests, etc
* pushing the staging branch to master on success should also be locally scripted
(This roughly corresponds to "start by doing just the parts handled by 'parallel-buildtest' in the existing scripts".)
The initial load here is relatively low as it will only be doing tests of merge builds, and we don't have very many of these. (We will likely want to scale up later, though.)
= Ideas for later phases =
These are described mostly for context and to give an idea of where we want to go in future.
== Folding into patchew ==
Today we have a CI setup which tests all patchsets sent to the qemu-devel mailing list. This is handled by a robot called 'patchew', which parses emails, creates git trees with the patches applied and dispatches to a variety of platforms to do make/make-check style tests, with feedback via both web UI and email. This obviously has significant overlap with the CI we're doing for merge requests. The suggestion is that we could get rid of the half of patchew which is implementing "dispatch to CI job runners for testing and get back logs and pass/fail indication". The "parse emails and create git trees to be tested" and "web UI" parts would remain, but instead of doing its own dispatch and CI runners it would just invoke the same CI setup as the merge tests.
== Automated identification of pull request emails ==
Probably most easily done via patchew. We could automate more of the merge request workflow so that tests are kicked off automatically when a merge request email is sent to the list, rather than requiring a human to do this. There would then need to be some way to tell the automated system "ok, actually push that staging branch to master", as we still want a human to eyeball them first, especially during releases.
== Automated and decentralised application of merges ==
My personal preference for where we finally end up is to have something similar to the Rust community's automation, where testing of potential merges is automatic, and the pushing of a successful merge to master is done via comments in the web UI. We could have a setup where, for instance, an ack by any two other devs who've successfully submitted merges in the last 4 months is sufficient to permit a merge. (Criteria for acceptance could perhaps be tightened during release freezes.)

Revision as of 15:26, 11 November 2019

Problem Statement

A gating CI is a prerequisite to having a multi-maintainer model of merging. By having a common set of tests that are run prior to a merge you do not rely on who is currently doing merging duties having access to the current set of test machines.

Currently pre-merge testing is done via a set of tests done by ad-hoc shell scripts run on a set of machines using personal accounts of the overall maintainer. We want to replace this ad-hoc system with one which:

  • does not use any machines which aren't usable with generic project role accounts
  • uses a known and maintainable CI system (eg Gitlab) rather than hand-hacked scripts
  • can be handed over to another person to handle releases

Current Tests

This section describes the current ad-hoc setup. It isn't intended to imply that we want to necessarily carry over all of these tests and host types.

The scripts are kept in:

https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree

though they are best treated as a reference for what we currently do rather than used as a base for anything.

The set of machine I currently test on are:

  • an S390x box (this is provided to the project by IBM's Community Cloud so can be used for the new CI setup)
  • aarch32 (as a chroot on an aarch64 system)
  • aarch64
  • ppc64 (on the GCC compile farm)
  • OSX
  • Windows crossbuilds
  • NetBSD, FreeBSD and OpenBSD using the tests/vm VMs
  • x86-64 Linux with a variety of different build configs (see the 'remake-merge-builds' script for how these are set up)

I also have access to a SPARC box but am not currently testing with it as there are hangs which I did not have time to investigate.

Testing process:

  • I get an email which is a pull request, and I run the "apply-pullreq" script, which takes the GIT URL and tag/branch name to test.
  • apply-pullreq performs the merge into a 'staging' branch
  • apply-pullreq also performs some simple local tests:
    • does git verify-tag like the GPG signature?
    • are we trying to apply the pull before reopening the dev tree for a new release?
    • does the pull include commits with bad UTF8 or bogus qemu-devel email addresses?
    • submodule updates are only allowed if the --submodule-ok option was specifically passed
  • apply-pullreq then invokes parallel-buildtest to do the actual testing
  • parallel-buildtest is a trivial wrapper around GNU Parallel which invokes 'mergebuild' on each of the test machines
  • if all is OK then the user gets to do the 'git push' to push the staging branch to master

In almost all cases 'mergebuild' is simply "run 'make -C build' and then 'make -C build check'". The exceptions are:

  • the Windows crossbuilds don't try to run 'make check'
  • the x86-64 host runs the 'pull-buildtest' script, which:
    • does make/make check for multiple configs
    • includes one build from 'make clean' (almost everything else does an incremental build)
    • runs 'make check-tcg' on the all-linux-static config
    • runs a trivial set of 'ls' binaries for a bunch of linux-user guests (this is probably mostly redundant now we have check-tcg)

The parallel-buildtest script causes GNU parallel to print a series of lines with the logfiles containing the captured stdout/stderr from each machine. I run the output of those through the 'greplogs' script which looks for things that look like error messages. This is intended to capture warnings/errors which didn't manage to cause the make process to return failure for one reason or another.

Sketch of 'phase one' Solution

To keep the scope of the initial implementation constrained the plan is:

  • identifying pull request emails, performing the actual git merge, and pushing the resulting staging branch to a public location should remain manual (or locally shell-scripted) tasks initially
  • the CI should have some mechanism for "start CI on this git repo + branch-or-tag"
  • that mechanism being scriptable and returning a success/failure code is preferable but not essential in phase one
  • the CI should have a web UI for looking at current status, logs from failed tests, etc
  • pushing the staging branch to master on success should also be locally scripted

(This roughly corresponds to "start by doing just the parts handled by 'parallel-buildtest' in the existing scripts".)

The initial load here is relatively low as it will only be doing tests of merge builds, and we don't have very many of these. (We will likely want to scale up later, though.)

Ideas for later phases

These are described mostly for context and to give an idea of where we want to go in future.

Folding into patchew

Today we have a CI setup which tests all patchsets sent to the qemu-devel mailing list. This is handled by a robot called 'patchew', which parses emails, creates git trees with the patches applied and dispatches to a variety of platforms to do make/make-check style tests, with feedback via both web UI and email. This obviously has significant overlap with the CI we're doing for merge requests. The suggestion is that we could get rid of the half of patchew which is implementing "dispatch to CI job runners for testing and get back logs and pass/fail indication". The "parse emails and create git trees to be tested" and "web UI" parts would remain, but instead of doing its own dispatch and CI runners it would just invoke the same CI setup as the merge tests.

Automated identification of pull request emails

Probably most easily done via patchew. We could automate more of the merge request workflow so that tests are kicked off automatically when a merge request email is sent to the list, rather than requiring a human to do this. There would then need to be some way to tell the automated system "ok, actually push that staging branch to master", as we still want a human to eyeball them first, especially during releases.

Automated and decentralised application of merges

My personal preference for where we finally end up is to have something similar to the Rust community's automation, where testing of potential merges is automatic, and the pushing of a successful merge to master is done via comments in the web UI. We could have a setup where, for instance, an ack by any two other devs who've successfully submitted merges in the last 4 months is sufficient to permit a merge. (Criteria for acceptance could perhaps be tightened during release freezes.)