* [RFC] QEMU Gating CI @ 2019-12-02 14:05 Cleber Rosa 2019-12-02 17:00 ` Stefan Hajnoczi ` (3 more replies) 0 siblings, 4 replies; 22+ messages in thread From: Cleber Rosa @ 2019-12-02 14:05 UTC (permalink / raw) To: qemu-devel, Peter Maydell, Alex Bennée Cc: Ademar Reis, Jeff Nelson, Stefan Hajnoczi, Wainer dos Santos Moschetta, Markus Armbruster RFC: QEMU Gating CI =================== This RFC attempts to address most of the issues described in "Requirements/GatinCI"[1]. An also relevant write up is the "State of QEMU CI as we enter 4.0"[2]. The general approach is one to minimize the infrastructure maintenance and development burden, leveraging as much as possible "other people's" infrastructure and code. GitLab's CI/CD platform is the most relevant component dealt with here. Problem Statement ----------------- The following is copied verbatim from Peter Maydell's write up[1]: "A gating CI is a prerequisite to having a multi-maintainer model of merging. By having a common set of tests that are run prior to a merge you do not rely on who is currently doing merging duties having access to the current set of test machines." This is of a very simplified view of the problem that I'd like to break down even further into the following key points: * Common set of tests * Pre-merge ("prior to a merge") * Access to the current set of test machines * Multi-maintainer model Common set of tests ~~~~~~~~~~~~~~~~~~~ Before we delve any further, let's make it clear that a "common set of tests" is really a "dynamic common set of tests". My point is that a set of tests in QEMU may include or exclude different tests depending on the environment. The exact tests that will be executed may differ depending on the environment, including: * Hardware * Operating system * Build configuration * Environment variables In the "State of QEMU CI as we enter 4.0" Alex Bennée listed some of those "common set of tests": * check * check-tcg * check-softfloat * check-block * check-acceptance While Peter mentions that most of his checks are limited to: * check * check-tcg Our current inability to quickly identify a faulty test from test execution results (and specially in remote environments), and act upon it (say quickly disable it on a given host platform), makes me believe that it's fair to start a gating CI implementation that uses this rather coarse granularity. Another benefit is a close or even a 1:1 relationship between a common test set and an entry in the CI configuration. For instance, the "check" common test set would map to a "make check" command in a "script:" YAML entry. To exemplify my point, if one specific test run as part of "check-tcg" is found to be faulty on a specific job (say on a specific OS), the entire "check-tcg" test set may be disabled as a CI-level maintenance action. Of course a follow up action to deal with the specific test is required, probably in the form of a Launchpad bug and patches dealing with the issue, but without necessarily a CI related angle to it. If/when test result presentation and control mechanism evolve, we may feel confident and go into finer grained granularity. For instance, a mechanism for disabling nothing but "tests/migration-test" on a given environment would be possible and desirable from a CI management level. Pre-merge ~~~~~~~~~ The natural way to have pre-merge CI jobs in GitLab is to send "Merge Requests"[3] (abbreviated as "MR" from now on). In most projects, a MR comes from individual contributors, usually the authors of the changes themselves. It's my understanding that the current maintainer model employed in QEMU will *not* change at this time, meaning that code contributions and reviews will continue to happen on the mailing list. A maintainer then, having collected a number of patches, would submit a MR either in addition or in substitution to the Pull Requests sent to the mailing list. "Pipelines for Merged Results"[4] is a very important feature to support the multi-maintainer model, and looks in practice, similar to Peter's "staging" branch approach, with an "automatic refresh" of the target branch. It can give a maintainer extra confidence that a MR will play nicely with the updated status of the target branch. It's my understanding that it should be the "key to the gates". A minor note is that conflicts are still possible in a multi-maintainer model if there are more than one person doing the merges. A worthy point is that the GitLab web UI is not the only way to create a Merge Request, but a rich set of APIs are available[5]. This is interesting for many reasons, and maybe some of Peter's "apply-pullreq"[6] actions (such as bad UTF8 or bogus qemu-devel email addresses checks could be made earlier) as part of a "send-mergereq"-like script, bringing conformance earlier on the merge process, at the MR creation stage. Note: It's possible to have CI jobs definition that are specific to MR, allowing generic non-MR jobs to be kept on the default configuration. This can be used so individual contributors continue to leverage some of the "free" (shared) runner made available on gitlab.com. Multi-maintainer model ~~~~~~~~~~~~~~~~~~~~~~ The previous section already introduced some of the proposed workflow that can enable such a multi-maintainer model. With a Gating CI system, though, it will be natural to have a smaller "Mean time between (CI) failures", simply because of the expected increased number of systems and checks. A lot of countermeasures have to be employed to keep that MTBF in check. For once, it's imperative that the maintainers for such systems and jobs are clearly defined and readily accessible. Either the same MAINTAINERS file or a more suitable variation of such data should be defined before activating the *gating* rules. This would allow a routing to request the attention of the maintainer responsible. In case of unresposive maintainers, or any other condition that renders and keeps one or more CI jobs failing for a given previously established amount of time, the job can be demoted with an "allow_failure" configuration[7]. Once such a change is commited, the path to promotion would be just the same as in a newly added job definition. Note: In a future phase we can evaluate the creation of rules that look at changed paths in a MR (similar to "F:" entries on MAINTAINERS) and the execution of specific CI jobs, which would be the responsibility of a given maintainer[8]. Access to the current set of test machines ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When compared to the various CI systems and services already being employed in the QEMU project, this is the most striking difference in the proposed model. Instead of relying on shared/public/free resources, this proposal also deals with privately owned and operated machines. Even though the QEMU project operates with great cooperation, it's crucial to define clear boundaries when it comes to machine access. Restricted access to machines is important because: * The results of jobs are many times directly tied to the setup and status of machines. Even "soft" changes such as removing or updating packages can introduce failures in jobs (this is greatly minimized but not completely eliminated when using containers or VMs). Updating firmware or changing its settings are also examples of changes that may change the outcome of jobs. * If maintainers will be accounted for the status of the jobs defined to run on specific machines, they must be sure of the status of the machines. * Machines need regular monitoring and will receive required maintainance actions which can cause job regressions. Thus, there needs to be one clear way for machines to be *used* for running jobs sent by different maintainers, while still prohibiting any other privileged action that can cause permanent change to the machine. The GitLab agent (gitlab-runner) is designed to do just that, and defining what will be excuted in a job (in a given system) should be all that's generally allowed. The job definition itself, will of course be subject to code review before a maintainer decides to send a MR containing such new or updated job definitions. Still related to machine maintanance, it's highly desirable for jobs tied to specific host machines to be introduced alongside with documentation and/or scripts that can replicate the machine setup. If the machine setup steps can be easily and reliably reproduced, then: * Other people may help to debug issues and regressions if they happen to have the same hardware available * Other people may provide more machines to run the same types of jobs * If a machine maintainer goes MIA, it'd be easier to find another maintainer GitLab Jobs and Pipelines ------------------------- GitLab CI is built around two major concepts: jobs and pipelines. The current GitLab CI configuration in QEMU uses jobs only (or putting it another way, all jobs in a single pipeline stage). Consider the folowing job definition[9]: build-tci: script: - TARGETS="aarch64 alpha arm hppa m68k microblaze moxie ppc64 s390x x86_64" - ./configure --enable-tcg-interpreter --target-list="$(for tg in $TARGETS; do echo -n ${tg}'-softmmu '; done)" - make -j2 - make tests/boot-serial-test tests/cdrom-test tests/pxe-test - for tg in $TARGETS ; do export QTEST_QEMU_BINARY="${tg}-softmmu/qemu-system-${tg}" ; ./tests/boot-serial-test || exit 1 ; ./tests/cdrom-test || exit 1 ; done - QTEST_QEMU_BINARY="x86_64-softmmu/qemu-system-x86_64" ./tests/pxe-test - QTEST_QEMU_BINARY="s390x-softmmu/qemu-system-s390x" ./tests/pxe-test -m slow All the lines under "script" are performed sequentially. It should be clear that there's the possibility of breaking this down into multiple stages, so that a build happens first, and then "common set of tests" run in parallel. Using the example above, it would look something like: +---------------+------------------------+ | BUILD STAGE | TEST STAGE | +---------------+------------------------+ | +-------+ | +------------------+ | | | build | | | boot-serial-test | | | +-------+ | +------------------+ | | | | | | +------------------+ | | | | cdrom-test | | | | +------------------+ | | | | | | +------------------+ | | | | x86_64-pxe-test | | | | +------------------+ | | | | | | +------------------+ | | | | s390x-pxe-test | | | | +------------------+ | | | | +---------------+------------------------+ Of course it would be silly to break down that job into smaller jobs that would run individual tests like "boot-serial-test" or "cdrom-test". Still, the pipeline approach is valid because: * Common set of tests would run in parallel, giving a quicker result turnaround * It's easier to determine to possible nature of the problem with just the basic CI job status * Different maintainers could be defined for different "common set of tests", and again by leveraging the basic CI job status, automation for directed notification can be implemented In the following example, "check-block" maintainers could be left undisturbed with failures in the "check-acceptance" job: +---------------+------------------------+ | BUILD STAGE | TEST STAGE | +---------------+------------------------+ | +-------+ | +------------------+ | | | build | | | check-block | | | +-------+ | +------------------+ | | | | | | +------------------+ | | | | check-acceptance | | | | +------------------+ | | | | +---------------+------------------------+ The same logic applies for test sets for different targets. For instance, combining the two previous examples, there could different maintainers defined for the different jobs on the test stage: +---------------+------------------------+ | BUILD STAGE | TEST STAGE | +---------------+------------------------+ | +-------+ | +------------------+ | | | build | | | x86_64-block | | | +-------+ | +------------------+ | | | | | | +------------------+ | | | | x86_64-acceptance| | | | +------------------+ | | | | | | +------------------+ | | | | s390x-block | | | | +------------------+ | | | | | | +------------------+ | | | | s390x-acceptance | | | | +------------------+ | +---------------+------------------------+ Current limitations for a multi-stage pipeline ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Because it's assumed that each job will happen in an isolated and independent execution environment, jobs must explicitly define the resources that will be shared between stages. GitLab will make sure the same source code revision will be available on all jobs automatically. Additionaly, GitLab supports the concept of artifacts. By defining artifacts in the "build" stage, jobs in the "test" stage can expect to have a copy of those artifacts automatically. In theory, there's nothing that prevents an entire QEMU build directory, to be treated as an artifact. In practice, there are predefined limits on GitLab that prevents that from being possible, resulting in errors such as: Uploading artifacts... build: found 3164 matching files ERROR: Uploading artifacts to coordinator... too large archive id=xxxxxxx responseStatus=413 Request Entity Too Large status=413 Request Entity Too Large token=yyyyyyyyy FATAL: too large ERROR: Job failed: exit code 1 As far as I can tell, this is an instance define limit that's clearly influenced by storage costs. I see a few possible solutions to this limitation: 1) Provide our own "artifact" like solution that uses our own storage solution 2) Reduce or eliminate the dependency on a complete build tree The first solution can go against the general trend of not having to maintain CI infrastructure. It could be made simpler by using cloud storage, but there would still be some interaction with another external infrastructure component. I find the second solution preferrable, given that most tests depend on having one or a few binaries available. I've run multi-stage pipelines with some of those binaries (qemu-img, $target-softmmu/qemu-system-$target) defined as artifcats and they behaved as expected. But, this could require some intrusive changes to the current "make"-based test invocation. Job Naming convention --------------------- Based only on the very simple examples job above, it should already be clear that there's a lot of possibility for confusion and chaos. For instance, by looking at the "build" job definition or results, it's very hard to tell what's really about. A bit more could be inferred by the "x86_64-block" job name. Still, the problem we have to address here is not only about the amount of information easily obtained from a job name, but allowing for very similar job definitions within a global namespace. For instance, if we add an Operating Systems component to the mix, we need an extra qualifier for unique job names. Some of the possible components in a job definition are: * Stage * Build profile * Test set (a shorter name for what was described in the "Common set of tests" section) * Host architecture * Target architecture * Host Operating System identification (name and version) * Execution mode/environment (bare metal, container, VM, etc) Stage ~~~~~ The stage of a job (which maps roughly to its purpose) should be clearly defined. A job that builds QEMU should start with "build" and a job that tests QEMU should start with "test". IMO, in a second phase, once multi-stage pipelines are taken for granted, we could evaluate dropping this component altogether from the naming convention, and relying purely on the stage classification. Build profile ~~~~~~~~~~~~~ Different build profiles already abound in QEMU's various CI configuration files. It's hard to put a naming convention here, except that it should represent the most distinguishable characteristics of the build configuration. For instance, we can find a "build-disabled" job in the current ".gitlab-ci.yml" file that is aptly named, as it forcefully disables a lot of build options. Test set ~~~~~~~~ As mentioned in the "Common set of tests" section, I believe that the make target name can be used to identify the test set that will be executed in a job. That is, if a job is to be run at the "test" stage, and will run "make check", its name should start with "test-check". QEMU Targets ~~~~~~~~~~~~ Because a given job could, and usually do, involve multiple targets, I honestly can not think of how to add this to the naming convention. I'll ignore it for now, and consider the targets are defined in the build profile. Host Architecture ~~~~~~~~~~~~~~~~~ The host architecture name convention should be an easy pick, given that QEMU itself employes a architecture convention for its targets. Host OS ~~~~~~~ The suggestion I have for the host OS name is to follow the libosinfo[10] convention as closely as possible. libosinfo's "Short ID" should be well suitable here. Examples include: "openbsd4.2", "opensuse42.3", "rhel8.0", "ubuntu9.10" and "win2k12r2". Execution Environment ~~~~~~~~~~~~~~~~~~~~~ Distinguishing between running tests in a bare-metal versus a nested VM environment is quite significant to a number of people. Still, I think it could probably be optional for the initial implementation phase, like the naming convention for the QEMU Targets. Example 1 ~~~~~~~~~ Defining a job that will build QEMU with common debug options, on a RHEL 8.0 system on a x86_64 host: build-debug-rhel8.0-x86_64: script: - ./configure --enable-debug - make Example 2 ~~~~~~~~~ Defining a job that will run the "qtest" test set on a NetBSD 8.1 system on an aarch64 host: test-qtest-netbsd8.1-aarch64: script: - make check-qtest Job and Machine Scheduling -------------------------- While the naming convention gives some information to human beings, and hopefully allows for some order and avoids collusions on the global job namespace, it's not enough to define where those jobs should run. Tags[11] is the available mechanism to tie jobs to specific machines running the GitLab CI agent, "gitlab-runner". Unfortunately, some duplication seems unavoidable, in the sense that some of the naming components listed above are machine specific, and will then need to be also given as tags. Note: it may be a good idea to be extra verbose with tags, by having a qualifier prefix. The justification is that tags also live in a global namespace, and in theory, at a given point, tags of different "categories", say a CPU name and Operating System name may collide. Or, it may just be me being paranoid. Example 1 ~~~~~~~~~ build-debug-rhel8.0-x86_64: tags: - rhel8.0 - x86_64 script: - ./configure --enable-debug - make Example 2 ~~~~~~~~~ test-qtest-netbsd8.1-aarch64: tags: - netbsd8.1 - aarch64 script: - make check-qtest Operating System definition versus Container Images --------------------------------------------------- In the previous section and examples, we're assuming that tests will run on machines that have registered "gitlab-runner" agents with matching tags. The tags given at gitlab-runner registration time would of course match the same naming convention defined earlier. So, if one is registering a "gitlab-runner" instance on a x86_64 machine, running RHEL 8.0, the tags "rhel8.0" and "x86_64" would be given (possibly among others). Nevertheless, most deployment scenarios will probably rely on jobs being executed by gitlab-runner's container executor (currently Docker-only). This means that tags given to a job *may* drop the tag associated with the host operating system selection, and instead provide the ".gitlab-ci.yml" configuration directive that determines the container image to be used. Most jobs would probably *not* require a matching host operating system and container images, but there should still be the capability to make it a requirement. For instance, jobs containing tests that require the KVM accelerator on specific scenarios may require a matching host Operating System. Note: What was mentioned in the "Execution Environment" section under the naming conventions section, is also closely related to this requirement, that is, one may require a job to run under a container, VM or bare metal. Example 1 ~~~~~~~~~ Build QEMU on a "rhel8.0" image hosted under the "qemuci" organization and require the runner to support container execution: build-debug-rhel8.0-x86_64: tags: - x86_64 - container image: qemuci/rhel8.0 script: - ./configure --enable-debug - make Example 2 ~~~~~~~~~ Run check QEMU on a "rhel8.0" image hosted under the "qemuci" organization and require the runner to support container execution and be on a matching host: test-check-rhel8.0-x86_64: tags: - x86_64 - rhel8.0 - container image: qemuci/rhel8.0 script: - make check Next ---- Because this document is already too long and that can be distracting, I decided to defer many other implementation level details to a second RFC, alongside some code. Some completementary topics that I have prepared include: * Container images creation, hosting and management * Advanced pipeline definitions - Job depedencies - Artifacts - Results * GitLab CI for Individial Contributors * GitLab runner: - Official and Custom Binaries - Executors - Security implications - Helper container images for non supported architectures * Checklists for: - Preparing and documenting machine setup - Proposing new runners and jobs - Runners and jobs promotions and demotions Of course any other topics that spurr from this discussion will also be added to the following threads. References: ----------- [1] https://wiki.qemu.org/Requirements/GatingCI [2] https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg04909.html [3] https://docs.gitlab.com/ee/gitlab-basics/add-merge-request.html [4] https://docs.gitlab.com/ee/ci/merge_request_pipelines/pipelines_for_merged_results/index.html [5] https://docs.gitlab.com/ee/api/merge_requests.html#create-mr-pipeline [6] https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/apply-pullreq [7] https://docs.gitlab.com/ee/ci/yaml/README.html#allow_failure [8] https://docs.gitlab.com/ee/ci/yaml/README.html#using-onlychanges-with-pipelines-for-merge-requests [9] https://github.com/qemu/qemu/blob/fb2246882a2c8d7f084ebe0617e97ac78467d156/.gitlab-ci.yml#L70 [10] https://libosinfo.org/ [11] https://docs.gitlab.com/ee/ci/runners/README.html#using-tags ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2019-12-02 14:05 [RFC] QEMU Gating CI Cleber Rosa @ 2019-12-02 17:00 ` Stefan Hajnoczi 2019-12-02 17:08 ` Peter Maydell 2019-12-02 18:12 ` Cleber Rosa 2019-12-03 14:07 ` Alex Bennée ` (2 subsequent siblings) 3 siblings, 2 replies; 22+ messages in thread From: Stefan Hajnoczi @ 2019-12-02 17:00 UTC (permalink / raw) To: Cleber Rosa Cc: Peter Maydell, qemu-devel, Wainer dos Santos Moschetta, Markus Armbruster, Jeff Nelson, Alex Bennée, Ademar Reis [-- Attachment #1: Type: text/plain, Size: 18244 bytes --] On Mon, Dec 02, 2019 at 09:05:52AM -0500, Cleber Rosa wrote: > RFC: QEMU Gating CI > =================== Excellent, thank you for your work on this! > > This RFC attempts to address most of the issues described in > "Requirements/GatinCI"[1]. An also relevant write up is the "State of > QEMU CI as we enter 4.0"[2]. > > The general approach is one to minimize the infrastructure maintenance > and development burden, leveraging as much as possible "other people's" > infrastructure and code. GitLab's CI/CD platform is the most relevant > component dealt with here. > > Problem Statement > ----------------- > > The following is copied verbatim from Peter Maydell's write up[1]: > > "A gating CI is a prerequisite to having a multi-maintainer model of > merging. By having a common set of tests that are run prior to a merge > you do not rely on who is currently doing merging duties having access > to the current set of test machines." > > This is of a very simplified view of the problem that I'd like to break > down even further into the following key points: > > * Common set of tests > * Pre-merge ("prior to a merge") > * Access to the current set of test machines > * Multi-maintainer model > > Common set of tests > ~~~~~~~~~~~~~~~~~~~ > > Before we delve any further, let's make it clear that a "common set of > tests" is really a "dynamic common set of tests". My point is that a > set of tests in QEMU may include or exclude different tests depending > on the environment. > > The exact tests that will be executed may differ depending on the > environment, including: > > * Hardware > * Operating system > * Build configuration > * Environment variables > > In the "State of QEMU CI as we enter 4.0" Alex Bennée listed some of > those "common set of tests": > > * check > * check-tcg > * check-softfloat > * check-block > * check-acceptance > > While Peter mentions that most of his checks are limited to: > > * check > * check-tcg > > Our current inability to quickly identify a faulty test from test > execution results (and specially in remote environments), and act upon > it (say quickly disable it on a given host platform), makes me believe > that it's fair to start a gating CI implementation that uses this > rather coarse granularity. > > Another benefit is a close or even a 1:1 relationship between a common > test set and an entry in the CI configuration. For instance, the > "check" common test set would map to a "make check" command in a > "script:" YAML entry. > > To exemplify my point, if one specific test run as part of "check-tcg" > is found to be faulty on a specific job (say on a specific OS), the > entire "check-tcg" test set may be disabled as a CI-level maintenance > action. Of course a follow up action to deal with the specific test > is required, probably in the form of a Launchpad bug and patches > dealing with the issue, but without necessarily a CI related angle to > it. I think this coarse level of granularity is unrealistic. We cannot disable 99 tests because of 1 known failure. There must be a way of disabling individual tests. You don't need to implement it yourself, but I think this needs to be solved by someone before a gating CI can be put into use. It probably involves adding a "make EXCLUDE_TESTS=foo,bar check" variable so that .gitlab-ci.yml can be modified to exclude specific tests on certain OSes. > > If/when test result presentation and control mechanism evolve, we may > feel confident and go into finer grained granularity. For instance, a > mechanism for disabling nothing but "tests/migration-test" on a given > environment would be possible and desirable from a CI management level. > > Pre-merge > ~~~~~~~~~ > > The natural way to have pre-merge CI jobs in GitLab is to send "Merge > Requests"[3] (abbreviated as "MR" from now on). In most projects, a > MR comes from individual contributors, usually the authors of the > changes themselves. It's my understanding that the current maintainer > model employed in QEMU will *not* change at this time, meaning that > code contributions and reviews will continue to happen on the mailing > list. A maintainer then, having collected a number of patches, would > submit a MR either in addition or in substitution to the Pull Requests > sent to the mailing list. > > "Pipelines for Merged Results"[4] is a very important feature to > support the multi-maintainer model, and looks in practice, similar to > Peter's "staging" branch approach, with an "automatic refresh" of the > target branch. It can give a maintainer extra confidence that a MR > will play nicely with the updated status of the target branch. It's > my understanding that it should be the "key to the gates". A minor > note is that conflicts are still possible in a multi-maintainer model > if there are more than one person doing the merges. The intention is to have only 1 active maintainer at a time. The maintainer will handle all merges for the current QEMU release and then hand over to the next maintainer after the release has been made. Solving the problem for multiple active maintainers is low priority at the moment. > A worthy point is that the GitLab web UI is not the only way to create > a Merge Request, but a rich set of APIs are available[5]. This is > interesting for many reasons, and maybe some of Peter's > "apply-pullreq"[6] actions (such as bad UTF8 or bogus qemu-devel email > addresses checks could be made earlier) as part of a > "send-mergereq"-like script, bringing conformance earlier on the merge > process, at the MR creation stage. > > Note: It's possible to have CI jobs definition that are specific to > MR, allowing generic non-MR jobs to be kept on the default > configuration. This can be used so individual contributors continue > to leverage some of the "free" (shared) runner made available on > gitlab.com. I expected this section to say: 1. Maintainer sets up a personal gitlab.com account with a qemu.git fork. 2. Maintainer adds QEMU's CI tokens to their personal account. 3. Each time a maintainer pushes to their "staging" branch the CI triggers. IMO this model is simpler than MRs because once it has been set up the maintainer just uses git push. Why are MRs necessary? > Multi-maintainer model > ~~~~~~~~~~~~~~~~~~~~~~ > > The previous section already introduced some of the proposed workflow > that can enable such a multi-maintainer model. With a Gating CI > system, though, it will be natural to have a smaller "Mean time > between (CI) failures", simply because of the expected increased > number of systems and checks. A lot of countermeasures have to be > employed to keep that MTBF in check. I expect the CI to be in a state of partial failure all the time. Previously the idea of Tier 1 and Tier 2 platforms was raised where Tier 2 platforms can be failing without gating the CI. I think this is reality for us. Niche host OSes and features fail and remain in the failing state for days/weeks. The CI should be designed to run in this mode all the time. > For once, it's imperative that the maintainers for such systems and > jobs are clearly defined and readily accessible. Either the same > MAINTAINERS file or a more suitable variation of such data should be > defined before activating the *gating* rules. This would allow a > routing to request the attention of the maintainer responsible. > > In case of unresposive maintainers, or any other condition that > renders and keeps one or more CI jobs failing for a given previously > established amount of time, the job can be demoted with an > "allow_failure" configuration[7]. Once such a change is commited, the > path to promotion would be just the same as in a newly added job > definition. > > Note: In a future phase we can evaluate the creation of rules that > look at changed paths in a MR (similar to "F:" entries on MAINTAINERS) > and the execution of specific CI jobs, which would be the > responsibility of a given maintainer[8]. > > Access to the current set of test machines > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > When compared to the various CI systems and services already being > employed in the QEMU project, this is the most striking difference in > the proposed model. Instead of relying on shared/public/free > resources, this proposal also deals with privately owned and > operated machines. > > Even though the QEMU project operates with great cooperation, it's > crucial to define clear boundaries when it comes to machine access. > Restricted access to machines is important because: > > * The results of jobs are many times directly tied to the setup and > status of machines. Even "soft" changes such as removing or updating > packages can introduce failures in jobs (this is greatly minimized > but not completely eliminated when using containers or VMs). > Updating firmware or changing its settings are also examples of > changes that may change the outcome of jobs. > > * If maintainers will be accounted for the status of the jobs defined > to run on specific machines, they must be sure of the status of the > machines. > > * Machines need regular monitoring and will receive required > maintainance actions which can cause job regressions. > > Thus, there needs to be one clear way for machines to be *used* for > running jobs sent by different maintainers, while still prohibiting > any other privileged action that can cause permanent change to the > machine. The GitLab agent (gitlab-runner) is designed to do just > that, and defining what will be excuted in a job (in a given system) > should be all that's generally allowed. The job definition itself, > will of course be subject to code review before a maintainer decides > to send a MR containing such new or updated job definitions. > > Still related to machine maintanance, it's highly desirable for jobs > tied to specific host machines to be introduced alongside with > documentation and/or scripts that can replicate the machine setup. If > the machine setup steps can be easily and reliably reproduced, then: > > * Other people may help to debug issues and regressions if they > happen to have the same hardware available > > * Other people may provide more machines to run the same types of > jobs > > * If a machine maintainer goes MIA, it'd be easier to find another > maintainer qemu.git has tests/vm for Ubuntu (i386), FreeBSD, NetBSD, OpenBSD, CentOS, Fedora and tests/docker for Debian cross-compilation. These are a good starting point for automated/reproducible environments for running builds and tests. It would be great to integrate with gitlab-runner. > > GitLab Jobs and Pipelines > ------------------------- > > GitLab CI is built around two major concepts: jobs and pipelines. The > current GitLab CI configuration in QEMU uses jobs only (or putting it > another way, all jobs in a single pipeline stage). Consider the > folowing job definition[9]: > > build-tci: > script: > - TARGETS="aarch64 alpha arm hppa m68k microblaze moxie ppc64 s390x x86_64" > - ./configure --enable-tcg-interpreter > --target-list="$(for tg in $TARGETS; do echo -n ${tg}'-softmmu '; done)" > - make -j2 > - make tests/boot-serial-test tests/cdrom-test tests/pxe-test > - for tg in $TARGETS ; do > export QTEST_QEMU_BINARY="${tg}-softmmu/qemu-system-${tg}" ; > ./tests/boot-serial-test || exit 1 ; > ./tests/cdrom-test || exit 1 ; > done > - QTEST_QEMU_BINARY="x86_64-softmmu/qemu-system-x86_64" ./tests/pxe-test > - QTEST_QEMU_BINARY="s390x-softmmu/qemu-system-s390x" ./tests/pxe-test -m slow > > All the lines under "script" are performed sequentially. It should be > clear that there's the possibility of breaking this down into multiple > stages, so that a build happens first, and then "common set of tests" > run in parallel. Using the example above, it would look something > like: > > +---------------+------------------------+ > | BUILD STAGE | TEST STAGE | > +---------------+------------------------+ > | +-------+ | +------------------+ | > | | build | | | boot-serial-test | | > | +-------+ | +------------------+ | > | | | > | | +------------------+ | > | | | cdrom-test | | > | | +------------------+ | > | | | > | | +------------------+ | > | | | x86_64-pxe-test | | > | | +------------------+ | > | | | > | | +------------------+ | > | | | s390x-pxe-test | | > | | +------------------+ | > | | | > +---------------+------------------------+ > > Of course it would be silly to break down that job into smaller jobs that > would run individual tests like "boot-serial-test" or "cdrom-test". Still, > the pipeline approach is valid because: > > * Common set of tests would run in parallel, giving a quicker result > turnaround > > * It's easier to determine to possible nature of the problem with > just the basic CI job status > > * Different maintainers could be defined for different "common set of > tests", and again by leveraging the basic CI job status, automation > for directed notification can be implemented > > In the following example, "check-block" maintainers could be left > undisturbed with failures in the "check-acceptance" job: > > +---------------+------------------------+ > | BUILD STAGE | TEST STAGE | > +---------------+------------------------+ > | +-------+ | +------------------+ | > | | build | | | check-block | | > | +-------+ | +------------------+ | > | | | > | | +------------------+ | > | | | check-acceptance | | > | | +------------------+ | > | | | > +---------------+------------------------+ > > The same logic applies for test sets for different targets. For > instance, combining the two previous examples, there could different > maintainers defined for the different jobs on the test stage: > > +---------------+------------------------+ > | BUILD STAGE | TEST STAGE | > +---------------+------------------------+ > | +-------+ | +------------------+ | > | | build | | | x86_64-block | | > | +-------+ | +------------------+ | > | | | > | | +------------------+ | > | | | x86_64-acceptance| | > | | +------------------+ | > | | | > | | +------------------+ | > | | | s390x-block | | > | | +------------------+ | > | | | > | | +------------------+ | > | | | s390x-acceptance | | > | | +------------------+ | > +---------------+------------------------+ > > Current limitations for a multi-stage pipeline > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Because it's assumed that each job will happen in an isolated and > independent execution environment, jobs must explicitly define the > resources that will be shared between stages. GitLab will make sure > the same source code revision will be available on all jobs > automatically. Additionaly, GitLab supports the concept of artifacts. > By defining artifacts in the "build" stage, jobs in the "test" stage > can expect to have a copy of those artifacts automatically. > > In theory, there's nothing that prevents an entire QEMU build > directory, to be treated as an artifact. In practice, there are > predefined limits on GitLab that prevents that from being possible, > resulting in errors such as: > > Uploading artifacts... > build: found 3164 matching files > ERROR: Uploading artifacts to coordinator... too large archive > id=xxxxxxx responseStatus=413 Request Entity Too Large > status=413 Request Entity Too Large token=yyyyyyyyy > FATAL: too large > ERROR: Job failed: exit code 1 > > As far as I can tell, this is an instance define limit that's clearly > influenced by storage costs. I see a few possible solutions to this > limitation: > > 1) Provide our own "artifact" like solution that uses our own storage > solution > > 2) Reduce or eliminate the dependency on a complete build tree > > The first solution can go against the general trend of not having to > maintain CI infrastructure. It could be made simpler by using cloud > storage, but there would still be some interaction with another > external infrastructure component. > > I find the second solution preferrable, given that most tests depend > on having one or a few binaries available. I've run multi-stage > pipelines with some of those binaries (qemu-img, > $target-softmmu/qemu-system-$target) defined as artifcats and they > behaved as expected. But, this could require some intrusive changes > to the current "make"-based test invocation. I agree. It should be possible to bring the necessary artifacts down to below the limit. This wasn't a problem for the virtio-fs GitLab CI scripts I wrote that build a Linux kernel, QEMU, and guest image so I think will be possible for QEMU as a whole: https://gitlab.com/virtio-fs/virtio-fs-ci/ [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2019-12-02 17:00 ` Stefan Hajnoczi @ 2019-12-02 17:08 ` Peter Maydell 2019-12-02 18:28 ` Cleber Rosa 2019-12-02 18:12 ` Cleber Rosa 1 sibling, 1 reply; 22+ messages in thread From: Peter Maydell @ 2019-12-02 17:08 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Markus Armbruster, Wainer dos Santos Moschetta, QEMU Developers, Jeff Nelson, Cleber Rosa, Alex Bennée, Ademar Reis On Mon, 2 Dec 2019 at 17:00, Stefan Hajnoczi <stefanha@redhat.com> wrote: > > On Mon, Dec 02, 2019 at 09:05:52AM -0500, Cleber Rosa wrote: > > To exemplify my point, if one specific test run as part of "check-tcg" > > is found to be faulty on a specific job (say on a specific OS), the > > entire "check-tcg" test set may be disabled as a CI-level maintenance > > action. Of course a follow up action to deal with the specific test > > is required, probably in the form of a Launchpad bug and patches > > dealing with the issue, but without necessarily a CI related angle to > > it. > > I think this coarse level of granularity is unrealistic. We cannot > disable 99 tests because of 1 known failure. There must be a way of > disabling individual tests. You don't need to implement it yourself, > but I think this needs to be solved by someone before a gating CI can be > put into use. > > It probably involves adding a "make EXCLUDE_TESTS=foo,bar check" > variable so that .gitlab-ci.yml can be modified to exclude specific > tests on certain OSes. We don't have this at the moment, so I'm not sure we need to add it as part of moving to doing merge testing via gitlab ? The current process is "if the pullreq causes a test to fail then the pullreq needs to be changed, perhaps by adding a patch which disables the test on a particular platform if necessary". Making that smoother might be nice, but I would be a little wary about adding requirements to the move-to-gitlab that don't absolutely need to be there. thanks -- PMM ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2019-12-02 17:08 ` Peter Maydell @ 2019-12-02 18:28 ` Cleber Rosa 2019-12-02 18:36 ` Warner Losh 0 siblings, 1 reply; 22+ messages in thread From: Cleber Rosa @ 2019-12-02 18:28 UTC (permalink / raw) To: Peter Maydell Cc: Stefan Hajnoczi, Markus Armbruster, Wainer dos Santos Moschetta, QEMU Developers, Jeff Nelson, Alex Bennée, Ademar Reis On Mon, Dec 02, 2019 at 05:08:35PM +0000, Peter Maydell wrote: > On Mon, 2 Dec 2019 at 17:00, Stefan Hajnoczi <stefanha@redhat.com> wrote: > > > > On Mon, Dec 02, 2019 at 09:05:52AM -0500, Cleber Rosa wrote: > > > To exemplify my point, if one specific test run as part of "check-tcg" > > > is found to be faulty on a specific job (say on a specific OS), the > > > entire "check-tcg" test set may be disabled as a CI-level maintenance > > > action. Of course a follow up action to deal with the specific test > > > is required, probably in the form of a Launchpad bug and patches > > > dealing with the issue, but without necessarily a CI related angle to > > > it. > > > > I think this coarse level of granularity is unrealistic. We cannot > > disable 99 tests because of 1 known failure. There must be a way of > > disabling individual tests. You don't need to implement it yourself, > > but I think this needs to be solved by someone before a gating CI can be > > put into use. > > > > It probably involves adding a "make EXCLUDE_TESTS=foo,bar check" > > variable so that .gitlab-ci.yml can be modified to exclude specific > > tests on certain OSes. > > We don't have this at the moment, so I'm not sure we need to > add it as part of moving to doing merge testing via gitlab ? > The current process is "if the pullreq causes a test to fail > then the pullreq needs to be changed, perhaps by adding a > patch which disables the test on a particular platform if > necessary". Making that smoother might be nice, but I would > be a little wary about adding requirements to the move-to-gitlab > that don't absolutely need to be there. > > thanks > -- PMM > Right, it goes without saying that: 1) I acknowledge the problem (and I can have a long conversation about it :) 2) I don't think it has to be a prerequisite to the "move-to-gitlab" effort Thanks, - Cleber. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2019-12-02 18:28 ` Cleber Rosa @ 2019-12-02 18:36 ` Warner Losh 2019-12-02 22:38 ` Cleber Rosa 0 siblings, 1 reply; 22+ messages in thread From: Warner Losh @ 2019-12-02 18:36 UTC (permalink / raw) To: Cleber Rosa Cc: Peter Maydell, Stefan Hajnoczi, QEMU Developers, Wainer dos Santos Moschetta, Markus Armbruster, Jeff Nelson, Alex Bennée, Ademar Reis [-- Attachment #1: Type: text/plain, Size: 2685 bytes --] On Mon, Dec 2, 2019 at 11:29 AM Cleber Rosa <crosa@redhat.com> wrote: > On Mon, Dec 02, 2019 at 05:08:35PM +0000, Peter Maydell wrote: > > On Mon, 2 Dec 2019 at 17:00, Stefan Hajnoczi <stefanha@redhat.com> > wrote: > > > > > > On Mon, Dec 02, 2019 at 09:05:52AM -0500, Cleber Rosa wrote: > > > > To exemplify my point, if one specific test run as part of > "check-tcg" > > > > is found to be faulty on a specific job (say on a specific OS), the > > > > entire "check-tcg" test set may be disabled as a CI-level maintenance > > > > action. Of course a follow up action to deal with the specific test > > > > is required, probably in the form of a Launchpad bug and patches > > > > dealing with the issue, but without necessarily a CI related angle to > > > > it. > > > > > > I think this coarse level of granularity is unrealistic. We cannot > > > disable 99 tests because of 1 known failure. There must be a way of > > > disabling individual tests. You don't need to implement it yourself, > > > but I think this needs to be solved by someone before a gating CI can > be > > > put into use. > > > > > > It probably involves adding a "make EXCLUDE_TESTS=foo,bar check" > > > variable so that .gitlab-ci.yml can be modified to exclude specific > > > tests on certain OSes. > > > > We don't have this at the moment, so I'm not sure we need to > > add it as part of moving to doing merge testing via gitlab ? > > The current process is "if the pullreq causes a test to fail > > then the pullreq needs to be changed, perhaps by adding a > > patch which disables the test on a particular platform if > > necessary". Making that smoother might be nice, but I would > > be a little wary about adding requirements to the move-to-gitlab > > that don't absolutely need to be there. > > > > thanks > > -- PMM > > > > Right, it goes without saying that: > > 1) I acknowledge the problem (and I can have a long conversation > about it :) > Just make sure that any pipeline and mandatory CI steps don't slow things down too much... While the examples have talked about 1 or 2 pull requests getting done in parallel, and that's great, the problem is when you try to land 10 or 20 all at once, one that causes the failure and you aren't sure which one it actually is... Make sure whatever you design has sane exception case handling to not cause too much collateral damage... I worked one place that would back everything out if a once-a-week CI test ran and had failures... That CI test-run took 2 days to run, so it wasn't practical to run it often, or for every commit. In the end, though, the powers that be implemented a automated bisection tool that made it marginally less sucky.. Warner [-- Attachment #2: Type: text/html, Size: 3466 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2019-12-02 18:36 ` Warner Losh @ 2019-12-02 22:38 ` Cleber Rosa 0 siblings, 0 replies; 22+ messages in thread From: Cleber Rosa @ 2019-12-02 22:38 UTC (permalink / raw) To: Warner Losh Cc: Peter Maydell, Stefan Hajnoczi, QEMU Developers, Wainer dos Santos Moschetta, Markus Armbruster, Jeff Nelson, Alex Bennée, Ademar Reis On Mon, Dec 02, 2019 at 11:36:35AM -0700, Warner Losh wrote: > > Just make sure that any pipeline and mandatory CI steps don't slow things > down too much... While the examples have talked about 1 or 2 pull requests > getting done in parallel, and that's great, the problem is when you try to > land 10 or 20 all at once, one that causes the failure and you aren't sure > which one it actually is... Make sure whatever you design has sane > exception case handling to not cause too much collateral damage... I worked > one place that would back everything out if a once-a-week CI test ran and > had failures... That CI test-run took 2 days to run, so it wasn't practical > to run it often, or for every commit. In the end, though, the powers that > be implemented a automated bisection tool that made it marginally less > sucky.. > > Warner What I would personally like to see is the availability of enough resources to give a ~2 hour max result turnaround, that is, the complete pipeline finishes within that 2 hours. Of course the exact max time should be a constructed consensus. If someone is contributing a new job supposed to run on existing hardware, its acceptance should be carefully considered. If more hardware is being added and the job is capable of running parallel with others, than it shouldn't be an issue (I don't think we'll hit GitLab's scheduling limits anytime soon). With regards to the "1 or 2 pull requests done in parallel", of course there could be a queue of pending jobs, but given that the idea is for these jobs to be run based on maintainers actions (say a Merge Request), the volume should be much lower than if individual contributors were triggering the same jobs on their patch series, and not at all on every commit (as you describe with the ~2 days jobs). Anyway, thanks for the feedback and please do not refrain from further participation in this effort. Your experience seems quite valuable. Thanks, - Cleber. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2019-12-02 17:00 ` Stefan Hajnoczi 2019-12-02 17:08 ` Peter Maydell @ 2019-12-02 18:12 ` Cleber Rosa 2019-12-03 14:14 ` Stefan Hajnoczi 1 sibling, 1 reply; 22+ messages in thread From: Cleber Rosa @ 2019-12-02 18:12 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Peter Maydell, qemu-devel, Wainer dos Santos Moschetta, Markus Armbruster, Jeff Nelson, Alex Bennée, Ademar Reis On Mon, Dec 02, 2019 at 05:00:18PM +0000, Stefan Hajnoczi wrote: > On Mon, Dec 02, 2019 at 09:05:52AM -0500, Cleber Rosa wrote: > > RFC: QEMU Gating CI > > =================== > > Excellent, thank you for your work on this! > > > > > This RFC attempts to address most of the issues described in > > "Requirements/GatinCI"[1]. An also relevant write up is the "State of > > QEMU CI as we enter 4.0"[2]. > > > > The general approach is one to minimize the infrastructure maintenance > > and development burden, leveraging as much as possible "other people's" > > infrastructure and code. GitLab's CI/CD platform is the most relevant > > component dealt with here. > > > > Problem Statement > > ----------------- > > > > The following is copied verbatim from Peter Maydell's write up[1]: > > > > "A gating CI is a prerequisite to having a multi-maintainer model of > > merging. By having a common set of tests that are run prior to a merge > > you do not rely on who is currently doing merging duties having access > > to the current set of test machines." > > > > This is of a very simplified view of the problem that I'd like to break > > down even further into the following key points: > > > > * Common set of tests > > * Pre-merge ("prior to a merge") > > * Access to the current set of test machines > > * Multi-maintainer model > > > > Common set of tests > > ~~~~~~~~~~~~~~~~~~~ > > > > Before we delve any further, let's make it clear that a "common set of > > tests" is really a "dynamic common set of tests". My point is that a > > set of tests in QEMU may include or exclude different tests depending > > on the environment. > > > > The exact tests that will be executed may differ depending on the > > environment, including: > > > > * Hardware > > * Operating system > > * Build configuration > > * Environment variables > > > > In the "State of QEMU CI as we enter 4.0" Alex Bennée listed some of > > those "common set of tests": > > > > * check > > * check-tcg > > * check-softfloat > > * check-block > > * check-acceptance > > > > While Peter mentions that most of his checks are limited to: > > > > * check > > * check-tcg > > > > Our current inability to quickly identify a faulty test from test > > execution results (and specially in remote environments), and act upon > > it (say quickly disable it on a given host platform), makes me believe > > that it's fair to start a gating CI implementation that uses this > > rather coarse granularity. > > > > Another benefit is a close or even a 1:1 relationship between a common > > test set and an entry in the CI configuration. For instance, the > > "check" common test set would map to a "make check" command in a > > "script:" YAML entry. > > > > To exemplify my point, if one specific test run as part of "check-tcg" > > is found to be faulty on a specific job (say on a specific OS), the > > entire "check-tcg" test set may be disabled as a CI-level maintenance > > action. Of course a follow up action to deal with the specific test > > is required, probably in the form of a Launchpad bug and patches > > dealing with the issue, but without necessarily a CI related angle to > > it. > > I think this coarse level of granularity is unrealistic. We cannot > disable 99 tests because of 1 known failure. There must be a way of > disabling individual tests. You don't need to implement it yourself, > but I think this needs to be solved by someone before a gating CI can be > put into use. > IMO it should be realistic if you look at it from a "CI related angle". The pull request could still be revised and disable a single test because of a known failure, but this would not be necessarily related to the CI. > It probably involves adding a "make EXCLUDE_TESTS=foo,bar check" > variable so that .gitlab-ci.yml can be modified to exclude specific > tests on certain OSes. > I certainly acknowledge the issue, but I don't think this (and many other issues that will certainly come up) should be a blocker to the transition to GitLab. > > > > If/when test result presentation and control mechanism evolve, we may > > feel confident and go into finer grained granularity. For instance, a > > mechanism for disabling nothing but "tests/migration-test" on a given > > environment would be possible and desirable from a CI management level. > > > > Pre-merge > > ~~~~~~~~~ > > > > The natural way to have pre-merge CI jobs in GitLab is to send "Merge > > Requests"[3] (abbreviated as "MR" from now on). In most projects, a > > MR comes from individual contributors, usually the authors of the > > changes themselves. It's my understanding that the current maintainer > > model employed in QEMU will *not* change at this time, meaning that > > code contributions and reviews will continue to happen on the mailing > > list. A maintainer then, having collected a number of patches, would > > submit a MR either in addition or in substitution to the Pull Requests > > sent to the mailing list. > > > > "Pipelines for Merged Results"[4] is a very important feature to > > support the multi-maintainer model, and looks in practice, similar to > > Peter's "staging" branch approach, with an "automatic refresh" of the > > target branch. It can give a maintainer extra confidence that a MR > > will play nicely with the updated status of the target branch. It's > > my understanding that it should be the "key to the gates". A minor > > note is that conflicts are still possible in a multi-maintainer model > > if there are more than one person doing the merges. > > The intention is to have only 1 active maintainer at a time. The > maintainer will handle all merges for the current QEMU release and then > hand over to the next maintainer after the release has been made. > > Solving the problem for multiple active maintainers is low priority at > the moment. > Even so, I have the impression that the following workflow: - Look at Merge Results Pipeline for MR#1 - Merge MR #1 - Hack on something else - Look at *automatically updated* Merge Results Pipeline for MR#2 - Merge MR #2 Is better than: - Push PR #1 to staging - Wait for PR #1 Pipeline to finish - Look at PR #1 Pipeline results - Push staging into master - Push PR #2 to staging - Wait for PR #2 Pipeline to finish - Push staging into master But I don't think I'll be a direct user of those workflows, so I'm completely open to feedback on it. > > A worthy point is that the GitLab web UI is not the only way to create > > a Merge Request, but a rich set of APIs are available[5]. This is > > interesting for many reasons, and maybe some of Peter's > > "apply-pullreq"[6] actions (such as bad UTF8 or bogus qemu-devel email > > addresses checks could be made earlier) as part of a > > "send-mergereq"-like script, bringing conformance earlier on the merge > > process, at the MR creation stage. > > > > Note: It's possible to have CI jobs definition that are specific to > > MR, allowing generic non-MR jobs to be kept on the default > > configuration. This can be used so individual contributors continue > > to leverage some of the "free" (shared) runner made available on > > gitlab.com. > > I expected this section to say: > 1. Maintainer sets up a personal gitlab.com account with a qemu.git fork. > 2. Maintainer adds QEMU's CI tokens to their personal account. > 3. Each time a maintainer pushes to their "staging" branch the CI > triggers. > > IMO this model is simpler than MRs because once it has been set up the > maintainer just uses git push. Why are MRs necessary? > I am not sure GitLab "Specific Runners" can be used from other accounts/forks. AFAICT, you'd need a MR to send jobs that would run on those machines, because (again AFAICT) the token used to register those gitlab-runner instances on those machines is not shareable across forks. But, I'll double check that. > > Multi-maintainer model > > ~~~~~~~~~~~~~~~~~~~~~~ > > > > The previous section already introduced some of the proposed workflow > > that can enable such a multi-maintainer model. With a Gating CI > > system, though, it will be natural to have a smaller "Mean time > > between (CI) failures", simply because of the expected increased > > number of systems and checks. A lot of countermeasures have to be > > employed to keep that MTBF in check. > > I expect the CI to be in a state of partial failure all the time. > Previously the idea of Tier 1 and Tier 2 platforms was raised where Tier > 2 platforms can be failing without gating the CI. I think this is > reality for us. Niche host OSes and features fail and remain in the > failing state for days/weeks. The CI should be designed to run in this > mode all the time. > The most important tool we'd have at the CI level is to "allow failures" indeed. GitLab CI itself doesn't provide the concept of different tiers, so effectively we'd have to mimic that with jobs that will not be blocking. What I think we should use is, have a well defined methodology, and tools, to either promote or demote failing/passing jobs. For example, a newly introduced job will always be in "allow failure" mode (similar to Tier 2), until it proves itself by running reliably for 100 runs or 2 months, whatever comes last. Likewise, a job that is not allowed to fail (similar to a Tier 1) would be demoted if it fails twice and is not repaired within 24 hours. > > For once, it's imperative that the maintainers for such systems and > > jobs are clearly defined and readily accessible. Either the same > > MAINTAINERS file or a more suitable variation of such data should be > > defined before activating the *gating* rules. This would allow a > > routing to request the attention of the maintainer responsible. > > > > In case of unresposive maintainers, or any other condition that > > renders and keeps one or more CI jobs failing for a given previously > > established amount of time, the job can be demoted with an > > "allow_failure" configuration[7]. Once such a change is commited, the > > path to promotion would be just the same as in a newly added job > > definition. > > > > Note: In a future phase we can evaluate the creation of rules that > > look at changed paths in a MR (similar to "F:" entries on MAINTAINERS) > > and the execution of specific CI jobs, which would be the > > responsibility of a given maintainer[8]. > > > > Access to the current set of test machines > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > When compared to the various CI systems and services already being > > employed in the QEMU project, this is the most striking difference in > > the proposed model. Instead of relying on shared/public/free > > resources, this proposal also deals with privately owned and > > operated machines. > > > > Even though the QEMU project operates with great cooperation, it's > > crucial to define clear boundaries when it comes to machine access. > > Restricted access to machines is important because: > > > > * The results of jobs are many times directly tied to the setup and > > status of machines. Even "soft" changes such as removing or updating > > packages can introduce failures in jobs (this is greatly minimized > > but not completely eliminated when using containers or VMs). > > Updating firmware or changing its settings are also examples of > > changes that may change the outcome of jobs. > > > > * If maintainers will be accounted for the status of the jobs defined > > to run on specific machines, they must be sure of the status of the > > machines. > > > > * Machines need regular monitoring and will receive required > > maintainance actions which can cause job regressions. > > > > Thus, there needs to be one clear way for machines to be *used* for > > running jobs sent by different maintainers, while still prohibiting > > any other privileged action that can cause permanent change to the > > machine. The GitLab agent (gitlab-runner) is designed to do just > > that, and defining what will be excuted in a job (in a given system) > > should be all that's generally allowed. The job definition itself, > > will of course be subject to code review before a maintainer decides > > to send a MR containing such new or updated job definitions. > > > > Still related to machine maintanance, it's highly desirable for jobs > > tied to specific host machines to be introduced alongside with > > documentation and/or scripts that can replicate the machine setup. If > > the machine setup steps can be easily and reliably reproduced, then: > > > > * Other people may help to debug issues and regressions if they > > happen to have the same hardware available > > > > * Other people may provide more machines to run the same types of > > jobs > > > > * If a machine maintainer goes MIA, it'd be easier to find another > > maintainer > > qemu.git has tests/vm for Ubuntu (i386), FreeBSD, NetBSD, OpenBSD, > CentOS, Fedora and tests/docker for Debian cross-compilation. These are > a good starting point for automated/reproducible environments for > running builds and tests. It would be great to integrate with > gitlab-runner. > Yes, the idea is to close the gap as much as possible, and make what we already have on qemu.git available to CI/gitlab-runner and vice-versa. > > > > GitLab Jobs and Pipelines > > ------------------------- > > > > GitLab CI is built around two major concepts: jobs and pipelines. The > > current GitLab CI configuration in QEMU uses jobs only (or putting it > > another way, all jobs in a single pipeline stage). Consider the > > folowing job definition[9]: > > > > build-tci: > > script: > > - TARGETS="aarch64 alpha arm hppa m68k microblaze moxie ppc64 s390x x86_64" > > - ./configure --enable-tcg-interpreter > > --target-list="$(for tg in $TARGETS; do echo -n ${tg}'-softmmu '; done)" > > - make -j2 > > - make tests/boot-serial-test tests/cdrom-test tests/pxe-test > > - for tg in $TARGETS ; do > > export QTEST_QEMU_BINARY="${tg}-softmmu/qemu-system-${tg}" ; > > ./tests/boot-serial-test || exit 1 ; > > ./tests/cdrom-test || exit 1 ; > > done > > - QTEST_QEMU_BINARY="x86_64-softmmu/qemu-system-x86_64" ./tests/pxe-test > > - QTEST_QEMU_BINARY="s390x-softmmu/qemu-system-s390x" ./tests/pxe-test -m slow > > > > All the lines under "script" are performed sequentially. It should be > > clear that there's the possibility of breaking this down into multiple > > stages, so that a build happens first, and then "common set of tests" > > run in parallel. Using the example above, it would look something > > like: > > > > +---------------+------------------------+ > > | BUILD STAGE | TEST STAGE | > > +---------------+------------------------+ > > | +-------+ | +------------------+ | > > | | build | | | boot-serial-test | | > > | +-------+ | +------------------+ | > > | | | > > | | +------------------+ | > > | | | cdrom-test | | > > | | +------------------+ | > > | | | > > | | +------------------+ | > > | | | x86_64-pxe-test | | > > | | +------------------+ | > > | | | > > | | +------------------+ | > > | | | s390x-pxe-test | | > > | | +------------------+ | > > | | | > > +---------------+------------------------+ > > > > Of course it would be silly to break down that job into smaller jobs that > > would run individual tests like "boot-serial-test" or "cdrom-test". Still, > > the pipeline approach is valid because: > > > > * Common set of tests would run in parallel, giving a quicker result > > turnaround > > > > * It's easier to determine to possible nature of the problem with > > just the basic CI job status > > > > * Different maintainers could be defined for different "common set of > > tests", and again by leveraging the basic CI job status, automation > > for directed notification can be implemented > > > > In the following example, "check-block" maintainers could be left > > undisturbed with failures in the "check-acceptance" job: > > > > +---------------+------------------------+ > > | BUILD STAGE | TEST STAGE | > > +---------------+------------------------+ > > | +-------+ | +------------------+ | > > | | build | | | check-block | | > > | +-------+ | +------------------+ | > > | | | > > | | +------------------+ | > > | | | check-acceptance | | > > | | +------------------+ | > > | | | > > +---------------+------------------------+ > > > > The same logic applies for test sets for different targets. For > > instance, combining the two previous examples, there could different > > maintainers defined for the different jobs on the test stage: > > > > +---------------+------------------------+ > > | BUILD STAGE | TEST STAGE | > > +---------------+------------------------+ > > | +-------+ | +------------------+ | > > | | build | | | x86_64-block | | > > | +-------+ | +------------------+ | > > | | | > > | | +------------------+ | > > | | | x86_64-acceptance| | > > | | +------------------+ | > > | | | > > | | +------------------+ | > > | | | s390x-block | | > > | | +------------------+ | > > | | | > > | | +------------------+ | > > | | | s390x-acceptance | | > > | | +------------------+ | > > +---------------+------------------------+ > > > > Current limitations for a multi-stage pipeline > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > Because it's assumed that each job will happen in an isolated and > > independent execution environment, jobs must explicitly define the > > resources that will be shared between stages. GitLab will make sure > > the same source code revision will be available on all jobs > > automatically. Additionaly, GitLab supports the concept of artifacts. > > By defining artifacts in the "build" stage, jobs in the "test" stage > > can expect to have a copy of those artifacts automatically. > > > > In theory, there's nothing that prevents an entire QEMU build > > directory, to be treated as an artifact. In practice, there are > > predefined limits on GitLab that prevents that from being possible, > > resulting in errors such as: > > > > Uploading artifacts... > > build: found 3164 matching files > > ERROR: Uploading artifacts to coordinator... too large archive > > id=xxxxxxx responseStatus=413 Request Entity Too Large > > status=413 Request Entity Too Large token=yyyyyyyyy > > FATAL: too large > > ERROR: Job failed: exit code 1 > > > > As far as I can tell, this is an instance define limit that's clearly > > influenced by storage costs. I see a few possible solutions to this > > limitation: > > > > 1) Provide our own "artifact" like solution that uses our own storage > > solution > > > > 2) Reduce or eliminate the dependency on a complete build tree > > > > The first solution can go against the general trend of not having to > > maintain CI infrastructure. It could be made simpler by using cloud > > storage, but there would still be some interaction with another > > external infrastructure component. > > > > I find the second solution preferrable, given that most tests depend > > on having one or a few binaries available. I've run multi-stage > > pipelines with some of those binaries (qemu-img, > > $target-softmmu/qemu-system-$target) defined as artifcats and they > > behaved as expected. But, this could require some intrusive changes > > to the current "make"-based test invocation. > > I agree. It should be possible to bring the necessary artifacts down to > below the limit. This wasn't a problem for the virtio-fs GitLab CI > scripts I wrote that build a Linux kernel, QEMU, and guest image so I > think will be possible for QEMU as a whole: > https://gitlab.com/virtio-fs/virtio-fs-ci/ Cool, thanks for the pointer and feedback! - Cleber. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2019-12-02 18:12 ` Cleber Rosa @ 2019-12-03 14:14 ` Stefan Hajnoczi 0 siblings, 0 replies; 22+ messages in thread From: Stefan Hajnoczi @ 2019-12-03 14:14 UTC (permalink / raw) To: Cleber Rosa Cc: Peter Maydell, qemu-devel, Wainer dos Santos Moschetta, Markus Armbruster, Jeff Nelson, Alex Bennée, Ademar Reis [-- Attachment #1: Type: text/plain, Size: 8997 bytes --] On Mon, Dec 02, 2019 at 01:12:54PM -0500, Cleber Rosa wrote: > On Mon, Dec 02, 2019 at 05:00:18PM +0000, Stefan Hajnoczi wrote: > > On Mon, Dec 02, 2019 at 09:05:52AM -0500, Cleber Rosa wrote: > > > RFC: QEMU Gating CI > > > =================== > > > > Excellent, thank you for your work on this! > > > > > > > > This RFC attempts to address most of the issues described in > > > "Requirements/GatinCI"[1]. An also relevant write up is the "State of > > > QEMU CI as we enter 4.0"[2]. > > > > > > The general approach is one to minimize the infrastructure maintenance > > > and development burden, leveraging as much as possible "other people's" > > > infrastructure and code. GitLab's CI/CD platform is the most relevant > > > component dealt with here. > > > > > > Problem Statement > > > ----------------- > > > > > > The following is copied verbatim from Peter Maydell's write up[1]: > > > > > > "A gating CI is a prerequisite to having a multi-maintainer model of > > > merging. By having a common set of tests that are run prior to a merge > > > you do not rely on who is currently doing merging duties having access > > > to the current set of test machines." > > > > > > This is of a very simplified view of the problem that I'd like to break > > > down even further into the following key points: > > > > > > * Common set of tests > > > * Pre-merge ("prior to a merge") > > > * Access to the current set of test machines > > > * Multi-maintainer model > > > > > > Common set of tests > > > ~~~~~~~~~~~~~~~~~~~ > > > > > > Before we delve any further, let's make it clear that a "common set of > > > tests" is really a "dynamic common set of tests". My point is that a > > > set of tests in QEMU may include or exclude different tests depending > > > on the environment. > > > > > > The exact tests that will be executed may differ depending on the > > > environment, including: > > > > > > * Hardware > > > * Operating system > > > * Build configuration > > > * Environment variables > > > > > > In the "State of QEMU CI as we enter 4.0" Alex Bennée listed some of > > > those "common set of tests": > > > > > > * check > > > * check-tcg > > > * check-softfloat > > > * check-block > > > * check-acceptance > > > > > > While Peter mentions that most of his checks are limited to: > > > > > > * check > > > * check-tcg > > > > > > Our current inability to quickly identify a faulty test from test > > > execution results (and specially in remote environments), and act upon > > > it (say quickly disable it on a given host platform), makes me believe > > > that it's fair to start a gating CI implementation that uses this > > > rather coarse granularity. > > > > > > Another benefit is a close or even a 1:1 relationship between a common > > > test set and an entry in the CI configuration. For instance, the > > > "check" common test set would map to a "make check" command in a > > > "script:" YAML entry. > > > > > > To exemplify my point, if one specific test run as part of "check-tcg" > > > is found to be faulty on a specific job (say on a specific OS), the > > > entire "check-tcg" test set may be disabled as a CI-level maintenance > > > action. Of course a follow up action to deal with the specific test > > > is required, probably in the form of a Launchpad bug and patches > > > dealing with the issue, but without necessarily a CI related angle to > > > it. > > > > I think this coarse level of granularity is unrealistic. We cannot > > disable 99 tests because of 1 known failure. There must be a way of > > disabling individual tests. You don't need to implement it yourself, > > but I think this needs to be solved by someone before a gating CI can be > > put into use. > > > > IMO it should be realistic if you look at it from a "CI related > angle". The pull request could still be revised and disable a single > test because of a known failure, but this would not be necessarily > related to the CI. That sounds fine, thanks. I interpreted the text a little differently. I agree this functionality doesn't need to present in order to move to GitLab. > > > It probably involves adding a "make EXCLUDE_TESTS=foo,bar check" > > variable so that .gitlab-ci.yml can be modified to exclude specific > > tests on certain OSes. > > > > I certainly acknowledge the issue, but I don't think this (and many > other issues that will certainly come up) should be a blocker to the > transition to GitLab. > > > > > > > If/when test result presentation and control mechanism evolve, we may > > > feel confident and go into finer grained granularity. For instance, a > > > mechanism for disabling nothing but "tests/migration-test" on a given > > > environment would be possible and desirable from a CI management level. > > > > > > Pre-merge > > > ~~~~~~~~~ > > > > > > The natural way to have pre-merge CI jobs in GitLab is to send "Merge > > > Requests"[3] (abbreviated as "MR" from now on). In most projects, a > > > MR comes from individual contributors, usually the authors of the > > > changes themselves. It's my understanding that the current maintainer > > > model employed in QEMU will *not* change at this time, meaning that > > > code contributions and reviews will continue to happen on the mailing > > > list. A maintainer then, having collected a number of patches, would > > > submit a MR either in addition or in substitution to the Pull Requests > > > sent to the mailing list. > > > > > > "Pipelines for Merged Results"[4] is a very important feature to > > > support the multi-maintainer model, and looks in practice, similar to > > > Peter's "staging" branch approach, with an "automatic refresh" of the > > > target branch. It can give a maintainer extra confidence that a MR > > > will play nicely with the updated status of the target branch. It's > > > my understanding that it should be the "key to the gates". A minor > > > note is that conflicts are still possible in a multi-maintainer model > > > if there are more than one person doing the merges. > > > > The intention is to have only 1 active maintainer at a time. The > > maintainer will handle all merges for the current QEMU release and then > > hand over to the next maintainer after the release has been made. > > > > Solving the problem for multiple active maintainers is low priority at > > the moment. > > > > Even so, I have the impression that the following workflow: > > - Look at Merge Results Pipeline for MR#1 > - Merge MR #1 > - Hack on something else > - Look at *automatically updated* Merge Results Pipeline for MR#2 > - Merge MR #2 > > Is better than: > > - Push PR #1 to staging > - Wait for PR #1 Pipeline to finish > - Look at PR #1 Pipeline results > - Push staging into master > - Push PR #2 to staging > - Wait for PR #2 Pipeline to finish > - Push staging into master > > But I don't think I'll be a direct user of those workflows, so I'm > completely open to feedback on it. If the goal is to run multiple trees through the CI in parallel then multiple branches can be used. I guess I'm just > > > > A worthy point is that the GitLab web UI is not the only way to create > > > a Merge Request, but a rich set of APIs are available[5]. This is > > > interesting for many reasons, and maybe some of Peter's > > > "apply-pullreq"[6] actions (such as bad UTF8 or bogus qemu-devel email > > > addresses checks could be made earlier) as part of a > > > "send-mergereq"-like script, bringing conformance earlier on the merge > > > process, at the MR creation stage. > > > > > > Note: It's possible to have CI jobs definition that are specific to > > > MR, allowing generic non-MR jobs to be kept on the default > > > configuration. This can be used so individual contributors continue > > > to leverage some of the "free" (shared) runner made available on > > > gitlab.com. > > > > I expected this section to say: > > 1. Maintainer sets up a personal gitlab.com account with a qemu.git fork. > > 2. Maintainer adds QEMU's CI tokens to their personal account. > > 3. Each time a maintainer pushes to their "staging" branch the CI > > triggers. > > > > IMO this model is simpler than MRs because once it has been set up the > > maintainer just uses git push. Why are MRs necessary? > > > > I am not sure GitLab "Specific Runners" can be used from other > accounts/forks. AFAICT, you'd need a MR to send jobs that would run > on those machines, because (again AFAICT) the token used to register > those gitlab-runner instances on those machines is not shareable > across forks. But, I'll double check that. Another question: Is a Merge Request necessary in order to trigger the CI or is just pushing to a branch enough? With GitHub + Travis just pushing is enough. Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2019-12-02 14:05 [RFC] QEMU Gating CI Cleber Rosa 2019-12-02 17:00 ` Stefan Hajnoczi @ 2019-12-03 14:07 ` Alex Bennée 2019-12-04 8:55 ` Thomas Huth 2019-12-06 19:03 ` Cleber Rosa 2019-12-03 17:54 ` Peter Maydell 2020-01-17 14:33 ` Peter Maydell 3 siblings, 2 replies; 22+ messages in thread From: Alex Bennée @ 2019-12-03 14:07 UTC (permalink / raw) To: Cleber Rosa Cc: Peter Maydell, Stefan Hajnoczi, qemu-devel, Wainer dos Santos Moschetta, Markus Armbruster, Jeff Nelson, Ademar Reis Cleber Rosa <crosa@redhat.com> writes: > RFC: QEMU Gating CI > =================== > > This RFC attempts to address most of the issues described in > "Requirements/GatinCI"[1]. An also relevant write up is the "State of > QEMU CI as we enter 4.0"[2]. > > The general approach is one to minimize the infrastructure maintenance > and development burden, leveraging as much as possible "other people's" > infrastructure and code. GitLab's CI/CD platform is the most relevant > component dealt with here. > > Problem Statement > ----------------- > > The following is copied verbatim from Peter Maydell's write up[1]: > > "A gating CI is a prerequisite to having a multi-maintainer model of > merging. By having a common set of tests that are run prior to a merge > you do not rely on who is currently doing merging duties having access > to the current set of test machines." > > This is of a very simplified view of the problem that I'd like to break > down even further into the following key points: > > * Common set of tests > * Pre-merge ("prior to a merge") > * Access to the current set of test machines > * Multi-maintainer model > > Common set of tests > ~~~~~~~~~~~~~~~~~~~ > > Before we delve any further, let's make it clear that a "common set of > tests" is really a "dynamic common set of tests". My point is that a > set of tests in QEMU may include or exclude different tests depending > on the environment. > > The exact tests that will be executed may differ depending on the > environment, including: > > * Hardware > * Operating system > * Build configuration > * Environment variables > > In the "State of QEMU CI as we enter 4.0" Alex Bennée listed some of > those "common set of tests": > > * check Check encompasses a subset of the other checks - currently: - check-unit - check-qtest - check-block The thing that stops other groups of tests being included is generally are they solid on all the various hw/os/config/env setups you describe. For example check-tcg currently fails gloriously on non-x86 with docker enabled as it tries to get all the cross compiler images working. > * check-tcg > * check-softfloat > * check-block > * check-acceptance > > While Peter mentions that most of his checks are limited to: > > * check > * check-tcg > > Our current inability to quickly identify a faulty test from test > execution results (and specially in remote environments), and act upon > it (say quickly disable it on a given host platform), makes me believe > that it's fair to start a gating CI implementation that uses this > rather coarse granularity. > > Another benefit is a close or even a 1:1 relationship between a common > test set and an entry in the CI configuration. For instance, the > "check" common test set would map to a "make check" command in a > "script:" YAML entry. > > To exemplify my point, if one specific test run as part of "check-tcg" > is found to be faulty on a specific job (say on a specific OS), the > entire "check-tcg" test set may be disabled as a CI-level maintenance > action. This would in this example eliminate practically all emulation testing apart from the very minimal boot-codes that get spun up by the various qtest migration tests. And of course the longer a group of tests is disabled the larger the window for additional regressions to get in. It may be a reasonable approach but it's not without consequence. > Of course a follow up action to deal with the specific test > is required, probably in the form of a Launchpad bug and patches > dealing with the issue, but without necessarily a CI related angle to > it. > > If/when test result presentation and control mechanism evolve, we may > feel confident and go into finer grained granularity. For instance, a > mechanism for disabling nothing but "tests/migration-test" on a given > environment would be possible and desirable from a CI management > level. The migration tests have found regressions although the problem has generally been they were intermittent failures and hard to reproduce locally. The last one took a few weeks of grinding to reproduce and get patches together. > Pre-merge > ~~~~~~~~~ > > The natural way to have pre-merge CI jobs in GitLab is to send "Merge > Requests"[3] (abbreviated as "MR" from now on). In most projects, a > MR comes from individual contributors, usually the authors of the > changes themselves. It's my understanding that the current maintainer > model employed in QEMU will *not* change at this time, meaning that > code contributions and reviews will continue to happen on the mailing > list. A maintainer then, having collected a number of patches, would > submit a MR either in addition or in substitution to the Pull Requests > sent to the mailing list. > > "Pipelines for Merged Results"[4] is a very important feature to > support the multi-maintainer model, and looks in practice, similar to > Peter's "staging" branch approach, with an "automatic refresh" of the > target branch. It can give a maintainer extra confidence that a MR > will play nicely with the updated status of the target branch. It's > my understanding that it should be the "key to the gates". A minor > note is that conflicts are still possible in a multi-maintainer model > if there are more than one person doing the merges. > > A worthy point is that the GitLab web UI is not the only way to create > a Merge Request, but a rich set of APIs are available[5]. This is > interesting for many reasons, and maybe some of Peter's > "apply-pullreq"[6] actions (such as bad UTF8 or bogus qemu-devel email > addresses checks could be made earlier) as part of a > "send-mergereq"-like script, bringing conformance earlier on the merge > process, at the MR creation stage. > > Note: It's possible to have CI jobs definition that are specific to > MR, allowing generic non-MR jobs to be kept on the default > configuration. This can be used so individual contributors continue > to leverage some of the "free" (shared) runner made available on > gitlab.com. > > Multi-maintainer model > ~~~~~~~~~~~~~~~~~~~~~~ > > The previous section already introduced some of the proposed workflow > that can enable such a multi-maintainer model. With a Gating CI > system, though, it will be natural to have a smaller "Mean time > between (CI) failures", simply because of the expected increased > number of systems and checks. A lot of countermeasures have to be > employed to keep that MTBF in check. > > For once, it's imperative that the maintainers for such systems and > jobs are clearly defined and readily accessible. Either the same > MAINTAINERS file or a more suitable variation of such data should be > defined before activating the *gating* rules. This would allow a > routing to request the attention of the maintainer responsible. > > In case of unresposive maintainers, or any other condition that > renders and keeps one or more CI jobs failing for a given previously > established amount of time, the job can be demoted with an > "allow_failure" configuration[7]. Once such a change is commited, the > path to promotion would be just the same as in a newly added job > definition. > > Note: In a future phase we can evaluate the creation of rules that > look at changed paths in a MR (similar to "F:" entries on MAINTAINERS) > and the execution of specific CI jobs, which would be the > responsibility of a given maintainer[8]. > > Access to the current set of test machines > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > When compared to the various CI systems and services already being > employed in the QEMU project, this is the most striking difference in > the proposed model. Instead of relying on shared/public/free > resources, this proposal also deals with privately owned and > operated machines. > > Even though the QEMU project operates with great cooperation, it's > crucial to define clear boundaries when it comes to machine access. > Restricted access to machines is important because: > > * The results of jobs are many times directly tied to the setup and > status of machines. Even "soft" changes such as removing or updating > packages can introduce failures in jobs (this is greatly minimized > but not completely eliminated when using containers or VMs). > Updating firmware or changing its settings are also examples of > changes that may change the outcome of jobs. > > * If maintainers will be accounted for the status of the jobs defined > to run on specific machines, they must be sure of the status of the > machines. > > * Machines need regular monitoring and will receive required > maintainance actions which can cause job regressions. > > Thus, there needs to be one clear way for machines to be *used* for > running jobs sent by different maintainers, while still prohibiting > any other privileged action that can cause permanent change to the > machine. The GitLab agent (gitlab-runner) is designed to do just > that, and defining what will be excuted in a job (in a given system) > should be all that's generally allowed. The job definition itself, > will of course be subject to code review before a maintainer decides > to send a MR containing such new or updated job definitions. > > Still related to machine maintanance, it's highly desirable for jobs > tied to specific host machines to be introduced alongside with > documentation and/or scripts that can replicate the machine setup. If > the machine setup steps can be easily and reliably reproduced, then: > > * Other people may help to debug issues and regressions if they > happen to have the same hardware available > > * Other people may provide more machines to run the same types of > jobs > > * If a machine maintainer goes MIA, it'd be easier to find another > maintainer > > GitLab Jobs and Pipelines > ------------------------- > > GitLab CI is built around two major concepts: jobs and pipelines. The > current GitLab CI configuration in QEMU uses jobs only (or putting it > another way, all jobs in a single pipeline stage). Consider the > folowing job definition[9]: > > build-tci: > script: > - TARGETS="aarch64 alpha arm hppa m68k microblaze moxie ppc64 s390x x86_64" > - ./configure --enable-tcg-interpreter > --target-list="$(for tg in $TARGETS; do echo -n ${tg}'-softmmu '; done)" > - make -j2 > - make tests/boot-serial-test tests/cdrom-test tests/pxe-test > - for tg in $TARGETS ; do > export QTEST_QEMU_BINARY="${tg}-softmmu/qemu-system-${tg}" ; > ./tests/boot-serial-test || exit 1 ; > ./tests/cdrom-test || exit 1 ; > done > - QTEST_QEMU_BINARY="x86_64-softmmu/qemu-system-x86_64" ./tests/pxe-test > - QTEST_QEMU_BINARY="s390x-softmmu/qemu-system-s390x" ./tests/pxe-test -m slow > > All the lines under "script" are performed sequentially. It should be > clear that there's the possibility of breaking this down into multiple > stages, so that a build happens first, and then "common set of tests" > run in parallel. Using the example above, it would look something > like: > > +---------------+------------------------+ > | BUILD STAGE | TEST STAGE | > +---------------+------------------------+ > | +-------+ | +------------------+ | > | | build | | | boot-serial-test | | > | +-------+ | +------------------+ | > | | | > | | +------------------+ | > | | | cdrom-test | | > | | +------------------+ | > | | | > | | +------------------+ | > | | | x86_64-pxe-test | | > | | +------------------+ | > | | | > | | +------------------+ | > | | | s390x-pxe-test | | > | | +------------------+ | > | | | > +---------------+------------------------+ > > Of course it would be silly to break down that job into smaller jobs that > would run individual tests like "boot-serial-test" or "cdrom-test". Still, > the pipeline approach is valid because: > > * Common set of tests would run in parallel, giving a quicker result > turnaround check-unit is a good candidate for parallel tests. The others depends - I've recently turned most make check's back to -j 1 on travis because it's a real pain to see what test has hung when other tests keep running. > > * It's easier to determine to possible nature of the problem with > just the basic CI job status > > * Different maintainers could be defined for different "common set of > tests", and again by leveraging the basic CI job status, automation > for directed notification can be implemented > > In the following example, "check-block" maintainers could be left > undisturbed with failures in the "check-acceptance" job: > > +---------------+------------------------+ > | BUILD STAGE | TEST STAGE | > +---------------+------------------------+ > | +-------+ | +------------------+ | > | | build | | | check-block | | > | +-------+ | +------------------+ | > | | | > | | +------------------+ | > | | | check-acceptance | | > | | +------------------+ | > | | | > +---------------+------------------------+ > > The same logic applies for test sets for different targets. For > instance, combining the two previous examples, there could different > maintainers defined for the different jobs on the test stage: > > +---------------+------------------------+ > | BUILD STAGE | TEST STAGE | > +---------------+------------------------+ > | +-------+ | +------------------+ | > | | build | | | x86_64-block | | > | +-------+ | +------------------+ | > | | | > | | +------------------+ | > | | | x86_64-acceptance| | > | | +------------------+ | > | | | > | | +------------------+ | > | | | s390x-block | | > | | +------------------+ | > | | | > | | +------------------+ | > | | | s390x-acceptance | | > | | +------------------+ | > +---------------+------------------------+ > > Current limitations for a multi-stage pipeline > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Because it's assumed that each job will happen in an isolated and > independent execution environment, jobs must explicitly define the > resources that will be shared between stages. GitLab will make sure > the same source code revision will be available on all jobs > automatically. Additionaly, GitLab supports the concept of artifacts. > By defining artifacts in the "build" stage, jobs in the "test" stage > can expect to have a copy of those artifacts automatically. > > In theory, there's nothing that prevents an entire QEMU build > directory, to be treated as an artifact. In practice, there are > predefined limits on GitLab that prevents that from being possible, > resulting in errors such as: > > Uploading artifacts... > build: found 3164 matching files > ERROR: Uploading artifacts to coordinator... too large archive > id=xxxxxxx responseStatus=413 Request Entity Too Large > status=413 Request Entity Too Large token=yyyyyyyyy > FATAL: too large > ERROR: Job failed: exit code 1 > > As far as I can tell, this is an instance define limit that's clearly > influenced by storage costs. I see a few possible solutions to this > limitation: > > 1) Provide our own "artifact" like solution that uses our own storage > solution > > 2) Reduce or eliminate the dependency on a complete build tree > > The first solution can go against the general trend of not having to > maintain CI infrastructure. It could be made simpler by using cloud > storage, but there would still be some interaction with another > external infrastructure component. > > I find the second solution preferrable, given that most tests depend > on having one or a few binaries available. I've run multi-stage > pipelines with some of those binaries (qemu-img, > $target-softmmu/qemu-system-$target) defined as artifcats and they > behaved as expected. But, this could require some intrusive changes > to the current "make"-based test invocation. It would be nice if the make check could be run with a make install'ed set of binaries. I'm not sure how much hackery would be required to get that to work nicely. Does specifying QEMU and QEMU_IMG prevent make trying to re-build everything in situ? > > Job Naming convention > --------------------- > > Based only on the very simple examples job above, it should already be > clear that there's a lot of possibility for confusion and chaos. For > instance, by looking at the "build" job definition or results, it's > very hard to tell what's really about. A bit more could be inferred by > the "x86_64-block" job name. > > Still, the problem we have to address here is not only about the > amount of information easily obtained from a job name, but allowing > for very similar job definitions within a global namespace. For > instance, if we add an Operating Systems component to the mix, we need > an extra qualifier for unique job names. > > Some of the possible components in a job definition are: > > * Stage > * Build profile > * Test set (a shorter name for what was described in the "Common set > of tests" section) > * Host architecture > * Target architecture > * Host Operating System identification (name and version) > * Execution mode/environment (bare metal, container, VM, etc) > > Stage > ~~~~~ > > The stage of a job (which maps roughly to its purpose) should be > clearly defined. A job that builds QEMU should start with "build" and > a job that tests QEMU should start with "test". > > IMO, in a second phase, once multi-stage pipelines are taken for > granted, we could evaluate dropping this component altogether from the > naming convention, and relying purely on the stage classification. > > Build profile > ~~~~~~~~~~~~~ > > Different build profiles already abound in QEMU's various CI > configuration files. It's hard to put a naming convention here, > except that it should represent the most distinguishable > characteristics of the build configuration. For instance, we can find > a "build-disabled" job in the current ".gitlab-ci.yml" file that is > aptly named, as it forcefully disables a lot of build options. > > Test set > ~~~~~~~~ > > As mentioned in the "Common set of tests" section, I believe that the > make target name can be used to identify the test set that will be > executed in a job. That is, if a job is to be run at the "test" > stage, and will run "make check", its name should start with > "test-check". > > QEMU Targets > ~~~~~~~~~~~~ > > Because a given job could, and usually do, involve multiple targets, I > honestly can not think of how to add this to the naming convention. > I'll ignore it for now, and consider the targets are defined in the > build profile. I like to think of three groups: Core SoftMMU - the major KVM architectures The rest of SoftMMU - all our random emulation targets linux-user > > Host Architecture > ~~~~~~~~~~~~~~~~~ > > The host architecture name convention should be an easy pick, given > that QEMU itself employes a architecture convention for its targets. > > Host OS > ~~~~~~~ > > The suggestion I have for the host OS name is to follow the > libosinfo[10] convention as closely as possible. libosinfo's "Short > ID" should be well suitable here. Examples include: "openbsd4.2", > "opensuse42.3", "rhel8.0", "ubuntu9.10" and "win2k12r2". > > Execution Environment > ~~~~~~~~~~~~~~~~~~~~~ > > Distinguishing between running tests in a bare-metal versus a nested > VM environment is quite significant to a number of people. > > Still, I think it could probably be optional for the initial > implementation phase, like the naming convention for the QEMU Targets. > > Example 1 > ~~~~~~~~~ > > Defining a job that will build QEMU with common debug options, on > a RHEL 8.0 system on a x86_64 host: > > build-debug-rhel8.0-x86_64: > script: > - ./configure --enable-debug > - make > > Example 2 > ~~~~~~~~~ > > Defining a job that will run the "qtest" test set on a NetBSD 8.1 > system on an aarch64 host: > > test-qtest-netbsd8.1-aarch64: > script: > - make check-qtest > > Job and Machine Scheduling > -------------------------- > > While the naming convention gives some information to human beings, > and hopefully allows for some order and avoids collusions on the > global job namespace, it's not enough to define where those jobs > should run. > > Tags[11] is the available mechanism to tie jobs to specific machines > running the GitLab CI agent, "gitlab-runner". Unfortunately, some > duplication seems unavoidable, in the sense that some of the naming > components listed above are machine specific, and will then need to be > also given as tags. > > Note: it may be a good idea to be extra verbose with tags, by having a > qualifier prefix. The justification is that tags also live in a > global namespace, and in theory, at a given point, tags of different > "categories", say a CPU name and Operating System name may collide. > Or, it may just be me being paranoid. > > Example 1 > ~~~~~~~~~ > > build-debug-rhel8.0-x86_64: > tags: > - rhel8.0 > - x86_64 > script: > - ./configure --enable-debug > - make > > Example 2 > ~~~~~~~~~ > > test-qtest-netbsd8.1-aarch64: > tags: > - netbsd8.1 > - aarch64 > script: > - make check-qtest Where are all these going to go? Are we overloading the existing gitlab.yml or are we going to have a new set of configs for the GatingCI and keep gitlab.yml as the current subset that people run on their own accounts? > > Operating System definition versus Container Images > --------------------------------------------------- > > In the previous section and examples, we're assuming that tests will > run on machines that have registered "gitlab-runner" agents with > matching tags. The tags given at gitlab-runner registration time > would of course match the same naming convention defined earlier. > > So, if one is registering a "gitlab-runner" instance on a x86_64 > machine, running RHEL 8.0, the tags "rhel8.0" and "x86_64" would be > given (possibly among others). > > Nevertheless, most deployment scenarios will probably rely on jobs > being executed by gitlab-runner's container executor (currently > Docker-only). This means that tags given to a job *may* drop the tag > associated with the host operating system selection, and instead > provide the ".gitlab-ci.yml" configuration directive that determines > the container image to be used. > > Most jobs would probably *not* require a matching host operating > system and container images, but there should still be the capability > to make it a requirement. For instance, jobs containing tests that > require the KVM accelerator on specific scenarios may require a > matching host Operating System. > > Note: What was mentioned in the "Execution Environment" section under > the naming conventions section, is also closely related to this > requirement, that is, one may require a job to run under a container, > VM or bare metal. > > Example 1 > ~~~~~~~~~ > > Build QEMU on a "rhel8.0" image hosted under the "qemuci" organization > and require the runner to support container execution: > > build-debug-rhel8.0-x86_64: > tags: > - x86_64 > - container > image: qemuci/rhel8.0 > script: > - ./configure --enable-debug > - make > > Example 2 > ~~~~~~~~~ > > Run check QEMU on a "rhel8.0" image hosted under the "qemuci" > organization and require the runner to support container execution and > be on a matching host: > > test-check-rhel8.0-x86_64: > tags: > - x86_64 > - rhel8.0 > - container > image: qemuci/rhel8.0 > script: > - make check > > Next > ---- > > Because this document is already too long and that can be distracting, > I decided to defer many other implementation level details to a second > RFC, alongside some code. > > Some completementary topics that I have prepared include: > > * Container images creation, hosting and management > * Advanced pipeline definitions > - Job depedencies > - Artifacts > - Results > * GitLab CI for Individial Contributors > * GitLab runner: > - Official and Custom Binaries > - Executors > - Security implications > - Helper container images for non supported architectures > * Checklists for: > - Preparing and documenting machine setup > - Proposing new runners and jobs > - Runners and jobs promotions and demotions > > Of course any other topics that spurr from this discussion will also > be added to the following threads. > > References: > ----------- > [1] https://wiki.qemu.org/Requirements/GatingCI > [2] https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg04909.html > [3] https://docs.gitlab.com/ee/gitlab-basics/add-merge-request.html > [4] https://docs.gitlab.com/ee/ci/merge_request_pipelines/pipelines_for_merged_results/index.html > [5] https://docs.gitlab.com/ee/api/merge_requests.html#create-mr-pipeline > [6] https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/apply-pullreq > [7] https://docs.gitlab.com/ee/ci/yaml/README.html#allow_failure > [8] https://docs.gitlab.com/ee/ci/yaml/README.html#using-onlychanges-with-pipelines-for-merge-requests > [9] https://github.com/qemu/qemu/blob/fb2246882a2c8d7f084ebe0617e97ac78467d156/.gitlab-ci.yml#L70 > [10] https://libosinfo.org/ > [11] https://docs.gitlab.com/ee/ci/runners/README.html#using-tags -- Alex Bennée ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2019-12-03 14:07 ` Alex Bennée @ 2019-12-04 8:55 ` Thomas Huth 2019-12-06 19:03 ` Cleber Rosa 1 sibling, 0 replies; 22+ messages in thread From: Thomas Huth @ 2019-12-04 8:55 UTC (permalink / raw) To: Alex Bennée, Cleber Rosa Cc: Peter Maydell, Stefan Hajnoczi, Markus Armbruster, Wainer dos Santos Moschetta, qemu-devel, Jeff Nelson, Philippe Mathieu-Daudé, Ademar Reis On 03/12/2019 15.07, Alex Bennée wrote: [...] >> GitLab Jobs and Pipelines >> ------------------------- >> >> GitLab CI is built around two major concepts: jobs and pipelines. The >> current GitLab CI configuration in QEMU uses jobs only (or putting it >> another way, all jobs in a single pipeline stage). Yeah, the initial gitlab-ci.yml file was one of the very first YAML file and one the very first CI files that I wrote, with hardly any experience in this area ... there is definitely a lot of room for improvement here! >> Consider the >> folowing job definition[9]: >> >> build-tci: >> script: >> - TARGETS="aarch64 alpha arm hppa m68k microblaze moxie ppc64 s390x x86_64" >> - ./configure --enable-tcg-interpreter >> --target-list="$(for tg in $TARGETS; do echo -n ${tg}'-softmmu '; done)" >> - make -j2 >> - make tests/boot-serial-test tests/cdrom-test tests/pxe-test >> - for tg in $TARGETS ; do >> export QTEST_QEMU_BINARY="${tg}-softmmu/qemu-system-${tg}" ; >> ./tests/boot-serial-test || exit 1 ; >> ./tests/cdrom-test || exit 1 ; >> done >> - QTEST_QEMU_BINARY="x86_64-softmmu/qemu-system-x86_64" ./tests/pxe-test >> - QTEST_QEMU_BINARY="s390x-softmmu/qemu-system-s390x" ./tests/pxe-test -m slow >> >> All the lines under "script" are performed sequentially. It should be >> clear that there's the possibility of breaking this down into multiple >> stages, so that a build happens first, and then "common set of tests" >> run in parallel. Using the example above, it would look something >> like: >> >> +---------------+------------------------+ >> | BUILD STAGE | TEST STAGE | >> +---------------+------------------------+ >> | +-------+ | +------------------+ | >> | | build | | | boot-serial-test | | >> | +-------+ | +------------------+ | >> | | | >> | | +------------------+ | >> | | | cdrom-test | | >> | | +------------------+ | >> | | | >> | | +------------------+ | >> | | | x86_64-pxe-test | | >> | | +------------------+ | >> | | | >> | | +------------------+ | >> | | | s390x-pxe-test | | >> | | +------------------+ | >> | | | >> +---------------+------------------------+ >> >> Of course it would be silly to break down that job into smaller jobs that >> would run individual tests like "boot-serial-test" or "cdrom-test". Still, >> the pipeline approach is valid because: >> >> * Common set of tests would run in parallel, giving a quicker result >> turnaround Ok, full ack for the idea to use separate pipelines for the testing (Philippe once showed me this idea already, too, he's using it for EDK2 testing IIRC). But the example with the build-tci is quite bad. The single steps here are basically just a subset of "check-qtest" to skip the tests that we are not interested in here. If we don't care about losing some minutes of testing, we can simply replace all those steps with "make check-qtest" again. I think what we really want to put into different pipelines are the sub-steps of "make check", i.e.: - check-block - check-qapi-schema - check-unit - check-softfloat - check-qtest - check-decodetree And of course also the other ones that are not included in "make check" yet, e.g. "check-acceptance" etc. > check-unit is a good candidate for parallel tests. The others depends - > I've recently turned most make check's back to -j 1 on travis because > it's a real pain to see what test has hung when other tests keep > running. If I understood correctly, it's not about running the check steps in parallel with "make -jXX" in one pipeline, but rather about running the different test steps in different pipelines. So you get a separate output for each test subsystem. >> Current limitations for a multi-stage pipeline >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> >> Because it's assumed that each job will happen in an isolated and >> independent execution environment, jobs must explicitly define the >> resources that will be shared between stages. GitLab will make sure >> the same source code revision will be available on all jobs >> automatically. Additionaly, GitLab supports the concept of artifacts. >> By defining artifacts in the "build" stage, jobs in the "test" stage >> can expect to have a copy of those artifacts automatically. >> >> In theory, there's nothing that prevents an entire QEMU build >> directory, to be treated as an artifact. In practice, there are >> predefined limits on GitLab that prevents that from being possible, >> resulting in errors such as: >> >> Uploading artifacts... >> build: found 3164 matching files >> ERROR: Uploading artifacts to coordinator... too large archive >> id=xxxxxxx responseStatus=413 Request Entity Too Large >> status=413 Request Entity Too Large token=yyyyyyyyy >> FATAL: too large >> ERROR: Job failed: exit code 1 >> >> As far as I can tell, this is an instance define limit that's clearly >> influenced by storage costs. I see a few possible solutions to this >> limitation: >> >> 1) Provide our own "artifact" like solution that uses our own storage >> solution >> >> 2) Reduce or eliminate the dependency on a complete build tree >> >> The first solution can go against the general trend of not having to >> maintain CI infrastructure. It could be made simpler by using cloud >> storage, but there would still be some interaction with another >> external infrastructure component. >> >> I find the second solution preferrable, given that most tests depend >> on having one or a few binaries available. I've run multi-stage >> pipelines with some of those binaries (qemu-img, >> $target-softmmu/qemu-system-$target) defined as artifcats and they >> behaved as expected. But, this could require some intrusive changes >> to the current "make"-based test invocation. I think it should be sufficient to define a simple set of artifacts like: - tests/* - *-softmmu/qemu-system-* - qemu-img, qemu-nbd ... and all the other helper binaries - Makefile* ... and maybe some more missing files. It's some initial work, but once we have the basic list, I don't expect to change it much in the course of time. Thomas ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2019-12-03 14:07 ` Alex Bennée 2019-12-04 8:55 ` Thomas Huth @ 2019-12-06 19:03 ` Cleber Rosa 1 sibling, 0 replies; 22+ messages in thread From: Cleber Rosa @ 2019-12-06 19:03 UTC (permalink / raw) To: Alex Bennée Cc: Peter Maydell, Stefan Hajnoczi, qemu-devel, Wainer dos Santos Moschetta, Markus Armbruster, Jeff Nelson, Ademar Reis [-- Attachment #1: Type: text/plain, Size: 27651 bytes --] On Tue, Dec 03, 2019 at 02:07:32PM +0000, Alex Bennée wrote: > > Cleber Rosa <crosa@redhat.com> writes: > > > RFC: QEMU Gating CI > > =================== > > > > This RFC attempts to address most of the issues described in > > "Requirements/GatinCI"[1]. An also relevant write up is the "State of > > QEMU CI as we enter 4.0"[2]. > > > > The general approach is one to minimize the infrastructure maintenance > > and development burden, leveraging as much as possible "other people's" > > infrastructure and code. GitLab's CI/CD platform is the most relevant > > component dealt with here. > > > > Problem Statement > > ----------------- > > > > The following is copied verbatim from Peter Maydell's write up[1]: > > > > "A gating CI is a prerequisite to having a multi-maintainer model of > > merging. By having a common set of tests that are run prior to a merge > > you do not rely on who is currently doing merging duties having access > > to the current set of test machines." > > > > This is of a very simplified view of the problem that I'd like to break > > down even further into the following key points: > > > > * Common set of tests > > * Pre-merge ("prior to a merge") > > * Access to the current set of test machines > > * Multi-maintainer model > > > > Common set of tests > > ~~~~~~~~~~~~~~~~~~~ > > > > Before we delve any further, let's make it clear that a "common set of > > tests" is really a "dynamic common set of tests". My point is that a > > set of tests in QEMU may include or exclude different tests depending > > on the environment. > > > > The exact tests that will be executed may differ depending on the > > environment, including: > > > > * Hardware > > * Operating system > > * Build configuration > > * Environment variables > > > > In the "State of QEMU CI as we enter 4.0" Alex Bennée listed some of > > those "common set of tests": > > > > * check > > Check encompasses a subset of the other checks - currently: > > - check-unit > - check-qtest > - check-block > > The thing that stops other groups of tests being included is generally > are they solid on all the various hw/os/config/env setups you describe. > For example check-tcg currently fails gloriously on non-x86 with docker > enabled as it tries to get all the cross compiler images working. > Right. > > * check-tcg > > * check-softfloat > > * check-block > > * check-acceptance > > > > While Peter mentions that most of his checks are limited to: > > > > * check > > * check-tcg > > > > Our current inability to quickly identify a faulty test from test > > execution results (and specially in remote environments), and act upon > > it (say quickly disable it on a given host platform), makes me believe > > that it's fair to start a gating CI implementation that uses this > > rather coarse granularity. > > > > Another benefit is a close or even a 1:1 relationship between a common > > test set and an entry in the CI configuration. For instance, the > > "check" common test set would map to a "make check" command in a > > "script:" YAML entry. > > > > To exemplify my point, if one specific test run as part of "check-tcg" > > is found to be faulty on a specific job (say on a specific OS), the > > entire "check-tcg" test set may be disabled as a CI-level maintenance > > action. > > This would in this example eliminate practically all emulation testing > apart from the very minimal boot-codes that get spun up by the various > qtest migration tests. And of course the longer a group of tests is > disabled the larger the window for additional regressions to get in. > > It may be a reasonable approach but it's not without consequence. > Thanks for the insights. I agree that there are tradeoffs, and I bet that I'm speculating a lot, and there's just to much to be learned here. My main point in proposing this very crude rule, though, is trying to address the operational difficulties when such a Gating CI grows beyond a single *machine* and *job* maintainer. A practical example may help. Let's consider the following jobs are initially (on phase 1) active and *gating*: * test-qtest-ubuntu19.04-x86_64 * test-qtest-ubuntu19.04-aarch64 * test-qtest-ubuntu19.04-s390x Then, because the model has proven successful, a new job that has already being running for a while with successful results, but with no influence to gating, is added to the gating group. This job is being run on a machine that is managed by a different maintainer: * test-qtest-centos8.0-ppc64le After some time, the test-qtest-centos8.0-ppc64le job starts to fail, with seemingly no relation to recently merged code. From a CI management perspective, disabling the job completely is reasonable if: * the machine seems to be faulty * ppc64le machine maintainer is unresponsive * there's no mechanism to disable a portion of the job (such as an specific test) * bug has been found but there's no short-term fix This doesn't mean that a lot of test will be eliminated for good. Unless the machine is faulty, it's expected that the tests will continue to run, but not with Gating powers. Also, it's expected that the same (or similar) tests will be running on other machines/jobs. IMO, it can be actually the opposite, when "skip test y on platform x" conditions hidden in test code can survive a lot longer than a disabled job/machine with a frustrated (and engaged) maintainer trying to get it back to a "green" status, and then to a reliable status for a given time so that it can be considered a gating job again. > > Of course a follow up action to deal with the specific test > > is required, probably in the form of a Launchpad bug and patches > > dealing with the issue, but without necessarily a CI related angle to > > it. > > > > If/when test result presentation and control mechanism evolve, we may > > feel confident and go into finer grained granularity. For instance, a > > mechanism for disabling nothing but "tests/migration-test" on a given > > environment would be possible and desirable from a CI management > > level. > > The migration tests have found regressions although the problem has > generally been they were intermittent failures and hard to reproduce > locally. The last one took a few weeks of grinding to reproduce and get > patches together. > Right. So I believe we are in sync with the nature of the problem, that is, that some tests would benefit from individually being pulled from specific jobs until a permanent solution can be applied to them. At the same time, if we can't do that (see the conditions that may render us unable to do it), it would be fair to remove a CI job from being gating. > > Pre-merge > > ~~~~~~~~~ > > > > The natural way to have pre-merge CI jobs in GitLab is to send "Merge > > Requests"[3] (abbreviated as "MR" from now on). In most projects, a > > MR comes from individual contributors, usually the authors of the > > changes themselves. It's my understanding that the current maintainer > > model employed in QEMU will *not* change at this time, meaning that > > code contributions and reviews will continue to happen on the mailing > > list. A maintainer then, having collected a number of patches, would > > submit a MR either in addition or in substitution to the Pull Requests > > sent to the mailing list. > > > > "Pipelines for Merged Results"[4] is a very important feature to > > support the multi-maintainer model, and looks in practice, similar to > > Peter's "staging" branch approach, with an "automatic refresh" of the > > target branch. It can give a maintainer extra confidence that a MR > > will play nicely with the updated status of the target branch. It's > > my understanding that it should be the "key to the gates". A minor > > note is that conflicts are still possible in a multi-maintainer model > > if there are more than one person doing the merges. > > > > A worthy point is that the GitLab web UI is not the only way to create > > a Merge Request, but a rich set of APIs are available[5]. This is > > interesting for many reasons, and maybe some of Peter's > > "apply-pullreq"[6] actions (such as bad UTF8 or bogus qemu-devel email > > addresses checks could be made earlier) as part of a > > "send-mergereq"-like script, bringing conformance earlier on the merge > > process, at the MR creation stage. > > > > Note: It's possible to have CI jobs definition that are specific to > > MR, allowing generic non-MR jobs to be kept on the default > > configuration. This can be used so individual contributors continue > > to leverage some of the "free" (shared) runner made available on > > gitlab.com. > > > > Multi-maintainer model > > ~~~~~~~~~~~~~~~~~~~~~~ > > > > The previous section already introduced some of the proposed workflow > > that can enable such a multi-maintainer model. With a Gating CI > > system, though, it will be natural to have a smaller "Mean time > > between (CI) failures", simply because of the expected increased > > number of systems and checks. A lot of countermeasures have to be > > employed to keep that MTBF in check. > > > > For once, it's imperative that the maintainers for such systems and > > jobs are clearly defined and readily accessible. Either the same > > MAINTAINERS file or a more suitable variation of such data should be > > defined before activating the *gating* rules. This would allow a > > routing to request the attention of the maintainer responsible. > > > > In case of unresposive maintainers, or any other condition that > > renders and keeps one or more CI jobs failing for a given previously > > established amount of time, the job can be demoted with an > > "allow_failure" configuration[7]. Once such a change is commited, the > > path to promotion would be just the same as in a newly added job > > definition. > > > > Note: In a future phase we can evaluate the creation of rules that > > look at changed paths in a MR (similar to "F:" entries on MAINTAINERS) > > and the execution of specific CI jobs, which would be the > > responsibility of a given maintainer[8]. > > > > Access to the current set of test machines > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > When compared to the various CI systems and services already being > > employed in the QEMU project, this is the most striking difference in > > the proposed model. Instead of relying on shared/public/free > > resources, this proposal also deals with privately owned and > > operated machines. > > > > Even though the QEMU project operates with great cooperation, it's > > crucial to define clear boundaries when it comes to machine access. > > Restricted access to machines is important because: > > > > * The results of jobs are many times directly tied to the setup and > > status of machines. Even "soft" changes such as removing or updating > > packages can introduce failures in jobs (this is greatly minimized > > but not completely eliminated when using containers or VMs). > > Updating firmware or changing its settings are also examples of > > changes that may change the outcome of jobs. > > > > * If maintainers will be accounted for the status of the jobs defined > > to run on specific machines, they must be sure of the status of the > > machines. > > > > * Machines need regular monitoring and will receive required > > maintainance actions which can cause job regressions. > > > > Thus, there needs to be one clear way for machines to be *used* for > > running jobs sent by different maintainers, while still prohibiting > > any other privileged action that can cause permanent change to the > > machine. The GitLab agent (gitlab-runner) is designed to do just > > that, and defining what will be excuted in a job (in a given system) > > should be all that's generally allowed. The job definition itself, > > will of course be subject to code review before a maintainer decides > > to send a MR containing such new or updated job definitions. > > > > Still related to machine maintanance, it's highly desirable for jobs > > tied to specific host machines to be introduced alongside with > > documentation and/or scripts that can replicate the machine setup. If > > the machine setup steps can be easily and reliably reproduced, then: > > > > * Other people may help to debug issues and regressions if they > > happen to have the same hardware available > > > > * Other people may provide more machines to run the same types of > > jobs > > > > * If a machine maintainer goes MIA, it'd be easier to find another > > maintainer > > > > GitLab Jobs and Pipelines > > ------------------------- > > > > GitLab CI is built around two major concepts: jobs and pipelines. The > > current GitLab CI configuration in QEMU uses jobs only (or putting it > > another way, all jobs in a single pipeline stage). Consider the > > folowing job definition[9]: > > > > build-tci: > > script: > > - TARGETS="aarch64 alpha arm hppa m68k microblaze moxie ppc64 s390x x86_64" > > - ./configure --enable-tcg-interpreter > > --target-list="$(for tg in $TARGETS; do echo -n ${tg}'-softmmu '; done)" > > - make -j2 > > - make tests/boot-serial-test tests/cdrom-test tests/pxe-test > > - for tg in $TARGETS ; do > > export QTEST_QEMU_BINARY="${tg}-softmmu/qemu-system-${tg}" ; > > ./tests/boot-serial-test || exit 1 ; > > ./tests/cdrom-test || exit 1 ; > > done > > - QTEST_QEMU_BINARY="x86_64-softmmu/qemu-system-x86_64" ./tests/pxe-test > > - QTEST_QEMU_BINARY="s390x-softmmu/qemu-system-s390x" ./tests/pxe-test -m slow > > > > All the lines under "script" are performed sequentially. It should be > > clear that there's the possibility of breaking this down into multiple > > stages, so that a build happens first, and then "common set of tests" > > run in parallel. Using the example above, it would look something > > like: > > > > +---------------+------------------------+ > > | BUILD STAGE | TEST STAGE | > > +---------------+------------------------+ > > | +-------+ | +------------------+ | > > | | build | | | boot-serial-test | | > > | +-------+ | +------------------+ | > > | | | > > | | +------------------+ | > > | | | cdrom-test | | > > | | +------------------+ | > > | | | > > | | +------------------+ | > > | | | x86_64-pxe-test | | > > | | +------------------+ | > > | | | > > | | +------------------+ | > > | | | s390x-pxe-test | | > > | | +------------------+ | > > | | | > > +---------------+------------------------+ > > > > Of course it would be silly to break down that job into smaller jobs that > > would run individual tests like "boot-serial-test" or "cdrom-test". Still, > > the pipeline approach is valid because: > > > > * Common set of tests would run in parallel, giving a quicker result > > turnaround > > check-unit is a good candidate for parallel tests. The others depends - > I've recently turned most make check's back to -j 1 on travis because > it's a real pain to see what test has hung when other tests keep > running. > Agreed. Running tests in parallel while keeping/presenting traceable results is a real problem, specially in a remote environment such as in a CI. FIY, slightly off-topic and Avocado specific: Avocado keeps each test result in a separate directory and log file, which helps with that. I'm bridging the parallel test runner (introduced a few releases back) to the existing result/reporting infrastructure. My goal is to have that running the acceptance tests shortly. > > > > * It's easier to determine to possible nature of the problem with > > just the basic CI job status > > > > * Different maintainers could be defined for different "common set of > > tests", and again by leveraging the basic CI job status, automation > > for directed notification can be implemented > > > > In the following example, "check-block" maintainers could be left > > undisturbed with failures in the "check-acceptance" job: > > > > +---------------+------------------------+ > > | BUILD STAGE | TEST STAGE | > > +---------------+------------------------+ > > | +-------+ | +------------------+ | > > | | build | | | check-block | | > > | +-------+ | +------------------+ | > > | | | > > | | +------------------+ | > > | | | check-acceptance | | > > | | +------------------+ | > > | | | > > +---------------+------------------------+ > > > > The same logic applies for test sets for different targets. For > > instance, combining the two previous examples, there could different > > maintainers defined for the different jobs on the test stage: > > > > +---------------+------------------------+ > > | BUILD STAGE | TEST STAGE | > > +---------------+------------------------+ > > | +-------+ | +------------------+ | > > | | build | | | x86_64-block | | > > | +-------+ | +------------------+ | > > | | | > > | | +------------------+ | > > | | | x86_64-acceptance| | > > | | +------------------+ | > > | | | > > | | +------------------+ | > > | | | s390x-block | | > > | | +------------------+ | > > | | | > > | | +------------------+ | > > | | | s390x-acceptance | | > > | | +------------------+ | > > +---------------+------------------------+ > > > > Current limitations for a multi-stage pipeline > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > Because it's assumed that each job will happen in an isolated and > > independent execution environment, jobs must explicitly define the > > resources that will be shared between stages. GitLab will make sure > > the same source code revision will be available on all jobs > > automatically. Additionaly, GitLab supports the concept of artifacts. > > By defining artifacts in the "build" stage, jobs in the "test" stage > > can expect to have a copy of those artifacts automatically. > > > > In theory, there's nothing that prevents an entire QEMU build > > directory, to be treated as an artifact. In practice, there are > > predefined limits on GitLab that prevents that from being possible, > > resulting in errors such as: > > > > Uploading artifacts... > > build: found 3164 matching files > > ERROR: Uploading artifacts to coordinator... too large archive > > id=xxxxxxx responseStatus=413 Request Entity Too Large > > status=413 Request Entity Too Large token=yyyyyyyyy > > FATAL: too large > > ERROR: Job failed: exit code 1 > > > > As far as I can tell, this is an instance define limit that's clearly > > influenced by storage costs. I see a few possible solutions to this > > limitation: > > > > 1) Provide our own "artifact" like solution that uses our own storage > > solution > > > > 2) Reduce or eliminate the dependency on a complete build tree > > > > The first solution can go against the general trend of not having to > > maintain CI infrastructure. It could be made simpler by using cloud > > storage, but there would still be some interaction with another > > external infrastructure component. > > > > I find the second solution preferrable, given that most tests depend > > on having one or a few binaries available. I've run multi-stage > > pipelines with some of those binaries (qemu-img, > > $target-softmmu/qemu-system-$target) defined as artifcats and they > > behaved as expected. But, this could require some intrusive changes > > to the current "make"-based test invocation. > > It would be nice if the make check could be run with a make install'ed > set of binaries. I'm not sure how much hackery would be required to get > that to work nicely. Does specifying QEMU and QEMU_IMG prevent make > trying to re-build everything in situ? > At this point, I don't know how hard it'd be. I'll certainly give it a try. Thomas has provided some extra info on another response to this thread too. > > > > Job Naming convention > > --------------------- > > > > Based only on the very simple examples job above, it should already be > > clear that there's a lot of possibility for confusion and chaos. For > > instance, by looking at the "build" job definition or results, it's > > very hard to tell what's really about. A bit more could be inferred by > > the "x86_64-block" job name. > > > > Still, the problem we have to address here is not only about the > > amount of information easily obtained from a job name, but allowing > > for very similar job definitions within a global namespace. For > > instance, if we add an Operating Systems component to the mix, we need > > an extra qualifier for unique job names. > > > > Some of the possible components in a job definition are: > > > > * Stage > > * Build profile > > * Test set (a shorter name for what was described in the "Common set > > of tests" section) > > * Host architecture > > * Target architecture > > * Host Operating System identification (name and version) > > * Execution mode/environment (bare metal, container, VM, etc) > > > > Stage > > ~~~~~ > > > > The stage of a job (which maps roughly to its purpose) should be > > clearly defined. A job that builds QEMU should start with "build" and > > a job that tests QEMU should start with "test". > > > > IMO, in a second phase, once multi-stage pipelines are taken for > > granted, we could evaluate dropping this component altogether from the > > naming convention, and relying purely on the stage classification. > > > > Build profile > > ~~~~~~~~~~~~~ > > > > Different build profiles already abound in QEMU's various CI > > configuration files. It's hard to put a naming convention here, > > except that it should represent the most distinguishable > > characteristics of the build configuration. For instance, we can find > > a "build-disabled" job in the current ".gitlab-ci.yml" file that is > > aptly named, as it forcefully disables a lot of build options. > > > > Test set > > ~~~~~~~~ > > > > As mentioned in the "Common set of tests" section, I believe that the > > make target name can be used to identify the test set that will be > > executed in a job. That is, if a job is to be run at the "test" > > stage, and will run "make check", its name should start with > > "test-check". > > > > QEMU Targets > > ~~~~~~~~~~~~ > > > > Because a given job could, and usually do, involve multiple targets, I > > honestly can not think of how to add this to the naming convention. > > I'll ignore it for now, and consider the targets are defined in the > > build profile. > > I like to think of three groups: > > Core SoftMMU - the major KVM architectures > The rest of SoftMMU - all our random emulation targets > linux-user > OK, makes sense. It'd be nice to know if other share the same general idea. I'll check how pervasive this general definition is on the documentation too. > > > > Host Architecture > > ~~~~~~~~~~~~~~~~~ > > > > The host architecture name convention should be an easy pick, given > > that QEMU itself employes a architecture convention for its targets. > > > > Host OS > > ~~~~~~~ > > > > The suggestion I have for the host OS name is to follow the > > libosinfo[10] convention as closely as possible. libosinfo's "Short > > ID" should be well suitable here. Examples include: "openbsd4.2", > > "opensuse42.3", "rhel8.0", "ubuntu9.10" and "win2k12r2". > > > > Execution Environment > > ~~~~~~~~~~~~~~~~~~~~~ > > > > Distinguishing between running tests in a bare-metal versus a nested > > VM environment is quite significant to a number of people. > > > > Still, I think it could probably be optional for the initial > > implementation phase, like the naming convention for the QEMU Targets. > > > > Example 1 > > ~~~~~~~~~ > > > > Defining a job that will build QEMU with common debug options, on > > a RHEL 8.0 system on a x86_64 host: > > > > build-debug-rhel8.0-x86_64: > > script: > > - ./configure --enable-debug > > - make > > > > Example 2 > > ~~~~~~~~~ > > > > Defining a job that will run the "qtest" test set on a NetBSD 8.1 > > system on an aarch64 host: > > > > test-qtest-netbsd8.1-aarch64: > > script: > > - make check-qtest > > > > Job and Machine Scheduling > > -------------------------- > > > > While the naming convention gives some information to human beings, > > and hopefully allows for some order and avoids collusions on the > > global job namespace, it's not enough to define where those jobs > > should run. > > > > Tags[11] is the available mechanism to tie jobs to specific machines > > running the GitLab CI agent, "gitlab-runner". Unfortunately, some > > duplication seems unavoidable, in the sense that some of the naming > > components listed above are machine specific, and will then need to be > > also given as tags. > > > > Note: it may be a good idea to be extra verbose with tags, by having a > > qualifier prefix. The justification is that tags also live in a > > global namespace, and in theory, at a given point, tags of different > > "categories", say a CPU name and Operating System name may collide. > > Or, it may just be me being paranoid. > > > > Example 1 > > ~~~~~~~~~ > > > > build-debug-rhel8.0-x86_64: > > tags: > > - rhel8.0 > > - x86_64 > > script: > > - ./configure --enable-debug > > - make > > > > Example 2 > > ~~~~~~~~~ > > > > test-qtest-netbsd8.1-aarch64: > > tags: > > - netbsd8.1 > > - aarch64 > > script: > > - make check-qtest > > Where are all these going to go? Are we overloading the existing > gitlab.yml or are we going to have a new set of configs for the GatingCI > and keep gitlab.yml as the current subset that people run on their own > accounts? > These will have to go into the existing ".gitlab-ci.yml" file, because current GitLab has no support for multiple pipelines, and no support for multiple "gitlab.yml" files. That's one of the reasons why I took the time to describe the proposal, because normalizing the current file to receive extra jobs is, if not necessary, highly desirable. Thanks for the feedback and insights, - Cleber. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2019-12-02 14:05 [RFC] QEMU Gating CI Cleber Rosa 2019-12-02 17:00 ` Stefan Hajnoczi 2019-12-03 14:07 ` Alex Bennée @ 2019-12-03 17:54 ` Peter Maydell 2019-12-05 5:05 ` Cleber Rosa 2020-01-17 14:33 ` Peter Maydell 3 siblings, 1 reply; 22+ messages in thread From: Peter Maydell @ 2019-12-03 17:54 UTC (permalink / raw) To: Cleber Rosa Cc: Jeff Nelson, QEMU Developers, Wainer dos Santos Moschetta, Markus Armbruster, Stefan Hajnoczi, Alex Bennée, Ademar Reis On Mon, 2 Dec 2019 at 14:06, Cleber Rosa <crosa@redhat.com> wrote: > > RFC: QEMU Gating CI > =================== > > This RFC attempts to address most of the issues described in > "Requirements/GatinCI"[1]. An also relevant write up is the "State of > QEMU CI as we enter 4.0"[2]. > > The general approach is one to minimize the infrastructure maintenance > and development burden, leveraging as much as possible "other people's" > infrastructure and code. GitLab's CI/CD platform is the most relevant > component dealt with here. Thanks for writing up this RFC. My overall view is that there's some interesting stuff in here and definitely some things we'll want to cover at some point, but there's also a fair amount that is veering away from solving the immediate problem we want to solve, and which we should thus postpone for later (beyond making some reasonable efforts not to design something which paints us into a corner so it's annoyingly hard to improve later). > To exemplify my point, if one specific test run as part of "check-tcg" > is found to be faulty on a specific job (say on a specific OS), the > entire "check-tcg" test set may be disabled as a CI-level maintenance > action. Of course a follow up action to deal with the specific test > is required, probably in the form of a Launchpad bug and patches > dealing with the issue, but without necessarily a CI related angle to > it. > > If/when test result presentation and control mechanism evolve, we may > feel confident and go into finer grained granularity. For instance, a > mechanism for disabling nothing but "tests/migration-test" on a given > environment would be possible and desirable from a CI management level. For instance, we don't have anything today for granularity of definition of what tests we run where or where we disable them. So we don't need it in order to move away from the scripting approach I have at the moment. We can just say "the CI system will run make and make check (and maybe in some hosts some additional test-running commands) on these hosts" and hardcode that into whatever yaml file the CI system's configured in. > Pre-merge > ~~~~~~~~~ > > The natural way to have pre-merge CI jobs in GitLab is to send "Merge > Requests"[3] (abbreviated as "MR" from now on). In most projects, a > MR comes from individual contributors, usually the authors of the > changes themselves. It's my understanding that the current maintainer > model employed in QEMU will *not* change at this time, meaning that > code contributions and reviews will continue to happen on the mailing > list. A maintainer then, having collected a number of patches, would > submit a MR either in addition or in substitution to the Pull Requests > sent to the mailing list. Eventually it would be nice to allow any submaintainer to send a merge request to the CI system (though you would want it to have a "but don't apply until somebody else approves it" gate as well as the automated testing part). But right now all we need is for the one person managing merges and releases to be able to say "here's the branch where I merged this pullrequest, please test it". At any rate, supporting multiple submaintainers all talking to the CI independently should be out of scope for now. > Multi-maintainer model > ~~~~~~~~~~~~~~~~~~~~~~ > > The previous section already introduced some of the proposed workflow > that can enable such a multi-maintainer model. With a Gating CI > system, though, it will be natural to have a smaller "Mean time > between (CI) failures", simply because of the expected increased > number of systems and checks. A lot of countermeasures have to be > employed to keep that MTBF in check. > > For once, it's imperative that the maintainers for such systems and > jobs are clearly defined and readily accessible. Either the same > MAINTAINERS file or a more suitable variation of such data should be > defined before activating the *gating* rules. This would allow a > routing to request the attention of the maintainer responsible. > > In case of unresposive maintainers, or any other condition that > renders and keeps one or more CI jobs failing for a given previously > established amount of time, the job can be demoted with an > "allow_failure" configuration[7]. Once such a change is commited, the > path to promotion would be just the same as in a newly added job > definition. > > Note: In a future phase we can evaluate the creation of rules that > look at changed paths in a MR (similar to "F:" entries on MAINTAINERS) > and the execution of specific CI jobs, which would be the > responsibility of a given maintainer[8]. All this stuff is not needed to start with. We cope at the moment with "everything is gating, and if something doesn't pass it needs to be fixed or manually removed from the setup". > GitLab Jobs and Pipelines > ------------------------- > > GitLab CI is built around two major concepts: jobs and pipelines. The > current GitLab CI configuration in QEMU uses jobs only (or putting it > another way, all jobs in a single pipeline stage). Consider the > folowing job definition[9]: > > build-tci: > script: > - TARGETS="aarch64 alpha arm hppa m68k microblaze moxie ppc64 s390x x86_64" > - ./configure --enable-tcg-interpreter > --target-list="$(for tg in $TARGETS; do echo -n ${tg}'-softmmu '; done)" > - make -j2 > - make tests/boot-serial-test tests/cdrom-test tests/pxe-test > - for tg in $TARGETS ; do > export QTEST_QEMU_BINARY="${tg}-softmmu/qemu-system-${tg}" ; > ./tests/boot-serial-test || exit 1 ; > ./tests/cdrom-test || exit 1 ; > done > - QTEST_QEMU_BINARY="x86_64-softmmu/qemu-system-x86_64" ./tests/pxe-test > - QTEST_QEMU_BINARY="s390x-softmmu/qemu-system-s390x" ./tests/pxe-test -m slow > > All the lines under "script" are performed sequentially. It should be > clear that there's the possibility of breaking this down into multiple > stages, so that a build happens first, and then "common set of tests" > run in parallel. We could do this, but we don't do it today, so we don't need to think about this at all to start with. > In theory, there's nothing that prevents an entire QEMU build > directory, to be treated as an artifact. In practice, there are > predefined limits on GitLab that prevents that from being possible, ...so we don't need to worry about somehow defining some cut-down "build artefact" that we provide to the testing phase. Just do a build and test run as a single thing. We can always come back and improve later. Have you been able to investigate and confirm that we can get a gitlab-runner setup that works on non-x86 ? That seems to me like an important thing we should be confident about early before we sink too much effort into a gitlab-based solution. thanks -- PMM ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2019-12-03 17:54 ` Peter Maydell @ 2019-12-05 5:05 ` Cleber Rosa 0 siblings, 0 replies; 22+ messages in thread From: Cleber Rosa @ 2019-12-05 5:05 UTC (permalink / raw) To: Peter Maydell Cc: Jeff Nelson, QEMU Developers, Wainer dos Santos Moschetta, Markus Armbruster, Stefan Hajnoczi, Alex Bennée, Ademar Reis On Tue, Dec 03, 2019 at 05:54:38PM +0000, Peter Maydell wrote: > On Mon, 2 Dec 2019 at 14:06, Cleber Rosa <crosa@redhat.com> wrote: > > > > RFC: QEMU Gating CI > > =================== > > > > This RFC attempts to address most of the issues described in > > "Requirements/GatinCI"[1]. An also relevant write up is the "State of > > QEMU CI as we enter 4.0"[2]. > > > > The general approach is one to minimize the infrastructure maintenance > > and development burden, leveraging as much as possible "other people's" > > infrastructure and code. GitLab's CI/CD platform is the most relevant > > component dealt with here. > > Thanks for writing up this RFC. > > My overall view is that there's some interesting stuff in > here and definitely some things we'll want to cover at some > point, but there's also a fair amount that is veering away > from solving the immediate problem we want to solve, and > which we should thus postpone for later (beyond making some > reasonable efforts not to design something which paints us > into a corner so it's annoyingly hard to improve later). > Right. I think this is a valid perspective to consider as we define the order and scope of thanks. I'll follow up with a more straightforward suggestion with the bare minimum actions for a first round. > > To exemplify my point, if one specific test run as part of "check-tcg" > > is found to be faulty on a specific job (say on a specific OS), the > > entire "check-tcg" test set may be disabled as a CI-level maintenance > > action. Of course a follow up action to deal with the specific test > > is required, probably in the form of a Launchpad bug and patches > > dealing with the issue, but without necessarily a CI related angle to > > it. > > > > If/when test result presentation and control mechanism evolve, we may > > feel confident and go into finer grained granularity. For instance, a > > mechanism for disabling nothing but "tests/migration-test" on a given > > environment would be possible and desirable from a CI management level. > > For instance, we don't have anything today for granularity of > definition of what tests we run where or where we disable them. > So we don't need it in order to move away from the scripting > approach I have at the moment. We can just say "the CI system > will run make and make check (and maybe in some hosts some > additional test-running commands) on these hosts" and hardcode > that into whatever yaml file the CI system's configured in. > I absolutely agree. That's why I even considered *if* this will done, and not only *when*. Because I happen to be biased from working on a test runner/framework, this is something that I had to at least talk about, so that it can be evaluated and maybe turned into a goal. > > Pre-merge > > ~~~~~~~~~ > > > > The natural way to have pre-merge CI jobs in GitLab is to send "Merge > > Requests"[3] (abbreviated as "MR" from now on). In most projects, a > > MR comes from individual contributors, usually the authors of the > > changes themselves. It's my understanding that the current maintainer > > model employed in QEMU will *not* change at this time, meaning that > > code contributions and reviews will continue to happen on the mailing > > list. A maintainer then, having collected a number of patches, would > > submit a MR either in addition or in substitution to the Pull Requests > > sent to the mailing list. > > Eventually it would be nice to allow any submaintainer > to send a merge request to the CI system (though you would > want it to have a "but don't apply until somebody else approves it" > gate as well as the automated testing part). But right now all > we need is for the one person managing merges and releases > to be able to say "here's the branch where I merged this > pullrequest, please test it". At any rate, supporting multiple > submaintainers all talking to the CI independently should be > out of scope for now. > OK, noted. > > Multi-maintainer model > > ~~~~~~~~~~~~~~~~~~~~~~ > > > > The previous section already introduced some of the proposed workflow > > that can enable such a multi-maintainer model. With a Gating CI > > system, though, it will be natural to have a smaller "Mean time > > between (CI) failures", simply because of the expected increased > > number of systems and checks. A lot of countermeasures have to be > > employed to keep that MTBF in check. > > > > For once, it's imperative that the maintainers for such systems and > > jobs are clearly defined and readily accessible. Either the same > > MAINTAINERS file or a more suitable variation of such data should be > > defined before activating the *gating* rules. This would allow a > > routing to request the attention of the maintainer responsible. > > > > In case of unresposive maintainers, or any other condition that > > renders and keeps one or more CI jobs failing for a given previously > > established amount of time, the job can be demoted with an > > "allow_failure" configuration[7]. Once such a change is commited, the > > path to promotion would be just the same as in a newly added job > > definition. > > > > Note: In a future phase we can evaluate the creation of rules that > > look at changed paths in a MR (similar to "F:" entries on MAINTAINERS) > > and the execution of specific CI jobs, which would be the > > responsibility of a given maintainer[8]. > > All this stuff is not needed to start with. We cope at the > moment with "everything is gating, and if something doesn't > pass it needs to be fixed or manually removed from the setup". > OK, I get your point. But, I think it's fair to say though, that one big motivation that we also have for this work, is to be able to provide new machines and jobs into the Gating CI in the very near future. And to do that, we must set common rules so that anyone else can do the same and abide by the same terms. > > GitLab Jobs and Pipelines > > ------------------------- > > > > GitLab CI is built around two major concepts: jobs and pipelines. The > > current GitLab CI configuration in QEMU uses jobs only (or putting it > > another way, all jobs in a single pipeline stage). Consider the > > folowing job definition[9]: > > > > build-tci: > > script: > > - TARGETS="aarch64 alpha arm hppa m68k microblaze moxie ppc64 s390x x86_64" > > - ./configure --enable-tcg-interpreter > > --target-list="$(for tg in $TARGETS; do echo -n ${tg}'-softmmu '; done)" > > - make -j2 > > - make tests/boot-serial-test tests/cdrom-test tests/pxe-test > > - for tg in $TARGETS ; do > > export QTEST_QEMU_BINARY="${tg}-softmmu/qemu-system-${tg}" ; > > ./tests/boot-serial-test || exit 1 ; > > ./tests/cdrom-test || exit 1 ; > > done > > - QTEST_QEMU_BINARY="x86_64-softmmu/qemu-system-x86_64" ./tests/pxe-test > > - QTEST_QEMU_BINARY="s390x-softmmu/qemu-system-s390x" ./tests/pxe-test -m slow > > > > All the lines under "script" are performed sequentially. It should be > > clear that there's the possibility of breaking this down into multiple > > stages, so that a build happens first, and then "common set of tests" > > run in parallel. > > We could do this, but we don't do it today, so we don't need > to think about this at all to start with. > So, in your opinion, this is phase >= 1 material. Noted. > > In theory, there's nothing that prevents an entire QEMU build > > directory, to be treated as an artifact. In practice, there are > > predefined limits on GitLab that prevents that from being possible, > > ...so we don't need to worry about somehow defining some > cut-down "build artefact" that we provide to the testing > phase. Just do a build and test run as a single thing. > We can always come back and improve later. > > > Have you been able to investigate and confirm that we can > get a gitlab-runner setup that works on non-x86 ? That seems > to me like an important thing we should be confident about > early before we sink too much effort into a gitlab-based > solution. > I've successfully built gitlab-runner and run jobs on aarch64, ppc64le and s390x. The binaries are available here: https://cleber.fedorapeople.org/gitlab-runner/v12.4.1/ But, with the "shell" executor (given that Docker helper images are not available for those architectures). I don't think we'd have to depend on GitLab providing those images though, it should be possible to create them for different architectures and tweak the gitlab-runner code to use different image references on those architectures. Does this answer this specific question? Best, - Cleber. > thanks > -- PMM > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2019-12-02 14:05 [RFC] QEMU Gating CI Cleber Rosa ` (2 preceding siblings ...) 2019-12-03 17:54 ` Peter Maydell @ 2020-01-17 14:33 ` Peter Maydell 2020-01-21 20:00 ` Cleber Rosa 2020-02-03 3:27 ` Cleber Rosa 3 siblings, 2 replies; 22+ messages in thread From: Peter Maydell @ 2020-01-17 14:33 UTC (permalink / raw) To: Cleber Rosa Cc: Jeff Nelson, QEMU Developers, Wainer dos Santos Moschetta, Markus Armbruster, Stefan Hajnoczi, Alex Bennée, Ademar Reis On Mon, 2 Dec 2019 at 14:06, Cleber Rosa <crosa@redhat.com> wrote: > > RFC: QEMU Gating CI > =================== > > This RFC attempts to address most of the issues described in > "Requirements/GatinCI"[1]. An also relevant write up is the "State of > QEMU CI as we enter 4.0"[2]. > > The general approach is one to minimize the infrastructure maintenance > and development burden, leveraging as much as possible "other people's" > infrastructure and code. GitLab's CI/CD platform is the most relevant > component dealt with here. Happy New Year! Now we're in 2020, any chance of an update on plans/progress here? I would very much like to be able to hand processing of pull requests over to somebody else after the 5.0 cycle, if not before. (I'm quite tempted to make that a hard deadline and just say that somebody else will have to pick it up for 5.1, regardless...) thanks -- PMM ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2020-01-17 14:33 ` Peter Maydell @ 2020-01-21 20:00 ` Cleber Rosa 2020-02-03 3:27 ` Cleber Rosa 1 sibling, 0 replies; 22+ messages in thread From: Cleber Rosa @ 2020-01-21 20:00 UTC (permalink / raw) To: Peter Maydell Cc: Jeff Nelson, QEMU Developers, Wainer dos Santos Moschetta, Markus Armbruster, Stefan Hajnoczi, Alex Bennée, Ademar Reis [-- Attachment #1: Type: text/plain, Size: 1986 bytes --] On Fri, Jan 17, 2020 at 02:33:54PM +0000, Peter Maydell wrote: > On Mon, 2 Dec 2019 at 14:06, Cleber Rosa <crosa@redhat.com> wrote: > > > > RFC: QEMU Gating CI > > =================== > > > > This RFC attempts to address most of the issues described in > > "Requirements/GatinCI"[1]. An also relevant write up is the "State of > > QEMU CI as we enter 4.0"[2]. > > > > The general approach is one to minimize the infrastructure maintenance > > and development burden, leveraging as much as possible "other people's" > > infrastructure and code. GitLab's CI/CD platform is the most relevant > > component dealt with here. > > Happy New Year! Now we're in 2020, any chance of an update on > plans/progress here? I would very much like to be able to hand > processing of pull requests over to somebody else after the > 5.0 cycle, if not before. (I'm quite tempted to make that a > hard deadline and just say that somebody else will have to > pick it up for 5.1, regardless...) > > thanks > -- PMM > Hi Peter, Happy New Year too! As an status update, I have some work queued up related to this work that I need to do some minor polishing and then post to the mailing list. That has to do with the changes to container definition files, and the most basic changes to the GitLab YAML configuration files to achieve a first stage of implementation. We'd also have to coordinate access to the existing machines in use, so that we can validate that this proposal will work. Just to be extra clear, I'm available to this initial configuration on the machines that are currently running the tests, provided you think it's a good idea. Also, that would also be helpful to me, as I'm learning a lot of this stuff as I go, and there's always some tricky new details on different environments. PS: I'm actually on the road, but I should be settle by Tomorrow, and I expect to resume work on this the following day. Best Regards, - Cleber. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2020-01-17 14:33 ` Peter Maydell 2020-01-21 20:00 ` Cleber Rosa @ 2020-02-03 3:27 ` Cleber Rosa 2020-02-03 15:00 ` Cleber Rosa 2020-02-07 16:42 ` Peter Maydell 1 sibling, 2 replies; 22+ messages in thread From: Cleber Rosa @ 2020-02-03 3:27 UTC (permalink / raw) To: Peter Maydell Cc: Jeff Nelson, Markus Armbruster, Wainer dos Santos Moschetta, QEMU Developers, Stefan Hajnoczi, Alex Bennée, Ademar Reis [-- Attachment #1: Type: text/plain, Size: 5905 bytes --] On Fri, Jan 17, 2020 at 02:33:54PM +0000, Peter Maydell wrote: > On Mon, 2 Dec 2019 at 14:06, Cleber Rosa <crosa@redhat.com> wrote: > > > > RFC: QEMU Gating CI > > =================== > > > > This RFC attempts to address most of the issues described in > > "Requirements/GatinCI"[1]. An also relevant write up is the "State of > > QEMU CI as we enter 4.0"[2]. > > > > The general approach is one to minimize the infrastructure maintenance > > and development burden, leveraging as much as possible "other people's" > > infrastructure and code. GitLab's CI/CD platform is the most relevant > > component dealt with here. > > Happy New Year! Now we're in 2020, any chance of an update on > plans/progress here? I would very much like to be able to hand > processing of pull requests over to somebody else after the > 5.0 cycle, if not before. (I'm quite tempted to make that a > hard deadline and just say that somebody else will have to > pick it up for 5.1, regardless...) > > thanks > -- PMM > Hi Peter, Last time I believe the take was to be as simplistic as possible, and try to focus on the bare mininum necessary to implement the workflow you described[1]. The following lines preceded by ">>>" were extracted from the Wiki and will be used to explain those points. >>> The set of machine I currently test on are: >>> >>> * an S390x box (this is provided to the project by IBM's Community >>> Cloud so can be used for the new CI setup) >>> * aarch32 (as a chroot on an aarch64 system) >>> * aarch64 >>> * ppc64 (on the GCC compile farm) I've built an updated gitlab-runner version for s390x, aarch64 and ppc64[2]. I've now tested its behavior with the shell executor (instead of docker) on aarch64 and ppc64. I did not get a chance yet to test this new version and executor with s390x, but I'm planning to do it soon. >>> * OSX >>> * Windows crossbuilds >>> * NetBSD, FreeBSD and OpenBSD using the tests/vm VMs gitlab-runner clients are available for Darwin, Windows (native) and FreeBSD. I have *not* tested any of those, though. I've tried a Windows crossbuild, and with the right packages installed, and worked like a charm on a Fedora machine. >>> * x86-64 Linux with a variety of different build configs (see the >>> 'remake-merge-builds' script for how these are set up) This is of course the more standard setup for gitlab-runner, and the bulk of the work that I'm posting here is related to those different build configs. I assumed those x86-64 machines had some sort version of Ubuntu, so I used 18.04.3 LTS. Hopefully it maches most or all of the current environment. Please refer to messages on the mailing list with $SUBJECT: [RFC PATCH 1/2] GitLab CI: avoid calling before_scripts on unintended jobs [RFC PATCH 2/2] GitLab CI: crude mapping of PMM's scripts to jobs There are few question in there which I'd appreciate help with. >>> Testing process: >>> >>> * I get an email which is a pull request, and I run the >>> "apply-pullreq" script, which takes the GIT URL and tag/branch name >>> to test. >>> * apply-pullreq performs the merge into a 'staging' branch >>> * apply-pullreq also performs some simple local tests: >>> * does git verify-tag like the GPG signature? >>> * are we trying to apply the pull before reopening the dev tree >>> for a new release? >>> * does the pull include commits with bad UTF8 or bogus qemu-devel >>> email addresses? >>> * submodule updates are only allowed if the --submodule-ok option >>> was specifically passed These steps could go unchanged at this point. One minor remark is that the repo hosted at gitlab.com would be used instead. The 'staging' branch can be protected[4] so that only authorized people can do it (and trigger the pipeline and its jobs). >>> * apply-pullreq then invokes parallel-buildtest to do the actual >>> testing This would be done by GitLab instead. The dispatching of jobs is based on the tags given to jobs and machines. IMO at least the OS version and architecture should be given as tags, and the machine needs proper setup to run a job, such as having the right packages installed. It can start with a proper documentation for every type of OS and version (and possibly job type), and evolve into scripts or other type of automation. These are usuall identical or very similar to what is defined in "tests/docker/dockerfiles", but need to be done at the machine level because of the "shell" executor. >>> * parallel-buildtest is a trivial wrapper around GNU Parallel which >>> invokes 'mergebuild' on each of the test machines >>> * if all is OK then the user gets to do the 'git push' to push the >>> staging branch to master The central place to check for success or failure would be the pipeline page. Also, there's a configurable notification system that should (I've not tested it throughly) send failed and/or successful pipeline results to the pipeline author. IIUC, this means whoever pushed to the 'staging' branch that caused the pipeline to be triggered. Let me know if this makes sense to you, and if so, we can arrange a real world PoC. FIY, I've run hundreds of jobs in an internal GitLab instance, and GitLab itself (server and runner) seems very stable. Regards, - Cleber. --- [1] - https://wiki.qemu.org/Requirements/GatingCI [2] - https://cleber.fedorapeople.org/gitlab-runner/v12.7.0/ [3] - https://docs.gitlab.com/runner/install/bleeding-edge.html#download-the-standalone-binaries [4] - https://docs.gitlab.com/ee/user/project/protected_branches.html [5] - https://docs.gitlab.com/ee/user/profile/notifications.html#issue--epics--merge-request-events [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2020-02-03 3:27 ` Cleber Rosa @ 2020-02-03 15:00 ` Cleber Rosa 2020-02-07 16:42 ` Peter Maydell 1 sibling, 0 replies; 22+ messages in thread From: Cleber Rosa @ 2020-02-03 15:00 UTC (permalink / raw) To: Peter Maydell Cc: Jeff Nelson, Markus Armbruster, Wainer dos Santos Moschetta, QEMU Developers, Stefan Hajnoczi, Alex Bennée, Ademar Reis [-- Attachment #1: Type: text/plain, Size: 827 bytes --] On Sun, Feb 02, 2020 at 10:27:12PM -0500, Cleber Rosa wrote: > >>> The set of machine I currently test on are: > >>> > >>> * an S390x box (this is provided to the project by IBM's Community > >>> Cloud so can be used for the new CI setup) > >>> * aarch32 (as a chroot on an aarch64 system) > >>> * aarch64 > >>> * ppc64 (on the GCC compile farm) > > I've built an updated gitlab-runner version for s390x, aarch64 and > ppc64[2]. I've now tested its behavior with the shell executor > (instead of docker) on aarch64 and ppc64. I did not get a chance yet > to test this new version and executor with s390x, but I'm planning > to do it soon. > Just a quick update on s390x. I've run a job and had no issues: https://gitlab.com/cleber.gnu/qemuci/-/jobs/424084346 - Cleber. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2020-02-03 3:27 ` Cleber Rosa 2020-02-03 15:00 ` Cleber Rosa @ 2020-02-07 16:42 ` Peter Maydell 2020-02-07 20:38 ` Cleber Rosa 1 sibling, 1 reply; 22+ messages in thread From: Peter Maydell @ 2020-02-07 16:42 UTC (permalink / raw) To: Cleber Rosa Cc: Jeff Nelson, Markus Armbruster, Wainer dos Santos Moschetta, QEMU Developers, Stefan Hajnoczi, Alex Bennée, Ademar Reis On Mon, 3 Feb 2020 at 03:28, Cleber Rosa <crosa@redhat.com> wrote > >>> Testing process: > >>> > >>> * I get an email which is a pull request, and I run the > >>> "apply-pullreq" script, which takes the GIT URL and tag/branch name > >>> to test. > >>> * apply-pullreq performs the merge into a 'staging' branch > >>> * apply-pullreq also performs some simple local tests: > >>> * does git verify-tag like the GPG signature? > >>> * are we trying to apply the pull before reopening the dev tree > >>> for a new release? > >>> * does the pull include commits with bad UTF8 or bogus qemu-devel > >>> email addresses? > >>> * submodule updates are only allowed if the --submodule-ok option > >>> was specifically passed > > These steps could go unchanged at this point. One minor remark is > that the repo hosted at gitlab.com would be used instead. The > 'staging' branch can be protected[4] so that only authorized people > can do it (and trigger the pipeline and its jobs). > > >>> * apply-pullreq then invokes parallel-buildtest to do the actual > >>> testing > > This would be done by GitLab instead. The dispatching of jobs is > based on the tags given to jobs and machines. IMO at least the OS > version and architecture should be given as tags, and the machine > needs proper setup to run a job, such as having the right packages > installed. It can start with a proper documentation for every type of > OS and version (and possibly job type), and evolve into scripts > or other type of automation. > > These are usuall identical or very similar to what is defined in > "tests/docker/dockerfiles", but need to be done at the machine level > because of the "shell" executor. > > >>> * parallel-buildtest is a trivial wrapper around GNU Parallel which > >>> invokes 'mergebuild' on each of the test machines > >>> * if all is OK then the user gets to do the 'git push' to push the > >>> staging branch to master > > The central place to check for success or failure would be the > pipeline page. Also, there's a configurable notification system that > should (I've not tested it throughly) send failed and/or successful > pipeline results to the pipeline author. IIUC, this means whoever > pushed to the 'staging' branch that caused the pipeline to be > triggered. > > Let me know if this makes sense to you, and if so, we can arrange > a real world PoC. FIY, I've run hundreds of jobs in an internal > GitLab instance, and GitLab itself (server and runner) seems very > stable. This all sounds like the right thing and great progress. So yes, I agree that the next step would be to get to a point where you can give me some instructions on how to say "OK, here's my staging branch" and run it through the new test process and look at the results. thanks -- PMM ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2020-02-07 16:42 ` Peter Maydell @ 2020-02-07 20:38 ` Cleber Rosa 2020-02-08 13:08 ` Peter Maydell 0 siblings, 1 reply; 22+ messages in thread From: Cleber Rosa @ 2020-02-07 20:38 UTC (permalink / raw) To: Peter Maydell Cc: Jeff Nelson, Markus Armbruster, Wainer dos Santos Moschetta, QEMU Developers, Stefan Hajnoczi, Alex Bennée, Ademar Reis [-- Attachment #1: Type: text/plain, Size: 1675 bytes --] On Fri, Feb 07, 2020 at 04:42:10PM +0000, Peter Maydell wrote: > > This all sounds like the right thing and great progress. So yes, > I agree that the next step would be to get to a point where you > can give me some instructions on how to say "OK, here's my staging > branch" and run it through the new test process and look at the > results. > IIUC the point you're describing, we must: * Have the rigth jobs defined in .gitlab-ci.yml (there are some questions to be answered on that thread) * Setup machines with: - gitlab-runner (with tags matching OS and arch) - packages needed for the actual job execution (compilers, etc) At this point, the "parallel-buildtest" command[1], would be replaced with something like: - git push git@gitlab.com:qemu-project/qemu.git staging:staging Which would automatically generate a pipeline. Checking the results can be done with programmatically using the GitLab APIs[2]. Once the result is validated, you would run "git push publish-upstream staging:master" as usual (as instructed by the script)[3]. So this leaves us with the "musts" above, and also with creating a command line tool that uses the GitLab APIs to check on the status of the pipeline associated with the staging branch. > thanks > -- PMM > Thanks for the feedback, and please please let me know if I got your point. Cheers, - Cleber. [1] - https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/apply-pullreq#n125 [2] - https://docs.gitlab.com/ee/api/pipelines.html#list-project-pipelines [3] - https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/apply-pullreq#n136 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2020-02-07 20:38 ` Cleber Rosa @ 2020-02-08 13:08 ` Peter Maydell 2020-03-02 15:27 ` Peter Maydell 0 siblings, 1 reply; 22+ messages in thread From: Peter Maydell @ 2020-02-08 13:08 UTC (permalink / raw) To: Cleber Rosa Cc: Jeff Nelson, Markus Armbruster, Wainer dos Santos Moschetta, QEMU Developers, Stefan Hajnoczi, Alex Bennée, Ademar Reis On Fri, 7 Feb 2020 at 20:39, Cleber Rosa <crosa@redhat.com> wrote: > On Fri, Feb 07, 2020 at 04:42:10PM +0000, Peter Maydell wrote: > > This all sounds like the right thing and great progress. So yes, > > I agree that the next step would be to get to a point where you > > can give me some instructions on how to say "OK, here's my staging > > branch" and run it through the new test process and look at the > > results. > > > > IIUC the point you're describing, we must: > > * Have the rigth jobs defined in .gitlab-ci.yml (there are some > questions to be answered on that thread) For the non-x86 architectures, do we define the jobs to run on those in the same .ym file? (Generally the non-x86 machines just want to run a simple make/make check; they don't need to check the wide variety of different configs x86 does.) > * Setup machines with: > - gitlab-runner (with tags matching OS and arch) > - packages needed for the actual job execution (compilers, etc) > > At this point, the "parallel-buildtest" command[1], would be replaced > with something like: > > - git push git@gitlab.com:qemu-project/qemu.git staging:staging > > Which would automatically generate a pipeline. Checking the results can > be done with programmatically using the GitLab APIs[2]. > > Once the result is validated, you would run "git push publish-upstream > staging:master" as usual (as instructed by the script)[3]. > > So this leaves us with the "musts" above, and also with creating a > command line tool that uses the GitLab APIs to check on the status of > the pipeline associated with the staging branch. Yeah, that sounds right. To start with I'm ok with checking a web page by hand to see what the job results are, so getting the runners set up so we can test by doing git push is the place to start, I think. Once we've got the actual testing going and all the machines and configs we want to test in place, we can go back and look at improving the UX for the person doing pullreqs so it's a bit more automated using the GitLab APIs. thanks -- PMM ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2020-02-08 13:08 ` Peter Maydell @ 2020-03-02 15:27 ` Peter Maydell 2020-03-05 6:50 ` Cleber Rosa 0 siblings, 1 reply; 22+ messages in thread From: Peter Maydell @ 2020-03-02 15:27 UTC (permalink / raw) To: Cleber Rosa Cc: Jeff Nelson, Markus Armbruster, Wainer dos Santos Moschetta, QEMU Developers, Stefan Hajnoczi, Alex Bennée, Ademar Reis On Sat, 8 Feb 2020 at 13:08, Peter Maydell <peter.maydell@linaro.org> wrote: > On Fri, 7 Feb 2020 at 20:39, Cleber Rosa <crosa@redhat.com> wrote: > > On Fri, Feb 07, 2020 at 04:42:10PM +0000, Peter Maydell wrote: > > > This all sounds like the right thing and great progress. So yes, > > > I agree that the next step would be to get to a point where you > > > can give me some instructions on how to say "OK, here's my staging > > > branch" and run it through the new test process and look at the > > > results. Hi -- any progress on this front? (Maybe I missed an email; if so, sorry about that...) thanks -- PMM ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] QEMU Gating CI 2020-03-02 15:27 ` Peter Maydell @ 2020-03-05 6:50 ` Cleber Rosa 0 siblings, 0 replies; 22+ messages in thread From: Cleber Rosa @ 2020-03-05 6:50 UTC (permalink / raw) To: Peter Maydell Cc: Jeff Nelson, Markus Armbruster, Wainer dos Santos Moschetta, QEMU Developers, Stefan Hajnoczi, Alex Bennée, Ademar Reis [-- Attachment #1: Type: text/plain, Size: 962 bytes --] On Mon, Mar 02, 2020 at 03:27:42PM +0000, Peter Maydell wrote: > > Hi -- any progress on this front? (Maybe I missed an email; if > so, sorry about that...) > > thanks > -- PMM > Hi Peter, Yes, I've made some progress on some of the points raised on the last email exchanges: 1) Jobs on non-Linux OS. I've built/setup gitlab-runner for FreeBSD, and tested a job: - https://gitlab.com/cleber.gnu/qemuci/-/jobs/440379169 There are some limitations on a library that gitlab-runner uses to manage services (and that has no implementation for FreeBSD "services"). But, there are workarounds that work allright. 2) Wrote a script that checks/waits on the pipeline: - https://gitlab.com/cleber.gnu/qemuci/-/commit/d90c5cf917c43f06c0724dc025205d618521c4cc 3) Wrote machine setup documentation/scripts. I'm tidying it all up to send a PR in the next day or two. Thanks for your patience! - Cleber. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2020-03-05 6:51 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-12-02 14:05 [RFC] QEMU Gating CI Cleber Rosa 2019-12-02 17:00 ` Stefan Hajnoczi 2019-12-02 17:08 ` Peter Maydell 2019-12-02 18:28 ` Cleber Rosa 2019-12-02 18:36 ` Warner Losh 2019-12-02 22:38 ` Cleber Rosa 2019-12-02 18:12 ` Cleber Rosa 2019-12-03 14:14 ` Stefan Hajnoczi 2019-12-03 14:07 ` Alex Bennée 2019-12-04 8:55 ` Thomas Huth 2019-12-06 19:03 ` Cleber Rosa 2019-12-03 17:54 ` Peter Maydell 2019-12-05 5:05 ` Cleber Rosa 2020-01-17 14:33 ` Peter Maydell 2020-01-21 20:00 ` Cleber Rosa 2020-02-03 3:27 ` Cleber Rosa 2020-02-03 15:00 ` Cleber Rosa 2020-02-07 16:42 ` Peter Maydell 2020-02-07 20:38 ` Cleber Rosa 2020-02-08 13:08 ` Peter Maydell 2020-03-02 15:27 ` Peter Maydell 2020-03-05 6:50 ` Cleber Rosa
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).