On Tue, Mar 30, 2021 at 12:19:38PM +0100, Daniel P. Berrangé wrote:
> On Mon, Mar 29, 2021 at 03:10:36PM +0100, Stefan Hajnoczi wrote:
> > Hi,
> > I wanted to follow up with a summary of the CI jobs:
> > 
> > 1. Containers & Containers Layer2 - ~3 minutes/job x 39 jobs
> > 2. Builds - ~50 minutes/job x 61 jobs
> > 3. Tests - ~12 minutes/job x 20 jobs
> > 4. Deploy - 52 minutes x 1 job
> > 
> > The Builds phase consumes the most CI minutes. If we can optimize this
> > phase then we'll achieve the biggest impact.
> > 
> > In the short term builds could be disabled. However, in the long term I
> > think full build coverage is desirable to prevent merging code that
> > breaks certain host OSes/architectures (e.g. stable Linux distros,
> > macOS, etc).
> 
> The notion of "full build coverage" doesn't really exist in reality.
> The number of platforms that QEMU is targetting, combined with the
> number of features that can be turned on/off in QEMU configure
> means that the matrix for "full build coverage" is too huge to ever
> contemplate.

Good point. We will never cover the full build matrix. I do think that
it's important to cover real-world builds, especially ones that tend to
expose issues (e.g. macOS, Windows, stable Linux distros, etc).

> I think a challenges we have with our incremental approach is that
> we're not really taking into account relative importance of the
> different build scenarios, and often don't look at the big picture
> of what the new job adds in terms of quality, compared to existing
> jobs.
> 
> eg Consider we have
> 
>   build-system-alpine:
>   build-system-ubuntu:
>   build-system-debian:
>   build-system-fedora:
>   build-system-centos:
>   build-system-opensuse:
> 
>   build-trace-multi-user:
>   build-trace-ftrace-system:
>   build-trace-ust-system:
> 
> I'd question whether we really need any of those 'build-trace'
> jobs. Instead, we could have build-system-ubuntu pass
> --enable-trace-backends=log,simple,syslog, build-system-debian
> pass --enable-trace-backends=ust and build-system-fedora
> pass --enable-trace-backends=ftrace, etc. 

Yes, I agree. The trace builds could be collapsed into various other
builds.

> > Traditionally ccache (https://ccache.dev/) was used to detect
> > recompilation of the same compiler input files. This is trickier to do
> > in GitLab CI since it would be necessary to share and update a cache,
> > potentially between untrusted users. Unfortunately this shifts the
> > bottleneck from CPU to network in a CI-as-a-Service environment since
> > the cached build output needs to be accessed by the linker on the CI
> > runner but is stored remotely.
> 
> Our docker containers install ccache already and I could have sworn
> that we use that in gitlab, but now I'm not so sure. We're only
> saving the "build/" directory as an artifact between jobs, and I'm
> not sure that directory holds the ccache cache.

It seems we're not benefitting much from ccache at the moment since the
build takes 50 minutes. Maybe this is a good area to investigate further
and find out what can be improved.

Stefan