All of lore.kernel.org
 help / color / mirror / Atom feed
* Two jobs at once on denx-vulcan?
@ 2021-09-18 16:37 Simon Glass
  2021-09-20  8:12 ` Harald Seiler
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Glass @ 2021-09-18 16:37 UTC (permalink / raw)
  To: U-Boot Mailing List; +Cc: Tom Rini, Harald Seiler

Hi,

Is there something screwy with this? It seems that denx-vulcan does
two builds at once?

https://source.denx.de/u-boot/custodians/u-boot-dm/-/jobs/323540

Regards,
SImon

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two jobs at once on denx-vulcan?
  2021-09-18 16:37 Two jobs at once on denx-vulcan? Simon Glass
@ 2021-09-20  8:12 ` Harald Seiler
  2021-09-20 14:06   ` Simon Glass
  0 siblings, 1 reply; 9+ messages in thread
From: Harald Seiler @ 2021-09-20  8:12 UTC (permalink / raw)
  To: Simon Glass, U-Boot Mailing List; +Cc: Tom Rini

Hi,

On Sat, 2021-09-18 at 10:37 -0600, Simon Glass wrote:
> Hi,
> 
> Is there something screwy with this? It seems that denx-vulcan does
> two builds at once?
> 
> https://source.denx.de/u-boot/custodians/u-boot-dm/-/jobs/323540

Hm, I did some changes to the vulcan runner which might have caused
this... But still, even if it is running multiple jobs in parallel, they
should still be isolated, so how does this lead to a build failure?

-- 
Harald

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-62  Fax: +49-8142-66989-80   Email: hws@denx.de


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two jobs at once on denx-vulcan?
  2021-09-20  8:12 ` Harald Seiler
@ 2021-09-20 14:06   ` Simon Glass
  2021-09-24 14:01     ` Harald Seiler
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Glass @ 2021-09-20 14:06 UTC (permalink / raw)
  To: Harald Seiler; +Cc: U-Boot Mailing List, Tom Rini

Hi Harald,

On Mon, 20 Sept 2021 at 02:12, Harald Seiler <hws@denx.de> wrote:
>
> Hi,
>
> On Sat, 2021-09-18 at 10:37 -0600, Simon Glass wrote:
> > Hi,
> >
> > Is there something screwy with this? It seems that denx-vulcan does
> > two builds at once?
> >
> > https://source.denx.de/u-boot/custodians/u-boot-dm/-/jobs/323540
>
> Hm, I did some changes to the vulcan runner which might have caused
> this... But still, even if it is running multiple jobs in parallel, they
> should still be isolated, so how does this lead to a build failure?

I'm not sure that it does, but I do see this at the above link:

Error: Unable to create
'/builds/u-boot/custodians/u-boot-dm/.git/logs/HEAD.lock': File
exists.

Re doing multiple builds, have you set it up so it doesn't take on the
very large builds? I would love to enable multiple builds for the qemu
steps since they mostly use a single CPU, but am not sure how to do
it.

Regards,
Simon

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two jobs at once on denx-vulcan?
  2021-09-20 14:06   ` Simon Glass
@ 2021-09-24 14:01     ` Harald Seiler
  2021-09-24 14:20       ` Tom Rini
  0 siblings, 1 reply; 9+ messages in thread
From: Harald Seiler @ 2021-09-24 14:01 UTC (permalink / raw)
  To: Simon Glass; +Cc: U-Boot Mailing List, Tom Rini

Hi Simon,

On Mon, 2021-09-20 at 08:06 -0600, Simon Glass wrote:
> Hi Harald,
> 
> On Mon, 20 Sept 2021 at 02:12, Harald Seiler <hws@denx.de> wrote:
> > 
> > Hi,
> > 
> > On Sat, 2021-09-18 at 10:37 -0600, Simon Glass wrote:
> > > Hi,
> > > 
> > > Is there something screwy with this? It seems that denx-vulcan does
> > > two builds at once?
> > > 
> > > https://source.denx.de/u-boot/custodians/u-boot-dm/-/jobs/323540
> > 
> > Hm, I did some changes to the vulcan runner which might have caused
> > this... But still, even if it is running multiple jobs in parallel, they
> > should still be isolated, so how does this lead to a build failure?
> 
> I'm not sure that it does, but I do see this at the above link:
> 
> Error: Unable to create
> '/builds/u-boot/custodians/u-boot-dm/.git/logs/HEAD.lock': File
> exists.

This is super strange... Each build should be running in its own
container so there should never be a way for such a race to occur.  No
clue what is going on here...

> Re doing multiple builds, have you set it up so it doesn't take on the
> very large builds? I would love to enable multiple builds for the qemu
> steps since they mostly use a single CPU, but am not sure how to do
> it.

Actually, this was more a mistake than an intentional change.  I updated
the runner on vulcan to also take jobs for some other repos and wanted
those jobs to run in parallel.  It looks like I just forgot setting the
`limit = 1` option for the U-Boot runner.

Now, I think doing what you suggest is possible.  We need to tag build
and "test" jobs differently and then define multiple runners with
different limits.  E.g. in `.gitlab-ci.yml`:

	build all 32bit ARM platforms:
	  stage: world build
	  tags:
	    - build

	cppcheck:
	  stage: testsuites
	  tags:
	    - test

And then define two runners in `/etc/gitlab-runner/config.toml`:

	concurrent = 4

	[[runners]]
	  name = "u-boot builder on vulcan"
	  limit = 1
	  ...

	[[runners]]
	  name = "u-boot tester on vulcan"
	  limit = 4
	  ...

and during registration they get the `build` and `test` tags
respectively.  This would allow running (in this example) up to 4 test
jobs concurrently, but only ever one large build job at once.

-- 
Harald

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-62  Fax: +49-8142-66989-80   Email: hws@denx.de


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two jobs at once on denx-vulcan?
  2021-09-24 14:01     ` Harald Seiler
@ 2021-09-24 14:20       ` Tom Rini
  2021-09-24 14:38         ` Simon Glass
  0 siblings, 1 reply; 9+ messages in thread
From: Tom Rini @ 2021-09-24 14:20 UTC (permalink / raw)
  To: Harald Seiler; +Cc: Simon Glass, U-Boot Mailing List

[-- Attachment #1: Type: text/plain, Size: 3277 bytes --]

On Fri, Sep 24, 2021 at 04:01:21PM +0200, Harald Seiler wrote:
> Hi Simon,
> 
> On Mon, 2021-09-20 at 08:06 -0600, Simon Glass wrote:
> > Hi Harald,
> > 
> > On Mon, 20 Sept 2021 at 02:12, Harald Seiler <hws@denx.de> wrote:
> > > 
> > > Hi,
> > > 
> > > On Sat, 2021-09-18 at 10:37 -0600, Simon Glass wrote:
> > > > Hi,
> > > > 
> > > > Is there something screwy with this? It seems that denx-vulcan does
> > > > two builds at once?
> > > > 
> > > > https://source.denx.de/u-boot/custodians/u-boot-dm/-/jobs/323540
> > > 
> > > Hm, I did some changes to the vulcan runner which might have caused
> > > this... But still, even if it is running multiple jobs in parallel, they
> > > should still be isolated, so how does this lead to a build failure?
> > 
> > I'm not sure that it does, but I do see this at the above link:
> > 
> > Error: Unable to create
> > '/builds/u-boot/custodians/u-boot-dm/.git/logs/HEAD.lock': File
> > exists.
> 
> This is super strange... Each build should be running in its own
> container so there should never be a way for such a race to occur.  No
> clue what is going on here...

I know this from having to track down a different oddball failure with
konsulko-bootbake.  It comes down to something along the lines of
volumes being re-used.  Good in that it means that every job every time
isn't doing a whole clone of the u-boot tree.  Bad in that just in case
the job gets wedged/killed in a crazy spot you end up with problems like
this.  If you run a 'find' on vulcan you'll figure out which overlay has
a problem.  Or you can stop the runner for a moment and tell docker to
purge unused volumes and it'll clear it up.

> > Re doing multiple builds, have you set it up so it doesn't take on the
> > very large builds? I would love to enable multiple builds for the qemu
> > steps since they mostly use a single CPU, but am not sure how to do
> > it.
> 
> Actually, this was more a mistake than an intentional change.  I updated
> the runner on vulcan to also take jobs for some other repos and wanted
> those jobs to run in parallel.  It looks like I just forgot setting the
> `limit = 1` option for the U-Boot runner.
> 
> Now, I think doing what you suggest is possible.  We need to tag build
> and "test" jobs differently and then define multiple runners with
> different limits.  E.g. in `.gitlab-ci.yml`:
> 
> 	build all 32bit ARM platforms:
> 	  stage: world build
> 	  tags:
> 	    - build
> 
> 	cppcheck:
> 	  stage: testsuites
> 	  tags:
> 	    - test
> 
> And then define two runners in `/etc/gitlab-runner/config.toml`:
> 
> 	concurrent = 4
> 
> 	[[runners]]
> 	  name = "u-boot builder on vulcan"
> 	  limit = 1
> 	  ...
> 
> 	[[runners]]
> 	  name = "u-boot tester on vulcan"
> 	  limit = 4
> 	  ...
> 
> and during registration they get the `build` and `test` tags
> respectively.  This would allow running (in this example) up to 4 test
> jobs concurrently, but only ever one large build job at once.

Yes, but this would also make it harder for people to use the CI as-is
with their own runners.  For example, the only thing stopping people
from using the free gitlab CI runners on their own is that squashfs
test being broken.

-- 
Tom

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 659 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two jobs at once on denx-vulcan?
  2021-09-24 14:20       ` Tom Rini
@ 2021-09-24 14:38         ` Simon Glass
  2021-09-24 14:55           ` Tom Rini
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Glass @ 2021-09-24 14:38 UTC (permalink / raw)
  To: Tom Rini; +Cc: Harald Seiler, U-Boot Mailing List

Hi Tom,

On Fri, 24 Sept 2021 at 08:20, Tom Rini <trini@konsulko.com> wrote:
>
> On Fri, Sep 24, 2021 at 04:01:21PM +0200, Harald Seiler wrote:
> > Hi Simon,
> >
> > On Mon, 2021-09-20 at 08:06 -0600, Simon Glass wrote:
> > > Hi Harald,
> > >
> > > On Mon, 20 Sept 2021 at 02:12, Harald Seiler <hws@denx.de> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Sat, 2021-09-18 at 10:37 -0600, Simon Glass wrote:
> > > > > Hi,
> > > > >
> > > > > Is there something screwy with this? It seems that denx-vulcan does
> > > > > two builds at once?
> > > > >
> > > > > https://source.denx.de/u-boot/custodians/u-boot-dm/-/jobs/323540
> > > >
> > > > Hm, I did some changes to the vulcan runner which might have caused
> > > > this... But still, even if it is running multiple jobs in parallel, they
> > > > should still be isolated, so how does this lead to a build failure?
> > >
> > > I'm not sure that it does, but I do see this at the above link:
> > >
> > > Error: Unable to create
> > > '/builds/u-boot/custodians/u-boot-dm/.git/logs/HEAD.lock': File
> > > exists.
> >
> > This is super strange... Each build should be running in its own
> > container so there should never be a way for such a race to occur.  No
> > clue what is going on here...
>
> I know this from having to track down a different oddball failure with
> konsulko-bootbake.  It comes down to something along the lines of
> volumes being re-used.  Good in that it means that every job every time
> isn't doing a whole clone of the u-boot tree.  Bad in that just in case
> the job gets wedged/killed in a crazy spot you end up with problems like
> this.  If you run a 'find' on vulcan you'll figure out which overlay has
> a problem.  Or you can stop the runner for a moment and tell docker to
> purge unused volumes and it'll clear it up.
>
> > > Re doing multiple builds, have you set it up so it doesn't take on the
> > > very large builds? I would love to enable multiple builds for the qemu
> > > steps since they mostly use a single CPU, but am not sure how to do
> > > it.
> >
> > Actually, this was more a mistake than an intentional change.  I updated
> > the runner on vulcan to also take jobs for some other repos and wanted
> > those jobs to run in parallel.  It looks like I just forgot setting the
> > `limit = 1` option for the U-Boot runner.
> >
> > Now, I think doing what you suggest is possible.  We need to tag build
> > and "test" jobs differently and then define multiple runners with
> > different limits.  E.g. in `.gitlab-ci.yml`:
> >
> >       build all 32bit ARM platforms:
> >         stage: world build
> >         tags:
> >           - build
> >
> >       cppcheck:
> >         stage: testsuites
> >         tags:
> >           - test
> >
> > And then define two runners in `/etc/gitlab-runner/config.toml`:
> >
> >       concurrent = 4
> >
> >       [[runners]]
> >         name = "u-boot builder on vulcan"
> >         limit = 1
> >         ...
> >
> >       [[runners]]
> >         name = "u-boot tester on vulcan"
> >         limit = 4
> >         ...
> >
> > and during registration they get the `build` and `test` tags
> > respectively.  This would allow running (in this example) up to 4 test
> > jobs concurrently, but only ever one large build job at once.
>
> Yes, but this would also make it harder for people to use the CI as-is
> with their own runners.  For example, the only thing stopping people
> from using the free gitlab CI runners on their own is that squashfs
> test being broken.

Thanks for the info Harald.

Would it just mean that they would need to add both 'build' and 'test'
tags to their running? If so that does not sound onerous.

I believe it would speed up CI quite a bit.

Regards,
Simon

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two jobs at once on denx-vulcan?
  2021-09-24 14:38         ` Simon Glass
@ 2021-09-24 14:55           ` Tom Rini
  2021-09-24 23:36             ` Simon Glass
  0 siblings, 1 reply; 9+ messages in thread
From: Tom Rini @ 2021-09-24 14:55 UTC (permalink / raw)
  To: Simon Glass; +Cc: Harald Seiler, U-Boot Mailing List

[-- Attachment #1: Type: text/plain, Size: 5135 bytes --]

On Fri, Sep 24, 2021 at 08:38:49AM -0600, Simon Glass wrote:
> Hi Tom,
> 
> On Fri, 24 Sept 2021 at 08:20, Tom Rini <trini@konsulko.com> wrote:
> >
> > On Fri, Sep 24, 2021 at 04:01:21PM +0200, Harald Seiler wrote:
> > > Hi Simon,
> > >
> > > On Mon, 2021-09-20 at 08:06 -0600, Simon Glass wrote:
> > > > Hi Harald,
> > > >
> > > > On Mon, 20 Sept 2021 at 02:12, Harald Seiler <hws@denx.de> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > On Sat, 2021-09-18 at 10:37 -0600, Simon Glass wrote:
> > > > > > Hi,
> > > > > >
> > > > > > Is there something screwy with this? It seems that denx-vulcan does
> > > > > > two builds at once?
> > > > > >
> > > > > > https://source.denx.de/u-boot/custodians/u-boot-dm/-/jobs/323540
> > > > >
> > > > > Hm, I did some changes to the vulcan runner which might have caused
> > > > > this... But still, even if it is running multiple jobs in parallel, they
> > > > > should still be isolated, so how does this lead to a build failure?
> > > >
> > > > I'm not sure that it does, but I do see this at the above link:
> > > >
> > > > Error: Unable to create
> > > > '/builds/u-boot/custodians/u-boot-dm/.git/logs/HEAD.lock': File
> > > > exists.
> > >
> > > This is super strange... Each build should be running in its own
> > > container so there should never be a way for such a race to occur.  No
> > > clue what is going on here...
> >
> > I know this from having to track down a different oddball failure with
> > konsulko-bootbake.  It comes down to something along the lines of
> > volumes being re-used.  Good in that it means that every job every time
> > isn't doing a whole clone of the u-boot tree.  Bad in that just in case
> > the job gets wedged/killed in a crazy spot you end up with problems like
> > this.  If you run a 'find' on vulcan you'll figure out which overlay has
> > a problem.  Or you can stop the runner for a moment and tell docker to
> > purge unused volumes and it'll clear it up.
> >
> > > > Re doing multiple builds, have you set it up so it doesn't take on the
> > > > very large builds? I would love to enable multiple builds for the qemu
> > > > steps since they mostly use a single CPU, but am not sure how to do
> > > > it.
> > >
> > > Actually, this was more a mistake than an intentional change.  I updated
> > > the runner on vulcan to also take jobs for some other repos and wanted
> > > those jobs to run in parallel.  It looks like I just forgot setting the
> > > `limit = 1` option for the U-Boot runner.
> > >
> > > Now, I think doing what you suggest is possible.  We need to tag build
> > > and "test" jobs differently and then define multiple runners with
> > > different limits.  E.g. in `.gitlab-ci.yml`:
> > >
> > >       build all 32bit ARM platforms:
> > >         stage: world build
> > >         tags:
> > >           - build
> > >
> > >       cppcheck:
> > >         stage: testsuites
> > >         tags:
> > >           - test
> > >
> > > And then define two runners in `/etc/gitlab-runner/config.toml`:
> > >
> > >       concurrent = 4
> > >
> > >       [[runners]]
> > >         name = "u-boot builder on vulcan"
> > >         limit = 1
> > >         ...
> > >
> > >       [[runners]]
> > >         name = "u-boot tester on vulcan"
> > >         limit = 4
> > >         ...
> > >
> > > and during registration they get the `build` and `test` tags
> > > respectively.  This would allow running (in this example) up to 4 test
> > > jobs concurrently, but only ever one large build job at once.
> >
> > Yes, but this would also make it harder for people to use the CI as-is
> > with their own runners.  For example, the only thing stopping people
> > from using the free gitlab CI runners on their own is that squashfs
> > test being broken.
> 
> Thanks for the info Harald.
> 
> Would it just mean that they would need to add both 'build' and 'test'
> tags to their running? If so that does not sound onerous.

Along with not being able to use the gitlab free runners.

> I believe it would speed up CI quite a bit.

I'm not sure?  First, did you upgrade your runners recently?  I started
by looking at
https://source.denx.de/u-boot/u-boot/-/pipelines/9238/builds and all of
the last stage jobs went super quick.  But second, assuming the time
there includes spinning up the runner, sandbox+clang took 2x as long to
run as regular sandbox, to run less tests:
https://source.denx.de/u-boot/u-boot/-/jobs/326772
https://source.denx.de/u-boot/u-boot/-/jobs/326773

But we might save a minute, or two, if all of the other much quicker
tests ran to completion sooner, but we'd still be stuck waiting on the
longest running test.

So while I think splitting the job in to stages, such that if something
fails early we call it all off, a time test where we just have a single
stage would mean more stuff in parallel and maybe would be quicker,
especially when we have more free runners.  And to me, sadly, that's our
biggest gating factor and the one that can be solved with money rather
than technical wizardry.

-- 
Tom

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 659 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two jobs at once on denx-vulcan?
  2021-09-24 14:55           ` Tom Rini
@ 2021-09-24 23:36             ` Simon Glass
  2021-09-27 13:36               ` Tom Rini
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Glass @ 2021-09-24 23:36 UTC (permalink / raw)
  To: Tom Rini; +Cc: Harald Seiler, U-Boot Mailing List

Hi Tom,

On Fri, 24 Sept 2021 at 08:55, Tom Rini <trini@konsulko.com> wrote:
>
> On Fri, Sep 24, 2021 at 08:38:49AM -0600, Simon Glass wrote:
> > Hi Tom,
> >
> > On Fri, 24 Sept 2021 at 08:20, Tom Rini <trini@konsulko.com> wrote:
> > >
> > > On Fri, Sep 24, 2021 at 04:01:21PM +0200, Harald Seiler wrote:
> > > > Hi Simon,
> > > >
> > > > On Mon, 2021-09-20 at 08:06 -0600, Simon Glass wrote:
> > > > > Hi Harald,
> > > > >
> > > > > On Mon, 20 Sept 2021 at 02:12, Harald Seiler <hws@denx.de> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Sat, 2021-09-18 at 10:37 -0600, Simon Glass wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > Is there something screwy with this? It seems that denx-vulcan does
> > > > > > > two builds at once?
> > > > > > >
> > > > > > > https://source.denx.de/u-boot/custodians/u-boot-dm/-/jobs/323540
> > > > > >
> > > > > > Hm, I did some changes to the vulcan runner which might have caused
> > > > > > this... But still, even if it is running multiple jobs in parallel, they
> > > > > > should still be isolated, so how does this lead to a build failure?
> > > > >
> > > > > I'm not sure that it does, but I do see this at the above link:
> > > > >
> > > > > Error: Unable to create
> > > > > '/builds/u-boot/custodians/u-boot-dm/.git/logs/HEAD.lock': File
> > > > > exists.
> > > >
> > > > This is super strange... Each build should be running in its own
> > > > container so there should never be a way for such a race to occur.  No
> > > > clue what is going on here...
> > >
> > > I know this from having to track down a different oddball failure with
> > > konsulko-bootbake.  It comes down to something along the lines of
> > > volumes being re-used.  Good in that it means that every job every time
> > > isn't doing a whole clone of the u-boot tree.  Bad in that just in case
> > > the job gets wedged/killed in a crazy spot you end up with problems like
> > > this.  If you run a 'find' on vulcan you'll figure out which overlay has
> > > a problem.  Or you can stop the runner for a moment and tell docker to
> > > purge unused volumes and it'll clear it up.
> > >
> > > > > Re doing multiple builds, have you set it up so it doesn't take on the
> > > > > very large builds? I would love to enable multiple builds for the qemu
> > > > > steps since they mostly use a single CPU, but am not sure how to do
> > > > > it.
> > > >
> > > > Actually, this was more a mistake than an intentional change.  I updated
> > > > the runner on vulcan to also take jobs for some other repos and wanted
> > > > those jobs to run in parallel.  It looks like I just forgot setting the
> > > > `limit = 1` option for the U-Boot runner.
> > > >
> > > > Now, I think doing what you suggest is possible.  We need to tag build
> > > > and "test" jobs differently and then define multiple runners with
> > > > different limits.  E.g. in `.gitlab-ci.yml`:
> > > >
> > > >       build all 32bit ARM platforms:
> > > >         stage: world build
> > > >         tags:
> > > >           - build
> > > >
> > > >       cppcheck:
> > > >         stage: testsuites
> > > >         tags:
> > > >           - test
> > > >
> > > > And then define two runners in `/etc/gitlab-runner/config.toml`:
> > > >
> > > >       concurrent = 4
> > > >
> > > >       [[runners]]
> > > >         name = "u-boot builder on vulcan"
> > > >         limit = 1
> > > >         ...
> > > >
> > > >       [[runners]]
> > > >         name = "u-boot tester on vulcan"
> > > >         limit = 4
> > > >         ...
> > > >
> > > > and during registration they get the `build` and `test` tags
> > > > respectively.  This would allow running (in this example) up to 4 test
> > > > jobs concurrently, but only ever one large build job at once.
> > >
> > > Yes, but this would also make it harder for people to use the CI as-is
> > > with their own runners.  For example, the only thing stopping people
> > > from using the free gitlab CI runners on their own is that squashfs
> > > test being broken.
> >
> > Thanks for the info Harald.
> >
> > Would it just mean that they would need to add both 'build' and 'test'
> > tags to their running? If so that does not sound onerous.
>
> Along with not being able to use the gitlab free runners.
>
> > I believe it would speed up CI quite a bit.
>
> I'm not sure?  First, did you upgrade your runners recently?  I started
> by looking at
> https://source.denx.de/u-boot/u-boot/-/pipelines/9238/builds and all of
> the last stage jobs went super quick.  But second, assuming the time

They are the same as ever: tui did about 1 build per second on average
and kaki did 0.5 builds per second, but this has slowed by about 15%
recently. They are both have quite a few cores. It could just be that
the other two runners were busy so kaki and tui did everything.

> there includes spinning up the runner, sandbox+clang took 2x as long to
> run as regular sandbox, to run less tests:
> https://source.denx.de/u-boot/u-boot/-/jobs/326772
> https://source.denx.de/u-boot/u-boot/-/jobs/326773

Yes but tui is 2x as fast as kaki (both in terms of number of CPUs and
single-threaded performance) so that might explain it.

>
> But we might save a minute, or two, if all of the other much quicker
> tests ran to completion sooner, but we'd still be stuck waiting on the
> longest running test.

Yes, which can be many minutes. But each qemu run takes a good minute
and we have about 30 of them now. Even if all four runners are running
on them, then that is 7 minutes. In parallel it might only take a
minute or two.

>
> So while I think splitting the job in to stages, such that if something
> fails early we call it all off, a time test where we just have a single
> stage would mean more stuff in parallel and maybe would be quicker,
> especially when we have more free runners.  And to me, sadly, that's our
> biggest gating factor and the one that can be solved with money rather
> than technical wizardry.

Make sense. The other problem is that, to run the tests in parallel,
we might need to clean some of them up (the series I sent is a start
on that). But I think tui could probably run all the qemu jobs in
parallel at once, for example.

So perhaps we can come back to this when we get parallel tests
running. It definitely is not efficient at present, in the second
(qemu) stage.

Regards,
Simon

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two jobs at once on denx-vulcan?
  2021-09-24 23:36             ` Simon Glass
@ 2021-09-27 13:36               ` Tom Rini
  0 siblings, 0 replies; 9+ messages in thread
From: Tom Rini @ 2021-09-27 13:36 UTC (permalink / raw)
  To: Simon Glass; +Cc: Harald Seiler, U-Boot Mailing List

[-- Attachment #1: Type: text/plain, Size: 7048 bytes --]

On Fri, Sep 24, 2021 at 05:36:31PM -0600, Simon Glass wrote:
> Hi Tom,
> 
> On Fri, 24 Sept 2021 at 08:55, Tom Rini <trini@konsulko.com> wrote:
> >
> > On Fri, Sep 24, 2021 at 08:38:49AM -0600, Simon Glass wrote:
> > > Hi Tom,
> > >
> > > On Fri, 24 Sept 2021 at 08:20, Tom Rini <trini@konsulko.com> wrote:
> > > >
> > > > On Fri, Sep 24, 2021 at 04:01:21PM +0200, Harald Seiler wrote:
> > > > > Hi Simon,
> > > > >
> > > > > On Mon, 2021-09-20 at 08:06 -0600, Simon Glass wrote:
> > > > > > Hi Harald,
> > > > > >
> > > > > > On Mon, 20 Sept 2021 at 02:12, Harald Seiler <hws@denx.de> wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > On Sat, 2021-09-18 at 10:37 -0600, Simon Glass wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > Is there something screwy with this? It seems that denx-vulcan does
> > > > > > > > two builds at once?
> > > > > > > >
> > > > > > > > https://source.denx.de/u-boot/custodians/u-boot-dm/-/jobs/323540
> > > > > > >
> > > > > > > Hm, I did some changes to the vulcan runner which might have caused
> > > > > > > this... But still, even if it is running multiple jobs in parallel, they
> > > > > > > should still be isolated, so how does this lead to a build failure?
> > > > > >
> > > > > > I'm not sure that it does, but I do see this at the above link:
> > > > > >
> > > > > > Error: Unable to create
> > > > > > '/builds/u-boot/custodians/u-boot-dm/.git/logs/HEAD.lock': File
> > > > > > exists.
> > > > >
> > > > > This is super strange... Each build should be running in its own
> > > > > container so there should never be a way for such a race to occur.  No
> > > > > clue what is going on here...
> > > >
> > > > I know this from having to track down a different oddball failure with
> > > > konsulko-bootbake.  It comes down to something along the lines of
> > > > volumes being re-used.  Good in that it means that every job every time
> > > > isn't doing a whole clone of the u-boot tree.  Bad in that just in case
> > > > the job gets wedged/killed in a crazy spot you end up with problems like
> > > > this.  If you run a 'find' on vulcan you'll figure out which overlay has
> > > > a problem.  Or you can stop the runner for a moment and tell docker to
> > > > purge unused volumes and it'll clear it up.
> > > >
> > > > > > Re doing multiple builds, have you set it up so it doesn't take on the
> > > > > > very large builds? I would love to enable multiple builds for the qemu
> > > > > > steps since they mostly use a single CPU, but am not sure how to do
> > > > > > it.
> > > > >
> > > > > Actually, this was more a mistake than an intentional change.  I updated
> > > > > the runner on vulcan to also take jobs for some other repos and wanted
> > > > > those jobs to run in parallel.  It looks like I just forgot setting the
> > > > > `limit = 1` option for the U-Boot runner.
> > > > >
> > > > > Now, I think doing what you suggest is possible.  We need to tag build
> > > > > and "test" jobs differently and then define multiple runners with
> > > > > different limits.  E.g. in `.gitlab-ci.yml`:
> > > > >
> > > > >       build all 32bit ARM platforms:
> > > > >         stage: world build
> > > > >         tags:
> > > > >           - build
> > > > >
> > > > >       cppcheck:
> > > > >         stage: testsuites
> > > > >         tags:
> > > > >           - test
> > > > >
> > > > > And then define two runners in `/etc/gitlab-runner/config.toml`:
> > > > >
> > > > >       concurrent = 4
> > > > >
> > > > >       [[runners]]
> > > > >         name = "u-boot builder on vulcan"
> > > > >         limit = 1
> > > > >         ...
> > > > >
> > > > >       [[runners]]
> > > > >         name = "u-boot tester on vulcan"
> > > > >         limit = 4
> > > > >         ...
> > > > >
> > > > > and during registration they get the `build` and `test` tags
> > > > > respectively.  This would allow running (in this example) up to 4 test
> > > > > jobs concurrently, but only ever one large build job at once.
> > > >
> > > > Yes, but this would also make it harder for people to use the CI as-is
> > > > with their own runners.  For example, the only thing stopping people
> > > > from using the free gitlab CI runners on their own is that squashfs
> > > > test being broken.
> > >
> > > Thanks for the info Harald.
> > >
> > > Would it just mean that they would need to add both 'build' and 'test'
> > > tags to their running? If so that does not sound onerous.
> >
> > Along with not being able to use the gitlab free runners.
> >
> > > I believe it would speed up CI quite a bit.
> >
> > I'm not sure?  First, did you upgrade your runners recently?  I started
> > by looking at
> > https://source.denx.de/u-boot/u-boot/-/pipelines/9238/builds and all of
> > the last stage jobs went super quick.  But second, assuming the time
> 
> They are the same as ever: tui did about 1 build per second on average
> and kaki did 0.5 builds per second, but this has slowed by about 15%
> recently. They are both have quite a few cores. It could just be that
> the other two runners were busy so kaki and tui did everything.
> 
> > there includes spinning up the runner, sandbox+clang took 2x as long to
> > run as regular sandbox, to run less tests:
> > https://source.denx.de/u-boot/u-boot/-/jobs/326772
> > https://source.denx.de/u-boot/u-boot/-/jobs/326773
> 
> Yes but tui is 2x as fast as kaki (both in terms of number of CPUs and
> single-threaded performance) so that might explain it.
> 
> >
> > But we might save a minute, or two, if all of the other much quicker
> > tests ran to completion sooner, but we'd still be stuck waiting on the
> > longest running test.
> 
> Yes, which can be many minutes. But each qemu run takes a good minute
> and we have about 30 of them now. Even if all four runners are running
> on them, then that is 7 minutes. In parallel it might only take a
> minute or two.
> 
> >
> > So while I think splitting the job in to stages, such that if something
> > fails early we call it all off, a time test where we just have a single
> > stage would mean more stuff in parallel and maybe would be quicker,
> > especially when we have more free runners.  And to me, sadly, that's our
> > biggest gating factor and the one that can be solved with money rather
> > than technical wizardry.
> 
> Make sense. The other problem is that, to run the tests in parallel,
> we might need to clean some of them up (the series I sent is a start
> on that). But I think tui could probably run all the qemu jobs in
> parallel at once, for example.
> 
> So perhaps we can come back to this when we get parallel tests
> running. It definitely is not efficient at present, in the second
> (qemu) stage.

OK.  And I guess the other part of this would be that you could take
tui/kaki/etc out of general rotation for a bit and run some pipelines to
see what the time change is with your ideas in place.

-- 
Tom

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 659 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-09-27 13:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-18 16:37 Two jobs at once on denx-vulcan? Simon Glass
2021-09-20  8:12 ` Harald Seiler
2021-09-20 14:06   ` Simon Glass
2021-09-24 14:01     ` Harald Seiler
2021-09-24 14:20       ` Tom Rini
2021-09-24 14:38         ` Simon Glass
2021-09-24 14:55           ` Tom Rini
2021-09-24 23:36             ` Simon Glass
2021-09-27 13:36               ` Tom Rini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.