All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] module.bbclass: Fix potential do_compile/do_make_scripts race condition
@ 2015-12-06 11:26 Paul Barker
  2015-12-06 11:26 ` Paul Barker
  2015-12-15 14:04 ` Paul Barker
  0 siblings, 2 replies; 7+ messages in thread
From: Paul Barker @ 2015-12-06 11:26 UTC (permalink / raw)
  To: openembedded-core

I ran into a race condition building multiple external modules against a 3.10.y
series kernel using the dylan branch of OpenEmbedded. This is difficult to
reproduce as it requires very specific timing: the do_make_scripts task for one
module was linking the modpost script whilst the do_compile task for another
module was attempting to use the modpost script. This resulted in a permission
error:

ERROR: Function failed: do_compile (see /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434 for further information)
ERROR: Logfile of failure stored in: /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434
Log data follows:
| DEBUG: Executing shell function do_compile
| make -C /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel M=$PWD clean
| make[1]: Entering directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
| make[1]: Leaving directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
| make -C /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel M=$PWD modules
| make[1]: Entering directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
|   CC [M]  /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/git/ti/runtime/hplib/module/hplibmod.o
|   Building modules, stage 2.
|   MODPOST 1 modules
| /bin/sh: scripts/mod/modpost: Permission denied
| make[2]: *** [__modpost] Error 126
| make[1]: *** [modules] Error 2
| make[1]: Leaving directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
| make: *** [default] Error 2
| ERROR: Function failed: do_compile (see /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434 for further information)
ERROR: Task 1284 (/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/meta-mcsdk/meta-arago-extras/recipes-bsp/ti-hplib/ti-hplib-mod_git.bb, do_compile) failed with exit code '1'

Later kernel versions do not rebuild the modpost script every time that 'make
scripts' is invoked so they should be safe from this particular failure. However
I'm not convinced that running 'make scripts' whilst also building an
out-of-tree module is always safe on later kernels and there is always the
potential for vendor kernels to have different behaviour here.

Although this was seen on the dylan branch the behaviour of master and jethro
looks to be the same here - do_make_scripts is locked so that only one instance
of it may run at one time but there is nothing to prevent one instance of
do_make_scripts running at the same time as an instance of do_compile.

The patch I'm sending attempts to solve this issue by locking the do_compile
task with the same lockfile as the do_make_scripts task in module.bbclass so
that an instance of do_copile can't run at the same time as an instance of
do_make_scripts. I don't know enough about the task locking to guarantee that
this is the right solution or to be able to test that it works as expected so
I'm marking the patch as an RFC.

Please let me know if this is the right approach and if there is any easy way to
test this.

Paul Barker (1):
  module.bbclass: Fix potential do_compile/do_make_scripts race
    condition

 meta/classes/module.bbclass | 4 ++++
 1 file changed, 4 insertions(+)

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH RFC] module.bbclass: Fix potential do_compile/do_make_scripts race condition
  2015-12-06 11:26 [PATCH RFC] module.bbclass: Fix potential do_compile/do_make_scripts race condition Paul Barker
@ 2015-12-06 11:26 ` Paul Barker
  2015-12-15 14:04 ` Paul Barker
  1 sibling, 0 replies; 7+ messages in thread
From: Paul Barker @ 2015-12-06 11:26 UTC (permalink / raw)
  To: openembedded-core

In the Linux 3.10.y series, repeatedly running 'make scripts' causes the modpost
script to be rebuilt every time. This causes a potential race condition when
building multiple external modules against a 3.10.y series kernel: one recipe
may be running do_make_scripts and modifying modpost whilst another is running
do_compile and using modpost. In my case this caused a permission error during
do_compile for one recipe as modpost was not executable at the time it tried to
run it.

In Linux 3.11 this was resolved and multiple invocations of 'make scripts' does
not cause modpost to be rebuilt. However, there are still vendor kernels in use
based on the 3.10.y series. Even on later kernel versions it would require more
investigation to conclude that running 'make scripts' whilst also building an
out-of-tree module is always safe. Therefore we should prevent do_make_scripts
from running at the same time as do_compile.

A side effect of this is that only one out-of-tree module recipe can be running
do_compile at any time. This may affect build time if multiple, large
out-of-tree modules are being built but that should be rare and the impact on
overall build time should be low.

Signed-off-by: Paul Barker <paul.barker@commagility.com>
---
 meta/classes/module.bbclass | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/meta/classes/module.bbclass b/meta/classes/module.bbclass
index 0952c0c..74280f0 100644
--- a/meta/classes/module.bbclass
+++ b/meta/classes/module.bbclass
@@ -4,6 +4,10 @@ addtask make_scripts after do_patch before do_compile
 do_make_scripts[lockfiles] = "${TMPDIR}/kernel-scripts.lock"
 do_make_scripts[depends] += "virtual/kernel:do_shared_workdir"
 
+# Ensure one recipe isn't running do_make_scripts whilst another is using those
+# scripts in do_compile.
+do_compile[lockfiles] = "${TMPDIR}/kernel-scripts.lock"
+
 EXTRA_OEMAKE += "KERNEL_SRC=${STAGING_KERNEL_DIR}"
 
 module_do_compile() {
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] module.bbclass: Fix potential do_compile/do_make_scripts race condition
  2015-12-06 11:26 [PATCH RFC] module.bbclass: Fix potential do_compile/do_make_scripts race condition Paul Barker
  2015-12-06 11:26 ` Paul Barker
@ 2015-12-15 14:04 ` Paul Barker
  2015-12-15 14:42   ` Bruce Ashfield
  2017-05-08 10:29   ` Mike Crowe
  1 sibling, 2 replies; 7+ messages in thread
From: Paul Barker @ 2015-12-15 14:04 UTC (permalink / raw)
  To: openembedded-core

On Sun, 6 Dec 2015 11:26:33 +0000
Paul Barker <paul.barker@commagility.com> wrote:

> I ran into a race condition building multiple external modules against a 3.10.y
> series kernel using the dylan branch of OpenEmbedded. This is difficult to
> reproduce as it requires very specific timing: the do_make_scripts task for one
> module was linking the modpost script whilst the do_compile task for another
> module was attempting to use the modpost script. This resulted in a permission
> error:
> 
> ERROR: Function failed: do_compile (see /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434 for further information)
> ERROR: Logfile of failure stored in: /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434
> Log data follows:
> | DEBUG: Executing shell function do_compile
> | make -C /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel M=$PWD clean
> | make[1]: Entering directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> | make[1]: Leaving directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> | make -C /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel M=$PWD modules
> | make[1]: Entering directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> |   CC [M]  /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/git/ti/runtime/hplib/module/hplibmod.o
> |   Building modules, stage 2.
> |   MODPOST 1 modules
> | /bin/sh: scripts/mod/modpost: Permission denied
> | make[2]: *** [__modpost] Error 126
> | make[1]: *** [modules] Error 2
> | make[1]: Leaving directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> | make: *** [default] Error 2
> | ERROR: Function failed: do_compile (see /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434 for further information)
> ERROR: Task 1284 (/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/meta-mcsdk/meta-arago-extras/recipes-bsp/ti-hplib/ti-hplib-mod_git.bb, do_compile) failed with exit code '1'
> 
> Later kernel versions do not rebuild the modpost script every time that 'make
> scripts' is invoked so they should be safe from this particular failure. However
> I'm not convinced that running 'make scripts' whilst also building an
> out-of-tree module is always safe on later kernels and there is always the
> potential for vendor kernels to have different behaviour here.
> 
> Although this was seen on the dylan branch the behaviour of master and jethro
> looks to be the same here - do_make_scripts is locked so that only one instance
> of it may run at one time but there is nothing to prevent one instance of
> do_make_scripts running at the same time as an instance of do_compile.
> 
> The patch I'm sending attempts to solve this issue by locking the do_compile
> task with the same lockfile as the do_make_scripts task in module.bbclass so
> that an instance of do_copile can't run at the same time as an instance of
> do_make_scripts. I don't know enough about the task locking to guarantee that
> this is the right solution or to be able to test that it works as expected so
> I'm marking the patch as an RFC.
> 
> Please let me know if this is the right approach and if there is any easy way to
> test this.
> 
> Paul Barker (1):
>   module.bbclass: Fix potential do_compile/do_make_scripts race
>     condition
> 
>  meta/classes/module.bbclass | 4 ++++
>  1 file changed, 4 insertions(+)
> 

ping on this.

I've just got bitten by this again so it's not a one-off. Is anyone able to
give me some feedback on the patch, whether this is the right approach to fix
the problem and whether this is applicable to jethro/master.

Thanks,

-- 
Paul Barker
CommAgility Ltd


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] module.bbclass: Fix potential do_compile/do_make_scripts race condition
  2015-12-15 14:04 ` Paul Barker
@ 2015-12-15 14:42   ` Bruce Ashfield
  2017-05-08 10:29   ` Mike Crowe
  1 sibling, 0 replies; 7+ messages in thread
From: Bruce Ashfield @ 2015-12-15 14:42 UTC (permalink / raw)
  To: Paul Barker; +Cc: Patches and discussions about the oe-core layer

[-- Attachment #1: Type: text/plain, Size: 5398 bytes --]

On Tue, Dec 15, 2015 at 9:04 AM, Paul Barker <paul.barker@commagility.com>
wrote:

> On Sun, 6 Dec 2015 11:26:33 +0000
> Paul Barker <paul.barker@commagility.com> wrote:
>
> > I ran into a race condition building multiple external modules against a
> 3.10.y
> > series kernel using the dylan branch of OpenEmbedded. This is difficult
> to
> > reproduce as it requires very specific timing: the do_make_scripts task
> for one
> > module was linking the modpost script whilst the do_compile task for
> another
> > module was attempting to use the modpost script. This resulted in a
> permission
> > error:
> >
> > ERROR: Function failed: do_compile (see
> /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434
> for further information)
> > ERROR: Logfile of failure stored in:
> /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434
> > Log data follows:
> > | DEBUG: Executing shell function do_compile
> > | make -C
> /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel
> M=$PWD clean
> > | make[1]: Entering directory
> `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > | make[1]: Leaving directory
> `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > | make -C
> /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel
> M=$PWD modules
> > | make[1]: Entering directory
> `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > |   CC [M]
> /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/git/ti/runtime/hplib/module/hplibmod.o
> > |   Building modules, stage 2.
> > |   MODPOST 1 modules
> > | /bin/sh: scripts/mod/modpost: Permission denied
> > | make[2]: *** [__modpost] Error 126
> > | make[1]: *** [modules] Error 2
> > | make[1]: Leaving directory
> `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > | make: *** [default] Error 2
> > | ERROR: Function failed: do_compile (see
> /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434
> for further information)
> > ERROR: Task 1284
> (/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/meta-mcsdk/meta-arago-extras/recipes-bsp/ti-hplib/
> ti-hplib-mod_git.bb, do_compile) failed with exit code '1'
> >
> > Later kernel versions do not rebuild the modpost script every time that
> 'make
> > scripts' is invoked so they should be safe from this particular failure.
> However
> > I'm not convinced that running 'make scripts' whilst also building an
> > out-of-tree module is always safe on later kernels and there is always
> the
> > potential for vendor kernels to have different behaviour here.
> >
> > Although this was seen on the dylan branch the behaviour of master and
> jethro
> > looks to be the same here - do_make_scripts is locked so that only one
> instance
> > of it may run at one time but there is nothing to prevent one instance of
> > do_make_scripts running at the same time as an instance of do_compile.
> >
> > The patch I'm sending attempts to solve this issue by locking the
> do_compile
> > task with the same lockfile as the do_make_scripts task in
> module.bbclass so
> > that an instance of do_copile can't run at the same time as an instance
> of
> > do_make_scripts. I don't know enough about the task locking to guarantee
> that
> > this is the right solution or to be able to test that it works as
> expected so
> > I'm marking the patch as an RFC.
> >
> > Please let me know if this is the right approach and if there is any
> easy way to
> > test this.
> >
> > Paul Barker (1):
> >   module.bbclass: Fix potential do_compile/do_make_scripts race
> >     condition
> >
> >  meta/classes/module.bbclass | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
>
> ping on this.
>

Sorry. I was traveling when this landed .. made a mental note .. and then
never
looped around.


>
> I've just got bitten by this again so it's not a one-off. Is anyone able to
> give me some feedback on the patch, whether this is the right approach to
> fix
> the problem and whether this is applicable to jethro/master.
>

The approach makes sense to me, and it was what I was considering for
generating
symbols after do_compile_modules. As long as it isn't serializing a huge
part of
the build, the impacts are even measurable.

So this change looks sane to me.

Bruce




>
> Thanks,
>
> --
> Paul Barker
> CommAgility Ltd
> --
> _______________________________________________
> Openembedded-core mailing list
> Openembedded-core@lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/openembedded-core
>



-- 
"Thou shalt not follow the NULL pointer, for chaos and madness await thee
at its end"

[-- Attachment #2: Type: text/html, Size: 6896 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] module.bbclass: Fix potential do_compile/do_make_scripts race condition
  2015-12-15 14:04 ` Paul Barker
  2015-12-15 14:42   ` Bruce Ashfield
@ 2017-05-08 10:29   ` Mike Crowe
  2017-05-14  9:07     ` Mike Crowe
  1 sibling, 1 reply; 7+ messages in thread
From: Mike Crowe @ 2017-05-08 10:29 UTC (permalink / raw)
  To: openembedded-core; +Cc: Paul Barker

On Tuesday 15 December 2015 at 14:04:34 +0000, Paul Barker wrote:
> On Sun, 6 Dec 2015 11:26:33 +0000
> Paul Barker <paul.barker@commagility.com> wrote:
> 
> > I ran into a race condition building multiple external modules against a 3.10.y
> > series kernel using the dylan branch of OpenEmbedded. This is difficult to
> > reproduce as it requires very specific timing: the do_make_scripts task for one
> > module was linking the modpost script whilst the do_compile task for another
> > module was attempting to use the modpost script. This resulted in a permission
> > error:
> > 
> > ERROR: Function failed: do_compile (see /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434 for further information)
> > ERROR: Logfile of failure stored in: /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434
> > Log data follows:
> > | DEBUG: Executing shell function do_compile
> > | make -C /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel M=$PWD clean
> > | make[1]: Entering directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > | make[1]: Leaving directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > | make -C /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel M=$PWD modules
> > | make[1]: Entering directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > |   CC [M]  /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/git/ti/runtime/hplib/module/hplibmod.o
> > |   Building modules, stage 2.
> > |   MODPOST 1 modules
> > | /bin/sh: scripts/mod/modpost: Permission denied
> > | make[2]: *** [__modpost] Error 126
> > | make[1]: *** [modules] Error 2
> > | make[1]: Leaving directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > | make: *** [default] Error 2
> > | ERROR: Function failed: do_compile (see /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434 for further information)
> > ERROR: Task 1284 (/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/meta-mcsdk/meta-arago-extras/recipes-bsp/ti-hplib/ti-hplib-mod_git.bb, do_compile) failed with exit code '1'
> > 
> > Later kernel versions do not rebuild the modpost script every time that 'make
> > scripts' is invoked so they should be safe from this particular failure. However
> > I'm not convinced that running 'make scripts' whilst also building an
> > out-of-tree module is always safe on later kernels and there is always the
> > potential for vendor kernels to have different behaviour here.
> > 
> > Although this was seen on the dylan branch the behaviour of master and jethro
> > looks to be the same here - do_make_scripts is locked so that only one instance
> > of it may run at one time but there is nothing to prevent one instance of
> > do_make_scripts running at the same time as an instance of do_compile.
> > 
> > The patch I'm sending attempts to solve this issue by locking the do_compile
> > task with the same lockfile as the do_make_scripts task in module.bbclass so
> > that an instance of do_copile can't run at the same time as an instance of
> > do_make_scripts. I don't know enough about the task locking to guarantee that
> > this is the right solution or to be able to test that it works as expected so
> > I'm marking the patch as an RFC.
> > 
> > Please let me know if this is the right approach and if there is any easy way to
> > test this.
> > 
> > Paul Barker (1):
> >   module.bbclass: Fix potential do_compile/do_make_scripts race
> >     condition
> > 
> >  meta/classes/module.bbclass | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> 
> ping on this.
> 
> I've just got bitten by this again so it's not a one-off. Is anyone able to
> give me some feedback on the patch, whether this is the right approach to fix
> the problem and whether this is applicable to jethro/master.

We've started seeing the same symptom, but with a v3.14 kernel. We have
several recipes that build out-of-tree modules and I can see
do_make_scripts for one running at the same time as do_compile for the one
that fails.

If I try to reproduce the problem by hand, I cannot. However, I only see
modpost being compiled for one of the tasks in the logs.

I can't really explain why I see the problem with a newer kernel.
Regardless, it seems unwise to even attempt to run do_make_tasks and
do_compile in parallel.

It looks this patch was reviewed favourably, but doesn't seem to have made
it into master.

In the meantime, I'll try this patch and see if it makes the problem go
away for us.

Thanks.

Mike.

Original patch at https://patchwork.openembedded.org/patch/109269/ and
thread at
http://lists.openembedded.org/pipermail/openembedded-core/2015-December/113752.html
for those without long-term email archives.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] module.bbclass: Fix potential do_compile/do_make_scripts race condition
  2017-05-08 10:29   ` Mike Crowe
@ 2017-05-14  9:07     ` Mike Crowe
  2018-03-28  8:24       ` Javier Viguera
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Crowe @ 2017-05-14  9:07 UTC (permalink / raw)
  To: openembedded-core; +Cc: Paul Barker

On Tuesday 15 December 2015 at 14:04:34 +0000, Paul Barker wrote:
> > On Sun, 6 Dec 2015 11:26:33 +0000
> > Paul Barker <paul.barker@commagility.com> wrote:
> > 
> > > I ran into a race condition building multiple external modules against a 3.10.y
> > > series kernel using the dylan branch of OpenEmbedded. This is difficult to
> > > reproduce as it requires very specific timing: the do_make_scripts task for one
> > > module was linking the modpost script whilst the do_compile task for another
> > > module was attempting to use the modpost script. This resulted in a permission
> > > error:

[snip]

> > > | /bin/sh: scripts/mod/modpost: Permission denied

[snip]

On Monday 08 May 2017 at 11:29:03 +0100, Mike Crowe wrote:
> We've started seeing the same symptom, but with a v3.14 kernel. We have
> several recipes that build out-of-tree modules and I can see
> do_make_scripts for one running at the same time as do_compile for the one
> that fails.
> 
> If I try to reproduce the problem by hand, I cannot. However, I only see
> modpost being compiled for one of the tasks in the logs.
> 
> I can't really explain why I see the problem with a newer kernel.
> Regardless, it seems unwise to even attempt to run do_make_tasks and
> do_compile in parallel.
> 
> It looks this patch was reviewed favourably, but doesn't seem to have made
> it into master.
> 
> In the meantime, I'll try this patch and see if it makes the problem go
> away for us.

The patch does seem to have resolved the problem, although there haven't
yet been enough test runs to be completely sure.

A complete copy of Paul Barker's original patch applied to master follows.

Mike.

--8<--

From 4615110d4f5eb7925c280ed38789fb8aff44379f Mon Sep 17 00:00:00 2001
From: Paul Barker <paul.barker@commagility.com>
Date: Sun, 6 Dec 2015 11:26:34 +0000
Subject: [PATCH] module.bbclass: Fix potential do_compile/do_make_scripts race
 condition

In the Linux 3.10.y series, repeatedly running 'make scripts' causes the modpost
script to be rebuilt every time. This causes a potential race condition when
building multiple external modules against a 3.10.y series kernel: one recipe
may be running do_make_scripts and modifying modpost whilst another is running
do_compile and using modpost. In my case this caused a permission error during
do_compile for one recipe as modpost was not executable at the time it tried to
run it.

In Linux 3.11 this was resolved and multiple invocations of 'make scripts' does
not cause modpost to be rebuilt. However, there are still vendor kernels in use
based on the 3.10.y series. Even on later kernel versions it would require more
investigation to conclude that running 'make scripts' whilst also building an
out-of-tree module is always safe. Therefore we should prevent do_make_scripts
from running at the same time as do_compile.

A side effect of this is that only one out-of-tree module recipe can be running
do_compile at any time. This may affect build time if multiple, large
out-of-tree modules are being built but that should be rare and the impact on
overall build time should be low.

Signed-off-by: Paul Barker <paul.barker@commagility.com>
Tested-by: Mike Crowe <mac@mcrowe.com>
---
 meta/classes/module.bbclass | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/meta/classes/module.bbclass b/meta/classes/module.bbclass
index 802476bc7a..1971f41156 100644
--- a/meta/classes/module.bbclass
+++ b/meta/classes/module.bbclass
@@ -4,6 +4,10 @@ addtask make_scripts after do_prepare_recipe_sysroot before do_compile
 do_make_scripts[lockfiles] = "${TMPDIR}/kernel-scripts.lock"
 do_make_scripts[depends] += "virtual/kernel:do_shared_workdir"
 
+# Ensure one recipe isn't running do_make_scripts whilst another is using those
+# scripts in do_compile.
+do_compile[lockfiles] = "${TMPDIR}/kernel-scripts.lock"
+
 EXTRA_OEMAKE += "KERNEL_SRC=${STAGING_KERNEL_DIR}"
 
 MODULES_INSTALL_TARGET ?= "modules_install"
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] module.bbclass: Fix potential do_compile/do_make_scripts race condition
  2017-05-14  9:07     ` Mike Crowe
@ 2018-03-28  8:24       ` Javier Viguera
  0 siblings, 0 replies; 7+ messages in thread
From: Javier Viguera @ 2018-03-28  8:24 UTC (permalink / raw)
  To: openembedded-core

On 14/05/17 11:07, Mike Crowe wrote:
> 
> On Monday 08 May 2017 at 11:29:03 +0100, Mike Crowe wrote:
>> We've started seeing the same symptom, but with a v3.14 kernel. We have
>> several recipes that build out-of-tree modules and I can see
>> do_make_scripts for one running at the same time as do_compile for the one
>> that fails.
>>

Even though I have not seen this running manually, our buildservers have 
reproduced it using Rocko and a v4.9 kernel, building the external 
imx-gpu kernel module for a NXP IMX6 based hardware.

So yeah, still happening.

-- 
Javier Viguera
Software Engineer

Digi International® Spain S.A.U.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-03-28  8:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-06 11:26 [PATCH RFC] module.bbclass: Fix potential do_compile/do_make_scripts race condition Paul Barker
2015-12-06 11:26 ` Paul Barker
2015-12-15 14:04 ` Paul Barker
2015-12-15 14:42   ` Bruce Ashfield
2017-05-08 10:29   ` Mike Crowe
2017-05-14  9:07     ` Mike Crowe
2018-03-28  8:24       ` Javier Viguera

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.