All of lore.kernel.org
 help / color / mirror / Atom feed
* [bitbake-devel][PATCH] siggen: Fix insufficent entropy in sigtask file names
@ 2022-08-03 14:04 Joshua Watt
  2022-08-04  9:26 ` Luca Ceresoli
  0 siblings, 1 reply; 5+ messages in thread
From: Joshua Watt @ 2022-08-03 14:04 UTC (permalink / raw)
  To: bitbake-devel; +Cc: rasmus.villemoes, Joshua Watt

Signature generation uses mkstemp() to get a file descriptor to a unique
file and then write the signature into it. However, the unique file name
generation in glibc is based on the system timestamp, which means that
with highly parallel builds it is more likely than one might expect
expected that a conflict will occur between two different builder nodes.
When operating over NFS (such as a shared sstate cache), this can cause
race conditions and rare failures (particularly with NFS servers that
may not correctly implement O_EXCL).

The signature generation code is particularly susceptible to races since
a single "sigtask." prefix used for all signatures from all tasks, which
makes collision even more likely.

To work around this, add an internal implementation of mkstemp() that
adds additional truly random entropy to the file name to eliminate
conflicts.

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
---
 bitbake/lib/bb/siggen.py |  2 +-
 bitbake/lib/bb/utils.py  | 21 +++++++++++++++++++++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/bitbake/lib/bb/siggen.py b/bitbake/lib/bb/siggen.py
index 3f3d6df54d..bb80343a97 100644
--- a/bitbake/lib/bb/siggen.py
+++ b/bitbake/lib/bb/siggen.py
@@ -425,7 +425,7 @@ class SignatureGeneratorBasic(SignatureGenerator):
                 bb.error("Taskhash mismatch %s versus %s for %s" % (computed_taskhash, self.taskhash[tid], tid))
                 sigfile = sigfile.replace(self.taskhash[tid], computed_taskhash)
 
-        fd, tmpfile = tempfile.mkstemp(dir=os.path.dirname(sigfile), prefix="sigtask.")
+        fd, tmpfile = bb.utils.mkstemp(dir=os.path.dirname(sigfile), prefix="sigtask.")
         try:
             with bb.compress.zstd.open(fd, "wt", encoding="utf-8", num_threads=1) as f:
                 json.dump(data, f, sort_keys=True, separators=(",", ":"), cls=SetEncoder)
diff --git a/bitbake/lib/bb/utils.py b/bitbake/lib/bb/utils.py
index 19ed68ea62..b8b90df8d3 100644
--- a/bitbake/lib/bb/utils.py
+++ b/bitbake/lib/bb/utils.py
@@ -28,6 +28,8 @@ import signal
 import collections
 import copy
 import ctypes
+import random
+import tempfile
 from subprocess import getstatusoutput
 from contextlib import contextmanager
 from ctypes import cdll
@@ -1754,3 +1756,22 @@ def is_local_uid(uid=''):
             if str(uid) == line_split[2]:
                 return True
     return False
+
+def mkstemp(suffix=None, prefix=None, dir=None, text=False):
+    """
+    Generates a unique filename, independent of time.
+
+    mkstemp() in glibc (at least) generates unique file names based on the
+    current system time. When combined with highly parallel builds, and
+    operating over NFS (e.g. shared sstate/downloads) this can result in
+    conflicts and race conditions.
+
+    This function adds additional entropy to the file name so that a collision
+    is independent of time and thus extremely unlikely.
+    """
+    entropy = "".join(random.choices("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890", k=20))
+    if prefix:
+        prefix = prefix + entropy
+    else:
+        prefix = tempfile.gettempprefix() + entropy
+    return tempfile.mkstemp(suffix=suffix, prefix=prefix, dir=dir, text=text)
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [bitbake-devel][PATCH] siggen: Fix insufficent entropy in sigtask file names
  2022-08-03 14:04 [bitbake-devel][PATCH] siggen: Fix insufficent entropy in sigtask file names Joshua Watt
@ 2022-08-04  9:26 ` Luca Ceresoli
  2022-08-09 21:44   ` Joshua Watt
  0 siblings, 1 reply; 5+ messages in thread
From: Luca Ceresoli @ 2022-08-04  9:26 UTC (permalink / raw)
  To: Joshua Watt; +Cc: bitbake-devel, rasmus.villemoes

Hi Joshua,

On Wed,  3 Aug 2022 09:04:41 -0500
"Joshua Watt" <JPEWhacker@gmail.com> wrote:

> Signature generation uses mkstemp() to get a file descriptor to a unique
> file and then write the signature into it. However, the unique file name
> generation in glibc is based on the system timestamp, which means that
> with highly parallel builds it is more likely than one might expect
> expected that a conflict will occur between two different builder nodes.

s/expect expected/expect/

Taken in my testing branch with the above fix.

-- 
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [bitbake-devel][PATCH] siggen: Fix insufficent entropy in sigtask file names
  2022-08-04  9:26 ` Luca Ceresoli
@ 2022-08-09 21:44   ` Joshua Watt
  2022-08-09 21:47     ` Steve Sakoman
  0 siblings, 1 reply; 5+ messages in thread
From: Joshua Watt @ 2022-08-09 21:44 UTC (permalink / raw)
  To: Luca Ceresoli; +Cc: bitbake-devel, rasmus.villemoes, Steve Sakoman

On Thu, Aug 4, 2022 at 4:26 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
>
> Hi Joshua,
>
> On Wed,  3 Aug 2022 09:04:41 -0500
> "Joshua Watt" <JPEWhacker@gmail.com> wrote:
>
> > Signature generation uses mkstemp() to get a file descriptor to a unique
> > file and then write the signature into it. However, the unique file name
> > generation in glibc is based on the system timestamp, which means that
> > with highly parallel builds it is more likely than one might expect
> > expected that a conflict will occur between two different builder nodes.
>
> s/expect expected/expect/
>
> Taken in my testing branch with the above fix.

Steve, can you please backport this to kirkstone and dunfell? There
may be a context failure backporting it to dunfell, let me know if you
want me to send a new patch with the correct context.

Thanks

>
> --
> Luca Ceresoli, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [bitbake-devel][PATCH] siggen: Fix insufficent entropy in sigtask file names
  2022-08-09 21:44   ` Joshua Watt
@ 2022-08-09 21:47     ` Steve Sakoman
  2022-08-10 14:03       ` Joshua Watt
  0 siblings, 1 reply; 5+ messages in thread
From: Steve Sakoman @ 2022-08-09 21:47 UTC (permalink / raw)
  To: Joshua Watt; +Cc: Luca Ceresoli, bitbake-devel, rasmus.villemoes

On Tue, Aug 9, 2022 at 11:44 AM Joshua Watt <jpewhacker@gmail.com> wrote:
>
> On Thu, Aug 4, 2022 at 4:26 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> >
> > Hi Joshua,
> >
> > On Wed,  3 Aug 2022 09:04:41 -0500
> > "Joshua Watt" <JPEWhacker@gmail.com> wrote:
> >
> > > Signature generation uses mkstemp() to get a file descriptor to a unique
> > > file and then write the signature into it. However, the unique file name
> > > generation in glibc is based on the system timestamp, which means that
> > > with highly parallel builds it is more likely than one might expect
> > > expected that a conflict will occur between two different builder nodes.
> >
> > s/expect expected/expect/
> >
> > Taken in my testing branch with the above fix.
>
> Steve, can you please backport this to kirkstone and dunfell? There
> may be a context failure backporting it to dunfell, let me know if you
> want me to send a new patch with the correct context.

Yes! I'll probably get to that tomorrow after I finish up testing for
the 4.0.3 release.  I'll let you know if I run into any issues I'm not
confident about resolving.

Steve


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [bitbake-devel][PATCH] siggen: Fix insufficent entropy in sigtask file names
  2022-08-09 21:47     ` Steve Sakoman
@ 2022-08-10 14:03       ` Joshua Watt
  0 siblings, 0 replies; 5+ messages in thread
From: Joshua Watt @ 2022-08-10 14:03 UTC (permalink / raw)
  To: Steve Sakoman; +Cc: Luca Ceresoli, bitbake-devel, rasmus.villemoes

On Tue, Aug 9, 2022 at 4:48 PM Steve Sakoman <steve@sakoman.com> wrote:
>
> On Tue, Aug 9, 2022 at 11:44 AM Joshua Watt <jpewhacker@gmail.com> wrote:
> >
> > On Thu, Aug 4, 2022 at 4:26 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> > >
> > > Hi Joshua,
> > >
> > > On Wed,  3 Aug 2022 09:04:41 -0500
> > > "Joshua Watt" <JPEWhacker@gmail.com> wrote:
> > >
> > > > Signature generation uses mkstemp() to get a file descriptor to a unique
> > > > file and then write the signature into it. However, the unique file name
> > > > generation in glibc is based on the system timestamp, which means that
> > > > with highly parallel builds it is more likely than one might expect
> > > > expected that a conflict will occur between two different builder nodes.
> > >
> > > s/expect expected/expect/
> > >
> > > Taken in my testing branch with the above fix.
> >
> > Steve, can you please backport this to kirkstone and dunfell? There
> > may be a context failure backporting it to dunfell, let me know if you
> > want me to send a new patch with the correct context.
>
> Yes! I'll probably get to that tomorrow after I finish up testing for
> the 4.0.3 release.  I'll let you know if I run into any issues I'm not
> confident about resolving.

That's fine. I also just realized this relies on Python 3.6
functionality, so hold off on backporting it to dunfell for the time
being; Hardknott should be OK still

>
> Steve


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-08-10 14:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-03 14:04 [bitbake-devel][PATCH] siggen: Fix insufficent entropy in sigtask file names Joshua Watt
2022-08-04  9:26 ` Luca Ceresoli
2022-08-09 21:44   ` Joshua Watt
2022-08-09 21:47     ` Steve Sakoman
2022-08-10 14:03       ` Joshua Watt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.