bitbake-devel.lists.openembedded.org archive mirror
 help / color / mirror / Atom feed
* [bitbake][kirkstone][2.0][PATCH 0/4] Patch review
@ 2023-02-14 15:28 Steve Sakoman
  2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 1/4] siggen: Fix inefficient string concatenation Steve Sakoman
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Steve Sakoman @ 2023-02-14 15:28 UTC (permalink / raw)
  To: bitbake-devel

Please review this set of patches for 2.0/kirkstone and have comments back by
end of day Thursday.

Passed a-full on autobuilder:

https://autobuilder.yoctoproject.org/typhoon/#/builders/83/builds/4921

The following changes since commit 86f2fa5261da959cda706c794a0047e5e89d4d6b:

  fetch2/git: Clarify the meaning of namespace (2023-02-01 09:03:36 -1000)

are available in the Git repository at:

  https://git.openembedded.org/bitbake-contrib stable/2.0-nut
  http://cgit.openembedded.org/bitbake-contrib/log/?h=stable/2.0-nut

Etienne Cordonnier (1):
  siggen: Fix inefficient string concatenation

Marius Kriegerowski (1):
  bitbake-diffsigs: Make PEP8 compliant

Mark Hatle (1):
  utils/ply: Update md5 to better report errors with hashlib

Schmidt, Adriaan (1):
  bitbake-diffsigs: break on first dependent task difference

 bin/bitbake-diffsigs | 49 ++++++++++++++++++++++++--------------------
 lib/bb/siggen.py     | 11 +++++-----
 lib/bb/utils.py      |  7 ++++++-
 lib/ply/yacc.py      |  7 +++++++
 4 files changed, 46 insertions(+), 28 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [bitbake][kirkstone][2.0][PATCH 1/4] siggen: Fix inefficient string concatenation
  2023-02-14 15:28 [bitbake][kirkstone][2.0][PATCH 0/4] Patch review Steve Sakoman
@ 2023-02-14 15:28 ` Steve Sakoman
  2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 2/4] utils/ply: Update md5 to better report errors with hashlib Steve Sakoman
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Steve Sakoman @ 2023-02-14 15:28 UTC (permalink / raw)
  To: bitbake-devel

From: Etienne Cordonnier <ecordonnier@snap.com>

As discussed in https://stackoverflow.com/a/4435752/1710392 , CPython
has an optimization for statements in the form "a = a + b" or "a += b".
It seems that this line does not get optimized, because it has a form a = a + b + c:
data = data + "./" + f.split("/./")[1]

For that reason, it does a copy of data for each iteration, potentially copying megabytes
of data for each iteration.

Changing this line causes SignatureGeneratorBasic::get_taskhash to take 0.06 seconds
instead of 45 seconds on my test setup where SRC_URI points to a big directory.

Note that PEP8 recommends explicitely not to use this optimization which is specific to CPython:
"do not rely on CPython’s efficient implementation of in-place string concatenation for statements in the form a += b or a = a + b"

However, the PEP8 recommended form using "join()" also does not avoid the copy and takes 45 seconds in my test setup:
data = ''.join((data, "./", f.split("/./")[1]))

I have changed the other lines to also use += for consistency only, however those were in the form a = a + b
and were optimized already.

Co-authored-by: JJ Robertson <jrobertson@snap.com>
Signed-off-by: Etienne Cordonnier <ecordonnier@snap.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
(cherry picked from commit 195750f2ca355e29d51219c58ecb2c1d83692717)
Signed-off-by: Steve Sakoman <steve@sakoman.com>
---
 lib/bb/siggen.py | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/bb/siggen.py b/lib/bb/siggen.py
index 9a20fc8e..cea3a538 100644
--- a/lib/bb/siggen.py
+++ b/lib/bb/siggen.py
@@ -329,19 +329,19 @@ class SignatureGeneratorBasic(SignatureGenerator):
 
         data = self.basehash[tid]
         for dep in self.runtaskdeps[tid]:
-            data = data + self.get_unihash(dep)
+            data += self.get_unihash(dep)
 
         for (f, cs) in self.file_checksum_values[tid]:
             if cs:
                 if "/./" in f:
-                    data = data + "./" + f.split("/./")[1]
-                data = data + cs
+                    data += "./" + f.split("/./")[1]
+                data += cs
 
         if tid in self.taints:
             if self.taints[tid].startswith("nostamp:"):
-                data = data + self.taints[tid][8:]
+                data += self.taints[tid][8:]
             else:
-                data = data + self.taints[tid]
+                data += self.taints[tid]
 
         h = hashlib.sha256(data.encode("utf-8")).hexdigest()
         self.taskhash[tid] = h
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [bitbake][kirkstone][2.0][PATCH 2/4] utils/ply: Update md5 to better report errors with hashlib
  2023-02-14 15:28 [bitbake][kirkstone][2.0][PATCH 0/4] Patch review Steve Sakoman
  2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 1/4] siggen: Fix inefficient string concatenation Steve Sakoman
@ 2023-02-14 15:28 ` Steve Sakoman
  2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 3/4] bitbake-diffsigs: Make PEP8 compliant Steve Sakoman
  2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 4/4] bitbake-diffsigs: break on first dependent task difference Steve Sakoman
  3 siblings, 0 replies; 5+ messages in thread
From: Steve Sakoman @ 2023-02-14 15:28 UTC (permalink / raw)
  To: bitbake-devel

From: Mark Hatle <mark.hatle@kernel.crashing.org>

In the case where hashlib is not available, the try would fail and fall
through resulting in a backtrace on the usage of the 'sig'.  The backtrace
itself was confusing and made it difficult to determine what went wrong.

Update the import to be in it's own try block with an appropriate
message to indicate what went wrong.

Note, the current version of ply all of this code has been restructured
so this is not applicable upstream.

Additionally, some versions of hashlib don't appear to implement the
second FIPS related argument.  Detect this and support both versions.

Signed-off-by: Mark Hatle <mark.hatle@amd.com>
Signed-off-by: Mark Hatle <mark.hatle@kernel.crashing.org>
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
(cherry picked from commit 484ab42f440070c0369b81f5c69da860fa47a798)
Signed-off-by: Steve Sakoman <steve@sakoman.com>
---
 lib/bb/utils.py | 7 ++++++-
 lib/ply/yacc.py | 7 +++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/lib/bb/utils.py b/lib/bb/utils.py
index 66a8a08c..bca4830f 100644
--- a/lib/bb/utils.py
+++ b/lib/bb/utils.py
@@ -545,7 +545,12 @@ def md5_file(filename):
     Return the hex string representation of the MD5 checksum of filename.
     """
     import hashlib
-    return _hasher(hashlib.new('MD5', usedforsecurity=False), filename)
+    try:
+        sig = hashlib.new('MD5', usedforsecurity=False)
+    except TypeError:
+        # Some configurations don't appear to support two arguments
+        sig = hashlib.new('MD5')
+    return _hasher(sig, filename)
 
 def sha256_file(filename):
     """
diff --git a/lib/ply/yacc.py b/lib/ply/yacc.py
index 767c4e46..381b50cf 100644
--- a/lib/ply/yacc.py
+++ b/lib/ply/yacc.py
@@ -2798,7 +2798,14 @@ class ParserReflect(object):
     def signature(self):
         try:
             import hashlib
+        except ImportError:
+            raise RuntimeError("Unable to import hashlib")
+        try:
             sig = hashlib.new('MD5', usedforsecurity=False)
+        except TypeError:
+            # Some configurations don't appear to support two arguments
+            sig = hashlib.new('MD5')
+        try:
             if self.start:
                 sig.update(self.start.encode('latin-1'))
             if self.prec:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [bitbake][kirkstone][2.0][PATCH 3/4] bitbake-diffsigs: Make PEP8 compliant
  2023-02-14 15:28 [bitbake][kirkstone][2.0][PATCH 0/4] Patch review Steve Sakoman
  2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 1/4] siggen: Fix inefficient string concatenation Steve Sakoman
  2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 2/4] utils/ply: Update md5 to better report errors with hashlib Steve Sakoman
@ 2023-02-14 15:28 ` Steve Sakoman
  2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 4/4] bitbake-diffsigs: break on first dependent task difference Steve Sakoman
  3 siblings, 0 replies; 5+ messages in thread
From: Steve Sakoman @ 2023-02-14 15:28 UTC (permalink / raw)
  To: bitbake-devel

From: Marius Kriegerowski <marius.kriegerowski@gmail.com>

This ignores flake8 rules:
  * E402 module level import not at top of file
  * E501 line too long

Signed-off-by: Marius Kriegerowski <marius.kriegerowski@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
(cherry picked from commit e8b176de448dc387c7a578c92b52aef28591038f)
Signed-off-by: Steve Sakoman <steve@sakoman.com>
---
 bin/bitbake-diffsigs | 49 ++++++++++++++++++++++++--------------------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/bin/bitbake-diffsigs b/bin/bitbake-diffsigs
index cf4cc706..fe0f33ee 100755
--- a/bin/bitbake-diffsigs
+++ b/bin/bitbake-diffsigs
@@ -11,6 +11,7 @@
 import os
 import sys
 import warnings
+
 warnings.simplefilter("default")
 import argparse
 import logging
@@ -27,6 +28,7 @@ logger = bb.msg.logger_create(myname)
 
 is_dump = myname == 'bitbake-dumpsig'
 
+
 def find_siginfo(tinfoil, pn, taskname, sigs=None):
     result = None
     tinfoil.set_event_mask(['bb.event.FindSigInfoResult',
@@ -52,6 +54,7 @@ def find_siginfo(tinfoil, pn, taskname, sigs=None):
         sys.exit(2)
     return result
 
+
 def find_siginfo_task(bbhandler, pn, taskname, sig1=None, sig2=None):
     """ Find the most recent signature files for the specified PN/task """
 
@@ -63,10 +66,10 @@ def find_siginfo_task(bbhandler, pn, taskname, sig1=None, sig2=None):
         if not sigfiles:
             logger.error('No sigdata files found matching %s %s matching either %s or %s' % (pn, taskname, sig1, sig2))
             sys.exit(1)
-        elif not sig1 in sigfiles:
+        elif sig1 not in sigfiles:
             logger.error('No sigdata files found matching %s %s with signature %s' % (pn, taskname, sig1))
             sys.exit(1)
-        elif not sig2 in sigfiles:
+        elif sig2 not in sigfiles:
             logger.error('No sigdata files found matching %s %s with signature %s' % (pn, taskname, sig2))
             sys.exit(1)
         latestfiles = [sigfiles[sig1], sigfiles[sig2]]
@@ -88,9 +91,9 @@ def recursecb(key, hash1, hash2):
     recout = []
     if not hashfiles:
         recout.append("Unable to find matching sigdata for %s with hashes %s or %s" % (key, hash1, hash2))
-    elif not hash1 in hashfiles:
+    elif hash1 not in hashfiles:
         recout.append("Unable to find matching sigdata for %s with hash %s" % (key, hash1))
-    elif not hash2 in hashfiles:
+    elif hash2 not in hashfiles:
         recout.append("Unable to find matching sigdata for %s with hash %s" % (key, hash2))
     else:
         out2 = bb.siggen.compare_sigfiles(hashfiles[hash1], hashfiles[hash2], recursecb, color=color)
@@ -110,36 +113,36 @@ parser.add_argument('-D', '--debug',
 
 if is_dump:
     parser.add_argument("-t", "--task",
-            help="find the signature data file for the last run of the specified task",
-            action="store", dest="taskargs", nargs=2, metavar=('recipename', 'taskname'))
+                        help="find the signature data file for the last run of the specified task",
+                        action="store", dest="taskargs", nargs=2, metavar=('recipename', 'taskname'))
 
     parser.add_argument("sigdatafile1",
-            help="Signature file to dump. Not used when using -t/--task.",
-            action="store", nargs='?', metavar="sigdatafile")
+                        help="Signature file to dump. Not used when using -t/--task.",
+                        action="store", nargs='?', metavar="sigdatafile")
 else:
     parser.add_argument('-c', '--color',
-            help='Colorize the output (where %(metavar)s is %(choices)s)',
-            choices=['auto', 'always', 'never'], default='auto', metavar='color')
+                        help='Colorize the output (where %(metavar)s is %(choices)s)',
+                        choices=['auto', 'always', 'never'], default='auto', metavar='color')
 
     parser.add_argument('-d', '--dump',
-            help='Dump the last signature data instead of comparing (equivalent to using bitbake-dumpsig)',
-            action='store_true')
+                        help='Dump the last signature data instead of comparing (equivalent to using bitbake-dumpsig)',
+                        action='store_true')
 
     parser.add_argument("-t", "--task",
-            help="find the signature data files for the last two runs of the specified task and compare them",
-            action="store", dest="taskargs", nargs=2, metavar=('recipename', 'taskname'))
+                        help="find the signature data files for the last two runs of the specified task and compare them",
+                        action="store", dest="taskargs", nargs=2, metavar=('recipename', 'taskname'))
 
     parser.add_argument("-s", "--signature",
-            help="With -t/--task, specify the signatures to look for instead of taking the last two",
-            action="store", dest="sigargs", nargs=2, metavar=('fromsig', 'tosig'))
+                        help="With -t/--task, specify the signatures to look for instead of taking the last two",
+                        action="store", dest="sigargs", nargs=2, metavar=('fromsig', 'tosig'))
 
     parser.add_argument("sigdatafile1",
-            help="First signature file to compare (or signature file to dump, if second not specified). Not used when using -t/--task.",
-            action="store", nargs='?')
+                        help="First signature file to compare (or signature file to dump, if second not specified). Not used when using -t/--task.",
+                        action="store", nargs='?')
 
     parser.add_argument("sigdatafile2",
-            help="Second signature file to compare",
-            action="store", nargs='?')
+                        help="Second signature file to compare",
+                        action="store", nargs='?')
 
 options = parser.parse_args()
 if is_dump:
@@ -157,7 +160,8 @@ if options.taskargs:
     with bb.tinfoil.Tinfoil() as tinfoil:
         tinfoil.prepare(config_only=True)
         if not options.dump and options.sigargs:
-            files = find_siginfo_task(tinfoil, options.taskargs[0], options.taskargs[1], options.sigargs[0], options.sigargs[1])
+            files = find_siginfo_task(tinfoil, options.taskargs[0], options.taskargs[1], options.sigargs[0],
+                                      options.sigargs[1])
         else:
             files = find_siginfo_task(tinfoil, options.taskargs[0], options.taskargs[1])
 
@@ -166,7 +170,8 @@ if options.taskargs:
             output = bb.siggen.dump_sigfile(files[-1])
         else:
             if len(files) < 2:
-                logger.error('Only one matching sigdata file found for the specified task (%s %s)' % (options.taskargs[0], options.taskargs[1]))
+                logger.error('Only one matching sigdata file found for the specified task (%s %s)' % (
+                    options.taskargs[0], options.taskargs[1]))
                 sys.exit(1)
 
             # Recurse into signature comparison
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [bitbake][kirkstone][2.0][PATCH 4/4] bitbake-diffsigs: break on first dependent task difference
  2023-02-14 15:28 [bitbake][kirkstone][2.0][PATCH 0/4] Patch review Steve Sakoman
                   ` (2 preceding siblings ...)
  2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 3/4] bitbake-diffsigs: Make PEP8 compliant Steve Sakoman
@ 2023-02-14 15:28 ` Steve Sakoman
  3 siblings, 0 replies; 5+ messages in thread
From: Steve Sakoman @ 2023-02-14 15:28 UTC (permalink / raw)
  To: bitbake-devel

From: "Schmidt, Adriaan" <adriaan.schmidt@siemens.com>

compare_sigfiles() recursively calculates differences on all dependent
tasks with changed hashes. This is done in arbitrary/alphabetical order, and
only the last of those results is returned, while everything else is discarded.

This changes the behavior to instead return the first difference and not calculate
any more, which significantly speeds up diffs of tasks with many dependencies.

Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
(cherry picked from commit ea6a676c9aa2864c2eff40eea41ba09ce903a651)
Signed-off-by: Steve Sakoman <steve@sakoman.com>
---
 lib/bb/siggen.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/bb/siggen.py b/lib/bb/siggen.py
index cea3a538..0a9ce0ed 100644
--- a/lib/bb/siggen.py
+++ b/lib/bb/siggen.py
@@ -1028,6 +1028,7 @@ def compare_sigfiles(a, b, recursecb=None, color=False, collapsed=False):
                             # If a dependent hash changed, might as well print the line above and then defer to the changes in
                             # that hash since in all likelyhood, they're the same changes this task also saw.
                             output = [output[-1]] + recout
+                            break
 
     a_taint = a_data.get('taint', None)
     b_taint = b_data.get('taint', None)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-02-14 15:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-14 15:28 [bitbake][kirkstone][2.0][PATCH 0/4] Patch review Steve Sakoman
2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 1/4] siggen: Fix inefficient string concatenation Steve Sakoman
2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 2/4] utils/ply: Update md5 to better report errors with hashlib Steve Sakoman
2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 3/4] bitbake-diffsigs: Make PEP8 compliant Steve Sakoman
2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 4/4] bitbake-diffsigs: break on first dependent task difference Steve Sakoman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).