All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/11] bitbake-diffsigs fixes/improvements
@ 2017-04-06 21:52 Paul Eggleton
  2017-04-06 21:52 ` [PATCH 01/11] bitbake-diffsigs: fix -t picking wrong files to compare Paul Eggleton
                   ` (12 more replies)
  0 siblings, 13 replies; 22+ messages in thread
From: Paul Eggleton @ 2017-04-06 21:52 UTC (permalink / raw)
  To: bitbake-devel

As part of an effort to add task signature recording and comparison to
buildhistory in OE, I went digging into bitbake-diffsigs and the code
that supports it and discovered that unfortunately it had a number of
bugs - in particular the -t option despite being around for a while and
having numerous band-aids applied in the past still wasn't really
working properly. I have to take a big chunk of the responsibility for
this as I wrote the thing in the first place. This patchset corrects
most of the issues that I found and also makes a number of improvements
to the readability of the output.

NOTE: there are a few corresponding changes required in OE-Core that I
am about to send out, these should be applied at the same time.


The following changes since commit 751c9dc51fd01fa64a1ff37ba2638110335f71af:

  fetch/local: Drop FILESDIR (2017-04-05 09:38:01 +0100)

are available in the git repository at:

  git://git.yoctoproject.org/poky-contrib paule/bb-sigstuff
  http://git.yoctoproject.org/cgit.cgi/poky-contrib/log/?h=paule/bb-sigstuff

Paul Eggleton (11):
  bitbake-diffsigs: fix -t picking wrong files to compare
  lib/bb/siggen: add missing path separator to cleaned paths
  bitbake-diffsigs: properly report which signature is missing
  bitbake-diffsigs: drop naive logic for removing duplicate files
  lib/bb/siggen: show a diff when dumping changes to multi-line values
  lib/bb/siggen: don't show unchanged runtaskdeps list
  bitbake-diffsigs: change to use argparse
  bitbake-diffsigs: add an option to find and compare specific signatures
  lib/bb/siggen: add collapsed mode to compare_sigfiles()
  lib/bb/siggen: show word-diff for single-line values containing spaces
  bitbake-diffsigs: colourise output

 LICENSE                    |   2 +
 bin/bitbake-diffsigs       | 175 +++++++++++++++++++++------------------
 lib/bb/siggen.py           | 166 +++++++++++++++++++++++++++++--------
 lib/simplediff/LICENSE     |  22 +++++
 lib/simplediff/__init__.py | 198 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 453 insertions(+), 110 deletions(-)
 create mode 100644 lib/simplediff/LICENSE
 create mode 100644 lib/simplediff/__init__.py

-- 
2.9.3



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 01/11] bitbake-diffsigs: fix -t picking wrong files to compare
  2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
@ 2017-04-06 21:52 ` Paul Eggleton
  2017-04-06 21:52 ` [PATCH 02/11] lib/bb/siggen: add missing path separator to cleaned paths Paul Eggleton
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Paul Eggleton @ 2017-04-06 21:52 UTC (permalink / raw)
  To: bitbake-devel

We weren't picking the right files to compare here - according to the
order in which the list is sorted (by mtime), we need to be taking the
last two items and not the first two.

Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com>
---
 bin/bitbake-diffsigs | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/bin/bitbake-diffsigs b/bin/bitbake-diffsigs
index 1e3de09..b2ebe91 100755
--- a/bin/bitbake-diffsigs
+++ b/bin/bitbake-diffsigs
@@ -84,9 +84,9 @@ def find_compare_task(bbhandler, pn, taskname):
             return recout
 
         # Recurse into signature comparison
-        logger.debug("Signature file (previous): %s" % latestfiles[0])
-        logger.debug("Signature file (latest): %s" % latestfiles[1])
-        output = bb.siggen.compare_sigfiles(latestfiles[0], latestfiles[1], recursecb)
+        logger.debug("Signature file (previous): %s" % latestfiles[-2])
+        logger.debug("Signature file (latest): %s" % latestfiles[-1])
+        output = bb.siggen.compare_sigfiles(latestfiles[-2], latestfiles[-1], recursecb)
         if output:
             print('\n'.join(output))
     sys.exit(0)
-- 
2.9.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 02/11] lib/bb/siggen: add missing path separator to cleaned paths
  2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
  2017-04-06 21:52 ` [PATCH 01/11] bitbake-diffsigs: fix -t picking wrong files to compare Paul Eggleton
@ 2017-04-06 21:52 ` Paul Eggleton
  2017-04-06 21:52 ` [PATCH 03/11] bitbake-diffsigs: properly report which signature is missing Paul Eggleton
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Paul Eggleton @ 2017-04-06 21:52 UTC (permalink / raw)
  To: bitbake-devel

Printing "pbzip2pbzip2_1.1.13.bb" is ugly, we need to add a separating
slash so that we get "pbzip2/pbzip2_1.1.13.bb" instead.

Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com>
---
 lib/bb/siggen.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/bb/siggen.py b/lib/bb/siggen.py
index 3ceeef1..f47af6d 100644
--- a/lib/bb/siggen.py
+++ b/lib/bb/siggen.py
@@ -355,7 +355,7 @@ def clean_basepath(a):
     mc = None
     if a.startswith("multiconfig:"):
         _, mc, a = a.split(":", 2)
-    b = a.rsplit("/", 2)[1] + a.rsplit("/", 2)[2]
+    b = a.rsplit("/", 2)[1] + '/' + a.rsplit("/", 2)[2]
     if a.startswith("virtual:"):
         b = b + ":" + a.rsplit(":", 1)[0]
     if mc:
-- 
2.9.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 03/11] bitbake-diffsigs: properly report which signature is missing
  2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
  2017-04-06 21:52 ` [PATCH 01/11] bitbake-diffsigs: fix -t picking wrong files to compare Paul Eggleton
  2017-04-06 21:52 ` [PATCH 02/11] lib/bb/siggen: add missing path separator to cleaned paths Paul Eggleton
@ 2017-04-06 21:52 ` Paul Eggleton
  2017-04-07 16:25   ` Patrick Ohly
  2017-04-06 21:52 ` [PATCH 04/11] bitbake-diffsigs: drop naive logic for removing duplicate files Paul Eggleton
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Paul Eggleton @ 2017-04-06 21:52 UTC (permalink / raw)
  To: bitbake-devel

If just one of the two signatures we want to compare aren't available,
report that one rather than misleadingly claiming both are missing.

Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com>
---
 bin/bitbake-diffsigs | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/bin/bitbake-diffsigs b/bin/bitbake-diffsigs
index b2ebe91..f84188d 100755
--- a/bin/bitbake-diffsigs
+++ b/bin/bitbake-diffsigs
@@ -75,11 +75,15 @@ def find_compare_task(bbhandler, pn, taskname):
             hashfiles = bb.siggen.find_siginfo(key, None, hashes, bbhandler.config_data)
 
             recout = []
-            if len(hashfiles) == 2:
+            if len(hashfiles) == 0:
+                recout.append("Unable to find matching sigdata for %s with hashes %s or %s" % (key, hash1, hash2))
+            elif not hash1 in hashfiles:
+                recout.append("Unable to find matching sigdata for %s with hash %s" % (key, hash1))
+            elif not hash2 in hashfiles:
+                recout.append("Unable to find matching sigdata for %s with hash %s" % (key, hash2))
+            else:
                 out2 = bb.siggen.compare_sigfiles(hashfiles[hash1], hashfiles[hash2], recursecb)
                 recout.extend(list('  ' + l for l in out2))
-            else:
-                recout.append("Unable to find matching sigdata for %s with hashes %s or %s" % (key, hash1, hash2))
 
             return recout
 
-- 
2.9.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 04/11] bitbake-diffsigs: drop naive logic for removing duplicate files
  2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
                   ` (2 preceding siblings ...)
  2017-04-06 21:52 ` [PATCH 03/11] bitbake-diffsigs: properly report which signature is missing Paul Eggleton
@ 2017-04-06 21:52 ` Paul Eggleton
  2017-04-06 21:52 ` [PATCH 05/11] lib/bb/siggen: show a diff when dumping changes to multi-line values Paul Eggleton
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Paul Eggleton @ 2017-04-06 21:52 UTC (permalink / raw)
  To: bitbake-devel

This logic doesn't work in practice, certainly not with current versions
where sigdata files are preserved in the stamps directory and therefore
there will often be multiple sigdata files - you can now easily get
files for the same signature from sstate and the stamps directory with the
result that bitbake-diffsigs reports nothing has changed. Instead, let's
change the find_siginfo function in OE-Core to simply not return
duplicates so we don't have to filter them out here.

Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com>
---
 bin/bitbake-diffsigs | 16 ----------------
 1 file changed, 16 deletions(-)

diff --git a/bin/bitbake-diffsigs b/bin/bitbake-diffsigs
index f84188d..5400e5b 100755
--- a/bin/bitbake-diffsigs
+++ b/bin/bitbake-diffsigs
@@ -37,12 +37,6 @@ logger = bb.msg.logger_create('bitbake-diffsigs')
 def find_compare_task(bbhandler, pn, taskname):
     """ Find the most recent signature files for the specified PN/task and compare them """
 
-    def get_hashval(siginfo):
-        if siginfo.endswith('.siginfo'):
-            return siginfo.rpartition(':')[2].partition('_')[0]
-        else:
-            return siginfo.rpartition('.')[2]
-
     if not hasattr(bb.siggen, 'find_siginfo'):
         logger.error('Metadata does not support finding signature data files')
         sys.exit(1)
@@ -59,16 +53,6 @@ def find_compare_task(bbhandler, pn, taskname):
         logger.error('Only one matching sigdata file found for the specified task (%s %s)' % (pn, taskname))
         sys.exit(1)
     else:
-        # It's possible that latestfiles contain 3 elements and the first two have the same hash value.
-        # In this case, we delete the second element.
-        # The above case is actually the most common one. Because we may have sigdata file and siginfo
-        # file having the same hash value. Comparing such two files makes no sense.
-        if len(latestfiles) == 3:
-            hash0 = get_hashval(latestfiles[0])
-            hash1 = get_hashval(latestfiles[1])
-            if hash0 == hash1:
-                latestfiles.pop(1)
-
         # Define recursion callback
         def recursecb(key, hash1, hash2):
             hashes = [hash1, hash2]
-- 
2.9.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 05/11] lib/bb/siggen: show a diff when dumping changes to multi-line values
  2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
                   ` (3 preceding siblings ...)
  2017-04-06 21:52 ` [PATCH 04/11] bitbake-diffsigs: drop naive logic for removing duplicate files Paul Eggleton
@ 2017-04-06 21:52 ` Paul Eggleton
  2017-04-06 21:52 ` [PATCH 06/11] lib/bb/siggen: don't show unchanged runtaskdeps list Paul Eggleton
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Paul Eggleton @ 2017-04-06 21:52 UTC (permalink / raw)
  To: bitbake-devel

When dumping changes to signatures e.g. output of bitbake -s printdiff,
if for example a function has changed, it's much more readable to see a
unified diff of the changes rather than just printing the old function
followed by the new function, so use difflib to do that.

Note: I elected to keep to one item in the returned list per change,
rather than one line per line of output, so that the caller can still
look at changes individually if needed. Thus I've added some handling to
bitbake-diffsigs to split the change into lines so that each line is
displayed indented.

Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com>
---
 bin/bitbake-diffsigs |  4 +++-
 lib/bb/siggen.py     | 12 +++++++++++-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/bin/bitbake-diffsigs b/bin/bitbake-diffsigs
index 5400e5b..4ca085f 100755
--- a/bin/bitbake-diffsigs
+++ b/bin/bitbake-diffsigs
@@ -67,7 +67,9 @@ def find_compare_task(bbhandler, pn, taskname):
                 recout.append("Unable to find matching sigdata for %s with hash %s" % (key, hash2))
             else:
                 out2 = bb.siggen.compare_sigfiles(hashfiles[hash1], hashfiles[hash2], recursecb)
-                recout.extend(list('  ' + l for l in out2))
+                for change in out2:
+                    for line in change.splitlines():
+                        recout.append('  ' + line)
 
             return recout
 
diff --git a/lib/bb/siggen.py b/lib/bb/siggen.py
index f47af6d..f497fb9 100644
--- a/lib/bb/siggen.py
+++ b/lib/bb/siggen.py
@@ -5,6 +5,7 @@ import re
 import tempfile
 import pickle
 import bb.data
+import difflib
 from bb.checksum import FileChecksumCache
 
 logger = logging.getLogger('BitBake.SigGen')
@@ -462,7 +463,16 @@ def compare_sigfiles(a, b, recursecb = None):
     changed, added, removed = dict_diff(a_data['varvals'], b_data['varvals'])
     if changed:
         for dep in changed:
-            output.append("Variable %s value changed from '%s' to '%s'" % (dep, a_data['varvals'][dep], b_data['varvals'][dep]))
+            oldval = a_data['varvals'][dep]
+            newval = b_data['varvals'][dep]
+            if newval and oldval and ('\n' in oldval or '\n' in newval):
+                diff = difflib.unified_diff(oldval.splitlines(), newval.splitlines(), lineterm='')
+                # Cut off the first two lines, since we aren't interested in
+                # the old/new filename (they are blank anyway in this case)
+                difflines = list(diff)[2:]
+                output.append("Variable %s value changed:\n%s" % (dep, '\n'.join(difflines)))
+            else:
+                output.append("Variable %s value changed from '%s' to '%s'" % (dep, oldval, newval))
 
     if not 'file_checksum_values' in a_data:
          a_data['file_checksum_values'] = {}
-- 
2.9.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 06/11] lib/bb/siggen: don't show unchanged runtaskdeps list
  2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
                   ` (4 preceding siblings ...)
  2017-04-06 21:52 ` [PATCH 05/11] lib/bb/siggen: show a diff when dumping changes to multi-line values Paul Eggleton
@ 2017-04-06 21:52 ` Paul Eggleton
  2017-04-06 21:52 ` [PATCH 07/11] bitbake-diffsigs: change to use argparse Paul Eggleton
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Paul Eggleton @ 2017-04-06 21:52 UTC (permalink / raw)
  To: bitbake-devel

If the runtaskdeps list hasn't actually changed (but the signatures of
some of the tasks did) then it doesn't make sense to print out the old
and new lists as they are both the same and may be very long, e.g. for
do_rootfs in OE.

Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com>
---
 lib/bb/siggen.py | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/bb/siggen.py b/lib/bb/siggen.py
index f497fb9..c6b14c2 100644
--- a/lib/bb/siggen.py
+++ b/lib/bb/siggen.py
@@ -506,10 +506,14 @@ def compare_sigfiles(a, b, recursecb = None):
                 changed.append("%s with hash %s\n changed to\n%s with hash %s" % (a, a_data['runtaskhashes'][a], b, b_data['runtaskhashes'][b]))
 
     if changed:
-        output.append("runtaskdeps changed from %s to %s" % (clean_basepaths_list(a_data['runtaskdeps']), clean_basepaths_list(b_data['runtaskdeps'])))
+        clean_a = clean_basepaths_list(a_data['runtaskdeps'])
+        clean_b = clean_basepaths_list(b_data['runtaskdeps'])
+        if clean_a != clean_b:
+            output.append("runtaskdeps changed from %s to %s" % (clean_a, clean_b))
+        else:
+            output.append("runtaskdeps changed:")
         output.append("\n".join(changed))
 
-
     if 'runtaskhashes' in a_data and 'runtaskhashes' in b_data:
         a = a_data['runtaskhashes']
         b = b_data['runtaskhashes']
-- 
2.9.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 07/11] bitbake-diffsigs: change to use argparse
  2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
                   ` (5 preceding siblings ...)
  2017-04-06 21:52 ` [PATCH 06/11] lib/bb/siggen: don't show unchanged runtaskdeps list Paul Eggleton
@ 2017-04-06 21:52 ` Paul Eggleton
  2017-04-06 21:52 ` [PATCH 08/11] bitbake-diffsigs: add an option to find and compare specific signatures Paul Eggleton
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Paul Eggleton @ 2017-04-06 21:52 UTC (permalink / raw)
  To: bitbake-devel

Argparse is a bit easier to deal with than optparse, and since we're
about to add some options, migrate this script over.

Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com>
---
 bin/bitbake-diffsigs | 64 +++++++++++++++++++++++++++-------------------------
 1 file changed, 33 insertions(+), 31 deletions(-)

diff --git a/bin/bitbake-diffsigs b/bin/bitbake-diffsigs
index 4ca085f..e3f848d 100755
--- a/bin/bitbake-diffsigs
+++ b/bin/bitbake-diffsigs
@@ -3,7 +3,7 @@
 # bitbake-diffsigs
 # BitBake task signature data comparison utility
 #
-# Copyright (C) 2012-2013 Intel Corporation
+# Copyright (C) 2012-2013, 2017 Intel Corporation
 #
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License version 2 as
@@ -22,7 +22,7 @@ import os
 import sys
 import warnings
 import fnmatch
-import optparse
+import argparse
 import logging
 import pickle
 
@@ -83,22 +83,27 @@ def find_compare_task(bbhandler, pn, taskname):
 
 
 
-parser = optparse.OptionParser(
-    description = "Compares siginfo/sigdata files written out by BitBake",
-    usage = """
-  %prog -t recipename taskname
-  %prog sigdatafile1 sigdatafile2
-  %prog sigdatafile1""")
+parser = argparse.ArgumentParser(
+    description="Compares siginfo/sigdata files written out by BitBake")
 
-parser.add_option("-D", "--debug",
-        help = "enable debug",
-        action = "store_true", dest="debug", default = False)
+parser.add_argument('-d', '--debug',
+                    help='Enable debug output',
+                    action='store_true')
 
-parser.add_option("-t", "--task",
-        help = "find the signature data files for last two runs of the specified task and compare them",
-        action="store", dest="taskargs", nargs=2, metavar='recipename taskname')
+parser.add_argument("-t", "--task",
+        help="find the signature data files for last two runs of the specified task and compare them",
+        action="store", dest="taskargs", nargs=2, metavar=('recipename', 'taskname'))
 
-options, args = parser.parse_args(sys.argv)
+parser.add_argument("sigdatafile1",
+        help="First signature file to compare (or signature file to dump, if second not specified). Not used when using -t/--task.",
+        action="store", nargs='?')
+
+parser.add_argument("sigdatafile2",
+        help="Second signature file to compare",
+        action="store", nargs='?')
+
+
+options = parser.parse_args()
 
 if options.debug:
     logger.setLevel(logging.DEBUG)
@@ -108,20 +113,17 @@ if options.taskargs:
         tinfoil.prepare(config_only=True)
         find_compare_task(tinfoil, options.taskargs[0], options.taskargs[1])
 else:
-    if len(args) == 1:
-        parser.print_help()
-    else:
-        try:
-            if len(args) == 2:
-                output = bb.siggen.dump_sigfile(sys.argv[1])
-            else:
-                output = bb.siggen.compare_sigfiles(sys.argv[1], sys.argv[2])
-        except IOError as e:
-            logger.error(str(e))
-            sys.exit(1)
-        except (pickle.UnpicklingError, EOFError):
-            logger.error('Invalid signature data - ensure you are specifying sigdata/siginfo files')
-            sys.exit(1)
+    try:
+        if options.sigdatafile1 and options.sigdatafile2:
+            output = bb.siggen.compare_sigfiles(options.sigdatafile1, options.sigdatafile2)
+        elif options.sigdatafile1:
+            output = bb.siggen.dump_sigfile(options.sigdatafile1)
+    except IOError as e:
+        logger.error(str(e))
+        sys.exit(1)
+    except (pickle.UnpicklingError, EOFError):
+        logger.error('Invalid signature data - ensure you are specifying sigdata/siginfo files')
+        sys.exit(1)
 
-        if output:
-            print('\n'.join(output))
+    if output:
+        print('\n'.join(output))
-- 
2.9.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 08/11] bitbake-diffsigs: add an option to find and compare specific signatures
  2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
                   ` (6 preceding siblings ...)
  2017-04-06 21:52 ` [PATCH 07/11] bitbake-diffsigs: change to use argparse Paul Eggleton
@ 2017-04-06 21:52 ` Paul Eggleton
  2017-04-06 21:52 ` [PATCH 09/11] lib/bb/siggen: add collapsed mode to compare_sigfiles() Paul Eggleton
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Paul Eggleton @ 2017-04-06 21:52 UTC (permalink / raw)
  To: bitbake-devel

With the -t option which recurses to find the ultimate cause of a
signature change, it was hardcoded to take the last two executions of
the specified task. On the other hand, if you have two specific task
hashes (say from bitbake output, or some other tool) then you'll want to
pick those, so provide an option to specify those as well. (Note, the
new -s option needs to be specified alongside -t rather than instead of
it.)

Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com>
---
 bin/bitbake-diffsigs | 95 ++++++++++++++++++++++++++++++++--------------------
 1 file changed, 59 insertions(+), 36 deletions(-)

diff --git a/bin/bitbake-diffsigs b/bin/bitbake-diffsigs
index e3f848d..e9fdb48 100755
--- a/bin/bitbake-diffsigs
+++ b/bin/bitbake-diffsigs
@@ -34,7 +34,7 @@ import bb.msg
 
 logger = bb.msg.logger_create('bitbake-diffsigs')
 
-def find_compare_task(bbhandler, pn, taskname):
+def find_compare_task(bbhandler, pn, taskname, sig1=None, sig2=None):
     """ Find the most recent signature files for the specified PN/task and compare them """
 
     if not hasattr(bb.siggen, 'find_siginfo'):
@@ -44,41 +44,54 @@ def find_compare_task(bbhandler, pn, taskname):
     if not taskname.startswith('do_'):
         taskname = 'do_%s' % taskname
 
-    filedates = bb.siggen.find_siginfo(pn, taskname, None, bbhandler.config_data)
-    latestfiles = sorted(filedates.keys(), key=lambda f: filedates[f])[-3:]
-    if not latestfiles:
-        logger.error('No sigdata files found matching %s %s' % (pn, taskname))
-        sys.exit(1)
-    elif len(latestfiles) < 2:
-        logger.error('Only one matching sigdata file found for the specified task (%s %s)' % (pn, taskname))
-        sys.exit(1)
+    if sig1 and sig2:
+        sigfiles = bb.siggen.find_siginfo(pn, taskname, [sig1, sig2], bbhandler.config_data)
+        if len(sigfiles) == 0:
+            logger.error('No sigdata files found matching %s %s matching either %s or %s' % (pn, taskname, sig1, sig2))
+            sys.exit(1)
+        elif not sig1 in sigfiles:
+            logger.error('No sigdata files found matching %s %s with signature %s' % (pn, taskname, sig1))
+            sys.exit(1)
+        elif not sig2 in sigfiles:
+            logger.error('No sigdata files found matching %s %s with signature %s' % (pn, taskname, sig2))
+            sys.exit(1)
+        latestfiles = [sigfiles[sig1], sigfiles[sig2]]
     else:
-        # Define recursion callback
-        def recursecb(key, hash1, hash2):
-            hashes = [hash1, hash2]
-            hashfiles = bb.siggen.find_siginfo(key, None, hashes, bbhandler.config_data)
-
-            recout = []
-            if len(hashfiles) == 0:
-                recout.append("Unable to find matching sigdata for %s with hashes %s or %s" % (key, hash1, hash2))
-            elif not hash1 in hashfiles:
-                recout.append("Unable to find matching sigdata for %s with hash %s" % (key, hash1))
-            elif not hash2 in hashfiles:
-                recout.append("Unable to find matching sigdata for %s with hash %s" % (key, hash2))
-            else:
-                out2 = bb.siggen.compare_sigfiles(hashfiles[hash1], hashfiles[hash2], recursecb)
-                for change in out2:
-                    for line in change.splitlines():
-                        recout.append('  ' + line)
-
-            return recout
-
-        # Recurse into signature comparison
-        logger.debug("Signature file (previous): %s" % latestfiles[-2])
-        logger.debug("Signature file (latest): %s" % latestfiles[-1])
-        output = bb.siggen.compare_sigfiles(latestfiles[-2], latestfiles[-1], recursecb)
-        if output:
-            print('\n'.join(output))
+        filedates = bb.siggen.find_siginfo(pn, taskname, None, bbhandler.config_data)
+        latestfiles = sorted(filedates.keys(), key=lambda f: filedates[f])[-3:]
+        if not latestfiles:
+            logger.error('No sigdata files found matching %s %s' % (pn, taskname))
+            sys.exit(1)
+        elif len(latestfiles) < 2:
+            logger.error('Only one matching sigdata file found for the specified task (%s %s)' % (pn, taskname))
+            sys.exit(1)
+
+    # Define recursion callback
+    def recursecb(key, hash1, hash2):
+        hashes = [hash1, hash2]
+        hashfiles = bb.siggen.find_siginfo(key, None, hashes, bbhandler.config_data)
+
+        recout = []
+        if len(hashfiles) == 0:
+            recout.append("Unable to find matching sigdata for %s with hashes %s or %s" % (key, hash1, hash2))
+        elif not hash1 in hashfiles:
+            recout.append("Unable to find matching sigdata for %s with hash %s" % (key, hash1))
+        elif not hash2 in hashfiles:
+            recout.append("Unable to find matching sigdata for %s with hash %s" % (key, hash2))
+        else:
+            out2 = bb.siggen.compare_sigfiles(hashfiles[hash1], hashfiles[hash2], recursecb)
+            for change in out2:
+                for line in change.splitlines():
+                    recout.append('  ' + line)
+
+        return recout
+
+    # Recurse into signature comparison
+    logger.debug("Signature file (previous): %s" % latestfiles[-2])
+    logger.debug("Signature file (latest): %s" % latestfiles[-1])
+    output = bb.siggen.compare_sigfiles(latestfiles[-2], latestfiles[-1], recursecb)
+    if output:
+        print('\n'.join(output))
     sys.exit(0)
 
 
@@ -94,6 +107,10 @@ parser.add_argument("-t", "--task",
         help="find the signature data files for last two runs of the specified task and compare them",
         action="store", dest="taskargs", nargs=2, metavar=('recipename', 'taskname'))
 
+parser.add_argument("-s", "--signature",
+        help="With -t/--task, specify the signatures to look for instead of taking the last two",
+        action="store", dest="sigargs", nargs=2, metavar=('fromsig', 'tosig'))
+
 parser.add_argument("sigdatafile1",
         help="First signature file to compare (or signature file to dump, if second not specified). Not used when using -t/--task.",
         action="store", nargs='?')
@@ -111,8 +128,14 @@ if options.debug:
 if options.taskargs:
     with bb.tinfoil.Tinfoil() as tinfoil:
         tinfoil.prepare(config_only=True)
-        find_compare_task(tinfoil, options.taskargs[0], options.taskargs[1])
+        if options.sigargs:
+            find_compare_task(tinfoil, options.taskargs[0], options.taskargs[1], options.sigargs[0], options.sigargs[1])
+        else:
+            find_compare_task(tinfoil, options.taskargs[0], options.taskargs[1])
 else:
+    if options.sigargs:
+        logger.error('-s/--signature can only be used together with -t/--task')
+        sys.exit(1)
     try:
         if options.sigdatafile1 and options.sigdatafile2:
             output = bb.siggen.compare_sigfiles(options.sigdatafile1, options.sigdatafile2)
-- 
2.9.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 09/11] lib/bb/siggen: add collapsed mode to compare_sigfiles()
  2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
                   ` (7 preceding siblings ...)
  2017-04-06 21:52 ` [PATCH 08/11] bitbake-diffsigs: add an option to find and compare specific signatures Paul Eggleton
@ 2017-04-06 21:52 ` Paul Eggleton
  2017-04-07  6:37   ` Patrick Ohly
  2017-04-06 21:52 ` [PATCH 10/11] lib/bb/siggen: show word-diff for single-line values containing spaces Paul Eggleton
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Paul Eggleton @ 2017-04-06 21:52 UTC (permalink / raw)
  To: bitbake-devel

If we just want to drill down to the actual differences then we don't
need to see certain things in the output, e.g. basehash changing or the
signature of dependent tasks. This will be used for comparing signatures
within buildhistory-diff in OE-Core; the default mode as used by
bitbake-diffsigs and bitbake -S printdiff remains unchanged for the
moment.

Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com>
---
 lib/bb/siggen.py | 51 ++++++++++++++++++++++++++++-----------------------
 1 file changed, 28 insertions(+), 23 deletions(-)

diff --git a/lib/bb/siggen.py b/lib/bb/siggen.py
index c6b14c2..3c5d862 100644
--- a/lib/bb/siggen.py
+++ b/lib/bb/siggen.py
@@ -375,7 +375,7 @@ def clean_basepaths_list(a):
         b.append(clean_basepath(x))
     return b
 
-def compare_sigfiles(a, b, recursecb = None):
+def compare_sigfiles(a, b, recursecb=None, collapsed=False):
     output = []
 
     with open(a, 'rb') as f:
@@ -443,7 +443,7 @@ def compare_sigfiles(a, b, recursecb = None):
     if a_data['taskdeps'] != b_data['taskdeps']:
         output.append("Task dependencies changed from:\n%s\nto:\n%s" % (sorted(a_data['taskdeps']), sorted(b_data['taskdeps'])))
 
-    if a_data['basehash'] != b_data['basehash']:
+    if a_data['basehash'] != b_data['basehash'] and not collapsed:
         output.append("basehash changed from %s to %s" % (a_data['basehash'], b_data['basehash']))
 
     changed, added, removed = dict_diff(a_data['gendeps'], b_data['gendeps'], a_data['basewhitelist'] & b_data['basewhitelist'])
@@ -495,24 +495,25 @@ def compare_sigfiles(a, b, recursecb = None):
     if not 'runtaskdeps' in b_data:
          b_data['runtaskdeps'] = {}
 
-    if len(a_data['runtaskdeps']) != len(b_data['runtaskdeps']):
-        changed = ["Number of task dependencies changed"]
-    else:
-        changed = []
-        for idx, task in enumerate(a_data['runtaskdeps']):
-            a = a_data['runtaskdeps'][idx]
-            b = b_data['runtaskdeps'][idx]
-            if a_data['runtaskhashes'][a] != b_data['runtaskhashes'][b]:
-                changed.append("%s with hash %s\n changed to\n%s with hash %s" % (a, a_data['runtaskhashes'][a], b, b_data['runtaskhashes'][b]))
-
-    if changed:
-        clean_a = clean_basepaths_list(a_data['runtaskdeps'])
-        clean_b = clean_basepaths_list(b_data['runtaskdeps'])
-        if clean_a != clean_b:
-            output.append("runtaskdeps changed from %s to %s" % (clean_a, clean_b))
+    if not collapsed:
+        if len(a_data['runtaskdeps']) != len(b_data['runtaskdeps']):
+            changed = ["Number of task dependencies changed"]
         else:
-            output.append("runtaskdeps changed:")
-        output.append("\n".join(changed))
+            changed = []
+            for idx, task in enumerate(a_data['runtaskdeps']):
+                a = a_data['runtaskdeps'][idx]
+                b = b_data['runtaskdeps'][idx]
+                if a_data['runtaskhashes'][a] != b_data['runtaskhashes'][b]:
+                    changed.append("%s with hash %s\n changed to\n%s with hash %s" % (a, a_data['runtaskhashes'][a], b, b_data['runtaskhashes'][b]))
+
+        if changed:
+            clean_a = clean_basepaths_list(a_data['runtaskdeps'])
+            clean_b = clean_basepaths_list(b_data['runtaskdeps'])
+            if clean_a != clean_b:
+                output.append("runtaskdeps changed from %s to %s" % (clean_a, clean_b))
+            else:
+                output.append("runtaskdeps changed:")
+            output.append("\n".join(changed))
 
     if 'runtaskhashes' in a_data and 'runtaskhashes' in b_data:
         a = a_data['runtaskhashes']
@@ -540,13 +541,17 @@ def compare_sigfiles(a, b, recursecb = None):
                     output.append("Dependency on task %s was removed with hash %s" % (clean_basepath(dep), a[dep]))
         if changed:
             for dep in changed:
-                output.append("Hash for dependent task %s changed from %s to %s" % (clean_basepath(dep), a[dep], b[dep]))
+                if not collapsed:
+                    output.append("Hash for dependent task %s changed from %s to %s" % (clean_basepath(dep), a[dep], b[dep]))
                 if callable(recursecb):
-                    # If a dependent hash changed, might as well print the line above and then defer to the changes in 
-                    # that hash since in all likelyhood, they're the same changes this task also saw.
                     recout = recursecb(dep, a[dep], b[dep])
                     if recout:
-                        output = [output[-1]] + recout
+                        if collapsed:
+                            output.extend(recout)
+                        else:
+                            # If a dependent hash changed, might as well print the line above and then defer to the changes in 
+                            # that hash since in all likelyhood, they're the same changes this task also saw.
+                            output = [output[-1]] + recout
 
     a_taint = a_data.get('taint', None)
     b_taint = b_data.get('taint', None)
-- 
2.9.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 10/11] lib/bb/siggen: show word-diff for single-line values containing spaces
  2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
                   ` (8 preceding siblings ...)
  2017-04-06 21:52 ` [PATCH 09/11] lib/bb/siggen: add collapsed mode to compare_sigfiles() Paul Eggleton
@ 2017-04-06 21:52 ` Paul Eggleton
  2017-04-06 21:52 ` [PATCH 11/11] bitbake-diffsigs: colourise output Paul Eggleton
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Paul Eggleton @ 2017-04-06 21:52 UTC (permalink / raw)
  To: bitbake-devel

If a variable value has changed and either the new or old value contains
spaces, a word diff should be appropriate and may be a bit more readable.
Import the "simplediff" module and use it to show a word diff (in the
style of GNU wdiff and git diff --word-diff).

Also use a similar style diff to show changes in the runtaskhashes list.
I didn't use an actual word-diff here since it's a little different - we
can be sure that the list is a list and not simply a free-format string.

Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com>
---
 LICENSE                    |   2 +
 lib/bb/siggen.py           |  38 ++++++++-
 lib/simplediff/LICENSE     |  22 +++++
 lib/simplediff/__init__.py | 198 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 259 insertions(+), 1 deletion(-)
 create mode 100644 lib/simplediff/LICENSE
 create mode 100644 lib/simplediff/__init__.py

diff --git a/LICENSE b/LICENSE
index 5d4a4c2..7d4e5f4 100644
--- a/LICENSE
+++ b/LICENSE
@@ -15,3 +15,5 @@ Foundation and individual contributors.
 * QUnit is redistributed under the MIT license.
 
 * Font Awesome fonts redistributed under the SIL Open Font License 1.1
+
+* simplediff is distributed under the zlib license.
diff --git a/lib/bb/siggen.py b/lib/bb/siggen.py
index 3c5d862..d40c721 100644
--- a/lib/bb/siggen.py
+++ b/lib/bb/siggen.py
@@ -6,6 +6,7 @@ import tempfile
 import pickle
 import bb.data
 import difflib
+import simplediff
 from bb.checksum import FileChecksumCache
 
 logger = logging.getLogger('BitBake.SigGen')
@@ -352,6 +353,39 @@ def dump_this_task(outfile, d):
     referencestamp = bb.build.stamp_internal(task, d, None, True)
     bb.parse.siggen.dump_sigtask(fn, task, outfile, "customfile:" + referencestamp)
 
+def worddiff_str(oldstr, newstr):
+    diff = simplediff.diff(oldstr.split(' '), newstr.split(' '))
+    ret = []
+    for change, value in diff:
+        value = ' '.join(value)
+        if change == '=':
+            ret.append(value)
+        elif change == '+':
+            item = '{+%s+}' % value
+            ret.append(item)
+        elif change == '-':
+            item = '[-%s-]' % value
+            ret.append(item)
+    whitespace_note = ''
+    if oldstr != newstr and ' '.join(oldstr.split()) == ' '.join(newstr.split()):
+        whitespace_note = ' (whitespace changed)'
+    return '"%s"%s' % (' '.join(ret), whitespace_note)
+
+def list_inline_diff(oldlist, newlist):
+    diff = simplediff.diff(oldlist, newlist)
+    ret = []
+    for change, value in diff:
+        value = ' '.join(value)
+        if change == '=':
+            ret.append("'%s'" % value)
+        elif change == '+':
+            item = "+'%s'" % value
+            ret.append(item)
+        elif change == '-':
+            item = "-'%s'" % value
+            ret.append(item)
+    return '[%s]' % (', '.join(ret))
+
 def clean_basepath(a):
     mc = None
     if a.startswith("multiconfig:"):
@@ -471,6 +505,8 @@ def compare_sigfiles(a, b, recursecb=None, collapsed=False):
                 # the old/new filename (they are blank anyway in this case)
                 difflines = list(diff)[2:]
                 output.append("Variable %s value changed:\n%s" % (dep, '\n'.join(difflines)))
+            elif newval and oldval and (' ' in oldval or ' ' in newval):
+                output.append("Variable %s value changed:\n%s" % (dep, worddiff_str(oldval, newval)))
             else:
                 output.append("Variable %s value changed from '%s' to '%s'" % (dep, oldval, newval))
 
@@ -510,7 +546,7 @@ def compare_sigfiles(a, b, recursecb=None, collapsed=False):
             clean_a = clean_basepaths_list(a_data['runtaskdeps'])
             clean_b = clean_basepaths_list(b_data['runtaskdeps'])
             if clean_a != clean_b:
-                output.append("runtaskdeps changed from %s to %s" % (clean_a, clean_b))
+                output.append("runtaskdeps changed:\n%s" % list_inline_diff(clean_a, clean_b))
             else:
                 output.append("runtaskdeps changed:")
             output.append("\n".join(changed))
diff --git a/lib/simplediff/LICENSE b/lib/simplediff/LICENSE
new file mode 100644
index 0000000..8242dde
--- /dev/null
+++ b/lib/simplediff/LICENSE
@@ -0,0 +1,22 @@
+Copyright (c) 2008 - 2013 Paul Butler and contributors
+
+This sofware may be used under a zlib/libpng-style license:
+
+This software is provided 'as-is', without any express or implied warranty. In
+no event will the authors be held liable for any damages arising from the use
+of this software.
+
+Permission is granted to anyone to use this software for any purpose, including
+commercial applications, and to alter it and redistribute it freely, subject to
+the following restrictions:
+
+1. The origin of this software must not be misrepresented; you must not claim
+that you wrote the original software. If you use this software in a product, an
+acknowledgment in the product documentation would be appreciated but is not
+required.
+
+2. Altered source versions must be plainly marked as such, and must not be
+misrepresented as being the original software.
+
+3. This notice may not be removed or altered from any source distribution.
+
diff --git a/lib/simplediff/__init__.py b/lib/simplediff/__init__.py
new file mode 100644
index 0000000..57ee3c5
--- /dev/null
+++ b/lib/simplediff/__init__.py
@@ -0,0 +1,198 @@
+'''
+Simple Diff for Python version 1.0
+
+Annotate two versions of a list with the values that have been
+changed between the versions, similar to unix's `diff` but with
+a dead-simple Python interface.
+
+(C) Paul Butler 2008-2012 <http://www.paulbutler.org/>
+May be used and distributed under the zlib/libpng license
+<http://www.opensource.org/licenses/zlib-license.php>
+'''
+
+__all__ = ['diff', 'string_diff', 'html_diff']
+__version__ = '1.0'
+
+
+def diff(old, new):
+    '''
+    Find the differences between two lists. Returns a list of pairs, where the
+    first value is in ['+','-','='] and represents an insertion, deletion, or
+    no change for that list. The second value of the pair is the list
+    of elements.
+
+    Params:
+        old     the old list of immutable, comparable values (ie. a list
+                of strings)
+        new     the new list of immutable, comparable values
+   
+    Returns:
+        A list of pairs, with the first part of the pair being one of three
+        strings ('-', '+', '=') and the second part being a list of values from
+        the original old and/or new lists. The first part of the pair
+        corresponds to whether the list of values is a deletion, insertion, or
+        unchanged, respectively.
+
+    Examples:
+        >>> diff([1,2,3,4],[1,3,4])
+        [('=', [1]), ('-', [2]), ('=', [3, 4])]
+
+        >>> diff([1,2,3,4],[2,3,4,1])
+        [('-', [1]), ('=', [2, 3, 4]), ('+', [1])]
+
+        >>> diff('The quick brown fox jumps over the lazy dog'.split(),
+        ...      'The slow blue cheese drips over the lazy carrot'.split())
+        ... # doctest: +NORMALIZE_WHITESPACE
+        [('=', ['The']),
+         ('-', ['quick', 'brown', 'fox', 'jumps']),
+         ('+', ['slow', 'blue', 'cheese', 'drips']),
+         ('=', ['over', 'the', 'lazy']),
+         ('-', ['dog']),
+         ('+', ['carrot'])]
+
+    '''
+
+    # Create a map from old values to their indices
+    old_index_map = dict()
+    for i, val in enumerate(old):
+        old_index_map.setdefault(val,list()).append(i)
+
+    # Find the largest substring common to old and new.
+    # We use a dynamic programming approach here.
+    # 
+    # We iterate over each value in the `new` list, calling the
+    # index `inew`. At each iteration, `overlap[i]` is the
+    # length of the largest suffix of `old[:i]` equal to a suffix
+    # of `new[:inew]` (or unset when `old[i]` != `new[inew]`).
+    #
+    # At each stage of iteration, the new `overlap` (called
+    # `_overlap` until the original `overlap` is no longer needed)
+    # is built from the old one.
+    #
+    # If the length of overlap exceeds the largest substring
+    # seen so far (`sub_length`), we update the largest substring
+    # to the overlapping strings.
+
+    overlap = dict()
+    # `sub_start_old` is the index of the beginning of the largest overlapping
+    # substring in the old list. `sub_start_new` is the index of the beginning
+    # of the same substring in the new list. `sub_length` is the length that
+    # overlaps in both.
+    # These track the largest overlapping substring seen so far, so naturally
+    # we start with a 0-length substring.
+    sub_start_old = 0
+    sub_start_new = 0
+    sub_length = 0
+
+    for inew, val in enumerate(new):
+        _overlap = dict()
+        for iold in old_index_map.get(val,list()):
+            # now we are considering all values of iold such that
+            # `old[iold] == new[inew]`.
+            _overlap[iold] = (iold and overlap.get(iold - 1, 0)) + 1
+            if(_overlap[iold] > sub_length):
+                # this is the largest substring seen so far, so store its
+                # indices
+                sub_length = _overlap[iold]
+                sub_start_old = iold - sub_length + 1
+                sub_start_new = inew - sub_length + 1
+        overlap = _overlap
+
+    if sub_length == 0:
+        # If no common substring is found, we return an insert and delete...
+        return (old and [('-', old)] or []) + (new and [('+', new)] or [])
+    else:
+        # ...otherwise, the common substring is unchanged and we recursively
+        # diff the text before and after that substring
+        return diff(old[ : sub_start_old], new[ : sub_start_new]) + \
+               [('=', new[sub_start_new : sub_start_new + sub_length])] + \
+               diff(old[sub_start_old + sub_length : ],
+                       new[sub_start_new + sub_length : ])
+
+
+def string_diff(old, new):
+    '''
+    Returns the difference between the old and new strings when split on
+    whitespace. Considers punctuation a part of the word
+
+    This function is intended as an example; you'll probably want
+    a more sophisticated wrapper in practice.
+
+    Params:
+        old     the old string
+        new     the new string
+
+    Returns:
+        the output of `diff` on the two strings after splitting them
+        on whitespace (a list of change instructions; see the docstring
+        of `diff`)
+
+    Examples:
+        >>> string_diff('The quick brown fox', 'The fast blue fox')
+        ... # doctest: +NORMALIZE_WHITESPACE
+        [('=', ['The']),
+         ('-', ['quick', 'brown']),
+         ('+', ['fast', 'blue']),
+         ('=', ['fox'])]
+
+    '''
+    return diff(old.split(), new.split())
+
+
+def html_diff(old, new):
+    '''
+    Returns the difference between two strings (as in stringDiff) in
+    HTML format. HTML code in the strings is NOT escaped, so you
+    will get weird results if the strings contain HTML.
+
+    This function is intended as an example; you'll probably want
+    a more sophisticated wrapper in practice.
+
+    Params:
+        old     the old string
+        new     the new string
+
+    Returns:
+        the output of the diff expressed with HTML <ins> and <del>
+        tags.
+
+    Examples:
+        >>> html_diff('The quick brown fox', 'The fast blue fox')
+        'The <del>quick brown</del> <ins>fast blue</ins> fox'
+    '''
+    con = {'=': (lambda x: x),
+           '+': (lambda x: "<ins>" + x + "</ins>"),
+           '-': (lambda x: "<del>" + x + "</del>")}
+    return " ".join([(con[a])(" ".join(b)) for a, b in string_diff(old, new)])
+
+
+def check_diff(old, new):
+    '''
+    This tests that diffs returned by `diff` are valid. You probably won't
+    want to use this function, but it's provided for documentation and
+    testing.
+
+    A diff should satisfy the property that the old input is equal to the
+    elements of the result annotated with '-' or '=' concatenated together.
+    Likewise, the new input is equal to the elements of the result annotated
+    with '+' or '=' concatenated together. This function compares `old`,
+    `new`, and the results of `diff(old, new)` to ensure this is true.
+
+    Tests:
+        >>> check_diff('ABCBA', 'CBABA')
+        >>> check_diff('Foobarbaz', 'Foobarbaz')
+        >>> check_diff('Foobarbaz', 'Boobazbam')
+        >>> check_diff('The quick brown fox', 'Some quick brown car')
+        >>> check_diff('A thick red book', 'A quick blue book')
+        >>> check_diff('dafhjkdashfkhasfjsdafdasfsda', 'asdfaskjfhksahkfjsdha')
+        >>> check_diff('88288822828828288282828', '88288882882828282882828')
+        >>> check_diff('1234567890', '24689')
+    '''
+    old = list(old)
+    new = list(new)
+    result = diff(old, new)
+    _old = [val for (a, vals) in result if (a in '=-') for val in vals]
+    assert old == _old, 'Expected %s, got %s' % (old, _old)
+    _new = [val for (a, vals) in result if (a in '=+') for val in vals]
+    assert new == _new, 'Expected %s, got %s' % (new, _new)
+
-- 
2.9.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 11/11] bitbake-diffsigs: colourise output
  2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
                   ` (9 preceding siblings ...)
  2017-04-06 21:52 ` [PATCH 10/11] lib/bb/siggen: show word-diff for single-line values containing spaces Paul Eggleton
@ 2017-04-06 21:52 ` Paul Eggleton
  2017-04-07  7:56 ` [PATCH 00/11] bitbake-diffsigs fixes/improvements Patrick Ohly
  2017-04-07 12:20 ` Peter Kjellerstedt
  12 siblings, 0 replies; 22+ messages in thread
From: Paul Eggleton @ 2017-04-06 21:52 UTC (permalink / raw)
  To: bitbake-devel

If the output is a TTY, add colour to the output in order to make it
easier to read. At the moment this is fairly basic, just add colour to
the "titles" of each change and to the diff output.

I tried to introduce this without changing the code too much - rather
than moving everything over to the new python formatting style, I've
introduced a color_format() function which takes care of the colour
formatting, either accepting additional format arguments or
alternatively leaving the caller to use the old-style formatting (%) to
insert values.

Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com>
---
 bin/bitbake-diffsigs |  18 ++++++---
 lib/bb/siggen.py     | 101 +++++++++++++++++++++++++++++++++++++--------------
 2 files changed, 85 insertions(+), 34 deletions(-)

diff --git a/bin/bitbake-diffsigs b/bin/bitbake-diffsigs
index e9fdb48..884be1e 100755
--- a/bin/bitbake-diffsigs
+++ b/bin/bitbake-diffsigs
@@ -34,7 +34,7 @@ import bb.msg
 
 logger = bb.msg.logger_create('bitbake-diffsigs')
 
-def find_compare_task(bbhandler, pn, taskname, sig1=None, sig2=None):
+def find_compare_task(bbhandler, pn, taskname, sig1=None, sig2=None, color=False):
     """ Find the most recent signature files for the specified PN/task and compare them """
 
     if not hasattr(bb.siggen, 'find_siginfo'):
@@ -79,7 +79,7 @@ def find_compare_task(bbhandler, pn, taskname, sig1=None, sig2=None):
         elif not hash2 in hashfiles:
             recout.append("Unable to find matching sigdata for %s with hash %s" % (key, hash2))
         else:
-            out2 = bb.siggen.compare_sigfiles(hashfiles[hash1], hashfiles[hash2], recursecb)
+            out2 = bb.siggen.compare_sigfiles(hashfiles[hash1], hashfiles[hash2], recursecb, color=color)
             for change in out2:
                 for line in change.splitlines():
                     recout.append('  ' + line)
@@ -89,7 +89,7 @@ def find_compare_task(bbhandler, pn, taskname, sig1=None, sig2=None):
     # Recurse into signature comparison
     logger.debug("Signature file (previous): %s" % latestfiles[-2])
     logger.debug("Signature file (latest): %s" % latestfiles[-1])
-    output = bb.siggen.compare_sigfiles(latestfiles[-2], latestfiles[-1], recursecb)
+    output = bb.siggen.compare_sigfiles(latestfiles[-2], latestfiles[-1], recursecb, color=color)
     if output:
         print('\n'.join(output))
     sys.exit(0)
@@ -103,6 +103,10 @@ parser.add_argument('-d', '--debug',
                     help='Enable debug output',
                     action='store_true')
 
+parser.add_argument('--color',
+        help='Colorize output (where %(metavar)s is %(choices)s)',
+        choices=['auto', 'always', 'never'], default='auto', metavar='color')
+
 parser.add_argument("-t", "--task",
         help="find the signature data files for last two runs of the specified task and compare them",
         action="store", dest="taskargs", nargs=2, metavar=('recipename', 'taskname'))
@@ -125,20 +129,22 @@ options = parser.parse_args()
 if options.debug:
     logger.setLevel(logging.DEBUG)
 
+color = (options.color == 'always' or (options.color == 'auto' and sys.stdout.isatty()))
+
 if options.taskargs:
     with bb.tinfoil.Tinfoil() as tinfoil:
         tinfoil.prepare(config_only=True)
         if options.sigargs:
-            find_compare_task(tinfoil, options.taskargs[0], options.taskargs[1], options.sigargs[0], options.sigargs[1])
+            find_compare_task(tinfoil, options.taskargs[0], options.taskargs[1], options.sigargs[0], options.sigargs[1], color=color)
         else:
-            find_compare_task(tinfoil, options.taskargs[0], options.taskargs[1])
+            find_compare_task(tinfoil, options.taskargs[0], options.taskargs[1], color=color)
 else:
     if options.sigargs:
         logger.error('-s/--signature can only be used together with -t/--task')
         sys.exit(1)
     try:
         if options.sigdatafile1 and options.sigdatafile2:
-            output = bb.siggen.compare_sigfiles(options.sigdatafile1, options.sigdatafile2)
+            output = bb.siggen.compare_sigfiles(options.sigdatafile1, options.sigdatafile2, color=color)
         elif options.sigdatafile1:
             output = bb.siggen.dump_sigfile(options.sigdatafile1)
     except IOError as e:
diff --git a/lib/bb/siggen.py b/lib/bb/siggen.py
index d40c721..25ad3e9 100644
--- a/lib/bb/siggen.py
+++ b/lib/bb/siggen.py
@@ -353,7 +353,23 @@ def dump_this_task(outfile, d):
     referencestamp = bb.build.stamp_internal(task, d, None, True)
     bb.parse.siggen.dump_sigtask(fn, task, outfile, "customfile:" + referencestamp)
 
-def worddiff_str(oldstr, newstr):
+def init_colors(enable_color):
+    """Initialise colour dict for passing to compare_sigfiles()"""
+    # First set up the colours
+    colors = {'color_title':   '\033[1;37;40m',
+              'color_default': '\033[0;37;40m',
+              'color_add':     '\033[1;32;40m',
+              'color_remove':  '\033[1;31;40m',
+             }
+    # Leave all keys present but clear the values
+    if not enable_color:
+        for k in colors.keys():
+            colors[k] = ''
+    return colors
+
+def worddiff_str(oldstr, newstr, colors=None):
+    if not colors:
+        colors = init_colors(False)
     diff = simplediff.diff(oldstr.split(' '), newstr.split(' '))
     ret = []
     for change, value in diff:
@@ -361,17 +377,19 @@ def worddiff_str(oldstr, newstr):
         if change == '=':
             ret.append(value)
         elif change == '+':
-            item = '{+%s+}' % value
+            item = '{color_add}{{+{value}+}}{color_default}'.format(value=value, **colors)
             ret.append(item)
         elif change == '-':
-            item = '[-%s-]' % value
+            item = '{color_remove}[-{value}-]{color_default}'.format(value=value, **colors)
             ret.append(item)
     whitespace_note = ''
     if oldstr != newstr and ' '.join(oldstr.split()) == ' '.join(newstr.split()):
         whitespace_note = ' (whitespace changed)'
     return '"%s"%s' % (' '.join(ret), whitespace_note)
 
-def list_inline_diff(oldlist, newlist):
+def list_inline_diff(oldlist, newlist, colors=None):
+    if not colors:
+        colors = init_colors(False)
     diff = simplediff.diff(oldlist, newlist)
     ret = []
     for change, value in diff:
@@ -379,10 +397,10 @@ def list_inline_diff(oldlist, newlist):
         if change == '=':
             ret.append("'%s'" % value)
         elif change == '+':
-            item = "+'%s'" % value
+            item = '{color_add}+{value}{color_default}'.format(value=value, **colors)
             ret.append(item)
         elif change == '-':
-            item = "-'%s'" % value
+            item = '{color_remove}-{value}{color_default}'.format(value=value, **colors)
             ret.append(item)
     return '[%s]' % (', '.join(ret))
 
@@ -409,9 +427,26 @@ def clean_basepaths_list(a):
         b.append(clean_basepath(x))
     return b
 
-def compare_sigfiles(a, b, recursecb=None, collapsed=False):
+def compare_sigfiles(a, b, recursecb=None, color=False, collapsed=False):
     output = []
 
+    colors = init_colors(color)
+    def color_format(formatstr, **values):
+        """
+        Return colour formatted string.
+        NOTE: call with the format string, not an already formatted string
+        containing values (otherwise you could have trouble with { and }
+        characters)
+        """
+        if not formatstr.endswith('{color_default}'):
+            formatstr += '{color_default}'
+        # In newer python 3 versions you can pass both of these directly,
+        # but we only require 3.4 at the moment
+        formatparams = {}
+        formatparams.update(colors)
+        formatparams.update(values)
+        return formatstr.format(**formatparams)
+
     with open(a, 'rb') as f:
         p1 = pickle.Unpickler(f)
         a_data = p1.load()
@@ -465,33 +500,33 @@ def compare_sigfiles(a, b, recursecb=None, collapsed=False):
         return changed, added, removed
 
     if 'basewhitelist' in a_data and a_data['basewhitelist'] != b_data['basewhitelist']:
-        output.append("basewhitelist changed from '%s' to '%s'" % (a_data['basewhitelist'], b_data['basewhitelist']))
+        output.append(color_format("{color_title}basewhitelist changed{color_default} from '%s' to '%s'") % (a_data['basewhitelist'], b_data['basewhitelist']))
         if a_data['basewhitelist'] and b_data['basewhitelist']:
             output.append("changed items: %s" % a_data['basewhitelist'].symmetric_difference(b_data['basewhitelist']))
 
     if 'taskwhitelist' in a_data and a_data['taskwhitelist'] != b_data['taskwhitelist']:
-        output.append("taskwhitelist changed from '%s' to '%s'" % (a_data['taskwhitelist'], b_data['taskwhitelist']))
+        output.append(color_format("{color_title}taskwhitelist changed{color_default} from '%s' to '%s'") % (a_data['taskwhitelist'], b_data['taskwhitelist']))
         if a_data['taskwhitelist'] and b_data['taskwhitelist']:
             output.append("changed items: %s" % a_data['taskwhitelist'].symmetric_difference(b_data['taskwhitelist']))
 
     if a_data['taskdeps'] != b_data['taskdeps']:
-        output.append("Task dependencies changed from:\n%s\nto:\n%s" % (sorted(a_data['taskdeps']), sorted(b_data['taskdeps'])))
+        output.append(color_format("{color_title}Task dependencies changed{color_default} from:\n%s\nto:\n%s") % (sorted(a_data['taskdeps']), sorted(b_data['taskdeps'])))
 
     if a_data['basehash'] != b_data['basehash'] and not collapsed:
-        output.append("basehash changed from %s to %s" % (a_data['basehash'], b_data['basehash']))
+        output.append(color_format("{color_title}basehash changed{color_default} from %s to %s") % (a_data['basehash'], b_data['basehash']))
 
     changed, added, removed = dict_diff(a_data['gendeps'], b_data['gendeps'], a_data['basewhitelist'] & b_data['basewhitelist'])
     if changed:
         for dep in changed:
-            output.append("List of dependencies for variable %s changed from '%s' to '%s'" % (dep, a_data['gendeps'][dep], b_data['gendeps'][dep]))
+            output.append(color_format("{color_title}List of dependencies for variable %s changed from '{color_default}%s{color_title}' to '{color_default}%s{color_title}'") % (dep, a_data['gendeps'][dep], b_data['gendeps'][dep]))
             if a_data['gendeps'][dep] and b_data['gendeps'][dep]:
                 output.append("changed items: %s" % a_data['gendeps'][dep].symmetric_difference(b_data['gendeps'][dep]))
     if added:
         for dep in added:
-            output.append("Dependency on variable %s was added" % (dep))
+            output.append(color_format("{color_title}Dependency on variable %s was added") % (dep))
     if removed:
         for dep in removed:
-            output.append("Dependency on Variable %s was removed" % (dep))
+            output.append(color_format("{color_title}Dependency on Variable %s was removed") % (dep))
 
 
     changed, added, removed = dict_diff(a_data['varvals'], b_data['varvals'])
@@ -504,11 +539,20 @@ def compare_sigfiles(a, b, recursecb=None, collapsed=False):
                 # Cut off the first two lines, since we aren't interested in
                 # the old/new filename (they are blank anyway in this case)
                 difflines = list(diff)[2:]
-                output.append("Variable %s value changed:\n%s" % (dep, '\n'.join(difflines)))
+                if color:
+                    # Add colour to diff output
+                    for i, line in enumerate(difflines):
+                        if line.startswith('+'):
+                            line = color_format('{color_add}{line}', line=line)
+                            difflines[i] = line
+                        elif line.startswith('-'):
+                            line = color_format('{color_remove}{line}', line=line)
+                            difflines[i] = line
+                output.append(color_format("{color_title}Variable {var} value changed:{color_default}\n{diff}", var=dep, diff='\n'.join(difflines)))
             elif newval and oldval and (' ' in oldval or ' ' in newval):
-                output.append("Variable %s value changed:\n%s" % (dep, worddiff_str(oldval, newval)))
+                output.append(color_format("{color_title}Variable {var} value changed:{color_default}\n{diff}", var=dep, diff=worddiff_str(oldval, newval, colors)))
             else:
-                output.append("Variable %s value changed from '%s' to '%s'" % (dep, oldval, newval))
+                output.append(color_format("{color_title}Variable {var} value changed from '{color_default}{oldval}{color_title}' to '{color_default}{newval}{color_title}'{color_default}", var=dep, oldval=oldval, newval=newval))
 
     if not 'file_checksum_values' in a_data:
          a_data['file_checksum_values'] = {}
@@ -518,13 +562,13 @@ def compare_sigfiles(a, b, recursecb=None, collapsed=False):
     changed, added, removed = file_checksums_diff(a_data['file_checksum_values'], b_data['file_checksum_values'])
     if changed:
         for f, old, new in changed:
-            output.append("Checksum for file %s changed from %s to %s" % (f, old, new))
+            output.append(color_format("{color_title}Checksum for file %s changed{color_default} from %s to %s") % (f, old, new))
     if added:
         for f in added:
-            output.append("Dependency on checksum of file %s was added" % (f))
+            output.append(color_format("{color_title}Dependency on checksum of file %s was added") % (f))
     if removed:
         for f in removed:
-            output.append("Dependency on checksum of file %s was removed" % (f))
+            output.append(color_format("{color_title}Dependency on checksum of file %s was removed") % (f))
 
     if not 'runtaskdeps' in a_data:
          a_data['runtaskdeps'] = {}
@@ -539,18 +583,19 @@ def compare_sigfiles(a, b, recursecb=None, collapsed=False):
             for idx, task in enumerate(a_data['runtaskdeps']):
                 a = a_data['runtaskdeps'][idx]
                 b = b_data['runtaskdeps'][idx]
-                if a_data['runtaskhashes'][a] != b_data['runtaskhashes'][b]:
-                    changed.append("%s with hash %s\n changed to\n%s with hash %s" % (a, a_data['runtaskhashes'][a], b, b_data['runtaskhashes'][b]))
+                if a_data['runtaskhashes'][a] != b_data['runtaskhashes'][b] and not collapsed:
+                    changed.append("%s with hash %s\n changed to\n%s with hash %s" % (clean_basepath(a), a_data['runtaskhashes'][a], clean_basepath(b), b_data['runtaskhashes'][b]))
 
         if changed:
             clean_a = clean_basepaths_list(a_data['runtaskdeps'])
             clean_b = clean_basepaths_list(b_data['runtaskdeps'])
             if clean_a != clean_b:
-                output.append("runtaskdeps changed:\n%s" % list_inline_diff(clean_a, clean_b))
+                output.append(color_format("{color_title}runtaskdeps changed:{color_default}\n%s") % list_inline_diff(clean_a, clean_b, colors))
             else:
-                output.append("runtaskdeps changed:")
+                output.append(color_format("{color_title}runtaskdeps changed:"))
             output.append("\n".join(changed))
 
+
     if 'runtaskhashes' in a_data and 'runtaskhashes' in b_data:
         a = a_data['runtaskhashes']
         b = b_data['runtaskhashes']
@@ -564,7 +609,7 @@ def compare_sigfiles(a, b, recursecb=None, collapsed=False):
                             #output.append("Dependency on task %s was replaced by %s with same hash" % (dep, bdep))
                             bdep_found = True
                 if not bdep_found:
-                    output.append("Dependency on task %s was added with hash %s" % (clean_basepath(dep), b[dep]))
+                    output.append(color_format("{color_title}Dependency on task %s was added{color_default} with hash %s") % (clean_basepath(dep), b[dep]))
         if removed:
             for dep in removed:
                 adep_found = False
@@ -574,11 +619,11 @@ def compare_sigfiles(a, b, recursecb=None, collapsed=False):
                             #output.append("Dependency on task %s was replaced by %s with same hash" % (adep, dep))
                             adep_found = True
                 if not adep_found:
-                    output.append("Dependency on task %s was removed with hash %s" % (clean_basepath(dep), a[dep]))
+                    output.append(color_format("{color_title}Dependency on task %s was removed{color_default} with hash %s") % (clean_basepath(dep), a[dep]))
         if changed:
             for dep in changed:
                 if not collapsed:
-                    output.append("Hash for dependent task %s changed from %s to %s" % (clean_basepath(dep), a[dep], b[dep]))
+                    output.append(color_format("{color_title}Hash for dependent task %s changed{color_default} from %s to %s") % (clean_basepath(dep), a[dep], b[dep]))
                 if callable(recursecb):
                     recout = recursecb(dep, a[dep], b[dep])
                     if recout:
@@ -592,7 +637,7 @@ def compare_sigfiles(a, b, recursecb=None, collapsed=False):
     a_taint = a_data.get('taint', None)
     b_taint = b_data.get('taint', None)
     if a_taint != b_taint:
-        output.append("Taint (by forced/invalidated task) changed from %s to %s" % (a_taint, b_taint))
+        output.append(color_format("{color_title}Taint (by forced/invalidated task) changed{color_default} from %s to %s") % (a_taint, b_taint))
 
     return output
 
-- 
2.9.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 09/11] lib/bb/siggen: add collapsed mode to compare_sigfiles()
  2017-04-06 21:52 ` [PATCH 09/11] lib/bb/siggen: add collapsed mode to compare_sigfiles() Paul Eggleton
@ 2017-04-07  6:37   ` Patrick Ohly
  2017-04-07  8:26     ` Paul Eggleton
  0 siblings, 1 reply; 22+ messages in thread
From: Patrick Ohly @ 2017-04-07  6:37 UTC (permalink / raw)
  To: Paul Eggleton; +Cc: bitbake-devel

On Fri, 2017-04-07 at 09:52 +1200, Paul Eggleton wrote:
> If we just want to drill down to the actual differences then we don't
> need to see certain things in the output, e.g. basehash changing or the
> signature of dependent tasks. This will be used for comparing signatures
> within buildhistory-diff in OE-Core; the default mode as used by
> bitbake-diffsigs and bitbake -S printdiff remains unchanged for the
> moment.

This sounds interesting. The bitbake-diffsigs output I got for the
yocto-compat-layer had such useless "basehash changed" comments. Should
we add a mode parameter to bitbake-diffsigs that allows choosing this
collapsed mode?

-- 
Best Regards, Patrick Ohly

The content of this message is my personal opinion only and although
I am an employee of Intel, the statements I make here in no way
represent Intel's position on the issue, nor am I authorized to speak
on behalf of Intel on this matter.





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 00/11] bitbake-diffsigs fixes/improvements
  2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
                   ` (10 preceding siblings ...)
  2017-04-06 21:52 ` [PATCH 11/11] bitbake-diffsigs: colourise output Paul Eggleton
@ 2017-04-07  7:56 ` Patrick Ohly
  2017-04-07  8:29   ` Paul Eggleton
  2017-04-07 12:20 ` Peter Kjellerstedt
  12 siblings, 1 reply; 22+ messages in thread
From: Patrick Ohly @ 2017-04-07  7:56 UTC (permalink / raw)
  To: Paul Eggleton; +Cc: bitbake-devel

On Fri, 2017-04-07 at 09:52 +1200, Paul Eggleton wrote:
> As part of an effort to add task signature recording and comparison to
> buildhistory in OE, I went digging into bitbake-diffsigs and the code
> that supports it and discovered that unfortunately it had a number of
> bugs - in particular the -t option despite being around for a while and
> having numerous band-aids applied in the past still wasn't really
> working properly.

I'm not sure whether it is working now, even with your changes. My
test, using poky master and your bitbake branch:

1. bitbake -c cleansstate m4
2. bitbake m4
3. patch m4:
diff --git a/meta/recipes-devtools/m4/m4_1.4.18.bb
b/meta/recipes-devtools/m4/m4_1.4.18.bb
index b12c0adf3ad..0a8e9c10778 100644
--- a/meta/recipes-devtools/m4/m4_1.4.18.bb
+++ b/meta/recipes-devtools/m4/m4_1.4.18.bb
@@ -1,3 +1,7 @@
 require m4-${PV}.inc
 
+do_install_append () {
+    echo hello world
+}
+
 BBCLASSEXTEND = "nativesdk"
4. bitbake m4
5. bitbake-diffsigs -t m4 do_build
=> ERROR: No sigdata files found matching m4 do_build

There is one file for do_build in tmp/stamps/i586-poky-linux/m4 and
several for do_install:

$ ls -l tmp/stamps/i586-poky-linux/m4/*do_build* tmp/stamps/i586-poky-linux/m4/*do_install*
-rw-r--r-- 1 pohly pohly     0 Apr  7 09:53 tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_build.aa7d33bbb2dfef67ef3ca1b5d3f2cde1
-rw-r--r-- 1 pohly pohly     0 Apr  7 09:53 tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_install.a0954ae23d8da50c4af22f719255411f
-rw-r--r-- 1 pohly pohly 20573 Apr  7 09:52 tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_install.sigdata.464e707d80ec982fb5b0c50a6526f05a
-rw-r--r-- 1 pohly pohly 19610 Apr  7 09:53 tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_install.sigdata.a0954ae23d8da50c4af22f719255411f


bitbake-diffsigs does not fail for do_install, but does not print
anything either (= no output at all from "bitbake-diffsigs -t m4
do_install").

Hmm, does it pick the right files to compare?

$ bitbake-diffsigs -d -t m4 do_install
DEBUG: Signature file (previous): /fast/build/poky/x86/sstate-cache/a0/sstate:m4:i586-poky-linux:1.4.18:r0:i586:3:a0954ae23d8da50c4af22f719255411f_install.tgz.siginfo
DEBUG: Signature file (latest): /fast/build/poky/x86/tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_install.sigdata.a0954ae23d8da50c4af22f719255411f

Am I using the tool incorrectly or are my expectations wrong?

My expectation was that "bitbake-diffsigs -t m4 do_build" would tell me
that m4 was rebuilt because some tasks that it depends on changed, and
then I would drill down to those tasks, eventually arriving at
do_install.

-- 
Best Regards, Patrick Ohly

The content of this message is my personal opinion only and although
I am an employee of Intel, the statements I make here in no way
represent Intel's position on the issue, nor am I authorized to speak
on behalf of Intel on this matter.





^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 09/11] lib/bb/siggen: add collapsed mode to compare_sigfiles()
  2017-04-07  6:37   ` Patrick Ohly
@ 2017-04-07  8:26     ` Paul Eggleton
  0 siblings, 0 replies; 22+ messages in thread
From: Paul Eggleton @ 2017-04-07  8:26 UTC (permalink / raw)
  To: Patrick Ohly; +Cc: bitbake-devel

On Friday, 7 April 2017 6:37:49 PM NZST Patrick Ohly wrote:
> On Fri, 2017-04-07 at 09:52 +1200, Paul Eggleton wrote:
> > If we just want to drill down to the actual differences then we don't
> > need to see certain things in the output, e.g. basehash changing or the
> > signature of dependent tasks. This will be used for comparing signatures
> > within buildhistory-diff in OE-Core; the default mode as used by
> > bitbake-diffsigs and bitbake -S printdiff remains unchanged for the
> > moment.
> 
> This sounds interesting. The bitbake-diffsigs output I got for the
> yocto-compat-layer had such useless "basehash changed" comments. Should
> we add a mode parameter to bitbake-diffsigs that allows choosing this
> collapsed mode?

We could, but to be effective the caller needs to handle the output in a 
particular way (see what I've done in the buildhistory-diff patchset I sent 
today).

FYI my plan for 2.4 is to evaluate whether this collapsed mode or something 
like it is better as a default display mode.

Cheers,
Paul

-- 

Paul Eggleton
Intel Open Source Technology Centre


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 00/11] bitbake-diffsigs fixes/improvements
  2017-04-07  7:56 ` [PATCH 00/11] bitbake-diffsigs fixes/improvements Patrick Ohly
@ 2017-04-07  8:29   ` Paul Eggleton
  2017-04-07  8:32     ` Paul Eggleton
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Eggleton @ 2017-04-07  8:29 UTC (permalink / raw)
  To: Patrick Ohly; +Cc: bitbake-devel

Hi Patrick,

On Friday, 7 April 2017 7:56:19 PM NZST Patrick Ohly wrote:
> On Fri, 2017-04-07 at 09:52 +1200, Paul Eggleton wrote:
> > As part of an effort to add task signature recording and comparison to
> > buildhistory in OE, I went digging into bitbake-diffsigs and the code
> > that supports it and discovered that unfortunately it had a number of
> > bugs - in particular the -t option despite being around for a while and
> > having numerous band-aids applied in the past still wasn't really
> > working properly.
> 
> I'm not sure whether it is working now, even with your changes. My
> test, using poky master and your bitbake branch:
> 
> 1. bitbake -c cleansstate m4
> 2. bitbake m4
> 3. patch m4:
> diff --git a/meta/recipes-devtools/m4/m4_1.4.18.bb
> b/meta/recipes-devtools/m4/m4_1.4.18.bb
> index b12c0adf3ad..0a8e9c10778 100644
> --- a/meta/recipes-devtools/m4/m4_1.4.18.bb
> +++ b/meta/recipes-devtools/m4/m4_1.4.18.bb
> @@ -1,3 +1,7 @@
>  require m4-${PV}.inc
> 
> +do_install_append () {
> +    echo hello world
> +}
> +
>  BBCLASSEXTEND = "nativesdk"
> 4. bitbake m4
> 5. bitbake-diffsigs -t m4 do_build
> => ERROR: No sigdata files found matching m4 do_build
> 
> There is one file for do_build in tmp/stamps/i586-poky-linux/m4 and
> several for do_install:
> 
> $ ls -l tmp/stamps/i586-poky-linux/m4/*do_build*
> tmp/stamps/i586-poky-linux/m4/*do_install* -rw-r--r-- 1 pohly pohly     0
> Apr  7 09:53
> tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_build.aa7d33bbb2dfef67ef3ca1b5d3
> f2cde1 -rw-r--r-- 1 pohly pohly     0 Apr  7 09:53
> tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_install.a0954ae23d8da50c4af22f71
> 9255411f -rw-r--r-- 1 pohly pohly 20573 Apr  7 09:52
> tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_install.sigdata.464e707d80ec982f
> b5b0c50a6526f05a -rw-r--r-- 1 pohly pohly 19610 Apr  7 09:53
> tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_install.sigdata.a0954ae23d8da50c
> 4af22f719255411f
> 
> 
> bitbake-diffsigs does not fail for do_install, but does not print
> anything either (= no output at all from "bitbake-diffsigs -t m4
> do_install").
> 
> Hmm, does it pick the right files to compare?
> 
> $ bitbake-diffsigs -d -t m4 do_install
> DEBUG: Signature file (previous):
> /fast/build/poky/x86/sstate-cache/a0/sstate:m4:i586-poky-linux:1.4.18:r0:i5
> 86:3:a0954ae23d8da50c4af22f719255411f_install.tgz.siginfo DEBUG: Signature
> file (latest):
> /fast/build/poky/x86/tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_install.sig
> data.a0954ae23d8da50c4af22f719255411f
> 
> Am I using the tool incorrectly or are my expectations wrong?
>
> My expectation was that "bitbake-diffsigs -t m4 do_build" would tell me
> that m4 was rebuilt because some tasks that it depends on changed, and
> then I would drill down to those tasks, eventually arriving at
> do_install.

Hmm, something has gone wrong here because this is the exact scenario it's 
supposed to handle. In my similar tests it did work, so I'm not sure why it 
didn't work in this case - I'll have to give it a try here.

Cheers,
Paul

-- 

Paul Eggleton
Intel Open Source Technology Centre


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 00/11] bitbake-diffsigs fixes/improvements
  2017-04-07  8:29   ` Paul Eggleton
@ 2017-04-07  8:32     ` Paul Eggleton
  2017-04-07  8:54       ` Patrick Ohly
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Eggleton @ 2017-04-07  8:32 UTC (permalink / raw)
  To: Patrick Ohly; +Cc: bitbake-devel

On Friday, 7 April 2017 8:29:18 PM NZST Paul Eggleton wrote:
> Hi Patrick,
> 
> On Friday, 7 April 2017 7:56:19 PM NZST Patrick Ohly wrote:
> > On Fri, 2017-04-07 at 09:52 +1200, Paul Eggleton wrote:
> > > As part of an effort to add task signature recording and comparison to
> > > buildhistory in OE, I went digging into bitbake-diffsigs and the code
> > > that supports it and discovered that unfortunately it had a number of
> > > bugs - in particular the -t option despite being around for a while and
> > > having numerous band-aids applied in the past still wasn't really
> > > working properly.
> > 
> > I'm not sure whether it is working now, even with your changes. My
> > test, using poky master and your bitbake branch:
> > 
> > 1. bitbake -c cleansstate m4
> > 2. bitbake m4
> > 3. patch m4:
> > diff --git a/meta/recipes-devtools/m4/m4_1.4.18.bb
> > b/meta/recipes-devtools/m4/m4_1.4.18.bb
> > index b12c0adf3ad..0a8e9c10778 100644
> > --- a/meta/recipes-devtools/m4/m4_1.4.18.bb
> > +++ b/meta/recipes-devtools/m4/m4_1.4.18.bb
> > @@ -1,3 +1,7 @@
> > 
> >  require m4-${PV}.inc
> > 
> > +do_install_append () {
> > +    echo hello world
> > +}
> > +
> > 
> >  BBCLASSEXTEND = "nativesdk"
> > 
> > 4. bitbake m4
> > 5. bitbake-diffsigs -t m4 do_build
> > => ERROR: No sigdata files found matching m4 do_build
> > 
> > There is one file for do_build in tmp/stamps/i586-poky-linux/m4 and
> > several for do_install:
> > 
> > $ ls -l tmp/stamps/i586-poky-linux/m4/*do_build*
> > tmp/stamps/i586-poky-linux/m4/*do_install* -rw-r--r-- 1 pohly pohly     0
> > Apr  7 09:53
> > tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_build.aa7d33bbb2dfef67ef3ca1b5d
> > 3
> > f2cde1 -rw-r--r-- 1 pohly pohly     0 Apr  7 09:53
> > tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_install.a0954ae23d8da50c4af22f7
> > 1
> > 9255411f -rw-r--r-- 1 pohly pohly 20573 Apr  7 09:52
> > tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_install.sigdata.464e707d80ec982
> > f
> > b5b0c50a6526f05a -rw-r--r-- 1 pohly pohly 19610 Apr  7 09:53
> > tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_install.sigdata.a0954ae23d8da50
> > c
> > 4af22f719255411f
> > 
> > 
> > bitbake-diffsigs does not fail for do_install, but does not print
> > anything either (= no output at all from "bitbake-diffsigs -t m4
> > do_install").
> > 
> > Hmm, does it pick the right files to compare?
> > 
> > $ bitbake-diffsigs -d -t m4 do_install
> > DEBUG: Signature file (previous):
> > /fast/build/poky/x86/sstate-cache/a0/sstate:m4:i586-poky-linux:1.4.18:r0:i
> > 5
> > 86:3:a0954ae23d8da50c4af22f719255411f_install.tgz.siginfo DEBUG: Signature
> > file (latest):
> > /fast/build/poky/x86/tmp/stamps/i586-poky-linux/m4/1.4.18-r0.do_install.si
> > g
> > data.a0954ae23d8da50c4af22f719255411f
> > 
> > Am I using the tool incorrectly or are my expectations wrong?
> > 
> > My expectation was that "bitbake-diffsigs -t m4 do_build" would tell me
> > that m4 was rebuilt because some tasks that it depends on changed, and
> > then I would drill down to those tasks, eventually arriving at
> > do_install.
> 
> Hmm, something has gone wrong here because this is the exact scenario it's
> supposed to handle. In my similar tests it did work, so I'm not sure why it
> didn't work in this case - I'll have to give it a try here.

I send and then I immediately realised what might be wrong - you said you're 
using poky master which implies that you haven't also applied the patches for 
lib/oe/sstatesig.py that I sent to the OE-Core list today. If you haven't, try 
doing that, it should fix the problem.

Cheers,
Paul

-- 

Paul Eggleton
Intel Open Source Technology Centre


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 00/11] bitbake-diffsigs fixes/improvements
  2017-04-07  8:32     ` Paul Eggleton
@ 2017-04-07  8:54       ` Patrick Ohly
  2017-04-07  9:48         ` Paul Eggleton
  0 siblings, 1 reply; 22+ messages in thread
From: Patrick Ohly @ 2017-04-07  8:54 UTC (permalink / raw)
  To: Paul Eggleton; +Cc: bitbake-devel

On Fri, 2017-04-07 at 20:32 +1200, Paul Eggleton wrote:
> On Friday, 7 April 2017 8:29:18 PM NZST Paul Eggleton wrote:
> > On Friday, 7 April 2017 7:56:19 PM NZST Patrick Ohly wrote:
> > > Am I using the tool incorrectly or are my expectations wrong?
> > > 
> > > My expectation was that "bitbake-diffsigs -t m4 do_build" would tell me
> > > that m4 was rebuilt because some tasks that it depends on changed, and
> > > then I would drill down to those tasks, eventually arriving at
> > > do_install.
> > 
> > Hmm, something has gone wrong here because this is the exact scenario it's
> > supposed to handle. In my similar tests it did work, so I'm not sure why it
> > didn't work in this case - I'll have to give it a try here.
> 
> I send and then I immediately realised what might be wrong - you said you're 
> using poky master which implies that you haven't also applied the patches for 
> lib/oe/sstatesig.py that I sent to the OE-Core list today. If you haven't, try 
> doing that, it should fix the problem.

Indeed, I had missed that part. It's working now for do_install.

What about do_build? Should bitbake-diffsigs be usable with that?

-- 
Best Regards, Patrick Ohly

The content of this message is my personal opinion only and although
I am an employee of Intel, the statements I make here in no way
represent Intel's position on the issue, nor am I authorized to speak
on behalf of Intel on this matter.





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 00/11] bitbake-diffsigs fixes/improvements
  2017-04-07  8:54       ` Patrick Ohly
@ 2017-04-07  9:48         ` Paul Eggleton
  0 siblings, 0 replies; 22+ messages in thread
From: Paul Eggleton @ 2017-04-07  9:48 UTC (permalink / raw)
  To: Patrick Ohly; +Cc: bitbake-devel

On Friday, 7 April 2017 8:54:38 PM NZST Patrick Ohly wrote:
> On Fri, 2017-04-07 at 20:32 +1200, Paul Eggleton wrote:
> > On Friday, 7 April 2017 8:29:18 PM NZST Paul Eggleton wrote:
> > > On Friday, 7 April 2017 7:56:19 PM NZST Patrick Ohly wrote:
> > > > Am I using the tool incorrectly or are my expectations wrong?
> > > > 
> > > > My expectation was that "bitbake-diffsigs -t m4 do_build" would tell
> > > > me
> > > > that m4 was rebuilt because some tasks that it depends on changed, and
> > > > then I would drill down to those tasks, eventually arriving at
> > > > do_install.
> > > 
> > > Hmm, something has gone wrong here because this is the exact scenario
> > > it's
> > > supposed to handle. In my similar tests it did work, so I'm not sure why
> > > it
> > > didn't work in this case - I'll have to give it a try here.
> > 
> > I send and then I immediately realised what might be wrong - you said
> > you're using poky master which implies that you haven't also applied the
> > patches for lib/oe/sstatesig.py that I sent to the OE-Core list today. If
> > you haven't, try doing that, it should fix the problem.
> 
> Indeed, I had missed that part. It's working now for do_install.
> 
> What about do_build? Should bitbake-diffsigs be usable with that?

If it changed (i.e. there are two signature files to compare) then yes, -t 
should be able to work back from those. For some reason though you don't seem 
to have sigdata files for do_build, which is odd - a quick look in my stamps 
directory here suggests that you should.

Cheers,
Paul

-- 

Paul Eggleton
Intel Open Source Technology Centre


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 00/11] bitbake-diffsigs fixes/improvements
  2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
                   ` (11 preceding siblings ...)
  2017-04-07  7:56 ` [PATCH 00/11] bitbake-diffsigs fixes/improvements Patrick Ohly
@ 2017-04-07 12:20 ` Peter Kjellerstedt
  2017-04-07 16:46   ` Richard Purdie
  12 siblings, 1 reply; 22+ messages in thread
From: Peter Kjellerstedt @ 2017-04-07 12:20 UTC (permalink / raw)
  To: Paul Eggleton; +Cc: bitbake-devel

> -----Original Message-----
> From: bitbake-devel-bounces@lists.openembedded.org [mailto:bitbake-
> devel-bounces@lists.openembedded.org] On Behalf Of Paul Eggleton
> Sent: den 6 april 2017 23:52
> To: bitbake-devel@lists.openembedded.org
> Subject: [bitbake-devel] [PATCH 00/11] bitbake-diffsigs
> fixes/improvements
> 
> As part of an effort to add task signature recording and comparison to
> buildhistory in OE, I went digging into bitbake-diffsigs and the code
> that supports it and discovered that unfortunately it had a number of
> bugs - in particular the -t option despite being around for a while and
> having numerous band-aids applied in the past still wasn't really
> working properly. I have to take a big chunk of the responsibility for
> this as I wrote the thing in the first place. This patchset corrects
> most of the issues that I found and also makes a number of improvements
> to the readability of the output.
> 
> NOTE: there are a few corresponding changes required in OE-Core that I
> am about to send out, these should be applied at the same time.

Will you do the same for bitbake-dumpsig as I believe it has many 
problems in common with bitbake-diffsigs?

//Peter



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 03/11] bitbake-diffsigs: properly report which signature is missing
  2017-04-06 21:52 ` [PATCH 03/11] bitbake-diffsigs: properly report which signature is missing Paul Eggleton
@ 2017-04-07 16:25   ` Patrick Ohly
  0 siblings, 0 replies; 22+ messages in thread
From: Patrick Ohly @ 2017-04-07 16:25 UTC (permalink / raw)
  To: Paul Eggleton; +Cc: bitbake-devel

On Fri, 2017-04-07 at 09:52 +1200, Paul Eggleton wrote:
> If just one of the two signatures we want to compare aren't available,
> report that one rather than misleadingly claiming both are missing.

I just noticed that bitbake-diffsigs -t ... do_build recursively dumps
differences in other tasks - cool, I don't know it could do that. It
worked fine for all-arch ca-certificates, but not for 

$ bitbake-diffsigs -d -t gdb-cross-x86_64 do_build -s 68a82197720db0f0dbad8955635bc1c0 5631f484d848541bdbf6a4c1f8117f79
DEBUG: Signature file (previous): /fast/build/nodistro/x86/tmp-glibc/stamps/x86_64-linux/gdb-cross-x86_64/7.12.1-r0.do_build.sigdata.68a82197720db0f0dbad8955635bc1c0
DEBUG: Signature file (latest): /fast/build/nodistro/x86/tmp-glibc/stamps/x86_64-linux/gdb-cross-x86_64/7.12.1-r0.do_build.sigdata.5631f484d848541bdbf6a4c1f8117f79
Hash for dependent task gdb/gdb-cross_7.12.1.bb.do_populate_sysroot changed from 6247000d87570135b704fcfd3e4e7cd4 to 554c1ce69bf0538a91436e5f7c256488
Unable to find matching sigdata for /fast/work/openembedded-core/meta/recipes-devtools/gdb/gdb-cross_7.12.1.bb.do_populate_sysroot with hashes 6247000d87570135b704fcfd3e4e7cd4 or 554c1ce69bf0538a91436e5f7c256488

$ bitbake-diffsigs -d -t gdb-cross-x86_64 do_populate_sysroot -s 6247000d87570135b704fcfd3e4e7cd4 554c1ce69bf0538a91436e5f7c256488
DEBUG: Signature file (previous): /fast/build/nodistro/x86/tmp-glibc/stamps/x86_64-linux/gdb-cross-x86_64/7.12.1-r0.do_populate_sysroot.sigdata.6247000d87570135b704fcfd3e4e7cd4
DEBUG: Signature file (latest): /fast/build/nodistro/x86/tmp-glibc/stamps/x86_64-linux/gdb-cross-x86_64/7.12.1-r0.do_populate_sysroot.sigdata.554c1ce69bf0538a91436e5f7c256488
Hash for dependent task gdb/gdb-cross_7.12.1.bb.do_install changed from 0786db89020b877d443da848548b3d1d to 746c2815e180fa2a4ff28608c7c1b9d5
Unable to find matching sigdata for /fast/work/openembedded-core/meta/recipes-devtools/gdb/gdb-cross_7.12.1.bb.do_install with hashes 0786db89020b877d443da848548b3d1d or 746c2815e180fa2a4ff28608c7c1b9d5

In other words, automatic diff fails to find files, while manual diff finds them?!

This situation can be triggered with a local patch of mine (to be posted soon) and using
yocto-compat-layer.py -n --machines intel-corei7-64 intel-core2-32 qemux86 qemux86-64 -- /fast/work/meta-intel

The test uses MACHINE=... bitbake -S none world to generate the signatures for
different machines and then looks for tasks that are shared between
machines with different signatures.

-- 
Best Regards, Patrick Ohly

The content of this message is my personal opinion only and although
I am an employee of Intel, the statements I make here in no way
represent Intel's position on the issue, nor am I authorized to speak
on behalf of Intel on this matter.





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 00/11] bitbake-diffsigs fixes/improvements
  2017-04-07 12:20 ` Peter Kjellerstedt
@ 2017-04-07 16:46   ` Richard Purdie
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Purdie @ 2017-04-07 16:46 UTC (permalink / raw)
  To: Peter Kjellerstedt, Paul Eggleton; +Cc: bitbake-devel

On Fri, 2017-04-07 at 12:20 +0000, Peter Kjellerstedt wrote:
> > 
> > -----Original Message-----
> > From: bitbake-devel-bounces@lists.openembedded.org [mailto:bitbake-
> > devel-bounces@lists.openembedded.org] On Behalf Of Paul Eggleton
> > Sent: den 6 april 2017 23:52
> > To: bitbake-devel@lists.openembedded.org
> > Subject: [bitbake-devel] [PATCH 00/11] bitbake-diffsigs
> > fixes/improvements
> > 
> > As part of an effort to add task signature recording and comparison
> > to
> > buildhistory in OE, I went digging into bitbake-diffsigs and the
> > code
> > that supports it and discovered that unfortunately it had a number
> > of
> > bugs - in particular the -t option despite being around for a while
> > and
> > having numerous band-aids applied in the past still wasn't really
> > working properly. I have to take a big chunk of the responsibility
> > for
> > this as I wrote the thing in the first place. This patchset
> > corrects
> > most of the issues that I found and also makes a number of
> > improvements
> > to the readability of the output.
> > 
> > NOTE: there are a few corresponding changes required in OE-Core
> > that I
> > am about to send out, these should be applied at the same time.
> Will you do the same for bitbake-dumpsig as I believe it has many 
> problems in common with bitbake-diffsigs?

bitbake-dumpsig is near enough a symlink to diffsigs, its all the same
code.

Cheers,

Richard


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-04-07 16:48 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-06 21:52 [PATCH 00/11] bitbake-diffsigs fixes/improvements Paul Eggleton
2017-04-06 21:52 ` [PATCH 01/11] bitbake-diffsigs: fix -t picking wrong files to compare Paul Eggleton
2017-04-06 21:52 ` [PATCH 02/11] lib/bb/siggen: add missing path separator to cleaned paths Paul Eggleton
2017-04-06 21:52 ` [PATCH 03/11] bitbake-diffsigs: properly report which signature is missing Paul Eggleton
2017-04-07 16:25   ` Patrick Ohly
2017-04-06 21:52 ` [PATCH 04/11] bitbake-diffsigs: drop naive logic for removing duplicate files Paul Eggleton
2017-04-06 21:52 ` [PATCH 05/11] lib/bb/siggen: show a diff when dumping changes to multi-line values Paul Eggleton
2017-04-06 21:52 ` [PATCH 06/11] lib/bb/siggen: don't show unchanged runtaskdeps list Paul Eggleton
2017-04-06 21:52 ` [PATCH 07/11] bitbake-diffsigs: change to use argparse Paul Eggleton
2017-04-06 21:52 ` [PATCH 08/11] bitbake-diffsigs: add an option to find and compare specific signatures Paul Eggleton
2017-04-06 21:52 ` [PATCH 09/11] lib/bb/siggen: add collapsed mode to compare_sigfiles() Paul Eggleton
2017-04-07  6:37   ` Patrick Ohly
2017-04-07  8:26     ` Paul Eggleton
2017-04-06 21:52 ` [PATCH 10/11] lib/bb/siggen: show word-diff for single-line values containing spaces Paul Eggleton
2017-04-06 21:52 ` [PATCH 11/11] bitbake-diffsigs: colourise output Paul Eggleton
2017-04-07  7:56 ` [PATCH 00/11] bitbake-diffsigs fixes/improvements Patrick Ohly
2017-04-07  8:29   ` Paul Eggleton
2017-04-07  8:32     ` Paul Eggleton
2017-04-07  8:54       ` Patrick Ohly
2017-04-07  9:48         ` Paul Eggleton
2017-04-07 12:20 ` Peter Kjellerstedt
2017-04-07 16:46   ` Richard Purdie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.