All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] fetch2/checksum/siggen: Fix taskhashes not tracking file directories
@ 2021-11-08 15:11 Richard Purdie
  0 siblings, 0 replies; only message in thread
From: Richard Purdie @ 2021-11-08 15:11 UTC (permalink / raw)
  To: bitbake-devel

Currently if you have something like:

SRC_URI = "file://foobar;subdir=${S}"

and a file like:

foobar/1/somefile

and then move it to:

foobar/2/somefile

the task checksums don't reflect/notice this. The file-checksum fields
encode two pieces of data, the file path and whether or not the file
exists. Changing the code which uses these fields is problematic.

We can however add a "/./" path element which means "include the bit
after the marker in the checksum" which the path walking code can use
to mark which bits of the path are visible to the fetcher.

I'm not convinced this is great design but it does appear to work.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/checksum.py | 12 +++++++++++-
 lib/bb/siggen.py   | 11 ++++++++++-
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/lib/bb/checksum.py b/lib/bb/checksum.py
index 1d50a26426..fb8a77f6ab 100644
--- a/lib/bb/checksum.py
+++ b/lib/bb/checksum.py
@@ -50,6 +50,7 @@ class FileChecksumCache(MultiProcessCache):
         MultiProcessCache.__init__(self)
 
     def get_checksum(self, f):
+        f = os.path.normpath(f)
         entry = self.cachedata[0].get(f)
         cmtime = self.mtime_cache.cached_mtime(f)
         if entry:
@@ -84,15 +85,24 @@ class FileChecksumCache(MultiProcessCache):
                 return None
             return checksum
 
+        #
+        # Changing the format of file-checksums is problematic as both OE and Bitbake have
+        # knowledge of them. We need to encode a new piece of data, the portion of the path
+        # we care about from a checksum perspective. This means that files that change subdirectory
+        # are tracked by the task hashes. To do this, we do something horrible and put a "/./" into
+        # the path. The filesystem handles it but it gives us a marker to know which subsection
+        # of the path to cache.
+        #
         def checksum_dir(pth):
             # Handle directories recursively
             if pth == "/":
                 bb.fatal("Refusing to checksum /")
+            pth = pth.rstrip("/")
             dirchecksums = []
             for root, dirs, files in os.walk(pth, topdown=True):
                 [dirs.remove(d) for d in list(dirs) if d in localdirsexclude]
                 for name in files:
-                    fullpth = os.path.join(root, name)
+                    fullpth = os.path.join(root, name).replace(pth, os.path.join(pth, "."))
                     checksum = checksum_file(fullpth)
                     if checksum:
                         dirchecksums.append((fullpth, checksum))
diff --git a/lib/bb/siggen.py b/lib/bb/siggen.py
index 578ba5d661..44965c8cca 100644
--- a/lib/bb/siggen.py
+++ b/lib/bb/siggen.py
@@ -328,6 +328,8 @@ class SignatureGeneratorBasic(SignatureGenerator):
 
         for (f, cs) in self.file_checksum_values[tid]:
             if cs:
+                if "/./" in f:
+                    data = data + "./" + f.split("/./")[1]
                 data = data + cs
 
         if tid in self.taints:
@@ -385,7 +387,12 @@ class SignatureGeneratorBasic(SignatureGenerator):
 
         if runtime and tid in self.taskhash:
             data['runtaskdeps'] = self.runtaskdeps[tid]
-            data['file_checksum_values'] = [(os.path.basename(f), cs) for f,cs in self.file_checksum_values[tid]]
+            data['file_checksum_values'] = []
+            for f,cs in self.file_checksum_values[tid]:
+                if "/./" in f:
+                    data['file_checksum_values'].append(("./" + f.split("/./")[1], cs))
+                else:
+                    data['file_checksum_values'].append((os.path.basename(f), cs))
             data['runtaskhashes'] = {}
             for dep in data['runtaskdeps']:
                 data['runtaskhashes'][dep] = self.get_unihash(dep)
@@ -1028,6 +1035,8 @@ def calc_taskhash(sigdata):
 
     for c in sigdata['file_checksum_values']:
         if c[1]:
+            if "./" in c[0]:
+                data = data + c[0]
             data = data + c[1]
 
     if 'taint' in sigdata:
-- 
2.32.0



^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2021-11-08 15:11 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08 15:11 [PATCH v2] fetch2/checksum/siggen: Fix taskhashes not tracking file directories Richard Purdie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.