All of lore.kernel.org
 help / color / mirror / Atom feed
* SRC_URI Directory Change
@ 2021-10-15 10:03 Chuck Wolber
  2021-10-15 13:58 ` [yocto] " Richard Purdie
  0 siblings, 1 reply; 3+ messages in thread
From: Chuck Wolber @ 2021-10-15 10:03 UTC (permalink / raw)
  To: yocto

[-- Attachment #1: Type: text/plain, Size: 916 bytes --]

Is there a recommended strategy to get do_fetch to invalidate on directory
path changes in paths pointed to by file:// URLs in SRC_URI?

Example:

SRC_URI += "file://src;subdir=${S}"

A file at src/foo/bar/baz is recognized just fine. But then a directory
change to something like src/foo/bar2/baz is not recognized and does not
invalidate any tasks in subsequent builds.

Use case is a recipe that has a fair bit of metadata that is not even
remotely amenable to the typical flat layout expected of a set of patches.

..Ch:W..

P.S. I attempted to do this with an event handler that was run when
bb.event.RecipePreFinalise is is fired. It would compare directory trees
and set do_fetch[nostamp] = "1" to invalidate the fetcher task. But I got
really spotty behavior. It seems like event handlers are cached like tasks.


-- 
*"Perfection must be reached by degrees; she requires the slow hand of
time." - Voltaire*

[-- Attachment #2: Type: text/html, Size: 1338 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [yocto] SRC_URI Directory Change
  2021-10-15 10:03 SRC_URI Directory Change Chuck Wolber
@ 2021-10-15 13:58 ` Richard Purdie
  2021-10-17 21:59   ` Chuck Wolber
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Purdie @ 2021-10-15 13:58 UTC (permalink / raw)
  To: Chuck Wolber, yocto

On Fri, 2021-10-15 at 03:03 -0700, Chuck Wolber wrote:
> Is there a recommended strategy to get do_fetch to invalidate on directory
> path changes in paths pointed to by file:// URLs in SRC_URI?
> 
> Example:
> 
> SRC_URI += "file://src;subdir=${S}"
> 
> A file at src/foo/bar/baz is recognized just fine. But then a directory change
> to something like src/foo/bar2/baz is not recognized and does not invalidate
> any tasks in subsequent builds.
> 
> Use case is a recipe that has a fair bit of metadata that is not even remotely
> amenable to the typical flat layout expected of a set of patches.
> 
> ..Ch:W..
> 
> P.S. I attempted to do this with an event handler that was run when
> bb.event.RecipePreFinalise is is fired. It would compare directory trees and
> set do_fetch[nostamp] = "1" to invalidate the fetcher task. But I got really
> spotty behavior. It seems like event handlers are cached like tasks.

I can see why this breaks and it isn't entirely straightforward to fix since
we'd have to add data to the file-checksums entries which are generated by both
OE and Bitbake.

I have a bit of a horrible idea to do this in master-next which does solve the
problem. We probably need a new bitbake selftest testcase before I could think
about merging it though.

Cheers,

Richard



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [yocto] SRC_URI Directory Change
  2021-10-15 13:58 ` [yocto] " Richard Purdie
@ 2021-10-17 21:59   ` Chuck Wolber
  0 siblings, 0 replies; 3+ messages in thread
From: Chuck Wolber @ 2021-10-17 21:59 UTC (permalink / raw)
  To: Richard Purdie; +Cc: yocto

[-- Attachment #1: Type: text/plain, Size: 5650 bytes --]

On Fri, Oct 15, 2021 at 6:58 AM Richard Purdie <
richard.purdie@linuxfoundation.org> wrote:

>
> I have a bit of a horrible idea to do this in master-next which does solve
> the
> problem. We probably need a new bitbake selftest testcase before I could
> think
> about merging it though.
>

I tested the patch you posted on IRC against bitbake 1.50.0. It seems to
work
quite well, but I think I found two minor bugs in it. I pasted an updated
patch below.

In a nutshell...

If you have a SRC_URI file:// entry that ends in a "/", then you get a "."
that is
not enclosed in "/./" when running the checksum_dir method, which causes
checksum_file to throw warnings that look like this:

WARNING: Unable to get checksum for gettext-minimal-native SRC_URI entry
.intlmacosx.m4: [Errno 2] No such file or directory:
'/mnt/openembedded-core/meta/recipes-core/gettext/gettext-minimal-0.21/aclocal/.intlmacosx.m4'

I think the simplest fix is to add a "pth = pth.rstrip("/")" in the
checksum_dir method after
the guard statement. But then that exposes a different issue in the patch.

For recipes that trigger the above warning, the rstrip() fix fixes the
warning, but then you
expose a new problem in siggen.py calc_taskhash().

In your patched version, the check for the "/" fails to include the
filename in the hash calculation
for files at the root of a SRC_URI entry (there is no "/" to be found).
This results in a task hash
mismatch error on the first build, but not on subsequent builds for fairly
obvious reasons.

I solved this by adding a third field to the tuple with a True/False value,
which is a much more reliable semaphore (IMHO).

I tested these fixes and it worked perfectly. Here is an updated version of
your patch that takes into
account the fixes I described. I can produce a "patch against your patch"
if these fixes are
considered correct and do not cause even bigger problems that are not
obvious to me.

diff --git a/bitbake/lib/bb/checksum.py b/bitbake/lib/bb/checksum.py
index 1d50a26426..fb8a77f6ab 100644
--- a/bitbake/lib/bb/checksum.py
+++ b/bitbake/lib/bb/checksum.py
@@ -50,6 +50,7 @@ class FileChecksumCache(MultiProcessCache):
         MultiProcessCache.__init__(self)

     def get_checksum(self, f):
+        f = os.path.normpath(f)
         entry = self.cachedata[0].get(f)
         cmtime = self.mtime_cache.cached_mtime(f)
         if entry:
@@ -84,15 +85,24 @@ class FileChecksumCache(MultiProcessCache):
                 return None
             return checksum

+        #
+        # Changing the format of file-checksums is problematic as
both OE and Bitbake have
+        # knowledge of them. We need to encode a new piece of data,
the portion of the path
+        # we care about from a checksum perspective. This means that
files that change subdirectory
+        # are tracked by the task hashes. To do this, we do something
horrible and put a "/./" into
+        # the path. The filesystem handles it but it gives us a
marker to know which subsection
+        # of the path to cache.
+        #
         def checksum_dir(pth):
             # Handle directories recursively
             if pth == "/":
                 bb.fatal("Refusing to checksum /")
+            pth = pth.rstrip("/")
             dirchecksums = []
             for root, dirs, files in os.walk(pth, topdown=True):
                 [dirs.remove(d) for d in list(dirs) if d in localdirsexclude]
                 for name in files:
-                    fullpth = os.path.join(root, name)
+                    fullpth = os.path.join(root, name).replace(pth,
os.path.join(pth, "."))
                     checksum = checksum_file(fullpth)
                     if checksum:
                         dirchecksums.append((fullpth, checksum))
diff --git a/bitbake/lib/bb/siggen.py b/bitbake/lib/bb/siggen.py
index 0d88c6ec68..f649f39ced 100644
--- a/bitbake/lib/bb/siggen.py
+++ b/bitbake/lib/bb/siggen.py
@@ -308,13 +308,14 @@ class SignatureGeneratorBasic(SignatureGenerator):
         return

     def get_taskhash(self, tid, deps, dataCaches):
-
         data = self.basehash[tid]
         for dep in self.runtaskdeps[tid]:
             data = data + self.get_unihash(dep)

         for (f, cs) in self.file_checksum_values[tid]:
             if cs:
+                if "/./" in f:
+                    data = data + f.split("/./")[1]
                 data = data + cs

         if tid in self.taints:
@@ -372,7 +373,12 @@ class SignatureGeneratorBasic(SignatureGenerator):

         if runtime and tid in self.taskhash:
             data['runtaskdeps'] = self.runtaskdeps[tid]
-            data['file_checksum_values'] = [(os.path.basename(f), cs)
for f,cs in self.file_checksum_values[tid]]
+            data['file_checksum_values'] = []
+            for f,cs in self.file_checksum_values[tid]:
+                if "/./" in f:
+
data['file_checksum_values'].append((f.split("/./")[1], cs, True))
+                else:
+
data['file_checksum_values'].append((os.path.basename(f), cs, False))
             data['runtaskhashes'] = {}
             for dep in data['runtaskdeps']:
                 data['runtaskhashes'][dep] = self.get_unihash(dep)
@@ -1017,6 +1023,8 @@ def calc_taskhash(sigdata):

     for c in sigdata['file_checksum_values']:
         if c[1]:
+            if c[2]:
+                data = data + c[0]
             data = data + c[1]

     if 'taint' in sigdata:


..Ch:W..

P.S. The gettext-minimal-native_0.21.bb (from OEC) is a very good example
of something that triggers this behavior. But I found plenty of others as
well.

-- 
*"Perfection must be reached by degrees; she requires the slow hand of
time." - Voltaire*

[-- Attachment #2: Type: text/html, Size: 7094 bytes --]

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-10-17 22:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-15 10:03 SRC_URI Directory Change Chuck Wolber
2021-10-15 13:58 ` [yocto] " Richard Purdie
2021-10-17 21:59   ` Chuck Wolber

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.