All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 01/10] runqueue: Fix corruption issue
@ 2019-08-14 13:53 Richard Purdie
  2019-08-14 13:53 ` [PATCH 02/10] runqueue: Improve setscene task handling logic Richard Purdie
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Richard Purdie @ 2019-08-14 13:53 UTC (permalink / raw)
  To: bitbake-devel

We need to copy this set, not modify the original else all kinds
of weird and bad things break, mostly from circular references.
We'll not go into how much sleep I lost tracking down the fallout
from this.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/runqueue.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/bb/runqueue.py b/lib/bb/runqueue.py
index bb61087359..2bf19b9778 100644
--- a/lib/bb/runqueue.py
+++ b/lib/bb/runqueue.py
@@ -2124,7 +2124,7 @@ class RunQueueExecute:
     # as most code can't handle them
     def build_taskdepdata(self, task):
         taskdepdata = {}
-        next = self.rqdata.runtaskentries[task].depends
+        next = self.rqdata.runtaskentries[task].depends.copy()
         next.add(task)
         next = self.filtermcdeps(task, next)
         while next:
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 02/10] runqueue: Improve setscene task handling logic
  2019-08-14 13:53 [PATCH 01/10] runqueue: Fix corruption issue Richard Purdie
@ 2019-08-14 13:53 ` Richard Purdie
  2019-08-14 13:53 ` [PATCH 03/10] tests/runqueue: Add further hash equivalence tests Richard Purdie
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Richard Purdie @ 2019-08-14 13:53 UTC (permalink / raw)
  To: bitbake-devel

The previous tasks_covered and tasks_notcovered were basically unstable
data structures. We couldn't always tell whether tasks should be covered
or not when trying to repair the sturcture if sstate tasks reran.

In the end its simpler to throw the lists away and rebuild them based upon
current data rather than trying to patch it adhoc. This turns out to be
simpler and much more reliable and I've much more confidence in this code.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/runqueue.py | 109 ++++++++++++++-------------------------------
 1 file changed, 34 insertions(+), 75 deletions(-)

diff --git a/lib/bb/runqueue.py b/lib/bb/runqueue.py
index 2bf19b9778..29786c400b 100644
--- a/lib/bb/runqueue.py
+++ b/lib/bb/runqueue.py
@@ -1730,7 +1730,9 @@ class RunQueueExecute:
         self.tasks_notcovered = set()
         self.scenequeue_notneeded = set()
 
-        self.coveredtopocess = set()
+        # We can't skip specified target tasks which aren't setscene tasks
+        self.cantskip = set(self.rqdata.target_tids)
+        self.cantskip.difference_update(self.rqdata.runq_setscene_tids)
 
         schedulers = self.get_schedulers()
         for scheduler in schedulers:
@@ -2235,12 +2237,12 @@ class RunQueueExecute:
             if not valid:
                 continue
 
+            if tid in self.tasks_scenequeue_done:
+                self.tasks_scenequeue_done.remove(tid)
             for dep in self.sqdata.sq_covered_tasks[tid]:
                 if dep not in self.runq_complete:
-                    if dep in self.tasks_scenequeue_done:
+                    if dep in self.tasks_scenequeue_done and dep not in self.sqdata.unskippable:
                         self.tasks_scenequeue_done.remove(dep)
-                    if dep in self.tasks_notcovered:
-                        self.tasks_notcovered.remove(dep)
 
             if tid in self.sq_buildable:
                 self.sq_buildable.remove(tid)
@@ -2254,6 +2256,8 @@ class RunQueueExecute:
                 self.sqdata.outrightfail.remove(tid)
             if tid in self.scenequeue_notcovered:
                 self.scenequeue_notcovered.remove(tid)
+            if tid in self.scenequeue_covered:
+                self.scenequeue_covered.remove(tid)
 
             (mc, fn, taskname, taskfn) = split_tid_mcfn(tid)
             self.sqdata.stamps[tid] = bb.build.stampfile(taskname + "_setscene", self.rqdata.dataCaches[mc], taskfn, noextra=True)
@@ -2269,41 +2273,8 @@ class RunQueueExecute:
         if changes:
             self.update_holdofftasks()
 
-    def scenequeue_process_notcovered(self, task):
-        if len(self.rqdata.runtaskentries[task].depends) == 0:
-            self.setbuildable(task)
-        notcovered = set([task])
-        while notcovered:
-            new = set()
-            for t in sorted(notcovered):
-                for deptask in sorted(self.rqdata.runtaskentries[t].depends):
-                    if deptask in notcovered or deptask in new or deptask in self.rqdata.runq_setscene_tids or deptask in self.tasks_notcovered:
-                        continue
-                    logger.debug(1, 'Task %s depends on non-setscene task %s so not skipping' % (t, deptask))
-                    new.add(deptask)
-                    self.tasks_notcovered.add(deptask)
-                    if len(self.rqdata.runtaskentries[deptask].depends) == 0:
-                        self.setbuildable(deptask)
-            notcovered = new
-
-    def scenequeue_process_unskippable(self, task):
-        # Look up the dependency chain for non-setscene things which depend on this task
-        # and mark as 'done'/notcovered
-        ready = set([task])
-        while ready:
-            new = set()
-            for t in sorted(ready):
-                for deptask in sorted(self.rqdata.runtaskentries[t].revdeps):
-                    if deptask in ready or deptask in new or deptask in self.tasks_scenequeue_done or deptask in self.rqdata.runq_setscene_tids:
-                        continue
-                    if deptask in self.sqdata.unskippable:
-                        new.add(deptask)
-                        self.tasks_scenequeue_done.add(deptask)
-                        self.tasks_notcovered.add(deptask)
-                        #logger.warning("Up: " + str(deptask))
-            ready = new
-
     def scenequeue_updatecounters(self, task, fail=False):
+
         for dep in sorted(self.sqdata.sq_deps[task]):
             if fail and task in self.sqdata.sq_harddeps and dep in self.sqdata.sq_harddeps[task]:
                 logger.debug(2, "%s was unavailable and is a hard dependency of %s so skipping" % (task, dep))
@@ -2325,39 +2296,30 @@ class RunQueueExecute:
                         continue
                     if self.rqdata.runtaskentries[dep].revdeps.issubset(self.tasks_scenequeue_done):
                         new.add(dep)
-                        #logger.warning(" Down: " + dep)
             next = new
 
-        if task in self.sqdata.unskippable:
-            self.scenequeue_process_unskippable(task)
-
-        if task in self.scenequeue_notcovered:
-            logger.debug(1, 'Not skipping setscene task %s', task)
-            self.scenequeue_process_notcovered(task)
-        elif task in self.scenequeue_covered:
-            logger.debug(1, 'Queued setscene task %s', task)
-            self.coveredtopocess.add(task)
-
-        for task in sorted(self.coveredtopocess.copy()):
-            if self.sqdata.sq_covered_tasks[task].issubset(self.tasks_scenequeue_done):
-                logger.debug(1, 'Processing setscene task %s', task)
-                covered = self.sqdata.sq_covered_tasks[task]
-                covered.add(task)
-
-                # If a task is in target_tids and isn't a setscene task, we can't skip it.
-                cantskip = covered.intersection(self.rqdata.target_tids).difference(self.rqdata.runq_setscene_tids)
-                for tid in sorted(cantskip):
-                    self.tasks_notcovered.add(tid)
-                    self.scenequeue_process_notcovered(tid)
-                covered.difference_update(cantskip)
-
-                # Remove notcovered tasks
-                covered.difference_update(self.tasks_notcovered)
-                self.tasks_covered.update(covered)
-                self.coveredtopocess.remove(task)
-                for tid in sorted(covered):
-                    if self.rqdata.runtaskentries[tid].depends.issubset(self.runq_complete):
-                        self.setbuildable(tid)
+        notcovered = set(self.scenequeue_notcovered)
+        notcovered |= self.cantskip
+        for tid in self.scenequeue_notcovered:
+            notcovered |= self.sqdata.sq_covered_tasks[tid]
+        notcovered |= self.sqdata.unskippable.difference(self.rqdata.runq_setscene_tids)
+        notcovered.intersection_update(self.tasks_scenequeue_done)
+
+        covered = set(self.scenequeue_covered)
+        for tid in self.scenequeue_covered:
+            covered |= self.sqdata.sq_covered_tasks[tid]
+        covered.difference_update(notcovered)
+        covered.intersection_update(self.tasks_scenequeue_done)
+
+        for tid in notcovered | covered:
+            if len(self.rqdata.runtaskentries[tid].depends) == 0:
+                self.setbuildable(tid)
+            elif self.rqdata.runtaskentries[tid].depends.issubset(self.runq_complete):
+                 self.setbuildable(tid)
+
+        self.tasks_covered = covered
+        self.tasks_notcovered = notcovered
+
         self.update_holdofftasks()
 
     def sq_task_completeoutright(self, task):
@@ -2369,7 +2331,6 @@ class RunQueueExecute:
 
         logger.debug(1, 'Found task %s which could be accelerated', task)
         self.scenequeue_covered.add(task)
-        self.tasks_covered.add(task)
         self.scenequeue_updatecounters(task)
 
     def sq_check_taskfail(self, task):
@@ -2390,7 +2351,6 @@ class RunQueueExecute:
         self.sq_stats.taskFailed()
         bb.event.fire(sceneQueueTaskFailed(task, self.sq_stats, result, self), self.cfgData)
         self.scenequeue_notcovered.add(task)
-        self.tasks_notcovered.add(task)
         self.scenequeue_updatecounters(task, True)
         self.sq_check_taskfail(task)
 
@@ -2400,7 +2360,6 @@ class RunQueueExecute:
         self.sq_stats.taskSkipped()
         self.sq_stats.taskCompleted()
         self.scenequeue_notcovered.add(task)
-        self.tasks_notcovered.add(task)
         self.scenequeue_updatecounters(task, True)
 
     def sq_task_skip(self, task):
@@ -2564,6 +2523,7 @@ def build_scenequeue_data(sqdata, rqdata, rq, cooker, stampcache, sqrq):
     for tid in rqdata.runtaskentries:
         if len(rqdata.runtaskentries[tid].revdeps) == 0:
             sqdata.unskippable.add(tid)
+    sqdata.unskippable |= sqrq.cantskip
     while new:
         new = False
         orig = sqdata.unskippable.copy()
@@ -2572,14 +2532,13 @@ def build_scenequeue_data(sqdata, rqdata, rq, cooker, stampcache, sqrq):
                 continue
             if len(rqdata.runtaskentries[tid].depends) == 0:
                 # These are tasks which have no setscene tasks in their chain, need to mark as directly buildable
-                sqrq.tasks_notcovered.add(tid)
-                sqrq.tasks_scenequeue_done.add(tid)
                 sqrq.setbuildable(tid)
-            sqrq.scenequeue_process_unskippable(tid)
             sqdata.unskippable |= rqdata.runtaskentries[tid].depends
             if sqdata.unskippable != orig:
                 new = True
 
+    sqrq.tasks_scenequeue_done |= sqdata.unskippable.difference(rqdata.runq_setscene_tids)
+
     rqdata.init_progress_reporter.next_stage(len(rqdata.runtaskentries))
 
     # Sanity check all dependencies could be changed to setscene task references
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 03/10] tests/runqueue: Add further hash equivalence tests
  2019-08-14 13:53 [PATCH 01/10] runqueue: Fix corruption issue Richard Purdie
  2019-08-14 13:53 ` [PATCH 02/10] runqueue: Improve setscene task handling logic Richard Purdie
@ 2019-08-14 13:53 ` Richard Purdie
  2019-08-14 13:53 ` [PATCH 04/10] cooker: Improve hash server startup code to avoid exit tracebacks Richard Purdie
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Richard Purdie @ 2019-08-14 13:53 UTC (permalink / raw)
  To: bitbake-devel

Add some extra hash equivalence runqueue tests based on recent scenarios
that caused problems during testing.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 .../tests/runqueue-tests/classes/base.bbclass |  20 ++-
 lib/bb/tests/runqueue-tests/conf/bitbake.conf |   2 +-
 lib/bb/tests/runqueue-tests/recipes/e1.bb     |   1 +
 lib/bb/tests/runqueue.py                      | 155 +++++++++++++++++-
 4 files changed, 166 insertions(+), 12 deletions(-)
 create mode 100644 lib/bb/tests/runqueue-tests/recipes/e1.bb

diff --git a/lib/bb/tests/runqueue-tests/classes/base.bbclass b/lib/bb/tests/runqueue-tests/classes/base.bbclass
index 138edc3fa9..b57650d591 100644
--- a/lib/bb/tests/runqueue-tests/classes/base.bbclass
+++ b/lib/bb/tests/runqueue-tests/classes/base.bbclass
@@ -7,7 +7,7 @@ def stamptask(d):
     thistask = d.expand("${PN}:${BB_CURRENTTASK}")
     stampname = d.expand("${TOPDIR}/%s.run" % thistask)
     with open(stampname, "a+") as f:
-        f.write("\n")
+        f.write(d.getVar("BB_UNIHASH") + "\n")
 
     if d.getVar("BB_CURRENT_MC") != "default":
         thistask = d.expand("${BB_CURRENT_MC}:${PN}:${BB_CURRENTTASK}")
@@ -235,18 +235,28 @@ def sstate_checkhashes(sq_data, d, siginfo=False, currentcount=0, **kwargs):
 
     valid = d.getVar("SSTATEVALID").split()
 
-    for tid in sq_data['hash']:
+    for tid in sorted(sq_data['hash']):
         n = os.path.basename(bb.runqueue.fn_from_tid(tid)).split(".")[0] + ":do_" + bb.runqueue.taskname_from_tid(tid)[3:]
         print(n)
+        stampfile = d.expand("${TOPDIR}/%s.run" % n.replace("do_", ""))
         if n in valid:
             bb.note("SState: Found valid sstate for %s" % n)
             found.add(tid)
-        elif os.path.exists(d.expand("${TOPDIR}/%s.run" % n.replace("do_", ""))):
-            bb.note("SState: Found valid sstate for %s (already run)" % n)
+        elif n + ":" + sq_data['hash'][tid] in valid:
+            bb.note("SState: Found valid sstate for %s" % n)
             found.add(tid)
+        elif os.path.exists(stampfile):
+            with open(stampfile, "r") as f:
+                hash = f.readline().strip()
+            if hash == sq_data['hash'][tid]:
+                bb.note("SState: Found valid sstate for %s (already run)" % n)
+                found.add(tid)
+            else:
+                bb.note("SState: sstate hash didn't match previous run for %s (%s vs %s)" % (n, sq_data['hash'][tid], hash))
+                missed.add(tid)
         else:
             missed.add(tid)
-            bb.note("SState: Found no valid sstate for %s" % n)
+            bb.note("SState: Found no valid sstate for %s (%s)" % (n, sq_data['hash'][tid]))
 
     return found
 
diff --git a/lib/bb/tests/runqueue-tests/conf/bitbake.conf b/lib/bb/tests/runqueue-tests/conf/bitbake.conf
index 1c61f27607..ab0f6bcfac 100644
--- a/lib/bb/tests/runqueue-tests/conf/bitbake.conf
+++ b/lib/bb/tests/runqueue-tests/conf/bitbake.conf
@@ -11,6 +11,6 @@ STAMP = "${TMPDIR}/stamps/${PN}"
 T = "${TMPDIR}/workdir/${PN}/temp"
 BB_NUMBER_THREADS = "4"
 
-BB_HASHBASE_WHITELIST = "BB_CURRENT_MC BB_HASHSERVE"
+BB_HASHBASE_WHITELIST = "BB_CURRENT_MC BB_HASHSERVE TMPDIR TOPDIR SLOWTASKS SSTATEVALID"
 
 include conf/multiconfig/${BB_CURRENT_MC}.conf
diff --git a/lib/bb/tests/runqueue-tests/recipes/e1.bb b/lib/bb/tests/runqueue-tests/recipes/e1.bb
new file mode 100644
index 0000000000..1588bc8a59
--- /dev/null
+++ b/lib/bb/tests/runqueue-tests/recipes/e1.bb
@@ -0,0 +1 @@
+DEPENDS = "b1"
\ No newline at end of file
diff --git a/lib/bb/tests/runqueue.py b/lib/bb/tests/runqueue.py
index fbdacccfa1..1103f905f9 100644
--- a/lib/bb/tests/runqueue.py
+++ b/lib/bb/tests/runqueue.py
@@ -40,10 +40,12 @@ class RunQueueTests(unittest.TestCase):
         except subprocess.CalledProcessError as e:
             self.fail("Command %s failed with %s" % (cmd, e.output))
         tasks = []
-        with open(builddir + "/task.log", "r") as f:
-            tasks = [line.rstrip() for line in f]
-        if cleanup:
-            os.remove(builddir + "/task.log")
+        tasklog = builddir + "/task.log"
+        if os.path.exists(tasklog):
+            with open(tasklog, "r") as f:
+                tasks = [line.rstrip() for line in f]
+            if cleanup:
+                os.remove(tasklog)
         return tasks
 
     def test_no_setscenevalid(self):
@@ -229,7 +231,7 @@ class RunQueueTests(unittest.TestCase):
             self.assertEqual(set(tasks), set(expected))
 
 
-    def test_hashserv(self):
+    def test_hashserv_single(self):
         with tempfile.TemporaryDirectory(prefix="runqueuetest") as tempdir:
             extraenv = {
                 "BB_HASHSERVE" : "localhost:0",
@@ -248,6 +250,147 @@ class RunQueueTests(unittest.TestCase):
             self.assertEqual(set(tasks), set(expected))
             cmd = ["bitbake", "a1", "b1"]
             tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True)
-            expected = ['a1:' + x for x in setscenetasks] + ['b1:' + x for x in setscenetasks] + ['a1:build', 'b1:build']
+            expected = ['a1:populate_sysroot', 'a1:package', 'a1:package_write_rpm_setscene', 'a1:packagedata_setscene',
+                        'a1:package_write_ipk_setscene', 'a1:package_qa_setscene']
             self.assertEqual(set(tasks), set(expected))
 
+    def test_hashserv_double(self):
+        with tempfile.TemporaryDirectory(prefix="runqueuetest") as tempdir:
+            extraenv = {
+                "BB_HASHSERVE" : "localhost:0",
+                "BB_SIGNATURE_HANDLER" : "TestEquivHash"
+            }
+            cmd = ["bitbake", "a1", "b1", "e1"]
+            setscenetasks = ['package_write_ipk_setscene', 'package_write_rpm_setscene', 'packagedata_setscene',
+                             'populate_sysroot_setscene', 'package_qa_setscene']
+            sstatevalid = ""
+            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True)
+            expected = ['a1:' + x for x in self.alltasks] + ['b1:' + x for x in self.alltasks] + ['e1:' + x for x in self.alltasks]
+            self.assertEqual(set(tasks), set(expected))
+            cmd = ["bitbake", "a1", "b1", "-c", "install", "-fn"]
+            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True)
+            cmd = ["bitbake", "e1"]
+            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True)
+            expected = ['a1:package', 'a1:install', 'b1:package', 'b1:install', 'a1:populate_sysroot', 'b1:populate_sysroot',
+                        'a1:package_write_ipk_setscene', 'b1:packagedata_setscene', 'b1:package_write_rpm_setscene',
+                        'a1:package_write_rpm_setscene', 'b1:package_write_ipk_setscene', 'a1:packagedata_setscene']
+            self.assertEqual(set(tasks), set(expected))
+
+
+    def test_hashserv_multiple_setscene(self):
+        # Runs e1:do_package_setscene twice
+        with tempfile.TemporaryDirectory(prefix="runqueuetest") as tempdir:
+            extraenv = {
+                "BB_HASHSERVE" : "localhost:0",
+                "BB_SIGNATURE_HANDLER" : "TestEquivHash"
+            }
+            cmd = ["bitbake", "a1", "b1", "e1"]
+            setscenetasks = ['package_write_ipk_setscene', 'package_write_rpm_setscene', 'packagedata_setscene',
+                             'populate_sysroot_setscene', 'package_qa_setscene']
+            sstatevalid = ""
+            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True)
+            expected = ['a1:' + x for x in self.alltasks] + ['b1:' + x for x in self.alltasks] + ['e1:' + x for x in self.alltasks]
+            self.assertEqual(set(tasks), set(expected))
+            cmd = ["bitbake", "a1", "b1", "-c", "install", "-fn"]
+            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True)
+            cmd = ["bitbake", "e1"]
+            sstatevalid = "e1:do_package"
+            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True, slowtasks="a1:populate_sysroot b1:populate_sysroot")
+            expected = ['a1:package', 'a1:install', 'b1:package', 'b1:install', 'a1:populate_sysroot', 'b1:populate_sysroot',
+                        'a1:package_write_ipk_setscene', 'b1:packagedata_setscene', 'b1:package_write_rpm_setscene',
+                        'a1:package_write_rpm_setscene', 'b1:package_write_ipk_setscene', 'a1:packagedata_setscene',
+                        'e1:package_setscene']
+            self.assertEqual(set(tasks), set(expected))
+            for i in expected:
+                if i in ["e1:package_setscene"]:
+                    self.assertEqual(tasks.count(i), 4, "%s not in task list four times" % i)
+                else:
+                    self.assertEqual(tasks.count(i), 1, "%s not in task list once" % i)
+
+    def test_hashserv_partial_match(self):
+        # e1:do_package matches initial built but not second hash value
+        with tempfile.TemporaryDirectory(prefix="runqueuetest") as tempdir:
+            extraenv = {
+                "BB_HASHSERVE" : "localhost:0",
+                "BB_SIGNATURE_HANDLER" : "TestEquivHash"
+            }
+            cmd = ["bitbake", "a1", "b1"]
+            setscenetasks = ['package_write_ipk_setscene', 'package_write_rpm_setscene', 'packagedata_setscene',
+                             'populate_sysroot_setscene', 'package_qa_setscene']
+            sstatevalid = ""
+            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True)
+            expected = ['a1:' + x for x in self.alltasks] + ['b1:' + x for x in self.alltasks]
+            self.assertEqual(set(tasks), set(expected))
+            with open(tempdir + "/stamps/a1.do_install.taint", "w") as f:
+               f.write("d460a29e-903f-4b76-a96b-3bcc22a65994")
+            with open(tempdir + "/stamps/b1.do_install.taint", "w") as f:
+               f.write("ed36d46a-2977-458a-b3de-eef885bc1817")
+            cmd = ["bitbake", "e1"]
+            sstatevalid = "e1:do_package:cb47e017ab549d87aab614c0f49dcf969ff6414745909094f0af7e657cedc657"
+            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True)
+            expected = ['a1:package', 'a1:install', 'b1:package', 'b1:install', 'a1:populate_sysroot', 'b1:populate_sysroot',
+                        'a1:package_write_ipk_setscene', 'b1:packagedata_setscene', 'b1:package_write_rpm_setscene',
+                        'a1:package_write_rpm_setscene', 'b1:package_write_ipk_setscene', 'a1:packagedata_setscene',
+                        'e1:package_setscene'] + ['e1:' + x for x in self.alltasks]
+            expected.remove('e1:package')
+            self.assertEqual(set(tasks), set(expected))
+
+    def test_hashserv_partial_match2(self):
+        # e1:do_package + e1:do_populate_sysroot matches initial built but not second hash value
+        with tempfile.TemporaryDirectory(prefix="runqueuetest") as tempdir:
+            extraenv = {
+                "BB_HASHSERVE" : "localhost:0",
+                "BB_SIGNATURE_HANDLER" : "TestEquivHash"
+            }
+            cmd = ["bitbake", "a1", "b1"]
+            setscenetasks = ['package_write_ipk_setscene', 'package_write_rpm_setscene', 'packagedata_setscene',
+                             'populate_sysroot_setscene', 'package_qa_setscene']
+            sstatevalid = ""
+            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True)
+            expected = ['a1:' + x for x in self.alltasks] + ['b1:' + x for x in self.alltasks]
+            self.assertEqual(set(tasks), set(expected))
+            with open(tempdir + "/stamps/a1.do_install.taint", "w") as f:
+               f.write("d460a29e-903f-4b76-a96b-3bcc22a65994")
+            with open(tempdir + "/stamps/b1.do_install.taint", "w") as f:
+               f.write("ed36d46a-2977-458a-b3de-eef885bc1817")
+            cmd = ["bitbake", "e1"]
+            sstatevalid = "e1:do_package:cb47e017ab549d87aab614c0f49dcf969ff6414745909094f0af7e657cedc657 e1:do_populate_sysroot:aa6a915229f04af429d3c6c59c303516c500650b7c48da8e07b20a53acd86c5f"
+            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True)
+            expected = ['a1:package', 'a1:install', 'b1:package', 'b1:install', 'a1:populate_sysroot', 'b1:populate_sysroot',
+                        'a1:package_write_ipk_setscene', 'b1:packagedata_setscene', 'b1:package_write_rpm_setscene',
+                        'a1:package_write_rpm_setscene', 'b1:package_write_ipk_setscene', 'a1:packagedata_setscene',
+                        'e1:package_setscene', 'e1:populate_sysroot_setscene', 'e1:build', 'e1:package_qa', 'e1:package_write_rpm', 'e1:package_write_ipk', 'e1:packagedata']
+            self.assertEqual(set(tasks), set(expected))
+
+
+    def test_hashserv_partial_match3(self):
+        # e1:do_package is valid for a1 but not after b1
+        # In former buggy code, this triggered e1:do_fetch, then e1:do_populate_sysroot to run
+        # with none of the intermediate tasks which is a serious bug
+        with tempfile.TemporaryDirectory(prefix="runqueuetest") as tempdir:
+            extraenv = {
+                "BB_HASHSERVE" : "localhost:0",
+                "BB_SIGNATURE_HANDLER" : "TestEquivHash"
+            }
+            cmd = ["bitbake", "a1", "b1"]
+            setscenetasks = ['package_write_ipk_setscene', 'package_write_rpm_setscene', 'packagedata_setscene',
+                             'populate_sysroot_setscene', 'package_qa_setscene']
+            sstatevalid = ""
+            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True)
+            expected = ['a1:' + x for x in self.alltasks] + ['b1:' + x for x in self.alltasks]
+            self.assertEqual(set(tasks), set(expected))
+            with open(tempdir + "/stamps/a1.do_install.taint", "w") as f:
+               f.write("d460a29e-903f-4b76-a96b-3bcc22a65994")
+            with open(tempdir + "/stamps/b1.do_install.taint", "w") as f:
+               f.write("ed36d46a-2977-458a-b3de-eef885bc1817")
+            cmd = ["bitbake", "e1", "-DD"]
+            sstatevalid = "e1:do_package:b710f6312ffed900b4b2761cc05538645f4ff3e7e0b70d688c70c0f3bcc2e1a2"
+            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True, slowtasks="e1:fetch")
+            expected = ['a1:package', 'a1:install', 'b1:package', 'b1:install', 'a1:populate_sysroot', 'b1:populate_sysroot',
+                        'a1:package_write_ipk_setscene', 'b1:packagedata_setscene', 'b1:package_write_rpm_setscene',
+                        'a1:package_write_rpm_setscene', 'b1:package_write_ipk_setscene', 'a1:packagedata_setscene',
+                        'e1:package_setscene']  + ['e1:' + x for x in self.alltasks]
+            expected.remove('e1:package')
+            self.assertEqual(set(tasks), set(expected))
+
+
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 04/10] cooker: Improve hash server startup code to avoid exit tracebacks
  2019-08-14 13:53 [PATCH 01/10] runqueue: Fix corruption issue Richard Purdie
  2019-08-14 13:53 ` [PATCH 02/10] runqueue: Improve setscene task handling logic Richard Purdie
  2019-08-14 13:53 ` [PATCH 03/10] tests/runqueue: Add further hash equivalence tests Richard Purdie
@ 2019-08-14 13:53 ` Richard Purdie
  2019-08-14 13:53 ` [PATCH 05/10] runqueue: Wait for covered tasks to complete before trying setscene Richard Purdie
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Richard Purdie @ 2019-08-14 13:53 UTC (permalink / raw)
  To: bitbake-devel

At exit the hashserv code was causing tracebacks as join() wasn't
being called from the thread that started the process. Ensure that
the hashserver is started from the pre_serve hook which is the
final thread the cooker runs in. This avoids the traceback at the
expense of some horrific poking into data stores which will ultimately
need improving through a proper API.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/cooker.py         | 29 +++++++++++++++++------------
 lib/hashserv/__init__.py |  1 +
 2 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/lib/bb/cooker.py b/lib/bb/cooker.py
index ec1b35d724..0607fcc708 100644
--- a/lib/bb/cooker.py
+++ b/lib/bb/cooker.py
@@ -371,22 +371,10 @@ class BBCooker:
 
         self.data.setVar('BB_CMDLINE', self.ui_cmdline)
 
-        if self.data.getVar("BB_HASHSERVE") == "localhost:0":
-            if not self.hashserv:
-                dbfile = (self.data.getVar("PERSISTENT_DIR") or self.data.getVar("CACHE")) + "/hashserv.db"
-                self.hashserv = hashserv.create_server(('localhost', 0), dbfile, '')
-                self.hashservport = "localhost:" + str(self.hashserv.server_port)
-                self.hashserv.process = multiprocessing.Process(target=self.hashserv.serve_forever)
-                self.hashserv.process.daemon = True
-                self.hashserv.process.start()
-            self.data.setVar("BB_HASHSERVE", self.hashservport)
-
         #
         # Copy of the data store which has been expanded.
         # Used for firing events and accessing variables where expansion needs to be accounted for
         #
-        bb.parse.init_parser(self.data)
-
         if CookerFeatures.BASEDATASTORE_TRACKING in self.featureset:
             self.disableDataTracking()
 
@@ -404,6 +392,22 @@ class BBCooker:
         except prserv.serv.PRServiceConfigError as e:
             bb.fatal("Unable to start PR Server, exitting")
 
+        if self.data.getVar("BB_HASHSERVE") == "localhost:0":
+            if not self.hashserv:
+                dbfile = (self.data.getVar("PERSISTENT_DIR") or self.data.getVar("CACHE")) + "/hashserv.db"
+                self.hashserv = hashserv.create_server(('localhost', 0), dbfile, '')
+                self.hashservport = "localhost:" + str(self.hashserv.server_port)
+                self.hashserv.process = multiprocessing.Process(target=self.hashserv.serve_forever)
+                self.hashserv.process.daemon = True
+                self.hashserv.process.start()
+            self.data.setVar("BB_HASHSERVE", self.hashservport)
+            self.databuilder.origdata.setVar("BB_HASHSERVE", self.hashservport)
+            self.databuilder.data.setVar("BB_HASHSERVE", self.hashservport)
+            for mc in self.databuilder.mcdata:
+                self.databuilder.mcdata[mc].setVar("BB_HASHSERVE", self.hashservport)
+
+        bb.parse.init_parser(self.data)
+
     def enableDataTracking(self):
         self.configuration.tracking = True
         if hasattr(self, "data"):
@@ -1677,6 +1681,7 @@ class BBCooker:
 
     def reset(self):
         self.initConfigurationData()
+        self.handlePRServ()
 
     def clientComplete(self):
         """Called when the client is done using the server"""
diff --git a/lib/hashserv/__init__.py b/lib/hashserv/__init__.py
index 1d5e08ee5a..55966e748a 100644
--- a/lib/hashserv/__init__.py
+++ b/lib/hashserv/__init__.py
@@ -151,6 +151,7 @@ class ThreadedHTTPServer(HTTPServer):
 
         signal.signal(signal.SIGTERM, self.sigterm_exception)
         super().serve_forever()
+        os._exit(0)
 
     def sigterm_exception(self, signum, stackframe):
         self.server_close()
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 05/10] runqueue: Wait for covered tasks to complete before trying setscene
  2019-08-14 13:53 [PATCH 01/10] runqueue: Fix corruption issue Richard Purdie
                   ` (2 preceding siblings ...)
  2019-08-14 13:53 ` [PATCH 04/10] cooker: Improve hash server startup code to avoid exit tracebacks Richard Purdie
@ 2019-08-14 13:53 ` Richard Purdie
  2019-08-14 13:53 ` [PATCH 06/10] runqueue: Fix next_buildable_task performance problem Richard Purdie
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Richard Purdie @ 2019-08-14 13:53 UTC (permalink / raw)
  To: bitbake-devel

If tasks are in the covered list of tasks for a given setscene task,
it needs to wait for those to complete before we can start.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/runqueue.py | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/bb/runqueue.py b/lib/bb/runqueue.py
index 29786c400b..9acad7af8e 100644
--- a/lib/bb/runqueue.py
+++ b/lib/bb/runqueue.py
@@ -1920,6 +1920,10 @@ class RunQueueExecute:
                             if nexttask in self.sq_deferred:
                                 del self.sq_deferred[nexttask]
                             return True
+                    # If covered tasks are running, need to wait for them to complete
+                    for t in self.sqdata.sq_covered_tasks[nexttask]:
+                        if t in self.runq_running and t not in self.runq_complete:
+                            continue
                     if nexttask in self.sq_deferred:
                         if self.sq_deferred[nexttask] not in self.runq_complete:
                             continue
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 06/10] runqueue: Fix next_buildable_task performance problem
  2019-08-14 13:53 [PATCH 01/10] runqueue: Fix corruption issue Richard Purdie
                   ` (3 preceding siblings ...)
  2019-08-14 13:53 ` [PATCH 05/10] runqueue: Wait for covered tasks to complete before trying setscene Richard Purdie
@ 2019-08-14 13:53 ` Richard Purdie
  2019-08-14 13:53 ` [PATCH 07/10] runqueue: Improve scenequeue debugging Richard Purdie
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Richard Purdie @ 2019-08-14 13:53 UTC (permalink / raw)
  To: bitbake-devel

Looking at the profile information, a lot of time is being spent in
next_buildable_task. This is probably due to the generator expressions
not working well with the empty test.

The easiest way to improve things is to switch to using set manipulations.
We also don't need to update self.buildable the way the original code did
as we don't rely on that anywhere.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/runqueue.py | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/lib/bb/runqueue.py b/lib/bb/runqueue.py
index 9acad7af8e..3bcbaee12a 100644
--- a/lib/bb/runqueue.py
+++ b/lib/bb/runqueue.py
@@ -133,7 +133,7 @@ class RunQueueScheduler(object):
 
         self.prio_map = [self.rqdata.runtaskentries.keys()]
 
-        self.buildable = []
+        self.buildable = set()
         self.skip_maxthread = {}
         self.stamps = {}
         for tid in self.rqdata.runtaskentries:
@@ -148,8 +148,10 @@ class RunQueueScheduler(object):
         """
         Return the id of the first task we find that is buildable
         """
-        self.buildable = [x for x in self.buildable if x not in self.rq.runq_running]
-        buildable = [x for x in self.buildable if (x in self.rq.tasks_covered or x in self.rq.tasks_notcovered) and x not in self.rq.holdoff_tasks]
+        buildable = set(self.buildable)
+        buildable.difference_update(self.rq.runq_running)
+        buildable.difference_update(self.rq.holdoff_tasks)
+        buildable.intersection_update(self.rq.tasks_covered | self.rq.tasks_notcovered)
         if not buildable:
             return None
 
@@ -167,7 +169,7 @@ class RunQueueScheduler(object):
                 skip_buildable[rtaskname] = 1
 
         if len(buildable) == 1:
-            tid = buildable[0]
+            tid = buildable.pop()
             taskname = taskname_from_tid(tid)
             if taskname in skip_buildable and skip_buildable[taskname] >= int(self.skip_maxthread[taskname]):
                 return None
@@ -204,7 +206,7 @@ class RunQueueScheduler(object):
             return self.next_buildable_task()
 
     def newbuildable(self, task):
-        self.buildable.append(task)
+        self.buildable.add(task)
 
     def removebuildable(self, task):
         self.buildable.remove(task)
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 07/10] runqueue: Improve scenequeue debugging
  2019-08-14 13:53 [PATCH 01/10] runqueue: Fix corruption issue Richard Purdie
                   ` (4 preceding siblings ...)
  2019-08-14 13:53 ` [PATCH 06/10] runqueue: Fix next_buildable_task performance problem Richard Purdie
@ 2019-08-14 13:53 ` Richard Purdie
  2019-08-14 13:53 ` [PATCH 08/10] runqueue: Recompute holdoff tasks from scratch Richard Purdie
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Richard Purdie @ 2019-08-14 13:53 UTC (permalink / raw)
  To: bitbake-devel

Whilst we had good runqueue failure mode debug, it hadn't adapted to the
scenequeue changes. Run the scenequeue sanity tests at the end of
a build and output the results regardless of whether all setscene tasks
completed or not. This *massively* improves the ability to debug runqueue
problems.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/runqueue.py | 61 +++++++++++++++++++++++++++++++---------------
 1 file changed, 42 insertions(+), 19 deletions(-)

diff --git a/lib/bb/runqueue.py b/lib/bb/runqueue.py
index 3bcbaee12a..a1e3285821 100644
--- a/lib/bb/runqueue.py
+++ b/lib/bb/runqueue.py
@@ -1746,9 +1746,9 @@ class RunQueueExecute:
             bb.fatal("Invalid scheduler '%s'.  Available schedulers: %s" %
                      (self.scheduler, ", ".join(obj.name for obj in schedulers)))
 
-        if len(self.rqdata.runq_setscene_tids) > 0:
-            self.sqdata = SQData()
-            build_scenequeue_data(self.sqdata, self.rqdata, self.rq, self.cooker, self.stampcache, self)
+        #if len(self.rqdata.runq_setscene_tids) > 0:
+        self.sqdata = SQData()
+        build_scenequeue_data(self.sqdata, self.rqdata, self.rq, self.cooker, self.stampcache, self)
 
     def runqueue_process_waitpid(self, task, status):
 
@@ -1901,6 +1901,35 @@ class RunQueueExecute:
         self.stats.taskSkipped()
         self.stats.taskCompleted()
 
+    def summarise_scenequeue_errors(self):
+        err = False
+        if not self.sqdone:
+            logger.debug(1, 'We could skip tasks %s', "\n".join(sorted(self.scenequeue_covered)))
+            completeevent = sceneQueueComplete(self.sq_stats, self.rq)
+            bb.event.fire(completeevent, self.cfgData)
+        if self.sq_deferred:
+            logger.error("Scenequeue had deferred entries: %s" % pprint.pformat(self.sq_deferred))
+            err = True
+        if self.changed_setscene:
+            logger.error("Scenequeue had unprocessed changed entries: %s" % pprint.pformat(self.changed_setscene))
+            err = True
+        if self.holdoff_tasks:
+            logger.error("Scenequeue had holdoff tasks: %s" % pprint.pformat(self.holdoff_tasks))
+            err = True
+
+        for x in self.rqdata.runtaskentries:
+            if x not in self.tasks_covered and x not in self.tasks_notcovered:
+                logger.error("Task %s was never moved from the setscene queue" % x)
+                err = True
+            if x not in self.tasks_scenequeue_done:
+                logger.error("Task %s was never processed by the setscene code" % x)
+                err = True
+            if len(self.rqdata.runtaskentries[x].depends) == 0 and x not in self.runq_buildable:
+                logger.error("Task %s was never marked as buildable by the setscene code" % x)
+                err = True
+        return err
+
+
     def execute(self):
         """
         Run the tasks in a queue prepared by prepare_runqueue
@@ -1996,22 +2025,8 @@ class RunQueueExecute:
 
         if not self.sq_live and not self.sqdone and not self.sq_deferred and not self.changed_setscene and not self.holdoff_tasks:
             logger.info("Setscene tasks completed")
-            logger.debug(1, 'We could skip tasks %s', "\n".join(sorted(self.scenequeue_covered)))
 
-            completeevent = sceneQueueComplete(self.sq_stats, self.rq)
-            bb.event.fire(completeevent, self.cfgData)
-
-            err = False
-            for x in self.rqdata.runtaskentries:
-                if x not in self.tasks_covered and x not in self.tasks_notcovered:
-                    logger.error("Task %s was never moved from the setscene queue" % x)
-                    err = True
-                if x not in self.tasks_scenequeue_done:
-                    logger.error("Task %s was never processed by the setscene code" % x)
-                    err = True
-                if len(self.rqdata.runtaskentries[x].depends) == 0 and x not in self.runq_buildable:
-                    logger.error("Task %s was never marked as buildable by the setscene code" % x)
-                    err = True
+            err = self.summarise_scenequeue_errors()
             if err:
                 self.rq.state = runQueueFailed
                 return True
@@ -2107,14 +2122,22 @@ class RunQueueExecute:
             return True
 
         # Sanity Checks
+        err = self.summarise_scenequeue_errors()
         for task in self.rqdata.runtaskentries:
             if task not in self.runq_buildable:
                 logger.error("Task %s never buildable!", task)
+                err = True
             elif task not in self.runq_running:
                 logger.error("Task %s never ran!", task)
+                err = True
             elif task not in self.runq_complete:
                 logger.error("Task %s never completed!", task)
-        self.rq.state = runQueueComplete
+                err = True
+
+        if err:
+            self.rq.state = runQueueFailed
+        else:
+            self.rq.state = runQueueComplete
 
         return True
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 08/10] runqueue: Recompute holdoff tasks from scratch
  2019-08-14 13:53 [PATCH 01/10] runqueue: Fix corruption issue Richard Purdie
                   ` (5 preceding siblings ...)
  2019-08-14 13:53 ` [PATCH 07/10] runqueue: Improve scenequeue debugging Richard Purdie
@ 2019-08-14 13:53 ` Richard Purdie
  2019-08-14 13:53 ` [PATCH 09/10] runqueue: Fix event timing race Richard Purdie
  2019-08-14 13:53 ` [PATCH 10/10] runqueue: Drop debug statement causing performance issues Richard Purdie
  8 siblings, 0 replies; 10+ messages in thread
From: Richard Purdie @ 2019-08-14 13:53 UTC (permalink / raw)
  To: bitbake-devel

The changed_setscene variable here is just odd and not needed. Worse,
it could prevent some tasks from being removed from the holdoff tasks
list. The list is being rebuilt and should work as intended just from
the other data, this is a leftover from previous versions of the code
as far as I can tell.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/runqueue.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/bb/runqueue.py b/lib/bb/runqueue.py
index a1e3285821..eb8e342761 100644
--- a/lib/bb/runqueue.py
+++ b/lib/bb/runqueue.py
@@ -2237,7 +2237,7 @@ class RunQueueExecute:
             self.update_holdofftasks()
 
     def update_holdofftasks(self):
-        self.holdoff_tasks = set(self.changed_setscene)
+        self.holdoff_tasks = set()
 
         for tid in self.rqdata.runq_setscene_tids:
             if tid not in self.scenequeue_covered and tid not in self.scenequeue_notcovered:
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 09/10] runqueue: Fix event timing race
  2019-08-14 13:53 [PATCH 01/10] runqueue: Fix corruption issue Richard Purdie
                   ` (6 preceding siblings ...)
  2019-08-14 13:53 ` [PATCH 08/10] runqueue: Recompute holdoff tasks from scratch Richard Purdie
@ 2019-08-14 13:53 ` Richard Purdie
  2019-08-14 13:53 ` [PATCH 10/10] runqueue: Drop debug statement causing performance issues Richard Purdie
  8 siblings, 0 replies; 10+ messages in thread
From: Richard Purdie @ 2019-08-14 13:53 UTC (permalink / raw)
  To: bitbake-devel

The event from the task notifiing of hash equivalency should only be processed
when the task completes. This can otherwise result in a race where a dependent
task may run before the original task completes causing various failures.

To make this work reliably, the code had to be restructured quite a bit.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/runqueue.py       | 141 ++++++++++++++++++++-------------------
 lib/bb/tests/runqueue.py |   4 +-
 2 files changed, 74 insertions(+), 71 deletions(-)

diff --git a/lib/bb/runqueue.py b/lib/bb/runqueue.py
index eb8e342761..a04703c870 100644
--- a/lib/bb/runqueue.py
+++ b/lib/bb/runqueue.py
@@ -1696,7 +1696,8 @@ class RunQueueExecute:
         self.sq_running = set()
         self.sq_live = set()
 
-        self.changed_setscene = set()
+        self.updated_taskhash_queue = []
+        self.pending_migrations = set()
 
         self.runq_buildable = set()
         self.runq_running = set()
@@ -1910,8 +1911,8 @@ class RunQueueExecute:
         if self.sq_deferred:
             logger.error("Scenequeue had deferred entries: %s" % pprint.pformat(self.sq_deferred))
             err = True
-        if self.changed_setscene:
-            logger.error("Scenequeue had unprocessed changed entries: %s" % pprint.pformat(self.changed_setscene))
+        if self.updated_taskhash_queue:
+            logger.error("Scenequeue had unprocessed changed taskhash entries: %s" % pprint.pformat(self.updated_taskhash_queue))
             err = True
         if self.holdoff_tasks:
             logger.error("Scenequeue had holdoff tasks: %s" % pprint.pformat(self.holdoff_tasks))
@@ -2023,7 +2024,7 @@ class RunQueueExecute:
             if self.can_start_task():
                 return True
 
-        if not self.sq_live and not self.sqdone and not self.sq_deferred and not self.changed_setscene and not self.holdoff_tasks:
+        if not self.sq_live and not self.sqdone and not self.sq_deferred and not self.updated_taskhash_queue and not self.holdoff_tasks:
             logger.info("Setscene tasks completed")
 
             err = self.summarise_scenequeue_errors()
@@ -2177,45 +2178,66 @@ class RunQueueExecute:
         #bb.note("Task %s: " % task + str(taskdepdata).replace("], ", "],\n"))
         return taskdepdata
 
-    def updated_taskhash(self, tid, unihash):
+    def update_holdofftasks(self):
+        self.holdoff_tasks = set()
+
+        for tid in self.rqdata.runq_setscene_tids:
+            if tid not in self.scenequeue_covered and tid not in self.scenequeue_notcovered:
+                self.holdoff_tasks.add(tid)
+
+        for tid in self.holdoff_tasks.copy():
+            for dep in self.sqdata.sq_covered_tasks[tid]:
+                if dep not in self.runq_complete:
+                    self.holdoff_tasks.add(dep)
+        logger.debug(2, "Holding off tasks %s" % pprint.pformat(self.holdoff_tasks))
+
+
+    def process_possible_migrations(self):
+
         changed = set()
-        if unihash != self.rqdata.runtaskentries[tid].unihash:
-            logger.info("Task %s unihash changed to %s" % (tid, unihash))
-            self.rqdata.runtaskentries[tid].unihash = unihash
-            bb.parse.siggen.set_unihash(tid, unihash)
-
-            # Work out all tasks which depend on this one
-            total = set()
-            next = set(self.rqdata.runtaskentries[tid].revdeps)
-            while next:
-                current = next.copy()
-                total = total |next
-                next = set()
-                for ntid in current:
-                    next |= self.rqdata.runtaskentries[ntid].revdeps
-                    next.difference_update(total)
-
-            # Now iterate those tasks in dependency order to regenerate their taskhash/unihash
-            done = set()
-            next = set(self.rqdata.runtaskentries[tid].revdeps)
-            while next:
-                current = next.copy()
-                next = set()
-                for tid in current:
-                    if not self.rqdata.runtaskentries[tid].depends.isdisjoint(total):
-                        continue
-                    procdep = []
-                    for dep in self.rqdata.runtaskentries[tid].depends:
-                        procdep.append(dep)
-                    orighash = self.rqdata.runtaskentries[tid].hash
-                    self.rqdata.runtaskentries[tid].hash = bb.parse.siggen.get_taskhash(tid, procdep, self.rqdata.dataCaches[mc_from_tid(tid)])
-                    origuni = self.rqdata.runtaskentries[tid].unihash
-                    self.rqdata.runtaskentries[tid].unihash = bb.parse.siggen.get_unihash(tid)
-                    logger.debug(1, "Task %s hash changes: %s->%s %s->%s" % (tid, orighash, self.rqdata.runtaskentries[tid].hash, origuni, self.rqdata.runtaskentries[tid].unihash))
-                    next |= self.rqdata.runtaskentries[tid].revdeps
-                    changed.add(tid)
-                    total.remove(tid)
-                    next.intersection_update(total)
+        for tid, unihash in self.updated_taskhash_queue.copy():
+            if tid in self.runq_running and tid not in self.runq_complete:
+                continue
+
+            self.updated_taskhash_queue.remove((tid, unihash))
+
+            if unihash != self.rqdata.runtaskentries[tid].unihash:
+                logger.info("Task %s unihash changed to %s" % (tid, unihash))
+                self.rqdata.runtaskentries[tid].unihash = unihash
+                bb.parse.siggen.set_unihash(tid, unihash)
+
+                # Work out all tasks which depend on this one
+                total = set()
+                next = set(self.rqdata.runtaskentries[tid].revdeps)
+                while next:
+                    current = next.copy()
+                    total = total |next
+                    next = set()
+                    for ntid in current:
+                        next |= self.rqdata.runtaskentries[ntid].revdeps
+                        next.difference_update(total)
+
+                # Now iterate those tasks in dependency order to regenerate their taskhash/unihash
+                done = set()
+                next = set(self.rqdata.runtaskentries[tid].revdeps)
+                while next:
+                    current = next.copy()
+                    next = set()
+                    for tid in current:
+                        if not self.rqdata.runtaskentries[tid].depends.isdisjoint(total):
+                            continue
+                        procdep = []
+                        for dep in self.rqdata.runtaskentries[tid].depends:
+                            procdep.append(dep)
+                        orighash = self.rqdata.runtaskentries[tid].hash
+                        self.rqdata.runtaskentries[tid].hash = bb.parse.siggen.get_taskhash(tid, procdep, self.rqdata.dataCaches[mc_from_tid(tid)])
+                        origuni = self.rqdata.runtaskentries[tid].unihash
+                        self.rqdata.runtaskentries[tid].unihash = bb.parse.siggen.get_unihash(tid)
+                        logger.debug(1, "Task %s hash changes: %s->%s %s->%s" % (tid, orighash, self.rqdata.runtaskentries[tid].hash, origuni, self.rqdata.runtaskentries[tid].unihash))
+                        next |= self.rqdata.runtaskentries[tid].revdeps
+                        changed.add(tid)
+                        total.remove(tid)
+                        next.intersection_update(total)
 
         if changed:
             for mc in self.rq.worker:
@@ -2223,7 +2245,7 @@ class RunQueueExecute:
             for mc in self.rq.fakeworker:
                 self.rq.fakeworker[mc].process.stdin.write(b"<newtaskhashes>" + pickle.dumps(bb.parse.siggen.get_taskhashes()) + b"</newtaskhashes>")
 
-        logger.debug(1, pprint.pformat("Tasks changed:\n%s" % (changed)))
+            logger.debug(1, pprint.pformat("Tasks changed:\n%s" % (changed)))
 
         for tid in changed:
             if tid not in self.rqdata.runq_setscene_tids:
@@ -2231,31 +2253,12 @@ class RunQueueExecute:
             valid = self.rq.validate_hashes(set([tid]), self.cooker.data, None, False)
             if not valid:
                 continue
-            self.changed_setscene.add(tid)
-
-        if changed:
-            self.update_holdofftasks()
-
-    def update_holdofftasks(self):
-        self.holdoff_tasks = set()
-
-        for tid in self.rqdata.runq_setscene_tids:
-            if tid not in self.scenequeue_covered and tid not in self.scenequeue_notcovered:
-                self.holdoff_tasks.add(tid)
-
-        for tid in self.holdoff_tasks.copy():
-            for dep in self.sqdata.sq_covered_tasks[tid]:
-                if dep not in self.runq_complete:
-                    self.holdoff_tasks.add(dep)
-        logger.debug(2, "Holding off tasks %s" % pprint.pformat(self.holdoff_tasks))
-
-    def process_possible_migrations(self):
-        changes = False
-        for tid in self.changed_setscene.copy():
             if tid in self.runq_running:
-                self.changed_setscene.remove(tid)
                 continue
+            if tid not in self.pending_migrations:
+                self.pending_migrations.add(tid)
 
+        for tid in self.pending_migrations.copy():
             valid = True
             # Check no tasks this covers are running
             for dep in self.sqdata.sq_covered_tasks[tid]:
@@ -2266,6 +2269,8 @@ class RunQueueExecute:
             if not valid:
                 continue
 
+            self.pending_migrations.remove(tid)
+
             if tid in self.tasks_scenequeue_done:
                 self.tasks_scenequeue_done.remove(tid)
             for dep in self.sqdata.sq_covered_tasks[tid]:
@@ -2296,10 +2301,8 @@ class RunQueueExecute:
 
             logger.info("Setscene task %s now valid and being rerun" % tid)
             self.sqdone = False
-            self.changed_setscene.remove(tid)
-            changes = True
 
-        if changes:
+        if changed:
             self.update_holdofftasks()
 
     def scenequeue_updatecounters(self, task, fail=False):
@@ -2854,7 +2857,7 @@ class runQueuePipe():
                     bb.msg.fatal("RunQueue", "failed load pickle '%s': '%s'" % (e, self.queue[7:index]))
                 bb.event.fire_from_worker(event, self.d)
                 if isinstance(event, taskUniHashUpdate):
-                    self.rqexec.updated_taskhash(event.taskid, event.unihash)
+                    self.rqexec.updated_taskhash_queue.append((event.taskid, event.unihash))
                 found = True
                 self.queue = self.queue[index+8:]
                 index = self.queue.find(b"</event>")
diff --git a/lib/bb/tests/runqueue.py b/lib/bb/tests/runqueue.py
index 1103f905f9..493516355d 100644
--- a/lib/bb/tests/runqueue.py
+++ b/lib/bb/tests/runqueue.py
@@ -384,8 +384,8 @@ class RunQueueTests(unittest.TestCase):
             with open(tempdir + "/stamps/b1.do_install.taint", "w") as f:
                f.write("ed36d46a-2977-458a-b3de-eef885bc1817")
             cmd = ["bitbake", "e1", "-DD"]
-            sstatevalid = "e1:do_package:b710f6312ffed900b4b2761cc05538645f4ff3e7e0b70d688c70c0f3bcc2e1a2"
-            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True, slowtasks="e1:fetch")
+            sstatevalid = "e1:do_package:f9aa46d63cb63d70a09712b6bc7fab57e4966cf8e8b52ff5ad1ba23823aec7d4 e1:do_package:b710f6312ffed900b4b2761cc05538645f4ff3e7e0b70d688c70c0f3bcc2e1a2"
+            tasks = self.run_bitbakecmd(cmd, tempdir, sstatevalid, extraenv=extraenv, cleanup=True, slowtasks="e1:fetch b1:install")
             expected = ['a1:package', 'a1:install', 'b1:package', 'b1:install', 'a1:populate_sysroot', 'b1:populate_sysroot',
                         'a1:package_write_ipk_setscene', 'b1:packagedata_setscene', 'b1:package_write_rpm_setscene',
                         'a1:package_write_rpm_setscene', 'b1:package_write_ipk_setscene', 'a1:packagedata_setscene',
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 10/10] runqueue: Drop debug statement causing performance issues
  2019-08-14 13:53 [PATCH 01/10] runqueue: Fix corruption issue Richard Purdie
                   ` (7 preceding siblings ...)
  2019-08-14 13:53 ` [PATCH 09/10] runqueue: Fix event timing race Richard Purdie
@ 2019-08-14 13:53 ` Richard Purdie
  8 siblings, 0 replies; 10+ messages in thread
From: Richard Purdie @ 2019-08-14 13:53 UTC (permalink / raw)
  To: bitbake-devel

This debug statement could result in a long list of tasks which when
repeatedly sent over our IPC, slowed down the builds immensely. Remove
it in favour of other more targetted debugging added recently, brining
back some lost performance, partciclarly on builds with large numbers
of tasks.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/runqueue.py | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/bb/runqueue.py b/lib/bb/runqueue.py
index a04703c870..b571dde4bf 100644
--- a/lib/bb/runqueue.py
+++ b/lib/bb/runqueue.py
@@ -2189,8 +2189,6 @@ class RunQueueExecute:
             for dep in self.sqdata.sq_covered_tasks[tid]:
                 if dep not in self.runq_complete:
                     self.holdoff_tasks.add(dep)
-        logger.debug(2, "Holding off tasks %s" % pprint.pformat(self.holdoff_tasks))
-
 
     def process_possible_migrations(self):
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-08-14 13:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-14 13:53 [PATCH 01/10] runqueue: Fix corruption issue Richard Purdie
2019-08-14 13:53 ` [PATCH 02/10] runqueue: Improve setscene task handling logic Richard Purdie
2019-08-14 13:53 ` [PATCH 03/10] tests/runqueue: Add further hash equivalence tests Richard Purdie
2019-08-14 13:53 ` [PATCH 04/10] cooker: Improve hash server startup code to avoid exit tracebacks Richard Purdie
2019-08-14 13:53 ` [PATCH 05/10] runqueue: Wait for covered tasks to complete before trying setscene Richard Purdie
2019-08-14 13:53 ` [PATCH 06/10] runqueue: Fix next_buildable_task performance problem Richard Purdie
2019-08-14 13:53 ` [PATCH 07/10] runqueue: Improve scenequeue debugging Richard Purdie
2019-08-14 13:53 ` [PATCH 08/10] runqueue: Recompute holdoff tasks from scratch Richard Purdie
2019-08-14 13:53 ` [PATCH 09/10] runqueue: Fix event timing race Richard Purdie
2019-08-14 13:53 ` [PATCH 10/10] runqueue: Drop debug statement causing performance issues Richard Purdie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.