All of lore.kernel.org
 help / color / mirror / Atom feed
* [OSSTEST PATCH] host reuse fixes: Properly clear out old static tasks from history
@ 2020-10-23 16:14 Ian Jackson
  0 siblings, 0 replies; only message in thread
From: Ian Jackson @ 2020-10-23 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

The algorithm for clearing out old lifecycle entries was wrong: it
would delete all entries for non-live tasks.

In practice this would properly remove all the old entries for
non-static tasks, since ownd tasks typically don't releease things
until the task ends (and it becomes non-live).  And it wouldn't remove
more than it should do unless some now-not-live task had an allocation
overlapping with us, which is not supposed to be possible if we are
doing a host wipe.  But it would not remove static tasks ever, since
they are always live.

Change to a completely different algorithm:

 * Check that only us (ie, $ttaskid) has (any shares of) this host
   allocated.  There's a function resource_check_allocated_core which
   already does this and since we're conceptually part of Executive
   it is proper for us to call it.  This is just a sanity check.

 * Delete all lifecycle entries predating the first entry made by
   us.  (We could just delete all entries other than ours, but in
   theory maybe some future code could result in a siutation where
   someone else could have had another share briefly at some point.)

This removes old junk from the "Tasks that could have affected" in
reports.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 Osstest/JobDB/Executive.pm | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/Osstest/JobDB/Executive.pm b/Osstest/JobDB/Executive.pm
index 1dcf55ff..097c8d75 100644
--- a/Osstest/JobDB/Executive.pm
+++ b/Osstest/JobDB/Executive.pm
@@ -515,15 +515,19 @@ sub jobdb_host_update_lifecycle_info ($$$) { #method
 
     if ($mode eq 'wiped') {
 	db_retry($flight, [qw(running)], $dbh_tests,[], sub {
-            $dbh_tests->do(<<END, {}, $hostname);
-                DELETE FROM host_lifecycle h
-                      WHERE hostname=?
-                        AND NOT EXISTS(
-                SELECT 1
-		  FROM tasks t
-		 WHERE t.live
-		   AND t.taskid = h.taskid
-                );
+            my $cshare = Osstest::Executive::resource_check_allocated_core(
+                "host",$hostname);
+            die "others have this host allocated when we have just wiped it! "
+	      .Dumper($cshare)
+	      if $cshare->{Others};
+	    $dbh_tests->do(<<END, {}, $hostname, $hostname, $ttaskid);
+                DELETE FROM host_lifecycle
+		      WHERE hostname=?
+			AND lcseq < (
+			       SELECT min(lcseq) 
+				FROM host_lifecycle
+			       WHERE hostname=? and taskid=?
+			    )
 END
         });
 	logm("host lifecycle: $hostname: wiped, cleared out old info");
-- 
2.20.1



^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2020-10-23 16:15 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-23 16:14 [OSSTEST PATCH] host reuse fixes: Properly clear out old static tasks from history Ian Jackson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.