All of lore.kernel.org
 help / color / mirror / Atom feed
* [OSSTEST PATCH 00/16] Bugfixes
@ 2020-10-22 16:44 Ian Jackson
  2020-10-22 16:44 ` [OSSTEST PATCH 01/16] share in jobdb: Break out $checkconstraints and move call Ian Jackson
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

Many of these are fixes to host sharing.

I'm still doing a formal dev test but I expect to push these soon.

Ian Jackson (16):
  share in jobdb: Break out $checkconstraints and move call
  share in jobdb: Move out-of-flight special case higher up
  PDU/IPMI: Retransmit, don't just wait
  PDU/MSW: Warn that SNMP status is often not immediately updated
  PDU/MSW: Break out get()
  PDU/MSW: Break out action_value()
  PDU/MSW: Actually implement delayed-*
  PDU/MSW: Make show() return the value from get()
  PDU/MSU: Retransmit on/off until PDU has changed
  host reuse fixes: Fix running of steps adhoc
  host reuse fixes: Fix runvar entry for adhoc tasks
  Introduce guest_mk_lv_name
  Prefix guest LV names with the job name
  reporting: Minor fix to reporting of tasks with no subtask
  host reuse fixes: Do not break host-reuse if no host allocated
  starvation: Do not count more than half a flight as starved

 Osstest/Executive.pm        |  2 +-
 Osstest/JobDB/Executive.pm  | 46 +++++++++++++++++++++++--------------
 Osstest/PDU/ipmi.pm         |  5 ++--
 Osstest/TestSupport.pm      |  9 ++++++--
 pdu-msw                     | 37 +++++++++++++++++++++++++----
 ts-debian-fixup             | 22 ++++++++++++++++++
 ts-debian-install           |  2 +-
 ts-host-reuse               |  2 +-
 ts-hosts-allocate-Executive |  2 +-
 9 files changed, 97 insertions(+), 30 deletions(-)

-- 
2.20.1



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 01/16] share in jobdb: Break out $checkconstraints and move call
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
@ 2020-10-22 16:44 ` Ian Jackson
  2020-10-22 16:44 ` [OSSTEST PATCH 02/16] share in jobdb: Move out-of-flight special case higher up Ian Jackson
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

This must happen after we introduce our new row or it is not
effective!

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 Osstest/JobDB/Executive.pm | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/Osstest/JobDB/Executive.pm b/Osstest/JobDB/Executive.pm
index f69ce277..071f31f1 100644
--- a/Osstest/JobDB/Executive.pm
+++ b/Osstest/JobDB/Executive.pm
@@ -582,6 +582,11 @@ END
           VALUES (?,        ?,      ?,      ?,   ?,      ?,     ?     )
 END
 
+    my $checkconstraints = sub {
+	$constraintsq->execute($hostname, $ttaskid);
+	$constraintsq->fetchrow_array() or confess "$hostname ?";
+    };
+
     my $ojvn = "$ho->{Ident}_lifecycle";
 
     if (length $r{$ojvn}) {
@@ -654,8 +659,6 @@ END
 		push @lifecycle, "$omarks$otj:$o->{stepno}$osuffix";
 	    }
 	}
-	$constraintsq->execute($hostname, $ttaskid);
-	$constraintsq->fetchrow_array() or confess "$hostname ?";
 
 	if (defined $flight) {
 	    $insertq->execute($hostname, $ttaskid,
@@ -670,6 +673,7 @@ END
 			      undef,
 			      undef,undef);
 	}
+	$checkconstraints->();
     });
 
     if (defined $flight) {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 02/16] share in jobdb: Move out-of-flight special case higher up
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
  2020-10-22 16:44 ` [OSSTEST PATCH 01/16] share in jobdb: Break out $checkconstraints and move call Ian Jackson
@ 2020-10-22 16:44 ` Ian Jackson
  2020-10-22 16:44 ` [OSSTEST PATCH 03/16] PDU/IPMI: Retransmit, don't just wait Ian Jackson
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

This avoids running the runvar computation loop outside flights.
This is good amongst other things because that loop prints warnings
about undef $flight and $job.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 Osstest/JobDB/Executive.pm | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/Osstest/JobDB/Executive.pm b/Osstest/JobDB/Executive.pm
index 071f31f1..4fa42e5d 100644
--- a/Osstest/JobDB/Executive.pm
+++ b/Osstest/JobDB/Executive.pm
@@ -587,6 +587,18 @@ END
 	$constraintsq->fetchrow_array() or confess "$hostname ?";
     };
 
+
+    if (!defined $flight) {
+	db_retry($dbh_tests,[], sub {
+	    $insertq->execute($hostname, $ttaskid,
+			      undef,undef,
+			      undef,
+			      undef,undef);
+	    $checkconstraints->();
+	});
+	return;
+    }
+
     my $ojvn = "$ho->{Ident}_lifecycle";
 
     if (length $r{$ojvn}) {
@@ -660,26 +672,17 @@ END
 	    }
 	}
 
-	if (defined $flight) {
-	    $insertq->execute($hostname, $ttaskid,
-			      $flight, $job,
-			      ($mode eq 'selectprep')+0,
+	$insertq->execute($hostname, $ttaskid,
+			  $flight, $job,
+			  ($mode eq 'selectprep')+0,
                 # ^ DBD::Pg doesn't accept perl canonical false for bool!
                 #   https://rt.cpan.org/Public/Bug/Display.html?id=133229
-			      $tident, $tstepno);
-	} else {
-	    $insertq->execute($hostname, $ttaskid,
-			      undef,undef,
-			      undef,
-			      undef,undef);
-	}
+			  $tident, $tstepno);
 	$checkconstraints->();
     });
 
-    if (defined $flight) {
-	push @lifecycle, $newsigil if length $newsigil;
-	store_runvar($ojvn, "@lifecycle");
-    }
+    push @lifecycle, $newsigil if length $newsigil;
+    store_runvar($ojvn, "@lifecycle");
 }
 
 sub current_stepno ($) { #method
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 03/16] PDU/IPMI: Retransmit, don't just wait
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
  2020-10-22 16:44 ` [OSSTEST PATCH 01/16] share in jobdb: Break out $checkconstraints and move call Ian Jackson
  2020-10-22 16:44 ` [OSSTEST PATCH 02/16] share in jobdb: Move out-of-flight special case higher up Ian Jackson
@ 2020-10-22 16:44 ` Ian Jackson
  2020-10-22 16:44 ` [OSSTEST PATCH 04/16] PDU/MSW: Warn that SNMP status is often not immediately updated Ian Jackson
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

We have a system for which
   ipmitool -H sabro0m -U root -P XXXX -I lanplus power on
seems to work but doesn't take effect the first time.

Retransit each retry.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 Osstest/PDU/ipmi.pm | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Osstest/PDU/ipmi.pm b/Osstest/PDU/ipmi.pm
index 98e8957f..21c94d98 100644
--- a/Osstest/PDU/ipmi.pm
+++ b/Osstest/PDU/ipmi.pm
@@ -66,11 +66,12 @@ sub pdu_power_state {
 	return;
     }
 
-    system_checked((@cmd, qw(power), $onoff));
-
     my $count = 60;
     for (;;) {
         last if $getstatus->() eq $onoff;
+
+	system_checked((@cmd, qw(power), $onoff));
+
         die "did not power $onoff" unless --$count > 0;
         sleep(1);
     }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 04/16] PDU/MSW: Warn that SNMP status is often not immediately updated
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
                   ` (2 preceding siblings ...)
  2020-10-22 16:44 ` [OSSTEST PATCH 03/16] PDU/IPMI: Retransmit, don't just wait Ian Jackson
@ 2020-10-22 16:44 ` Ian Jackson
  2020-10-22 16:44 ` [OSSTEST PATCH 05/16] PDU/MSW: Break out get() Ian Jackson
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

If you don't know this, it's very confusing.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 pdu-msw | 1 +
 1 file changed, 1 insertion(+)

diff --git a/pdu-msw b/pdu-msw
index d2691567..04b03a22 100755
--- a/pdu-msw
+++ b/pdu-msw
@@ -133,4 +133,5 @@ if (!defined $action) {
     print "was: "; show();
     set();
     print "now: "; show();
+    print "^ note, PDUs often do not update returned info immediately\n";
 }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 05/16] PDU/MSW: Break out get()
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
                   ` (3 preceding siblings ...)
  2020-10-22 16:44 ` [OSSTEST PATCH 04/16] PDU/MSW: Warn that SNMP status is often not immediately updated Ian Jackson
@ 2020-10-22 16:44 ` Ian Jackson
  2020-10-22 16:44 ` [OSSTEST PATCH 06/16] PDU/MSW: Break out action_value() Ian Jackson
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

This is going to be useful in a moment.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 pdu-msw | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/pdu-msw b/pdu-msw
index 04b03a22..58c33952 100755
--- a/pdu-msw
+++ b/pdu-msw
@@ -106,13 +106,18 @@ my @map= (undef, qw(
                     delayed-off
                     delayed-reboot));
 
-sub show () {
+sub get () {
     my $got= $session->get_request($read_oid);
     die "SNMP error reading $read_oid ".$session->error()." " unless $got;
     my $val= $got->{$read_oid};
     die unless $val;
     my $mean= $map[$val];
     die "$val ?" unless defined $mean;
+    return $mean;
+}
+
+sub show () {
+    my $mean = get();
     printf "pdu-msw $dnsname: #%s \"%s\" = %s\n", $useport, $usename, $mean;
 }
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 06/16] PDU/MSW: Break out action_value()
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
                   ` (4 preceding siblings ...)
  2020-10-22 16:44 ` [OSSTEST PATCH 05/16] PDU/MSW: Break out get() Ian Jackson
@ 2020-10-22 16:44 ` Ian Jackson
  2020-10-22 16:44 ` [OSSTEST PATCH 07/16] PDU/MSW: Actually implement delayed-* Ian Jackson
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

This is going to be useful in a moment.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 pdu-msw | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/pdu-msw b/pdu-msw
index 58c33952..03b0f342 100755
--- a/pdu-msw
+++ b/pdu-msw
@@ -121,13 +121,17 @@ sub show () {
     printf "pdu-msw $dnsname: #%s \"%s\" = %s\n", $useport, $usename, $mean;
 }
 
-sub set () {
+sub action_value () {
     my $delayadd= ($action =~ s/^delayed-// ? 3 : 0);
     my $valset= ($action =~ m/^(?:0|off)$/ ? 2 :
                  $action =~ m/^(?:1|on)$/ ? 1 :
                  $action =~ m/^(?:reboot)$/ ? 3 :
                  die "unknown action $action\n$usagemsg");
-        
+    return $valset;
+}
+
+sub set ($) {
+    my ($valset) = @_;
     my $res= $session->set_request(-varbindlist => [ $write_oid, INTEGER, $valset ]);
     die "SNMP set ".$session->error()." " unless $res;
 }
@@ -135,8 +139,9 @@ sub set () {
 if (!defined $action) {
     show();
 } else {
+    my $valset = action_value();
     print "was: "; show();
-    set();
+    set($valset);
     print "now: "; show();
     print "^ note, PDUs often do not update returned info immediately\n";
 }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 07/16] PDU/MSW: Actually implement delayed-*
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
                   ` (5 preceding siblings ...)
  2020-10-22 16:44 ` [OSSTEST PATCH 06/16] PDU/MSW: Break out action_value() Ian Jackson
@ 2020-10-22 16:44 ` Ian Jackson
  2020-10-22 16:44 ` [OSSTEST PATCH 08/16] PDU/MSW: Make show() return the value from get() Ian Jackson
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

Nothing in our tree uses this but having it here is useful docs for
the protocol so I shan't just delete it.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 pdu-msw | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pdu-msw b/pdu-msw
index 03b0f342..196b6c45 100755
--- a/pdu-msw
+++ b/pdu-msw
@@ -127,7 +127,7 @@ sub action_value () {
                  $action =~ m/^(?:1|on)$/ ? 1 :
                  $action =~ m/^(?:reboot)$/ ? 3 :
                  die "unknown action $action\n$usagemsg");
-    return $valset;
+    return $valset + $delayadd;
 }
 
 sub set ($) {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 08/16] PDU/MSW: Make show() return the value from get()
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
                   ` (6 preceding siblings ...)
  2020-10-22 16:44 ` [OSSTEST PATCH 07/16] PDU/MSW: Actually implement delayed-* Ian Jackson
@ 2020-10-22 16:44 ` Ian Jackson
  2020-10-22 16:44 ` [OSSTEST PATCH 09/16] PDU/MSU: Retransmit on/off until PDU has changed Ian Jackson
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

No-one uses this return value yet, so NFC.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 pdu-msw | 1 +
 1 file changed, 1 insertion(+)

diff --git a/pdu-msw b/pdu-msw
index 196b6c45..2d4ec967 100755
--- a/pdu-msw
+++ b/pdu-msw
@@ -119,6 +119,7 @@ sub get () {
 sub show () {
     my $mean = get();
     printf "pdu-msw $dnsname: #%s \"%s\" = %s\n", $useport, $usename, $mean;
+    return $mean;
 }
 
 sub action_value () {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 09/16] PDU/MSU: Retransmit on/off until PDU has changed
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
                   ` (7 preceding siblings ...)
  2020-10-22 16:44 ` [OSSTEST PATCH 08/16] PDU/MSW: Make show() return the value from get() Ian Jackson
@ 2020-10-22 16:44 ` Ian Jackson
  2020-10-22 16:45 ` [OSSTEST PATCH 10/16] host reuse fixes: Fix running of steps adhoc Ian Jackson
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

The main effect of this is that the transcript will actually show the
new PDU state.  Previously we would call show(), but APC PDUs would
normally not change immediately, so the transcript would show the old
state.

This also guards against an unresponsive PDU or a packet getting lost.
I don't think we have ever seen that.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 pdu-msw | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/pdu-msw b/pdu-msw
index 2d4ec967..c57f9f7c 100755
--- a/pdu-msw
+++ b/pdu-msw
@@ -41,6 +41,7 @@ while (@ARGV && $ARGV[0] =~ m/^-/) {
 
 if (@ARGV<2 || @ARGV>3 || $ARGV[0] =~ m/^-/) { die "bad usage\n$usagemsg"; }
 
+our ($max_retries) = 16; # timeout = 0.05 * max_retries^2
 our ($dnsname,$outlet,$action) = @ARGV;
 
 my ($session,$error) = Net::SNMP->session(
@@ -142,7 +143,21 @@ if (!defined $action) {
 } else {
     my $valset = action_value();
     print "was: "; show();
-    set($valset);
-    print "now: "; show();
-    print "^ note, PDUs often do not update returned info immediately\n";
+
+    my $retries = 0;
+    for (;;) {
+	set($valset);
+	sleep $retries * 0.1;
+	print "now: "; my $got = show();
+	if ($got eq $map[$valset]) { last; }
+	if ($map[$valset] !~ m{^(?:off|on)$}) {
+	    print
+ "^ note, PDUs often do not update returned info immediately\n";
+	    last;
+	}
+	if ($retries >= $max_retries) {
+	    die "PDU does not seem to be changing state!\n";
+	}
+	$retries++;
+    }
 }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 10/16] host reuse fixes: Fix running of steps adhoc
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
                   ` (8 preceding siblings ...)
  2020-10-22 16:44 ` [OSSTEST PATCH 09/16] PDU/MSU: Retransmit on/off until PDU has changed Ian Jackson
@ 2020-10-22 16:45 ` Ian Jackson
  2020-10-22 16:45 ` [OSSTEST PATCH 11/16] host reuse fixes: Fix runvar entry for adhoc tasks Ian Jackson
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

When a ts script is run by hand, for adhoc testing, there is no
OSSTEST_TESTID variable in the environment and the script does not
know it's own step number.  Such adhoc runs are not tracked as steps
in the steps table.

For host lifecycle purposes, treat these as ad-hoc out-of-flight uses,
based only on the taskid (which will usually be a person's personal
static task).

Without this, these adhoc runs fail with a constraint violating trying
to insert a flight/job/step row into the host lifecycle table: the
constraint requires the step to be specified but it is NULL.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 Osstest/JobDB/Executive.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Osstest/JobDB/Executive.pm b/Osstest/JobDB/Executive.pm
index 4fa42e5d..04555113 100644
--- a/Osstest/JobDB/Executive.pm
+++ b/Osstest/JobDB/Executive.pm
@@ -588,7 +588,7 @@ END
     };
 
 
-    if (!defined $flight) {
+    if (!defined $flight || !defined $tstepno) {
 	db_retry($dbh_tests,[], sub {
 	    $insertq->execute($hostname, $ttaskid,
 			      undef,undef,
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 11/16] host reuse fixes: Fix runvar entry for adhoc tasks
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
                   ` (9 preceding siblings ...)
  2020-10-22 16:45 ` [OSSTEST PATCH 10/16] host reuse fixes: Fix running of steps adhoc Ian Jackson
@ 2020-10-22 16:45 ` Ian Jackson
  2020-10-22 16:45 ` [OSSTEST PATCH 12/16] Introduce guest_mk_lv_name Ian Jackson
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

When processing an item from the host lifecycle table into the runvar,
we don't want to do all the processing of flight and job.  Instead, we
should simply put the ?<taskid> into the runvar.

Previously this would produce ?<taskid>: which the flight reporting
code would choke on.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 Osstest/JobDB/Executive.pm | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/Osstest/JobDB/Executive.pm b/Osstest/JobDB/Executive.pm
index 04555113..1dcf55ff 100644
--- a/Osstest/JobDB/Executive.pm
+++ b/Osstest/JobDB/Executive.pm
@@ -649,6 +649,11 @@ END
 	    }
 	    next if $tj_seen{$oisprepmark.$otj}++;
 
+	    if (!defined $o->{flight}) {
+		push @lifecycle, "$omarks$otj";
+		next;
+	    }
+
 	    if (!$omarks && !$olive && defined($o->{flight}) &&
 		$ho->{Shared} &&
 		$ho->{Shared}{Type} =~ m/^build-/ &&
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 12/16] Introduce guest_mk_lv_name
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
                   ` (10 preceding siblings ...)
  2020-10-22 16:45 ` [OSSTEST PATCH 11/16] host reuse fixes: Fix runvar entry for adhoc tasks Ian Jackson
@ 2020-10-22 16:45 ` Ian Jackson
  2020-10-22 16:45 ` [OSSTEST PATCH 13/16] Prefix guest LV names with the job name Ian Jackson
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

This changes the way the disk name is constructed but not to any
overall effect.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 Osstest/TestSupport.pm | 9 +++++++--
 ts-debian-install      | 2 +-
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/Osstest/TestSupport.pm b/Osstest/TestSupport.pm
index 5e6b15d9..12aaba79 100644
--- a/Osstest/TestSupport.pm
+++ b/Osstest/TestSupport.pm
@@ -76,7 +76,7 @@ BEGIN {
                       target_jobdir target_extract_jobdistpath_subdir
                       target_extract_jobdistpath target_extract_distpart
 		      target_tftp_prefix
-                      lv_create lv_dev_mapper
+                      lv_create lv_dev_mapper guest_mk_lv_name
 
                       poll_loop tcpconnect await_tcp
                       contents_make_cpio file_simple_write_contents
@@ -2177,6 +2177,11 @@ sub guest_var_commalist ($$) {
     return split /\,/, guest_var($gho,$runvartail,'');
 }
 
+sub guest_mk_lv_name ($$) {
+    my ($gho, $suffix) = @_;
+    return "$gho->{Name}".$suffix;
+}
+
 sub prepareguest ($$$$$$) {
     my ($ho, $gn, $hostname, $tcpcheckport, $mb,
         $boot_timeout) = @_;
@@ -2205,7 +2210,7 @@ sub prepareguest ($$$$$$) {
     # If we have defined guest specific disksize, use it
     $mb = guest_var($gho,'disksize',$mb);
     if (defined $mb) {
-	store_runvar("${gn}_disk_lv", $r{"${gn}_hostname"}.'-disk');
+	store_runvar("${gn}_disk_lv", guest_mk_lv_name($gho, '-disk'));
     }
 
     if (defined $mb) {
diff --git a/ts-debian-install b/ts-debian-install
index f07dd676..8caa9d76 100755
--- a/ts-debian-install
+++ b/ts-debian-install
@@ -100,7 +100,7 @@ END
 
     my $cfg= "/etc/xen/$gho->{Name}.cfg";
     store_runvar("$gho->{Guest}_cfgpath", $cfg);
-    store_runvar("$gho->{Guest}_swap_lv", "$gho->{Name}-swap");
+    store_runvar("$gho->{Guest}_swap_lv", guest_mk_lv_name($gho, "-swap"));
 }
 
 prep();
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 13/16] Prefix guest LV names with the job name
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
                   ` (11 preceding siblings ...)
  2020-10-22 16:45 ` [OSSTEST PATCH 12/16] Introduce guest_mk_lv_name Ian Jackson
@ 2020-10-22 16:45 ` Ian Jackson
  2020-10-22 16:45 ` [OSSTEST PATCH 14/16] reporting: Minor fix to reporting of tasks with no subtask Ian Jackson
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

This means that a subsequent test which reuses the same host will not
use the same LVs.  This is a good idea because reusing the same LV
names in a subsequent job means relying on the "ad hoc run" cleanup
code.  This is a bad idea because that code is rarely tested.

And because, depending on the situation, the old LVs may even still be
in use.  For example, in a pair test, the guest's LVs will still be
set up for use with nbd.

It seems better to fix this by using a fresh LV rather than adding
more teardown code.

The "wear limit" on host reuse is what prevents the disk filling up
with LVs from old guests.

ts-debian-fixup needs special handling, because Debian's xen-tools'
xen-create-image utility hardcodes its notion of LV name construction.
We need to rename the actual LVs (perhaps overwriting old ones from a
previous ad-hoc run) and also update the config.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 Osstest/TestSupport.pm |  2 +-
 ts-debian-fixup        | 22 ++++++++++++++++++++++
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/Osstest/TestSupport.pm b/Osstest/TestSupport.pm
index 12aaba79..9362a865 100644
--- a/Osstest/TestSupport.pm
+++ b/Osstest/TestSupport.pm
@@ -2179,7 +2179,7 @@ sub guest_var_commalist ($$) {
 
 sub guest_mk_lv_name ($$) {
     my ($gho, $suffix) = @_;
-    return "$gho->{Name}".$suffix;
+    return $job."_$gho->{Name}".$suffix;
 }
 
 sub prepareguest ($$$$$$) {
diff --git a/ts-debian-fixup b/ts-debian-fixup
index a878fe50..810b3aba 100755
--- a/ts-debian-fixup
+++ b/ts-debian-fixup
@@ -37,6 +37,27 @@ sub savecfg () {
     $cfg= get_filecontents("$cfgstash.orig");
 }
 
+sub lvnames () {
+    my $lvs = target_cmd_output_root($ho, "lvdisplay --colon", 30);
+    foreach my $suffix (qw(disk swap)) {
+	my $old = "$gho->{Name}-$suffix";
+	my $new = "${job}_${old}";
+	my $full_old = "/dev/$gho->{Vg}/$old";
+	my $full_new = "/dev/$gho->{Vg}/$new";
+	$cfg =~ s{\Q$full_old\E(?![0-9a-zA-Z/_.-])}{
+            logm "Replacing in domain config \`$&' with \`$full_new'";
+            $full_new;
+        }ge;
+	if ($lvs =~ m{^ *\Q$full_old\E}m) {
+	    if ($lvs =~ m{^ *\Q$full_new\E}m) {
+		# In case we are re-running (eg, adhoc)
+		target_cmd_root($ho, "lvremove -f $full_new", 30);
+	    }
+	    target_cmd_root($ho, "lvrename $full_old $new", 30);
+	}
+    }
+}
+
 sub ether () {
 #    $cfg =~ s/^ [ \t]*
 #        ( vif [ \t]* \= [ \t]* \[ [ \t]* [\'\"]
@@ -207,6 +228,7 @@ sub writecfg () {
 }
 
 savecfg();
+lvnames();
 ether();
 access();
 $console = target_setup_rootdev_console_inittab($ho,$gho,"$mountpoint");
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 14/16] reporting: Minor fix to reporting of tasks with no subtask
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
                   ` (12 preceding siblings ...)
  2020-10-22 16:45 ` [OSSTEST PATCH 13/16] Prefix guest LV names with the job name Ian Jackson
@ 2020-10-22 16:45 ` Ian Jackson
  2020-10-22 16:45 ` [OSSTEST PATCH 15/16] host reuse fixes: Do not break host-reuse if no host allocated Ian Jackson
  2020-10-22 16:45 ` [OSSTEST PATCH 16/16] starvation: Do not count more than half a flight as starved Ian Jackson
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

subtask can be NULL.  If so, do not include it.

This change fixes a warning and a minor cosmetic defect.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 Osstest/Executive.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Osstest/Executive.pm b/Osstest/Executive.pm
index e3ab1dc3..d95d848d 100644
--- a/Osstest/Executive.pm
+++ b/Osstest/Executive.pm
@@ -427,7 +427,7 @@ sub report_rogue_task_description ($) {
     my $info= "rogue task ";
     $info .= " $arow->{type} $arow->{refkey}";
     $info .= " ($arow->{comment})" if defined $arow->{comment};
-    $info .= " $arow->{subtask}";
+    $info .= " $arow->{subtask}" if defined $arow->{subtask};
     $info .= " (user $arow->{username})";
     return $info;
 }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 15/16] host reuse fixes: Do not break host-reuse if no host allocated
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
                   ` (13 preceding siblings ...)
  2020-10-22 16:45 ` [OSSTEST PATCH 14/16] reporting: Minor fix to reporting of tasks with no subtask Ian Jackson
@ 2020-10-22 16:45 ` Ian Jackson
  2020-10-22 16:45 ` [OSSTEST PATCH 16/16] starvation: Do not count more than half a flight as starved Ian Jackson
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

If host allocation failed, or our dependency jobs failed, then we
won't have allocated a host.  The host runvar will not be set.
In this case, we want to do nothing.

But we forgot to pass $noneok to selecthost.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 ts-host-reuse | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ts-host-reuse b/ts-host-reuse
index e2498bb6..b885a3e6 100755
--- a/ts-host-reuse
+++ b/ts-host-reuse
@@ -165,7 +165,7 @@ sub act_start_test () {
 
 sub act_final () {
     if (!@ARGV) {
-	$ho = selecthost($whhost);
+	$ho = selecthost($whhost, 1);
 	return unless $ho;
 	host_update_lifecycle_info($ho, 'final');
     } elsif ("@ARGV" eq "--post-test-ok") {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [OSSTEST PATCH 16/16] starvation: Do not count more than half a flight as starved
  2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
                   ` (14 preceding siblings ...)
  2020-10-22 16:45 ` [OSSTEST PATCH 15/16] host reuse fixes: Do not break host-reuse if no host allocated Ian Jackson
@ 2020-10-22 16:45 ` Ian Jackson
  15 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-10-22 16:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

This seems like a sensible rule.

This also prevents the following bizarre behaviour: when a flight has
a handful of jobs that cannot be run at all (eg because it's a
commissioning flight for only hosts of a particular arch), those jobs
can complete quite quickly.  Even with a high X value because only a
smallish portion of the flight has finished, this can lead to a modest
threshhold value.  This combines particularly badly with commissioning
flights, where the duraation estimates are often nonsense.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 ts-hosts-allocate-Executive | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ts-hosts-allocate-Executive b/ts-hosts-allocate-Executive
index b216186a..459b9215 100755
--- a/ts-hosts-allocate-Executive
+++ b/ts-hosts-allocate-Executive
@@ -863,7 +863,7 @@ sub starving ($$) {
 	"D=%d W=%d X=%.3f t_D=%s t_me=%s t_lim=%.3f X'=%.4f (fi.s=%s)",
 	$d, $w, $X, $total_d, $projected_me, $lim, $Xcmp,
 	$fi->{started} - $now;
-    my $bad = $projected_me > $lim;
+    my $bad = $projected_me > $lim && $d >= $w;
     return ($bad, $m);
 }
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-10-22 17:09 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-22 16:44 [OSSTEST PATCH 00/16] Bugfixes Ian Jackson
2020-10-22 16:44 ` [OSSTEST PATCH 01/16] share in jobdb: Break out $checkconstraints and move call Ian Jackson
2020-10-22 16:44 ` [OSSTEST PATCH 02/16] share in jobdb: Move out-of-flight special case higher up Ian Jackson
2020-10-22 16:44 ` [OSSTEST PATCH 03/16] PDU/IPMI: Retransmit, don't just wait Ian Jackson
2020-10-22 16:44 ` [OSSTEST PATCH 04/16] PDU/MSW: Warn that SNMP status is often not immediately updated Ian Jackson
2020-10-22 16:44 ` [OSSTEST PATCH 05/16] PDU/MSW: Break out get() Ian Jackson
2020-10-22 16:44 ` [OSSTEST PATCH 06/16] PDU/MSW: Break out action_value() Ian Jackson
2020-10-22 16:44 ` [OSSTEST PATCH 07/16] PDU/MSW: Actually implement delayed-* Ian Jackson
2020-10-22 16:44 ` [OSSTEST PATCH 08/16] PDU/MSW: Make show() return the value from get() Ian Jackson
2020-10-22 16:44 ` [OSSTEST PATCH 09/16] PDU/MSU: Retransmit on/off until PDU has changed Ian Jackson
2020-10-22 16:45 ` [OSSTEST PATCH 10/16] host reuse fixes: Fix running of steps adhoc Ian Jackson
2020-10-22 16:45 ` [OSSTEST PATCH 11/16] host reuse fixes: Fix runvar entry for adhoc tasks Ian Jackson
2020-10-22 16:45 ` [OSSTEST PATCH 12/16] Introduce guest_mk_lv_name Ian Jackson
2020-10-22 16:45 ` [OSSTEST PATCH 13/16] Prefix guest LV names with the job name Ian Jackson
2020-10-22 16:45 ` [OSSTEST PATCH 14/16] reporting: Minor fix to reporting of tasks with no subtask Ian Jackson
2020-10-22 16:45 ` [OSSTEST PATCH 15/16] host reuse fixes: Do not break host-reuse if no host allocated Ian Jackson
2020-10-22 16:45 ` [OSSTEST PATCH 16/16] starvation: Do not count more than half a flight as starved Ian Jackson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.