All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history
@ 2019-11-08 18:49 Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 01/13] sg-report-host-history: Improve debugging output Ian Jackson
                   ` (13 more replies)
  0 siblings, 14 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Jürgen Groß, Ian Jackson

Earlier this week we discovered that sg-report-host-history was running
extremely slowly.  We applied an emergency fix 0fa72b13f5af
  sg-report-host-history: Reduce limit from 2000 to 200

The main problem is that sg-report-host-history runs once for each
flight, and must generate a relevant history view of the recent
history for each host - including much history that is already in the
old version of the html file.

The slow part is asking the database about information about each job,
including its final step, allocation step, etc.  (The main query which
digs out relevant jobs is also rather time consuming it runs all in
one go and takes only a minute or two.)

In this series we introduce a mechanism which caches much of the
historical analysis.

It is not straightforward to reuse old html data as-is because we
would have to do a merge sort with the new data and that would involve
rewriting the alternating background colour (!)

So instead, we stuff the information we got from the database into
comments in the HTML, which we can then scan on future runs.

The overall result is a factor of 10 speedup in my tests, for the
original history limit of 2000.  That is now fast enough we can put
it back.

(I was not able to reproduce the exceptional case I saw earlier in the
week, where it was apparently taking hours.  I suspect that there
comes a tipping point where db transactions end up being restarted.)

The patches are broken down into small pieces so that I could think
about them clearly and do self-review.

Ian Jackson (13):
  sg-report-host-history: Improve debugging output
  sg-report-host-history: New --no-install option for testing
  sg-report-host-history: Move `computeflightsrange' after hosts
  sg-report-host-history: Actually honour $minflight
  sg-report-host-history: Get job status from mainquery
  sg-report-host-history: Add $cachekey argument to jobquery
  sg-report-host-history: Store per-job query results in %$jr
  sg-report-host-history: Write cache entries
  sg-report-host-history: Write cache entries for tail, too
  sg-report-host-history: Read cache entries
  sg-report-host-history: Move job runvars query later
  sg-report-host-history: Cache runvar queries (power information)
  Revert "sg-report-host-history: Reduce limit from 2000 to 200"

 sg-report-host-history | 189 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 160 insertions(+), 29 deletions(-)

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Xen-devel] [OSSTEST PATCH 01/13] sg-report-host-history: Improve debugging output
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
@ 2019-11-08 18:49 ` Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 02/13] sg-report-host-history: New --no-install option for testing Ian Jackson
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-host-history | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/sg-report-host-history b/sg-report-host-history
index bd7391e0..42def6bf 100755
--- a/sg-report-host-history
+++ b/sg-report-host-history
@@ -101,6 +101,8 @@ END
     $minflight //= 0;
 
     $flightcond = "(flight > $minflight)";
+
+    print DEBUG "MINFLIGHT $minflight\n";
 }
 
 sub jobquery ($$) {
@@ -128,18 +130,22 @@ END
 
     push @params, scalar keys %hosts;
 
+    print DEBUG "MAINQUERY...\n";
     $runvarq->execute(@params);
 
     print DEBUG "FIRST PASS\n";
     while (my $jr= $runvarq->fetchrow_hashref()) {
-	print DEBUG "JOB $jr->{flight}.$jr->{job} ";
+	print DEBUG " $jr->{flight}.$jr->{job} ";
 	push @{ $hosts{$jr->{val}} }, $jr;
     }
+    print DEBUG "\n";
 }
 
 sub reporthost ($) {
     my ($hostname) = @_;
 
+    print DEBUG "HOST $hostname...\n";
+
     die if $hostname =~ m/[^-_.+0-9a-z]/;
 
     my $html_file= "$htmlout/$hostname.html";
@@ -204,7 +210,7 @@ END
 
     my @rows;
     foreach my $jr (@$inrows) {
-	print DEBUG "JOB $jr->{flight}.$jr->{job}\n";
+	print DEBUG "JOB $jr->{flight}.$jr->{job} ";
 
 	my $endedrow = jobquery($endedq, $jr);
 	if (!$endedrow) {
@@ -222,6 +228,7 @@ END
 
     my $alternate = 0;
     foreach my $jr (@rows) {
+        print DEBUG "JR $jr->{flight}.$jr->{job}\n";
 	my $ir = jobquery($infoq, $jr);
 	my $ar = jobquery($allocdq, $jr);
 	my $ident = $jr->{name};
@@ -340,6 +347,7 @@ foreach my $host (@ARGV) {
 END
             $hostsinflightq->execute($flight);
 	    while (my $row = $hostsinflightq->fetchrow_hashref()) {
+                print DEBUG "HR $row->{val}\n";
 		$hosts{$row->{val}} = [ ];
 	    }
 	});
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Xen-devel] [OSSTEST PATCH 02/13] sg-report-host-history: New --no-install option for testing
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 01/13] sg-report-host-history: Improve debugging output Ian Jackson
@ 2019-11-08 18:49 ` Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 03/13] sg-report-host-history: Move `computeflightsrange' after hosts Ian Jackson
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

No change for existing callers.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-host-history | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/sg-report-host-history b/sg-report-host-history
index 42def6bf..c9f4aaa6 100755
--- a/sg-report-host-history
+++ b/sg-report-host-history
@@ -31,6 +31,7 @@ use Osstest::Executive qw(:DEFAULT :colours);
 our $limit= 200;
 our $flightlimit;
 our $htmlout = ".";
+our $doinstall=1;
 our @blessings;
 
 open DEBUG, ">/dev/null";
@@ -51,6 +52,8 @@ while (@ARGV && $ARGV[0] =~ m/^-/) {
         push @blessings, split ',', $1;
     } elsif (m/^--html-dir=(.*)$/) {
         $htmlout= $1;
+    } elsif (m/^--no-install$/) {
+        $doinstall= 0;
     } elsif (m/^--debug/) {
         open DEBUG, ">&2" or die $!;
         DEBUG->autoflush(1);
@@ -322,7 +325,8 @@ END
     print H "</table></body></html>\n";
 
     close H or die $!;
-    rename "$html_file.new", "$html_file" or die "$html_file $!";
+    rename "$html_file.new", "$html_file" or die "$html_file $!"
+        if $doinstall;
 }
 
 db_retry($dbh_tests, [], sub {
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Xen-devel] [OSSTEST PATCH 03/13] sg-report-host-history: Move `computeflightsrange' after hosts
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 01/13] sg-report-host-history: Improve debugging output Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 02/13] sg-report-host-history: New --no-install option for testing Ian Jackson
@ 2019-11-08 18:49 ` Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 04/13] sg-report-host-history: Actually honour $minflight Ian Jackson
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

This will allow the flights range computation to depend on the hosts
we are interested in.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-host-history | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/sg-report-host-history b/sg-report-host-history
index c9f4aaa6..fc51074d 100755
--- a/sg-report-host-history
+++ b/sg-report-host-history
@@ -329,10 +329,6 @@ END
         if $doinstall;
 }
 
-db_retry($dbh_tests, [], sub {
-    computeflightsrange();
-});
-
 foreach my $host (@ARGV) {
     if ($host =~ m/^flight:/) {
 	my $flight=$'; #';
@@ -365,6 +361,10 @@ END
 exit 0 unless %hosts;
 
 db_retry($dbh_tests, [], sub {
+    computeflightsrange();
+});
+
+db_retry($dbh_tests, [], sub {
     mainquery();
 });
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Xen-devel] [OSSTEST PATCH 04/13] sg-report-host-history: Actually honour $minflight
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
                   ` (2 preceding siblings ...)
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 03/13] sg-report-host-history: Move `computeflightsrange' after hosts Ian Jackson
@ 2019-11-08 18:49 ` Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 05/13] sg-report-host-history: Get job status from mainquery Ian Jackson
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

This seriously speeds up some of the queries.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-host-history | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/sg-report-host-history b/sg-report-host-history
index fc51074d..d47784d9 100755
--- a/sg-report-host-history
+++ b/sg-report-host-history
@@ -67,6 +67,7 @@ while (@ARGV && $ARGV[0] =~ m/^-/) {
 
 our $restrictflight_cond = restrictflight_cond();
 our $flightcond;
+our $minflight;
 
 sub computeflightsrange () {
     if (!$flightlimit) {
@@ -100,7 +101,7 @@ END
 	  LIMIT 1
 END
     $minflightsq->execute();
-    my ($minflight) = $minflightsq->fetchrow_array();
+    ($minflight,) = $minflightsq->fetchrow_array();
     $minflight //= 0;
 
     $flightcond = "(flight > $minflight)";
@@ -127,10 +128,12 @@ sub mainquery () {
 	   AND ($valcond)
 	   AND $flightcond
            AND $restrictflight_cond
+           AND flight > ?
 	 ORDER BY flight DESC
 	 LIMIT ($limit * 3 + 100) * ?
 END
 
+    push @params, $minflight;
     push @params, scalar keys %hosts;
 
     print DEBUG "MAINQUERY...\n";
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Xen-devel] [OSSTEST PATCH 05/13] sg-report-host-history: Get job status from mainquery
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
                   ` (3 preceding siblings ...)
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 04/13] sg-report-host-history: Actually honour $minflight Ian Jackson
@ 2019-11-08 18:49 ` Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 06/13] sg-report-host-history: Add $cachekey argument to jobquery Ian Jackson
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

We are going to need this as part of our data reuse cache key, so we
need it this early.  This change hardly slows the query down.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-host-history | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/sg-report-host-history b/sg-report-host-history
index d47784d9..81a7a8d8 100755
--- a/sg-report-host-history
+++ b/sg-report-host-history
@@ -122,8 +122,9 @@ sub mainquery () {
     our @params = keys %hosts;
 
     our $runvarq //= db_prepare(<<END);
-	SELECT flight, job, name, val
+	SELECT flight, job, name, val, status
 	  FROM runvars
+          JOIN jobs USING (flight, job)
 	 WHERE $namecond
 	   AND ($valcond)
 	   AND $flightcond
@@ -186,10 +187,9 @@ sub reporthost ($) {
 END
 
     our $infoq //= db_prepare(<<END);
-	SELECT blessing, branch, intended, status
+	SELECT blessing, branch, intended
 	  FROM flights
-	  JOIN jobs USING (flight)
-	 WHERE flight=? AND job=?
+	 WHERE flight=? AND ?!='X'
 END
 
     our $allocdq //= db_prepare(<<END);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Xen-devel] [OSSTEST PATCH 06/13] sg-report-host-history: Add $cachekey argument to jobquery
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
                   ` (4 preceding siblings ...)
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 05/13] sg-report-host-history: Get job status from mainquery Ian Jackson
@ 2019-11-08 18:49 ` Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 07/13] sg-report-host-history: Store per-job query results in %$jr Ian Jackson
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

This key will distinguish the results of different queries we do per
job.  Right now it is not used, so no functional change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-host-history | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/sg-report-host-history b/sg-report-host-history
index 81a7a8d8..4c40cbec 100755
--- a/sg-report-host-history
+++ b/sg-report-host-history
@@ -109,7 +109,8 @@ END
     print DEBUG "MINFLIGHT $minflight\n";
 }
 
-sub jobquery ($$) {
+sub jobquery ($$$) {
+    my ($q, $jr, $cachekey) = @_;
     my ($q, $jr) = @_;
     $q->execute($jr->{flight}, $jr->{job});
     return $q->fetchrow_hashref();
@@ -218,7 +219,7 @@ END
     foreach my $jr (@$inrows) {
 	print DEBUG "JOB $jr->{flight}.$jr->{job} ";
 
-	my $endedrow = jobquery($endedq, $jr);
+	my $endedrow = jobquery($endedq, $jr, 'e');
 	if (!$endedrow) {
 	    print DEBUG "no-finished\n";
 	    next;
@@ -235,8 +236,8 @@ END
     my $alternate = 0;
     foreach my $jr (@rows) {
         print DEBUG "JR $jr->{flight}.$jr->{job}\n";
-	my $ir = jobquery($infoq, $jr);
-	my $ar = jobquery($allocdq, $jr);
+	my $ir = jobquery($infoq, $jr, 'i');
+	my $ar = jobquery($allocdq, $jr, 'a');
 	my $ident = $jr->{name};
 	$jrunvarq->execute($jr->{flight}, $jr->{job}, $ident);
         my %runvars;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Xen-devel] [OSSTEST PATCH 07/13] sg-report-host-history: Store per-job query results in %$jr
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
                   ` (5 preceding siblings ...)
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 06/13] sg-report-host-history: Add $cachekey argument to jobquery Ian Jackson
@ 2019-11-08 18:49 ` Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 08/13] sg-report-host-history: Write cache entries Ian Jackson
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

jobquery now looks for the subquery results in %$jr, under the
cachekey, and only runs the query if it's not found.  It then stores
the value.

We are going to persist the contents of %$jr across runs, and then
this will avoid rerunning queries needlessly.

No functional change yet.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-host-history | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/sg-report-host-history b/sg-report-host-history
index 4c40cbec..8767b25d 100755
--- a/sg-report-host-history
+++ b/sg-report-host-history
@@ -109,11 +109,21 @@ END
     print DEBUG "MINFLIGHT $minflight\n";
 }
 
+our $jqcachemisses = 0;
+our $jqtotal = 0;
+
 sub jobquery ($$$) {
     my ($q, $jr, $cachekey) = @_;
-    my ($q, $jr) = @_;
-    $q->execute($jr->{flight}, $jr->{job});
-    return $q->fetchrow_hashref();
+    $jqtotal++;
+    $cachekey = '%'.$cachekey;
+    my $cached = $jr->{$cachekey};
+    if (!$cached) {
+	$jqcachemisses++;
+	$q->execute($jr->{flight}, $jr->{job});
+	$cached = $q->fetchrow_hashref();
+	$jr->{$cachekey} = $cached;
+    }
+    return $cached;
 }
 
 our %hosts;
@@ -215,6 +225,12 @@ END
     my $inrows = $hosts{$hostname};
     print DEBUG "FOUND ", (scalar @$inrows), " ROWS for $hostname\n";
 
+    # Each entry in @$inrows is a $jr, which is a hash
+    # It has keys for the result columns in mainquery
+    # It also has keys '%<letter>' (yes, with a literal '%')
+    # which are the results of per-job queries.
+    # The contents of $jr for each job is cached across runs. (TODO)
+
     my @rows;
     foreach my $jr (@$inrows) {
 	print DEBUG "JOB $jr->{flight}.$jr->{job} ";
@@ -377,3 +393,5 @@ foreach my $host (sort keys %hosts) {
 	reporthost $host;
     });
 }
+
+print DEBUG "JQ CACHE ".($jqtotal-$jqcachemisses)." / $jqtotal\n";
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Xen-devel] [OSSTEST PATCH 08/13] sg-report-host-history: Write cache entries
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
                   ` (6 preceding siblings ...)
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 07/13] sg-report-host-history: Store per-job query results in %$jr Ian Jackson
@ 2019-11-08 18:49 ` Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 09/13] sg-report-host-history: Write cache entries for tail, too Ian Jackson
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

Write the %$jr contents out in a fairly terse format.  We stuff it
into a parseable SGML/XML comment in the output HTML.

Nothing makes use of this yet - parsing it back in will come later.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-host-history | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/sg-report-host-history b/sg-report-host-history
index 8767b25d..335efa1c 100755
--- a/sg-report-host-history
+++ b/sg-report-host-history
@@ -246,6 +246,27 @@ END
 	push @rows, { %$jr, %$endedrow };
     }
 
+    my $write_cache_entry = sub {
+	my ($jr) = @_;
+        print H "<!-- osstest-report-reuseable";
+	my $whash = sub {
+	    my ($h) = @_;
+	    foreach my $k (sort keys %$h) {
+		next if $k =~ m/^\%/;
+		$_ = $h->{$k};
+		s{[^-+=/~:;_.,\w]}{ sprintf "%%%02x", ord $& }ge;
+		printf H " %s=%s", $k, $_;
+	    }
+	};
+	$whash->($jr);
+	foreach my $hk (sort keys %$jr) {
+	    next unless $hk =~ m/^\%/;
+	    print H " $'";
+	    $whash->($jr->{$hk});
+	}
+	print H " -->\n";
+    };
+
     @rows = sort { $b->{finished} <=> $a->{finished} } @rows;
     $#rows = $limit-1 if @rows > $limit;
 
@@ -338,6 +359,8 @@ END
         print H "<td>" if !$any_power;
 	print H "</td>\n";
 
+	$write_cache_entry->($jr);
+
 	print H "</tr>\n\n";
 	$alternate ^= 1;
     }
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Xen-devel] [OSSTEST PATCH 09/13] sg-report-host-history: Write cache entries for tail, too
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
                   ` (7 preceding siblings ...)
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 08/13] sg-report-host-history: Write cache entries Ian Jackson
@ 2019-11-08 18:49 ` Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 10/13] sg-report-host-history: Read cache entries Ian Jackson
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

mainquery fetches a number of rows supposed to be larger than needed
for the output limit $limit.  And then for each host we sort them by
time of the last step - which means we must have the last step, which
is a separate query for each job.  We want to cache this information
even for jobs we do not actually report in the html output.

(There is still nothing which actually reads this cache data.)

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-host-history | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/sg-report-host-history b/sg-report-host-history
index 335efa1c..7dcfac9a 100755
--- a/sg-report-host-history
+++ b/sg-report-host-history
@@ -268,10 +268,15 @@ END
     };
 
     @rows = sort { $b->{finished} <=> $a->{finished} } @rows;
-    $#rows = $limit-1 if @rows > $limit;
 
     my $alternate = 0;
+    my $wrote = 0;
     foreach my $jr (@rows) {
+	if ($wrote++ >= $limit) {
+	    $write_cache_entry->($jr);
+	    next;
+	}
+
         print DEBUG "JR $jr->{flight}.$jr->{job}\n";
 	my $ir = jobquery($infoq, $jr, 'i');
 	my $ar = jobquery($allocdq, $jr, 'a');
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Xen-devel] [OSSTEST PATCH 10/13] sg-report-host-history: Read cache entries
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
                   ` (8 preceding siblings ...)
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 09/13] sg-report-host-history: Write cache entries for tail, too Ian Jackson
@ 2019-11-08 18:49 ` Ian Jackson
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 11/13] sg-report-host-history: Move job runvars query later Ian Jackson
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-host-history | 57 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/sg-report-host-history b/sg-report-host-history
index 7dcfac9a..e67c7346 100755
--- a/sg-report-host-history
+++ b/sg-report-host-history
@@ -31,6 +31,7 @@ use Osstest::Executive qw(:DEFAULT :colours);
 our $limit= 200;
 our $flightlimit;
 our $htmlout = ".";
+our $read_existing=1;
 our $doinstall=1;
 our @blessings;
 
@@ -52,6 +53,8 @@ while (@ARGV && $ARGV[0] =~ m/^-/) {
         push @blessings, split ',', $1;
     } elsif (m/^--html-dir=(.*)$/) {
         $htmlout= $1;
+    } elsif (m/^--regenerate$/) {
+        $read_existing= 0;
     } elsif (m/^--no-install$/) {
         $doinstall= 0;
     } elsif (m/^--debug/) {
@@ -69,6 +72,41 @@ our $restrictflight_cond = restrictflight_cond();
 our $flightcond;
 our $minflight;
 
+our %hcaches;
+
+sub read_existing_logs ($) {
+    my ($hostname) = @_;
+    return unless $read_existing;
+    my $html_file = "$htmlout/$hostname.html";
+    if (!open H, $html_file) {
+        return if $!==ENOENT;
+        die "failed to open $html_file: $!";
+    }
+    my $tcache = { };
+    $hcaches{$hostname} = $tcache;
+    for (;;) {
+        $_ = <H> // last;
+        next unless m{^\<\!-- osstest-report-reuseable (.*)--\>$};
+	my $jr = {};
+	my $ch = $jr;
+	foreach (split / /, $1) {
+	    if (m{^\w+$}) {
+		$ch = { };
+		$jr->{'%'.$&} = $ch;
+		next;
+	    }
+	    s{^(\w+)=}{} or die;
+	    my $k = $1;
+	    s{\%([0-9a-f]{2})}{ chr hex $1 }ge;
+	    $ch->{$k} = $_;
+	    print DEBUG "GOTCACHE $hostname $k\n";
+	}
+	print DEBUG "GOTCACHE $hostname \@ $jr->{flight} $jr->{job} $jr->{status},$jr->{name}\n";
+	$tcache->{$jr->{flight},$jr->{job},$jr->{status},$jr->{name}} = $jr;
+    }
+    close H;
+}
+
 sub computeflightsrange () {
     if (!$flightlimit) {
 	my $flagscond =
@@ -225,16 +263,26 @@ END
     my $inrows = $hosts{$hostname};
     print DEBUG "FOUND ", (scalar @$inrows), " ROWS for $hostname\n";
 
+    my $tcache = $hcaches{$hostname};
+
     # Each entry in @$inrows is a $jr, which is a hash
     # It has keys for the result columns in mainquery
     # It also has keys '%<letter>' (yes, with a literal '%')
     # which are the results of per-job queries.
-    # The contents of $jr for each job is cached across runs. (TODO)
+    # The contents of $jr for each job is cached across runs.
 
     my @rows;
+    my $cachehits = 0;
     foreach my $jr (@$inrows) {
 	print DEBUG "JOB $jr->{flight}.$jr->{job} ";
 
+	my $cacherow =
+	    $tcache->{$jr->{flight},$jr->{job},$jr->{status},$jr->{name}};
+	if ($cacherow) {
+	    $jr = $cacherow;
+	    $cachehits++;
+	}
+
 	my $endedrow = jobquery($endedq, $jr, 'e');
 	if (!$endedrow) {
 	    print DEBUG "no-finished\n";
@@ -246,6 +294,9 @@ END
 	push @rows, { %$jr, %$endedrow };
     }
 
+    print DEBUG "CACHE $hostname $cachehits / ".(scalar @rows)
+	." of ".(scalar %$tcache)."\n";
+
     my $write_cache_entry = sub {
 	my ($jr) = @_;
         print H "<!-- osstest-report-reuseable";
@@ -408,6 +459,10 @@ END
 
 exit 0 unless %hosts;
 
+foreach (keys %hosts) {
+    read_existing_logs($_);
+}
+
 db_retry($dbh_tests, [], sub {
     computeflightsrange();
 });
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Xen-devel] [OSSTEST PATCH 11/13] sg-report-host-history: Move job runvars query later
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
                   ` (9 preceding siblings ...)
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 10/13] sg-report-host-history: Read cache entries Ian Jackson
@ 2019-11-08 18:49 ` Ian Jackson
  2019-11-08 18:50 ` [Xen-devel] [OSSTEST PATCH 12/13] sg-report-host-history: Cache runvar queries (power information) Ian Jackson
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

This query is just used for the power methods.  Put it near there.
Also, indent it in a `do' block.  These changes will make the next
change easier to read.

No functional change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-host-history | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/sg-report-host-history b/sg-report-host-history
index e67c7346..7c2116d3 100755
--- a/sg-report-host-history
+++ b/sg-report-host-history
@@ -332,11 +332,6 @@ END
 	my $ir = jobquery($infoq, $jr, 'i');
 	my $ar = jobquery($allocdq, $jr, 'a');
 	my $ident = $jr->{name};
-	$jrunvarq->execute($jr->{flight}, $jr->{job}, $ident);
-        my %runvars;
-        while (my ($n, $v) = $jrunvarq->fetchrow_array()) {
-            $runvars{$n} = $v;
-        }
 
 	my $altcolour = report_altcolour($alternate);
 	print H "<tr $altcolour>";
@@ -377,10 +372,18 @@ END
 	print H "<td $ri->{ColourAttr}>$ri->{Content}</td>\n";
 
 	my %powers;
-	foreach my $r (sort keys %runvars) {
-	    next unless $r =~ m{^\Q${ident}\E_power_};
-	    $powers{$'} = $runvars{$r};
-	}
+	do {
+	    $jrunvarq->execute($jr->{flight}, $jr->{job}, $ident);
+	    my %runvars;
+	    while (my ($n, $v) = $jrunvarq->fetchrow_array()) {
+		$runvars{$n} = $v;
+	    }
+
+	    foreach my $r (sort keys %runvars) {
+		next unless $r =~ m{^\Q${ident}\E_power_};
+		$powers{$'} = $runvars{$r};
+	    }
+	};
 	my $skipped = 0;
         my $any_power = 0;
         my $pr_power_colour = sub {
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Xen-devel] [OSSTEST PATCH 12/13] sg-report-host-history: Cache runvar queries (power information)
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
                   ` (10 preceding siblings ...)
  2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 11/13] sg-report-host-history: Move job runvars query later Ian Jackson
@ 2019-11-08 18:50 ` Ian Jackson
  2019-11-08 18:50 ` [Xen-devel] [OSSTEST PATCH 13/13] Revert "sg-report-host-history: Reduce limit from 2000 to 200" Ian Jackson
  2019-11-08 19:45 ` [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Sander Eikelenboom
  13 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

This per-job processing was not done with jobquery, so was not cached.
We assign it the cache letter `p'.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-host-history | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/sg-report-host-history b/sg-report-host-history
index 7c2116d3..a11b00a0 100755
--- a/sg-report-host-history
+++ b/sg-report-host-history
@@ -322,6 +322,9 @@ END
 
     my $alternate = 0;
     my $wrote = 0;
+    my $runvarq_hits = 0;
+    my $runvarq_misses = 0;
+
     foreach my $jr (@rows) {
 	if ($wrote++ >= $limit) {
 	    $write_cache_entry->($jr);
@@ -372,7 +375,11 @@ END
 	print H "<td $ri->{ColourAttr}>$ri->{Content}</td>\n";
 
 	my %powers;
-	do {
+	if ($jr->{'%p'}) {
+	    %powers = %{ $jr->{'%p'} };
+	    $runvarq_hits++;
+	} else {
+	    $runvarq_misses++;
 	    $jrunvarq->execute($jr->{flight}, $jr->{job}, $ident);
 	    my %runvars;
 	    while (my ($n, $v) = $jrunvarq->fetchrow_array()) {
@@ -383,7 +390,8 @@ END
 		next unless $r =~ m{^\Q${ident}\E_power_};
 		$powers{$'} = $runvars{$r};
 	    }
-	};
+	    $jr->{'%p'} = { %powers };
+	}
 	my $skipped = 0;
         my $any_power = 0;
         my $pr_power_colour = sub {
@@ -429,6 +437,9 @@ END
     close H or die $!;
     rename "$html_file.new", "$html_file" or die "$html_file $!"
         if $doinstall;
+
+    print DEBUG "HOST CACHE RQ $runvarq_hits / ".
+	  ($runvarq_hits+$runvarq_misses)."\n";
 }
 
 foreach my $host (@ARGV) {
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Xen-devel] [OSSTEST PATCH 13/13] Revert "sg-report-host-history: Reduce limit from 2000 to 200"
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
                   ` (11 preceding siblings ...)
  2019-11-08 18:50 ` [Xen-devel] [OSSTEST PATCH 12/13] sg-report-host-history: Cache runvar queries (power information) Ian Jackson
@ 2019-11-08 18:50 ` Ian Jackson
  2019-11-08 19:45 ` [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Sander Eikelenboom
  13 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-08 18:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

This reverts commit 0fa72b13f5af0a544c417fc3c64cda1ea869a0ac.

Now we have the cacheing we can put this back and have useful host
histories again.

Some performance figures (individual measurements):

                                   limit=200     limit=2000
  before this series                 3m32          some very long times
  with this series, --regenerate     3m06          13m56 29m05
  with this series, reusing cache    2m22 1m49      3m10  3m36

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-host-history | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sg-report-host-history b/sg-report-host-history
index a11b00a0..54738e68 100755
--- a/sg-report-host-history
+++ b/sg-report-host-history
@@ -28,7 +28,7 @@ use POSIX;
 
 use Osstest::Executive qw(:DEFAULT :colours);
 
-our $limit= 200;
+our $limit= 2000;
 our $flightlimit;
 our $htmlout = ".";
 our $read_existing=1;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history
  2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
                   ` (12 preceding siblings ...)
  2019-11-08 18:50 ` [Xen-devel] [OSSTEST PATCH 13/13] Revert "sg-report-host-history: Reduce limit from 2000 to 200" Ian Jackson
@ 2019-11-08 19:45 ` Sander Eikelenboom
  2019-11-11 11:00   ` Ian Jackson
  13 siblings, 1 reply; 21+ messages in thread
From: Sander Eikelenboom @ 2019-11-08 19:45 UTC (permalink / raw)
  To: Ian Jackson, xen-devel; +Cc: Jürgen Groß

On 08/11/2019 19:49, Ian Jackson wrote:
> Earlier this week we discovered that sg-report-host-history was running
> extremely slowly.  We applied an emergency fix 0fa72b13f5af
>   sg-report-host-history: Reduce limit from 2000 to 200
> 
> The main problem is that sg-report-host-history runs once for each
> flight, and must generate a relevant history view of the recent
> history for each host - including much history that is already in the
> old version of the html file.
> 
> The slow part is asking the database about information about each job,
> including its final step, allocation step, etc.  (The main query which
> digs out relevant jobs is also rather time consuming it runs all in
> one go and takes only a minute or two.)
> 
> In this series we introduce a mechanism which caches much of the
> historical analysis.
> 
> It is not straightforward to reuse old html data as-is because we
> would have to do a merge sort with the new data and that would involve
> rewriting the alternating background colour (!)
> 
> So instead, we stuff the information we got from the database into
> comments in the HTML, which we can then scan on future runs.

Not mend to bike shed, so just for consideration:
- Have you considered (inline) css for the background colouring, or does
  it have to be html only  ?
- And for caching perhaps a materialized view with aggregated data only
  refreshed at a more convient time could perhaps help at the database
  level ?

--
Sander

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history
  2019-11-08 19:45 ` [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Sander Eikelenboom
@ 2019-11-11 11:00   ` Ian Jackson
  2019-11-11 12:03     ` Sander Eikelenboom
  0 siblings, 1 reply; 21+ messages in thread
From: Ian Jackson @ 2019-11-11 11:00 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: Jürgen Groß, xen-devel

Sander Eikelenboom writes ("Re: [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history"):
> Not mend to bike shed, so just for consideration:

Suggestions are very welcome.  Be careful, I'm still looking for a
co-maintainer :-).

> - Have you considered (inline) css for the background colouring, or does
>   it have to be html only  ?

There is no particular reason why it shouldn't be CSS.  Is there a
reason why doing it in html causes problems for you ?

The background colours for the cells are made with
  report_altcolour
  report_altchangecolour
in Osstest/Executive.pm.

report_altcolour returns something that can be put into an element
open tag, given a definite indication of whether the colour should be
paler or darker.

report_altchangecolour is used to produce background colours which
change when the value in the cell changes.

I think it would be easy to replace bgcolour= with some appropriate
style= and some CSS.  Patches - even very rough ones - welcome.

> - And for caching perhaps a materialized view with aggregated data only
>   refreshed at a more convient time could perhaps help at the database
>   level ?

Maybe, but currently the archaeology algorithm is not expressed
entirely in SQL so it couldn't be a materialised view.  And converting
it to SQL would be annoying because SQL is a rather poor programming
language.

It might be possible to, instead, have table(s) containing archaeology
results.  I hadn't really properly considered that possibility.  That
might well have been a better approach.  So thank you for your helpful
prompt.  I will definitely bear this in mind for the future.

I'm not sure I feel like reengineering this particular series at this
time, though.  One reason (apart from that I've done it like this now)
is that the current approach has the advantage that it doesn't need a
DB schema change.  I have a system for doing schema changes but they
add risk and I don't want to do that in the Xen release freeze.

Regards,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history
  2019-11-11 11:00   ` Ian Jackson
@ 2019-11-11 12:03     ` Sander Eikelenboom
  2019-11-11 14:00       ` Ian Jackson
  0 siblings, 1 reply; 21+ messages in thread
From: Sander Eikelenboom @ 2019-11-11 12:03 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jürgen Groß, xen-devel

On 11/11/2019 12:00, Ian Jackson wrote:
> Sander Eikelenboom writes ("Re: [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history"):
>> Not mend to bike shed, so just for consideration:
> 
> Suggestions are very welcome.  Be careful, I'm still looking for a
> co-maintainer :-).
/me is ducking under the table ;)
Seems to be quite a lot of intracate Perl, I never was a prince of Perl
and that hasn't got any better by not using it actively the past years.

>> - Have you considered (inline) css for the background colouring, or does
>>   it have to be html only  ?
> 
> There is no particular reason why it shouldn't be CSS.  Is there a
> reason why doing it in html causes problems for you ?

Not really, but especially applying style to alternating rows is now
quite simple with pseudo classes:

 tr:nth-child(even){
   background-color: grey;
 }

 tr:nth-child(even){
   background-color: white;
 }

You could stuff this in a <head><style> ... </style></head>,
so you don't have to repeat this every row for the common case.
For any special cases you could overrule based on class.
I happen to find it one of the most useful CSS features.

https://www.w3.org/wiki/CSS/Selectors/pseudo-classes/:nth-child

> The background colours for the cells are made with
>   report_altcolour
>   report_altchangecolour
> in Osstest/Executive.pm.
> 
> report_altcolour returns something that can be put into an element
> open tag, given a definite indication of whether the colour should be
> paler or darker.
> 
> report_altchangecolour is used to produce background colours which
> change when the value in the cell changes.
> 
> I think it would be easy to replace bgcolour= with some appropriate
> style= and some CSS.  Patches - even very rough ones - welcome.
> 
>> - And for caching perhaps a materialized view with aggregated data only
>>   refreshed at a more convient time could perhaps help at the database
>>   level ?
> 
> Maybe, but currently the archaeology algorithm is not expressed
> entirely in SQL so it couldn't be a materialised view.  And converting
> it to SQL would be annoying because SQL is a rather poor programming
> language.

It is a poor programming language, but it is very good at retrieving and
modifying data. Sometimes it takes some effort to wrap your head around
the way you have to specify what data you want and in what for, without
being to explicit in how it is supposed to be retrieved.

> It might be possible to, instead, have table(s) containing archaeology
> results.  I hadn't really properly considered that possibility.  That
> might well have been a better approach.  So thank you for your helpful
> prompt.  I will definitely bear this in mind for the future.

If I remember correctly Postgres is being used, perhaps there is stull
some relatively low hanging fruit when analyzing the performance of the
queries you run at the actual data.

> I'm not sure I feel like reengineering this particular series at this
> time, though.  One reason (apart from that I've done it like this now)
> is that the current approach has the advantage that it doesn't need a
> DB schema change.  I have a system for doing schema changes but they
> add risk and I don't want to do that in the Xen release freeze.

I understand, and I concur that that is probably the best at the moment.

I will take a look at the code somewhere this or next week and see if I
can get any familiarity with it and perhaps end up with some contributions.

--
Sander

> Regards,
> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history
  2019-11-11 12:03     ` Sander Eikelenboom
@ 2019-11-11 14:00       ` Ian Jackson
  2019-11-20 17:54         ` Ian Jackson
  0 siblings, 1 reply; 21+ messages in thread
From: Ian Jackson @ 2019-11-11 14:00 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: Jürgen Groß, xen-devel

Sander Eikelenboom writes ("Re: [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history"):
> /me is ducking under the table ;)
> Seems to be quite a lot of intracate Perl, I never was a prince of Perl
> and that hasn't got any better by not using it actively the past years.

Heh.  Although it's generally not supposed to be intricate.  I have
tried to keep it fairly straightforward.

> Not really, but especially applying style to alternating rows is now
> quite simple with pseudo classes:
> 
>  tr:nth-child(even){
>    background-color: grey;
>  }
> 
>  tr:nth-child(even){
>    background-color: white;
>  }
> 
> You could stuff this in a <head><style> ... </style></head>,
> so you don't have to repeat this every row for the common case.
> For any special cases you could overrule based on class.
> I happen to find it one of the most useful CSS features.

Interesting.  Mmm.  (Although your vignette above ought to have an
`odd' in it I think...)

> > Maybe, but currently the archaeology algorithm is not expressed
> > entirely in SQL so it couldn't be a materialised view.  And converting
> > it to SQL would be annoying because SQL is a rather poor programming
> > language.
> 
> It is a poor programming language, but it is very good at retrieving and
> modifying data. Sometimes it takes some effort to wrap your head around
> the way you have to specify what data you want and in what for, without
> being to explicit in how it is supposed to be retrieved.

Indeed so.

> If I remember correctly Postgres is being used, perhaps there is stull
> some relatively low hanging fruit when analyzing the performance of the
> queries you run at the actual data.

Yes.  One of the biggest problems is I really want to make an index on
runvar *values*.  But if I do that then routine runvar updates have to
update that index.  So what I want is a partial index but the rows of
runvars which are indexed ought to be controlled by the corresponding
row of the flights table.

I am considering denormalising this by including a `finalised' bit in
the runvars table.  But not now...

> I will take a look at the code somewhere this or next week and see if I
> can get any familiarity with it and perhaps end up with some contributions.

All contributions and suggestions are welcome.

Regards,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history
  2019-11-11 14:00       ` Ian Jackson
@ 2019-11-20 17:54         ` Ian Jackson
  2019-11-21  5:45           ` Jürgen Groß
  0 siblings, 1 reply; 21+ messages in thread
From: Ian Jackson @ 2019-11-20 17:54 UTC (permalink / raw)
  To: Jürgen Groß; +Cc: xen-devel

Hi, I promised you to do a risk/benefit analysis on this series and
here is my report.  With your permission I plan to push it on Sunday
night or Monday morning, if you think that is a convenient time.


Summary:

There are three kinds of risk here:

* There is a nonneglible chance that these changes have a significant
  adverse performance impact on post-flight reporting, so that
  overall throughput is adversely affected.  I have tried to exclude
  it by both reasoning and testing but it remains a risk.

  I propose to deal with this risk by pushing the change to osstest
  pretest at the beginning of the week, so that when it makes it
  through the self-push gate I am around to monitor it.  I will check
  to see that it is DTRT, and, particularly, that the reporting is not
  overly slow.

* I expect a certain amount of additional delay during the
  transitional period, when some flights are using old code and some
  new code.

  I propose to deal with this issue by negotiating a good time to do
  this when we can afford to, effectively, lose a few hours'
  throughput.

* There is a pretty small chance that these changes breaks everything
  by causing all flights to crash during host reporting.

  This will be obvious, especially if I'm watching it all closely.
  If this happens it will need to be reverted.

If we decide this series is a problem, after it has gone into
production, we can simply revert it.  There is nothing else in the
osstest push gate right now.  The old code will still function and we
could confidently force push it.

The upside of this change is to undo a regression in our ability to
diagnose host problems.  Particularly, if a host has a low probability
or intermittent fault, we will want to be able to look further back
than the current ~200 jobs (not sure how long that is without looking
it up but it is only a few days I think, at least for some hosts).

Ian.


Patch-by-patch notes:


sg-report-host-history: Improve debugging output

This is just additional prints.  If they accidentally refer to wrong
variables, this would generate perl nonfatal warnings in debug mode
(which we do not use in production).


sg-report-host-history: New --no-install option for testing

By inspection and testing this code does nothing if the new option is
not passed.


sg-report-host-history: Move `computeflightsrange' after hosts

I double checked that global variables used and set by
computeflightsrange.  It uses and sets $flightlimit; nothing else uses
this and it is set by the option parser.  It uses $limit, which is
only set by the option parser.  It sets $minflight and $flightcond;
these are used only by mainquery, which still comes after
computeflightsrange.


sg-report-host-history: Actually honour $minflight

The effect of this is to limit the output from some of
sg-report-host-history's queries.  If this is wrong somehow the worst
case is that information would be missing from the host history
reports.  That information would be for flights earlier than a minimum
flight number, so it would be quite obvious.

In principle the code code have a bug which causes the queries to
fail, for example if the parameters or syntax are wrong.  But the new
syntax is unconditional and such a bug should therefore be spotted
during testing.


sg-report-host-history: Get job status from mainquery

This unconditionally joins the jobs table to the runvars table in the
`mainquery'.  (Unconditionality means the query syntax is right.)

The jobs table is much smaller.  A handful of empirical tests suggest
this change does not slow things down significantly.  It not
particularly likely, but it is possible that this will be different in
production.

The change to the $infoq is slightly confusing.  There is now a dummy
"AND ?!='X'" condition in the query.  Its purpose is to consume a
redundant job name argument which is not needed any more.  jobs are
never called X so this condition is always true.  Testing shows this
works.


sg-report-host-history: Add $cachekey argument to jobquery

This patch does nothing but add an unused argument.  Syntax errors and
missed call sites (even on non-taken paths) would be caught by perl.


sg-report-host-history: Store per-job query results in %$jr

This is quite complex.  It stores new data in a hash %$jr which is
about the size of the host history report.  Those host history reports
have limited size so we expect this to be OK from a performance point
of view.  If not, we would see slow sg-report-host-history processes
(see mitigation above).

In principle this code might cause perl errors and cause
sg-report-host-history to crash, maybe because of a wrong or undefined
reference.  But I have tested both the cache hit and cache miss cases.


sg-report-host-history: Write cache entries
sg-report-host-history: Write cache entries for tail, too

This dumps the data out to the HTML.  There is new fiddly quoting code
but it is largely unconditional so has been executed and tested, so it
will probably not crash entirely.  There remains a risk that the
quoting algorithm or something else is wrong and generates corrupted
HTML.  That would not be a crisis for us as users, but it might affect
the program's ability to read it in.  See the next section for that:


sg-report-host-history: Read cache entries

The biggest risk here is that the logfile parser which reads the cache
entries finds something it doesn't like and crashes, refusing to parse
it.

If this occurs it is because of strange data in the osstest database:
weird job names or something, which trigger quoting/unquoting bugs.
But this code has been manually tested on existing recent data.  So
existing data is good.  And we aren't making new changes to osstest.


sg-report-host-history: Move job runvars query later

This is fine because it just sets local (my) variables.  Perl would
notice if we had got things wrong.


sg-report-host-history: Cache runvar queries (power information)

This relies on the changes made so far and does not add significant
risks of its own.


Revert "sg-report-host-history: Reduce limit from 2000 to 200"

This is the purpose of the exercise.

The risk is that the changes are not sufficient to, in practice, give
adequate performance.  During the transition (while some jobs are
using new code and some old) there will be some delays as things are
needlessly regenerated, but afterwards all should be well.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history
  2019-11-20 17:54         ` Ian Jackson
@ 2019-11-21  5:45           ` Jürgen Groß
  2019-11-25 14:53             ` Ian Jackson
  0 siblings, 1 reply; 21+ messages in thread
From: Jürgen Groß @ 2019-11-21  5:45 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel

On 20.11.19 18:54, Ian Jackson wrote:
> Hi, I promised you to do a risk/benefit analysis on this series and
> here is my report.  With your permission I plan to push it on Sunday
> night or Monday morning, if you think that is a convenient time.

TYVM.

I'm fine with your plan.


Juergen

> 
> 
> Summary:
> 
> There are three kinds of risk here:
> 
> * There is a nonneglible chance that these changes have a significant
>    adverse performance impact on post-flight reporting, so that
>    overall throughput is adversely affected.  I have tried to exclude
>    it by both reasoning and testing but it remains a risk.
> 
>    I propose to deal with this risk by pushing the change to osstest
>    pretest at the beginning of the week, so that when it makes it
>    through the self-push gate I am around to monitor it.  I will check
>    to see that it is DTRT, and, particularly, that the reporting is not
>    overly slow.
> 
> * I expect a certain amount of additional delay during the
>    transitional period, when some flights are using old code and some
>    new code.
> 
>    I propose to deal with this issue by negotiating a good time to do
>    this when we can afford to, effectively, lose a few hours'
>    throughput.
> 
> * There is a pretty small chance that these changes breaks everything
>    by causing all flights to crash during host reporting.
> 
>    This will be obvious, especially if I'm watching it all closely.
>    If this happens it will need to be reverted.
> 
> If we decide this series is a problem, after it has gone into
> production, we can simply revert it.  There is nothing else in the
> osstest push gate right now.  The old code will still function and we
> could confidently force push it.
> 
> The upside of this change is to undo a regression in our ability to
> diagnose host problems.  Particularly, if a host has a low probability
> or intermittent fault, we will want to be able to look further back
> than the current ~200 jobs (not sure how long that is without looking
> it up but it is only a few days I think, at least for some hosts).
> 
> Ian.
> 
> 
> Patch-by-patch notes:
> 
> 
> sg-report-host-history: Improve debugging output
> 
> This is just additional prints.  If they accidentally refer to wrong
> variables, this would generate perl nonfatal warnings in debug mode
> (which we do not use in production).
> 
> 
> sg-report-host-history: New --no-install option for testing
> 
> By inspection and testing this code does nothing if the new option is
> not passed.
> 
> 
> sg-report-host-history: Move `computeflightsrange' after hosts
> 
> I double checked that global variables used and set by
> computeflightsrange.  It uses and sets $flightlimit; nothing else uses
> this and it is set by the option parser.  It uses $limit, which is
> only set by the option parser.  It sets $minflight and $flightcond;
> these are used only by mainquery, which still comes after
> computeflightsrange.
> 
> 
> sg-report-host-history: Actually honour $minflight
> 
> The effect of this is to limit the output from some of
> sg-report-host-history's queries.  If this is wrong somehow the worst
> case is that information would be missing from the host history
> reports.  That information would be for flights earlier than a minimum
> flight number, so it would be quite obvious.
> 
> In principle the code code have a bug which causes the queries to
> fail, for example if the parameters or syntax are wrong.  But the new
> syntax is unconditional and such a bug should therefore be spotted
> during testing.
> 
> 
> sg-report-host-history: Get job status from mainquery
> 
> This unconditionally joins the jobs table to the runvars table in the
> `mainquery'.  (Unconditionality means the query syntax is right.)
> 
> The jobs table is much smaller.  A handful of empirical tests suggest
> this change does not slow things down significantly.  It not
> particularly likely, but it is possible that this will be different in
> production.
> 
> The change to the $infoq is slightly confusing.  There is now a dummy
> "AND ?!='X'" condition in the query.  Its purpose is to consume a
> redundant job name argument which is not needed any more.  jobs are
> never called X so this condition is always true.  Testing shows this
> works.
> 
> 
> sg-report-host-history: Add $cachekey argument to jobquery
> 
> This patch does nothing but add an unused argument.  Syntax errors and
> missed call sites (even on non-taken paths) would be caught by perl.
> 
> 
> sg-report-host-history: Store per-job query results in %$jr
> 
> This is quite complex.  It stores new data in a hash %$jr which is
> about the size of the host history report.  Those host history reports
> have limited size so we expect this to be OK from a performance point
> of view.  If not, we would see slow sg-report-host-history processes
> (see mitigation above).
> 
> In principle this code might cause perl errors and cause
> sg-report-host-history to crash, maybe because of a wrong or undefined
> reference.  But I have tested both the cache hit and cache miss cases.
> 
> 
> sg-report-host-history: Write cache entries
> sg-report-host-history: Write cache entries for tail, too
> 
> This dumps the data out to the HTML.  There is new fiddly quoting code
> but it is largely unconditional so has been executed and tested, so it
> will probably not crash entirely.  There remains a risk that the
> quoting algorithm or something else is wrong and generates corrupted
> HTML.  That would not be a crisis for us as users, but it might affect
> the program's ability to read it in.  See the next section for that:
> 
> 
> sg-report-host-history: Read cache entries
> 
> The biggest risk here is that the logfile parser which reads the cache
> entries finds something it doesn't like and crashes, refusing to parse
> it.
> 
> If this occurs it is because of strange data in the osstest database:
> weird job names or something, which trigger quoting/unquoting bugs.
> But this code has been manually tested on existing recent data.  So
> existing data is good.  And we aren't making new changes to osstest.
> 
> 
> sg-report-host-history: Move job runvars query later
> 
> This is fine because it just sets local (my) variables.  Perl would
> notice if we had got things wrong.
> 
> 
> sg-report-host-history: Cache runvar queries (power information)
> 
> This relies on the changes made so far and does not add significant
> risks of its own.
> 
> 
> Revert "sg-report-host-history: Reduce limit from 2000 to 200"
> 
> This is the purpose of the exercise.
> 
> The risk is that the changes are not sufficient to, in practice, give
> adequate performance.  During the transition (while some jobs are
> using new code and some old) there will be some delays as things are
> needlessly regenerated, but afterwards all should be well.
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history
  2019-11-21  5:45           ` Jürgen Groß
@ 2019-11-25 14:53             ` Ian Jackson
  0 siblings, 0 replies; 21+ messages in thread
From: Ian Jackson @ 2019-11-25 14:53 UTC (permalink / raw)
  To: Jürgen Groß; +Cc: xen-devel

Jürgen Groß writes ("Re: [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history"):
> On 20.11.19 18:54, Ian Jackson wrote:
> > Hi, I promised you to do a risk/benefit analysis on this series and
> > here is my report.  With your permission I plan to push it on Sunday
> > night or Monday morning, if you think that is a convenient time.
> 
> TYVM.
> 
> I'm fine with your plan.

Thanks.  I have pushed this to osstest pretest.  Coincidentally, we
have what looks like it might be a low-probability hardware problem
with debina0.  Or maybe some other kind of problem.  Hopefully the
longer logs will help diagnose this.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-11-25 14:53 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-08 18:49 [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Ian Jackson
2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 01/13] sg-report-host-history: Improve debugging output Ian Jackson
2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 02/13] sg-report-host-history: New --no-install option for testing Ian Jackson
2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 03/13] sg-report-host-history: Move `computeflightsrange' after hosts Ian Jackson
2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 04/13] sg-report-host-history: Actually honour $minflight Ian Jackson
2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 05/13] sg-report-host-history: Get job status from mainquery Ian Jackson
2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 06/13] sg-report-host-history: Add $cachekey argument to jobquery Ian Jackson
2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 07/13] sg-report-host-history: Store per-job query results in %$jr Ian Jackson
2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 08/13] sg-report-host-history: Write cache entries Ian Jackson
2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 09/13] sg-report-host-history: Write cache entries for tail, too Ian Jackson
2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 10/13] sg-report-host-history: Read cache entries Ian Jackson
2019-11-08 18:49 ` [Xen-devel] [OSSTEST PATCH 11/13] sg-report-host-history: Move job runvars query later Ian Jackson
2019-11-08 18:50 ` [Xen-devel] [OSSTEST PATCH 12/13] sg-report-host-history: Cache runvar queries (power information) Ian Jackson
2019-11-08 18:50 ` [Xen-devel] [OSSTEST PATCH 13/13] Revert "sg-report-host-history: Reduce limit from 2000 to 200" Ian Jackson
2019-11-08 19:45 ` [Xen-devel] [OSSTEST PATCH 00/13] Speed up and restore host history Sander Eikelenboom
2019-11-11 11:00   ` Ian Jackson
2019-11-11 12:03     ` Sander Eikelenboom
2019-11-11 14:00       ` Ian Jackson
2019-11-20 17:54         ` Ian Jackson
2019-11-21  5:45           ` Jürgen Groß
2019-11-25 14:53             ` Ian Jackson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.