All of lore.kernel.org
 help / color / mirror / Atom feed
* [OSSTEST RFC PATCH 00/13] Planner: Performance improvement
@ 2015-09-02 15:45 Ian Jackson
  2015-09-02 15:45 ` [OSSTEST PATCH 01/13] Tcl: Use lshift instead of open-coding with lrange Ian Jackson
                   ` (13 more replies)
  0 siblings, 14 replies; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Campbell

This series addresses a performance problem which we seem to be
experiencing with the existing osstest resource planner in Xen Project
test colo.  The current setup it can leave hosts idle for too long
before getting around to allocating them to test jobs.

The full explanation of the new algorithm is in the commit message for
13/13.

I have not executed this series yet.  Testing it is not all that easy
because this code operates only in infrastructure mode, and most of
the code is in the central queue daemon and not subject to the osstest
self-push-gate.

I intend, after a first round of review, to test this fully on the
Citrix Cambridge osstest instance.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 01/13] Tcl: Use lshift instead of open-coding with lrange
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
@ 2015-09-02 15:45 ` Ian Jackson
  2015-09-03 10:44   ` Ian Campbell
  2015-09-02 15:45 ` [OSSTEST PATCH 02/13] Planner: docs: Minor fixes Ian Jackson
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

In ms-queuedaemon, and JobDB-Executive, once each.  No functional
change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 ms-queuedaemon          |    3 +--
 tcl/JobDB-Executive.tcl |    3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/ms-queuedaemon b/ms-queuedaemon
index e15bc79..d6d59ee 100755
--- a/ms-queuedaemon
+++ b/ms-queuedaemon
@@ -210,8 +210,7 @@ proc queuerun-perhaps-step {} {
         return
     }
 
-    set thinking [lindex $queue_running 0]
-    set queue_running [lrange $queue_running 1 end]
+    set thinking [lshift queue_running]
     log-event "queuerun-perhaps-step selected"
 
     set thinking_after [after [expr {$c(QueueThoughtsTimeout) * 1000}] \
diff --git a/tcl/JobDB-Executive.tcl b/tcl/JobDB-Executive.tcl
index 7228712..d61d2a2 100644
--- a/tcl/JobDB-Executive.tcl
+++ b/tcl/JobDB-Executive.tcl
@@ -74,8 +74,7 @@ proc set-flight {} {
         set argv [lrange $argv 2 end]
     }
 
-    set flight [lindex $argv 0]
-    set argv [lrange $argv 1 end]
+    set flight [lshift argv]
     set env(OSSTEST_FLIGHT) $flight
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 02/13] Planner: docs: Minor fixes
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
  2015-09-02 15:45 ` [OSSTEST PATCH 01/13] Tcl: Use lshift instead of open-coding with lrange Ian Jackson
@ 2015-09-02 15:45 ` Ian Jackson
  2015-09-03 10:47   ` Ian Campbell
  2015-09-02 15:45 ` [OSSTEST PATCH 03/13] Planner: docs: Document set-info command Ian Jackson
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

 * Document the ms-queuedaemon banner
 * Document the argument to the allocation $resourcecall callback fn.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 Osstest/Executive.pm |    2 +-
 README.planner       |    3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/Osstest/Executive.pm b/Osstest/Executive.pm
index bf968c8..ab015d2 100644
--- a/Osstest/Executive.pm
+++ b/Osstest/Executive.pm
@@ -604,7 +604,7 @@ sub plan_search ($$$$) {
 }
 
 sub alloc_resources {
-    my ($resourcecall) = pop @_;
+    my ($resourcecall) = pop @_; # $resourcecall->($plan);
     my (%xparams) = @_;
     # $resourcecall should die (abort) or return ($ok, $bookinglist)
     #
diff --git a/README.planner b/README.planner
index ec4dce8..34eae97 100644
--- a/README.planner
+++ b/README.planner
@@ -181,6 +181,9 @@ DETAILED PROTOCOL NOTES
 
 ms-queuedaemon commands
 
+        < OK ms-queuedaemon [INFO...]
+                Banner on connection.  INFO should be ignored.
+
 	> wait
 		I want to join the plan
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 03/13] Planner: docs: Document set-info command
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
  2015-09-02 15:45 ` [OSSTEST PATCH 01/13] Tcl: Use lshift instead of open-coding with lrange Ian Jackson
  2015-09-02 15:45 ` [OSSTEST PATCH 02/13] Planner: docs: Minor fixes Ian Jackson
@ 2015-09-02 15:45 ` Ian Jackson
  2015-09-03 10:48   ` Ian Campbell
  2015-09-02 15:45 ` [OSSTEST PATCH 04/13] Planner: Fix indefinite holdoff Ian Jackson
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 README.planner |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/README.planner b/README.planner
index 34eae97..b1247e7 100644
--- a/README.planner
+++ b/README.planner
@@ -184,6 +184,16 @@ ms-queuedaemon commands
         < OK ms-queuedaemon [INFO...]
                 Banner on connection.  INFO should be ignored.
 
+        > set-info KEY VALUE
+                There are various informational keys which are used by
+                the planner, provided by clients.  Keys include:
+
+        > set-info preinfo ...                   } used for reporting
+        > set-info job ...                       }
+        > set-info priority ...                      } used for
+        > set-info sub-priority ...                  } queue adjustment
+        > set-info wait-start-adjust ...             }
+
 	> wait
 		I want to join the plan
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 04/13] Planner: Fix indefinite holdoff
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
                   ` (2 preceding siblings ...)
  2015-09-02 15:45 ` [OSSTEST PATCH 03/13] Planner: docs: Document set-info command Ian Jackson
@ 2015-09-02 15:45 ` Ian Jackson
  2015-09-03 10:51   ` Ian Campbell
  2015-09-02 15:45 ` [OSSTEST PATCH 05/13] Planner: client side: $mayalloc parameter to $resourcecall->() Ian Jackson
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

runneeded-ensure-will would always reset the runneeded_holdoff_after
timer.  So no new queue run would start until no runneeded-ensure-will
has occurred for (currently) 30s.

Instead, only start the timer if it's not already running.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 ms-queuedaemon |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/ms-queuedaemon b/ms-queuedaemon
index d6d59ee..1aa526c 100755
--- a/ms-queuedaemon
+++ b/ms-queuedaemon
@@ -86,10 +86,12 @@ proc runneeded-ensure-will {need} {
     log-event "runneeded-ensure-will $need (was $need_queue_run)"
 
     if {$need > $need_queue_run} { set need_queue_run $need }
-    catch { after cancel $runneeded_holdoff_after }
-    set runneeded_holdoff_after \
-        [after [expr {$c(QueueDaemonHoldoff) * 1000}] \
-             runneeded-perhaps-start]
+
+    if {![info exists runneeded_holdoff_after]} {
+	set runneeded_holdoff_after \
+	    [after [expr {$c(QueueDaemonHoldoff) * 1000}] \
+		 runneeded-perhaps-start]
+    }
 }
 
 proc runneeded-perhaps-start {} {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 05/13] Planner: client side: $mayalloc parameter to $resourcecall->()
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
                   ` (3 preceding siblings ...)
  2015-09-02 15:45 ` [OSSTEST PATCH 04/13] Planner: Fix indefinite holdoff Ian Jackson
@ 2015-09-02 15:45 ` Ian Jackson
  2015-09-03 10:52   ` Ian Campbell
  2015-09-02 15:45 ` [OSSTEST PATCH 06/13] Planner: client side: New `!OK think noalloc' protocol Ian Jackson
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

Add a new parameter to $resourcecall which allows the alloc_resources
loop in Osstest::Executive to specify to its clients that on this
occasion they should not make any actual allocations.

The callers of alloc_resources are all adjusted to honour this new
parameter:
 * ts-hosts-allocate-Executive avoids allocating unless $mayalloc
 * mg-allocate avoids allocating unless $mayalloc
 * mg-blockage never allocates anyway.

Currently we always pass 1, so no functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 Osstest/Executive.pm        |    4 ++--
 mg-allocate                 |    4 ++--
 mg-blockage                 |    2 +-
 ts-hosts-allocate-Executive |    4 ++--
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/Osstest/Executive.pm b/Osstest/Executive.pm
index ab015d2..f9be0a0 100644
--- a/Osstest/Executive.pm
+++ b/Osstest/Executive.pm
@@ -604,7 +604,7 @@ sub plan_search ($$$$) {
 }
 
 sub alloc_resources {
-    my ($resourcecall) = pop @_; # $resourcecall->($plan);
+    my ($resourcecall) = pop @_; # $resourcecall->($plan, $mayalloc);
     my (%xparams) = @_;
     # $resourcecall should die (abort) or return ($ok, $bookinglist)
     #
@@ -720,7 +720,7 @@ sub alloc_resources {
 		$plan= from_json($jplan);
 	    }, sub {
 		if (!eval {
-		    ($ok, $bookinglist) = $resourcecall->($plan);
+		    ($ok, $bookinglist) = $resourcecall->($plan, 1);
 		    1;
 		}) {
 		    warn "resourcecall $@";
diff --git a/mg-allocate b/mg-allocate
index fc1b394..6dc7ddd 100755
--- a/mg-allocate
+++ b/mg-allocate
@@ -270,7 +270,7 @@ sub showposs ($) {
 sub plan () {
     my @got;
     alloc_resources(sub {
-        my ($plan) = @_;
+        my ($plan, $mayalloc) = @_;
 
 	@got = ();
         my @possmatrix = ([]);
@@ -310,7 +310,7 @@ sub plan () {
 	die unless $planned;
 
         my $allok=0;
-        if (!$planned->{Start}) {
+        if ($mayalloc && !$planned->{Start}) {
             $allok=1;
 
             alloc_prep();
diff --git a/mg-blockage b/mg-blockage
index a21f15b..1f66d8e 100755
--- a/mg-blockage
+++ b/mg-blockage
@@ -40,7 +40,7 @@ sub max { (reverse sort @_)[0]; }
 
 sub plan () {
     alloc_resources(sub {
-	my ($plan) = @_;
+	my ($plan, $mayalloc) = @_;
 
 	my $now = time;
 	if ($now > $end) { return (1, { Bookings => [ ] }); }
diff --git a/ts-hosts-allocate-Executive b/ts-hosts-allocate-Executive
index 7a9411d..6e1391e 100755
--- a/ts-hosts-allocate-Executive
+++ b/ts-hosts-allocate-Executive
@@ -606,7 +606,7 @@ sub alloc_hosts () {
 }
 
 sub attempt_allocation {
-    ($plan) = @_;
+    ($plan, $mayalloc) = @_;
     undef $best;
 
     logm("allocating hosts: ".join(' ', map { $_->{Ident} } @hids));
@@ -632,7 +632,7 @@ sub attempt_allocation {
 
     my $retval=0;
 
-    if (!$best->{Start}) {
+    if ($mayalloc && !$best->{Start}) {
 	$retval= 1;
 	foreach my $hid (@hids) {
 	    my $got= actual_allocation($hid);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 06/13] Planner: client side: New `!OK think noalloc' protocol
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
                   ` (4 preceding siblings ...)
  2015-09-02 15:45 ` [OSSTEST PATCH 05/13] Planner: client side: $mayalloc parameter to $resourcecall->() Ian Jackson
@ 2015-09-02 15:45 ` Ian Jackson
  2015-09-03 10:53   ` Ian Campbell
  2015-09-02 15:45 ` [OSSTEST PATCH 07/13] Planner: ms-planner support -w option Ian Jackson
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

Introduce a way for the queue daemon to tell its client that it must
not allocate anything in this planning iteration.

In the client:
 * Advertise the new feature via set-info.
 * Accept the `noalloc' part of `!OK think noalloc';
 * Print that in our log message;
 * Honour it by passing it to $resourcecall.

And document the new protocol.  However, there is no server-side yet,
so this does not yet introduce any overall change to the system.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 Osstest/Executive.pm |    9 ++++++---
 README.planner       |    6 +++++-
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/Osstest/Executive.pm b/Osstest/Executive.pm
index f9be0a0..4f51d70 100644
--- a/Osstest/Executive.pm
+++ b/Osstest/Executive.pm
@@ -674,6 +674,7 @@ sub alloc_resources {
                 $set_info->('priority', $priority);
                 $set_info->('sub-priority',$ENV{OSSTEST_RESOURCE_SUBPRIORITY});
                 $set_info->('preinfo',     $ENV{OSSTEST_RESOURCE_PREINFO});
+		$set_info->('feature-noalloc', 1);
 
                 if (defined $waitstart) {
                     $set_info->('wait-start',$waitstart);
@@ -699,7 +700,9 @@ sub alloc_resources {
 
             logm("resource allocation: awaiting our slot...");
 
-            $_= <$qserv>;  defined && m/^\!OK think\s$/ or die "$_ ?";
+            $_= <$qserv>;
+	    defined && m/^\!OK think( noalloc)?\s$/ or die "$_ ?";
+	    my $noalloc = $1 // '';
 
             opendb_tests();
 
@@ -715,12 +718,12 @@ sub alloc_resources {
 		read($qserv, $jplan, $jplanlen) == $jplanlen or die $!;
 		my $jplanprint= $jplan;
 		chomp $jplanprint;
-		logm("resource allocation: obtained base plan.");
+		logm("resource allocation: obtained base plan$noalloc.");
 		$debugm->("base plan = ", $jplanprint);
 		$plan= from_json($jplan);
 	    }, sub {
 		if (!eval {
-		    ($ok, $bookinglist) = $resourcecall->($plan, 1);
+		    ($ok, $bookinglist) = $resourcecall->($plan, !$noalloc);
 		    1;
 		}) {
 		    warn "resourcecall $@";
diff --git a/README.planner b/README.planner
index b1247e7..24185ce 100644
--- a/README.planner
+++ b/README.planner
@@ -194,11 +194,15 @@ ms-queuedaemon commands
         > set-info sub-priority ...                  } queue adjustment
         > set-info wait-start-adjust ...             }
 
+        > set-info feature-noalloc 1
+	        The client understands `!OK think noalloc'.
+
 	> wait
 		I want to join the plan
 
-	< !OK think
+	< !OK think [noalloc]
 		Now is the time to add yourself to the plan
+                `noalloc' means client must not not actually allocate.
 
 		> get-plan
 		< OK get-plan BYTES
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 07/13] Planner: ms-planner support -w option
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
                   ` (5 preceding siblings ...)
  2015-09-02 15:45 ` [OSSTEST PATCH 06/13] Planner: client side: New `!OK think noalloc' protocol Ian Jackson
@ 2015-09-02 15:45 ` Ian Jackson
  2015-09-03 11:25   ` Ian Campbell
  2015-09-02 15:45 ` [OSSTEST PATCH 08/13] Planner: ms-queuedaemon: Prep for multiple walkers Ian Jackson
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

We are going to introduce multiple concurrent streams of planning
processing, called `walkers' in ms-queuedaemon.  The work-in-progress
plan is stored, server-side, during planning, in data-plan.pl.  But we
need to have more than one of these.

Update ms-planner and ms-planner-debug to honour a -w option, to
specify a replacement for the word `plan' in `data-plan.pl'.

No overall functional change, since nothing uses these options yet.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 ms-planner       |    6 +++++-
 ms-planner-debug |   13 ++++++++++---
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/ms-planner b/ms-planner
index 1dd01a9..f2070b0 100755
--- a/ms-planner
+++ b/ms-planner
@@ -33,12 +33,16 @@ use Osstest::Executive;
 
 open DEBUG, ">/dev/null" or die $!;
 
+our $walker = 'plan';
+
 while (@ARGV and $ARGV[0] =~ m/^-/) {
     $_= shift @ARGV;
     last if m/^--$/;
     while (m/^-./) {
         if (s/^-D/-/) {
             open DEBUG, ">&STDERR" or die $!;
+        } elsif (s/^-w(.+)/-/) {
+	    $walker = $1;
         } else {
             die "$_ ?";
         }
@@ -49,7 +53,7 @@ csreadconfig();
 
 our ($plan);
 
-my $fn= "data-plan.pl";
+my $fn= "data-$walker.pl";
 
 sub allocations ($$) {
     my ($init, $each) = @_;
diff --git a/ms-planner-debug b/ms-planner-debug
index e277ca6..eac5675 100755
--- a/ms-planner-debug
+++ b/ms-planner-debug
@@ -25,7 +25,14 @@ use Osstest;
 
 csreadconfig();
 
-my $f= sprintf "data-plan-debug-%s.txt", time;
+my $walker = 'plan';
+
+if (@ARGV && $ARGV[0] =~ m/^-w(.+)$/) {
+    $walker = $1;
+    shift @ARGV;
+}
+
+my $f= sprintf "data-$walker-debug-%s.txt", time;
 
 printf "%s\n", $f;
 
@@ -42,9 +49,9 @@ foreach my $arg (@ARGV) {
     }
 }
 
-print "==========data-plan.pl==========\n";
+print "==========data-$walker.pl==========\n";
 
-system 'cat data-plan.pl 2>&1';
+system 'cat data-$walker.pl 2>&1';
 
 print "==========resources==========\n";
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 08/13] Planner: ms-queuedaemon: Prep for multiple walkers
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
                   ` (6 preceding siblings ...)
  2015-09-02 15:45 ` [OSSTEST PATCH 07/13] Planner: ms-planner support -w option Ian Jackson
@ 2015-09-02 15:45 ` Ian Jackson
  2015-09-03 12:05   ` Ian Campbell
  2015-09-02 15:45 ` [OSSTEST PATCH 09/13] Planner: ms-queuedaemon: Synchronise thinking " Ian Jackson
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

We are going to introduce multiple concurrent streams of planning
processing, called `walkers'.

Prepare the ground for this with some formulaic changes which will
otherwise greatly clutter substantive patches.

(A client will still only think for one walker at once, because that's
what the client protocol expects - and anything else would be far too
confusing.)

General:
 * Introduce the concept of a `walker' to ms-queuedaemon.
 * Provide a list of the walkers which might exist, `walkers'
 * Provide some helper procedures for iterating over these,
   and easily accessing their state.

Queue handling:
 * Add a new `w' argument to many procs: specifically, most of the
   procs in the section `machinery for running the queue'.
 * Log the walker ($w) at the start of all relevant log messages.
 * Pass the -w option to ms-planner and ms-planner-debug.
 * Add safety catches which will crash the ms-queuedaemon if it finds
   it is asking the same client to think for more than one walker.
 * we-are-thinking and check-we-are-thinking tell the caller what
   walker the client is thinking for.
 * In the resource-plan.html filename, replace `plan' with the walker
   filename.

Elsewhere:
 * Teach dequeue-chan to deal with all the walkers, including
   maybe the (one) walker for which the client is thinking.
 * Teach log-state to report on all the walkers.
 * In the runneeded logic, hardcode `plan' as the walker to use.

There is still actually only one walker.

No overall functional change, except to some log messages.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 ms-queuedaemon |  191 +++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 121 insertions(+), 70 deletions(-)

diff --git a/ms-queuedaemon b/ms-queuedaemon
index 1aa526c..53ac655 100755
--- a/ms-queuedaemon
+++ b/ms-queuedaemon
@@ -21,6 +21,23 @@
 
 source ./tcl/daemonlib.tcl
 
+set walkers {plan}
+
+proc walker-globals {w} {
+    # introduces queue_running, thinking[_after] for the specific walker
+    foreach v {queue_running thinking thinking_after} {
+	uplevel 1 [list upvar $w/$v $v]
+    }
+}
+
+proc foreach-walker {walkervar body} {
+    global walkers
+    upvar 1 $walkervar w
+    foreach w $walkers {
+	uplevel 1 walker-globals $w
+	uplevel 1 $body
+    }
+}
 
 proc chan-destroy-stuff {chan} {
     dequeue-chan $chan destroy
@@ -31,13 +48,20 @@ proc chan-destroy-stuff {chan} {
 proc dequeue-chan {chan why} {
     log-event "dequeue-chan $chan $why"
 
-    global queue queue_running thinking
+    global queue
     lremove queue $chan
 
-    if {[info exists queue_running]} { lremove queue_running $chan }
-    if {[info exists thinking] &&
-        ![string compare $thinking $chan]} {
-        queuerun-step-done $why
+    foreach-walker w {
+	if {[info exists queue_running]} { lremove queue_running $chan }
+    }
+
+    # Reentrancy: this next loop can only trigger once, because
+    # the queuerun-step-done won't ever start chan thinking again
+    foreach-walker w {
+	if {[info exists thinking] &&
+	    ![string compare $thinking $chan]} {
+	    queuerun-step-done $w $why
+	}
     }
 }
 
@@ -55,31 +79,35 @@ proc log-event {m} {
 }
 
 proc log-state {m} {
-    global need_queue_run queue queue_running thinking
+    global need_queue_run queue
 
     set lhs [format "N=%d Q=%d (%-11.11s) " \
                  $need_queue_run [llength $queue] $queue]
 
-    if {[info exists queue_running]} {
-        append lhs [format "R=%d " [llength $queue_running]]
-        if {[info exists thinking]} {
-            append lhs [format "T=%s " $thinking]
-        } else {
-            append lhs [format "        "]
-        }
-        append lhs [format "(%-11.11s) " $queue_running]
-    } else {
-        append lhs "                          "
+    foreach-walker w {
+	if {[info exists queue_running]} {
+	    append lhs [format "R=%d " [llength $queue_running]]
+	    if {[info exists thinking]} {
+		append lhs [format "T=%s " $thinking]
+	    } else {
+		append lhs [format "        "]
+	    }
+	    append lhs [format "(%-11.11s) " $queue_running]
+	} else {
+	    append lhs "                          "
+	}
     }
+
     log "$lhs | $m"
 }
 
 #---------- machinery for making sure we run the queue ----------
 #
 # variables:
-#   queue            chans that are waiting for resources
-#   queue_running    unset if not running, list of chans if running
-#   need_queue_run   0: not needed; 1: needed if more resources; 2: force
+#  $w/queue            chans that are waiting for resources
+#  $w/queue_running    unset if not running, list of chans if running
+#  need_queue_run      0: not needed; 1: needed if more resources; 2: force
+# the $w/ variables are generally upvar'd with walker-globals
 
 proc runneeded-ensure-will {need} {
     global runneeded_holdoff_after c need_queue_run
@@ -96,7 +124,8 @@ proc runneeded-ensure-will {need} {
 
 proc runneeded-perhaps-start {} {
     log-event runneeded-perhaps-start
-    global queue queue_running thinking need_queue_run inhibit
+    walker-globals plan
+    global queue need_queue_run inhibit
     global runneeded_holdoff_after
     unset runneeded_holdoff_after
 
@@ -108,8 +137,8 @@ proc runneeded-perhaps-start {} {
     set need_queue_run 0
 
     if {![llength $queue]} {
-        plan-reset
-        report-plan
+        plan-reset plan
+        report-plan plan
         return
     }
 
@@ -140,7 +169,7 @@ proc runneeded-perhaps-start {} {
     log "runneeded-perhaps-start starting cleaned=$cleaned"
 
     runneeded-2-requeue
-    queuerun-start
+    queuerun-start plan
 }
 
 proc runneeded-ensure-polling {} {
@@ -181,26 +210,29 @@ proc runneeded-2-requeue {} {
 #---------- machinery for running the queue ----------
 #
 # variables:
-#    queue             chans waiting, read when we start
-#    queue_running     chans not yet asked
-#    thinking          chan currently asking
-#    thinking_after    timeout
+#  queue                chans waiting, read when we start
+#  $w/queue_running     chans not yet asked
+#  $w/thinking          chan currently asking
+#  $w/thinking_after    timeout
+# all the $w/ are generally upvar'd by walker-globals
 
-proc plan-reset {} {
-    exec ./ms-planner reset < /dev/null
+proc plan-reset {w} {
+    exec ./ms-planner -w$w reset < /dev/null
 }
 
-proc queuerun-start {} {
-    log-event queuerun-start
-    global queue_running queue
-    plan-reset
+proc queuerun-start {w} {
+    global queue
+    walker-globals $w
+    log-event "$w queuerun-start"
+    plan-reset $w
     set queue_running $queue
-    after idle queuerun-perhaps-step
+    after idle queuerun-perhaps-step plan
 }
 
-proc queuerun-perhaps-step {} {
-    log-event queuerun-perhaps-step
-    global thinking queue_running thinking_after c
+proc queuerun-perhaps-step {w} {
+    log-event "$w queuerun-perhaps-step"
+    walker-globals $w
+    global c
 
     if {[info exists thinking]} return
     if {![info exists queue_running]} return
@@ -208,107 +240,126 @@ proc queuerun-perhaps-step {} {
     if {![llength $queue_running]} {
         unset queue_running
         runneeded-ensure-will 0
-        report-plan
+        report-plan $w
         return
     }
 
-    set thinking [lshift queue_running]
-    log-event "queuerun-perhaps-step selected"
+    set next [lindex $queue_running 0]
+    set already [we-are-thinking $next]
+    if {[llength $already]} {
+	error "next $next thinking $already but also want $w"
+    }
+
+    set thinking $next
+    lshift queue_running
+    log-event "$w queuerun-perhaps-step selected"
 
     set thinking_after [after [expr {$c(QueueThoughtsTimeout) * 1000}] \
-                            queue-thoughts-timedout]
+                            queue-thoughts-timedout $w]
     for-chan $thinking {
         puts-chan $thinking "!OK think"
     }
 }
 
-proc report-plan {} {
+proc report-plan {w} {
     global c
     if {[catch {
-        exec ./ms-planner show-html > "$c(WebspaceFile)/resource-plan.html"
+	set outputfile "$c(WebspaceFile)/resource-$w.html"
+	exec ./ms-planner -w$w show-html > $outputfile
     } emsg]} {
-        log "INTERNAL ERROR showing plan html: $emsg"
+        log "INTERNAL ERROR showing $w html: $emsg"
     } else {
-        log "report-plan OK"
+        log "$w report-plan OK"
     }
 }
 
 proc we-are-thinking {chan} {
-    global thinking
-    return [expr {[info exists thinking] && ![string compare $thinking $chan]}]
+    set ws {}
+    foreach-walker w {
+	if {[info exists thinking] && ![string compare $thinking $chan]} {
+	    lappend ws $w
+	}
+    }
+    if {[llength $ws] > 1} {
+	error "arrgh chan $chan thinking $ws !"
+    }
+    return [lindex $ws 0]
 }
 
 proc check-we-are-thinking {chan} {
-    if {![we-are-thinking $chan]} {
+    set w [we-are-thinking $chan]
+    if {![string length $w]} {
         puts-chan $chan "ERROR you are not thinking"
         return -code return
     }
+    return $w
 }
 
-proc queuerun-step-done {why} {
-    log-event "queuerun-step-done $why"
-    global queue_running thinking thinking_after
-    puts-chan-desc $thinking "queuerun-step-done $thinking $why"
+proc queuerun-step-done {w why} {
+    log-event "$w queuerun-step-done $why"
+    walker-vars $w
+    puts-chan-desc $thinking "$w queuerun-step-done $thinking $why"
     if {[info exists thinking_after]} {
         after cancel $thinking_after
         unset thinking_after
     }
     unset thinking
-    after idle queuerun-perhaps-step
+    after idle queuerun-perhaps-step $w
 }
 
-proc queue-thoughts-timedout {} {
-    log-event queue-thoughts-timedout
-    global thinking thinking_after
+proc queue-thoughts-timedout {w} {
+    log-event "$w queue-thoughts-timedout"
+    walker-vars $w
     set chan $thinking
     unset thinking_after
-    queuerun-step-done timeout
+    queuerun-step-done $w timeout
     for-chan $chan {
         puts-chan $chan "!ERROR timed out (too pensive)"
     }
 }
 
 proc cmd/thought-wait {chan desc} {
-    check-we-are-thinking $chan
-    queuerun-step-done thought-wait
+    set w [check-we-are-thinking $chan]
+    queuerun-step-done $w thought-wait
     puts-chan $chan "OK thought"
 }
 
 proc cmd/thought-done {chan desc} {
-    check-we-are-thinking $chan
-    queuerun-step-done thought-done
+    set w [check-we-are-thinking $chan]
+    queuerun-step-done $w thought-done
     dequeue-chan $chan thought-wait
     puts-chan $chan "OK thought"
 }
 
 proc cmd/get-plan {chan desc} {
     global plan
-    set plan [exec -keepnewline ./ms-planner get-plan < /dev/null]
+    set w [check-we-are-thinking $chan]
+    set plan [exec -keepnewline ./ms-planner -w$w get-plan < /dev/null]
     puts-chan-data $chan "OK get-plan" $plan
 }
 
 proc cmd/book-resources {chan desc bytes} {
-    check-we-are-thinking $chan
-    read-chan-data $chan $bytes do-book-resources
+    read-chan-data $chan $bytes do-book-resources $w
 }
 proc do-book-resources {chan desc data} {
     global plan errorInfo
-    check-we-are-thinking $chan
+    set w [check-we-are-thinking $chan]
     set info [chan-get-info $chan {"$info(preinfo) "} ""]
     append info [chan-get-info $chan {"job $info(job)"} $desc]
     if {[catch {
-	exec ./ms-planner book-resources $info << $data
+	exec ./ms-planner -w$w book-resources $info << $data
     } emsg]} {
-	set f [exec ./ms-planner-debug $info $data $plan]
+	set f [exec ./ms-planner-debug -w$w $info $data $plan]
 	error "$f $emsg" $errorInfo
     }
     puts-chan $chan "OK book-resources"
 }
 
 proc cmd/unwait {chan desc} {
-    if {[we-are-thinking $chan]} {
-        queuerun-step-done unwait
-        set res cancel
+    set w [we-are-thinking $chan]
+    if {[string length $w]} {
+        queuerun-step-done $w unwait
+        set res "$w cancel"
     } else {
         set res noop
     }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 09/13] Planner: ms-queuedaemon: Synchronise thinking multiple walkers
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
                   ` (7 preceding siblings ...)
  2015-09-02 15:45 ` [OSSTEST PATCH 08/13] Planner: ms-queuedaemon: Prep for multiple walkers Ian Jackson
@ 2015-09-02 15:45 ` Ian Jackson
  2015-09-03 12:06   ` Ian Campbell
  2015-09-02 15:45 ` [OSSTEST PATCH 10/13] Planner: ms-queuedaemon: Break out queuerun-finished/<walker> Ian Jackson
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

If multiple walkers want to ask the same chan, we want to serialise
them.  This is actually straightforward:  Firstly, we arrrange that
each walker finishing a thought will prompt _all_ walkers to
reconsider whether they need to continue.  Then we can simply do
nothing if we want to a chan to think that another walker is already
waiting for; since that other walker will prompt us later.

Still no actual functional change because there is still only one
walker.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 ms-queuedaemon |   14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/ms-queuedaemon b/ms-queuedaemon
index 53ac655..779ede0 100755
--- a/ms-queuedaemon
+++ b/ms-queuedaemon
@@ -216,6 +216,12 @@ proc runneeded-2-requeue {} {
 #  $w/thinking_after    timeout
 # all the $w/ are generally upvar'd by walker-globals
 
+proc walkers-perhaps-queue-steps {} {
+    foreach-walker w {
+	after idle queuerun-perhaps-step $w
+    }
+}
+
 proc plan-reset {w} {
     exec ./ms-planner -w$w reset < /dev/null
 }
@@ -226,7 +232,7 @@ proc queuerun-start {w} {
     log-event "$w queuerun-start"
     plan-reset $w
     set queue_running $queue
-    after idle queuerun-perhaps-step plan
+    walkers-perhaps-queue-steps
 }
 
 proc queuerun-perhaps-step {w} {
@@ -247,7 +253,9 @@ proc queuerun-perhaps-step {w} {
     set next [lindex $queue_running 0]
     set already [we-are-thinking $next]
     if {[llength $already]} {
-	error "next $next thinking $already but also want $w"
+	# $already will wake us via walkers-perhaps-queue-steps
+	log-event "$w queuerun-perhaps-step already $already"
+	return
     }
 
     set thinking $next
@@ -304,7 +312,7 @@ proc queuerun-step-done {w why} {
         unset thinking_after
     }
     unset thinking
-    after idle queuerun-perhaps-step $w
+    walkers-perhaps-queue-steps
 }
 
 proc queue-thoughts-timedout {w} {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 10/13] Planner: ms-queuedaemon: Break out queuerun-finished/<walker>
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
                   ` (8 preceding siblings ...)
  2015-09-02 15:45 ` [OSSTEST PATCH 09/13] Planner: ms-queuedaemon: Synchronise thinking " Ian Jackson
@ 2015-09-02 15:45 ` Ian Jackson
  2015-09-03 12:43   ` Ian Campbell
  2015-09-02 15:45 ` [OSSTEST PATCH 11/13] Planner: ms-queuedaemon: Break out notify-to-think Ian Jackson
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

This formalises the queue-completed interface, allowing parts outside
the queuerun machinery to cleanly be notified when a queue is
completed, and relieving the queuerun-perhaps-step of the need to know
what to do for the end of any particular walker's queue.

Currently there is still only one walker, `plan'.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 ms-queuedaemon |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/ms-queuedaemon b/ms-queuedaemon
index 779ede0..1def285 100755
--- a/ms-queuedaemon
+++ b/ms-queuedaemon
@@ -172,6 +172,11 @@ proc runneeded-perhaps-start {} {
     queuerun-start plan
 }
 
+proc queuerun-finished/plan {} {
+    runneeded-ensure-will 0
+    report-plan plan
+}
+
 proc runneeded-ensure-polling {} {
     log-event runneeded-ensure-polling
     global polling_after queue c
@@ -245,8 +250,7 @@ proc queuerun-perhaps-step {w} {
 
     if {![llength $queue_running]} {
         unset queue_running
-        runneeded-ensure-will 0
-        report-plan $w
+	queuerun-finished/$w
         return
     }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 11/13] Planner: ms-queuedaemon: Break out notify-to-think
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
                   ` (9 preceding siblings ...)
  2015-09-02 15:45 ` [OSSTEST PATCH 10/13] Planner: ms-queuedaemon: Break out queuerun-finished/<walker> Ian Jackson
@ 2015-09-02 15:45 ` Ian Jackson
  2015-09-03 12:44   ` Ian Campbell
  2015-09-02 15:45 ` [OSSTEST PATCH 12/13] Planner: ms-queuedaemon: Make report-plan take output walker name Ian Jackson
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

This is going to want to do something more complicated shortly.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 ms-queuedaemon |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/ms-queuedaemon b/ms-queuedaemon
index 1def285..0158048 100755
--- a/ms-queuedaemon
+++ b/ms-queuedaemon
@@ -268,9 +268,8 @@ proc queuerun-perhaps-step {w} {
 
     set thinking_after [after [expr {$c(QueueThoughtsTimeout) * 1000}] \
                             queue-thoughts-timedout $w]
-    for-chan $thinking {
-        puts-chan $thinking "!OK think"
-    }
+
+    notify-to-think $w $thinking
 }
 
 proc report-plan {w} {
@@ -379,6 +378,12 @@ proc cmd/unwait {chan desc} {
     puts-chan $chan "OK unwait $res"
 }
 
+proc notify-to-think {w thinking} { 
+    for-chan $thinking {
+	puts-chan $thinking "!OK think"
+    }
+}
+
 #---------- general stuff ----------
 
 proc banner {chan} {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 12/13] Planner: ms-queuedaemon: Make report-plan take output walker name
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
                   ` (10 preceding siblings ...)
  2015-09-02 15:45 ` [OSSTEST PATCH 11/13] Planner: ms-queuedaemon: Break out notify-to-think Ian Jackson
@ 2015-09-02 15:45 ` Ian Jackson
  2015-09-03 12:44   ` Ian Campbell
  2015-09-02 15:45 ` [OSSTEST PATCH 13/13] Planner: ms-queuedaemon: Restart planning when resources become free Ian Jackson
  2015-09-03 11:49 ` [OSSTEST PATCH 14/13] Planning reports: Break out return-plan-to-client Ian Jackson
  13 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

We are going to want to process each walker's data into reports which
are not necessarily named after the same walker.

No functional change as yet.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 ms-queuedaemon |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/ms-queuedaemon b/ms-queuedaemon
index 0158048..d2aabf4 100755
--- a/ms-queuedaemon
+++ b/ms-queuedaemon
@@ -138,7 +138,7 @@ proc runneeded-perhaps-start {} {
 
     if {![llength $queue]} {
         plan-reset plan
-        report-plan plan
+        report-plan plan plan
         return
     }
 
@@ -174,7 +174,7 @@ proc runneeded-perhaps-start {} {
 
 proc queuerun-finished/plan {} {
     runneeded-ensure-will 0
-    report-plan plan
+    report-plan plan plan
 }
 
 proc runneeded-ensure-polling {} {
@@ -272,10 +272,10 @@ proc queuerun-perhaps-step {w} {
     notify-to-think $w $thinking
 }
 
-proc report-plan {w} {
+proc report-plan {w wo} {
     global c
     if {[catch {
-	set outputfile "$c(WebspaceFile)/resource-$w.html"
+	set outputfile "$c(WebspaceFile)/resource-$wo.html"
 	exec ./ms-planner -w$w show-html > $outputfile
     } emsg]} {
         log "INTERNAL ERROR showing $w html: $emsg"
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 13/13] Planner: ms-queuedaemon: Restart planning when resources become free
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
                   ` (11 preceding siblings ...)
  2015-09-02 15:45 ` [OSSTEST PATCH 12/13] Planner: ms-queuedaemon: Make report-plan take output walker name Ian Jackson
@ 2015-09-02 15:45 ` Ian Jackson
  2015-09-03 12:59   ` Ian Campbell
  2015-09-03 11:49 ` [OSSTEST PATCH 14/13] Planning reports: Break out return-plan-to-client Ian Jackson
  13 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-02 15:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

This solves a performance problem with the existing planner.

The problem is that with a large installation, and a big queue, a full
plan can take a long time to prepare.  (In our current installation,
perhaps as long as half an hour.)  Any resource which becomes free
during one plan run cannot be allocated to a new job until the next
plan run starts.  This means resources (test machines) are often
sitting around idle.

Fix this by restarting the planning process as soon as any new
resource becomes free.  This means that jobs at the front of the queue
get a chance to allocate it right away, so it will probably be
allocated soon.

If it is only interesting to jobs later in the queue, then there may
be a delay in reallocating it, but presumably the resource is not much
in demand and those later jobs will allocate it when they get a bit
closer to the head.

But, there is a problem with this: it means that the plan is generally
never completed.  So we have no overview any more of when which
flights will finish and what the overall queue is like.  We solve this
problem by running a second instance of the planner algorithm, all the
way to completion, in a `dummy' mode where no actual resource
allocation takes place.  This second `projection' instance comes into
being whenever the main `plan' instance is restarted, and it inherits
the planning state from the main `plan' instance.

Global livelock (where we keep restarting the plan but never manage to
allocate anything) is not possible because each restart involves a new
resource becoming free.  If nothing gets allocated because we can't
get that far before being restarted, then eventually there will be
nothing left allocated to become newly free.

Starvation, of a form, is possible: a late-in-queue job which wants a
resource available right now might have difficulty allocating it
because the planner is spending its effort rescheduling early-in-queue
jobs which want resources which are in greater demand - so that the
late-in-queue job never gets called.  Arguably this is an appropriate
allocation of planning time.

With this arrangemernt we can generate two reports: a `plan' report
containing the short term plan which was used for actual resource
allocation, and which is frequently restarted and therefore not
necessarily complete; and a `projection' report which contains a
complete plan for all work the system is currently aware of, but which
is less-frequently updated.

Because planner clients do not contain the planning algorithm state,
the only client change needed is the ability to run in a `dummy' mode
without actual allocation; this is the `noalloc' feature earlier in
this series.

The main work is in ms-queuedaemon.  We have prepared the ground for
multiple instances of the planning algorithm; from the point of view
of ms-queuedaemon, an instance of the planning algorithm is mainly a
walk over the job queue.  So we call them `walkers'.

Therefore, what we do here is introduce a new `projection' walker,
as follows:

Add `projection' to the global list of possible walkers.

Invent a new section of code, the `restarter', which is responsible
for managing the relationship between the two walkers.  (It uses
direct knowledge of the queue state data structures, etc., to avoid
having to invent a complete formal interface to a walker.)

If we ever finish the plan walker's queue, we update both the
projection report output and the plan report output, from the same
plan.  Finishing the projection walker's queue means we have a
complete projection, but we don't touch the plan.

In principle it might happen that the plan walker might overtake the
projection walker, and then complete, write out a complete and up to
date plan as the projection, and that the projection walker would then
complete and overwrite the projection with less-up-to-date
information.  We don't explicitly exclude this.  Of course such a
result will be rectified soon enough by another planning run.

The restarter can ask the database for the list of currently-available
resources, and can therefore detect when new become newly-free.

The rest of the code remains largely ignorant of the operation of the
restarter.  There are a few hooks:

runneeded-perhaps-start notifies the restarter when we start the
plan; this is used by the restarter to record the set of free
resources at the start of a planning run, so that it can see later
whether any /new/ resources have become free.

restarter-maybe-provoke-restart is called when we get notification
from the the owner daemon that resources may have become idle.  We
look for newly-idle resources, and if there are any, and we are
running the plan walker, we directly edit the plan walker's queue to
put RESTART at the front.

queuerun-perhaps-step spots the special entry RESTART in its queue and
calls into back the restarter when it finds it.  This deferred
approach is necessary because we can't do the restart operation while
a client is thinking (because we would have to change that client's
cogitation from the `live, can allocate' mode to the `dummy, cannot
allocate' mode; and because that would make the code more complex).

The main work is done in the restarter-restart-now hook.  It reports
the current (incomplete) plan, and then checks to see if a projection
walker is running; if it is, it leaves it alone, and simply abandons
the current plan run and arranges for a new run to started.  If a
projection walker is not running it copies all the plan walker's state
(including the data-plan.pl disk file containing the plan-in-progress)
to the projection walker, and sets the projection walker going.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 README.planner |    8 +++++
 ms-queuedaemon |   98 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 104 insertions(+), 2 deletions(-)

diff --git a/README.planner b/README.planner
index 24185ce..c1b2bf6 100644
--- a/README.planner
+++ b/README.planner
@@ -76,6 +76,14 @@ that newly-freed resources are properly offered first to the tasks at
 the front of the queue.  ms-ownerdaemon sets all idle resources to
 allocatable at the start of each planning cycle.
 
+The planner actually sometimes runs two planning cycles: if resources
+become free while the planner is running, it will restart the planning
+cycle in an effort to get those resources into service.  But, it will
+leave the existing planning run going in a projection-only mode (where
+no resources actually get allocated), so that there is a report for
+the administrator showing an idea of what the system thinks may happen
+in the more distant future.
+
 
 ms-ownerdaemon and `ownd' tasks
 -------------------------------
diff --git a/ms-queuedaemon b/ms-queuedaemon
index d2aabf4..425b98f 100755
--- a/ms-queuedaemon
+++ b/ms-queuedaemon
@@ -21,7 +21,7 @@
 
 source ./tcl/daemonlib.tcl
 
-set walkers {plan}
+set walkers {plan projection}
 
 proc walker-globals {w} {
     # introduces queue_running, thinking[_after] for the specific walker
@@ -169,12 +169,19 @@ proc runneeded-perhaps-start {} {
     log "runneeded-perhaps-start starting cleaned=$cleaned"
 
     runneeded-2-requeue
+    restarter-starting-plan-hook
     queuerun-start plan
 }
 
 proc queuerun-finished/plan {} {
     runneeded-ensure-will 0
     report-plan plan plan
+    report-plan plan projection
+}
+
+proc queuerun-finished/projection {} {
+    runneeded-ensure-will 0
+    report-plan projection projection
 }
 
 proc runneeded-ensure-polling {} {
@@ -255,6 +262,11 @@ proc queuerun-perhaps-step {w} {
     }
 
     set next [lindex $queue_running 0]
+    if {![string compare RESTART $next]} {
+	lshift queue_running
+	restarter-restart-now
+    }
+
     set already [we-are-thinking $next]
     if {[llength $already]} {
 	# $already will wake us via walkers-perhaps-queue-steps
@@ -378,9 +390,90 @@ proc cmd/unwait {chan desc} {
     puts-chan $chan "OK unwait $res"
 }
 
+#---------- special magic for restarting the plan ----------
+
+proc for-free-resources {body varname} {
+    jobdb::transaction resources {
+	pg_execute -array free_resources_row dbh {
+		SELECT (restype || '/' || resname || '/' || shareix) AS r
+		  FROM resources
+	     WHERE NOT (SELECT live FROM tasks WHERE taskid=owntaskid)
+	      ORDER BY restype, resname
+	} [list uplevel 1 \
+	       "[list upvar #0 free_resources_row(r) $varname]; $body"]
+    }
+    return $l
+}
+
+proc restarter-starting-plan-hook {} {
+    global wasfree
+    catch { unset wasfree }
+    for-free-resources freeres {
+	set wasfree($freeres) 1
+    }
+}
+
+proc restarter-maybe-provoke-restart {newly_free} {
+    set newly_free {}
+    global wasfree
+    for-free-resources freeres {
+	if {[info exists wasfree($freeres)]} continue
+	lappend newly_free $freeres
+	set wasfree($freeres) 1
+    }
+    if {!$newly_free} {
+	log-event "restarter-maybe-provoke-restart nothing"
+	return
+    }
+ 
+    walker-runvars plan
+
+    if {!([info exists queue_running] && [llength $queue_running]}} {
+	log-event "restarter-maybe-provoke-restart not-running ($newly_free)"
+	return
+    }
+    
+    log-event "restarter-maybe-provoke-restart provoked ($newly_free)"
+
+    if {[string compare RESTART [lindex $queue_running 0]]} {
+	set queue_running [concat RESTART $queue_running]
+    }
+    after idle queuerun-perhaps-step plan
+}
+
+proc restarter-restart-now {} {
+    # We restart the `plan' walker.  Well, actually, if the #
+    # `projection' walker is not running, we transfer the `plan'
+    # walker to it.  At this stage the plan walker is not thinking so
+    # there are no outstanding callbacks to worry about.
+
+    report-plan plan plan
+
+    global projection/queue_running
+    global plan/queue_running
+
+    if {![info exists projection/queue_running]} {
+	log-event "queuerun-restart-now projection-idle continue-as"
+	set projection/queue_running [set plan/queue_running]
+	file rename -force data-plan.pl data-projection.pl
+	after idle queuerun-perhaps-step projection
+    } else {
+	log-event "queuerun-restart-now projection-running"
+    }
+    unset plan/queue_running
+    runneeded-ensure-will 2
+}
+
 proc notify-to-think {w thinking} { 
     for-chan $thinking {
-	puts-chan $thinking "!OK think"
+	switch -glob $w.[info exists info(feature-noalloc)] {
+	    plan.* { puts-chan $thinking "!OK think" }
+	    projection.1 { puts-chan $thinking "!OK think noalloc" }
+	    projection.0 {
+		# oh well, can't include it in the projection; too bad
+		queuerun-step-done $w "!feature-noalloc"
+	    }
+	}
     }
 }
 
@@ -519,6 +612,7 @@ proc await-endings-notified {} {
             error "$owndchan eof"
         }
         runneeded-ensure-will 2
+        restarter-maybe-provoke-restart
     }
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 01/13] Tcl: Use lshift instead of open-coding with lrange
  2015-09-02 15:45 ` [OSSTEST PATCH 01/13] Tcl: Use lshift instead of open-coding with lrange Ian Jackson
@ 2015-09-03 10:44   ` Ian Campbell
  2015-09-03 10:50     ` Ian Jackson
  0 siblings, 1 reply; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 10:44 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
> In ms-queuedaemon, and JobDB-Executive, once each.  No functional
> change.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

(Caveat: I don't speak a lot of tcl, but seems plausible. Same caveat for
all tcl patches here...)

> ---
>  ms-queuedaemon          |    3 +--
>  tcl/JobDB-Executive.tcl |    3 +--
>  2 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/ms-queuedaemon b/ms-queuedaemon
> index e15bc79..d6d59ee 100755
> --- a/ms-queuedaemon
> +++ b/ms-queuedaemon
> @@ -210,8 +210,7 @@ proc queuerun-perhaps-step {} {
>          return
>      }
>  
> -    set thinking [lindex $queue_running 0]
> -    set queue_running [lrange $queue_running 1 end]
> +    set thinking [lshift queue_running]
>      log-event "queuerun-perhaps-step selected"
>  
>      set thinking_after [after [expr {$c(QueueThoughtsTimeout) * 1000}] \
> diff --git a/tcl/JobDB-Executive.tcl b/tcl/JobDB-Executive.tcl
> index 7228712..d61d2a2 100644
> --- a/tcl/JobDB-Executive.tcl
> +++ b/tcl/JobDB-Executive.tcl
> @@ -74,8 +74,7 @@ proc set-flight {} {
>          set argv [lrange $argv 2 end]
>      }
>  
> -    set flight [lindex $argv 0]
> -    set argv [lrange $argv 1 end]
> +    set flight [lshift argv]
>      set env(OSSTEST_FLIGHT) $flight
>  }
>  

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 02/13] Planner: docs: Minor fixes
  2015-09-02 15:45 ` [OSSTEST PATCH 02/13] Planner: docs: Minor fixes Ian Jackson
@ 2015-09-03 10:47   ` Ian Campbell
  2015-09-03 12:04     ` Ian Jackson
  0 siblings, 1 reply; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 10:47 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
>  * Document the ms-queuedaemon banner
>  * Document the argument to the allocation $resourcecall callback fn.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> ---
>  Osstest/Executive.pm |    2 +-
>  README.planner       |    3 +++
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/Osstest/Executive.pm b/Osstest/Executive.pm
> index bf968c8..ab015d2 100644
> --- a/Osstest/Executive.pm
> +++ b/Osstest/Executive.pm
> @@ -604,7 +604,7 @@ sub plan_search ($$$$) {
>  }
>  
>  sub alloc_resources {
> -    my ($resourcecall) = pop @_;
> +    my ($resourcecall) = pop @_; # $resourcecall->($plan);

Took me a while to work out this means $resourcecall is a function which
should be called as shown, but now I've got that:

Acked-by: Ian Campbell <ian.campbell@citrix.com>

Maybe consider adding "Called as" to the front of the comment though? (Ack
either way)

>      my (%xparams) = @_;
>      # $resourcecall should die (abort) or return ($ok, $bookinglist)
>      #
> diff --git a/README.planner b/README.planner
> index ec4dce8..34eae97 100644
> --- a/README.planner
> +++ b/README.planner
> @@ -181,6 +181,9 @@ DETAILED PROTOCOL NOTES
>  
>  ms-queuedaemon commands
>  
> +        < OK ms-queuedaemon [INFO...]
> +                Banner on connection.  INFO should be ignored.
> +
>  	> wait
>  		I want to join the plan
>  

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 03/13] Planner: docs: Document set-info command
  2015-09-02 15:45 ` [OSSTEST PATCH 03/13] Planner: docs: Document set-info command Ian Jackson
@ 2015-09-03 10:48   ` Ian Campbell
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 10:48 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 01/13] Tcl: Use lshift instead of open-coding with lrange
  2015-09-03 10:44   ` Ian Campbell
@ 2015-09-03 10:50     ` Ian Jackson
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Jackson @ 2015-09-03 10:50 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel

Ian Campbell writes ("Re: [OSSTEST PATCH 01/13] Tcl: Use lshift instead of open-coding with lrange"):
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> (Caveat: I don't speak a lot of tcl, but seems plausible. Same caveat for
> all tcl patches here...)

Thanks...

Ian.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 04/13] Planner: Fix indefinite holdoff
  2015-09-02 15:45 ` [OSSTEST PATCH 04/13] Planner: Fix indefinite holdoff Ian Jackson
@ 2015-09-03 10:51   ` Ian Campbell
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 10:51 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
> runneeded-ensure-will would always reset the runneeded_holdoff_after
> timer.  So no new queue run would start until no runneeded-ensure-will
> has occurred for (currently) 30s.
> 
> Instead, only start the timer if it's not already running.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> ---
>  ms-queuedaemon |   10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/ms-queuedaemon b/ms-queuedaemon
> index d6d59ee..1aa526c 100755
> --- a/ms-queuedaemon
> +++ b/ms-queuedaemon
> @@ -86,10 +86,12 @@ proc runneeded-ensure-will {need} {
>      log-event "runneeded-ensure-will $need (was $need_queue_run)"
>  
>      if {$need > $need_queue_run} { set need_queue_run $need }
> -    catch { after cancel $runneeded_holdoff_after }
> -    set runneeded_holdoff_after \
> -        [after [expr {$c(QueueDaemonHoldoff) * 1000}] \
> -             runneeded-perhaps-start]
> +
> +    if {![info exists runneeded_holdoff_after]} {
> +	set runneeded_holdoff_after \
> +	    [after [expr {$c(QueueDaemonHoldoff) * 1000}] \
> +		 runneeded-perhaps-start]
> +    }
>  }
>  
>  proc runneeded-perhaps-start {} {

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 05/13] Planner: client side: $mayalloc parameter to $resourcecall->()
  2015-09-02 15:45 ` [OSSTEST PATCH 05/13] Planner: client side: $mayalloc parameter to $resourcecall->() Ian Jackson
@ 2015-09-03 10:52   ` Ian Campbell
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 10:52 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
> Add a new parameter to $resourcecall which allows the alloc_resources
> loop in Osstest::Executive to specify to its clients that on this
> occasion they should not make any actual allocations.
> 
> The callers of alloc_resources are all adjusted to honour this new
> parameter:
>  * ts-hosts-allocate-Executive avoids allocating unless $mayalloc
>  * mg-allocate avoids allocating unless $mayalloc
>  * mg-blockage never allocates anyway.
> 
> Currently we always pass 1, so no functional change.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>


> ---
>  Osstest/Executive.pm        |    4 ++--
>  mg-allocate                 |    4 ++--
>  mg-blockage                 |    2 +-
>  ts-hosts-allocate-Executive |    4 ++--
>  4 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/Osstest/Executive.pm b/Osstest/Executive.pm
> index ab015d2..f9be0a0 100644
> --- a/Osstest/Executive.pm
> +++ b/Osstest/Executive.pm
> @@ -604,7 +604,7 @@ sub plan_search ($$$$) {
>  }
>  
>  sub alloc_resources {
> -    my ($resourcecall) = pop @_; # $resourcecall->($plan);
> +    my ($resourcecall) = pop @_; # $resourcecall->($plan, $mayalloc);
>      my (%xparams) = @_;
>      # $resourcecall should die (abort) or return ($ok, $bookinglist)
>      #
> @@ -720,7 +720,7 @@ sub alloc_resources {
>  		$plan= from_json($jplan);
>  	    }, sub {
>  		if (!eval {
> -		    ($ok, $bookinglist) = $resourcecall->($plan);
> +		    ($ok, $bookinglist) = $resourcecall->($plan, 1);
>  		    1;
>  		}) {
>  		    warn "resourcecall $@";
> diff --git a/mg-allocate b/mg-allocate
> index fc1b394..6dc7ddd 100755
> --- a/mg-allocate
> +++ b/mg-allocate
> @@ -270,7 +270,7 @@ sub showposs ($) {
>  sub plan () {
>      my @got;
>      alloc_resources(sub {
> -        my ($plan) = @_;
> +        my ($plan, $mayalloc) = @_;
>  
>  	@got = ();
>          my @possmatrix = ([]);
> @@ -310,7 +310,7 @@ sub plan () {
>  	die unless $planned;
>  
>          my $allok=0;
> -        if (!$planned->{Start}) {
> +        if ($mayalloc && !$planned->{Start}) {
>              $allok=1;
>  
>              alloc_prep();
> diff --git a/mg-blockage b/mg-blockage
> index a21f15b..1f66d8e 100755
> --- a/mg-blockage
> +++ b/mg-blockage
> @@ -40,7 +40,7 @@ sub max { (reverse sort @_)[0]; }
>  
>  sub plan () {
>      alloc_resources(sub {
> -	my ($plan) = @_;
> +	my ($plan, $mayalloc) = @_;
>  
>  	my $now = time;
>  	if ($now > $end) { return (1, { Bookings => [ ] }); }
> diff --git a/ts-hosts-allocate-Executive b/ts-hosts-allocate-Executive
> index 7a9411d..6e1391e 100755
> --- a/ts-hosts-allocate-Executive
> +++ b/ts-hosts-allocate-Executive
> @@ -606,7 +606,7 @@ sub alloc_hosts () {
>  }
>  
>  sub attempt_allocation {
> -    ($plan) = @_;
> +    ($plan, $mayalloc) = @_;
>      undef $best;
>  
>      logm("allocating hosts: ".join(' ', map { $_->{Ident} } @hids));
> @@ -632,7 +632,7 @@ sub attempt_allocation {
>  
>      my $retval=0;
>  
> -    if (!$best->{Start}) {
> +    if ($mayalloc && !$best->{Start}) {
>  	$retval= 1;
>  	foreach my $hid (@hids) {
>  	    my $got= actual_allocation($hid);

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 06/13] Planner: client side: New `!OK think noalloc' protocol
  2015-09-02 15:45 ` [OSSTEST PATCH 06/13] Planner: client side: New `!OK think noalloc' protocol Ian Jackson
@ 2015-09-03 10:53   ` Ian Campbell
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 10:53 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
> Introduce a way for the queue daemon to tell its client that it must
> not allocate anything in this planning iteration.
> 
> In the client:
>  * Advertise the new feature via set-info.
>  * Accept the `noalloc' part of `!OK think noalloc';
>  * Print that in our log message;
>  * Honour it by passing it to $resourcecall.
> 
> And document the new protocol.  However, there is no server-side yet,
> so this does not yet introduce any overall change to the system.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 07/13] Planner: ms-planner support -w option
  2015-09-02 15:45 ` [OSSTEST PATCH 07/13] Planner: ms-planner support -w option Ian Jackson
@ 2015-09-03 11:25   ` Ian Campbell
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 11:25 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
> We are going to introduce multiple concurrent streams of planning
> processing, called `walkers' in ms-queuedaemon.  The work-in-progress
> plan is stored, server-side, during planning, in data-plan.pl.  But we
> need to have more than one of these.
> 
> Update ms-planner and ms-planner-debug to honour a -w option, to
> specify a replacement for the word `plan' in `data-plan.pl'.
> 
> No overall functional change, since nothing uses these options yet.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 14/13] Planning reports: Break out return-plan-to-client
  2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
                   ` (12 preceding siblings ...)
  2015-09-02 15:45 ` [OSSTEST PATCH 13/13] Planner: ms-queuedaemon: Restart planning when resources become free Ian Jackson
@ 2015-09-03 11:49 ` Ian Jackson
  2015-09-03 11:49   ` [OSSTEST PATCH 15/13] Plan reporting: Provide get-last-plan queuedaemon command Ian Jackson
  2015-09-03 13:01   ` [OSSTEST PATCH 14/13] Planning reports: Break out return-plan-to-client Ian Campbell
  13 siblings, 2 replies; 37+ messages in thread
From: Ian Jackson @ 2015-09-03 11:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

We are going to want to reuse this.  No functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 ms-queuedaemon |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/ms-queuedaemon b/ms-queuedaemon
index 425b98f..222b687 100755
--- a/ms-queuedaemon
+++ b/ms-queuedaemon
@@ -354,11 +354,16 @@ proc cmd/thought-done {chan desc} {
     puts-chan $chan "OK thought"
 }
 
+proc return-plan-to-client {chan wf} {
+    set tplan [exec -keepnewline ./ms-planner -w$wf get-plan < /dev/null]
+    puts-chan-data $chan "OK get-plan" $tplan
+    return $tplan
+}
+
 proc cmd/get-plan {chan desc} {
     global plan
     set w [check-we-are-thinking $chan]
-    set plan [exec -keepnewline ./ms-planner -w$w get-plan < /dev/null]
-    puts-chan-data $chan "OK get-plan" $plan
+    set plan [return-plan-to-client $chan $w]
 }
 
 proc cmd/book-resources {chan desc bytes} {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [OSSTEST PATCH 15/13] Plan reporting: Provide get-last-plan queuedaemon command
  2015-09-03 11:49 ` [OSSTEST PATCH 14/13] Planning reports: Break out return-plan-to-client Ian Jackson
@ 2015-09-03 11:49   ` Ian Jackson
  2015-09-03 13:05     ` Ian Campbell
  2015-09-03 13:01   ` [OSSTEST PATCH 14/13] Planning reports: Break out return-plan-to-client Ian Campbell
  1 sibling, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-03 11:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

This allows retrieval, by monitoring clients which are not
participating in the planning queue, of the finished projection, or
the unfinished plan as it was at the time of last restart.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 README.planner |   11 +++++++++++
 ms-queuedaemon |    8 ++++++++
 2 files changed, 19 insertions(+)

diff --git a/README.planner b/README.planner
index c1b2bf6..8231812 100644
--- a/README.planner
+++ b/README.planner
@@ -224,6 +224,17 @@ ms-queuedaemon commands
 
 	> thought-done | thought-wait
 
+        > get-last-plan projection|plan
+                Retrieve the last fully completed plan (`projection'),
+		or the most recent short-term information used for
+		actual resource allocation (`plan').
+
+		For monitoring tools, not participants in the resource
+		allocation process.  Response is as for get-plan:
+
+		< OK get-plan BYTES
+		PLAN-DATA
+
 
 Plan is:
 	Start	time_t used for "now" *
diff --git a/ms-queuedaemon b/ms-queuedaemon
index 222b687..97dff85 100755
--- a/ms-queuedaemon
+++ b/ms-queuedaemon
@@ -292,6 +292,7 @@ proc report-plan {w wo} {
     } emsg]} {
         log "INTERNAL ERROR showing $w html: $emsg"
     } else {
+	file copy -force data-$w.pl data-$wo-final.pl
         log "$w report-plan OK"
     }
 }
@@ -366,6 +367,13 @@ proc cmd/get-plan {chan desc} {
     set plan [return-plan-to-client $chan $w]
 }
 
+proc cmd/get-last-plan {chan desc w} {
+    global walkers
+    if {[lsearch -exact $walkers $w] < 0} { error "unknown last-plan ($w)" }
+    if {![file exists data-$w-final.pl]} { error "no last-plan $w" }
+    return-plan-data-to-client $w-final
+}
+
 proc cmd/book-resources {chan desc bytes} {
     read-chan-data $chan $bytes do-book-resources $w
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 02/13] Planner: docs: Minor fixes
  2015-09-03 10:47   ` Ian Campbell
@ 2015-09-03 12:04     ` Ian Jackson
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Jackson @ 2015-09-03 12:04 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Ian Jackson

Ian Campbell writes ("Re: [OSSTEST PATCH 02/13] Planner: docs: Minor fixes"):
> On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
> >  sub alloc_resources {
> > -    my ($resourcecall) = pop @_;
> > +    my ($resourcecall) = pop @_; # $resourcecall->($plan);
> 
> Took me a while to work out this means $resourcecall is a function which
> should be called as shown, but now I've got that:
...
> Maybe consider adding "Called as" to the front of the comment though? (Ack
> either way)

Perl syntax like   $resourcecall->($plan)  always means that
$resourcecall is a subref which is being invoked.  And the variable is
called `call'.

So I think extra commentary would be redundant.

> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 08/13] Planner: ms-queuedaemon: Prep for multiple walkers
  2015-09-02 15:45 ` [OSSTEST PATCH 08/13] Planner: ms-queuedaemon: Prep for multiple walkers Ian Jackson
@ 2015-09-03 12:05   ` Ian Campbell
  2015-09-03 15:42     ` Ian Jackson
  0 siblings, 1 reply; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 12:05 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
> We are going to introduce multiple concurrent streams of planning
> processing, called `walkers'.
> 
> Prepare the ground for this with some formulaic changes which will
> otherwise greatly clutter substantive patches.
> 
> (A client will still only think for one walker at once, because that's
> what the client protocol expects - and anything else would be far too
> confusing.)
> 
> General:
>  * Introduce the concept of a `walker' to ms-queuedaemon.
>  * Provide a list of the walkers which might exist, `walkers'
>  * Provide some helper procedures for iterating over these,
>    and easily accessing their state.
> 
> Queue handling:
>  * Add a new `w' argument to many procs: specifically, most of the
>    procs in the section `machinery for running the queue'.
>  * Log the walker ($w) at the start of all relevant log messages.
>  * Pass the -w option to ms-planner and ms-planner-debug.
>  * Add safety catches which will crash the ms-queuedaemon if it finds
>    it is asking the same client to think for more than one walker.
>  * we-are-thinking and check-we-are-thinking tell the caller what
>    walker the client is thinking for.
>  * In the resource-plan.html filename, replace `plan' with the walker
>    filename.
> 
> Elsewhere:
>  * Teach dequeue-chan to deal with all the walkers, including
>    maybe the (one) walker for which the client is thinking.
>  * Teach log-state to report on all the walkers.
>  * In the runneeded logic, hardcode `plan' as the walker to use.
> 
> There is still actually only one walker.
> 
> No overall functional change, except to some log messages.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

This mostly seems plausible, in so far as I can speak tcl.

You spotted a bug while explaining the intricacies of upvar/uplevel to me
IRL, but I didn't see anything further...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 09/13] Planner: ms-queuedaemon: Synchronise thinking multiple walkers
  2015-09-02 15:45 ` [OSSTEST PATCH 09/13] Planner: ms-queuedaemon: Synchronise thinking " Ian Jackson
@ 2015-09-03 12:06   ` Ian Campbell
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 12:06 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
> If multiple walkers want to ask the same chan, we want to serialise
> them.  This is actually straightforward:  Firstly, we arrrange that
> each walker finishing a thought will prompt _all_ walkers to
> reconsider whether they need to continue.  Then we can simply do
> nothing if we want to a chan to think that another walker is already
> waiting for; since that other walker will prompt us later.
> 
> Still no actual functional change because there is still only one
> walker.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 10/13] Planner: ms-queuedaemon: Break out queuerun-finished/<walker>
  2015-09-02 15:45 ` [OSSTEST PATCH 10/13] Planner: ms-queuedaemon: Break out queuerun-finished/<walker> Ian Jackson
@ 2015-09-03 12:43   ` Ian Campbell
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 12:43 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
> This formalises the queue-completed interface, allowing parts outside
> the queuerun machinery to cleanly be notified when a queue is
> completed, and relieving the queuerun-perhaps-step of the need to know
> what to do for the end of any particular walker's queue.
> 
> Currently there is still only one walker, `plan'.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 11/13] Planner: ms-queuedaemon: Break out notify-to-think
  2015-09-02 15:45 ` [OSSTEST PATCH 11/13] Planner: ms-queuedaemon: Break out notify-to-think Ian Jackson
@ 2015-09-03 12:44   ` Ian Campbell
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 12:44 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
> This is going to want to do something more complicated shortly.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 12/13] Planner: ms-queuedaemon: Make report-plan take output walker name
  2015-09-02 15:45 ` [OSSTEST PATCH 12/13] Planner: ms-queuedaemon: Make report-plan take output walker name Ian Jackson
@ 2015-09-03 12:44   ` Ian Campbell
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 12:44 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
> We are going to want to process each walker's data into reports which
> are not necessarily named after the same walker.
> 
> No functional change as yet.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 13/13] Planner: ms-queuedaemon: Restart planning when resources become free
  2015-09-02 15:45 ` [OSSTEST PATCH 13/13] Planner: ms-queuedaemon: Restart planning when resources become free Ian Jackson
@ 2015-09-03 12:59   ` Ian Campbell
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 12:59 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:

/me takes a deep breath...

> With this arrangemernt we can generate two reports: a `plan' report

"arrangement"

> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> ---
>  README.planner |    8 +++++
>  ms-queuedaemon |   98 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--

I think you may want to add some stuff to .gitignore, data-projection.pl is
the main one I can think of.

> +proc restarter-maybe-provoke-restart {newly_free} {
> +    set newly_free {}
> +    global wasfree
> +    for-free-resources freeres {
> +	if {[info exists wasfree($freeres)]} continue
> +	lappend newly_free $freeres
> +	set wasfree($freeres) 1
> +    }
> +    if {!$newly_free} {
> +	log-event "restarter-maybe-provoke-restart nothing"
> +	return
> +    }
> + 
> +    walker-runvars plan

ITYM walker-globals?

> +
> +proc restarter-restart-now {} {
> +    # We restart the `plan' walker.  Well, actually, if the #

Comment wrap damage? ("#" at the end)

> +    # `projection' walker is not running, we transfer the `plan'
> +    # walker to it.  At this stage the plan walker is not thinking so
> +    # there are no outstanding callbacks to worry about.
> +
> +    report-plan plan plan
> +
> +    global projection/queue_running
> +    global plan/queue_running
> +
> +    if {![info exists projection/queue_running]} {
> +	log-event "queuerun-restart-now projection-idle continue-as"
> +	set projection/queue_running [set plan/queue_running]

What sort of language names the "get" function "set" :-P

I didn't spot anything else strange.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 14/13] Planning reports: Break out return-plan-to-client
  2015-09-03 11:49 ` [OSSTEST PATCH 14/13] Planning reports: Break out return-plan-to-client Ian Jackson
  2015-09-03 11:49   ` [OSSTEST PATCH 15/13] Plan reporting: Provide get-last-plan queuedaemon command Ian Jackson
@ 2015-09-03 13:01   ` Ian Campbell
  1 sibling, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 13:01 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Thu, 2015-09-03 at 12:49 +0100, Ian Jackson wrote:
> We are going to want to reuse this.  No functional change.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 15/13] Plan reporting: Provide get-last-plan queuedaemon command
  2015-09-03 11:49   ` [OSSTEST PATCH 15/13] Plan reporting: Provide get-last-plan queuedaemon command Ian Jackson
@ 2015-09-03 13:05     ` Ian Campbell
  2015-09-03 15:51       ` Ian Jackson
  0 siblings, 1 reply; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 13:05 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Thu, 2015-09-03 at 12:49 +0100, Ian Jackson wrote:
> This allows retrieval, by monitoring clients which are not
> participating in the planning queue, of the finished projection, or
> the unfinished plan as it was at the time of last restart.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> ---
>  README.planner |   11 +++++++++++
>  ms-queuedaemon |    8 ++++++++
>  2 files changed, 19 insertions(+)
> 
> diff --git a/README.planner b/README.planner
> index c1b2bf6..8231812 100644
> --- a/README.planner
> +++ b/README.planner
> @@ -224,6 +224,17 @@ ms-queuedaemon commands
>  
>  	> thought-done | thought-wait
>  
> +        > get-last-plan projection|plan
> +                Retrieve the last fully completed plan (`projection'),
> +		or the most recent short-term information used for
> +		actual resource allocation (`plan').
> +
> +		For monitoring tools, not participants in the resource
> +		allocation process.  Response is as for get-plan:
> +
> +		< OK get-plan BYTES
> +		PLAN-DATA
> +
>  
>  Plan is:
>  	Start	time_t used for "now" *
> diff --git a/ms-queuedaemon b/ms-queuedaemon
> index 222b687..97dff85 100755
> --- a/ms-queuedaemon
> +++ b/ms-queuedaemon
> @@ -292,6 +292,7 @@ proc report-plan {w wo} {
>      } emsg]} {
>          log "INTERNAL ERROR showing $w html: $emsg"
>      } else {
> +	file copy -force data-$w.pl data-$wo-final.pl

IIRC there was code earlier which did smth like "cp data-plan.pl data
-projection.pl" during the take over and that if no resources come
available during a plan walk then no projection walk would happen.

IOW a user needs to do something like try get-last-plan on projection, then
if that doesn't work try again with plan. 

But with that the last projection walk might be out of data WRT a more
recently completed plan walk.

So I think a client would need to get both and compare the timestamps or
something. It might be nicer to arrange for data-plan-final.pl to be copied
to data-projection-final.pl at the same time as data-plan.pl becomes data
-projection.pl?

>          log "$w report-plan OK"
>      }
>  }
> @@ -366,6 +367,13 @@ proc cmd/get-plan {chan desc} {
>      set plan [return-plan-to-client $chan $w]
>  }
>  
> +proc cmd/get-last-plan {chan desc w} {
> +    global walkers
> +    if {[lsearch -exact $walkers $w] < 0} { error "unknown last-plan 
> ($w)" }
> +    if {![file exists data-$w-final.pl]} { error "no last-plan $w" }
> +    return-plan-data-to-client $w-final
> +}
> +
>  proc cmd/book-resources {chan desc bytes} {
>      read-chan-data $chan $bytes do-book-resources $w
>  }

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 08/13] Planner: ms-queuedaemon: Prep for multiple walkers
  2015-09-03 12:05   ` Ian Campbell
@ 2015-09-03 15:42     ` Ian Jackson
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Jackson @ 2015-09-03 15:42 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel

Ian Campbell writes ("Re: [OSSTEST PATCH 08/13] Planner: ms-queuedaemon: Prep for multiple walkers"):
> On Wed, 2015-09-02 at 16:45 +0100, Ian Jackson wrote:
> > We are going to introduce multiple concurrent streams of planning
> > processing, called `walkers'.
...
> This mostly seems plausible, in so far as I can speak tcl.

Thanks.

> You spotted a bug while explaining the intricacies of upvar/uplevel to me
> IRL, but I didn't see anything further...

Indeed; I have fixed that bug in my tree.

Ian.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 15/13] Plan reporting: Provide get-last-plan queuedaemon command
  2015-09-03 13:05     ` Ian Campbell
@ 2015-09-03 15:51       ` Ian Jackson
  2015-09-03 15:52         ` Ian Jackson
  0 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-03 15:51 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel

Ian Campbell writes ("Re: [OSSTEST PATCH 15/13] Plan reporting: Provide get-last-plan queuedaemon command"):
> On Thu, 2015-09-03 at 12:49 +0100, Ian Jackson wrote:
> > This allows retrieval, by monitoring clients which are not
> > participating in the planning queue, of the finished projection, or
> > the unfinished plan as it was at the time of last restart.
...
> > +	file copy -force data-$w.pl data-$wo-final.pl
> 
> IIRC there was code earlier which did smth like "cp data-plan.pl data
> -projection.pl" during the take over and that if no resources come
> available during a plan walk then no projection walk would happen.

Yes.  After patch 13 we have this:

  proc queuerun-finished/plan {} {
      runneeded-ensure-will 0
      report-plan plan plan
      report-plan plan projection
  }

report-plan X Y generates resource-Y.html from data-Y.pl.

In 15/13 we make it also copy data-Y.pl to data-Y-final.pl.

>  It might be nicer to arrange for data-plan-final.pl to be copied
> to data-projection-final.pl at the same time as data-plan.pl becomes data
> -projection.pl?

So this is already taken care of.

Ian.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 15/13] Plan reporting: Provide get-last-plan queuedaemon command
  2015-09-03 15:51       ` Ian Jackson
@ 2015-09-03 15:52         ` Ian Jackson
  2015-09-03 16:12           ` Ian Campbell
  0 siblings, 1 reply; 37+ messages in thread
From: Ian Jackson @ 2015-09-03 15:52 UTC (permalink / raw)
  To: Ian Campbell, xen-devel

Ian Jackson writes ("Re: [OSSTEST PATCH 15/13] Plan reporting: Provide get-last-plan queuedaemon command"):
>   proc queuerun-finished/plan {} {
>       runneeded-ensure-will 0
>       report-plan plan plan
>       report-plan plan projection
>   }
> 
> report-plan X Y generates resource-Y.html from data-Y.pl.
> 
> In 15/13 we make it also copy data-Y.pl to data-Y-final.pl.
                                ^^^^^^^^^
I mean                          data-X.pl

Ian.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [OSSTEST PATCH 15/13] Plan reporting: Provide get-last-plan queuedaemon command
  2015-09-03 15:52         ` Ian Jackson
@ 2015-09-03 16:12           ` Ian Campbell
  0 siblings, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2015-09-03 16:12 UTC (permalink / raw)
  To: Ian Jackson, xen-devel

On Thu, 2015-09-03 at 16:52 +0100, Ian Jackson wrote:
> Ian Jackson writes ("Re: [OSSTEST PATCH 15/13] Plan reporting: Provide 
> get-last-plan queuedaemon command"):
> >   proc queuerun-finished/plan {} {
> >       runneeded-ensure-will 0
> >       report-plan plan plan
> >       report-plan plan projection
> >   }
> > 
> > report-plan X Y generates resource-Y.html from data-Y.pl.
> > 
> > In 15/13 we make it also copy data-Y.pl to data-Y-final.pl.
>                                 ^^^^^^^^^
> I mean                          data-X.pl

I read it as you meant it.

So great, thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2015-09-03 16:12 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-02 15:45 [OSSTEST RFC PATCH 00/13] Planner: Performance improvement Ian Jackson
2015-09-02 15:45 ` [OSSTEST PATCH 01/13] Tcl: Use lshift instead of open-coding with lrange Ian Jackson
2015-09-03 10:44   ` Ian Campbell
2015-09-03 10:50     ` Ian Jackson
2015-09-02 15:45 ` [OSSTEST PATCH 02/13] Planner: docs: Minor fixes Ian Jackson
2015-09-03 10:47   ` Ian Campbell
2015-09-03 12:04     ` Ian Jackson
2015-09-02 15:45 ` [OSSTEST PATCH 03/13] Planner: docs: Document set-info command Ian Jackson
2015-09-03 10:48   ` Ian Campbell
2015-09-02 15:45 ` [OSSTEST PATCH 04/13] Planner: Fix indefinite holdoff Ian Jackson
2015-09-03 10:51   ` Ian Campbell
2015-09-02 15:45 ` [OSSTEST PATCH 05/13] Planner: client side: $mayalloc parameter to $resourcecall->() Ian Jackson
2015-09-03 10:52   ` Ian Campbell
2015-09-02 15:45 ` [OSSTEST PATCH 06/13] Planner: client side: New `!OK think noalloc' protocol Ian Jackson
2015-09-03 10:53   ` Ian Campbell
2015-09-02 15:45 ` [OSSTEST PATCH 07/13] Planner: ms-planner support -w option Ian Jackson
2015-09-03 11:25   ` Ian Campbell
2015-09-02 15:45 ` [OSSTEST PATCH 08/13] Planner: ms-queuedaemon: Prep for multiple walkers Ian Jackson
2015-09-03 12:05   ` Ian Campbell
2015-09-03 15:42     ` Ian Jackson
2015-09-02 15:45 ` [OSSTEST PATCH 09/13] Planner: ms-queuedaemon: Synchronise thinking " Ian Jackson
2015-09-03 12:06   ` Ian Campbell
2015-09-02 15:45 ` [OSSTEST PATCH 10/13] Planner: ms-queuedaemon: Break out queuerun-finished/<walker> Ian Jackson
2015-09-03 12:43   ` Ian Campbell
2015-09-02 15:45 ` [OSSTEST PATCH 11/13] Planner: ms-queuedaemon: Break out notify-to-think Ian Jackson
2015-09-03 12:44   ` Ian Campbell
2015-09-02 15:45 ` [OSSTEST PATCH 12/13] Planner: ms-queuedaemon: Make report-plan take output walker name Ian Jackson
2015-09-03 12:44   ` Ian Campbell
2015-09-02 15:45 ` [OSSTEST PATCH 13/13] Planner: ms-queuedaemon: Restart planning when resources become free Ian Jackson
2015-09-03 12:59   ` Ian Campbell
2015-09-03 11:49 ` [OSSTEST PATCH 14/13] Planning reports: Break out return-plan-to-client Ian Jackson
2015-09-03 11:49   ` [OSSTEST PATCH 15/13] Plan reporting: Provide get-last-plan queuedaemon command Ian Jackson
2015-09-03 13:05     ` Ian Campbell
2015-09-03 15:51       ` Ian Jackson
2015-09-03 15:52         ` Ian Jackson
2015-09-03 16:12           ` Ian Campbell
2015-09-03 13:01   ` [OSSTEST PATCH 14/13] Planning reports: Break out return-plan-to-client Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.