All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/35] libxl ao abort request (cancellation)
@ 2015-06-25 17:44 Ian Jackson
  2015-06-25 17:44 ` [PATCH 01/35] libxl: ao internal API docs: Mention synchronous ao completion Ian Jackson
                   ` (35 more replies)
  0 siblings, 36 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Euan Harris, Ian Campbell

This is v4 of my series to provide support for abandoning a
long-running libxl operation.

This abandonment is now called `abort' rather than `cancel'; this
conveys the potential destructiveness, and also provides the space for
us to provide a safer (more checked) cancellation, later.

I have in general retained acks despite this change, unless the
adjustments to the individual patch were nontrivial or less than
entirely obvious.  Most of the patches from 16 onwards were affected.

This code has now been exercised, at least for domain create, resume
and destroy, using a test harness provided by Euan Harris of Citrix
the XenServer ring3 team.  The version of the test harness I used is
here:

  http://xenbits.xen.org/gitweb/?p=people/iwj/ring3-xl-test.git;a=summary
  git://xenbits.xen.org/people/iwj/ring3-xl-test.git
     master

IMO the test coverage now is already extensive enough for this series
to go into xen.git#staging.

(AIUI it is intended that the test suite go into xen.git eventually,
and be plumbed into osstest, but that is a separate task and the test
suite is nowhere near ready for that yet.)

 a 01/35  libxl: ao internal API docs: Mention synchronous ao completi
 a 02/35  libxl: suspend: switch_logdirty_done takes rc               
 a 03/35  libxl: suspend: common suspend callbacks take rc            
 a 04/35  libxl: suspend: Return correct error from callbacks         
 a 05/35  libxl: Use libxl__xswait* in libxl__ao_device               
 a 06/35  libxl: xswait/devstate: Move xswait to before devstate      
 a 07/35  libxl: devstate: Use libxl__xswait*                         
N  08/35  libxl: Rename AO_ABORT to AO_CREATE_FAIL                    
N  09/35  libxl: Change some log messages to say `abandoning' rather t
N  10/35  libxl: Change an internal comment to say `bail' rather than 
C  11/35  libxl: New error codes ABORTED etc.                         
 a 12/35  libxl: events: Make timeout and async exec setup take an ao,
Ca 13/35  libxl: events: Make libxl__async_exec_* pass caller an rc   
 a 14/35  libxl: events: Permit timeouts to signal ao abort           
N  15/35  libxl: spawn: Preserve rc in error path                     
Ca 16/35  libxl: domain create: Do not destroy on ao abort            
 a 17/35  libxl: ao: Record ultimate parent of a nested ao            
 a 18/35  libxl: ao: Count the nested progeny of an ao                
 a 19/35  libxl: ao: Provide manip_refcnt                             
C  20/35  libxl: ao abort: Provide public ao abort request API        
 a 21/35  libxl: ao abort: Provide explicit internal abort check API  
C  22/35  libxl: ao abort: Make timeouts abortable                    
 a 23/35  libxl: ao abort: Note that driver domain task cannot be usef
Ca 24/35  libxl: Introduce DOMAIN_DESTROYED error code                
 a 25/35  libxl: ao abort: Support aborting where we spot domain death
 a 26/35  libxl: Introduce FILLZERO                                   
 a 27/35  libxl: ao abort: Preparations for save/restore abort        
 a 28/35  libxl: ao abort: Handle SIGTERM in save/restore helper      
 a 29/35  libxl: ao abort: Abort libxc save/restore                   
 a 30/35  libxl: ao: datacopier callback gets an rc                   
 a 31/35  libxl: ao abort: Make datacopiers abortable                 
N  32/35  libxl: Fix libxl__get_domid error reporting                 
N  33/35  libxl: spawn: Always debug log middle child process death   
N  34/35  libxl: libxl__ev_child pass actual pid to callback          
N  35/35  libxl: When save/restore helper dies, do not overwrite rc   

Keys:
  a  Acked by a maintainer
  N  New in v4
  C  Changed in v4 (other than to rename CANCELLED to ABORTED,
     `cancel' to `abrt', etc.)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH 01/35] libxl: ao internal API docs: Mention synchronous ao completion
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 02/35] libxl: suspend: switch_logdirty_done takes rc Ian Jackson
                   ` (34 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

This doc comment about ao lifecycle failed to mention the option of
completing the ao during the initiator function.  (Indeed, the most
obvious reading would forbid it.)

Restructure the comment, describe this situation, and generally
improve the wording.

Also, fix a grammar problem (missing word `a').

Reported-by: Koushik Chakravarty <koushik.chakravarty@citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_internal.h |   33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index e96d6b5..b0cc4cf 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1954,28 +1954,41 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc);
  *        must be copied into the per-operation structure using
  *        libxl__ao_progress_gethow.
  *
- * - If initiation is successful, the initiating function needs
- *   to run libxl__ao_inprogress right before unlocking and
- *   returning, and return whatever it returns (AO_INPROGRESS macro).
- *
  * - If the initiation is unsuccessful, the initiating function must
  *   call libxl__ao_abort before unlocking and returning whatever
  *   error code is appropriate (AO_ABORT macro).
  *
+ * If initiation is successful:
+ *
+ * - The initiating function must run libxl__ao_inprogress right
+ *   before unlocking and returning, and return whatever it returns.
+ *   This is best achieved with the AO_INPROGRESS macro.
+ *
  * - If the operation supports progress reports, it may generate
  *   suitable events with NEW_EVENT and report them with
  *   libxl__ao_progress_report (with the ctx locked).
  *
- * - Later, some callback function, whose callback has been requested
- *   directly or indirectly, should call libxl__ao_complete (with the
- *   ctx locked, as it will generally already be in any event callback
- *   function).  This must happen exactly once for each ao (and not if
- *   the ao has been destroyed, obviously).
+ * - Eventually, some callback function, whose callback has been
+ *   requested directly or indirectly, should call libxl__ao_complete
+ *   (with the ctx locked, as it will generally already be in any
+ *   event callback function).  This must happen exactly once for each
+ *   ao, as the last that happens with that ao.
+ *
+ * - However, it is permissible for the initiating function to call
+ *   libxl__ao_inprogress and/or libxl__ao_complete (directly or
+ *   indirectly), before it uses AO_INPROGRESS to return.  (The ao
+ *   infrastructure will arrange to defer destruction of the ao, etc.,
+ *   until the proper time.)  An initiating function should do this
+ *   if it takes a codepath which completes synchronously.
+ *
+ * - Conversely it is forbidden to call libxl__ao_complete in the
+ *   initiating function _after_ AO_INPROGRESS, because
+ *   libxl__ao_complete requires the ctx to be locked.
  *
  * - Note that during callback functions, two gcs are available:
  *    - The one in egc, whose lifetime is only this callback
  *    - The one in ao, whose lifetime is the asynchronous operation
- *   Usually callback function should use CONTAINER_OF to obtain its
+ *   Usually a callback function should use CONTAINER_OF to obtain its
  *   own state structure, containing a pointer to the ao.  It should
  *   then obtain the ao and use the ao's gc; this is most easily done
  *   using the convenience macro STATE_AO_GC.
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 02/35] libxl: suspend: switch_logdirty_done takes rc
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
  2015-06-25 17:44 ` [PATCH 01/35] libxl: ao internal API docs: Mention synchronous ao completion Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 03/35] libxl: suspend: common suspend callbacks take rc Ian Jackson
                   ` (33 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

switch_logdirty_done used to take the value to pass to
libxl__xc_domain_saverestore_async_callback_done (ie, the return value
from the callback).  (This was mistakenly described as "ok" in the
prototype, but in the definition it is "broke" and all the call sites
passed 0 for success or -1 for error.)

Instead, make it take a libxl error code (rc).  Convert this to the
suspend callback value at the end.

No functional change in this patch.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_dom.c |   23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 600393d..fcfc5d1 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1166,7 +1166,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
                             const char *watch_path, const char *event_path);
 static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss, int ok);
+                                 libxl__domain_suspend_state *dss, int rc);
 
 static void logdirty_init(libxl__logdirty_switch *lds)
 {
@@ -1244,7 +1244,7 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
  out:
     LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc);
     libxl__xs_transaction_abort(gc, &t);
-    switch_logdirty_done(egc,dss,-1);
+    switch_logdirty_done(egc,dss,rc);
 }
 
 static void domain_suspend_switch_qemu_xen_logdirty
@@ -1292,7 +1292,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
     libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
     STATE_AO_GC(dss->ao);
     LOG(ERROR,"logdirty switch: wait for device model timed out");
-    switch_logdirty_done(egc,dss,-1);
+    switch_logdirty_done(egc,dss,ERROR_FAIL);
 }
 
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
@@ -1344,17 +1344,16 @@ static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
      */
     libxl__xs_transaction_abort(gc, &t);
 
-    if (!rc) {
-        switch_logdirty_done(egc,dss,0);
-    } else if (rc < 0) {
-        LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
-        switch_logdirty_done(egc,dss,-1);
+    if (rc <= 0) {
+        if (rc < 0)
+            LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
+        switch_logdirty_done(egc,dss,rc);
     }
 }
 
 static void switch_logdirty_done(libxl__egc *egc,
                                  libxl__domain_suspend_state *dss,
-                                 int broke)
+                                 int rc)
 {
     STATE_AO_GC(dss->ao);
     libxl__logdirty_switch *lds = &dss->logdirty;
@@ -1362,6 +1361,12 @@ static void switch_logdirty_done(libxl__egc *egc,
     libxl__ev_xswatch_deregister(gc, &lds->watch);
     libxl__ev_time_deregister(gc, &lds->timeout);
 
+    int broke;
+    if (rc) {
+        broke = -1;
+    } else {
+        broke = 0;
+    }
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, broke);
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 03/35] libxl: suspend: common suspend callbacks take rc
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
  2015-06-25 17:44 ` [PATCH 01/35] libxl: ao internal API docs: Mention synchronous ao completion Ian Jackson
  2015-06-25 17:44 ` [PATCH 02/35] libxl: suspend: switch_logdirty_done takes rc Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 04/35] libxl: suspend: Return correct error from callbacks Ian Jackson
                   ` (32 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Change the following functions to take a libxl error code rather than
a boolean "ok" value, and translate that value to the boolean expected
by libxc at the last moment:
  domain_suspend_callback_common_done        } dss->callback_common_done
  remus_domain_suspend_callback_common_done  }
  domain_suspend_common_done

Also, abolish domain_suspend_common_failed as
domain_suspend_common_done can easily do its job and the call sites
now have to supply the right rc value anyway.

In domain_suspend_common_guest_suspended, change "ret" to "rc"
as it contains a libxl error code.

There is no functional change in this patch: the proper rc value now
propagates further, but is still eventually smashed to a boolean.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Fix a leftover comment referring to domain_suspend_common_failed
---
 tools/libxl/libxl_dom.c |   54 +++++++++++++++++++++--------------------------
 1 file changed, 24 insertions(+), 30 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index fcfc5d1..72cee27 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1149,7 +1149,7 @@ out:
 static void domain_suspend_done(libxl__egc *egc,
                         libxl__domain_suspend_state *dss, int rc);
 static void domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int ok);
+                                libxl__domain_suspend_state *dss, int rc);
 
 /*----- complicated callback, called by xc_domain_save -----*/
 
@@ -1437,11 +1437,9 @@ static void suspend_common_wait_guest_check(libxl__egc *egc,
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
       libxl__ev_time *ev, const struct timeval *requested_abs);
 
-static void domain_suspend_common_failed(libxl__egc *egc,
-                                         libxl__domain_suspend_state *dss);
 static void domain_suspend_common_done(libxl__egc *egc,
                                        libxl__domain_suspend_state *dss,
-                                       bool ok);
+                                       int rc);
 
 static bool domain_suspend_pvcontrol_acked(const char *state) {
     /* any value other than "suspend", including ENOENT (i.e. !state), is OK */
@@ -1471,6 +1469,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
         ret = xc_evtchn_notify(CTX->xce, dss->guest_evtchn.port);
         if (ret < 0) {
             LOG(ERROR, "xc_evtchn_notify failed ret=%d", ret);
+            rc = ERROR_FAIL;
             goto err;
         }
 
@@ -1491,6 +1490,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
         ret = xc_domain_shutdown(CTX->xch, domid, SHUTDOWN_suspend);
         if (ret < 0) {
             LOGE(ERROR, "xc_domain_shutdown failed");
+            rc = ERROR_FAIL;
             goto err;
         }
         /* The guest does not (need to) respond to this sort of request. */
@@ -1505,7 +1505,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
     libxl__domain_pvcontrol_write(gc, XBT_NULL, domid, "suspend");
 
     dss->pvcontrol.path = libxl__domain_pvcontrol_xspath(gc, domid);
-    if (!dss->pvcontrol.path) goto err;
+    if (!dss->pvcontrol.path) { rc = ERROR_FAIL; goto err; }
 
     dss->pvcontrol.ao = ao;
     dss->pvcontrol.what = "guest acknowledgement of suspend request";
@@ -1515,7 +1515,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
     return;
 
  err:
-    domain_suspend_common_failed(egc, dss);
+    domain_suspend_common_done(egc, dss, rc);
 }
 
 static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
@@ -1525,8 +1525,8 @@ static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
     STATE_AO_GC(dss->ao);
     /* If we should be done waiting, suspend_common_wait_guest_check
      * will end up calling domain_suspend_common_guest_suspended or
-     * domain_suspend_common_failed, both of which cancel the evtchn
-     * wait.  So re-enable it now. */
+     * domain_suspend_common_done, both of which cancel the evtchn
+     * wait as needed.  So re-enable it now. */
     libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
     suspend_common_wait_guest_check(egc, dss);
 }
@@ -1591,7 +1591,7 @@ static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
 
  err:
     libxl__xs_transaction_abort(gc, &t);
-    domain_suspend_common_failed(egc, dss);
+    domain_suspend_common_done(egc, dss, rc);
     return;
 }
 
@@ -1615,7 +1615,7 @@ static void domain_suspend_common_wait_guest(libxl__egc *egc,
     return;
 
  err:
-    domain_suspend_common_failed(egc, dss);
+    domain_suspend_common_done(egc, dss, rc);
 }
 
 static void suspend_common_wait_guest_watch(libxl__egc *egc,
@@ -1665,7 +1665,7 @@ static void suspend_common_wait_guest_check(libxl__egc *egc,
     return;
 
  err:
-    domain_suspend_common_failed(egc, dss);
+    domain_suspend_common_done(egc, dss, ERROR_FAIL);
 }
 
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
@@ -1674,46 +1674,40 @@ static void suspend_common_wait_guest_timeout(libxl__egc *egc,
     libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, guest_timeout);
     STATE_AO_GC(dss->ao);
     LOG(ERROR, "guest did not suspend, timed out");
-    domain_suspend_common_failed(egc, dss);
+    domain_suspend_common_done(egc, dss, ERROR_GUEST_TIMEDOUT);
 }
 
 static void domain_suspend_common_guest_suspended(libxl__egc *egc,
                                          libxl__domain_suspend_state *dss)
 {
     STATE_AO_GC(dss->ao);
-    int ret;
+    int rc;
 
     libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
     libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
     libxl__ev_time_deregister(gc, &dss->guest_timeout);
 
     if (dss->hvm) {
-        ret = libxl__domain_suspend_device_model(gc, dss);
-        if (ret) {
-            LOG(ERROR, "libxl__domain_suspend_device_model failed ret=%d", ret);
-            domain_suspend_common_failed(egc, dss);
+        rc = libxl__domain_suspend_device_model(gc, dss);
+        if (rc) {
+            LOG(ERROR, "libxl__domain_suspend_device_model failed ret=%d", rc);
+            domain_suspend_common_done(egc, dss, rc);
             return;
         }
     }
-    domain_suspend_common_done(egc, dss, 1);
-}
-
-static void domain_suspend_common_failed(libxl__egc *egc,
-                                         libxl__domain_suspend_state *dss)
-{
     domain_suspend_common_done(egc, dss, 0);
 }
 
 static void domain_suspend_common_done(libxl__egc *egc,
                                        libxl__domain_suspend_state *dss,
-                                       bool ok)
+                                       int rc)
 {
     EGC_GC;
     assert(!libxl__xswait_inuse(&dss->pvcontrol));
     libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
     libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
     libxl__ev_time_deregister(gc, &dss->guest_timeout);
-    dss->callback_common_done(egc, dss, ok);
+    dss->callback_common_done(egc, dss, rc);
 }
 
 static inline char *physmap_path(libxl__gc *gc, uint32_t dm_domid,
@@ -1830,9 +1824,9 @@ static void libxl__domain_suspend_callback(void *data)
 }
 
 static void domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int ok)
+                                libxl__domain_suspend_state *dss, int rc)
 {
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, !rc);
 }
 
 /*----- remus callbacks -----*/
@@ -1856,9 +1850,9 @@ static void libxl__remus_domain_suspend_callback(void *data)
 }
 
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int ok)
+                                libxl__domain_suspend_state *dss, int rc)
 {
-    if (!ok)
+    if (rc)
         goto out;
 
     libxl__remus_devices_state *const rds = &dss->rds;
@@ -1867,7 +1861,7 @@ static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
     return;
 
 out:
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, !rc);
 }
 
 static void remus_devices_postsuspend_cb(libxl__egc *egc,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 04/35] libxl: suspend: Return correct error from callbacks
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (2 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 03/35] libxl: suspend: common suspend callbacks take rc Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 05/35] libxl: Use libxl__xswait* in libxl__ao_device Ian Jackson
                   ` (31 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

If a suspend callback fails, it has a libxl error code in its hand.
However we must return to libxc the values that libxc expects.  So we
stash the libxl error code in dss->rc and fish it out again after
libxc returns from the suspend call.

While we're here, abolish the now-redundant `ok' variable in
remus_devices_postsuspend_cb.

The overall functional change is that libxl_domain_save now completes
with the correct error code as determined when the underlying failure
happened.  (Usually this is, still, ERROR_FAIL.)

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Add cleanup in remus_devices_postsuspend_cb.
---
 tools/libxl/libxl_dom.c      |   22 ++++++++++++++++------
 tools/libxl/libxl_internal.h |    1 +
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 72cee27..54908fb 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1261,6 +1261,7 @@ static void domain_suspend_switch_qemu_xen_logdirty
         libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
     } else {
         LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc);
+        dss->rc = rc;
         libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
     }
 }
@@ -1283,6 +1284,7 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
     default:
         LOG(ERROR,"logdirty switch failed"
             ", no valid device model version found, aborting suspend");
+        dss->rc = ERROR_FAIL;
         libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
     }
 }
@@ -1364,6 +1366,7 @@ static void switch_logdirty_done(libxl__egc *egc,
     int broke;
     if (rc) {
         broke = -1;
+        dss->rc = rc;
     } else {
         broke = 0;
     }
@@ -1826,6 +1829,7 @@ static void libxl__domain_suspend_callback(void *data)
 static void domain_suspend_callback_common_done(libxl__egc *egc,
                                 libxl__domain_suspend_state *dss, int rc)
 {
+    dss->rc = rc;
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, !rc);
 }
 
@@ -1861,6 +1865,7 @@ static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
     return;
 
 out:
+    dss->rc = rc;
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, !rc);
 }
 
@@ -1868,16 +1873,17 @@ static void remus_devices_postsuspend_cb(libxl__egc *egc,
                                          libxl__remus_devices_state *rds,
                                          int rc)
 {
-    int ok = 0;
     libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
 
     if (rc)
         goto out;
 
-    ok = 1;
+    rc = 0;
 
 out:
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
+    if (rc)
+        dss->rc = rc;
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, !rc);
 }
 
 static void libxl__remus_domain_resume_callback(void *data)
@@ -1896,7 +1902,6 @@ static void remus_devices_preresume_cb(libxl__egc *egc,
                                        libxl__remus_devices_state *rds,
                                        int rc)
 {
-    int ok = 0;
     libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
 
@@ -1908,10 +1913,12 @@ static void remus_devices_preresume_cb(libxl__egc *egc,
     if (rc)
         goto out;
 
-    ok = 1;
+    rc = 0;
 
 out:
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
+    if (rc)
+        dss->rc = rc;
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, !rc);
 }
 
 /*----- remus asynchronous checkpoint callback -----*/
@@ -2029,6 +2036,7 @@ void libxl__domain_suspend(libxl__egc *egc, libxl__domain_suspend_state *dss)
     libxl__srm_save_autogen_callbacks *const callbacks =
         &dss->shs.callbacks.save.a;
 
+    dss->rc = 0;
     logdirty_init(&dss->logdirty);
     libxl__xswait_init(&dss->pvcontrol);
     libxl__ev_evtchn_init(&dss->guest_evtchn);
@@ -2117,6 +2125,8 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
                          "domain did not respond to suspend request");
         if ( !dss->guest_responded )
             rc = ERROR_GUEST_TIMEDOUT;
+        else if (dss->rc)
+            rc = dss->rc;
         else
             rc = ERROR_FAIL;
         goto out;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b0cc4cf..ad779cd 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2863,6 +2863,7 @@ struct libxl__domain_suspend_state {
     int debug;
     const libxl_domain_remus_info *remus;
     /* private */
+    int rc;
     libxl__ev_evtchn guest_evtchn;
     int guest_evtchn_lockfd;
     int hvm;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 05/35] libxl: Use libxl__xswait* in libxl__ao_device
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (3 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 04/35] libxl: suspend: Return correct error from callbacks Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 06/35] libxl: xswait/devstate: Move xswait to before devstate Ian Jackson
                   ` (30 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Replace the separate timeout and xenstore watch with use of
libxl__xswait*.

Different control flow, but no ultimate functional change apart from
slight changes to the text of error messages.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_device.c   |   64 ++++++++++++------------------------------
 tools/libxl/libxl_internal.h |    2 +-
 2 files changed, 19 insertions(+), 47 deletions(-)

diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 93bb41e..8352cfb 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -450,7 +450,7 @@ void libxl__prepare_ao_device(libxl__ao *ao, libxl__ao_device *aodev)
      * Initialize xs_watch, because it's not used on all possible
      * execution paths, but it's unconditionally destroyed when finished.
      */
-    libxl__ev_xswatch_init(&aodev->xs_watch);
+    libxl__xswait_init(&aodev->xswait);
     aodev->active = 1;
     /* We init this here because we might call device_hotplug_done
      * without actually calling any hotplug script */
@@ -731,13 +731,9 @@ static void device_hotplug_child_death_cb(libxl__egc *egc,
                                           libxl__async_exec_state *aes,
                                           int status);
 
-static void device_destroy_be_timeout_cb(libxl__egc *egc, libxl__ev_time *ev,
-                                         const struct timeval *requested_abs);
-
 static void device_destroy_be_watch_cb(libxl__egc *egc,
-                                       libxl__ev_xswatch *watch,
-                                       const char *watch_path,
-                                       const char *event_path);
+                                       libxl__xswait_state *xswait,
+                                       int rc, const char *data);
 
 static void device_hotplug_done(libxl__egc *egc, libxl__ao_device *aodev);
 
@@ -988,22 +984,14 @@ static void device_hotplug(libxl__egc *egc, libxl__ao_device *aodev)
         if (aodev->action != LIBXL__DEVICE_ACTION_REMOVE)
             goto out;
 
-        rc = libxl__ev_time_register_rel(gc, &aodev->timeout,
-                                         device_destroy_be_timeout_cb,
-                                         LIBXL_DESTROY_TIMEOUT * 1000);
-        if (rc) {
-            LOG(ERROR, "setup of xs watch timeout failed");
-            goto out;
-        }
-
-        rc = libxl__ev_xswatch_register(gc, &aodev->xs_watch,
-                                        device_destroy_be_watch_cb,
-                                        be_path);
-        if (rc) {
-            LOG(ERROR, "setup of xs watch for %s failed", be_path);
-            libxl__ev_time_deregister(gc, &aodev->timeout);
+        aodev->xswait.ao = ao;
+        aodev->xswait.what = "removal of backend path";
+        aodev->xswait.path = be_path;
+        aodev->xswait.timeout_ms = LIBXL_DESTROY_TIMEOUT * 1000;
+        aodev->xswait.callback = device_destroy_be_watch_cb;
+        rc = libxl__xswait_start(gc, &aodev->xswait);
+        if (rc)
             goto out;
-        }
         return;
     }
 
@@ -1101,37 +1089,21 @@ error:
     device_hotplug_done(egc, aodev);
 }
 
-static void device_destroy_be_timeout_cb(libxl__egc *egc, libxl__ev_time *ev,
-                                         const struct timeval *requested_abs)
-{
-    libxl__ao_device *aodev = CONTAINER_OF(ev, *aodev, timeout);
-    STATE_AO_GC(aodev->ao);
-
-    LOG(ERROR, "timed out while waiting for %s to be removed",
-               libxl__device_backend_path(gc, aodev->dev));
-
-    aodev->rc = ERROR_TIMEDOUT;
-
-    device_hotplug_done(egc, aodev);
-    return;
-}
-
 static void device_destroy_be_watch_cb(libxl__egc *egc,
-                                       libxl__ev_xswatch *watch,
-                                       const char *watch_path,
-                                       const char *event_path)
+                                       libxl__xswait_state *xswait,
+                                       int rc, const char *dir)
 {
-    libxl__ao_device *aodev = CONTAINER_OF(watch, *aodev, xs_watch);
+    libxl__ao_device *aodev = CONTAINER_OF(xswait, *aodev, xswait);
     STATE_AO_GC(aodev->ao);
-    const char *dir;
-    int rc;
 
-    rc = libxl__xs_read_checked(gc, XBT_NULL, watch_path, &dir);
     if (rc) {
-        LOG(ERROR, "unable to read backend path: %s", watch_path);
+        if (rc == ERROR_TIMEDOUT)
+            LOG(ERROR, "timed out while waiting for %s to be removed",
+                xswait->path);
         aodev->rc = rc;
         goto out;
     }
+
     if (dir) {
         /* backend path still exists, wait a little longer... */
         return;
@@ -1164,7 +1136,7 @@ static void device_hotplug_clean(libxl__gc *gc, libxl__ao_device *aodev)
 {
     /* Clean events and check reentrancy */
     libxl__ev_time_deregister(gc, &aodev->timeout);
-    libxl__ev_xswatch_deregister(gc, &aodev->xs_watch);
+    libxl__xswait_stop(gc, &aodev->xswait);
     assert(!libxl__async_exec_inuse(&aodev->aes));
 }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index ad779cd..09770b6 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2210,7 +2210,7 @@ struct libxl__ao_device {
     /* Bodge for Qemu devices */
     libxl__ev_time timeout;
     /* xenstore watch for backend path of driver domains */
-    libxl__ev_xswatch xs_watch;
+    libxl__xswait_state xswait;
     int num_exec;
     /* for calling hotplug scripts */
     libxl__async_exec_state aes;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 06/35] libxl: xswait/devstate: Move xswait to before devstate
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (4 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 05/35] libxl: Use libxl__xswait* in libxl__ao_device Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 07/35] libxl: devstate: Use libxl__xswait* Ian Jackson
                   ` (29 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Pure code motion.  We are going to make devstate use xswait.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_internal.h |  109 +++++++++++++++++++++---------------------
 1 file changed, 55 insertions(+), 54 deletions(-)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 09770b6..62c04c2 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1116,6 +1116,61 @@ _hidden const char *libxl__device_nic_devname(libxl__gc *gc,
 
 _hidden int libxl__get_domid(libxl__gc *gc, uint32_t *domid);
 
+/*----- xswait: wait for a xenstore node to be suitable -----*/
+
+typedef struct libxl__xswait_state libxl__xswait_state;
+
+/*
+ * rc describes the circumstances of this callback:
+ *
+ * rc==0
+ *
+ *     The xenstore path (may have) changed.  It has been read for
+ *     you.  The result is in data (allocated from the ao gc).
+ *     data may be NULL, which means that the xenstore read gave
+ *     ENOENT.
+ *
+ *     If you are satisfied, you MUST call libxl__xswait_stop.
+ *     Otherwise, xswait will continue waiting and watching and
+ *     will call you back later.
+ *
+ * rc==ERROR_TIMEDOUT
+ *
+ *     The specified timeout was reached.
+ *     This has NOT been logged (except to the debug log).
+ *     xswait will not continue (but calling libxl__xswait_stop is OK).
+ *
+ * rc!=0, !=ERROR_TIMEDOUT
+ *
+ *     Some other error occurred.
+ *     This HAS been logged.
+ *     xswait will not continue (but calling libxl__xswait_stop is OK).
+ *
+ * xswait.path may start with with '@', in which case no read is done
+ * and the callback will always get data==0.
+ */
+typedef void libxl__xswait_callback(libxl__egc *egc,
+      libxl__xswait_state *xswa, int rc, const char *data);
+
+struct libxl__xswait_state {
+    /* caller must fill these in, and they must all remain valid */
+    libxl__ao *ao;
+    const char *what; /* for error msgs: noun phrase, what we're waiting for */
+    const char *path;
+    int timeout_ms; /* as for poll(2) */
+    libxl__xswait_callback *callback;
+    /* remaining fields are private to xswait */
+    libxl__ev_time time_ev;
+    libxl__ev_xswatch watch_ev;
+};
+
+void libxl__xswait_init(libxl__xswait_state*);
+void libxl__xswait_stop(libxl__gc*, libxl__xswait_state*); /*idempotent*/
+bool libxl__xswait_inuse(const libxl__xswait_state *ss);
+
+int libxl__xswait_start(libxl__gc*, libxl__xswait_state*);
+
+
 /*
  * libxl__ev_devstate - waits a given time for a device to
  * reach a given state.  Follows the libxl_ev_* conventions.
@@ -1213,60 +1268,6 @@ _hidden int libxl__device_pci_destroy_all(libxl__gc *gc, uint32_t domid);
 _hidden int libxl__device_dt_add(libxl__gc *gc, uint32_t domid,
                                  const libxl_device_dtdev *dtdev);
 
-/*----- xswait: wait for a xenstore node to be suitable -----*/
-
-typedef struct libxl__xswait_state libxl__xswait_state;
-
-/*
- * rc describes the circumstances of this callback:
- *
- * rc==0
- *
- *     The xenstore path (may have) changed.  It has been read for
- *     you.  The result is in data (allocated from the ao gc).
- *     data may be NULL, which means that the xenstore read gave
- *     ENOENT.
- *
- *     If you are satisfied, you MUST call libxl__xswait_stop.
- *     Otherwise, xswait will continue waiting and watching and
- *     will call you back later.
- *
- * rc==ERROR_TIMEDOUT
- *
- *     The specified timeout was reached.
- *     This has NOT been logged (except to the debug log).
- *     xswait will not continue (but calling libxl__xswait_stop is OK).
- *
- * rc!=0, !=ERROR_TIMEDOUT
- *
- *     Some other error occurred.
- *     This HAS been logged.
- *     xswait will not continue (but calling libxl__xswait_stop is OK).
- *
- * xswait.path may start with with '@', in which case no read is done
- * and the callback will always get data==0.
- */
-typedef void libxl__xswait_callback(libxl__egc *egc,
-      libxl__xswait_state *xswa, int rc, const char *data);
-
-struct libxl__xswait_state {
-    /* caller must fill these in, and they must all remain valid */
-    libxl__ao *ao;
-    const char *what; /* for error msgs: noun phrase, what we're waiting for */
-    const char *path;
-    int timeout_ms; /* as for poll(2) */
-    libxl__xswait_callback *callback;
-    /* remaining fields are private to xswait */
-    libxl__ev_time time_ev;
-    libxl__ev_xswatch watch_ev;
-};
-
-void libxl__xswait_init(libxl__xswait_state*);
-void libxl__xswait_stop(libxl__gc*, libxl__xswait_state*); /*idempotent*/
-bool libxl__xswait_inuse(const libxl__xswait_state *ss);
-
-int libxl__xswait_start(libxl__gc*, libxl__xswait_state*);
-
 /*
  *----- spawn -----
  *
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 07/35] libxl: devstate: Use libxl__xswait*
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (5 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 06/35] libxl: xswait/devstate: Move xswait to before devstate Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 08/35] libxl: Rename AO_ABORT to AO_CREATE_FAIL Ian Jackson
                   ` (28 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Initialise ds->w.ao
---
 tools/libxl/libxl_device.c   |    4 +--
 tools/libxl/libxl_event.c    |   79 +++++++++++++++++++-----------------------
 tools/libxl/libxl_internal.h |   11 +++---
 3 files changed, 41 insertions(+), 53 deletions(-)

diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 8352cfb..951d125 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -758,7 +758,7 @@ void libxl__wait_device_connection(libxl__egc *egc, libxl__ao_device *aodev)
         return;
     }
 
-    rc = libxl__ev_devstate_wait(gc, &aodev->backend_ds,
+    rc = libxl__ev_devstate_wait(ao, &aodev->backend_ds,
                                  device_backend_callback,
                                  state_path, XenbusStateInitWait,
                                  LIBXL_INIT_TIMEOUT * 1000);
@@ -859,7 +859,7 @@ void libxl__initiate_device_remove(libxl__egc *egc,
         if (rc < 0) goto out;
     }
 
-    rc = libxl__ev_devstate_wait(gc, &aodev->backend_ds,
+    rc = libxl__ev_devstate_wait(ao, &aodev->backend_ds,
                                  device_backend_callback,
                                  state_path, XenbusStateClosed,
                                  LIBXL_DESTROY_TIMEOUT * 1000);
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 9ff4e78..3b45148 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -816,68 +816,59 @@ void libxl__ev_evtchn_cancel(libxl__gc *gc, libxl__ev_evtchn *evev)
  * waiting for device state
  */
 
-static void devstate_watch_callback(libxl__egc *egc, libxl__ev_xswatch *watch,
-                                const char *watch_path, const char *event_path)
+static void devstate_callback(libxl__egc *egc, libxl__xswait_state *xsw,
+                              int rc, const char *sstate)
 {
     EGC_GC;
-    libxl__ev_devstate *ds = CONTAINER_OF(watch, *ds, watch);
-    int rc;
+    libxl__ev_devstate *ds = CONTAINER_OF(xsw, *ds, w);
 
-    char *sstate = libxl__xs_read(gc, XBT_NULL, watch_path);
+    if (rc) {
+        if (rc == ERROR_TIMEDOUT)
+            LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d "
+                       " timed out", ds->w.path, ds->wanted);
+        goto out;
+    }
     if (!sstate) {
-        if (errno == ENOENT) {
-            LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d"
-                       " but it was removed", watch_path, ds->wanted);
-            rc = ERROR_INVAL;
-        } else {
-            LIBXL__LOG_ERRNO(CTX, LIBXL__LOG_ERROR, "backend %s wanted state"
-                             " %d but read failed", watch_path, ds->wanted);
-            rc = ERROR_FAIL;
-        }
+        LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d"
+                   " but it was removed", ds->w.path, ds->wanted);
+        rc = ERROR_INVAL;
+        goto out;
+    }
+
+    int got = atoi(sstate);
+    if (got == ds->wanted) {
+        LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d ok",
+                   ds->w.path, ds->wanted);
+        rc = 0;
     } else {
-        int got = atoi(sstate);
-        if (got == ds->wanted) {
-            LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d ok",
-                       watch_path, ds->wanted);
-            rc = 0;
-        } else {
-            LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d"
-                       " still waiting state %d", watch_path, ds->wanted, got);
-            return;
-        }
+        LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d"
+                   " still waiting state %d", ds->w.path, ds->wanted, got);
+        return;
     }
-    libxl__ev_devstate_cancel(gc, ds);
-    ds->callback(egc, ds, rc);
-}
 
-static void devstate_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                             const struct timeval *requested_abs)
-{
-    EGC_GC;
-    libxl__ev_devstate *ds = CONTAINER_OF(ev, *ds, timeout);
-    LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d "
-               " timed out", ds->watch.path, ds->wanted);
+ out:
     libxl__ev_devstate_cancel(gc, ds);
-    ds->callback(egc, ds, ERROR_TIMEDOUT);
+    ds->callback(egc, ds, rc);
 }
 
-int libxl__ev_devstate_wait(libxl__gc *gc, libxl__ev_devstate *ds,
+int libxl__ev_devstate_wait(libxl__ao *ao, libxl__ev_devstate *ds,
                             libxl__ev_devstate_callback cb,
                             const char *state_path, int state, int milliseconds)
 {
+    AO_GC;
     int rc;
 
-    libxl__ev_time_init(&ds->timeout);
-    libxl__ev_xswatch_init(&ds->watch);
+    libxl__xswait_init(&ds->w);
     ds->wanted = state;
     ds->callback = cb;
 
-    rc = libxl__ev_time_register_rel(gc, &ds->timeout, devstate_timeout,
-                                     milliseconds);
-    if (rc) goto out;
-
-    rc = libxl__ev_xswatch_register(gc, &ds->watch, devstate_watch_callback,
-                                    state_path);
+    ds->w.ao = ao;
+    ds->w.what = GCSPRINTF("backend %s (hoping for state change to %d)",
+                           state_path, state);
+    ds->w.path = state_path;
+    ds->w.timeout_ms = milliseconds;
+    ds->w.callback = devstate_callback;
+    rc = libxl__xswait_start(gc, &ds->w);
     if (rc) goto out;
 
     return 0;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 62c04c2..b49b10b 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1189,24 +1189,21 @@ struct libxl__ev_devstate {
     libxl__ev_devstate_callback *callback;
     /* as for the remainder, read-only public parts may also be
      * read by the caller (notably, watch.path), but only when waiting: */
-    libxl__ev_xswatch watch;
-    libxl__ev_time timeout;
+    libxl__xswait_state w;
 };
 
 static inline void libxl__ev_devstate_init(libxl__ev_devstate *ds)
 {
-    libxl__ev_time_init(&ds->timeout);
-    libxl__ev_xswatch_init(&ds->watch);
+    libxl__xswait_init(&ds->w);
 }
 
 static inline void libxl__ev_devstate_cancel(libxl__gc *gc,
                                              libxl__ev_devstate *ds)
 {
-    libxl__ev_time_deregister(gc,&ds->timeout);
-    libxl__ev_xswatch_deregister(gc,&ds->watch);
+    libxl__xswait_stop(gc,&ds->w);
 }
 
-_hidden int libxl__ev_devstate_wait(libxl__gc *gc, libxl__ev_devstate *ds,
+_hidden int libxl__ev_devstate_wait(libxl__ao *ao, libxl__ev_devstate *ds,
                                     libxl__ev_devstate_callback cb,
                                     const char *state_path,
                                     int state, int milliseconds);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 08/35] libxl: Rename AO_ABORT to AO_CREATE_FAIL
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (6 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 07/35] libxl: devstate: Use libxl__xswait* Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-26 14:10   ` Wei Liu
  2015-06-25 17:44 ` [PATCH 09/35] libxl: Change some log messages to say `abandoning' rather than `aborting' Ian Jackson
                   ` (27 subsequent siblings)
  35 siblings, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

We are going to introduce a new meaning for aborting an ao, so rename
this to avoid confusion.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl.c          |   10 +++++-----
 tools/libxl/libxl_event.c    |    4 ++--
 tools/libxl/libxl_internal.h |   12 ++++++------
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index cfc2623..98b94ee 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -901,7 +901,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
     return AO_INPROGRESS;
 
  out:
-    return AO_ABORT(rc);
+    return AO_CREATE_FAIL(rc);
 }
 
 static void libxl__remus_setup_done(libxl__egc *egc,
@@ -982,7 +982,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags,
     return AO_INPROGRESS;
 
  out_err:
-    return AO_ABORT(rc);
+    return AO_CREATE_FAIL(rc);
 }
 
 int libxl_domain_pause(libxl_ctx *ctx, uint32_t domid)
@@ -3008,7 +3008,7 @@ out:
 
     if (lock) libxl__unlock_domain_userdata(lock);
 
-    if (rc) return AO_ABORT(rc);
+    if (rc) return AO_CREATE_FAIL(rc);
     return AO_INPROGRESS;
 }
 
@@ -4198,7 +4198,7 @@ out:
         libxl__initiate_device_remove(egc, aodev);                      \
                                                                         \
     out:                                                                \
-        if (rc) return AO_ABORT(rc);                                    \
+        if (rc) return AO_CREATE_FAIL(rc);                                    \
         return AO_INPROGRESS;                                           \
     }
 
@@ -4609,7 +4609,7 @@ int libxl_device_events_handler(libxl_ctx *ctx,
     return AO_INPROGRESS;
 
 out:
-    return AO_ABORT(rc);
+    return AO_CREATE_FAIL(rc);
 }
 
 /******************************************************************************/
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 3b45148..4b234a3 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -1735,10 +1735,10 @@ void libxl__ao__destroy(libxl_ctx *ctx, libxl__ao *ao)
     free(ao);
 }
 
-void libxl__ao_abort(libxl__ao *ao)
+void libxl__ao_create_fail(libxl__ao *ao)
 {
     AO_GC;
-    LOG(DEBUG,"ao %p: abort",ao);
+    LOG(DEBUG,"ao %p: create fail",ao);
     assert(ao->magic == LIBXL__AO_MAGIC);
     assert(ao->in_initiator);
     assert(!ao->complete);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b49b10b..cd1dfc3 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1953,8 +1953,8 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc);
  *        libxl__ao_progress_gethow.
  *
  * - If the initiation is unsuccessful, the initiating function must
- *   call libxl__ao_abort before unlocking and returning whatever
- *   error code is appropriate (AO_ABORT macro).
+ *   call libxl__ao_create_fail before unlocking and returning whatever
+ *   error code is appropriate (AO_CREATE_FAIL macro).
  *
  * If initiation is successful:
  *
@@ -2011,10 +2011,10 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc);
         (ao__rc);                                               \
    })
 
-#define AO_ABORT(rc) ({                                         \
+#define AO_CREATE_FAIL(rc) ({                                   \
         libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc);          \
         assert(rc);                                             \
-        libxl__ao_abort(ao);                                    \
+        libxl__ao_create_fail(ao);                              \
         libxl__ctx_unlock(ao__ctx); /* gc is now invalid */     \
         EGC_FREE;                                               \
         (rc);                                                   \
@@ -2035,7 +2035,7 @@ _hidden libxl__ao *libxl__ao_create(libxl_ctx*, uint32_t domid,
        const char *file, int line, const char *func);
 _hidden int libxl__ao_inprogress(libxl__ao *ao,
        const char *file, int line, const char *func); /* temporarily unlocks */
-_hidden void libxl__ao_abort(libxl__ao *ao);
+_hidden void libxl__ao_create_fail(libxl__ao *ao);
 _hidden void libxl__ao_complete(libxl__egc *egc, libxl__ao *ao, int rc);
 _hidden libxl__gc *libxl__ao_inprogress_gc(libxl__ao *ao);
 
@@ -2064,7 +2064,7 @@ _hidden void libxl__ao_complete_check_progress_reports(libxl__egc*, libxl__ao*);
  * The returned sub-ao is suitable for passing to gc-related functions
  * and macros such as libxl__ao_inprogress_gc, AO_GC, and STATE_AO_GC.
  *
- * It MUST NOT be used with AO_INPROGRESS, AO_ABORT,
+ * It MUST NOT be used with AO_INPROGRESS, AO_CREATE_FAIL,
  * libxl__ao_complete, libxl__ao_progress_report, and so on.
  *
  * The caller must ensure that all of the sub-ao's are freed before
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 09/35] libxl: Change some log messages to say `abandoning' rather than `aborting'
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (7 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 08/35] libxl: Rename AO_ABORT to AO_CREATE_FAIL Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-26 14:12   ` Wei Liu
  2015-06-25 17:44 ` [PATCH 10/35] libxl: Change an internal comment to say `bail' rather than `abort' Ian Jackson
                   ` (26 subsequent siblings)
  35 siblings, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

We are going to introduce application-requested aborts of (ao)
operations, but these suspend failures are something different.
Reword to avoid confusion.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_dom.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 54908fb..ccbcb6e 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1242,7 +1242,7 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
     return;
 
  out:
-    LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc);
+    LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
     libxl__xs_transaction_abort(gc, &t);
     switch_logdirty_done(egc,dss,rc);
 }
@@ -1260,7 +1260,7 @@ static void domain_suspend_switch_qemu_xen_logdirty
     if (!rc) {
         libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
     } else {
-        LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc);
+        LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
         dss->rc = rc;
         libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
     }
@@ -1283,7 +1283,7 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
         break;
     default:
         LOG(ERROR,"logdirty switch failed"
-            ", no valid device model version found, aborting suspend");
+            ", no valid device model version found, abandoning suspend");
         dss->rc = ERROR_FAIL;
         libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
     }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 10/35] libxl: Change an internal comment to say `bail' rather than `abort'
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (8 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 09/35] libxl: Change some log messages to say `abandoning' rather than `aborting' Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-26 14:13   ` Wei Liu
  2015-06-25 17:44 ` [PATCH 11/35] libxl: New error codes ABORTED etc Ian Jackson
                   ` (25 subsequent siblings)
  35 siblings, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_internal.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index cd1dfc3..b445a51 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2380,7 +2380,7 @@ struct libxl__multidev {
  *          any stale entry
  *       for loop -- xs transaction
  *           open xs transaction
- *           check device existence, abort if it exists
+ *           check device existence, bail if it exists
  *           write in-memory json config to disk
  *           commit xs transaction
  *       end for loop
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 11/35] libxl: New error codes ABORTED etc.
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (9 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 10/35] libxl: Change an internal comment to say `bail' rather than `abort' Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-26 14:13   ` Wei Liu
  2015-06-25 17:44 ` [PATCH 12/35] libxl: events: Make timeout and async exec setup take an ao, not a gc Ian Jackson
                   ` (24 subsequent siblings)
  35 siblings, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

We introduce ERROR_ABORTED now, so that we can write code to handle
it, and decreee that functions might return it, even though currently
there is nowhere where this error is generated.

While we're here, provide ERROR_NOTFOUND which will also be used
later, but only as part of the public API.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v4: CANCELLED renamed to ABORTED.
    No longer introduce ERROR_NOTIMPLEMENTED.
v2: Rebase means new errors have bigger (more negative) numbers.
---
 tools/libxl/libxl_types.idl |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 23f27d4..9558d52 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -65,6 +65,8 @@ libxl_error = Enumeration("error", [
     (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
     (-20, "VNUMA_CONFIG_INVALID"),
     (-21, "DOMAIN_NOTFOUND"),
+    (-22, "ABORTED"),
+    (-23, "NOTFOUND"),
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 12/35] libxl: events: Make timeout and async exec setup take an ao, not a gc
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (10 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 11/35] libxl: New error codes ABORTED etc Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-30  6:02   ` Wen Congyang
  2015-06-25 17:44 ` [PATCH 13/35] libxl: events: Make libxl__async_exec_* pass caller an rc Ian Jackson
                   ` (23 subsequent siblings)
  35 siblings, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ian Campbell, Wen Congyang, Ian Jackson, Euan Harris,
	Yang Hongyang, Lai Jiangshan

Change the timeout setup functions to take a libxl__ao, not a
libxl__gc.  This is going to be needed for ao abort, because timeouts
are going to be a main hook for ao abort requests - so the timeouts
need to be associated with an ao.

This means that timeouts can only occur as part of a long-running
libxl function (but this is of course correct, as libxl shouldn't have
any global timeouts, and indeed all the call sites have an ao).

Also remove the gc parameter from libxl__async_exec_start.  It can
just use the gc from the ao supplied in the aes.

All the callers follow the obvious patterns and therefore supply the
ao's gc to libxl__async_exec_start and the timeout setup functions.
There is therefore no functional change in this patch.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Wen Congyang <wency@cn.fujitsu.com>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>

---
v2: This patch split off from "Permit timeouts to signal cancellation".
    Rebased; consequently, deal with libxl__async_exec_start.
    CC'd authors of the libxl__async_exec_* functions.
---
 tools/libxl/libxl_aoutils.c         |    8 +++++---
 tools/libxl/libxl_device.c          |    4 ++--
 tools/libxl/libxl_dom.c             |    8 ++++----
 tools/libxl/libxl_event.c           |    6 ++++--
 tools/libxl/libxl_internal.h        |    6 +++---
 tools/libxl/libxl_remus_disk_drbd.c |    2 +-
 tools/libxl/libxl_test_timedereg.c  |    9 +++++----
 7 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/tools/libxl/libxl_aoutils.c b/tools/libxl/libxl_aoutils.c
index ef679dd..1502ffe 100644
--- a/tools/libxl/libxl_aoutils.c
+++ b/tools/libxl/libxl_aoutils.c
@@ -46,7 +46,7 @@ int libxl__xswait_start(libxl__gc *gc, libxl__xswait_state *xswa)
 {
     int rc;
 
-    rc = libxl__ev_time_register_rel(gc, &xswa->time_ev,
+    rc = libxl__ev_time_register_rel(xswa->ao, &xswa->time_ev,
                                      xswait_timeout_callback, xswa->timeout_ms);
     if (rc) goto err;
 
@@ -547,16 +547,18 @@ void libxl__async_exec_init(libxl__async_exec_state *aes)
     libxl__ev_child_init(&aes->child);
 }
 
-int libxl__async_exec_start(libxl__gc *gc, libxl__async_exec_state *aes)
+int libxl__async_exec_start(libxl__async_exec_state *aes)
 {
     pid_t pid;
 
     /* Convenience aliases */
+    libxl__ao *ao = aes->ao;
+    AO_GC;
     libxl__ev_child *const child = &aes->child;
     char ** const args = aes->args;
 
     /* Set execution timeout */
-    if (libxl__ev_time_register_rel(gc, &aes->time,
+    if (libxl__ev_time_register_rel(ao, &aes->time,
                                     async_exec_timeout,
                                     aes->timeout_ms)) {
         LOG(ERROR, "unable to register timeout for executing: %s", aes->what);
diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 951d125..b6276f6 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -808,7 +808,7 @@ void libxl__initiate_device_remove(libxl__egc *egc,
              * TODO: 4.2 Bodge due to QEMU, see comment on top of
              * libxl__initiate_device_remove in libxl_internal.h
              */
-            rc = libxl__ev_time_register_rel(gc, &aodev->timeout,
+            rc = libxl__ev_time_register_rel(ao, &aodev->timeout,
                                              device_qemu_timeout,
                                              LIBXL_QEMU_BODGE_TIMEOUT * 1000);
             if (rc) {
@@ -1034,7 +1034,7 @@ static void device_hotplug(libxl__egc *egc, libxl__ao_device *aodev)
     aes->stdfds[1] = 2;
     aes->stdfds[2] = -1;
 
-    rc = libxl__async_exec_start(gc, aes);
+    rc = libxl__async_exec_start(aes);
     if (rc)
         goto out;
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index ccbcb6e..40a2d79 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1200,7 +1200,7 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
                                 switch_logdirty_xswatch, lds->ret_path);
     if (rc) goto out;
 
-    rc = libxl__ev_time_register_rel(gc, &lds->timeout,
+    rc = libxl__ev_time_register_rel(ao, &lds->timeout,
                                 switch_logdirty_timeout, 10*1000);
     if (rc) goto out;
 
@@ -1480,7 +1480,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
         rc = libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
         if (rc) goto err;
 
-        rc = libxl__ev_time_register_rel(gc, &dss->guest_timeout,
+        rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
                                          suspend_common_wait_guest_timeout,
                                          60*1000);
         if (rc) goto err;
@@ -1611,7 +1611,7 @@ static void domain_suspend_common_wait_guest(libxl__egc *egc,
                                     "@releaseDomain");
     if (rc) goto err;
 
-    rc = libxl__ev_time_register_rel(gc, &dss->guest_timeout,
+    rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
                                      suspend_common_wait_guest_timeout,
                                      60*1000);
     if (rc) goto err;
@@ -1990,7 +1990,7 @@ static void remus_devices_commit_cb(libxl__egc *egc,
      */
 
     /* Set checkpoint interval timeout */
-    rc = libxl__ev_time_register_rel(gc, &dss->checkpoint_timeout,
+    rc = libxl__ev_time_register_rel(ao, &dss->checkpoint_timeout,
                                      remus_next_checkpoint,
                                      dss->interval);
 
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 4b234a3..217fe97 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -332,10 +332,11 @@ static void time_done_debug(libxl__gc *gc, const char *func,
 #endif
 }
 
-int libxl__ev_time_register_abs(libxl__gc *gc, libxl__ev_time *ev,
+int libxl__ev_time_register_abs(libxl__ao *ao, libxl__ev_time *ev,
                                 libxl__ev_time_callback *func,
                                 struct timeval absolute)
 {
+    AO_GC;
     int rc;
 
     CTX_LOCK;
@@ -356,10 +357,11 @@ int libxl__ev_time_register_abs(libxl__gc *gc, libxl__ev_time *ev,
 }
 
 
-int libxl__ev_time_register_rel(libxl__gc *gc, libxl__ev_time *ev,
+int libxl__ev_time_register_rel(libxl__ao *ao, libxl__ev_time *ev,
                                 libxl__ev_time_callback *func,
                                 int milliseconds /* as for poll(2) */)
 {
+    AO_GC;
     struct timeval absolute;
     int rc;
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b445a51..8c38eab 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -791,10 +791,10 @@ static inline void libxl__ev_fd_init(libxl__ev_fd *efd)
 static inline int libxl__ev_fd_isregistered(const libxl__ev_fd *efd)
                     { return efd->fd >= 0; }
 
-_hidden int libxl__ev_time_register_rel(libxl__gc*, libxl__ev_time *ev_out,
+_hidden int libxl__ev_time_register_rel(libxl__ao*, libxl__ev_time *ev_out,
                                         libxl__ev_time_callback*,
                                         int milliseconds /* as for poll(2) */);
-_hidden int libxl__ev_time_register_abs(libxl__gc*, libxl__ev_time *ev_out,
+_hidden int libxl__ev_time_register_abs(libxl__ao*, libxl__ev_time *ev_out,
                                         libxl__ev_time_callback*,
                                         struct timeval);
 _hidden int libxl__ev_time_modify_rel(libxl__gc*, libxl__ev_time *ev,
@@ -2166,7 +2166,7 @@ struct libxl__async_exec_state {
 };
 
 void libxl__async_exec_init(libxl__async_exec_state *aes);
-int libxl__async_exec_start(libxl__gc *gc, libxl__async_exec_state *aes);
+int libxl__async_exec_start(libxl__async_exec_state *aes);
 bool libxl__async_exec_inuse(const libxl__async_exec_state *aes);
 
 /*----- device addition/removal -----*/
diff --git a/tools/libxl/libxl_remus_disk_drbd.c b/tools/libxl/libxl_remus_disk_drbd.c
index afe9b61..5e0c9a6 100644
--- a/tools/libxl/libxl_remus_disk_drbd.c
+++ b/tools/libxl/libxl_remus_disk_drbd.c
@@ -120,7 +120,7 @@ static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev)
     aes->stdfds[1] = -1;
     aes->stdfds[2] = -1;
 
-    rc = libxl__async_exec_start(gc, aes);
+    rc = libxl__async_exec_start(aes);
     if (rc)
         goto out;
 
diff --git a/tools/libxl/libxl_test_timedereg.c b/tools/libxl/libxl_test_timedereg.c
index a44639f..e2cc27d 100644
--- a/tools/libxl/libxl_test_timedereg.c
+++ b/tools/libxl/libxl_test_timedereg.c
@@ -30,12 +30,13 @@ static int seq;
 static void occurs(libxl__egc *egc, libxl__ev_time *ev,
                    const struct timeval *requested_abs);
 
-static void regs(libxl__gc *gc, int j)
+static void regs(libxl__ao *ao, int j)
 {
+    AO_GC;
     int rc, i;
     LOG(DEBUG,"regs(%d)", j);
     for (i=0; i<NTIMES; i++) {
-        rc = libxl__ev_time_register_rel(gc, &et[j][i], occurs, ms[j][i]);
+        rc = libxl__ev_time_register_rel(ao, &et[j][i], occurs, ms[j][i]);
         assert(!rc);
     }    
 }
@@ -52,7 +53,7 @@ int libxl_test_timedereg(libxl_ctx *ctx, libxl_asyncop_how *ao_how)
         libxl__ev_time_init(&et[1][i]);
     }
 
-    regs(gc, 0);
+    regs(ao, 0);
 
     return AO_INPROGRESS;
 }
@@ -71,7 +72,7 @@ static void occurs(libxl__egc *egc, libxl__ev_time *ev,
         assert(ev == &et[0][1]);
         libxl__ev_time_deregister(gc, &et[0][0]);
         libxl__ev_time_deregister(gc, &et[0][2]);
-        regs(gc, 1);
+        regs(tao, 1);
         libxl__ev_time_deregister(gc, &et[0][1]);
         break;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 13/35] libxl: events: Make libxl__async_exec_* pass caller an rc
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (11 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 12/35] libxl: events: Make timeout and async exec setup take an ao, not a gc Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 14/35] libxl: events: Permit timeouts to signal ao abort Ian Jackson
                   ` (22 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

The internal user of libxl__async_exec_start et al now gets an rc as
well as the process's exit status.

For now this is always either 0 or ERROR_FAIL, but with ao
abort requests this will possibly be ABORTED or TIMEDOUT too.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v4: Improve doc comment as suggested by Ian C.
v2: New patch due to rebause; v1 had changes to device_hotplug_*
     scripts instead.
    Callback now gets unambiguous information about error situation:
     previously, if only thing that went wrong was that child died
     badly, rc would be FAILED, which was unambigously; now rc=0.
    Add a comment document the meaning of the rc and status parameters
     to the callback.
---
 tools/libxl/libxl_aoutils.c         |    9 ++++++---
 tools/libxl/libxl_device.c          |   13 +++++++++----
 tools/libxl/libxl_internal.h        |   12 +++++++++++-
 tools/libxl/libxl_netbuffer.c       |   19 ++++++++++---------
 tools/libxl/libxl_remus_disk_drbd.c |    8 +++++---
 5 files changed, 41 insertions(+), 20 deletions(-)

diff --git a/tools/libxl/libxl_aoutils.c b/tools/libxl/libxl_aoutils.c
index 1502ffe..450caae 100644
--- a/tools/libxl/libxl_aoutils.c
+++ b/tools/libxl/libxl_aoutils.c
@@ -534,11 +534,12 @@ static void async_exec_done(libxl__egc *egc,
     libxl__ev_time_deregister(gc, &aes->time);
 
     if (status) {
-        libxl_report_child_exitstatus(CTX, LIBXL__LOG_ERROR,
-                                      aes->what, pid, status);
+        if (!aes->rc)
+            libxl_report_child_exitstatus(CTX, LIBXL__LOG_ERROR,
+                                          aes->what, pid, status);
     }
 
-    aes->callback(egc, aes, status);
+    aes->callback(egc, aes, aes->rc, status);
 }
 
 void libxl__async_exec_init(libxl__async_exec_state *aes)
@@ -557,6 +558,8 @@ int libxl__async_exec_start(libxl__async_exec_state *aes)
     libxl__ev_child *const child = &aes->child;
     char ** const args = aes->args;
 
+    aes->rc = 0;
+
     /* Set execution timeout */
     if (libxl__ev_time_register_rel(ao, &aes->time,
                                     async_exec_timeout,
diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index b6276f6..012a9f8 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -729,7 +729,7 @@ static void device_hotplug(libxl__egc *egc, libxl__ao_device *aodev);
 
 static void device_hotplug_child_death_cb(libxl__egc *egc,
                                           libxl__async_exec_state *aes,
-                                          int status);
+                                          int rc, int status);
 
 static void device_destroy_be_watch_cb(libxl__egc *egc,
                                        libxl__xswait_state *xswait,
@@ -1052,7 +1052,7 @@ out:
 
 static void device_hotplug_child_death_cb(libxl__egc *egc,
                                           libxl__async_exec_state *aes,
-                                          int status)
+                                          int rc, int status)
 {
     libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
     STATE_AO_GC(aodev->ao);
@@ -1061,12 +1061,17 @@ static void device_hotplug_child_death_cb(libxl__egc *egc,
 
     device_hotplug_clean(gc, aodev);
 
-    if (status) {
+    if (status && !rc) {
         hotplug_error = libxl__xs_read(gc, XBT_NULL,
                                        GCSPRINTF("%s/hotplug-error", be_path));
         if (hotplug_error)
             LOG(ERROR, "script: %s", hotplug_error);
-        aodev->rc = ERROR_FAIL;
+        rc = ERROR_FAIL;
+    }
+
+    if (rc) {
+        if (!aodev->rc)
+            aodev->rc = rc;
         if (aodev->action == LIBXL__DEVICE_ACTION_ADD)
             /*
              * Only fail on device connection, on disconnection
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 8c38eab..69983b6 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2147,7 +2147,16 @@ _hidden const char *libxl__run_dir_path(void);
 typedef struct libxl__async_exec_state libxl__async_exec_state;
 
 typedef void libxl__async_exec_callback(libxl__egc *egc,
-                        libxl__async_exec_state *aes, int status);
+                        libxl__async_exec_state *aes, int rc, int status);
+/*
+ * Meaning of status and rc:
+ *  rc==0, status==0    all went well
+ *  rc==0, status!=0    everything OK except child exited nonzero (logged)
+ *  rc!=0               something else went wrong (status is real
+ *                       exit status; maybe reflecting SIGKILL, and
+ *                       therefore not very interesting, if aes code
+ *                       killed the child).  Logged unless ABORTED.
+ */
 
 struct libxl__async_exec_state {
     /* caller must fill these in */
@@ -2163,6 +2172,7 @@ struct libxl__async_exec_state {
     /* private */
     libxl__ev_time time;
     libxl__ev_child child;
+    int rc;
 };
 
 void libxl__async_exec_init(libxl__async_exec_state *aes);
diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c
index edc6843..ff2d6c7 100644
--- a/tools/libxl/libxl_netbuffer.c
+++ b/tools/libxl/libxl_netbuffer.c
@@ -219,10 +219,10 @@ out:
 
 static void netbuf_setup_script_cb(libxl__egc *egc,
                                    libxl__async_exec_state *aes,
-                                   int status);
+                                   int rc, int status);
 static void netbuf_teardown_script_cb(libxl__egc *egc,
                                       libxl__async_exec_state *aes,
-                                      int status);
+                                      int rc, int status);
 
 /*
  * the script needs the following env & args
@@ -327,14 +327,13 @@ out:
  */
 static void netbuf_setup_script_cb(libxl__egc *egc,
                                    libxl__async_exec_state *aes,
-                                   int status)
+                                   int rc, int status)
 {
     libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
     libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
     libxl__remus_devices_state *rds = dev->rds;
     const char *out_path_base, *hotplug_error = NULL;
-    int rc;
 
     STATE_AO_GC(rds->ao);
 
@@ -344,6 +343,11 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
     const char *const vif = remus_nic->vif;
     const char **const ifb = &remus_nic->ifb;
 
+    if (status && !rc)
+        rc = ERROR_FAIL;
+    if (rc)
+        goto out;
+
     /*
      * we need to get ifb first because it's needed for teardown
      */
@@ -411,17 +415,14 @@ out:
 
 static void netbuf_teardown_script_cb(libxl__egc *egc,
                                       libxl__async_exec_state *aes,
-                                      int status)
+                                      int rc, int status)
 {
-    int rc;
     libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
     libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
 
-    if (status)
+    if (status && !rc)
         rc = ERROR_FAIL;
-    else
-        rc = 0;
 
     free_qdisc(remus_nic);
 
diff --git a/tools/libxl/libxl_remus_disk_drbd.c b/tools/libxl/libxl_remus_disk_drbd.c
index 5e0c9a6..fc76b89 100644
--- a/tools/libxl/libxl_remus_disk_drbd.c
+++ b/tools/libxl/libxl_remus_disk_drbd.c
@@ -78,7 +78,7 @@ out:
 /* callbacks */
 static void match_async_exec_cb(libxl__egc *egc,
                                 libxl__async_exec_state *aes,
-                                int status);
+                                int rc, int status);
 
 /* implementations */
 
@@ -133,9 +133,8 @@ out:
 
 static void match_async_exec_cb(libxl__egc *egc,
                                 libxl__async_exec_state *aes,
-                                int status)
+                                int rc, int status)
 {
-    int rc;
     libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
     libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_drbd_disk *drbd_disk;
@@ -143,6 +142,9 @@ static void match_async_exec_cb(libxl__egc *egc,
 
     STATE_AO_GC(aodev->ao);
 
+    if (rc)
+        goto out;
+
     if (status) {
         rc = ERROR_REMUS_DEVOPS_DOES_NOT_MATCH;
         /* BUG: seems to assume that any exit status means `no match' */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 14/35] libxl: events: Permit timeouts to signal ao abort
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (12 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 13/35] libxl: events: Make libxl__async_exec_* pass caller an rc Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 15/35] libxl: spawn: Preserve rc in error path Ian Jackson
                   ` (21 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

The callback functions provided by users must take an rc value.  This
rc value can be ERROR_TIMEDOUT or ERROR_ABORTED.

Users of xswait are now expected to deal correctly with
ERROR_ABORTED.  If they experience this, it hasn't been logged.
And the caller won't log it either since it's not TIMEDOUT.
Luckily this is correct, so we can just change the doc comment.

Currently nothing generates ERROR_ABORTED; in particular the timeouts
cannot in fact signal abort requests.

There should be no publicly visible change except that some error
returns from libxl will change from ERROR_FAIL to ERROR_TIMEDOUT, and
some changes to debugging messages.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v4: ABORTED not CANCELLED.
---
 tools/libxl/libxl_aoutils.c        |   11 ++++++++---
 tools/libxl/libxl_device.c         |    8 +++++---
 tools/libxl/libxl_dom.c            |   29 ++++++++++++++++++++---------
 tools/libxl/libxl_event.c          |    8 ++++----
 tools/libxl/libxl_internal.h       |   12 +++++++-----
 tools/libxl/libxl_test_timedereg.c |    8 +++++---
 6 files changed, 49 insertions(+), 27 deletions(-)

diff --git a/tools/libxl/libxl_aoutils.c b/tools/libxl/libxl_aoutils.c
index 450caae..593a575 100644
--- a/tools/libxl/libxl_aoutils.c
+++ b/tools/libxl/libxl_aoutils.c
@@ -80,12 +80,13 @@ void xswait_xswatch_callback(libxl__egc *egc, libxl__ev_xswatch *xsw,
 }
 
 void xswait_timeout_callback(libxl__egc *egc, libxl__ev_time *ev,
-                             const struct timeval *requested_abs)
+                             const struct timeval *requested_abs,
+                             int rc)
 {
     EGC_GC;
     libxl__xswait_state *xswa = CONTAINER_OF(ev, *xswa, time_ev);
     LOG(DEBUG, "%s: xswait timeout (path=%s)", xswa->what, xswa->path);
-    xswait_report_error(egc, xswa, ERROR_TIMEDOUT);
+    xswait_report_error(egc, xswa, rc);
 }
 
 static void xswait_report_error(libxl__egc *egc, libxl__xswait_state *xswa,
@@ -506,11 +507,15 @@ int libxl__openptys(libxl__openpty_state *op,
 
 static void async_exec_timeout(libxl__egc *egc,
                                libxl__ev_time *ev,
-                               const struct timeval *requested_abs)
+                               const struct timeval *requested_abs,
+                               int rc)
 {
     libxl__async_exec_state *aes = CONTAINER_OF(ev, *aes, time);
     STATE_AO_GC(aes->ao);
 
+    if (!aes->rc)
+        aes->rc = rc;
+
     libxl__ev_time_deregister(gc, &aes->time);
 
     assert(libxl__ev_child_inuse(&aes->child));
diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 012a9f8..8209629 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -717,7 +717,7 @@ out:
 
 /* This callback is part of the Qemu devices Badge */
 static void device_qemu_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                const struct timeval *requested_abs);
+                                const struct timeval *requested_abs, int rc);
 
 static void device_backend_callback(libxl__egc *egc, libxl__ev_devstate *ds,
                                    int rc);
@@ -880,7 +880,7 @@ out:
 }
 
 static void device_qemu_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                const struct timeval *requested_abs)
+                                const struct timeval *requested_abs, int rc)
 {
     libxl__ao_device *aodev = CONTAINER_OF(ev, *aodev, timeout);
     STATE_AO_GC(aodev->ao);
@@ -888,7 +888,9 @@ static void device_qemu_timeout(libxl__egc *egc, libxl__ev_time *ev,
     char *state_path = GCSPRINTF("%s/state", be_path);
     const char *xs_state;
     xs_transaction_t t = 0;
-    int rc = 0;
+
+    if (rc != ERROR_TIMEDOUT)
+        goto out;
 
     libxl__ev_time_deregister(gc, &aodev->timeout);
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 40a2d79..b081ef7 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1162,7 +1162,8 @@ static void domain_suspend_callback_common_done(libxl__egc *egc,
  */
 
 static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                    const struct timeval *requested_abs);
+                                    const struct timeval *requested_abs,
+                                    int rc);
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
                             const char *watch_path, const char *event_path);
 static void switch_logdirty_done(libxl__egc *egc,
@@ -1289,7 +1290,8 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
     }
 }
 static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                    const struct timeval *requested_abs)
+                                    const struct timeval *requested_abs,
+                                    int rc)
 {
     libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
     STATE_AO_GC(dss->ao);
@@ -1438,7 +1440,7 @@ static void suspend_common_wait_guest_watch(libxl__egc *egc,
 static void suspend_common_wait_guest_check(libxl__egc *egc,
         libxl__domain_suspend_state *dss);
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
-      libxl__ev_time *ev, const struct timeval *requested_abs);
+      libxl__ev_time *ev, const struct timeval *requested_abs, int rc);
 
 static void domain_suspend_common_done(libxl__egc *egc,
                                        libxl__domain_suspend_state *dss,
@@ -1672,12 +1674,15 @@ static void suspend_common_wait_guest_check(libxl__egc *egc,
 }
 
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
-      libxl__ev_time *ev, const struct timeval *requested_abs)
+      libxl__ev_time *ev, const struct timeval *requested_abs, int rc)
 {
     libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, guest_timeout);
     STATE_AO_GC(dss->ao);
-    LOG(ERROR, "guest did not suspend, timed out");
-    domain_suspend_common_done(egc, dss, ERROR_GUEST_TIMEDOUT);
+    if (rc == ERROR_TIMEDOUT) {
+        LOG(ERROR, "guest did not suspend, timed out");
+        rc = ERROR_GUEST_TIMEDOUT;
+    }
+    domain_suspend_common_done(egc, dss, rc);
 }
 
 static void domain_suspend_common_guest_suspended(libxl__egc *egc,
@@ -1929,7 +1934,8 @@ static void remus_devices_commit_cb(libxl__egc *egc,
                                     libxl__remus_devices_state *rds,
                                     int rc);
 static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
-                                  const struct timeval *requested_abs);
+                                  const struct timeval *requested_abs,
+                                  int rc);
 
 static void libxl__remus_domain_checkpoint_callback(void *data)
 {
@@ -2004,7 +2010,8 @@ out:
 }
 
 static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
-                                  const struct timeval *requested_abs)
+                                  const struct timeval *requested_abs,
+                                  int rc)
 {
     libxl__domain_suspend_state *dss =
                             CONTAINER_OF(ev, *dss, checkpoint_timeout);
@@ -2016,7 +2023,11 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
      * (xc_domain_save.c). in order to continue executing the infinite loop
      * (suspend, checkpoint, resume) in xc_domain_save().
      */
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 1);
+
+    if (rc)
+        dss->rc = rc;
+
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, !rc);
 }
 
 /*----- main code for suspending, in order of execution -----*/
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 217fe97..2809743 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -406,7 +406,7 @@ void libxl__ev_time_deregister(libxl__gc *gc, libxl__ev_time *ev)
     return;
 }
 
-static void time_occurs(libxl__egc *egc, libxl__ev_time *etime)
+static void time_occurs(libxl__egc *egc, libxl__ev_time *etime, int rc)
 {
     DBG("ev_time=%p occurs abs=%lu.%06lu",
         etime, (unsigned long)etime->abs.tv_sec,
@@ -414,7 +414,7 @@ static void time_occurs(libxl__egc *egc, libxl__ev_time *etime)
 
     libxl__ev_time_callback *func = etime->func;
     etime->func = 0;
-    func(egc, etime, &etime->abs);
+    func(egc, etime, &etime->abs, rc);
 }
 
 
@@ -1207,7 +1207,7 @@ static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller,
 
         time_deregister(gc, etime);
 
-        time_occurs(egc, etime);
+        time_occurs(egc, etime, ERROR_TIMEDOUT);
     }
 }
 
@@ -1274,7 +1274,7 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
 
     LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry);
 
-    time_occurs(egc, ev);
+    time_occurs(egc, ev, ERROR_TIMEDOUT);
 
  out:
     CTX_UNLOCK;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 69983b6..cda122e 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -195,7 +195,8 @@ struct libxl__ev_fd {
 
 typedef struct libxl__ev_time libxl__ev_time;
 typedef void libxl__ev_time_callback(libxl__egc *egc, libxl__ev_time *ev,
-                                     const struct timeval *requested_abs);
+                                     const struct timeval *requested_abs,
+                                     int rc); /* TIMEDOUT or ABORTED */
 struct libxl__ev_time {
     /* caller should include this in their own struct */
     /* read-only for caller, who may read only when registered: */
@@ -1134,13 +1135,13 @@ typedef struct libxl__xswait_state libxl__xswait_state;
  *     Otherwise, xswait will continue waiting and watching and
  *     will call you back later.
  *
- * rc==ERROR_TIMEDOUT
+ * rc==ERROR_TIMEDOUT, rc==ERROR_ABORTED
  *
  *     The specified timeout was reached.
  *     This has NOT been logged (except to the debug log).
  *     xswait will not continue (but calling libxl__xswait_stop is OK).
  *
- * rc!=0, !=ERROR_TIMEDOUT
+ * rc!=0, !=ERROR_TIMEDOUT, !=ERROR_ABORTED
  *
  *     Some other error occurred.
  *     This HAS been logged.
@@ -1180,8 +1181,9 @@ int libxl__xswait_start(libxl__gc*, libxl__xswait_state*);
 typedef struct libxl__ev_devstate libxl__ev_devstate;
 typedef void libxl__ev_devstate_callback(libxl__egc *egc, libxl__ev_devstate*,
                                          int rc);
-  /* rc will be 0, ERROR_TIMEDOUT, ERROR_INVAL (meaning path was removed),
-   * or ERROR_FAIL if other stuff went wrong (in which latter case, logged) */
+  /* rc will be 0, ERROR_TIMEDOUT, ERROR_ABORTED, ERROR_INVAL
+   * (meaning path was removed), or ERROR_FAIL if other stuff went
+   * wrong (in which latter case, logged) */
 
 struct libxl__ev_devstate {
     /* read-only for caller, who may read only when waiting: */
diff --git a/tools/libxl/libxl_test_timedereg.c b/tools/libxl/libxl_test_timedereg.c
index e2cc27d..c464663 100644
--- a/tools/libxl/libxl_test_timedereg.c
+++ b/tools/libxl/libxl_test_timedereg.c
@@ -28,7 +28,7 @@ static libxl__ao *tao;
 static int seq;
 
 static void occurs(libxl__egc *egc, libxl__ev_time *ev,
-                   const struct timeval *requested_abs);
+                   const struct timeval *requested_abs, int rc);
 
 static void regs(libxl__ao *ao, int j)
 {
@@ -59,13 +59,15 @@ int libxl_test_timedereg(libxl_ctx *ctx, libxl_asyncop_how *ao_how)
 }
 
 static void occurs(libxl__egc *egc, libxl__ev_time *ev,
-                   const struct timeval *requested_abs)
+                   const struct timeval *requested_abs, int rc)
 {
     EGC_GC;
     int i;
 
     int off = ev - &et[0][0];
-    LOG(DEBUG,"occurs[%d][%d] seq=%d", off/NTIMES, off%NTIMES, seq);
+    LOG(DEBUG,"occurs[%d][%d] seq=%d rc=%d", off/NTIMES, off%NTIMES, seq, rc);
+
+    assert(!rc);
 
     switch (seq) {
     case 0:
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 15/35] libxl: spawn: Preserve rc in error path
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (13 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 14/35] libxl: events: Permit timeouts to signal ao abort Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-26 14:28   ` Wei Liu
  2015-06-25 17:44 ` [PATCH 16/35] libxl: domain create: Do not destroy on ao abort Ian Jackson
                   ` (20 subsequent siblings)
  35 siblings, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Make spawn provide an rc to its caller, and either pass it through
from the timeout callback, or invent ERROR_FAIL, as applicable.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v4: New patch in this version of the series.
---
 tools/libxl/libxl_dm.c       |    8 +++++---
 tools/libxl/libxl_exec.c     |   36 ++++++++++++++++++++----------------
 tools/libxl/libxl_internal.h |    5 +++--
 3 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 33f9ce6..317a8eb 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -1370,7 +1370,8 @@ static void stubdom_xswait_cb(libxl__egc *egc, libxl__xswait_state *xswait,
 static void device_model_confirm(libxl__egc *egc, libxl__spawn_state *spawn,
                                  const char *xsdata);
 static void device_model_startup_failed(libxl__egc *egc,
-                                        libxl__spawn_state *spawn);
+                                        libxl__spawn_state *spawn,
+                                        int rc);
 static void device_model_detached(libxl__egc *egc,
                                   libxl__spawn_state *spawn);
 
@@ -1540,10 +1541,11 @@ static void device_model_confirm(libxl__egc *egc, libxl__spawn_state *spawn,
 }
 
 static void device_model_startup_failed(libxl__egc *egc,
-                                        libxl__spawn_state *spawn)
+                                        libxl__spawn_state *spawn,
+                                        int rc)
 {
     libxl__dm_spawn_state *dmss = CONTAINER_OF(spawn, *dmss, spawn);
-    device_model_spawn_outcome(egc, dmss, ERROR_FAIL);
+    device_model_spawn_outcome(egc, dmss, rc);
 }
 
 static void device_model_detached(libxl__egc *egc,
diff --git a/tools/libxl/libxl_exec.c b/tools/libxl/libxl_exec.c
index 478b4c2..85cbde0 100644
--- a/tools/libxl/libxl_exec.c
+++ b/tools/libxl/libxl_exec.c
@@ -238,11 +238,11 @@ err:
 /*
  * Full set of possible states of a libxl__spawn_state and its _detachable:
  *
- *                   detaching failed  mid     timeout      xswatch          
+ *                   detaching rc      mid     timeout      xswatch
  *  - Undefined         undef   undef   -        undef        undef
  *  - Idle              any     any     Idle     Idle         Idle
  *  - Attached OK       0       0       Active   Active       Active
- *  - Attached Failed   0       1       Active   Idle         Idle
+ *  - Attached Failed   0       non-0   Active   Idle         Idle
  *  - Detaching         1       maybe   Active   Idle         Idle
  *  - Partial           any     any     Idle     Active/Idle  Active/Idle
  *
@@ -267,7 +267,7 @@ static void spawn_cleanup(libxl__gc *gc, libxl__spawn_state *ss);
 
 /* Precondition: Attached or Detaching; caller has logged failure reason.
  * Results: Detaching, or Attached Failed */
-static void spawn_fail(libxl__egc *egc, libxl__spawn_state *ss);
+static void spawn_fail(libxl__egc *egc, libxl__spawn_state *ss, int rc);
 
 void libxl__spawn_init(libxl__spawn_state *ss)
 {
@@ -283,7 +283,7 @@ int libxl__spawn_spawn(libxl__egc *egc, libxl__spawn_state *ss)
     int status, rc;
 
     libxl__spawn_init(ss);
-    ss->failed = ss->detaching = 0;
+    ss->rc = ss->detaching = 0;
 
     ss->xswait.ao = ao;
     ss->xswait.what = GCSPRINTF("%s startup", ss->what);
@@ -352,12 +352,13 @@ static void spawn_cleanup(libxl__gc *gc, libxl__spawn_state *ss)
 
 static void spawn_detach(libxl__gc *gc, libxl__spawn_state *ss)
 /* Precondition: Attached or Detaching, but caller must have just set
- * at least one of detaching or failed.
+ * at least one of detaching or rc.
  * Results: Detaching or Attached Failed */
 {
     int r;
 
     assert(libxl__ev_child_inuse(&ss->mid));
+    assert(ss->detaching || ss->rc);
     libxl__xswait_stop(gc, &ss->xswait);
 
     pid_t child = ss->mid.pid;
@@ -373,12 +374,13 @@ void libxl__spawn_initiate_detach(libxl__gc *gc, libxl__spawn_state *ss)
     spawn_detach(gc, ss);
 }
 
-static void spawn_fail(libxl__egc *egc, libxl__spawn_state *ss)
+static void spawn_fail(libxl__egc *egc, libxl__spawn_state *ss, int rc)
 /* Caller must have logged.  Must be last thing in calling function,
  * as it may make the callback.  Precondition: Attached or Detaching. */
 {
     EGC_GC;
-    ss->failed = 1;
+    assert(rc);
+    ss->rc = rc;
     spawn_detach(gc, ss);
 }
 
@@ -391,9 +393,10 @@ static void spawn_watch_event(libxl__egc *egc, libxl__xswait_state *xswa,
     if (rc) {
         if (rc == ERROR_TIMEDOUT)
             LOG(ERROR, "%s: startup timed out", ss->what);
-        spawn_fail(egc, ss); /* must be last */
+        spawn_fail(egc, ss, rc); /* must be last */
         return;
     }
+    LOG(DEBUG, "%s: spawn watch p=%s", ss->what, p);
     ss->confirm_cb(egc, ss, p); /* must be last */
 }
 
@@ -404,7 +407,7 @@ static void spawn_middle_death(libxl__egc *egc, libxl__ev_child *childw,
     EGC_GC;
     libxl__spawn_state *ss = CONTAINER_OF(childw, *ss, mid);
 
-    if ((ss->failed || ss->detaching) &&
+    if ((ss->rc || ss->detaching) &&
         ((WIFEXITED(status) && WEXITSTATUS(status)==0) ||
          (WIFSIGNALED(status) && WTERMSIG(status)==SIGKILL))) {
         /* as expected */
@@ -413,7 +416,7 @@ static void spawn_middle_death(libxl__egc *egc, libxl__ev_child *childw,
         const char *what =
             GCSPRINTF("%s intermediate process (startup monitor)", ss->what);
         libxl_report_child_exitstatus(CTX, loglevel, what, pid, status);
-        ss->failed = 1;
+        ss->rc = ERROR_FAIL;
     } else {
         if (!status)
             LOG(ERROR, "%s [%ld]: unexpectedly exited with exit status 0,"
@@ -432,19 +435,20 @@ static void spawn_middle_death(libxl__egc *egc, libxl__ev_child *childw,
                 LOG(ERROR, "%s [%ld]: died during startup due to unknown fatal"
                     " signal number %d", ss->what, (unsigned long)pid, sig);
         }
-        ss->failed = 1;
+        ss->rc = ERROR_FAIL;
     }
 
     spawn_cleanup(gc, ss);
 
-    if (ss->failed && !ss->detaching) {
-        ss->failure_cb(egc, ss); /* must be last */
+    if (ss->rc && !ss->detaching) {
+        ss->failure_cb(egc, ss, ss->rc); /* must be last */
         return;
     }
     
-    if (ss->failed && ss->detaching)
-        LOG(WARN,"%s underlying machinery seemed to fail,"
-            " but its function seems to have been successful", ss->what);
+    if (ss->rc && ss->detaching)
+        LOG(WARN,"%s underlying machinery seemed to fail (%d),"
+            " but its function seems to have been successful",
+            ss->what, ss->rc);
 
     assert(ss->detaching);
     ss->detached_cb(egc, ss);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index cda122e..8a142e7 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1365,7 +1365,8 @@ libxl__spawn_midproc_cb(libxl__gc*, libxl__spawn_state*, pid_t inner);
  * The spawn state will be Idle on entry to the callback (and
  * it may be reused immediately if desired).
  */
-typedef void libxl__spawn_failure_cb(libxl__egc*, libxl__spawn_state*);
+typedef void libxl__spawn_failure_cb(libxl__egc*, libxl__spawn_state*,
+                                     int rc);
 
 /*
  * Called when the xspath watch triggers.  xspath will have been read
@@ -1406,7 +1407,7 @@ struct libxl__spawn_state {
 
     /* remaining fields are private to libxl_spawn_... */
     int detaching; /* we are in Detaching */
-    int failed; /* might be true whenever we are not Idle */
+    int rc; /* might be non-0 whenever we are not Idle */
     libxl__ev_child mid; /* always in use whenever we are not Idle */
     libxl__xswait_state xswait;
 };
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 16/35] libxl: domain create: Do not destroy on ao abort
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (14 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 15/35] libxl: spawn: Preserve rc in error path Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 17/35] libxl: ao: Record ultimate parent of a nested ao Ian Jackson
                   ` (19 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

If we aborted the domain creation, do not try to tear it down again
Document this.

This is a backwards-compatible API change since old libxl users will
never abort any operations.

In the current code, there is no functional change, because
ERROR_ABORTED is never generated anywhere yet.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v4: ABORTED not CANCELLED.
    Always write out guest domid on completion.
    Do not trash rc with libxl__set_domain_configuration on preserve path.
---
 tools/libxl/libxl.h        |    4 ++++
 tools/libxl/libxl_create.c |   15 +++++++++------
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 0a7913b..b113d8c 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -975,6 +975,10 @@ int libxl_ctx_free(libxl_ctx *ctx /* 0 is OK */);
 
 /* domain related functions */
 
+/* If the result is ERROR_ABORTED, the domain may or may not exist
+ * (in a half-created state).  *domid will be valid and will be the
+ * domain id, or -1, as appropriate */
+
 int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
                             uint32_t *domid,
                             const libxl_asyncop_how *ao_how,
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 86384d2..f366a09 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1519,7 +1519,9 @@ static void domcreate_complete(libxl__egc *egc,
     if (!rc && d_config->b_info.exec_ssidref)
         rc = xc_flask_relabel_domain(CTX->xch, dcs->guest_domid, d_config->b_info.exec_ssidref);
 
-    if (!rc) {
+    bool retain_domain = !rc || rc == ERROR_ABORTED;
+
+    if (retain_domain) {
         libxl__domain_userdata_lock *lock;
 
         /* Note that we hold CTX lock at this point so only need to
@@ -1530,15 +1532,17 @@ static void domcreate_complete(libxl__egc *egc,
             rc = ERROR_LOCK_FAIL;
         } else {
             libxl__update_domain_configuration(gc, d_config_saved, d_config);
-            rc = libxl__set_domain_configuration(gc, dcs->guest_domid,
-                                                 d_config_saved);
+            int cfg_rc = libxl__set_domain_configuration
+                (gc, dcs->guest_domid, d_config_saved);
+            if (!rc)
+                rc = cfg_rc;
             libxl__unlock_domain_userdata(lock);
         }
     }
 
     libxl_domain_config_dispose(d_config_saved);
 
-    if (rc) {
+    if (!retain_domain) {
         if (dcs->guest_domid) {
             dcs->dds.ao = ao;
             dcs->dds.domid = dcs->guest_domid;
@@ -1608,8 +1612,7 @@ static void domain_create_cb(libxl__egc *egc,
     libxl__app_domain_create_state *cdcs = CONTAINER_OF(dcs, *cdcs, dcs);
     STATE_AO_GC(cdcs->dcs.ao);
 
-    if (!rc)
-        *cdcs->domid_out = domid;
+    *cdcs->domid_out = domid;
 
     libxl__ao_complete(egc, ao, rc);
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 17/35] libxl: ao: Record ultimate parent of a nested ao
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (15 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 16/35] libxl: domain create: Do not destroy on ao abort Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 18/35] libxl: ao: Count the nested progeny of an ao Ian Jackson
                   ` (18 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

This will be used by the abort request machinery.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_event.c    |   25 +++++++++++++++----------
 tools/libxl/libxl_internal.h |    3 ++-
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 2809743..47d65a3 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -31,6 +31,9 @@
 #define DBG(args, ...) LIBXL__DBG_LOG(CTX, args, __VA_ARGS__)
 
 
+static libxl__ao *ao_nested_root(libxl__ao *ao);
+
+
 /*
  * The counter osevent_in_hook is used to ensure that the application
  * honours the reentrancy restriction documented in libxl_event.h.
@@ -1761,7 +1764,7 @@ void libxl__ao_complete(libxl__egc *egc, libxl__ao *ao, int rc)
     LOG(DEBUG,"ao %p: complete, rc=%d",ao,rc);
     assert(ao->magic == LIBXL__AO_MAGIC);
     assert(!ao->complete);
-    assert(!ao->nested);
+    assert(!ao->nested_root);
     ao->complete = 1;
     ao->rc = rc;
 
@@ -1932,7 +1935,7 @@ void libxl__ao_progress_report(libxl__egc *egc, libxl__ao *ao,
         const libxl_asyncprogress_how *how, libxl_event *ev)
 {
     AO_GC;
-    assert(!ao->nested);
+    assert(!ao->nested_root);
     if (how->callback == dummy_asyncprogress_callback_ignore) {
         LOG(DEBUG,"ao %p: progress report: ignored",ao);
         libxl_event_free(CTX,ev);
@@ -1955,21 +1958,23 @@ void libxl__ao_progress_report(libxl__egc *egc, libxl__ao *ao,
 
 /* nested ao */
 
+static libxl__ao *ao_nested_root(libxl__ao *ao) {
+    libxl__ao *root = ao->nested_root ? : ao;
+    assert(!root->nested_root);
+    return root;
+}
+
 _hidden libxl__ao *libxl__nested_ao_create(libxl__ao *parent)
 {
-    /* We only use the parent to get the ctx.  However, we require the
-     * caller to provide us with an ao, not just a ctx, to prove that
-     * they are already in an asynchronous operation.  That will avoid
-     * people using this to (for example) make an ao in a non-ao_how
-     * function somewhere in the middle of libxl. */
-    libxl__ao *child = NULL;
+    libxl__ao *child = NULL, *root;
     libxl_ctx *ctx = libxl__gc_owner(&parent->gc);
 
     assert(parent->magic == LIBXL__AO_MAGIC);
+    root = ao_nested_root(parent);
 
     child = libxl__zalloc(&ctx->nogc_gc, sizeof(*child));
     child->magic = LIBXL__AO_MAGIC;
-    child->nested = 1;
+    child->nested_root = root;
     LIBXL_INIT_GC(child->gc, ctx);
     libxl__gc *gc = &child->gc;
 
@@ -1980,7 +1985,7 @@ _hidden libxl__ao *libxl__nested_ao_create(libxl__ao *parent)
 _hidden void libxl__nested_ao_free(libxl__ao *child)
 {
     assert(child->magic == LIBXL__AO_MAGIC);
-    assert(child->nested);
+    assert(child->nested_root);
     libxl_ctx *ctx = libxl__gc_owner(&child->gc);
     libxl__ao__destroy(ctx, child);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 8a142e7..235f630 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -468,7 +468,8 @@ struct libxl__ao {
      * only in libxl__ao_complete.)
      */
     uint32_t magic;
-    unsigned constructing:1, in_initiator:1, complete:1, notified:1, nested:1;
+    unsigned constructing:1, in_initiator:1, complete:1, notified:1;
+    libxl__ao *nested_root;
     int progress_reports_outstanding;
     int rc;
     libxl__gc gc;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 18/35] libxl: ao: Count the nested progeny of an ao
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (16 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 17/35] libxl: ao: Record ultimate parent of a nested ao Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 19/35] libxl: ao: Provide manip_refcnt Ian Jackson
                   ` (17 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

This will detect any "escaped" nested aos.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_event.c    |    8 +++++++-
 tools/libxl/libxl_internal.h |    1 +
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 47d65a3..96efb33 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -1765,6 +1765,7 @@ void libxl__ao_complete(libxl__egc *egc, libxl__ao *ao, int rc)
     assert(ao->magic == LIBXL__AO_MAGIC);
     assert(!ao->complete);
     assert(!ao->nested_root);
+    assert(!ao->nested_progeny);
     ao->complete = 1;
     ao->rc = rc;
 
@@ -1975,6 +1976,8 @@ _hidden libxl__ao *libxl__nested_ao_create(libxl__ao *parent)
     child = libxl__zalloc(&ctx->nogc_gc, sizeof(*child));
     child->magic = LIBXL__AO_MAGIC;
     child->nested_root = root;
+    assert(root->nested_progeny < INT_MAX);
+    root->nested_progeny++;
     LIBXL_INIT_GC(child->gc, ctx);
     libxl__gc *gc = &child->gc;
 
@@ -1985,7 +1988,10 @@ _hidden libxl__ao *libxl__nested_ao_create(libxl__ao *parent)
 _hidden void libxl__nested_ao_free(libxl__ao *child)
 {
     assert(child->magic == LIBXL__AO_MAGIC);
-    assert(child->nested_root);
+    libxl__ao *root = child->nested_root;
+    assert(root);
+    assert(root->nested_progeny > 0);
+    root->nested_progeny--;
     libxl_ctx *ctx = libxl__gc_owner(&child->gc);
     libxl__ao__destroy(ctx, child);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 235f630..6760aba 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -470,6 +470,7 @@ struct libxl__ao {
     uint32_t magic;
     unsigned constructing:1, in_initiator:1, complete:1, notified:1;
     libxl__ao *nested_root;
+    int nested_progeny;
     int progress_reports_outstanding;
     int rc;
     libxl__gc gc;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 19/35] libxl: ao: Provide manip_refcnt
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (17 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 18/35] libxl: ao: Count the nested progeny of an ao Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-26 14:33   ` Wei Liu
  2015-06-25 17:44 ` [PATCH 20/35] libxl: ao abort: Provide public ao abort request API Ian Jackson
                   ` (16 subsequent siblings)
  35 siblings, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Previously we used in_initiator to stop the ao being freed while we
were still in the initiator function (which would result in the
initiator's call to lixl__ao_inprogress accessing the ao after it had
been freed).

We are going to introduce a new libxl entrypoint which finds, and
operates on, ongoing aos.  This function needs the same protection,
and might even end up running on the same ao multiple times
concurrently.

So do this with reference counting instead, with a new variable
ao->manip_refcnt.

We keep ao->in_initiator because that allows us to keep some useful
asserts about the sequencing of libxl__ao_inprogress, etc.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Add a missing space.
    Mention locking in the comment.
---
 tools/libxl/libxl_event.c    |   43 +++++++++++++++++++++++++++++++++---------
 tools/libxl/libxl_internal.h |    1 +
 2 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 96efb33..6d5f9d8 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -33,6 +33,8 @@
 
 static libxl__ao *ao_nested_root(libxl__ao *ao);
 
+static void ao__check_destroy(libxl_ctx *ctx, libxl__ao *ao);
+
 
 /*
  * The counter osevent_in_hook is used to ensure that the application
@@ -1347,8 +1349,7 @@ static void egc_run_callbacks(libxl__egc *egc)
         ao->how.callback(CTX, ao->rc, ao->how.u.for_callback);
         CTX_LOCK;
         ao->notified = 1;
-        if (!ao->in_initiator)
-            libxl__ao__destroy(CTX, ao);
+        ao__check_destroy(CTX, ao);
         CTX_UNLOCK;
     }
 }
@@ -1729,6 +1730,33 @@ int libxl_event_wait(libxl_ctx *ctx, libxl_event **event_r,
  *                              - destroy the ao
  */
 
+
+/*
+ * A "manip" is a libxl public function manipulating this ao, which
+ * has a pointer to it.  We have to not destroy it while that's the
+ * case, obviously.  Callers must have the ctx locked, obviously.
+ */
+static void ao__manip_enter(libxl__ao *ao)
+{
+    assert(ao->manip_refcnt < INT_MAX);
+    ao->manip_refcnt++;
+}
+
+static void ao__manip_leave(libxl_ctx *ctx, libxl__ao *ao)
+{
+    assert(ao->manip_refcnt > 0);
+    ao->manip_refcnt--;
+    ao__check_destroy(ctx, ao);
+}
+
+static void ao__check_destroy(libxl_ctx *ctx, libxl__ao *ao)
+{
+    if (!ao->manip_refcnt && ao->notified) {
+        assert(ao->complete);
+        libxl__ao__destroy(ctx, ao);
+    }
+}
+
 void libxl__ao__destroy(libxl_ctx *ctx, libxl__ao *ao)
 {
     AO_GC;
@@ -1810,8 +1838,8 @@ void libxl__ao_complete_check_progress_reports(libxl__egc *egc, libxl__ao *ao)
         }
         ao->notified = 1;
     }
-    if (!ao->in_initiator && ao->notified)
-        libxl__ao__destroy(ctx, ao);
+    
+    ao__check_destroy(ctx, ao);
 }
 
 libxl__ao *libxl__ao_create(libxl_ctx *ctx, uint32_t domid,
@@ -1826,6 +1854,7 @@ libxl__ao *libxl__ao_create(libxl_ctx *ctx, uint32_t domid,
     ao->magic = LIBXL__AO_MAGIC;
     ao->constructing = 1;
     ao->in_initiator = 1;
+    ao__manip_enter(ao);
     ao->poller = 0;
     ao->domid = domid;
     LIBXL_INIT_GC(ao->gc, ctx);
@@ -1906,11 +1935,7 @@ int libxl__ao_inprogress(libxl__ao *ao,
     }
 
     ao->in_initiator = 0;
-
-    if (ao->notified) {
-        assert(ao->complete);
-        libxl__ao__destroy(CTX,ao);
-    }
+    ao__manip_leave(CTX, ao);
 
     return rc;
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 6760aba..3ffefc3 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -469,6 +469,7 @@ struct libxl__ao {
      */
     uint32_t magic;
     unsigned constructing:1, in_initiator:1, complete:1, notified:1;
+    int manip_refcnt;
     libxl__ao *nested_root;
     int nested_progeny;
     int progress_reports_outstanding;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 20/35] libxl: ao abort: Provide public ao abort request API
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (18 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 19/35] libxl: ao: Provide manip_refcnt Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-26 14:47   ` Wei Liu
  2015-06-25 17:44 ` [PATCH 21/35] libxl: ao abort: Provide explicit internal abort check API Ian Jackson
                   ` (15 subsequent siblings)
  35 siblings, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Provide libxl_ao_abort.

There is machinery to allow an ao to register an interest in abort
requests, using a libxl__ao_abortable.

This API is not currently very functional: requesting abort will
never have any effect.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v4: Rename from cancel to abort.
    Actually record aos on aos_inprogress.
    (Report from Koushik Chakravarty at Citrix.)
    Do not mark libxl_ao_cancel hidden (!)
    Abolish ERROR_NOTIMPLEMENTED from libxl_ao_cancel.
    All operations are supposed to support cancellation.
v2: Minor comment improvements
---
 tools/libxl/libxl.c          |    3 +
 tools/libxl/libxl.h          |   57 ++++++++++++++++++
 tools/libxl/libxl_event.c    |  132 +++++++++++++++++++++++++++++++++++++++++-
 tools/libxl/libxl_internal.h |   41 ++++++++++++-
 4 files changed, 229 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 98b94ee..54b398b 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -73,6 +73,8 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version,
     LIBXL_LIST_INIT(&ctx->evtchns_waiting);
     libxl__ev_fd_init(&ctx->evtchn_efd);
 
+    LIBXL_LIST_INIT(&ctx->aos_inprogress);
+
     LIBXL_TAILQ_INIT(&ctx->death_list);
     libxl__ev_xswatch_init(&ctx->death_watch);
 
@@ -174,6 +176,7 @@ int libxl_ctx_free(libxl_ctx *ctx)
     assert(LIBXL_LIST_EMPTY(&ctx->efds));
     assert(LIBXL_TAILQ_EMPTY(&ctx->etimes));
     assert(LIBXL_LIST_EMPTY(&ctx->evtchns_waiting));
+    assert(LIBXL_LIST_EMPTY(&ctx->aos_inprogress));
 
     if (ctx->xch) xc_interface_close(ctx->xch);
     libxl_version_info_dispose(&ctx->version_info);
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index b113d8c..e52afca 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -688,6 +688,11 @@ typedef struct libxl__ctx libxl_ctx;
  */
 #define LIBXL_HAVE_DEVICE_CHANNEL 1
 
+/*
+ * LIBXL_HAVE_AO_ABORT indicates the availability of libxl_ao_abort
+ */
+#define LIBXL_HAVE_AO_ABORT 1
+
 /* Functions annotated with LIBXL_EXTERNAL_CALLERS_ONLY may not be
  * called from within libxl itself. Callers outside libxl, who
  * do not #include libxl_internal.h, are fine. */
@@ -965,6 +970,58 @@ typedef struct {
     void *for_callback; /* passed to callback */
 } libxl_asyncprogress_how;
 
+/*
+ * It is sometimes possible to abort an asynchronous operation.
+ *
+ * libxl_ao_abort searches for an ongoing asynchronous operation whose
+ * ao_how is identical to *how, and tries to abort it.  The return
+ * values from libxl_ao_abort are as follows:
+ *
+ *  0
+ *
+ *     The operation was found, and attempts are being made to cut it
+ *     short.  However, it may still take some time to stop.  It is
+ *     also possible that the operation will nevertheless complete
+ *     successfully.
+ *
+ *  ERROR_NOTFOUND
+ *
+ *      No matching ongoing operation was found.  This might happen
+ *      for an actual operation if the operation has already completed
+ *      (perhaps on another thread).  The call to libxl_ao_abort has
+ *      had no effect.
+ *
+ *  ERROR_ABORTED
+ *
+ *     The operation has already been the subject of at least one
+ *     call to libxl_ao_abort.
+ *
+ * If the operation was indeed cut short due to the abort request, it
+ * will complete, at some point in the future, with ERROR_ABORTED.  In
+ * that case, depending on the operation it have performed some of the
+ * work in question and left the operation half-done.  Consult the
+ * documentation for individual operations.
+ *
+ * Note that an aborted operation might still fail for other reasons
+ * even after the abort was requested.
+ *
+ * If your application is multithreaded you must not reuse an
+ * ao_how->for_event or ao_how->for_callback value (with a particular
+ * ao_how->callback) unless you are sure that none of your other
+ * threads are going to abort the previous operation using that
+ * value; otherwise you risk aborting the wrong operation if the
+ * intended target of the abort request completes in the meantime.
+ *
+ * It is possible to abort even an operation which is being performed
+ * synchronously, but since in that case how==NULL you had better only
+ * have one such operation, because it is not possible to tell them
+ * apart.  (And, if you want to do this, obviously the abort would
+ * have to be requested on a different thread.)
+ */
+int libxl_ao_abort(libxl_ctx *ctx, const libxl_asyncop_how *how)
+                   LIBXL_EXTERNAL_CALLERS_ONLY;
+
+
 #define LIBXL_VERSION 0
 
 /* context functions */
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 6d5f9d8..f9daa55 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -1776,6 +1776,8 @@ void libxl__ao_create_fail(libxl__ao *ao)
     assert(ao->in_initiator);
     assert(!ao->complete);
     assert(!ao->progress_reports_outstanding);
+    assert(!ao->aborting);
+    LIBXL_LIST_REMOVE(ao, inprogress_entry);
     libxl__ao__destroy(CTX, ao);
 }
 
@@ -1796,7 +1798,7 @@ void libxl__ao_complete(libxl__egc *egc, libxl__ao *ao, int rc)
     assert(!ao->nested_progeny);
     ao->complete = 1;
     ao->rc = rc;
-
+    LIBXL_LIST_REMOVE(ao, inprogress_entry);
     libxl__ao_complete_check_progress_reports(egc, ao);
 }
 
@@ -1869,6 +1871,8 @@ libxl__ao *libxl__ao_create(libxl_ctx *ctx, uint32_t domid,
                "ao %p: create: how=%p callback=%p poller=%p",
                ao, how, ao->how.callback, ao->poller);
 
+    LIBXL_LIST_INSERT_HEAD(&ctx->aos_inprogress, ao, inprogress_entry);
+
     return ao;
 
  out:
@@ -1922,8 +1926,8 @@ int libxl__ao_inprogress(libxl__ao *ao,
                 sleep(1);
                 /* It's either this or return ERROR_I_DONT_KNOW_WHETHER
                  * _THE_THING_YOU_ASKED_FOR_WILL_BE_DONE_LATER_WHEN
-                 * _YOU_DIDNT_EXPECT_IT, since we don't have any kind of
-                 * cancellation ability. */
+                 * _YOU_DIDNT_EXPECT_IT, since we don't have a
+                 * synchronous cancellation ability. */
             }
 
             CTX_UNLOCK;
@@ -1941,6 +1945,128 @@ int libxl__ao_inprogress(libxl__ao *ao,
 }
 
 
+/* abort requests */
+
+static int ao__abort(libxl_ctx *ctx, libxl__ao *parent)
+/* Temporarily unlocks ctx, which must be locked exactly once on entry. */
+{
+    int rc;
+    ao__manip_enter(parent);
+
+    if (parent->aborting) {
+        rc = ERROR_ABORTED;
+        goto out;
+    }
+
+    parent->aborting = 1;
+
+    if (LIBXL_LIST_EMPTY(&parent->abortables)) {
+        LIBXL__LOG(ctx, XTL_DEBUG,
+                   "ao %p: abort requested and noted, but no-one interested",
+                   parent);
+        rc = 0;
+        goto out;
+    }
+
+    /* We keep calling abort hooks until there are none left */
+    while (!LIBXL_LIST_EMPTY(&parent->abortables)) {
+        libxl__egc egc;
+        LIBXL_INIT_EGC(egc,ctx);
+
+        assert(!parent->complete);
+
+        libxl__ao_abortable *abrt = LIBXL_LIST_FIRST(&parent->abortables);
+        assert(parent == ao_nested_root(abrt->ao));
+
+        LIBXL_LIST_REMOVE(abrt, entry);
+        abrt->registered = 0;
+
+        LIBXL__LOG(ctx, XTL_DEBUG, "ao %p: abrt=%p: aborting",
+                   parent, abrt->ao);
+        abrt->callback(&egc, abrt, ERROR_ABORTED);
+
+        libxl__ctx_unlock(ctx);
+        libxl__egc_cleanup(&egc);
+        libxl__ctx_lock(ctx);
+    }
+
+    rc = 0;
+
+ out:
+    ao__manip_leave(ctx, parent);
+    return rc;
+}
+
+int libxl_ao_abort(libxl_ctx *ctx, const libxl_asyncop_how *how)
+{
+    libxl__ao *search;
+    libxl__ctx_lock(ctx);
+    int rc;
+
+    LIBXL_LIST_FOREACH(search, &ctx->aos_inprogress, inprogress_entry) {
+        if (how) {
+            /* looking for ao to be reported by callback or event */
+            if (search->poller)
+                /* sync */
+                continue;
+            if (how->callback != search->how.callback)
+                continue;
+            if (how->callback
+                ? (how->u.for_callback != search->how.u.for_callback)
+                : (how->u.for_event != search->how.u.for_event))
+                continue;
+        } else {
+            /* looking for synchronous call */
+            if (!search->poller)
+                /* async */
+                continue;
+        }
+        goto found;
+    }
+    rc = ERROR_NOTFOUND;
+    goto out;
+
+ found:
+    rc = ao__abort(ctx, search);
+ out:
+    libxl__ctx_unlock(ctx);
+    return rc;
+}
+
+int libxl__ao_abortable_register(libxl__ao_abortable *abrt)
+{
+    libxl__ao *ao = abrt->ao;
+    libxl__ao *root = ao_nested_root(ao);
+    AO_GC;
+
+    if (root->aborting) {
+ DBG("ao=%p: preemptively aborting ao_abortable registration %p (root=%p)",
+            ao, abrt, root);
+        return ERROR_ABORTED;
+    }
+
+    DBG("ao=%p, abrt=%p: registering (root=%p)", ao, abrt, root);
+    LIBXL_LIST_INSERT_HEAD(&root->abortables, abrt, entry);
+    abrt->registered = 1;
+
+    return 0;
+}
+
+_hidden void libxl__ao_abortable_deregister(libxl__ao_abortable *abrt)
+{
+    if (!abrt->registered)
+        return;
+
+    libxl__ao *ao = abrt->ao;
+    libxl__ao *root __attribute__((unused)) = ao_nested_root(ao);
+    AO_GC;
+
+    DBG("ao=%p, abrt=%p: deregistering (root=%p)", ao, abrt, root);
+    LIBXL_LIST_REMOVE(abrt, entry);
+    abrt->registered = 0;
+}
+
+
 /* progress reporting */
 
 /* The application indicates a desire to ignore events by passing NULL
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 3ffefc3..b275ee4 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -193,6 +193,40 @@ struct libxl__ev_fd {
 };
 
 
+typedef struct libxl__ao_abortable libxl__ao_abortable;
+typedef void libxl__ao_abortable_callback(libxl__egc *egc,
+                  libxl__ao_abortable *ao_abortable, int rc /* ABORTED */);
+
+struct libxl__ao_abortable {
+    /* caller must fill this in and it must remain valid */
+    libxl__ao *ao;
+    libxl__ao_abortable_callback *callback;
+    /* remainder is private for abort machinery */
+    bool registered;
+    LIBXL_LIST_ENTRY(libxl__ao_abortable) entry;
+    /*
+     * For nested aos:
+     *  Semantically, abort affects the whole tree of aos,
+     *    not just the parent.
+     *  libxl__ao_abortable.ao refers to the child, so
+     *    that the child callback sees the right ao.  (After all,
+     *    it was code dealing with the child that set .ao.)
+     *  But, the abortable is recorded on the "abortables" list
+     *    for the ultimate root ao, so that every possible child
+     *    abort occurs as a result of the abort of the parent.
+     *  We set ao->aborting only in the root.
+     */
+};
+
+_hidden int libxl__ao_abortable_register(libxl__ao_abortable*);
+_hidden void libxl__ao_abortable_deregister(libxl__ao_abortable*);
+
+static inline void libxl__ao_abortable_init
+  (libxl__ao_abortable *c) { c->registered = 0; }
+static inline bool libxl__ao_abortable_isregistered
+  (const libxl__ao_abortable *c) { return c->registered; }
+
+
 typedef struct libxl__ev_time libxl__ev_time;
 typedef void libxl__ev_time_callback(libxl__egc *egc, libxl__ev_time *ev,
                                      const struct timeval *requested_abs,
@@ -382,6 +416,8 @@ struct libxl__ctx {
     LIBXL_LIST_HEAD(, libxl__ev_evtchn) evtchns_waiting;
     libxl__ev_fd evtchn_efd;
 
+    LIBXL_LIST_HEAD(, libxl__ao) aos_inprogress;
+
     LIBXL_TAILQ_HEAD(libxl__evgen_domain_death_list, libxl_evgen_domain_death)
         death_list /* sorted by domid */,
         death_reported;
@@ -468,12 +504,15 @@ struct libxl__ao {
      * only in libxl__ao_complete.)
      */
     uint32_t magic;
-    unsigned constructing:1, in_initiator:1, complete:1, notified:1;
+    unsigned constructing:1, in_initiator:1, complete:1, notified:1,
+        aborting:1;
     int manip_refcnt;
     libxl__ao *nested_root;
     int nested_progeny;
     int progress_reports_outstanding;
     int rc;
+    LIBXL_LIST_HEAD(, libxl__ao_abortable) abortables;
+    LIBXL_LIST_ENTRY(libxl__ao) inprogress_entry;
     libxl__gc gc;
     libxl_asyncop_how how;
     libxl__poller *poller;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 21/35] libxl: ao abort: Provide explicit internal abort check API
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (19 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 20/35] libxl: ao abort: Provide public ao abort request API Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 22/35] libxl: ao abort: Make timeouts abortable Ian Jackson
                   ` (14 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Some places in libxl which can't handle abort via a
libxl__ao_abortable callback might nevertheless benefit from being
able to explicitly check whether abort has been requested.

Provide the (fairly trivial) internal API function to do this.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: New in this version of the series.
---
 tools/libxl/libxl_event.c    |   11 +++++++++++
 tools/libxl/libxl_internal.h |    2 ++
 2 files changed, 13 insertions(+)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index f9daa55..5015563 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -2033,6 +2033,17 @@ int libxl_ao_abort(libxl_ctx *ctx, const libxl_asyncop_how *how)
     return rc;
 }
 
+int libxl__ao_aborting(libxl__ao *ao)
+{
+    libxl__ao *root = ao_nested_root(ao);
+    if (root->aborting) {
+        DBG("ao=%p: aborting at explicit check (root=%p)", ao, root);
+        return ERROR_ABORTED;
+    }
+
+    return 0;
+}
+
 int libxl__ao_abortable_register(libxl__ao_abortable *abrt)
 {
     libxl__ao *ao = abrt->ao;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b275ee4..637b96b 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -226,6 +226,8 @@ static inline void libxl__ao_abortable_init
 static inline bool libxl__ao_abortable_isregistered
   (const libxl__ao_abortable *c) { return c->registered; }
 
+int libxl__ao_aborting(libxl__ao *ao); /* -> 0 or ERROR_ABORTED */
+
 
 typedef struct libxl__ev_time libxl__ev_time;
 typedef void libxl__ev_time_callback(libxl__egc *egc, libxl__ev_time *ev,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 22/35] libxl: ao abort: Make timeouts abortable
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (20 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 21/35] libxl: ao abort: Provide explicit internal abort check API Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 23/35] libxl: ao abort: Note that driver domain task cannot be usefully aborted Ian Jackson
                   ` (13 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Make libxl__ev_time* register with the abort machinery, so that
libxl_ao_abort can stop any operation which has a timeout.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v4: Do not immediately deregister the abortable (!)
---
 tools/libxl/libxl_event.c    |   28 ++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h |    3 ++-
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 5015563..ed9e18c 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -315,6 +315,8 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev,
 
 static void time_deregister(libxl__gc *gc, libxl__ev_time *ev)
 {
+    libxl__ao_abortable_deregister(&ev->abrt);
+
     if (!ev->infinite) {
         struct timeval right_away = { 0, 0 };
         if (ev->nexus) /* only set if app provided hooks */
@@ -337,6 +339,23 @@ static void time_done_debug(libxl__gc *gc, const char *func,
 #endif
 }
 
+static void time_aborted(libxl__egc *egc, libxl__ao_abortable *abrt, int rc)
+{
+    libxl__ev_time *ev = CONTAINER_OF(abrt, *ev, abrt);
+    EGC_GC;
+
+    time_deregister(gc, ev);
+    DBG("ev_time=%p aborted", ev);
+    ev->func(egc, ev, &ev->abs, rc);
+}
+
+static int time_register_abortable(libxl__ao *ao, libxl__ev_time *ev)
+{
+    ev->abrt.ao = ao;
+    ev->abrt.callback = time_aborted;
+    return libxl__ao_abortable_register(&ev->abrt);
+}
+
 int libxl__ev_time_register_abs(libxl__ao *ao, libxl__ev_time *ev,
                                 libxl__ev_time_callback *func,
                                 struct timeval absolute)
@@ -349,6 +368,9 @@ int libxl__ev_time_register_abs(libxl__ao *ao, libxl__ev_time *ev,
     DBG("ev_time=%p register abs=%lu.%06lu",
         ev, (unsigned long)absolute.tv_sec, (unsigned long)absolute.tv_usec);
 
+    rc = time_register_abortable(ao, ev);
+    if (rc) goto out;
+
     rc = time_register_finite(gc, ev, absolute);
     if (rc) goto out;
 
@@ -356,6 +378,7 @@ int libxl__ev_time_register_abs(libxl__ao *ao, libxl__ev_time *ev,
 
     rc = 0;
  out:
+    libxl__ao_abortable_deregister(&ev->abrt);
     time_done_debug(gc,__func__,ev,rc);
     CTX_UNLOCK;
     return rc;
@@ -374,6 +397,9 @@ int libxl__ev_time_register_rel(libxl__ao *ao, libxl__ev_time *ev,
 
     DBG("ev_time=%p register ms=%d", ev, milliseconds);
 
+    rc = time_register_abortable(ao, ev);
+    if (rc) goto out;
+
     if (milliseconds < 0) {
         ev->infinite = 1;
     } else {
@@ -388,6 +414,8 @@ int libxl__ev_time_register_rel(libxl__ao *ao, libxl__ev_time *ev,
     rc = 0;
 
  out:
+    if (!libxl__ev_time_isregistered(ev))
+        libxl__ao_abortable_deregister(&ev->abrt);
     time_done_debug(gc,__func__,ev,rc);
     CTX_UNLOCK;
     return rc;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 637b96b..615c382 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -242,6 +242,7 @@ struct libxl__ev_time {
     LIBXL_TAILQ_ENTRY(libxl__ev_time) entry;
     struct timeval abs;
     libxl__osevent_hook_nexus *nexus;
+    libxl__ao_abortable abrt;
 };
 
 typedef struct libxl__ev_xswatch libxl__ev_xswatch;
@@ -848,7 +849,7 @@ _hidden int libxl__ev_time_modify_abs(libxl__gc*, libxl__ev_time *ev,
                                       struct timeval);
 _hidden void libxl__ev_time_deregister(libxl__gc*, libxl__ev_time *ev);
 static inline void libxl__ev_time_init(libxl__ev_time *ev)
-                { ev->func = 0; }
+                { ev->func = 0; libxl__ao_abortable_init(&ev->abrt); }
 static inline int libxl__ev_time_isregistered(const libxl__ev_time *ev)
                 { return !!ev->func; }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 23/35] libxl: ao abort: Note that driver domain task cannot be usefully aborted
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (21 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 22/35] libxl: ao abort: Make timeouts abortable Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 24/35] libxl: Introduce DOMAIN_DESTROYED error code Ian Jackson
                   ` (12 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell, Roger Pau Monné

In practice, aborting this task will cause all subsequent actual
backend operations to fail, but will not actually cause the
libxl_device_events_handler operation to complete.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: New in this version of the series.
---
 tools/libxl/libxl.h |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index e52afca..a556374 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -1425,6 +1425,9 @@ libxl_device_pci *libxl_device_pci_list(libxl_ctx *ctx, uint32_t domid,
  * From a libxl API point of view, this starts a long-running
  * operation.  That operation consists of "being a driver domain"
  * and never completes.
+ *
+ * Attempting to abort this operation is not advisable; proper
+ * shutdown of the driver domain task is not supported.
  */
 int libxl_device_events_handler(libxl_ctx *ctx,
                                 const libxl_asyncop_how *ao_how)
-- 
1.7.10.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 24/35] libxl: Introduce DOMAIN_DESTROYED error code
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (22 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 23/35] libxl: ao abort: Note that driver domain task cannot be usefully aborted Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 25/35] libxl: ao abort: Support aborting where we spot domain death Ian Jackson
                   ` (11 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

This is currently reported only by the bootloader code, if the domain
is destroyed while the bootloader is running.

In the future it would be nice to return it for other circumstances
where the domain existed when the operation started but subsequently
vanished.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v4: Add a comment about the new error code's semantics
v2: New in this version of the series.
---
 tools/libxl/libxl_bootloader.c |    2 +-
 tools/libxl/libxl_types.idl    |    1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_bootloader.c b/tools/libxl/libxl_bootloader.c
index 0263ef9..490ad5c 100644
--- a/tools/libxl/libxl_bootloader.c
+++ b/tools/libxl/libxl_bootloader.c
@@ -613,7 +613,7 @@ static void bootloader_display_copyfail(libxl__egc *egc,
 static void bootloader_domaindeath(libxl__egc *egc, libxl__domaindeathcheck *dc)
 {
     libxl__bootloader_state *bl = CONTAINER_OF(dc, *bl, deathcheck);
-    bootloader_stop(egc, bl, ERROR_FAIL);
+    bootloader_stop(egc, bl, ERROR_DOMAIN_DESTROYED);
 }
 
 static void bootloader_finished(libxl__egc *egc, libxl__ev_child *child,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 9558d52..e1632fa 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -67,6 +67,7 @@ libxl_error = Enumeration("error", [
     (-21, "DOMAIN_NOTFOUND"),
     (-22, "ABORTED"),
     (-23, "NOTFOUND"),
+    (-24, "DOMAIN_DESTROYED"), # Target domain ceased to exist during op
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 25/35] libxl: ao abort: Support aborting where we spot domain death
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (23 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 24/35] libxl: Introduce DOMAIN_DESTROYED error code Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 26/35] libxl: Introduce FILLZERO Ian Jackson
                   ` (10 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Make an active libxl__domaindeathcheck contain an active
libxl__ao_abortable.

Consequential changes are:
 * domaindeath callbacks now take an rc value.
 * libxl__domaindeathcheck_start takes an ao, not a gc.
 * bootloader_domaindeath plumbs the rc through to its caller.
 * libxl__domaindeathcheck_init and _stop are not quite trivial any
   more so are moved from (inline functions) in libxl_internal.h, to
   ordinary functions defined in libxl_event.c.
 * libxl__domaindeathcheck_start is not trivial any more, and now has
   the standard error-handling pattern.

The only current user of libxl__domaindeathcheck is the bootloader.
So the result is that now it is possible to effectively abort domain
creation while the bootloader is running.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: New in this version of the series.
---
 tools/libxl/libxl_bootloader.c |   11 +++++----
 tools/libxl/libxl_event.c      |   50 ++++++++++++++++++++++++++++++++++++----
 tools/libxl/libxl_internal.h   |   12 +++++-----
 3 files changed, 59 insertions(+), 14 deletions(-)

diff --git a/tools/libxl/libxl_bootloader.c b/tools/libxl/libxl_bootloader.c
index 490ad5c..6db958e 100644
--- a/tools/libxl/libxl_bootloader.c
+++ b/tools/libxl/libxl_bootloader.c
@@ -33,7 +33,8 @@ static void bootloader_keystrokes_copyfail(libxl__egc *egc,
        libxl__datacopier_state *dc, int onwrite, int errnoval);
 static void bootloader_display_copyfail(libxl__egc *egc,
        libxl__datacopier_state *dc, int onwrite, int errnoval);
-static void bootloader_domaindeath(libxl__egc*, libxl__domaindeathcheck *dc);
+static void bootloader_domaindeath(libxl__egc*, libxl__domaindeathcheck *dc,
+                                   int rc);
 static void bootloader_finished(libxl__egc *egc, libxl__ev_child *child,
                                 pid_t pid, int status);
 
@@ -496,7 +497,7 @@ static void bootloader_gotptys(libxl__egc *egc, libxl__openpty_state *op)
     bl->deathcheck.what = "stopping bootloader";
     bl->deathcheck.domid = bl->domid;
     bl->deathcheck.callback = bootloader_domaindeath;
-    rc = libxl__domaindeathcheck_start(gc, &bl->deathcheck);
+    rc = libxl__domaindeathcheck_start(ao, &bl->deathcheck);
     if (rc) goto out;
 
     if (bl->console_available)
@@ -610,10 +611,12 @@ static void bootloader_display_copyfail(libxl__egc *egc,
     bootloader_copyfail(egc, "bootloader output", bl, 1, onwrite, errnoval);
 }
 
-static void bootloader_domaindeath(libxl__egc *egc, libxl__domaindeathcheck *dc)
+static void bootloader_domaindeath(libxl__egc *egc,
+                                   libxl__domaindeathcheck *dc,
+                                   int rc)
 {
     libxl__bootloader_state *bl = CONTAINER_OF(dc, *bl, deathcheck);
-    bootloader_stop(egc, bl, ERROR_DOMAIN_DESTROYED);
+    bootloader_stop(egc, bl, rc);
 }
 
 static void bootloader_finished(libxl__egc *egc, libxl__ev_child *child,
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index ed9e18c..9072df4 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -933,6 +933,18 @@ int libxl__ev_devstate_wait(libxl__ao *ao, libxl__ev_devstate *ds,
  * futile.
  */
 
+void libxl__domaindeathcheck_init(libxl__domaindeathcheck *dc)
+{
+    libxl__ao_abortable_init(&dc->abrt);
+    libxl__ev_xswatch_init(&dc->watch);
+}
+
+void libxl__domaindeathcheck_stop(libxl__gc *gc, libxl__domaindeathcheck *dc)
+{
+    libxl__ao_abortable_deregister(&dc->abrt);
+    libxl__ev_xswatch_deregister(gc,&dc->watch);
+}
+
 static void domaindeathcheck_callback(libxl__egc *egc, libxl__ev_xswatch *w,
                             const char *watch_path, const char *event_path)
 {
@@ -941,6 +953,8 @@ static void domaindeathcheck_callback(libxl__egc *egc, libxl__ev_xswatch *w,
     const char *p = libxl__xs_read(gc, XBT_NULL, watch_path);
     if (p) return;
 
+    libxl__domaindeathcheck_stop(gc,dc);
+
     if (errno!=ENOENT) {
         LIBXL__EVENT_DISASTER(egc,"failed to read xenstore"
                               " for domain detach check", errno, 0);
@@ -949,15 +963,43 @@ static void domaindeathcheck_callback(libxl__egc *egc, libxl__ev_xswatch *w,
 
     LOG(ERROR,"%s: domain %"PRIu32" removed (%s no longer in xenstore)",
         dc->what, dc->domid, watch_path);
-    dc->callback(egc, dc);
+    dc->callback(egc, dc, ERROR_DOMAIN_DESTROYED);
+}
+
+static void domaindeathcheck_abort(libxl__egc *egc,
+                                   libxl__ao_abortable *abrt,
+                                   int rc)
+{
+    libxl__domaindeathcheck *dc = CONTAINER_OF(abrt, *dc, abrt);
+    EGC_GC;
+
+    libxl__domaindeathcheck_stop(gc,dc);
+    dc->callback(egc, dc, rc);
 }
 
-int libxl__domaindeathcheck_start(libxl__gc *gc,
+int libxl__domaindeathcheck_start(libxl__ao *ao,
                                   libxl__domaindeathcheck *dc)
 {
+    AO_GC;
+    int rc;
     const char *path = GCSPRINTF("/local/domain/%"PRIu32, dc->domid);
-    return libxl__ev_xswatch_register(gc, &dc->watch,
-                                      domaindeathcheck_callback, path);
+
+    libxl__domaindeathcheck_init(dc);
+
+    dc->abrt.ao = ao;
+    dc->abrt.callback = domaindeathcheck_abort;
+    rc = libxl__ao_abortable_register(&dc->abrt);
+    if (rc) goto out;
+
+    rc = libxl__ev_xswatch_register(gc, &dc->watch,
+                                    domaindeathcheck_callback, path);
+    if (rc) goto out;
+
+    return 0;
+
+ out:
+    libxl__domaindeathcheck_stop(gc,dc);
+    return rc;
 }
 
 /*
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 615c382..2296a5b 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1263,7 +1263,8 @@ _hidden int libxl__ev_devstate_wait(libxl__ao *ao, libxl__ev_devstate *ds,
 
 typedef struct libxl__domaindeathcheck libxl__domaindeathcheck;
 typedef void libxl___domaindeathcheck_callback(libxl__egc *egc,
-                                         libxl__domaindeathcheck*);
+        libxl__domaindeathcheck*,
+        int rc /* DESTROYED or ABORTED */);
 
 struct libxl__domaindeathcheck {
     /* must be filled in by caller, and remain valid: */
@@ -1271,16 +1272,15 @@ struct libxl__domaindeathcheck {
     uint32_t domid;
     libxl___domaindeathcheck_callback *callback;
     /* private */
+    libxl__ao_abortable abrt;
     libxl__ev_xswatch watch;
 };
 
-_hidden int libxl__domaindeathcheck_start(libxl__gc *gc,
+_hidden int libxl__domaindeathcheck_start(libxl__ao *ao,
                                           libxl__domaindeathcheck *dc);
 
-static inline void libxl__domaindeathcheck_init
- (libxl__domaindeathcheck *dc) { libxl__ev_xswatch_init(&dc->watch); }
-static inline void libxl__domaindeathcheck_stop(libxl__gc *gc,
-  libxl__domaindeathcheck *dc) { libxl__ev_xswatch_deregister(gc,&dc->watch); }
+void libxl__domaindeathcheck_init(libxl__domaindeathcheck *dc);
+void libxl__domaindeathcheck_stop(libxl__gc *gc, libxl__domaindeathcheck *dc);
 
 
 /*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 26/35] libxl: Introduce FILLZERO
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (24 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 25/35] libxl: ao abort: Support aborting where we spot domain death Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 27/35] libxl: ao abort: Preparations for save/restore abort Ian Jackson
                   ` (9 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

FILLZERO is a macro for memset(&foo,0,sizeof(foo)).  It eliminates the
possiblity to make the error memset(&foo,0,sizeof(&foo)).

No callers yet, but document it in CODING_STYLE.  (In accordance with
existing libxl policy, I haven't gone through all existing possible
call sites.)

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: New in this version of the series.
---
 tools/libxl/CODING_STYLE     |    1 +
 tools/libxl/libxl_internal.h |    3 +++
 tools/libxl/libxl_utils.h    |    3 +++
 3 files changed, 7 insertions(+)

diff --git a/tools/libxl/CODING_STYLE b/tools/libxl/CODING_STYLE
index f5b5890..a65efb3 100644
--- a/tools/libxl/CODING_STYLE
+++ b/tools/libxl/CODING_STYLE
@@ -62,6 +62,7 @@ whenever they are applicable.  For example:
   libxl__ctx_[un]lock     CTX_LOCK, CTX_UNLOCK
   gc=...; ao=...;         EGC_GC, AO_GC, STATE_AO_GC
   explicit gc creation    GC_INIT, GC_FREE
+  memset(..,0,sizeof..)   FILLZERO
 
 
 ERROR HANDLING
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 2296a5b..54c9e39 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3291,6 +3291,9 @@ _hidden const char *libxl__device_model_savefile(libxl__gc *gc, uint32_t domid);
     })
 
 
+#define FILLZERO LIBXL_FILLZERO
+
+
 /*
  * All of these assume (or define)
  *    libxl__gc *gc;
diff --git a/tools/libxl/libxl_utils.h b/tools/libxl/libxl_utils.h
index 1c1761d..9b90a44 100644
--- a/tools/libxl/libxl_utils.h
+++ b/tools/libxl/libxl_utils.h
@@ -160,6 +160,9 @@ int libxl_cpumap_to_nodemap(libxl_ctx *ctx,
 
 void libxl_string_copy(libxl_ctx *ctx, char **dst, char **src);
 
+
+#define LIBXL_FILLZERO(object) (memset(&(object), 0, sizeof((object))))
+
 #endif
 
 /*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 27/35] libxl: ao abort: Preparations for save/restore abort
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (25 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 26/35] libxl: Introduce FILLZERO Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 28/35] libxl: ao abort: Handle SIGTERM in save/restore helper Ian Jackson
                   ` (8 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Two unrelated non-functional changes, broken out into a pre-patch for
easier review:

Break out a function sendsig() in libxl_save_callout.c.

Move io_fd to be a global variable in libxl_save_helper.c.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: New in this version of the series.
---
 tools/libxl/libxl_save_callout.c |   10 +++++++---
 tools/libxl/libxl_save_helper.c  |    5 +++--
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 40b25e4..1d584f1 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -237,6 +237,12 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
     libxl__carefd_close(childs_pipes[1]);
     helper_failed(egc, shs, rc);;
 }
+static void sendsig(libxl__gc *gc, libxl__save_helper_state *shs, int sig)
+{
+    int r = kill(shs->child.pid, sig);
+    if (r) LOGE(WARN, "failed to kill save/restore helper [%lu] (signal %d)",
+                (unsigned long)shs->child.pid, sig);
+}
 
 static void helper_failed(libxl__egc *egc, libxl__save_helper_state *shs,
                           int rc)
@@ -253,9 +259,7 @@ static void helper_failed(libxl__egc *egc, libxl__save_helper_state *shs,
         return;
     }
 
-    int r = kill(shs->child.pid, SIGKILL);
-    if (r) LOGE(WARN, "failed to kill save/restore helper [%lu]",
-                (unsigned long)shs->child.pid);
+    sendsig(gc, shs, SIGKILL);
 }
 
 static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 74826a1..7514b2e 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -85,6 +85,7 @@ static xentoollog_logger logger = {
     tellparent_destroy,
 };
 static xc_interface *xch;
+static int io_fd;
 
 /*----- error handling -----*/
 
@@ -211,7 +212,7 @@ int main(int argc, char **argv)
 
     if (!strcmp(mode,"--save-domain")) {
 
-        int io_fd =                atoi(NEXTARG);
+        io_fd =                    atoi(NEXTARG);
         uint32_t dom =             strtoul(NEXTARG,0,10);
         uint32_t max_iters =       strtoul(NEXTARG,0,10);
         uint32_t max_factor =      strtoul(NEXTARG,0,10);
@@ -234,7 +235,7 @@ int main(int argc, char **argv)
 
     } else if (!strcmp(mode,"--restore-domain")) {
 
-        int io_fd =                atoi(NEXTARG);
+        io_fd =                    atoi(NEXTARG);
         uint32_t dom =             strtoul(NEXTARG,0,10);
         unsigned store_evtchn =    strtoul(NEXTARG,0,10);
         domid_t store_domid =      strtoul(NEXTARG,0,10);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 28/35] libxl: ao abort: Handle SIGTERM in save/restore helper
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (26 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 27/35] libxl: ao abort: Preparations for save/restore abort Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 29/35] libxl: ao abort: Abort libxc save/restore Ian Jackson
                   ` (7 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

During startup of the save/restore helper, set the disposition of
SIGTERM appropriately.

For restore, we can simply die immediately - there is no point trying
to do any kind of cleanup on what is now going to be a trashed domain.

For save, we want to arrange that libxc's cleanup code (eg turning off
logdirty) takes place.  So our signal handler replaces the fd with one
on which writes will fail, causing libxc's own loop to fail next time
it actually tries to do a write.

Currently this has only a minor beneficial effect: we don't send the
helper a SIGTERM ourselves, and if someone else contrives to send our
helper a SIGTERM they have probably sent one to libxl too in which
case things are going to be a bit messy anyway.

But in the next patch libxl itself is going to use SIGTERM to
implement ao abort requests.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: New in this version of the series.
---
 tools/libxl/libxl_save_helper.c |   58 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 7514b2e..0be77c9 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -40,8 +40,10 @@
 #include <unistd.h>
 #include <assert.h>
 #include <inttypes.h>
+#include <fcntl.h>
 
 #include "libxl.h"
+#include "libxl_utils.h"
 
 #include "xenctrl.h"
 #include "xenguest.h"
@@ -120,6 +122,58 @@ static void *xmalloc(size_t sz)
     return r;
 }
 
+/*----- signal handling -----*/
+
+static int unwriteable_fd;
+
+static void save_signal_handler(int num)
+{
+    /*
+     * We want to be able to interrupt save.  But the code in libxc
+     * which does the actual saving is straight-through, and we need
+     * to execute its error path to put the guest back to sanity.
+     *
+     * So what we do is this: when we get the signal, we dup2
+     * the result of open("/dev/null",O_RDONLY) onto the output fd.
+     *
+     * This is guaranteed to 1. interrupt libxc's write (causing it to
+     * return short, or maybe EINTR); 2. make the next write give
+     * EBADF, so that: 3. at latest, libxc will notice when it next
+     * tries to write data and will then go into its cleanup path.
+     *
+     * We make no effort here to sanitise the resulting errors.
+     * That's libxl's job.
+     */
+    int esave = errno;
+
+    int r = dup2(unwriteable_fd, io_fd);
+    assert(r == io_fd); /* if not we can't write an xtl message because we
+                         * might end up interleaving on our control stream */
+
+    errno = esave;
+}
+
+static void setup_signals(void (*handler)(int))
+{
+    struct sigaction sa;
+    sigset_t spmask;
+    int r;
+
+    unwriteable_fd = open("/dev/null",O_RDONLY);
+    if (unwriteable_fd < 0) fail(errno,"open /dev/null for reading");
+
+    LIBXL_FILLZERO(sa);
+    sa.sa_handler = handler;
+    sigemptyset(&sa.sa_mask);
+    r = sigaction(SIGTERM, &sa, 0);
+    if (r) fail(errno,"sigaction SIGTERM failed");
+
+    sigemptyset(&spmask);
+    sigaddset(&spmask,SIGTERM);
+    r = sigprocmask(SIG_UNBLOCK,&spmask,0);
+    if (r) fail(errno,"sigprocmask unblock SIGTERM failed");
+}
+
 /*----- helper functions called by autogenerated stubs -----*/
 
 unsigned char * helper_allocbuf(int len, void *user)
@@ -229,6 +283,8 @@ int main(int argc, char **argv)
         helper_setcallbacks_save(&helper_save_callbacks, cbflags);
 
         startup("save");
+        setup_signals(save_signal_handler);
+
         r = xc_domain_save(xch, io_fd, dom, max_iters, max_factor, flags,
                            &helper_save_callbacks, hvm);
         complete(r);
@@ -254,6 +310,8 @@ int main(int argc, char **argv)
         unsigned long console_mfn = 0;
 
         startup("restore");
+        setup_signals(SIG_DFL);
+
         r = xc_domain_restore(xch, io_fd, dom, store_evtchn, &store_mfn,
                               store_domid, console_evtchn, &console_mfn,
                               console_domid, hvm, pae, superpages,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 29/35] libxl: ao abort: Abort libxc save/restore
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (27 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 28/35] libxl: ao abort: Handle SIGTERM in save/restore helper Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 30/35] libxl: ao: datacopier callback gets an rc Ian Jackson
                   ` (6 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Register the the save/restore helper interface with the abort
machinery.  When we are informed that save/restore should be aborted,
we make a note of the that in our rc variable, and send the helper a
SIGTERM.  It will die in due course.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: New in this version of the series.
---
 tools/libxl/libxl_internal.h     |    1 +
 tools/libxl/libxl_save_callout.c |   24 ++++++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 54c9e39..4bdf93a 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2702,6 +2702,7 @@ typedef struct libxl__save_helper_state {
     int rc;
     int completed; /* retval/errnoval valid iff completed */
     int retval, errnoval; /* from xc_domain_save / xc_domain_restore */
+    libxl__ao_abortable abrt;
     libxl__carefd *pipes[2]; /* 0 = helper's stdin, 1 = helper's stdout */
     libxl__ev_fd readable;
     libxl__ev_child child;
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 1d584f1..93a884b 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -32,6 +32,7 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
                        const unsigned long *argnums, int num_argnums);
 
 static void helper_failed(libxl__egc*, libxl__save_helper_state *shs, int rc);
+static void helper_stop(libxl__egc *egc, libxl__ao_abortable*, int rc);
 static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
                                    int fd, short events, short revents);
 static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
@@ -166,9 +167,15 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
     shs->rc = 0;
     shs->completed = 0;
     shs->pipes[0] = shs->pipes[1] = 0;
+    libxl__ao_abortable_init(&shs->abrt);
     libxl__ev_fd_init(&shs->readable);
     libxl__ev_child_init(&shs->child);
 
+    shs->abrt.ao = shs->ao;
+    shs->abrt.callback = helper_stop;
+    rc = libxl__ao_abortable_register(&shs->abrt);
+    if (rc) goto out;
+
     shs->stdin_what = GCSPRINTF("domain %"PRIu32" save/restore helper"
                                 " stdin pipe", domid);
     shs->stdout_what = GCSPRINTF("domain %"PRIu32" save/restore helper"
@@ -262,6 +269,22 @@ static void helper_failed(libxl__egc *egc, libxl__save_helper_state *shs,
     sendsig(gc, shs, SIGKILL);
 }
 
+static void helper_stop(libxl__egc *egc, libxl__ao_abortable *abrt, int rc)
+{
+    libxl__save_helper_state *shs = CONTAINER_OF(abrt, *shs, abrt);
+    STATE_AO_GC(shs->ao);
+
+    if (!libxl__ev_child_inuse(&shs->child)) {
+        helper_failed(egc, shs, rc);
+        return;
+    }
+
+    if (!shs->rc)
+        shs->rc = rc;
+
+    sendsig(gc, shs, SIGTERM);
+}
+
 static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
                                    int fd, short events, short revents)
 {
@@ -332,6 +355,7 @@ static void helper_done(libxl__egc *egc, libxl__save_helper_state *shs)
 {
     STATE_AO_GC(shs->ao);
 
+    libxl__ao_abortable_deregister(&shs->abrt);
     libxl__ev_fd_deregister(gc, &shs->readable);
     libxl__carefd_close(shs->pipes[0]);  shs->pipes[0] = 0;
     libxl__carefd_close(shs->pipes[1]);  shs->pipes[1] = 0;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 30/35] libxl: ao: datacopier callback gets an rc
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (28 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 29/35] libxl: ao abort: Abort libxc save/restore Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 31/35] libxl: ao abort: Make datacopiers abortable Ian Jackson
                   ` (5 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson, Euan Harris,
	David Vrabel

libxl__datacopier_* now provides its caller's callback function with
an rc value.  This relieves the caller of the need to figure out an
appropriate rc value.

Arrange that the `other internal failure' cases now get a valid
positive errno value (EIO).

In a few places, assert that errno is nonzero before passing it to our
caller.

Extend the datacopier callback API to permit the dc to signal ABORTED.
(It doesn't actually do this yet, though.)

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: David Vrabel <david.vrabel@citrix.com>
---
v2: New in this version of the series.
---
 tools/libxl/libxl_aoutils.c    |   24 +++++++++++++-----------
 tools/libxl/libxl_bootloader.c |   20 +++++++++++---------
 tools/libxl/libxl_dom.c        |   10 +++-------
 tools/libxl/libxl_internal.h   |   18 +++++++++++-------
 4 files changed, 38 insertions(+), 34 deletions(-)

diff --git a/tools/libxl/libxl_aoutils.c b/tools/libxl/libxl_aoutils.c
index 593a575..1245828 100644
--- a/tools/libxl/libxl_aoutils.c
+++ b/tools/libxl/libxl_aoutils.c
@@ -121,10 +121,10 @@ void libxl__datacopier_kill(libxl__datacopier_state *dc)
 }
 
 static void datacopier_callback(libxl__egc *egc, libxl__datacopier_state *dc,
-                                int onwrite, int errnoval)
+                                int rc, int onwrite, int errnoval)
 {
     libxl__datacopier_kill(dc);
-    dc->callback(egc, dc, onwrite, errnoval);
+    dc->callback(egc, dc, rc, onwrite, errnoval);
 }
 
 static void datacopier_writable(libxl__egc *egc, libxl__ev_fd *ev,
@@ -142,14 +142,14 @@ static void datacopier_check_state(libxl__egc *egc, libxl__datacopier_state *dc)
             if (rc) {
                 LOG(ERROR, "unable to establish write event on %s"
                     " during copy of %s", dc->writewhat, dc->copywhat);
-                datacopier_callback(egc, dc, -1, 0);
+                datacopier_callback(egc, dc, ERROR_FAIL, -1, EIO);
                 return;
             }
         }
     } else if (!libxl__ev_fd_isregistered(&dc->toread) ||
                dc->bytes_to_read == 0) {
         /* we have had eof */
-        datacopier_callback(egc, dc, 0, 0);
+        datacopier_callback(egc, dc, 0, 0, 0);
         return;
     } else {
         /* nothing buffered, but still reading */
@@ -195,7 +195,7 @@ static int datacopier_pollhup_handled(libxl__egc *egc,
             onwrite ? dc->writewhat : dc->readwhat,
             dc->copywhat);
         libxl__datacopier_kill(dc);
-        dc->callback_pollhup(egc, dc, onwrite, -1);
+        dc->callback_pollhup(egc, dc, ERROR_FAIL, onwrite, -1);
         return 1;
     }
     return 0;
@@ -213,7 +213,7 @@ static void datacopier_readable(libxl__egc *egc, libxl__ev_fd *ev,
         LOG(ERROR,
             "unexpected poll event 0x%x (expected POLLIN and/or POLLHUP)"
             " on %s during copy of %s", revents, dc->readwhat, dc->copywhat);
-        datacopier_callback(egc, dc, -1, 0);
+        datacopier_callback(egc, dc, ERROR_FAIL, -1, EIO);
         return;
     }
     assert(revents & (POLLIN|POLLHUP));
@@ -245,20 +245,21 @@ static void datacopier_readable(libxl__egc *egc, libxl__ev_fd *ev,
         }
         if (r < 0) {
             if (errno == EINTR) continue;
+            assert(errno);
             if (errno == EWOULDBLOCK) {
                 if (revents & POLLHUP) {
                     LOG(ERROR,
                         "poll reported HUP but fd read gave EWOULDBLOCK"
                         " on %s during copy of %s",
                         dc->readwhat, dc->copywhat);
-                    datacopier_callback(egc, dc, -1, 0);
+                    datacopier_callback(egc, dc, ERROR_FAIL, -1, 0);
                     return;
                 }
                 break;
             }
             LOGE(ERROR, "error reading %s during copy of %s",
                  dc->readwhat, dc->copywhat);
-            datacopier_callback(egc, dc, 0, errno);
+            datacopier_callback(egc, dc, ERROR_FAIL, 0, errno);
             return;
         }
         if (r == 0) {
@@ -287,7 +288,7 @@ static void datacopier_readable(libxl__egc *egc, libxl__ev_fd *ev,
                 assert(ferror(dc->log));
                 assert(errno);
                 LOGE(ERROR, "error logging %s", dc->copywhat);
-                datacopier_callback(egc, dc, 0, errno);
+                datacopier_callback(egc, dc, ERROR_FAIL, 0, errno);
                 return;
             }
         }
@@ -315,7 +316,7 @@ static void datacopier_writable(libxl__egc *egc, libxl__ev_fd *ev,
     if (revents & ~POLLOUT) {
         LOG(ERROR, "unexpected poll event 0x%x (should be POLLOUT)"
             " on %s during copy of %s", revents, dc->writewhat, dc->copywhat);
-        datacopier_callback(egc, dc, -1, 0);
+        datacopier_callback(egc, dc, ERROR_FAIL, -1, EIO);
         return;
     }
     assert(revents & POLLOUT);
@@ -332,9 +333,10 @@ static void datacopier_writable(libxl__egc *egc, libxl__ev_fd *ev,
         if (r < 0) {
             if (errno == EINTR) continue;
             if (errno == EWOULDBLOCK) break;
+            assert(errno);
             LOGE(ERROR, "error writing to %s during copy of %s",
                  dc->writewhat, dc->copywhat);
-            datacopier_callback(egc, dc, 1, errno);
+            datacopier_callback(egc, dc, ERROR_FAIL, 1, errno);
             return;
         }
         assert(r > 0);
diff --git a/tools/libxl/libxl_bootloader.c b/tools/libxl/libxl_bootloader.c
index 6db958e..95dde98 100644
--- a/tools/libxl/libxl_bootloader.c
+++ b/tools/libxl/libxl_bootloader.c
@@ -30,9 +30,9 @@
 
 static void bootloader_gotptys(libxl__egc *egc, libxl__openpty_state *op);
 static void bootloader_keystrokes_copyfail(libxl__egc *egc,
-       libxl__datacopier_state *dc, int onwrite, int errnoval);
+       libxl__datacopier_state *dc, int rc, int onwrite, int errnoval);
 static void bootloader_display_copyfail(libxl__egc *egc,
-       libxl__datacopier_state *dc, int onwrite, int errnoval);
+       libxl__datacopier_state *dc, int rc, int onwrite, int errnoval);
 static void bootloader_domaindeath(libxl__egc*, libxl__domaindeathcheck *dc,
                                    int rc);
 static void bootloader_finished(libxl__egc *egc, libxl__ev_child *child,
@@ -579,10 +579,10 @@ static void bootloader_gotptys(libxl__egc *egc, libxl__openpty_state *op)
 
 /* perhaps one of these will be called, but perhaps not */
 static void bootloader_copyfail(libxl__egc *egc, const char *which,
-        libxl__bootloader_state *bl, int ondisplay, int onwrite, int errnoval)
+        libxl__bootloader_state *bl, int ondisplay,
+        int rc, int onwrite, int errnoval)
 {
     STATE_AO_GC(bl->ao);
-    int rc = ERROR_FAIL;
 
     if (errnoval==-1) {
         /* POLLHUP */
@@ -593,22 +593,24 @@ static void bootloader_copyfail(libxl__egc *egc, const char *which,
             LOG(ERROR, "unexpected POLLHUP on %s", which);
         }
     }
-    if (!onwrite && !errnoval)
+    if (!rc) {
         LOG(ERROR, "unexpected eof copying %s", which);
+        rc = ERROR_FAIL;
+    }
 
     bootloader_stop(egc, bl, rc);
 }
 static void bootloader_keystrokes_copyfail(libxl__egc *egc,
-       libxl__datacopier_state *dc, int onwrite, int errnoval)
+       libxl__datacopier_state *dc, int rc, int onwrite, int errnoval)
 {
     libxl__bootloader_state *bl = CONTAINER_OF(dc, *bl, keystrokes);
-    bootloader_copyfail(egc, "bootloader input", bl, 0, onwrite, errnoval);
+    bootloader_copyfail(egc, "bootloader input", bl, 0, rc,onwrite,errnoval);
 }
 static void bootloader_display_copyfail(libxl__egc *egc,
-       libxl__datacopier_state *dc, int onwrite, int errnoval)
+       libxl__datacopier_state *dc, int rc, int onwrite, int errnoval)
 {
     libxl__bootloader_state *bl = CONTAINER_OF(dc, *bl, display);
-    bootloader_copyfail(egc, "bootloader output", bl, 1, onwrite, errnoval);
+    bootloader_copyfail(egc, "bootloader output", bl, 1, rc,onwrite,errnoval);
 }
 
 static void bootloader_domaindeath(libxl__egc *egc,
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index b081ef7..bdc0465 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -2158,7 +2158,7 @@ out:
 }
 
 static void save_device_model_datacopier_done(libxl__egc *egc,
-     libxl__datacopier_state *dc, int onwrite, int errnoval);
+     libxl__datacopier_state *dc, int rc, int onwrite, int errnoval);
 
 void libxl__domain_save_device_model(libxl__egc *egc,
                                      libxl__domain_suspend_state *dss,
@@ -2218,11 +2218,11 @@ void libxl__domain_save_device_model(libxl__egc *egc,
     return;
 
  out:
-    save_device_model_datacopier_done(egc, dc, -1, 0);
+    save_device_model_datacopier_done(egc, dc, rc, -1, EIO);
 }
 
 static void save_device_model_datacopier_done(libxl__egc *egc,
-     libxl__datacopier_state *dc, int onwrite, int errnoval)
+     libxl__datacopier_state *dc, int our_rc, int onwrite, int errnoval)
 {
     libxl__domain_suspend_state *dss =
         CONTAINER_OF(dc, *dss, save_dm_datacopier);
@@ -2230,14 +2230,10 @@ static void save_device_model_datacopier_done(libxl__egc *egc,
 
     /* Convenience aliases */
     const char *const filename = dss->dm_savefile;
-    int our_rc = 0;
     int rc;
 
     libxl__datacopier_kill(dc);
 
-    if (onwrite || errnoval)
-        our_rc = ERROR_FAIL;
-
     if (dc->readfd >= 0) {
         close(dc->readfd);
         dc->readfd = -1;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 4bdf93a..47e3fbd 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2620,17 +2620,21 @@ _hidden void libxl__device_disk_local_initiate_detach(libxl__egc *egc,
 typedef struct libxl__datacopier_state libxl__datacopier_state;
 typedef struct libxl__datacopier_buf libxl__datacopier_buf;
 
-/* onwrite==1 means failure happened when writing, logged, errnoval is valid
- * onwrite==0 means failure happened when reading
- *     errnoval==0 means we got eof and all data was written
- *     errnoval!=0 means we had a read error, logged
- * onwrite==-1 means some other internal failure, errnoval not valid, logged
- * If we get POLLHUP, we call callback_pollhup(..., onwrite, -1);
+/* onwrite==1 means problem happened when writing
+ *     rc==FAIL    errnoval >0    we had a write error, logged
+ * onwrite==0 means problem happened when reading
+ *     rc==0       errnoval==0    we got eof and all data was written
+ *     rc==FAIL    errnoval >0    we had a read error, logged
+ * onwrite==-1 means some other internal problem
+ *     rc==FAIL    errnoval==EIO  some other internal failure, logged
+ *     rc==ABORTED errnoval==0    abort requested, not logged
+ * If we get POLLHUP, we call callback_pollhup with
+ *     rc==FAIL    errnoval==-1   POLLHUP signalled
  * or if callback_pollhup==0 this is treated as eof (if POLLIN|POLLHUP
  * on the reading fd) or an internal failure (otherwise), as above.
  * In all cases copier is killed before calling this callback */
 typedef void libxl__datacopier_callback(libxl__egc *egc,
-     libxl__datacopier_state *dc, int onwrite, int errnoval);
+     libxl__datacopier_state *dc, int rc, int onwrite, int errnoval);
 
 struct libxl__datacopier_buf {
     /* private to datacopier */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 31/35] libxl: ao abort: Make datacopiers abortable
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (29 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 30/35] libxl: ao: datacopier callback gets an rc Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-25 17:44 ` [PATCH 32/35] libxl: Fix libxl__get_domid error reporting Ian Jackson
                   ` (4 subsequent siblings)
  35 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

libxl__datacopier_* can now actually generate a callback with
rc==ABORTED.

This provides abort support during some corner cases, including (at
least) copying the device model data during the end of domain save.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: New in this version of the series.
---
 tools/libxl/libxl_aoutils.c  |   16 ++++++++++++++++
 tools/libxl/libxl_internal.h |    1 +
 2 files changed, 17 insertions(+)

diff --git a/tools/libxl/libxl_aoutils.c b/tools/libxl/libxl_aoutils.c
index 1245828..0931eee 100644
--- a/tools/libxl/libxl_aoutils.c
+++ b/tools/libxl/libxl_aoutils.c
@@ -103,6 +103,7 @@ static void xswait_report_error(libxl__egc *egc, libxl__xswait_state *xswa,
 void libxl__datacopier_init(libxl__datacopier_state *dc)
 {
     assert(dc->ao);
+    libxl__ao_abortable_init(&dc->abrt);
     libxl__ev_fd_init(&dc->toread);
     libxl__ev_fd_init(&dc->towrite);
     LIBXL_TAILQ_INIT(&dc->bufs);
@@ -113,6 +114,7 @@ void libxl__datacopier_kill(libxl__datacopier_state *dc)
     STATE_AO_GC(dc->ao);
     libxl__datacopier_buf *buf, *tbuf;
 
+    libxl__ao_abortable_deregister(&dc->abrt);
     libxl__ev_fd_deregister(gc, &dc->toread);
     libxl__ev_fd_deregister(gc, &dc->towrite);
     LIBXL_TAILQ_FOREACH_SAFE(buf, &dc->bufs, entry, tbuf)
@@ -201,6 +203,15 @@ static int datacopier_pollhup_handled(libxl__egc *egc,
     return 0;
 }
 
+static void datacopier_abort(libxl__egc *egc, libxl__ao_abortable *abrt,
+                             int rc)
+{
+    libxl__datacopier_state *dc = CONTAINER_OF(abrt, *dc, abrt);
+    STATE_AO_GC(dc->ao);
+
+    datacopier_callback(egc, dc, rc, -1, 0);
+}
+
 static void datacopier_readable(libxl__egc *egc, libxl__ev_fd *ev,
                                 int fd, short events, short revents) {
     libxl__datacopier_state *dc = CONTAINER_OF(ev, *dc, toread);
@@ -359,6 +370,11 @@ int libxl__datacopier_start(libxl__datacopier_state *dc)
     assert(dc->readfd >= 0 || dc->writefd >= 0);
     assert(!(dc->readbuf && dc->bytes_to_read == -1));
 
+    dc->abrt.ao = ao;
+    dc->abrt.callback = datacopier_abort;
+    rc = libxl__ao_abortable_register(&dc->abrt);
+    if (rc) goto out;
+
     if (dc->readfd >= 0) {
         rc = libxl__ev_fd_register(gc, &dc->toread, datacopier_readable,
                                    dc->readfd, POLLIN);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 47e3fbd..4126102 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2657,6 +2657,7 @@ struct libxl__datacopier_state {
                       fd. The buffer should be at least as large as the
                       bytes_to_read parameter, which should not be -1. */
     /* remaining fields are private to datacopier */
+    libxl__ao_abortable abrt;
     libxl__ev_fd toread, towrite;
     ssize_t used;
     LIBXL_TAILQ_HEAD(libxl__datacopier_bufs, libxl__datacopier_buf) bufs;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 32/35] libxl: Fix libxl__get_domid error reporting
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (30 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 31/35] libxl: ao abort: Make datacopiers abortable Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-26 14:54   ` Wei Liu
  2015-06-25 17:44 ` [PATCH 33/35] libxl: spawn: Always debug log middle child process death Ian Jackson
                   ` (3 subsequent siblings)
  35 siblings, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Make it log something if the xenstore path does not exist.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v4: New patch in this version of the series.
---
 tools/libxl/libxl.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 54b398b..c5aad11 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1962,8 +1962,10 @@ int libxl__get_domid(libxl__gc *gc, uint32_t *domid)
     const char *xs_domid;
 
     rc = libxl__xs_read_checked(gc, XBT_NULL, DOMID_XS_PATH, &xs_domid);
-    if (rc || !xs_domid) {
-        rc = rc ? rc : ERROR_FAIL;
+    if (rc) goto out;
+    if (!xs_domid) {
+        LOG(ERROR, "failed to get own domid (%s)", DOMID_XS_PATH);
+        rc = ERROR_FAIL;
         goto out;
     }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 33/35] libxl: spawn: Always debug log middle child process death
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (31 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 32/35] libxl: Fix libxl__get_domid error reporting Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-26 14:55   ` Wei Liu
  2015-06-25 17:44 ` [PATCH 34/35] libxl: libxl__ev_child pass actual pid to callback Ian Jackson
                   ` (2 subsequent siblings)
  35 siblings, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

Otherwise the debug log is strangely devoid of an explanation for the
spawn completing.

We decorate `what', as otherwise the logged message is rather alarming
(especially if the death is due to us sending SIGKILL, which even
happens on the success path).

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v4: New patch in this version of the series.
---
 tools/libxl/libxl_exec.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/libxl/libxl_exec.c b/tools/libxl/libxl_exec.c
index 85cbde0..ecb30cf 100644
--- a/tools/libxl/libxl_exec.c
+++ b/tools/libxl/libxl_exec.c
@@ -411,6 +411,8 @@ static void spawn_middle_death(libxl__egc *egc, libxl__ev_child *childw,
         ((WIFEXITED(status) && WEXITSTATUS(status)==0) ||
          (WIFSIGNALED(status) && WTERMSIG(status)==SIGKILL))) {
         /* as expected */
+        const char *what = GCSPRINTF("%s (dying as expected)", ss->what);
+        libxl_report_child_exitstatus(CTX, XTL_DEBUG, what, pid, status);
     } else if (!WIFEXITED(status)) {
         int loglevel = ss->detaching ? XTL_WARN : XTL_ERROR;
         const char *what =
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 34/35] libxl: libxl__ev_child pass actual pid to callback
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (32 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 33/35] libxl: spawn: Always debug log middle child process death Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-26 14:56   ` Wei Liu
  2015-06-25 17:44 ` [PATCH 35/35] libxl: When save/restore helper dies, do not overwrite rc Ian Jackson
  2015-06-26 10:19 ` [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
  35 siblings, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

The callbacks actually ignore this except for logging, but we should
log the correct pid.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v4: New patch in this version of the series.
---
 tools/libxl/libxl_fork.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 144208a..4486687 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -447,9 +447,10 @@ static int perhaps_sigchld_needed(libxl__gc *gc, bool creating)
 static void childproc_reaped_ours(libxl__egc *egc, libxl__ev_child *ch,
                                  int status)
 {
+    pid_t pid = ch->pid;
     LIBXL_LIST_REMOVE(ch, entry);
     ch->pid = -1;
-    ch->callback(egc, ch, ch->pid, status);
+    ch->callback(egc, ch, pid, status);
 }
 
 static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 35/35] libxl: When save/restore helper dies, do not overwrite rc
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (33 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 34/35] libxl: libxl__ev_child pass actual pid to callback Ian Jackson
@ 2015-06-25 17:44 ` Ian Jackson
  2015-06-26 14:56   ` Wei Liu
  2015-06-26 10:19 ` [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
  35 siblings, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-25 17:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Euan Harris, Ian Campbell

If we already have an rc (eg from ao abort), keep it.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v4: New patch in this version of the series.
---
 tools/libxl/libxl_save_callout.c |   13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 93a884b..087c2d5 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -332,19 +332,22 @@ static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
 
     if (status) {
         libxl_report_child_exitstatus(CTX, XTL_ERROR, what, pid, status);
-        shs->rc = ERROR_FAIL;
+        if (!shs->rc)
+            shs->rc = ERROR_FAIL;
     }
 
     if (shs->need_results) {
-        if (!shs->rc)
+        if (!shs->rc) {
             LOG(ERROR,"%s exited without providing results",what);
-        shs->rc = ERROR_FAIL;
+            shs->rc = ERROR_FAIL;
+        }
     }
 
     if (!shs->completed) {
-        if (!shs->rc)
+        if (!shs->rc) {
             LOG(ERROR,"%s exited without signaling completion",what);
-        shs->rc = ERROR_FAIL;
+            shs->rc = ERROR_FAIL;
+        }
     }
 
     helper_done(egc, shs);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 00/35] libxl ao abort request (cancellation)
  2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
                   ` (34 preceding siblings ...)
  2015-06-25 17:44 ` [PATCH 35/35] libxl: When save/restore helper dies, do not overwrite rc Ian Jackson
@ 2015-06-26 10:19 ` Ian Jackson
  2015-06-26 13:20   ` NUMA config handling bug on reuse of libxl domain config Ian Jackson
  2015-06-26 15:40   ` [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
  35 siblings, 2 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-26 10:19 UTC (permalink / raw)
  To: xen-devel, Ian Campbell, Euan Harris, Wei Liu

Ian Jackson writes ("[PATCH v4 00/35] libxl ao abort request (cancellation)"):
> This is v4 of my series to provide support for abandoning a
> long-running libxl operation.

I've pushed it here

  http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=summary
  git://xenbits.xen.org/people/iwj/xen.git
     base.ao-abort.v4..wip.ao-abort.v4

(base.ao-abort is yesterday's staging).

Ian.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* NUMA config handling bug on reuse of libxl domain config
  2015-06-26 10:19 ` [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
@ 2015-06-26 13:20   ` Ian Jackson
  2015-06-26 15:57     ` Dario Faggioli
  2015-06-26 15:40   ` [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
  1 sibling, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-26 13:20 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: Wei Liu, xen-devel, Euan Harris, Ian Campbell

Ian Jackson writes ("Re: [PATCH v4 00/35] libxl ao abort request (cancellation)"):
> Ian Jackson writes ("[PATCH v4 00/35] libxl ao abort request (cancellation)"):
> > This is v4 of my series to provide support for abandoning a
> > long-running libxl operation.
> 
> I've pushed it here
> 
>   http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=summary
>   git://xenbits.xen.org/people/iwj/xen.git
>      base.ao-abort.v4..wip.ao-abort.v4
> 
> (base.ao-abort is yesterday's staging).

One of the tests failed until I put a workaround in.

The test program:
 - creates and populates a libxl_domain_config
 - calls libxl_domain_create_new
 - destroys the resulting domain with libxl_domain_destroy
 - calls libxl_domain_create_new again with the same config

The result is:

 libxl: error: libxl_dom.c:343:libxl__build_pre: Can run NUMA placement
 only if the domain does not have any NUMA node affinity set already

 libxl: error: libxl_create.c:1174:domcreate_rebuild_done: cannot
 (re-)build domain: -6

My repro case is to use the wip.ao-abort.v4 series, above, with the
following test suite:

  http://xenbits.xen.org/gitweb/?p=people/iwj/ring3-xl-test.git;a=summary
  git://xenbits.xen.org/people/iwj/ring3-xl-test.git
     #t.numa-bug

Ian.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 08/35] libxl: Rename AO_ABORT to AO_CREATE_FAIL
  2015-06-25 17:44 ` [PATCH 08/35] libxl: Rename AO_ABORT to AO_CREATE_FAIL Ian Jackson
@ 2015-06-26 14:10   ` Wei Liu
  0 siblings, 0 replies; 59+ messages in thread
From: Wei Liu @ 2015-06-26 14:10 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, xen-devel, Euan Harris, Ian Campbell

On Thu, Jun 25, 2015 at 06:44:08PM +0100, Ian Jackson wrote:
> We are going to introduce a new meaning for aborting an ao, so rename
> this to avoid confusion.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 09/35] libxl: Change some log messages to say `abandoning' rather than `aborting'
  2015-06-25 17:44 ` [PATCH 09/35] libxl: Change some log messages to say `abandoning' rather than `aborting' Ian Jackson
@ 2015-06-26 14:12   ` Wei Liu
  0 siblings, 0 replies; 59+ messages in thread
From: Wei Liu @ 2015-06-26 14:12 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, xen-devel, Euan Harris, Ian Campbell

On Thu, Jun 25, 2015 at 06:44:09PM +0100, Ian Jackson wrote:
> We are going to introduce application-requested aborts of (ao)
> operations, but these suspend failures are something different.
> Reword to avoid confusion.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

> ---
>  tools/libxl/libxl_dom.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index 54908fb..ccbcb6e 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -1242,7 +1242,7 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
>      return;
>  
>   out:
> -    LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc);
> +    LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
>      libxl__xs_transaction_abort(gc, &t);
>      switch_logdirty_done(egc,dss,rc);
>  }
> @@ -1260,7 +1260,7 @@ static void domain_suspend_switch_qemu_xen_logdirty
>      if (!rc) {
>          libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
>      } else {
> -        LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc);
> +        LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
>          dss->rc = rc;
>          libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
>      }
> @@ -1283,7 +1283,7 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
>          break;
>      default:
>          LOG(ERROR,"logdirty switch failed"
> -            ", no valid device model version found, aborting suspend");
> +            ", no valid device model version found, abandoning suspend");
>          dss->rc = ERROR_FAIL;
>          libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
>      }
> -- 
> 1.7.10.4

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 10/35] libxl: Change an internal comment to say `bail' rather than `abort'
  2015-06-25 17:44 ` [PATCH 10/35] libxl: Change an internal comment to say `bail' rather than `abort' Ian Jackson
@ 2015-06-26 14:13   ` Wei Liu
  0 siblings, 0 replies; 59+ messages in thread
From: Wei Liu @ 2015-06-26 14:13 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, xen-devel, Euan Harris, Ian Campbell

On Thu, Jun 25, 2015 at 06:44:10PM +0100, Ian Jackson wrote:
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

> ---
>  tools/libxl/libxl_internal.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index cd1dfc3..b445a51 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -2380,7 +2380,7 @@ struct libxl__multidev {
>   *          any stale entry
>   *       for loop -- xs transaction
>   *           open xs transaction
> - *           check device existence, abort if it exists
> + *           check device existence, bail if it exists
>   *           write in-memory json config to disk
>   *           commit xs transaction
>   *       end for loop
> -- 
> 1.7.10.4

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 11/35] libxl: New error codes ABORTED etc.
  2015-06-25 17:44 ` [PATCH 11/35] libxl: New error codes ABORTED etc Ian Jackson
@ 2015-06-26 14:13   ` Wei Liu
  0 siblings, 0 replies; 59+ messages in thread
From: Wei Liu @ 2015-06-26 14:13 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, xen-devel, Euan Harris, Ian Campbell

On Thu, Jun 25, 2015 at 06:44:11PM +0100, Ian Jackson wrote:
> We introduce ERROR_ABORTED now, so that we can write code to handle
> it, and decreee that functions might return it, even though currently
> there is nowhere where this error is generated.
> 
> While we're here, provide ERROR_NOTFOUND which will also be used
> later, but only as part of the public API.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

> ---
> v4: CANCELLED renamed to ABORTED.
>     No longer introduce ERROR_NOTIMPLEMENTED.
> v2: Rebase means new errors have bigger (more negative) numbers.
> ---
>  tools/libxl/libxl_types.idl |    2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 23f27d4..9558d52 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -65,6 +65,8 @@ libxl_error = Enumeration("error", [
>      (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
>      (-20, "VNUMA_CONFIG_INVALID"),
>      (-21, "DOMAIN_NOTFOUND"),
> +    (-22, "ABORTED"),
> +    (-23, "NOTFOUND"),
>      ], value_namespace = "")
>  
>  libxl_domain_type = Enumeration("domain_type", [
> -- 
> 1.7.10.4

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 15/35] libxl: spawn: Preserve rc in error path
  2015-06-25 17:44 ` [PATCH 15/35] libxl: spawn: Preserve rc in error path Ian Jackson
@ 2015-06-26 14:28   ` Wei Liu
  0 siblings, 0 replies; 59+ messages in thread
From: Wei Liu @ 2015-06-26 14:28 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, xen-devel, Euan Harris, Ian Campbell

On Thu, Jun 25, 2015 at 06:44:15PM +0100, Ian Jackson wrote:
> Make spawn provide an rc to its caller, and either pass it through
> from the timeout callback, or invent ERROR_FAIL, as applicable.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 19/35] libxl: ao: Provide manip_refcnt
  2015-06-25 17:44 ` [PATCH 19/35] libxl: ao: Provide manip_refcnt Ian Jackson
@ 2015-06-26 14:33   ` Wei Liu
  2015-06-26 15:02     ` Ian Jackson
  0 siblings, 1 reply; 59+ messages in thread
From: Wei Liu @ 2015-06-26 14:33 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, xen-devel, Euan Harris, Ian Campbell

On Thu, Jun 25, 2015 at 06:44:19PM +0100, Ian Jackson wrote:
> Previously we used in_initiator to stop the ao being freed while we
> were still in the initiator function (which would result in the
> initiator's call to lixl__ao_inprogress accessing the ao after it had
> been freed).

Typo lixl.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 20/35] libxl: ao abort: Provide public ao abort request API
  2015-06-25 17:44 ` [PATCH 20/35] libxl: ao abort: Provide public ao abort request API Ian Jackson
@ 2015-06-26 14:47   ` Wei Liu
  2015-06-26 15:05     ` Ian Jackson
  0 siblings, 1 reply; 59+ messages in thread
From: Wei Liu @ 2015-06-26 14:47 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, xen-devel, Euan Harris, Ian Campbell

On Thu, Jun 25, 2015 at 06:44:20PM +0100, Ian Jackson wrote:
[...]
> + * It is possible to abort even an operation which is being performed
> + * synchronously, but since in that case how==NULL you had better only
> + * have one such operation, because it is not possible to tell them
> + * apart.  (And, if you want to do this, obviously the abort would
> + * have to be requested on a different thread.)
> + */
> +int libxl_ao_abort(libxl_ctx *ctx, const libxl_asyncop_how *how)
> +                   LIBXL_EXTERNAL_CALLERS_ONLY;
> +
> +
>  #define LIBXL_VERSION 0
>  
[...]
> +
> +int libxl_ao_abort(libxl_ctx *ctx, const libxl_asyncop_how *how)
> +{
> +    libxl__ao *search;
> +    libxl__ctx_lock(ctx);
> +    int rc;
> +
> +    LIBXL_LIST_FOREACH(search, &ctx->aos_inprogress, inprogress_entry) {
> +        if (how) {
> +            /* looking for ao to be reported by callback or event */
> +            if (search->poller)
> +                /* sync */
> +                continue;
> +            if (how->callback != search->how.callback)
> +                continue;
> +            if (how->callback
> +                ? (how->u.for_callback != search->how.u.for_callback)
> +                : (how->u.for_event != search->how.u.for_event))
> +                continue;
> +        } else {
> +            /* looking for synchronous call */
> +            if (!search->poller)
> +                /* async */
> +                continue;
> +        }

I interpreted the doc more like "it aborts all sync operations".

This function only aborts the first synchronous operation it finds.
The doc didn't make this clear.

Wei.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 32/35] libxl: Fix libxl__get_domid error reporting
  2015-06-25 17:44 ` [PATCH 32/35] libxl: Fix libxl__get_domid error reporting Ian Jackson
@ 2015-06-26 14:54   ` Wei Liu
  0 siblings, 0 replies; 59+ messages in thread
From: Wei Liu @ 2015-06-26 14:54 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, xen-devel, Euan Harris, Ian Campbell

On Thu, Jun 25, 2015 at 06:44:32PM +0100, Ian Jackson wrote:
> Make it log something if the xenstore path does not exist.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

> ---
> v4: New patch in this version of the series.
> ---
>  tools/libxl/libxl.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 54b398b..c5aad11 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -1962,8 +1962,10 @@ int libxl__get_domid(libxl__gc *gc, uint32_t *domid)
>      const char *xs_domid;
>  
>      rc = libxl__xs_read_checked(gc, XBT_NULL, DOMID_XS_PATH, &xs_domid);
> -    if (rc || !xs_domid) {
> -        rc = rc ? rc : ERROR_FAIL;
> +    if (rc) goto out;
> +    if (!xs_domid) {
> +        LOG(ERROR, "failed to get own domid (%s)", DOMID_XS_PATH);
> +        rc = ERROR_FAIL;
>          goto out;
>      }
>  
> -- 
> 1.7.10.4

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 33/35] libxl: spawn: Always debug log middle child process death
  2015-06-25 17:44 ` [PATCH 33/35] libxl: spawn: Always debug log middle child process death Ian Jackson
@ 2015-06-26 14:55   ` Wei Liu
  0 siblings, 0 replies; 59+ messages in thread
From: Wei Liu @ 2015-06-26 14:55 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, xen-devel, Euan Harris, Ian Campbell

On Thu, Jun 25, 2015 at 06:44:33PM +0100, Ian Jackson wrote:
> Otherwise the debug log is strangely devoid of an explanation for the
> spawn completing.
> 
> We decorate `what', as otherwise the logged message is rather alarming
> (especially if the death is due to us sending SIGKILL, which even
> happens on the success path).
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

> ---
> v4: New patch in this version of the series.
> ---
>  tools/libxl/libxl_exec.c |    2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tools/libxl/libxl_exec.c b/tools/libxl/libxl_exec.c
> index 85cbde0..ecb30cf 100644
> --- a/tools/libxl/libxl_exec.c
> +++ b/tools/libxl/libxl_exec.c
> @@ -411,6 +411,8 @@ static void spawn_middle_death(libxl__egc *egc, libxl__ev_child *childw,
>          ((WIFEXITED(status) && WEXITSTATUS(status)==0) ||
>           (WIFSIGNALED(status) && WTERMSIG(status)==SIGKILL))) {
>          /* as expected */
> +        const char *what = GCSPRINTF("%s (dying as expected)", ss->what);
> +        libxl_report_child_exitstatus(CTX, XTL_DEBUG, what, pid, status);
>      } else if (!WIFEXITED(status)) {
>          int loglevel = ss->detaching ? XTL_WARN : XTL_ERROR;
>          const char *what =
> -- 
> 1.7.10.4

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 34/35] libxl: libxl__ev_child pass actual pid to callback
  2015-06-25 17:44 ` [PATCH 34/35] libxl: libxl__ev_child pass actual pid to callback Ian Jackson
@ 2015-06-26 14:56   ` Wei Liu
  0 siblings, 0 replies; 59+ messages in thread
From: Wei Liu @ 2015-06-26 14:56 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, xen-devel, Euan Harris, Ian Campbell

On Thu, Jun 25, 2015 at 06:44:34PM +0100, Ian Jackson wrote:
> The callbacks actually ignore this except for logging, but we should
> log the correct pid.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

> ---
> v4: New patch in this version of the series.
> ---
>  tools/libxl/libxl_fork.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
> index 144208a..4486687 100644
> --- a/tools/libxl/libxl_fork.c
> +++ b/tools/libxl/libxl_fork.c
> @@ -447,9 +447,10 @@ static int perhaps_sigchld_needed(libxl__gc *gc, bool creating)
>  static void childproc_reaped_ours(libxl__egc *egc, libxl__ev_child *ch,
>                                   int status)
>  {
> +    pid_t pid = ch->pid;
>      LIBXL_LIST_REMOVE(ch, entry);
>      ch->pid = -1;
> -    ch->callback(egc, ch, ch->pid, status);
> +    ch->callback(egc, ch, pid, status);
>  }
>  
>  static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
> -- 
> 1.7.10.4

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 35/35] libxl: When save/restore helper dies, do not overwrite rc
  2015-06-25 17:44 ` [PATCH 35/35] libxl: When save/restore helper dies, do not overwrite rc Ian Jackson
@ 2015-06-26 14:56   ` Wei Liu
  0 siblings, 0 replies; 59+ messages in thread
From: Wei Liu @ 2015-06-26 14:56 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, xen-devel, Euan Harris, Ian Campbell

On Thu, Jun 25, 2015 at 06:44:35PM +0100, Ian Jackson wrote:
> If we already have an rc (eg from ao abort), keep it.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

> ---
> v4: New patch in this version of the series.
> ---
>  tools/libxl/libxl_save_callout.c |   13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
> index 93a884b..087c2d5 100644
> --- a/tools/libxl/libxl_save_callout.c
> +++ b/tools/libxl/libxl_save_callout.c
> @@ -332,19 +332,22 @@ static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
>  
>      if (status) {
>          libxl_report_child_exitstatus(CTX, XTL_ERROR, what, pid, status);
> -        shs->rc = ERROR_FAIL;
> +        if (!shs->rc)
> +            shs->rc = ERROR_FAIL;
>      }
>  
>      if (shs->need_results) {
> -        if (!shs->rc)
> +        if (!shs->rc) {
>              LOG(ERROR,"%s exited without providing results",what);
> -        shs->rc = ERROR_FAIL;
> +            shs->rc = ERROR_FAIL;
> +        }
>      }
>  
>      if (!shs->completed) {
> -        if (!shs->rc)
> +        if (!shs->rc) {
>              LOG(ERROR,"%s exited without signaling completion",what);
> -        shs->rc = ERROR_FAIL;
> +            shs->rc = ERROR_FAIL;
> +        }
>      }
>  
>      helper_done(egc, shs);
> -- 
> 1.7.10.4

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 19/35] libxl: ao: Provide manip_refcnt
  2015-06-26 14:33   ` Wei Liu
@ 2015-06-26 15:02     ` Ian Jackson
  0 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-26 15:02 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Euan Harris, Ian Campbell

Wei Liu writes ("Re: [PATCH 19/35] libxl: ao: Provide manip_refcnt"):
> On Thu, Jun 25, 2015 at 06:44:19PM +0100, Ian Jackson wrote:
> > Previously we used in_initiator to stop the ao being freed while we
> > were still in the initiator function (which would result in the
> > initiator's call to lixl__ao_inprogress accessing the ao after it had
> > been freed).
> 
> Typo lixl.

Fixed, thanks.

Ian.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 20/35] libxl: ao abort: Provide public ao abort request API
  2015-06-26 14:47   ` Wei Liu
@ 2015-06-26 15:05     ` Ian Jackson
  2015-06-26 15:13       ` Wei Liu
  0 siblings, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-26 15:05 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Euan Harris, Ian Campbell

Wei Liu writes ("Re: [PATCH 20/35] libxl: ao abort: Provide public ao abort request API"):
> On Thu, Jun 25, 2015 at 06:44:20PM +0100, Ian Jackson wrote:
> [...]
> > + * It is possible to abort even an operation which is being performed
> > + * synchronously, but since in that case how==NULL you had better only
> > + * have one such operation, because it is not possible to tell them
> > + * apart.  (And, if you want to do this, obviously the abort would
> > + * have to be requested on a different thread.)
> > + */
...> 
> I interpreted the doc more like "it aborts all sync operations".
> 
> This function only aborts the first synchronous operation it finds.
> The doc didn't make this clear.

I will change it as follows:

   ... you had better only have one such operation, because it is
   not possible to tell them apart
 + (and libxl_ao_abort will abort only the first one it finds)

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 20/35] libxl: ao abort: Provide public ao abort request API
  2015-06-26 15:05     ` Ian Jackson
@ 2015-06-26 15:13       ` Wei Liu
  2015-06-26 15:30         ` Wei Liu
  0 siblings, 1 reply; 59+ messages in thread
From: Wei Liu @ 2015-06-26 15:13 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Euan Harris, xen-devel, Wei Liu, Ian Campbell

On Fri, Jun 26, 2015 at 04:05:07PM +0100, Ian Jackson wrote:
> Wei Liu writes ("Re: [PATCH 20/35] libxl: ao abort: Provide public ao abort request API"):
> > On Thu, Jun 25, 2015 at 06:44:20PM +0100, Ian Jackson wrote:
> > [...]
> > > + * It is possible to abort even an operation which is being performed
> > > + * synchronously, but since in that case how==NULL you had better only
> > > + * have one such operation, because it is not possible to tell them
> > > + * apart.  (And, if you want to do this, obviously the abort would
> > > + * have to be requested on a different thread.)
> > > + */
> ...> 
> > I interpreted the doc more like "it aborts all sync operations".
> > 
> > This function only aborts the first synchronous operation it finds.
> > The doc didn't make this clear.
> 
> I will change it as follows:
> 
>    ... you had better only have one such operation, because it is
>    not possible to tell them apart
>  + (and libxl_ao_abort will abort only the first one it finds)
> 

LGTM

> Thanks,
> Ian.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 20/35] libxl: ao abort: Provide public ao abort request API
  2015-06-26 15:13       ` Wei Liu
@ 2015-06-26 15:30         ` Wei Liu
  0 siblings, 0 replies; 59+ messages in thread
From: Wei Liu @ 2015-06-26 15:30 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Euan Harris, xen-devel, Wei Liu, Ian Campbell

On Fri, Jun 26, 2015 at 04:13:59PM +0100, Wei Liu wrote:
> On Fri, Jun 26, 2015 at 04:05:07PM +0100, Ian Jackson wrote:
> > Wei Liu writes ("Re: [PATCH 20/35] libxl: ao abort: Provide public ao abort request API"):
> > > On Thu, Jun 25, 2015 at 06:44:20PM +0100, Ian Jackson wrote:
> > > [...]
> > > > + * It is possible to abort even an operation which is being performed
> > > > + * synchronously, but since in that case how==NULL you had better only
> > > > + * have one such operation, because it is not possible to tell them
> > > > + * apart.  (And, if you want to do this, obviously the abort would
> > > > + * have to be requested on a different thread.)
> > > > + */
> > ...> 
> > > I interpreted the doc more like "it aborts all sync operations".
> > > 
> > > This function only aborts the first synchronous operation it finds.
> > > The doc didn't make this clear.
> > 
> > I will change it as follows:
> > 
> >    ... you had better only have one such operation, because it is
> >    not possible to tell them apart
> >  + (and libxl_ao_abort will abort only the first one it finds)
> > 
> 
> LGTM
> 

And I should make it clearer:

Acked-by: Wei Liu <wei.liu2@citrix.com>

> > Thanks,
> > Ian.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 00/35] libxl ao abort request (cancellation)
  2015-06-26 10:19 ` [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
  2015-06-26 13:20   ` NUMA config handling bug on reuse of libxl domain config Ian Jackson
@ 2015-06-26 15:40   ` Ian Jackson
  2015-06-26 15:51     ` Ian Campbell
  1 sibling, 1 reply; 59+ messages in thread
From: Ian Jackson @ 2015-06-26 15:40 UTC (permalink / raw)
  To: xen-devel, Ian Campbell, Euan Harris, Wei Liu

Ian Jackson writes ("Re: [PATCH v4 00/35] libxl ao abort request (cancellation)"):
> Ian Jackson writes ("[PATCH v4 00/35] libxl ao abort request (cancellation)"):
> > This is v4 of my series to provide support for abandoning a
> > long-running libxl operation.
> 
> I've pushed it here
> 
>   http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=summary
>   git://xenbits.xen.org/people/iwj/xen.git
>      base.ao-abort.v4..wip.ao-abort.v4
> 
> (base.ao-abort is yesterday's staging).

Thanks to Wei for the reviews.  Everything is now acked.

I just wanted to double-check that we are happy with this going in
now, even though there are no in-tree callers ?

We do have a test suite in development, which we are hoping to get
into xen.git in 4.6 (or early 4.7 if it misses 4.6).

Ian C, are you happy with the interface changes from v3 ?  (Primarily,
I'm referring to the changed semantics surrounding the removal of
NOTIMPLEMENTED.)


If there are no further objections I think this can go into staging.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 00/35] libxl ao abort request (cancellation)
  2015-06-26 15:40   ` [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
@ 2015-06-26 15:51     ` Ian Campbell
  2015-06-26 15:56       ` Ian Jackson
  0 siblings, 1 reply; 59+ messages in thread
From: Ian Campbell @ 2015-06-26 15:51 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, xen-devel, Euan Harris

On Fri, 2015-06-26 at 16:40 +0100, Ian Jackson wrote:
> Ian Jackson writes ("Re: [PATCH v4 00/35] libxl ao abort request (cancellation)"):
> > Ian Jackson writes ("[PATCH v4 00/35] libxl ao abort request (cancellation)"):
> > > This is v4 of my series to provide support for abandoning a
> > > long-running libxl operation.
> > 
> > I've pushed it here
> > 
> >   http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=summary
> >   git://xenbits.xen.org/people/iwj/xen.git
> >      base.ao-abort.v4..wip.ao-abort.v4
> > 
> > (base.ao-abort is yesterday's staging).
> 
> Thanks to Wei for the reviews.  Everything is now acked.
> 
> I just wanted to double-check that we are happy with this going in
> now, even though there are no in-tree callers ?
> 
> We do have a test suite in development, which we are hoping to get
> into xen.git in 4.6 (or early 4.7 if it misses 4.6).

Given that I think it's ok.

> Ian C, are you happy with the interface changes from v3 ?  (Primarily,
> I'm referring to the changed semantics surrounding the removal of
> NOTIMPLEMENTED.)

Yes, I think so.


> If there are no further objections I think this can go into staging.

Ack.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 00/35] libxl ao abort request (cancellation)
  2015-06-26 15:51     ` Ian Campbell
@ 2015-06-26 15:56       ` Ian Jackson
  0 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-26 15:56 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, xen-devel, Euan Harris

Ian Campbell writes ("Re: [PATCH v4 00/35] libxl ao abort request (cancellation)"):
> On Fri, 2015-06-26 at 16:40 +0100, Ian Jackson wrote:
> > Ian Jackson writes ("Re: [PATCH v4 00/35] libxl ao abort request (cancellation)"):
> > If there are no further objections I think this can go into staging.
> 
> Ack.

Now pushed.  I'll spare everyone a v5 repost of the series just for
the added acks and the one added sentence to a comment.

Thanks everyone.

Ian.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: NUMA config handling bug on reuse of libxl domain config
  2015-06-26 13:20   ` NUMA config handling bug on reuse of libxl domain config Ian Jackson
@ 2015-06-26 15:57     ` Dario Faggioli
  0 siblings, 0 replies; 59+ messages in thread
From: Dario Faggioli @ 2015-06-26 15:57 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, xen-devel, Euan Harris, Ian Campbell


[-- Attachment #1.1: Type: text/plain, Size: 1501 bytes --]

On Fri, 2015-06-26 at 14:20 +0100, Ian Jackson wrote:
> Ian Jackson writes ("Re: [PATCH v4 00/35] libxl ao abort request (cancellation)"):
> > Ian Jackson writes ("[PATCH v4 00/35] libxl ao abort request (cancellation)"):
> > > This is v4 of my series to provide support for abandoning a
> > > long-running libxl operation.
> > 
> > I've pushed it here
> > 
> >   http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=summary
> >   git://xenbits.xen.org/people/iwj/xen.git
> >      base.ao-abort.v4..wip.ao-abort.v4
> > 
> > (base.ao-abort is yesterday's staging).
> 
> One of the tests failed until I put a workaround in.
> 
I see...

> The test program:
>  - creates and populates a libxl_domain_config
>  - calls libxl_domain_create_new
>  - destroys the resulting domain with libxl_domain_destroy
>  - calls libxl_domain_create_new again with the same config
> 
> The result is:
> 
>  libxl: error: libxl_dom.c:343:libxl__build_pre: Can run NUMA placement
>  only if the domain does not have any NUMA node affinity set already
> 
>  libxl: error: libxl_create.c:1174:domcreate_rebuild_done: cannot
>  (re-)build domain: -6
> 
Ok, thanks for letting know. I will look into this ASAP.

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 12/35] libxl: events: Make timeout and async exec setup take an ao, not a gc
  2015-06-25 17:44 ` [PATCH 12/35] libxl: events: Make timeout and async exec setup take an ao, not a gc Ian Jackson
@ 2015-06-30  6:02   ` Wen Congyang
  2015-06-30  9:14     ` Ian Jackson
  0 siblings, 1 reply; 59+ messages in thread
From: Wen Congyang @ 2015-06-30  6:02 UTC (permalink / raw)
  To: Ian Jackson, xen-devel
  Cc: Wei Liu, Yang Hongyang, Euan Harris, Ian Campbell, Lai Jiangshan

On 06/26/2015 01:44 AM, Ian Jackson wrote:
> Change the timeout setup functions to take a libxl__ao, not a
> libxl__gc.  This is going to be needed for ao abort, because timeouts
> are going to be a main hook for ao abort requests - so the timeouts
> need to be associated with an ao.
> 
> This means that timeouts can only occur as part of a long-running
> libxl function (but this is of course correct, as libxl shouldn't have
> any global timeouts, and indeed all the call sites have an ao).
> 
> Also remove the gc parameter from libxl__async_exec_start.  It can
> just use the gc from the ao supplied in the aes.
> 
> All the callers follow the obvious patterns and therefore supply the
> ao's gc to libxl__async_exec_start and the timeout setup functions.
> There is therefore no functional change in this patch.

Sorry, I just reviewed this patch, and don't do a building test.
libxl__async_exec_start() is also used in libxl_netbuffer.c.

Thanks
Wen Congyang

> 
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
> CC: Yang Hongyang <yanghy@cn.fujitsu.com>
> CC: Wen Congyang <wency@cn.fujitsu.com>
> CC: Lai Jiangshan <laijs@cn.fujitsu.com>
> Acked-by: Wen Congyang <wency@cn.fujitsu.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> ---
> v2: This patch split off from "Permit timeouts to signal cancellation".
>     Rebased; consequently, deal with libxl__async_exec_start.
>     CC'd authors of the libxl__async_exec_* functions.
> ---
>  tools/libxl/libxl_aoutils.c         |    8 +++++---
>  tools/libxl/libxl_device.c          |    4 ++--
>  tools/libxl/libxl_dom.c             |    8 ++++----
>  tools/libxl/libxl_event.c           |    6 ++++--
>  tools/libxl/libxl_internal.h        |    6 +++---
>  tools/libxl/libxl_remus_disk_drbd.c |    2 +-
>  tools/libxl/libxl_test_timedereg.c  |    9 +++++----
>  7 files changed, 24 insertions(+), 19 deletions(-)
> 
> diff --git a/tools/libxl/libxl_aoutils.c b/tools/libxl/libxl_aoutils.c
> index ef679dd..1502ffe 100644
> --- a/tools/libxl/libxl_aoutils.c
> +++ b/tools/libxl/libxl_aoutils.c
> @@ -46,7 +46,7 @@ int libxl__xswait_start(libxl__gc *gc, libxl__xswait_state *xswa)
>  {
>      int rc;
>  
> -    rc = libxl__ev_time_register_rel(gc, &xswa->time_ev,
> +    rc = libxl__ev_time_register_rel(xswa->ao, &xswa->time_ev,
>                                       xswait_timeout_callback, xswa->timeout_ms);
>      if (rc) goto err;
>  
> @@ -547,16 +547,18 @@ void libxl__async_exec_init(libxl__async_exec_state *aes)
>      libxl__ev_child_init(&aes->child);
>  }
>  
> -int libxl__async_exec_start(libxl__gc *gc, libxl__async_exec_state *aes)
> +int libxl__async_exec_start(libxl__async_exec_state *aes)
>  {
>      pid_t pid;
>  
>      /* Convenience aliases */
> +    libxl__ao *ao = aes->ao;
> +    AO_GC;
>      libxl__ev_child *const child = &aes->child;
>      char ** const args = aes->args;
>  
>      /* Set execution timeout */
> -    if (libxl__ev_time_register_rel(gc, &aes->time,
> +    if (libxl__ev_time_register_rel(ao, &aes->time,
>                                      async_exec_timeout,
>                                      aes->timeout_ms)) {
>          LOG(ERROR, "unable to register timeout for executing: %s", aes->what);
> diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
> index 951d125..b6276f6 100644
> --- a/tools/libxl/libxl_device.c
> +++ b/tools/libxl/libxl_device.c
> @@ -808,7 +808,7 @@ void libxl__initiate_device_remove(libxl__egc *egc,
>               * TODO: 4.2 Bodge due to QEMU, see comment on top of
>               * libxl__initiate_device_remove in libxl_internal.h
>               */
> -            rc = libxl__ev_time_register_rel(gc, &aodev->timeout,
> +            rc = libxl__ev_time_register_rel(ao, &aodev->timeout,
>                                               device_qemu_timeout,
>                                               LIBXL_QEMU_BODGE_TIMEOUT * 1000);
>              if (rc) {
> @@ -1034,7 +1034,7 @@ static void device_hotplug(libxl__egc *egc, libxl__ao_device *aodev)
>      aes->stdfds[1] = 2;
>      aes->stdfds[2] = -1;
>  
> -    rc = libxl__async_exec_start(gc, aes);
> +    rc = libxl__async_exec_start(aes);
>      if (rc)
>          goto out;
>  
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index ccbcb6e..40a2d79 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -1200,7 +1200,7 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
>                                  switch_logdirty_xswatch, lds->ret_path);
>      if (rc) goto out;
>  
> -    rc = libxl__ev_time_register_rel(gc, &lds->timeout,
> +    rc = libxl__ev_time_register_rel(ao, &lds->timeout,
>                                  switch_logdirty_timeout, 10*1000);
>      if (rc) goto out;
>  
> @@ -1480,7 +1480,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
>          rc = libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
>          if (rc) goto err;
>  
> -        rc = libxl__ev_time_register_rel(gc, &dss->guest_timeout,
> +        rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
>                                           suspend_common_wait_guest_timeout,
>                                           60*1000);
>          if (rc) goto err;
> @@ -1611,7 +1611,7 @@ static void domain_suspend_common_wait_guest(libxl__egc *egc,
>                                      "@releaseDomain");
>      if (rc) goto err;
>  
> -    rc = libxl__ev_time_register_rel(gc, &dss->guest_timeout,
> +    rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
>                                       suspend_common_wait_guest_timeout,
>                                       60*1000);
>      if (rc) goto err;
> @@ -1990,7 +1990,7 @@ static void remus_devices_commit_cb(libxl__egc *egc,
>       */
>  
>      /* Set checkpoint interval timeout */
> -    rc = libxl__ev_time_register_rel(gc, &dss->checkpoint_timeout,
> +    rc = libxl__ev_time_register_rel(ao, &dss->checkpoint_timeout,
>                                       remus_next_checkpoint,
>                                       dss->interval);
>  
> diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
> index 4b234a3..217fe97 100644
> --- a/tools/libxl/libxl_event.c
> +++ b/tools/libxl/libxl_event.c
> @@ -332,10 +332,11 @@ static void time_done_debug(libxl__gc *gc, const char *func,
>  #endif
>  }
>  
> -int libxl__ev_time_register_abs(libxl__gc *gc, libxl__ev_time *ev,
> +int libxl__ev_time_register_abs(libxl__ao *ao, libxl__ev_time *ev,
>                                  libxl__ev_time_callback *func,
>                                  struct timeval absolute)
>  {
> +    AO_GC;
>      int rc;
>  
>      CTX_LOCK;
> @@ -356,10 +357,11 @@ int libxl__ev_time_register_abs(libxl__gc *gc, libxl__ev_time *ev,
>  }
>  
>  
> -int libxl__ev_time_register_rel(libxl__gc *gc, libxl__ev_time *ev,
> +int libxl__ev_time_register_rel(libxl__ao *ao, libxl__ev_time *ev,
>                                  libxl__ev_time_callback *func,
>                                  int milliseconds /* as for poll(2) */)
>  {
> +    AO_GC;
>      struct timeval absolute;
>      int rc;
>  
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index b445a51..8c38eab 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -791,10 +791,10 @@ static inline void libxl__ev_fd_init(libxl__ev_fd *efd)
>  static inline int libxl__ev_fd_isregistered(const libxl__ev_fd *efd)
>                      { return efd->fd >= 0; }
>  
> -_hidden int libxl__ev_time_register_rel(libxl__gc*, libxl__ev_time *ev_out,
> +_hidden int libxl__ev_time_register_rel(libxl__ao*, libxl__ev_time *ev_out,
>                                          libxl__ev_time_callback*,
>                                          int milliseconds /* as for poll(2) */);
> -_hidden int libxl__ev_time_register_abs(libxl__gc*, libxl__ev_time *ev_out,
> +_hidden int libxl__ev_time_register_abs(libxl__ao*, libxl__ev_time *ev_out,
>                                          libxl__ev_time_callback*,
>                                          struct timeval);
>  _hidden int libxl__ev_time_modify_rel(libxl__gc*, libxl__ev_time *ev,
> @@ -2166,7 +2166,7 @@ struct libxl__async_exec_state {
>  };
>  
>  void libxl__async_exec_init(libxl__async_exec_state *aes);
> -int libxl__async_exec_start(libxl__gc *gc, libxl__async_exec_state *aes);
> +int libxl__async_exec_start(libxl__async_exec_state *aes);
>  bool libxl__async_exec_inuse(const libxl__async_exec_state *aes);
>  
>  /*----- device addition/removal -----*/
> diff --git a/tools/libxl/libxl_remus_disk_drbd.c b/tools/libxl/libxl_remus_disk_drbd.c
> index afe9b61..5e0c9a6 100644
> --- a/tools/libxl/libxl_remus_disk_drbd.c
> +++ b/tools/libxl/libxl_remus_disk_drbd.c
> @@ -120,7 +120,7 @@ static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev)
>      aes->stdfds[1] = -1;
>      aes->stdfds[2] = -1;
>  
> -    rc = libxl__async_exec_start(gc, aes);
> +    rc = libxl__async_exec_start(aes);
>      if (rc)
>          goto out;
>  
> diff --git a/tools/libxl/libxl_test_timedereg.c b/tools/libxl/libxl_test_timedereg.c
> index a44639f..e2cc27d 100644
> --- a/tools/libxl/libxl_test_timedereg.c
> +++ b/tools/libxl/libxl_test_timedereg.c
> @@ -30,12 +30,13 @@ static int seq;
>  static void occurs(libxl__egc *egc, libxl__ev_time *ev,
>                     const struct timeval *requested_abs);
>  
> -static void regs(libxl__gc *gc, int j)
> +static void regs(libxl__ao *ao, int j)
>  {
> +    AO_GC;
>      int rc, i;
>      LOG(DEBUG,"regs(%d)", j);
>      for (i=0; i<NTIMES; i++) {
> -        rc = libxl__ev_time_register_rel(gc, &et[j][i], occurs, ms[j][i]);
> +        rc = libxl__ev_time_register_rel(ao, &et[j][i], occurs, ms[j][i]);
>          assert(!rc);
>      }    
>  }
> @@ -52,7 +53,7 @@ int libxl_test_timedereg(libxl_ctx *ctx, libxl_asyncop_how *ao_how)
>          libxl__ev_time_init(&et[1][i]);
>      }
>  
> -    regs(gc, 0);
> +    regs(ao, 0);
>  
>      return AO_INPROGRESS;
>  }
> @@ -71,7 +72,7 @@ static void occurs(libxl__egc *egc, libxl__ev_time *ev,
>          assert(ev == &et[0][1]);
>          libxl__ev_time_deregister(gc, &et[0][0]);
>          libxl__ev_time_deregister(gc, &et[0][2]);
> -        regs(gc, 1);
> +        regs(tao, 1);
>          libxl__ev_time_deregister(gc, &et[0][1]);
>          break;
>  
> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 12/35] libxl: events: Make timeout and async exec setup take an ao, not a gc
  2015-06-30  6:02   ` Wen Congyang
@ 2015-06-30  9:14     ` Ian Jackson
  0 siblings, 0 replies; 59+ messages in thread
From: Ian Jackson @ 2015-06-30  9:14 UTC (permalink / raw)
  To: Wen Congyang
  Cc: xen-devel, Wei Liu, Lai Jiangshan, Euan Harris, Yang Hongyang,
	Ian Campbell

Wen Congyang writes ("Re: [PATCH 12/35] libxl: events: Make timeout and async exec setup take an ao, not a gc"):
> On 06/26/2015 01:44 AM, Ian Jackson wrote:
> > All the callers follow the obvious patterns and therefore supply the
> > ao's gc to libxl__async_exec_start and the timeout setup functions.
> > There is therefore no functional change in this patch.
> 
> Sorry, I just reviewed this patch, and don't do a building test.
> libxl__async_exec_start() is also used in libxl_netbuffer.c.

Right.  This is my fault, not yours.  A reviewer is not normally
expected to do a build test.

I should have done a `git grep' to check I had changed the call sites,
but obviously I either failed to do so or overlooked this call site.

I should remember that libxl is starting to grow quite a few areas of
conditional compilation.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2015-06-30  9:14 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-25 17:44 [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
2015-06-25 17:44 ` [PATCH 01/35] libxl: ao internal API docs: Mention synchronous ao completion Ian Jackson
2015-06-25 17:44 ` [PATCH 02/35] libxl: suspend: switch_logdirty_done takes rc Ian Jackson
2015-06-25 17:44 ` [PATCH 03/35] libxl: suspend: common suspend callbacks take rc Ian Jackson
2015-06-25 17:44 ` [PATCH 04/35] libxl: suspend: Return correct error from callbacks Ian Jackson
2015-06-25 17:44 ` [PATCH 05/35] libxl: Use libxl__xswait* in libxl__ao_device Ian Jackson
2015-06-25 17:44 ` [PATCH 06/35] libxl: xswait/devstate: Move xswait to before devstate Ian Jackson
2015-06-25 17:44 ` [PATCH 07/35] libxl: devstate: Use libxl__xswait* Ian Jackson
2015-06-25 17:44 ` [PATCH 08/35] libxl: Rename AO_ABORT to AO_CREATE_FAIL Ian Jackson
2015-06-26 14:10   ` Wei Liu
2015-06-25 17:44 ` [PATCH 09/35] libxl: Change some log messages to say `abandoning' rather than `aborting' Ian Jackson
2015-06-26 14:12   ` Wei Liu
2015-06-25 17:44 ` [PATCH 10/35] libxl: Change an internal comment to say `bail' rather than `abort' Ian Jackson
2015-06-26 14:13   ` Wei Liu
2015-06-25 17:44 ` [PATCH 11/35] libxl: New error codes ABORTED etc Ian Jackson
2015-06-26 14:13   ` Wei Liu
2015-06-25 17:44 ` [PATCH 12/35] libxl: events: Make timeout and async exec setup take an ao, not a gc Ian Jackson
2015-06-30  6:02   ` Wen Congyang
2015-06-30  9:14     ` Ian Jackson
2015-06-25 17:44 ` [PATCH 13/35] libxl: events: Make libxl__async_exec_* pass caller an rc Ian Jackson
2015-06-25 17:44 ` [PATCH 14/35] libxl: events: Permit timeouts to signal ao abort Ian Jackson
2015-06-25 17:44 ` [PATCH 15/35] libxl: spawn: Preserve rc in error path Ian Jackson
2015-06-26 14:28   ` Wei Liu
2015-06-25 17:44 ` [PATCH 16/35] libxl: domain create: Do not destroy on ao abort Ian Jackson
2015-06-25 17:44 ` [PATCH 17/35] libxl: ao: Record ultimate parent of a nested ao Ian Jackson
2015-06-25 17:44 ` [PATCH 18/35] libxl: ao: Count the nested progeny of an ao Ian Jackson
2015-06-25 17:44 ` [PATCH 19/35] libxl: ao: Provide manip_refcnt Ian Jackson
2015-06-26 14:33   ` Wei Liu
2015-06-26 15:02     ` Ian Jackson
2015-06-25 17:44 ` [PATCH 20/35] libxl: ao abort: Provide public ao abort request API Ian Jackson
2015-06-26 14:47   ` Wei Liu
2015-06-26 15:05     ` Ian Jackson
2015-06-26 15:13       ` Wei Liu
2015-06-26 15:30         ` Wei Liu
2015-06-25 17:44 ` [PATCH 21/35] libxl: ao abort: Provide explicit internal abort check API Ian Jackson
2015-06-25 17:44 ` [PATCH 22/35] libxl: ao abort: Make timeouts abortable Ian Jackson
2015-06-25 17:44 ` [PATCH 23/35] libxl: ao abort: Note that driver domain task cannot be usefully aborted Ian Jackson
2015-06-25 17:44 ` [PATCH 24/35] libxl: Introduce DOMAIN_DESTROYED error code Ian Jackson
2015-06-25 17:44 ` [PATCH 25/35] libxl: ao abort: Support aborting where we spot domain death Ian Jackson
2015-06-25 17:44 ` [PATCH 26/35] libxl: Introduce FILLZERO Ian Jackson
2015-06-25 17:44 ` [PATCH 27/35] libxl: ao abort: Preparations for save/restore abort Ian Jackson
2015-06-25 17:44 ` [PATCH 28/35] libxl: ao abort: Handle SIGTERM in save/restore helper Ian Jackson
2015-06-25 17:44 ` [PATCH 29/35] libxl: ao abort: Abort libxc save/restore Ian Jackson
2015-06-25 17:44 ` [PATCH 30/35] libxl: ao: datacopier callback gets an rc Ian Jackson
2015-06-25 17:44 ` [PATCH 31/35] libxl: ao abort: Make datacopiers abortable Ian Jackson
2015-06-25 17:44 ` [PATCH 32/35] libxl: Fix libxl__get_domid error reporting Ian Jackson
2015-06-26 14:54   ` Wei Liu
2015-06-25 17:44 ` [PATCH 33/35] libxl: spawn: Always debug log middle child process death Ian Jackson
2015-06-26 14:55   ` Wei Liu
2015-06-25 17:44 ` [PATCH 34/35] libxl: libxl__ev_child pass actual pid to callback Ian Jackson
2015-06-26 14:56   ` Wei Liu
2015-06-25 17:44 ` [PATCH 35/35] libxl: When save/restore helper dies, do not overwrite rc Ian Jackson
2015-06-26 14:56   ` Wei Liu
2015-06-26 10:19 ` [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
2015-06-26 13:20   ` NUMA config handling bug on reuse of libxl domain config Ian Jackson
2015-06-26 15:57     ` Dario Faggioli
2015-06-26 15:40   ` [PATCH v4 00/35] libxl ao abort request (cancellation) Ian Jackson
2015-06-26 15:51     ` Ian Campbell
2015-06-26 15:56       ` Ian Jackson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.