All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] libxl: Fixes from Ian Jackson
@ 2015-03-17 15:30 Jim Fehlig
  2015-03-17 15:30 ` [PATCH 1/3] libxl: In domain death search, start search at first domid we want Jim Fehlig
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Jim Fehlig @ 2015-03-17 15:30 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian.Jackson, wei.liu2

This is a small series of libxl patches I received off-list from
Ian Jackson.  The patches fix a few issues I found when converting
the libvirt libxl driver to use a single libxl_ctx.  Patch 2 has
been modified slightly to address off-list comments from Wei Liu.

Ian Jackson (3):
  libxl: In domain death search, start search at first domid we want
  libxl: Domain destroy: unlock userdata earlier
  libxl: Domain destroy: fork

 tools/libxl/libxl.c          | 77 +++++++++++++++++++++++++++++++++++---------
 tools/libxl/libxl_internal.h |  1 +
 2 files changed, 63 insertions(+), 15 deletions(-)

-- 
1.8.0.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/3] libxl: In domain death search, start search at first domid we want
  2015-03-17 15:30 [PATCH 0/3] libxl: Fixes from Ian Jackson Jim Fehlig
@ 2015-03-17 15:30 ` Jim Fehlig
  2015-03-17 17:34   ` Wei Liu
  2015-03-18 12:19   ` Ian Campbell
  2015-03-17 15:30 ` [PATCH 2/3] libxl: Domain destroy: unlock userdata earlier Jim Fehlig
  2015-03-17 15:30 ` [PATCH 3/3] libxl: Domain destroy: fork Jim Fehlig
  2 siblings, 2 replies; 17+ messages in thread
From: Jim Fehlig @ 2015-03-17 15:30 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian.Jackson, wei.liu2

From: Ian Jackson <Ian.Jackson@eu.citrix.com>

From: Ian Jackson <Ian.Jackson@eu.citrix.com>

When domain_death_xswatch_callback needed a further call to
xc_domain_getinfolist it would restart it with the last domain it
found rather than the first one it wants.

If it only wants one it will also only ask for one domain.  The result
would then be that it gets the previous domain again (ie, the previous
one to the one it wants), which still doesn't reveal the answer to the
question, and it would therefore loop again.

It's completely unclear to me why I thought it was a good idea to
start the xc_domain_getinfolist with the last domain previously found
rather than the first one left un-confirmed.  The code has been that
way since it was introduced.

Instead, start each xc_domain_getinfolist at the next domain whose
status we need to check.

We also need to move the test for !evg into the loop, we now need evg
to compute the arguments to getinfolist.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Reported-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Tested-by: Jim Fehlig <jfehlig@suse.com>
---
 tools/libxl/libxl.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 088786e..e7eb863 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1168,22 +1168,20 @@ static void domain_death_xswatch_callback(libxl__egc *egc, libxl__ev_xswatch *w,
                                         const char *wpath, const char *epath) {
     EGC_GC;
     libxl_evgen_domain_death *evg;
-    uint32_t domid;
     int rc;
 
     CTX_LOCK;
 
     evg = LIBXL_TAILQ_FIRST(&CTX->death_list);
-    if (!evg) goto out;
-
-    domid = evg->domid;
 
     for (;;) {
+        if (!evg) goto out;
+
         int nentries = LIBXL_TAILQ_NEXT(evg, entry) ? 200 : 1;
         xc_domaininfo_t domaininfos[nentries];
         const xc_domaininfo_t *got = domaininfos, *gotend;
 
-        rc = xc_domain_getinfolist(CTX->xch, domid, nentries, domaininfos);
+        rc = xc_domain_getinfolist(CTX->xch, evg->domid, nentries, domaininfos);
         if (rc == -1) {
             LIBXL__EVENT_DISASTER(egc, "xc_domain_getinfolist failed while"
                                   " processing @releaseDomain watch event",
@@ -1193,8 +1191,10 @@ static void domain_death_xswatch_callback(libxl__egc *egc, libxl__ev_xswatch *w,
         gotend = &domaininfos[rc];
 
         LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "[evg=%p:%"PRIu32"]"
-                   " from domid=%"PRIu32" nentries=%d rc=%d",
-                   evg, evg->domid, domid, nentries, rc);
+                   " nentries=%d rc=%d %ld..%ld",
+                   evg, evg->domid, nentries, rc,
+                   rc>0 ? (long)domaininfos[0].domain : 0,
+                   rc>0 ? (long)domaininfos[rc-1].domain : 0);
 
         for (;;) {
             if (!evg) {
@@ -1257,7 +1257,6 @@ static void domain_death_xswatch_callback(libxl__egc *egc, libxl__ev_xswatch *w,
         }
 
         assert(rc); /* rc==0 results in us eating all evgs and quitting */
-        domid = gotend[-1].domain;
     }
  all_reported:
  out:
-- 
1.8.0.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3] libxl: Domain destroy: unlock userdata earlier
  2015-03-17 15:30 [PATCH 0/3] libxl: Fixes from Ian Jackson Jim Fehlig
  2015-03-17 15:30 ` [PATCH 1/3] libxl: In domain death search, start search at first domid we want Jim Fehlig
@ 2015-03-17 15:30 ` Jim Fehlig
  2015-03-17 17:34   ` Wei Liu
  2015-03-17 15:30 ` [PATCH 3/3] libxl: Domain destroy: fork Jim Fehlig
  2 siblings, 1 reply; 17+ messages in thread
From: Jim Fehlig @ 2015-03-17 15:30 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian.Jackson, wei.liu2

From: Ian Jackson <ian.jackson@eu.citrix.com>

Unlock the userdata before we actually call xc_domain_destroy.  This
leaves open the possibility that other libxl callers will see the
half-destroyed domain (with no devices, paused), but this is fine.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Tested-by: Jim Fehlig <jfehlig@suse.com>
---

Addressed off-list comments from Wei Liu

 tools/libxl/libxl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index e7eb863..b6541d4 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1636,7 +1636,7 @@ static void devices_destroy_cb(libxl__egc *egc,
     uint32_t domid = dis->domid;
     char *dom_path;
     char *vm_path;
-    libxl__domain_userdata_lock *lock = NULL;
+    libxl__domain_userdata_lock *lock;
 
     dom_path = libxl__xs_get_dompath(gc, domid);
     if (!dom_path) {
@@ -1670,6 +1670,8 @@ static void devices_destroy_cb(libxl__egc *egc,
     }
     libxl__userdata_destroyall(gc, domid);
 
+    libxl__unlock_domain_userdata(lock);
+
     rc = xc_domain_destroy(ctx->xch, domid);
     if (rc < 0) {
         LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_domain_destroy failed for %d", domid);
@@ -1679,7 +1681,6 @@ static void devices_destroy_cb(libxl__egc *egc,
     rc = 0;
 
 out:
-    if (lock) libxl__unlock_domain_userdata(lock);
     dis->callback(egc, dis, rc);
     return;
 }
-- 
1.8.0.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3] libxl: Domain destroy: fork
  2015-03-17 15:30 [PATCH 0/3] libxl: Fixes from Ian Jackson Jim Fehlig
  2015-03-17 15:30 ` [PATCH 1/3] libxl: In domain death search, start search at first domid we want Jim Fehlig
  2015-03-17 15:30 ` [PATCH 2/3] libxl: Domain destroy: unlock userdata earlier Jim Fehlig
@ 2015-03-17 15:30 ` Jim Fehlig
  2015-03-17 18:04   ` Wei Liu
  2015-03-18 12:27   ` Ian Campbell
  2 siblings, 2 replies; 17+ messages in thread
From: Jim Fehlig @ 2015-03-17 15:30 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian.Jackson, wei.liu2

From: Ian Jackson <ian.jackson@eu.citrix.com>

Call xc_domain_destroy in a subprocess.  That allows us to do so
asynchronously, rather than blocking the whole process calling libxl.

The changes in detail:

 * Provide an libxl__ev_child in libxl__domain_destroy_state, and
   initialise it in libxl__domain_destroy.  There is no possibility
   to `clean up' a libxl__ev_child, but there need to clean it up, as
   the control flow ensures that we only continue after the child has
   exited.

 * Call libxl__ev_child_fork at the right point and put the call to
   xc_domain_destroy and associated logging in the child.  (The child
   opens a new xenctrl handle because we mustn't use the parent's.)

 * Consequently, the success return path from domain_destroy_domid_cb
   no longer calls dis->callback.  Instead it simply returns.

 * We plumb the errorno value through the child's exit status, if it
   fits.  This means we normally do the logging only in the parent.

 * Incidentally, we fix the bug that we were treating the return value
   from xc_domain_destroy as an errno value when in fact it is a
   return value from do_domctl (in this case, 0 or -1 setting errno).

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Tested-by: Jim Fehlig <jfehlig@suse.com>
---
 tools/libxl/libxl.c          | 57 ++++++++++++++++++++++++++++++++++++++++----
 tools/libxl/libxl_internal.h |  1 +
 2 files changed, 53 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index b6541d4..b43db1a 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1481,6 +1481,10 @@ static void domain_destroy_callback(libxl__egc *egc,
 static void destroy_finish_check(libxl__egc *egc,
                                  libxl__domain_destroy_state *dds);
 
+static void domain_destroy_domid_cb(libxl__egc *egc,
+                                    libxl__ev_child *destroyer,
+                                    pid_t pid, int status);
+
 void libxl__domain_destroy(libxl__egc *egc, libxl__domain_destroy_state *dds)
 {
     STATE_AO_GC(dds->ao);
@@ -1567,6 +1571,8 @@ void libxl__destroy_domid(libxl__egc *egc, libxl__destroy_domid_state *dis)
     char *pid;
     int rc, dm_present;
 
+    libxl__ev_child_init(&dis->destroyer);
+
     rc = libxl_domain_info(ctx, NULL, domid);
     switch(rc) {
     case 0:
@@ -1672,17 +1678,58 @@ static void devices_destroy_cb(libxl__egc *egc,
 
     libxl__unlock_domain_userdata(lock);
 
-    rc = xc_domain_destroy(ctx->xch, domid);
-    if (rc < 0) {
-        LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_domain_destroy failed for %d", domid);
+    rc = libxl__ev_child_fork(gc, &dis->destroyer, domain_destroy_domid_cb);
+    if (rc < 0) goto out;
+    if (!rc) { /* child */
+        ctx->xch = xc_interface_open(ctx->lg,0,0);
+        if (!ctx->xch) goto badchild;
+
+        rc = xc_domain_destroy(ctx->xch, domid);
+        if (rc < 0) goto badchild;
+        _exit(0);
+
+    badchild:
+        if (errno > 0  && errno < 126) {
+            _exit(errno);
+        } else {
+            LOGE(ERROR,
+ "xc_domain_destroy failed for %d (with difficult errno value %d)",
+                 domid, errno);
+            _exit(-1);
+        }
+    }
+    LOG(INFO, "forked pid %ld for destroy of domain %d", (long)rc, domid);
+
+    return;
+
+out:
+    dis->callback(egc, dis, rc);
+    return;
+}
+
+static void domain_destroy_domid_cb(libxl__egc *egc,
+                                    libxl__ev_child *destroyer,
+                                    pid_t pid, int status)
+{
+    libxl__destroy_domid_state *dis = CONTAINER_OF(destroyer, *dis, destroyer);
+    STATE_AO_GC(dis->ao);
+    int rc;
+
+    if (status) {
+        if (WIFEXITED(status) && WEXITSTATUS(status)<126) {
+            LOGEV(ERROR, WEXITSTATUS(status),
+                  "xc_domain_destroy failed for %"PRIu32"", dis->domid);
+        } else {
+            libxl_report_child_exitstatus(CTX, XTL_ERROR,
+                                          "async domain destroy", pid, status);
+        }
         rc = ERROR_FAIL;
         goto out;
     }
     rc = 0;
 
-out:
+ out:
     dis->callback(egc, dis, rc);
-    return;
 }
 
 int libxl_console_exec(libxl_ctx *ctx, uint32_t domid, int cons_num,
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 934465a..28d32ef 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2957,6 +2957,7 @@ struct libxl__destroy_domid_state {
     libxl__domid_destroy_cb *callback;
     /* private to implementation */
     libxl__devices_remove_state drs;
+    libxl__ev_child destroyer;
 };
 
 struct libxl__domain_destroy_state {
-- 
1.8.0.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] libxl: In domain death search, start search at first domid we want
  2015-03-17 15:30 ` [PATCH 1/3] libxl: In domain death search, start search at first domid we want Jim Fehlig
@ 2015-03-17 17:34   ` Wei Liu
  2015-03-18 12:19   ` Ian Campbell
  1 sibling, 0 replies; 17+ messages in thread
From: Wei Liu @ 2015-03-17 17:34 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: Ian.Jackson, wei.liu2, xen-devel

On Tue, Mar 17, 2015 at 09:30:57AM -0600, Jim Fehlig wrote:
> From: Ian Jackson <Ian.Jackson@eu.citrix.com>
> 
> From: Ian Jackson <Ian.Jackson@eu.citrix.com>
> 
> When domain_death_xswatch_callback needed a further call to
> xc_domain_getinfolist it would restart it with the last domain it
> found rather than the first one it wants.
> 
> If it only wants one it will also only ask for one domain.  The result
> would then be that it gets the previous domain again (ie, the previous
> one to the one it wants), which still doesn't reveal the answer to the
> question, and it would therefore loop again.
> 
> It's completely unclear to me why I thought it was a good idea to
> start the xc_domain_getinfolist with the last domain previously found
> rather than the first one left un-confirmed.  The code has been that
> way since it was introduced.
> 
> Instead, start each xc_domain_getinfolist at the next domain whose
> status we need to check.
> 
> We also need to move the test for !evg into the loop, we now need evg
> to compute the arguments to getinfolist.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Reported-by: Jim Fehlig <jfehlig@suse.com>
> Reviewed-by: Jim Fehlig <jfehlig@suse.com>
> Tested-by: Jim Fehlig <jfehlig@suse.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] libxl: Domain destroy: unlock userdata earlier
  2015-03-17 15:30 ` [PATCH 2/3] libxl: Domain destroy: unlock userdata earlier Jim Fehlig
@ 2015-03-17 17:34   ` Wei Liu
  2015-03-18 12:20     ` Ian Campbell
  0 siblings, 1 reply; 17+ messages in thread
From: Wei Liu @ 2015-03-17 17:34 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: wei.liu2, Ian.Jackson, xen-devel

On Tue, Mar 17, 2015 at 09:30:58AM -0600, Jim Fehlig wrote:
> From: Ian Jackson <ian.jackson@eu.citrix.com>
> 
> Unlock the userdata before we actually call xc_domain_destroy.  This
> leaves open the possibility that other libxl callers will see the
> half-destroyed domain (with no devices, paused), but this is fine.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> Reviewed-by: Jim Fehlig <jfehlig@suse.com>
> Tested-by: Jim Fehlig <jfehlig@suse.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] libxl: Domain destroy: fork
  2015-03-17 15:30 ` [PATCH 3/3] libxl: Domain destroy: fork Jim Fehlig
@ 2015-03-17 18:04   ` Wei Liu
  2015-03-18 12:28     ` Ian Campbell
  2015-03-18 12:27   ` Ian Campbell
  1 sibling, 1 reply; 17+ messages in thread
From: Wei Liu @ 2015-03-17 18:04 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: wei.liu2, Ian.Jackson, xen-devel

On Tue, Mar 17, 2015 at 09:30:59AM -0600, Jim Fehlig wrote:
> From: Ian Jackson <ian.jackson@eu.citrix.com>
> 
> Call xc_domain_destroy in a subprocess.  That allows us to do so
> asynchronously, rather than blocking the whole process calling libxl.
> 
> The changes in detail:
> 
>  * Provide an libxl__ev_child in libxl__domain_destroy_state, and
>    initialise it in libxl__domain_destroy.  There is no possibility
>    to `clean up' a libxl__ev_child, but there need to clean it up, as
>    the control flow ensures that we only continue after the child has
>    exited.
> 
>  * Call libxl__ev_child_fork at the right point and put the call to
>    xc_domain_destroy and associated logging in the child.  (The child
>    opens a new xenctrl handle because we mustn't use the parent's.)
> 
>  * Consequently, the success return path from domain_destroy_domid_cb
>    no longer calls dis->callback.  Instead it simply returns.
> 
>  * We plumb the errorno value through the child's exit status, if it
>    fits.  This means we normally do the logging only in the parent.
> 
>  * Incidentally, we fix the bug that we were treating the return value
>    from xc_domain_destroy as an errno value when in fact it is a
>    return value from do_domctl (in this case, 0 or -1 setting errno).
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Reviewed-by: Jim Fehlig <jfehlig@suse.com>
> Tested-by: Jim Fehlig <jfehlig@suse.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

One nit below.

> ---
[...]
> +        ctx->xch = xc_interface_open(ctx->lg,0,0);
> +        if (!ctx->xch) goto badchild;
> +
> +        rc = xc_domain_destroy(ctx->xch, domid);
> +        if (rc < 0) goto badchild;
> +        _exit(0);
> +
> +    badchild:
> +        if (errno > 0  && errno < 126) {
> +            _exit(errno);
> +        } else {
> +            LOGE(ERROR,
> + "xc_domain_destroy failed for %d (with difficult errno value %d)",

Indentation is wrong.

Wei.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] libxl: In domain death search, start search at first domid we want
  2015-03-17 15:30 ` [PATCH 1/3] libxl: In domain death search, start search at first domid we want Jim Fehlig
  2015-03-17 17:34   ` Wei Liu
@ 2015-03-18 12:19   ` Ian Campbell
  2015-03-18 17:47     ` Jim Fehlig
  1 sibling, 1 reply; 17+ messages in thread
From: Ian Campbell @ 2015-03-18 12:19 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: wei.liu2, Ian.Jackson, xen-devel

On Tue, 2015-03-17 at 09:30 -0600, Jim Fehlig wrote:
> From: Ian Jackson <Ian.Jackson@eu.citrix.com>
> 
> From: Ian Jackson <Ian.Jackson@eu.citrix.com>
> 
> When domain_death_xswatch_callback needed a further call to
> xc_domain_getinfolist it would restart it with the last domain it
> found rather than the first one it wants.
> 
> If it only wants one it will also only ask for one domain.  The result
> would then be that it gets the previous domain again (ie, the previous
> one to the one it wants), which still doesn't reveal the answer to the
> question, and it would therefore loop again.
> 
> It's completely unclear to me why I thought it was a good idea to
> start the xc_domain_getinfolist with the last domain previously found
> rather than the first one left un-confirmed.  The code has been that
> way since it was introduced.

Is it because the xc_domain_getinfolist will fetch at most:
        int nentries = LIBXL_TAILQ_NEXT(evg, entry) ? 200 : 1;
entries?

After your change then if the domid we are looking for is the 201st
domain then won't we just keep going round looking at the first 200
(undying) domains?

Ian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] libxl: Domain destroy: unlock userdata earlier
  2015-03-17 17:34   ` Wei Liu
@ 2015-03-18 12:20     ` Ian Campbell
  2015-03-18 17:52       ` Jim Fehlig
  0 siblings, 1 reply; 17+ messages in thread
From: Ian Campbell @ 2015-03-18 12:20 UTC (permalink / raw)
  To: Wei Liu; +Cc: Jim Fehlig, Ian.Jackson, xen-devel

On Tue, 2015-03-17 at 17:34 +0000, Wei Liu wrote:
> On Tue, Mar 17, 2015 at 09:30:58AM -0600, Jim Fehlig wrote:
> > From: Ian Jackson <ian.jackson@eu.citrix.com>
> > 
> > Unlock the userdata before we actually call xc_domain_destroy.  This
> > leaves open the possibility that other libxl callers will see the
> > half-destroyed domain (with no devices, paused), but this is fine.
> > 
> > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> > CC: Wei Liu <wei.liu2@citrix.com>
> > Reviewed-by: Jim Fehlig <jfehlig@suse.com>
> > Tested-by: Jim Fehlig <jfehlig@suse.com>
> 
> Acked-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

I'm not sure if this is safe/sensible to apply without the preceding
patch which I had a comment on.

Ian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] libxl: Domain destroy: fork
  2015-03-17 15:30 ` [PATCH 3/3] libxl: Domain destroy: fork Jim Fehlig
  2015-03-17 18:04   ` Wei Liu
@ 2015-03-18 12:27   ` Ian Campbell
  2015-03-20 15:14     ` Jim Fehlig
  1 sibling, 1 reply; 17+ messages in thread
From: Ian Campbell @ 2015-03-18 12:27 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: wei.liu2, Ian.Jackson, xen-devel

On Tue, 2015-03-17 at 09:30 -0600, Jim Fehlig wrote:
> From: Ian Jackson <ian.jackson@eu.citrix.com>
> 
> Call xc_domain_destroy in a subprocess.  That allows us to do so
> asynchronously, rather than blocking the whole process calling libxl.
> 
> The changes in detail:
> 
>  * Provide an libxl__ev_child in libxl__domain_destroy_state, and
>    initialise it in libxl__domain_destroy.  There is no possibility
>    to `clean up' a libxl__ev_child, but there need to clean it up, as
                                               ^is no ?

>    the control flow ensures that we only continue after the child has
>    exited.
> 
>  * Call libxl__ev_child_fork at the right point and put the call to
>    xc_domain_destroy and associated logging in the child.  (The child
>    opens a new xenctrl handle because we mustn't use the parent's.)

Do I read it correctly that this is arranging to use the same logger as
the parent was, IOW the log messages will still go somewhere the
application considers useful?

> 
>  * Consequently, the success return path from domain_destroy_domid_cb
>    no longer calls dis->callback.  Instead it simply returns.
> 
>  * We plumb the errorno value through the child's exit status, if it
>    fits.  This means we normally do the logging only in the parent.

(This makes my question above moot I think).

>  * Incidentally, we fix the bug that we were treating the return value
>    from xc_domain_destroy as an errno value when in fact it is a
>    return value from do_domctl (in this case, 0 or -1 setting errno).
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Reviewed-by: Jim Fehlig <jfehlig@suse.com>
> Tested-by: Jim Fehlig <jfehlig@suse.com>
> ---
>  tools/libxl/libxl.c          | 57 ++++++++++++++++++++++++++++++++++++++++----
>  tools/libxl/libxl_internal.h |  1 +
>  2 files changed, 53 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index b6541d4..b43db1a 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -1481,6 +1481,10 @@ static void domain_destroy_callback(libxl__egc *egc,
>  static void destroy_finish_check(libxl__egc *egc,
>                                   libxl__domain_destroy_state *dds);
>  
> +static void domain_destroy_domid_cb(libxl__egc *egc,
> +                                    libxl__ev_child *destroyer,
> +                                    pid_t pid, int status);

This seems misplaced, doesn't it want to go before libxl__destroy_domid
just after the prototype of devices_destroy_cb, which is where it lives
in the sequence?

Ian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] libxl: Domain destroy: fork
  2015-03-17 18:04   ` Wei Liu
@ 2015-03-18 12:28     ` Ian Campbell
  0 siblings, 0 replies; 17+ messages in thread
From: Ian Campbell @ 2015-03-18 12:28 UTC (permalink / raw)
  To: Wei Liu; +Cc: Jim Fehlig, Ian.Jackson, xen-devel

On Tue, 2015-03-17 at 18:04 +0000, Wei Liu wrote:
> > +        } else {
> > +            LOGE(ERROR,
> > + "xc_domain_destroy failed for %d (with difficult errno value %d)",
> 
> Indentation is wrong.

It is preferable to pull back the indentation of a string literal rather
than wrap it in order to preserve grep-ability.

How far back to pull it is a matter of preference I guess.

Ian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] libxl: In domain death search, start search at first domid we want
  2015-03-18 12:19   ` Ian Campbell
@ 2015-03-18 17:47     ` Jim Fehlig
  2015-03-19 10:22       ` Ian Campbell
  0 siblings, 1 reply; 17+ messages in thread
From: Jim Fehlig @ 2015-03-18 17:47 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian.Jackson, wei.liu2, xen-devel

Ian Campbell wrote:
> On Tue, 2015-03-17 at 09:30 -0600, Jim Fehlig wrote:
>   
>> From: Ian Jackson <Ian.Jackson@eu.citrix.com>
>>
>> From: Ian Jackson <Ian.Jackson@eu.citrix.com>
>>
>> When domain_death_xswatch_callback needed a further call to
>> xc_domain_getinfolist it would restart it with the last domain it
>> found rather than the first one it wants.
>>
>> If it only wants one it will also only ask for one domain.  The result
>> would then be that it gets the previous domain again (ie, the previous
>> one to the one it wants), which still doesn't reveal the answer to the
>> question, and it would therefore loop again.
>>
>> It's completely unclear to me why I thought it was a good idea to
>> start the xc_domain_getinfolist with the last domain previously found
>> rather than the first one left un-confirmed.  The code has been that
>> way since it was introduced.
>>     
>
> Is it because the xc_domain_getinfolist will fetch at most:
>         int nentries = LIBXL_TAILQ_NEXT(evg, entry) ? 200 : 1;
> entries?
>
> After your change then if the domid we are looking for is the 201st
> domain then won't we just keep going round looking at the first 200
> (undying) domains?
>   

Yes, that theoretically looks to be the case. When all 200 domains have
been examined the inner loop is terminated

if (got == gotend) {
LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, " got==gotend");
break;
}

which will repeat the outer loop with same conditions. But
xc_domain_getinfolist() is called with first_domain set to the domain ID
we are looking for. Is it possible for xc_domain_getinfolist() to return
200 domains with IDs less than the ID we asked for?

Regards,
Jim

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] libxl: Domain destroy: unlock userdata earlier
  2015-03-18 12:20     ` Ian Campbell
@ 2015-03-18 17:52       ` Jim Fehlig
  2015-03-20 11:58         ` Ian Campbell
  0 siblings, 1 reply; 17+ messages in thread
From: Jim Fehlig @ 2015-03-18 17:52 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian.Jackson, Wei Liu, xen-devel

Ian Campbell wrote:
> On Tue, 2015-03-17 at 17:34 +0000, Wei Liu wrote:
>   
>> On Tue, Mar 17, 2015 at 09:30:58AM -0600, Jim Fehlig wrote:
>>     
>>> From: Ian Jackson <ian.jackson@eu.citrix.com>
>>>
>>> Unlock the userdata before we actually call xc_domain_destroy.  This
>>> leaves open the possibility that other libxl callers will see the
>>> half-destroyed domain (with no devices, paused), but this is fine.
>>>
>>> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
>>> CC: Wei Liu <wei.liu2@citrix.com>
>>> Reviewed-by: Jim Fehlig <jfehlig@suse.com>
>>> Tested-by: Jim Fehlig <jfehlig@suse.com>
>>>       
>> Acked-by: Wei Liu <wei.liu2@citrix.com>
>>     
>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>
> I'm not sure if this is safe/sensible to apply without the preceding
> patch which I had a comment on.
>   

It is not so sensible without the subsequent patch 3/3.  This patch is
not related to the preceding patch 1/3.

Regards,
Jim

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] libxl: In domain death search, start search at first domid we want
  2015-03-18 17:47     ` Jim Fehlig
@ 2015-03-19 10:22       ` Ian Campbell
  0 siblings, 0 replies; 17+ messages in thread
From: Ian Campbell @ 2015-03-19 10:22 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: wei.liu2, Ian.Jackson, xen-devel

On Wed, 2015-03-18 at 11:47 -0600, Jim Fehlig wrote:
> Ian Campbell wrote:
> > On Tue, 2015-03-17 at 09:30 -0600, Jim Fehlig wrote:
> >   
> >> From: Ian Jackson <Ian.Jackson@eu.citrix.com>
> >>
> >> From: Ian Jackson <Ian.Jackson@eu.citrix.com>
> >>
> >> When domain_death_xswatch_callback needed a further call to
> >> xc_domain_getinfolist it would restart it with the last domain it
> >> found rather than the first one it wants.
> >>
> >> If it only wants one it will also only ask for one domain.  The result
> >> would then be that it gets the previous domain again (ie, the previous
> >> one to the one it wants), which still doesn't reveal the answer to the
> >> question, and it would therefore loop again.
> >>
> >> It's completely unclear to me why I thought it was a good idea to
> >> start the xc_domain_getinfolist with the last domain previously found
> >> rather than the first one left un-confirmed.  The code has been that
> >> way since it was introduced.
> >>     
> >
> > Is it because the xc_domain_getinfolist will fetch at most:
> >         int nentries = LIBXL_TAILQ_NEXT(evg, entry) ? 200 : 1;
> > entries?
> >
> > After your change then if the domid we are looking for is the 201st
> > domain then won't we just keep going round looking at the first 200
> > (undying) domains?
> >   
> 
> Yes, that theoretically looks to be the case. When all 200 domains have
> been examined the inner loop is terminated
> 
> if (got == gotend) {
> LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, " got==gotend");
> break;
> }
> 
> which will repeat the outer loop with same conditions. But
> xc_domain_getinfolist() is called with first_domain set to the domain ID
> we are looking for. Is it possible for xc_domain_getinfolist() to return
> 200 domains with IDs less than the ID we asked for?

I'm not sure, but it does seem rather unlikely.

However what I'm not sure about is whether this function is intending to
handle multiple domains dying or not. It looks to me like it is (and
that's what I would expect).

Ah, and Ian has pushed the selection of domid from outside the loop to
inside the loop (as evg->domid). Which means that, assuming
xc_domain_getinfolist can't return 200 domains with IDs less that the
one we wanted we will infact make progress, possibly handling up to 200
dead domains, and then go round the loop again and start from the new
smallest domid. (evg is progressed inside domain_death_occurred).

So right, I think I understand what this patch is doing and I think it
is correct:

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] libxl: Domain destroy: unlock userdata earlier
  2015-03-18 17:52       ` Jim Fehlig
@ 2015-03-20 11:58         ` Ian Campbell
  0 siblings, 0 replies; 17+ messages in thread
From: Ian Campbell @ 2015-03-20 11:58 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: Ian.Jackson, Wei Liu, xen-devel

On Wed, 2015-03-18 at 11:52 -0600, Jim Fehlig wrote:
> Ian Campbell wrote:
> > On Tue, 2015-03-17 at 17:34 +0000, Wei Liu wrote:
> >   
> >> On Tue, Mar 17, 2015 at 09:30:58AM -0600, Jim Fehlig wrote:
> >>     
> >>> From: Ian Jackson <ian.jackson@eu.citrix.com>
> >>>
> >>> Unlock the userdata before we actually call xc_domain_destroy.  This
> >>> leaves open the possibility that other libxl callers will see the
> >>> half-destroyed domain (with no devices, paused), but this is fine.
> >>>
> >>> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> >>> CC: Wei Liu <wei.liu2@citrix.com>
> >>> Reviewed-by: Jim Fehlig <jfehlig@suse.com>
> >>> Tested-by: Jim Fehlig <jfehlig@suse.com>
> >>>       
> >> Acked-by: Wei Liu <wei.liu2@citrix.com>
> >>     
> >
> > Acked-by: Ian Campbell <ian.campbell@citrix.com>
> >
> > I'm not sure if this is safe/sensible to apply without the preceding
> > patch which I had a comment on.
> >   
> 
> It is not so sensible without the subsequent patch 3/3.  This patch is
> not related to the preceding patch 1/3.

Thanks, I applied 1/3 and then I decided I may as well apply 2/3 as
well, there's no harm and it would be interesting to see how the locking
change holds up.

Ian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] libxl: Domain destroy: fork
  2015-03-18 12:27   ` Ian Campbell
@ 2015-03-20 15:14     ` Jim Fehlig
  2015-03-20 16:03       ` Ian Campbell
  0 siblings, 1 reply; 17+ messages in thread
From: Jim Fehlig @ 2015-03-20 15:14 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian.Jackson, wei.liu2, xen-devel

[-- Attachment #1: Type: text/plain, Size: 878 bytes --]

Ian Campbell wrote:
> On Tue, 2015-03-17 at 09:30 -0600, Jim Fehlig wrote:
>   
>> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
>> index b6541d4..b43db1a 100644
>> --- a/tools/libxl/libxl.c
>> +++ b/tools/libxl/libxl.c
>> @@ -1481,6 +1481,10 @@ static void domain_destroy_callback(libxl__egc *egc,
>>  static void destroy_finish_check(libxl__egc *egc,
>>                                   libxl__domain_destroy_state *dds);
>>  
>> +static void domain_destroy_domid_cb(libxl__egc *egc,
>> +                                    libxl__ev_child *destroyer,
>> +                                    pid_t pid, int status);
>>     
>
> This seems misplaced, doesn't it want to go before libxl__destroy_domid
> just after the prototype of devices_destroy_cb, which is where it lives
> in the sequence?
>   

Yes, I think you are right. Here an adjusted patch.

Regards,
Jim


[-- Attachment #2: dom-destroy-fork.patch --]
[-- Type: text/x-diff, Size: 4911 bytes --]

>From 0c6fef6926580938b0fb8750c009af5151a056dc Mon Sep 17 00:00:00 2001
From: Ian Jackson <ian.jackson@eu.citrix.com>
Date: Thu, 5 Mar 2015 16:28:04 +0000
Subject: [PATCH 3/3] libxl: Domain destroy: fork

Call xc_domain_destroy in a subprocess.  That allows us to do so
asynchronously, rather than blocking the whole process calling libxl.

The changes in detail:

 * Provide an libxl__ev_child in libxl__domain_destroy_state, and
   initialise it in libxl__domain_destroy.  There is no possibility
   to `clean up' a libxl__ev_child, but there need to clean it up, as
   the control flow ensures that we only continue after the child has
   exited.

 * Call libxl__ev_child_fork at the right point and put the call to
   xc_domain_destroy and associated logging in the child.  (The child
   opens a new xenctrl handle because we mustn't use the parent's.)

 * Consequently, the success return path from domain_destroy_domid_cb
   no longer calls dis->callback.  Instead it simply returns.

 * We plumb the errorno value through the child's exit status, if it
   fits.  This means we normally do the logging only in the parent.

 * Incidentally, we fix the bug that we were treating the return value
   from xc_domain_destroy as an errno value when in fact it is a
   return value from do_domctl (in this case, 0 or -1 setting errno).

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Tested-by: Jim Fehlig <jfehlig@suse.com>
---
 tools/libxl/libxl.c          | 57 ++++++++++++++++++++++++++++++++++++++++----
 tools/libxl/libxl_internal.h |  1 +
 2 files changed, 53 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index b6541d4..b53a183 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1558,6 +1558,10 @@ static void devices_destroy_cb(libxl__egc *egc,
                                libxl__devices_remove_state *drs,
                                int rc);
 
+static void domain_destroy_domid_cb(libxl__egc *egc,
+                                    libxl__ev_child *destroyer,
+                                    pid_t pid, int status);
+
 void libxl__destroy_domid(libxl__egc *egc, libxl__destroy_domid_state *dis)
 {
     STATE_AO_GC(dis->ao);
@@ -1567,6 +1571,8 @@ void libxl__destroy_domid(libxl__egc *egc, libxl__destroy_domid_state *dis)
     char *pid;
     int rc, dm_present;
 
+    libxl__ev_child_init(&dis->destroyer);
+
     rc = libxl_domain_info(ctx, NULL, domid);
     switch(rc) {
     case 0:
@@ -1672,17 +1678,58 @@ static void devices_destroy_cb(libxl__egc *egc,
 
     libxl__unlock_domain_userdata(lock);
 
-    rc = xc_domain_destroy(ctx->xch, domid);
-    if (rc < 0) {
-        LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_domain_destroy failed for %d", domid);
+    rc = libxl__ev_child_fork(gc, &dis->destroyer, domain_destroy_domid_cb);
+    if (rc < 0) goto out;
+    if (!rc) { /* child */
+        ctx->xch = xc_interface_open(ctx->lg,0,0);
+        if (!ctx->xch) goto badchild;
+
+        rc = xc_domain_destroy(ctx->xch, domid);
+        if (rc < 0) goto badchild;
+        _exit(0);
+
+    badchild:
+        if (errno > 0  && errno < 126) {
+            _exit(errno);
+        } else {
+            LOGE(ERROR,
+ "xc_domain_destroy failed for %d (with difficult errno value %d)",
+                 domid, errno);
+            _exit(-1);
+        }
+    }
+    LOG(INFO, "forked pid %ld for destroy of domain %d", (long)rc, domid);
+
+    return;
+
+out:
+    dis->callback(egc, dis, rc);
+    return;
+}
+
+static void domain_destroy_domid_cb(libxl__egc *egc,
+                                    libxl__ev_child *destroyer,
+                                    pid_t pid, int status)
+{
+    libxl__destroy_domid_state *dis = CONTAINER_OF(destroyer, *dis, destroyer);
+    STATE_AO_GC(dis->ao);
+    int rc;
+
+    if (status) {
+        if (WIFEXITED(status) && WEXITSTATUS(status)<126) {
+            LOGEV(ERROR, WEXITSTATUS(status),
+                  "xc_domain_destroy failed for %"PRIu32"", dis->domid);
+        } else {
+            libxl_report_child_exitstatus(CTX, XTL_ERROR,
+                                          "async domain destroy", pid, status);
+        }
         rc = ERROR_FAIL;
         goto out;
     }
     rc = 0;
 
-out:
+ out:
     dis->callback(egc, dis, rc);
-    return;
 }
 
 int libxl_console_exec(libxl_ctx *ctx, uint32_t domid, int cons_num,
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 934465a..28d32ef 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2957,6 +2957,7 @@ struct libxl__destroy_domid_state {
     libxl__domid_destroy_cb *callback;
     /* private to implementation */
     libxl__devices_remove_state drs;
+    libxl__ev_child destroyer;
 };
 
 struct libxl__domain_destroy_state {
-- 
1.8.0.1


[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] libxl: Domain destroy: fork
  2015-03-20 15:14     ` Jim Fehlig
@ 2015-03-20 16:03       ` Ian Campbell
  0 siblings, 0 replies; 17+ messages in thread
From: Ian Campbell @ 2015-03-20 16:03 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: Ian.Jackson, wei.liu2, xen-devel

On Fri, 2015-03-20 at 09:14 -0600, Jim Fehlig wrote:
> Ian Campbell wrote:
> > On Tue, 2015-03-17 at 09:30 -0600, Jim Fehlig wrote:
> >   
> >> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> >> index b6541d4..b43db1a 100644
> >> --- a/tools/libxl/libxl.c
> >> +++ b/tools/libxl/libxl.c
> >> @@ -1481,6 +1481,10 @@ static void domain_destroy_callback(libxl__egc *egc,
> >>  static void destroy_finish_check(libxl__egc *egc,
> >>                                   libxl__domain_destroy_state *dds);
> >>  
> >> +static void domain_destroy_domid_cb(libxl__egc *egc,
> >> +                                    libxl__ev_child *destroyer,
> >> +                                    pid_t pid, int status);
> >>     
> >
> > This seems misplaced, doesn't it want to go before libxl__destroy_domid
> > just after the prototype of devices_destroy_cb, which is where it lives
> > in the sequence?
> >   
> 
> Yes, I think you are right. Here an adjusted patch.

Acked + applied, thanks!

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2015-03-20 16:03 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-17 15:30 [PATCH 0/3] libxl: Fixes from Ian Jackson Jim Fehlig
2015-03-17 15:30 ` [PATCH 1/3] libxl: In domain death search, start search at first domid we want Jim Fehlig
2015-03-17 17:34   ` Wei Liu
2015-03-18 12:19   ` Ian Campbell
2015-03-18 17:47     ` Jim Fehlig
2015-03-19 10:22       ` Ian Campbell
2015-03-17 15:30 ` [PATCH 2/3] libxl: Domain destroy: unlock userdata earlier Jim Fehlig
2015-03-17 17:34   ` Wei Liu
2015-03-18 12:20     ` Ian Campbell
2015-03-18 17:52       ` Jim Fehlig
2015-03-20 11:58         ` Ian Campbell
2015-03-17 15:30 ` [PATCH 3/3] libxl: Domain destroy: fork Jim Fehlig
2015-03-17 18:04   ` Wei Liu
2015-03-18 12:28     ` Ian Campbell
2015-03-18 12:27   ` Ian Campbell
2015-03-20 15:14     ` Jim Fehlig
2015-03-20 16:03       ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.