All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] libxl: fork: SIGCHLD flexibility
@ 2014-01-17 16:23 Ian Jackson
  2014-01-17 16:23 ` [PATCH 01/12] libxl: fork: Break out checked_waitpid Ian Jackson
                   ` (13 more replies)
  0 siblings, 14 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:23 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Campbell

libvirt reaps its children synchronously and has no central pid
registry and no dispatch mechanism.  libxl does have a pid registry so
can provide a selective reaping facility, but that is not currently
exposed.  Here we expose it.

Also, libvirt has multiple libxl ctxs.  Prior to this series it is not
possible for those to share SIGCHLD: libxl expects either the
application, or _one_ libxl ctx, to own SIGCHLD.  In the final patch
of this series we relax this restriction by having libxl maintain a
process-wide list of the libxl ctxs that are supposed to be interested
in SIGCHLD.

I have not tested the selective reaping functionality.  The most
plausible test environment for that is a suitably modified libvirt.

I have tested the new SIGCHLD plumbing, at least with a single ctx,
since xl uses it.  Testing that it works in a real multi-ctx
application is again probably most easily done with libvirt.

I hope that with this series applied, simply having libvirt pass
libxl_sigchld_owner_libxl_always_selective_reap should be sufficient
for everything to work.  There is no need to specifically request the
SIGCHLD-sharing.

 a 01/12] libxl: fork: Break out checked_waitpid
 a 02/12] libxl: fork: Break out childproc_reaped_ours
 a 03/12] libxl: fork: Clarify docs for libxl_sigchld_owner
*  04/12] libxl: fork: Document libxl_sigchld_owner_libxl better
 a 05/12] libxl: fork: assert that chldmode is right
 a 06/12] libxl: fork: Provide libxl_childproc_sigchld_occurred
+a 07/12] libxl: fork: Provide ..._always_selective_reap
 a 08/12] libxl: fork: Provide LIBXL_HAVE_SIGCHLD_SELECTIVE_REAP
*  09/12] libxl: fork: Rename sigchld handler functions
*  10/12] libxl: fork: Break out sigchld_installhandler_core
*  11/12] libxl: fork: Break out sigchld_sethandler_raw
*  12/12] libxl: fork: Share SIGCHLD handler amongst ctxs

(a = acked; * = new patch; + = modified patch)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 01/12] libxl: fork: Break out checked_waitpid
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
@ 2014-01-17 16:23 ` Ian Jackson
  2014-01-17 16:23 ` [PATCH 02/12] libxl: fork: Break out childproc_reaped_ours Ian Jackson
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:23 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

This is a simple error-handling wrapper for waitpid.  We're going to
want to call waitpid somewhere else and this avoids some of the
duplication.

No functional change in this patch.  (Technically, we used to check
chldmode_ours again in the EINTR case, and don't now, but that can't
have changed because we continuously hold the libxl ctx lock.)

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_fork.c |   26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 4ae9f94..2252370 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -155,6 +155,22 @@ int libxl__carefd_fd(const libxl__carefd *cf)
  * Actual child process handling
  */
 
+/* Like waitpid(,,WNOHANG) but handles all errors except ECHILD. */
+static pid_t checked_waitpid(libxl__egc *egc, pid_t want, int *status)
+{
+    for (;;) {
+        pid_t got = waitpid(want, status, WNOHANG);
+        if (got != -1)
+            return got;
+        if (errno == ECHILD)
+            return got;
+        if (errno == EINTR)
+            continue;
+        LIBXL__EVENT_DISASTER(egc, "waitpid() failed", errno, 0);
+        return 0;
+    }
+}
+
 static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
                                      int fd, short events, short revents);
 
@@ -331,16 +347,10 @@ static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
 
     while (chldmode_ours(CTX, 0) /* in case the app changes the mode */) {
         int status;
-        pid_t pid = waitpid(-1, &status, WNOHANG);
-
-        if (pid == 0) return;
+        pid_t pid = checked_waitpid(egc, -1, &status);
 
-        if (pid == -1) {
-            if (errno == ECHILD) return;
-            if (errno == EINTR) continue;
-            LIBXL__EVENT_DISASTER(egc, "waitpid() failed", errno, 0);
+        if (pid == 0 || pid == -1 /* ECHILD */)
             return;
-        }
 
         int rc = childproc_reaped(egc, pid, status);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 02/12] libxl: fork: Break out childproc_reaped_ours
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
  2014-01-17 16:23 ` [PATCH 01/12] libxl: fork: Break out checked_waitpid Ian Jackson
@ 2014-01-17 16:23 ` Ian Jackson
  2014-01-17 16:23 ` [PATCH 03/12] libxl: fork: Clarify docs for libxl_sigchld_owner Ian Jackson
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:23 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

We're going to want to do this again at a new call site.

No functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl_fork.c |   12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 2252370..7b84765 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -290,6 +290,14 @@ static int perhaps_installhandler(libxl__gc *gc, bool creating)
     return 0;
 }
 
+static void childproc_reaped_ours(libxl__egc *egc, libxl__ev_child *ch,
+                                 int status)
+{
+    LIBXL_LIST_REMOVE(ch, entry);
+    ch->pid = -1;
+    ch->callback(egc, ch, ch->pid, status);
+}
+
 static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
 {
     EGC_GC;
@@ -303,9 +311,7 @@ static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
     return ERROR_UNKNOWN_CHILD;
 
  found:
-    LIBXL_LIST_REMOVE(ch, entry);
-    ch->pid = -1;
-    ch->callback(egc, ch, pid, status);
+    childproc_reaped_ours(egc, ch, status);
 
     perhaps_removehandler(gc);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 03/12] libxl: fork: Clarify docs for libxl_sigchld_owner
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
  2014-01-17 16:23 ` [PATCH 01/12] libxl: fork: Break out checked_waitpid Ian Jackson
  2014-01-17 16:23 ` [PATCH 02/12] libxl: fork: Break out childproc_reaped_ours Ian Jackson
@ 2014-01-17 16:23 ` Ian Jackson
  2014-01-17 16:23 ` [PATCH 04/12] libxl: fork: Document libxl_sigchld_owner_libxl better Ian Jackson
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:23 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

Clarify that libxl_sigchld_owner_libxl causes libxl to reap all the
process's children, and clarify the wording of the description of
libxl_sigchld_owner_libxl_always.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_event.h |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
index 6261f99..ff0b2fa 100644
--- a/tools/libxl/libxl_event.h
+++ b/tools/libxl/libxl_event.h
@@ -467,7 +467,8 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
 
 
 typedef enum {
-    /* libxl owns SIGCHLD whenever it has a child. */
+    /* libxl owns SIGCHLD whenever it has a child, and reaps
+     * all children, including those not spawned by libxl. */
     libxl_sigchld_owner_libxl,
 
     /* Application promises to call libxl_childproc_exited but NOT
@@ -476,7 +477,7 @@ typedef enum {
     libxl_sigchld_owner_mainloop,
 
     /* libxl owns SIGCHLD all the time, and the application is
-     * relying on libxl's event loop for reaping its own children. */
+     * relying on libxl's event loop for reaping its children too. */
     libxl_sigchld_owner_libxl_always,
 } libxl_sigchld_owner;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 04/12] libxl: fork: Document libxl_sigchld_owner_libxl better
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
                   ` (2 preceding siblings ...)
  2014-01-17 16:23 ` [PATCH 03/12] libxl: fork: Clarify docs for libxl_sigchld_owner Ian Jackson
@ 2014-01-17 16:23 ` Ian Jackson
  2014-01-17 16:28   ` Ian Campbell
  2014-01-17 16:23 ` [PATCH 05/12] libxl: fork: assert that chldmode is right Ian Jackson
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:23 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

libxl_sigchld_owner_libxl ought to have been mentioned in the list of
options for chldowner.  Since it's the default, move the description
of the its behaviour into the description of that option.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl_event.h |   27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
index ff0b2fa..4f72c4b 100644
--- a/tools/libxl/libxl_event.h
+++ b/tools/libxl/libxl_event.h
@@ -442,9 +442,26 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
  * For programs which run their own children alongside libxl's:
  *
  *     A program which does this must call libxl_childproc_setmode.
- *     There are two options:
+ *     There are three options:
  * 
+ *     libxl_sigchld_owner_libxl:
+ *
+ *       While any libxl operation which might use child processes
+ *       is running, works like libxl_sigchld_owner_libxl_always;
+ *       but, deinstalls the handler the rest of the time.
+ *
+ *       In this mode, the application, while it uses any libxl
+ *       operation which might create or use child processes (see
+ *       above):
+ *           - Must not have any child processes running.
+ *           - Must not install a SIGCHLD handler.
+ *           - Must not reap any children.
+ *
+ *       This is the default (i.e. if setmode is not called, or 0 is
+ *       passed for hooks).
+ *
  *     libxl_sigchld_owner_mainloop:
+ *
  *       The application must install a SIGCHLD handler and reap (at
  *       least) all of libxl's children and pass their exit status to
  *       libxl by calling libxl_childproc_exited.  (If the application
@@ -452,17 +469,11 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
  *       on each ctx.)
  *
  *     libxl_sigchld_owner_libxl_always:
+ *
  *       The application expects libxl to reap all of its children,
  *       and provides a callback to be notified of their exit
  *       statues.  The application must have only one libxl_ctx
  *       configured this way.
- *
- * An application which fails to call setmode, or which passes 0 for
- * hooks, while it uses any libxl operation which might
- * create or use child processes (see above):
- *   - Must not have any child processes running.
- *   - Must not install a SIGCHLD handler.
- *   - Must not reap any children.
  */
 
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 05/12] libxl: fork: assert that chldmode is right
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
                   ` (3 preceding siblings ...)
  2014-01-17 16:23 ` [PATCH 04/12] libxl: fork: Document libxl_sigchld_owner_libxl better Ian Jackson
@ 2014-01-17 16:23 ` Ian Jackson
  2014-01-17 16:23 ` [PATCH 06/12] libxl: fork: Provide libxl_childproc_sigchld_occurred Ian Jackson
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:23 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

In libxl_childproc_reaped, check that the chldmode is as expected.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_fork.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 7b84765..85db2fb 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -322,6 +322,8 @@ int libxl_childproc_reaped(libxl_ctx *ctx, pid_t pid, int status)
 {
     EGC_INIT(ctx);
     CTX_LOCK;
+    assert(CTX->childproc_hooks->chldowner
+           == libxl_sigchld_owner_mainloop);
     int rc = childproc_reaped(egc, pid, status);
     CTX_UNLOCK;
     EGC_FREE;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 06/12] libxl: fork: Provide libxl_childproc_sigchld_occurred
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
                   ` (4 preceding siblings ...)
  2014-01-17 16:23 ` [PATCH 05/12] libxl: fork: assert that chldmode is right Ian Jackson
@ 2014-01-17 16:23 ` Ian Jackson
  2014-01-17 16:24 ` [PATCH 07/12] libxl: fork: Provide ..._always_selective_reap Ian Jackson
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:23 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

Applications exist which don't keep track of all their child processes
in a manner suitable for coherent dispatch of their termination.  In
such a situation, nothing in the whole process may call wait, or
waitpid(-1,,).  Doing so reaps processes belonging to other parts of
the application and there is then no way to deliver the exit status to
the right place.

To facilitate this, provide a facility for such an application to ask
libxl to call waitpid on each of its children individually.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_event.h |   29 +++++++++++++++++++++++++----
 tools/libxl/libxl_fork.c  |   45 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
index 4f72c4b..3c93955 100644
--- a/tools/libxl/libxl_event.h
+++ b/tools/libxl/libxl_event.h
@@ -482,9 +482,10 @@ typedef enum {
      * all children, including those not spawned by libxl. */
     libxl_sigchld_owner_libxl,
 
-    /* Application promises to call libxl_childproc_exited but NOT
-     * from within a signal handler.  libxl will not itself arrange to
-     * (un)block or catch SIGCHLD. */
+    /* Application promises to discover when SIGCHLD occurs and call
+     * libxl_childproc_exited or libxl_childproc_sigchld_occurred (but
+     * NOT from within a signal handler).  libxl will not itself
+     * arrange to (un)block or catch SIGCHLD. */
     libxl_sigchld_owner_mainloop,
 
     /* libxl owns SIGCHLD all the time, and the application is
@@ -542,7 +543,8 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks,
 
 /*
  * This function is for an application which owns SIGCHLD and which
- * therefore reaps all of the process's children.
+ * reaps all of the process's children, and dispatches the exit status
+ * to the correct place inside the application.
  *
  * May be called only by an application which has called setmode with
  * chldowner == libxl_sigchld_owner_mainloop.  If pid was a process started
@@ -558,6 +560,25 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks,
 int libxl_childproc_reaped(libxl_ctx *ctx, pid_t, int status)
                            LIBXL_EXTERNAL_CALLERS_ONLY;
 
+/*
+ * This function is for an application which owns SIGCHLD but which
+ * doesn't keep track of all of its own children in a manner suitable
+ * for reaping all of them and then dispatching them.
+ *
+ * Such an the application must notify libxl, by calling this
+ * function, that a SIGCHLD occurred.  libxl will then check all its
+ * children, reap any that are ready, and take any action necessary -
+ * but it will not reap anything else.
+ *
+ * May be called only by an application which has called setmode with
+ * chldowner == libxl_sigchld_owner_mainloop.
+ *
+ * May NOT be called from within a signal handler which might
+ * interrupt any libxl operation (just like libxl_childproc_reaped).
+ */
+void libxl_childproc_sigchld_occurred(libxl_ctx *ctx)
+                           LIBXL_EXTERNAL_CALLERS_ONLY;
+
 
 /*
  * An application which initialises a libxl_ctx in a parent process
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 85db2fb..b2325e0 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -330,6 +330,51 @@ int libxl_childproc_reaped(libxl_ctx *ctx, pid_t pid, int status)
     return rc;
 }
 
+static void childproc_checkall(libxl__egc *egc)
+{
+    EGC_GC;
+    libxl__ev_child *ch;
+
+    for (;;) {
+        int status;
+        pid_t got;
+
+        LIBXL_LIST_FOREACH(ch, &CTX->children, entry) {
+            got = checked_waitpid(egc, ch->pid, &status);
+            if (got)
+                goto found;
+        }
+        /* not found */
+        return;
+
+    found:
+        if (got == -1) {
+            LIBXL__EVENT_DISASTER
+                (egc, "waitpid() gave ECHILD but we have a child",
+                 ECHILD, 0);
+            /* it must have finished but we don't know its status */
+            status = 255<<8; /* no wait.h macro for this! */
+            assert(WIFEXITED(status));
+            assert(WEXITSTATUS(status)==255);
+            assert(!WIFSIGNALED(status));
+            assert(!WIFSTOPPED(status));
+        }
+        childproc_reaped_ours(egc, ch, status);
+        /* we need to restart the loop, as children may have been edited */
+    }
+}
+
+void libxl_childproc_sigchld_occurred(libxl_ctx *ctx)
+{
+    EGC_INIT(ctx);
+    CTX_LOCK;
+    assert(CTX->childproc_hooks->chldowner
+           == libxl_sigchld_owner_mainloop);
+    childproc_checkall(egc);
+    CTX_UNLOCK;
+    EGC_FREE;
+}
+
 static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
                                      int fd, short events, short revents)
 {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 07/12] libxl: fork: Provide ..._always_selective_reap
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
                   ` (5 preceding siblings ...)
  2014-01-17 16:23 ` [PATCH 06/12] libxl: fork: Provide libxl_childproc_sigchld_occurred Ian Jackson
@ 2014-01-17 16:24 ` Ian Jackson
  2014-01-17 22:17   ` Jim Fehlig
  2014-01-17 16:24 ` [PATCH 08/12] libxl: fork: Provide LIBXL_HAVE_SIGCHLD_SELECTIVE_REAP Ian Jackson
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:24 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

Applications exist which want to use libxl in an event-driven mode but
which do not integrate child termination into their event system, but
instead reap all their own children synchronously.

In such an application libxl must own SIGCHLD but avoid reaping any
children that don't belong to libxl.

Provide libxl_sigchld_owner_libxl_always_selective_reap which has this
behaviour.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

---
v2: Document the new mode in the big "Subprocess handling" comment.
---
 tools/libxl/libxl_event.h |   11 +++++++++++
 tools/libxl/libxl_fork.c  |    7 +++++++
 2 files changed, 18 insertions(+)

diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
index 3c93955..824ac88 100644
--- a/tools/libxl/libxl_event.h
+++ b/tools/libxl/libxl_event.h
@@ -474,6 +474,12 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
  *       and provides a callback to be notified of their exit
  *       statues.  The application must have only one libxl_ctx
  *       configured this way.
+ *
+ *     libxl_sigchld_owner_libxl_always_selective_reap:
+ *
+ *       The application expects to reap all of its own children
+ *       synchronously, and does not use SIGCHLD.  libxl is
+ *       to install a SIGCHLD handler.
  */
 
 
@@ -491,6 +497,11 @@ typedef enum {
     /* libxl owns SIGCHLD all the time, and the application is
      * relying on libxl's event loop for reaping its children too. */
     libxl_sigchld_owner_libxl_always,
+
+    /* libxl owns SIGCHLD all the time, but it must only reap its own
+     * children.  The application will reap its own children
+     * synchronously with waitpid, without the assistance of SIGCHLD. */
+    libxl_sigchld_owner_libxl_always_selective_reap,
 } libxl_sigchld_owner;
 
 typedef struct {
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index b2325e0..16e17f6 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -268,6 +268,7 @@ static bool chldmode_ours(libxl_ctx *ctx, bool creating)
     case libxl_sigchld_owner_mainloop:
         return 0;
     case libxl_sigchld_owner_libxl_always:
+    case libxl_sigchld_owner_libxl_always_selective_reap:
         return 1;
     }
     abort();
@@ -398,6 +399,12 @@ static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
     int e = libxl__self_pipe_eatall(selfpipe);
     if (e) LIBXL__EVENT_DISASTER(egc, "read sigchld pipe", e, 0);
 
+    if (CTX->childproc_hooks->chldowner
+        == libxl_sigchld_owner_libxl_always_selective_reap) {
+        childproc_checkall(egc);
+        return;
+    }
+
     while (chldmode_ours(CTX, 0) /* in case the app changes the mode */) {
         int status;
         pid_t pid = checked_waitpid(egc, -1, &status);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 08/12] libxl: fork: Provide LIBXL_HAVE_SIGCHLD_SELECTIVE_REAP
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
                   ` (6 preceding siblings ...)
  2014-01-17 16:24 ` [PATCH 07/12] libxl: fork: Provide ..._always_selective_reap Ian Jackson
@ 2014-01-17 16:24 ` Ian Jackson
  2014-01-17 16:24 ` [PATCH 09/12] libxl: fork: Rename sigchld handler functions Ian Jackson
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:24 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

This is the feature test macro for libxl_childproc_sigchld_occurred
and libxl_sigchld_owner_libxl_always_selective_reap.

It is split out into this separate patch because: a single feature
test is sensible because we do not intend anyone to release or ship
libxl versions with one of these but not the other; but, the two
features are in separate patches for clarity; and, this just makes
reading the actual code easier.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl.h |   13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 12d6c31..1ac34c3 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -409,6 +409,19 @@
  */
 #define LIBXL_HAVE_DRIVER_DOMAIN_CREATION 1
 
+/*
+ * LIBXL_HAVE_SIGCHLD_SELECTIVE_REAP
+ *
+ * If this is defined:
+ *
+ * Firstly, the enum libxl_sigchld_owner (in libxl_event.h) has the
+ * value libxl_sigchld_owner_libxl_always_selective_reap which may be
+ * passed to libxl_childproc_setmode in hooks->chldmode.
+ *
+ * Secondly, the function libxl_childproc_sigchld_occurred exists.
+ */
+#define LIBXL_HAVE_SIGCHLD_OWNER_SELECTIVE_REAP 1
+
 /* Functions annotated with LIBXL_EXTERNAL_CALLERS_ONLY may not be
  * called from within libxl itself. Callers outside libxl, who
  * do not #include libxl_internal.h, are fine. */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 09/12] libxl: fork: Rename sigchld handler functions
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
                   ` (7 preceding siblings ...)
  2014-01-17 16:24 ` [PATCH 08/12] libxl: fork: Provide LIBXL_HAVE_SIGCHLD_SELECTIVE_REAP Ian Jackson
@ 2014-01-17 16:24 ` Ian Jackson
  2014-01-20  9:59   ` Ian Campbell
  2014-01-17 16:24 ` [PATCH 10/12] libxl: fork: Break out sigchld_installhandler_core Ian Jackson
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:24 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

We are going to change these functions so that different libxl ctx's
can share a single SIGCHLD handler.  Rename them now to a new name
which doesn't imply unconditional handler installation or removal.

Also note in the comments that they are idempotent.

No functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl.c          |    2 +-
 tools/libxl/libxl_fork.c     |   22 +++++++++++-----------
 tools/libxl/libxl_internal.h |    4 ++--
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 2845ca4..4679b51 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -170,7 +170,7 @@ int libxl_ctx_free(libxl_ctx *ctx)
     /* If we have outstanding children, then the application inherits
      * them; we wish the application good luck with understanding
      * this if and when it reaps them. */
-    libxl__sigchld_removehandler(gc);
+    libxl__sigchld_notneeded(gc);
 
     if (ctx->sigchld_selfpipe[0] >= 0) {
         close(ctx->sigchld_selfpipe[0]);
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 16e17f6..a15af8e 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -194,7 +194,7 @@ static void sigchld_removehandler_core(void)
     sigchld_owner = 0;
 }
 
-void libxl__sigchld_removehandler(libxl__gc *gc) /* non-reentrant */
+void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */
 {
     int rc;
 
@@ -210,7 +210,7 @@ void libxl__sigchld_removehandler(libxl__gc *gc) /* non-reentrant */
     }
 }
 
-int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */
+int libxl__sigchld_needed(libxl__gc *gc) /* non-reentrant, idempotent */
 {
     int r, rc;
 
@@ -274,18 +274,18 @@ static bool chldmode_ours(libxl_ctx *ctx, bool creating)
     abort();
 }
 
-static void perhaps_removehandler(libxl__gc *gc)
+static void perhaps_sigchld_notneeded(libxl__gc *gc)
 {
     if (!chldmode_ours(CTX, 0))
-        libxl__sigchld_removehandler(gc);
+        libxl__sigchld_notneeded(gc);
 }
 
-static int perhaps_installhandler(libxl__gc *gc, bool creating)
+static int perhaps_sigchld_needed(libxl__gc *gc, bool creating)
 {
     int rc;
 
     if (chldmode_ours(CTX, creating)) {
-        rc = libxl__sigchld_installhandler(gc);
+        rc = libxl__sigchld_needed(gc);
         if (rc) return rc;
     }
     return 0;
@@ -314,7 +314,7 @@ static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
  found:
     childproc_reaped_ours(egc, ch, status);
 
-    perhaps_removehandler(gc);
+    perhaps_sigchld_notneeded(gc);
 
     return 0;
 }
@@ -445,7 +445,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch,
     CTX_LOCK;
     int rc;
 
-    perhaps_installhandler(gc, 1);
+    perhaps_sigchld_needed(gc, 1);
 
     pid_t pid =
         CTX->childproc_hooks->fork_replacement
@@ -473,7 +473,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch,
     rc = pid;
 
  out:
-    perhaps_removehandler(gc);
+    perhaps_sigchld_notneeded(gc);
     CTX_UNLOCK;
     return rc;
 }
@@ -492,8 +492,8 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks,
     ctx->childproc_hooks = hooks;
     ctx->childproc_user = user;
 
-    perhaps_removehandler(gc);
-    perhaps_installhandler(gc, 0); /* idempotent, ok to ignore errors for now */
+    perhaps_sigchld_notneeded(gc);
+    perhaps_sigchld_needed(gc, 0); /* idempotent, ok to ignore errors for now */
 
     CTX_UNLOCK;
     GC_FREE;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 1bd23ff..fba681d 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -838,8 +838,8 @@ _hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p);
 
 /* Internal to fork and child reaping machinery */
 extern const libxl_childproc_hooks libxl__childproc_default_hooks;
-int libxl__sigchld_installhandler(libxl__gc*); /* non-reentrant; logs errs */
-void libxl__sigchld_removehandler(libxl__gc*); /* non-reentrant */
+int libxl__sigchld_needed(libxl__gc*); /* non-reentrant idempotent, logs errs */
+void libxl__sigchld_notneeded(libxl__gc*); /* non-reentrant idempotent */
 int libxl__self_pipe_wakeup(int fd); /* returns 0 or -1 setting errno */
 int libxl__self_pipe_eatall(int fd); /* returns 0 or -1 setting errno */
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 10/12] libxl: fork: Break out sigchld_installhandler_core
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
                   ` (8 preceding siblings ...)
  2014-01-17 16:24 ` [PATCH 09/12] libxl: fork: Rename sigchld handler functions Ian Jackson
@ 2014-01-17 16:24 ` Ian Jackson
  2014-01-20  9:59   ` Ian Campbell
  2014-01-17 16:24 ` [PATCH 11/12] libxl: fork: Break out sigchld_sethandler_raw Ian Jackson
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:24 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

Pure code motion.  This is going to make the final substantive patch
easier to read.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl_fork.c |   38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index a15af8e..ce8e8eb 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -194,6 +194,27 @@ static void sigchld_removehandler_core(void)
     sigchld_owner = 0;
 }
 
+static void sigchld_installhandler_core(libxl__gc *gc)
+{
+    struct sigaction ours;
+    int r;
+
+    assert(!sigchld_owner);
+    sigchld_owner = CTX;
+
+    memset(&ours,0,sizeof(ours));
+    ours.sa_handler = sigchld_handler;
+    sigemptyset(&ours.sa_mask);
+    ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
+    r = sigaction(SIGCHLD, &ours, &sigchld_saved_action);
+    assert(!r);
+
+    assert(((void)"application must negotiate with libxl about SIGCHLD",
+            !(sigchld_saved_action.sa_flags & SA_SIGINFO) &&
+            (sigchld_saved_action.sa_handler == SIG_DFL ||
+             sigchld_saved_action.sa_handler == SIG_IGN)));
+}
+
 void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */
 {
     int rc;
@@ -236,22 +257,7 @@ int libxl__sigchld_needed(libxl__gc *gc) /* non-reentrant, idempotent */
 
     atfork_lock();
     if (sigchld_owner != CTX) {
-        struct sigaction ours;
-
-        assert(!sigchld_owner);
-        sigchld_owner = CTX;
-
-        memset(&ours,0,sizeof(ours));
-        ours.sa_handler = sigchld_handler;
-        sigemptyset(&ours.sa_mask);
-        ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
-        r = sigaction(SIGCHLD, &ours, &sigchld_saved_action);
-        assert(!r);
-
-        assert(((void)"application must negotiate with libxl about SIGCHLD",
-                !(sigchld_saved_action.sa_flags & SA_SIGINFO) &&
-                (sigchld_saved_action.sa_handler == SIG_DFL ||
-                 sigchld_saved_action.sa_handler == SIG_IGN)));
+        sigchld_installhandler_core(gc);
     }
     atfork_unlock();
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 11/12] libxl: fork: Break out sigchld_sethandler_raw
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
                   ` (9 preceding siblings ...)
  2014-01-17 16:24 ` [PATCH 10/12] libxl: fork: Break out sigchld_installhandler_core Ian Jackson
@ 2014-01-17 16:24 ` Ian Jackson
  2014-01-20  9:58   ` Ian Campbell
  2014-01-17 16:24 ` [PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs Ian Jackson
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:24 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

We are going to want introduce another call site in the final
substantive patch.

Pure code motion; no functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl_fork.c |   20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index ce8e8eb..b6b14fe 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -182,6 +182,19 @@ static void sigchld_handler(int signo)
     errno = esave;
 }
 
+static void sigchld_sethandler_raw(void (*handler)(int), struct sigaction *old)
+{
+    struct sigaction ours;
+    int r;
+
+    memset(&ours,0,sizeof(ours));
+    ours.sa_handler = handler;
+    sigemptyset(&ours.sa_mask);
+    ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
+    r = sigaction(SIGCHLD, &ours, old);
+    assert(!r);
+}
+
 static void sigchld_removehandler_core(void)
 {
     struct sigaction was;
@@ -202,12 +215,7 @@ static void sigchld_installhandler_core(libxl__gc *gc)
     assert(!sigchld_owner);
     sigchld_owner = CTX;
 
-    memset(&ours,0,sizeof(ours));
-    ours.sa_handler = sigchld_handler;
-    sigemptyset(&ours.sa_mask);
-    ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
-    r = sigaction(SIGCHLD, &ours, &sigchld_saved_action);
-    assert(!r);
+    sigchld_sethandler_raw(sigchld_handler, &sigchld_saved_action);
 
     assert(((void)"application must negotiate with libxl about SIGCHLD",
             !(sigchld_saved_action.sa_flags & SA_SIGINFO) &&
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
                   ` (10 preceding siblings ...)
  2014-01-17 16:24 ` [PATCH 11/12] libxl: fork: Break out sigchld_sethandler_raw Ian Jackson
@ 2014-01-17 16:24 ` Ian Jackson
  2014-01-17 18:13   ` Ian Jackson
  2014-01-17 16:37 ` [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
  2014-01-17 22:29 ` Jim Fehlig
  13 siblings, 1 reply; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:24 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

Previously, an application which had multiple libxl ctxs in multiple
threads, would have to itself plumb SIGCHLD through to each ctx.
Instead, permit multiple libxl ctxs to all share the SIGCHLD handler.

We keep a list of all the ctxs which are interested in SIGCHLD and
notify all of their self-pipes.

In more detail:

 * sigchld_owner, the ctx* of the SIGCHLD owner, is replaced by
   sigchld_users, a list of SIGCHLD users.

 * Each ctx keeps track of whether it is on the users list, so that
   libxl__sigchld_needed and libxl__sigchld_notneeded now instead of
   idempotently installing and removing the handler, idempotently add
   or remove the ctx from the list.

   We ensure that we always have the SIGCHLD handler installed
   iff the sigchld_users list is nonempty.  To make this a bit
   easier we make sigchld_installhandler_core and
   sigchld_removehandler_core idempotent.

   Specifically, the call sites for sigchld_installhandler_core and
   sigchld_removehandler_core are updated to manipulate sigchld_users
   and only call the install or remove functions as applicable.

 * In the signal handler we walk the list of SIGCHLD users and write
   to each of their self-pipes.  That means that we need to arrange to
   defer SIGCHLD when we are manipulating the list (to avoid the
   signal handler interrupting our list manipulation); this is quite
   tiresome to arrange.

   The code as written will, on the first installation of the SIGCHLD
   handler, firstly install the real handler, then immediately replace
   it with the deferral handler.  Doing it this way makes the code
   clearer as it makes the SIGCHLD deferral machinery much more
   self-contained (and hence easier to reason about).

 * The first part of libxl__sigchld_notneeded is broken out into a new
   function sigchld_user_remove (which is also needed during for
   postfork).  And of course that first part of the function is now
   rather different, as explained above.

 * sigchld_installhandler_core no longer takes the gc argument,
   because it now deals with SIGCHLD for all ctxs.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl_event.h    |    2 +-
 tools/libxl/libxl_fork.c     |  132 +++++++++++++++++++++++++++++++++++-------
 tools/libxl/libxl_internal.h |    3 +
 3 files changed, 114 insertions(+), 23 deletions(-)

diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
index 824ac88..ab6ac5c 100644
--- a/tools/libxl/libxl_event.h
+++ b/tools/libxl/libxl_event.h
@@ -472,7 +472,7 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
  *
  *       The application expects libxl to reap all of its children,
  *       and provides a callback to be notified of their exit
- *       statues.  The application must have only one libxl_ctx
+ *       statuses.  The application may have have multiple libxl_ctxs
  *       configured this way.
  *
  *     libxl_sigchld_owner_libxl_always_selective_reap:
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index b6b14fe..5b42bad 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -46,11 +46,18 @@ static int atfork_registered;
 static LIBXL_LIST_HEAD(, libxl__carefd) carefds =
     LIBXL_LIST_HEAD_INITIALIZER(carefds);
 
-/* non-null iff installed, protected by no_forking */
-static libxl_ctx *sigchld_owner;
+/* Protected against concurrency by no_forking.  sigchld_users is
+ * protected against being interrupted by SIGCHLD (and thus read
+ * asynchronously by the signal handler) by sigchld_defer (see
+ * below). */
+static bool sigchld_installed; /* 0 means not */
+static LIBXL_LIST_HEAD(, libxl_ctx) sigchld_users =
+    LIBXL_LIST_HEAD_INITIALIZER(sigchld_users);
 static struct sigaction sigchld_saved_action;
 
-static void sigchld_removehandler_core(void);
+static void sigchld_removehandler_core(void); /* idempotent */
+static void sigchld_user_remove(libxl_ctx *ctx); /* idempotent */
+static void sigchld_sethandler_raw(void (*handler)(int), struct sigaction *old);
 
 static void atfork_lock(void)
 {
@@ -126,8 +133,7 @@ void libxl_postfork_child_noexec(libxl_ctx *ctx)
     }
     LIBXL_LIST_INIT(&carefds);
 
-    if (sigchld_owner)
-        sigchld_removehandler_core();
+    sigchld_user_remove(ctx);
 
     atfork_unlock();
 }
@@ -152,7 +158,8 @@ int libxl__carefd_fd(const libxl__carefd *cf)
 }
 
 /*
- * Actual child process handling
+ * Low-level functions for child process handling, including
+ * the main SIGCHLD handler.
  */
 
 /* Like waitpid(,,WNOHANG) but handles all errors except ECHILD. */
@@ -176,9 +183,16 @@ static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
 
 static void sigchld_handler(int signo)
 {
+    /* This function has to be reentrant!  Luckily it is. */
+
+    libxl_ctx *notify;
     int esave = errno;
-    int e = libxl__self_pipe_wakeup(sigchld_owner->sigchld_selfpipe[1]);
-    assert(!e); /* errors are probably EBADF, very bad */
+
+    LIBXL_LIST_FOREACH(notify, &sigchld_users, sigchld_users_entry) {
+        int e = libxl__self_pipe_wakeup(notify->sigchld_selfpipe[1]);
+        assert(!e); /* errors are probably EBADF, very bad */
+    }
+
     errno = esave;
 }
 
@@ -195,25 +209,74 @@ static void sigchld_sethandler_raw(void (*handler)(int), struct sigaction *old)
     assert(!r);
 }
 
-static void sigchld_removehandler_core(void)
+/*
+ * SIGCHLD deferral
+ *
+ * sigchld_defer and sigchld_release are a bit like using sigprocmask
+ * to block the signal only they work for the whole process.  Sadly
+ * this has to be done by setting a special handler that records the
+ * "pendingness" of the signal here in the program.  How tedious.
+ *
+ * A property of this approach is that the signal handler itself
+ * must be reentrant (see the comment in release_sigchld).
+ *
+ * Callers have the atfork_lock so there is no risk of concurrency
+ * within these functions, aside obviously from the risk of being
+ * interrupted by the signal.
+ */
+
+static volatile sig_atomic_t sigchld_occurred_while_deferred;
+
+static void sigchld_handler_when_deferred(int signo)
+{
+    sigchld_occurred_while_deferred = 1;
+}
+
+static void defer_sigchld(void)
+{
+    assert(sigchld_installed);
+    sigchld_sethandler_raw(sigchld_handler_when_deferred, 0);
+}
+
+static void release_sigchld(void)
+{
+    assert(sigchld_installed);
+    sigchld_sethandler_raw(sigchld_handler, 0);
+    if (sigchld_occurred_while_deferred) {
+        sigchld_occurred_while_deferred = 0;
+        /* We might get another SIGCHLD here, in which case
+         * sigchld_handler will be interrupted and re-entered.
+         * This is OK. */
+        sigchld_handler(SIGCHLD);
+    }
+}
+
+/*
+ * Meat of the child process handling.
+ */
+
+static void sigchld_removehandler_core(void) /* idempotent */
 {
     struct sigaction was;
     int r;
     
+    if (!sigchld_installed)
+        return;
+
     r = sigaction(SIGCHLD, &sigchld_saved_action, &was);
     assert(!r);
     assert(!(was.sa_flags & SA_SIGINFO));
     assert(was.sa_handler == sigchld_handler);
-    sigchld_owner = 0;
+
+    sigchld_installed = 0;
 }
 
-static void sigchld_installhandler_core(libxl__gc *gc)
+static void sigchld_installhandler_core(void) /* idempotent */
 {
-    struct sigaction ours;
-    int r;
+    if (sigchld_installed)
+        return;
 
-    assert(!sigchld_owner);
-    sigchld_owner = CTX;
+    sigchld_installed = 1;
 
     sigchld_sethandler_raw(sigchld_handler, &sigchld_saved_action);
 
@@ -223,15 +286,32 @@ static void sigchld_installhandler_core(libxl__gc *gc)
              sigchld_saved_action.sa_handler == SIG_IGN)));
 }
 
-void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */
+static void sigchld_user_remove(libxl_ctx *ctx) /* idempotent */
 {
-    int rc;
+    if (!ctx->sigchld_user_registered)
+        return;
 
     atfork_lock();
-    if (sigchld_owner == CTX)
+    defer_sigchld();
+
+    LIBXL_LIST_REMOVE(ctx, sigchld_users_entry);
+
+    release_sigchld();
+
+    if (LIBXL_LIST_EMPTY(&sigchld_users))
         sigchld_removehandler_core();
+
     atfork_unlock();
 
+    ctx->sigchld_user_registered = 0;
+}
+
+void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */
+{
+    int rc;
+
+    sigchld_user_remove(CTX);
+
     if (libxl__ev_fd_isregistered(&CTX->sigchld_selfpipe_efd)) {
         rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, 0);
         if (rc)
@@ -262,12 +342,20 @@ int libxl__sigchld_needed(libxl__gc *gc) /* non-reentrant, idempotent */
         rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, POLLIN);
         if (rc) goto out;
     }
+    if (!CTX->sigchld_user_registered) {
+        atfork_lock();
 
-    atfork_lock();
-    if (sigchld_owner != CTX) {
-        sigchld_installhandler_core(gc);
+        sigchld_installhandler_core();
+
+        defer_sigchld();
+
+        LIBXL_LIST_INSERT_HEAD(&sigchld_users, CTX, sigchld_users_entry);
+
+        release_sigchld();
+        atfork_unlock();
+
+        CTX->sigchld_user_registered = 1;
     }
-    atfork_unlock();
 
     rc = 0;
  out:
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index fba681d..8429448 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -355,6 +355,8 @@ struct libxl__ctx {
     int sigchld_selfpipe[2]; /* [0]==-1 means handler not installed */
     libxl__ev_fd sigchld_selfpipe_efd;
     LIBXL_LIST_HEAD(, libxl__ev_child) children;
+    bool sigchld_user_registered;
+    LIBXL_LIST_ENTRY(libxl_ctx) sigchld_users_entry;
 
     libxl_version_info version_info;
 };
@@ -840,6 +842,7 @@ _hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p);
 extern const libxl_childproc_hooks libxl__childproc_default_hooks;
 int libxl__sigchld_needed(libxl__gc*); /* non-reentrant idempotent, logs errs */
 void libxl__sigchld_notneeded(libxl__gc*); /* non-reentrant idempotent */
+void libxl__sigchld_check_stale_handler(void);
 int libxl__self_pipe_wakeup(int fd); /* returns 0 or -1 setting errno */
 int libxl__self_pipe_eatall(int fd); /* returns 0 or -1 setting errno */
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH 04/12] libxl: fork: Document libxl_sigchld_owner_libxl better
  2014-01-17 16:23 ` [PATCH 04/12] libxl: fork: Document libxl_sigchld_owner_libxl better Ian Jackson
@ 2014-01-17 16:28   ` Ian Campbell
  0 siblings, 0 replies; 58+ messages in thread
From: Ian Campbell @ 2014-01-17 16:28 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, xen-devel

On Fri, 2014-01-17 at 16:23 +0000, Ian Jackson wrote:
> libxl_sigchld_owner_libxl ought to have been mentioned in the list of
> options for chldowner.  Since it's the default, move the description
> of the its behaviour into the description of that option.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Jim Fehlig <jfehlig@suse.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
                   ` (11 preceding siblings ...)
  2014-01-17 16:24 ` [PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs Ian Jackson
@ 2014-01-17 16:37 ` Ian Jackson
  2014-01-17 22:29 ` Jim Fehlig
  13 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 16:37 UTC (permalink / raw)
  To: xen-devel, Ian Campbell, Jim Fehlig

To check that this was doing roughly the right things, I straced xl.

Here is what it does before the first time libxl wants to fork:

 pipe([17, 18])                          = 0
 rt_sigaction(SIGCHLD, {0xb76ad507, [], SA_RESTART|SA_NOCLDSTOP}, {SIG_DFL, [], 0}, 8) = 0
 rt_sigaction(SIGCHLD, {0xb76ad150, [], SA_RESTART|SA_NOCLDSTOP}, NULL, 8) = 0
 rt_sigaction(SIGCHLD, {0xb76ad507, [], SA_RESTART|SA_NOCLDSTOP}, NULL, 8) = 0
 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb744a978) = 10033

That seems about right.  (It does leak the self-pipe into the child
but that is of no consequence.)


Here is what is happens in the parent when the child exits:

  --- SIGCHLD (Child exited) @ 0 (0) ---
  write(18, "\0", 1)                      = 1
  sigreturn()                             = ? (mask now [])
  gettimeofday({1389975566, 501125}, NULL) = 0
  poll([{fd=17, events=POLLIN}, {fd=7, events=POLLIN}, {fd=7, events=POLLIN}], 3, -1) = 1 ([{fd=17, revents=POLLIN}])
  gettimeofday({1389975566, 501571}, NULL) = 0
  read(17, "\0", 256)                     = 1
  waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 10033

That looks mostly right for the libxl child handling except that there
are two instances of {fd=7, events=POLLIN}.  I'm going to investigate
what that is and why it might be happening, starting by trying to
figure out what fd 7 is.

The trace then continues:

  close(14)                               = 0
  close(15)                               = 0
  close(12)                               = 0
  munmap(0xb766a000, 4096)                = 0
  close(11)                               = 0

I think that is the migration ao and its fds etc. being torn down.

  write(10, "\0", 1)                      = 1

This is the poller wakeup for ao completion.

  rt_sigaction(SIGCHLD, {0xb76ad150, [], SA_RESTART|SA_NOCLDSTOP}, NULL, 8) = 0
  rt_sigaction(SIGCHLD, {0xb76ad507, [], SA_RESTART|SA_NOCLDSTOP}, NULL, 8) = 0
  rt_sigaction(SIGCHLD, {SIG_DFL, [], 0}, {0xb76ad507, [], SA_RESTART|SA_NOCLDSTOP}, 8) = 0

And here libxl removes the ctx from the SIGCHLD users: the first two
sigaction calls are from the defer and release, and the final one
restores the application's default handler.

Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs
  2014-01-17 16:24 ` [PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs Ian Jackson
@ 2014-01-17 18:13   ` Ian Jackson
  2014-01-20  9:56     ` Ian Campbell
  0 siblings, 1 reply; 58+ messages in thread
From: Ian Jackson @ 2014-01-17 18:13 UTC (permalink / raw)
  To: xen-devel, Ian Campbell, Jim Fehlig

Ian Jackson writes ("[PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs"):
> Previously, an application which had multiple libxl ctxs in multiple
> threads, would have to itself plumb SIGCHLD through to each ctx.
> Instead, permit multiple libxl ctxs to all share the SIGCHLD handler.

An un-posted version of this patch had a feature test macro for this
but I seem to have dropped that by mistake.  Here is a v2.1 with that
added.

Ian.

>From 3d519a8e8c117484f938e55b87eff97f2558144b Mon Sep 17 00:00:00 2001
From: Ian Jackson <ian.jackson@eu.citrix.com>
Date: Fri, 17 Jan 2014 11:58:55 +0000
Subject: [PATCH] libxl: fork: Share SIGCHLD handler amongst ctxs

Previously, an application which had multiple libxl ctxs in multiple
threads, would have to itself plumb SIGCHLD through to each ctx.
Instead, permit multiple libxl ctxs to all share the SIGCHLD handler.

We keep a list of all the ctxs which are interested in SIGCHLD and
notify all of their self-pipes.

In more detail:

 * sigchld_owner, the ctx* of the SIGCHLD owner, is replaced by
   sigchld_users, a list of SIGCHLD users.

 * Each ctx keeps track of whether it is on the users list, so that
   libxl__sigchld_needed and libxl__sigchld_notneeded now instead of
   idempotently installing and removing the handler, idempotently add
   or remove the ctx from the list.

   We ensure that we always have the SIGCHLD handler installed
   iff the sigchld_users list is nonempty.  To make this a bit
   easier we make sigchld_installhandler_core and
   sigchld_removehandler_core idempotent.

   Specifically, the call sites for sigchld_installhandler_core and
   sigchld_removehandler_core are updated to manipulate sigchld_users
   and only call the install or remove functions as applicable.

 * In the signal handler we walk the list of SIGCHLD users and write
   to each of their self-pipes.  That means that we need to arrange to
   defer SIGCHLD when we are manipulating the list (to avoid the
   signal handler interrupting our list manipulation); this is quite
   tiresome to arrange.

   The code as written will, on the first installation of the SIGCHLD
   handler, firstly install the real handler, then immediately replace
   it with the deferral handler.  Doing it this way makes the code
   clearer as it makes the SIGCHLD deferral machinery much more
   self-contained (and hence easier to reason about).

 * The first part of libxl__sigchld_notneeded is broken out into a new
   function sigchld_user_remove (which is also needed during for
   postfork).  And of course that first part of the function is now
   rather different, as explained above.

 * sigchld_installhandler_core no longer takes the gc argument,
   because it now deals with SIGCHLD for all ctxs.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>

---
v2.1: Provide feature test macro LIBXL_HAVE_SIGCHLD_SHARING
---
 tools/libxl/libxl.h          |    9 +++
 tools/libxl/libxl_event.h    |    2 +-
 tools/libxl/libxl_fork.c     |  132 +++++++++++++++++++++++++++++++++++-------
 tools/libxl/libxl_internal.h |    3 +
 4 files changed, 123 insertions(+), 23 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 1ac34c3..0b992d1 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -422,6 +422,15 @@
  */
 #define LIBXL_HAVE_SIGCHLD_OWNER_SELECTIVE_REAP 1
 
+/*
+ * LIBXL_HAVE_SIGCHLD_SHARING
+ *
+ * If this is defined, it is permissible for multiple libxl ctxs
+ * to simultaneously "own" SIGCHLD.  See "Subprocess handling"
+ * in libxl_event.h.
+ */
+#define LIBXL_HAVE_SIGCHLD_SHARING 1
+
 /* Functions annotated with LIBXL_EXTERNAL_CALLERS_ONLY may not be
  * called from within libxl itself. Callers outside libxl, who
  * do not #include libxl_internal.h, are fine. */
diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
index 824ac88..ab6ac5c 100644
--- a/tools/libxl/libxl_event.h
+++ b/tools/libxl/libxl_event.h
@@ -472,7 +472,7 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
  *
  *       The application expects libxl to reap all of its children,
  *       and provides a callback to be notified of their exit
- *       statues.  The application must have only one libxl_ctx
+ *       statuses.  The application may have have multiple libxl_ctxs
  *       configured this way.
  *
  *     libxl_sigchld_owner_libxl_always_selective_reap:
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index b6b14fe..5b42bad 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -46,11 +46,18 @@ static int atfork_registered;
 static LIBXL_LIST_HEAD(, libxl__carefd) carefds =
     LIBXL_LIST_HEAD_INITIALIZER(carefds);
 
-/* non-null iff installed, protected by no_forking */
-static libxl_ctx *sigchld_owner;
+/* Protected against concurrency by no_forking.  sigchld_users is
+ * protected against being interrupted by SIGCHLD (and thus read
+ * asynchronously by the signal handler) by sigchld_defer (see
+ * below). */
+static bool sigchld_installed; /* 0 means not */
+static LIBXL_LIST_HEAD(, libxl_ctx) sigchld_users =
+    LIBXL_LIST_HEAD_INITIALIZER(sigchld_users);
 static struct sigaction sigchld_saved_action;
 
-static void sigchld_removehandler_core(void);
+static void sigchld_removehandler_core(void); /* idempotent */
+static void sigchld_user_remove(libxl_ctx *ctx); /* idempotent */
+static void sigchld_sethandler_raw(void (*handler)(int), struct sigaction *old);
 
 static void atfork_lock(void)
 {
@@ -126,8 +133,7 @@ void libxl_postfork_child_noexec(libxl_ctx *ctx)
     }
     LIBXL_LIST_INIT(&carefds);
 
-    if (sigchld_owner)
-        sigchld_removehandler_core();
+    sigchld_user_remove(ctx);
 
     atfork_unlock();
 }
@@ -152,7 +158,8 @@ int libxl__carefd_fd(const libxl__carefd *cf)
 }
 
 /*
- * Actual child process handling
+ * Low-level functions for child process handling, including
+ * the main SIGCHLD handler.
  */
 
 /* Like waitpid(,,WNOHANG) but handles all errors except ECHILD. */
@@ -176,9 +183,16 @@ static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
 
 static void sigchld_handler(int signo)
 {
+    /* This function has to be reentrant!  Luckily it is. */
+
+    libxl_ctx *notify;
     int esave = errno;
-    int e = libxl__self_pipe_wakeup(sigchld_owner->sigchld_selfpipe[1]);
-    assert(!e); /* errors are probably EBADF, very bad */
+
+    LIBXL_LIST_FOREACH(notify, &sigchld_users, sigchld_users_entry) {
+        int e = libxl__self_pipe_wakeup(notify->sigchld_selfpipe[1]);
+        assert(!e); /* errors are probably EBADF, very bad */
+    }
+
     errno = esave;
 }
 
@@ -195,25 +209,74 @@ static void sigchld_sethandler_raw(void (*handler)(int), struct sigaction *old)
     assert(!r);
 }
 
-static void sigchld_removehandler_core(void)
+/*
+ * SIGCHLD deferral
+ *
+ * sigchld_defer and sigchld_release are a bit like using sigprocmask
+ * to block the signal only they work for the whole process.  Sadly
+ * this has to be done by setting a special handler that records the
+ * "pendingness" of the signal here in the program.  How tedious.
+ *
+ * A property of this approach is that the signal handler itself
+ * must be reentrant (see the comment in release_sigchld).
+ *
+ * Callers have the atfork_lock so there is no risk of concurrency
+ * within these functions, aside obviously from the risk of being
+ * interrupted by the signal.
+ */
+
+static volatile sig_atomic_t sigchld_occurred_while_deferred;
+
+static void sigchld_handler_when_deferred(int signo)
+{
+    sigchld_occurred_while_deferred = 1;
+}
+
+static void defer_sigchld(void)
+{
+    assert(sigchld_installed);
+    sigchld_sethandler_raw(sigchld_handler_when_deferred, 0);
+}
+
+static void release_sigchld(void)
+{
+    assert(sigchld_installed);
+    sigchld_sethandler_raw(sigchld_handler, 0);
+    if (sigchld_occurred_while_deferred) {
+        sigchld_occurred_while_deferred = 0;
+        /* We might get another SIGCHLD here, in which case
+         * sigchld_handler will be interrupted and re-entered.
+         * This is OK. */
+        sigchld_handler(SIGCHLD);
+    }
+}
+
+/*
+ * Meat of the child process handling.
+ */
+
+static void sigchld_removehandler_core(void) /* idempotent */
 {
     struct sigaction was;
     int r;
     
+    if (!sigchld_installed)
+        return;
+
     r = sigaction(SIGCHLD, &sigchld_saved_action, &was);
     assert(!r);
     assert(!(was.sa_flags & SA_SIGINFO));
     assert(was.sa_handler == sigchld_handler);
-    sigchld_owner = 0;
+
+    sigchld_installed = 0;
 }
 
-static void sigchld_installhandler_core(libxl__gc *gc)
+static void sigchld_installhandler_core(void) /* idempotent */
 {
-    struct sigaction ours;
-    int r;
+    if (sigchld_installed)
+        return;
 
-    assert(!sigchld_owner);
-    sigchld_owner = CTX;
+    sigchld_installed = 1;
 
     sigchld_sethandler_raw(sigchld_handler, &sigchld_saved_action);
 
@@ -223,15 +286,32 @@ static void sigchld_installhandler_core(libxl__gc *gc)
              sigchld_saved_action.sa_handler == SIG_IGN)));
 }
 
-void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */
+static void sigchld_user_remove(libxl_ctx *ctx) /* idempotent */
 {
-    int rc;
+    if (!ctx->sigchld_user_registered)
+        return;
 
     atfork_lock();
-    if (sigchld_owner == CTX)
+    defer_sigchld();
+
+    LIBXL_LIST_REMOVE(ctx, sigchld_users_entry);
+
+    release_sigchld();
+
+    if (LIBXL_LIST_EMPTY(&sigchld_users))
         sigchld_removehandler_core();
+
     atfork_unlock();
 
+    ctx->sigchld_user_registered = 0;
+}
+
+void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */
+{
+    int rc;
+
+    sigchld_user_remove(CTX);
+
     if (libxl__ev_fd_isregistered(&CTX->sigchld_selfpipe_efd)) {
         rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, 0);
         if (rc)
@@ -262,12 +342,20 @@ int libxl__sigchld_needed(libxl__gc *gc) /* non-reentrant, idempotent */
         rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, POLLIN);
         if (rc) goto out;
     }
+    if (!CTX->sigchld_user_registered) {
+        atfork_lock();
 
-    atfork_lock();
-    if (sigchld_owner != CTX) {
-        sigchld_installhandler_core(gc);
+        sigchld_installhandler_core();
+
+        defer_sigchld();
+
+        LIBXL_LIST_INSERT_HEAD(&sigchld_users, CTX, sigchld_users_entry);
+
+        release_sigchld();
+        atfork_unlock();
+
+        CTX->sigchld_user_registered = 1;
     }
-    atfork_unlock();
 
     rc = 0;
  out:
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index fba681d..8429448 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -355,6 +355,8 @@ struct libxl__ctx {
     int sigchld_selfpipe[2]; /* [0]==-1 means handler not installed */
     libxl__ev_fd sigchld_selfpipe_efd;
     LIBXL_LIST_HEAD(, libxl__ev_child) children;
+    bool sigchld_user_registered;
+    LIBXL_LIST_ENTRY(libxl_ctx) sigchld_users_entry;
 
     libxl_version_info version_info;
 };
@@ -840,6 +842,7 @@ _hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p);
 extern const libxl_childproc_hooks libxl__childproc_default_hooks;
 int libxl__sigchld_needed(libxl__gc*); /* non-reentrant idempotent, logs errs */
 void libxl__sigchld_notneeded(libxl__gc*); /* non-reentrant idempotent */
+void libxl__sigchld_check_stale_handler(void);
 int libxl__self_pipe_wakeup(int fd); /* returns 0 or -1 setting errno */
 int libxl__self_pipe_eatall(int fd); /* returns 0 or -1 setting errno */
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH 07/12] libxl: fork: Provide ..._always_selective_reap
  2014-01-17 16:24 ` [PATCH 07/12] libxl: fork: Provide ..._always_selective_reap Ian Jackson
@ 2014-01-17 22:17   ` Jim Fehlig
  0 siblings, 0 replies; 58+ messages in thread
From: Jim Fehlig @ 2014-01-17 22:17 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Ian Campbell

Ian Jackson wrote:
> Applications exist which want to use libxl in an event-driven mode but
> which do not integrate child termination into their event system, but
> instead reap all their own children synchronously.
>
> In such an application libxl must own SIGCHLD but avoid reaping any
> children that don't belong to libxl.
>
> Provide libxl_sigchld_owner_libxl_always_selective_reap which has this
> behaviour.
>
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Jim Fehlig <jfehlig@suse.com>
> Cc: Ian Campbell <Ian.Campbell@citrix.com>
> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
>
> ---
> v2: Document the new mode in the big "Subprocess handling" comment.
> ---
>  tools/libxl/libxl_event.h |   11 +++++++++++
>  tools/libxl/libxl_fork.c  |    7 +++++++
>  2 files changed, 18 insertions(+)
>
> diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
> index 3c93955..824ac88 100644
> --- a/tools/libxl/libxl_event.h
> +++ b/tools/libxl/libxl_event.h
> @@ -474,6 +474,12 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
>   *       and provides a callback to be notified of their exit
>   *       statues.  The application must have only one libxl_ctx
>   *       configured this way.
> + *
> + *     libxl_sigchld_owner_libxl_always_selective_reap:
> + *
> + *       The application expects to reap all of its own children
> + *       synchronously, and does not use SIGCHLD.  libxl is
> + *       to install a SIGCHLD handler.
>   */
>  
>  
> @@ -491,6 +497,11 @@ typedef enum {
>      /* libxl owns SIGCHLD all the time, and the application is
>       * relying on libxl's event loop for reaping its children too. */
>      libxl_sigchld_owner_libxl_always,
> +
> +    /* libxl owns SIGCHLD all the time, but it must only reap its own
> +     * children.  The application will reap its own children
> +     * synchronously with waitpid, without the assistance of SIGCHLD. */
> +    libxl_sigchld_owner_libxl_always_selective_reap,
>   

ACK to the doc improvements here.  Much clearer IMO.

Regards,
Jim

>  } libxl_sigchld_owner;
>  
>  typedef struct {
> diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
> index b2325e0..16e17f6 100644
> --- a/tools/libxl/libxl_fork.c
> +++ b/tools/libxl/libxl_fork.c
> @@ -268,6 +268,7 @@ static bool chldmode_ours(libxl_ctx *ctx, bool creating)
>      case libxl_sigchld_owner_mainloop:
>          return 0;
>      case libxl_sigchld_owner_libxl_always:
> +    case libxl_sigchld_owner_libxl_always_selective_reap:
>          return 1;
>      }
>      abort();
> @@ -398,6 +399,12 @@ static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
>      int e = libxl__self_pipe_eatall(selfpipe);
>      if (e) LIBXL__EVENT_DISASTER(egc, "read sigchld pipe", e, 0);
>  
> +    if (CTX->childproc_hooks->chldowner
> +        == libxl_sigchld_owner_libxl_always_selective_reap) {
> +        childproc_checkall(egc);
> +        return;
> +    }
> +
>      while (chldmode_ours(CTX, 0) /* in case the app changes the mode */) {
>          int status;
>          pid_t pid = checked_waitpid(egc, -1, &status);
>   

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
                   ` (12 preceding siblings ...)
  2014-01-17 16:37 ` [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
@ 2014-01-17 22:29 ` Jim Fehlig
  2014-01-20 18:14   ` Jim Fehlig
  13 siblings, 1 reply; 58+ messages in thread
From: Jim Fehlig @ 2014-01-17 22:29 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Ian Campbell

Ian Jackson wrote:
> libvirt reaps its children synchronously and has no central pid
> registry and no dispatch mechanism.  libxl does have a pid registry so
> can provide a selective reaping facility, but that is not currently
> exposed.  Here we expose it.
>
> Also, libvirt has multiple libxl ctxs.  Prior to this series it is not
> possible for those to share SIGCHLD: libxl expects either the
> application, or _one_ libxl ctx, to own SIGCHLD.  In the final patch
> of this series we relax this restriction by having libxl maintain a
> process-wide list of the libxl ctxs that are supposed to be interested
> in SIGCHLD.
>
> I have not tested the selective reaping functionality.  The most
> plausible test environment for that is a suitably modified libvirt.
>   

I've been testing this series (plus 1/3 in your "tools: Miscellanous
fixes for 4.4" series) on a suitably modified libvirt and the results
look good so far :).

I'm running four scripts concurrently that

- start / stop domA
- save / restore domB
- reboot domC
- get stats on dom{A,B,C}

They have been running for about an hour now, and I haven't noticed any
problems

Thanks!
Jim

> I have tested the new SIGCHLD plumbing, at least with a single ctx,
> since xl uses it.  Testing that it works in a real multi-ctx
> application is again probably most easily done with libvirt.
>
> I hope that with this series applied, simply having libvirt pass
> libxl_sigchld_owner_libxl_always_selective_reap should be sufficient
> for everything to work.  There is no need to specifically request the
> SIGCHLD-sharing.
>
>  a 01/12] libxl: fork: Break out checked_waitpid
>  a 02/12] libxl: fork: Break out childproc_reaped_ours
>  a 03/12] libxl: fork: Clarify docs for libxl_sigchld_owner
> *  04/12] libxl: fork: Document libxl_sigchld_owner_libxl better
>  a 05/12] libxl: fork: assert that chldmode is right
>  a 06/12] libxl: fork: Provide libxl_childproc_sigchld_occurred
> +a 07/12] libxl: fork: Provide ..._always_selective_reap
>  a 08/12] libxl: fork: Provide LIBXL_HAVE_SIGCHLD_SELECTIVE_REAP
> *  09/12] libxl: fork: Rename sigchld handler functions
> *  10/12] libxl: fork: Break out sigchld_installhandler_core
> *  11/12] libxl: fork: Break out sigchld_sethandler_raw
> *  12/12] libxl: fork: Share SIGCHLD handler amongst ctxs
>
> (a = acked; * = new patch; + = modified patch)
>
>   

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs
  2014-01-17 18:13   ` Ian Jackson
@ 2014-01-20  9:56     ` Ian Campbell
  2014-01-21 14:40       ` Ian Jackson
  0 siblings, 1 reply; 58+ messages in thread
From: Ian Campbell @ 2014-01-20  9:56 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, xen-devel

On Fri, 2014-01-17 at 18:13 +0000, Ian Jackson wrote:
> diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
> index 824ac88..ab6ac5c 100644
> --- a/tools/libxl/libxl_event.h
> +++ b/tools/libxl/libxl_event.h
> @@ -472,7 +472,7 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
>   *
>   *       The application expects libxl to reap all of its children,
>   *       and provides a callback to be notified of their exit
> - *       statues.  The application must have only one libxl_ctx
> + *       statuses.  The application may have have multiple libxl_ctxs

"have have"

It's hard to believe but that seems to be the only comment I have, I
think I actually grokked the whole locking via SIGCHLD deferral thing
and it seems to make sense.

Acked-by: Ian Campbell <ian.campbell@citrix.com>

Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 11/12] libxl: fork: Break out sigchld_sethandler_raw
  2014-01-17 16:24 ` [PATCH 11/12] libxl: fork: Break out sigchld_sethandler_raw Ian Jackson
@ 2014-01-20  9:58   ` Ian Campbell
  2014-01-20 17:57     ` Ian Jackson
  0 siblings, 1 reply; 58+ messages in thread
From: Ian Campbell @ 2014-01-20  9:58 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, xen-devel

On Fri, 2014-01-17 at 16:24 +0000, Ian Jackson wrote:
> We are going to want introduce another call site in the final
> substantive patch.
> 
> Pure code motion; no functional change.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Jim Fehlig <jfehlig@suse.com>
> Cc: Ian Campbell <Ian.Campbell@citrix.com>
> ---
>  tools/libxl/libxl_fork.c |   20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
> index ce8e8eb..b6b14fe 100644
> --- a/tools/libxl/libxl_fork.c
> +++ b/tools/libxl/libxl_fork.c
> @@ -182,6 +182,19 @@ static void sigchld_handler(int signo)
>      errno = esave;
>  }
>  
> +static void sigchld_sethandler_raw(void (*handler)(int), struct sigaction *old)
> +{
> +    struct sigaction ours;
> +    int r;
> +
> +    memset(&ours,0,sizeof(ours));
> +    ours.sa_handler = handler;
> +    sigemptyset(&ours.sa_mask);
> +    ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
> +    r = sigaction(SIGCHLD, &ours, old);
> +    assert(!r);
> +}
> +
>  static void sigchld_removehandler_core(void)
>  {
>      struct sigaction was;
> @@ -202,12 +215,7 @@ static void sigchld_installhandler_core(libxl__gc *gc)
>      assert(!sigchld_owner);
>      sigchld_owner = CTX;
>  
> -    memset(&ours,0,sizeof(ours));

Is "ours" now an unused variable in this context?

> -    ours.sa_handler = sigchld_handler;
> -    sigemptyset(&ours.sa_mask);
> -    ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
> -    r = sigaction(SIGCHLD, &ours, &sigchld_saved_action);
> -    assert(!r);
> +    sigchld_sethandler_raw(sigchld_handler, &sigchld_saved_action);
>  
>      assert(((void)"application must negotiate with libxl about SIGCHLD",
>              !(sigchld_saved_action.sa_flags & SA_SIGINFO) &&

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 10/12] libxl: fork: Break out sigchld_installhandler_core
  2014-01-17 16:24 ` [PATCH 10/12] libxl: fork: Break out sigchld_installhandler_core Ian Jackson
@ 2014-01-20  9:59   ` Ian Campbell
  0 siblings, 0 replies; 58+ messages in thread
From: Ian Campbell @ 2014-01-20  9:59 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, xen-devel

On Fri, 2014-01-17 at 16:24 +0000, Ian Jackson wrote:
> Pure code motion.  This is going to make the final substantive patch
> easier to read.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Jim Fehlig <jfehlig@suse.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

> ---
>  tools/libxl/libxl_fork.c |   38 ++++++++++++++++++++++----------------
>  1 file changed, 22 insertions(+), 16 deletions(-)
> 
> diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
> index a15af8e..ce8e8eb 100644
> --- a/tools/libxl/libxl_fork.c
> +++ b/tools/libxl/libxl_fork.c
> @@ -194,6 +194,27 @@ static void sigchld_removehandler_core(void)
>      sigchld_owner = 0;
>  }
>  
> +static void sigchld_installhandler_core(libxl__gc *gc)
> +{
> +    struct sigaction ours;
> +    int r;
> +
> +    assert(!sigchld_owner);
> +    sigchld_owner = CTX;
> +
> +    memset(&ours,0,sizeof(ours));
> +    ours.sa_handler = sigchld_handler;
> +    sigemptyset(&ours.sa_mask);
> +    ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
> +    r = sigaction(SIGCHLD, &ours, &sigchld_saved_action);
> +    assert(!r);
> +
> +    assert(((void)"application must negotiate with libxl about SIGCHLD",
> +            !(sigchld_saved_action.sa_flags & SA_SIGINFO) &&
> +            (sigchld_saved_action.sa_handler == SIG_DFL ||
> +             sigchld_saved_action.sa_handler == SIG_IGN)));
> +}
> +
>  void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */
>  {
>      int rc;
> @@ -236,22 +257,7 @@ int libxl__sigchld_needed(libxl__gc *gc) /* non-reentrant, idempotent */
>  
>      atfork_lock();
>      if (sigchld_owner != CTX) {
> -        struct sigaction ours;
> -
> -        assert(!sigchld_owner);
> -        sigchld_owner = CTX;
> -
> -        memset(&ours,0,sizeof(ours));
> -        ours.sa_handler = sigchld_handler;
> -        sigemptyset(&ours.sa_mask);
> -        ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
> -        r = sigaction(SIGCHLD, &ours, &sigchld_saved_action);
> -        assert(!r);
> -
> -        assert(((void)"application must negotiate with libxl about SIGCHLD",
> -                !(sigchld_saved_action.sa_flags & SA_SIGINFO) &&
> -                (sigchld_saved_action.sa_handler == SIG_DFL ||
> -                 sigchld_saved_action.sa_handler == SIG_IGN)));
> +        sigchld_installhandler_core(gc);
>      }
>      atfork_unlock();
>  

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 09/12] libxl: fork: Rename sigchld handler functions
  2014-01-17 16:24 ` [PATCH 09/12] libxl: fork: Rename sigchld handler functions Ian Jackson
@ 2014-01-20  9:59   ` Ian Campbell
  0 siblings, 0 replies; 58+ messages in thread
From: Ian Campbell @ 2014-01-20  9:59 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, xen-devel

On Fri, 2014-01-17 at 16:24 +0000, Ian Jackson wrote:
> We are going to change these functions so that different libxl ctx's
> can share a single SIGCHLD handler.  Rename them now to a new name
> which doesn't imply unconditional handler installation or removal.
> 
> Also note in the comments that they are idempotent.
> 
> No functional change.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Jim Fehlig <jfehlig@suse.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 11/12] libxl: fork: Break out sigchld_sethandler_raw
  2014-01-20  9:58   ` Ian Campbell
@ 2014-01-20 17:57     ` Ian Jackson
  0 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-20 17:57 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Jim Fehlig, xen-devel

Ian Campbell writes ("Re: [PATCH 11/12] libxl: fork: Break out sigchld_sethandler_raw"):
> On Fri, 2014-01-17 at 16:24 +0000, Ian Jackson wrote:
> > @@ -202,12 +215,7 @@ static void sigchld_installhandler_core(libxl__gc *gc)
> >      assert(!sigchld_owner);
> >      sigchld_owner = CTX;
> >  
> > -    memset(&ours,0,sizeof(ours));
> 
> Is "ours" now an unused variable in this context?

Yes.  In fact these were deleted in the next patch.  I have moved the
deletion to this patch.

Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-17 22:29 ` Jim Fehlig
@ 2014-01-20 18:14   ` Jim Fehlig
  2014-01-21 14:46     ` Ian Jackson
  2014-01-21 15:28     ` [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
  0 siblings, 2 replies; 58+ messages in thread
From: Jim Fehlig @ 2014-01-20 18:14 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Ian Campbell

Jim Fehlig wrote:
> Ian Jackson wrote:
>   
>> libvirt reaps its children synchronously and has no central pid
>> registry and no dispatch mechanism.  libxl does have a pid registry so
>> can provide a selective reaping facility, but that is not currently
>> exposed.  Here we expose it.
>>
>> Also, libvirt has multiple libxl ctxs.  Prior to this series it is not
>> possible for those to share SIGCHLD: libxl expects either the
>> application, or _one_ libxl ctx, to own SIGCHLD.  In the final patch
>> of this series we relax this restriction by having libxl maintain a
>> process-wide list of the libxl ctxs that are supposed to be interested
>> in SIGCHLD.
>>
>> I have not tested the selective reaping functionality.  The most
>> plausible test environment for that is a suitably modified libvirt.
>>   
>>     
>
> I've been testing this series (plus 1/3 in your "tools: Miscellanous
> fixes for 4.4" series) on a suitably modified libvirt and the results
> look good so far :).
>
> I'm running four scripts concurrently that
>
> - start / stop domA
> - save / restore domB
> - reboot domC
> - get stats on dom{A,B,C}
>
> They have been running for about an hour now, and I haven't noticed any
> problems
>   

I let this run over the weekend and today noticed libvirtd was deadlocked

Thread 5 (Thread 0x7ffff10ea700 (LWP 42142)):
#0  0x00007ffff4d20b7d in read () from /lib64/libpthread.so.0
#1  0x00007fffeb88d028 in libxl__self_pipe_eatall (fd=39) at
libxl_event.c:1369
#2  0x00007fffeb88f52c in sigchld_selfpipe_handler (egc=0x7ffff10e9270,
ev=0x5555559986e8, fd=39,
    events=1, revents=1) at libxl_fork.c:501
#3  0x00007fffeb88bbf5 in afterpoll_internal (egc=0x7ffff10e9270,
poller=0x5555559a2b40, nfds=3,
    fds=0x5555558d96a0, now=...) at libxl_event.c:990
#4  0x00007fffeb88d2d2 in eventloop_iteration (egc=0x7ffff10e9270,
poller=0x5555559a2b40)
    at libxl_event.c:1431
#5  0x00007fffeb88de18 in libxl__ao_inprogress (ao=0x5555559beb30,
    file=0x7fffeb8a0a1b "libxl_create.c", line=1356,
    func=0x7fffeb8a1530 <__func__.16339> "do_domain_create") at
libxl_event.c:1676
#6  0x00007fffeb86813f in do_domain_create (ctx=0x555555998550,
d_config=0x7ffff10e94d0,
    domid=0x7ffff10e944c, restore_fd=-1, checkpointed_stream=0,
ao_how=0x0, aop_console_how=0x0)
    at libxl_create.c:1356
#7  0x00007fffeb86820d in libxl_domain_create_new (ctx=0x555555998550,
d_config=0x7ffff10e94d0,
    domid=0x7ffff10e944c, ao_how=0x0, aop_console_how=0x0) at
libxl_create.c:1377
#8  0x00007fffebad01b6 in libxlVmStart (driver=0x5555558b7be0,
vm=0x5555558d3280,
    start_paused=false, restore_fd=-1) at libxl/libxl_driver.c:630
#9  0x00007fffebad7594 in libxlDomainCreateWithFlags
(dom=0x5555559b9c00, flags=0)
    at libxl/libxl_driver.c:2924
#...

Thread 1 (Thread 0x7ffff7fc7840 (LWP 42135)):
#0  0x00007ffff4d2089c in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007ffff4d1c4f2 in _L_lock_957 () from /lib64/libpthread.so.0
#2  0x00007ffff4d1c35a in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007fffeb88943a in libxl__ctx_lock (ctx=0x555555998550) at
libxl_internal.h:2760
#4  0x00007fffeb88bf3d in libxl_osevent_occurred_fd (ctx=0x555555998550,
    for_libxl=0x5555559953e0, fd=45, events_ign=0, revents_ign=1) at
libxl_event.c:1049
#5  0x00007fffebacd56c in libxlDomainObjFDEventCallback (watch=40,
fd=45, vir_events=1,
    fd_info=0x5555559b5b80) at libxl/libxl_domain.c:132
#...

It looks like libxl is waiting for a read with a ctx locked on thread 5,
then receives an occurred_fd event on the same ctx in thread 1.  But it
is not clear to me why read() is blocking...

Regards,
Jim

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs
  2014-01-20  9:56     ` Ian Campbell
@ 2014-01-21 14:40       ` Ian Jackson
  2014-01-21 14:53         ` Ian Campbell
  0 siblings, 1 reply; 58+ messages in thread
From: Ian Jackson @ 2014-01-21 14:40 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Jim Fehlig, xen-devel

Ian Campbell writes ("Re: [PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs"):
> It's hard to believe but that seems to be the only comment I have, I
> think I actually grokked the whole locking via SIGCHLD deferral thing
> and it seems to make sense.
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Following a pub conversation, and a conversation with a friend of mine
last night over a glass of port, I think this fixup patch needs to be
applied too.

Jim, I'm going to look at your crash now.

Ian.

>From 7a593aa6903f4a2e3b927e546da32582184843f5 Mon Sep 17 00:00:00 2001
From: Ian Jackson <ian.jackson@eu.citrix.com>
Date: Tue, 21 Jan 2014 14:22:41 +0000
Subject: [PATCH] libxl: fork: Fixup SIGCHLD sharing

* Use a mutex for defer_sigchld, to guard against concurrency between
  the thread calling defer_sigchld and an instance of the primary
  signal handler on another thread.

* libxl_sigchld_owner_libxl_always is incompatible with SIGCHLD sharing.
  Document this correctly.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_event.h |   14 ++++++++------
 tools/libxl/libxl_fork.c  |   26 ++++++++++++++++++++++++--
 2 files changed, 32 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
index f0703f6..ca43cb9 100644
--- a/tools/libxl/libxl_event.h
+++ b/tools/libxl/libxl_event.h
@@ -470,16 +470,18 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
  *
  *     libxl_sigchld_owner_libxl_always:
  *
- *       The application expects libxl to reap all of its children,
- *       and provides a callback to be notified of their exit
- *       statuses.  The application may have multiple libxl_ctxs
- *       configured this way.
+ *       The application expects this libxl ctx to reap all of the
+ *       process's children, and provides a callback to be notified of
+ *       their exit statuses.  The application must have only one
+ *       libxl_ctx configured this way.
  *
  *     libxl_sigchld_owner_libxl_always_selective_reap:
  *
  *       The application expects to reap all of its own children
- *       synchronously, and does not use SIGCHLD.  libxl is
- *       to install a SIGCHLD handler.
+ *       synchronously, and does not use SIGCHLD.  libxl is to install
+ *       a SIGCHLD handler.  The application may have multiple
+ *       libxl_ctxs configured this way; in which case all of its ctxs
+ *       must be so configured.
  */
 
 
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 5b42bad..2432512 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -51,6 +51,7 @@ static LIBXL_LIST_HEAD(, libxl__carefd) carefds =
  * asynchronously by the signal handler) by sigchld_defer (see
  * below). */
 static bool sigchld_installed; /* 0 means not */
+static pthread_mutex_t sigchld_defer_mutex = PTHREAD_MUTEX_INITIALIZER;
 static LIBXL_LIST_HEAD(, libxl_ctx) sigchld_users =
     LIBXL_LIST_HEAD_INITIALIZER(sigchld_users);
 static struct sigaction sigchld_saved_action;
@@ -188,11 +189,17 @@ static void sigchld_handler(int signo)
     libxl_ctx *notify;
     int esave = errno;
 
+    int r = pthread_mutex_lock(&sigchld_defer_mutex);
+    assert(!r);
+
     LIBXL_LIST_FOREACH(notify, &sigchld_users, sigchld_users_entry) {
         int e = libxl__self_pipe_wakeup(notify->sigchld_selfpipe[1]);
         assert(!e); /* errors are probably EBADF, very bad */
     }
 
+    r = pthread_mutex_unlock(&sigchld_defer_mutex);
+    assert(!r);
+
     errno = esave;
 }
 
@@ -221,8 +228,10 @@ static void sigchld_sethandler_raw(void (*handler)(int), struct sigaction *old)
  * must be reentrant (see the comment in release_sigchld).
  *
  * Callers have the atfork_lock so there is no risk of concurrency
- * within these functions, aside obviously from the risk of being
- * interrupted by the signal.
+ * within these functions, aside from the risk of being interrupted by
+ * the signal.  We use sigchld_defer_mutex to guard against the
+ * possibility of the real signal handler being still running on
+ * another thread.
  */
 
 static volatile sig_atomic_t sigchld_occurred_while_deferred;
@@ -235,12 +244,25 @@ static void sigchld_handler_when_deferred(int signo)
 static void defer_sigchld(void)
 {
     assert(sigchld_installed);
+
     sigchld_sethandler_raw(sigchld_handler_when_deferred, 0);
+
+    /* Now _this thread_ cannot any longer be interrupted by the
+     * signal, so we can take the mutex without risk of deadlock.  If
+     * another thread is in the signal handler, either it or we will
+     * block and wait for the other. */
+
+    int r = pthread_mutex_lock(&sigchld_defer_mutex);
+    assert(!r);
 }
 
 static void release_sigchld(void)
 {
     assert(sigchld_installed);
+
+    int r = pthread_mutex_unlock(&sigchld_defer_mutex);
+    assert(!r);
+
     sigchld_sethandler_raw(sigchld_handler, 0);
     if (sigchld_occurred_while_deferred) {
         sigchld_occurred_while_deferred = 0;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-20 18:14   ` Jim Fehlig
@ 2014-01-21 14:46     ` Ian Jackson
  2014-01-21 15:11       ` [PATCH 13/12] libxl: events: Break out libxl__pipe_nonblock, _close Ian Jackson
  2014-01-21 15:28     ` [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
  1 sibling, 1 reply; 58+ messages in thread
From: Ian Jackson @ 2014-01-21 14:46 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: xen-devel, Ian Campbell

Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
...
> It looks like libxl is waiting for a read with a ctx locked on thread 5,
> then receives an occurred_fd event on the same ctx in thread 1.  But it
> is not clear to me why read() is blocking...

Presumably the sigchld handler (bottom half) lost the race with a
previous instance of the sigchld_selfpipe_handler (top half), causing
sigchld_selfpipe_handler to run when the pipe was in fact empty.

And the real bug is that nothing sets the pipe to nonblocking!
Bear with me.

Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs
  2014-01-21 14:40       ` Ian Jackson
@ 2014-01-21 14:53         ` Ian Campbell
  2014-01-21 15:09           ` Ian Jackson
  0 siblings, 1 reply; 58+ messages in thread
From: Ian Campbell @ 2014-01-21 14:53 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, xen-devel

On Tue, 2014-01-21 at 14:40 +0000, Ian Jackson wrote:
> Ian Campbell writes ("Re: [PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs"):
> > It's hard to believe but that seems to be the only comment I have, I
> > think I actually grokked the whole locking via SIGCHLD deferral thing
> > and it seems to make sense.
> > 
> > Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Following a pub conversation, and a conversation with a friend of mine
> last night over a glass of port, I think this fixup patch needs to be
> applied too.
> 
> Jim, I'm going to look at your crash now.
> 
> Ian.
> 
> From 7a593aa6903f4a2e3b927e546da32582184843f5 Mon Sep 17 00:00:00 2001
> From: Ian Jackson <ian.jackson@eu.citrix.com>
> Date: Tue, 21 Jan 2014 14:22:41 +0000
> Subject: [PATCH] libxl: fork: Fixup SIGCHLD sharing
> 
> * Use a mutex for defer_sigchld, to guard against concurrency between
>   the thread calling defer_sigchld and an instance of the primary
>   signal handler on another thread.
> 
> * libxl_sigchld_owner_libxl_always is incompatible with SIGCHLD sharing.
>   Document this correctly.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

Including if you want to fold this into another patch.

> ---
>  tools/libxl/libxl_event.h |   14 ++++++++------
>  tools/libxl/libxl_fork.c  |   26 ++++++++++++++++++++++++--
>  2 files changed, 32 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
> index f0703f6..ca43cb9 100644
> --- a/tools/libxl/libxl_event.h
> +++ b/tools/libxl/libxl_event.h
> @@ -470,16 +470,18 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
>   *
>   *     libxl_sigchld_owner_libxl_always:
>   *
> - *       The application expects libxl to reap all of its children,
> - *       and provides a callback to be notified of their exit
> - *       statuses.  The application may have multiple libxl_ctxs
> - *       configured this way.
> + *       The application expects this libxl ctx to reap all of the
> + *       process's children, and provides a callback to be notified of
> + *       their exit statuses.  The application must have only one
> + *       libxl_ctx configured this way.
>   *
>   *     libxl_sigchld_owner_libxl_always_selective_reap:
>   *
>   *       The application expects to reap all of its own children
> - *       synchronously, and does not use SIGCHLD.  libxl is
> - *       to install a SIGCHLD handler.
> + *       synchronously, and does not use SIGCHLD.  libxl is to install
> + *       a SIGCHLD handler.  The application may have multiple
> + *       libxl_ctxs configured this way; in which case all of its ctxs
> + *       must be so configured.
>   */
>  
> 
> diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
> index 5b42bad..2432512 100644
> --- a/tools/libxl/libxl_fork.c
> +++ b/tools/libxl/libxl_fork.c
> @@ -51,6 +51,7 @@ static LIBXL_LIST_HEAD(, libxl__carefd) carefds =
>   * asynchronously by the signal handler) by sigchld_defer (see
>   * below). */
>  static bool sigchld_installed; /* 0 means not */
> +static pthread_mutex_t sigchld_defer_mutex = PTHREAD_MUTEX_INITIALIZER;
>  static LIBXL_LIST_HEAD(, libxl_ctx) sigchld_users =
>      LIBXL_LIST_HEAD_INITIALIZER(sigchld_users);
>  static struct sigaction sigchld_saved_action;
> @@ -188,11 +189,17 @@ static void sigchld_handler(int signo)
>      libxl_ctx *notify;
>      int esave = errno;
>  
> +    int r = pthread_mutex_lock(&sigchld_defer_mutex);
> +    assert(!r);
> +
>      LIBXL_LIST_FOREACH(notify, &sigchld_users, sigchld_users_entry) {
>          int e = libxl__self_pipe_wakeup(notify->sigchld_selfpipe[1]);
>          assert(!e); /* errors are probably EBADF, very bad */
>      }
>  
> +    r = pthread_mutex_unlock(&sigchld_defer_mutex);
> +    assert(!r);
> +
>      errno = esave;
>  }
>  
> @@ -221,8 +228,10 @@ static void sigchld_sethandler_raw(void (*handler)(int), struct sigaction *old)
>   * must be reentrant (see the comment in release_sigchld).
>   *
>   * Callers have the atfork_lock so there is no risk of concurrency
> - * within these functions, aside obviously from the risk of being
> - * interrupted by the signal.
> + * within these functions, aside from the risk of being interrupted by
> + * the signal.  We use sigchld_defer_mutex to guard against the
> + * possibility of the real signal handler being still running on
> + * another thread.
>   */
>  
>  static volatile sig_atomic_t sigchld_occurred_while_deferred;
> @@ -235,12 +244,25 @@ static void sigchld_handler_when_deferred(int signo)
>  static void defer_sigchld(void)
>  {
>      assert(sigchld_installed);
> +
>      sigchld_sethandler_raw(sigchld_handler_when_deferred, 0);
> +
> +    /* Now _this thread_ cannot any longer be interrupted by the
> +     * signal, so we can take the mutex without risk of deadlock.  If
> +     * another thread is in the signal handler, either it or we will
> +     * block and wait for the other. */
> +
> +    int r = pthread_mutex_lock(&sigchld_defer_mutex);
> +    assert(!r);
>  }
>  
>  static void release_sigchld(void)
>  {
>      assert(sigchld_installed);
> +
> +    int r = pthread_mutex_unlock(&sigchld_defer_mutex);
> +    assert(!r);
> +
>      sigchld_sethandler_raw(sigchld_handler, 0);
>      if (sigchld_occurred_while_deferred) {
>          sigchld_occurred_while_deferred = 0;

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs
  2014-01-21 14:53         ` Ian Campbell
@ 2014-01-21 15:09           ` Ian Jackson
  0 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-21 15:09 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Jim Fehlig, xen-devel

Ian Campbell writes ("Re: [PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs"):
> On Tue, 2014-01-21 at 14:40 +0000, Ian Jackson wrote:
...
> > * Use a mutex for defer_sigchld, to guard against concurrency between
> >   the thread calling defer_sigchld and an instance of the primary
> >   signal handler on another thread.
> > 
> > * libxl_sigchld_owner_libxl_always is incompatible with SIGCHLD sharing.
> >   Document this correctly.
> > 
> > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Including if you want to fold this into another patch.

Thanks, I intend to fold it into my series.

Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 13/12] libxl: events: Break out libxl__pipe_nonblock, _close
  2014-01-21 14:46     ` Ian Jackson
@ 2014-01-21 15:11       ` Ian Jackson
  2014-01-21 15:11         ` [PATCH 14/12] libxl: fork: Make SIGCHLD self-pipe nonblocking Ian Jackson
  2014-01-21 15:27         ` [PATCH 13/12] libxl: events: Break out libxl__pipe_nonblock, _close Ian Campbell
  0 siblings, 2 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-21 15:11 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

Break out the pipe creation and destruction from the poller code
into two new functions libxl__pipe_nonblock and libxl__pipe_close.

No overall functional difference other than minor differences in exact
log messages.

Also move libxl__self_pipe_wakeup and libxl__self_pipe_eatall into the
new pipe utilities section in libxl_event.c; this is pure code motion.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_event.c    |  104 ++++++++++++++++++++++++++----------------
 tools/libxl/libxl_internal.h |    9 ++++
 2 files changed, 73 insertions(+), 40 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index bdef7ac..35a8f81 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -1271,26 +1271,81 @@ int libxl_event_check(libxl_ctx *ctx, libxl_event **event_r,
 }
 
 /*
- * Manipulation of pollers
+ * Utilities for pipes (specifically, useful for self-pipes)
  */
 
-int libxl__poller_init(libxl_ctx *ctx, libxl__poller *p)
+void libxl__pipe_close(int fds[2])
+{
+    if (fds[0] >= 0) close(fds[0]);
+    if (fds[1] >= 0) close(fds[1]);
+    fds[0] = fds[1] = -1;
+}
+
+int libxl__pipe_nonblock(libxl_ctx *ctx, int fds[2])
 {
     int r, rc;
-    p->fd_polls = 0;
-    p->fd_rindices = 0;
 
-    r = pipe(p->wakeup_pipe);
+    r = libxl_pipe(ctx, fds);
     if (r) {
-        LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "cannot create poller pipe");
+        fds[0] = fds[1] = -1;
         rc = ERROR_FAIL;
         goto out;
     }
 
-    rc = libxl_fd_set_nonblock(ctx, p->wakeup_pipe[0], 1);
+    rc = libxl_fd_set_nonblock(ctx, fds[0], 1);
     if (rc) goto out;
 
-    rc = libxl_fd_set_nonblock(ctx, p->wakeup_pipe[1], 1);
+    rc = libxl_fd_set_nonblock(ctx, fds[1], 1);
+    if (rc) goto out;
+
+    return 0;
+
+ out:
+    libxl__pipe_close(fds);
+    return rc;
+}
+
+int libxl__self_pipe_wakeup(int fd)
+{
+    static const char buf[1] = "";
+
+    for (;;) {
+        int r = write(fd, buf, 1);
+        if (r==1) return 0;
+        assert(r==-1);
+        if (errno == EINTR) continue;
+        if (errno == EWOULDBLOCK) return 0;
+        assert(errno);
+        return errno;
+    }
+}
+
+int libxl__self_pipe_eatall(int fd)
+{
+    char buf[256];
+    for (;;) {
+        int r = read(fd, buf, sizeof(buf));
+        if (r == sizeof(buf)) continue;
+        if (r >= 0) return 0;
+        assert(r == -1);
+        if (errno == EINTR) continue;
+        if (errno == EWOULDBLOCK) return 0;
+        assert(errno);
+        return errno;
+    }
+}
+
+/*
+ * Manipulation of pollers
+ */
+
+int libxl__poller_init(libxl_ctx *ctx, libxl__poller *p)
+{
+    int rc;
+    p->fd_polls = 0;
+    p->fd_rindices = 0;
+
+    rc = libxl__pipe_nonblock(ctx, p->wakeup_pipe);
     if (rc) goto out;
 
     return 0;
@@ -1302,8 +1357,7 @@ int libxl__poller_init(libxl_ctx *ctx, libxl__poller *p)
 
 void libxl__poller_dispose(libxl__poller *p)
 {
-    if (p->wakeup_pipe[1] > 0) close(p->wakeup_pipe[1]);
-    if (p->wakeup_pipe[0] > 0) close(p->wakeup_pipe[0]);
+    libxl__pipe_close(p->wakeup_pipe);
     free(p->fd_polls);
     free(p->fd_rindices);
 }
@@ -1347,36 +1401,6 @@ void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p)
     if (e) LIBXL__EVENT_DISASTER(egc, "cannot poke watch pipe", e, 0);
 }
 
-int libxl__self_pipe_wakeup(int fd)
-{
-    static const char buf[1] = "";
-
-    for (;;) {
-        int r = write(fd, buf, 1);
-        if (r==1) return 0;
-        assert(r==-1);
-        if (errno == EINTR) continue;
-        if (errno == EWOULDBLOCK) return 0;
-        assert(errno);
-        return errno;
-    }
-}
-
-int libxl__self_pipe_eatall(int fd)
-{
-    char buf[256];
-    for (;;) {
-        int r = read(fd, buf, sizeof(buf));
-        if (r == sizeof(buf)) continue;
-        if (r >= 0) return 0;
-        assert(r == -1);
-        if (errno == EINTR) continue;
-        if (errno == EWOULDBLOCK) return 0;
-        assert(errno);
-        return errno;
-    }
-}
-
 /*
  * Main event loop iteration
  */
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 8429448..9d17586 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -509,6 +509,15 @@ _hidden char *libxl__strndup(libxl__gc *gc_opt, const char *c, size_t n) NN1;
  * string. (similar to a gc'd dirname(3)). */
 _hidden char *libxl__dirname(libxl__gc *gc_opt, const char *s) NN1;
 
+/* Make a pipe and set both ends nonblocking.  On error, nothing
+ * is left open and both fds[]==-1, and a message is logged.
+ * Useful for self-pipes. */
+_hidden int libxl__pipe_nonblock(libxl_ctx *ctx, int fds[2]);
+/* Closes the pipe fd(s).  Either or both of fds[] may be -1 meaning
+ * `not open'.  Ignores any errors.  Sets fds[] to -1. */
+_hidden void libxl__pipe_close(int fds[2]);
+
+
 /* Each of these logs errors and returns a libxl error code.
  * They do not mind if path is already removed.
  * For _file, path must not be a directory; for _directory it must be. */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 14/12] libxl: fork: Make SIGCHLD self-pipe nonblocking
  2014-01-21 15:11       ` [PATCH 13/12] libxl: events: Break out libxl__pipe_nonblock, _close Ian Jackson
@ 2014-01-21 15:11         ` Ian Jackson
  2014-01-21 15:32           ` Ian Campbell
  2014-01-21 15:27         ` [PATCH 13/12] libxl: events: Break out libxl__pipe_nonblock, _close Ian Campbell
  1 sibling, 1 reply; 58+ messages in thread
From: Ian Jackson @ 2014-01-21 15:11 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell

Use the new libxl__pipe_nonblock and _close functions, rather than
open coding the same logic.  Now the pipe is nonblocking, which avoids
a race which could result in libxl deadlocking in a multithreaded
program.

Reported-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl.c      |    6 +-----
 tools/libxl/libxl_fork.c |   12 +++---------
 2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 4679b51..3730074 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -171,11 +171,7 @@ int libxl_ctx_free(libxl_ctx *ctx)
      * them; we wish the application good luck with understanding
      * this if and when it reaps them. */
     libxl__sigchld_notneeded(gc);
-
-    if (ctx->sigchld_selfpipe[0] >= 0) {
-        close(ctx->sigchld_selfpipe[0]);
-        close(ctx->sigchld_selfpipe[1]);
-    }
+    libxl__pipe_close(ctx->sigchld_selfpipe);
 
     pthread_mutex_destroy(&ctx->lock);
 
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 2432512..1d0017b 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -343,17 +343,11 @@ void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */
 
 int libxl__sigchld_needed(libxl__gc *gc) /* non-reentrant, idempotent */
 {
-    int r, rc;
+    int rc;
 
     if (CTX->sigchld_selfpipe[0] < 0) {
-        r = pipe(CTX->sigchld_selfpipe);
-        if (r) {
-            CTX->sigchld_selfpipe[0] = -1;
-            LIBXL__LOG_ERRNO(CTX, LIBXL__LOG_ERROR,
-                             "failed to create sigchld pipe");
-            rc = ERROR_FAIL;
-            goto out;
-        }
+        rc = libxl__pipe_nonblock(CTX, CTX->sigchld_selfpipe);
+        if (rc) goto out;
     }
     if (!libxl__ev_fd_isregistered(&CTX->sigchld_selfpipe_efd)) {
         rc = libxl__ev_fd_register(gc, &CTX->sigchld_selfpipe_efd,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH 13/12] libxl: events: Break out libxl__pipe_nonblock, _close
  2014-01-21 15:11       ` [PATCH 13/12] libxl: events: Break out libxl__pipe_nonblock, _close Ian Jackson
  2014-01-21 15:11         ` [PATCH 14/12] libxl: fork: Make SIGCHLD self-pipe nonblocking Ian Jackson
@ 2014-01-21 15:27         ` Ian Campbell
  2014-01-21 15:31           ` Ian Jackson
  1 sibling, 1 reply; 58+ messages in thread
From: Ian Campbell @ 2014-01-21 15:27 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, xen-devel

On Tue, 2014-01-21 at 15:11 +0000, Ian Jackson wrote:
> Break out the pipe creation and destruction from the poller code
> into two new functions libxl__pipe_nonblock and libxl__pipe_close.
> 
> No overall functional difference other than minor differences in exact
> log messages.
> 
> Also move libxl__self_pipe_wakeup and libxl__self_pipe_eatall into the
> new pipe utilities section in libxl_event.c; this is pure code motion.

You also switched pipe() -> libxl_pipe(), which is fine but not
mentioned here.

> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Jim Fehlig <jfehlig@suse.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> ---
>  tools/libxl/libxl_event.c    |  104 ++++++++++++++++++++++++++----------------
>  tools/libxl/libxl_internal.h |    9 ++++
>  2 files changed, 73 insertions(+), 40 deletions(-)
> 
> diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
> index bdef7ac..35a8f81 100644
> --- a/tools/libxl/libxl_event.c
> +++ b/tools/libxl/libxl_event.c
> @@ -1271,26 +1271,81 @@ int libxl_event_check(libxl_ctx *ctx, libxl_event **event_r,
>  }
>  
>  /*
> - * Manipulation of pollers
> + * Utilities for pipes (specifically, useful for self-pipes)
>   */
>  
> -int libxl__poller_init(libxl_ctx *ctx, libxl__poller *p)
> +void libxl__pipe_close(int fds[2])
> +{
> +    if (fds[0] >= 0) close(fds[0]);
> +    if (fds[1] >= 0) close(fds[1]);
> +    fds[0] = fds[1] = -1;
> +}
> +
> +int libxl__pipe_nonblock(libxl_ctx *ctx, int fds[2])
>  {
>      int r, rc;
> -    p->fd_polls = 0;
> -    p->fd_rindices = 0;
>  
> -    r = pipe(p->wakeup_pipe);
> +    r = libxl_pipe(ctx, fds);
>      if (r) {
> -        LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "cannot create poller pipe");
> +        fds[0] = fds[1] = -1;
>          rc = ERROR_FAIL;
>          goto out;
>      }
>  
> -    rc = libxl_fd_set_nonblock(ctx, p->wakeup_pipe[0], 1);
> +    rc = libxl_fd_set_nonblock(ctx, fds[0], 1);
>      if (rc) goto out;
>  
> -    rc = libxl_fd_set_nonblock(ctx, p->wakeup_pipe[1], 1);
> +    rc = libxl_fd_set_nonblock(ctx, fds[1], 1);
> +    if (rc) goto out;
> +
> +    return 0;
> +
> + out:
> +    libxl__pipe_close(fds);
> +    return rc;
> +}
> +
> +int libxl__self_pipe_wakeup(int fd)
> +{
> +    static const char buf[1] = "";
> +
> +    for (;;) {
> +        int r = write(fd, buf, 1);
> +        if (r==1) return 0;
> +        assert(r==-1);
> +        if (errno == EINTR) continue;
> +        if (errno == EWOULDBLOCK) return 0;
> +        assert(errno);
> +        return errno;
> +    }
> +}
> +
> +int libxl__self_pipe_eatall(int fd)
> +{
> +    char buf[256];
> +    for (;;) {
> +        int r = read(fd, buf, sizeof(buf));
> +        if (r == sizeof(buf)) continue;
> +        if (r >= 0) return 0;
> +        assert(r == -1);
> +        if (errno == EINTR) continue;
> +        if (errno == EWOULDBLOCK) return 0;
> +        assert(errno);
> +        return errno;
> +    }
> +}
> +
> +/*
> + * Manipulation of pollers
> + */
> +
> +int libxl__poller_init(libxl_ctx *ctx, libxl__poller *p)
> +{
> +    int rc;
> +    p->fd_polls = 0;
> +    p->fd_rindices = 0;
> +
> +    rc = libxl__pipe_nonblock(ctx, p->wakeup_pipe);
>      if (rc) goto out;
>  
>      return 0;
> @@ -1302,8 +1357,7 @@ int libxl__poller_init(libxl_ctx *ctx, libxl__poller *p)
>  
>  void libxl__poller_dispose(libxl__poller *p)
>  {
> -    if (p->wakeup_pipe[1] > 0) close(p->wakeup_pipe[1]);
> -    if (p->wakeup_pipe[0] > 0) close(p->wakeup_pipe[0]);
> +    libxl__pipe_close(p->wakeup_pipe);
>      free(p->fd_polls);
>      free(p->fd_rindices);
>  }
> @@ -1347,36 +1401,6 @@ void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p)
>      if (e) LIBXL__EVENT_DISASTER(egc, "cannot poke watch pipe", e, 0);
>  }
>  
> -int libxl__self_pipe_wakeup(int fd)
> -{
> -    static const char buf[1] = "";
> -
> -    for (;;) {
> -        int r = write(fd, buf, 1);
> -        if (r==1) return 0;
> -        assert(r==-1);
> -        if (errno == EINTR) continue;
> -        if (errno == EWOULDBLOCK) return 0;
> -        assert(errno);
> -        return errno;
> -    }
> -}
> -
> -int libxl__self_pipe_eatall(int fd)
> -{
> -    char buf[256];
> -    for (;;) {
> -        int r = read(fd, buf, sizeof(buf));
> -        if (r == sizeof(buf)) continue;
> -        if (r >= 0) return 0;
> -        assert(r == -1);
> -        if (errno == EINTR) continue;
> -        if (errno == EWOULDBLOCK) return 0;
> -        assert(errno);
> -        return errno;
> -    }
> -}
> -
>  /*
>   * Main event loop iteration
>   */
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 8429448..9d17586 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -509,6 +509,15 @@ _hidden char *libxl__strndup(libxl__gc *gc_opt, const char *c, size_t n) NN1;
>   * string. (similar to a gc'd dirname(3)). */
>  _hidden char *libxl__dirname(libxl__gc *gc_opt, const char *s) NN1;
>  
> +/* Make a pipe and set both ends nonblocking.  On error, nothing
> + * is left open and both fds[]==-1, and a message is logged.
> + * Useful for self-pipes. */
> +_hidden int libxl__pipe_nonblock(libxl_ctx *ctx, int fds[2]);
> +/* Closes the pipe fd(s).  Either or both of fds[] may be -1 meaning
> + * `not open'.  Ignores any errors.  Sets fds[] to -1. */
> +_hidden void libxl__pipe_close(int fds[2]);
> +
> +
>  /* Each of these logs errors and returns a libxl error code.
>   * They do not mind if path is already removed.
>   * For _file, path must not be a directory; for _directory it must be. */

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-20 18:14   ` Jim Fehlig
  2014-01-21 14:46     ` Ian Jackson
@ 2014-01-21 15:28     ` Ian Jackson
  2014-01-22  5:32       ` Jim Fehlig
  1 sibling, 1 reply; 58+ messages in thread
From: Ian Jackson @ 2014-01-21 15:28 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: xen-devel, Ian Campbell

Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> I let this run over the weekend and today noticed libvirtd was deadlocked

I have just retested xl with:
  * my 3-patch 4.4 fixes series
  * v2 of my fork series
  * the extra mutex patch "libxl: fork: Fixup SIGCHLD sharing"
  * "13/12" and "14/12" just posted
and it WFM.

Of course I don't have the same setup as Jim.

Jim: if it's not too much trouble, I'd appreciate it if you could try
that combination.

For your convenience you can find a git branch of it at
  http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=shortlog;h=refs/tags/wip.enumerate-pids-v2.1
aka
  git://xenbits.xen.org/people/iwj/xen.git#wip.enumerate-pids-v2.1

A useful stress test would be to occasionally run something like
   while true; do kill -CHLD <pid of libvirt process>; done
while it's busy.

Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 13/12] libxl: events: Break out libxl__pipe_nonblock, _close
  2014-01-21 15:27         ` [PATCH 13/12] libxl: events: Break out libxl__pipe_nonblock, _close Ian Campbell
@ 2014-01-21 15:31           ` Ian Jackson
  0 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-21 15:31 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Jim Fehlig, xen-devel

Ian Campbell writes ("Re: [PATCH 13/12] libxl: events: Break out libxl__pipe_nonblock, _close"):
> On Tue, 2014-01-21 at 15:11 +0000, Ian Jackson wrote:
> > Also move libxl__self_pipe_wakeup and libxl__self_pipe_eatall into the
> > new pipe utilities section in libxl_event.c; this is pure code motion.
> 
> You also switched pipe() -> libxl_pipe(), which is fine but not
> mentioned here.

Good point.

> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 14/12] libxl: fork: Make SIGCHLD self-pipe nonblocking
  2014-01-21 15:11         ` [PATCH 14/12] libxl: fork: Make SIGCHLD self-pipe nonblocking Ian Jackson
@ 2014-01-21 15:32           ` Ian Campbell
  2014-01-21 15:48             ` Ian Jackson
  0 siblings, 1 reply; 58+ messages in thread
From: Ian Campbell @ 2014-01-21 15:32 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, xen-devel

On Tue, 2014-01-21 at 15:11 +0000, Ian Jackson wrote:
> Use the new libxl__pipe_nonblock and _close functions, rather than
> open coding the same logic.  Now the pipe is nonblocking, which avoids
> a race which could result in libxl deadlocking in a multithreaded
> program.

Is any additional error handling required at any of the points where the
pipe is used, EWOULDBLOCK etc.

I think in practice that is all in libxl__self_pipe_wakeup and
libxl__self_pipe_eatall, the callers of which treat errors as disaster
(with assert and LIBXL__EVENT_DISASTER respectively). But both wakeup
and eatall handle EWOULDBLOCK sensibly.

Assuming my analysis of what needs to be handled and where it needs to
be done matches yours:
Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 14/12] libxl: fork: Make SIGCHLD self-pipe nonblocking
  2014-01-21 15:32           ` Ian Campbell
@ 2014-01-21 15:48             ` Ian Jackson
  0 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-21 15:48 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Jim Fehlig, xen-devel

Ian Campbell writes ("Re: [PATCH 14/12] libxl: fork: Make SIGCHLD self-pipe nonblocking"):
> On Tue, 2014-01-21 at 15:11 +0000, Ian Jackson wrote:
> > Use the new libxl__pipe_nonblock and _close functions, rather than
> > open coding the same logic.  Now the pipe is nonblocking, which avoids
> > a race which could result in libxl deadlocking in a multithreaded
> > program.
> 
> Is any additional error handling required at any of the points where the
> pipe is used, EWOULDBLOCK etc.

No, they are just the libxl__self_pipe* functions as you have noticed,
which are designed to work (only!) with nonblocking fds.

> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-21 15:28     ` [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
@ 2014-01-22  5:32       ` Jim Fehlig
  2014-01-23  4:05         ` Jim Fehlig
  0 siblings, 1 reply; 58+ messages in thread
From: Jim Fehlig @ 2014-01-22  5:32 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Ian Campbell

Ian Jackson wrote:
> Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
>   
>> I let this run over the weekend and today noticed libvirtd was deadlocked
>>     
>
> I have just retested xl with:
>   * my 3-patch 4.4 fixes series
>   * v2 of my fork series
>   * the extra mutex patch "libxl: fork: Fixup SIGCHLD sharing"
>   * "13/12" and "14/12" just posted
> and it WFM.
>
> Of course I don't have the same setup as Jim.
>
> Jim: if it's not too much trouble, I'd appreciate it if you could try
> that combination.
>
> For your convenience you can find a git branch of it at
>   http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=shortlog;h=refs/tags/wip.enumerate-pids-v2.1
> aka
>   git://xenbits.xen.org/people/iwj/xen.git#wip.enumerate-pids-v2.1
>   

I've been testing this branch and notice an occasional libvirtd segfault
that always occurs when calling libxl_domain_create_restore().  By
occasional, I mean my save/restore script might cause the segfault after
2 iterations, or 20 iterations, or ...  But the segfault always occurs
in libxl_domain_create_restore()

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffeef59700 (LWP 12083)]
0x00007ffff74577ef in virObjectIsClass (anyobj=0x2f302f6e69616d6f,
klass=0x5555558a1310)
    at util/virobject.c:362
362         return virClassIsDerivedFrom(obj->klass, klass);
(gdb) bt
#0  0x00007ffff74577ef in virObjectIsClass (anyobj=0x2f302f6e69616d6f,
klass=0x5555558a1310)
    at util/virobject.c:362
#1  0x00007ffff745765b in virObjectLock (anyobj=0x2f302f6e69616d6f) at
util/virobject.c:314
#2  0x00007fffe993cc96 in libxlDomainObjTimeoutModifyEventHook
(priv=0x5555558fc310,
    hndp=0x5555559e5d88, abs_t=...) at libxl/libxl_domain.c:302
#3  0x00007fffe96f8fed in time_deregister (gc=0x7fffeef58220,
ev=0x5555559eee48)
    at libxl_event.c:294
#4  0x00007fffe96facfd in afterpoll_internal (egc=0x7fffeef58220,
poller=0x5555559a4c70, nfds=3,
    fds=0x5555559c09d0, now=...) at libxl_event.c:1008
#5  0x00007fffe96fc312 in eventloop_iteration (egc=0x7fffeef58220,
poller=0x5555559a4c70)
    at libxl_event.c:1455
#6  0x00007fffe96fce58 in libxl__ao_inprogress (ao=0x5555559e9690,
    file=0x7fffe970fadb "libxl_create.c", line=1356,
    func=0x7fffe97105f0 <__func__.16344> "do_domain_create") at
libxl_event.c:1700
#7  0x00007fffe96d711f in do_domain_create (ctx=0x5555559d9fa0,
d_config=0x7fffeef58490,
    domid=0x7fffeef5840c, restore_fd=89, checkpointed_stream=0,
ao_how=0x0, aop_console_how=0x0)
    at libxl_create.c:1356
#8  0x00007fffe96d7238 in libxl_domain_create_restore
(ctx=0x5555559d9fa0, d_config=0x7fffeef58490,
    domid=0x7fffeef5840c, restore_fd=89, params=0x7fffeef58400,
ao_how=0x0, aop_console_how=0x0)
    at libxl_create.c:1387
#...
(gdb) f 2
#2  0x00007fffe993cc96 in libxlDomainObjTimeoutModifyEventHook
(priv=0x5555558fc310,
    hndp=0x5555559e5d88, abs_t=...) at libxl/libxl_domain.c:302
302         virObjectLock(info->priv);
(gdb) p info->priv
$3 = (libxlDomainObjPrivatePtr) 0x2f302f6e69616d6f
(gdb) f 9
#9  0x00007fffe993f2c7 in libxlVmStart (driver=0x5555558c2e50,
vm=0x5555558e6a50,
    start_paused=false, restore_fd=89) at libxl/libxl_driver.c:635
635             res = libxl_domain_create_restore(priv->ctx, &d_config,
&domid,
(gdb) p priv
$2 = (libxlDomainObjPrivatePtr) 0x5555558fc310

It looks like the libxlDomainObjPrivatePtr, stashed as part of
for_app_registration_out when registering the timeout, has been
trampled.  Not sure if the problem is in libvirt or libxl, but it is
late here and I'm calling it a night :).

Regards,
Jim

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-22  5:32       ` Jim Fehlig
@ 2014-01-23  4:05         ` Jim Fehlig
  2014-01-23 10:56           ` Ian Jackson
  0 siblings, 1 reply; 58+ messages in thread
From: Jim Fehlig @ 2014-01-23  4:05 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Ian Campbell

Jim Fehlig wrote:
> Ian Jackson wrote:
>   
>> Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
>>   
>>     
>>> I let this run over the weekend and today noticed libvirtd was deadlocked
>>>     
>>>       
>> I have just retested xl with:
>>   * my 3-patch 4.4 fixes series
>>   * v2 of my fork series
>>   * the extra mutex patch "libxl: fork: Fixup SIGCHLD sharing"
>>   * "13/12" and "14/12" just posted
>> and it WFM.
>>
>> Of course I don't have the same setup as Jim.
>>
>> Jim: if it's not too much trouble, I'd appreciate it if you could try
>> that combination.
>>
>> For your convenience you can find a git branch of it at
>>   http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=shortlog;h=refs/tags/wip.enumerate-pids-v2.1
>> aka
>>   git://xenbits.xen.org/people/iwj/xen.git#wip.enumerate-pids-v2.1
>>   
>>     
>
> I've been testing this branch and notice an occasional libvirtd segfault
> that always occurs when calling libxl_domain_create_restore().  By
> occasional, I mean my save/restore script might cause the segfault after
> 2 iterations, or 20 iterations, or ...  But the segfault always occurs
> in libxl_domain_create_restore()
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7fffeef59700 (LWP 12083)]
> 0x00007ffff74577ef in virObjectIsClass (anyobj=0x2f302f6e69616d6f,
> klass=0x5555558a1310)
>     at util/virobject.c:362
> 362         return virClassIsDerivedFrom(obj->klass, klass);
> (gdb) bt
> #0  0x00007ffff74577ef in virObjectIsClass (anyobj=0x2f302f6e69616d6f,
> klass=0x5555558a1310)
>     at util/virobject.c:362
> #1  0x00007ffff745765b in virObjectLock (anyobj=0x2f302f6e69616d6f) at
> util/virobject.c:314
> #2  0x00007fffe993cc96 in libxlDomainObjTimeoutModifyEventHook
> (priv=0x5555558fc310,
>     hndp=0x5555559e5d88, abs_t=...) at libxl/libxl_domain.c:302
> #3  0x00007fffe96f8fed in time_deregister (gc=0x7fffeef58220,
> ev=0x5555559eee48)
>     at libxl_event.c:294
> #4  0x00007fffe96facfd in afterpoll_internal (egc=0x7fffeef58220,
> poller=0x5555559a4c70, nfds=3,
>     fds=0x5555559c09d0, now=...) at libxl_event.c:1008
> #5  0x00007fffe96fc312 in eventloop_iteration (egc=0x7fffeef58220,
> poller=0x5555559a4c70)
>     at libxl_event.c:1455
> #6  0x00007fffe96fce58 in libxl__ao_inprogress (ao=0x5555559e9690,
>     file=0x7fffe970fadb "libxl_create.c", line=1356,
>     func=0x7fffe97105f0 <__func__.16344> "do_domain_create") at
> libxl_event.c:1700
> #7  0x00007fffe96d711f in do_domain_create (ctx=0x5555559d9fa0,
> d_config=0x7fffeef58490,
>     domid=0x7fffeef5840c, restore_fd=89, checkpointed_stream=0,
> ao_how=0x0, aop_console_how=0x0)
>     at libxl_create.c:1356
> #8  0x00007fffe96d7238 in libxl_domain_create_restore
> (ctx=0x5555559d9fa0, d_config=0x7fffeef58490,
>     domid=0x7fffeef5840c, restore_fd=89, params=0x7fffeef58400,
> ao_how=0x0, aop_console_how=0x0)
>     at libxl_create.c:1387
> #...
> (gdb) f 2
> #2  0x00007fffe993cc96 in libxlDomainObjTimeoutModifyEventHook
> (priv=0x5555558fc310,
>     hndp=0x5555559e5d88, abs_t=...) at libxl/libxl_domain.c:302
> 302         virObjectLock(info->priv);
> (gdb) p info->priv
> $3 = (libxlDomainObjPrivatePtr) 0x2f302f6e69616d6f
> (gdb) f 9
> #9  0x00007fffe993f2c7 in libxlVmStart (driver=0x5555558c2e50,
> vm=0x5555558e6a50,
>     start_paused=false, restore_fd=89) at libxl/libxl_driver.c:635
> 635             res = libxl_domain_create_restore(priv->ctx, &d_config,
> &domid,
> (gdb) p priv
> $2 = (libxlDomainObjPrivatePtr) 0x5555558fc310
>
> It looks like the libxlDomainObjPrivatePtr, stashed as part of
> for_app_registration_out when registering the timeout, has been
> trampled.  Not sure if the problem is in libvirt or libxl, but it is
> late here and I'm calling it a night :).
>   

It appears the timeout_modify callback is invoked on a previously
deregistered timeout.  I didn't notice the segfault when running
libvirtd under valgrind, but did see

==14653== Invalid read of size 8
==14653==    at 0x134ACD1C: libxlDomainObjTimeoutModifyEventHook
(libxl_domain.c:309)
==14653==    by 0x13730FEC: time_deregister (libxl_event.c:294)
==14653==    by 0x13732CFC: afterpoll_internal (libxl_event.c:1008)
==14653==    by 0x13734311: eventloop_iteration (libxl_event.c:1455)
==14653==    by 0x13734E57: libxl__ao_inprogress (libxl_event.c:1700)
==14653==    by 0x1370F11E: do_domain_create (libxl_create.c:1356)
==14653==    by 0x1370F237: libxl_domain_create_restore
(libxl_create.c:1387)
==14653==    by 0x134AF332: libxlVmStart (libxl_driver.c:635)
==14653==    by 0x134B382A: libxlDomainRestoreFlags (libxl_driver.c:2047)
==14653==    by 0x134B3975: libxlDomainRestore (libxl_driver.c:2070)
==14653==    by 0x53B5AC7: virDomainRestore (libvirt.c:2678)
==14653==    by 0x130ADC: remoteDispatchDomainRestore
(remote_dispatch.h:6657)
==14653==  Address 0x18000178 is 8 bytes inside a block of size 32 free'd
==14653==    at 0x4C28ADC: free (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==14653==    by 0x529B08F: virFree (viralloc.c:580)
==14653==    by 0x134AC578: libxlDomainObjEventHookInfoFree
(libxl_domain.c:110)
==14653==    by 0x52BE3DB: virEventPollCleanupTimeouts (vireventpoll.c:535)
==14653==    by 0x52BEA4C: virEventPollRunOnce (vireventpoll.c:651)
==14653==    by 0x52BC960: virEventRunDefaultImpl (virevent.c:306)

which is consistent with the gdb findings.  I've audited the timeout
handling code in libvirt and didn't notice any problems.  I'll have some
time tomorrow to continue poking.

Regards,
Jim

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-23  4:05         ` Jim Fehlig
@ 2014-01-23 10:56           ` Ian Jackson
  2014-01-23 21:36             ` Jim Fehlig
  2014-01-24  4:27             ` Jim Fehlig
  0 siblings, 2 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-23 10:56 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: xen-devel, Ian Campbell

Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> It appears the timeout_modify callback is invoked on a previously
> deregistered timeout.  I didn't notice the segfault when running
> libvirtd under valgrind, but did see

Hmmm.  This could be a libxl problem.  I'll review the code again and
maybe think about adding some assertions.

But I've slept on this and I had an idea about libvirt's rescheduling
timeouts.  Can you point me at the libvirt branch you're using (a git
tree would be ideal) and I'll take a look at that too ?

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-23 10:56           ` Ian Jackson
@ 2014-01-23 21:36             ` Jim Fehlig
  2014-01-24  4:27             ` Jim Fehlig
  1 sibling, 0 replies; 58+ messages in thread
From: Jim Fehlig @ 2014-01-23 21:36 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Ian Campbell

Ian Jackson wrote:
> Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
>   
>> It appears the timeout_modify callback is invoked on a previously
>> deregistered timeout.  I didn't notice the segfault when running
>> libvirtd under valgrind, but did see
>>     
>
> Hmmm.  This could be a libxl problem.  I'll review the code again and
> maybe think about adding some assertions.
>
> But I've slept on this and I had an idea about libvirt's rescheduling
> timeouts.  Can you point me at the libvirt branch you're using (a git
> tree would be ideal) and I'll take a look at that too ?
>   

Sure

git@github.com:jfehlig/libvirt.git

I just pushed that from the test machine, which has the 3 commits I'm
testing plopped on the master branch

Regards,
Jim

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-23 10:56           ` Ian Jackson
  2014-01-23 21:36             ` Jim Fehlig
@ 2014-01-24  4:27             ` Jim Fehlig
  2014-01-24 12:41               ` Ian Jackson
  1 sibling, 1 reply; 58+ messages in thread
From: Jim Fehlig @ 2014-01-24  4:27 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Ian Campbell

Ian Jackson wrote:
> Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
>   
>> It appears the timeout_modify callback is invoked on a previously
>> deregistered timeout.  I didn't notice the segfault when running
>> libvirtd under valgrind, but did see
>>     
>
> Hmmm.  This could be a libxl problem.  I'll review the code again and
> maybe think about adding some assertions.
>   

BTW, I only see the crash when the save/restore script is running.  I
stopped the other scripts and domains, running only save/restore on a
single domain, and see the crash rather quickly (within 10 iterations).

> But I've slept on this and I had an idea about libvirt's rescheduling
> timeouts. 

I'm not so thrilled with the timeout handling code in the libvirt libxl
driver.  The driver maintains a list of active timeouts because IIRC,
there were cases when the driver received timeout deregistrations when
calling libxl_ctx_free, at which point some of the associated structures
were freed.  The idea was to call libxl_osevent_occurred_timeout on any
active timeouts before freeing libxlDomainObjPrivate and its contents.

Regards,
Jim

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-24  4:27             ` Jim Fehlig
@ 2014-01-24 12:41               ` Ian Jackson
  2014-01-24 12:52                 ` Ian Campbell
                                   ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-24 12:41 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: xen-devel, Ian Campbell

Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> BTW, I only see the crash when the save/restore script is running.  I
> stopped the other scripts and domains, running only save/restore on a
> single domain, and see the crash rather quickly (within 10 iterations).

I'll look at the libvirt code, but:

With a recurring timeout, how can you ever know it's cancelled ?
There might be threads out there, which don't hold any locks, which
are in the process of executing a callback for a timeout.  That might
be arbitrarily delayed from the pov of the rest of the program.

E.g.:

 Thread A                                             Thread B

   invoke some libxl operation
X    do some libxl stuff
X    register timeout (libxl)
XV     record timeout info
X    do some more libxl stuff
     ...
X    do some more libxl stuff
X    deregister timeout (libxl internal)
X     converted to request immediate timeout
XV     record new timeout info
X      release libvirt event loop lock
                                            entering libvirt event loop
                                       V     observe timeout is immediate
                                       V      need to do callback
                                               call libxl driver

      entering libvirt event loop
 V     observe timeout is immediate
 V      need to do callback
         call libxl driver
           call libxl
  X          libxl sees timeout is live
  X          libxl does libxl stuff
         libxl driver deregisters
 V         record lack of timeout
         free driver's timeout struct
                                               call libxl
                                      X          libxl sees timeout is dead
                                      X          libxl does nothing
                                             libxl driver deregisters
                                       V       CRASH due to deregistering
                                       V        already-deregistered timeout

If this is how things are, then I think there is no sane way to use
libvirt's timeouts (!)

In principle I guess the driver could keep its per-timeout structs
around forever and remember whether they've been deregistered or not.
Each one would have to have a lock in it.

But if you think about it, if you have 10 threads all running the
event loop and you set a timeout to zero, doesn't that mean that every
thread's event loop should do the timeout callback as fast as it can ?
That could be a lot of wasted effort.

The best solution would appear to be to provide a non-recurring
callback.

> I'm not so thrilled with the timeout handling code in the libvirt libxl
> driver.  The driver maintains a list of active timeouts because IIRC,
> there were cases when the driver received timeout deregistrations when
> calling libxl_ctx_free, at which point some of the associated structures
> were freed.  The idea was to call libxl_osevent_occurred_timeout on any
> active timeouts before freeing libxlDomainObjPrivate and its contents.

libxl does deregister fd callbacks in libxl_ctx_free.

But libxl doesn't currently "deregister" any timeouts in
libxl_ctx_free; indeed it would be a bit daft for it to do so as at
libxl_ctx_free there are no aos running so there would be nothing to
time out.

But there is a difficulty with timeouts which libxl has set to occur
immediately but which have not yet actually had the callback.  The the
application cannot call libxl_ctx_free with such timeouts outstanding,
because that would imply later calling back into libxl with a stale
ctx.

(Looking at the code I see that the "nexi" are never actually freed.
Bah.)

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-24 12:41               ` Ian Jackson
@ 2014-01-24 12:52                 ` Ian Campbell
  2014-01-24 15:14                   ` Ian Jackson
  2014-01-27  5:39                   ` Jim Fehlig
  2014-01-27  5:22                 ` Jim Fehlig
  2014-01-28  1:39                 ` [libvirt] [Xen-devel] " Jim Fehlig
  2 siblings, 2 replies; 58+ messages in thread
From: Ian Campbell @ 2014-01-24 12:52 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, xen-devel

On Fri, 2014-01-24 at 12:41 +0000, Ian Jackson wrote:

>  Thread A                                             Thread B
> 
>    invoke some libxl operation
> X    do some libxl stuff
> X    register timeout (libxl)
> XV     record timeout info
> X    do some more libxl stuff
>      ...
> X    do some more libxl stuff
> X    deregister timeout (libxl internal)
> X     converted to request immediate timeout
> XV     record new timeout info
> X      release libvirt event loop lock
>                                             entering libvirt event loop
>                                        V     observe timeout is immediate

Is there nothing in this interval which deregisters, pauses, quiesces or
otherwise prevents the timer from going off again until after the
callback (when the lock would be reacquired and whatever was done is
undone)?

(V is the libvirt event loop lock I presume?)

>                                        V      need to do callback
>                                                call libxl driver
> 
>       entering libvirt event loop
>  V     observe timeout is immediate

Given the behaviour I suggest above this would be prevented I think?

> But if you think about it, if you have 10 threads all running the
> event loop and you set a timeout to zero, doesn't that mean that every
> thread's event loop should do the timeout callback as fast as it can ?
> That could be a lot of wasted effort.

It doesn't seem all that likely triggering the same timeout multiple
times in different threads simultaneously would be a deliberate design
decision, so if the libvirt event core doesn't already prevent this
somehow then it seems to me that this is just a bug in the event loop
core.

In that case should be addressed in libvirt, and in any case the libvirt
core folks should be involved in the discussion, so they have the
opportunity to tell us how best, if at all, we can use the provided
facilities and/or whether these issues are deliberate of things which
should be fixed etc.

Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-24 12:52                 ` Ian Campbell
@ 2014-01-24 15:14                   ` Ian Jackson
  2014-01-24 15:18                     ` Ian Jackson
  2014-01-24 16:36                     ` Ian Jackson
  2014-01-27  5:39                   ` Jim Fehlig
  1 sibling, 2 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-24 15:14 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Jim Fehlig, xen-devel, Ian Jackson

Ian Campbell writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> On Fri, 2014-01-24 at 12:41 +0000, Ian Jackson wrote:
> >  Thread A                                             Thread B
> > X      release libvirt event loop lock
> >                                             entering libvirt event loop
> >                                        V     observe timeout is immediate
> 
> Is there nothing in this interval which deregisters, pauses, quiesces or
> otherwise prevents the timer from going off again until after the
> callback (when the lock would be reacquired and whatever was done is
> undone)?

No.  libvirt is quite happy to call the callback multiple times at
once.

I have just looked at the code in vireventpoll.c and there is nothing
that stops this being a problem.

> Given the behaviour I suggest above this would be prevented I think?

Yes, I think so.

> It doesn't seem all that likely triggering the same timeout multiple
> times in different threads simultaneously would be a deliberate design
> decision, so if the libvirt event core doesn't already prevent this
> somehow then it seems to me that this is just a bug in the event loop
> core.

Yes.

> In that case should be addressed in libvirt, and in any case the libvirt
> core folks should be involved in the discussion, so they have the
> opportunity to tell us how best, if at all, we can use the provided
> facilities and/or whether these issues are deliberate of things which
> should be fixed etc.

I'm looking at a way to fix this in libvirt.  Mail to follow ...

Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-24 15:14                   ` Ian Jackson
@ 2014-01-24 15:18                     ` Ian Jackson
  2014-01-24 16:36                     ` Ian Jackson
  1 sibling, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-24 15:18 UTC (permalink / raw)
  To: Ian Campbell, Ian Jackson, Jim Fehlig, xen-devel

Ian Jackson writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> I have just looked at the code in vireventpoll.c and there is nothing
> that stops this being a problem.

While looking at this I found:

        /* Add 20ms fuzz so we don't pointlessly spin doing
         * <10ms sleeps, particularly on kernels with low HZ
         * it is fine that a timer expires 20ms earlier than
         * requested
         */
        if (eventLoop.timeouts[i].expiresAt <= (now+20)) {

This is Not Good.  Result is that maybe libvirt wakes up, calls the
timeout callback function for libxl, libxl compares the time with the
timeout requested and thinks "oh this must be some previous callback
with the same struct" (due to essential workaround for inherent API
race) and does nothing.
                                                            
Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-24 15:14                   ` Ian Jackson
  2014-01-24 15:18                     ` Ian Jackson
@ 2014-01-24 16:36                     ` Ian Jackson
  2014-01-24 16:57                       ` Ian Jackson
  1 sibling, 1 reply; 58+ messages in thread
From: Ian Jackson @ 2014-01-24 16:36 UTC (permalink / raw)
  To: Ian Campbell, Jim Fehlig, xen-devel

Ian Jackson writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> I have just looked at the code in vireventpoll.c and there is nothing
> that stops this being a problem.

I think it's a problem with fd callbacks too.

fd callbacks aren't always removed by the very same handler; they
might be removed by pretty much any arbitrary code (although it will
be in _some_ handler).

It isn't correct to defer removal of an fd callback because it is
important that by the time the removal callback returns, the fd is no
longer being polled by the event loop.  That includes being polled by
_any_ instance of the event loop perhaps in another thread.  Neither
libxl nor libvirt have any machinery to wake up other instances of the
event loop in other threads to change their fd sets, and to wait until
that's done.

libxl could (and perhaps should) grow such a thing in its own event
loop, but thinking about how to write it, it's going to be quite
tedious![1]

Changing the identity of the file objects referred to by fds being
polled on in another thread has unfortunate behaviours.  In
experiments done by a friend of mine we discovered that:
 * Linux keeps a secret handle onto the old open-file struct
 * FreeBSD honours the change immediately
 * Solaris returns EBADF even if the operation is dup2 to
   dup a new open file object onto the existing fd
(After discussion we agreed that a close reading of SuS permits only
the FreeBSD behaviour if the open file objects refer to plain files
which in practice means that the Solaris behaviour is buggy.  I think
the Linux behaviour is mad and the FreeBSD behaviour correct.)

And of course if the fd remains being polled on after the internal
deregistration callback has returned, libxl will feel free to close
the fd.  It might then be reused for any arbitrary object, which might
cause malfunctions if polled in the old way.  And in any case poll
might return EBADF which will probaby crash the event loop.

I'm trying to think of a way for libvirt to handle all of this without
duplicating the horrible synchronisation/wakeup thing I'm
contemplating for libxl.

Ian.

[1] Add a ctx-wide list of pollers, one for every libxl thread in
poll.  This list has to be covered by its own lock.

When fd deregistration occurs, we take this lock, wake up all the
pollers, release the lock, and then wait (perhaps with a condition
variable) for the pollers to acknowledge that they have left poll().

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-24 16:36                     ` Ian Jackson
@ 2014-01-24 16:57                       ` Ian Jackson
  0 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-24 16:57 UTC (permalink / raw)
  To: Ian Campbell, Jim Fehlig, xen-devel

Ian Jackson writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> libxl could (and perhaps should) grow such a thing in its own event
> loop, but thinking about how to write it, it's going to be quite
> tedious![1]
...
> [1] Add a ctx-wide list of pollers, one for every libxl thread in
> poll.  This list has to be covered by its own lock.
> 
> When fd deregistration occurs, we take this lock, wake up all the
> pollers, release the lock, and then wait (perhaps with a condition
> variable) for the pollers to acknowledge that they have left poll().

Something like this (warning, pseudocode).

Ian.

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 93f8fdc..5e3fe87 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -232,6 +232,17 @@ void libxl__ev_fd_deregister(libxl__gc *gc, libxl__ev_fd *ev)
     LIBXL_LIST_REMOVE(ev, entry);
     ev->fd = -1;
 
+    with cohort lock{
+        push new cohort, mark it in use;
+
+        loop waiting for acks{
+            signal all pollers in our cohort and all older;
+            if there are none, hooray (exit the loop);
+            condvar wait;
+        }
+        cleanup;
+    }
+
  out:
     CTX_UNLOCK;
 }
@@ -1436,10 +1447,22 @@ static int eventloop_iteration(libxl__egc *egc, libxl__poller *poller) {
         poller->fd_polls_allocd = nfds;
     }
 
+    with cohort lock{
+        add poller to list in running cohort;
+    }
+
     CTX_UNLOCK;
     rc = poll(poller->fd_polls, nfds, timeout);
     CTX_LOCK;
 
+    with cohort lock{
+        remove poller from whatever cohort's list it was in #';
+        clean up old running cohorts:
+            while (oldest cohort is empty and unused)
+                delete it;
+        signal cv;
+    }
+
     if (rc < 0) {
         if (errno == EINTR)
             return 0; /* will go round again if caller requires */

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-24 12:41               ` Ian Jackson
  2014-01-24 12:52                 ` Ian Campbell
@ 2014-01-27  5:22                 ` Jim Fehlig
  2014-01-27 14:48                   ` Ian Jackson
  2014-01-28  1:39                 ` [libvirt] [Xen-devel] " Jim Fehlig
  2 siblings, 1 reply; 58+ messages in thread
From: Jim Fehlig @ 2014-01-27  5:22 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Ian Campbell

Ian Jackson wrote:
> Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
>   
>> BTW, I only see the crash when the save/restore script is running.  I
>> stopped the other scripts and domains, running only save/restore on a
>> single domain, and see the crash rather quickly (within 10 iterations).
>>     
>
> I'll look at the libvirt code, but:
>
> With a recurring timeout, how can you ever know it's cancelled ?
> There might be threads out there, which don't hold any locks, which
> are in the process of executing a callback for a timeout.  That might
> be arbitrarily delayed from the pov of the rest of the program.
>
> E.g.:
>
>  Thread A                                             Thread B
>
>    invoke some libxl operation
> X    do some libxl stuff
> X    register timeout (libxl)
> XV     record timeout info
> X    do some more libxl stuff
>      ...
> X    do some more libxl stuff
> X    deregister timeout (libxl internal)
> X     converted to request immediate timeout
> XV     record new timeout info
> X      release libvirt event loop lock
>                                             entering libvirt event loop
>                                        V     observe timeout is immediate
>                                        V      need to do callback
>                                                call libxl driver
>
>       entering libvirt event loop
>  V     observe timeout is immediate
>  V      need to do callback
>          call libxl driver
>            call libxl
>   X          libxl sees timeout is live
>   X          libxl does libxl stuff
>          libxl driver deregisters
>  V         record lack of timeout
>          free driver's timeout struct
>                                                call libxl
>                                       X          libxl sees timeout is dead
>                                       X          libxl does nothing
>                                              libxl driver deregisters
>                                        V       CRASH due to deregistering
>                                        V        already-deregistered timeout
>
> If this is how things are, then I think there is no sane way to use
> libvirt's timeouts (!)
>   

Looking at libvirt's default event loop impl, and the current libxl
driver code, I think this is how things are :-/.  But maybe you have
just described a bug in the libxl driver.  In the timer callback,
libxlDomainObjPrivate is locked, the timeout is disabled in libvirt
event loop, libxlDomainObjPrivate is unlocked, and
libxl_osevent_occurred_timeout is called.  Could the issue be solved by
checking if the timeout is still valid in the callback, while holding a
lock on libxlDomainObjPrivate?  The first thread running the callback
could mark the timeout invalid before releasing the lock and calling
libxl_osevent_occurred_timeout.  After acquiring the
libxlDomainObjPrivate lock, subsequent threads running the callback
would see the timer is invalid and return.

Regards,
Jim

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-24 12:52                 ` Ian Campbell
  2014-01-24 15:14                   ` Ian Jackson
@ 2014-01-27  5:39                   ` Jim Fehlig
  1 sibling, 0 replies; 58+ messages in thread
From: Jim Fehlig @ 2014-01-27  5:39 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Ian Jackson

Ian Campbell wrote:
> On Fri, 2014-01-24 at 12:41 +0000, Ian Jackson wrote:
>
>   
>>  Thread A                                             Thread B
>>
>>    invoke some libxl operation
>> X    do some libxl stuff
>> X    register timeout (libxl)
>> XV     record timeout info
>> X    do some more libxl stuff
>>      ...
>> X    do some more libxl stuff
>> X    deregister timeout (libxl internal)
>> X     converted to request immediate timeout
>> XV     record new timeout info
>> X      release libvirt event loop lock
>>                                             entering libvirt event loop
>>                                        V     observe timeout is immediate
>>     
>
> Is there nothing in this interval which deregisters, pauses, quiesces or
> otherwise prevents the timer from going off again until after the
> callback (when the lock would be reacquired and whatever was done is
> undone)?
>   

The libvirt libxl driver will disable the timeout in libvirt's event
loop before calling libxl_osevent_occurred_timeout.  But AFAICT, and as
Ian J. describes, another thread running the event loop could invoke the
timeout before it is disabled.  I think this could be handled in the
libxl driver as described in my response to his mail.

> (V is the libvirt event loop lock I presume?)
>
>   
>>                                        V      need to do callback
>>                                                call libxl driver
>>
>>       entering libvirt event loop
>>  V     observe timeout is immediate
>>     
>
> Given the behaviour I suggest above this would be prevented I think?
>
>   
>> But if you think about it, if you have 10 threads all running the
>> event loop and you set a timeout to zero, doesn't that mean that every
>> thread's event loop should do the timeout callback as fast as it can ?
>> That could be a lot of wasted effort.
>>     
>
> It doesn't seem all that likely triggering the same timeout multiple
> times in different threads simultaneously would be a deliberate design
> decision, so if the libvirt event core doesn't already prevent this
> somehow then it seems to me that this is just a bug in the event loop
> core.
>   

If I understand the code correctly, it does seem possible to trigger the
same timeout multiple times.

> In that case should be addressed in libvirt, and in any case the libvirt
> core folks should be involved in the discussion, so they have the
> opportunity to tell us how best, if at all, we can use the provided
> facilities and/or whether these issues are deliberate of things which
> should be fixed etc.
>   

Agreed.  I'll mention the issue and this discussion on the libvirt IRC
channel tomorrow.

Regards,
Jim

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-27  5:22                 ` Jim Fehlig
@ 2014-01-27 14:48                   ` Ian Jackson
  0 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-27 14:48 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: xen-devel, Ian Campbell

Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> Looking at libvirt's default event loop impl, and the current libxl
> driver code, I think this is how things are :-/.  But maybe you have
> just described a bug in the libxl driver.

I think this is a bug in the libvirt timeout system.  At the very
least it should avoid entering the same timeout callback in multiple
threads.

Something like the attached (UNTESTED).

I'm currently working on a test case to demonstrate the fd race.

Ian.


Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>

diff --git a/src/util/vireventpoll.c b/src/util/vireventpoll.c
index 8a4c8bc..41b012d 100644
--- a/src/util/vireventpoll.c
+++ b/src/util/vireventpoll.c
@@ -66,6 +66,7 @@ struct virEventPollTimeout {
     virFreeCallback ff;
     void *opaque;
     int deleted;
+    bool in_callback;
 };
 
 /* Allocate extra slots for virEventPollHandle/virEventPollTimeout
@@ -238,6 +239,7 @@ int virEventPollAddTimeout(int frequency,
     eventLoop.timeouts[eventLoop.timeoutsCount].deleted = 0;
     eventLoop.timeouts[eventLoop.timeoutsCount].expiresAt =
         frequency >= 0 ? frequency + now : 0;
+    eventLoop.timeouts[eventLoop.timeoutsCount].in_callback = 0;
 
     eventLoop.timeoutsCount++;
     ret = nextTimer-1;
@@ -334,6 +336,8 @@ static int virEventPollCalculateTimeout(int *timeout) {
     for (i = 0; i < eventLoop.timeoutsCount; i++) {
         if (eventLoop.timeouts[i].deleted)
             continue;
+        if (eventLoop.timeouts[i].in_callback)
+            continue;
         if (eventLoop.timeouts[i].frequency < 0)
             continue;
 
@@ -429,7 +433,9 @@ static int virEventPollDispatchTimeouts(void)
         return -1;
 
     for (i = 0; i < ntimeouts; i++) {
-        if (eventLoop.timeouts[i].deleted || eventLoop.timeouts[i].frequency < 0)
+        if (eventLoop.timeouts[i].deleted)
+            continue;
+        if (eventLoop.timeouts[i].frequency < 0)
             continue;
 
         /* Add 20ms fuzz so we don't pointlessly spin doing
@@ -447,9 +453,11 @@ static int virEventPollDispatchTimeouts(void)
             PROBE(EVENT_POLL_DISPATCH_TIMEOUT,
                   "timer=%d",
                   timer);
+            eventLoop.timeouts[i].in_callback = 1;
             virMutexUnlock(&eventLoop.lock);
             (cb)(timer, opaque);
             virMutexLock(&eventLoop.lock);
+            eventLoop.timeouts[i].in_callback = 0;
         }
     }
     return 0;

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-24 12:41               ` Ian Jackson
  2014-01-24 12:52                 ` Ian Campbell
  2014-01-27  5:22                 ` Jim Fehlig
@ 2014-01-28  1:39                 ` Jim Fehlig
  2014-01-28 10:06                   ` Daniel P. Berrange
                                     ` (2 more replies)
  2 siblings, 3 replies; 58+ messages in thread
From: Jim Fehlig @ 2014-01-28  1:39 UTC (permalink / raw)
  To: Ian Jackson; +Cc: LibVir, xen-devel, Ian Campbell

[Adding libvirt list...]

Ian Jackson wrote:
> Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
>   
>> BTW, I only see the crash when the save/restore script is running.  I
>> stopped the other scripts and domains, running only save/restore on a
>> single domain, and see the crash rather quickly (within 10 iterations).
>>     
>
> I'll look at the libvirt code, but:
>
> With a recurring timeout, how can you ever know it's cancelled ?
> There might be threads out there, which don't hold any locks, which
> are in the process of executing a callback for a timeout.  That might
> be arbitrarily delayed from the pov of the rest of the program.
>
> E.g.:
>
>  Thread A                                             Thread B
>
>    invoke some libxl operation
> X    do some libxl stuff
> X    register timeout (libxl)
> XV     record timeout info
> X    do some more libxl stuff
>      ...
> X    do some more libxl stuff
> X    deregister timeout (libxl internal)
> X     converted to request immediate timeout
> XV     record new timeout info
> X      release libvirt event loop lock
>                                             entering libvirt event loop
>                                        V     observe timeout is immediate
>                                        V      need to do callback
>                                                call libxl driver
>
>       entering libvirt event loop
>  V     observe timeout is immediate
>  V      need to do callback
>          call libxl driver
>            call libxl
>   X          libxl sees timeout is live
>   X          libxl does libxl stuff
>          libxl driver deregisters
>  V         record lack of timeout
>          free driver's timeout struct
>                                                call libxl
>                                       X          libxl sees timeout is dead
>                                       X          libxl does nothing
>                                              libxl driver deregisters
>                                        V       CRASH due to deregistering
>                                        V        already-deregistered timeout
>
> If this is how things are, then I think there is no sane way to use
> libvirt's timeouts (!)
>   

Looking at the libvirt code again, it seems a single thread services the
event loop. See virNetServerRun() in src/util/virnetserver.c. Indeed, I
see the same thread ID in all the timer and fd callbacks. One of the
libvirt core devs can correct me if I'm wrong.

Regards,
Jim


> In principle I guess the driver could keep its per-timeout structs
> around forever and remember whether they've been deregistered or not.
> Each one would have to have a lock in it.
>
> But if you think about it, if you have 10 threads all running the
> event loop and you set a timeout to zero, doesn't that mean that every
> thread's event loop should do the timeout callback as fast as it can ?
> That could be a lot of wasted effort.
>
> The best solution would appear to be to provide a non-recurring
> callback.
>
>   
>> I'm not so thrilled with the timeout handling code in the libvirt libxl
>> driver.  The driver maintains a list of active timeouts because IIRC,
>> there were cases when the driver received timeout deregistrations when
>> calling libxl_ctx_free, at which point some of the associated structures
>> were freed.  The idea was to call libxl_osevent_occurred_timeout on any
>> active timeouts before freeing libxlDomainObjPrivate and its contents.
>>     
>
> libxl does deregister fd callbacks in libxl_ctx_free.
>
> But libxl doesn't currently "deregister" any timeouts in
> libxl_ctx_free; indeed it would be a bit daft for it to do so as at
> libxl_ctx_free there are no aos running so there would be nothing to
> time out.
>
> But there is a difficulty with timeouts which libxl has set to occur
> immediately but which have not yet actually had the callback.  The the
> application cannot call libxl_ctx_free with such timeouts outstanding,
> because that would imply later calling back into libxl with a stale
> ctx.
>
> (Looking at the code I see that the "nexi" are never actually freed.
> Bah.)
>
> Thanks,
> Ian.
>
>   

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-28  1:39                 ` [libvirt] [Xen-devel] " Jim Fehlig
@ 2014-01-28 10:06                   ` Daniel P. Berrange
  2014-01-29 16:23                     ` [libvirt] " Ian Jackson
  2014-01-30 12:18                   ` [libvirt] [Xen-devel] " Daniel P. Berrange
  2014-01-30 16:28                   ` Ian Jackson
  2 siblings, 1 reply; 58+ messages in thread
From: Daniel P. Berrange @ 2014-01-28 10:06 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: LibVir, xen-devel, Ian Jackson, Ian Campbell

On Mon, Jan 27, 2014 at 06:39:36PM -0700, Jim Fehlig wrote:
> [Adding libvirt list...]
> 
> Ian Jackson wrote:
> > Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> >   
> >> BTW, I only see the crash when the save/restore script is running.  I
> >> stopped the other scripts and domains, running only save/restore on a
> >> single domain, and see the crash rather quickly (within 10 iterations).
> >>     
> >
> > I'll look at the libvirt code, but:
> >
> > With a recurring timeout, how can you ever know it's cancelled ?
> > There might be threads out there, which don't hold any locks, which
> > are in the process of executing a callback for a timeout.  That might
> > be arbitrarily delayed from the pov of the rest of the program.
> >
> > E.g.:
> >
> >  Thread A                                             Thread B
> >
> >    invoke some libxl operation
> > X    do some libxl stuff
> > X    register timeout (libxl)
> > XV     record timeout info
> > X    do some more libxl stuff
> >      ...
> > X    do some more libxl stuff
> > X    deregister timeout (libxl internal)
> > X     converted to request immediate timeout
> > XV     record new timeout info
> > X      release libvirt event loop lock
> >                                             entering libvirt event loop
> >                                        V     observe timeout is immediate
> >                                        V      need to do callback
> >                                                call libxl driver
> >
> >       entering libvirt event loop
> >  V     observe timeout is immediate
> >  V      need to do callback
> >          call libxl driver
> >            call libxl
> >   X          libxl sees timeout is live
> >   X          libxl does libxl stuff
> >          libxl driver deregisters
> >  V         record lack of timeout
> >          free driver's timeout struct
> >                                                call libxl
> >                                       X          libxl sees timeout is dead
> >                                       X          libxl does nothing
> >                                              libxl driver deregisters
> >                                        V       CRASH due to deregistering
> >                                        V        already-deregistered timeout
> >
> > If this is how things are, then I think there is no sane way to use
> > libvirt's timeouts (!)
> >   
> 
> Looking at the libvirt code again, it seems a single thread services the
> event loop. See virNetServerRun() in src/util/virnetserver.c. Indeed, I
> see the same thread ID in all the timer and fd callbacks. One of the
> libvirt core devs can correct me if I'm wrong.

Yes, you are correct. The threading model for libvirtd is that the
process thread leader executes the event loop, dealing with timer
and file descriptor I/O callbacks. There are also 'n' worker threads
which exclusively handle public API calls from libvirt clients. IOW
all your timer callbacks will be in one thread - which also means
you want your timer callbacks to be fast to execute. If you have any
very slow code you'll want to spawn a temporary worker thread from
the timer callback to do the slow work.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [libvirt] [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-28 10:06                   ` Daniel P. Berrange
@ 2014-01-29 16:23                     ` Ian Jackson
  0 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-29 16:23 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: LibVir, Jim Fehlig, xen-devel, Ian Campbell

Daniel P. Berrange writes ("Re: [libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> Yes, you are correct. The threading model for libvirtd is that the
> process thread leader executes the event loop, dealing with timer
> and file descriptor I/O callbacks. There are also 'n' worker threads
> which exclusively handle public API calls from libvirt clients. IOW
> all your timer callbacks will be in one thread - which also means
> you want your timer callbacks to be fast to execute.

Right.  Good.  All of libxl's callbacks should be fast.

There is still the problem that libxl functions on other threads may
make an fd deregistration call, while libvirt's event loop is still
polling (or selecting) on the fd in the main loop.

libxl might then close the fd, dup2 something onto it, leave the fd to
be reused for some other object.

Depending on the underlying kernel, that can cause side effects.  I
have reproduced the analogous bug with libxl's event loop, but I had
to use an fd connected to the controlling tty, from a background
process group, when the tty has stty tostop.  polling such a thing for
POLLOUT raises SIGTTOU.

Of course the libvirt xl driver still needs to cope with being told by
libxl to register or modify a timeout, or register deregister or
modify an fd, in other than the master thread.

Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-28  1:39                 ` [libvirt] [Xen-devel] " Jim Fehlig
  2014-01-28 10:06                   ` Daniel P. Berrange
@ 2014-01-30 12:18                   ` Daniel P. Berrange
  2014-01-30 16:14                     ` Jim Fehlig
  2014-01-30 16:28                   ` Ian Jackson
  2 siblings, 1 reply; 58+ messages in thread
From: Daniel P. Berrange @ 2014-01-30 12:18 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: LibVir, xen-devel, Ian Jackson, Ian Campbell

On Mon, Jan 27, 2014 at 06:39:36PM -0700, Jim Fehlig wrote:
> [Adding libvirt list...]
> 
> Ian Jackson wrote:
> > Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):

BTW, I meant to ask before - what is the SIGCHLD reference about in the
subject line ?

libvirt drivers that live inside libvirtd should never use or rely on
the SIGCHLD signal at all. All VM processes started by libvirtd ought
to be fully daemonized so that their parent is pid 1 / init. This ensures
that the libvirtd daemon can be restarted without all the VMs getting
reaped. Once the VMs are reparented to init, then libvirt or library
code it uses has no way of ever receiving SIGCHLD.

I'm unclear if the libvirt libxl driver is currently daemonizing the
processes associated with VMs it starts, but if not, it really should
do.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-30 12:18                   ` [libvirt] [Xen-devel] " Daniel P. Berrange
@ 2014-01-30 16:14                     ` Jim Fehlig
  2014-01-30 16:17                       ` Daniel P. Berrange
  0 siblings, 1 reply; 58+ messages in thread
From: Jim Fehlig @ 2014-01-30 16:14 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: LibVir, xen-devel, Ian Jackson, Ian Campbell

Daniel P. Berrange wrote:
> On Mon, Jan 27, 2014 at 06:39:36PM -0700, Jim Fehlig wrote:
>   
>> [Adding libvirt list...]
>>
>> Ian Jackson wrote:
>>     
>>> Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
>>>       
>
> BTW, I meant to ask before - what is the SIGCHLD reference about in the
> subject line ?
>   

This is related to child processes started by libxl.  E.g. running a
bootloader when creating PV VMs, running a save/restore helper when
saving/restoring a VM, etc.

> libvirt drivers that live inside libvirtd should never use or rely on
> the SIGCHLD signal at all. All VM processes started by libvirtd ought
> to be fully daemonized so that their parent is pid 1 / init. This ensures
> that the libvirtd daemon can be restarted without all the VMs getting
> reaped. Once the VMs are reparented to init, then libvirt or library
> code it uses has no way of ever receiving SIGCHLD.
>   

Nod.  VMs are "daemonized", in the context of Xen.  Once running,
libvirtd can be restarted without reaping them.

Regards,
Jim

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-30 16:14                     ` Jim Fehlig
@ 2014-01-30 16:17                       ` Daniel P. Berrange
  0 siblings, 0 replies; 58+ messages in thread
From: Daniel P. Berrange @ 2014-01-30 16:17 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: LibVir, xen-devel, Ian Jackson, Ian Campbell

On Thu, Jan 30, 2014 at 09:14:01AM -0700, Jim Fehlig wrote:
> Daniel P. Berrange wrote:
> > On Mon, Jan 27, 2014 at 06:39:36PM -0700, Jim Fehlig wrote:
> >   
> >> [Adding libvirt list...]
> >>
> >> Ian Jackson wrote:
> >>     
> >>> Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> >>>       
> >
> > BTW, I meant to ask before - what is the SIGCHLD reference about in the
> > subject line ?
> >   
> 
> This is related to child processes started by libxl.  E.g. running a
> bootloader when creating PV VMs, running a save/restore helper when
> saving/restoring a VM, etc.
> 
> > libvirt drivers that live inside libvirtd should never use or rely on
> > the SIGCHLD signal at all. All VM processes started by libvirtd ought
> > to be fully daemonized so that their parent is pid 1 / init. This ensures
> > that the libvirtd daemon can be restarted without all the VMs getting
> > reaped. Once the VMs are reparented to init, then libvirt or library
> > code it uses has no way of ever receiving SIGCHLD.
> >   
> 
> Nod.  VMs are "daemonized", in the context of Xen.  Once running,
> libvirtd can be restarted without reaping them.

Great, thanks for clarifying, this sounds fine then.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-28  1:39                 ` [libvirt] [Xen-devel] " Jim Fehlig
  2014-01-28 10:06                   ` Daniel P. Berrange
  2014-01-30 12:18                   ` [libvirt] [Xen-devel] " Daniel P. Berrange
@ 2014-01-30 16:28                   ` Ian Jackson
  2014-01-30 16:56                     ` Jim Fehlig
  2 siblings, 1 reply; 58+ messages in thread
From: Ian Jackson @ 2014-01-30 16:28 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: LibVir, xen-devel, Ian Campbell

Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> Looking at the libvirt code again, it seems a single thread services the
> event loop. See virNetServerRun() in src/util/virnetserver.c. Indeed, I
> see the same thread ID in all the timer and fd callbacks. One of the
> libvirt core devs can correct me if I'm wrong.

OK.  So just to recap where we stand:

 * I think libxl needs the SIGCHLD flexibility series.  I'll repost
   that (v3) but it's had hardly any changes.

 * You need to fix the timer deregistration arrangements in the
   libvirt/libxl driver to avoid the crash you identified the other day.

 * Something needs to be done about the 20ms slop in the libvirt event
   loop (as it could cause libxl to lock up).  If you can't get rid of
   it in the libvirt core, then adding 20ms to the every requested
   callback time in the libvirt/libxl driver would work for now.

 * I think we can get away with not doing anything about the fd
   deregistration race in libvirt because both Linux and FreeBSD have
   behaviours that are tolerable.

 * libxl should have the fd deregistration race fixed in Xen 4.5.

Have you managed to fix the timer deregistration crash, and retest ?

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-30 16:28                   ` Ian Jackson
@ 2014-01-30 16:56                     ` Jim Fehlig
  2014-01-30 17:12                       ` [libvirt] [Xen-devel] " Ian Jackson
  0 siblings, 1 reply; 58+ messages in thread
From: Jim Fehlig @ 2014-01-30 16:56 UTC (permalink / raw)
  To: Ian Jackson; +Cc: LibVir, xen-devel, Ian Campbell

Ian Jackson wrote:
> Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
>   
>> Looking at the libvirt code again, it seems a single thread services the
>> event loop. See virNetServerRun() in src/util/virnetserver.c. Indeed, I
>> see the same thread ID in all the timer and fd callbacks. One of the
>> libvirt core devs can correct me if I'm wrong.
>>     
>
> OK.  So just to recap where we stand:
>
>  * I think libxl needs the SIGCHLD flexibility series.  I'll repost
>    that (v3) but it's had hardly any changes.
>   

Ok, thanks.  I'm currently testing on your git branch referenced earlier
in this thread

git://xenbits.xen.org/people/iwj/xen.git#wip.enumerate-pids-v2.1

>  * You need to fix the timer deregistration arrangements in the
>    libvirt/libxl driver to avoid the crash you identified the other day.
>   

Yes, I'm testing a fix now.

>  * Something needs to be done about the 20ms slop in the libvirt event
>    loop (as it could cause libxl to lock up).  If you can't get rid of
>    it in the libvirt core, then adding 20ms to the every requested
>    callback time in the libvirt/libxl driver would work for now.
>   

The commit msg adding the fuzz says

    Fix event test timer checks on kernels with HZ=100
   
    On kernels with HZ=100, the resolution of sleeps in poll() is
    quite bad. Doing a precise check on the expiry time vs the
    current time will thus often thing the timer has not expired
    even though we're within 10ms of the expected expiry time. This
    then causes another pointless sleep in poll() for <10ms. Timers
    do not need to have such precise expiration, so we treat a timer
    as expired if it is within 20ms of the expected expiry time. This
    also fixes the eventtest.c test suite on kernels with HZ=100
   
    * daemon/event.c: Add 20ms fuzz when checking for timer expiry


I could handle this in the libxl driver as you say, but doing so makes
me a bit nervous.  Potentially locking up libxl makes me nervous too :).

>  * I think we can get away with not doing anything about the fd
>    deregistration race in libvirt because both Linux and FreeBSD have
>    behaviours that are tolerable.
>
>  * libxl should have the fd deregistration race fixed in Xen 4.5.
>
> Have you managed to fix the timer deregistration crash, and retest ?
>   

Yes.  I've been running my tests for about 24 hours now with no problems
noted.  The tests include starting/stopping a persistent VM,
creating/stopping a transient VM, rebooting a persistent VM,
saving/restoring a transient VM, and getting info on all of these VMs.

I should probably add saving/restoring a persistent VM to the mix since
the associated libxl_ctx is never freed.  Only when a persistent VM is
undefined is the libxl_ctx freed.

Regards,
Jim

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility
  2014-01-30 16:56                     ` Jim Fehlig
@ 2014-01-30 17:12                       ` Ian Jackson
  0 siblings, 0 replies; 58+ messages in thread
From: Ian Jackson @ 2014-01-30 17:12 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: LibVir, xen-devel, Ian Campbell

Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> Ok, thanks.  I'm currently testing on your git branch referenced earlier
> in this thread
> 
> git://xenbits.xen.org/people/iwj/xen.git#wip.enumerate-pids-v2.1

Great.  That's the one.  My current version is pretty much identical -
some unused variables deleted and comments edited.

> >  * You need to fix the timer deregistration arrangements in the
> >    libvirt/libxl driver to avoid the crash you identified the other day.
> 
> Yes, I'm testing a fix now.

Great.

> >  * Something needs to be done about the 20ms slop in the libvirt event
> >    loop (as it could cause libxl to lock up).  If you can't get rid of
> >    it in the libvirt core, then adding 20ms to the every requested
> >    callback time in the libvirt/libxl driver would work for now.
> >   
> 
> The commit msg adding the fuzz says
> 
>     Fix event test timer checks on kernels with HZ=100
>    
>     On kernels with HZ=100, the resolution of sleeps in poll() is
>     quite bad. Doing a precise check on the expiry time vs the
>     current time will thus often thing the timer has not expired
>     even though we're within 10ms of the expected expiry time. This
>     then causes another pointless sleep in poll() for <10ms. Timers
>     do not need to have such precise expiration, so we treat a timer
>     as expired if it is within 20ms of the expected expiry time. This
>     also fixes the eventtest.c test suite on kernels with HZ=100

I think this is a bug in the kernel.  poll() may sleep longer, but not
shorter, than expected.

>     * daemon/event.c: Add 20ms fuzz when checking for timer expiry
> 
> I could handle this in the libxl driver as you say, but doing so makes
> me a bit nervous.  Potentially locking up libxl makes me nervous too :).

I was going to say that the code in libxl_osevent_occurred_timeout
checked the time against the requested time and would ignore the event
(thinking it was stale) if it was too early.

But in fact now that I read the code this is not true.  In fact I
think it will work OK (modulo some things happening too soon).  So the
upshot is that I still think this is a bug in libvirt but I don't
think it's critical to fix it.

Sorry to cause undue alarm.

> Yes.  I've been running my tests for about 24 hours now with no problems
> noted.  The tests include starting/stopping a persistent VM,
> creating/stopping a transient VM, rebooting a persistent VM,
> saving/restoring a transient VM, and getting info on all of these VMs.
> 
> I should probably add saving/restoring a persistent VM to the mix since
> the associated libxl_ctx is never freed.  Only when a persistent VM is
> undefined is the libxl_ctx freed.

Right.  Great.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2014-01-30 17:12 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-17 16:23 [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
2014-01-17 16:23 ` [PATCH 01/12] libxl: fork: Break out checked_waitpid Ian Jackson
2014-01-17 16:23 ` [PATCH 02/12] libxl: fork: Break out childproc_reaped_ours Ian Jackson
2014-01-17 16:23 ` [PATCH 03/12] libxl: fork: Clarify docs for libxl_sigchld_owner Ian Jackson
2014-01-17 16:23 ` [PATCH 04/12] libxl: fork: Document libxl_sigchld_owner_libxl better Ian Jackson
2014-01-17 16:28   ` Ian Campbell
2014-01-17 16:23 ` [PATCH 05/12] libxl: fork: assert that chldmode is right Ian Jackson
2014-01-17 16:23 ` [PATCH 06/12] libxl: fork: Provide libxl_childproc_sigchld_occurred Ian Jackson
2014-01-17 16:24 ` [PATCH 07/12] libxl: fork: Provide ..._always_selective_reap Ian Jackson
2014-01-17 22:17   ` Jim Fehlig
2014-01-17 16:24 ` [PATCH 08/12] libxl: fork: Provide LIBXL_HAVE_SIGCHLD_SELECTIVE_REAP Ian Jackson
2014-01-17 16:24 ` [PATCH 09/12] libxl: fork: Rename sigchld handler functions Ian Jackson
2014-01-20  9:59   ` Ian Campbell
2014-01-17 16:24 ` [PATCH 10/12] libxl: fork: Break out sigchld_installhandler_core Ian Jackson
2014-01-20  9:59   ` Ian Campbell
2014-01-17 16:24 ` [PATCH 11/12] libxl: fork: Break out sigchld_sethandler_raw Ian Jackson
2014-01-20  9:58   ` Ian Campbell
2014-01-20 17:57     ` Ian Jackson
2014-01-17 16:24 ` [PATCH 12/12] libxl: fork: Share SIGCHLD handler amongst ctxs Ian Jackson
2014-01-17 18:13   ` Ian Jackson
2014-01-20  9:56     ` Ian Campbell
2014-01-21 14:40       ` Ian Jackson
2014-01-21 14:53         ` Ian Campbell
2014-01-21 15:09           ` Ian Jackson
2014-01-17 16:37 ` [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
2014-01-17 22:29 ` Jim Fehlig
2014-01-20 18:14   ` Jim Fehlig
2014-01-21 14:46     ` Ian Jackson
2014-01-21 15:11       ` [PATCH 13/12] libxl: events: Break out libxl__pipe_nonblock, _close Ian Jackson
2014-01-21 15:11         ` [PATCH 14/12] libxl: fork: Make SIGCHLD self-pipe nonblocking Ian Jackson
2014-01-21 15:32           ` Ian Campbell
2014-01-21 15:48             ` Ian Jackson
2014-01-21 15:27         ` [PATCH 13/12] libxl: events: Break out libxl__pipe_nonblock, _close Ian Campbell
2014-01-21 15:31           ` Ian Jackson
2014-01-21 15:28     ` [PATCH 00/12] libxl: fork: SIGCHLD flexibility Ian Jackson
2014-01-22  5:32       ` Jim Fehlig
2014-01-23  4:05         ` Jim Fehlig
2014-01-23 10:56           ` Ian Jackson
2014-01-23 21:36             ` Jim Fehlig
2014-01-24  4:27             ` Jim Fehlig
2014-01-24 12:41               ` Ian Jackson
2014-01-24 12:52                 ` Ian Campbell
2014-01-24 15:14                   ` Ian Jackson
2014-01-24 15:18                     ` Ian Jackson
2014-01-24 16:36                     ` Ian Jackson
2014-01-24 16:57                       ` Ian Jackson
2014-01-27  5:39                   ` Jim Fehlig
2014-01-27  5:22                 ` Jim Fehlig
2014-01-27 14:48                   ` Ian Jackson
2014-01-28  1:39                 ` [libvirt] [Xen-devel] " Jim Fehlig
2014-01-28 10:06                   ` Daniel P. Berrange
2014-01-29 16:23                     ` [libvirt] " Ian Jackson
2014-01-30 12:18                   ` [libvirt] [Xen-devel] " Daniel P. Berrange
2014-01-30 16:14                     ` Jim Fehlig
2014-01-30 16:17                       ` Daniel P. Berrange
2014-01-30 16:28                   ` Ian Jackson
2014-01-30 16:56                     ` Jim Fehlig
2014-01-30 17:12                       ` [libvirt] [Xen-devel] " Ian Jackson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.