All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] question on libxl ao
@ 2013-11-05  9:22 Bamvor Jian Zhang
  2013-11-05  9:22 ` [PATCH 1/2] expose child fd in order to handle child exit in libvirt Bamvor Jian Zhang
  2013-11-05  9:22 ` [PATCH 2/2] call ao_how callback explicitly Bamvor Jian Zhang
  0 siblings, 2 replies; 29+ messages in thread
From: Bamvor Jian Zhang @ 2013-11-05  9:22 UTC (permalink / raw)
  To: xen-devel
  Cc: jfehlig, Bamvor Jian Zhang, ian.jackson, ian.campbell,
	stefano.stabellini

there are four choices for using ao_how in libvirt.
(a), ao without callback and set libxl_sigchld_owner_mainloop
(b), ao without callback and set libxl_sigchld_owner_libxl_always
(c), ao with callback and set libxl_sigchld_owner_mainloop
(d), ao with callback and set libxl_sigchld_owner_libxl_always

i got some trouble when i try (b) and (c).
i still not decided which one is better choice for implement in
libvirt. it would be easier if i could try above four choice.

Bamvor Jian Zhang (2):
  expose child fd in order to handle child exit in libvirt
  call ao_how callback explicitly

 tools/libxl/libxl.h          |  2 ++
 tools/libxl/libxl_event.c    | 18 ++++++++++++++++++
 tools/libxl/libxl_fork.c     |  5 +++++
 tools/libxl/libxl_internal.h |  1 +
 4 files changed, 26 insertions(+)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/2] expose child fd in order to handle child exit in libvirt
  2013-11-05  9:22 [PATCH 0/2] question on libxl ao Bamvor Jian Zhang
@ 2013-11-05  9:22 ` Bamvor Jian Zhang
  2013-11-05 16:05   ` [PATCH 0/2] question on libxl ao [and 1 more messages] Ian Jackson
  2013-11-05 19:23   ` [PATCH 0/2] question on libxl ao [and 1 more messages] " Ian Jackson
  2013-11-05  9:22 ` [PATCH 2/2] call ao_how callback explicitly Bamvor Jian Zhang
  1 sibling, 2 replies; 29+ messages in thread
From: Bamvor Jian Zhang @ 2013-11-05  9:22 UTC (permalink / raw)
  To: xen-devel
  Cc: jfehlig, Bamvor Jian Zhang, ian.jackson, ian.campbell,
	stefano.stabellini

libvirt could handle fd and timeout event through
libxl_osevent_hooks. either of these will not inform the child
exit if libvirt set the libxl_sigchld_owner_libxl_always.

add a function for returning the sigchld_selfpipe in order to
handle the child exit in libvirt.
meanwhile, there is only one pipe in ctx, it seems that it is
not worth to add this to libxl_osevent_hooks.

Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com>
---
 tools/libxl/libxl.h      | 2 ++
 tools/libxl/libxl_fork.c | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 1c6675d..11e1bc3 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -1071,6 +1071,8 @@ int libxl_flask_loadpolicy(libxl_ctx *ctx, void *policy, uint32_t size);
 int libxl_fd_set_cloexec(libxl_ctx *ctx, int fd, int cloexec);
 int libxl_fd_set_nonblock(libxl_ctx *ctx, int fd, int nonblock);
 
+int libxl_get_pipe_handle(libxl_ctx *ctx, int num);
+
 #include <libxl_event.h>
 
 #endif /* LIBXL_H */
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 044ddad..1989258 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -418,6 +418,11 @@ int libxl__ev_child_xenstore_reopen(libxl__gc *gc, const char *what) {
     return rc;
 }
 
+int libxl_get_pipe_handle(libxl_ctx *ctx, int num)
+{
+    return ctx->sigchld_selfpipe[num];
+}
+
 /*
  * Local variables:
  * mode: C
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/2] call ao_how callback explicitly
  2013-11-05  9:22 [PATCH 0/2] question on libxl ao Bamvor Jian Zhang
  2013-11-05  9:22 ` [PATCH 1/2] expose child fd in order to handle child exit in libvirt Bamvor Jian Zhang
@ 2013-11-05  9:22 ` Bamvor Jian Zhang
  2013-11-05 15:51   ` Ian Jackson
  1 sibling, 1 reply; 29+ messages in thread
From: Bamvor Jian Zhang @ 2013-11-05  9:22 UTC (permalink / raw)
  To: xen-devel
  Cc: jfehlig, Bamvor Jian Zhang, ian.jackson, ian.campbell,
	stefano.stabellini

ao_how callback is not called after inserted into list.
in my test, i directly call it in
libxl__ao_complete_check_progress_reports. i know it should not
work like this. Could you give some suggestion about it?

Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com>
---
 tools/libxl/libxl_event.c    | 18 ++++++++++++++++++
 tools/libxl/libxl_internal.h |  1 +
 2 files changed, 19 insertions(+)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 0fe4428..8320547 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -1578,6 +1578,23 @@ void libxl__ao_complete(libxl__egc *egc, libxl__ao *ao, int rc)
     libxl__ao_complete_check_progress_reports(egc, ao);
 }
 
+void libxl__ao_occurred(libxl__egc *egc)
+{
+    EGC_GC;
+    libxl__ao *ao, *ao_tmp;
+    LIBXL_TAILQ_FOREACH_SAFE(ao, &egc->aos_for_callback,
+                             entry_for_callback, ao_tmp) {
+        LIBXL_TAILQ_REMOVE(&egc->aos_for_callback, ao, entry_for_callback);
+        LOG(DEBUG,"ao %p: completion callback", ao);
+        ao->how.callback(CTX, ao->rc, ao->how.u.for_callback);
+        CTX_LOCK;
+        ao->notified = 1;
+        if (!ao->in_initiator)
+            libxl__ao__destroy(CTX, ao);
+        CTX_UNLOCK;
+    }
+}
+
 void libxl__ao_complete_check_progress_reports(libxl__egc *egc, libxl__ao *ao)
 {
     /*
@@ -1601,6 +1618,7 @@ void libxl__ao_complete_check_progress_reports(libxl__egc *egc, libxl__ao *ao)
     } else if (ao->how.callback) {
         LIBXL__LOG(ctx, XTL_DEBUG, "ao %p: complete for callback",ao);
         LIBXL_TAILQ_INSERT_TAIL(&egc->aos_for_callback, ao, entry_for_callback);
+        libxl__ao_occurred(egc);
     } else {
         libxl_event *ev;
         ev = NEW_EVENT(egc, OPERATION_COMPLETE, ao->domid, ao->how.u.for_event);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 4f92522..bed775d 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1760,6 +1760,7 @@ _hidden int libxl__ao_inprogress(libxl__ao *ao,
        const char *file, int line, const char *func); /* temporarily unlocks */
 _hidden void libxl__ao_abort(libxl__ao *ao);
 _hidden void libxl__ao_complete(libxl__egc *egc, libxl__ao *ao, int rc);
+_hidden void libxl__ao_occurred(libxl__egc *egc);
 _hidden libxl__gc *libxl__ao_inprogress_gc(libxl__ao *ao);
 
 /* Can be called at any time.  Use is essential for any aop user. */
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] call ao_how callback explicitly
  2013-11-05  9:22 ` [PATCH 2/2] call ao_how callback explicitly Bamvor Jian Zhang
@ 2013-11-05 15:51   ` Ian Jackson
  2013-11-07 14:51     ` Bamvor Jian Zhang
  0 siblings, 1 reply; 29+ messages in thread
From: Ian Jackson @ 2013-11-05 15:51 UTC (permalink / raw)
  To: Bamvor Jian Zhang; +Cc: jfehlig, stefano.stabellini, ian.campbell, xen-devel

Bamvor Jian Zhang writes ("[PATCH 2/2] call ao_how callback explicitly"):
> ao_how callback is not called after inserted into list.
> in my test, i directly call it in
> libxl__ao_complete_check_progress_reports. i know it should not
> work like this. Could you give some suggestion about it?

I'm not sure what you mean.  Under what conditions do you not see the
callback happen ?  The existing machinery is supposed to take care of
that.  Specifically, in the patch you're changing,
libxl__ao_complete_check_progress_reports puts the ao onto
egc->aos_for_callback.  On the return path from the libxl event
function back to the application, we are supposed to call
libxl__egc_cleanup, which in turn calls egc_run_callbacks, which
should pick up the aos on aos_for_callback.  libxl__egc_cleanup is
called from the event loops in libxl_event_wait and
libxl__ao_inprogress and from the macro EGC_FREE.  It should be
impossible for a path to be missed because inside libxl one needs an
egc to call libxl__ao_complete.

What kind of an ao_how is your application passing to libxl ?  Is your
application multithreaded ?

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/2] question on libxl ao [and 1 more messages]
  2013-11-05  9:22 ` [PATCH 1/2] expose child fd in order to handle child exit in libvirt Bamvor Jian Zhang
@ 2013-11-05 16:05   ` Ian Jackson
  2013-11-05 19:19     ` [PATCH 0/3] libxl: event system: SIGCHLD related fixes Ian Jackson
  2013-11-07  3:00     ` [PATCH 0/2] question on libxl ao [and 1 more messages] Bamvor Jian Zhang
  2013-11-05 19:23   ` [PATCH 0/2] question on libxl ao [and 1 more messages] " Ian Jackson
  1 sibling, 2 replies; 29+ messages in thread
From: Ian Jackson @ 2013-11-05 16:05 UTC (permalink / raw)
  To: Bamvor Jian Zhang; +Cc: jfehlig, stefano.stabellini, ian.campbell, xen-devel

Bamvor Jian Zhang writes ("[PATCH 0/2] question on libxl ao"):
> there are four choices for using ao_how in libvirt.
> (a), ao without callback and set libxl_sigchld_owner_mainloop
> (b), ao without callback and set libxl_sigchld_owner_libxl_always
> (c), ao with callback and set libxl_sigchld_owner_mainloop
> (d), ao with callback and set libxl_sigchld_owner_libxl_always
> 
> i got some trouble when i try (b) and (c).
> i still not decided which one is better choice for implement in
> libvirt. it would be easier if i could try above four choice.

I'm not sure I follow.  I think the right answer for libvirt is
probably aos specifying callback, and libxl_sigchld_owner_mainloop.
I'm assuming that libvirt has something for handling children
already.  But I haven't looked.  If it doesn't then
libxl_sigchld_owner_always would probably be best.

Bamvor Jian Zhang writes ("[PATCH 1/2] expose child fd in order to handle child exit in libvirt"):
> libvirt could handle fd and timeout event through
> libxl_osevent_hooks. either of these will not inform the child
> exit if libvirt set the libxl_sigchld_owner_libxl_always.

I don't exactly follow, but I think you have found a bug.

If libvirt specifies libxl_sigchld_owner_libxl_always then libxl will
install a SIGCHLD handler and to the self-pipe trick.  The self-pipe
fd should be registered with the fd event machinery in the normal way.
However, this doesn't actually appear to happen.  Instead the
beforepoll/afterpoll functions handle that fd ad-hoc.  The result is
that, probably, nothing will notice the selfpipe becoming writable.

I will see about writing a patch to fix this.

> add a function for returning the sigchld_selfpipe in order to
> handle the child exit in libvirt.
> meanwhile, there is only one pipe in ctx, it seems that it is
> not worth to add this to libxl_osevent_hooks.

This approach is not right, though.  There is no need to add
additional interfaces to libxl.  The bug just needs to be fixed.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 0/3] libxl: event system: SIGCHLD related fixes
  2013-11-05 16:05   ` [PATCH 0/2] question on libxl ao [and 1 more messages] Ian Jackson
@ 2013-11-05 19:19     ` Ian Jackson
  2013-11-05 19:19       ` [PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators Ian Jackson
                         ` (3 more replies)
  2013-11-07  3:00     ` [PATCH 0/2] question on libxl ao [and 1 more messages] Bamvor Jian Zhang
  1 sibling, 4 replies; 29+ messages in thread
From: Ian Jackson @ 2013-11-05 19:19 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell, Bamvor Jian Zhang

libxl's SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always
and does not always work correctly with the fd registration system.

 1/3 libxl: event system: pass gc, not just ctx, to internal
 2/3 libxl: event system: Make
 3/3 libxl: event system: properly register the SIGCHLD

Reported-by: Bamvor Jian Zhang <bjzhang@suse.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Bamvor Jian Zhang <bjzhang@suse.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators
  2013-11-05 19:19     ` [PATCH 0/3] libxl: event system: SIGCHLD related fixes Ian Jackson
@ 2013-11-05 19:19       ` Ian Jackson
  2013-11-06 10:30         ` Ian Campbell
  2013-11-05 19:19       ` [PATCH 2/3] libxl: event system: Make libxl_sigchld_owner_libxl_always work Ian Jackson
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 29+ messages in thread
From: Ian Jackson @ 2013-11-05 19:19 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

We are going to want the gc for libxl__ev_fd_register.

No functional change in this patch.  Simply change the argument types,
and the actual arguments from ctx to gc.  Inside these functions, use
CTX (the macro which uses gc) rather than the old formal parameter
ctx.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl.c          |    2 +-
 tools/libxl/libxl_fork.c     |   36 ++++++++++++++++++------------------
 tools/libxl/libxl_internal.h |    4 ++--
 3 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 0f0f56c..d0a87d2 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -171,7 +171,7 @@ int libxl_ctx_free(libxl_ctx *ctx)
     /* If we have outstanding children, then the application inherits
      * them; we wish the application good luck with understanding
      * this if and when it reaps them. */
-    libxl__sigchld_removehandler(ctx);
+    libxl__sigchld_removehandler(gc);
 
     if (ctx->sigchld_selfpipe[0] >= 0) {
         close(ctx->sigchld_selfpipe[0]);
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 044ddad..f392d67 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -173,23 +173,23 @@ static void sigchld_removehandler_core(void)
     sigchld_owner = 0;
 }
 
-void libxl__sigchld_removehandler(libxl_ctx *ctx) /* non-reentrant */
+void libxl__sigchld_removehandler(libxl__gc *gc) /* non-reentrant */
 {
     atfork_lock();
-    if (sigchld_owner == ctx)
+    if (sigchld_owner == CTX)
         sigchld_removehandler_core();
     atfork_unlock();
 }
 
-int libxl__sigchld_installhandler(libxl_ctx *ctx) /* non-reentrant */
+int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */
 {
     int r, rc;
 
-    if (ctx->sigchld_selfpipe[0] < 0) {
-        r = pipe(ctx->sigchld_selfpipe);
+    if (CTX->sigchld_selfpipe[0] < 0) {
+        r = pipe(CTX->sigchld_selfpipe);
         if (r) {
-            ctx->sigchld_selfpipe[0] = -1;
-            LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR,
+            CTX->sigchld_selfpipe[0] = -1;
+            LIBXL__LOG_ERRNO(CTX, LIBXL__LOG_ERROR,
                              "failed to create sigchld pipe");
             rc = ERROR_FAIL;
             goto out;
@@ -197,11 +197,11 @@ int libxl__sigchld_installhandler(libxl_ctx *ctx) /* non-reentrant */
     }
 
     atfork_lock();
-    if (sigchld_owner != ctx) {
+    if (sigchld_owner != CTX) {
         struct sigaction ours;
 
         assert(!sigchld_owner);
-        sigchld_owner = ctx;
+        sigchld_owner = CTX;
 
         memset(&ours,0,sizeof(ours));
         ours.sa_handler = sigchld_handler;
@@ -239,11 +239,11 @@ int libxl__fork_selfpipe_active(libxl_ctx *ctx)
     return ctx->sigchld_selfpipe[0];
 }
 
-static void perhaps_removehandler(libxl_ctx *ctx)
+static void perhaps_removehandler(libxl__gc *gc)
 {
-    if (LIBXL_LIST_EMPTY(&ctx->children) &&
-        ctx->childproc_hooks->chldowner != libxl_sigchld_owner_libxl_always)
-        libxl__sigchld_removehandler(ctx);
+    if (LIBXL_LIST_EMPTY(&CTX->children) &&
+        CTX->childproc_hooks->chldowner != libxl_sigchld_owner_libxl_always)
+        libxl__sigchld_removehandler(gc);
 }
 
 static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
@@ -263,7 +263,7 @@ static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
     ch->pid = -1;
     ch->callback(egc, ch, pid, status);
 
-    perhaps_removehandler(CTX);
+    perhaps_removehandler(gc);
 
     return 0;
 }
@@ -331,7 +331,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch,
     int rc;
 
     if (chldmode_ours(CTX)) {
-        rc = libxl__sigchld_installhandler(CTX);
+        rc = libxl__sigchld_installhandler(gc);
         if (rc) goto out;
     }
 
@@ -361,7 +361,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch,
     rc = pid;
 
  out:
-    perhaps_removehandler(CTX);
+    perhaps_removehandler(gc);
     CTX_UNLOCK;
     return rc;
 }
@@ -383,10 +383,10 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks,
     switch (ctx->childproc_hooks->chldowner) {
     case libxl_sigchld_owner_mainloop:
     case libxl_sigchld_owner_libxl:
-        libxl__sigchld_removehandler(ctx);
+        libxl__sigchld_removehandler(gc);
         break;
     case libxl_sigchld_owner_libxl_always:
-        libxl__sigchld_installhandler(ctx);
+        libxl__sigchld_installhandler(gc);
         break;
     default:
         abort();
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 4729566..bd2eeed 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -838,8 +838,8 @@ _hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p);
 
 /* Internal to fork and child reaping machinery */
 extern const libxl_childproc_hooks libxl__childproc_default_hooks;
-int libxl__sigchld_installhandler(libxl_ctx *ctx); /* non-reentrant;logs errs */
-void libxl__sigchld_removehandler(libxl_ctx *ctx); /* non-reentrant */
+int libxl__sigchld_installhandler(libxl__gc*); /* non-reentrant; logs errs */
+void libxl__sigchld_removehandler(libxl__gc*); /* non-reentrant */
 int libxl__fork_selfpipe_active(libxl_ctx *ctx); /* returns read fd or -1 */
 void libxl__fork_selfpipe_woken(libxl__egc *egc);
 int libxl__self_pipe_wakeup(int fd); /* returns 0 or -1 setting errno */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/3] libxl: event system: Make libxl_sigchld_owner_libxl_always work
  2013-11-05 19:19     ` [PATCH 0/3] libxl: event system: SIGCHLD related fixes Ian Jackson
  2013-11-05 19:19       ` [PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators Ian Jackson
@ 2013-11-05 19:19       ` Ian Jackson
  2013-11-06 10:34         ` Ian Campbell
  2013-11-05 19:19       ` [PATCH 3/3] libxl: event system: properly register the SIGCHLD self-pipe Ian Jackson
  2013-11-06 10:38       ` [PATCH 0/3] libxl: event system: SIGCHLD related fixes Ian Campbell
  3 siblings, 1 reply; 29+ messages in thread
From: Ian Jackson @ 2013-11-05 19:19 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell, Bamvor Jian Zhang

Previously, libxl_sigchld_owner_libxl_always was mishandled.  It would
result in libxl paying no attention to the sigchld self pipe.

Fix this by fixing chldmode_ours so that it returns true iff we are
supposed to be handling SIGCHLD.

Additionally, we arrange to use chldmode_ours everywhere where we are
installing/removing signal handlers and/or deciding whether to check
the self pipe, etc.  This means it needs a new "creating" flag
argument for the benefit of libxl__ev_child_fork, which needs to
install the signal handler in libxl_sigchld_owner_libxl even if there
are not currently any children.

ctx->childproc_hooks->chldowner is now interpreted only by
chldmode_ours.

Reported-by: Bamvor Jian Zhang <bjzhang@suse.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Bamvor Jian Zhang <bjzhang@suse.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
---
 tools/libxl/libxl_fork.c |   51 ++++++++++++++++++++++++----------------------
 1 file changed, 27 insertions(+), 24 deletions(-)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index f392d67..335ca40 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -222,18 +222,23 @@ int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */
     return rc;
 }
 
-static int chldmode_ours(libxl_ctx *ctx)
+static bool chldmode_ours(libxl_ctx *ctx, bool creating)
 {
-    return ctx->childproc_hooks->chldowner == libxl_sigchld_owner_libxl;
+    switch (ctx->childproc_hooks->chldowner) {
+    case libxl_sigchld_owner_libxl:
+        return creating || !LIBXL_LIST_EMPTY(&ctx->children);
+    case libxl_sigchld_owner_mainloop:
+        return 0;
+    case libxl_sigchld_owner_libxl_always:
+        return 1;
+    }
+    abort();
 }
 
 int libxl__fork_selfpipe_active(libxl_ctx *ctx)
 {
     /* Returns the fd to read, or -1 */
-    if (!chldmode_ours(ctx))
-        return -1;
-
-    if (LIBXL_LIST_EMPTY(&ctx->children))
+    if (!chldmode_ours(ctx, 0))
         return -1;
 
     return ctx->sigchld_selfpipe[0];
@@ -241,11 +246,21 @@ int libxl__fork_selfpipe_active(libxl_ctx *ctx)
 
 static void perhaps_removehandler(libxl__gc *gc)
 {
-    if (LIBXL_LIST_EMPTY(&CTX->children) &&
-        CTX->childproc_hooks->chldowner != libxl_sigchld_owner_libxl_always)
+    if (chldmode_ours(CTX, 0))
         libxl__sigchld_removehandler(gc);
 }
 
+static int perhaps_installhandler(libxl__gc *gc, bool creating)
+{
+    int rc;
+
+    if (chldmode_ours(CTX, creating)) {
+        rc = libxl__sigchld_installhandler(gc);
+        if (rc) return rc;
+    }
+    return 0;
+}
+
 static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
 {
     EGC_GC;
@@ -284,7 +299,7 @@ void libxl__fork_selfpipe_woken(libxl__egc *egc)
      * ctx must be locked EXACTLY ONCE */
     EGC_GC;
 
-    while (chldmode_ours(CTX) /* in case the app changes the mode */) {
+    while (chldmode_ours(CTX, 0) /* in case the app changes the mode */) {
         int status;
         pid_t pid = waitpid(-1, &status, WNOHANG);
 
@@ -330,10 +345,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch,
     CTX_LOCK;
     int rc;
 
-    if (chldmode_ours(CTX)) {
-        rc = libxl__sigchld_installhandler(gc);
-        if (rc) goto out;
-    }
+    perhaps_installhandler(gc, 1);
 
     pid_t pid =
         CTX->childproc_hooks->fork_replacement
@@ -380,17 +392,8 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks,
     ctx->childproc_hooks = hooks;
     ctx->childproc_user = user;
 
-    switch (ctx->childproc_hooks->chldowner) {
-    case libxl_sigchld_owner_mainloop:
-    case libxl_sigchld_owner_libxl:
-        libxl__sigchld_removehandler(gc);
-        break;
-    case libxl_sigchld_owner_libxl_always:
-        libxl__sigchld_installhandler(gc);
-        break;
-    default:
-        abort();
-    }
+    perhaps_removehandler(gc);
+    perhaps_installhandler(gc, 0); /* idempotent, ok to ignore errors for now */
 
     CTX_UNLOCK;
     GC_FREE;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 3/3] libxl: event system: properly register the SIGCHLD self-pipe
  2013-11-05 19:19     ` [PATCH 0/3] libxl: event system: SIGCHLD related fixes Ian Jackson
  2013-11-05 19:19       ` [PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators Ian Jackson
  2013-11-05 19:19       ` [PATCH 2/3] libxl: event system: Make libxl_sigchld_owner_libxl_always work Ian Jackson
@ 2013-11-05 19:19       ` Ian Jackson
  2013-11-06 10:36         ` Ian Campbell
  2013-11-06 10:38       ` [PATCH 0/3] libxl: event system: SIGCHLD related fixes Ian Campbell
  3 siblings, 1 reply; 29+ messages in thread
From: Ian Jackson @ 2013-11-05 19:19 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell, Bamvor Jian Zhang

An application which uses libxl_osevent_register_hooks, and doesn't
use libxl_sigchld_owner_mainloop, would never properly experience the
deaths of its (libxl) children.

This is because the SIGCHLD self pipe would be handled using ad-hoc
code in beforepoll_internal and afterpoll_internal.  However, this is
no good if application is using the fd registration system instead; in
that case these functions would not be called and nothing would deal
with readability of the self pipe.

Fix this as follows:

The SIGCHLD self pipe now is now dealt with by a standard libxl__ev_fd
handler, which is registered and deregistered along with the SIGCHLD
handler itself.  The handler function subsumes the ad-hoc response
code removed from beforepoll_internal, and the functionality of
libxl__fork_selfpipe_woken.

This is also tidier as the SIGCHLD self pipe is no longer touched
outside libxl_fork.c other than in ctx initialisation and teardown.

(The ad-hoc arrangements for poller->wakeup_pipe in
beforepoll_internal and afterpoll_internal are OK because the
libxl__poller mechanism exists to wake up threads which are sitting
inside libxl's poll loop, so is not applicable to the application's
event loop.)

Reported-by: Bamvor Jian Zhang <bjzhang@suse.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Bamvor Jian Zhang <bjzhang@suse.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
---
 tools/libxl/libxl.c          |    2 ++
 tools/libxl/libxl_event.c    |   12 ----------
 tools/libxl/libxl_fork.c     |   52 ++++++++++++++++++++++++++++++++----------
 tools/libxl/libxl_internal.h |    3 +--
 4 files changed, 43 insertions(+), 26 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index d0a87d2..23e5a40 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -65,6 +65,7 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version,
     ctx->childproc_user = 0;
         
     ctx->sigchld_selfpipe[0] = -1;
+    libxl__ev_fd_init(&ctx->sigchld_selfpipe_efd);
 
     /* The mutex is special because we can't idempotently destroy it */
 
@@ -146,6 +147,7 @@ int libxl_ctx_free(libxl_ctx *ctx)
     for (i = 0; i < ctx->watch_nslots; i++)
         assert(!libxl__watch_slot_contents(gc, i));
     libxl__ev_fd_deregister(gc, &ctx->watch_efd);
+    libxl__ev_fd_deregister(gc, &ctx->sigchld_selfpipe_efd);
 
     /* Now there should be no more events requested from the application: */
 
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 0fe4428..b732816 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -774,10 +774,6 @@ static int beforepoll_internal(libxl__gc *gc, libxl__poller *poller,
                                                                        \
         REQUIRE_FD(poller->wakeup_pipe[0], POLLIN, BODY);              \
                                                                        \
-        int selfpipe = libxl__fork_selfpipe_active(CTX);               \
-        if (selfpipe >= 0)                                             \
-            REQUIRE_FD(selfpipe, POLLIN, BODY);                        \
-                                                                       \
     }while(0)
 
 #define REQUIRE_FD(req_fd_, req_events_, BODY) do{      \
@@ -999,14 +995,6 @@ static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller,
         if (e) LIBXL__EVENT_DISASTER(egc, "read wakeup", e, 0);
     }
 
-    int selfpipe = libxl__fork_selfpipe_active(CTX);
-    if (selfpipe >= 0 &&
-        afterpoll_check_fd(poller,fds,nfds, selfpipe, POLLIN)) {
-        int e = libxl__self_pipe_eatall(selfpipe);
-        if (e) LIBXL__EVENT_DISASTER(egc, "read sigchld pipe", e, 0);
-        libxl__fork_selfpipe_woken(egc);
-    }
-
     for (;;) {
         libxl__ev_time *etime = LIBXL_TAILQ_FIRST(&CTX->etimes);
         if (!etime)
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 335ca40..aad1652 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -155,6 +155,9 @@ int libxl__carefd_fd(const libxl__carefd *cf)
  * Actual child process handling
  */
 
+static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
+                                     int fd, short events, short revents);
+
 static void sigchld_handler(int signo)
 {
     int e = libxl__self_pipe_wakeup(sigchld_owner->sigchld_selfpipe[1]);
@@ -175,10 +178,18 @@ static void sigchld_removehandler_core(void)
 
 void libxl__sigchld_removehandler(libxl__gc *gc) /* non-reentrant */
 {
+    int rc;
+
     atfork_lock();
     if (sigchld_owner == CTX)
         sigchld_removehandler_core();
     atfork_unlock();
+
+    if (libxl__ev_fd_isregistered(&CTX->sigchld_selfpipe_efd)) {
+        rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, 0);
+        if (rc)
+            libxl__ev_fd_deregister(gc, &CTX->sigchld_selfpipe_efd);
+    }
 }
 
 int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */
@@ -195,6 +206,15 @@ int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */
             goto out;
         }
     }
+    if (!libxl__ev_fd_isregistered(&CTX->sigchld_selfpipe_efd)) {
+        rc = libxl__ev_fd_register(gc, &CTX->sigchld_selfpipe_efd,
+                                   sigchld_selfpipe_handler,
+                                   CTX->sigchld_selfpipe[0], POLLIN);
+        if (rc) goto out;
+    } else {
+        rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, POLLIN);
+        if (rc) goto out;
+    }
 
     atfork_lock();
     if (sigchld_owner != CTX) {
@@ -235,18 +255,9 @@ static bool chldmode_ours(libxl_ctx *ctx, bool creating)
     abort();
 }
 
-int libxl__fork_selfpipe_active(libxl_ctx *ctx)
-{
-    /* Returns the fd to read, or -1 */
-    if (!chldmode_ours(ctx, 0))
-        return -1;
-
-    return ctx->sigchld_selfpipe[0];
-}
-
 static void perhaps_removehandler(libxl__gc *gc)
 {
-    if (chldmode_ours(CTX, 0))
+    if (!chldmode_ours(CTX, 0))
         libxl__sigchld_removehandler(gc);
 }
 
@@ -293,12 +304,29 @@ int libxl_childproc_reaped(libxl_ctx *ctx, pid_t pid, int status)
     return rc;
 }
 
-void libxl__fork_selfpipe_woken(libxl__egc *egc)
+static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
+                                     int fd, short events, short revents)
 {
     /* May make callbacks into the application for child processes.
-     * ctx must be locked EXACTLY ONCE */
+     * So, this function may unlock and relock the CTX.  This is OK
+     * because event callback functions are always called with the CTX
+     * locked exactly once, and from code which copes with reentrancy.
+     * (See also the comment in afterpoll_internal.) */
     EGC_GC;
 
+    int selfpipe = CTX->sigchld_selfpipe[0];
+
+    if (revents & ~POLLIN) {
+        LOG(ERROR, "unexpected poll event 0x%x on SIGCHLD self pipe", revents);
+        LIBXL__EVENT_DISASTER(egc,
+                              "unexpected poll event on SIGCHLD self pipe",
+                              0, 0);
+    }
+    assert(revents & POLLIN);
+
+    int e = libxl__self_pipe_eatall(selfpipe);
+    if (e) LIBXL__EVENT_DISASTER(egc, "read sigchld pipe", e, 0);
+
     while (chldmode_ours(CTX, 0) /* in case the app changes the mode */) {
         int status;
         pid_t pid = waitpid(-1, &status, WNOHANG);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index bd2eeed..1e60333 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -353,6 +353,7 @@ struct libxl__ctx {
     const libxl_childproc_hooks *childproc_hooks;
     void *childproc_user;
     int sigchld_selfpipe[2]; /* [0]==-1 means handler not installed */
+    libxl__ev_fd sigchld_selfpipe_efd;
     LIBXL_LIST_HEAD(, libxl__ev_child) children;
 
     libxl_version_info version_info;
@@ -840,8 +841,6 @@ _hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p);
 extern const libxl_childproc_hooks libxl__childproc_default_hooks;
 int libxl__sigchld_installhandler(libxl__gc*); /* non-reentrant; logs errs */
 void libxl__sigchld_removehandler(libxl__gc*); /* non-reentrant */
-int libxl__fork_selfpipe_active(libxl_ctx *ctx); /* returns read fd or -1 */
-void libxl__fork_selfpipe_woken(libxl__egc *egc);
 int libxl__self_pipe_wakeup(int fd); /* returns 0 or -1 setting errno */
 int libxl__self_pipe_eatall(int fd); /* returns 0 or -1 setting errno */
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/2] question on libxl ao [and 1 more messages] [and 1 more messages]
  2013-11-05  9:22 ` [PATCH 1/2] expose child fd in order to handle child exit in libvirt Bamvor Jian Zhang
  2013-11-05 16:05   ` [PATCH 0/2] question on libxl ao [and 1 more messages] Ian Jackson
@ 2013-11-05 19:23   ` Ian Jackson
  1 sibling, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-11-05 19:23 UTC (permalink / raw)
  To: Bamvor Jian Zhang; +Cc: jfehlig, stefano.stabellini, ian.campbell, xen-devel

Ian Jackson writes ("Re: [PATCH 0/2] question on libxl ao [and 1 more messages]"):
> Bamvor Jian Zhang writes ("[PATCH 1/2] expose child fd in order to handle child exit in libvirt"):
> > libvirt could handle fd and timeout event through
> > libxl_osevent_hooks. either of these will not inform the child
> > exit if libvirt set the libxl_sigchld_owner_libxl_always.
> 
> I don't exactly follow, but I think you have found a bug.
...

Can you try these three patches:

Ian Jackson writes ("[PATCH 0/3] libxl: event system: SIGCHLD related fixes"):
> libxl's SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always
> and does not always work correctly with the fd registration system.
> 
>  1/3 libxl: event system: pass gc, not just ctx, to internal
>  2/3 libxl: event system: Make
>  3/3 libxl: event system: properly register the SIGCHLD
> 
> Reported-by: Bamvor Jian Zhang <bjzhang@suse.com>
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Bamvor Jian Zhang <bjzhang@suse.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Jim Fehlig <jfehlig@suse.com>

I've pushed them to
  git://xenbits.xen.org/people/iwj/xen.git#bamvor-sigchld-fixes-v1
  http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=shortlog;h=refs/heads/bamvor-sigchld-fixes-v1

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators
  2013-11-05 19:19       ` [PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators Ian Jackson
@ 2013-11-06 10:30         ` Ian Campbell
  0 siblings, 0 replies; 29+ messages in thread
From: Ian Campbell @ 2013-11-06 10:30 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel

On Tue, 2013-11-05 at 19:19 +0000, Ian Jackson wrote:
> We are going to want the gc for libxl__ev_fd_register.
> 
> No functional change in this patch.  Simply change the argument types,
> and the actual arguments from ctx to gc.  Inside these functions, use
> CTX (the macro which uses gc) rather than the old formal parameter
> ctx.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

(this series makes me wonder if your 'i' key was sticky when you wrote
all this stuff, chld indeed ;-))

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/3] libxl: event system: Make libxl_sigchld_owner_libxl_always work
  2013-11-05 19:19       ` [PATCH 2/3] libxl: event system: Make libxl_sigchld_owner_libxl_always work Ian Jackson
@ 2013-11-06 10:34         ` Ian Campbell
  0 siblings, 0 replies; 29+ messages in thread
From: Ian Campbell @ 2013-11-06 10:34 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, Bamvor Jian Zhang, xen-devel

On Tue, 2013-11-05 at 19:19 +0000, Ian Jackson wrote:
> Previously, libxl_sigchld_owner_libxl_always was mishandled.  It would
> result in libxl paying no attention to the sigchld self pipe.
> 
> Fix this by fixing chldmode_ours so that it returns true iff we are
> supposed to be handling SIGCHLD.
> 
> Additionally, we arrange to use chldmode_ours everywhere where we are
> installing/removing signal handlers and/or deciding whether to check
> the self pipe, etc.  This means it needs a new "creating" flag
> argument for the benefit of libxl__ev_child_fork, which needs to
> install the signal handler in libxl_sigchld_owner_libxl even if there
> are not currently any children.
> 
> ctx->childproc_hooks->chldowner is now interpreted only by
> chldmode_ours.
> 
> Reported-by: Bamvor Jian Zhang <bjzhang@suse.com>
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Bamvor Jian Zhang <bjzhang@suse.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] libxl: event system: properly register the SIGCHLD self-pipe
  2013-11-05 19:19       ` [PATCH 3/3] libxl: event system: properly register the SIGCHLD self-pipe Ian Jackson
@ 2013-11-06 10:36         ` Ian Campbell
  0 siblings, 0 replies; 29+ messages in thread
From: Ian Campbell @ 2013-11-06 10:36 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, Bamvor Jian Zhang, xen-devel

On Tue, 2013-11-05 at 19:19 +0000, Ian Jackson wrote:
> An application which uses libxl_osevent_register_hooks, and doesn't
> use libxl_sigchld_owner_mainloop, would never properly experience the
> deaths of its (libxl) children.
> 
> This is because the SIGCHLD self pipe would be handled using ad-hoc
> code in beforepoll_internal and afterpoll_internal.  However, this is
> no good if application is using the fd registration system instead; in
> that case these functions would not be called and nothing would deal
> with readability of the self pipe.
> 
> Fix this as follows:
> 
> The SIGCHLD self pipe now is now dealt with by a standard libxl__ev_fd
> handler, which is registered and deregistered along with the SIGCHLD
> handler itself.  The handler function subsumes the ad-hoc response
> code removed from beforepoll_internal, and the functionality of
> libxl__fork_selfpipe_woken.
> 
> This is also tidier as the SIGCHLD self pipe is no longer touched
> outside libxl_fork.c other than in ctx initialisation and teardown.
> 
> (The ad-hoc arrangements for poller->wakeup_pipe in
> beforepoll_internal and afterpoll_internal are OK because the
> libxl__poller mechanism exists to wake up threads which are sitting
> inside libxl's poll loop, so is not applicable to the application's
> event loop.)
> 
> Reported-by: Bamvor Jian Zhang <bjzhang@suse.com>
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Bamvor Jian Zhang <bjzhang@suse.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/3] libxl: event system: SIGCHLD related fixes
  2013-11-05 19:19     ` [PATCH 0/3] libxl: event system: SIGCHLD related fixes Ian Jackson
                         ` (2 preceding siblings ...)
  2013-11-05 19:19       ` [PATCH 3/3] libxl: event system: properly register the SIGCHLD self-pipe Ian Jackson
@ 2013-11-06 10:38       ` Ian Campbell
  2013-11-06 12:08         ` Ian Jackson
  3 siblings, 1 reply; 29+ messages in thread
From: Ian Campbell @ 2013-11-06 10:38 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, Bamvor Jian Zhang, xen-devel

On Tue, 2013-11-05 at 19:19 +0000, Ian Jackson wrote:
> libxl's SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always
> and does not always work correctly with the fd registration system.
> 
>  1/3 libxl: event system: pass gc, not just ctx, to internal
>  2/3 libxl: event system: Make
>  3/3 libxl: event system: properly register the SIGCHLD
> 
> Reported-by: Bamvor Jian Zhang <bjzhang@suse.com>
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Bamvor Jian Zhang <bjzhang@suse.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Jim Fehlig <jfehlig@suse.com>

I acked all three of these, but I think it probably wants an (Acked|
Tested|Something)-by from Jim and/or Bamvor before it gets committed.

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/3] libxl: event system: SIGCHLD related fixes
  2013-11-06 10:38       ` [PATCH 0/3] libxl: event system: SIGCHLD related fixes Ian Campbell
@ 2013-11-06 12:08         ` Ian Jackson
  2013-11-12 18:27           ` [PATCH v2 " Ian Jackson
  0 siblings, 1 reply; 29+ messages in thread
From: Ian Jackson @ 2013-11-06 12:08 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Jim Fehlig, Bamvor Jian Zhang, xen-devel

Ian Campbell writes ("Re: [PATCH 0/3] libxl: event system: SIGCHLD related fixes"):
> On Tue, 2013-11-05 at 19:19 +0000, Ian Jackson wrote:
> > libxl's SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always
> > and does not always work correctly with the fd registration system.
> > 
> >  1/3 libxl: event system: pass gc, not just ctx, to internal
> >  2/3 libxl: event system: Make
> >  3/3 libxl: event system: properly register the SIGCHLD
...
> I acked all three of these, but I think it probably wants an (Acked|
> Tested|Something)-by from Jim and/or Bamvor before it gets committed.

Thanks.  Yes, indeed.  I have tested that xl still works but I don't
have a test harness for the fd registration machinery.

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/2] question on libxl ao [and 1 more messages]
  2013-11-05 16:05   ` [PATCH 0/2] question on libxl ao [and 1 more messages] Ian Jackson
  2013-11-05 19:19     ` [PATCH 0/3] libxl: event system: SIGCHLD related fixes Ian Jackson
@ 2013-11-07  3:00     ` Bamvor Jian Zhang
  1 sibling, 0 replies; 29+ messages in thread
From: Bamvor Jian Zhang @ 2013-11-07  3:00 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, xen-devel, ian.campbell, stefano.stabellini



 >>>Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: 
> Bamvor Jian Zhang writes ("[PATCH 0/2] question on libxl ao"): 
>> there are four choices for using ao_how in libvirt. 
>> (a), ao without callback and set libxl_sigchld_owner_mainloop 
>> (b), ao without callback and set libxl_sigchld_owner_libxl_always 
>> (c), ao with callback and set libxl_sigchld_owner_mainloop 
>> (d), ao with callback and set libxl_sigchld_owner_libxl_always 
>>  
>> i got some trouble when i try (b) and (c). 
>> i still not decided which one is better choice for implement in 
>> libvirt. it would be easier if i could try above four choice.
>  
> I'm not sure I follow.  I think the right answer for libvirt is 
> probably aos specifying callback, and libxl_sigchld_owner_mainloop. 
> I'm assuming that libvirt has something for handling children 
> already.  But I haven't looked.  If it doesn't then 
> libxl_sigchld_owner_always would probably be best.
AFAICS, no child handle in libvirt. i will try your patches.

thanks.

bamvor
>  
> Bamvor Jian Zhang writes ("[PATCH 1/2] expose child fd in order to handle  
> child exit in libvirt"): 
>> libvirt could handle fd and timeout event through 
>> libxl_osevent_hooks. either of these will not inform the child 
>> exit if libvirt set the libxl_sigchld_owner_libxl_always.
>  
> I don't exactly follow, but I think you have found a bug.
>  
> If libvirt specifies libxl_sigchld_owner_libxl_always then libxl will 
> install a SIGCHLD handler and to the self-pipe trick.  The self-pipe 
> fd should be registered with the fd event machinery in the normal way. 
> However, this doesn't actually appear to happen.  Instead the 
> beforepoll/afterpoll functions handle that fd ad-hoc.  The result is 
> that, probably, nothing will notice the selfpipe becoming writable.
>  
> I will see about writing a patch to fix this.
>  
>> add a function for returning the sigchld_selfpipe in order to 
>> handle the child exit in libvirt. 
>> meanwhile, there is only one pipe in ctx, it seems that it is 
>> not worth to add this to libxl_osevent_hooks.
>  
> This approach is not right, though.  There is no need to add 
> additional interfaces to libxl.  The bug just needs to be fixed.
>  
> Thanks, 
> Ian.
>  
> _______________________________________________ 
> Xen-devel mailing list 
> Xen-devel@lists.xen.org 
> http://lists.xen.org/xen-devel 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] call ao_how callback explicitly
  2013-11-05 15:51   ` Ian Jackson
@ 2013-11-07 14:51     ` Bamvor Jian Zhang
  2013-11-07 17:03       ` Ian Jackson
  0 siblings, 1 reply; 29+ messages in thread
From: Bamvor Jian Zhang @ 2013-11-07 14:51 UTC (permalink / raw)
  To: Ian.Jackson; +Cc: Jim Fehlig, xen-devel, ian.campbell, stefano.stabellini

> Bamvor Jian Zhang writes ("[PATCH 2/2] call ao_how callback explicitly"):
> > ao_how callback is not called after inserted into list.
> > in my test, i directly call it in
> > libxl__ao_complete_check_progress_reports. i know it should not
> > work like this. Could you give some suggestion about it?
> 
> I'm not sure what you mean.  Under what conditions do you not see the
> callback happen ?  The existing machinery is supposed to take care of
> that.  Specifically, in the patch you're changing,
> libxl__ao_complete_check_progress_reports puts the ao onto
> egc->aos_for_callback.  On the return path from the libxl event
> function back to the application, we are supposed to call
> libxl__egc_cleanup, which in turn calls egc_run_callbacks, which
> should pick up the aos on aos_for_callback.  libxl__egc_cleanup is
> called from the event loops in libxl_event_wait and
> libxl__ao_inprogress and from the macro EGC_FREE.  It should be
> impossible for a path to be missed because inside libxl one needs an
> egc to call libxl__ao_complete.
the libvirt libxl driver register the event handler through
libxl_event_register_callbacks, so the libxl_event_wait could not get
the event. and if the ao_how is used, the libxl__egc_cleanup in
"if ( poller )" statement will not be called either.
even if i could expose the libxl_egc_cleanup to libvirt, i still do not
know when should i call it? i do not know if there is a event triggered
before the ao_how callback should be called. (is there a fd event when
the async operation complete?).


diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index b732816..c216651 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -1743,6 +1743,13 @@ void libxl__ao_progress_report(libxl__egc *egc, libxl__ao *ao,
     }
 }
 
+void libxl_egc_cleanup(libxl_ctx *ctx)
+{
+    EGC_INIT(ctx);
+    libxl__egc_cleanup(egc);
+    EGC_FREE;
+    return rc;
+}
 
 /*
  * Local variables:
diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
index 27a65dc..f0f7376 100644
--- a/tools/libxl/libxl_event.h
+++ b/tools/libxl/libxl_event.h
@@ -555,6 +555,7 @@ int libxl_childproc_reaped(libxl_ctx *ctx, pid_t, int status)
  */
 void libxl_postfork_child_noexec(libxl_ctx *ctx);
 
+void libxl_egc_cleanup(libxl_ctx *ctx);
 
 #endif
 
>
> What kind of an ao_how is your application passing to libxl ?
ao_how with callback set.
> Is your
> application multithreaded ?
yes.



> 
> Ian.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] call ao_how callback explicitly
  2013-11-07 14:51     ` Bamvor Jian Zhang
@ 2013-11-07 17:03       ` Ian Jackson
  2013-11-08 13:17         ` Bamvor Jian Zhang
  0 siblings, 1 reply; 29+ messages in thread
From: Ian Jackson @ 2013-11-07 17:03 UTC (permalink / raw)
  To: Bamvor Jian Zhang; +Cc: Jim Fehlig, xen-devel, ian.campbell, stefano.stabellini

Bamvor Jian Zhang writes ("Re: [Xen-devel] [PATCH 2/2] call ao_how callback explicitly"):
...
> > I'm not sure what you mean.  Under what conditions do you not see the
> > callback happen ?  The existing machinery is supposed to take care of
> > that.  Specifically, in the patch you're changing,
> > libxl__ao_complete_check_progress_reports puts the ao onto
> > egc->aos_for_callback.  On the return path from the libxl event
> > function back to the application, we are supposed to call
> > libxl__egc_cleanup, which in turn calls egc_run_callbacks, which
> > should pick up the aos on aos_for_callback.  libxl__egc_cleanup is
> > called from the event loops in libxl_event_wait and
> > libxl__ao_inprogress and from the macro EGC_FREE.  It should be
> > impossible for a path to be missed because inside libxl one needs an
> > egc to call libxl__ao_complete.
...
> the libvirt libxl driver register the event handler through
> libxl_event_register_callbacks,

Right.

> so the libxl_event_wait could not get the event.

I understand from this that you're calling libxl_event_wait ?  (Rather
than that you're saying that you're not calling libxl_event_wait, but
doing so wouldn't help.)

But, as I understand it, you're passing an ao_how which specifies a
callback, rather than the generation of an event.  So I think
libxl_event_wait is irrelevant.

> and if the ao_how is used, the libxl__egc_cleanup in
> "if ( poller )" statement will not be called either.
> even if i could expose the libxl_egc_cleanup to libvirt, i still do not
> know when should i call it?

No, you shouldn't need to call it explicitly and doing so won't help.

> i do not know if there is a event triggered
> before the ao_how callback should be called. (is there a fd event when
> the async operation complete?).

It's the other way around.  The async operation completes, probably,
because of an fd event.  So I think the call flow should be:

 1. Your application notices the fd is readable (say)
 2. Your application calls libxl_osevent_occurred_fd.
 3. libxl_osevent_occurred_fd allocates an egc on entry.
 4. In libxl, libxl_osevent_occurred_fd calls whatever internal
    functions are supposed to respond to the fd.  If appropriate,
    this will cause some ao to complete.  All of these calls
    will receive the same egc as created in step 3.  I.e.:
 5. In libxl, something calls libxl__ao_complete.
 6. libxl__ao_complete puts the ao on the egc's aos_for_callback
    list.
 7. The call stack unwinds inside libxl, eventually returning
    to libxl_osevent_occurred_fd.
 8. libxl_osevent_occurred_fd uses EGC_FREE on exit, and EGC_FREE
    contains a call to libxl__egc_cleanup.
 9. libxl__egc_cleanup calls egc_run_callbacks which finds the ao
    on the aos_for_callback list and makes your application
    callback.

The purpose of the "poller" machinery is so that if at step 5 we
decide to return an event structure, rather than making a callback, a
concurrent call to libxl_event_wait (on a different thread) can be
woken up.

Can you turn on libxl's debugging and send me a trace of it losing the
event ?

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] call ao_how callback explicitly
  2013-11-07 17:03       ` Ian Jackson
@ 2013-11-08 13:17         ` Bamvor Jian Zhang
  2013-11-08 16:07           ` Ian Jackson
  0 siblings, 1 reply; 29+ messages in thread
From: Bamvor Jian Zhang @ 2013-11-08 13:17 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, xen-devel, ian.campbell, stefano.stabellini

 >>>Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: 
> Bamvor Jian Zhang writes ("Re: [Xen-devel] [PATCH 2/2] call ao_how callback  
> explicitly"): 
> ... 
> > > I'm not sure what you mean.  Under what conditions do you not see the 
> > > callback happen ?  The existing machinery is supposed to take care of 
> > > that.  Specifically, in the patch you're changing, 
> > > libxl__ao_complete_check_progress_reports puts the ao onto 
> > > egc->aos_for_callback.  On the return path from the libxl event 
> > > function back to the application, we are supposed to call 
> > > libxl__egc_cleanup, which in turn calls egc_run_callbacks, which 
> > > should pick up the aos on aos_for_callback.  libxl__egc_cleanup is 
> > > called from the event loops in libxl_event_wait and 
> > > libxl__ao_inprogress and from the macro EGC_FREE.  It should be 
> > > impossible for a path to be missed because inside libxl one needs an 
> > > egc to call libxl__ao_complete. 
> ... 
> > the libvirt libxl driver register the event handler through 
> > libxl_event_register_callbacks, 
>  
> Right. 
>  
> > so the libxl_event_wait could not get the event. 
>  
> I understand from this that you're calling libxl_event_wait ?  (Rather
> than that you're saying that you're not calling libxl_event_wait, but
> doing so wouldn't help.)
i am talking about why i could not call libxl_event_wait.
>
> But, as I understand it, you're passing an ao_how which specifies a
> callback, rather than the generation of an event.  So I think
> libxl_event_wait is irrelevant.
agree.
>  
> > and if the ao_how is used, the libxl__egc_cleanup in 
> > "if ( poller )" statement will not be called either. 
> > even if i could expose the libxl_egc_cleanup to libvirt, i still do not 
> > know when should i call it? 
>  
> No, you shouldn't need to call it explicitly and doing so won't help. 
>  
> > i do not know if there is a event triggered 
> > before the ao_how callback should be called. (is there a fd event when 
> > the async operation complete?). 
>  
> It's the other way around.  The async operation completes, probably, 
> because of an fd event.  So I think the call flow should be: 
>  
>  1. Your application notices the fd is readable (say) 
>  2. Your application calls libxl_osevent_occurred_fd. 
>  3. libxl_osevent_occurred_fd allocates an egc on entry. 
>  4. In libxl, libxl_osevent_occurred_fd calls whatever internal 
>     functions are supposed to respond to the fd.  If appropriate, 
>     this will cause some ao to complete.  All of these calls 
>     will receive the same egc as created in step 3.  I.e.: 
>  5. In libxl, something calls libxl__ao_complete. 
>  6. libxl__ao_complete puts the ao on the egc's aos_for_callback 
>     list. 
>  7. The call stack unwinds inside libxl, eventually returning 
>     to libxl_osevent_occurred_fd. 
>  8. libxl_osevent_occurred_fd uses EGC_FREE on exit, and EGC_FREE 
>     contains a call to libxl__egc_cleanup. 
>  9. libxl__egc_cleanup calls egc_run_callbacks which finds the ao 
>     on the aos_for_callback list and makes your application 
>     callback. 

with log and gdb, i could see above flow except the ao in egc->aops_for_callback is empty.
and from gdb i could know that the egc is different between libxl__ao_complete_check_progress_reports and fd event(here is watchfd_callback).
libxl__ao_complete_check_progress_reports
(gdb) p egc
$1 = (libxl__egc *) 0x7fb1fd03f500

watchfd_callback
(gdb) p egc
$4 = (libxl__egc *) 0x7ffffb291960

after check the code in libxl_osevent_occurred_fd, the egc is created in watchfd_callback. so, EGC_FREE could not handle the ao added by libxl__ao_complete_check_progress_reports. the latter egc is created by AO_CREATE in do_domain_create.
is it correct? if so, could we use the same egc all the time?

here is the full libxl.log:
libxl: debug: libxl_create.c:1230:do_domain_create: ao 0x7fb1d0001e50: create: how=0x7fb1f6448e80 callback=0x7fb1f6210fb0 poller=(nil)
libxl: debug: libxl_device.c:257:libxl__device_disk_set_backend: Disk vdev=hda spec.backend=tap
libxl: debug: libxl_create.c:675:initiate_domain_create: running bootloader
libxl: debug: libxl_bootloader.c:321:libxl__bootloader_run: not a PV domain, skipping bootloader
libxl: debug: libxl_event.c:608:libxl__ev_xswatch_deregister: watch w=0x7fb1d0003078: deregister unregistered
libxl: debug: libxl_numa.c:475:libxl__get_numa_candidate: New best NUMA placement candidate found: nr_nodes=1, nr_cpus=8, nr_vcpus=9, free_memkb=908
libxl: detail: libxl_dom.c:195:numa_place_domain: NUMA placement candidate with 1 nodes, 8 cpus and 908 KB free selected
xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9f0c8
xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19f0c8
xc: info: VIRTUAL MEMORY ARRANGEMENT:
  Loader:        0000000000100000->000000000019f0c8
  Modules:       0000000000000000->0000000000000000
  TOTAL:         0000000000000000->000000001f800000
  ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
  4KB PAGES: 0x0000000000000200
  2MB PAGES: 0x00000000000000fb
  1GB PAGES: 0x0000000000000000
xc: detail: elf_load_binary: phdr 0 at 0x7fb1f2f48000 -> 0x7fb1f2fddf4d
libxl: debug: libxl_device.c:257:libxl__device_disk_set_backend: Disk vdev=hda spec.backend=tap
libxl: debug: libxl_event.c:559:libxl__ev_xswatch_register: watch w=0x7fb1d0002558 wpath=/local/domain/0/backend/vbd/2/768/state token=3/0: register slotnum=3
libxl: debug: libxl_create.c:1243:do_domain_create: ao 0x7fb1d0001e50: inprogress: poller=(nil), flags=i
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x7fb1d0002558 wpath=/local/domain/0/backend/vbd/2/768/state token=3/0: event epath=/local/domain/0/backend/vbd/2/768/state
libxl: debug: libxl_event.c:643:devstate_watch_callback: backend /local/domain/0/backend/vbd/2/768/state wanted state 2 ok
libxl: debug: libxl_event.c:596:libxl__ev_xswatch_deregister: watch w=0x7fb1d0002558 wpath=/local/domain/0/backend/vbd/2/768/state token=3/0: deregister slotnum=3
libxl: debug: libxl_event.c:608:libxl__ev_xswatch_deregister: watch w=0x7fb1d0002558: deregister unregistered
libxl: debug: libxl_device.c:959:device_hotplug: calling hotplug script: /etc/xen/scripts/block add
libxl: debug: libxl_dm.c:1218:libxl__spawn_local_dm: Spawning device-model /usr/lib/xen/bin/qemu-system-i386 with arguments:
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   /usr/lib/xen/bin/qemu-system-i386
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   -xen-domid
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   2
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   -chardev
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-2,server,nowait
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   -mon
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   chardev=libxl-cmd,mode=control
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   -name
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   bjz_04_sles11_sp2
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   -vnc
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   127.0.0.1:0,to=99
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   -global
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   isa-fdc.driveA=
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   -vga
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   cirrus
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   -global
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   vga.vram_size_mb=8
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   -boot
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   order=c
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   -net
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   none
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   -M
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   xenfv
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   -m
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   504
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   -drive
libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm:   file=/var/lib/xen/images_2/bjz_04_sles11_sp2/disk0.raw,if=ide,index=0,media=disk,format=raw,cache=writeback
libxl: debug: libxl_event.c:559:libxl__ev_xswatch_register: watch w=0x7fb1d00032b0 wpath=/local/domain/0/device-model/2/state token=3/1: register slotnum=3
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x7fb1d00032b0 wpath=/local/domain/0/device-model/2/state token=3/1: event epath=/local/domain/0/device-model/2/state
libxl: debug: libxl_event.c:596:libxl__ev_xswatch_deregister: watch w=0x7fb1d00032b0 wpath=/local/domain/0/device-model/2/state token=3/1: deregister slotnum=3
libxl: debug: libxl_event.c:472:watchfd_callback: watch epath=/local/domain/0/device-model/2/state token=3/1: empty slot
libxl: debug: libxl_event.c:608:libxl__ev_xswatch_deregister: watch w=0x7fb1d00032b0: deregister unregistered
libxl: debug: libxl_qmp.c:707:libxl__qmp_initialize: connected to /var/run/xen/qmp-libxl-2
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: qmp
libxl: debug: libxl_qmp.c:555:qmp_send_prepare: next qmp command: '{
    "execute": "qmp_capabilities",
    "id": 1
}
'
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return
libxl: debug: libxl_qmp.c:555:qmp_send_prepare: next qmp command: '{
    "execute": "query-chardev",
    "id": 2
}
'
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return
libxl: debug: libxl_qmp.c:555:qmp_send_prepare: next qmp command: '{
    "execute": "query-vnc",
    "id": 3
}
'
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return
libxl: debug: libxl_event.c:559:libxl__ev_xswatch_register: watch w=0x7fb1d0007b58 wpath=/local/domain/0/backend/vif/2/0/state token=3/2: register slotnum=3
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x7fb1d0007b58 wpath=/local/domain/0/backend/vif/2/0/state token=3/2: event epath=/local/domain/0/backend/vif/2/0/state
libxl: debug: libxl_event.c:643:devstate_watch_callback: backend /local/domain/0/backend/vif/2/0/state wanted state 2 ok
libxl: debug: libxl_event.c:596:libxl__ev_xswatch_deregister: watch w=0x7fb1d0007b58 wpath=/local/domain/0/backend/vif/2/0/state token=3/2: deregister slotnum=3
libxl: debug: libxl_event.c:608:libxl__ev_xswatch_deregister: watch w=0x7fb1d0007b58: deregister unregistered
libxl: debug: libxl_device.c:959:device_hotplug: calling hotplug script: /etc/xen/scripts/vif-bridge online
libxl: debug: libxl_event.c:472:watchfd_callback: watch epath=/local/domain/0/backend/vif/2/0/state token=3/2: empty slot
libxl: debug: libxl_event.c:1737:libxl__ao_progress_report: ao 0x7fb1d0001e50: progress report: ignored
libxl: debug: libxl_event.c:1569:libxl__ao_complete: ao 0x7fb1d0001e50: complete, rc=0
libxl: debug: libxl_event.c:1599:libxl__ao_complete_check_progress_reports: ao 0x7fb1d0001e50: complete for callback
libxl: debug: libxl_event.c:1173:egc_run_callbacks: ao 0x7fb1d0001e50: completion callback
libxl: debug: libxl_event.c:1541:libxl__ao__destroy: ao 0x7fb1d0001e50: destroy
libxl: debug: libxl_event.c:559:libxl__ev_xswatch_register: watch w=0x7fb1d00029b0 wpath=@releaseDomain token=3/3: register slotnum=3
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x7fb1d00029b0 wpath=@releaseDomain token=3/3: event epath=@releaseDomain
libxl: debug: libxl.c:1000:domain_death_xswatch_callback: [evg=0x7fb1d0006330:2] from domid=2 nentries=1 rc=1
libxl: debug: libxl.c:1011:domain_death_xswatch_callback: [evg=0x7fb1d0006330:2]   got=domaininfos[0] got->domain=2
libxl: debug: libxl.c:1038:domain_death_xswatch_callback:  exists shutdown_reported=0 dominf.flags=ffff0022
libxl: debug: libxl.c:1004:domain_death_xswatch_callback: [evg=0] all reported
libxl: debug: libxl.c:1068:domain_death_xswatch_callback: domain death search done
xc: debug: hypercall buffer: total allocations:0 total releases:0
xc: debug: hypercall buffer: current allocations:0 maximum allocations:0
xc: debug: hypercall buffer: cache current size:0
xc: debug: hypercall buffer: cache hits:0 misses:0 toobig:0

best regards

bamvor
> The purpose of the "poller" machinery is so that if at step 5 we 
> decide to return an event structure, rather than making a callback, a 
> concurrent call to libxl_event_wait (on a different thread) can be 
> woken up. 
>  
> Can you turn on libxl's debugging and send me a trace of it losing the 
> event ? 
>  
> Ian. 
>  
> _______________________________________________ 
> Xen-devel mailing list 
> Xen-devel@lists.xen.org 
> http://lists.xen.org/xen-devel 
>  
>  

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] call ao_how callback explicitly
  2013-11-08 13:17         ` Bamvor Jian Zhang
@ 2013-11-08 16:07           ` Ian Jackson
  0 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-11-08 16:07 UTC (permalink / raw)
  To: Bamvor Jian Zhang; +Cc: Jim Fehlig, xen-devel, ian.campbell, stefano.stabellini

Bamvor Jian Zhang writes ("Re: [Xen-devel] [PATCH 2/2] call ao_how callback explicitly"):
> Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: 
> > It's the other way around.  The async operation completes, probably, 
> > because of an fd event.  So I think the call flow should be: 
> >  
> >  1. Your application notices the fd is readable (say) 
> >  2. Your application calls libxl_osevent_occurred_fd. 
> >  3. libxl_osevent_occurred_fd allocates an egc on entry. 
> >  4. In libxl, libxl_osevent_occurred_fd calls whatever internal 
> >     functions are supposed to respond to the fd.  If appropriate, 
> >     this will cause some ao to complete.  All of these calls 
> >     will receive the same egc as created in step 3.  I.e.: 
> >  5. In libxl, something calls libxl__ao_complete. 
> >  6. libxl__ao_complete puts the ao on the egc's aos_for_callback 
> >     list. 
> >  7. The call stack unwinds inside libxl, eventually returning 
> >     to libxl_osevent_occurred_fd. 
> >  8. libxl_osevent_occurred_fd uses EGC_FREE on exit, and EGC_FREE 
> >     contains a call to libxl__egc_cleanup. 
> >  9. libxl__egc_cleanup calls egc_run_callbacks which finds the ao 
> >     on the aos_for_callback list and makes your application 
> >     callback. 
> 
> with log and gdb, i could see above flow except the ao in egc->aops_for_callback is empty.

The ao is "empty" ?  I don't follow.

> and from gdb i could know that the egc is different between libxl__ao_complete_check_progress_reports and fd event(here is watchfd_callback).
> libxl__ao_complete_check_progress_reports
> (gdb) p egc
> $1 = (libxl__egc *) 0x7fb1fd03f500
> 
> watchfd_callback
> (gdb) p egc
> $4 = (libxl__egc *) 0x7ffffb291960

Every entry into libxl's event system has its own egc.  So if you are
entering two of these functions at once on different threads, each
will have its own egc.

> after check the code in libxl_osevent_occurred_fd, the egc is
> created in watchfd_callback.

No, watchfd_callback doesn't create the egc.  The EGC_GC macro doesn't
create the egc.  It only extracts the gc from it for the benefit of
code inside that function.  The egc will have been passed to
watchfd_callback from its caller.  The caller might be
afterpoll_internal called from libxl_osevent_afterpoll (but you imply
you're not using that) or from eventloop_iteration, which might be
called from libxl_event_wait (which you're not using) or
libxl__ao_inprogress (but only in the synchronous case which you're
not using).  Or it might be libxl_osevent_occurred_fd, which does
create the egc but runs the callbacks on exit.

> so, EGC_FREE could not handle the ao added by
> libxl__ao_complete_check_progress_reports. the latter egc is created
> by AO_CREATE in do_domain_create.  is it correct? if so, could we
> use the same egc all the time?

No, each egc corresponds to one entry into libxl.  That's what they're
for.

> here is the full libxl.log:

This shows only one ao.  Is that right ?  It starts here:

> libxl: debug: libxl_create.c:1230:do_domain_create: ao 0x7fb1d0001e50: create: how=0x7fb1f6448e80 callback=0x7fb1f6210fb0 poller=(nil)

And then, later, it completes:

> libxl: debug: libxl_event.c:1569:libxl__ao_complete: ao 0x7fb1d0001e50: complete, rc=0
> libxl: debug: libxl_event.c:1599:libxl__ao_complete_check_progress_reports: ao 0x7fb1d0001e50: complete for callback
> libxl: debug: libxl_event.c:1173:egc_run_callbacks: ao 0x7fb1d0001e50: completion callback
> libxl: debug: libxl_event.c:1541:libxl__ao__destroy: ao 0x7fb1d0001e50: destroy

Here we see the callback being made.  That message "completion
callback" comes from this code in egc_run_callbacks:

        LOG(DEBUG,"ao %p: completion callback", ao);
        ao->how.callback(CTX, ao->rc, ao->how.u.for_callback);

So this debug trace appears to show the callback actually happening.
What makes you think it isn't ?

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes
  2013-11-06 12:08         ` Ian Jackson
@ 2013-11-12 18:27           ` Ian Jackson
  2013-11-12 18:27             ` [PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators Ian Jackson
                               ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Ian Jackson @ 2013-11-12 18:27 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian.Campbell, Bamvor Jian Zhang

libxl's SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always
and does not always work correctly with the fd registration system.

 1/3 libxl: event system: pass gc, not just ctx, to internal
 2/3 libxl: event system: Make
 3/3 libxl: event system: properly register the SIGCHLD

This is v2 of this series because when I reviewed it myself I found
that there was a critical missing ! in perhaps_removehandler, which
would remove the handler iff it was needed.  IMO this shows the
weakness of my testing arrangements.

I'd really appreciate some feedback on this from Bamvor or Jim.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators
  2013-11-12 18:27           ` [PATCH v2 " Ian Jackson
@ 2013-11-12 18:27             ` Ian Jackson
  2013-11-12 18:27             ` [PATCH 2/3] libxl: event system: Make libxl_sigchld_owner_libxl_always work Ian Jackson
                               ` (3 subsequent siblings)
  4 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-11-12 18:27 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell, Bamvor Jian Zhang

We are going to want the gc for libxl__ev_fd_register.

No functional change in this patch.  Simply change the argument types,
and the actual arguments from ctx to gc.  Inside these functions, use
CTX (the macro which uses gc) rather than the old formal parameter
ctx.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl.c          |    2 +-
 tools/libxl/libxl_fork.c     |   36 ++++++++++++++++++------------------
 tools/libxl/libxl_internal.h |    4 ++--
 3 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 0de1112..2ebba98 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -171,7 +171,7 @@ int libxl_ctx_free(libxl_ctx *ctx)
     /* If we have outstanding children, then the application inherits
      * them; we wish the application good luck with understanding
      * this if and when it reaps them. */
-    libxl__sigchld_removehandler(ctx);
+    libxl__sigchld_removehandler(gc);
 
     if (ctx->sigchld_selfpipe[0] >= 0) {
         close(ctx->sigchld_selfpipe[0]);
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index b6f0b2d..ae6184c 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -175,23 +175,23 @@ static void sigchld_removehandler_core(void)
     sigchld_owner = 0;
 }
 
-void libxl__sigchld_removehandler(libxl_ctx *ctx) /* non-reentrant */
+void libxl__sigchld_removehandler(libxl__gc *gc) /* non-reentrant */
 {
     atfork_lock();
-    if (sigchld_owner == ctx)
+    if (sigchld_owner == CTX)
         sigchld_removehandler_core();
     atfork_unlock();
 }
 
-int libxl__sigchld_installhandler(libxl_ctx *ctx) /* non-reentrant */
+int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */
 {
     int r, rc;
 
-    if (ctx->sigchld_selfpipe[0] < 0) {
-        r = pipe(ctx->sigchld_selfpipe);
+    if (CTX->sigchld_selfpipe[0] < 0) {
+        r = pipe(CTX->sigchld_selfpipe);
         if (r) {
-            ctx->sigchld_selfpipe[0] = -1;
-            LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR,
+            CTX->sigchld_selfpipe[0] = -1;
+            LIBXL__LOG_ERRNO(CTX, LIBXL__LOG_ERROR,
                              "failed to create sigchld pipe");
             rc = ERROR_FAIL;
             goto out;
@@ -199,11 +199,11 @@ int libxl__sigchld_installhandler(libxl_ctx *ctx) /* non-reentrant */
     }
 
     atfork_lock();
-    if (sigchld_owner != ctx) {
+    if (sigchld_owner != CTX) {
         struct sigaction ours;
 
         assert(!sigchld_owner);
-        sigchld_owner = ctx;
+        sigchld_owner = CTX;
 
         memset(&ours,0,sizeof(ours));
         ours.sa_handler = sigchld_handler;
@@ -241,11 +241,11 @@ int libxl__fork_selfpipe_active(libxl_ctx *ctx)
     return ctx->sigchld_selfpipe[0];
 }
 
-static void perhaps_removehandler(libxl_ctx *ctx)
+static void perhaps_removehandler(libxl__gc *gc)
 {
-    if (LIBXL_LIST_EMPTY(&ctx->children) &&
-        ctx->childproc_hooks->chldowner != libxl_sigchld_owner_libxl_always)
-        libxl__sigchld_removehandler(ctx);
+    if (LIBXL_LIST_EMPTY(&CTX->children) &&
+        CTX->childproc_hooks->chldowner != libxl_sigchld_owner_libxl_always)
+        libxl__sigchld_removehandler(gc);
 }
 
 static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
@@ -265,7 +265,7 @@ static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
     ch->pid = -1;
     ch->callback(egc, ch, pid, status);
 
-    perhaps_removehandler(CTX);
+    perhaps_removehandler(gc);
 
     return 0;
 }
@@ -333,7 +333,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch,
     int rc;
 
     if (chldmode_ours(CTX)) {
-        rc = libxl__sigchld_installhandler(CTX);
+        rc = libxl__sigchld_installhandler(gc);
         if (rc) goto out;
     }
 
@@ -363,7 +363,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch,
     rc = pid;
 
  out:
-    perhaps_removehandler(CTX);
+    perhaps_removehandler(gc);
     CTX_UNLOCK;
     return rc;
 }
@@ -385,10 +385,10 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks,
     switch (ctx->childproc_hooks->chldowner) {
     case libxl_sigchld_owner_mainloop:
     case libxl_sigchld_owner_libxl:
-        libxl__sigchld_removehandler(ctx);
+        libxl__sigchld_removehandler(gc);
         break;
     case libxl_sigchld_owner_libxl_always:
-        libxl__sigchld_installhandler(ctx);
+        libxl__sigchld_installhandler(gc);
         break;
     default:
         abort();
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 4ad6ad3..d567a10 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -838,8 +838,8 @@ _hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p);
 
 /* Internal to fork and child reaping machinery */
 extern const libxl_childproc_hooks libxl__childproc_default_hooks;
-int libxl__sigchld_installhandler(libxl_ctx *ctx); /* non-reentrant;logs errs */
-void libxl__sigchld_removehandler(libxl_ctx *ctx); /* non-reentrant */
+int libxl__sigchld_installhandler(libxl__gc*); /* non-reentrant; logs errs */
+void libxl__sigchld_removehandler(libxl__gc*); /* non-reentrant */
 int libxl__fork_selfpipe_active(libxl_ctx *ctx); /* returns read fd or -1 */
 void libxl__fork_selfpipe_woken(libxl__egc *egc);
 int libxl__self_pipe_wakeup(int fd); /* returns 0 or -1 setting errno */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/3] libxl: event system: Make libxl_sigchld_owner_libxl_always work
  2013-11-12 18:27           ` [PATCH v2 " Ian Jackson
  2013-11-12 18:27             ` [PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators Ian Jackson
@ 2013-11-12 18:27             ` Ian Jackson
  2013-11-12 18:27             ` [PATCH 3/3] libxl: event system: properly register the SIGCHLD self-pipe Ian Jackson
                               ` (2 subsequent siblings)
  4 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-11-12 18:27 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell, Bamvor Jian Zhang

Previously, libxl_sigchld_owner_libxl_always was mishandled.  It would
result in libxl paying no attention to the sigchld self pipe.

Fix this by fixing chldmode_ours so that it returns true iff we are
supposed to be handling SIGCHLD.

Additionally, we arrange to use chldmode_ours everywhere where we are
installing/removing signal handlers and/or deciding whether to check
the self pipe, etc.  This means it needs a new "creating" flag
argument for the benefit of libxl__ev_child_fork, which needs to
install the signal handler in libxl_sigchld_owner_libxl even if there
are not currently any children.

ctx->childproc_hooks->chldowner is now interpreted only by
chldmode_ours.

Reported-by: Bamvor Jian Zhang <bjzhang@suse.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Bamvor Jian Zhang <bjzhang@suse.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>

---
v2: Get sense of chldmode test right in perhaps_removehandler (!)
---
 tools/libxl/libxl_fork.c |   51 ++++++++++++++++++++++++----------------------
 1 file changed, 27 insertions(+), 24 deletions(-)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index ae6184c..a581763 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -224,18 +224,23 @@ int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */
     return rc;
 }
 
-static int chldmode_ours(libxl_ctx *ctx)
+static bool chldmode_ours(libxl_ctx *ctx, bool creating)
 {
-    return ctx->childproc_hooks->chldowner == libxl_sigchld_owner_libxl;
+    switch (ctx->childproc_hooks->chldowner) {
+    case libxl_sigchld_owner_libxl:
+        return creating || !LIBXL_LIST_EMPTY(&ctx->children);
+    case libxl_sigchld_owner_mainloop:
+        return 0;
+    case libxl_sigchld_owner_libxl_always:
+        return 1;
+    }
+    abort();
 }
 
 int libxl__fork_selfpipe_active(libxl_ctx *ctx)
 {
     /* Returns the fd to read, or -1 */
-    if (!chldmode_ours(ctx))
-        return -1;
-
-    if (LIBXL_LIST_EMPTY(&ctx->children))
+    if (!chldmode_ours(ctx, 0))
         return -1;
 
     return ctx->sigchld_selfpipe[0];
@@ -243,11 +248,21 @@ int libxl__fork_selfpipe_active(libxl_ctx *ctx)
 
 static void perhaps_removehandler(libxl__gc *gc)
 {
-    if (LIBXL_LIST_EMPTY(&CTX->children) &&
-        CTX->childproc_hooks->chldowner != libxl_sigchld_owner_libxl_always)
+    if (!chldmode_ours(CTX, 0))
         libxl__sigchld_removehandler(gc);
 }
 
+static int perhaps_installhandler(libxl__gc *gc, bool creating)
+{
+    int rc;
+
+    if (chldmode_ours(CTX, creating)) {
+        rc = libxl__sigchld_installhandler(gc);
+        if (rc) return rc;
+    }
+    return 0;
+}
+
 static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
 {
     EGC_GC;
@@ -286,7 +301,7 @@ void libxl__fork_selfpipe_woken(libxl__egc *egc)
      * ctx must be locked EXACTLY ONCE */
     EGC_GC;
 
-    while (chldmode_ours(CTX) /* in case the app changes the mode */) {
+    while (chldmode_ours(CTX, 0) /* in case the app changes the mode */) {
         int status;
         pid_t pid = waitpid(-1, &status, WNOHANG);
 
@@ -332,10 +347,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch,
     CTX_LOCK;
     int rc;
 
-    if (chldmode_ours(CTX)) {
-        rc = libxl__sigchld_installhandler(gc);
-        if (rc) goto out;
-    }
+    perhaps_installhandler(gc, 1);
 
     pid_t pid =
         CTX->childproc_hooks->fork_replacement
@@ -382,17 +394,8 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks,
     ctx->childproc_hooks = hooks;
     ctx->childproc_user = user;
 
-    switch (ctx->childproc_hooks->chldowner) {
-    case libxl_sigchld_owner_mainloop:
-    case libxl_sigchld_owner_libxl:
-        libxl__sigchld_removehandler(gc);
-        break;
-    case libxl_sigchld_owner_libxl_always:
-        libxl__sigchld_installhandler(gc);
-        break;
-    default:
-        abort();
-    }
+    perhaps_removehandler(gc);
+    perhaps_installhandler(gc, 0); /* idempotent, ok to ignore errors for now */
 
     CTX_UNLOCK;
     GC_FREE;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 3/3] libxl: event system: properly register the SIGCHLD self-pipe
  2013-11-12 18:27           ` [PATCH v2 " Ian Jackson
  2013-11-12 18:27             ` [PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators Ian Jackson
  2013-11-12 18:27             ` [PATCH 2/3] libxl: event system: Make libxl_sigchld_owner_libxl_always work Ian Jackson
@ 2013-11-12 18:27             ` Ian Jackson
  2013-11-13 10:02             ` [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes Ian Campbell
  2013-11-20  4:28             ` Jim Fehlig
  4 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-11-12 18:27 UTC (permalink / raw)
  To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell, Bamvor Jian Zhang

An application which uses libxl_osevent_register_hooks, and doesn't
use libxl_sigchld_owner_mainloop, would never properly experience the
deaths of its (libxl) children.

This is because the SIGCHLD self pipe would be handled using ad-hoc
code in beforepoll_internal and afterpoll_internal.  However, this is
no good if application is using the fd registration system instead; in
that case these functions would not be called and nothing would deal
with readability of the self pipe.

Fix this as follows:

The SIGCHLD self pipe now is now dealt with by a standard libxl__ev_fd
handler, which is registered and deregistered along with the SIGCHLD
handler itself.  The handler function subsumes the ad-hoc response
code removed from beforepoll_internal, and the functionality of
libxl__fork_selfpipe_woken.

This is also tidier as the SIGCHLD self pipe is no longer touched
outside libxl_fork.c other than in ctx initialisation and teardown.

(The ad-hoc arrangements for poller->wakeup_pipe in
beforepoll_internal and afterpoll_internal are OK because the
libxl__poller mechanism exists to wake up threads which are sitting
inside libxl's poll loop, so is not applicable to the application's
event loop.)

Reported-by: Bamvor Jian Zhang <bjzhang@suse.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Bamvor Jian Zhang <bjzhang@suse.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
---
 tools/libxl/libxl.c          |    2 ++
 tools/libxl/libxl_event.c    |   12 ----------
 tools/libxl/libxl_fork.c     |   50 ++++++++++++++++++++++++++++++++----------
 tools/libxl/libxl_internal.h |    3 +--
 4 files changed, 42 insertions(+), 25 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 2ebba98..3efc564 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -65,6 +65,7 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version,
     ctx->childproc_user = 0;
         
     ctx->sigchld_selfpipe[0] = -1;
+    libxl__ev_fd_init(&ctx->sigchld_selfpipe_efd);
 
     /* The mutex is special because we can't idempotently destroy it */
 
@@ -146,6 +147,7 @@ int libxl_ctx_free(libxl_ctx *ctx)
     for (i = 0; i < ctx->watch_nslots; i++)
         assert(!libxl__watch_slot_contents(gc, i));
     libxl__ev_fd_deregister(gc, &ctx->watch_efd);
+    libxl__ev_fd_deregister(gc, &ctx->sigchld_selfpipe_efd);
 
     /* Now there should be no more events requested from the application: */
 
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 54d15db..bdef7ac 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -774,10 +774,6 @@ static int beforepoll_internal(libxl__gc *gc, libxl__poller *poller,
                                                                        \
         REQUIRE_FD(poller->wakeup_pipe[0], POLLIN, BODY);              \
                                                                        \
-        int selfpipe = libxl__fork_selfpipe_active(CTX);               \
-        if (selfpipe >= 0)                                             \
-            REQUIRE_FD(selfpipe, POLLIN, BODY);                        \
-                                                                       \
     }while(0)
 
 #define REQUIRE_FD(req_fd_, req_events_, BODY) do{      \
@@ -999,14 +995,6 @@ static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller,
         if (e) LIBXL__EVENT_DISASTER(egc, "read wakeup", e, 0);
     }
 
-    int selfpipe = libxl__fork_selfpipe_active(CTX);
-    if (selfpipe >= 0 &&
-        afterpoll_check_fd(poller,fds,nfds, selfpipe, POLLIN)) {
-        int e = libxl__self_pipe_eatall(selfpipe);
-        if (e) LIBXL__EVENT_DISASTER(egc, "read sigchld pipe", e, 0);
-        libxl__fork_selfpipe_woken(egc);
-    }
-
     for (;;) {
         libxl__ev_time *etime = LIBXL_TAILQ_FIRST(&CTX->etimes);
         if (!etime)
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index a581763..4ae9f94 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -155,6 +155,9 @@ int libxl__carefd_fd(const libxl__carefd *cf)
  * Actual child process handling
  */
 
+static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
+                                     int fd, short events, short revents);
+
 static void sigchld_handler(int signo)
 {
     int esave = errno;
@@ -177,10 +180,18 @@ static void sigchld_removehandler_core(void)
 
 void libxl__sigchld_removehandler(libxl__gc *gc) /* non-reentrant */
 {
+    int rc;
+
     atfork_lock();
     if (sigchld_owner == CTX)
         sigchld_removehandler_core();
     atfork_unlock();
+
+    if (libxl__ev_fd_isregistered(&CTX->sigchld_selfpipe_efd)) {
+        rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, 0);
+        if (rc)
+            libxl__ev_fd_deregister(gc, &CTX->sigchld_selfpipe_efd);
+    }
 }
 
 int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */
@@ -197,6 +208,15 @@ int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */
             goto out;
         }
     }
+    if (!libxl__ev_fd_isregistered(&CTX->sigchld_selfpipe_efd)) {
+        rc = libxl__ev_fd_register(gc, &CTX->sigchld_selfpipe_efd,
+                                   sigchld_selfpipe_handler,
+                                   CTX->sigchld_selfpipe[0], POLLIN);
+        if (rc) goto out;
+    } else {
+        rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, POLLIN);
+        if (rc) goto out;
+    }
 
     atfork_lock();
     if (sigchld_owner != CTX) {
@@ -237,15 +257,6 @@ static bool chldmode_ours(libxl_ctx *ctx, bool creating)
     abort();
 }
 
-int libxl__fork_selfpipe_active(libxl_ctx *ctx)
-{
-    /* Returns the fd to read, or -1 */
-    if (!chldmode_ours(ctx, 0))
-        return -1;
-
-    return ctx->sigchld_selfpipe[0];
-}
-
 static void perhaps_removehandler(libxl__gc *gc)
 {
     if (!chldmode_ours(CTX, 0))
@@ -295,12 +306,29 @@ int libxl_childproc_reaped(libxl_ctx *ctx, pid_t pid, int status)
     return rc;
 }
 
-void libxl__fork_selfpipe_woken(libxl__egc *egc)
+static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
+                                     int fd, short events, short revents)
 {
     /* May make callbacks into the application for child processes.
-     * ctx must be locked EXACTLY ONCE */
+     * So, this function may unlock and relock the CTX.  This is OK
+     * because event callback functions are always called with the CTX
+     * locked exactly once, and from code which copes with reentrancy.
+     * (See also the comment in afterpoll_internal.) */
     EGC_GC;
 
+    int selfpipe = CTX->sigchld_selfpipe[0];
+
+    if (revents & ~POLLIN) {
+        LOG(ERROR, "unexpected poll event 0x%x on SIGCHLD self pipe", revents);
+        LIBXL__EVENT_DISASTER(egc,
+                              "unexpected poll event on SIGCHLD self pipe",
+                              0, 0);
+    }
+    assert(revents & POLLIN);
+
+    int e = libxl__self_pipe_eatall(selfpipe);
+    if (e) LIBXL__EVENT_DISASTER(egc, "read sigchld pipe", e, 0);
+
     while (chldmode_ours(CTX, 0) /* in case the app changes the mode */) {
         int status;
         pid_t pid = waitpid(-1, &status, WNOHANG);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index d567a10..b1f5f81 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -353,6 +353,7 @@ struct libxl__ctx {
     const libxl_childproc_hooks *childproc_hooks;
     void *childproc_user;
     int sigchld_selfpipe[2]; /* [0]==-1 means handler not installed */
+    libxl__ev_fd sigchld_selfpipe_efd;
     LIBXL_LIST_HEAD(, libxl__ev_child) children;
 
     libxl_version_info version_info;
@@ -840,8 +841,6 @@ _hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p);
 extern const libxl_childproc_hooks libxl__childproc_default_hooks;
 int libxl__sigchld_installhandler(libxl__gc*); /* non-reentrant; logs errs */
 void libxl__sigchld_removehandler(libxl__gc*); /* non-reentrant */
-int libxl__fork_selfpipe_active(libxl_ctx *ctx); /* returns read fd or -1 */
-void libxl__fork_selfpipe_woken(libxl__egc *egc);
 int libxl__self_pipe_wakeup(int fd); /* returns 0 or -1 setting errno */
 int libxl__self_pipe_eatall(int fd); /* returns 0 or -1 setting errno */
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes
  2013-11-12 18:27           ` [PATCH v2 " Ian Jackson
                               ` (2 preceding siblings ...)
  2013-11-12 18:27             ` [PATCH 3/3] libxl: event system: properly register the SIGCHLD self-pipe Ian Jackson
@ 2013-11-13 10:02             ` Ian Campbell
  2013-11-21 18:49               ` Ian Jackson
  2013-11-20  4:28             ` Jim Fehlig
  4 siblings, 1 reply; 29+ messages in thread
From: Ian Campbell @ 2013-11-13 10:02 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, Bamvor Jian Zhang, xen-devel

On Tue, 2013-11-12 at 18:27 +0000, Ian Jackson wrote:
> libxl's SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always
> and does not always work correctly with the fd registration system.
> 
>  1/3 libxl: event system: pass gc, not just ctx, to internal
>  2/3 libxl: event system: Make
>  3/3 libxl: event system: properly register the SIGCHLD
> 
> This is v2 of this series because when I reviewed it myself I found
> that there was a critical missing ! in perhaps_removehandler, which
> would remove the handler iff it was needed.  IMO this shows the
> weakness of my testing arrangements.

Sorry for also missing this. FWIW my ack from last time can still apply.

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes
  2013-11-12 18:27           ` [PATCH v2 " Ian Jackson
                               ` (3 preceding siblings ...)
  2013-11-13 10:02             ` [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes Ian Campbell
@ 2013-11-20  4:28             ` Jim Fehlig
  2013-11-20 15:22               ` 答复: " Bamvor Jian Zhang
  4 siblings, 1 reply; 29+ messages in thread
From: Jim Fehlig @ 2013-11-20  4:28 UTC (permalink / raw)
  To: Bamvor Jian Zhang; +Cc: Ian Jackson, Ian.Campbell, xen-devel

Ian Jackson wrote:
> libxl's SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always
> and does not always work correctly with the fd registration system.
>
>  1/3 libxl: event system: pass gc, not just ctx, to internal
>  2/3 libxl: event system: Make
>  3/3 libxl: event system: properly register the SIGCHLD
>
> This is v2 of this series because when I reviewed it myself I found
> that there was a critical missing ! in perhaps_removehandler, which
> would remove the handler iff it was needed.  IMO this shows the
> weakness of my testing arrangements.
>
> I'd really appreciate some feedback on this from Bamvor or Jim.
>   

Hrm, I haven't followed this thread and am not familiar with the details...

Bamvor, have you reviewed/tested Ian's patches to see if they resolve
the issue you originally raised in this thread?

Thanks,
Jim

^ permalink raw reply	[flat|nested] 29+ messages in thread

* 答复: Re:  [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes
  2013-11-20  4:28             ` Jim Fehlig
@ 2013-11-20 15:22               ` Bamvor Jian Zhang
  2013-11-20 16:40                 ` 答复: Re: [PAT CH " Ian Jackson
  0 siblings, 1 reply; 29+ messages in thread
From: Bamvor Jian Zhang @ 2013-11-20 15:22 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: ian.jackson, Ian.Campbell, xen-devel

hi, 
>>> Jim Fehlig  2013-11-20 下午 12:29 >>>
> Ian Jackson wrote:
> > libxl's SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always
> > and does not always work correctly with the fd registration system.
> >
> >  1/3 libxl: event system: pass gc, not just ctx, to internal
> >  2/3 libxl: event system: Make
> >  3/3 libxl: event system: properly register the SIGCHLD
> >
> > This is v2 of this series because when I reviewed it myself I found
> > that there was a critical missing ! in perhaps_removehandler, which
> > would remove the handler iff it was needed.  IMO this shows the
> > weakness of my testing arrangements.
> >
> > I'd really appreciate some feedback on this from Bamvor or Jim.
> >   

> Hrm, I haven't followed this thread and am not familiar with the details...
> 
> Bamvor, have you reviewed/tested Ian's patches to see if they resolve
> the issue you originally raised in this thread?
sorry for late response.
i write a patch in libvirt in which use ao_how without callback and let libxl
manage the child. 
so fat i test create, save, restore successful. there is a dead-lock in libvirt when i
issue shutdown, i will deal with it tomorrow.


bamvor
> Thanks,
> Jim






_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 答复: Re:  [PAT CH v2 0/3] libxl: event system: SIGCHLD related  fixes
  2013-11-20 15:22               ` 答复: " Bamvor Jian Zhang
@ 2013-11-20 16:40                 ` Ian Jackson
  0 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-11-20 16:40 UTC (permalink / raw)
  To: Bamvor Jian Zhang; +Cc: Jim Fehlig, Ian.Campbell, xen-devel

Bamvor Jian Zhang writes ("答复: Re: [Xen-devel] [PAT CH v2 0/3] libxl: event system: SIGCHLD related fixes"):
> Jim Fehlig  2013-11-20 下午 12:29 >>>
> > Bamvor, have you reviewed/tested Ian's patches to see if they resolve
> > the issue you originally raised in this thread?
>
> sorry for late response.
> i write a patch in libvirt in which use ao_how without callback and let libxl
> manage the child. 

Err, OK.  I think, though, that at least some of what you reported was
a genuine libxl bug.  So perhaps you didn't need to change libvirt.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes
  2013-11-13 10:02             ` [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes Ian Campbell
@ 2013-11-21 18:49               ` Ian Jackson
  0 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-11-21 18:49 UTC (permalink / raw)
  To: Ian Campbell; +Cc: george.dunlap, Jim Fehlig, Bamvor Jian Zhang, xen-devel

Ian Campbell writes ("Re: [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes"):
> On Tue, 2013-11-12 at 18:27 +0000, Ian Jackson wrote:
> > libxl's SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always
> > and does not always work correctly with the fd registration system.
> > 
> >  1/3 libxl: event system: pass gc, not just ctx, to internal
> >  2/3 libxl: event system: Make
> >  3/3 libxl: event system: properly register the SIGCHLD
> > 
> > This is v2 of this series because when I reviewed it myself I found
> > that there was a critical missing ! in perhaps_removehandler, which
> > would remove the handler iff it was needed.  IMO this shows the
> > weakness of my testing arrangements.
> 
> Sorry for also missing this. FWIW my ack from last time can still apply.

Well, in the absence of more information, and after in-person
confirmation that this (attempted) bugfix would be better exposed for
testing in the tree rather than floating about in emails, I have
pushed these three patches.

Thanks,
ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2013-11-21 18:49 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-05  9:22 [PATCH 0/2] question on libxl ao Bamvor Jian Zhang
2013-11-05  9:22 ` [PATCH 1/2] expose child fd in order to handle child exit in libvirt Bamvor Jian Zhang
2013-11-05 16:05   ` [PATCH 0/2] question on libxl ao [and 1 more messages] Ian Jackson
2013-11-05 19:19     ` [PATCH 0/3] libxl: event system: SIGCHLD related fixes Ian Jackson
2013-11-05 19:19       ` [PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators Ian Jackson
2013-11-06 10:30         ` Ian Campbell
2013-11-05 19:19       ` [PATCH 2/3] libxl: event system: Make libxl_sigchld_owner_libxl_always work Ian Jackson
2013-11-06 10:34         ` Ian Campbell
2013-11-05 19:19       ` [PATCH 3/3] libxl: event system: properly register the SIGCHLD self-pipe Ian Jackson
2013-11-06 10:36         ` Ian Campbell
2013-11-06 10:38       ` [PATCH 0/3] libxl: event system: SIGCHLD related fixes Ian Campbell
2013-11-06 12:08         ` Ian Jackson
2013-11-12 18:27           ` [PATCH v2 " Ian Jackson
2013-11-12 18:27             ` [PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators Ian Jackson
2013-11-12 18:27             ` [PATCH 2/3] libxl: event system: Make libxl_sigchld_owner_libxl_always work Ian Jackson
2013-11-12 18:27             ` [PATCH 3/3] libxl: event system: properly register the SIGCHLD self-pipe Ian Jackson
2013-11-13 10:02             ` [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes Ian Campbell
2013-11-21 18:49               ` Ian Jackson
2013-11-20  4:28             ` Jim Fehlig
2013-11-20 15:22               ` 答复: " Bamvor Jian Zhang
2013-11-20 16:40                 ` 答复: Re: [PAT CH " Ian Jackson
2013-11-07  3:00     ` [PATCH 0/2] question on libxl ao [and 1 more messages] Bamvor Jian Zhang
2013-11-05 19:23   ` [PATCH 0/2] question on libxl ao [and 1 more messages] " Ian Jackson
2013-11-05  9:22 ` [PATCH 2/2] call ao_how callback explicitly Bamvor Jian Zhang
2013-11-05 15:51   ` Ian Jackson
2013-11-07 14:51     ` Bamvor Jian Zhang
2013-11-07 17:03       ` Ian Jackson
2013-11-08 13:17         ` Bamvor Jian Zhang
2013-11-08 16:07           ` Ian Jackson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.