All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH 0/8] libxl: event: Fix hang for some applications
@ 2020-01-10 13:28 Ian Jackson
  2020-01-10 13:28 ` [Xen-devel] [PATCH 1/8] libxl: event: Rename poller.fds_changed to .fds_deregistered Ian Jackson
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Ian Jackson @ 2020-01-10 13:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, George Dunlap, Wei Liu

The meat here, including a description of the bug, is in:
  libxl: event: Fix hang when mixing blocking and eventy calls

I have compiled this but not tested it.  We do not have a good test
suite for this event stuff.  (And races etc are hard to test.)
George, can you check to see whether it fixes the issue you saw ?

If so then I suggest we try to convince ourselves of its correctness
via a second round of code review.  I will certainly want to read it
all again after the weekend, since then I will hopefully have
forgotten enough about this to make that a worthwhile exercise.

Ian Jackson (8):
  libxl: event: Rename poller.fds_changed to .fds_deregistered
  libxl: event: Rename ctx.pollers_fd_changed to .pollers_active
  libxl: event: Introduce CTX_UNLOCK_EGC_FREE
  libxl: event: Fix hang when mixing blocking and eventy calls
  libxl: event: poller pipe optimisation
  libxl: event: Break out baton_wake
  libxl: event: Fix possible hang with libxl_osevent_beforepoll
  libxl: event: Move poller pipe emptying to the end of afterpoll

 tools/libxl/libxl.c          |   4 +-
 tools/libxl/libxl_event.c    | 126 ++++++++++++++++++++++++++++++-------------
 tools/libxl/libxl_fork.c     |   6 +--
 tools/libxl/libxl_internal.h |  42 +++++++++++----
 4 files changed, 127 insertions(+), 51 deletions(-)

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Xen-devel] [PATCH 1/8] libxl: event: Rename poller.fds_changed to .fds_deregistered
  2020-01-10 13:28 [Xen-devel] [PATCH 0/8] libxl: event: Fix hang for some applications Ian Jackson
@ 2020-01-10 13:28 ` Ian Jackson
  2020-01-10 13:28 ` [Xen-devel] [PATCH 2/8] libxl: event: Rename ctx.pollers_fd_changed to .pollers_active Ian Jackson
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ian Jackson @ 2020-01-10 13:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, George Dunlap, Wei Liu

This is only for deregistration.  We are going to add another variable
for new events, with different semantics, and this overly-general name
will become confusing.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxl_event.c    | 8 ++++----
 tools/libxl/libxl_internal.h | 6 +++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index aa8b7d1945..1210c1bfb3 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -239,7 +239,7 @@ void libxl__ev_fd_deregister(libxl__gc *gc, libxl__ev_fd *ev)
     ev->fd = -1;
 
     LIBXL_LIST_FOREACH(poller, &CTX->pollers_fds_changed, fds_changed_entry)
-        poller->fds_changed = 1;
+        poller->fds_deregistered = 1;
 
  out:
     CTX_UNLOCK;
@@ -1120,7 +1120,7 @@ static int beforepoll_internal(libxl__gc *gc, libxl__poller *poller,
 
     *nfds_io = used;
 
-    poller->fds_changed = 0;
+    poller->fds_deregistered = 0;
 
     libxl__ev_time *etime = LIBXL_TAILQ_FIRST(&CTX->etimes);
     if (etime) {
@@ -1186,7 +1186,7 @@ static int afterpoll_check_fd(libxl__poller *poller,
             /* again, stale slot entry */
             continue;
 
-        assert(poller->fds_changed || !(fds[slot].revents & POLLNVAL));
+        assert(poller->fds_deregistered || !(fds[slot].revents & POLLNVAL));
 
         /* we mask in case requested events have changed */
         int slot_revents = fds[slot].revents & events;
@@ -1626,7 +1626,7 @@ int libxl__poller_init(libxl__gc *gc, libxl__poller *p)
     int rc;
     p->fd_polls = 0;
     p->fd_rindices = 0;
-    p->fds_changed = 0;
+    p->fds_deregistered = 0;
 
     rc = libxl__pipe_nonblock(CTX, p->wakeup_pipe);
     if (rc) goto out;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index ba8c9b41ab..c5b71d15f0 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -629,14 +629,14 @@ struct libxl__poller {
     /*
      * We also use the poller to record whether any fds have been
      * deregistered since we entered poll.  Each poller which is not
-     * idle is on the list pollers_fds_changed.  fds_changed is
+     * idle is on the list pollers_fds_changed.  fds_deregistered is
      * cleared by beforepoll, and tested by afterpoll.  Whenever an fd
-     * event is deregistered, we set the fds_changed of all non-idle
+     * event is deregistered, we set the fds_deregistered of all non-idle
      * pollers.  So afterpoll can tell whether any POLLNVAL is
      * plausibly due to an fd being closed and reopened.
      */
     LIBXL_LIST_ENTRY(libxl__poller) fds_changed_entry;
-    bool fds_changed;
+    bool fds_deregistered;
 };
 
 struct libxl__gc {
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Xen-devel] [PATCH 2/8] libxl: event: Rename ctx.pollers_fd_changed to .pollers_active
  2020-01-10 13:28 [Xen-devel] [PATCH 0/8] libxl: event: Fix hang for some applications Ian Jackson
  2020-01-10 13:28 ` [Xen-devel] [PATCH 1/8] libxl: event: Rename poller.fds_changed to .fds_deregistered Ian Jackson
@ 2020-01-10 13:28 ` Ian Jackson
  2020-01-10 13:28 ` [Xen-devel] [PATCH 3/8] libxl: event: Introduce CTX_UNLOCK_EGC_FREE Ian Jackson
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ian Jackson @ 2020-01-10 13:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, George Dunlap, Wei Liu

We are going to use this a bit more widely.  Make the name more
general.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxl.c          | 4 ++--
 tools/libxl/libxl_event.c    | 8 ++++----
 tools/libxl/libxl_internal.h | 6 +++---
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index a0d84281d0..f60fd3e4fd 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -48,7 +48,7 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version,
     ctx->poller_app = 0;
     LIBXL_LIST_INIT(&ctx->pollers_event);
     LIBXL_LIST_INIT(&ctx->pollers_idle);
-    LIBXL_LIST_INIT(&ctx->pollers_fds_changed);
+    LIBXL_LIST_INIT(&ctx->pollers_active);
 
     LIBXL_LIST_INIT(&ctx->efds);
     LIBXL_TAILQ_INIT(&ctx->etimes);
@@ -177,7 +177,7 @@ int libxl_ctx_free(libxl_ctx *ctx)
     libxl__poller_put(ctx, ctx->poller_app);
     ctx->poller_app = NULL;
     assert(LIBXL_LIST_EMPTY(&ctx->pollers_event));
-    assert(LIBXL_LIST_EMPTY(&ctx->pollers_fds_changed));
+    assert(LIBXL_LIST_EMPTY(&ctx->pollers_active));
     libxl__poller *poller, *poller_tmp;
     LIBXL_LIST_FOREACH_SAFE(poller, &ctx->pollers_idle, entry, poller_tmp) {
         libxl__poller_dispose(poller);
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 1210c1bfb3..5b12a45e70 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -238,7 +238,7 @@ void libxl__ev_fd_deregister(libxl__gc *gc, libxl__ev_fd *ev)
     LIBXL_LIST_REMOVE(ev, entry);
     ev->fd = -1;
 
-    LIBXL_LIST_FOREACH(poller, &CTX->pollers_fds_changed, fds_changed_entry)
+    LIBXL_LIST_FOREACH(poller, &CTX->pollers_active, active_entry)
         poller->fds_deregistered = 1;
 
  out:
@@ -1663,15 +1663,15 @@ libxl__poller *libxl__poller_get(libxl__gc *gc)
         }
     }
 
-    LIBXL_LIST_INSERT_HEAD(&CTX->pollers_fds_changed, p,
-                           fds_changed_entry);
+    LIBXL_LIST_INSERT_HEAD(&CTX->pollers_active, p,
+                           active_entry);
     return p;
 }
 
 void libxl__poller_put(libxl_ctx *ctx, libxl__poller *p)
 {
     if (!p) return;
-    LIBXL_LIST_REMOVE(p, fds_changed_entry);
+    LIBXL_LIST_REMOVE(p, active_entry);
     LIBXL_LIST_INSERT_HEAD(&ctx->pollers_idle, p, entry);
 }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index c5b71d15f0..581d64b99c 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -629,13 +629,13 @@ struct libxl__poller {
     /*
      * We also use the poller to record whether any fds have been
      * deregistered since we entered poll.  Each poller which is not
-     * idle is on the list pollers_fds_changed.  fds_deregistered is
+     * idle is on the list pollers_active.  fds_deregistered is
      * cleared by beforepoll, and tested by afterpoll.  Whenever an fd
      * event is deregistered, we set the fds_deregistered of all non-idle
      * pollers.  So afterpoll can tell whether any POLLNVAL is
      * plausibly due to an fd being closed and reopened.
      */
-    LIBXL_LIST_ENTRY(libxl__poller) fds_changed_entry;
+    LIBXL_LIST_ENTRY(libxl__poller) active_entry;
     bool fds_deregistered;
 };
 
@@ -678,7 +678,7 @@ struct libxl__ctx {
 
     libxl__poller *poller_app; /* libxl_osevent_beforepoll and _afterpoll */
     LIBXL_LIST_HEAD(, libxl__poller) pollers_event, pollers_idle;
-    LIBXL_LIST_HEAD(, libxl__poller) pollers_fds_changed;
+    LIBXL_LIST_HEAD(, libxl__poller) pollers_active;
 
     LIBXL_SLIST_HEAD(libxl__osevent_hook_nexi, libxl__osevent_hook_nexus)
         hook_fd_nexi_idle, hook_timeout_nexi_idle;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Xen-devel] [PATCH 3/8] libxl: event: Introduce CTX_UNLOCK_EGC_FREE
  2020-01-10 13:28 [Xen-devel] [PATCH 0/8] libxl: event: Fix hang for some applications Ian Jackson
  2020-01-10 13:28 ` [Xen-devel] [PATCH 1/8] libxl: event: Rename poller.fds_changed to .fds_deregistered Ian Jackson
  2020-01-10 13:28 ` [Xen-devel] [PATCH 2/8] libxl: event: Rename ctx.pollers_fd_changed to .pollers_active Ian Jackson
@ 2020-01-10 13:28 ` Ian Jackson
  2020-01-10 13:28 ` [Xen-devel] [PATCH 4/8] libxl: event: Fix hang when mixing blocking and eventy calls Ian Jackson
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ian Jackson @ 2020-01-10 13:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, George Dunlap, Wei Liu

This is a very common exit pattern.  We are going to want to change
this pattern.  So we should make it into a macro of its own.

No functional change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxl_event.c    | 18 ++++++------------
 tools/libxl/libxl_fork.c     |  6 ++----
 tools/libxl/libxl_internal.h |  2 ++
 3 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 5b12a45e70..be37e12bb0 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -1152,8 +1152,7 @@ int libxl_osevent_beforepoll(libxl_ctx *ctx, int *nfds_io,
     CTX_LOCK;
     int rc = beforepoll_internal(gc, ctx->poller_app,
                                  nfds_io, fds, timeout_upd, now);
-    CTX_UNLOCK;
-    EGC_FREE;
+    CTX_UNLOCK_EGC_FREE;
     return rc;
 }
 
@@ -1305,8 +1304,7 @@ void libxl_osevent_afterpoll(libxl_ctx *ctx, int nfds, const struct pollfd *fds,
     EGC_INIT(ctx);
     CTX_LOCK;
     afterpoll_internal(egc, ctx->poller_app, nfds, fds, now);
-    CTX_UNLOCK;
-    EGC_FREE;
+    CTX_UNLOCK_EGC_FREE;
 }
 
 /*
@@ -1342,8 +1340,7 @@ void libxl_osevent_occurred_fd(libxl_ctx *ctx, void *for_libxl,
     fd_occurs(egc, ev, revents_ign);
 
  out:
-    CTX_UNLOCK;
-    EGC_FREE;
+    CTX_UNLOCK_EGC_FREE;
 }
 
 void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
@@ -1365,8 +1362,7 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
     time_occurs(egc, ev, ERROR_TIMEDOUT);
 
  out:
-    CTX_UNLOCK;
-    EGC_FREE;
+    CTX_UNLOCK_EGC_FREE;
 }
 
 void libxl__event_disaster(libxl__egc *egc, const char *msg, int errnoval,
@@ -1546,8 +1542,7 @@ int libxl_event_check(libxl_ctx *ctx, libxl_event **event_r,
     EGC_INIT(ctx);
     CTX_LOCK;
     int rc = event_check_internal(egc, event_r, typemask, pred, pred_user);
-    CTX_UNLOCK;
-    EGC_FREE;
+    CTX_UNLOCK_EGC_FREE;
     return rc;
 }
 
@@ -1772,8 +1767,7 @@ int libxl_event_wait(libxl_ctx *ctx, libxl_event **event_r,
  out:
     libxl__poller_put(ctx, poller);
 
-    CTX_UNLOCK;
-    EGC_FREE;
+    CTX_UNLOCK_EGC_FREE;
     return rc;
 }
 
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 0f1b6b518c..cf170b9085 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -483,8 +483,7 @@ int libxl_childproc_reaped(libxl_ctx *ctx, pid_t pid, int status)
     assert(CTX->childproc_hooks->chldowner
            == libxl_sigchld_owner_mainloop);
     int rc = childproc_reaped(egc, pid, status);
-    CTX_UNLOCK;
-    EGC_FREE;
+    CTX_UNLOCK_EGC_FREE;
     return rc;
 }
 
@@ -529,8 +528,7 @@ void libxl_childproc_sigchld_occurred(libxl_ctx *ctx)
     assert(CTX->childproc_hooks->chldowner
            == libxl_sigchld_owner_mainloop);
     childproc_checkall(egc);
-    CTX_UNLOCK;
-    EGC_FREE;
+    CTX_UNLOCK_EGC_FREE;
 }
 
 static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 581d64b99c..983fffac7a 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2363,6 +2363,8 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc);
 
 #define EGC_FREE           libxl__egc_cleanup(egc)
 
+#define CTX_UNLOCK_EGC_FREE  do{ CTX_UNLOCK; EGC_FREE; }while(0)
+
 
 /*
  * Machinery for asynchronous operations ("ao")
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Xen-devel] [PATCH 4/8] libxl: event: Fix hang when mixing blocking and eventy calls
  2020-01-10 13:28 [Xen-devel] [PATCH 0/8] libxl: event: Fix hang for some applications Ian Jackson
                   ` (2 preceding siblings ...)
  2020-01-10 13:28 ` [Xen-devel] [PATCH 3/8] libxl: event: Introduce CTX_UNLOCK_EGC_FREE Ian Jackson
@ 2020-01-10 13:28 ` Ian Jackson
  2020-01-10 13:28 ` [Xen-devel] [PATCH 5/8] libxl: event: poller pipe optimisation Ian Jackson
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ian Jackson @ 2020-01-10 13:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, George Dunlap, Wei Liu

If the application calls libxl with ao_how==0 and also makes calls
like _occurred, libxl will sometimes get stuck.

The bug happens as follows (for example):

  Thread A
       libxl_do_thing(,ao_how==0)
       libxl_do_thing starts, sets up some callbacks
       libxl_do_thing exit path calls AO_INPROGRESS
       libxl__ao_inprogress goes into event loop
       eventloop_iteration sleeps on:
          - do_thing's current fd set
          - sigchld pipe if applicable
          - its poller

  Thread B
       libxl_something_occurred
       the something is to do with do_thing, above
       do_thing_next_callback does some more work
       do_thing_next_callback becomes interested in fd N
       thread B returns to application

Note that nothing wakes up thread A.  A is not listening on fd N.  So
do_thing_* will not spot when fd N signals.  do_thing will not make
further timely progress.  If there is no timeout thread A will never
wake up.

The problem here occurs because thread A is waiting on an out of date
osevent set.  We need the following property: whenever any thread is
blocking in the libxl event loop, at least one thread must be using an
up to date osevent set.  It is OK for all but one threads to have
stale event sets, because so long as one waiting thread has the right
event set, any actually interesting event will, if nothing else, wake
that "right" thread up.  It will then make some progress and/or, if it
exits, ensure that some other thread becomes the "right" thread.

There is also the possibility that a thread might block waiting for
libxl osevents but outside libxl, eg if the application used
libxl_osevent_beforepoll.  We will deal with that in a moment.

Reported-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxl_event.c    | 72 +++++++++++++++++++++++++++++++++++++-------
 tools/libxl/libxl_internal.h | 33 ++++++++++++++++----
 2 files changed, 88 insertions(+), 17 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index be37e12bb0..18db091a6e 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -36,6 +36,42 @@ static libxl__ao *ao_nested_root(libxl__ao *ao);
 static void ao__check_destroy(libxl_ctx *ctx, libxl__ao *ao);
 
 
+static void pollers_note_osevent_added(libxl_ctx *ctx) {
+    libxl__poller *poller;
+    LIBXL_LIST_FOREACH(poller, &ctx->pollers_active, active_entry)
+        poller->osevents_added = 1;
+}
+
+void libxl__egc_cleanup_1_baton(libxl__egc *egc)
+{
+    EGC_GC;
+    libxl__poller *search, *wake=0;
+
+    LIBXL_LIST_FOREACH(search, &CTX->pollers_active, active_entry) {
+        if (search == CTX->poller_app)
+            /* This one is special.  We can't give it the baton. */
+            continue;
+        if (!search->osevents_added)
+            /* This poller is up to date and will wake up as needed. */
+            return;
+        if (!wake)
+            wake = search;
+    }
+
+    if (!wake)
+        /* no-one in libxl waiting for any events */
+        return;
+
+    libxl__poller_wakeup(egc, wake);
+
+    wake->osevents_added = 0;
+    /* This serves to make _1_baton idempotent.  It is OK even though
+     * that poller may currently be sleeping on only old osevents,
+     * because it is going to wake up because we've just prodded it,
+     * and it pick up new osevents on its next iteration (or pass
+     * on the baton). */
+}
+
 /*
  * The counter osevent_in_hook is used to ensure that the application
  * honours the reentrancy restriction documented in libxl_event.h.
@@ -194,6 +230,7 @@ int libxl__ev_fd_register(libxl__gc *gc, libxl__ev_fd *ev,
     ev->func = func;
 
     LIBXL_LIST_INSERT_HEAD(&CTX->efds, ev, entry);
+    pollers_note_osevent_added(CTX);
 
     rc = 0;
 
@@ -214,6 +251,8 @@ int libxl__ev_fd_modify(libxl__gc *gc, libxl__ev_fd *ev, short events)
     rc = OSEVENT_HOOK(fd,modify, noop, ev->fd, &ev->nexus->for_app_reg, events);
     if (rc) goto out;
 
+    if ((events & ~ev->events))
+        pollers_note_osevent_added(CTX);
     ev->events = events;
 
     rc = 0;
@@ -315,6 +354,7 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev,
     LIBXL_TAILQ_INSERT_SORTED(&CTX->etimes, entry, ev, evsearch, /*empty*/,
                               timercmp(&ev->abs, &evsearch->abs, >));
 
+    pollers_note_osevent_added(CTX);
     return 0;
 }
 
@@ -1121,6 +1161,7 @@ static int beforepoll_internal(libxl__gc *gc, libxl__poller *poller,
     *nfds_io = used;
 
     poller->fds_deregistered = 0;
+    poller->osevents_added = 0;
 
     libxl__ev_time *etime = LIBXL_TAILQ_FIRST(&CTX->etimes);
     if (etime) {
@@ -1444,7 +1485,7 @@ static void egc_run_callbacks(libxl__egc *egc)
     }
 }
 
-void libxl__egc_cleanup(libxl__egc *egc)
+void libxl__egc_cleanup_2_ul_cb_gc(libxl__egc *egc)
 {
     EGC_GC;
     egc_run_callbacks(egc);
@@ -1754,13 +1795,15 @@ int libxl_event_wait(libxl_ctx *ctx, libxl_event **event_r,
         rc = eventloop_iteration(egc, poller);
         if (rc) goto out;
 
-        /* we unlock and cleanup the egc each time we go through this loop,
-         * so that (a) we don't accumulate garbage and (b) any events
-         * which are to be dispatched by callback are actually delivered
-         * in a timely fashion.
+        /* we unlock and cleanup the egc each time we go through this
+         * loop, so that (a) we don't accumulate garbage and (b) any
+         * events which are to be dispatched by callback are actually
+         * delivered in a timely fashion.  _1_baton will be
+         * called to pass the baton iff we actually leave; otherwise
+         * we are still carrying it.
          */
         CTX_UNLOCK;
-        libxl__egc_cleanup(egc);
+        libxl__egc_cleanup_2_ul_cb_gc(egc);
         CTX_LOCK;
     }
 
@@ -2033,10 +2076,12 @@ int libxl__ao_inprogress(libxl__ao *ao,
                  * synchronous cancellation ability. */
             }
 
+            /* The call to egc..1_baton is below, only if we are leaving. */
             CTX_UNLOCK;
-            libxl__egc_cleanup(&egc);
+            libxl__egc_cleanup_2_ul_cb_gc(&egc);
             CTX_LOCK;
         }
+        libxl__egc_cleanup_1_baton(&egc);
     } else {
         rc = 0;
     }
@@ -2053,6 +2098,9 @@ int libxl__ao_inprogress(libxl__ao *ao,
 static int ao__abort(libxl_ctx *ctx, libxl__ao *parent)
 /* Temporarily unlocks ctx, which must be locked exactly once on entry. */
 {
+    libxl__egc egc;
+    LIBXL_INIT_EGC(egc,ctx);
+
     int rc;
     ao__manip_enter(parent);
 
@@ -2073,9 +2121,6 @@ static int ao__abort(libxl_ctx *ctx, libxl__ao *parent)
 
     /* We keep calling abort hooks until there are none left */
     while (!LIBXL_LIST_EMPTY(&parent->abortables)) {
-        libxl__egc egc;
-        LIBXL_INIT_EGC(egc,ctx);
-
         assert(!parent->complete);
 
         libxl__ao_abortable *abrt = LIBXL_LIST_FIRST(&parent->abortables);
@@ -2088,8 +2133,9 @@ static int ao__abort(libxl_ctx *ctx, libxl__ao *parent)
                    "ao %p: abrt=%p: aborting", parent, abrt->ao);
         abrt->callback(&egc, abrt, ERROR_ABORTED);
 
+        /* The call to egc..1_baton is in the out block below. */
         libxl__ctx_unlock(ctx);
-        libxl__egc_cleanup(&egc);
+        libxl__egc_cleanup_2_ul_cb_gc(&egc);
         libxl__ctx_lock(ctx);
     }
 
@@ -2097,6 +2143,10 @@ static int ao__abort(libxl_ctx *ctx, libxl__ao *parent)
 
  out:
     ao__manip_leave(ctx, parent);
+    /* The call to egc..2_ul_cb_gc is above.  This is sufficient
+     * because only code inside the loop adds anything to the egc, and
+     * we ensures that the egc is clean when we leave the loop. */
+    libxl__egc_cleanup_1_baton(&egc);
     return rc;
 }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 983fffac7a..b9c4031863 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -634,9 +634,23 @@ struct libxl__poller {
      * event is deregistered, we set the fds_deregistered of all non-idle
      * pollers.  So afterpoll can tell whether any POLLNVAL is
      * plausibly due to an fd being closed and reopened.
+     *
+     * Additionally, we record whether any fd or time event sources
+     * have been registered.  This is necessary because sometimes we
+     * need to wake up the only libxl thread stuck in
+     * eventloop_iteration so that it will pick up new fds or earlier
+     * timeouts.  osevents_added is cleared by beforepoll, and set by
+     * fd or timeout event registration.  When we are about to leave
+     * libxl (strictly, when we are about to give up an egc), we check
+     * whether there are any pollers.  If there are, then at least one
+     * of them must have osevents_added clear.  If not, we wake up the
+     * first one on the list.  Any entry on pollers_active constitutes
+     * a promise to also make this check, so the baton will never be
+     * dropped.
      */
     LIBXL_LIST_ENTRY(libxl__poller) active_entry;
     bool fds_deregistered;
+    bool osevents_added;
 };
 
 struct libxl__gc {
@@ -2350,7 +2364,10 @@ _hidden libxl_device_model_version libxl__default_device_model(libxl__gc *gc);
         LIBXL_STAILQ_INIT(&(egc).ev_immediates);        \
     } while(0)
 
-_hidden void libxl__egc_cleanup(libxl__egc *egc);
+_hidden void libxl__egc_cleanup_1_baton(libxl__egc *egc);
+  /* Passes the baton for added osevents.  See comment for
+   * osevents_added in struct libxl__poller. */
+_hidden void libxl__egc_cleanup_2_ul_cb_gc(libxl__egc *egc);
   /* Frees memory allocated within this egc's gc, and and report all
    * occurred events via callback, if applicable.  May reenter the
    * application; see restrictions above.  The ctx must be UNLOCKED. */
@@ -2361,9 +2378,11 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc);
     libxl__egc egc[1]; LIBXL_INIT_EGC(egc[0],ctx);      \
     EGC_GC
 
-#define EGC_FREE           libxl__egc_cleanup(egc)
-
-#define CTX_UNLOCK_EGC_FREE  do{ CTX_UNLOCK; EGC_FREE; }while(0)
+#define CTX_UNLOCK_EGC_FREE  do{                \
+        libxl__egc_cleanup_1_baton(egc);        \
+        CTX_UNLOCK;                             \
+        libxl__egc_cleanup_2_ul_cb_gc(egc);     \
+    }while(0)
 
 
 /*
@@ -2468,8 +2487,9 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc);
 
 #define AO_INPROGRESS ({                                        \
         libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc);          \
+        /* __ao_inprogress will do egc..1_baton if needed */	\
         CTX_UNLOCK;                                             \
-        EGC_FREE;                                               \
+        libxl__egc_cleanup_2_ul_cb_gc(egc);                     \
         CTX_LOCK;                                               \
         int ao__rc = libxl__ao_inprogress(ao,                   \
                                __FILE__, __LINE__, __func__);   \
@@ -2481,8 +2501,9 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc);
         libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc);          \
         assert(rc);                                             \
         libxl__ao_create_fail(ao);                              \
+        libxl__egc_cleanup_1_baton(egc);                        \
         libxl__ctx_unlock(ao__ctx); /* gc is now invalid */     \
-        EGC_FREE;                                               \
+        libxl__egc_cleanup_2_ul_cb_gc(egc);                     \
         (rc);                                                   \
     })
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Xen-devel] [PATCH 5/8] libxl: event: poller pipe optimisation
  2020-01-10 13:28 [Xen-devel] [PATCH 0/8] libxl: event: Fix hang for some applications Ian Jackson
                   ` (3 preceding siblings ...)
  2020-01-10 13:28 ` [Xen-devel] [PATCH 4/8] libxl: event: Fix hang when mixing blocking and eventy calls Ian Jackson
@ 2020-01-10 13:28 ` Ian Jackson
  2020-01-10 13:29 ` [Xen-devel] [PATCH 6/8] libxl: event: Break out baton_wake Ian Jackson
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ian Jackson @ 2020-01-10 13:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, George Dunlap, Wei Liu

Track in userland whether the poller pipe is nonempty.  This saves us
writing many many bytes to the pipe if nothing ever reads them.

This is going to be relevant in a moment, where we are going to create
a situation where this will happen quite a lot.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxl_event.c    | 2 ++
 tools/libxl/libxl_internal.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 18db091a6e..05559cad9a 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -1319,6 +1319,7 @@ static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller,
     }
 
     if (afterpoll_check_fd(poller,fds,nfds, poller->wakeup_pipe[0],POLLIN)) {
+        poller->pipe_nonempty = 0;
         int e = libxl__self_pipe_eatall(poller->wakeup_pipe[0]);
         if (e) LIBXL__EVENT_DISASTER(egc, "read wakeup", e, 0);
     }
@@ -1713,6 +1714,7 @@ void libxl__poller_put(libxl_ctx *ctx, libxl__poller *p)
 
 void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p)
 {
+    if (p->pipe_nonempty++) return;
     int e = libxl__self_pipe_wakeup(p->wakeup_pipe[1]);
     if (e) LIBXL__EVENT_DISASTER(egc, "cannot poke watch pipe", e, 0);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b9c4031863..baf7509b1e 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -625,6 +625,7 @@ struct libxl__poller {
     int (*fd_rindices)[3]; /* see libxl_event.c:beforepoll_internal */
 
     int wakeup_pipe[2]; /* 0 means no fd allocated */
+    bool pipe_nonempty;
 
     /*
      * We also use the poller to record whether any fds have been
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Xen-devel] [PATCH 6/8] libxl: event: Break out baton_wake
  2020-01-10 13:28 [Xen-devel] [PATCH 0/8] libxl: event: Fix hang for some applications Ian Jackson
                   ` (4 preceding siblings ...)
  2020-01-10 13:28 ` [Xen-devel] [PATCH 5/8] libxl: event: poller pipe optimisation Ian Jackson
@ 2020-01-10 13:29 ` Ian Jackson
  2020-01-10 13:29 ` [Xen-devel] [PATCH 7/8] libxl: event: Fix possible hang with libxl_osevent_beforepoll Ian Jackson
  2020-01-10 13:29 ` [Xen-devel] [PATCH 8/8] libxl: event: Move poller pipe emptying to the end of afterpoll Ian Jackson
  7 siblings, 0 replies; 9+ messages in thread
From: Ian Jackson @ 2020-01-10 13:29 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, George Dunlap, Wei Liu

No functional change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxl_event.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 05559cad9a..4d57843cce 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -42,6 +42,18 @@ static void pollers_note_osevent_added(libxl_ctx *ctx) {
         poller->osevents_added = 1;
 }
 
+static void baton_wake(libxl__egc *egc, libxl__poller *wake)
+{
+    libxl__poller_wakeup(egc, wake);
+
+    wake->osevents_added = 0;
+    /* This serves to make _1_baton idempotent.  It is OK even though
+     * that poller may currently be sleeping on only old osevents,
+     * because it is going to wake up because we've just prodded it,
+     * and it pick up new osevents on its next iteration (or pass
+     * on the baton). */
+}
+
 void libxl__egc_cleanup_1_baton(libxl__egc *egc)
 {
     EGC_GC;
@@ -62,14 +74,7 @@ void libxl__egc_cleanup_1_baton(libxl__egc *egc)
         /* no-one in libxl waiting for any events */
         return;
 
-    libxl__poller_wakeup(egc, wake);
-
-    wake->osevents_added = 0;
-    /* This serves to make _1_baton idempotent.  It is OK even though
-     * that poller may currently be sleeping on only old osevents,
-     * because it is going to wake up because we've just prodded it,
-     * and it pick up new osevents on its next iteration (or pass
-     * on the baton). */
+    baton_wake(egc, wake);
 }
 
 /*
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Xen-devel] [PATCH 7/8] libxl: event: Fix possible hang with libxl_osevent_beforepoll
  2020-01-10 13:28 [Xen-devel] [PATCH 0/8] libxl: event: Fix hang for some applications Ian Jackson
                   ` (5 preceding siblings ...)
  2020-01-10 13:29 ` [Xen-devel] [PATCH 6/8] libxl: event: Break out baton_wake Ian Jackson
@ 2020-01-10 13:29 ` Ian Jackson
  2020-01-10 13:29 ` [Xen-devel] [PATCH 8/8] libxl: event: Move poller pipe emptying to the end of afterpoll Ian Jackson
  7 siblings, 0 replies; 9+ messages in thread
From: Ian Jackson @ 2020-01-10 13:29 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, George Dunlap, Wei Liu

If the application uses libxl_osevent_beforepoll, a similar hang is
possible to the one described and fixed in
   libxl: event: Fix hang when mixing blocking and eventy calls
Application behaviour would have to be fairly unusual, but it
doesn't seem sensible to just leave this latent bug.

We fix the latent bug by waking up the "poller_app" pipe every time we
add osevents.  If the application does not ever call beforepoll, we
write one byte to the pipe and set pipe_nonempty and then we ignore
it.  We only write another byte if beforepoll is called again.

Normally in an eventy program there would only be one thread calling
libxl_osevent_beforepoll.  The effect in such a program is to
sometimes needlessly go round the poll loop again if a timeout
callback becomes interested in a new osevent.  We'll fix that in a
moment.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxl_event.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 4d57843cce..4314191c3b 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -59,6 +59,9 @@ void libxl__egc_cleanup_1_baton(libxl__egc *egc)
     EGC_GC;
     libxl__poller *search, *wake=0;
 
+    if (CTX->poller_app->osevents_added)
+        baton_wake(egc, CTX->poller_app);
+
     LIBXL_LIST_FOREACH(search, &CTX->pollers_active, active_entry) {
         if (search == CTX->poller_app)
             /* This one is special.  We can't give it the baton. */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Xen-devel] [PATCH 8/8] libxl: event: Move poller pipe emptying to the end of afterpoll
  2020-01-10 13:28 [Xen-devel] [PATCH 0/8] libxl: event: Fix hang for some applications Ian Jackson
                   ` (6 preceding siblings ...)
  2020-01-10 13:29 ` [Xen-devel] [PATCH 7/8] libxl: event: Fix possible hang with libxl_osevent_beforepoll Ian Jackson
@ 2020-01-10 13:29 ` Ian Jackson
  7 siblings, 0 replies; 9+ messages in thread
From: Ian Jackson @ 2020-01-10 13:29 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, George Dunlap, Wei Liu

If a timer event callback causes this poller to be woken (not very
unlikely) we would go round the poll loop twice rather than once.

Do the poller pipe emptying at the end; this is slightly more
efficient because it can't cause any callbacks, so it happens after
all the callbacks have been run.

(This pipe-emptying has to happen in afterpoll rather than the
apparently more logical beforepoll, because the application calling
beforepoll doesn't constitute a promise to actually do anything.)

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxl_event.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 4314191c3b..f59fbc719e 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -1326,12 +1326,6 @@ static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller,
         fd_occurs(egc, efd, revents);
     }
 
-    if (afterpoll_check_fd(poller,fds,nfds, poller->wakeup_pipe[0],POLLIN)) {
-        poller->pipe_nonempty = 0;
-        int e = libxl__self_pipe_eatall(poller->wakeup_pipe[0]);
-        if (e) LIBXL__EVENT_DISASTER(egc, "read wakeup", e, 0);
-    }
-
     for (;;) {
         libxl__ev_time *etime = LIBXL_TAILQ_FIRST(&CTX->etimes);
         if (!etime)
@@ -1346,6 +1340,12 @@ static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller,
 
         time_occurs(egc, etime, ERROR_TIMEDOUT);
     }
+
+    if (afterpoll_check_fd(poller,fds,nfds, poller->wakeup_pipe[0],POLLIN)) {
+        poller->pipe_nonempty = 0;
+        int e = libxl__self_pipe_eatall(poller->wakeup_pipe[0]);
+        if (e) LIBXL__EVENT_DISASTER(egc, "read wakeup", e, 0);
+    }
 }
 
 void libxl_osevent_afterpoll(libxl_ctx *ctx, int nfds, const struct pollfd *fds,
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-01-10 13:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-10 13:28 [Xen-devel] [PATCH 0/8] libxl: event: Fix hang for some applications Ian Jackson
2020-01-10 13:28 ` [Xen-devel] [PATCH 1/8] libxl: event: Rename poller.fds_changed to .fds_deregistered Ian Jackson
2020-01-10 13:28 ` [Xen-devel] [PATCH 2/8] libxl: event: Rename ctx.pollers_fd_changed to .pollers_active Ian Jackson
2020-01-10 13:28 ` [Xen-devel] [PATCH 3/8] libxl: event: Introduce CTX_UNLOCK_EGC_FREE Ian Jackson
2020-01-10 13:28 ` [Xen-devel] [PATCH 4/8] libxl: event: Fix hang when mixing blocking and eventy calls Ian Jackson
2020-01-10 13:28 ` [Xen-devel] [PATCH 5/8] libxl: event: poller pipe optimisation Ian Jackson
2020-01-10 13:29 ` [Xen-devel] [PATCH 6/8] libxl: event: Break out baton_wake Ian Jackson
2020-01-10 13:29 ` [Xen-devel] [PATCH 7/8] libxl: event: Fix possible hang with libxl_osevent_beforepoll Ian Jackson
2020-01-10 13:29 ` [Xen-devel] [PATCH 8/8] libxl: event: Move poller pipe emptying to the end of afterpoll Ian Jackson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.