All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Jackson <ian.jackson@eu.citrix.com>
To: <xen-devel@lists.xenproject.org>
Cc: Anthony PERARD <anthony.perard@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	George Dunlap <george.dunlap@citrix.com>, Wei Liu <wl@xen.org>
Subject: [Xen-devel] [PATCH v3 09/10] libxl: event: Fix possible hang with libxl_osevent_beforepoll
Date: Fri, 17 Jan 2020 14:47:25 +0000	[thread overview]
Message-ID: <20200117144726.582-10-ian.jackson@eu.citrix.com> (raw)
In-Reply-To: <20200117144726.582-1-ian.jackson@eu.citrix.com>

If the application uses libxl_osevent_beforepoll, a similar hang is
possible to the one described and fixed in
   libxl: event: Fix hang when mixing blocking and eventy calls
Application behaviour would have to be fairly unusual, but it
doesn't seem sensible to just leave this latent bug.

We fix the latent bug by waking up the "poller_app" pipe every time we
add osevents.  If the application does not ever call beforepoll, we
write one byte to the pipe and set pipe_nonempty and then we ignore
it.  We only write another byte if beforepoll is called again.

Normally in an eventy program there would only be one thread calling
libxl_osevent_beforepoll.  The effect in such a program is to
sometimes needlessly go round the poll loop again if a timeout
callback becomes interested in a new osevent.  We'll fix that in a
moment.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Tested-by: George Dunlap <george.dunlap@citrix.com>
---
v2: New addition to correctness arguments in libxl_event.c comment.
---
 tools/libxl/libxl_event.c | 54 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 43 insertions(+), 11 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 45cc67942d..5f6a607d80 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -41,18 +41,25 @@ static void ao__check_destroy(libxl_ctx *ctx, libxl__ao *ao);
  *
  * We need the following property (the "unstale liveness property"):
  *
- * Whenever any thread is blocking in the libxl event loop[1], at
- * least one thread must be using an up to date osevent set.  It is OK
- * for all but one threads to have stale event sets, because so long
- * as one waiting thread has the right event set, any actually
- * interesting event will, if nothing else, wake that "right" thread
- * up.  It will then make some progress and/or, if it exits, ensure
- * that some other thread becomes the "right" thread.
+ * Whenever any thread is blocking as a result of being given an fd
+ * set or timeout by libxl, at least one thread must be using an up to
+ * date osevent set.  It is OK for all but one threads to have stale
+ * event sets, because so long as one waiting thread has the right
+ * event set, any actually interesting event will, if nothing else,
+ * wake that "right" thread up.  It will then make some progress
+ * and/or, if it exits, ensure that some other thread becomes the
+ * "right" thread.
  *
- * [1] TODO: Right now we are considering only the libxl event loop.
- * We need to consider application event loop outside libxl too.
+ * For threads blocking outside libxl and which are receiving libxl's
+ * fd and timeout information via the libxl_osevent_hooks callbacks,
+ * libxl calls this function as soon as it becomes interested.  It is
+ * the responsiblity of a provider of these functions in a
+ * multithreaded environment to make arrangements to wake up event
+ * waiting thread(s) with stale event sets.
  *
- * Argument that our approach is sound:
+ * Waiters outside libxl using _beforepoll are dealt with below.
+ *
+ * For the libxl event loop, the argument is as follows:
  *
  * The issue we are concerned about is libxl sleeping on an out of
  * date fd set, or too long a timeout, so that it doesn't make
@@ -132,7 +139,29 @@ static void ao__check_destroy(libxl_ctx *ctx, libxl__ao *ao);
  * will reenter libxl when it gains the lock and necessarily then
  * becomes a baton holder in category (a).
  *
- * So the "baton invariant" is maintained.  QED.
+ * So the "baton invariant" is maintained.
+ * QED (for waiters in libxl).
+ *
+ *
+ * For waiters outside libxl which used libxl_osevent_beforepoll
+ * to get the fd set:
+ *
+ * As above, adding an osevent involves having an egc or an ao.
+ * It sets poller->osevents_added on all active pollers.  Notably
+ * it sets it on poller_app, which is always active.
+ *
+ * The thread which does this will dispose of its egc or ao before
+ * exiting libxl so it will always wake up the poller_app if the last
+ * call to _beforepoll was before the osevents were added.  So the
+ * application's fd set contains at least a wakeup in the form of the
+ * poller_app fd.  The application cannot sleep on the libxl fd set
+ * until it has called _afterpoll which empties the pipe, and it
+ * is expected to then call _beforepoll again before sleeping.
+ *
+ * So all the application's event waiting thread(s) will always have
+ * an up to date osevent set, and will be woken up if necessary to
+ * achieve this.  (This is in contrast libxl's own event loop where
+ * only one thread need be up to date, as discussed above.)
  */
 static void pollers_note_osevent_added(libxl_ctx *ctx) {
     libxl__poller *poller;
@@ -157,6 +186,9 @@ void libxl__egc_ao_cleanup_1_baton(libxl__gc *gc)
 {
     libxl__poller *search, *wake=0;
 
+    if (CTX->poller_app->osevents_added)
+        baton_wake(gc, CTX->poller_app);
+
     LIBXL_LIST_FOREACH(search, &CTX->pollers_active, active_entry) {
         if (search == CTX->poller_app)
             /* This one is special.  We can't give it the baton. */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  parent reply	other threads:[~2020-01-17 14:47 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-17 14:47 [Xen-devel] [PATCH v3 00/10] libxl: event: Fix hang for some applications Ian Jackson
2020-01-17 14:47 ` [Xen-devel] [PATCH v3 01/10] libxl: event: Rename poller.fds_changed to .fds_deregistered Ian Jackson
2020-01-17 14:47 ` [Xen-devel] [PATCH v3 02/10] libxl: event: Rename ctx.pollers_fd_changed to .pollers_active Ian Jackson
2020-01-17 14:47 ` [Xen-devel] [PATCH v3 03/10] libxl: event: Introduce CTX_UNLOCK_EGC_FREE Ian Jackson
2020-01-17 14:47 ` [Xen-devel] [PATCH v3 04/10] libxl: event: Make LIBXL__EVENT_DISASTER take a gc, not an egc Ian Jackson
2020-01-17 14:47 ` [Xen-devel] [PATCH v3 05/10] libxl: event: Make libxl__poller_wakeup " Ian Jackson
2020-01-17 14:47 ` [Xen-devel] [PATCH v3 06/10] libxl: event: Fix hang when mixing blocking and eventy calls Ian Jackson
2020-01-17 14:47 ` [Xen-devel] [PATCH v3 07/10] libxl: event: poller pipe optimisation Ian Jackson
2020-01-17 14:47 ` [Xen-devel] [PATCH v3 08/10] libxl: event: Break out baton_wake Ian Jackson
2020-01-17 14:47 ` Ian Jackson [this message]
2020-01-17 14:47 ` [Xen-devel] [PATCH v3 10/10] libxl: event: Move poller pipe emptying to the end of afterpoll Ian Jackson
2020-01-27 16:09 ` [Xen-devel] [PATCH v3 00/10] libxl: event: Fix hang for some applications Ian Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200117144726.582-10-ian.jackson@eu.citrix.com \
    --to=ian.jackson@eu.citrix.com \
    --cc=anthony.perard@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.