All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: qemu-devel@nongnu.org
Cc: qemu-block@nongnu.org, famz@redhat.com, kwolf@redhat.com,
	stefanha@redhat.com
Subject: [Qemu-devel] [PATCH 16/18] iothread: release AioContext around aio_poll
Date: Thu, 13 Oct 2016 19:34:20 +0200	[thread overview]
Message-ID: <1476380062-18001-17-git-send-email-pbonzini@redhat.com> (raw)
In-Reply-To: <1476380062-18001-1-git-send-email-pbonzini@redhat.com>

This is the first step towards having fine-grained critical sections in
dataplane threads, which will resolve lock ordering problems between
address_space_* functions (which need the BQL when doing MMIO, even
after we complete RCU-based dispatch) and the AioContext.

Because AioContext does not use contention callbacks anymore, the
unit test has to be changed.

Previously applied as a0710f7995f914e3044e5899bd8ff6c43c62f916 and
then reverted.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 async.c                     | 22 +++-------------------
 docs/multiple-iothreads.txt | 40 +++++++++++++++++++++++-----------------
 include/block/aio.h         |  3 ---
 iothread.c                  | 11 ++---------
 tests/test-aio.c            | 22 ++++++++++++++--------
 5 files changed, 42 insertions(+), 56 deletions(-)

diff --git a/async.c b/async.c
index fb37b03..27db772 100644
--- a/async.c
+++ b/async.c
@@ -107,8 +107,8 @@ int aio_bh_poll(AioContext *ctx)
          * aio_notify again if necessary.
          */
         if (atomic_xchg(&bh->scheduled, 0)) {
-            /* Idle BHs and the notify BH don't count as progress */
-            if (!bh->idle && bh != ctx->notify_dummy_bh) {
+            /* Idle BHs don't count as progress */
+            if (!bh->idle) {
                 ret = 1;
             }
             bh->idle = 0;
@@ -260,7 +260,6 @@ aio_ctx_finalize(GSource     *source)
 {
     AioContext *ctx = (AioContext *) source;
 
-    qemu_bh_delete(ctx->notify_dummy_bh);
     thread_pool_free(ctx->thread_pool);
 
 #ifdef CONFIG_LINUX_AIO
@@ -346,19 +345,6 @@ static void aio_timerlist_notify(void *opaque)
     aio_notify(opaque);
 }
 
-static void aio_rfifolock_cb(void *opaque)
-{
-    AioContext *ctx = opaque;
-
-    /* Kick owner thread in case they are blocked in aio_poll() */
-    qemu_bh_schedule(ctx->notify_dummy_bh);
-}
-
-static void notify_dummy_bh(void *opaque)
-{
-    /* Do nothing, we were invoked just to force the event loop to iterate */
-}
-
 static void event_notifier_dummy_cb(EventNotifier *e)
 {
 }
@@ -386,11 +372,9 @@ AioContext *aio_context_new(Error **errp)
 #endif
     ctx->thread_pool = NULL;
     qemu_mutex_init(&ctx->bh_lock);
-    rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx);
+    rfifolock_init(&ctx->lock, NULL, NULL);
     timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx);
 
-    ctx->notify_dummy_bh = aio_bh_new(ctx, notify_dummy_bh, NULL);
-
     return ctx;
 fail:
     g_source_destroy(&ctx->source);
diff --git a/docs/multiple-iothreads.txt b/docs/multiple-iothreads.txt
index 40b8419..0e7cdb2 100644
--- a/docs/multiple-iothreads.txt
+++ b/docs/multiple-iothreads.txt
@@ -105,13 +105,10 @@ a BH in the target AioContext beforehand and then call qemu_bh_schedule().  No
 acquire/release or locking is needed for the qemu_bh_schedule() call.  But be
 sure to acquire the AioContext for aio_bh_new() if necessary.
 
-The relationship between AioContext and the block layer
--------------------------------------------------------
-The AioContext originates from the QEMU block layer because it provides a
-scoped way of running event loop iterations until all work is done.  This
-feature is used to complete all in-flight block I/O requests (see
-bdrv_drain_all()).  Nowadays AioContext is a generic event loop that can be
-used by any QEMU subsystem.
+AioContext and the block layer
+------------------------------
+The AioContext originates from the QEMU block layer, even though nowadays
+AioContext is a generic event loop that can be used by any QEMU subsystem.
 
 The block layer has support for AioContext integrated.  Each BlockDriverState
 is associated with an AioContext using bdrv_set_aio_context() and
@@ -122,13 +119,22 @@ Block layer code must therefore expect to run in an IOThread and avoid using
 old APIs that implicitly use the main loop.  See the "How to program for
 IOThreads" above for information on how to do that.
 
-If main loop code such as a QMP function wishes to access a BlockDriverState it
-must first call aio_context_acquire(bdrv_get_aio_context(bs)) to ensure the
-IOThread does not run in parallel.
-
-Long-running jobs (usually in the form of coroutines) are best scheduled in the
-BlockDriverState's AioContext to avoid the need to acquire/release around each
-bdrv_*() call.  Be aware that there is currently no mechanism to get notified
-when bdrv_set_aio_context() moves this BlockDriverState to a different
-AioContext (see bdrv_detach_aio_context()/bdrv_attach_aio_context()), so you
-may need to add this if you want to support long-running jobs.
+If main loop code such as a QMP function wishes to access a BlockDriverState
+it must first call aio_context_acquire(bdrv_get_aio_context(bs)) to ensure
+that callbacks in the IOThread do not run in parallel.
+
+Code running in the monitor typically needs to ensure that past
+requests from the guest are completed.  When a block device is running
+in an IOThread, the IOThread can also process requests from the guest
+(via ioeventfd).  To achieve both objects, wrap the code between
+bdrv_drained_begin() and bdrv_drained_end(), thus creating a "drained
+section".  The functions must be called between aio_context_acquire()
+and aio_context_release().  You can freely release and re-acquire the
+AioContext within a drained section.
+
+Long-running jobs (usually in the form of coroutines) are best scheduled in
+the BlockDriverState's AioContext to avoid the need to acquire/release around
+each bdrv_*() call.  The functions bdrv_add/remove_aio_context_notifier,
+or alternatively blk_add/remove_aio_context_notifier if you use BlockBackends,
+can be used to get a notification whenever bdrv_set_aio_context() moves a
+BlockDriverState to a different AioContext.
diff --git a/include/block/aio.h b/include/block/aio.h
index 60a4f21..5714aba 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -116,9 +116,6 @@ struct AioContext {
     bool notified;
     EventNotifier notifier;
 
-    /* Scheduling this BH forces the event loop it iterate */
-    QEMUBH *notify_dummy_bh;
-
     /* Thread pool for performing work and receiving completion callbacks */
     struct ThreadPool *thread_pool;
 
diff --git a/iothread.c b/iothread.c
index 8153e21..7359b10 100644
--- a/iothread.c
+++ b/iothread.c
@@ -40,7 +40,6 @@ AioContext *qemu_get_current_aio_context(void)
 static void *iothread_run(void *opaque)
 {
     IOThread *iothread = opaque;
-    bool blocking;
 
     rcu_register_thread();
 
@@ -50,14 +49,8 @@ static void *iothread_run(void *opaque)
     qemu_cond_signal(&iothread->init_done_cond);
     qemu_mutex_unlock(&iothread->init_done_lock);
 
-    while (!iothread->stopping) {
-        aio_context_acquire(iothread->ctx);
-        blocking = true;
-        while (!iothread->stopping && aio_poll(iothread->ctx, blocking)) {
-            /* Progress was made, keep going */
-            blocking = false;
-        }
-        aio_context_release(iothread->ctx);
+    while (!atomic_read(&iothread->stopping)) {
+        aio_poll(iothread->ctx, true);
     }
 
     rcu_unregister_thread();
diff --git a/tests/test-aio.c b/tests/test-aio.c
index 03aa846..5be99f8 100644
--- a/tests/test-aio.c
+++ b/tests/test-aio.c
@@ -100,6 +100,7 @@ static void event_ready_cb(EventNotifier *e)
 
 typedef struct {
     QemuMutex start_lock;
+    EventNotifier notifier;
     bool thread_acquired;
 } AcquireTestData;
 
@@ -111,6 +112,11 @@ static void *test_acquire_thread(void *opaque)
     qemu_mutex_lock(&data->start_lock);
     qemu_mutex_unlock(&data->start_lock);
 
+    /* event_notifier_set might be called either before or after
+     * the main thread's call to poll().  The test case's outcome
+     * should be the same in either case.
+     */
+    event_notifier_set(&data->notifier);
     aio_context_acquire(ctx);
     aio_context_release(ctx);
 
@@ -125,20 +131,19 @@ static void set_event_notifier(AioContext *ctx, EventNotifier *notifier,
     aio_set_event_notifier(ctx, notifier, false, handler);
 }
 
-static void dummy_notifier_read(EventNotifier *unused)
+static void dummy_notifier_read(EventNotifier *n)
 {
-    g_assert(false); /* should never be invoked */
+    event_notifier_test_and_clear(n);
 }
 
 static void test_acquire(void)
 {
     QemuThread thread;
-    EventNotifier notifier;
     AcquireTestData data;
 
     /* Dummy event notifier ensures aio_poll() will block */
-    event_notifier_init(&notifier, false);
-    set_event_notifier(ctx, &notifier, dummy_notifier_read);
+    event_notifier_init(&data.notifier, false);
+    set_event_notifier(ctx, &data.notifier, dummy_notifier_read);
     g_assert(!aio_poll(ctx, false)); /* consume aio_notify() */
 
     qemu_mutex_init(&data.start_lock);
@@ -152,12 +157,13 @@ static void test_acquire(void)
     /* Block in aio_poll(), let other thread kick us and acquire context */
     aio_context_acquire(ctx);
     qemu_mutex_unlock(&data.start_lock); /* let the thread run */
-    g_assert(!aio_poll(ctx, true));
+    g_assert(aio_poll(ctx, true));
+    g_assert(!data.thread_acquired);
     aio_context_release(ctx);
 
     qemu_thread_join(&thread);
-    set_event_notifier(ctx, &notifier, NULL);
-    event_notifier_cleanup(&notifier);
+    set_event_notifier(ctx, &data.notifier, NULL);
+    event_notifier_cleanup(&data.notifier);
 
     g_assert(data.thread_acquired);
 }
-- 
2.7.4

  parent reply	other threads:[~2016-10-13 17:35 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-13 17:34 [Qemu-devel] [PATCH 00/18] dataplane: remove RFifoLock (including almost all previously sent patches) Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 01/18] replication: interrupt failover if the main device is closed Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 02/18] blockjob: introduce .drain callback for jobs Paolo Bonzini
2016-10-16 10:02   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-17  7:53     ` [Qemu-devel] " Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 03/18] mirror: use bdrv_drained_begin/bdrv_drained_end Paolo Bonzini
2016-10-14  9:43   ` Fam Zheng
2016-10-14 10:00     ` Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 04/18] block: add BDS field to count in-flight requests Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 05/18] block: change drain to look only at one child at a time Paolo Bonzini
2016-10-14 10:12   ` Fam Zheng
2016-10-13 17:34 ` [Qemu-devel] [PATCH 06/18] qed: Implement .bdrv_drain Paolo Bonzini
2016-10-14 10:33   ` Fam Zheng
2016-10-14 10:40     ` Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 07/18] block: introduce bdrv_poll_while and bdrv_wakeup Paolo Bonzini
2016-10-14 10:42   ` Fam Zheng
2016-10-14 10:43     ` Paolo Bonzini
2016-10-16 10:25   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-17  7:54     ` [Qemu-devel] " Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 08/18] nfs: move nfs_set_events out of the while loops Paolo Bonzini
2016-10-16 10:37   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-13 17:34 ` [Qemu-devel] [PATCH 09/18] nfs: use bdrv_poll_while and bdrv_wakeup Paolo Bonzini
2016-10-16 16:17   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-13 17:34 ` [Qemu-devel] [PATCH 10/18] sheepdog: " Paolo Bonzini
2016-10-16 16:21   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-13 17:34 ` [Qemu-devel] [PATCH 11/18] aio: introduce qemu_get_current_aio_context Paolo Bonzini
2016-10-16 16:28   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-13 17:34 ` [Qemu-devel] [PATCH 12/18] iothread: detach all block devices before stopping them Paolo Bonzini
2016-10-14 14:50   ` Fam Zheng
2016-10-14 14:59     ` Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 13/18] replication: pass BlockDriverState to reopen_backing_file Paolo Bonzini
2016-10-16 16:31   ` Stefan Hajnoczi
2016-10-13 17:34 ` [Qemu-devel] [PATCH 14/18] block: prepare bdrv_reopen_multiple to release AioContext Paolo Bonzini
2016-10-16 16:32   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-13 17:34 ` [Qemu-devel] [PATCH 15/18] block: only call aio_poll on the current thread's AioContext Paolo Bonzini
2016-10-14 14:55   ` Fam Zheng
2016-10-16 16:40   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-17  8:04     ` [Qemu-devel] " Paolo Bonzini
2016-10-18 10:10       ` Stefan Hajnoczi
2016-10-13 17:34 ` Paolo Bonzini [this message]
2016-10-13 17:34 ` [Qemu-devel] [PATCH 17/18] qemu-thread: introduce QemuRecMutex Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 18/18] aio: convert from RFifoLock to QemuRecMutex Paolo Bonzini
2016-10-16 16:43   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-17  8:58 ` [Qemu-devel] [PATCH 00/18] dataplane: remove RFifoLock (including almost all previously sent patches) Christian Borntraeger
2016-10-17  9:17   ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1476380062-18001-17-git-send-email-pbonzini@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.