All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC 0/7] dataplane: switch to N:M devices-per-thread model
@ 2013-12-12 13:19 Stefan Hajnoczi
  2013-12-12 13:19 ` [Qemu-devel] [RFC 1/7] rfifolock: add recursive FIFO lock Stefan Hajnoczi
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2013-12-12 13:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Michael Roth, Stefan Hajnoczi

This series moves the event loop thread out of dataplane code.  It makes
-iothread id=foo a separate concept.  This makes it possible to bind several
devices to the same iothread.

Syntax:

  qemu -iothread id=iothread0 \
       -device virtio-blk-pci,iothread=iothread0,x-data-plane=on,...

For backwards-compatibility the iothread= parameter can be omitted.  A
per-device IOThread will be created behind the scenes (just like the old 1:1
threading model).

This series includes the aio_context_acquire/release API which makes it easy to
synchronize access to AioContext across threads.

The IOThread object is really a stand-in for Michael Roth's QContext.  They
both have the same purpose but I needed something to develop against while
QContext is unfinished.  In order to make progress I'd like to agree on the
user-visible API that IOThread/QContext presents.  That way QContext can be
implemented step-by-step without holding up dataplane.

Finally, the -iothread command-line option will soon be replaced with -object.
I am following Igor's work in the area and will try to help get that code in.

Stefan Hajnoczi (7):
  rfifolock: add recursive FIFO lock
  aio: add aio_context_acquire() and aio_context_release()
  iothread: add I/O thread object
  iothread: command-line option
  qdev: add get_pointer_and_free() for temporary strings
  iothread: add "iothread" qdev property type
  dataplane: replace internal thread with IOThread

 Makefile.objs                    |   1 +
 async.c                          |  18 +++++
 hw/block/dataplane/virtio-blk.c  |  91 ++++++++++++-----------
 hw/core/qdev-properties-system.c |  65 +++++++++++++++++
 include/block/aio.h              |  18 +++++
 include/hw/qdev-properties.h     |   3 +
 include/hw/virtio/virtio-blk.h   |   8 ++-
 include/qemu/rfifolock.h         |  54 ++++++++++++++
 include/sysemu/iothread.h        |  32 +++++++++
 include/sysemu/sysemu.h          |   1 +
 iothread.c                       | 152 +++++++++++++++++++++++++++++++++++++++
 qemu-options.hx                  |   8 +++
 tests/Makefile                   |   2 +
 tests/test-aio.c                 |  58 +++++++++++++++
 tests/test-rfifolock.c           |  90 +++++++++++++++++++++++
 util/Makefile.objs               |   1 +
 util/rfifolock.c                 |  78 ++++++++++++++++++++
 vl.c                             |  12 ++++
 18 files changed, 647 insertions(+), 45 deletions(-)
 create mode 100644 include/qemu/rfifolock.h
 create mode 100644 include/sysemu/iothread.h
 create mode 100644 iothread.c
 create mode 100644 tests/test-rfifolock.c
 create mode 100644 util/rfifolock.c

-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] [RFC 1/7] rfifolock: add recursive FIFO lock
  2013-12-12 13:19 [Qemu-devel] [RFC 0/7] dataplane: switch to N:M devices-per-thread model Stefan Hajnoczi
@ 2013-12-12 13:19 ` Stefan Hajnoczi
  2013-12-12 13:19 ` [Qemu-devel] [RFC 2/7] aio: add aio_context_acquire() and aio_context_release() Stefan Hajnoczi
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2013-12-12 13:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Michael Roth, Stefan Hajnoczi

QemuMutex does not guarantee fairness and cannot be acquired
recursively:

Fairness means each locker gets a turn and the scheduler cannot cause
starvation.

Recursive locking is useful for composition, it allows a sequence of
locking operations to be invoked atomically by acquiring the lock around
them.

This patch adds RFifoLock, a recursive lock that guarantees FIFO order.
Its first user is added in the next patch.

RFifoLock has one additional feature: it can be initialized with an
optional contention callback.  The callback is invoked whenever a thread
must wait for the lock.  For example, it can be used to poke the current
owner so that they release the lock soon.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/qemu/rfifolock.h | 54 +++++++++++++++++++++++++++++
 tests/Makefile           |  2 ++
 tests/test-rfifolock.c   | 90 ++++++++++++++++++++++++++++++++++++++++++++++++
 util/Makefile.objs       |  1 +
 util/rfifolock.c         | 78 +++++++++++++++++++++++++++++++++++++++++
 5 files changed, 225 insertions(+)
 create mode 100644 include/qemu/rfifolock.h
 create mode 100644 tests/test-rfifolock.c
 create mode 100644 util/rfifolock.c

diff --git a/include/qemu/rfifolock.h b/include/qemu/rfifolock.h
new file mode 100644
index 0000000..b23ab53
--- /dev/null
+++ b/include/qemu/rfifolock.h
@@ -0,0 +1,54 @@
+/*
+ * Recursive FIFO lock
+ *
+ * Copyright Red Hat, Inc. 2013
+ *
+ * Authors:
+ *  Stefan Hajnoczi   <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef QEMU_RFIFOLOCK_H
+#define QEMU_RFIFOLOCK_H
+
+#include "qemu/thread.h"
+
+/* Recursive FIFO lock
+ *
+ * This lock provides more features than a plain mutex:
+ *
+ * 1. Fairness - enforces FIFO order.
+ * 2. Nesting - can be taken recursively.
+ * 3. Contention callback - optional, called when thread must wait.
+ *
+ * The recursive FIFO lock is heavyweight so prefer other synchronization
+ * primitives if you do not need its features.
+ */
+typedef struct {
+    QemuMutex lock;             /* protects all fields */
+
+    /* FIFO order */
+    unsigned int head;          /* active ticket number */
+    unsigned int tail;          /* waiting ticket number */
+    QemuCond cond;              /* used to wait for our ticket number */
+
+    /* Nesting */
+    QemuThread owner_thread;    /* thread that currently has ownership */
+    unsigned int nesting;       /* amount of nesting levels */
+
+    /* Contention callback */
+    void (*cb)(void *);         /* called when thread must wait, with ->lock
+                                 * held so it may not recursively lock/unlock
+                                 */
+    void *cb_opaque;
+} RFifoLock;
+
+void rfifolock_init(RFifoLock *r, void (*cb)(void *), void *opaque);
+void rfifolock_destroy(RFifoLock *r);
+void rfifolock_lock(RFifoLock *r);
+void rfifolock_unlock(RFifoLock *r);
+
+#endif /* QEMU_RFIFOLOCK_H */
diff --git a/tests/Makefile b/tests/Makefile
index 379cdd9..70f55db 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -31,6 +31,7 @@ check-unit-y += tests/test-visitor-serialization$(EXESUF)
 check-unit-y += tests/test-iov$(EXESUF)
 gcov-files-test-iov-y = util/iov.c
 check-unit-y += tests/test-aio$(EXESUF)
+check-unit-y += tests/test-rfifolock$(EXESUF)
 check-unit-y += tests/test-throttle$(EXESUF)
 gcov-files-test-aio-$(CONFIG_WIN32) = aio-win32.c
 gcov-files-test-aio-$(CONFIG_POSIX) = aio-posix.c
@@ -148,6 +149,7 @@ tests/check-qfloat$(EXESUF): tests/check-qfloat.o libqemuutil.a
 tests/check-qjson$(EXESUF): tests/check-qjson.o libqemuutil.a libqemustub.a
 tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-aio$(EXESUF): tests/test-aio.o $(block-obj-y) libqemuutil.a libqemustub.a
+tests/test-rfifolock$(EXESUF): tests/test-rfifolock.o libqemuutil.a libqemustub.a
 tests/test-throttle$(EXESUF): tests/test-throttle.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-iov$(EXESUF): tests/test-iov.o libqemuutil.a
diff --git a/tests/test-rfifolock.c b/tests/test-rfifolock.c
new file mode 100644
index 0000000..440dbcb
--- /dev/null
+++ b/tests/test-rfifolock.c
@@ -0,0 +1,90 @@
+/*
+ * RFifoLock tests
+ *
+ * Copyright Red Hat, Inc. 2013
+ *
+ * Authors:
+ *  Stefan Hajnoczi    <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+
+#include <glib.h>
+#include "qemu-common.h"
+#include "qemu/rfifolock.h"
+
+static void test_nesting(void)
+{
+    RFifoLock lock;
+
+    /* Trivial test, ensure the lock is recursive */
+    rfifolock_init(&lock, NULL, NULL);
+    rfifolock_lock(&lock);
+    rfifolock_lock(&lock);
+    rfifolock_lock(&lock);
+    rfifolock_unlock(&lock);
+    rfifolock_unlock(&lock);
+    rfifolock_unlock(&lock);
+    rfifolock_destroy(&lock);
+}
+
+typedef struct {
+    RFifoLock lock;
+    int fd[2];
+} CallbackTestData;
+
+static void rfifolock_cb(void *opaque)
+{
+    CallbackTestData *data = opaque;
+    int ret;
+    char c = 0;
+
+    ret = write(data->fd[1], &c, sizeof(c));
+    g_assert(ret == 1);
+}
+
+static void *callback_thread(void *opaque)
+{
+    CallbackTestData *data = opaque;
+
+    /* The other thread holds the lock so the contention callback will be
+     * invoked...
+     */
+    rfifolock_lock(&data->lock);
+    rfifolock_unlock(&data->lock);
+    return NULL;
+}
+
+static void test_callback(void)
+{
+    CallbackTestData data;
+    QemuThread thread;
+    int ret;
+    char c;
+
+    rfifolock_init(&data.lock, rfifolock_cb, &data);
+    ret = qemu_pipe(data.fd);
+    g_assert(ret == 0);
+
+    /* Hold lock but allow the callback to kick us by writing to the pipe */
+    rfifolock_lock(&data.lock);
+    qemu_thread_create(&thread, callback_thread, &data, QEMU_THREAD_JOINABLE);
+    ret = read(data.fd[0], &c, sizeof(c));
+    g_assert(ret == 1);
+    rfifolock_unlock(&data.lock);
+    /* If we got here then the callback was invoked, as expected */
+
+    qemu_thread_join(&thread);
+    close(data.fd[0]);
+    close(data.fd[1]);
+    rfifolock_destroy(&data.lock);
+}
+
+int main(int argc, char **argv)
+{
+    g_test_init(&argc, &argv, NULL);
+    g_test_add_func("/nesting", test_nesting);
+    g_test_add_func("/callback", test_callback);
+    return g_test_run();
+}
diff --git a/util/Makefile.objs b/util/Makefile.objs
index af3e5cb..53a4c5e 100644
--- a/util/Makefile.objs
+++ b/util/Makefile.objs
@@ -13,3 +13,4 @@ util-obj-y += hexdump.o
 util-obj-y += crc32c.o
 util-obj-y += throttle.o
 util-obj-y += getauxval.o
+util-obj-y += rfifolock.o
diff --git a/util/rfifolock.c b/util/rfifolock.c
new file mode 100644
index 0000000..afbf748
--- /dev/null
+++ b/util/rfifolock.c
@@ -0,0 +1,78 @@
+/*
+ * Recursive FIFO lock
+ *
+ * Copyright Red Hat, Inc. 2013
+ *
+ * Authors:
+ *  Stefan Hajnoczi   <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include <assert.h>
+#include "qemu/rfifolock.h"
+
+void rfifolock_init(RFifoLock *r, void (*cb)(void *), void *opaque)
+{
+    qemu_mutex_init(&r->lock);
+    r->head = 0;
+    r->tail = 0;
+    qemu_cond_init(&r->cond);
+    r->nesting = 0;
+    r->cb = cb;
+    r->cb_opaque = opaque;
+}
+
+void rfifolock_destroy(RFifoLock *r)
+{
+    qemu_cond_destroy(&r->cond);
+    qemu_mutex_destroy(&r->lock);
+}
+
+/*
+ * Theory of operation:
+ *
+ * In order to ensure FIFO ordering, implement a ticketlock.  Threads acquiring
+ * the lock enqueue themselves by incrementing the tail index.  When the lock
+ * is unlocked, the head is incremented and waiting threads are notified.
+ *
+ * Recursive locking does not take a ticket since the head is only incremented
+ * when the outermost recursive caller unlocks.
+ */
+void rfifolock_lock(RFifoLock *r)
+{
+    qemu_mutex_lock(&r->lock);
+
+    /* Take a ticket */
+    unsigned int ticket = r->tail++;
+
+    if (r->nesting > 0 && qemu_thread_is_self(&r->owner_thread)) {
+        r->tail--; /* put ticket back, we're nesting */
+    } else {
+        while (ticket != r->head) {
+            /* Invoke optional contention callback */
+            if (r->cb) {
+                r->cb(r->cb_opaque);
+            }
+            qemu_cond_wait(&r->cond, &r->lock);
+        }
+    }
+
+    qemu_thread_get_self(&r->owner_thread);
+    r->nesting++;
+    qemu_mutex_unlock(&r->lock);
+}
+
+void rfifolock_unlock(RFifoLock *r)
+{
+    qemu_mutex_lock(&r->lock);
+    assert(r->nesting > 0);
+    assert(qemu_thread_is_self(&r->owner_thread));
+    if (--r->nesting == 0) {
+        r->head++;
+        qemu_cond_broadcast(&r->cond);
+    }
+    qemu_mutex_unlock(&r->lock);
+}
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [RFC 2/7] aio: add aio_context_acquire() and aio_context_release()
  2013-12-12 13:19 [Qemu-devel] [RFC 0/7] dataplane: switch to N:M devices-per-thread model Stefan Hajnoczi
  2013-12-12 13:19 ` [Qemu-devel] [RFC 1/7] rfifolock: add recursive FIFO lock Stefan Hajnoczi
@ 2013-12-12 13:19 ` Stefan Hajnoczi
  2013-12-12 13:19 ` [Qemu-devel] [RFC 3/7] iothread: add I/O thread object Stefan Hajnoczi
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2013-12-12 13:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Michael Roth, Stefan Hajnoczi

It can be useful to run an AioContext from a thread which normally does
not "own" the AioContext.  For example, request draining can be
implemented by acquiring the AioContext and looping aio_poll() until all
requests have been completed.

The following pattern should work:

  /* Event loop thread */
  while (running) {
      aio_context_acquire(ctx);
      aio_poll(ctx, true);
      aio_context_release(ctx);
  }

  /* Another thread */
  aio_context_acquire(ctx);
  bdrv_read(bs, 0x1000, buf, 1);
  aio_context_release(ctx);

This patch implements aio_context_acquire() and aio_context_release().

Note that existing aio_poll() callers do not need to worry about
acquiring and releasing - it is only needed when multiple threads will
call aio_poll() on the same AioContext.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 async.c             | 18 +++++++++++++++++
 include/block/aio.h | 18 +++++++++++++++++
 tests/test-aio.c    | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 94 insertions(+)

diff --git a/async.c b/async.c
index 5fb3fa6..6930185 100644
--- a/async.c
+++ b/async.c
@@ -214,6 +214,7 @@ aio_ctx_finalize(GSource     *source)
     thread_pool_free(ctx->thread_pool);
     aio_set_event_notifier(ctx, &ctx->notifier, NULL);
     event_notifier_cleanup(&ctx->notifier);
+    rfifolock_destroy(&ctx->lock);
     qemu_mutex_destroy(&ctx->bh_lock);
     g_array_free(ctx->pollfds, TRUE);
     timerlistgroup_deinit(&ctx->tlg);
@@ -250,6 +251,12 @@ static void aio_timerlist_notify(void *opaque)
     aio_notify(opaque);
 }
 
+static void aio_rfifolock_cb(void *opaque)
+{
+    /* Kick owner thread in case they are blocked in aio_poll() */
+    aio_notify(opaque);
+}
+
 AioContext *aio_context_new(void)
 {
     AioContext *ctx;
@@ -257,6 +264,7 @@ AioContext *aio_context_new(void)
     ctx->pollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
     ctx->thread_pool = NULL;
     qemu_mutex_init(&ctx->bh_lock);
+    rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx);
     event_notifier_init(&ctx->notifier, false);
     aio_set_event_notifier(ctx, &ctx->notifier, 
                            (EventNotifierHandler *)
@@ -275,3 +283,13 @@ void aio_context_unref(AioContext *ctx)
 {
     g_source_unref(&ctx->source);
 }
+
+void aio_context_acquire(AioContext *ctx)
+{
+    rfifolock_lock(&ctx->lock);
+}
+
+void aio_context_release(AioContext *ctx)
+{
+    rfifolock_unlock(&ctx->lock);
+}
diff --git a/include/block/aio.h b/include/block/aio.h
index 2efdf41..4aaa5d5 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -19,6 +19,7 @@
 #include "qemu/queue.h"
 #include "qemu/event_notifier.h"
 #include "qemu/thread.h"
+#include "qemu/rfifolock.h"
 #include "qemu/timer.h"
 
 typedef struct BlockDriverAIOCB BlockDriverAIOCB;
@@ -47,6 +48,9 @@ typedef void IOHandler(void *opaque);
 struct AioContext {
     GSource source;
 
+    /* Protects all fields from multi-threaded access */
+    RFifoLock lock;
+
     /* The list of registered AIO handlers */
     QLIST_HEAD(, AioHandler) aio_handlers;
 
@@ -104,6 +108,20 @@ void aio_context_ref(AioContext *ctx);
  */
 void aio_context_unref(AioContext *ctx);
 
+/* Take ownership of the AioContext.  If the AioContext will be shared between
+ * threads, a thread must have ownership when calling aio_poll().
+ *
+ * Note that multiple threads calling aio_poll() means timers, BHs, and
+ * callbacks may be invoked from a different thread than they were registered
+ * from.  Therefore, code must use AioContext acquire/release or use
+ * fine-grained synchronization to protect shared state if other threads will
+ * be accessing it simultaneously.
+ */
+void aio_context_acquire(AioContext *ctx);
+
+/* Reliquinish ownership of the AioContext. */
+void aio_context_release(AioContext *ctx);
+
 /**
  * aio_bh_new: Allocate a new bottom half structure.
  *
diff --git a/tests/test-aio.c b/tests/test-aio.c
index 592721e..d384b0b 100644
--- a/tests/test-aio.c
+++ b/tests/test-aio.c
@@ -112,6 +112,63 @@ static void test_notify(void)
     g_assert(!aio_poll(ctx, false));
 }
 
+typedef struct {
+    QemuMutex start_lock;
+    bool thread_acquired;
+} AcquireTestData;
+
+static void *test_acquire_thread(void *opaque)
+{
+    AcquireTestData *data = opaque;
+
+    /* Wait for other thread to let us start */
+    qemu_mutex_lock(&data->start_lock);
+    qemu_mutex_unlock(&data->start_lock);
+
+    aio_context_acquire(ctx);
+    aio_context_release(ctx);
+
+    data->thread_acquired = true; /* success, we got here */
+
+    return NULL;
+}
+
+static void dummy_notifier_read(EventNotifier *unused)
+{
+    g_assert(false); /* should never be invoked */
+}
+
+static void test_acquire(void)
+{
+    QemuThread thread;
+    EventNotifier notifier;
+    AcquireTestData data;
+
+    /* Dummy event notifier ensures aio_poll() will block */
+    event_notifier_init(&notifier, false);
+    aio_set_event_notifier(ctx, &notifier, dummy_notifier_read);
+    g_assert(!aio_poll(ctx, false)); /* consume aio_notify() */
+
+    qemu_mutex_init(&data.start_lock);
+    qemu_mutex_lock(&data.start_lock);
+    data.thread_acquired = false;
+
+    qemu_thread_create(&thread, test_acquire_thread,
+                       &data, QEMU_THREAD_JOINABLE);
+
+    /* Block in aio_poll(), let other thread kick us and acquire context */
+    aio_context_acquire(ctx);
+    qemu_mutex_unlock(&data.start_lock); /* let the thread run */
+    g_assert(!aio_poll(ctx, true));
+    aio_context_release(ctx);
+
+    qemu_thread_join(&thread);
+    aio_set_event_notifier(ctx, &notifier, NULL);
+    event_notifier_cleanup(&notifier);
+
+    g_assert(data.thread_acquired);
+}
+
 static void test_bh_schedule(void)
 {
     BHTestData data = { .n = 0 };
@@ -775,6 +832,7 @@ int main(int argc, char **argv)
 
     g_test_init(&argc, &argv, NULL);
     g_test_add_func("/aio/notify",                  test_notify);
+    g_test_add_func("/aio/acquire",                 test_acquire);
     g_test_add_func("/aio/bh/schedule",             test_bh_schedule);
     g_test_add_func("/aio/bh/schedule10",           test_bh_schedule10);
     g_test_add_func("/aio/bh/cancel",               test_bh_cancel);
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [RFC 3/7] iothread: add I/O thread object
  2013-12-12 13:19 [Qemu-devel] [RFC 0/7] dataplane: switch to N:M devices-per-thread model Stefan Hajnoczi
  2013-12-12 13:19 ` [Qemu-devel] [RFC 1/7] rfifolock: add recursive FIFO lock Stefan Hajnoczi
  2013-12-12 13:19 ` [Qemu-devel] [RFC 2/7] aio: add aio_context_acquire() and aio_context_release() Stefan Hajnoczi
@ 2013-12-12 13:19 ` Stefan Hajnoczi
  2013-12-12 18:00   ` Michael Roth
  2013-12-12 13:19 ` [Qemu-devel] [RFC 4/7] iothread: command-line option Stefan Hajnoczi
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2013-12-12 13:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Michael Roth, Stefan Hajnoczi

This is a stand-in for Michael Roth's QContext.  I expect this to be
replaced once QContext is completed.

The IOThread object is an AioContext event loop thread.  This patch adds
the concept of multiple event loop threads, allowing users to define
them.

When SMP guests run on SMP hosts it makes sense to instantiate multiple
IOThreads.  This spreads event loop processing across multiple cores.
Note that additional patches are required to actually bind a device to
an IOThread.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 Makefile.objs             |   1 +
 include/sysemu/iothread.h |  31 +++++++++++++
 iothread.c                | 115 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 147 insertions(+)
 create mode 100644 include/sysemu/iothread.h
 create mode 100644 iothread.c

diff --git a/Makefile.objs b/Makefile.objs
index 2b6c1fe..a1102a5 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -42,6 +42,7 @@ libcacard-y += libcacard/vcardt.o
 
 ifeq ($(CONFIG_SOFTMMU),y)
 common-obj-y = $(block-obj-y) blockdev.o blockdev-nbd.o block/
+common-obj-y += iothread.o
 common-obj-y += net/
 common-obj-y += readline.o
 common-obj-y += qdev-monitor.o device-hotplug.o
diff --git a/include/sysemu/iothread.h b/include/sysemu/iothread.h
new file mode 100644
index 0000000..8c49bd6
--- /dev/null
+++ b/include/sysemu/iothread.h
@@ -0,0 +1,31 @@
+/*
+ * Event loop thread
+ *
+ * Copyright Red Hat Inc., 2013
+ *
+ * Authors:
+ *  Stefan Hajnoczi   <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef IOTHREAD_H
+#define IOTHREAD_H
+
+#include "block/aio.h"
+
+#define TYPE_IOTHREAD "iothread"
+#define IOTHREADS_PATH "/backends/iothreads"
+
+typedef struct IOThread IOThread;
+
+#define IOTHREAD(obj) \
+   OBJECT_CHECK(IOThread, obj, TYPE_IOTHREAD)
+
+IOThread *iothread_find(const char *id);
+char *iothread_get_id(IOThread *iothread);
+AioContext *iothread_get_aio_context(IOThread *iothread);
+
+#endif /* IOTHREAD_H */
diff --git a/iothread.c b/iothread.c
new file mode 100644
index 0000000..dbc6047
--- /dev/null
+++ b/iothread.c
@@ -0,0 +1,115 @@
+/*
+ * Event loop thread
+ *
+ * Copyright Red Hat Inc., 2013
+ *
+ * Authors:
+ *  Stefan Hajnoczi   <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qom/object.h"
+#include "qemu/module.h"
+#include "qemu/thread.h"
+#include "block/aio.h"
+#include "sysemu/iothread.h"
+
+typedef ObjectClass IOThreadClass;
+struct IOThread {
+    Object parent;
+    QemuThread thread;
+    AioContext *ctx;
+    bool stopping;
+};
+
+#define IOTHREAD_GET_CLASS(obj) \
+   OBJECT_GET_CLASS(IOThreadClass, obj, TYPE_IOTHREAD)
+#define IOTHREAD_CLASS(klass) \
+   OBJECT_CLASS_CHECK(IOThreadClass, klass, TYPE_IOTHREAD)
+
+static void *iothread_run(void *opaque)
+{
+    IOThread *iothread = opaque;
+
+    for (;;) {
+        /* TODO can we optimize away acquire/release to only happen when
+         * aio_notify() was called?
+         */
+        aio_context_acquire(iothread->ctx);
+        if (iothread->stopping) {
+            aio_context_release(iothread->ctx);
+            break;
+        }
+        aio_poll(iothread->ctx, true);
+        aio_context_release(iothread->ctx);
+    }
+    return NULL;
+}
+
+static void iothread_instance_init(Object *obj)
+{
+    IOThread *iothread = IOTHREAD(obj);
+
+    iothread->stopping = false;
+    iothread->ctx = aio_context_new();
+
+    /* This assumes .instance_init() is called from a thread with useful CPU
+     * affinity for us to inherit.
+     */
+    qemu_thread_create(&iothread->thread, iothread_run,
+                       iothread, QEMU_THREAD_JOINABLE);
+}
+
+static void iothread_instance_finalize(Object *obj)
+{
+    IOThread *iothread = IOTHREAD(obj);
+
+    iothread->stopping = true;
+    aio_notify(iothread->ctx);
+    qemu_thread_join(&iothread->thread);
+    aio_context_unref(iothread->ctx);
+}
+
+static const TypeInfo iothread_info = {
+    .name = TYPE_IOTHREAD,
+    .parent = TYPE_OBJECT,
+    .instance_size = sizeof(IOThread),
+    .instance_init = iothread_instance_init,
+    .instance_finalize = iothread_instance_finalize,
+};
+
+static void iothread_register_types(void)
+{
+    type_register_static(&iothread_info);
+}
+
+type_init(iothread_register_types)
+
+IOThread *iothread_find(const char *id)
+{
+    Object *container = container_get(object_get_root(), IOTHREADS_PATH);
+    Object *child;
+
+    child = object_property_get_link(container, id, NULL);
+    if (!child) {
+        return NULL;
+    }
+    return IOTHREAD(child);
+}
+
+char *iothread_get_id(IOThread *iothread)
+{
+    /* The last path component is the identifier */
+    char *path = object_get_canonical_path(OBJECT(iothread));
+    char *id = g_strdup(&path[sizeof(IOTHREADS_PATH)]);
+    g_free(path);
+    return id;
+}
+
+AioContext *iothread_get_aio_context(IOThread *iothread)
+{
+    return iothread->ctx;
+}
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [RFC 4/7] iothread: command-line option
  2013-12-12 13:19 [Qemu-devel] [RFC 0/7] dataplane: switch to N:M devices-per-thread model Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2013-12-12 13:19 ` [Qemu-devel] [RFC 3/7] iothread: add I/O thread object Stefan Hajnoczi
@ 2013-12-12 13:19 ` Stefan Hajnoczi
  2013-12-12 13:19 ` [Qemu-devel] [RFC 5/7] qdev: add get_pointer_and_free() for temporary strings Stefan Hajnoczi
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2013-12-12 13:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Michael Roth, Stefan Hajnoczi

The -object option has several limitations that prevent it from fully
instantiating an IOThread.  Igor Mammedov and Paolo Bonzini are fixing
-object.

In the meantime, add a traditional -iothread command-line option that
takes an identifier and keeps a global list of IOThreads.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/sysemu/iothread.h |  1 +
 include/sysemu/sysemu.h   |  1 +
 iothread.c                | 37 +++++++++++++++++++++++++++++++++++++
 qemu-options.hx           |  8 ++++++++
 vl.c                      | 12 ++++++++++++
 5 files changed, 59 insertions(+)

diff --git a/include/sysemu/iothread.h b/include/sysemu/iothread.h
index 8c49bd6..a4fcb61 100644
--- a/include/sysemu/iothread.h
+++ b/include/sysemu/iothread.h
@@ -24,6 +24,7 @@ typedef struct IOThread IOThread;
 #define IOTHREAD(obj) \
    OBJECT_CHECK(IOThread, obj, TYPE_IOTHREAD)
 
+int iothread_init(void);
 IOThread *iothread_find(const char *id);
 char *iothread_get_id(IOThread *iothread);
 AioContext *iothread_get_aio_context(IOThread *iothread);
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 495dae8..77117cf 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -202,5 +202,6 @@ extern QemuOptsList qemu_netdev_opts;
 extern QemuOptsList qemu_net_opts;
 extern QemuOptsList qemu_global_opts;
 extern QemuOptsList qemu_mon_opts;
+extern QemuOptsList qemu_iothread_opts;
 
 #endif
diff --git a/iothread.c b/iothread.c
index dbc6047..8b2c0ef 100644
--- a/iothread.c
+++ b/iothread.c
@@ -12,6 +12,9 @@
  */
 
 #include "qom/object.h"
+#include "qapi/qmp/qerror.h"
+#include "qemu/config-file.h"
+#include "qemu/option.h"
 #include "qemu/module.h"
 #include "qemu/thread.h"
 #include "block/aio.h"
@@ -113,3 +116,37 @@ AioContext *iothread_get_aio_context(IOThread *iothread)
 {
     return iothread->ctx;
 }
+
+QemuOptsList qemu_iothread_opts = {
+    .name = "iothread",
+    .head = QTAILQ_HEAD_INITIALIZER(qemu_iothread_opts.head),
+    .desc = {
+        {
+            .name = "id",
+            .type = QEMU_OPT_STRING,
+            .help = "iothread identifier",
+        },
+        { /* end of list */ }
+    },
+};
+
+static int iothread_init_opts(QemuOpts *opts, void *opaque)
+{
+    Object *obj;
+    const char *id;
+
+    id = qemu_opts_id(opts);
+    if (iothread_find(id)) {
+        return -EEXIST;
+    }
+    obj = object_new(TYPE_IOTHREAD);
+    object_property_add_child(container_get(object_get_root(), IOTHREADS_PATH),
+                              id, obj, NULL);
+    return 0;
+}
+
+int iothread_init(void)
+{
+    return qemu_opts_foreach(qemu_find_opts("iothread"),
+                             iothread_init_opts, NULL, 1);
+}
diff --git a/qemu-options.hx b/qemu-options.hx
index af34483..6e29da1 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3123,6 +3123,14 @@ STEXI
 prepend a timestamp to each log message.(default:on)
 ETEXI
 
+DEF("iothread", HAS_ARG, QEMU_OPTION_iothread,
+    "-iothread id=id\n", QEMU_ARCH_ALL)
+STEXI
+@item -iothread id=id
+@findex -iothread
+Run an event loop thread that devices can be bound to.
+ETEXI
+
 HXCOMM This is the last statement. Insert new options before this line!
 STEXI
 @end table
diff --git a/vl.c b/vl.c
index b0399de..55304dc 100644
--- a/vl.c
+++ b/vl.c
@@ -167,6 +167,7 @@ int main(int argc, char **argv)
 #include "sysemu/cpus.h"
 #include "sysemu/arch_init.h"
 #include "qemu/osdep.h"
+#include "sysemu/iothread.h"
 
 #include "ui/qemu-spice.h"
 #include "qapi/string-input-visitor.h"
@@ -2890,6 +2891,7 @@ int main(int argc, char **argv, char **envp)
     qemu_add_opts(&qemu_tpmdev_opts);
     qemu_add_opts(&qemu_realtime_opts);
     qemu_add_opts(&qemu_msg_opts);
+    qemu_add_opts(&qemu_iothread_opts);
 
     runstate_init();
 
@@ -3796,6 +3798,12 @@ int main(int argc, char **argv, char **envp)
                 }
                 configure_msg(opts);
                 break;
+            case QEMU_OPTION_iothread:
+                opts = qemu_opts_parse(qemu_find_opts("iothread"), optarg, 0);
+                if (!opts) {
+                    exit(1);
+                }
+                break;
             default:
                 os_parse_cmd_args(popt->index, optarg);
             }
@@ -4102,6 +4110,10 @@ int main(int argc, char **argv, char **envp)
     qemu_init_cpu_loop();
     qemu_mutex_lock_iothread();
 
+    if (iothread_init() < 0) {
+        exit(1);
+    }
+
 #ifdef CONFIG_SPICE
     /* spice needs the timers to be initialized by this point */
     qemu_spice_init();
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [RFC 5/7] qdev: add get_pointer_and_free() for temporary strings
  2013-12-12 13:19 [Qemu-devel] [RFC 0/7] dataplane: switch to N:M devices-per-thread model Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2013-12-12 13:19 ` [Qemu-devel] [RFC 4/7] iothread: command-line option Stefan Hajnoczi
@ 2013-12-12 13:19 ` Stefan Hajnoczi
  2013-12-12 13:19 ` [Qemu-devel] [RFC 6/7] iothread: add "iothread" qdev property type Stefan Hajnoczi
  2013-12-12 13:19 ` [Qemu-devel] [RFC 7/7] dataplane: replace internal thread with IOThread Stefan Hajnoczi
  6 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2013-12-12 13:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Michael Roth, Stefan Hajnoczi

get_pointer() assumes the string has unspecified lifetime (at least as
long as the object is alive).  In some cases we can only produce a
temporary string that should be freed when get_pointer() is done.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/core/qdev-properties-system.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 729efa8..200f853 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -31,6 +31,20 @@ static void get_pointer(Object *obj, Visitor *v, Property *prop,
     visit_type_str(v, &p, name, errp);
 }
 
+/* Same as get_pointer() but frees heap-allocated print() return value */
+static void get_pointer_and_free(Object *obj, Visitor *v, Property *prop,
+                                 char *(*print)(void *ptr),
+                                 const char *name, Error **errp)
+{
+    DeviceState *dev = DEVICE(obj);
+    void **ptr = qdev_get_prop_ptr(dev, prop);
+    char *p;
+
+    p = *ptr ? print(*ptr) : g_strdup("");
+    visit_type_str(v, &p, name, errp);
+    g_free(p);
+}
+
 static void set_pointer(Object *obj, Visitor *v, Property *prop,
                         int (*parse)(DeviceState *dev, const char *str,
                                      void **ptr),
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [RFC 6/7] iothread: add "iothread" qdev property type
  2013-12-12 13:19 [Qemu-devel] [RFC 0/7] dataplane: switch to N:M devices-per-thread model Stefan Hajnoczi
                   ` (4 preceding siblings ...)
  2013-12-12 13:19 ` [Qemu-devel] [RFC 5/7] qdev: add get_pointer_and_free() for temporary strings Stefan Hajnoczi
@ 2013-12-12 13:19 ` Stefan Hajnoczi
  2013-12-12 13:19 ` [Qemu-devel] [RFC 7/7] dataplane: replace internal thread with IOThread Stefan Hajnoczi
  6 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2013-12-12 13:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Michael Roth, Stefan Hajnoczi

Add a "iothread" qdev property type so devices can be hooked up to an
IOThread from the comand-line:

  qemu -iothread id=iothread0 \
       -device some-device,iothread=iothread0

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/core/qdev-properties-system.c | 51 ++++++++++++++++++++++++++++++++++++++++
 include/hw/qdev-properties.h     |  3 +++
 2 files changed, 54 insertions(+)

diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 200f853..95b63b6 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -18,6 +18,7 @@
 #include "net/hub.h"
 #include "qapi/visitor.h"
 #include "sysemu/char.h"
+#include "sysemu/iothread.h"
 
 static void get_pointer(Object *obj, Visitor *v, Property *prop,
                         const char *(*print)(void *ptr),
@@ -396,6 +397,56 @@ void qdev_set_nic_properties(DeviceState *dev, NICInfo *nd)
     nd->instantiated = 1;
 }
 
+/* --- iothread --- */
+
+static char *print_iothread(void *ptr)
+{
+    return iothread_get_id(ptr);
+}
+
+static int parse_iothread(DeviceState *dev, const char *str, void **ptr)
+{
+    IOThread *iothread;
+
+    iothread = iothread_find(str);
+    if (!iothread) {
+        return -ENOENT;
+    }
+    object_ref(OBJECT(iothread));
+    *ptr = iothread;
+    return 0;
+}
+
+static void get_iothread(Object *obj, struct Visitor *v, void *opaque,
+                         const char *name, Error **errp)
+{
+    get_pointer_and_free(obj, v, opaque, print_iothread, name, errp);
+}
+
+static void set_iothread(Object *obj, struct Visitor *v, void *opaque,
+                         const char *name, Error **errp)
+{
+    set_pointer(obj, v, opaque, parse_iothread, name, errp);
+}
+
+static void release_iothread(Object *obj, const char *name, void *opaque)
+{
+    DeviceState *dev = DEVICE(obj);
+    Property *prop = opaque;
+    IOThread **ptr = qdev_get_prop_ptr(dev, prop);
+
+    if (*ptr) {
+        object_unref(OBJECT(*ptr));
+    }
+}
+
+PropertyInfo qdev_prop_iothread = {
+    .name = "iothread",
+    .get = get_iothread,
+    .set = set_iothread,
+    .release = release_iothread,
+};
+
 static int qdev_add_one_global(QemuOpts *opts, void *opaque)
 {
     GlobalProperty *g;
diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index 692f82e..4048f9b 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -25,6 +25,7 @@ extern PropertyInfo qdev_prop_bios_chs_trans;
 extern PropertyInfo qdev_prop_drive;
 extern PropertyInfo qdev_prop_netdev;
 extern PropertyInfo qdev_prop_vlan;
+extern PropertyInfo qdev_prop_iothread;
 extern PropertyInfo qdev_prop_pci_devfn;
 extern PropertyInfo qdev_prop_blocksize;
 extern PropertyInfo qdev_prop_pci_host_devaddr;
@@ -134,6 +135,8 @@ extern PropertyInfo qdev_prop_arraylen;
     DEFINE_PROP(_n, _s, _f, qdev_prop_vlan, NICPeers)
 #define DEFINE_PROP_DRIVE(_n, _s, _f) \
     DEFINE_PROP(_n, _s, _f, qdev_prop_drive, BlockDriverState *)
+#define DEFINE_PROP_IOTHREAD(_n, _s, _f)             \
+    DEFINE_PROP(_n, _s, _f, qdev_prop_iothread, IOThread *)
 #define DEFINE_PROP_MACADDR(_n, _s, _f)         \
     DEFINE_PROP(_n, _s, _f, qdev_prop_macaddr, MACAddr)
 #define DEFINE_PROP_LOSTTICKPOLICY(_n, _s, _f, _d) \
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [RFC 7/7] dataplane: replace internal thread with IOThread
  2013-12-12 13:19 [Qemu-devel] [RFC 0/7] dataplane: switch to N:M devices-per-thread model Stefan Hajnoczi
                   ` (5 preceding siblings ...)
  2013-12-12 13:19 ` [Qemu-devel] [RFC 6/7] iothread: add "iothread" qdev property type Stefan Hajnoczi
@ 2013-12-12 13:19 ` Stefan Hajnoczi
  6 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2013-12-12 13:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Michael Roth, Stefan Hajnoczi

Today virtio-blk dataplane uses a 1:1 device-per-thread model.  Now that
IOThreads have been introduced we can generalize this to N:M devices per
threads.

This patch drops thread code from dataplane in favor of running inside
an IOThread AioContext.

As a bonus we solve the case where a guest keeps submitting I/O requests
while dataplane is trying to stop.  Previously the dataplane thread
would continue to process requests until the request gave it a break.
Now we can shut down in bounded time thanks to
aio_context_acquire/release.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/block/dataplane/virtio-blk.c | 91 ++++++++++++++++++++++-------------------
 include/hw/virtio/virtio-blk.h  |  8 +++-
 2 files changed, 54 insertions(+), 45 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index f2d7350..43211ba 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -44,8 +44,6 @@ struct VirtIOBlockDataPlane {
     bool started;
     bool starting;
     bool stopping;
-    QEMUBH *start_bh;
-    QemuThread thread;
 
     VirtIOBlkConf *blk;
     int fd;                         /* image file descriptor */
@@ -59,12 +57,14 @@ struct VirtIOBlockDataPlane {
      * (because you don't own the file descriptor or handle; you just
      * use it).
      */
+    IOThread *iothread;
+    bool internal_iothread;
     AioContext *ctx;
     EventNotifier io_notifier;      /* Linux AIO completion */
     EventNotifier host_notifier;    /* doorbell */
 
     IOQueue ioqueue;                /* Linux AIO queue (should really be per
-                                       dataplane thread) */
+                                       IOThread) */
     VirtIOBlockRequest requests[REQ_MAX]; /* pool of requests, managed by the
                                              queue */
 
@@ -360,26 +360,7 @@ static void handle_io(EventNotifier *e)
     }
 }
 
-static void *data_plane_thread(void *opaque)
-{
-    VirtIOBlockDataPlane *s = opaque;
-
-    while (!s->stopping || s->num_reqs > 0) {
-        aio_poll(s->ctx, true);
-    }
-    return NULL;
-}
-
-static void start_data_plane_bh(void *opaque)
-{
-    VirtIOBlockDataPlane *s = opaque;
-
-    qemu_bh_delete(s->start_bh);
-    s->start_bh = NULL;
-    qemu_thread_create(&s->thread, data_plane_thread,
-                       s, QEMU_THREAD_JOINABLE);
-}
-
+/* Context: QEMU global mutex held */
 bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
                                   VirtIOBlockDataPlane **dataplane)
 {
@@ -407,7 +388,7 @@ bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
      * block jobs that can conflict.
      */
     if (bdrv_in_use(blk->conf.bs)) {
-        error_report("cannot start dataplane thread while device is in use");
+        error_report("cannot start dataplane while device is in use");
         return false;
     }
 
@@ -423,6 +404,20 @@ bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
     s->fd = fd;
     s->blk = blk;
 
+    if (blk->iothread) {
+        s->internal_iothread = false;
+        s->iothread = blk->iothread;
+        object_ref(OBJECT(s->iothread));
+    } else {
+        /* Create per-device IOThread if none specified */
+        s->internal_iothread = true;
+        s->iothread = IOTHREAD(object_new(TYPE_IOTHREAD));
+        object_property_add_child(container_get(object_get_root(),
+                                                IOTHREADS_PATH),
+                                  vdev->name, OBJECT(s->iothread), NULL);
+    }
+    s->ctx = iothread_get_aio_context(s->iothread);
+
     /* Prevent block operations that conflict with data plane thread */
     bdrv_set_in_use(blk->conf.bs, 1);
 
@@ -430,6 +425,7 @@ bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
     return true;
 }
 
+/* Context: QEMU global mutex held */
 void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
 {
     if (!s) {
@@ -438,9 +434,14 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
 
     virtio_blk_data_plane_stop(s);
     bdrv_set_in_use(s->blk->conf.bs, 0);
+    object_unref(OBJECT(s->iothread));
+    if (s->internal_iothread) {
+        object_unparent(OBJECT(s->iothread));
+    }
     g_free(s);
 }
 
+/* Context: QEMU global mutex held */
 void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 {
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s->vdev)));
@@ -464,8 +465,6 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
         return;
     }
 
-    s->ctx = aio_context_new();
-
     /* Set up guest notifier (irq) */
     if (k->set_guest_notifiers(qbus->parent, 1, true) != 0) {
         fprintf(stderr, "virtio-blk failed to set guest notifier, "
@@ -480,7 +479,6 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
         exit(1);
     }
     s->host_notifier = *virtio_queue_get_host_notifier(vq);
-    aio_set_event_notifier(s->ctx, &s->host_notifier, handle_notify);
 
     /* Set up ioqueue */
     ioq_init(&s->ioqueue, s->fd, REQ_MAX);
@@ -488,7 +486,6 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
         ioq_put_iocb(&s->ioqueue, &s->requests[i].iocb);
     }
     s->io_notifier = *ioq_get_notifier(&s->ioqueue);
-    aio_set_event_notifier(s->ctx, &s->io_notifier, handle_io);
 
     s->starting = false;
     s->started = true;
@@ -497,11 +494,14 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
     /* Kick right away to begin processing requests already in vring */
     event_notifier_set(virtio_queue_get_host_notifier(vq));
 
-    /* Spawn thread in BH so it inherits iothread cpusets */
-    s->start_bh = qemu_bh_new(start_data_plane_bh, s);
-    qemu_bh_schedule(s->start_bh);
+    /* Get this show started by hooking up our callbacks */
+    aio_context_acquire(s->ctx);
+    aio_set_event_notifier(s->ctx, &s->host_notifier, handle_notify);
+    aio_set_event_notifier(s->ctx, &s->io_notifier, handle_io);
+    aio_context_release(s->ctx);
 }
 
+/* Context: QEMU global mutex held */
 void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 {
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s->vdev)));
@@ -512,27 +512,32 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
     s->stopping = true;
     trace_virtio_blk_data_plane_stop(s);
 
-    /* Stop thread or cancel pending thread creation BH */
-    if (s->start_bh) {
-        qemu_bh_delete(s->start_bh);
-        s->start_bh = NULL;
-    } else {
-        aio_notify(s->ctx);
-        qemu_thread_join(&s->thread);
+    aio_context_acquire(s->ctx);
+
+    /* Stop notifications for new requests from guest */
+    aio_set_event_notifier(s->ctx, &s->host_notifier, NULL);
+
+    /* Complete pending requests */
+    while (s->num_reqs > 0) {
+        aio_poll(s->ctx, true);
     }
 
+    /* Stop ioq callbacks (there are no pending requests left) */
     aio_set_event_notifier(s->ctx, &s->io_notifier, NULL);
-    ioq_cleanup(&s->ioqueue);
 
-    aio_set_event_notifier(s->ctx, &s->host_notifier, NULL);
-    k->set_host_notifier(qbus->parent, 0, false);
+    aio_context_release(s->ctx);
+
+    /* Sync vring state back to virtqueue so that non-dataplane request
+     * processing can continue when we disable the host notifier below.
+     */
+    vring_teardown(&s->vring, s->vdev, 0);
 
-    aio_context_unref(s->ctx);
+    ioq_cleanup(&s->ioqueue);
+    k->set_host_notifier(qbus->parent, 0, false);
 
     /* Clean up guest notifier (irq) */
     k->set_guest_notifiers(qbus->parent, 1, false);
 
-    vring_teardown(&s->vring, s->vdev, 0);
     s->started = false;
     s->stopping = false;
 }
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 41885da..12193fd 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -16,6 +16,7 @@
 
 #include "hw/virtio/virtio.h"
 #include "hw/block/block.h"
+#include "sysemu/iothread.h"
 
 #define TYPE_VIRTIO_BLK "virtio-blk-device"
 #define VIRTIO_BLK(obj) \
@@ -106,6 +107,7 @@ struct virtio_scsi_inhdr
 struct VirtIOBlkConf
 {
     BlockConf conf;
+    IOThread *iothread;
     char *serial;
     uint32_t scsi;
     uint32_t config_wce;
@@ -140,13 +142,15 @@ typedef struct VirtIOBlock {
         DEFINE_BLOCK_CHS_PROPERTIES(_state, _field.conf),                     \
         DEFINE_PROP_STRING("serial", _state, _field.serial),                  \
         DEFINE_PROP_BIT("config-wce", _state, _field.config_wce, 0, true),    \
-        DEFINE_PROP_BIT("scsi", _state, _field.scsi, 0, true)
+        DEFINE_PROP_BIT("scsi", _state, _field.scsi, 0, true),                \
+        DEFINE_PROP_IOTHREAD("iothread", _state, _field.iothread)
 #else
 #define DEFINE_VIRTIO_BLK_PROPERTIES(_state, _field)                          \
         DEFINE_BLOCK_PROPERTIES(_state, _field.conf),                         \
         DEFINE_BLOCK_CHS_PROPERTIES(_state, _field.conf),                     \
         DEFINE_PROP_STRING("serial", _state, _field.serial),                  \
-        DEFINE_PROP_BIT("config-wce", _state, _field.config_wce, 0, true)
+        DEFINE_PROP_BIT("config-wce", _state, _field.config_wce, 0, true),    \
+        DEFINE_PROP_IOTHREAD("iothread", _state, _field.iothread)
 #endif /* __linux__ */
 
 void virtio_blk_set_conf(DeviceState *dev, VirtIOBlkConf *blk);
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [RFC 3/7] iothread: add I/O thread object
  2013-12-12 13:19 ` [Qemu-devel] [RFC 3/7] iothread: add I/O thread object Stefan Hajnoczi
@ 2013-12-12 18:00   ` Michael Roth
  2013-12-13  9:20     ` Stefan Hajnoczi
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Roth @ 2013-12-12 18:00 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini

Quoting Stefan Hajnoczi (2013-12-12 07:19:40)
> This is a stand-in for Michael Roth's QContext.  I expect this to be
> replaced once QContext is completed.
> 
> The IOThread object is an AioContext event loop thread.  This patch adds
> the concept of multiple event loop threads, allowing users to define
> them.
> 
> When SMP guests run on SMP hosts it makes sense to instantiate multiple
> IOThreads.  This spreads event loop processing across multiple cores.
> Note that additional patches are required to actually bind a device to
> an IOThread.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  Makefile.objs             |   1 +
>  include/sysemu/iothread.h |  31 +++++++++++++
>  iothread.c                | 115 ++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 147 insertions(+)
>  create mode 100644 include/sysemu/iothread.h
>  create mode 100644 iothread.c
> 
> diff --git a/Makefile.objs b/Makefile.objs
> index 2b6c1fe..a1102a5 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -42,6 +42,7 @@ libcacard-y += libcacard/vcardt.o
> 
>  ifeq ($(CONFIG_SOFTMMU),y)
>  common-obj-y = $(block-obj-y) blockdev.o blockdev-nbd.o block/
> +common-obj-y += iothread.o
>  common-obj-y += net/
>  common-obj-y += readline.o
>  common-obj-y += qdev-monitor.o device-hotplug.o
> diff --git a/include/sysemu/iothread.h b/include/sysemu/iothread.h
> new file mode 100644
> index 0000000..8c49bd6
> --- /dev/null
> +++ b/include/sysemu/iothread.h
> @@ -0,0 +1,31 @@
> +/*
> + * Event loop thread
> + *
> + * Copyright Red Hat Inc., 2013
> + *
> + * Authors:
> + *  Stefan Hajnoczi   <stefanha@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef IOTHREAD_H
> +#define IOTHREAD_H
> +
> +#include "block/aio.h"
> +
> +#define TYPE_IOTHREAD "iothread"
> +#define IOTHREADS_PATH "/backends/iothreads"
> +
> +typedef struct IOThread IOThread;
> +
> +#define IOTHREAD(obj) \
> +   OBJECT_CHECK(IOThread, obj, TYPE_IOTHREAD)
> +
> +IOThread *iothread_find(const char *id);
> +char *iothread_get_id(IOThread *iothread);
> +AioContext *iothread_get_aio_context(IOThread *iothread);
> +
> +#endif /* IOTHREAD_H */
> diff --git a/iothread.c b/iothread.c
> new file mode 100644
> index 0000000..dbc6047
> --- /dev/null
> +++ b/iothread.c
> @@ -0,0 +1,115 @@
> +/*
> + * Event loop thread
> + *
> + * Copyright Red Hat Inc., 2013
> + *
> + * Authors:
> + *  Stefan Hajnoczi   <stefanha@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qom/object.h"
> +#include "qemu/module.h"
> +#include "qemu/thread.h"
> +#include "block/aio.h"
> +#include "sysemu/iothread.h"
> +
> +typedef ObjectClass IOThreadClass;
> +struct IOThread {
> +    Object parent;
> +    QemuThread thread;
> +    AioContext *ctx;
> +    bool stopping;
> +};
> +
> +#define IOTHREAD_GET_CLASS(obj) \
> +   OBJECT_GET_CLASS(IOThreadClass, obj, TYPE_IOTHREAD)
> +#define IOTHREAD_CLASS(klass) \
> +   OBJECT_CLASS_CHECK(IOThreadClass, klass, TYPE_IOTHREAD)
> +
> +static void *iothread_run(void *opaque)
> +{
> +    IOThread *iothread = opaque;
> +
> +    for (;;) {
> +        /* TODO can we optimize away acquire/release to only happen when
> +         * aio_notify() was called?
> +         */

Perhaps have the AioContext's notifier callback set a flag that can be
checked for afterward to determine whether we should release/re-acquire?
Calls to aio_context_acquire() could reset it upon acquistion, so we could
maybe do something like:

while(!iothread->stopping) {
    aio_context_acquire(iothread->ctx);
    while (!iothread->ctx->notified) {
        aio_poll(iothread->ctx, true);
    }
    aio_context_release(iothread->ctx);
}

> +        aio_context_acquire(iothread->ctx);
> +        if (iothread->stopping) {
> +            aio_context_release(iothread->ctx);
> +            break;
> +        }
> +        aio_poll(iothread->ctx, true);
> +        aio_context_release(iothread->ctx);
> +    }
> +    return NULL;
> +}
> +
> +static void iothread_instance_init(Object *obj)
> +{
> +    IOThread *iothread = IOTHREAD(obj);
> +
> +    iothread->stopping = false;
> +    iothread->ctx = aio_context_new();
> +
> +    /* This assumes .instance_init() is called from a thread with useful CPU
> +     * affinity for us to inherit.
> +     */

Is this assumption necessary/controllable? Couldn't we just expose the thread
id via QOM or some other interface so users/management can set the affinity
later?

> +    qemu_thread_create(&iothread->thread, iothread_run,
> +                       iothread, QEMU_THREAD_JOINABLE);
> +}
> +
> +static void iothread_instance_finalize(Object *obj)
> +{
> +    IOThread *iothread = IOTHREAD(obj);
> +
> +    iothread->stopping = true;
> +    aio_notify(iothread->ctx);
> +    qemu_thread_join(&iothread->thread);
> +    aio_context_unref(iothread->ctx);
> +}
> +
> +static const TypeInfo iothread_info = {
> +    .name = TYPE_IOTHREAD,
> +    .parent = TYPE_OBJECT,
> +    .instance_size = sizeof(IOThread),
> +    .instance_init = iothread_instance_init,
> +    .instance_finalize = iothread_instance_finalize,
> +};
> +
> +static void iothread_register_types(void)
> +{
> +    type_register_static(&iothread_info);
> +}
> +
> +type_init(iothread_register_types)
> +
> +IOThread *iothread_find(const char *id)
> +{
> +    Object *container = container_get(object_get_root(), IOTHREADS_PATH);
> +    Object *child;
> +
> +    child = object_property_get_link(container, id, NULL);
> +    if (!child) {
> +        return NULL;
> +    }
> +    return IOTHREAD(child);
> +}
> +
> +char *iothread_get_id(IOThread *iothread)
> +{
> +    /* The last path component is the identifier */
> +    char *path = object_get_canonical_path(OBJECT(iothread));
> +    char *id = g_strdup(&path[sizeof(IOTHREADS_PATH)]);
> +    g_free(path);
> +    return id;
> +}
> +
> +AioContext *iothread_get_aio_context(IOThread *iothread)
> +{
> +    return iothread->ctx;
> +}
> -- 
> 1.8.4.2

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [RFC 3/7] iothread: add I/O thread object
  2013-12-12 18:00   ` Michael Roth
@ 2013-12-13  9:20     ` Stefan Hajnoczi
  0 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2013-12-13  9:20 UTC (permalink / raw)
  To: Michael Roth; +Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, Stefan Hajnoczi

On Thu, Dec 12, 2013 at 12:00:12PM -0600, Michael Roth wrote:
> Quoting Stefan Hajnoczi (2013-12-12 07:19:40)
> > +static void *iothread_run(void *opaque)
> > +{
> > +    IOThread *iothread = opaque;
> > +
> > +    for (;;) {
> > +        /* TODO can we optimize away acquire/release to only happen when
> > +         * aio_notify() was called?
> > +         */
> 
> Perhaps have the AioContext's notifier callback set a flag that can be
> checked for afterward to determine whether we should release/re-acquire?
> Calls to aio_context_acquire() could reset it upon acquistion, so we could
> maybe do something like:
> 
> while(!iothread->stopping) {
>     aio_context_acquire(iothread->ctx);
>     while (!iothread->ctx->notified) {
>         aio_poll(iothread->ctx, true);
>     }
>     aio_context_release(iothread->ctx);
> }

When aio_notify() kicks aio_poll() it returns false.  So I was thinking of:

while (!iothread->stopping) {
    aio_context_acquire(iothread->ctx);
    while (!iothread->stopping && aio_poll(iothread->ctx, true)) {
        /* Progress was made, keep going */
    }
    aio_context_release(iothread->ctx);
}

I'll try it in the next version.  Just didn't want to get too fancy yet.

> 
> > +        aio_context_acquire(iothread->ctx);
> > +        if (iothread->stopping) {
> > +            aio_context_release(iothread->ctx);
> > +            break;
> > +        }
> > +        aio_poll(iothread->ctx, true);
> > +        aio_context_release(iothread->ctx);
> > +    }
> > +    return NULL;
> > +}
> > +
> > +static void iothread_instance_init(Object *obj)
> > +{
> > +    IOThread *iothread = IOTHREAD(obj);
> > +
> > +    iothread->stopping = false;
> > +    iothread->ctx = aio_context_new();
> > +
> > +    /* This assumes .instance_init() is called from a thread with useful CPU
> > +     * affinity for us to inherit.
> > +     */
> 
> Is this assumption necessary/controllable? Couldn't we just expose the thread
> id via QOM or some other interface so users/management can set the affinity
> later?

This assumption holds since the monitor and command-line run in the main
thread.

The fix has traditionally been to create the thread from a BH scheduled
in the main loop.  That way it inherits the main thread's affinity.

We definitely need to expose tids via QOM/QMP.  That's something I'm
looking at QContext for.  Did you already implement an interface?

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-12-13  9:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-12 13:19 [Qemu-devel] [RFC 0/7] dataplane: switch to N:M devices-per-thread model Stefan Hajnoczi
2013-12-12 13:19 ` [Qemu-devel] [RFC 1/7] rfifolock: add recursive FIFO lock Stefan Hajnoczi
2013-12-12 13:19 ` [Qemu-devel] [RFC 2/7] aio: add aio_context_acquire() and aio_context_release() Stefan Hajnoczi
2013-12-12 13:19 ` [Qemu-devel] [RFC 3/7] iothread: add I/O thread object Stefan Hajnoczi
2013-12-12 18:00   ` Michael Roth
2013-12-13  9:20     ` Stefan Hajnoczi
2013-12-12 13:19 ` [Qemu-devel] [RFC 4/7] iothread: command-line option Stefan Hajnoczi
2013-12-12 13:19 ` [Qemu-devel] [RFC 5/7] qdev: add get_pointer_and_free() for temporary strings Stefan Hajnoczi
2013-12-12 13:19 ` [Qemu-devel] [RFC 6/7] iothread: add "iothread" qdev property type Stefan Hajnoczi
2013-12-12 13:19 ` [Qemu-devel] [RFC 7/7] dataplane: replace internal thread with IOThread Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.