All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end
@ 2015-10-21  2:06 Fam Zheng
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 01/12] aio: Add "is_external" flag for event handlers Fam Zheng
                   ` (12 more replies)
  0 siblings, 13 replies; 29+ messages in thread
From: Fam Zheng @ 2015-10-21  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, qemu-block

v5: Rebase onto Kevin's block tree.

v4: Rebase on to master so fix the "bdrv_move_feature_fields" issue.

v3: Call bdrv_drain unconditionally in bdrv_drained_begin.
    Document the internal I/O implications between bdrv_drain_begin and end.

The nested aio_poll()'s in block layer has a bug that new r/w requests from
ioeventfds and nbd exports are processed, which might break the caller's
semantics (qmp_transaction) or even pointers (bdrv_reopen).

Fam Zheng (12):
  aio: Add "is_external" flag for event handlers
  nbd: Mark fd handlers client type as "external"
  dataplane: Mark host notifiers' client type as "external"
  aio: introduce aio_{disable,enable}_external
  block: Introduce "drained begin/end" API
  block: Add "drained begin/end" for transactional external snapshot
  block: Add "drained begin/end" for transactional backup
  block: Add "drained begin/end" for transactional blockdev-backup
  block: Add "drained begin/end" for internal snapshot
  block: Introduce BlockDriver.bdrv_drain callback
  qed: Implement .bdrv_drain
  tests: Add test case for aio_disable_external

 aio-posix.c                     |  9 ++++-
 aio-win32.c                     |  8 +++-
 async.c                         |  3 +-
 block/curl.c                    | 14 ++++---
 block/io.c                      | 23 +++++++++++-
 block/iscsi.c                   |  9 ++---
 block/linux-aio.c               |  5 ++-
 block/nbd-client.c              | 10 +++--
 block/nfs.c                     | 17 ++++-----
 block/qed.c                     |  7 ++++
 block/sheepdog.c                | 38 ++++++++++++-------
 block/ssh.c                     |  5 ++-
 block/win32-aio.c               |  5 ++-
 blockdev.c                      | 37 ++++++++++++++++---
 hw/block/dataplane/virtio-blk.c |  5 ++-
 hw/scsi/virtio-scsi-dataplane.c | 22 +++++++----
 include/block/aio.h             | 39 ++++++++++++++++++++
 include/block/block.h           | 26 +++++++++++++
 include/block/block_int.h       |  8 ++++
 iohandler.c                     |  3 +-
 nbd.c                           |  4 +-
 tests/test-aio.c                | 82 ++++++++++++++++++++++++++++-------------
 22 files changed, 286 insertions(+), 93 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v5 01/12] aio: Add "is_external" flag for event handlers
  2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
@ 2015-10-21  2:06 ` Fam Zheng
  2015-10-21 15:21   ` [Qemu-devel] [Qemu-block] " Jeff Cody
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 02/12] nbd: Mark fd handlers client type as "external" Fam Zheng
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Fam Zheng @ 2015-10-21  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, qemu-block

All callers pass in false, and the real external ones will switch to
true in coming patches.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 aio-posix.c                     |  6 ++++-
 aio-win32.c                     |  5 ++++
 async.c                         |  3 ++-
 block/curl.c                    | 14 +++++-----
 block/iscsi.c                   |  9 +++----
 block/linux-aio.c               |  5 ++--
 block/nbd-client.c              | 10 ++++---
 block/nfs.c                     | 17 +++++-------
 block/sheepdog.c                | 38 ++++++++++++++++++---------
 block/ssh.c                     |  5 ++--
 block/win32-aio.c               |  5 ++--
 hw/block/dataplane/virtio-blk.c |  6 +++--
 hw/scsi/virtio-scsi-dataplane.c | 24 +++++++++++------
 include/block/aio.h             |  2 ++
 iohandler.c                     |  3 ++-
 nbd.c                           |  4 ++-
 tests/test-aio.c                | 58 +++++++++++++++++++++++------------------
 17 files changed, 130 insertions(+), 84 deletions(-)

diff --git a/aio-posix.c b/aio-posix.c
index d477033..f0f9122 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -25,6 +25,7 @@ struct AioHandler
     IOHandler *io_write;
     int deleted;
     void *opaque;
+    bool is_external;
     QLIST_ENTRY(AioHandler) node;
 };
 
@@ -43,6 +44,7 @@ static AioHandler *find_aio_handler(AioContext *ctx, int fd)
 
 void aio_set_fd_handler(AioContext *ctx,
                         int fd,
+                        bool is_external,
                         IOHandler *io_read,
                         IOHandler *io_write,
                         void *opaque)
@@ -82,6 +84,7 @@ void aio_set_fd_handler(AioContext *ctx,
         node->io_read = io_read;
         node->io_write = io_write;
         node->opaque = opaque;
+        node->is_external = is_external;
 
         node->pfd.events = (io_read ? G_IO_IN | G_IO_HUP | G_IO_ERR : 0);
         node->pfd.events |= (io_write ? G_IO_OUT | G_IO_ERR : 0);
@@ -92,10 +95,11 @@ void aio_set_fd_handler(AioContext *ctx,
 
 void aio_set_event_notifier(AioContext *ctx,
                             EventNotifier *notifier,
+                            bool is_external,
                             EventNotifierHandler *io_read)
 {
     aio_set_fd_handler(ctx, event_notifier_get_fd(notifier),
-                       (IOHandler *)io_read, NULL, notifier);
+                       is_external, (IOHandler *)io_read, NULL, notifier);
 }
 
 bool aio_prepare(AioContext *ctx)
diff --git a/aio-win32.c b/aio-win32.c
index 50a6867..3110d85 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -28,11 +28,13 @@ struct AioHandler {
     GPollFD pfd;
     int deleted;
     void *opaque;
+    bool is_external;
     QLIST_ENTRY(AioHandler) node;
 };
 
 void aio_set_fd_handler(AioContext *ctx,
                         int fd,
+                        bool is_external,
                         IOHandler *io_read,
                         IOHandler *io_write,
                         void *opaque)
@@ -86,6 +88,7 @@ void aio_set_fd_handler(AioContext *ctx,
         node->opaque = opaque;
         node->io_read = io_read;
         node->io_write = io_write;
+        node->is_external = is_external;
 
         event = event_notifier_get_handle(&ctx->notifier);
         WSAEventSelect(node->pfd.fd, event,
@@ -98,6 +101,7 @@ void aio_set_fd_handler(AioContext *ctx,
 
 void aio_set_event_notifier(AioContext *ctx,
                             EventNotifier *e,
+                            bool is_external,
                             EventNotifierHandler *io_notify)
 {
     AioHandler *node;
@@ -133,6 +137,7 @@ void aio_set_event_notifier(AioContext *ctx,
             node->e = e;
             node->pfd.fd = (uintptr_t)event_notifier_get_handle(e);
             node->pfd.events = G_IO_IN;
+            node->is_external = is_external;
             QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
 
             g_source_add_poll(&ctx->source, &node->pfd);
diff --git a/async.c b/async.c
index efce14b..bdc64a3 100644
--- a/async.c
+++ b/async.c
@@ -247,7 +247,7 @@ aio_ctx_finalize(GSource     *source)
     }
     qemu_mutex_unlock(&ctx->bh_lock);
 
-    aio_set_event_notifier(ctx, &ctx->notifier, NULL);
+    aio_set_event_notifier(ctx, &ctx->notifier, false, NULL);
     event_notifier_cleanup(&ctx->notifier);
     rfifolock_destroy(&ctx->lock);
     qemu_mutex_destroy(&ctx->bh_lock);
@@ -329,6 +329,7 @@ AioContext *aio_context_new(Error **errp)
     }
     g_source_set_can_recurse(&ctx->source, true);
     aio_set_event_notifier(ctx, &ctx->notifier,
+                           false,
                            (EventNotifierHandler *)
                            event_notifier_dummy_cb);
     ctx->thread_pool = NULL;
diff --git a/block/curl.c b/block/curl.c
index 032cc8a..8994182 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -154,18 +154,20 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
     DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd);
     switch (action) {
         case CURL_POLL_IN:
-            aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
-                               NULL, state);
+            aio_set_fd_handler(s->aio_context, fd, false,
+                               curl_multi_read, NULL, state);
             break;
         case CURL_POLL_OUT:
-            aio_set_fd_handler(s->aio_context, fd, NULL, curl_multi_do, state);
+            aio_set_fd_handler(s->aio_context, fd, false,
+                               NULL, curl_multi_do, state);
             break;
         case CURL_POLL_INOUT:
-            aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
-                               curl_multi_do, state);
+            aio_set_fd_handler(s->aio_context, fd, false,
+                               curl_multi_read, curl_multi_do, state);
             break;
         case CURL_POLL_REMOVE:
-            aio_set_fd_handler(s->aio_context, fd, NULL, NULL, NULL);
+            aio_set_fd_handler(s->aio_context, fd, false,
+                               NULL, NULL, NULL);
             break;
     }
 
diff --git a/block/iscsi.c b/block/iscsi.c
index 93f1ee4..9a628b7 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -291,8 +291,8 @@ iscsi_set_events(IscsiLun *iscsilun)
     int ev = iscsi_which_events(iscsi);
 
     if (ev != iscsilun->events) {
-        aio_set_fd_handler(iscsilun->aio_context,
-                           iscsi_get_fd(iscsi),
+        aio_set_fd_handler(iscsilun->aio_context, iscsi_get_fd(iscsi),
+                           false,
                            (ev & POLLIN) ? iscsi_process_read : NULL,
                            (ev & POLLOUT) ? iscsi_process_write : NULL,
                            iscsilun);
@@ -1280,9 +1280,8 @@ static void iscsi_detach_aio_context(BlockDriverState *bs)
 {
     IscsiLun *iscsilun = bs->opaque;
 
-    aio_set_fd_handler(iscsilun->aio_context,
-                       iscsi_get_fd(iscsilun->iscsi),
-                       NULL, NULL, NULL);
+    aio_set_fd_handler(iscsilun->aio_context, iscsi_get_fd(iscsilun->iscsi),
+                       false, NULL, NULL, NULL);
     iscsilun->events = 0;
 
     if (iscsilun->nop_timer) {
diff --git a/block/linux-aio.c b/block/linux-aio.c
index c991443..88b0520 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -287,7 +287,7 @@ void laio_detach_aio_context(void *s_, AioContext *old_context)
 {
     struct qemu_laio_state *s = s_;
 
-    aio_set_event_notifier(old_context, &s->e, NULL);
+    aio_set_event_notifier(old_context, &s->e, false, NULL);
     qemu_bh_delete(s->completion_bh);
 }
 
@@ -296,7 +296,8 @@ void laio_attach_aio_context(void *s_, AioContext *new_context)
     struct qemu_laio_state *s = s_;
 
     s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
-    aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb);
+    aio_set_event_notifier(new_context, &s->e, false,
+                           qemu_laio_completion_cb);
 }
 
 void *laio_init(void)
diff --git a/block/nbd-client.c b/block/nbd-client.c
index e1bb919..b7fd17a 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -124,7 +124,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
     s->send_coroutine = qemu_coroutine_self();
     aio_context = bdrv_get_aio_context(bs);
 
-    aio_set_fd_handler(aio_context, s->sock,
+    aio_set_fd_handler(aio_context, s->sock, false,
                        nbd_reply_ready, nbd_restart_write, bs);
     if (qiov) {
         if (!s->is_unix) {
@@ -144,7 +144,8 @@ static int nbd_co_send_request(BlockDriverState *bs,
     } else {
         rc = nbd_send_request(s->sock, request);
     }
-    aio_set_fd_handler(aio_context, s->sock, nbd_reply_ready, NULL, bs);
+    aio_set_fd_handler(aio_context, s->sock, false,
+                       nbd_reply_ready, NULL, bs);
     s->send_coroutine = NULL;
     qemu_co_mutex_unlock(&s->send_mutex);
     return rc;
@@ -348,14 +349,15 @@ int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
 void nbd_client_detach_aio_context(BlockDriverState *bs)
 {
     aio_set_fd_handler(bdrv_get_aio_context(bs),
-                       nbd_get_client_session(bs)->sock, NULL, NULL, NULL);
+                       nbd_get_client_session(bs)->sock,
+                       false, NULL, NULL, NULL);
 }
 
 void nbd_client_attach_aio_context(BlockDriverState *bs,
                                    AioContext *new_context)
 {
     aio_set_fd_handler(new_context, nbd_get_client_session(bs)->sock,
-                       nbd_reply_ready, NULL, bs);
+                       false, nbd_reply_ready, NULL, bs);
 }
 
 void nbd_client_close(BlockDriverState *bs)
diff --git a/block/nfs.c b/block/nfs.c
index 887a98e..fd79f89 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -63,11 +63,10 @@ static void nfs_set_events(NFSClient *client)
 {
     int ev = nfs_which_events(client->context);
     if (ev != client->events) {
-        aio_set_fd_handler(client->aio_context,
-                           nfs_get_fd(client->context),
+        aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context),
+                           false,
                            (ev & POLLIN) ? nfs_process_read : NULL,
-                           (ev & POLLOUT) ? nfs_process_write : NULL,
-                           client);
+                           (ev & POLLOUT) ? nfs_process_write : NULL, client);
 
     }
     client->events = ev;
@@ -242,9 +241,8 @@ static void nfs_detach_aio_context(BlockDriverState *bs)
 {
     NFSClient *client = bs->opaque;
 
-    aio_set_fd_handler(client->aio_context,
-                       nfs_get_fd(client->context),
-                       NULL, NULL, NULL);
+    aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context),
+                       false, NULL, NULL, NULL);
     client->events = 0;
 }
 
@@ -263,9 +261,8 @@ static void nfs_client_close(NFSClient *client)
         if (client->fh) {
             nfs_close(client->context, client->fh);
         }
-        aio_set_fd_handler(client->aio_context,
-                           nfs_get_fd(client->context),
-                           NULL, NULL, NULL);
+        aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context),
+                           false, NULL, NULL, NULL);
         nfs_destroy_context(client->context);
     }
     memset(client, 0, sizeof(NFSClient));
diff --git a/block/sheepdog.c b/block/sheepdog.c
index e7e58b7..d80e4ed 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -651,14 +651,16 @@ static coroutine_fn void do_co_req(void *opaque)
     unsigned int *rlen = srco->rlen;
 
     co = qemu_coroutine_self();
-    aio_set_fd_handler(srco->aio_context, sockfd, NULL, restart_co_req, co);
+    aio_set_fd_handler(srco->aio_context, sockfd, false,
+                       NULL, restart_co_req, co);
 
     ret = send_co_req(sockfd, hdr, data, wlen);
     if (ret < 0) {
         goto out;
     }
 
-    aio_set_fd_handler(srco->aio_context, sockfd, restart_co_req, NULL, co);
+    aio_set_fd_handler(srco->aio_context, sockfd, false,
+                       restart_co_req, NULL, co);
 
     ret = qemu_co_recv(sockfd, hdr, sizeof(*hdr));
     if (ret != sizeof(*hdr)) {
@@ -683,7 +685,8 @@ static coroutine_fn void do_co_req(void *opaque)
 out:
     /* there is at most one request for this sockfd, so it is safe to
      * set each handler to NULL. */
-    aio_set_fd_handler(srco->aio_context, sockfd, NULL, NULL, NULL);
+    aio_set_fd_handler(srco->aio_context, sockfd, false,
+                       NULL, NULL, NULL);
 
     srco->ret = ret;
     srco->finished = true;
@@ -735,7 +738,8 @@ static coroutine_fn void reconnect_to_sdog(void *opaque)
     BDRVSheepdogState *s = opaque;
     AIOReq *aio_req, *next;
 
-    aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL);
+    aio_set_fd_handler(s->aio_context, s->fd, false, NULL,
+                       NULL, NULL);
     close(s->fd);
     s->fd = -1;
 
@@ -938,7 +942,8 @@ static int get_sheep_fd(BDRVSheepdogState *s, Error **errp)
         return fd;
     }
 
-    aio_set_fd_handler(s->aio_context, fd, co_read_response, NULL, s);
+    aio_set_fd_handler(s->aio_context, fd, false,
+                       co_read_response, NULL, s);
     return fd;
 }
 
@@ -1199,7 +1204,7 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
 
     qemu_co_mutex_lock(&s->lock);
     s->co_send = qemu_coroutine_self();
-    aio_set_fd_handler(s->aio_context, s->fd,
+    aio_set_fd_handler(s->aio_context, s->fd, false,
                        co_read_response, co_write_request, s);
     socket_set_cork(s->fd, 1);
 
@@ -1218,7 +1223,8 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
     }
 out:
     socket_set_cork(s->fd, 0);
-    aio_set_fd_handler(s->aio_context, s->fd, co_read_response, NULL, s);
+    aio_set_fd_handler(s->aio_context, s->fd, false,
+                       co_read_response, NULL, s);
     s->co_send = NULL;
     qemu_co_mutex_unlock(&s->lock);
 }
@@ -1368,7 +1374,8 @@ static void sd_detach_aio_context(BlockDriverState *bs)
 {
     BDRVSheepdogState *s = bs->opaque;
 
-    aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL);
+    aio_set_fd_handler(s->aio_context, s->fd, false, NULL,
+                       NULL, NULL);
 }
 
 static void sd_attach_aio_context(BlockDriverState *bs,
@@ -1377,7 +1384,8 @@ static void sd_attach_aio_context(BlockDriverState *bs,
     BDRVSheepdogState *s = bs->opaque;
 
     s->aio_context = new_context;
-    aio_set_fd_handler(new_context, s->fd, co_read_response, NULL, s);
+    aio_set_fd_handler(new_context, s->fd, false,
+                       co_read_response, NULL, s);
 }
 
 /* TODO Convert to fine grained options */
@@ -1490,7 +1498,8 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
     g_free(buf);
     return 0;
 out:
-    aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, NULL, NULL, NULL);
+    aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd,
+                       false, NULL, NULL, NULL);
     if (s->fd >= 0) {
         closesocket(s->fd);
     }
@@ -1528,7 +1537,8 @@ static void sd_reopen_commit(BDRVReopenState *state)
     BDRVSheepdogState *s = state->bs->opaque;
 
     if (s->fd) {
-        aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL);
+        aio_set_fd_handler(s->aio_context, s->fd, false,
+                           NULL, NULL, NULL);
         closesocket(s->fd);
     }
 
@@ -1551,7 +1561,8 @@ static void sd_reopen_abort(BDRVReopenState *state)
     }
 
     if (re_s->fd) {
-        aio_set_fd_handler(s->aio_context, re_s->fd, NULL, NULL, NULL);
+        aio_set_fd_handler(s->aio_context, re_s->fd, false,
+                           NULL, NULL, NULL);
         closesocket(re_s->fd);
     }
 
@@ -1935,7 +1946,8 @@ static void sd_close(BlockDriverState *bs)
         error_report("%s, %s", sd_strerror(rsp->result), s->name);
     }
 
-    aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, NULL, NULL, NULL);
+    aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd,
+                       false, NULL, NULL, NULL);
     closesocket(s->fd);
     g_free(s->host_spec);
 }
diff --git a/block/ssh.c b/block/ssh.c
index d35b51f..af025c0 100644
--- a/block/ssh.c
+++ b/block/ssh.c
@@ -800,14 +800,15 @@ static coroutine_fn void set_fd_handler(BDRVSSHState *s, BlockDriverState *bs)
             rd_handler, wr_handler);
 
     aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock,
-                       rd_handler, wr_handler, co);
+                       false, rd_handler, wr_handler, co);
 }
 
 static coroutine_fn void clear_fd_handler(BDRVSSHState *s,
                                           BlockDriverState *bs)
 {
     DPRINTF("s->sock=%d", s->sock);
-    aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock, NULL, NULL, NULL);
+    aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock,
+                       false, NULL, NULL, NULL);
 }
 
 /* A non-blocking call returned EAGAIN, so yield, ensuring the
diff --git a/block/win32-aio.c b/block/win32-aio.c
index 64e8682..bbf2f01 100644
--- a/block/win32-aio.c
+++ b/block/win32-aio.c
@@ -174,7 +174,7 @@ int win32_aio_attach(QEMUWin32AIOState *aio, HANDLE hfile)
 void win32_aio_detach_aio_context(QEMUWin32AIOState *aio,
                                   AioContext *old_context)
 {
-    aio_set_event_notifier(old_context, &aio->e, NULL);
+    aio_set_event_notifier(old_context, &aio->e, false, NULL);
     aio->is_aio_context_attached = false;
 }
 
@@ -182,7 +182,8 @@ void win32_aio_attach_aio_context(QEMUWin32AIOState *aio,
                                   AioContext *new_context)
 {
     aio->is_aio_context_attached = true;
-    aio_set_event_notifier(new_context, &aio->e, win32_aio_completion_cb);
+    aio_set_event_notifier(new_context, &aio->e, false,
+                           win32_aio_completion_cb);
 }
 
 QEMUWin32AIOState *win32_aio_init(void)
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 6106e46..f8716bc 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -283,7 +283,8 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 
     /* Get this show started by hooking up our callbacks */
     aio_context_acquire(s->ctx);
-    aio_set_event_notifier(s->ctx, &s->host_notifier, handle_notify);
+    aio_set_event_notifier(s->ctx, &s->host_notifier, false,
+                           handle_notify);
     aio_context_release(s->ctx);
     return;
 
@@ -319,7 +320,8 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
     aio_context_acquire(s->ctx);
 
     /* Stop notifications for new requests from guest */
-    aio_set_event_notifier(s->ctx, &s->host_notifier, NULL);
+    aio_set_event_notifier(s->ctx, &s->host_notifier, false,
+                           NULL);
 
     /* Drain and switch bs back to the QEMU main loop */
     blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 1248fd9..21f51df 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -60,7 +60,8 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
     r = g_new(VirtIOSCSIVring, 1);
     r->host_notifier = *virtio_queue_get_host_notifier(vq);
     r->guest_notifier = *virtio_queue_get_guest_notifier(vq);
-    aio_set_event_notifier(s->ctx, &r->host_notifier, handler);
+    aio_set_event_notifier(s->ctx, &r->host_notifier, false,
+                           handler);
 
     r->parent = s;
 
@@ -71,7 +72,8 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
     return r;
 
 fail_vring:
-    aio_set_event_notifier(s->ctx, &r->host_notifier, NULL);
+    aio_set_event_notifier(s->ctx, &r->host_notifier, false,
+                           NULL);
     k->set_host_notifier(qbus->parent, n, false);
     g_free(r);
     return NULL;
@@ -162,14 +164,17 @@ static void virtio_scsi_clear_aio(VirtIOSCSI *s)
     int i;
 
     if (s->ctrl_vring) {
-        aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier, NULL);
+        aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier,
+                               false, NULL);
     }
     if (s->event_vring) {
-        aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier, NULL);
+        aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier,
+                               false, NULL);
     }
     if (s->cmd_vrings) {
         for (i = 0; i < vs->conf.num_queues && s->cmd_vrings[i]; i++) {
-            aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, NULL);
+            aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier,
+                                   false, NULL);
         }
     }
 }
@@ -290,10 +295,13 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s)
 
     aio_context_acquire(s->ctx);
 
-    aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier, NULL);
-    aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier, NULL);
+    aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier,
+                           false, NULL);
+    aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier,
+                           false, NULL);
     for (i = 0; i < vs->conf.num_queues; i++) {
-        aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, NULL);
+        aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier,
+                               false, NULL);
     }
 
     blk_drain_all(); /* ensure there are no in-flight requests */
diff --git a/include/block/aio.h b/include/block/aio.h
index 400b1b0..12f1141 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -299,6 +299,7 @@ bool aio_poll(AioContext *ctx, bool blocking);
  */
 void aio_set_fd_handler(AioContext *ctx,
                         int fd,
+                        bool is_external,
                         IOHandler *io_read,
                         IOHandler *io_write,
                         void *opaque);
@@ -312,6 +313,7 @@ void aio_set_fd_handler(AioContext *ctx,
  */
 void aio_set_event_notifier(AioContext *ctx,
                             EventNotifier *notifier,
+                            bool is_external,
                             EventNotifierHandler *io_read);
 
 /* Return a GSource that lets the main loop poll the file descriptors attached
diff --git a/iohandler.c b/iohandler.c
index 55f8501..eb68083 100644
--- a/iohandler.c
+++ b/iohandler.c
@@ -55,7 +55,8 @@ void qemu_set_fd_handler(int fd,
                          void *opaque)
 {
     iohandler_init();
-    aio_set_fd_handler(iohandler_ctx, fd, fd_read, fd_write, opaque);
+    aio_set_fd_handler(iohandler_ctx, fd, false,
+                       fd_read, fd_write, opaque);
 }
 
 /* reaping of zombies.  right now we're not passing the status to
diff --git a/nbd.c b/nbd.c
index 74859cb..fbc66be 100644
--- a/nbd.c
+++ b/nbd.c
@@ -1446,6 +1446,7 @@ static void nbd_set_handlers(NBDClient *client)
 {
     if (client->exp && client->exp->ctx) {
         aio_set_fd_handler(client->exp->ctx, client->sock,
+                           false,
                            client->can_read ? nbd_read : NULL,
                            client->send_coroutine ? nbd_restart_write : NULL,
                            client);
@@ -1455,7 +1456,8 @@ static void nbd_set_handlers(NBDClient *client)
 static void nbd_unset_handlers(NBDClient *client)
 {
     if (client->exp && client->exp->ctx) {
-        aio_set_fd_handler(client->exp->ctx, client->sock, NULL, NULL, NULL);
+        aio_set_fd_handler(client->exp->ctx, client->sock,
+                           false, NULL, NULL, NULL);
     }
 }
 
diff --git a/tests/test-aio.c b/tests/test-aio.c
index 217e337..03cd45d 100644
--- a/tests/test-aio.c
+++ b/tests/test-aio.c
@@ -118,6 +118,12 @@ static void *test_acquire_thread(void *opaque)
     return NULL;
 }
 
+static void set_event_notifier(AioContext *ctx, EventNotifier *notifier,
+                               EventNotifierHandler *handler)
+{
+    aio_set_event_notifier(ctx, notifier, false, handler);
+}
+
 static void dummy_notifier_read(EventNotifier *unused)
 {
     g_assert(false); /* should never be invoked */
@@ -131,7 +137,7 @@ static void test_acquire(void)
 
     /* Dummy event notifier ensures aio_poll() will block */
     event_notifier_init(&notifier, false);
-    aio_set_event_notifier(ctx, &notifier, dummy_notifier_read);
+    set_event_notifier(ctx, &notifier, dummy_notifier_read);
     g_assert(!aio_poll(ctx, false)); /* consume aio_notify() */
 
     qemu_mutex_init(&data.start_lock);
@@ -149,7 +155,7 @@ static void test_acquire(void)
     aio_context_release(ctx);
 
     qemu_thread_join(&thread);
-    aio_set_event_notifier(ctx, &notifier, NULL);
+    set_event_notifier(ctx, &notifier, NULL);
     event_notifier_cleanup(&notifier);
 
     g_assert(data.thread_acquired);
@@ -308,11 +314,11 @@ static void test_set_event_notifier(void)
 {
     EventNotifierTestData data = { .n = 0, .active = 0 };
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
     g_assert(!aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 0);
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     g_assert(!aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 0);
     event_notifier_cleanup(&data.e);
@@ -322,7 +328,7 @@ static void test_wait_event_notifier(void)
 {
     EventNotifierTestData data = { .n = 0, .active = 1 };
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
     while (aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 0);
     g_assert_cmpint(data.active, ==, 1);
@@ -336,7 +342,7 @@ static void test_wait_event_notifier(void)
     g_assert_cmpint(data.n, ==, 1);
     g_assert_cmpint(data.active, ==, 0);
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     g_assert(!aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 1);
 
@@ -347,7 +353,7 @@ static void test_flush_event_notifier(void)
 {
     EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true };
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
     while (aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 0);
     g_assert_cmpint(data.active, ==, 10);
@@ -363,7 +369,7 @@ static void test_flush_event_notifier(void)
     g_assert_cmpint(data.active, ==, 0);
     g_assert(!aio_poll(ctx, false));
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     g_assert(!aio_poll(ctx, false));
     event_notifier_cleanup(&data.e);
 }
@@ -374,7 +380,7 @@ static void test_wait_event_notifier_noflush(void)
     EventNotifierTestData dummy = { .n = 0, .active = 1 };
 
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
 
     g_assert(!aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 0);
@@ -387,7 +393,7 @@ static void test_wait_event_notifier_noflush(void)
 
     /* An active event notifier forces aio_poll to look at EventNotifiers.  */
     event_notifier_init(&dummy.e, false);
-    aio_set_event_notifier(ctx, &dummy.e, event_ready_cb);
+    set_event_notifier(ctx, &dummy.e, event_ready_cb);
 
     event_notifier_set(&data.e);
     g_assert(aio_poll(ctx, false));
@@ -407,10 +413,10 @@ static void test_wait_event_notifier_noflush(void)
     g_assert_cmpint(dummy.n, ==, 1);
     g_assert_cmpint(dummy.active, ==, 0);
 
-    aio_set_event_notifier(ctx, &dummy.e, NULL);
+    set_event_notifier(ctx, &dummy.e, NULL);
     event_notifier_cleanup(&dummy.e);
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     g_assert(!aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 2);
 
@@ -428,7 +434,7 @@ static void test_timer_schedule(void)
      * an fd to wait on. Fixing this breaks other tests. So create a dummy one.
      */
     event_notifier_init(&e, false);
-    aio_set_event_notifier(ctx, &e, dummy_io_handler_read);
+    set_event_notifier(ctx, &e, dummy_io_handler_read);
     aio_poll(ctx, false);
 
     aio_timer_init(ctx, &data.timer, data.clock_type,
@@ -467,7 +473,7 @@ static void test_timer_schedule(void)
     g_assert(!aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 2);
 
-    aio_set_event_notifier(ctx, &e, NULL);
+    set_event_notifier(ctx, &e, NULL);
     event_notifier_cleanup(&e);
 
     timer_del(&data.timer);
@@ -638,11 +644,11 @@ static void test_source_set_event_notifier(void)
 {
     EventNotifierTestData data = { .n = 0, .active = 0 };
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 0);
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 0);
     event_notifier_cleanup(&data.e);
@@ -652,7 +658,7 @@ static void test_source_wait_event_notifier(void)
 {
     EventNotifierTestData data = { .n = 0, .active = 1 };
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 0);
     g_assert_cmpint(data.active, ==, 1);
@@ -666,7 +672,7 @@ static void test_source_wait_event_notifier(void)
     g_assert_cmpint(data.n, ==, 1);
     g_assert_cmpint(data.active, ==, 0);
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 1);
 
@@ -677,7 +683,7 @@ static void test_source_flush_event_notifier(void)
 {
     EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true };
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 0);
     g_assert_cmpint(data.active, ==, 10);
@@ -693,7 +699,7 @@ static void test_source_flush_event_notifier(void)
     g_assert_cmpint(data.active, ==, 0);
     g_assert(!g_main_context_iteration(NULL, false));
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     while (g_main_context_iteration(NULL, false));
     event_notifier_cleanup(&data.e);
 }
@@ -704,7 +710,7 @@ static void test_source_wait_event_notifier_noflush(void)
     EventNotifierTestData dummy = { .n = 0, .active = 1 };
 
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
 
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 0);
@@ -717,7 +723,7 @@ static void test_source_wait_event_notifier_noflush(void)
 
     /* An active event notifier forces aio_poll to look at EventNotifiers.  */
     event_notifier_init(&dummy.e, false);
-    aio_set_event_notifier(ctx, &dummy.e, event_ready_cb);
+    set_event_notifier(ctx, &dummy.e, event_ready_cb);
 
     event_notifier_set(&data.e);
     g_assert(g_main_context_iteration(NULL, false));
@@ -737,10 +743,10 @@ static void test_source_wait_event_notifier_noflush(void)
     g_assert_cmpint(dummy.n, ==, 1);
     g_assert_cmpint(dummy.active, ==, 0);
 
-    aio_set_event_notifier(ctx, &dummy.e, NULL);
+    set_event_notifier(ctx, &dummy.e, NULL);
     event_notifier_cleanup(&dummy.e);
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 2);
 
@@ -759,7 +765,7 @@ static void test_source_timer_schedule(void)
      * an fd to wait on. Fixing this breaks other tests. So create a dummy one.
      */
     event_notifier_init(&e, false);
-    aio_set_event_notifier(ctx, &e, dummy_io_handler_read);
+    set_event_notifier(ctx, &e, dummy_io_handler_read);
     do {} while (g_main_context_iteration(NULL, false));
 
     aio_timer_init(ctx, &data.timer, data.clock_type,
@@ -784,7 +790,7 @@ static void test_source_timer_schedule(void)
     g_assert_cmpint(data.n, ==, 2);
     g_assert(qemu_clock_get_ns(data.clock_type) > expiry);
 
-    aio_set_event_notifier(ctx, &e, NULL);
+    set_event_notifier(ctx, &e, NULL);
     event_notifier_cleanup(&e);
 
     timer_del(&data.timer);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v5 02/12] nbd: Mark fd handlers client type as "external"
  2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 01/12] aio: Add "is_external" flag for event handlers Fam Zheng
@ 2015-10-21  2:06 ` Fam Zheng
  2015-10-21 15:21   ` [Qemu-devel] [Qemu-block] " Jeff Cody
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 03/12] dataplane: Mark host notifiers' " Fam Zheng
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Fam Zheng @ 2015-10-21  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, qemu-block

So we could distinguish it from internal used fds, thus avoid handling
unwanted events in nested aio polls.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 nbd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/nbd.c b/nbd.c
index fbc66be..dab1ebb 100644
--- a/nbd.c
+++ b/nbd.c
@@ -1446,7 +1446,7 @@ static void nbd_set_handlers(NBDClient *client)
 {
     if (client->exp && client->exp->ctx) {
         aio_set_fd_handler(client->exp->ctx, client->sock,
-                           false,
+                           true,
                            client->can_read ? nbd_read : NULL,
                            client->send_coroutine ? nbd_restart_write : NULL,
                            client);
@@ -1457,7 +1457,7 @@ static void nbd_unset_handlers(NBDClient *client)
 {
     if (client->exp && client->exp->ctx) {
         aio_set_fd_handler(client->exp->ctx, client->sock,
-                           false, NULL, NULL, NULL);
+                           true, NULL, NULL, NULL);
     }
 }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v5 03/12] dataplane: Mark host notifiers' client type as "external"
  2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 01/12] aio: Add "is_external" flag for event handlers Fam Zheng
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 02/12] nbd: Mark fd handlers client type as "external" Fam Zheng
@ 2015-10-21  2:06 ` Fam Zheng
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 04/12] aio: introduce aio_{disable, enable}_external Fam Zheng
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Fam Zheng @ 2015-10-21  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, qemu-block

They will be excluded by type in the nested event loops in block layer,
so that unwanted events won't be processed there.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/block/dataplane/virtio-blk.c |  5 ++---
 hw/scsi/virtio-scsi-dataplane.c | 18 ++++++++----------
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index f8716bc..c42ddeb 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -283,7 +283,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 
     /* Get this show started by hooking up our callbacks */
     aio_context_acquire(s->ctx);
-    aio_set_event_notifier(s->ctx, &s->host_notifier, false,
+    aio_set_event_notifier(s->ctx, &s->host_notifier, true,
                            handle_notify);
     aio_context_release(s->ctx);
     return;
@@ -320,8 +320,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
     aio_context_acquire(s->ctx);
 
     /* Stop notifications for new requests from guest */
-    aio_set_event_notifier(s->ctx, &s->host_notifier, false,
-                           NULL);
+    aio_set_event_notifier(s->ctx, &s->host_notifier, true, NULL);
 
     /* Drain and switch bs back to the QEMU main loop */
     blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 21f51df..0d8d71e 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -60,8 +60,7 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
     r = g_new(VirtIOSCSIVring, 1);
     r->host_notifier = *virtio_queue_get_host_notifier(vq);
     r->guest_notifier = *virtio_queue_get_guest_notifier(vq);
-    aio_set_event_notifier(s->ctx, &r->host_notifier, false,
-                           handler);
+    aio_set_event_notifier(s->ctx, &r->host_notifier, true, handler);
 
     r->parent = s;
 
@@ -72,8 +71,7 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
     return r;
 
 fail_vring:
-    aio_set_event_notifier(s->ctx, &r->host_notifier, false,
-                           NULL);
+    aio_set_event_notifier(s->ctx, &r->host_notifier, true, NULL);
     k->set_host_notifier(qbus->parent, n, false);
     g_free(r);
     return NULL;
@@ -165,16 +163,16 @@ static void virtio_scsi_clear_aio(VirtIOSCSI *s)
 
     if (s->ctrl_vring) {
         aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier,
-                               false, NULL);
+                               true, NULL);
     }
     if (s->event_vring) {
         aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier,
-                               false, NULL);
+                               true, NULL);
     }
     if (s->cmd_vrings) {
         for (i = 0; i < vs->conf.num_queues && s->cmd_vrings[i]; i++) {
             aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier,
-                                   false, NULL);
+                                   true, NULL);
         }
     }
 }
@@ -296,12 +294,12 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s)
     aio_context_acquire(s->ctx);
 
     aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier,
-                           false, NULL);
+                           true, NULL);
     aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier,
-                           false, NULL);
+                           true, NULL);
     for (i = 0; i < vs->conf.num_queues; i++) {
         aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier,
-                               false, NULL);
+                               true, NULL);
     }
 
     blk_drain_all(); /* ensure there are no in-flight requests */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v5 04/12] aio: introduce aio_{disable, enable}_external
  2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
                   ` (2 preceding siblings ...)
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 03/12] dataplane: Mark host notifiers' " Fam Zheng
@ 2015-10-21  2:06 ` Fam Zheng
  2015-10-21 15:56   ` [Qemu-devel] [Qemu-block] " Jeff Cody
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 05/12] block: Introduce "drained begin/end" API Fam Zheng
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Fam Zheng @ 2015-10-21  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, qemu-block

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 aio-posix.c         |  3 ++-
 aio-win32.c         |  3 ++-
 include/block/aio.h | 37 +++++++++++++++++++++++++++++++++++++
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/aio-posix.c b/aio-posix.c
index f0f9122..0467f23 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -261,7 +261,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
 
     /* fill pollfds */
     QLIST_FOREACH(node, &ctx->aio_handlers, node) {
-        if (!node->deleted && node->pfd.events) {
+        if (!node->deleted && node->pfd.events
+            && aio_node_check(ctx, node->is_external)) {
             add_pollfd(node);
         }
     }
diff --git a/aio-win32.c b/aio-win32.c
index 3110d85..43c4c79 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -309,7 +309,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
     /* fill fd sets */
     count = 0;
     QLIST_FOREACH(node, &ctx->aio_handlers, node) {
-        if (!node->deleted && node->io_notify) {
+        if (!node->deleted && node->io_notify
+            && aio_node_check(ctx, node->is_external)) {
             events[count++] = event_notifier_get_handle(node->e);
         }
     }
diff --git a/include/block/aio.h b/include/block/aio.h
index 12f1141..80151d1 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -122,6 +122,8 @@ struct AioContext {
 
     /* TimerLists for calling timers - one per clock type */
     QEMUTimerListGroup tlg;
+
+    int external_disable_cnt;
 };
 
 /**
@@ -375,4 +377,39 @@ static inline void aio_timer_init(AioContext *ctx,
  */
 int64_t aio_compute_timeout(AioContext *ctx);
 
+/**
+ * aio_disable_external:
+ * @ctx: the aio context
+ *
+ * Disable the furthur processing of clients.
+ */
+static inline void aio_disable_external(AioContext *ctx)
+{
+    atomic_inc(&ctx->external_disable_cnt);
+}
+
+/**
+ * aio_enable_external:
+ * @ctx: the aio context
+ *
+ * Disable the processing of external clients.
+ */
+static inline void aio_enable_external(AioContext *ctx)
+{
+    atomic_dec(&ctx->external_disable_cnt);
+}
+
+/**
+ * aio_node_check:
+ * @ctx: the aio context
+ * @is_external: Whether or not the checked node is an external event source.
+ *
+ * Check if the node's is_external flag is okey to be polled by the ctx at this
+ * moment. True means green light.
+ */
+static inline bool aio_node_check(AioContext *ctx, bool is_external)
+{
+    return !is_external || !atomic_read(&ctx->external_disable_cnt);
+}
+
 #endif
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v5 05/12] block: Introduce "drained begin/end" API
  2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
                   ` (3 preceding siblings ...)
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 04/12] aio: introduce aio_{disable, enable}_external Fam Zheng
@ 2015-10-21  2:06 ` Fam Zheng
  2015-10-21 16:11   ` [Qemu-devel] [Qemu-block] " Jeff Cody
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 06/12] block: Add "drained begin/end" for transactional external snapshot Fam Zheng
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Fam Zheng @ 2015-10-21  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, qemu-block

The semantics is that after bdrv_drained_begin(bs), bs will not get new external
requests until the matching bdrv_drained_end(bs).

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/io.c                | 17 +++++++++++++++++
 include/block/block.h     | 21 +++++++++++++++++++++
 include/block/block_int.h |  2 ++
 3 files changed, 40 insertions(+)

diff --git a/block/io.c b/block/io.c
index 2fd7a1d..5ac6256 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2624,3 +2624,20 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
     }
     bdrv_start_throttled_reqs(bs);
 }
+
+void bdrv_drained_begin(BlockDriverState *bs)
+{
+    if (!bs->quiesce_counter++) {
+        aio_disable_external(bdrv_get_aio_context(bs));
+    }
+    bdrv_drain(bs);
+}
+
+void bdrv_drained_end(BlockDriverState *bs)
+{
+    assert(bs->quiesce_counter > 0);
+    if (--bs->quiesce_counter > 0) {
+        return;
+    }
+    aio_enable_external(bdrv_get_aio_context(bs));
+}
diff --git a/include/block/block.h b/include/block/block.h
index 28d903c..6d38b62 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -610,4 +610,25 @@ void bdrv_io_plug(BlockDriverState *bs);
 void bdrv_io_unplug(BlockDriverState *bs);
 void bdrv_flush_io_queue(BlockDriverState *bs);
 
+BlockAcctStats *bdrv_get_stats(BlockDriverState *bs);
+
+/**
+ * bdrv_drained_begin:
+ *
+ * Begin a quiesced section for exclusive access to the BDS, by disabling
+ * external request sources including NBD server and device model. Note that
+ * this doesn't block timers or coroutines from submitting more requests, which
+ * means block_job_pause is still necessary.
+ *
+ * This function can be recursive.
+ */
+void bdrv_drained_begin(BlockDriverState *bs);
+
+/**
+ * bdrv_drained_end:
+ *
+ * End a quiescent section started by bdrv_drained_begin().
+ */
+void bdrv_drained_end(BlockDriverState *bs);
+
 #endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index e472a03..e317b14 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -448,6 +448,8 @@ struct BlockDriverState {
     /* threshold limit for writes, in bytes. "High water mark". */
     uint64_t write_threshold_offset;
     NotifierWithReturn write_threshold_notifier;
+
+    int quiesce_counter;
 };
 
 struct BlockBackendRootState {
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v5 06/12] block: Add "drained begin/end" for transactional external snapshot
  2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
                   ` (4 preceding siblings ...)
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 05/12] block: Introduce "drained begin/end" API Fam Zheng
@ 2015-10-21  2:06 ` Fam Zheng
  2015-10-21 17:18   ` [Qemu-devel] [Qemu-block] " Jeff Cody
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 07/12] block: Add "drained begin/end" for transactional backup Fam Zheng
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Fam Zheng @ 2015-10-21  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, qemu-block

This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 35e5719..e4a5eb4 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1569,6 +1569,7 @@ static void external_snapshot_prepare(BlkTransactionState *common,
     /* Acquire AioContext now so any threads operating on old_bs stop */
     state->aio_context = bdrv_get_aio_context(state->old_bs);
     aio_context_acquire(state->aio_context);
+    bdrv_drained_begin(state->old_bs);
 
     if (!bdrv_is_inserted(state->old_bs)) {
         error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
@@ -1638,8 +1639,6 @@ static void external_snapshot_commit(BlkTransactionState *common)
      * don't want to abort all of them if one of them fails the reopen */
     bdrv_reopen(state->old_bs, state->old_bs->open_flags & ~BDRV_O_RDWR,
                 NULL);
-
-    aio_context_release(state->aio_context);
 }
 
 static void external_snapshot_abort(BlkTransactionState *common)
@@ -1649,7 +1648,14 @@ static void external_snapshot_abort(BlkTransactionState *common)
     if (state->new_bs) {
         bdrv_unref(state->new_bs);
     }
+}
+
+static void external_snapshot_clean(BlkTransactionState *common)
+{
+    ExternalSnapshotState *state =
+                             DO_UPCAST(ExternalSnapshotState, common, common);
     if (state->aio_context) {
+        bdrv_drained_end(state->old_bs);
         aio_context_release(state->aio_context);
     }
 }
@@ -1809,6 +1815,7 @@ static const BdrvActionOps actions[] = {
         .prepare  = external_snapshot_prepare,
         .commit   = external_snapshot_commit,
         .abort = external_snapshot_abort,
+        .clean = external_snapshot_clean,
     },
     [TRANSACTION_ACTION_KIND_DRIVE_BACKUP] = {
         .instance_size = sizeof(DriveBackupState),
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v5 07/12] block: Add "drained begin/end" for transactional backup
  2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
                   ` (5 preceding siblings ...)
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 06/12] block: Add "drained begin/end" for transactional external snapshot Fam Zheng
@ 2015-10-21  2:06 ` Fam Zheng
  2015-10-21 17:20   ` [Qemu-devel] [Qemu-block] " Jeff Cody
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 08/12] block: Add "drained begin/end" for transactional blockdev-backup Fam Zheng
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Fam Zheng @ 2015-10-21  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, qemu-block

This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

Move the assignment to state->bs up right after bdrv_drained_begin, so
that we can use it in the clean callback. The abort callback will still
check bs->job and state->job, so it's OK.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index e4a5eb4..0a7848b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1684,9 +1684,16 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
         return;
     }
 
+    if (!blk_is_available(blk)) {
+        error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, backup->device);
+        return;
+    }
+
     /* AioContext is released in .clean() */
     state->aio_context = blk_get_aio_context(blk);
     aio_context_acquire(state->aio_context);
+    bdrv_drained_begin(blk_bs(blk));
+    state->bs = blk_bs(blk);
 
     qmp_drive_backup(backup->device, backup->target,
                      backup->has_format, backup->format,
@@ -1702,7 +1709,6 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
         return;
     }
 
-    state->bs = blk_bs(blk);
     state->job = state->bs->job;
 }
 
@@ -1722,6 +1728,7 @@ static void drive_backup_clean(BlkTransactionState *common)
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
 
     if (state->aio_context) {
+        bdrv_drained_end(state->bs);
         aio_context_release(state->aio_context);
     }
 }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v5 08/12] block: Add "drained begin/end" for transactional blockdev-backup
  2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
                   ` (6 preceding siblings ...)
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 07/12] block: Add "drained begin/end" for transactional backup Fam Zheng
@ 2015-10-21  2:06 ` Fam Zheng
  2015-10-21 17:25   ` [Qemu-devel] [Qemu-block] " Jeff Cody
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 09/12] block: Add "drained begin/end" for internal snapshot Fam Zheng
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Fam Zheng @ 2015-10-21  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, qemu-block

Similar to the previous patch, make sure that external events are not
dispatched during transaction operations.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 0a7848b..52f44b2 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1756,6 +1756,11 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
         return;
     }
 
+    if (!blk_is_available(blk)) {
+        error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, backup->device);
+        return;
+    }
+
     target = blk_by_name(backup->target);
     if (!target) {
         error_setg(errp, "Device '%s' not found", backup->target);
@@ -1770,6 +1775,8 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
         return;
     }
     aio_context_acquire(state->aio_context);
+    state->bs = blk_bs(blk);
+    bdrv_drained_begin(state->bs);
 
     qmp_blockdev_backup(backup->device, backup->target,
                         backup->sync,
@@ -1782,7 +1789,6 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
         return;
     }
 
-    state->bs = blk_bs(blk);
     state->job = state->bs->job;
 }
 
@@ -1802,6 +1808,7 @@ static void blockdev_backup_clean(BlkTransactionState *common)
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
 
     if (state->aio_context) {
+        bdrv_drained_end(state->bs);
         aio_context_release(state->aio_context);
     }
 }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v5 09/12] block: Add "drained begin/end" for internal snapshot
  2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
                   ` (7 preceding siblings ...)
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 08/12] block: Add "drained begin/end" for transactional blockdev-backup Fam Zheng
@ 2015-10-21  2:06 ` Fam Zheng
  2015-10-21 13:37   ` Kevin Wolf
  2015-10-21 18:22   ` [Qemu-devel] [Qemu-block] " Jeff Cody
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 10/12] block: Introduce BlockDriver.bdrv_drain callback Fam Zheng
                   ` (3 subsequent siblings)
  12 siblings, 2 replies; 29+ messages in thread
From: Fam Zheng @ 2015-10-21  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, qemu-block

This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

state->bs is assigned right after bdrv_drained_begin. Because it was
used as the flag for deletion or not in abort, now we need a separate
flag - InternalSnapshotState.created.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 52f44b2..92c2d0d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1370,6 +1370,7 @@ typedef struct InternalSnapshotState {
     BlockDriverState *bs;
     AioContext *aio_context;
     QEMUSnapshotInfo sn;
+    bool created;
 } InternalSnapshotState;
 
 static void internal_snapshot_prepare(BlkTransactionState *common,
@@ -1407,6 +1408,8 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
     /* AioContext is released in .clean() */
     state->aio_context = blk_get_aio_context(blk);
     aio_context_acquire(state->aio_context);
+    state->bs = blk_bs(blk);
+    bdrv_drained_begin(state->bs);
 
     if (!blk_is_available(blk)) {
         error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
@@ -1465,7 +1468,7 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
     }
 
     /* 4. succeed, mark a snapshot is created */
-    state->bs = bs;
+    state->created = true;
 }
 
 static void internal_snapshot_abort(BlkTransactionState *common)
@@ -1476,7 +1479,7 @@ static void internal_snapshot_abort(BlkTransactionState *common)
     QEMUSnapshotInfo *sn = &state->sn;
     Error *local_error = NULL;
 
-    if (!bs) {
+    if (!state->created) {
         return;
     }
 
@@ -1497,6 +1500,7 @@ static void internal_snapshot_clean(BlkTransactionState *common)
                                              common, common);
 
     if (state->aio_context) {
+        bdrv_drained_end(state->bs);
         aio_context_release(state->aio_context);
     }
 }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v5 10/12] block: Introduce BlockDriver.bdrv_drain callback
  2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
                   ` (8 preceding siblings ...)
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 09/12] block: Add "drained begin/end" for internal snapshot Fam Zheng
@ 2015-10-21  2:06 ` Fam Zheng
  2015-10-21 18:25   ` [Qemu-devel] [Qemu-block] " Jeff Cody
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 11/12] qed: Implement .bdrv_drain Fam Zheng
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Fam Zheng @ 2015-10-21  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, qemu-block

Drivers can have internal request sources that generate IO, like the
need_check_timer in QED. Since we want quiesced periods that contain
nested event loops in block layer, we need to have a way to disable such
event sources.

Block drivers must implement the "bdrv_drain" callback if it has any
internal sources that can generate I/O activity, like a timer or a
worker thread (even in a library) that can schedule QEMUBH in an
asynchronous callback.

Update the comments of bdrv_drain and bdrv_drained_begin accordingly.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/io.c                | 6 +++++-
 include/block/block.h     | 9 +++++++--
 include/block/block_int.h | 6 ++++++
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index 5ac6256..999df63 100644
--- a/block/io.c
+++ b/block/io.c
@@ -235,7 +235,8 @@ bool bdrv_requests_pending(BlockDriverState *bs)
 }
 
 /*
- * Wait for pending requests to complete on a single BlockDriverState subtree
+ * Wait for pending requests to complete on a single BlockDriverState subtree,
+ * and suspend block driver's internal I/O until next request arrives.
  *
  * Note that unlike bdrv_drain_all(), the caller must hold the BlockDriverState
  * AioContext.
@@ -248,6 +249,9 @@ void bdrv_drain(BlockDriverState *bs)
 {
     bool busy = true;
 
+    if (bs->drv && bs->drv->bdrv_drain) {
+        bs->drv->bdrv_drain(bs);
+    }
     while (busy) {
         /* Keep iterating */
          bdrv_flush_io_queue(bs);
diff --git a/include/block/block.h b/include/block/block.h
index 6d38b62..cd58a32 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -617,8 +617,13 @@ BlockAcctStats *bdrv_get_stats(BlockDriverState *bs);
  *
  * Begin a quiesced section for exclusive access to the BDS, by disabling
  * external request sources including NBD server and device model. Note that
- * this doesn't block timers or coroutines from submitting more requests, which
- * means block_job_pause is still necessary.
+ * this doesn't prevent timers or coroutines from submitting more requests,
+ * which means block_job_pause is still necessary.
+ *
+ * If new I/O requests are submitted after bdrv_drained_begin is called before
+ * bdrv_drained_end, more internal I/O might be going on after the request has
+ * been completed. If you don't want this, you have to issue another bdrv_drain
+ * or use a nested bdrv_drained_begin/end section.
  *
  * This function can be recursive.
  */
diff --git a/include/block/block_int.h b/include/block/block_int.h
index e317b14..73eba05 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -288,6 +288,12 @@ struct BlockDriver {
      */
     int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo);
 
+    /**
+     * Drain and stop any internal sources of requests in the driver, and
+     * remain so until next I/O callback (e.g. bdrv_co_writev) is called.
+     */
+    void (*bdrv_drain)(BlockDriverState *bs);
+
     QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v5 11/12] qed: Implement .bdrv_drain
  2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
                   ` (9 preceding siblings ...)
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 10/12] block: Introduce BlockDriver.bdrv_drain callback Fam Zheng
@ 2015-10-21  2:06 ` Fam Zheng
  2015-10-22  2:20   ` [Qemu-devel] [Qemu-block] " Jeff Cody
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 12/12] tests: Add test case for aio_disable_external Fam Zheng
  2015-10-21 13:40 ` [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Kevin Wolf
  12 siblings, 1 reply; 29+ messages in thread
From: Fam Zheng @ 2015-10-21  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, qemu-block

The "need_check_timer" is used to clear the "NEED_CHECK" flag in the
image header after a grace period once metadata update has finished. In
compliance to the bdrv_drain semantics we should make sure it remains
deleted once .bdrv_drain is called.

Call the qed_need_check_timer_cb manually to update the header
immediately.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/qed.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/block/qed.c b/block/qed.c
index 5ea05d4..e9dcb4d 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -375,6 +375,12 @@ static void bdrv_qed_attach_aio_context(BlockDriverState *bs,
     }
 }
 
+static void bdrv_qed_drain(BlockDriverState *bs)
+{
+    qed_cancel_need_check_timer(bs->opaque);
+    qed_need_check_timer_cb(bs->opaque);
+}
+
 static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
                          Error **errp)
 {
@@ -1676,6 +1682,7 @@ static BlockDriver bdrv_qed = {
     .bdrv_check               = bdrv_qed_check,
     .bdrv_detach_aio_context  = bdrv_qed_detach_aio_context,
     .bdrv_attach_aio_context  = bdrv_qed_attach_aio_context,
+    .bdrv_drain               = bdrv_qed_drain,
 };
 
 static void bdrv_qed_init(void)
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v5 12/12] tests: Add test case for aio_disable_external
  2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
                   ` (10 preceding siblings ...)
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 11/12] qed: Implement .bdrv_drain Fam Zheng
@ 2015-10-21  2:06 ` Fam Zheng
  2015-10-21 13:40 ` [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Kevin Wolf
  12 siblings, 0 replies; 29+ messages in thread
From: Fam Zheng @ 2015-10-21  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, qemu-block

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/test-aio.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/tests/test-aio.c b/tests/test-aio.c
index 03cd45d..1623803 100644
--- a/tests/test-aio.c
+++ b/tests/test-aio.c
@@ -374,6 +374,29 @@ static void test_flush_event_notifier(void)
     event_notifier_cleanup(&data.e);
 }
 
+static void test_aio_external_client(void)
+{
+    int i, j;
+
+    for (i = 1; i < 3; i++) {
+        EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true };
+        event_notifier_init(&data.e, false);
+        aio_set_event_notifier(ctx, &data.e, true, event_ready_cb);
+        event_notifier_set(&data.e);
+        for (j = 0; j < i; j++) {
+            aio_disable_external(ctx);
+        }
+        for (j = 0; j < i; j++) {
+            assert(!aio_poll(ctx, false));
+            assert(event_notifier_test_and_clear(&data.e));
+            event_notifier_set(&data.e);
+            aio_enable_external(ctx);
+        }
+        assert(aio_poll(ctx, false));
+        event_notifier_cleanup(&data.e);
+    }
+}
+
 static void test_wait_event_notifier_noflush(void)
 {
     EventNotifierTestData data = { .n = 0 };
@@ -832,6 +855,7 @@ int main(int argc, char **argv)
     g_test_add_func("/aio/event/wait",              test_wait_event_notifier);
     g_test_add_func("/aio/event/wait/no-flush-cb",  test_wait_event_notifier_noflush);
     g_test_add_func("/aio/event/flush",             test_flush_event_notifier);
+    g_test_add_func("/aio/external-client",         test_aio_external_client);
     g_test_add_func("/aio/timer/schedule",          test_timer_schedule);
 
     g_test_add_func("/aio-gsource/flush",                   test_source_flush);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/12] block: Add "drained begin/end" for internal snapshot
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 09/12] block: Add "drained begin/end" for internal snapshot Fam Zheng
@ 2015-10-21 13:37   ` Kevin Wolf
  2015-10-21 18:22   ` [Qemu-devel] [Qemu-block] " Jeff Cody
  1 sibling, 0 replies; 29+ messages in thread
From: Kevin Wolf @ 2015-10-21 13:37 UTC (permalink / raw)
  To: Fam Zheng; +Cc: pbonzini, stefanha, qemu-devel, qemu-block

Am 21.10.2015 um 04:06 hat Fam Zheng geschrieben:
> This ensures the atomicity of the transaction by avoiding processing of
> external requests such as those from ioeventfd.
> 
> state->bs is assigned right after bdrv_drained_begin. Because it was
> used as the flag for deletion or not in abort, now we need a separate
> flag - InternalSnapshotState.created.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  blockdev.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 52f44b2..92c2d0d 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1370,6 +1370,7 @@ typedef struct InternalSnapshotState {
>      BlockDriverState *bs;
>      AioContext *aio_context;
>      QEMUSnapshotInfo sn;
> +    bool created;
>  } InternalSnapshotState;
>  
>  static void internal_snapshot_prepare(BlkTransactionState *common,
> @@ -1407,6 +1408,8 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
>      /* AioContext is released in .clean() */
>      state->aio_context = blk_get_aio_context(blk);
>      aio_context_acquire(state->aio_context);
> +    state->bs = blk_bs(blk);
> +    bdrv_drained_begin(state->bs);
>  
>      if (!blk_is_available(blk)) {
>          error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);

The blk_is_available() check must come first, otherwise we can call
bdrv_drained_begin(NULL) and bdrv_drained_begin() doesn't accept that.

Kevin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end
  2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
                   ` (11 preceding siblings ...)
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 12/12] tests: Add test case for aio_disable_external Fam Zheng
@ 2015-10-21 13:40 ` Kevin Wolf
  12 siblings, 0 replies; 29+ messages in thread
From: Kevin Wolf @ 2015-10-21 13:40 UTC (permalink / raw)
  To: Fam Zheng; +Cc: pbonzini, stefanha, qemu-devel, qemu-block

Am 21.10.2015 um 04:06 hat Fam Zheng geschrieben:
> v5: Rebase onto Kevin's block tree.
> 
> v4: Rebase on to master so fix the "bdrv_move_feature_fields" issue.
> 
> v3: Call bdrv_drain unconditionally in bdrv_drained_begin.
>     Document the internal I/O implications between bdrv_drain_begin and end.
> 
> The nested aio_poll()'s in block layer has a bug that new r/w requests from
> ioeventfds and nbd exports are processed, which might break the caller's
> semantics (qmp_transaction) or even pointers (bdrv_reopen).

Patches 1-8 and 10-12:
Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 01/12] aio: Add "is_external" flag for event handlers
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 01/12] aio: Add "is_external" flag for event handlers Fam Zheng
@ 2015-10-21 15:21   ` Jeff Cody
  0 siblings, 0 replies; 29+ messages in thread
From: Jeff Cody @ 2015-10-21 15:21 UTC (permalink / raw)
  To: Fam Zheng; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Wed, Oct 21, 2015 at 10:06:38AM +0800, Fam Zheng wrote:
> All callers pass in false, and the real external ones will switch to
> true in coming patches.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>

Just a comment, but not necessarily a request to change:

We have a lot of functions that take bool true/false arguments, and it
can sometimes make it difficult to read the code.  I wonder if an enum
(e.g. AIO_EXTERNAL, AIO_INTERNAL) would be more descriptive and easier
to read for the aio_set_* functions.

Either way:

Reviewed-by: Jeff Cody <jcody@redhat.com>

> ---
>  aio-posix.c                     |  6 ++++-
>  aio-win32.c                     |  5 ++++
>  async.c                         |  3 ++-
>  block/curl.c                    | 14 +++++-----
>  block/iscsi.c                   |  9 +++----
>  block/linux-aio.c               |  5 ++--
>  block/nbd-client.c              | 10 ++++---
>  block/nfs.c                     | 17 +++++-------
>  block/sheepdog.c                | 38 ++++++++++++++++++---------
>  block/ssh.c                     |  5 ++--
>  block/win32-aio.c               |  5 ++--
>  hw/block/dataplane/virtio-blk.c |  6 +++--
>  hw/scsi/virtio-scsi-dataplane.c | 24 +++++++++++------
>  include/block/aio.h             |  2 ++
>  iohandler.c                     |  3 ++-
>  nbd.c                           |  4 ++-
>  tests/test-aio.c                | 58 +++++++++++++++++++++++------------------
>  17 files changed, 130 insertions(+), 84 deletions(-)
> 
> diff --git a/aio-posix.c b/aio-posix.c
> index d477033..f0f9122 100644
> --- a/aio-posix.c
> +++ b/aio-posix.c
> @@ -25,6 +25,7 @@ struct AioHandler
>      IOHandler *io_write;
>      int deleted;
>      void *opaque;
> +    bool is_external;
>      QLIST_ENTRY(AioHandler) node;
>  };
>  
> @@ -43,6 +44,7 @@ static AioHandler *find_aio_handler(AioContext *ctx, int fd)
>  
>  void aio_set_fd_handler(AioContext *ctx,
>                          int fd,
> +                        bool is_external,
>                          IOHandler *io_read,
>                          IOHandler *io_write,
>                          void *opaque)
> @@ -82,6 +84,7 @@ void aio_set_fd_handler(AioContext *ctx,
>          node->io_read = io_read;
>          node->io_write = io_write;
>          node->opaque = opaque;
> +        node->is_external = is_external;
>  
>          node->pfd.events = (io_read ? G_IO_IN | G_IO_HUP | G_IO_ERR : 0);
>          node->pfd.events |= (io_write ? G_IO_OUT | G_IO_ERR : 0);
> @@ -92,10 +95,11 @@ void aio_set_fd_handler(AioContext *ctx,
>  
>  void aio_set_event_notifier(AioContext *ctx,
>                              EventNotifier *notifier,
> +                            bool is_external,
>                              EventNotifierHandler *io_read)
>  {
>      aio_set_fd_handler(ctx, event_notifier_get_fd(notifier),
> -                       (IOHandler *)io_read, NULL, notifier);
> +                       is_external, (IOHandler *)io_read, NULL, notifier);
>  }
>  
>  bool aio_prepare(AioContext *ctx)
> diff --git a/aio-win32.c b/aio-win32.c
> index 50a6867..3110d85 100644
> --- a/aio-win32.c
> +++ b/aio-win32.c
> @@ -28,11 +28,13 @@ struct AioHandler {
>      GPollFD pfd;
>      int deleted;
>      void *opaque;
> +    bool is_external;
>      QLIST_ENTRY(AioHandler) node;
>  };
>  
>  void aio_set_fd_handler(AioContext *ctx,
>                          int fd,
> +                        bool is_external,
>                          IOHandler *io_read,
>                          IOHandler *io_write,
>                          void *opaque)
> @@ -86,6 +88,7 @@ void aio_set_fd_handler(AioContext *ctx,
>          node->opaque = opaque;
>          node->io_read = io_read;
>          node->io_write = io_write;
> +        node->is_external = is_external;
>  
>          event = event_notifier_get_handle(&ctx->notifier);
>          WSAEventSelect(node->pfd.fd, event,
> @@ -98,6 +101,7 @@ void aio_set_fd_handler(AioContext *ctx,
>  
>  void aio_set_event_notifier(AioContext *ctx,
>                              EventNotifier *e,
> +                            bool is_external,
>                              EventNotifierHandler *io_notify)
>  {
>      AioHandler *node;
> @@ -133,6 +137,7 @@ void aio_set_event_notifier(AioContext *ctx,
>              node->e = e;
>              node->pfd.fd = (uintptr_t)event_notifier_get_handle(e);
>              node->pfd.events = G_IO_IN;
> +            node->is_external = is_external;
>              QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
>  
>              g_source_add_poll(&ctx->source, &node->pfd);
> diff --git a/async.c b/async.c
> index efce14b..bdc64a3 100644
> --- a/async.c
> +++ b/async.c
> @@ -247,7 +247,7 @@ aio_ctx_finalize(GSource     *source)
>      }
>      qemu_mutex_unlock(&ctx->bh_lock);
>  
> -    aio_set_event_notifier(ctx, &ctx->notifier, NULL);
> +    aio_set_event_notifier(ctx, &ctx->notifier, false, NULL);
>      event_notifier_cleanup(&ctx->notifier);
>      rfifolock_destroy(&ctx->lock);
>      qemu_mutex_destroy(&ctx->bh_lock);
> @@ -329,6 +329,7 @@ AioContext *aio_context_new(Error **errp)
>      }
>      g_source_set_can_recurse(&ctx->source, true);
>      aio_set_event_notifier(ctx, &ctx->notifier,
> +                           false,
>                             (EventNotifierHandler *)
>                             event_notifier_dummy_cb);
>      ctx->thread_pool = NULL;
> diff --git a/block/curl.c b/block/curl.c
> index 032cc8a..8994182 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -154,18 +154,20 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
>      DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd);
>      switch (action) {
>          case CURL_POLL_IN:
> -            aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
> -                               NULL, state);
> +            aio_set_fd_handler(s->aio_context, fd, false,
> +                               curl_multi_read, NULL, state);
>              break;
>          case CURL_POLL_OUT:
> -            aio_set_fd_handler(s->aio_context, fd, NULL, curl_multi_do, state);
> +            aio_set_fd_handler(s->aio_context, fd, false,
> +                               NULL, curl_multi_do, state);
>              break;
>          case CURL_POLL_INOUT:
> -            aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
> -                               curl_multi_do, state);
> +            aio_set_fd_handler(s->aio_context, fd, false,
> +                               curl_multi_read, curl_multi_do, state);
>              break;
>          case CURL_POLL_REMOVE:
> -            aio_set_fd_handler(s->aio_context, fd, NULL, NULL, NULL);
> +            aio_set_fd_handler(s->aio_context, fd, false,
> +                               NULL, NULL, NULL);
>              break;
>      }
>  
> diff --git a/block/iscsi.c b/block/iscsi.c
> index 93f1ee4..9a628b7 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -291,8 +291,8 @@ iscsi_set_events(IscsiLun *iscsilun)
>      int ev = iscsi_which_events(iscsi);
>  
>      if (ev != iscsilun->events) {
> -        aio_set_fd_handler(iscsilun->aio_context,
> -                           iscsi_get_fd(iscsi),
> +        aio_set_fd_handler(iscsilun->aio_context, iscsi_get_fd(iscsi),
> +                           false,
>                             (ev & POLLIN) ? iscsi_process_read : NULL,
>                             (ev & POLLOUT) ? iscsi_process_write : NULL,
>                             iscsilun);
> @@ -1280,9 +1280,8 @@ static void iscsi_detach_aio_context(BlockDriverState *bs)
>  {
>      IscsiLun *iscsilun = bs->opaque;
>  
> -    aio_set_fd_handler(iscsilun->aio_context,
> -                       iscsi_get_fd(iscsilun->iscsi),
> -                       NULL, NULL, NULL);
> +    aio_set_fd_handler(iscsilun->aio_context, iscsi_get_fd(iscsilun->iscsi),
> +                       false, NULL, NULL, NULL);
>      iscsilun->events = 0;
>  
>      if (iscsilun->nop_timer) {
> diff --git a/block/linux-aio.c b/block/linux-aio.c
> index c991443..88b0520 100644
> --- a/block/linux-aio.c
> +++ b/block/linux-aio.c
> @@ -287,7 +287,7 @@ void laio_detach_aio_context(void *s_, AioContext *old_context)
>  {
>      struct qemu_laio_state *s = s_;
>  
> -    aio_set_event_notifier(old_context, &s->e, NULL);
> +    aio_set_event_notifier(old_context, &s->e, false, NULL);
>      qemu_bh_delete(s->completion_bh);
>  }
>  
> @@ -296,7 +296,8 @@ void laio_attach_aio_context(void *s_, AioContext *new_context)
>      struct qemu_laio_state *s = s_;
>  
>      s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
> -    aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb);
> +    aio_set_event_notifier(new_context, &s->e, false,
> +                           qemu_laio_completion_cb);
>  }
>  
>  void *laio_init(void)
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index e1bb919..b7fd17a 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -124,7 +124,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
>      s->send_coroutine = qemu_coroutine_self();
>      aio_context = bdrv_get_aio_context(bs);
>  
> -    aio_set_fd_handler(aio_context, s->sock,
> +    aio_set_fd_handler(aio_context, s->sock, false,
>                         nbd_reply_ready, nbd_restart_write, bs);
>      if (qiov) {
>          if (!s->is_unix) {
> @@ -144,7 +144,8 @@ static int nbd_co_send_request(BlockDriverState *bs,
>      } else {
>          rc = nbd_send_request(s->sock, request);
>      }
> -    aio_set_fd_handler(aio_context, s->sock, nbd_reply_ready, NULL, bs);
> +    aio_set_fd_handler(aio_context, s->sock, false,
> +                       nbd_reply_ready, NULL, bs);
>      s->send_coroutine = NULL;
>      qemu_co_mutex_unlock(&s->send_mutex);
>      return rc;
> @@ -348,14 +349,15 @@ int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
>  void nbd_client_detach_aio_context(BlockDriverState *bs)
>  {
>      aio_set_fd_handler(bdrv_get_aio_context(bs),
> -                       nbd_get_client_session(bs)->sock, NULL, NULL, NULL);
> +                       nbd_get_client_session(bs)->sock,
> +                       false, NULL, NULL, NULL);
>  }
>  
>  void nbd_client_attach_aio_context(BlockDriverState *bs,
>                                     AioContext *new_context)
>  {
>      aio_set_fd_handler(new_context, nbd_get_client_session(bs)->sock,
> -                       nbd_reply_ready, NULL, bs);
> +                       false, nbd_reply_ready, NULL, bs);
>  }
>  
>  void nbd_client_close(BlockDriverState *bs)
> diff --git a/block/nfs.c b/block/nfs.c
> index 887a98e..fd79f89 100644
> --- a/block/nfs.c
> +++ b/block/nfs.c
> @@ -63,11 +63,10 @@ static void nfs_set_events(NFSClient *client)
>  {
>      int ev = nfs_which_events(client->context);
>      if (ev != client->events) {
> -        aio_set_fd_handler(client->aio_context,
> -                           nfs_get_fd(client->context),
> +        aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context),
> +                           false,
>                             (ev & POLLIN) ? nfs_process_read : NULL,
> -                           (ev & POLLOUT) ? nfs_process_write : NULL,
> -                           client);
> +                           (ev & POLLOUT) ? nfs_process_write : NULL, client);
>  
>      }
>      client->events = ev;
> @@ -242,9 +241,8 @@ static void nfs_detach_aio_context(BlockDriverState *bs)
>  {
>      NFSClient *client = bs->opaque;
>  
> -    aio_set_fd_handler(client->aio_context,
> -                       nfs_get_fd(client->context),
> -                       NULL, NULL, NULL);
> +    aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context),
> +                       false, NULL, NULL, NULL);
>      client->events = 0;
>  }
>  
> @@ -263,9 +261,8 @@ static void nfs_client_close(NFSClient *client)
>          if (client->fh) {
>              nfs_close(client->context, client->fh);
>          }
> -        aio_set_fd_handler(client->aio_context,
> -                           nfs_get_fd(client->context),
> -                           NULL, NULL, NULL);
> +        aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context),
> +                           false, NULL, NULL, NULL);
>          nfs_destroy_context(client->context);
>      }
>      memset(client, 0, sizeof(NFSClient));
> diff --git a/block/sheepdog.c b/block/sheepdog.c
> index e7e58b7..d80e4ed 100644
> --- a/block/sheepdog.c
> +++ b/block/sheepdog.c
> @@ -651,14 +651,16 @@ static coroutine_fn void do_co_req(void *opaque)
>      unsigned int *rlen = srco->rlen;
>  
>      co = qemu_coroutine_self();
> -    aio_set_fd_handler(srco->aio_context, sockfd, NULL, restart_co_req, co);
> +    aio_set_fd_handler(srco->aio_context, sockfd, false,
> +                       NULL, restart_co_req, co);
>  
>      ret = send_co_req(sockfd, hdr, data, wlen);
>      if (ret < 0) {
>          goto out;
>      }
>  
> -    aio_set_fd_handler(srco->aio_context, sockfd, restart_co_req, NULL, co);
> +    aio_set_fd_handler(srco->aio_context, sockfd, false,
> +                       restart_co_req, NULL, co);
>  
>      ret = qemu_co_recv(sockfd, hdr, sizeof(*hdr));
>      if (ret != sizeof(*hdr)) {
> @@ -683,7 +685,8 @@ static coroutine_fn void do_co_req(void *opaque)
>  out:
>      /* there is at most one request for this sockfd, so it is safe to
>       * set each handler to NULL. */
> -    aio_set_fd_handler(srco->aio_context, sockfd, NULL, NULL, NULL);
> +    aio_set_fd_handler(srco->aio_context, sockfd, false,
> +                       NULL, NULL, NULL);
>  
>      srco->ret = ret;
>      srco->finished = true;
> @@ -735,7 +738,8 @@ static coroutine_fn void reconnect_to_sdog(void *opaque)
>      BDRVSheepdogState *s = opaque;
>      AIOReq *aio_req, *next;
>  
> -    aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL);
> +    aio_set_fd_handler(s->aio_context, s->fd, false, NULL,
> +                       NULL, NULL);
>      close(s->fd);
>      s->fd = -1;
>  
> @@ -938,7 +942,8 @@ static int get_sheep_fd(BDRVSheepdogState *s, Error **errp)
>          return fd;
>      }
>  
> -    aio_set_fd_handler(s->aio_context, fd, co_read_response, NULL, s);
> +    aio_set_fd_handler(s->aio_context, fd, false,
> +                       co_read_response, NULL, s);
>      return fd;
>  }
>  
> @@ -1199,7 +1204,7 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
>  
>      qemu_co_mutex_lock(&s->lock);
>      s->co_send = qemu_coroutine_self();
> -    aio_set_fd_handler(s->aio_context, s->fd,
> +    aio_set_fd_handler(s->aio_context, s->fd, false,
>                         co_read_response, co_write_request, s);
>      socket_set_cork(s->fd, 1);
>  
> @@ -1218,7 +1223,8 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
>      }
>  out:
>      socket_set_cork(s->fd, 0);
> -    aio_set_fd_handler(s->aio_context, s->fd, co_read_response, NULL, s);
> +    aio_set_fd_handler(s->aio_context, s->fd, false,
> +                       co_read_response, NULL, s);
>      s->co_send = NULL;
>      qemu_co_mutex_unlock(&s->lock);
>  }
> @@ -1368,7 +1374,8 @@ static void sd_detach_aio_context(BlockDriverState *bs)
>  {
>      BDRVSheepdogState *s = bs->opaque;
>  
> -    aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL);
> +    aio_set_fd_handler(s->aio_context, s->fd, false, NULL,
> +                       NULL, NULL);
>  }
>  
>  static void sd_attach_aio_context(BlockDriverState *bs,
> @@ -1377,7 +1384,8 @@ static void sd_attach_aio_context(BlockDriverState *bs,
>      BDRVSheepdogState *s = bs->opaque;
>  
>      s->aio_context = new_context;
> -    aio_set_fd_handler(new_context, s->fd, co_read_response, NULL, s);
> +    aio_set_fd_handler(new_context, s->fd, false,
> +                       co_read_response, NULL, s);
>  }
>  
>  /* TODO Convert to fine grained options */
> @@ -1490,7 +1498,8 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
>      g_free(buf);
>      return 0;
>  out:
> -    aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, NULL, NULL, NULL);
> +    aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd,
> +                       false, NULL, NULL, NULL);
>      if (s->fd >= 0) {
>          closesocket(s->fd);
>      }
> @@ -1528,7 +1537,8 @@ static void sd_reopen_commit(BDRVReopenState *state)
>      BDRVSheepdogState *s = state->bs->opaque;
>  
>      if (s->fd) {
> -        aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL);
> +        aio_set_fd_handler(s->aio_context, s->fd, false,
> +                           NULL, NULL, NULL);
>          closesocket(s->fd);
>      }
>  
> @@ -1551,7 +1561,8 @@ static void sd_reopen_abort(BDRVReopenState *state)
>      }
>  
>      if (re_s->fd) {
> -        aio_set_fd_handler(s->aio_context, re_s->fd, NULL, NULL, NULL);
> +        aio_set_fd_handler(s->aio_context, re_s->fd, false,
> +                           NULL, NULL, NULL);
>          closesocket(re_s->fd);
>      }
>  
> @@ -1935,7 +1946,8 @@ static void sd_close(BlockDriverState *bs)
>          error_report("%s, %s", sd_strerror(rsp->result), s->name);
>      }
>  
> -    aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, NULL, NULL, NULL);
> +    aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd,
> +                       false, NULL, NULL, NULL);
>      closesocket(s->fd);
>      g_free(s->host_spec);
>  }
> diff --git a/block/ssh.c b/block/ssh.c
> index d35b51f..af025c0 100644
> --- a/block/ssh.c
> +++ b/block/ssh.c
> @@ -800,14 +800,15 @@ static coroutine_fn void set_fd_handler(BDRVSSHState *s, BlockDriverState *bs)
>              rd_handler, wr_handler);
>  
>      aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock,
> -                       rd_handler, wr_handler, co);
> +                       false, rd_handler, wr_handler, co);
>  }
>  
>  static coroutine_fn void clear_fd_handler(BDRVSSHState *s,
>                                            BlockDriverState *bs)
>  {
>      DPRINTF("s->sock=%d", s->sock);
> -    aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock, NULL, NULL, NULL);
> +    aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock,
> +                       false, NULL, NULL, NULL);
>  }
>  
>  /* A non-blocking call returned EAGAIN, so yield, ensuring the
> diff --git a/block/win32-aio.c b/block/win32-aio.c
> index 64e8682..bbf2f01 100644
> --- a/block/win32-aio.c
> +++ b/block/win32-aio.c
> @@ -174,7 +174,7 @@ int win32_aio_attach(QEMUWin32AIOState *aio, HANDLE hfile)
>  void win32_aio_detach_aio_context(QEMUWin32AIOState *aio,
>                                    AioContext *old_context)
>  {
> -    aio_set_event_notifier(old_context, &aio->e, NULL);
> +    aio_set_event_notifier(old_context, &aio->e, false, NULL);
>      aio->is_aio_context_attached = false;
>  }
>  
> @@ -182,7 +182,8 @@ void win32_aio_attach_aio_context(QEMUWin32AIOState *aio,
>                                    AioContext *new_context)
>  {
>      aio->is_aio_context_attached = true;
> -    aio_set_event_notifier(new_context, &aio->e, win32_aio_completion_cb);
> +    aio_set_event_notifier(new_context, &aio->e, false,
> +                           win32_aio_completion_cb);
>  }
>  
>  QEMUWin32AIOState *win32_aio_init(void)
> diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
> index 6106e46..f8716bc 100644
> --- a/hw/block/dataplane/virtio-blk.c
> +++ b/hw/block/dataplane/virtio-blk.c
> @@ -283,7 +283,8 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
>  
>      /* Get this show started by hooking up our callbacks */
>      aio_context_acquire(s->ctx);
> -    aio_set_event_notifier(s->ctx, &s->host_notifier, handle_notify);
> +    aio_set_event_notifier(s->ctx, &s->host_notifier, false,
> +                           handle_notify);
>      aio_context_release(s->ctx);
>      return;
>  
> @@ -319,7 +320,8 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>      aio_context_acquire(s->ctx);
>  
>      /* Stop notifications for new requests from guest */
> -    aio_set_event_notifier(s->ctx, &s->host_notifier, NULL);
> +    aio_set_event_notifier(s->ctx, &s->host_notifier, false,
> +                           NULL);
>  
>      /* Drain and switch bs back to the QEMU main loop */
>      blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
> diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
> index 1248fd9..21f51df 100644
> --- a/hw/scsi/virtio-scsi-dataplane.c
> +++ b/hw/scsi/virtio-scsi-dataplane.c
> @@ -60,7 +60,8 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
>      r = g_new(VirtIOSCSIVring, 1);
>      r->host_notifier = *virtio_queue_get_host_notifier(vq);
>      r->guest_notifier = *virtio_queue_get_guest_notifier(vq);
> -    aio_set_event_notifier(s->ctx, &r->host_notifier, handler);
> +    aio_set_event_notifier(s->ctx, &r->host_notifier, false,
> +                           handler);
>  
>      r->parent = s;
>  
> @@ -71,7 +72,8 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
>      return r;
>  
>  fail_vring:
> -    aio_set_event_notifier(s->ctx, &r->host_notifier, NULL);
> +    aio_set_event_notifier(s->ctx, &r->host_notifier, false,
> +                           NULL);
>      k->set_host_notifier(qbus->parent, n, false);
>      g_free(r);
>      return NULL;
> @@ -162,14 +164,17 @@ static void virtio_scsi_clear_aio(VirtIOSCSI *s)
>      int i;
>  
>      if (s->ctrl_vring) {
> -        aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier, NULL);
> +        aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier,
> +                               false, NULL);
>      }
>      if (s->event_vring) {
> -        aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier, NULL);
> +        aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier,
> +                               false, NULL);
>      }
>      if (s->cmd_vrings) {
>          for (i = 0; i < vs->conf.num_queues && s->cmd_vrings[i]; i++) {
> -            aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, NULL);
> +            aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier,
> +                                   false, NULL);
>          }
>      }
>  }
> @@ -290,10 +295,13 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s)
>  
>      aio_context_acquire(s->ctx);
>  
> -    aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier, NULL);
> -    aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier, NULL);
> +    aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier,
> +                           false, NULL);
> +    aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier,
> +                           false, NULL);
>      for (i = 0; i < vs->conf.num_queues; i++) {
> -        aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, NULL);
> +        aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier,
> +                               false, NULL);
>      }
>  
>      blk_drain_all(); /* ensure there are no in-flight requests */
> diff --git a/include/block/aio.h b/include/block/aio.h
> index 400b1b0..12f1141 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -299,6 +299,7 @@ bool aio_poll(AioContext *ctx, bool blocking);
>   */
>  void aio_set_fd_handler(AioContext *ctx,
>                          int fd,
> +                        bool is_external,
>                          IOHandler *io_read,
>                          IOHandler *io_write,
>                          void *opaque);
> @@ -312,6 +313,7 @@ void aio_set_fd_handler(AioContext *ctx,
>   */
>  void aio_set_event_notifier(AioContext *ctx,
>                              EventNotifier *notifier,
> +                            bool is_external,
>                              EventNotifierHandler *io_read);
>  
>  /* Return a GSource that lets the main loop poll the file descriptors attached
> diff --git a/iohandler.c b/iohandler.c
> index 55f8501..eb68083 100644
> --- a/iohandler.c
> +++ b/iohandler.c
> @@ -55,7 +55,8 @@ void qemu_set_fd_handler(int fd,
>                           void *opaque)
>  {
>      iohandler_init();
> -    aio_set_fd_handler(iohandler_ctx, fd, fd_read, fd_write, opaque);
> +    aio_set_fd_handler(iohandler_ctx, fd, false,
> +                       fd_read, fd_write, opaque);
>  }
>  
>  /* reaping of zombies.  right now we're not passing the status to
> diff --git a/nbd.c b/nbd.c
> index 74859cb..fbc66be 100644
> --- a/nbd.c
> +++ b/nbd.c
> @@ -1446,6 +1446,7 @@ static void nbd_set_handlers(NBDClient *client)
>  {
>      if (client->exp && client->exp->ctx) {
>          aio_set_fd_handler(client->exp->ctx, client->sock,
> +                           false,
>                             client->can_read ? nbd_read : NULL,
>                             client->send_coroutine ? nbd_restart_write : NULL,
>                             client);
> @@ -1455,7 +1456,8 @@ static void nbd_set_handlers(NBDClient *client)
>  static void nbd_unset_handlers(NBDClient *client)
>  {
>      if (client->exp && client->exp->ctx) {
> -        aio_set_fd_handler(client->exp->ctx, client->sock, NULL, NULL, NULL);
> +        aio_set_fd_handler(client->exp->ctx, client->sock,
> +                           false, NULL, NULL, NULL);
>      }
>  }
>  
> diff --git a/tests/test-aio.c b/tests/test-aio.c
> index 217e337..03cd45d 100644
> --- a/tests/test-aio.c
> +++ b/tests/test-aio.c
> @@ -118,6 +118,12 @@ static void *test_acquire_thread(void *opaque)
>      return NULL;
>  }
>  
> +static void set_event_notifier(AioContext *ctx, EventNotifier *notifier,
> +                               EventNotifierHandler *handler)
> +{
> +    aio_set_event_notifier(ctx, notifier, false, handler);
> +}
> +
>  static void dummy_notifier_read(EventNotifier *unused)
>  {
>      g_assert(false); /* should never be invoked */
> @@ -131,7 +137,7 @@ static void test_acquire(void)
>  
>      /* Dummy event notifier ensures aio_poll() will block */
>      event_notifier_init(&notifier, false);
> -    aio_set_event_notifier(ctx, &notifier, dummy_notifier_read);
> +    set_event_notifier(ctx, &notifier, dummy_notifier_read);
>      g_assert(!aio_poll(ctx, false)); /* consume aio_notify() */
>  
>      qemu_mutex_init(&data.start_lock);
> @@ -149,7 +155,7 @@ static void test_acquire(void)
>      aio_context_release(ctx);
>  
>      qemu_thread_join(&thread);
> -    aio_set_event_notifier(ctx, &notifier, NULL);
> +    set_event_notifier(ctx, &notifier, NULL);
>      event_notifier_cleanup(&notifier);
>  
>      g_assert(data.thread_acquired);
> @@ -308,11 +314,11 @@ static void test_set_event_notifier(void)
>  {
>      EventNotifierTestData data = { .n = 0, .active = 0 };
>      event_notifier_init(&data.e, false);
> -    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
> +    set_event_notifier(ctx, &data.e, event_ready_cb);
>      g_assert(!aio_poll(ctx, false));
>      g_assert_cmpint(data.n, ==, 0);
>  
> -    aio_set_event_notifier(ctx, &data.e, NULL);
> +    set_event_notifier(ctx, &data.e, NULL);
>      g_assert(!aio_poll(ctx, false));
>      g_assert_cmpint(data.n, ==, 0);
>      event_notifier_cleanup(&data.e);
> @@ -322,7 +328,7 @@ static void test_wait_event_notifier(void)
>  {
>      EventNotifierTestData data = { .n = 0, .active = 1 };
>      event_notifier_init(&data.e, false);
> -    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
> +    set_event_notifier(ctx, &data.e, event_ready_cb);
>      while (aio_poll(ctx, false));
>      g_assert_cmpint(data.n, ==, 0);
>      g_assert_cmpint(data.active, ==, 1);
> @@ -336,7 +342,7 @@ static void test_wait_event_notifier(void)
>      g_assert_cmpint(data.n, ==, 1);
>      g_assert_cmpint(data.active, ==, 0);
>  
> -    aio_set_event_notifier(ctx, &data.e, NULL);
> +    set_event_notifier(ctx, &data.e, NULL);
>      g_assert(!aio_poll(ctx, false));
>      g_assert_cmpint(data.n, ==, 1);
>  
> @@ -347,7 +353,7 @@ static void test_flush_event_notifier(void)
>  {
>      EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true };
>      event_notifier_init(&data.e, false);
> -    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
> +    set_event_notifier(ctx, &data.e, event_ready_cb);
>      while (aio_poll(ctx, false));
>      g_assert_cmpint(data.n, ==, 0);
>      g_assert_cmpint(data.active, ==, 10);
> @@ -363,7 +369,7 @@ static void test_flush_event_notifier(void)
>      g_assert_cmpint(data.active, ==, 0);
>      g_assert(!aio_poll(ctx, false));
>  
> -    aio_set_event_notifier(ctx, &data.e, NULL);
> +    set_event_notifier(ctx, &data.e, NULL);
>      g_assert(!aio_poll(ctx, false));
>      event_notifier_cleanup(&data.e);
>  }
> @@ -374,7 +380,7 @@ static void test_wait_event_notifier_noflush(void)
>      EventNotifierTestData dummy = { .n = 0, .active = 1 };
>  
>      event_notifier_init(&data.e, false);
> -    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
> +    set_event_notifier(ctx, &data.e, event_ready_cb);
>  
>      g_assert(!aio_poll(ctx, false));
>      g_assert_cmpint(data.n, ==, 0);
> @@ -387,7 +393,7 @@ static void test_wait_event_notifier_noflush(void)
>  
>      /* An active event notifier forces aio_poll to look at EventNotifiers.  */
>      event_notifier_init(&dummy.e, false);
> -    aio_set_event_notifier(ctx, &dummy.e, event_ready_cb);
> +    set_event_notifier(ctx, &dummy.e, event_ready_cb);
>  
>      event_notifier_set(&data.e);
>      g_assert(aio_poll(ctx, false));
> @@ -407,10 +413,10 @@ static void test_wait_event_notifier_noflush(void)
>      g_assert_cmpint(dummy.n, ==, 1);
>      g_assert_cmpint(dummy.active, ==, 0);
>  
> -    aio_set_event_notifier(ctx, &dummy.e, NULL);
> +    set_event_notifier(ctx, &dummy.e, NULL);
>      event_notifier_cleanup(&dummy.e);
>  
> -    aio_set_event_notifier(ctx, &data.e, NULL);
> +    set_event_notifier(ctx, &data.e, NULL);
>      g_assert(!aio_poll(ctx, false));
>      g_assert_cmpint(data.n, ==, 2);
>  
> @@ -428,7 +434,7 @@ static void test_timer_schedule(void)
>       * an fd to wait on. Fixing this breaks other tests. So create a dummy one.
>       */
>      event_notifier_init(&e, false);
> -    aio_set_event_notifier(ctx, &e, dummy_io_handler_read);
> +    set_event_notifier(ctx, &e, dummy_io_handler_read);
>      aio_poll(ctx, false);
>  
>      aio_timer_init(ctx, &data.timer, data.clock_type,
> @@ -467,7 +473,7 @@ static void test_timer_schedule(void)
>      g_assert(!aio_poll(ctx, false));
>      g_assert_cmpint(data.n, ==, 2);
>  
> -    aio_set_event_notifier(ctx, &e, NULL);
> +    set_event_notifier(ctx, &e, NULL);
>      event_notifier_cleanup(&e);
>  
>      timer_del(&data.timer);
> @@ -638,11 +644,11 @@ static void test_source_set_event_notifier(void)
>  {
>      EventNotifierTestData data = { .n = 0, .active = 0 };
>      event_notifier_init(&data.e, false);
> -    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
> +    set_event_notifier(ctx, &data.e, event_ready_cb);
>      while (g_main_context_iteration(NULL, false));
>      g_assert_cmpint(data.n, ==, 0);
>  
> -    aio_set_event_notifier(ctx, &data.e, NULL);
> +    set_event_notifier(ctx, &data.e, NULL);
>      while (g_main_context_iteration(NULL, false));
>      g_assert_cmpint(data.n, ==, 0);
>      event_notifier_cleanup(&data.e);
> @@ -652,7 +658,7 @@ static void test_source_wait_event_notifier(void)
>  {
>      EventNotifierTestData data = { .n = 0, .active = 1 };
>      event_notifier_init(&data.e, false);
> -    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
> +    set_event_notifier(ctx, &data.e, event_ready_cb);
>      while (g_main_context_iteration(NULL, false));
>      g_assert_cmpint(data.n, ==, 0);
>      g_assert_cmpint(data.active, ==, 1);
> @@ -666,7 +672,7 @@ static void test_source_wait_event_notifier(void)
>      g_assert_cmpint(data.n, ==, 1);
>      g_assert_cmpint(data.active, ==, 0);
>  
> -    aio_set_event_notifier(ctx, &data.e, NULL);
> +    set_event_notifier(ctx, &data.e, NULL);
>      while (g_main_context_iteration(NULL, false));
>      g_assert_cmpint(data.n, ==, 1);
>  
> @@ -677,7 +683,7 @@ static void test_source_flush_event_notifier(void)
>  {
>      EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true };
>      event_notifier_init(&data.e, false);
> -    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
> +    set_event_notifier(ctx, &data.e, event_ready_cb);
>      while (g_main_context_iteration(NULL, false));
>      g_assert_cmpint(data.n, ==, 0);
>      g_assert_cmpint(data.active, ==, 10);
> @@ -693,7 +699,7 @@ static void test_source_flush_event_notifier(void)
>      g_assert_cmpint(data.active, ==, 0);
>      g_assert(!g_main_context_iteration(NULL, false));
>  
> -    aio_set_event_notifier(ctx, &data.e, NULL);
> +    set_event_notifier(ctx, &data.e, NULL);
>      while (g_main_context_iteration(NULL, false));
>      event_notifier_cleanup(&data.e);
>  }
> @@ -704,7 +710,7 @@ static void test_source_wait_event_notifier_noflush(void)
>      EventNotifierTestData dummy = { .n = 0, .active = 1 };
>  
>      event_notifier_init(&data.e, false);
> -    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
> +    set_event_notifier(ctx, &data.e, event_ready_cb);
>  
>      while (g_main_context_iteration(NULL, false));
>      g_assert_cmpint(data.n, ==, 0);
> @@ -717,7 +723,7 @@ static void test_source_wait_event_notifier_noflush(void)
>  
>      /* An active event notifier forces aio_poll to look at EventNotifiers.  */
>      event_notifier_init(&dummy.e, false);
> -    aio_set_event_notifier(ctx, &dummy.e, event_ready_cb);
> +    set_event_notifier(ctx, &dummy.e, event_ready_cb);
>  
>      event_notifier_set(&data.e);
>      g_assert(g_main_context_iteration(NULL, false));
> @@ -737,10 +743,10 @@ static void test_source_wait_event_notifier_noflush(void)
>      g_assert_cmpint(dummy.n, ==, 1);
>      g_assert_cmpint(dummy.active, ==, 0);
>  
> -    aio_set_event_notifier(ctx, &dummy.e, NULL);
> +    set_event_notifier(ctx, &dummy.e, NULL);
>      event_notifier_cleanup(&dummy.e);
>  
> -    aio_set_event_notifier(ctx, &data.e, NULL);
> +    set_event_notifier(ctx, &data.e, NULL);
>      while (g_main_context_iteration(NULL, false));
>      g_assert_cmpint(data.n, ==, 2);
>  
> @@ -759,7 +765,7 @@ static void test_source_timer_schedule(void)
>       * an fd to wait on. Fixing this breaks other tests. So create a dummy one.
>       */
>      event_notifier_init(&e, false);
> -    aio_set_event_notifier(ctx, &e, dummy_io_handler_read);
> +    set_event_notifier(ctx, &e, dummy_io_handler_read);
>      do {} while (g_main_context_iteration(NULL, false));
>  
>      aio_timer_init(ctx, &data.timer, data.clock_type,
> @@ -784,7 +790,7 @@ static void test_source_timer_schedule(void)
>      g_assert_cmpint(data.n, ==, 2);
>      g_assert(qemu_clock_get_ns(data.clock_type) > expiry);
>  
> -    aio_set_event_notifier(ctx, &e, NULL);
> +    set_event_notifier(ctx, &e, NULL);
>      event_notifier_cleanup(&e);
>  
>      timer_del(&data.timer);
> -- 
> 2.4.3
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 02/12] nbd: Mark fd handlers client type as "external"
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 02/12] nbd: Mark fd handlers client type as "external" Fam Zheng
@ 2015-10-21 15:21   ` Jeff Cody
  0 siblings, 0 replies; 29+ messages in thread
From: Jeff Cody @ 2015-10-21 15:21 UTC (permalink / raw)
  To: Fam Zheng; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Wed, Oct 21, 2015 at 10:06:39AM +0800, Fam Zheng wrote:
> So we could distinguish it from internal used fds, thus avoid handling
> unwanted events in nested aio polls.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  nbd.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/nbd.c b/nbd.c
> index fbc66be..dab1ebb 100644
> --- a/nbd.c
> +++ b/nbd.c
> @@ -1446,7 +1446,7 @@ static void nbd_set_handlers(NBDClient *client)
>  {
>      if (client->exp && client->exp->ctx) {
>          aio_set_fd_handler(client->exp->ctx, client->sock,
> -                           false,
> +                           true,
>                             client->can_read ? nbd_read : NULL,
>                             client->send_coroutine ? nbd_restart_write : NULL,
>                             client);
> @@ -1457,7 +1457,7 @@ static void nbd_unset_handlers(NBDClient *client)
>  {
>      if (client->exp && client->exp->ctx) {
>          aio_set_fd_handler(client->exp->ctx, client->sock,
> -                           false, NULL, NULL, NULL);
> +                           true, NULL, NULL, NULL);
>      }
>  }
>  
> -- 
> 2.4.3
> 
> 

Reviewed-by: Jeff Cody <jcody@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 04/12] aio: introduce aio_{disable, enable}_external
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 04/12] aio: introduce aio_{disable, enable}_external Fam Zheng
@ 2015-10-21 15:56   ` Jeff Cody
  2015-10-22  2:11     ` Fam Zheng
  0 siblings, 1 reply; 29+ messages in thread
From: Jeff Cody @ 2015-10-21 15:56 UTC (permalink / raw)
  To: Fam Zheng; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Wed, Oct 21, 2015 at 10:06:41AM +0800, Fam Zheng wrote:
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  aio-posix.c         |  3 ++-
>  aio-win32.c         |  3 ++-
>  include/block/aio.h | 37 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/aio-posix.c b/aio-posix.c
> index f0f9122..0467f23 100644
> --- a/aio-posix.c
> +++ b/aio-posix.c
> @@ -261,7 +261,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
>  
>      /* fill pollfds */
>      QLIST_FOREACH(node, &ctx->aio_handlers, node) {
> -        if (!node->deleted && node->pfd.events) {
> +        if (!node->deleted && node->pfd.events
> +            && aio_node_check(ctx, node->is_external)) {
>              add_pollfd(node);
>          }
>      }
> diff --git a/aio-win32.c b/aio-win32.c
> index 3110d85..43c4c79 100644
> --- a/aio-win32.c
> +++ b/aio-win32.c
> @@ -309,7 +309,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
>      /* fill fd sets */
>      count = 0;
>      QLIST_FOREACH(node, &ctx->aio_handlers, node) {
> -        if (!node->deleted && node->io_notify) {
> +        if (!node->deleted && node->io_notify
> +            && aio_node_check(ctx, node->is_external)) {
>              events[count++] = event_notifier_get_handle(node->e);
>          }
>      }
> diff --git a/include/block/aio.h b/include/block/aio.h
> index 12f1141..80151d1 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -122,6 +122,8 @@ struct AioContext {
>  
>      /* TimerLists for calling timers - one per clock type */
>      QEMUTimerListGroup tlg;
> +
> +    int external_disable_cnt;
>  };
>  
>  /**
> @@ -375,4 +377,39 @@ static inline void aio_timer_init(AioContext *ctx,
>   */
>  int64_t aio_compute_timeout(AioContext *ctx);
>  
> +/**
> + * aio_disable_external:
> + * @ctx: the aio context
> + *
> + * Disable the furthur processing of clients.

s/furthur/further

The comment below specifically references external clients - I think
the comments for aio_disable_external / aio_enable_external should be
worded similarly, so there is no confusion.

> + */
> +static inline void aio_disable_external(AioContext *ctx)
> +{
> +    atomic_inc(&ctx->external_disable_cnt);
> +}
> +
> +/**
> + * aio_enable_external:
> + * @ctx: the aio context
> + *
> + * Disable the processing of external clients.

Should this comment read "Enable" instead of "Disable"?

> + */
> +static inline void aio_enable_external(AioContext *ctx)
> +{
> +    atomic_dec(&ctx->external_disable_cnt);

Should we assert(ctx->external_disable_cnt >= 0)?


Additional comment:  the function names aio_enable_external() and
aio_disable_external() may be a bit misleading (particularly
aio_enable_external()).  It doesn't do a blanket enable of external
aio (i.e., it does not just blindly do ctx->external_disable_cnt = 0).

Perhaps something like aio_external_disable_inc/dec()?  (I'm not real
fond of that, either).

Just something for thought.


> +}
> +
> +/**
> + * aio_node_check:
> + * @ctx: the aio context
> + * @is_external: Whether or not the checked node is an external event source.
> + *
> + * Check if the node's is_external flag is okey to be polled by the ctx at this

s/okey/okay

> + * moment. True means green light.
> + */
> +static inline bool aio_node_check(AioContext *ctx, bool is_external)
> +{
> +    return !is_external || !atomic_read(&ctx->external_disable_cnt);
> +}
> +

It seems a little odd to me to have this helper function take the
is_external bool field from the node as the argument - any reason to
do that, rather than pass in the AioHandler and have aio_node_check()
parse whatever fields it deems necessary from it?

>  #endif
> -- 
> 2.4.3
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 05/12] block: Introduce "drained begin/end" API
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 05/12] block: Introduce "drained begin/end" API Fam Zheng
@ 2015-10-21 16:11   ` Jeff Cody
  2015-10-22  2:20     ` Fam Zheng
  0 siblings, 1 reply; 29+ messages in thread
From: Jeff Cody @ 2015-10-21 16:11 UTC (permalink / raw)
  To: Fam Zheng; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Wed, Oct 21, 2015 at 10:06:42AM +0800, Fam Zheng wrote:
> The semantics is that after bdrv_drained_begin(bs), bs will not get new external
> requests until the matching bdrv_drained_end(bs).
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/io.c                | 17 +++++++++++++++++
>  include/block/block.h     | 21 +++++++++++++++++++++
>  include/block/block_int.h |  2 ++
>  3 files changed, 40 insertions(+)
> 
> diff --git a/block/io.c b/block/io.c
> index 2fd7a1d..5ac6256 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2624,3 +2624,20 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
>      }
>      bdrv_start_throttled_reqs(bs);
>  }
> +
> +void bdrv_drained_begin(BlockDriverState *bs)
> +{
> +    if (!bs->quiesce_counter++) {
> +        aio_disable_external(bdrv_get_aio_context(bs));
> +    }
> +    bdrv_drain(bs);
> +}
> +
> +void bdrv_drained_end(BlockDriverState *bs)
> +{
> +    assert(bs->quiesce_counter > 0);
> +    if (--bs->quiesce_counter > 0) {
> +        return;
> +    }
> +    aio_enable_external(bdrv_get_aio_context(bs));
> +}

Why do we need a quiesce counter, given that
aio_{disable, enable}_external() increments / decrements a counter?



> diff --git a/include/block/block.h b/include/block/block.h
> index 28d903c..6d38b62 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -610,4 +610,25 @@ void bdrv_io_plug(BlockDriverState *bs);
>  void bdrv_io_unplug(BlockDriverState *bs);
>  void bdrv_flush_io_queue(BlockDriverState *bs);
>  
> +BlockAcctStats *bdrv_get_stats(BlockDriverState *bs);
> +

Is the above line from a bad rebase?


> +/**
> + * bdrv_drained_begin:
> + *
> + * Begin a quiesced section for exclusive access to the BDS, by disabling
> + * external request sources including NBD server and device model. Note that
> + * this doesn't block timers or coroutines from submitting more requests, which
> + * means block_job_pause is still necessary.
> + *
> + * This function can be recursive.
> + */
> +void bdrv_drained_begin(BlockDriverState *bs);
> +
> +/**
> + * bdrv_drained_end:
> + *
> + * End a quiescent section started by bdrv_drained_begin().
> + */
> +void bdrv_drained_end(BlockDriverState *bs);
> +
>  #endif
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index e472a03..e317b14 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -448,6 +448,8 @@ struct BlockDriverState {
>      /* threshold limit for writes, in bytes. "High water mark". */
>      uint64_t write_threshold_offset;
>      NotifierWithReturn write_threshold_notifier;
> +
> +    int quiesce_counter;
>  };
>  
>  struct BlockBackendRootState {
> -- 
> 2.4.3
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 06/12] block: Add "drained begin/end" for transactional external snapshot
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 06/12] block: Add "drained begin/end" for transactional external snapshot Fam Zheng
@ 2015-10-21 17:18   ` Jeff Cody
  0 siblings, 0 replies; 29+ messages in thread
From: Jeff Cody @ 2015-10-21 17:18 UTC (permalink / raw)
  To: Fam Zheng; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Wed, Oct 21, 2015 at 10:06:43AM +0800, Fam Zheng wrote:
> This ensures the atomicity of the transaction by avoiding processing of
> external requests such as those from ioeventfd.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  blockdev.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 35e5719..e4a5eb4 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1569,6 +1569,7 @@ static void external_snapshot_prepare(BlkTransactionState *common,
>      /* Acquire AioContext now so any threads operating on old_bs stop */
>      state->aio_context = bdrv_get_aio_context(state->old_bs);
>      aio_context_acquire(state->aio_context);
> +    bdrv_drained_begin(state->old_bs);
>  
>      if (!bdrv_is_inserted(state->old_bs)) {
>          error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
> @@ -1638,8 +1639,6 @@ static void external_snapshot_commit(BlkTransactionState *common)
>       * don't want to abort all of them if one of them fails the reopen */
>      bdrv_reopen(state->old_bs, state->old_bs->open_flags & ~BDRV_O_RDWR,
>                  NULL);
> -
> -    aio_context_release(state->aio_context);
>  }
>  
>  static void external_snapshot_abort(BlkTransactionState *common)
> @@ -1649,7 +1648,14 @@ static void external_snapshot_abort(BlkTransactionState *common)
>      if (state->new_bs) {
>          bdrv_unref(state->new_bs);
>      }
> +}
> +
> +static void external_snapshot_clean(BlkTransactionState *common)
> +{
> +    ExternalSnapshotState *state =
> +                             DO_UPCAST(ExternalSnapshotState, common, common);
>      if (state->aio_context) {
> +        bdrv_drained_end(state->old_bs);
>          aio_context_release(state->aio_context);
>      }
>  }
> @@ -1809,6 +1815,7 @@ static const BdrvActionOps actions[] = {
>          .prepare  = external_snapshot_prepare,
>          .commit   = external_snapshot_commit,
>          .abort = external_snapshot_abort,
> +        .clean = external_snapshot_clean,
>      },
>      [TRANSACTION_ACTION_KIND_DRIVE_BACKUP] = {
>          .instance_size = sizeof(DriveBackupState),
> -- 
> 2.4.3
> 
> 

Reviewed-by: Jeff Cody <jcody@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 07/12] block: Add "drained begin/end" for transactional backup
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 07/12] block: Add "drained begin/end" for transactional backup Fam Zheng
@ 2015-10-21 17:20   ` Jeff Cody
  0 siblings, 0 replies; 29+ messages in thread
From: Jeff Cody @ 2015-10-21 17:20 UTC (permalink / raw)
  To: Fam Zheng; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Wed, Oct 21, 2015 at 10:06:44AM +0800, Fam Zheng wrote:
> This ensures the atomicity of the transaction by avoiding processing of
> external requests such as those from ioeventfd.
> 
> Move the assignment to state->bs up right after bdrv_drained_begin, so
> that we can use it in the clean callback. The abort callback will still
> check bs->job and state->job, so it's OK.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  blockdev.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index e4a5eb4..0a7848b 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1684,9 +1684,16 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
>          return;
>      }
>  
> +    if (!blk_is_available(blk)) {
> +        error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, backup->device);
> +        return;
> +    }
> +
>      /* AioContext is released in .clean() */
>      state->aio_context = blk_get_aio_context(blk);
>      aio_context_acquire(state->aio_context);
> +    bdrv_drained_begin(blk_bs(blk));
> +    state->bs = blk_bs(blk);
>  
>      qmp_drive_backup(backup->device, backup->target,
>                       backup->has_format, backup->format,
> @@ -1702,7 +1709,6 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
>          return;
>      }
>  
> -    state->bs = blk_bs(blk);
>      state->job = state->bs->job;
>  }
>  
> @@ -1722,6 +1728,7 @@ static void drive_backup_clean(BlkTransactionState *common)
>      DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
>  
>      if (state->aio_context) {
> +        bdrv_drained_end(state->bs);
>          aio_context_release(state->aio_context);
>      }
>  }
> -- 
> 2.4.3
> 
> 

Reviewed-by: Jeff Cody <jcody@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 08/12] block: Add "drained begin/end" for transactional blockdev-backup
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 08/12] block: Add "drained begin/end" for transactional blockdev-backup Fam Zheng
@ 2015-10-21 17:25   ` Jeff Cody
  0 siblings, 0 replies; 29+ messages in thread
From: Jeff Cody @ 2015-10-21 17:25 UTC (permalink / raw)
  To: Fam Zheng; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Wed, Oct 21, 2015 at 10:06:45AM +0800, Fam Zheng wrote:
> Similar to the previous patch, make sure that external events are not
> dispatched during transaction operations.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  blockdev.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 0a7848b..52f44b2 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1756,6 +1756,11 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
>          return;
>      }
>  
> +    if (!blk_is_available(blk)) {
> +        error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, backup->device);
> +        return;
> +    }
> +
>      target = blk_by_name(backup->target);
>      if (!target) {
>          error_setg(errp, "Device '%s' not found", backup->target);
> @@ -1770,6 +1775,8 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
>          return;
>      }
>      aio_context_acquire(state->aio_context);
> +    state->bs = blk_bs(blk);
> +    bdrv_drained_begin(state->bs);
>  
>      qmp_blockdev_backup(backup->device, backup->target,
>                          backup->sync,
> @@ -1782,7 +1789,6 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
>          return;
>      }
>  
> -    state->bs = blk_bs(blk);
>      state->job = state->bs->job;
>  }
>  
> @@ -1802,6 +1808,7 @@ static void blockdev_backup_clean(BlkTransactionState *common)
>      BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
>  
>      if (state->aio_context) {
> +        bdrv_drained_end(state->bs);
>          aio_context_release(state->aio_context);
>      }
>  }
> -- 
> 2.4.3
> 
> 

Reviewed-by: Jeff Cody <jcody@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 09/12] block: Add "drained begin/end" for internal snapshot
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 09/12] block: Add "drained begin/end" for internal snapshot Fam Zheng
  2015-10-21 13:37   ` Kevin Wolf
@ 2015-10-21 18:22   ` Jeff Cody
  1 sibling, 0 replies; 29+ messages in thread
From: Jeff Cody @ 2015-10-21 18:22 UTC (permalink / raw)
  To: Fam Zheng; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Wed, Oct 21, 2015 at 10:06:46AM +0800, Fam Zheng wrote:
> This ensures the atomicity of the transaction by avoiding processing of
> external requests such as those from ioeventfd.
> 
> state->bs is assigned right after bdrv_drained_begin. Because it was
> used as the flag for deletion or not in abort, now we need a separate
> flag - InternalSnapshotState.created.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  blockdev.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 52f44b2..92c2d0d 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1370,6 +1370,7 @@ typedef struct InternalSnapshotState {
>      BlockDriverState *bs;
>      AioContext *aio_context;
>      QEMUSnapshotInfo sn;
> +    bool created;
>  } InternalSnapshotState;
>  
>  static void internal_snapshot_prepare(BlkTransactionState *common,
> @@ -1407,6 +1408,8 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
>      /* AioContext is released in .clean() */
>      state->aio_context = blk_get_aio_context(blk);
>      aio_context_acquire(state->aio_context);
> +    state->bs = blk_bs(blk);
> +    bdrv_drained_begin(state->bs);
> 

Do you think it'd make sense to move the context acquisition & call to
bdrv_drained_begin() below the check for blk_is_available()?  That way
the assignment of state->bs doesn't happen until after the call to
blk_is_available().  It looks like state->created protects us either
way, though - but if blk_is_inserted() fails, then there is no need
for _abort or _clean to do anything, if we hold off on grabbing the
context.

I think I am fine with it as is though, too.

>      if (!blk_is_available(blk)) {
>          error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
> @@ -1465,7 +1468,7 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
>      }
>  
>      /* 4. succeed, mark a snapshot is created */
> -    state->bs = bs;
> +    state->created = true;
>  }
>  
>  static void internal_snapshot_abort(BlkTransactionState *common)
> @@ -1476,7 +1479,7 @@ static void internal_snapshot_abort(BlkTransactionState *common)
>      QEMUSnapshotInfo *sn = &state->sn;
>      Error *local_error = NULL;
>  
> -    if (!bs) {
> +    if (!state->created) {
>          return;
>      }
>  
> @@ -1497,6 +1500,7 @@ static void internal_snapshot_clean(BlkTransactionState *common)
>                                               common, common);
>  
>      if (state->aio_context) {
> +        bdrv_drained_end(state->bs);
>          aio_context_release(state->aio_context);
>      }
>  }
> -- 
> 2.4.3
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 10/12] block: Introduce BlockDriver.bdrv_drain callback
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 10/12] block: Introduce BlockDriver.bdrv_drain callback Fam Zheng
@ 2015-10-21 18:25   ` Jeff Cody
  0 siblings, 0 replies; 29+ messages in thread
From: Jeff Cody @ 2015-10-21 18:25 UTC (permalink / raw)
  To: Fam Zheng; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Wed, Oct 21, 2015 at 10:06:47AM +0800, Fam Zheng wrote:
> Drivers can have internal request sources that generate IO, like the
> need_check_timer in QED. Since we want quiesced periods that contain
> nested event loops in block layer, we need to have a way to disable such
> event sources.
> 
> Block drivers must implement the "bdrv_drain" callback if it has any
> internal sources that can generate I/O activity, like a timer or a
> worker thread (even in a library) that can schedule QEMUBH in an
> asynchronous callback.
> 
> Update the comments of bdrv_drain and bdrv_drained_begin accordingly.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>



Reviewed-by: Jeff Cody <jcody@redhat.com>
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 04/12] aio: introduce aio_{disable, enable}_external
  2015-10-21 15:56   ` [Qemu-devel] [Qemu-block] " Jeff Cody
@ 2015-10-22  2:11     ` Fam Zheng
  2015-10-22  2:20       ` Jeff Cody
  0 siblings, 1 reply; 29+ messages in thread
From: Fam Zheng @ 2015-10-22  2:11 UTC (permalink / raw)
  To: Jeff Cody; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Wed, 10/21 11:56, Jeff Cody wrote:
> > +static inline bool aio_node_check(AioContext *ctx, bool is_external)
> > +{
> > +    return !is_external || !atomic_read(&ctx->external_disable_cnt);
> > +}
> > +
> 
> It seems a little odd to me to have this helper function take the
> is_external bool field from the node as the argument - any reason to
> do that, rather than pass in the AioHandler and have aio_node_check()
> parse whatever fields it deems necessary from it?

AioHandler is defined differently for posix and win32, but I didn't want to
duplicate this function in two files.

Fam

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 04/12] aio: introduce aio_{disable, enable}_external
  2015-10-22  2:11     ` Fam Zheng
@ 2015-10-22  2:20       ` Jeff Cody
  0 siblings, 0 replies; 29+ messages in thread
From: Jeff Cody @ 2015-10-22  2:20 UTC (permalink / raw)
  To: Fam Zheng; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Thu, Oct 22, 2015 at 10:11:16AM +0800, Fam Zheng wrote:
> On Wed, 10/21 11:56, Jeff Cody wrote:
> > > +static inline bool aio_node_check(AioContext *ctx, bool is_external)
> > > +{
> > > +    return !is_external || !atomic_read(&ctx->external_disable_cnt);
> > > +}
> > > +
> > 
> > It seems a little odd to me to have this helper function take the
> > is_external bool field from the node as the argument - any reason to
> > do that, rather than pass in the AioHandler and have aio_node_check()
> > parse whatever fields it deems necessary from it?
> 
> AioHandler is defined differently for posix and win32, but I didn't want to
> duplicate this function in two files.
> 
> Fam

That makes sense, thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 05/12] block: Introduce "drained begin/end" API
  2015-10-21 16:11   ` [Qemu-devel] [Qemu-block] " Jeff Cody
@ 2015-10-22  2:20     ` Fam Zheng
  0 siblings, 0 replies; 29+ messages in thread
From: Fam Zheng @ 2015-10-22  2:20 UTC (permalink / raw)
  To: Jeff Cody; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Wed, 10/21 12:11, Jeff Cody wrote:
> On Wed, Oct 21, 2015 at 10:06:42AM +0800, Fam Zheng wrote:
> > The semantics is that after bdrv_drained_begin(bs), bs will not get new external
> > requests until the matching bdrv_drained_end(bs).
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  block/io.c                | 17 +++++++++++++++++
> >  include/block/block.h     | 21 +++++++++++++++++++++
> >  include/block/block_int.h |  2 ++
> >  3 files changed, 40 insertions(+)
> > 
> > diff --git a/block/io.c b/block/io.c
> > index 2fd7a1d..5ac6256 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -2624,3 +2624,20 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
> >      }
> >      bdrv_start_throttled_reqs(bs);
> >  }
> > +
> > +void bdrv_drained_begin(BlockDriverState *bs)
> > +{
> > +    if (!bs->quiesce_counter++) {
> > +        aio_disable_external(bdrv_get_aio_context(bs));
> > +    }
> > +    bdrv_drain(bs);
> > +}
> > +
> > +void bdrv_drained_end(BlockDriverState *bs)
> > +{
> > +    assert(bs->quiesce_counter > 0);
> > +    if (--bs->quiesce_counter > 0) {
> > +        return;
> > +    }
> > +    aio_enable_external(bdrv_get_aio_context(bs));
> > +}
> 
> Why do we need a quiesce counter, given that
> aio_{disable, enable}_external() increments / decrements a counter?

No longer necessary in this version because we call bdrv_drain unconditionally,
but this makes it possible to introducd bdrv_is_quiesced() and assertion on it,
so I lean toward keeping it as it doesn't hurt.

> 
> > diff --git a/include/block/block.h b/include/block/block.h
> > index 28d903c..6d38b62 100644
> > --- a/include/block/block.h
> > +++ b/include/block/block.h
> > @@ -610,4 +610,25 @@ void bdrv_io_plug(BlockDriverState *bs);
> >  void bdrv_io_unplug(BlockDriverState *bs);
> >  void bdrv_flush_io_queue(BlockDriverState *bs);
> >  
> > +BlockAcctStats *bdrv_get_stats(BlockDriverState *bs);
> > +
> 
> Is the above line from a bad rebase?

Yes, will remove.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 11/12] qed: Implement .bdrv_drain
  2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 11/12] qed: Implement .bdrv_drain Fam Zheng
@ 2015-10-22  2:20   ` Jeff Cody
  2015-10-22  2:59     ` Fam Zheng
  0 siblings, 1 reply; 29+ messages in thread
From: Jeff Cody @ 2015-10-22  2:20 UTC (permalink / raw)
  To: Fam Zheng; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Wed, Oct 21, 2015 at 10:06:48AM +0800, Fam Zheng wrote:
> The "need_check_timer" is used to clear the "NEED_CHECK" flag in the
> image header after a grace period once metadata update has finished. In
> compliance to the bdrv_drain semantics we should make sure it remains
> deleted once .bdrv_drain is called.
> 
> Call the qed_need_check_timer_cb manually to update the header
> immediately.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/qed.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/block/qed.c b/block/qed.c
> index 5ea05d4..e9dcb4d 100644
> --- a/block/qed.c
> +++ b/block/qed.c
> @@ -375,6 +375,12 @@ static void bdrv_qed_attach_aio_context(BlockDriverState *bs,
>      }
>  }
>  
> +static void bdrv_qed_drain(BlockDriverState *bs)
> +{
> +    qed_cancel_need_check_timer(bs->opaque);
> +    qed_need_check_timer_cb(bs->opaque);
> +}
> +

Uh oh.

This causes a segfault sometimes, and other times an abort:


   # ./qemu-img create -f qed test.qed 512M
   Formatting 'test.qed', fmt=qed size=536870912 cluster_size=65536

   # ./qemu-io -c "read 0 512M" test.qed
   read 536870912/536870912 bytes at offset 0
   512 MiB, 1 ops; 0.0556 sec (8.988 GiB/sec and 17.9759 ops/sec)
   Segmentation fault (core dumped)


If I run the above qemu-io command with gdb, it will abort in
qed_plug_allocating_write_reqs().

I'd hazard a guess (I have not verified) that it is due to the
qed_header_write() call triggered by the aio flush callback function
qed_clear_need_check().  The aio flush is done inside the
qed_need_check_timer_cb() call.



>  static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
>                           Error **errp)
>  {
> @@ -1676,6 +1682,7 @@ static BlockDriver bdrv_qed = {
>      .bdrv_check               = bdrv_qed_check,
>      .bdrv_detach_aio_context  = bdrv_qed_detach_aio_context,
>      .bdrv_attach_aio_context  = bdrv_qed_attach_aio_context,
> +    .bdrv_drain               = bdrv_qed_drain,
>  };
>  
>  static void bdrv_qed_init(void)
> -- 
> 2.4.3
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v5 11/12] qed: Implement .bdrv_drain
  2015-10-22  2:20   ` [Qemu-devel] [Qemu-block] " Jeff Cody
@ 2015-10-22  2:59     ` Fam Zheng
  0 siblings, 0 replies; 29+ messages in thread
From: Fam Zheng @ 2015-10-22  2:59 UTC (permalink / raw)
  To: Jeff Cody; +Cc: kwolf, pbonzini, qemu-block, qemu-devel, stefanha

On Wed, 10/21 22:20, Jeff Cody wrote:
> On Wed, Oct 21, 2015 at 10:06:48AM +0800, Fam Zheng wrote:
> > The "need_check_timer" is used to clear the "NEED_CHECK" flag in the
> > image header after a grace period once metadata update has finished. In
> > compliance to the bdrv_drain semantics we should make sure it remains
> > deleted once .bdrv_drain is called.
> > 
> > Call the qed_need_check_timer_cb manually to update the header
> > immediately.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  block/qed.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/block/qed.c b/block/qed.c
> > index 5ea05d4..e9dcb4d 100644
> > --- a/block/qed.c
> > +++ b/block/qed.c
> > @@ -375,6 +375,12 @@ static void bdrv_qed_attach_aio_context(BlockDriverState *bs,
> >      }
> >  }
> >  
> > +static void bdrv_qed_drain(BlockDriverState *bs)
> > +{
> > +    qed_cancel_need_check_timer(bs->opaque);
> > +    qed_need_check_timer_cb(bs->opaque);
> > +}
> > +
> 
> Uh oh.
> 
> This causes a segfault sometimes, and other times an abort:
> 
> 
>    # ./qemu-img create -f qed test.qed 512M
>    Formatting 'test.qed', fmt=qed size=536870912 cluster_size=65536
> 
>    # ./qemu-io -c "read 0 512M" test.qed
>    read 536870912/536870912 bytes at offset 0
>    512 MiB, 1 ops; 0.0556 sec (8.988 GiB/sec and 17.9759 ops/sec)
>    Segmentation fault (core dumped)
> 
> 
> If I run the above qemu-io command with gdb, it will abort in
> qed_plug_allocating_write_reqs().
> 
> I'd hazard a guess (I have not verified) that it is due to the
> qed_header_write() call triggered by the aio flush callback function
> qed_clear_need_check().  The aio flush is done inside the
> qed_need_check_timer_cb() call.

Good catch, I think it's because of the second bdrv_drain in bdrv_close(),
when the first bdrv_aio_flush in qed_need_check_timer_cb hasn't finished.

We need a different bdrv_qed_drain implementation here.

Fam

> 
> 
> 
> >  static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
> >                           Error **errp)
> >  {
> > @@ -1676,6 +1682,7 @@ static BlockDriver bdrv_qed = {
> >      .bdrv_check               = bdrv_qed_check,
> >      .bdrv_detach_aio_context  = bdrv_qed_detach_aio_context,
> >      .bdrv_attach_aio_context  = bdrv_qed_attach_aio_context,
> > +    .bdrv_drain               = bdrv_qed_drain,
> >  };
> >  
> >  static void bdrv_qed_init(void)
> > -- 
> > 2.4.3
> > 
> > 

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-10-22  2:59 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-21  2:06 [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng
2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 01/12] aio: Add "is_external" flag for event handlers Fam Zheng
2015-10-21 15:21   ` [Qemu-devel] [Qemu-block] " Jeff Cody
2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 02/12] nbd: Mark fd handlers client type as "external" Fam Zheng
2015-10-21 15:21   ` [Qemu-devel] [Qemu-block] " Jeff Cody
2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 03/12] dataplane: Mark host notifiers' " Fam Zheng
2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 04/12] aio: introduce aio_{disable, enable}_external Fam Zheng
2015-10-21 15:56   ` [Qemu-devel] [Qemu-block] " Jeff Cody
2015-10-22  2:11     ` Fam Zheng
2015-10-22  2:20       ` Jeff Cody
2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 05/12] block: Introduce "drained begin/end" API Fam Zheng
2015-10-21 16:11   ` [Qemu-devel] [Qemu-block] " Jeff Cody
2015-10-22  2:20     ` Fam Zheng
2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 06/12] block: Add "drained begin/end" for transactional external snapshot Fam Zheng
2015-10-21 17:18   ` [Qemu-devel] [Qemu-block] " Jeff Cody
2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 07/12] block: Add "drained begin/end" for transactional backup Fam Zheng
2015-10-21 17:20   ` [Qemu-devel] [Qemu-block] " Jeff Cody
2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 08/12] block: Add "drained begin/end" for transactional blockdev-backup Fam Zheng
2015-10-21 17:25   ` [Qemu-devel] [Qemu-block] " Jeff Cody
2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 09/12] block: Add "drained begin/end" for internal snapshot Fam Zheng
2015-10-21 13:37   ` Kevin Wolf
2015-10-21 18:22   ` [Qemu-devel] [Qemu-block] " Jeff Cody
2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 10/12] block: Introduce BlockDriver.bdrv_drain callback Fam Zheng
2015-10-21 18:25   ` [Qemu-devel] [Qemu-block] " Jeff Cody
2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 11/12] qed: Implement .bdrv_drain Fam Zheng
2015-10-22  2:20   ` [Qemu-devel] [Qemu-block] " Jeff Cody
2015-10-22  2:59     ` Fam Zheng
2015-10-21  2:06 ` [Qemu-devel] [PATCH v5 12/12] tests: Add test case for aio_disable_external Fam Zheng
2015-10-21 13:40 ` [Qemu-devel] [PATCH v5 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.