All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices
@ 2015-10-12 11:50 Fam Zheng
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 01/12] aio: Add "is_external" flag for event handlers Fam Zheng
                   ` (11 more replies)
  0 siblings, 12 replies; 20+ messages in thread
From: Fam Zheng @ 2015-10-12 11:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi

v2: Use "bool" for external/internal instead of bit mask in the interface, so I
    didn't pick up Kevin's rev-by due to patches change.
    Add a unit test in tests/test-aio.c.

External I/O requests mustn't come in while we're in the middle of a
QMP transaction. This series adds bdrv_drained_begin and bdrv_drained_end
around "prepare" and "cleanup" in the transaction actions to make sure that.

Note that "backup" action already starts the block job coroutine in prepare,
and "snapshot" action already creates the snapshot, but both are OK because we
call bdrv_drained_begin first.

The nested event loops also dispatch timers and BHs on the same AioContext. The
existing timers are iscsi, curl, qcow2 and qed. The only one that does I/O is
qed, which is dealt with by a new ".bdrv_drain" callback.

Fam Zheng (12):
  aio: Add "is_external" flag for event handlers
  nbd: Mark fd handlers client type as "external"
  dataplane: Mark host notifiers' client type as "external"
  aio: introduce aio_{disable,enable}_external
  block: Introduce "drained begin/end" API
  block: Add "drained begin/end" for transactional external snapshot
  block: Add "drained begin/end" for transactional backup
  block: Add "drained begin/end" for transactional blockdev-backup
  block: Add "drained begin/end" for internal snapshot
  block: Introduce BlockDriver.bdrv_drain callback
  qed: Implement .bdrv_drain
  tests: Add test case for aio_disable_external

 aio-posix.c                     |  9 ++++-
 aio-win32.c                     |  8 +++-
 async.c                         |  3 +-
 block.c                         |  2 +
 block/curl.c                    | 14 ++++---
 block/io.c                      | 21 +++++++++++
 block/iscsi.c                   |  9 ++---
 block/linux-aio.c               |  5 ++-
 block/nbd-client.c              | 10 +++--
 block/nfs.c                     | 17 ++++-----
 block/qed.c                     |  7 ++++
 block/sheepdog.c                | 38 ++++++++++++-------
 block/ssh.c                     |  5 ++-
 block/win32-aio.c               |  5 ++-
 blockdev.c                      | 27 +++++++++++---
 hw/block/dataplane/virtio-blk.c |  5 ++-
 hw/scsi/virtio-scsi-dataplane.c | 22 +++++++----
 include/block/aio.h             | 39 ++++++++++++++++++++
 include/block/block.h           | 19 ++++++++++
 include/block/block_int.h       |  8 ++++
 iohandler.c                     |  3 +-
 nbd.c                           |  4 +-
 tests/test-aio.c                | 82 ++++++++++++++++++++++++++++-------------
 23 files changed, 270 insertions(+), 92 deletions(-)

-- 
2.6.1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 01/12] aio: Add "is_external" flag for event handlers
  2015-10-12 11:50 [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices Fam Zheng
@ 2015-10-12 11:50 ` Fam Zheng
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 02/12] nbd: Mark fd handlers client type as "external" Fam Zheng
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Fam Zheng @ 2015-10-12 11:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi

All callers pass in false, and the real external ones will switch to
true in coming patches.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 aio-posix.c                     |  6 ++++-
 aio-win32.c                     |  5 ++++
 async.c                         |  3 ++-
 block/curl.c                    | 14 +++++-----
 block/iscsi.c                   |  9 +++----
 block/linux-aio.c               |  5 ++--
 block/nbd-client.c              | 10 ++++---
 block/nfs.c                     | 17 +++++-------
 block/sheepdog.c                | 38 ++++++++++++++++++---------
 block/ssh.c                     |  5 ++--
 block/win32-aio.c               |  5 ++--
 hw/block/dataplane/virtio-blk.c |  6 +++--
 hw/scsi/virtio-scsi-dataplane.c | 24 +++++++++++------
 include/block/aio.h             |  2 ++
 iohandler.c                     |  3 ++-
 nbd.c                           |  4 ++-
 tests/test-aio.c                | 58 +++++++++++++++++++++++------------------
 17 files changed, 130 insertions(+), 84 deletions(-)

diff --git a/aio-posix.c b/aio-posix.c
index d477033..f0f9122 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -25,6 +25,7 @@ struct AioHandler
     IOHandler *io_write;
     int deleted;
     void *opaque;
+    bool is_external;
     QLIST_ENTRY(AioHandler) node;
 };
 
@@ -43,6 +44,7 @@ static AioHandler *find_aio_handler(AioContext *ctx, int fd)
 
 void aio_set_fd_handler(AioContext *ctx,
                         int fd,
+                        bool is_external,
                         IOHandler *io_read,
                         IOHandler *io_write,
                         void *opaque)
@@ -82,6 +84,7 @@ void aio_set_fd_handler(AioContext *ctx,
         node->io_read = io_read;
         node->io_write = io_write;
         node->opaque = opaque;
+        node->is_external = is_external;
 
         node->pfd.events = (io_read ? G_IO_IN | G_IO_HUP | G_IO_ERR : 0);
         node->pfd.events |= (io_write ? G_IO_OUT | G_IO_ERR : 0);
@@ -92,10 +95,11 @@ void aio_set_fd_handler(AioContext *ctx,
 
 void aio_set_event_notifier(AioContext *ctx,
                             EventNotifier *notifier,
+                            bool is_external,
                             EventNotifierHandler *io_read)
 {
     aio_set_fd_handler(ctx, event_notifier_get_fd(notifier),
-                       (IOHandler *)io_read, NULL, notifier);
+                       is_external, (IOHandler *)io_read, NULL, notifier);
 }
 
 bool aio_prepare(AioContext *ctx)
diff --git a/aio-win32.c b/aio-win32.c
index 50a6867..3110d85 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -28,11 +28,13 @@ struct AioHandler {
     GPollFD pfd;
     int deleted;
     void *opaque;
+    bool is_external;
     QLIST_ENTRY(AioHandler) node;
 };
 
 void aio_set_fd_handler(AioContext *ctx,
                         int fd,
+                        bool is_external,
                         IOHandler *io_read,
                         IOHandler *io_write,
                         void *opaque)
@@ -86,6 +88,7 @@ void aio_set_fd_handler(AioContext *ctx,
         node->opaque = opaque;
         node->io_read = io_read;
         node->io_write = io_write;
+        node->is_external = is_external;
 
         event = event_notifier_get_handle(&ctx->notifier);
         WSAEventSelect(node->pfd.fd, event,
@@ -98,6 +101,7 @@ void aio_set_fd_handler(AioContext *ctx,
 
 void aio_set_event_notifier(AioContext *ctx,
                             EventNotifier *e,
+                            bool is_external,
                             EventNotifierHandler *io_notify)
 {
     AioHandler *node;
@@ -133,6 +137,7 @@ void aio_set_event_notifier(AioContext *ctx,
             node->e = e;
             node->pfd.fd = (uintptr_t)event_notifier_get_handle(e);
             node->pfd.events = G_IO_IN;
+            node->is_external = is_external;
             QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
 
             g_source_add_poll(&ctx->source, &node->pfd);
diff --git a/async.c b/async.c
index efce14b..bdc64a3 100644
--- a/async.c
+++ b/async.c
@@ -247,7 +247,7 @@ aio_ctx_finalize(GSource     *source)
     }
     qemu_mutex_unlock(&ctx->bh_lock);
 
-    aio_set_event_notifier(ctx, &ctx->notifier, NULL);
+    aio_set_event_notifier(ctx, &ctx->notifier, false, NULL);
     event_notifier_cleanup(&ctx->notifier);
     rfifolock_destroy(&ctx->lock);
     qemu_mutex_destroy(&ctx->bh_lock);
@@ -329,6 +329,7 @@ AioContext *aio_context_new(Error **errp)
     }
     g_source_set_can_recurse(&ctx->source, true);
     aio_set_event_notifier(ctx, &ctx->notifier,
+                           false,
                            (EventNotifierHandler *)
                            event_notifier_dummy_cb);
     ctx->thread_pool = NULL;
diff --git a/block/curl.c b/block/curl.c
index 032cc8a..8994182 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -154,18 +154,20 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
     DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd);
     switch (action) {
         case CURL_POLL_IN:
-            aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
-                               NULL, state);
+            aio_set_fd_handler(s->aio_context, fd, false,
+                               curl_multi_read, NULL, state);
             break;
         case CURL_POLL_OUT:
-            aio_set_fd_handler(s->aio_context, fd, NULL, curl_multi_do, state);
+            aio_set_fd_handler(s->aio_context, fd, false,
+                               NULL, curl_multi_do, state);
             break;
         case CURL_POLL_INOUT:
-            aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
-                               curl_multi_do, state);
+            aio_set_fd_handler(s->aio_context, fd, false,
+                               curl_multi_read, curl_multi_do, state);
             break;
         case CURL_POLL_REMOVE:
-            aio_set_fd_handler(s->aio_context, fd, NULL, NULL, NULL);
+            aio_set_fd_handler(s->aio_context, fd, false,
+                               NULL, NULL, NULL);
             break;
     }
 
diff --git a/block/iscsi.c b/block/iscsi.c
index 93f1ee4..9a628b7 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -291,8 +291,8 @@ iscsi_set_events(IscsiLun *iscsilun)
     int ev = iscsi_which_events(iscsi);
 
     if (ev != iscsilun->events) {
-        aio_set_fd_handler(iscsilun->aio_context,
-                           iscsi_get_fd(iscsi),
+        aio_set_fd_handler(iscsilun->aio_context, iscsi_get_fd(iscsi),
+                           false,
                            (ev & POLLIN) ? iscsi_process_read : NULL,
                            (ev & POLLOUT) ? iscsi_process_write : NULL,
                            iscsilun);
@@ -1280,9 +1280,8 @@ static void iscsi_detach_aio_context(BlockDriverState *bs)
 {
     IscsiLun *iscsilun = bs->opaque;
 
-    aio_set_fd_handler(iscsilun->aio_context,
-                       iscsi_get_fd(iscsilun->iscsi),
-                       NULL, NULL, NULL);
+    aio_set_fd_handler(iscsilun->aio_context, iscsi_get_fd(iscsilun->iscsi),
+                       false, NULL, NULL, NULL);
     iscsilun->events = 0;
 
     if (iscsilun->nop_timer) {
diff --git a/block/linux-aio.c b/block/linux-aio.c
index c991443..88b0520 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -287,7 +287,7 @@ void laio_detach_aio_context(void *s_, AioContext *old_context)
 {
     struct qemu_laio_state *s = s_;
 
-    aio_set_event_notifier(old_context, &s->e, NULL);
+    aio_set_event_notifier(old_context, &s->e, false, NULL);
     qemu_bh_delete(s->completion_bh);
 }
 
@@ -296,7 +296,8 @@ void laio_attach_aio_context(void *s_, AioContext *new_context)
     struct qemu_laio_state *s = s_;
 
     s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
-    aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb);
+    aio_set_event_notifier(new_context, &s->e, false,
+                           qemu_laio_completion_cb);
 }
 
 void *laio_init(void)
diff --git a/block/nbd-client.c b/block/nbd-client.c
index e1bb919..b7fd17a 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -124,7 +124,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
     s->send_coroutine = qemu_coroutine_self();
     aio_context = bdrv_get_aio_context(bs);
 
-    aio_set_fd_handler(aio_context, s->sock,
+    aio_set_fd_handler(aio_context, s->sock, false,
                        nbd_reply_ready, nbd_restart_write, bs);
     if (qiov) {
         if (!s->is_unix) {
@@ -144,7 +144,8 @@ static int nbd_co_send_request(BlockDriverState *bs,
     } else {
         rc = nbd_send_request(s->sock, request);
     }
-    aio_set_fd_handler(aio_context, s->sock, nbd_reply_ready, NULL, bs);
+    aio_set_fd_handler(aio_context, s->sock, false,
+                       nbd_reply_ready, NULL, bs);
     s->send_coroutine = NULL;
     qemu_co_mutex_unlock(&s->send_mutex);
     return rc;
@@ -348,14 +349,15 @@ int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
 void nbd_client_detach_aio_context(BlockDriverState *bs)
 {
     aio_set_fd_handler(bdrv_get_aio_context(bs),
-                       nbd_get_client_session(bs)->sock, NULL, NULL, NULL);
+                       nbd_get_client_session(bs)->sock,
+                       false, NULL, NULL, NULL);
 }
 
 void nbd_client_attach_aio_context(BlockDriverState *bs,
                                    AioContext *new_context)
 {
     aio_set_fd_handler(new_context, nbd_get_client_session(bs)->sock,
-                       nbd_reply_ready, NULL, bs);
+                       false, nbd_reply_ready, NULL, bs);
 }
 
 void nbd_client_close(BlockDriverState *bs)
diff --git a/block/nfs.c b/block/nfs.c
index 887a98e..fd79f89 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -63,11 +63,10 @@ static void nfs_set_events(NFSClient *client)
 {
     int ev = nfs_which_events(client->context);
     if (ev != client->events) {
-        aio_set_fd_handler(client->aio_context,
-                           nfs_get_fd(client->context),
+        aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context),
+                           false,
                            (ev & POLLIN) ? nfs_process_read : NULL,
-                           (ev & POLLOUT) ? nfs_process_write : NULL,
-                           client);
+                           (ev & POLLOUT) ? nfs_process_write : NULL, client);
 
     }
     client->events = ev;
@@ -242,9 +241,8 @@ static void nfs_detach_aio_context(BlockDriverState *bs)
 {
     NFSClient *client = bs->opaque;
 
-    aio_set_fd_handler(client->aio_context,
-                       nfs_get_fd(client->context),
-                       NULL, NULL, NULL);
+    aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context),
+                       false, NULL, NULL, NULL);
     client->events = 0;
 }
 
@@ -263,9 +261,8 @@ static void nfs_client_close(NFSClient *client)
         if (client->fh) {
             nfs_close(client->context, client->fh);
         }
-        aio_set_fd_handler(client->aio_context,
-                           nfs_get_fd(client->context),
-                           NULL, NULL, NULL);
+        aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context),
+                           false, NULL, NULL, NULL);
         nfs_destroy_context(client->context);
     }
     memset(client, 0, sizeof(NFSClient));
diff --git a/block/sheepdog.c b/block/sheepdog.c
index e7e58b7..d80e4ed 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -651,14 +651,16 @@ static coroutine_fn void do_co_req(void *opaque)
     unsigned int *rlen = srco->rlen;
 
     co = qemu_coroutine_self();
-    aio_set_fd_handler(srco->aio_context, sockfd, NULL, restart_co_req, co);
+    aio_set_fd_handler(srco->aio_context, sockfd, false,
+                       NULL, restart_co_req, co);
 
     ret = send_co_req(sockfd, hdr, data, wlen);
     if (ret < 0) {
         goto out;
     }
 
-    aio_set_fd_handler(srco->aio_context, sockfd, restart_co_req, NULL, co);
+    aio_set_fd_handler(srco->aio_context, sockfd, false,
+                       restart_co_req, NULL, co);
 
     ret = qemu_co_recv(sockfd, hdr, sizeof(*hdr));
     if (ret != sizeof(*hdr)) {
@@ -683,7 +685,8 @@ static coroutine_fn void do_co_req(void *opaque)
 out:
     /* there is at most one request for this sockfd, so it is safe to
      * set each handler to NULL. */
-    aio_set_fd_handler(srco->aio_context, sockfd, NULL, NULL, NULL);
+    aio_set_fd_handler(srco->aio_context, sockfd, false,
+                       NULL, NULL, NULL);
 
     srco->ret = ret;
     srco->finished = true;
@@ -735,7 +738,8 @@ static coroutine_fn void reconnect_to_sdog(void *opaque)
     BDRVSheepdogState *s = opaque;
     AIOReq *aio_req, *next;
 
-    aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL);
+    aio_set_fd_handler(s->aio_context, s->fd, false, NULL,
+                       NULL, NULL);
     close(s->fd);
     s->fd = -1;
 
@@ -938,7 +942,8 @@ static int get_sheep_fd(BDRVSheepdogState *s, Error **errp)
         return fd;
     }
 
-    aio_set_fd_handler(s->aio_context, fd, co_read_response, NULL, s);
+    aio_set_fd_handler(s->aio_context, fd, false,
+                       co_read_response, NULL, s);
     return fd;
 }
 
@@ -1199,7 +1204,7 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
 
     qemu_co_mutex_lock(&s->lock);
     s->co_send = qemu_coroutine_self();
-    aio_set_fd_handler(s->aio_context, s->fd,
+    aio_set_fd_handler(s->aio_context, s->fd, false,
                        co_read_response, co_write_request, s);
     socket_set_cork(s->fd, 1);
 
@@ -1218,7 +1223,8 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
     }
 out:
     socket_set_cork(s->fd, 0);
-    aio_set_fd_handler(s->aio_context, s->fd, co_read_response, NULL, s);
+    aio_set_fd_handler(s->aio_context, s->fd, false,
+                       co_read_response, NULL, s);
     s->co_send = NULL;
     qemu_co_mutex_unlock(&s->lock);
 }
@@ -1368,7 +1374,8 @@ static void sd_detach_aio_context(BlockDriverState *bs)
 {
     BDRVSheepdogState *s = bs->opaque;
 
-    aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL);
+    aio_set_fd_handler(s->aio_context, s->fd, false, NULL,
+                       NULL, NULL);
 }
 
 static void sd_attach_aio_context(BlockDriverState *bs,
@@ -1377,7 +1384,8 @@ static void sd_attach_aio_context(BlockDriverState *bs,
     BDRVSheepdogState *s = bs->opaque;
 
     s->aio_context = new_context;
-    aio_set_fd_handler(new_context, s->fd, co_read_response, NULL, s);
+    aio_set_fd_handler(new_context, s->fd, false,
+                       co_read_response, NULL, s);
 }
 
 /* TODO Convert to fine grained options */
@@ -1490,7 +1498,8 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
     g_free(buf);
     return 0;
 out:
-    aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, NULL, NULL, NULL);
+    aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd,
+                       false, NULL, NULL, NULL);
     if (s->fd >= 0) {
         closesocket(s->fd);
     }
@@ -1528,7 +1537,8 @@ static void sd_reopen_commit(BDRVReopenState *state)
     BDRVSheepdogState *s = state->bs->opaque;
 
     if (s->fd) {
-        aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL);
+        aio_set_fd_handler(s->aio_context, s->fd, false,
+                           NULL, NULL, NULL);
         closesocket(s->fd);
     }
 
@@ -1551,7 +1561,8 @@ static void sd_reopen_abort(BDRVReopenState *state)
     }
 
     if (re_s->fd) {
-        aio_set_fd_handler(s->aio_context, re_s->fd, NULL, NULL, NULL);
+        aio_set_fd_handler(s->aio_context, re_s->fd, false,
+                           NULL, NULL, NULL);
         closesocket(re_s->fd);
     }
 
@@ -1935,7 +1946,8 @@ static void sd_close(BlockDriverState *bs)
         error_report("%s, %s", sd_strerror(rsp->result), s->name);
     }
 
-    aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, NULL, NULL, NULL);
+    aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd,
+                       false, NULL, NULL, NULL);
     closesocket(s->fd);
     g_free(s->host_spec);
 }
diff --git a/block/ssh.c b/block/ssh.c
index 8d06739..148ac3e 100644
--- a/block/ssh.c
+++ b/block/ssh.c
@@ -803,14 +803,15 @@ static coroutine_fn void set_fd_handler(BDRVSSHState *s, BlockDriverState *bs)
             rd_handler, wr_handler);
 
     aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock,
-                       rd_handler, wr_handler, co);
+                       false, rd_handler, wr_handler, co);
 }
 
 static coroutine_fn void clear_fd_handler(BDRVSSHState *s,
                                           BlockDriverState *bs)
 {
     DPRINTF("s->sock=%d", s->sock);
-    aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock, NULL, NULL, NULL);
+    aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock,
+                       false, NULL, NULL, NULL);
 }
 
 /* A non-blocking call returned EAGAIN, so yield, ensuring the
diff --git a/block/win32-aio.c b/block/win32-aio.c
index 64e8682..bbf2f01 100644
--- a/block/win32-aio.c
+++ b/block/win32-aio.c
@@ -174,7 +174,7 @@ int win32_aio_attach(QEMUWin32AIOState *aio, HANDLE hfile)
 void win32_aio_detach_aio_context(QEMUWin32AIOState *aio,
                                   AioContext *old_context)
 {
-    aio_set_event_notifier(old_context, &aio->e, NULL);
+    aio_set_event_notifier(old_context, &aio->e, false, NULL);
     aio->is_aio_context_attached = false;
 }
 
@@ -182,7 +182,8 @@ void win32_aio_attach_aio_context(QEMUWin32AIOState *aio,
                                   AioContext *new_context)
 {
     aio->is_aio_context_attached = true;
-    aio_set_event_notifier(new_context, &aio->e, win32_aio_completion_cb);
+    aio_set_event_notifier(new_context, &aio->e, false,
+                           win32_aio_completion_cb);
 }
 
 QEMUWin32AIOState *win32_aio_init(void)
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 6106e46..f8716bc 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -283,7 +283,8 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 
     /* Get this show started by hooking up our callbacks */
     aio_context_acquire(s->ctx);
-    aio_set_event_notifier(s->ctx, &s->host_notifier, handle_notify);
+    aio_set_event_notifier(s->ctx, &s->host_notifier, false,
+                           handle_notify);
     aio_context_release(s->ctx);
     return;
 
@@ -319,7 +320,8 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
     aio_context_acquire(s->ctx);
 
     /* Stop notifications for new requests from guest */
-    aio_set_event_notifier(s->ctx, &s->host_notifier, NULL);
+    aio_set_event_notifier(s->ctx, &s->host_notifier, false,
+                           NULL);
 
     /* Drain and switch bs back to the QEMU main loop */
     blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 5575648..d149418 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -60,7 +60,8 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
     r = g_slice_new(VirtIOSCSIVring);
     r->host_notifier = *virtio_queue_get_host_notifier(vq);
     r->guest_notifier = *virtio_queue_get_guest_notifier(vq);
-    aio_set_event_notifier(s->ctx, &r->host_notifier, handler);
+    aio_set_event_notifier(s->ctx, &r->host_notifier, false,
+                           handler);
 
     r->parent = s;
 
@@ -71,7 +72,8 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
     return r;
 
 fail_vring:
-    aio_set_event_notifier(s->ctx, &r->host_notifier, NULL);
+    aio_set_event_notifier(s->ctx, &r->host_notifier, false,
+                           NULL);
     k->set_host_notifier(qbus->parent, n, false);
     g_slice_free(VirtIOSCSIVring, r);
     return NULL;
@@ -162,14 +164,17 @@ static void virtio_scsi_clear_aio(VirtIOSCSI *s)
     int i;
 
     if (s->ctrl_vring) {
-        aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier, NULL);
+        aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier,
+                               false, NULL);
     }
     if (s->event_vring) {
-        aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier, NULL);
+        aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier,
+                               false, NULL);
     }
     if (s->cmd_vrings) {
         for (i = 0; i < vs->conf.num_queues && s->cmd_vrings[i]; i++) {
-            aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, NULL);
+            aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier,
+                                   false, NULL);
         }
     }
 }
@@ -290,10 +295,13 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s)
 
     aio_context_acquire(s->ctx);
 
-    aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier, NULL);
-    aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier, NULL);
+    aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier,
+                           false, NULL);
+    aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier,
+                           false, NULL);
     for (i = 0; i < vs->conf.num_queues; i++) {
-        aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, NULL);
+        aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier,
+                               false, NULL);
     }
 
     blk_drain_all(); /* ensure there are no in-flight requests */
diff --git a/include/block/aio.h b/include/block/aio.h
index 400b1b0..12f1141 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -299,6 +299,7 @@ bool aio_poll(AioContext *ctx, bool blocking);
  */
 void aio_set_fd_handler(AioContext *ctx,
                         int fd,
+                        bool is_external,
                         IOHandler *io_read,
                         IOHandler *io_write,
                         void *opaque);
@@ -312,6 +313,7 @@ void aio_set_fd_handler(AioContext *ctx,
  */
 void aio_set_event_notifier(AioContext *ctx,
                             EventNotifier *notifier,
+                            bool is_external,
                             EventNotifierHandler *io_read);
 
 /* Return a GSource that lets the main loop poll the file descriptors attached
diff --git a/iohandler.c b/iohandler.c
index 55f8501..eb68083 100644
--- a/iohandler.c
+++ b/iohandler.c
@@ -55,7 +55,8 @@ void qemu_set_fd_handler(int fd,
                          void *opaque)
 {
     iohandler_init();
-    aio_set_fd_handler(iohandler_ctx, fd, fd_read, fd_write, opaque);
+    aio_set_fd_handler(iohandler_ctx, fd, false,
+                       fd_read, fd_write, opaque);
 }
 
 /* reaping of zombies.  right now we're not passing the status to
diff --git a/nbd.c b/nbd.c
index 07240bd..32a1f66 100644
--- a/nbd.c
+++ b/nbd.c
@@ -1446,6 +1446,7 @@ static void nbd_set_handlers(NBDClient *client)
 {
     if (client->exp && client->exp->ctx) {
         aio_set_fd_handler(client->exp->ctx, client->sock,
+                           false,
                            client->can_read ? nbd_read : NULL,
                            client->send_coroutine ? nbd_restart_write : NULL,
                            client);
@@ -1455,7 +1456,8 @@ static void nbd_set_handlers(NBDClient *client)
 static void nbd_unset_handlers(NBDClient *client)
 {
     if (client->exp && client->exp->ctx) {
-        aio_set_fd_handler(client->exp->ctx, client->sock, NULL, NULL, NULL);
+        aio_set_fd_handler(client->exp->ctx, client->sock,
+                           false, NULL, NULL, NULL);
     }
 }
 
diff --git a/tests/test-aio.c b/tests/test-aio.c
index 217e337..03cd45d 100644
--- a/tests/test-aio.c
+++ b/tests/test-aio.c
@@ -118,6 +118,12 @@ static void *test_acquire_thread(void *opaque)
     return NULL;
 }
 
+static void set_event_notifier(AioContext *ctx, EventNotifier *notifier,
+                               EventNotifierHandler *handler)
+{
+    aio_set_event_notifier(ctx, notifier, false, handler);
+}
+
 static void dummy_notifier_read(EventNotifier *unused)
 {
     g_assert(false); /* should never be invoked */
@@ -131,7 +137,7 @@ static void test_acquire(void)
 
     /* Dummy event notifier ensures aio_poll() will block */
     event_notifier_init(&notifier, false);
-    aio_set_event_notifier(ctx, &notifier, dummy_notifier_read);
+    set_event_notifier(ctx, &notifier, dummy_notifier_read);
     g_assert(!aio_poll(ctx, false)); /* consume aio_notify() */
 
     qemu_mutex_init(&data.start_lock);
@@ -149,7 +155,7 @@ static void test_acquire(void)
     aio_context_release(ctx);
 
     qemu_thread_join(&thread);
-    aio_set_event_notifier(ctx, &notifier, NULL);
+    set_event_notifier(ctx, &notifier, NULL);
     event_notifier_cleanup(&notifier);
 
     g_assert(data.thread_acquired);
@@ -308,11 +314,11 @@ static void test_set_event_notifier(void)
 {
     EventNotifierTestData data = { .n = 0, .active = 0 };
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
     g_assert(!aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 0);
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     g_assert(!aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 0);
     event_notifier_cleanup(&data.e);
@@ -322,7 +328,7 @@ static void test_wait_event_notifier(void)
 {
     EventNotifierTestData data = { .n = 0, .active = 1 };
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
     while (aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 0);
     g_assert_cmpint(data.active, ==, 1);
@@ -336,7 +342,7 @@ static void test_wait_event_notifier(void)
     g_assert_cmpint(data.n, ==, 1);
     g_assert_cmpint(data.active, ==, 0);
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     g_assert(!aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 1);
 
@@ -347,7 +353,7 @@ static void test_flush_event_notifier(void)
 {
     EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true };
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
     while (aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 0);
     g_assert_cmpint(data.active, ==, 10);
@@ -363,7 +369,7 @@ static void test_flush_event_notifier(void)
     g_assert_cmpint(data.active, ==, 0);
     g_assert(!aio_poll(ctx, false));
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     g_assert(!aio_poll(ctx, false));
     event_notifier_cleanup(&data.e);
 }
@@ -374,7 +380,7 @@ static void test_wait_event_notifier_noflush(void)
     EventNotifierTestData dummy = { .n = 0, .active = 1 };
 
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
 
     g_assert(!aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 0);
@@ -387,7 +393,7 @@ static void test_wait_event_notifier_noflush(void)
 
     /* An active event notifier forces aio_poll to look at EventNotifiers.  */
     event_notifier_init(&dummy.e, false);
-    aio_set_event_notifier(ctx, &dummy.e, event_ready_cb);
+    set_event_notifier(ctx, &dummy.e, event_ready_cb);
 
     event_notifier_set(&data.e);
     g_assert(aio_poll(ctx, false));
@@ -407,10 +413,10 @@ static void test_wait_event_notifier_noflush(void)
     g_assert_cmpint(dummy.n, ==, 1);
     g_assert_cmpint(dummy.active, ==, 0);
 
-    aio_set_event_notifier(ctx, &dummy.e, NULL);
+    set_event_notifier(ctx, &dummy.e, NULL);
     event_notifier_cleanup(&dummy.e);
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     g_assert(!aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 2);
 
@@ -428,7 +434,7 @@ static void test_timer_schedule(void)
      * an fd to wait on. Fixing this breaks other tests. So create a dummy one.
      */
     event_notifier_init(&e, false);
-    aio_set_event_notifier(ctx, &e, dummy_io_handler_read);
+    set_event_notifier(ctx, &e, dummy_io_handler_read);
     aio_poll(ctx, false);
 
     aio_timer_init(ctx, &data.timer, data.clock_type,
@@ -467,7 +473,7 @@ static void test_timer_schedule(void)
     g_assert(!aio_poll(ctx, false));
     g_assert_cmpint(data.n, ==, 2);
 
-    aio_set_event_notifier(ctx, &e, NULL);
+    set_event_notifier(ctx, &e, NULL);
     event_notifier_cleanup(&e);
 
     timer_del(&data.timer);
@@ -638,11 +644,11 @@ static void test_source_set_event_notifier(void)
 {
     EventNotifierTestData data = { .n = 0, .active = 0 };
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 0);
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 0);
     event_notifier_cleanup(&data.e);
@@ -652,7 +658,7 @@ static void test_source_wait_event_notifier(void)
 {
     EventNotifierTestData data = { .n = 0, .active = 1 };
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 0);
     g_assert_cmpint(data.active, ==, 1);
@@ -666,7 +672,7 @@ static void test_source_wait_event_notifier(void)
     g_assert_cmpint(data.n, ==, 1);
     g_assert_cmpint(data.active, ==, 0);
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 1);
 
@@ -677,7 +683,7 @@ static void test_source_flush_event_notifier(void)
 {
     EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true };
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 0);
     g_assert_cmpint(data.active, ==, 10);
@@ -693,7 +699,7 @@ static void test_source_flush_event_notifier(void)
     g_assert_cmpint(data.active, ==, 0);
     g_assert(!g_main_context_iteration(NULL, false));
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     while (g_main_context_iteration(NULL, false));
     event_notifier_cleanup(&data.e);
 }
@@ -704,7 +710,7 @@ static void test_source_wait_event_notifier_noflush(void)
     EventNotifierTestData dummy = { .n = 0, .active = 1 };
 
     event_notifier_init(&data.e, false);
-    aio_set_event_notifier(ctx, &data.e, event_ready_cb);
+    set_event_notifier(ctx, &data.e, event_ready_cb);
 
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 0);
@@ -717,7 +723,7 @@ static void test_source_wait_event_notifier_noflush(void)
 
     /* An active event notifier forces aio_poll to look at EventNotifiers.  */
     event_notifier_init(&dummy.e, false);
-    aio_set_event_notifier(ctx, &dummy.e, event_ready_cb);
+    set_event_notifier(ctx, &dummy.e, event_ready_cb);
 
     event_notifier_set(&data.e);
     g_assert(g_main_context_iteration(NULL, false));
@@ -737,10 +743,10 @@ static void test_source_wait_event_notifier_noflush(void)
     g_assert_cmpint(dummy.n, ==, 1);
     g_assert_cmpint(dummy.active, ==, 0);
 
-    aio_set_event_notifier(ctx, &dummy.e, NULL);
+    set_event_notifier(ctx, &dummy.e, NULL);
     event_notifier_cleanup(&dummy.e);
 
-    aio_set_event_notifier(ctx, &data.e, NULL);
+    set_event_notifier(ctx, &data.e, NULL);
     while (g_main_context_iteration(NULL, false));
     g_assert_cmpint(data.n, ==, 2);
 
@@ -759,7 +765,7 @@ static void test_source_timer_schedule(void)
      * an fd to wait on. Fixing this breaks other tests. So create a dummy one.
      */
     event_notifier_init(&e, false);
-    aio_set_event_notifier(ctx, &e, dummy_io_handler_read);
+    set_event_notifier(ctx, &e, dummy_io_handler_read);
     do {} while (g_main_context_iteration(NULL, false));
 
     aio_timer_init(ctx, &data.timer, data.clock_type,
@@ -784,7 +790,7 @@ static void test_source_timer_schedule(void)
     g_assert_cmpint(data.n, ==, 2);
     g_assert(qemu_clock_get_ns(data.clock_type) > expiry);
 
-    aio_set_event_notifier(ctx, &e, NULL);
+    set_event_notifier(ctx, &e, NULL);
     event_notifier_cleanup(&e);
 
     timer_del(&data.timer);
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 02/12] nbd: Mark fd handlers client type as "external"
  2015-10-12 11:50 [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices Fam Zheng
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 01/12] aio: Add "is_external" flag for event handlers Fam Zheng
@ 2015-10-12 11:50 ` Fam Zheng
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 03/12] dataplane: Mark host notifiers' " Fam Zheng
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Fam Zheng @ 2015-10-12 11:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi

So we could distinguish it from internal used fds, thus avoid handling
unwanted events in nested aio polls.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 nbd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/nbd.c b/nbd.c
index 32a1f66..b599e62 100644
--- a/nbd.c
+++ b/nbd.c
@@ -1446,7 +1446,7 @@ static void nbd_set_handlers(NBDClient *client)
 {
     if (client->exp && client->exp->ctx) {
         aio_set_fd_handler(client->exp->ctx, client->sock,
-                           false,
+                           true,
                            client->can_read ? nbd_read : NULL,
                            client->send_coroutine ? nbd_restart_write : NULL,
                            client);
@@ -1457,7 +1457,7 @@ static void nbd_unset_handlers(NBDClient *client)
 {
     if (client->exp && client->exp->ctx) {
         aio_set_fd_handler(client->exp->ctx, client->sock,
-                           false, NULL, NULL, NULL);
+                           true, NULL, NULL, NULL);
     }
 }
 
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 03/12] dataplane: Mark host notifiers' client type as "external"
  2015-10-12 11:50 [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices Fam Zheng
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 01/12] aio: Add "is_external" flag for event handlers Fam Zheng
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 02/12] nbd: Mark fd handlers client type as "external" Fam Zheng
@ 2015-10-12 11:50 ` Fam Zheng
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 04/12] aio: introduce aio_{disable, enable}_external Fam Zheng
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Fam Zheng @ 2015-10-12 11:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi

They will be excluded by type in the nested event loops in block layer,
so that unwanted events won't be processed there.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/block/dataplane/virtio-blk.c |  5 ++---
 hw/scsi/virtio-scsi-dataplane.c | 18 ++++++++----------
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index f8716bc..c42ddeb 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -283,7 +283,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 
     /* Get this show started by hooking up our callbacks */
     aio_context_acquire(s->ctx);
-    aio_set_event_notifier(s->ctx, &s->host_notifier, false,
+    aio_set_event_notifier(s->ctx, &s->host_notifier, true,
                            handle_notify);
     aio_context_release(s->ctx);
     return;
@@ -320,8 +320,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
     aio_context_acquire(s->ctx);
 
     /* Stop notifications for new requests from guest */
-    aio_set_event_notifier(s->ctx, &s->host_notifier, false,
-                           NULL);
+    aio_set_event_notifier(s->ctx, &s->host_notifier, true, NULL);
 
     /* Drain and switch bs back to the QEMU main loop */
     blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index d149418..1c188f0 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -60,8 +60,7 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
     r = g_slice_new(VirtIOSCSIVring);
     r->host_notifier = *virtio_queue_get_host_notifier(vq);
     r->guest_notifier = *virtio_queue_get_guest_notifier(vq);
-    aio_set_event_notifier(s->ctx, &r->host_notifier, false,
-                           handler);
+    aio_set_event_notifier(s->ctx, &r->host_notifier, true, handler);
 
     r->parent = s;
 
@@ -72,8 +71,7 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
     return r;
 
 fail_vring:
-    aio_set_event_notifier(s->ctx, &r->host_notifier, false,
-                           NULL);
+    aio_set_event_notifier(s->ctx, &r->host_notifier, true, NULL);
     k->set_host_notifier(qbus->parent, n, false);
     g_slice_free(VirtIOSCSIVring, r);
     return NULL;
@@ -165,16 +163,16 @@ static void virtio_scsi_clear_aio(VirtIOSCSI *s)
 
     if (s->ctrl_vring) {
         aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier,
-                               false, NULL);
+                               true, NULL);
     }
     if (s->event_vring) {
         aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier,
-                               false, NULL);
+                               true, NULL);
     }
     if (s->cmd_vrings) {
         for (i = 0; i < vs->conf.num_queues && s->cmd_vrings[i]; i++) {
             aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier,
-                                   false, NULL);
+                                   true, NULL);
         }
     }
 }
@@ -296,12 +294,12 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s)
     aio_context_acquire(s->ctx);
 
     aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier,
-                           false, NULL);
+                           true, NULL);
     aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier,
-                           false, NULL);
+                           true, NULL);
     for (i = 0; i < vs->conf.num_queues; i++) {
         aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier,
-                               false, NULL);
+                               true, NULL);
     }
 
     blk_drain_all(); /* ensure there are no in-flight requests */
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 04/12] aio: introduce aio_{disable, enable}_external
  2015-10-12 11:50 [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices Fam Zheng
                   ` (2 preceding siblings ...)
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 03/12] dataplane: Mark host notifiers' " Fam Zheng
@ 2015-10-12 11:50 ` Fam Zheng
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 05/12] block: Introduce "drained begin/end" API Fam Zheng
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Fam Zheng @ 2015-10-12 11:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 aio-posix.c         |  3 ++-
 aio-win32.c         |  3 ++-
 include/block/aio.h | 37 +++++++++++++++++++++++++++++++++++++
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/aio-posix.c b/aio-posix.c
index f0f9122..0467f23 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -261,7 +261,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
 
     /* fill pollfds */
     QLIST_FOREACH(node, &ctx->aio_handlers, node) {
-        if (!node->deleted && node->pfd.events) {
+        if (!node->deleted && node->pfd.events
+            && aio_node_check(ctx, node->is_external)) {
             add_pollfd(node);
         }
     }
diff --git a/aio-win32.c b/aio-win32.c
index 3110d85..43c4c79 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -309,7 +309,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
     /* fill fd sets */
     count = 0;
     QLIST_FOREACH(node, &ctx->aio_handlers, node) {
-        if (!node->deleted && node->io_notify) {
+        if (!node->deleted && node->io_notify
+            && aio_node_check(ctx, node->is_external)) {
             events[count++] = event_notifier_get_handle(node->e);
         }
     }
diff --git a/include/block/aio.h b/include/block/aio.h
index 12f1141..80151d1 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -122,6 +122,8 @@ struct AioContext {
 
     /* TimerLists for calling timers - one per clock type */
     QEMUTimerListGroup tlg;
+
+    int external_disable_cnt;
 };
 
 /**
@@ -375,4 +377,39 @@ static inline void aio_timer_init(AioContext *ctx,
  */
 int64_t aio_compute_timeout(AioContext *ctx);
 
+/**
+ * aio_disable_external:
+ * @ctx: the aio context
+ *
+ * Disable the furthur processing of clients.
+ */
+static inline void aio_disable_external(AioContext *ctx)
+{
+    atomic_inc(&ctx->external_disable_cnt);
+}
+
+/**
+ * aio_enable_external:
+ * @ctx: the aio context
+ *
+ * Disable the processing of external clients.
+ */
+static inline void aio_enable_external(AioContext *ctx)
+{
+    atomic_dec(&ctx->external_disable_cnt);
+}
+
+/**
+ * aio_node_check:
+ * @ctx: the aio context
+ * @is_external: Whether or not the checked node is an external event source.
+ *
+ * Check if the node's is_external flag is okey to be polled by the ctx at this
+ * moment. True means green light.
+ */
+static inline bool aio_node_check(AioContext *ctx, bool is_external)
+{
+    return !is_external || !atomic_read(&ctx->external_disable_cnt);
+}
+
 #endif
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 05/12] block: Introduce "drained begin/end" API
  2015-10-12 11:50 [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices Fam Zheng
                   ` (3 preceding siblings ...)
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 04/12] aio: introduce aio_{disable, enable}_external Fam Zheng
@ 2015-10-12 11:50 ` Fam Zheng
  2015-10-12 13:59   ` Kevin Wolf
  2015-10-12 14:27   ` Paolo Bonzini
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 06/12] block: Add "drained begin/end" for transactional external snapshot Fam Zheng
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 20+ messages in thread
From: Fam Zheng @ 2015-10-12 11:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi

The semantics is that after bdrv_drained_begin(bs), bs will not get new external
requests until the matching bdrv_drained_end(bs).

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c                   |  2 ++
 block/io.c                | 18 ++++++++++++++++++
 include/block/block.h     | 19 +++++++++++++++++++
 include/block/block_int.h |  2 ++
 4 files changed, 41 insertions(+)

diff --git a/block.c b/block.c
index 1f90b47..9b28a07 100644
--- a/block.c
+++ b/block.c
@@ -2058,6 +2058,8 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
     bs_dest->device_list = bs_src->device_list;
     bs_dest->blk = bs_src->blk;
 
+    bs_dest->quiesce_counter = bs_src->quiesce_counter;
+
     memcpy(bs_dest->op_blockers, bs_src->op_blockers,
            sizeof(bs_dest->op_blockers));
 }
diff --git a/block/io.c b/block/io.c
index 94e18e6..5c088d5 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2618,3 +2618,21 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
     }
     bdrv_start_throttled_reqs(bs);
 }
+
+void bdrv_drained_begin(BlockDriverState *bs)
+{
+    if (bs->quiesce_counter++) {
+        return;
+    }
+    aio_disable_external(bdrv_get_aio_context(bs));
+    bdrv_drain(bs);
+}
+
+void bdrv_drained_end(BlockDriverState *bs)
+{
+    assert(bs->quiesce_counter > 0);
+    if (--bs->quiesce_counter > 0) {
+        return;
+    }
+    aio_enable_external(bdrv_get_aio_context(bs));
+}
diff --git a/include/block/block.h b/include/block/block.h
index 2dd6630..c4f6eef 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -619,4 +619,23 @@ void bdrv_flush_io_queue(BlockDriverState *bs);
 
 BlockAcctStats *bdrv_get_stats(BlockDriverState *bs);
 
+/**
+ * bdrv_drained_begin:
+ *
+ * Begin a quiesced section for exclusive access to the BDS, by disabling
+ * external request sources including NBD server and device model. Note that
+ * this doesn't block timers or coroutines from submitting more requests, which
+ * means block_job_pause is still necessary.
+ *
+ * This function can be recursive.
+ */
+void bdrv_drained_begin(BlockDriverState *bs);
+
+/**
+ * bdrv_drained_end:
+ *
+ * End a quiescent section started by bdrv_drained_begin().
+ */
+void bdrv_drained_end(BlockDriverState *bs);
+
 #endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 14ad4c3..7c58221 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -456,6 +456,8 @@ struct BlockDriverState {
     /* threshold limit for writes, in bytes. "High water mark". */
     uint64_t write_threshold_offset;
     NotifierWithReturn write_threshold_notifier;
+
+    int quiesce_counter;
 };
 
 
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 06/12] block: Add "drained begin/end" for transactional external snapshot
  2015-10-12 11:50 [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices Fam Zheng
                   ` (4 preceding siblings ...)
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 05/12] block: Introduce "drained begin/end" API Fam Zheng
@ 2015-10-12 11:50 ` Fam Zheng
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 07/12] block: Add "drained begin/end" for transactional backup Fam Zheng
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Fam Zheng @ 2015-10-12 11:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi

This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 32b04b4..90f1e15 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1479,6 +1479,7 @@ static void external_snapshot_prepare(BlkTransactionState *common,
     /* Acquire AioContext now so any threads operating on old_bs stop */
     state->aio_context = bdrv_get_aio_context(state->old_bs);
     aio_context_acquire(state->aio_context);
+    bdrv_drained_begin(state->old_bs);
 
     if (!bdrv_is_inserted(state->old_bs)) {
         error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
@@ -1548,8 +1549,6 @@ static void external_snapshot_commit(BlkTransactionState *common)
      * don't want to abort all of them if one of them fails the reopen */
     bdrv_reopen(state->new_bs, state->new_bs->open_flags & ~BDRV_O_RDWR,
                 NULL);
-
-    aio_context_release(state->aio_context);
 }
 
 static void external_snapshot_abort(BlkTransactionState *common)
@@ -1559,7 +1558,14 @@ static void external_snapshot_abort(BlkTransactionState *common)
     if (state->new_bs) {
         bdrv_unref(state->new_bs);
     }
+}
+
+static void external_snapshot_clean(BlkTransactionState *common)
+{
+    ExternalSnapshotState *state =
+                             DO_UPCAST(ExternalSnapshotState, common, common);
     if (state->aio_context) {
+        bdrv_drained_end(state->old_bs);
         aio_context_release(state->aio_context);
     }
 }
@@ -1724,6 +1730,7 @@ static const BdrvActionOps actions[] = {
         .prepare  = external_snapshot_prepare,
         .commit   = external_snapshot_commit,
         .abort = external_snapshot_abort,
+        .clean = external_snapshot_clean,
     },
     [TRANSACTION_ACTION_KIND_DRIVE_BACKUP] = {
         .instance_size = sizeof(DriveBackupState),
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 07/12] block: Add "drained begin/end" for transactional backup
  2015-10-12 11:50 [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices Fam Zheng
                   ` (5 preceding siblings ...)
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 06/12] block: Add "drained begin/end" for transactional external snapshot Fam Zheng
@ 2015-10-12 11:50 ` Fam Zheng
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 08/12] block: Add "drained begin/end" for transactional blockdev-backup Fam Zheng
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Fam Zheng @ 2015-10-12 11:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi

This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

Move the assignment to state->bs up right after bdrv_drained_begin, so
that we can use it in the clean callback. The abort callback will still
check bs->job and state->job, so it's OK.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 90f1e15..232bc21 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1599,6 +1599,8 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
     /* AioContext is released in .clean() */
     state->aio_context = bdrv_get_aio_context(bs);
     aio_context_acquire(state->aio_context);
+    bdrv_drained_begin(bs);
+    state->bs = bs;
 
     qmp_drive_backup(backup->device, backup->target,
                      backup->has_format, backup->format,
@@ -1614,7 +1616,6 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
         return;
     }
 
-    state->bs = bs;
     state->job = state->bs->job;
 }
 
@@ -1634,6 +1635,7 @@ static void drive_backup_clean(BlkTransactionState *common)
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
 
     if (state->aio_context) {
+        bdrv_drained_end(state->bs);
         aio_context_release(state->aio_context);
     }
 }
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 08/12] block: Add "drained begin/end" for transactional blockdev-backup
  2015-10-12 11:50 [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices Fam Zheng
                   ` (6 preceding siblings ...)
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 07/12] block: Add "drained begin/end" for transactional backup Fam Zheng
@ 2015-10-12 11:50 ` Fam Zheng
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 09/12] block: Add "drained begin/end" for internal snapshot Fam Zheng
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Fam Zheng @ 2015-10-12 11:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi

Similar to the previous patch, make sure that external events are not
dispatched during transaction operations.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 232bc21..015afbf 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1680,6 +1680,8 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
         return;
     }
     aio_context_acquire(state->aio_context);
+    state->bs = bs;
+    bdrv_drained_begin(bs);
 
     qmp_blockdev_backup(backup->device, backup->target,
                         backup->sync,
@@ -1692,7 +1694,6 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
         return;
     }
 
-    state->bs = bs;
     state->job = state->bs->job;
 }
 
@@ -1712,6 +1713,7 @@ static void blockdev_backup_clean(BlkTransactionState *common)
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
 
     if (state->aio_context) {
+        bdrv_drained_end(state->bs);
         aio_context_release(state->aio_context);
     }
 }
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 09/12] block: Add "drained begin/end" for internal snapshot
  2015-10-12 11:50 [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices Fam Zheng
                   ` (7 preceding siblings ...)
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 08/12] block: Add "drained begin/end" for transactional blockdev-backup Fam Zheng
@ 2015-10-12 11:50 ` Fam Zheng
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 10/12] block: Introduce BlockDriver.bdrv_drain callback Fam Zheng
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Fam Zheng @ 2015-10-12 11:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi

This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

state->bs is assigned right after bdrv_drained_begin. Because it was
used as the flag for deletion or not in abort, now we need a separate
flag - InternalSnapshotState.created.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 015afbf..c3da2c6 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1280,6 +1280,7 @@ typedef struct InternalSnapshotState {
     BlockDriverState *bs;
     AioContext *aio_context;
     QEMUSnapshotInfo sn;
+    bool created;
 } InternalSnapshotState;
 
 static void internal_snapshot_prepare(BlkTransactionState *common,
@@ -1318,6 +1319,8 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
     /* AioContext is released in .clean() */
     state->aio_context = bdrv_get_aio_context(bs);
     aio_context_acquire(state->aio_context);
+    bdrv_drained_begin(bs);
+    state->bs = bs;
 
     if (!bdrv_is_inserted(bs)) {
         error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
@@ -1375,7 +1378,7 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
     }
 
     /* 4. succeed, mark a snapshot is created */
-    state->bs = bs;
+    state->created = true;
 }
 
 static void internal_snapshot_abort(BlkTransactionState *common)
@@ -1386,7 +1389,7 @@ static void internal_snapshot_abort(BlkTransactionState *common)
     QEMUSnapshotInfo *sn = &state->sn;
     Error *local_error = NULL;
 
-    if (!bs) {
+    if (!state->created) {
         return;
     }
 
@@ -1407,6 +1410,7 @@ static void internal_snapshot_clean(BlkTransactionState *common)
                                              common, common);
 
     if (state->aio_context) {
+        bdrv_drained_end(state->bs);
         aio_context_release(state->aio_context);
     }
 }
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 10/12] block: Introduce BlockDriver.bdrv_drain callback
  2015-10-12 11:50 [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices Fam Zheng
                   ` (8 preceding siblings ...)
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 09/12] block: Add "drained begin/end" for internal snapshot Fam Zheng
@ 2015-10-12 11:50 ` Fam Zheng
  2015-10-12 14:17   ` Kevin Wolf
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 11/12] qed: Implement .bdrv_drain Fam Zheng
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 12/12] tests: Add test case for aio_disable_external Fam Zheng
  11 siblings, 1 reply; 20+ messages in thread
From: Fam Zheng @ 2015-10-12 11:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi

Drivers can have internal request sources that generate IO, like the
need_check_timer in QED. Since we want quiesced periods that contain
nested event loops in block layer, we need to have a way to disable such
event sources.

Block drivers must implement the "bdrv_drain" callback if it has any
internal sources that can generate I/O activity, like a timer or a
worker thread (even in a library) that can schedule QEMUBH in an
asynchronous callback.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/io.c                | 3 +++
 include/block/block_int.h | 6 ++++++
 2 files changed, 9 insertions(+)

diff --git a/block/io.c b/block/io.c
index 5c088d5..0e6d77c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -247,6 +247,9 @@ void bdrv_drain(BlockDriverState *bs)
 {
     bool busy = true;
 
+    if (bs->drv && bs->drv->bdrv_drain) {
+        bs->drv->bdrv_drain(bs);
+    }
     while (busy) {
         /* Keep iterating */
          bdrv_flush_io_queue(bs);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 7c58221..99359b2 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -288,6 +288,12 @@ struct BlockDriver {
      */
     int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo);
 
+    /**
+     * Drain and stop any internal sources of requests in the driver, and
+     * remain so until next I/O callback (e.g. bdrv_co_writev) is called.
+     */
+    void (*bdrv_drain)(BlockDriverState *bs);
+
     QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 11/12] qed: Implement .bdrv_drain
  2015-10-12 11:50 [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices Fam Zheng
                   ` (9 preceding siblings ...)
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 10/12] block: Introduce BlockDriver.bdrv_drain callback Fam Zheng
@ 2015-10-12 11:50 ` Fam Zheng
  2015-10-12 14:23   ` Kevin Wolf
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 12/12] tests: Add test case for aio_disable_external Fam Zheng
  11 siblings, 1 reply; 20+ messages in thread
From: Fam Zheng @ 2015-10-12 11:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi

The "need_check_timer" is used to clear the "NEED_CHECK" flag in the
image header after a grace period once metadata update has finished. In
compliance to the bdrv_drain semantics we should make sure it remains
deleted once .bdrv_drain is called.

Call the qed_need_check_timer_cb manually to update the header
immediately.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/qed.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/block/qed.c b/block/qed.c
index a7ff1d9..23bd273 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -381,6 +381,12 @@ static void bdrv_qed_attach_aio_context(BlockDriverState *bs,
     }
 }
 
+static void bdrv_qed_drain(BlockDriverState *bs)
+{
+    qed_cancel_need_check_timer(bs->opaque);
+    qed_need_check_timer_cb(bs->opaque);
+}
+
 static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
                          Error **errp)
 {
@@ -1683,6 +1689,7 @@ static BlockDriver bdrv_qed = {
     .bdrv_check               = bdrv_qed_check,
     .bdrv_detach_aio_context  = bdrv_qed_detach_aio_context,
     .bdrv_attach_aio_context  = bdrv_qed_attach_aio_context,
+    .bdrv_drain               = bdrv_qed_drain,
 };
 
 static void bdrv_qed_init(void)
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 12/12] tests: Add test case for aio_disable_external
  2015-10-12 11:50 [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices Fam Zheng
                   ` (10 preceding siblings ...)
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 11/12] qed: Implement .bdrv_drain Fam Zheng
@ 2015-10-12 11:50 ` Fam Zheng
  11 siblings, 0 replies; 20+ messages in thread
From: Fam Zheng @ 2015-10-12 11:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/test-aio.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/tests/test-aio.c b/tests/test-aio.c
index 03cd45d..1623803 100644
--- a/tests/test-aio.c
+++ b/tests/test-aio.c
@@ -374,6 +374,29 @@ static void test_flush_event_notifier(void)
     event_notifier_cleanup(&data.e);
 }
 
+static void test_aio_external_client(void)
+{
+    int i, j;
+
+    for (i = 1; i < 3; i++) {
+        EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true };
+        event_notifier_init(&data.e, false);
+        aio_set_event_notifier(ctx, &data.e, true, event_ready_cb);
+        event_notifier_set(&data.e);
+        for (j = 0; j < i; j++) {
+            aio_disable_external(ctx);
+        }
+        for (j = 0; j < i; j++) {
+            assert(!aio_poll(ctx, false));
+            assert(event_notifier_test_and_clear(&data.e));
+            event_notifier_set(&data.e);
+            aio_enable_external(ctx);
+        }
+        assert(aio_poll(ctx, false));
+        event_notifier_cleanup(&data.e);
+    }
+}
+
 static void test_wait_event_notifier_noflush(void)
 {
     EventNotifierTestData data = { .n = 0 };
@@ -832,6 +855,7 @@ int main(int argc, char **argv)
     g_test_add_func("/aio/event/wait",              test_wait_event_notifier);
     g_test_add_func("/aio/event/wait/no-flush-cb",  test_wait_event_notifier_noflush);
     g_test_add_func("/aio/event/flush",             test_flush_event_notifier);
+    g_test_add_func("/aio/external-client",         test_aio_external_client);
     g_test_add_func("/aio/timer/schedule",          test_timer_schedule);
 
     g_test_add_func("/aio-gsource/flush",                   test_source_flush);
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/12] block: Introduce "drained begin/end" API
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 05/12] block: Introduce "drained begin/end" API Fam Zheng
@ 2015-10-12 13:59   ` Kevin Wolf
  2015-10-12 14:27   ` Paolo Bonzini
  1 sibling, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2015-10-12 13:59 UTC (permalink / raw)
  To: Fam Zheng; +Cc: Paolo Bonzini, qemu-block, qemu-devel, Stefan Hajnoczi

Am 12.10.2015 um 13:50 hat Fam Zheng geschrieben:
> The semantics is that after bdrv_drained_begin(bs), bs will not get new external
> requests until the matching bdrv_drained_end(bs).
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block.c                   |  2 ++
>  block/io.c                | 18 ++++++++++++++++++
>  include/block/block.h     | 19 +++++++++++++++++++
>  include/block/block_int.h |  2 ++
>  4 files changed, 41 insertions(+)
> 
> diff --git a/block.c b/block.c
> index 1f90b47..9b28a07 100644
> --- a/block.c
> +++ b/block.c
> @@ -2058,6 +2058,8 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
>      bs_dest->device_list = bs_src->device_list;
>      bs_dest->blk = bs_src->blk;
>  
> +    bs_dest->quiesce_counter = bs_src->quiesce_counter;
> +
>      memcpy(bs_dest->op_blockers, bs_src->op_blockers,
>             sizeof(bs_dest->op_blockers));
>  }

This feels wrong. As I understand it, bdrv_drained_begin/end works on
specific nodes and not on trees. Including the field in
bdrv_move_feature_fields() means that it moves to the top of the tree
(i.e. it stays at in the same C object, which however belongs to a
different logical node now).

What I could imagine is that you did this so you can use
bdrv_draind_end() on the same BDS as you called bdrv_drained_start() on.
However, that's not the interface of bdrv_swap(), which really means
that the BDSes are swapped. So with this hunk you just end up having a
bug that cancels out the weirdness of the bdrv_swap() interface.

If you rebase on my bdrv_swap() removal series, things become a bit more
obvious. If you don't, you should drop this hunk and change some
bdrv_drained_end() calls, e.g. in the next patch, you'd have to call
bdrv_drained_begin(state->old_bs), but bdrv_drained_end(state->new_bs).

The rest of this patch looks good.

Kevin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/12] block: Introduce BlockDriver.bdrv_drain callback
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 10/12] block: Introduce BlockDriver.bdrv_drain callback Fam Zheng
@ 2015-10-12 14:17   ` Kevin Wolf
  0 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2015-10-12 14:17 UTC (permalink / raw)
  To: Fam Zheng; +Cc: Paolo Bonzini, qemu-block, qemu-devel, Stefan Hajnoczi

Am 12.10.2015 um 13:50 hat Fam Zheng geschrieben:
> Drivers can have internal request sources that generate IO, like the
> need_check_timer in QED. Since we want quiesced periods that contain
> nested event loops in block layer, we need to have a way to disable such
> event sources.
> 
> Block drivers must implement the "bdrv_drain" callback if it has any
> internal sources that can generate I/O activity, like a timer or a
> worker thread (even in a library) that can schedule QEMUBH in an
> asynchronous callback.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>

I think the right interface would be .bdrv_drain_begin/end callbacks so
that the timers or background work can be reenabled again after the
drained section.

As it happens, QED doesn't need this because you chose to complete the
outstanding work and the timer only needs to be reenabled on the next
write operation. Fine with me, we can extend the interface as soon as we
really need it.

(Though, actually, I'm not sure... I think I'll comment on the QED
patch.)

Kevin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 11/12] qed: Implement .bdrv_drain
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 11/12] qed: Implement .bdrv_drain Fam Zheng
@ 2015-10-12 14:23   ` Kevin Wolf
  0 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2015-10-12 14:23 UTC (permalink / raw)
  To: Fam Zheng; +Cc: Paolo Bonzini, qemu-block, qemu-devel, Stefan Hajnoczi

Am 12.10.2015 um 13:50 hat Fam Zheng geschrieben:
> The "need_check_timer" is used to clear the "NEED_CHECK" flag in the
> image header after a grace period once metadata update has finished. In
> compliance to the bdrv_drain semantics we should make sure it remains
> deleted once .bdrv_drain is called.
> 
> Call the qed_need_check_timer_cb manually to update the header
> immediately.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>

What happens if a new allocating write request is issued during the
drained section and the timer gets reenabled?

Kevin

>  block/qed.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/block/qed.c b/block/qed.c
> index a7ff1d9..23bd273 100644
> --- a/block/qed.c
> +++ b/block/qed.c
> @@ -381,6 +381,12 @@ static void bdrv_qed_attach_aio_context(BlockDriverState *bs,
>      }
>  }
>  
> +static void bdrv_qed_drain(BlockDriverState *bs)
> +{
> +    qed_cancel_need_check_timer(bs->opaque);
> +    qed_need_check_timer_cb(bs->opaque);
> +}
> +
>  static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
>                           Error **errp)
>  {
> @@ -1683,6 +1689,7 @@ static BlockDriver bdrv_qed = {
>      .bdrv_check               = bdrv_qed_check,
>      .bdrv_detach_aio_context  = bdrv_qed_detach_aio_context,
>      .bdrv_attach_aio_context  = bdrv_qed_attach_aio_context,
> +    .bdrv_drain               = bdrv_qed_drain,
>  };
>  
>  static void bdrv_qed_init(void)
> -- 
> 2.6.1
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/12] block: Introduce "drained begin/end" API
  2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 05/12] block: Introduce "drained begin/end" API Fam Zheng
  2015-10-12 13:59   ` Kevin Wolf
@ 2015-10-12 14:27   ` Paolo Bonzini
  2015-10-13  9:31     ` Kevin Wolf
  1 sibling, 1 reply; 20+ messages in thread
From: Paolo Bonzini @ 2015-10-12 14:27 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block



On 12/10/2015 13:50, Fam Zheng wrote:
> +void bdrv_drained_begin(BlockDriverState *bs)
> +{
> +    if (bs->quiesce_counter++) {
> +        return;
> +    }
> +    aio_disable_external(bdrv_get_aio_context(bs));
> +    bdrv_drain(bs);
> +}

I think bdrv_drain should be called unconditionally, i.e. before the
"if".  This should also solve Kevin's doubt about new allocating write
request reenabling the timer: any write request from the drained section
happens normally, until you get a nested drain request and then the
callback completes the requests.

Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/12] block: Introduce "drained begin/end" API
  2015-10-12 14:27   ` Paolo Bonzini
@ 2015-10-13  9:31     ` Kevin Wolf
  2015-10-13 10:39       ` Paolo Bonzini
  0 siblings, 1 reply; 20+ messages in thread
From: Kevin Wolf @ 2015-10-13  9:31 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Stefan Hajnoczi, Fam Zheng, qemu-devel, qemu-block

Am 12.10.2015 um 16:27 hat Paolo Bonzini geschrieben:
> 
> 
> On 12/10/2015 13:50, Fam Zheng wrote:
> > +void bdrv_drained_begin(BlockDriverState *bs)
> > +{
> > +    if (bs->quiesce_counter++) {
> > +        return;
> > +    }
> > +    aio_disable_external(bdrv_get_aio_context(bs));
> > +    bdrv_drain(bs);
> > +}
> 
> I think bdrv_drain should be called unconditionally, i.e. before the
> "if".  This should also solve Kevin's doubt about new allocating write
> request reenabling the timer: any write request from the drained section
> happens normally, until you get a nested drain request and then the
> callback completes the requests.

This would mean that once you've sent an I/O request inside a drain
section, you have to expect that more internal I/O might be going on
after the request has completed. If you don't want this, you have to
issue another bdrv_drain() or use a nested bdrv_drained_begin/end()
section.

Sounds reasonable enough to me, but I guess this should be explicitly
documented.

Kevin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/12] block: Introduce "drained begin/end" API
  2015-10-13  9:31     ` Kevin Wolf
@ 2015-10-13 10:39       ` Paolo Bonzini
  2015-10-13 11:12         ` Kevin Wolf
  0 siblings, 1 reply; 20+ messages in thread
From: Paolo Bonzini @ 2015-10-13 10:39 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, Fam Zheng, qemu-devel, Stefan Hajnoczi



On 13/10/2015 11:31, Kevin Wolf wrote:
> This would mean that once you've sent an I/O request inside a drain
> section, you have to expect that more internal I/O might be going on
> after the request has completed. If you don't want this, you have to
> issue another bdrv_drain() or use a nested bdrv_drained_begin/end()
> section.

Yes.

> Sounds reasonable enough to me, but I guess this should be explicitly
> documented.

I agree.  Perhaps bdrv_drained_begin/end() could be renamed to
bdrv_drain_and_lock() / bdrv_unlock()?

Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/12] block: Introduce "drained begin/end" API
  2015-10-13 10:39       ` Paolo Bonzini
@ 2015-10-13 11:12         ` Kevin Wolf
  0 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2015-10-13 11:12 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-block, Fam Zheng, qemu-devel, Stefan Hajnoczi

Am 13.10.2015 um 12:39 hat Paolo Bonzini geschrieben:
> 
> 
> On 13/10/2015 11:31, Kevin Wolf wrote:
> > This would mean that once you've sent an I/O request inside a drain
> > section, you have to expect that more internal I/O might be going on
> > after the request has completed. If you don't want this, you have to
> > issue another bdrv_drain() or use a nested bdrv_drained_begin/end()
> > section.
> 
> Yes.
> 
> > Sounds reasonable enough to me, but I guess this should be explicitly
> > documented.
> 
> I agree.  Perhaps bdrv_drained_begin/end() could be renamed to
> bdrv_drain_and_lock() / bdrv_unlock()?

It's not very obvious what bdrv_unlock() refers to, so I prefer the
current naming. Just making sure that the comment for bdrv_drained_begin
explains the exact semantics should be good enough.

Kevin

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-10-13 11:13 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-12 11:50 [Qemu-devel] [PATCH v2 00/12] block: bdrv_drained_begin/end for transactions on dataplane devices Fam Zheng
2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 01/12] aio: Add "is_external" flag for event handlers Fam Zheng
2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 02/12] nbd: Mark fd handlers client type as "external" Fam Zheng
2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 03/12] dataplane: Mark host notifiers' " Fam Zheng
2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 04/12] aio: introduce aio_{disable, enable}_external Fam Zheng
2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 05/12] block: Introduce "drained begin/end" API Fam Zheng
2015-10-12 13:59   ` Kevin Wolf
2015-10-12 14:27   ` Paolo Bonzini
2015-10-13  9:31     ` Kevin Wolf
2015-10-13 10:39       ` Paolo Bonzini
2015-10-13 11:12         ` Kevin Wolf
2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 06/12] block: Add "drained begin/end" for transactional external snapshot Fam Zheng
2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 07/12] block: Add "drained begin/end" for transactional backup Fam Zheng
2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 08/12] block: Add "drained begin/end" for transactional blockdev-backup Fam Zheng
2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 09/12] block: Add "drained begin/end" for internal snapshot Fam Zheng
2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 10/12] block: Introduce BlockDriver.bdrv_drain callback Fam Zheng
2015-10-12 14:17   ` Kevin Wolf
2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 11/12] qed: Implement .bdrv_drain Fam Zheng
2015-10-12 14:23   ` Kevin Wolf
2015-10-12 11:50 ` [Qemu-devel] [PATCH v2 12/12] tests: Add test case for aio_disable_external Fam Zheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.