All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations
@ 2017-04-10 15:05 Fam Zheng
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 01/10] block: Make bdrv_parent_drained_begin/end public Fam Zheng
                   ` (12 more replies)
  0 siblings, 13 replies; 29+ messages in thread
From: Fam Zheng @ 2017-04-10 15:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Ed Swierk, Fam Zheng, Kevin Wolf,
	Max Reitz, Eric Blake, Stefan Hajnoczi

v3: Respin the unmerged changes from v2 and include one new fix:

    (Yes, it is a big series for the last -rc, and I personally prefer the v2
    approach for the 4-9 part of the problem, which is much more mechanical.)

    - 1, 2 are redoing previous patch 4, using bdrv_parent_drained_begin/end.
      [Kevin]
      Also fix the ordering against aio_context_release. [Stefan]
    - 3 is unchanged from patch 6 in v2.
    - 4-9 are reworking of patch 5 following Paolo's suggestion, which allowed
      better patch split.
    - 10 is finding of a latent bug, which is revealed by patch 9.

v2: - Drop patch 4 in v1. A second thought made me feel neither it nor Kevin's
      suggestion to move the BH process to bdrv_drain_recurse/BDRV_POLL_WHILE
      is a complete fix. So leave it for a separate patch.
    - Add rev-by to patches 1, 3, 4.
    - Split from patch 1 in v1 and add patch 2, for the new assertions. [Kevin]
    - Rewrite patch 5. Fix block job's co when a BDS is moved to a different
      aio context. [Kevin]
    - Add patch 6.

Crashes are reported on dataplane devices when doing snapshot and commit under
guest I/O.

With this series, Ed's test case '176' now passes:

    https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.9

Fam Zheng (10):
  block: Make bdrv_parent_drained_begin/end public
  block: Quiesce old aio context during bdrv_set_aio_context
  tests/block-job-txn: Don't start block job before adding to txn
  coroutine: Extract qemu_aio_coroutine_enter
  async: Introduce aio_co_enter and aio_co_enter_if_inactive
  block: Introduce bdrv_coroutine_enter and *_if_inactive
  blockjob: Use bdrv_coroutine_enter to start coroutine
  qemu-io-cmds: Use bdrv_coroutine_enter
  block: Use bdrv_coroutine_enter to start I/O coroutines
  block: Fix bdrv_co_flush early return

 block.c                    | 17 +++++++++++++++--
 block/block-backend.c      |  4 ++--
 block/io.c                 | 34 ++++++++++++++++++----------------
 blockjob.c                 |  4 ++--
 include/block/aio.h        | 18 ++++++++++++++++++
 include/block/block.h      | 27 +++++++++++++++++++++++++++
 include/qemu/coroutine.h   |  5 +++++
 qemu-io-cmds.c             |  2 +-
 tests/qemu-iotests/109.out | 10 +++++-----
 tests/test-blockjob-txn.c  |  6 +++++-
 util/async.c               | 14 +++++++++++++-
 util/qemu-coroutine.c      | 11 ++++++++---
 util/trace-events          |  2 +-
 13 files changed, 120 insertions(+), 34 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH for 2.9 v3 01/10] block: Make bdrv_parent_drained_begin/end public
  2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
@ 2017-04-10 15:05 ` Fam Zheng
  2017-04-11  9:02   ` Kevin Wolf
  2017-04-11 10:30   ` Stefan Hajnoczi
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 02/10] block: Quiesce old aio context during bdrv_set_aio_context Fam Zheng
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 29+ messages in thread
From: Fam Zheng @ 2017-04-10 15:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Ed Swierk, Fam Zheng, Kevin Wolf,
	Max Reitz, Eric Blake, Stefan Hajnoczi

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/io.c            |  4 ++--
 include/block/block.h | 16 ++++++++++++++++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index 7321dda..9598646 100644
--- a/block/io.c
+++ b/block/io.c
@@ -44,7 +44,7 @@ static void coroutine_fn bdrv_co_do_rw(void *opaque);
 static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
     int64_t offset, int count, BdrvRequestFlags flags);
 
-static void bdrv_parent_drained_begin(BlockDriverState *bs)
+void bdrv_parent_drained_begin(BlockDriverState *bs)
 {
     BdrvChild *c;
 
@@ -55,7 +55,7 @@ static void bdrv_parent_drained_begin(BlockDriverState *bs)
     }
 }
 
-static void bdrv_parent_drained_end(BlockDriverState *bs)
+void bdrv_parent_drained_end(BlockDriverState *bs)
 {
     BdrvChild *c;
 
diff --git a/include/block/block.h b/include/block/block.h
index 3e09222..488a07e 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -573,6 +573,22 @@ void bdrv_io_plug(BlockDriverState *bs);
 void bdrv_io_unplug(BlockDriverState *bs);
 
 /**
+ * bdrv_parent_drained_begin:
+ *
+ * Begin a quiesced section of all users of @bs. This is part of
+ * bdrv_drained_begin.
+ */
+void bdrv_parent_drained_begin(BlockDriverState *bs);
+
+/**
+ * bdrv_parent_drained_end:
+ *
+ * End a quiesced section of all users of @bs. This is part of
+ * bdrv_drained_end.
+ */
+void bdrv_parent_drained_end(BlockDriverState *bs);
+
+/**
  * bdrv_drained_begin:
  *
  * Begin a quiesced section for exclusive access to the BDS, by disabling
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH for 2.9 v3 02/10] block: Quiesce old aio context during bdrv_set_aio_context
  2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 01/10] block: Make bdrv_parent_drained_begin/end public Fam Zheng
@ 2017-04-10 15:05 ` Fam Zheng
  2017-04-11  9:02   ` Kevin Wolf
  2017-04-11 10:30   ` Stefan Hajnoczi
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 03/10] tests/block-job-txn: Don't start block job before adding to txn Fam Zheng
                   ` (10 subsequent siblings)
  12 siblings, 2 replies; 29+ messages in thread
From: Fam Zheng @ 2017-04-10 15:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Ed Swierk, Fam Zheng, Kevin Wolf,
	Max Reitz, Eric Blake, Stefan Hajnoczi

The fact that the bs->aio_context is changing can confuse the dataplane
iothread, because of the now fine granularity aio context lock.
bdrv_drain should rather be a bdrv_drained_begin/end pair, but since
bs->aio_context is changing, we can just use aio_disable_external and
bdrv_parent_drained_begin.

Reported-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index b8a3011..a995a8e 100644
--- a/block.c
+++ b/block.c
@@ -4396,11 +4396,12 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
 
 void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
 {
-    AioContext *ctx;
+    AioContext *ctx = bdrv_get_aio_context(bs);
 
+    aio_disable_external(ctx);
+    bdrv_parent_drained_begin(bs);
     bdrv_drain(bs); /* ensure there are no in-flight requests */
 
-    ctx = bdrv_get_aio_context(bs);
     while (aio_poll(ctx, false)) {
         /* wait for all bottom halves to execute */
     }
@@ -4412,6 +4413,8 @@ void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
      */
     aio_context_acquire(new_context);
     bdrv_attach_aio_context(bs, new_context);
+    bdrv_parent_drained_end(bs);
+    aio_enable_external(ctx);
     aio_context_release(new_context);
 }
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH for 2.9 v3 03/10] tests/block-job-txn: Don't start block job before adding to txn
  2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 01/10] block: Make bdrv_parent_drained_begin/end public Fam Zheng
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 02/10] block: Quiesce old aio context during bdrv_set_aio_context Fam Zheng
@ 2017-04-10 15:05 ` Fam Zheng
  2017-04-11  9:04   ` Kevin Wolf
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 04/10] coroutine: Extract qemu_aio_coroutine_enter Fam Zheng
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Fam Zheng @ 2017-04-10 15:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Ed Swierk, Fam Zheng, Kevin Wolf,
	Max Reitz, Eric Blake, Stefan Hajnoczi

Previously, before test_block_job_start returns, the job can already
complete, as a result, the transactional state of other jobs added to
the same txn later cannot be handled correctly.

Move the block_job_start() calls to callers after
block_job_txn_add_job() calls.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/test-blockjob-txn.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tests/test-blockjob-txn.c b/tests/test-blockjob-txn.c
index 4ccbda1..0f80194 100644
--- a/tests/test-blockjob-txn.c
+++ b/tests/test-blockjob-txn.c
@@ -110,7 +110,6 @@ static BlockJob *test_block_job_start(unsigned int iterations,
     s->result = result;
     data->job = s;
     data->result = result;
-    block_job_start(&s->common);
     return &s->common;
 }
 
@@ -123,6 +122,7 @@ static void test_single_job(int expected)
     txn = block_job_txn_new();
     job = test_block_job_start(1, true, expected, &result);
     block_job_txn_add_job(txn, job);
+    block_job_start(job);
 
     if (expected == -ECANCELED) {
         block_job_cancel(job);
@@ -164,6 +164,8 @@ static void test_pair_jobs(int expected1, int expected2)
     block_job_txn_add_job(txn, job1);
     job2 = test_block_job_start(2, true, expected2, &result2);
     block_job_txn_add_job(txn, job2);
+    block_job_start(job1);
+    block_job_start(job2);
 
     if (expected1 == -ECANCELED) {
         block_job_cancel(job1);
@@ -223,6 +225,8 @@ static void test_pair_jobs_fail_cancel_race(void)
     block_job_txn_add_job(txn, job1);
     job2 = test_block_job_start(2, false, 0, &result2);
     block_job_txn_add_job(txn, job2);
+    block_job_start(job1);
+    block_job_start(job2);
 
     block_job_cancel(job1);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH for 2.9 v3 04/10] coroutine: Extract qemu_aio_coroutine_enter
  2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
                   ` (2 preceding siblings ...)
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 03/10] tests/block-job-txn: Don't start block job before adding to txn Fam Zheng
@ 2017-04-10 15:05 ` Fam Zheng
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 05/10] async: Introduce aio_co_enter and aio_co_enter_if_inactive Fam Zheng
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Fam Zheng @ 2017-04-10 15:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Ed Swierk, Fam Zheng, Kevin Wolf,
	Max Reitz, Eric Blake, Stefan Hajnoczi

It's a variant of qemu_coroutine_enter with an explicit AioContext
parameter.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 include/qemu/coroutine.h |  5 +++++
 util/qemu-coroutine.c    | 11 ++++++++---
 util/trace-events        |  2 +-
 3 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index e60beaf..a4509bd 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -77,6 +77,11 @@ void qemu_coroutine_enter(Coroutine *coroutine);
 void qemu_coroutine_enter_if_inactive(Coroutine *co);
 
 /**
+ * Transfer control to a coroutine and associate it with ctx
+ */
+void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine *co);
+
+/**
  * Transfer control back to a coroutine's caller
  *
  * This function does not return until the coroutine is re-entered using
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index 72412e5..486af9a 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -102,12 +102,12 @@ static void coroutine_delete(Coroutine *co)
     qemu_coroutine_delete(co);
 }
 
-void qemu_coroutine_enter(Coroutine *co)
+void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine *co)
 {
     Coroutine *self = qemu_coroutine_self();
     CoroutineAction ret;
 
-    trace_qemu_coroutine_enter(self, co, co->entry_arg);
+    trace_qemu_aio_coroutine_enter(ctx, self, co, co->entry_arg);
 
     if (co->caller) {
         fprintf(stderr, "Co-routine re-entered recursively\n");
@@ -115,7 +115,7 @@ void qemu_coroutine_enter(Coroutine *co)
     }
 
     co->caller = self;
-    co->ctx = qemu_get_current_aio_context();
+    co->ctx = ctx;
 
     /* Store co->ctx before anything that stores co.  Matches
      * barrier in aio_co_wake and qemu_co_mutex_wake.
@@ -139,6 +139,11 @@ void qemu_coroutine_enter(Coroutine *co)
     }
 }
 
+void qemu_coroutine_enter(Coroutine *co)
+{
+    qemu_aio_coroutine_enter(qemu_get_current_aio_context(), co);
+}
+
 void qemu_coroutine_enter_if_inactive(Coroutine *co)
 {
     if (!qemu_coroutine_entered(co)) {
diff --git a/util/trace-events b/util/trace-events
index ac27d94..b44ef4f 100644
--- a/util/trace-events
+++ b/util/trace-events
@@ -22,7 +22,7 @@ buffer_move(const char *buf, size_t len, const char *from) "%s: %zd bytes from %
 buffer_free(const char *buf, size_t len) "%s: capacity %zd"
 
 # util/qemu-coroutine.c
-qemu_coroutine_enter(void *from, void *to, void *opaque) "from %p to %p opaque %p"
+qemu_aio_coroutine_enter(void *ctx, void *from, void *to, void *opaque) "ctx %p from %p to %p opaque %p"
 qemu_coroutine_yield(void *from, void *to) "from %p to %p"
 qemu_coroutine_terminate(void *co) "self %p"
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH for 2.9 v3 05/10] async: Introduce aio_co_enter and aio_co_enter_if_inactive
  2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
                   ` (3 preceding siblings ...)
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 04/10] coroutine: Extract qemu_aio_coroutine_enter Fam Zheng
@ 2017-04-10 15:05 ` Fam Zheng
  2017-04-11  9:28   ` Kevin Wolf
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 06/10] block: Introduce bdrv_coroutine_enter and *_if_inactive Fam Zheng
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Fam Zheng @ 2017-04-10 15:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Ed Swierk, Fam Zheng, Kevin Wolf,
	Max Reitz, Eric Blake, Stefan Hajnoczi

They start the coroutine on the specified context.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 include/block/aio.h | 18 ++++++++++++++++++
 util/async.c        | 14 +++++++++++++-
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index 677b6ff..b0a6bb3 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -511,6 +511,24 @@ void aio_co_schedule(AioContext *ctx, struct Coroutine *co);
 void aio_co_wake(struct Coroutine *co);
 
 /**
+ * aio_co_enter:
+ * @ctx: the context to run the coroutine
+ * @co: the coroutine to run
+ *
+ * Enter a coroutine in the specified AioContext.
+ */
+void aio_co_enter(AioContext *ctx, struct Coroutine *co);
+
+/**
+ * aio_co_enter_if_inactive:
+ * @ctx: the context to run the coroutine
+ * @co: the coroutine to run
+ *
+ * Enter a coroutine in the specified AioContext, if it's not already entered.
+ */
+void aio_co_enter_if_inactive(AioContext *ctx, struct Coroutine *co);
+
+/**
  * Return the AioContext whose event loop runs in the current thread.
  *
  * If called from an IOThread this will be the IOThread's AioContext.  If
diff --git a/util/async.c b/util/async.c
index 663e297..507671a 100644
--- a/util/async.c
+++ b/util/async.c
@@ -453,6 +453,11 @@ void aio_co_wake(struct Coroutine *co)
     smp_read_barrier_depends();
     ctx = atomic_read(&co->ctx);
 
+    aio_co_enter(ctx, co);
+}
+
+void aio_co_enter(AioContext *ctx, struct Coroutine *co)
+{
     if (ctx != qemu_get_current_aio_context()) {
         aio_co_schedule(ctx, co);
         return;
@@ -464,11 +469,18 @@ void aio_co_wake(struct Coroutine *co)
         QSIMPLEQ_INSERT_TAIL(&self->co_queue_wakeup, co, co_queue_next);
     } else {
         aio_context_acquire(ctx);
-        qemu_coroutine_enter(co);
+        qemu_aio_coroutine_enter(ctx, co);
         aio_context_release(ctx);
     }
 }
 
+void aio_co_enter_if_inactive(AioContext *ctx, struct Coroutine *co)
+{
+    if (!qemu_coroutine_entered(co)) {
+        aio_co_enter(ctx, co);
+    }
+}
+
 void aio_context_ref(AioContext *ctx)
 {
     g_source_ref(&ctx->source);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH for 2.9 v3 06/10] block: Introduce bdrv_coroutine_enter and *_if_inactive
  2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
                   ` (4 preceding siblings ...)
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 05/10] async: Introduce aio_co_enter and aio_co_enter_if_inactive Fam Zheng
@ 2017-04-10 15:05 ` Fam Zheng
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 07/10] blockjob: Use bdrv_coroutine_enter to start coroutine Fam Zheng
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Fam Zheng @ 2017-04-10 15:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Ed Swierk, Fam Zheng, Kevin Wolf,
	Max Reitz, Eric Blake, Stefan Hajnoczi

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c               | 10 ++++++++++
 include/block/block.h | 11 +++++++++++
 2 files changed, 21 insertions(+)

diff --git a/block.c b/block.c
index a995a8e..e65b906 100644
--- a/block.c
+++ b/block.c
@@ -4324,6 +4324,16 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs)
     return bs->aio_context;
 }
 
+void bdrv_coroutine_enter(BlockDriverState *bs, Coroutine *co)
+{
+    aio_co_enter(bdrv_get_aio_context(bs), co);
+}
+
+void bdrv_coroutine_enter_if_inactive(BlockDriverState *bs, Coroutine *co)
+{
+    aio_co_enter_if_inactive(bdrv_get_aio_context(bs), co);
+}
+
 static void bdrv_do_remove_aio_context_notifier(BdrvAioNotifier *ban)
 {
     QLIST_REMOVE(ban, list);
diff --git a/include/block/block.h b/include/block/block.h
index 488a07e..dd9416e 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -558,6 +558,17 @@ bool bdrv_debug_is_suspended(BlockDriverState *bs, const char *tag);
 AioContext *bdrv_get_aio_context(BlockDriverState *bs);
 
 /**
+ * Transfer control to @co in the aio context of @bs
+ */
+void bdrv_coroutine_enter(BlockDriverState *bs, Coroutine *co);
+
+/**
+ * Transfer control to @co in the aio context of @bs if it's not active (i.e.
+ * part of the call stack of the running coroutine). Otherwise, do nothing.
+ */
+void bdrv_coroutine_enter_if_inactive(BlockDriverState *bs, Coroutine *co);
+
+/**
  * bdrv_set_aio_context:
  *
  * Changes the #AioContext used for fd handlers, timers, and BHs by this
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH for 2.9 v3 07/10] blockjob: Use bdrv_coroutine_enter to start coroutine
  2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
                   ` (5 preceding siblings ...)
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 06/10] block: Introduce bdrv_coroutine_enter and *_if_inactive Fam Zheng
@ 2017-04-10 15:05 ` Fam Zheng
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 08/10] qemu-io-cmds: Use bdrv_coroutine_enter Fam Zheng
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Fam Zheng @ 2017-04-10 15:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Ed Swierk, Fam Zheng, Kevin Wolf,
	Max Reitz, Eric Blake, Stefan Hajnoczi

Resuming and especially starting of the block job coroutine, could be issued in
the main thread.  However the coroutine's "home" ctx should be set to the same
context as job->blk. Use bdrv_coroutine_enter to ensure that.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockjob.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 9b619f385..6e48932 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -290,7 +290,7 @@ void block_job_start(BlockJob *job)
     job->pause_count--;
     job->busy = true;
     job->paused = false;
-    qemu_coroutine_enter(job->co);
+    bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
 void block_job_ref(BlockJob *job)
@@ -532,7 +532,7 @@ void block_job_user_resume(BlockJob *job)
 void block_job_enter(BlockJob *job)
 {
     if (job->co && !job->busy) {
-        qemu_coroutine_enter(job->co);
+        bdrv_coroutine_enter(blk_bs(job->blk), job->co);
     }
 }
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH for 2.9 v3 08/10] qemu-io-cmds: Use bdrv_coroutine_enter
  2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
                   ` (6 preceding siblings ...)
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 07/10] blockjob: Use bdrv_coroutine_enter to start coroutine Fam Zheng
@ 2017-04-10 15:05 ` Fam Zheng
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 09/10] block: Use bdrv_coroutine_enter to start I/O coroutines Fam Zheng
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Fam Zheng @ 2017-04-10 15:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Ed Swierk, Fam Zheng, Kevin Wolf,
	Max Reitz, Eric Blake, Stefan Hajnoczi

qemu_coroutine_create associates @co to qemu_aio_context but we poll
blk's context below. If the coroutine yields, it may never get resumed
again.

Use bdrv_coroutine_enter to make sure we are starting the I/O on the
right context.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 qemu-io-cmds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 883f53b..312fc6d 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -521,7 +521,7 @@ static int do_co_pwrite_zeroes(BlockBackend *blk, int64_t offset,
     }
 
     co = qemu_coroutine_create(co_pwrite_zeroes_entry, &data);
-    qemu_coroutine_enter(co);
+    bdrv_coroutine_enter(blk_bs(blk), co);
     while (!data.done) {
         aio_poll(blk_get_aio_context(blk), true);
     }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH for 2.9 v3 09/10] block: Use bdrv_coroutine_enter to start I/O coroutines
  2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
                   ` (7 preceding siblings ...)
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 08/10] qemu-io-cmds: Use bdrv_coroutine_enter Fam Zheng
@ 2017-04-10 15:05 ` Fam Zheng
  2017-04-10 15:38   ` Eric Blake
  2017-04-11 10:06   ` Kevin Wolf
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 10/10] block: Fix bdrv_co_flush early return Fam Zheng
                   ` (3 subsequent siblings)
  12 siblings, 2 replies; 29+ messages in thread
From: Fam Zheng @ 2017-04-10 15:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Ed Swierk, Fam Zheng, Kevin Wolf,
	Max Reitz, Eric Blake, Stefan Hajnoczi

BDRV_POLL_WHILE waits for the started I/O by releasing bs's ctx then polling
the main context, which relies on the yielded the coroutine would continue on
bs->ctx and notify qemu_aio_context with bdrv_wakeup(). Thus, using
qemu_coroutine_enter to start I/O is wrong because if the coroutine is entered
from main loop, co->ctx will be qemu_aio_context, as a result of the "release,
poll, acquire" loop of BDRV_POLL_WHILE, race conditions happen when both
main thread and the iothread access the same BDS:

  main loop                                iothread
-----------------------------------------------------------------------
  blockdev_snapshot
    aio_context_acquire(bs->ctx)
                                           virtio_scsi_data_plane_handle_cmd
    bdrv_drained_begin(bs->ctx)
    bdrv_flush(bs)
      bdrv_co_flush(bs)                      aio_context_acquire(bs->ctx).enter
        ...
        qemu_coroutine_yield(co)
      BDRV_POLL_WHILE()
        aio_context_release(bs->ctx)
                                             aio_context_acquire(bs->ctx).return
                                               ...
                                                 aio_co_wake(co)
        aio_poll(qemu_aio_context)               ...
          co_schedule_bh_cb()                    ...
            qemu_coroutine_enter(co)             ...

              /* (A) bdrv_co_flush(bs)           /* (B) I/O on bs */
                      continues... */
                                             aio_context_release(bs->ctx)
        aio_context_acquire(bs->ctx)

Note that in above case, bdrv_drained_begin() doesn't do the "release,
poll, acquire" in BDRV_POLL_WHILE, because bs->in_flight == 0.

Fix this by using bdrv_coroutine_enter and enter coroutine in the right
context.

iotests 109 output is updated because the coroutine reenter flow during
mirror job complete is different (now through co_queue_wakeup, instead
of the unconditional qemu_coroutine_switch before), making the end job
len different.

Signed-off-by: Fam Zheng <famz@redhat.com>

fixup
---
 block/block-backend.c      |  4 ++--
 block/io.c                 | 14 +++++++-------
 tests/qemu-iotests/109.out | 10 +++++-----
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 18ece99..a8f2b34 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1045,7 +1045,7 @@ static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
         co_entry(&rwco);
     } else {
         Coroutine *co = qemu_coroutine_create(co_entry, &rwco);
-        qemu_coroutine_enter(co);
+        bdrv_coroutine_enter(blk_bs(blk), co);
         BDRV_POLL_WHILE(blk_bs(blk), rwco.ret == NOT_DONE);
     }
 
@@ -1152,7 +1152,7 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
     acb->has_returned = false;
 
     co = qemu_coroutine_create(co_entry, acb);
-    qemu_coroutine_enter(co);
+    bdrv_coroutine_enter(blk_bs(blk), co);
 
     acb->has_returned = true;
     if (acb->rwco.ret != NOT_DONE) {
diff --git a/block/io.c b/block/io.c
index 9598646..00e45ca 100644
--- a/block/io.c
+++ b/block/io.c
@@ -616,7 +616,7 @@ static int bdrv_prwv_co(BdrvChild *child, int64_t offset,
         bdrv_rw_co_entry(&rwco);
     } else {
         co = qemu_coroutine_create(bdrv_rw_co_entry, &rwco);
-        qemu_coroutine_enter(co);
+        bdrv_coroutine_enter(child->bs, co);
         BDRV_POLL_WHILE(child->bs, rwco.ret == NOT_DONE);
     }
     return rwco.ret;
@@ -1880,7 +1880,7 @@ int64_t bdrv_get_block_status_above(BlockDriverState *bs,
     } else {
         co = qemu_coroutine_create(bdrv_get_block_status_above_co_entry,
                                    &data);
-        qemu_coroutine_enter(co);
+        bdrv_coroutine_enter(bs, co);
         BDRV_POLL_WHILE(bs, !data.done);
     }
     return data.ret;
@@ -2006,7 +2006,7 @@ bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
         };
         Coroutine *co = qemu_coroutine_create(bdrv_co_rw_vmstate_entry, &data);
 
-        qemu_coroutine_enter(co);
+        bdrv_coroutine_enter(bs, co);
         while (data.ret == -EINPROGRESS) {
             aio_poll(bdrv_get_aio_context(bs), true);
         }
@@ -2223,7 +2223,7 @@ static BlockAIOCB *bdrv_co_aio_prw_vector(BdrvChild *child,
     acb->is_write = is_write;
 
     co = qemu_coroutine_create(bdrv_co_do_rw, acb);
-    qemu_coroutine_enter(co);
+    bdrv_coroutine_enter(child->bs, co);
 
     bdrv_co_maybe_schedule_bh(acb);
     return &acb->common;
@@ -2254,7 +2254,7 @@ BlockAIOCB *bdrv_aio_flush(BlockDriverState *bs,
     acb->req.error = -EINPROGRESS;
 
     co = qemu_coroutine_create(bdrv_aio_flush_co_entry, acb);
-    qemu_coroutine_enter(co);
+    bdrv_coroutine_enter(bs, co);
 
     bdrv_co_maybe_schedule_bh(acb);
     return &acb->common;
@@ -2387,7 +2387,7 @@ int bdrv_flush(BlockDriverState *bs)
         bdrv_flush_co_entry(&flush_co);
     } else {
         co = qemu_coroutine_create(bdrv_flush_co_entry, &flush_co);
-        qemu_coroutine_enter(co);
+        bdrv_coroutine_enter(bs, co);
         BDRV_POLL_WHILE(bs, flush_co.ret == NOT_DONE);
     }
 
@@ -2534,7 +2534,7 @@ int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int count)
         bdrv_pdiscard_co_entry(&rwco);
     } else {
         co = qemu_coroutine_create(bdrv_pdiscard_co_entry, &rwco);
-        qemu_coroutine_enter(co);
+        bdrv_coroutine_enter(bs, co);
         BDRV_POLL_WHILE(bs, rwco.ret == NOT_DONE);
     }
 
diff --git a/tests/qemu-iotests/109.out b/tests/qemu-iotests/109.out
index e5d70d7..55fe536 100644
--- a/tests/qemu-iotests/109.out
+++ b/tests/qemu-iotests/109.out
@@ -10,7 +10,7 @@ Automatically detecting the format is dangerous for raw images, write operations
 Specify the 'raw' format explicitly to remove the restrictions.
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": "report"}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 1024, "offset": 0, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 65536, "offset": 0, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
 {"return": []}
 read 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -73,7 +73,7 @@ Automatically detecting the format is dangerous for raw images, write operations
 Specify the 'raw' format explicitly to remove the restrictions.
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": "report"}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 1024, "offset": 0, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 65536, "offset": 0, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
 {"return": []}
 read 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -115,7 +115,7 @@ Automatically detecting the format is dangerous for raw images, write operations
 Specify the 'raw' format explicitly to remove the restrictions.
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": "report"}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 2560, "offset": 0, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 65536, "offset": 0, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
 {"return": []}
 read 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -135,7 +135,7 @@ Automatically detecting the format is dangerous for raw images, write operations
 Specify the 'raw' format explicitly to remove the restrictions.
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": "report"}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 2560, "offset": OFFSET, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 65536, "offset": OFFSET, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
 {"return": []}
 read 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -195,7 +195,7 @@ Automatically detecting the format is dangerous for raw images, write operations
 Specify the 'raw' format explicitly to remove the restrictions.
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": "report"}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 2048, "offset": OFFSET, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 65536, "offset": OFFSET, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
 {"return": []}
 read 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH for 2.9 v3 10/10] block: Fix bdrv_co_flush early return
  2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
                   ` (8 preceding siblings ...)
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 09/10] block: Use bdrv_coroutine_enter to start I/O coroutines Fam Zheng
@ 2017-04-10 15:05 ` Fam Zheng
  2017-04-11  9:19   ` Paolo Bonzini
  2017-04-25 15:16   ` Kevin Wolf
  2017-04-11 10:52 ` [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Stefan Hajnoczi
                   ` (2 subsequent siblings)
  12 siblings, 2 replies; 29+ messages in thread
From: Fam Zheng @ 2017-04-10 15:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Ed Swierk, Fam Zheng, Kevin Wolf,
	Max Reitz, Eric Blake, Stefan Hajnoczi

bdrv_inc_in_flight and bdrv_dec_in_flight are mandatory for
BDRV_POLL_WHILE to work, even for the shortcut case where flush is
unnecessary. Move the if block to below bdrv_dec_in_flight, and BTW fix
the variable declaration position.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/io.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/block/io.c b/block/io.c
index 00e45ca..bae6947 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2278,16 +2278,17 @@ static void coroutine_fn bdrv_flush_co_entry(void *opaque)
 
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 {
-    int ret;
-
-    if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
-        bdrv_is_sg(bs)) {
-        return 0;
-    }
+    int current_gen;
+    int ret = 0;
 
     bdrv_inc_in_flight(bs);
 
-    int current_gen = bs->write_gen;
+    if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
+        bdrv_is_sg(bs)) {
+        goto early_exit;
+    }
+
+    current_gen = bs->write_gen;
 
     /* Wait until any previous flushes are completed */
     while (bs->active_flush_req) {
@@ -2370,6 +2371,7 @@ out:
     /* Return value is ignored - it's ok if wait queue is empty */
     qemu_co_queue_next(&bs->flush_queue);
 
+early_exit:
     bdrv_dec_in_flight(bs);
     return ret;
 }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 09/10] block: Use bdrv_coroutine_enter to start I/O coroutines
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 09/10] block: Use bdrv_coroutine_enter to start I/O coroutines Fam Zheng
@ 2017-04-10 15:38   ` Eric Blake
  2017-04-11 10:06   ` Kevin Wolf
  1 sibling, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-04-10 15:38 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Paolo Bonzini, qemu-block, Ed Swierk, Kevin Wolf, Max Reitz,
	Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 1690 bytes --]

On 04/10/2017 10:05 AM, Fam Zheng wrote:
> BDRV_POLL_WHILE waits for the started I/O by releasing bs's ctx then polling
> the main context, which relies on the yielded the coroutine would continue on
> bs->ctx and notify qemu_aio_context with bdrv_wakeup(). Thus, using

Reads awkwardly.  I'm guessing:

which relies on the yielded coroutine continuing on bs->ctx before
notifying qemu_aio_context with bdrv_wakeup().

> qemu_coroutine_enter to start I/O is wrong because if the coroutine is entered
> from main loop, co->ctx will be qemu_aio_context, as a result of the "release,
> poll, acquire" loop of BDRV_POLL_WHILE, race conditions happen when both
> main thread and the iothread access the same BDS:
> 

>         aio_context_acquire(bs->ctx)
> 
> Note that in above case, bdrv_drained_begin() doesn't do the "release,
> poll, acquire" in BDRV_POLL_WHILE, because bs->in_flight == 0.
> 
> Fix this by using bdrv_coroutine_enter and enter coroutine in the right
> context.
> 
> iotests 109 output is updated because the coroutine reenter flow during
> mirror job complete is different (now through co_queue_wakeup, instead
> of the unconditional qemu_coroutine_switch before), making the end job
> len different.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> 
> fixup

Is this leftovers during rebasing?

> ---
>  block/block-backend.c      |  4 ++--
>  block/io.c                 | 14 +++++++-------
>  tests/qemu-iotests/109.out | 10 +++++-----
>  3 files changed, 14 insertions(+), 14 deletions(-)
> 


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 01/10] block: Make bdrv_parent_drained_begin/end public
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 01/10] block: Make bdrv_parent_drained_begin/end public Fam Zheng
@ 2017-04-11  9:02   ` Kevin Wolf
  2017-04-11 10:30   ` Stefan Hajnoczi
  1 sibling, 0 replies; 29+ messages in thread
From: Kevin Wolf @ 2017-04-11  9:02 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Ed Swierk, Max Reitz,
	Eric Blake, Stefan Hajnoczi

Am 10.04.2017 um 17:05 hat Fam Zheng geschrieben:
> Signed-off-by: Fam Zheng <famz@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 02/10] block: Quiesce old aio context during bdrv_set_aio_context
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 02/10] block: Quiesce old aio context during bdrv_set_aio_context Fam Zheng
@ 2017-04-11  9:02   ` Kevin Wolf
  2017-04-11 10:30   ` Stefan Hajnoczi
  1 sibling, 0 replies; 29+ messages in thread
From: Kevin Wolf @ 2017-04-11  9:02 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Ed Swierk, Max Reitz,
	Eric Blake, Stefan Hajnoczi

Am 10.04.2017 um 17:05 hat Fam Zheng geschrieben:
> The fact that the bs->aio_context is changing can confuse the dataplane
> iothread, because of the now fine granularity aio context lock.
> bdrv_drain should rather be a bdrv_drained_begin/end pair, but since
> bs->aio_context is changing, we can just use aio_disable_external and
> bdrv_parent_drained_begin.
> 
> Reported-by: Ed Swierk <eswierk@skyportsystems.com>
> Signed-off-by: Fam Zheng <famz@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 03/10] tests/block-job-txn: Don't start block job before adding to txn
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 03/10] tests/block-job-txn: Don't start block job before adding to txn Fam Zheng
@ 2017-04-11  9:04   ` Kevin Wolf
  0 siblings, 0 replies; 29+ messages in thread
From: Kevin Wolf @ 2017-04-11  9:04 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Ed Swierk, Max Reitz,
	Eric Blake, Stefan Hajnoczi

Am 10.04.2017 um 17:05 hat Fam Zheng geschrieben:
> Previously, before test_block_job_start returns, the job can already
> complete, as a result, the transactional state of other jobs added to
> the same txn later cannot be handled correctly.
> 
> Move the block_job_start() calls to callers after
> block_job_txn_add_job() calls.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 10/10] block: Fix bdrv_co_flush early return
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 10/10] block: Fix bdrv_co_flush early return Fam Zheng
@ 2017-04-11  9:19   ` Paolo Bonzini
  2017-04-25 15:16   ` Kevin Wolf
  1 sibling, 0 replies; 29+ messages in thread
From: Paolo Bonzini @ 2017-04-11  9:19 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: qemu-block, Ed Swierk, Kevin Wolf, Max Reitz, Eric Blake,
	Stefan Hajnoczi



On 10/04/2017 23:05, Fam Zheng wrote:
> bdrv_inc_in_flight and bdrv_dec_in_flight are mandatory for
> BDRV_POLL_WHILE to work, even for the shortcut case where flush is
> unnecessary. Move the if block to below bdrv_dec_in_flight, and BTW fix
> the variable declaration position.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/io.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 00e45ca..bae6947 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2278,16 +2278,17 @@ static void coroutine_fn bdrv_flush_co_entry(void *opaque)
>  
>  int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>  {
> -    int ret;
> -
> -    if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
> -        bdrv_is_sg(bs)) {
> -        return 0;
> -    }
> +    int current_gen;
> +    int ret = 0;
>  
>      bdrv_inc_in_flight(bs);
>  
> -    int current_gen = bs->write_gen;
> +    if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
> +        bdrv_is_sg(bs)) {
> +        goto early_exit;
> +    }
> +
> +    current_gen = bs->write_gen;
>  
>      /* Wait until any previous flushes are completed */
>      while (bs->active_flush_req) {
> @@ -2370,6 +2371,7 @@ out:
>      /* Return value is ignored - it's ok if wait queue is empty */
>      qemu_co_queue_next(&bs->flush_queue);
>  
> +early_exit:
>      bdrv_dec_in_flight(bs);
>      return ret;
>  }
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 05/10] async: Introduce aio_co_enter and aio_co_enter_if_inactive
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 05/10] async: Introduce aio_co_enter and aio_co_enter_if_inactive Fam Zheng
@ 2017-04-11  9:28   ` Kevin Wolf
  2017-04-11 11:07     ` Fam Zheng
  0 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2017-04-11  9:28 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Ed Swierk, Max Reitz,
	Eric Blake, Stefan Hajnoczi

Am 10.04.2017 um 17:05 hat Fam Zheng geschrieben:
> They start the coroutine on the specified context.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  include/block/aio.h | 18 ++++++++++++++++++
>  util/async.c        | 14 +++++++++++++-
>  2 files changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/include/block/aio.h b/include/block/aio.h
> index 677b6ff..b0a6bb3 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -511,6 +511,24 @@ void aio_co_schedule(AioContext *ctx, struct Coroutine *co);
>  void aio_co_wake(struct Coroutine *co);
>  
>  /**
> + * aio_co_enter:
> + * @ctx: the context to run the coroutine
> + * @co: the coroutine to run
> + *
> + * Enter a coroutine in the specified AioContext.
> + */
> +void aio_co_enter(AioContext *ctx, struct Coroutine *co);
> +
> +/**
> + * aio_co_enter_if_inactive:
> + * @ctx: the context to run the coroutine
> + * @co: the coroutine to run
> + *
> + * Enter a coroutine in the specified AioContext, if it's not already entered.
> + */
> +void aio_co_enter_if_inactive(AioContext *ctx, struct Coroutine *co);
> +
> +/**
>   * Return the AioContext whose event loop runs in the current thread.
>   *
>   * If called from an IOThread this will be the IOThread's AioContext.  If
> diff --git a/util/async.c b/util/async.c
> index 663e297..507671a 100644
> --- a/util/async.c
> +++ b/util/async.c
> @@ -453,6 +453,11 @@ void aio_co_wake(struct Coroutine *co)
>      smp_read_barrier_depends();
>      ctx = atomic_read(&co->ctx);
>  
> +    aio_co_enter(ctx, co);
> +}
> +
> +void aio_co_enter(AioContext *ctx, struct Coroutine *co)
> +{
>      if (ctx != qemu_get_current_aio_context()) {
>          aio_co_schedule(ctx, co);
>          return;
> @@ -464,11 +469,18 @@ void aio_co_wake(struct Coroutine *co)
>          QSIMPLEQ_INSERT_TAIL(&self->co_queue_wakeup, co, co_queue_next);
>      } else {
>          aio_context_acquire(ctx);
> -        qemu_coroutine_enter(co);
> +        qemu_aio_coroutine_enter(ctx, co);
>          aio_context_release(ctx);
>      }
>  }
>  
> +void aio_co_enter_if_inactive(AioContext *ctx, struct Coroutine *co)
> +{
> +    if (!qemu_coroutine_entered(co)) {
> +        aio_co_enter(ctx, co);
> +    }
> +}

Is this a useful function, though?

I think the only interesting case is the first qemu_coroutine_enter()
after a coroutine is created, here we may want it to run in a different
AioContext than the caller. However, once this has happened, it is
already running in the right AioContext and we can use the normal
functions without giving an explicit AioContext (except in cases where
we wouldn't reenter from a callback of that AioContext, but do such
cases even exist?)

So I expect that some patches down the series, we get a patch that
converts more than is actually necessary. Let's see.

Kevin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 09/10] block: Use bdrv_coroutine_enter to start I/O coroutines
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 09/10] block: Use bdrv_coroutine_enter to start I/O coroutines Fam Zheng
  2017-04-10 15:38   ` Eric Blake
@ 2017-04-11 10:06   ` Kevin Wolf
  2017-04-11 11:37     ` Fam Zheng
  2017-04-11 11:43     ` Paolo Bonzini
  1 sibling, 2 replies; 29+ messages in thread
From: Kevin Wolf @ 2017-04-11 10:06 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Ed Swierk, Max Reitz,
	Eric Blake, Stefan Hajnoczi

Am 10.04.2017 um 17:05 hat Fam Zheng geschrieben:
> BDRV_POLL_WHILE waits for the started I/O by releasing bs's ctx then polling
> the main context, which relies on the yielded the coroutine would continue on
> bs->ctx and notify qemu_aio_context with bdrv_wakeup(). Thus, using
> qemu_coroutine_enter to start I/O is wrong because if the coroutine is entered
> from main loop, co->ctx will be qemu_aio_context, as a result of the "release,
> poll, acquire" loop of BDRV_POLL_WHILE, race conditions happen when both
> main thread and the iothread access the same BDS:
> 
>   main loop                                iothread
> -----------------------------------------------------------------------
>   blockdev_snapshot
>     aio_context_acquire(bs->ctx)
>                                            virtio_scsi_data_plane_handle_cmd
>     bdrv_drained_begin(bs->ctx)
>     bdrv_flush(bs)
>       bdrv_co_flush(bs)                      aio_context_acquire(bs->ctx).enter
>         ...
>         qemu_coroutine_yield(co)
>       BDRV_POLL_WHILE()
>         aio_context_release(bs->ctx)
>                                              aio_context_acquire(bs->ctx).return
>                                                ...
>                                                  aio_co_wake(co)
>         aio_poll(qemu_aio_context)               ...
>           co_schedule_bh_cb()                    ...
>             qemu_coroutine_enter(co)             ...
> 
>               /* (A) bdrv_co_flush(bs)           /* (B) I/O on bs */
>                       continues... */
>                                              aio_context_release(bs->ctx)
>         aio_context_acquire(bs->ctx)
> 
> Note that in above case, bdrv_drained_begin() doesn't do the "release,
> poll, acquire" in BDRV_POLL_WHILE, because bs->in_flight == 0.
> 
> Fix this by using bdrv_coroutine_enter and enter coroutine in the right
> context.
> 
> iotests 109 output is updated because the coroutine reenter flow during
> mirror job complete is different (now through co_queue_wakeup, instead
> of the unconditional qemu_coroutine_switch before), making the end job
> len different.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> 
> fixup
> ---
>  block/block-backend.c      |  4 ++--
>  block/io.c                 | 14 +++++++-------

These changes are okay.

The question is whether we need more of them. We do have a few more
callers of qemu_coroutine_create():

* blkverify, quorum: Here, we are already in a coroutine context and
  therefore the caller has made sure that we're in the right AioContext.
  The usual functions simply inherit it, which is correct.

* nbd-client: Has its own handling for getting the coroutine in to the
  right AioContexts, I'll assume it's okay.

* sheepdog:

  co_read_response() calls aio_co_wake() immediately after
  qemu_coroutine_create(). Is this allowed? I don't think co->ctx is
  even initialised at this time. Is this a bug introduced in Paolo's
  9d456654? ('block: explicitly acquire aiocontext in callbacks that
  need it') Anyway, called only in AioContext callbacks, so we're fine.

  do_req() is called from the main loop context in a few instances. Not
  sure if this is a problem, but maybe using bdrv_coroutine_enter()
  there would be safer.

* 9p, migration: Outside the block layer, trivially ok

* nbd/server: nbd_co_client_start() runs in the caller's AioContext.
  Should be fine, it doesn't directly issue any requests, but just
  spawns other coroutines that are put in the right AioContext.

* qemu-img, tests: Nobody cares about AioContexts there

So I think we should have another look at Sheepdog, the rest seems to be
fine.

>  tests/qemu-iotests/109.out | 10 +++++-----

This one is curious. Why do we copy more data depending on how the job
coroutine is reentered?

Kevin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 01/10] block: Make bdrv_parent_drained_begin/end public
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 01/10] block: Make bdrv_parent_drained_begin/end public Fam Zheng
  2017-04-11  9:02   ` Kevin Wolf
@ 2017-04-11 10:30   ` Stefan Hajnoczi
  1 sibling, 0 replies; 29+ messages in thread
From: Stefan Hajnoczi @ 2017-04-11 10:30 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Ed Swierk, Kevin Wolf,
	Max Reitz, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 297 bytes --]

On Mon, Apr 10, 2017 at 11:05:33PM +0800, Fam Zheng wrote:
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/io.c            |  4 ++--
>  include/block/block.h | 16 ++++++++++++++++
>  2 files changed, 18 insertions(+), 2 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 02/10] block: Quiesce old aio context during bdrv_set_aio_context
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 02/10] block: Quiesce old aio context during bdrv_set_aio_context Fam Zheng
  2017-04-11  9:02   ` Kevin Wolf
@ 2017-04-11 10:30   ` Stefan Hajnoczi
  1 sibling, 0 replies; 29+ messages in thread
From: Stefan Hajnoczi @ 2017-04-11 10:30 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Ed Swierk, Kevin Wolf,
	Max Reitz, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 619 bytes --]

On Mon, Apr 10, 2017 at 11:05:34PM +0800, Fam Zheng wrote:
> The fact that the bs->aio_context is changing can confuse the dataplane
> iothread, because of the now fine granularity aio context lock.
> bdrv_drain should rather be a bdrv_drained_begin/end pair, but since
> bs->aio_context is changing, we can just use aio_disable_external and
> bdrv_parent_drained_begin.
> 
> Reported-by: Ed Swierk <eswierk@skyportsystems.com>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations
  2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
                   ` (9 preceding siblings ...)
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 10/10] block: Fix bdrv_co_flush early return Fam Zheng
@ 2017-04-11 10:52 ` Stefan Hajnoczi
  2017-04-11 11:05 ` Kevin Wolf
  2017-04-11 11:13 ` [Qemu-devel] [PATCH for 2.9 v3 11/10] block, async: Remove unused *_enter_if_inactive() Kevin Wolf
  12 siblings, 0 replies; 29+ messages in thread
From: Stefan Hajnoczi @ 2017-04-11 10:52 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Ed Swierk, Kevin Wolf,
	Max Reitz, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 2955 bytes --]

On Mon, Apr 10, 2017 at 11:05:32PM +0800, Fam Zheng wrote:
> v3: Respin the unmerged changes from v2 and include one new fix:
> 
>     (Yes, it is a big series for the last -rc, and I personally prefer the v2
>     approach for the 4-9 part of the problem, which is much more mechanical.)
> 
>     - 1, 2 are redoing previous patch 4, using bdrv_parent_drained_begin/end.
>       [Kevin]
>       Also fix the ordering against aio_context_release. [Stefan]
>     - 3 is unchanged from patch 6 in v2.
>     - 4-9 are reworking of patch 5 following Paolo's suggestion, which allowed
>       better patch split.
>     - 10 is finding of a latent bug, which is revealed by patch 9.
> 
> v2: - Drop patch 4 in v1. A second thought made me feel neither it nor Kevin's
>       suggestion to move the BH process to bdrv_drain_recurse/BDRV_POLL_WHILE
>       is a complete fix. So leave it for a separate patch.
>     - Add rev-by to patches 1, 3, 4.
>     - Split from patch 1 in v1 and add patch 2, for the new assertions. [Kevin]
>     - Rewrite patch 5. Fix block job's co when a BDS is moved to a different
>       aio context. [Kevin]
>     - Add patch 6.
> 
> Crashes are reported on dataplane devices when doing snapshot and commit under
> guest I/O.
> 
> With this series, Ed's test case '176' now passes:
> 
>     https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.9
> 
> Fam Zheng (10):
>   block: Make bdrv_parent_drained_begin/end public
>   block: Quiesce old aio context during bdrv_set_aio_context
>   tests/block-job-txn: Don't start block job before adding to txn
>   coroutine: Extract qemu_aio_coroutine_enter
>   async: Introduce aio_co_enter and aio_co_enter_if_inactive
>   block: Introduce bdrv_coroutine_enter and *_if_inactive
>   blockjob: Use bdrv_coroutine_enter to start coroutine
>   qemu-io-cmds: Use bdrv_coroutine_enter
>   block: Use bdrv_coroutine_enter to start I/O coroutines
>   block: Fix bdrv_co_flush early return
> 
>  block.c                    | 17 +++++++++++++++--
>  block/block-backend.c      |  4 ++--
>  block/io.c                 | 34 ++++++++++++++++++----------------
>  blockjob.c                 |  4 ++--
>  include/block/aio.h        | 18 ++++++++++++++++++
>  include/block/block.h      | 27 +++++++++++++++++++++++++++
>  include/qemu/coroutine.h   |  5 +++++
>  qemu-io-cmds.c             |  2 +-
>  tests/qemu-iotests/109.out | 10 +++++-----
>  tests/test-blockjob-txn.c  |  6 +++++-
>  util/async.c               | 14 +++++++++++++-
>  util/qemu-coroutine.c      | 11 ++++++++---
>  util/trace-events          |  2 +-
>  13 files changed, 120 insertions(+), 34 deletions(-)

We need a fix for 2.9.  Interactions between AioContexts, coroutines,
and blockjobs are complex and brittle.  That is a pre-existing problem
though and hopefully we can simplify abstractions in 2.10.

Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations
  2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
                   ` (10 preceding siblings ...)
  2017-04-11 10:52 ` [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Stefan Hajnoczi
@ 2017-04-11 11:05 ` Kevin Wolf
  2017-04-11 11:55   ` Fam Zheng
  2017-04-11 11:13 ` [Qemu-devel] [PATCH for 2.9 v3 11/10] block, async: Remove unused *_enter_if_inactive() Kevin Wolf
  12 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2017-04-11 11:05 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Ed Swierk, Max Reitz,
	Eric Blake, Stefan Hajnoczi

Am 10.04.2017 um 17:05 hat Fam Zheng geschrieben:
> v3: Respin the unmerged changes from v2 and include one new fix:
> 
>     (Yes, it is a big series for the last -rc, and I personally prefer the v2
>     approach for the 4-9 part of the problem, which is much more mechanical.)
> 
>     - 1, 2 are redoing previous patch 4, using bdrv_parent_drained_begin/end.
>       [Kevin]
>       Also fix the ordering against aio_context_release. [Stefan]
>     - 3 is unchanged from patch 6 in v2.
>     - 4-9 are reworking of patch 5 following Paolo's suggestion, which allowed
>       better patch split.
>     - 10 is finding of a latent bug, which is revealed by patch 9.
> 
> v2: - Drop patch 4 in v1. A second thought made me feel neither it nor Kevin's
>       suggestion to move the BH process to bdrv_drain_recurse/BDRV_POLL_WHILE
>       is a complete fix. So leave it for a separate patch.
>     - Add rev-by to patches 1, 3, 4.
>     - Split from patch 1 in v1 and add patch 2, for the new assertions. [Kevin]
>     - Rewrite patch 5. Fix block job's co when a BDS is moved to a different
>       aio context. [Kevin]
>     - Add patch 6.
> 
> Crashes are reported on dataplane devices when doing snapshot and commit under
> guest I/O.
> 
> With this series, Ed's test case '176' now passes:
> 
>     https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.9

I had only two points for this series. The first is that it adds unused
functions, which doesn't hurt (but I might just send a PATCH 11/10 to
remove them again). The second one is that some sheepdog code is
suspicious, but if anything it just means that this series is incomplete,
so not a show stopper either.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 05/10] async: Introduce aio_co_enter and aio_co_enter_if_inactive
  2017-04-11  9:28   ` Kevin Wolf
@ 2017-04-11 11:07     ` Fam Zheng
  0 siblings, 0 replies; 29+ messages in thread
From: Fam Zheng @ 2017-04-11 11:07 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Ed Swierk, Max Reitz,
	Eric Blake, Stefan Hajnoczi

On Tue, 04/11 11:28, Kevin Wolf wrote:
> Am 10.04.2017 um 17:05 hat Fam Zheng geschrieben:
> > They start the coroutine on the specified context.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  include/block/aio.h | 18 ++++++++++++++++++
> >  util/async.c        | 14 +++++++++++++-
> >  2 files changed, 31 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/block/aio.h b/include/block/aio.h
> > index 677b6ff..b0a6bb3 100644
> > --- a/include/block/aio.h
> > +++ b/include/block/aio.h
> > @@ -511,6 +511,24 @@ void aio_co_schedule(AioContext *ctx, struct Coroutine *co);
> >  void aio_co_wake(struct Coroutine *co);
> >  
> >  /**
> > + * aio_co_enter:
> > + * @ctx: the context to run the coroutine
> > + * @co: the coroutine to run
> > + *
> > + * Enter a coroutine in the specified AioContext.
> > + */
> > +void aio_co_enter(AioContext *ctx, struct Coroutine *co);
> > +
> > +/**
> > + * aio_co_enter_if_inactive:
> > + * @ctx: the context to run the coroutine
> > + * @co: the coroutine to run
> > + *
> > + * Enter a coroutine in the specified AioContext, if it's not already entered.
> > + */
> > +void aio_co_enter_if_inactive(AioContext *ctx, struct Coroutine *co);
> > +
> > +/**
> >   * Return the AioContext whose event loop runs in the current thread.
> >   *
> >   * If called from an IOThread this will be the IOThread's AioContext.  If
> > diff --git a/util/async.c b/util/async.c
> > index 663e297..507671a 100644
> > --- a/util/async.c
> > +++ b/util/async.c
> > @@ -453,6 +453,11 @@ void aio_co_wake(struct Coroutine *co)
> >      smp_read_barrier_depends();
> >      ctx = atomic_read(&co->ctx);
> >  
> > +    aio_co_enter(ctx, co);
> > +}
> > +
> > +void aio_co_enter(AioContext *ctx, struct Coroutine *co)
> > +{
> >      if (ctx != qemu_get_current_aio_context()) {
> >          aio_co_schedule(ctx, co);
> >          return;
> > @@ -464,11 +469,18 @@ void aio_co_wake(struct Coroutine *co)
> >          QSIMPLEQ_INSERT_TAIL(&self->co_queue_wakeup, co, co_queue_next);
> >      } else {
> >          aio_context_acquire(ctx);
> > -        qemu_coroutine_enter(co);
> > +        qemu_aio_coroutine_enter(ctx, co);
> >          aio_context_release(ctx);
> >      }
> >  }
> >  
> > +void aio_co_enter_if_inactive(AioContext *ctx, struct Coroutine *co)
> > +{
> > +    if (!qemu_coroutine_entered(co)) {
> > +        aio_co_enter(ctx, co);
> > +    }
> > +}
> 
> Is this a useful function, though?
> 
> I think the only interesting case is the first qemu_coroutine_enter()
> after a coroutine is created, here we may want it to run in a different
> AioContext than the caller. However, once this has happened, it is
> already running in the right AioContext and we can use the normal
> functions without giving an explicit AioContext (except in cases where
> we wouldn't reenter from a callback of that AioContext, but do such
> cases even exist?)
> 
> So I expect that some patches down the series, we get a patch that
> converts more than is actually necessary. Let's see.
> 
> Kevin

You are right, actually this function is unused.

Fam

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH for 2.9 v3 11/10] block, async: Remove unused *_enter_if_inactive()
  2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
                   ` (11 preceding siblings ...)
  2017-04-11 11:05 ` Kevin Wolf
@ 2017-04-11 11:13 ` Kevin Wolf
  12 siblings, 0 replies; 29+ messages in thread
From: Kevin Wolf @ 2017-04-11 11:13 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, pbonzini, eswierk, famz, mreitz, eblake, stefanha, qemu-devel

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c               | 5 -----
 include/block/aio.h   | 9 ---------
 include/block/block.h | 6 ------
 util/async.c          | 7 -------
 4 files changed, 27 deletions(-)

diff --git a/block.c b/block.c
index e65b906..086a12d 100644
--- a/block.c
+++ b/block.c
@@ -4329,11 +4329,6 @@ void bdrv_coroutine_enter(BlockDriverState *bs, Coroutine *co)
     aio_co_enter(bdrv_get_aio_context(bs), co);
 }
 
-void bdrv_coroutine_enter_if_inactive(BlockDriverState *bs, Coroutine *co)
-{
-    aio_co_enter_if_inactive(bdrv_get_aio_context(bs), co);
-}
-
 static void bdrv_do_remove_aio_context_notifier(BdrvAioNotifier *ban)
 {
     QLIST_REMOVE(ban, list);
diff --git a/include/block/aio.h b/include/block/aio.h
index b0a6bb3..406e323 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -520,15 +520,6 @@ void aio_co_wake(struct Coroutine *co);
 void aio_co_enter(AioContext *ctx, struct Coroutine *co);
 
 /**
- * aio_co_enter_if_inactive:
- * @ctx: the context to run the coroutine
- * @co: the coroutine to run
- *
- * Enter a coroutine in the specified AioContext, if it's not already entered.
- */
-void aio_co_enter_if_inactive(AioContext *ctx, struct Coroutine *co);
-
-/**
  * Return the AioContext whose event loop runs in the current thread.
  *
  * If called from an IOThread this will be the IOThread's AioContext.  If
diff --git a/include/block/block.h b/include/block/block.h
index dd9416e..97d4330 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -563,12 +563,6 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs);
 void bdrv_coroutine_enter(BlockDriverState *bs, Coroutine *co);
 
 /**
- * Transfer control to @co in the aio context of @bs if it's not active (i.e.
- * part of the call stack of the running coroutine). Otherwise, do nothing.
- */
-void bdrv_coroutine_enter_if_inactive(BlockDriverState *bs, Coroutine *co);
-
-/**
  * bdrv_set_aio_context:
  *
  * Changes the #AioContext used for fd handlers, timers, and BHs by this
diff --git a/util/async.c b/util/async.c
index 507671a..355af73 100644
--- a/util/async.c
+++ b/util/async.c
@@ -474,13 +474,6 @@ void aio_co_enter(AioContext *ctx, struct Coroutine *co)
     }
 }
 
-void aio_co_enter_if_inactive(AioContext *ctx, struct Coroutine *co)
-{
-    if (!qemu_coroutine_entered(co)) {
-        aio_co_enter(ctx, co);
-    }
-}
-
 void aio_context_ref(AioContext *ctx)
 {
     g_source_ref(&ctx->source);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 09/10] block: Use bdrv_coroutine_enter to start I/O coroutines
  2017-04-11 10:06   ` Kevin Wolf
@ 2017-04-11 11:37     ` Fam Zheng
  2017-04-11 11:43     ` Paolo Bonzini
  1 sibling, 0 replies; 29+ messages in thread
From: Fam Zheng @ 2017-04-11 11:37 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-block, qemu-devel, Max Reitz, Ed Swierk, Stefan Hajnoczi,
	Paolo Bonzini

On Tue, 04/11 12:06, Kevin Wolf wrote:
> >  tests/qemu-iotests/109.out | 10 +++++-----
> 
> This one is curious. Why do we copy more data depending on how the job
> coroutine is reentered?

I did trace it and think the difference is not very important: on master, the
second iteration of the mirror_run loop gets a chance to run, and there we have:

    s->common.len = s->common.offset +
                    (cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE;

E.g. in the mirror job of the first updated output, iter #1 has cnt = 128
and s->sectors_in_flight = 0; iter #2 has cnt = 0, s->sectors_in_flight = 2.

I think the reason is simply that co_queue_wakeup in bdrv_coroutine_enter subtly
changed how the mirror coroutine and bdrv_co_pwritev are ordered.

Fam

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 09/10] block: Use bdrv_coroutine_enter to start I/O coroutines
  2017-04-11 10:06   ` Kevin Wolf
  2017-04-11 11:37     ` Fam Zheng
@ 2017-04-11 11:43     ` Paolo Bonzini
  1 sibling, 0 replies; 29+ messages in thread
From: Paolo Bonzini @ 2017-04-11 11:43 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Fam Zheng, qemu-devel, qemu-block, Ed Swierk, Max Reitz,
	Eric Blake, Stefan Hajnoczi


> So I think we should have another look at Sheepdog, the rest seems to be
> fine.

Sheepdog currently SEGVs and this might be the cause.  I can look at it
on Thursday.

Paolo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations
  2017-04-11 11:05 ` Kevin Wolf
@ 2017-04-11 11:55   ` Fam Zheng
  0 siblings, 0 replies; 29+ messages in thread
From: Fam Zheng @ 2017-04-11 11:55 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Ed Swierk, Max Reitz,
	Eric Blake, Stefan Hajnoczi

On Tue, 04/11 13:05, Kevin Wolf wrote:
> Am 10.04.2017 um 17:05 hat Fam Zheng geschrieben:
> > v3: Respin the unmerged changes from v2 and include one new fix:
> > 
> >     (Yes, it is a big series for the last -rc, and I personally prefer the v2
> >     approach for the 4-9 part of the problem, which is much more mechanical.)
> > 
> >     - 1, 2 are redoing previous patch 4, using bdrv_parent_drained_begin/end.
> >       [Kevin]
> >       Also fix the ordering against aio_context_release. [Stefan]
> >     - 3 is unchanged from patch 6 in v2.
> >     - 4-9 are reworking of patch 5 following Paolo's suggestion, which allowed
> >       better patch split.
> >     - 10 is finding of a latent bug, which is revealed by patch 9.
> > 
> > v2: - Drop patch 4 in v1. A second thought made me feel neither it nor Kevin's
> >       suggestion to move the BH process to bdrv_drain_recurse/BDRV_POLL_WHILE
> >       is a complete fix. So leave it for a separate patch.
> >     - Add rev-by to patches 1, 3, 4.
> >     - Split from patch 1 in v1 and add patch 2, for the new assertions. [Kevin]
> >     - Rewrite patch 5. Fix block job's co when a BDS is moved to a different
> >       aio context. [Kevin]
> >     - Add patch 6.
> > 
> > Crashes are reported on dataplane devices when doing snapshot and commit under
> > guest I/O.
> > 
> > With this series, Ed's test case '176' now passes:
> > 
> >     https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.9
> 
> I had only two points for this series. The first is that it adds unused
> functions, which doesn't hurt (but I might just send a PATCH 11/10 to
> remove them again). The second one is that some sheepdog code is
> suspicious, but if anything it just means that this series is incomplete,
> so not a show stopper either.
> 
> Reviewed-by: Kevin Wolf <kwolf@redhat.com>

Thanks, I'll squash in the removal patch and send a pull request for 2.9.

Fam

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 10/10] block: Fix bdrv_co_flush early return
  2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 10/10] block: Fix bdrv_co_flush early return Fam Zheng
  2017-04-11  9:19   ` Paolo Bonzini
@ 2017-04-25 15:16   ` Kevin Wolf
  2017-04-26  0:39     ` Fam Zheng
  1 sibling, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2017-04-25 15:16 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Ed Swierk, Max Reitz,
	Eric Blake, Stefan Hajnoczi

Am 10.04.2017 um 17:05 hat Fam Zheng geschrieben:
> bdrv_inc_in_flight and bdrv_dec_in_flight are mandatory for
> BDRV_POLL_WHILE to work, even for the shortcut case where flush is
> unnecessary. Move the if block to below bdrv_dec_in_flight, and BTW fix
> the variable declaration position.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/io.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 00e45ca..bae6947 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2278,16 +2278,17 @@ static void coroutine_fn bdrv_flush_co_entry(void *opaque)
>  
>  int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>  {
> -    int ret;
> -
> -    if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
> -        bdrv_is_sg(bs)) {
> -        return 0;
> -    }
> +    int current_gen;
> +    int ret = 0;
>  
>      bdrv_inc_in_flight(bs);

As Coverity points out, we're now using bs...

> -    int current_gen = bs->write_gen;
> +    if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||

...before doing the NULL check.

I'm not sure if we even need to have a NULL check here, but we would have
to check all callers to make sure that it's unnecessary. Before commit
29cdb251, it only checked bs->drv and I don't see how that commit
introduced a NULL caller, but maybe one was added later.

In any case, bdrv_co_flush() needs a fix, either remove the NULL check
or do it first.

> +        bdrv_is_sg(bs)) {
> +        goto early_exit;
> +    }

Kevin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 v3 10/10] block: Fix bdrv_co_flush early return
  2017-04-25 15:16   ` Kevin Wolf
@ 2017-04-26  0:39     ` Fam Zheng
  0 siblings, 0 replies; 29+ messages in thread
From: Fam Zheng @ 2017-04-26  0:39 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Ed Swierk, Max Reitz,
	Eric Blake, Stefan Hajnoczi

On Tue, 04/25 17:16, Kevin Wolf wrote:
> Am 10.04.2017 um 17:05 hat Fam Zheng geschrieben:
> > bdrv_inc_in_flight and bdrv_dec_in_flight are mandatory for
> > BDRV_POLL_WHILE to work, even for the shortcut case where flush is
> > unnecessary. Move the if block to below bdrv_dec_in_flight, and BTW fix
> > the variable declaration position.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  block/io.c | 16 +++++++++-------
> >  1 file changed, 9 insertions(+), 7 deletions(-)
> > 
> > diff --git a/block/io.c b/block/io.c
> > index 00e45ca..bae6947 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -2278,16 +2278,17 @@ static void coroutine_fn bdrv_flush_co_entry(void *opaque)
> >  
> >  int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
> >  {
> > -    int ret;
> > -
> > -    if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
> > -        bdrv_is_sg(bs)) {
> > -        return 0;
> > -    }
> > +    int current_gen;
> > +    int ret = 0;
> >  
> >      bdrv_inc_in_flight(bs);
> 
> As Coverity points out, we're now using bs...
> 
> > -    int current_gen = bs->write_gen;
> > +    if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
> 
> ...before doing the NULL check.
> 
> I'm not sure if we even need to have a NULL check here, but we would have
> to check all callers to make sure that it's unnecessary. Before commit
> 29cdb251, it only checked bs->drv and I don't see how that commit
> introduced a NULL caller, but maybe one was added later.
> 
> In any case, bdrv_co_flush() needs a fix, either remove the NULL check
> or do it first.

After auditing the callers and knowing the fact that the above
bdrv_inc_in_flight didn't cause a problem, I think removing the NULL check is
fine.

I'll send a patch.

Thanks.

Fam

> 
> > +        bdrv_is_sg(bs)) {
> > +        goto early_exit;
> > +    }
> 
> Kevin

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2017-04-26  0:40 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-10 15:05 [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Fam Zheng
2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 01/10] block: Make bdrv_parent_drained_begin/end public Fam Zheng
2017-04-11  9:02   ` Kevin Wolf
2017-04-11 10:30   ` Stefan Hajnoczi
2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 02/10] block: Quiesce old aio context during bdrv_set_aio_context Fam Zheng
2017-04-11  9:02   ` Kevin Wolf
2017-04-11 10:30   ` Stefan Hajnoczi
2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 03/10] tests/block-job-txn: Don't start block job before adding to txn Fam Zheng
2017-04-11  9:04   ` Kevin Wolf
2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 04/10] coroutine: Extract qemu_aio_coroutine_enter Fam Zheng
2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 05/10] async: Introduce aio_co_enter and aio_co_enter_if_inactive Fam Zheng
2017-04-11  9:28   ` Kevin Wolf
2017-04-11 11:07     ` Fam Zheng
2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 06/10] block: Introduce bdrv_coroutine_enter and *_if_inactive Fam Zheng
2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 07/10] blockjob: Use bdrv_coroutine_enter to start coroutine Fam Zheng
2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 08/10] qemu-io-cmds: Use bdrv_coroutine_enter Fam Zheng
2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 09/10] block: Use bdrv_coroutine_enter to start I/O coroutines Fam Zheng
2017-04-10 15:38   ` Eric Blake
2017-04-11 10:06   ` Kevin Wolf
2017-04-11 11:37     ` Fam Zheng
2017-04-11 11:43     ` Paolo Bonzini
2017-04-10 15:05 ` [Qemu-devel] [PATCH for 2.9 v3 10/10] block: Fix bdrv_co_flush early return Fam Zheng
2017-04-11  9:19   ` Paolo Bonzini
2017-04-25 15:16   ` Kevin Wolf
2017-04-26  0:39     ` Fam Zheng
2017-04-11 10:52 ` [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations Stefan Hajnoczi
2017-04-11 11:05 ` Kevin Wolf
2017-04-11 11:55   ` Fam Zheng
2017-04-11 11:13 ` [Qemu-devel] [PATCH for 2.9 v3 11/10] block, async: Remove unused *_enter_if_inactive() Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.