All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all Fam Zheng
@ 2015-05-13 10:26   ` Paolo Bonzini
  2015-05-13 11:08     ` Fam Zheng
  0 siblings, 1 reply; 33+ messages in thread
From: Paolo Bonzini @ 2015-05-13 10:26 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi



On 13/05/2015 19:28, Fam Zheng wrote:
> We don't want new requests from guest, so block the operation around the
> nested poll.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/io.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/block/io.c b/block/io.c
> index 1ce62c4..d369de3 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -289,9 +289,15 @@ static bool bdrv_drain_one(BlockDriverState *bs)
>   */
>  void bdrv_drain(BlockDriverState *bs)
>  {
> +    Error *blocker = NULL;
> +
> +    error_setg(&blocker, "bdrv_drain in progress");
> +    bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, blocker);
>      while (bdrv_drain_one(bs)) {
>          /* Keep iterating */
>      }
> +    bdrv_op_unblock(bs, BLOCK_OP_TYPE_DEVICE_IO, blocker);
> +    error_free(blocker);
>  }
>  
>  /*
> @@ -311,6 +317,9 @@ void bdrv_drain_all(void)
>      /* Always run first iteration so any pending completion BHs run */
>      bool busy = true;
>      BlockDriverState *bs = NULL;
> +    Error *blocker = NULL;
> +
> +    error_setg(&blocker, "bdrv_drain_all in progress");
>  
>      while ((bs = bdrv_next(bs))) {
>          AioContext *aio_context = bdrv_get_aio_context(bs);
> @@ -319,6 +328,7 @@ void bdrv_drain_all(void)
>          if (bs->job) {
>              block_job_pause(bs->job);
>          }
> +        bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, blocker);
>          aio_context_release(aio_context);
>      }
>  
> @@ -343,8 +353,10 @@ void bdrv_drain_all(void)
>          if (bs->job) {
>              block_job_resume(bs->job);
>          }
> +        bdrv_op_unblock(bs, BLOCK_OP_TYPE_DEVICE_IO, blocker);
>          aio_context_release(aio_context);
>      }
> +    error_free(blocker);
>  }
>  
>  /**
> 

I think this isn't enough.  It's the callers of bdrv_drain and
bdrv_drain_all that need to block before drain and unblock before
aio_context_release.

Paolo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/11] virtio-blk: Don't handle output when there is "device IO" op blocker
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 05/11] virtio-blk: Don't handle output when there is "device IO" op blocker Fam Zheng
@ 2015-05-13 10:26   ` Paolo Bonzini
  2015-05-13 11:09     ` Fam Zheng
  0 siblings, 1 reply; 33+ messages in thread
From: Paolo Bonzini @ 2015-05-13 10:26 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi



On 13/05/2015 19:28, Fam Zheng wrote:
> +static void virtio_blk_data_plane_pause(VirtIOBlock *vblk)
> +{
> +    VirtIOBlockDataPlane *s = vblk->dataplane;
> +
> +    event_notifier_test_and_clear(&s->host_notifier);
> +    aio_set_event_notifier(s->ctx, &s->host_notifier, NULL);
> +}
> +
> +static void handle_notify(EventNotifier *e);
> +static void virtio_blk_data_plane_resume(VirtIOBlock *vblk)
> +{
> +    VirtIOBlockDataPlane *s = vblk->dataplane;
> +
> +    aio_set_event_notifier(s->ctx, &s->host_notifier, handle_notify);
> +
> +    event_notifier_set(&s->host_notifier);
> +}

Perhaps add a note that these are called under aio_context_acquire?

Paolo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all
  2015-05-13 10:26   ` Paolo Bonzini
@ 2015-05-13 11:08     ` Fam Zheng
  2015-05-13 11:33       ` Paolo Bonzini
  0 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 11:08 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, qemu-block, armbru, jcody, qemu-devel, mreitz,
	Stefan Hajnoczi

On Wed, 05/13 12:26, Paolo Bonzini wrote:
> 
> 
> On 13/05/2015 19:28, Fam Zheng wrote:
> > We don't want new requests from guest, so block the operation around the
> > nested poll.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  block/io.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/block/io.c b/block/io.c
> > index 1ce62c4..d369de3 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -289,9 +289,15 @@ static bool bdrv_drain_one(BlockDriverState *bs)
> >   */
> >  void bdrv_drain(BlockDriverState *bs)
> >  {
> > +    Error *blocker = NULL;
> > +
> > +    error_setg(&blocker, "bdrv_drain in progress");
> > +    bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, blocker);
> >      while (bdrv_drain_one(bs)) {
> >          /* Keep iterating */
> >      }
> > +    bdrv_op_unblock(bs, BLOCK_OP_TYPE_DEVICE_IO, blocker);
> > +    error_free(blocker);
> >  }
> >  
> >  /*
> > @@ -311,6 +317,9 @@ void bdrv_drain_all(void)
> >      /* Always run first iteration so any pending completion BHs run */
> >      bool busy = true;
> >      BlockDriverState *bs = NULL;
> > +    Error *blocker = NULL;
> > +
> > +    error_setg(&blocker, "bdrv_drain_all in progress");
> >  
> >      while ((bs = bdrv_next(bs))) {
> >          AioContext *aio_context = bdrv_get_aio_context(bs);
> > @@ -319,6 +328,7 @@ void bdrv_drain_all(void)
> >          if (bs->job) {
> >              block_job_pause(bs->job);
> >          }
> > +        bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, blocker);
> >          aio_context_release(aio_context);
> >      }
> >  
> > @@ -343,8 +353,10 @@ void bdrv_drain_all(void)
> >          if (bs->job) {
> >              block_job_resume(bs->job);
> >          }
> > +        bdrv_op_unblock(bs, BLOCK_OP_TYPE_DEVICE_IO, blocker);
> >          aio_context_release(aio_context);
> >      }
> > +    error_free(blocker);
> >  }
> >  
> >  /**
> > 
> 
> I think this isn't enough.  It's the callers of bdrv_drain and
> bdrv_drain_all that need to block before drain and unblock before
> aio_context_release.

Which callers do you mean? qmp_transaction is covered in this series.

Fam

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/11] virtio-blk: Don't handle output when there is "device IO" op blocker
  2015-05-13 10:26   ` Paolo Bonzini
@ 2015-05-13 11:09     ` Fam Zheng
  0 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 11:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, qemu-block, armbru, jcody, qemu-devel, mreitz,
	Stefan Hajnoczi

On Wed, 05/13 12:26, Paolo Bonzini wrote:
> 
> 
> On 13/05/2015 19:28, Fam Zheng wrote:
> > +static void virtio_blk_data_plane_pause(VirtIOBlock *vblk)
> > +{
> > +    VirtIOBlockDataPlane *s = vblk->dataplane;
> > +
> > +    event_notifier_test_and_clear(&s->host_notifier);
> > +    aio_set_event_notifier(s->ctx, &s->host_notifier, NULL);
> > +}
> > +
> > +static void handle_notify(EventNotifier *e);
> > +static void virtio_blk_data_plane_resume(VirtIOBlock *vblk)
> > +{
> > +    VirtIOBlockDataPlane *s = vblk->dataplane;
> > +
> > +    aio_set_event_notifier(s->ctx, &s->host_notifier, handle_notify);
> > +
> > +    event_notifier_set(&s->host_notifier);
> > +}
> 
> Perhaps add a note that these are called under aio_context_acquire?
> 

OK, good idea.

Fam

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/11] blockdev: Block device IO during blockdev-backup transaction
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 10/11] blockdev: Block device IO during blockdev-backup transaction Fam Zheng
@ 2015-05-13 11:22   ` Wen Congyang
  2015-05-13 12:55     ` Fam Zheng
  2015-05-13 12:21   ` Paolo Bonzini
  1 sibling, 1 reply; 33+ messages in thread
From: Wen Congyang @ 2015-05-13 11:22 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

On 05/14/2015 01:28 AM, Fam Zheng wrote:
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  blockdev.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index ae52d27..bd28183 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1649,6 +1649,7 @@ typedef struct BlockdevBackupState {
>      BlockDriverState *bs;
>      BlockJob *job;
>      AioContext *aio_context;
> +    Error *blocker;
>  } BlockdevBackupState;
>  
>  static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
> @@ -1685,6 +1686,10 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
>      }
>      aio_context_acquire(state->aio_context);
>  
> +    state->bs = bs;
> +    error_setg(&state->blocker, "blockdev-backup in progress");
> +    bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
> +

Do you test this patch? You need to read from bs to do backup!!
If the mode is none, you also need to write to bs!!

Thanks
Wen Congyang

>      qmp_blockdev_backup(backup->device, backup->target,
>                          backup->sync,
>                          backup->has_speed, backup->speed,
> @@ -1696,7 +1701,6 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
>          return;
>      }
>  
> -    state->bs = bs;
>      state->job = state->bs->job;
>  }
>  
> @@ -1715,6 +1719,10 @@ static void blockdev_backup_clean(BlkTransactionState *common)
>  {
>      BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
>  
> +    if (state->bs) {
> +        bdrv_op_unblock(state->bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
> +        error_free(state->blocker);
> +    }
>      if (state->aio_context) {
>          aio_context_release(state->aio_context);
>      }
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 01/11] block: Add op blocker type "device IO"
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 01/11] block: Add op blocker type "device IO" Fam Zheng
@ 2015-05-13 11:32   ` Wen Congyang
  2015-05-13 12:04   ` Paolo Bonzini
  1 sibling, 0 replies; 33+ messages in thread
From: Wen Congyang @ 2015-05-13 11:32 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

On 05/14/2015 01:28 AM, Fam Zheng wrote:
> Preventing device from submitting IO is useful around various nested
> aio_poll. Op blocker is a good place to put this flag.
> 
> Devices would submit IO requests through blk_* block backend interface,
> which calls blk_check_request to check the validity. Return -EBUSY if
> the operation is blocked, which means device IO is temporarily disabled
> by another BDS user.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/block-backend.c | 4 ++++
>  include/block/block.h | 1 +
>  2 files changed, 5 insertions(+)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 93e46f3..71fc695 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -478,6 +478,10 @@ static int blk_check_request(BlockBackend *blk, int64_t sector_num,
>          return -EIO;
>      }
>  
> +    if (bdrv_op_is_blocked(blk->bs, BLOCK_OP_TYPE_DEVICE_IO, NULL)) {
> +        return -EBUSY;
> +    }
> +

The guest doesn't know this status, so how do we prevent it? If it
is virtio disk, you wait it, but if it is the other disks, what is
the behavior?

Thanks
Wen Congyang

>      return blk_check_byte_request(blk, sector_num * BDRV_SECTOR_SIZE,
>                                    nb_sectors * BDRV_SECTOR_SIZE);
>  }
> diff --git a/include/block/block.h b/include/block/block.h
> index 7d1a717..906fb31 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -159,6 +159,7 @@ typedef enum BlockOpType {
>      BLOCK_OP_TYPE_RESIZE,
>      BLOCK_OP_TYPE_STREAM,
>      BLOCK_OP_TYPE_REPLACE,
> +    BLOCK_OP_TYPE_DEVICE_IO,
>      BLOCK_OP_TYPE_MAX,
>  } BlockOpType;
>  
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all
  2015-05-13 11:08     ` Fam Zheng
@ 2015-05-13 11:33       ` Paolo Bonzini
  2015-05-13 15:17         ` Fam Zheng
  0 siblings, 1 reply; 33+ messages in thread
From: Paolo Bonzini @ 2015-05-13 11:33 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, armbru, jcody, qemu-devel, mreitz,
	Stefan Hajnoczi



On 13/05/2015 13:08, Fam Zheng wrote:
> > I think this isn't enough.  It's the callers of bdrv_drain and
> > bdrv_drain_all that need to block before drain and unblock before
> > aio_context_release.
> 
> Which callers do you mean? qmp_transaction is covered in this series.

All of them. :(

In some cases it may be unnecessary.  I'm thinking of bdrv_set_aio_context,
and mig_save_device_dirty, but that's probably the exception rather than
the rule.

In some cases, like bdrv_snapshot_delete, the change is trivial.

The only somewhat more complex case is probably block/mirror.c.  There,
the aio_context_release happens through block_job_sleep_ns or
qemu_coroutine_yield.  I guess the change would look something like
this sketch:

diff --git a/block/mirror.c b/block/mirror.c
index 58f391a..dcfede0 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -513,16 +513,29 @@ static void coroutine_fn mirror_run(void *opaque)
 
         if (cnt == 0 && should_complete) {
             /* The dirty bitmap is not updated while operations are pending.
-             * If we're about to exit, wait for pending operations before
-             * calling bdrv_get_dirty_count(bs), or we may exit while the
+             * If we're about to exit, wait for pending operations and
+             * recheck bdrv_get_dirty_count(bs), or we may exit while the
              * source has dirty data to copy!
              *
              * Note that I/O can be submitted by the guest while
-             * mirror_populate runs.
+             * the rest of mirror_populate runs, but we lock it out here.
              */
             trace_mirror_before_drain(s, cnt);
+
+            // ... block ...
             bdrv_drain(bs);
             cnt = bdrv_get_dirty_count(s->dirty_bitmap);
+
+            if (cnt == 0) {
+                /* The two disks are in sync.  Exit and report successful
+                 * completion.
+                 */
+                assert(s->synced && QLIST_EMPTY(&bs->tracked_requests));
+                s->common.cancelled = false;
+                break; // ... and unblock somewhere else...
+            }
+
+            // ... unblock ...
         }
 
         ret = 0;
@@ -535,13 +549,6 @@ static void coroutine_fn mirror_run(void *opaque)
         } else if (!should_complete) {
             delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
             block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
-        } else if (cnt == 0) {
-            /* The two disks are in sync.  Exit and report successful
-             * completion.
-             */
-            assert(QLIST_EMPTY(&bs->tracked_requests));
-            s->common.cancelled = false;
-            break;
         }
         last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     }


It can be the topic of a separate series.  But this patch brings a
false sense of security (either the blocker is unnecessary, or it
needs to last after bdrv_drain returns), so I think it should be
dropped.

Paolo

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 01/11] block: Add op blocker type "device IO"
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 01/11] block: Add op blocker type "device IO" Fam Zheng
  2015-05-13 11:32   ` Wen Congyang
@ 2015-05-13 12:04   ` Paolo Bonzini
  2015-05-13 15:02     ` Fam Zheng
  1 sibling, 1 reply; 33+ messages in thread
From: Paolo Bonzini @ 2015-05-13 12:04 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi



On 13/05/2015 19:28, Fam Zheng wrote:
> @@ -478,6 +478,10 @@ static int blk_check_request(BlockBackend *blk, int64_t sector_num,
>          return -EIO;
>      }
>  
> +    if (bdrv_op_is_blocked(blk->bs, BLOCK_OP_TYPE_DEVICE_IO, NULL)) {
> +        return -EBUSY;
> +    }

I think this is incorrect.  It's fine for backends to generate more I/O
after a blocker is submitted, as long as it's bounded.

For example, SCSI requests can result in many consecutive I/Os:

(1) FUA requests are split in write+flush

(2) adapters that do not use QEMUSGList-based I/O only read 128K at a time

(3) WRITE SAME operations are also split in chunks

(4) UNMAP operations process one descriptor at a time

Paolo

>      return blk_check_byte_request(blk, sector_num * BDRV_SECTOR_SIZE,
>                                    nb_sectors * BDRV_SECTOR_SIZE);
>  }

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/11] blockdev: Block device IO during blockdev-backup transaction
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 10/11] blockdev: Block device IO during blockdev-backup transaction Fam Zheng
  2015-05-13 11:22   ` Wen Congyang
@ 2015-05-13 12:21   ` Paolo Bonzini
  2015-05-13 15:08     ` Paolo Bonzini
  1 sibling, 1 reply; 33+ messages in thread
From: Paolo Bonzini @ 2015-05-13 12:21 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel; +Cc: Stefan Hajnoczi, armbru, qemu-block, mreitz



On 13/05/2015 19:28, Fam Zheng wrote:
> +    state->bs = bs;
> +    error_setg(&state->blocker, "blockdev-backup in progress");
> +    bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
> +
>      qmp_blockdev_backup(backup->device, backup->target,
>                          backup->sync,
>                          backup->has_speed, backup->speed,
> @@ -1696,7 +1701,6 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
>          return;
>      }
>  
> -    state->bs = bs;

I don't understand this.  Jobs could pause/resume themselves by adding a
DEVICE_IO notifier on the targets.

However, block backups is the one job that cannot do this, because I/O
on the source triggers I/O on the target.

So if we consider this idea worthwhile, and decide that pausing device
I/O on the target should pause the block job, the backup job actually
has to prevent *adding a DEVICE_IO blocker* on the target.  This
"meta-block" is not possible in your design, which is a pity because on
the surface it looked nicer than mine.

FWIW, my original idea was:

- bdrv_pause checks if there is an operation blocker for PAUSE.  if it
is there, it fails

- otherwise, bdrv_pause invokes a notifier list if this is the outermost
call. if not the outermost call, it does nothing

- bdrv_resume does the same, but does not need a blocker

- drive-backup should block PAUSE on its target


Also, should the blockers (either DEVICE_IO in your design, or PAUSE in
mine) be included in bdrv_op_block_all.  I would say no in your case;
here is the proof:

- block-backup doesn't like DEVICE_IO blockers on the target

- block-backup calls bdrv_op_block_all on the target

- hence, bdrv_op_block_all shouldn't block DEVICE_IO


block_job_create is another suspicious caller of bdrv_op_block_all.  It
probably shouldn't block neither PAUSE nor DEVICE_IO.

Paolo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/11] blockdev: Block device IO during blockdev-backup transaction
  2015-05-13 11:22   ` Wen Congyang
@ 2015-05-13 12:55     ` Fam Zheng
  2015-05-14  1:12       ` Wen Congyang
  0 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 12:55 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Kevin Wolf, qemu-block, armbru, jcody, qemu-devel, mreitz,
	Stefan Hajnoczi, pbonzini

On Wed, 05/13 19:22, Wen Congyang wrote:
> On 05/14/2015 01:28 AM, Fam Zheng wrote:
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  blockdev.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/blockdev.c b/blockdev.c
> > index ae52d27..bd28183 100644
> > --- a/blockdev.c
> > +++ b/blockdev.c
> > @@ -1649,6 +1649,7 @@ typedef struct BlockdevBackupState {
> >      BlockDriverState *bs;
> >      BlockJob *job;
> >      AioContext *aio_context;
> > +    Error *blocker;
> >  } BlockdevBackupState;
> >  
> >  static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
> > @@ -1685,6 +1686,10 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
> >      }
> >      aio_context_acquire(state->aio_context);
> >  
> > +    state->bs = bs;
> > +    error_setg(&state->blocker, "blockdev-backup in progress");
> > +    bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
> > +
> 
> Do you test this patch? You need to read from bs to do backup!!
> If the mode is none, you also need to write to bs!!

This blocker is only temporary and is removed in blockdev_backup_clean before
qmp_transaction returns.

Fam

> 
> Thanks
> Wen Congyang
> 
> >      qmp_blockdev_backup(backup->device, backup->target,
> >                          backup->sync,
> >                          backup->has_speed, backup->speed,
> > @@ -1696,7 +1701,6 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
> >          return;
> >      }
> >  
> > -    state->bs = bs;
> >      state->job = state->bs->job;
> >  }
> >  
> > @@ -1715,6 +1719,10 @@ static void blockdev_backup_clean(BlkTransactionState *common)
> >  {
> >      BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
> >  
> > +    if (state->bs) {
> > +        bdrv_op_unblock(state->bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
> > +        error_free(state->blocker);
> > +    }
> >      if (state->aio_context) {
> >          aio_context_release(state->aio_context);
> >      }
> > 
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 01/11] block: Add op blocker type "device IO"
  2015-05-13 12:04   ` Paolo Bonzini
@ 2015-05-13 15:02     ` Fam Zheng
  2015-05-13 15:09       ` Paolo Bonzini
  0 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 15:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, qemu-block, armbru, jcody, qemu-devel, mreitz,
	Stefan Hajnoczi

On Wed, 05/13 14:04, Paolo Bonzini wrote:
> 
> 
> On 13/05/2015 19:28, Fam Zheng wrote:
> > @@ -478,6 +478,10 @@ static int blk_check_request(BlockBackend *blk, int64_t sector_num,
> >          return -EIO;
> >      }
> >  
> > +    if (bdrv_op_is_blocked(blk->bs, BLOCK_OP_TYPE_DEVICE_IO, NULL)) {
> > +        return -EBUSY;
> > +    }
> 
> I think this is incorrect.  It's fine for backends to generate more I/O
> after a blocker is submitted, as long as it's bounded.
> 
> For example, SCSI requests can result in many consecutive I/Os:
> 
> (1) FUA requests are split in write+flush
> 
> (2) adapters that do not use QEMUSGList-based I/O only read 128K at a time
> 
> (3) WRITE SAME operations are also split in chunks
> 
> (4) UNMAP operations process one descriptor at a time

I don't understand the point of these examples. If we don't return -EBUSY here,
the request will sneak into block/io.c and perhaps break qmp transaction
semantics, if it lands between two backups.

Fam

> 
> Paolo
> 
> >      return blk_check_byte_request(blk, sector_num * BDRV_SECTOR_SIZE,
> >                                    nb_sectors * BDRV_SECTOR_SIZE);
> >  }

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/11] blockdev: Block device IO during blockdev-backup transaction
  2015-05-13 12:21   ` Paolo Bonzini
@ 2015-05-13 15:08     ` Paolo Bonzini
  2015-05-13 15:34       ` Fam Zheng
  0 siblings, 1 reply; 33+ messages in thread
From: Paolo Bonzini @ 2015-05-13 15:08 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel; +Cc: qemu-block, armbru, Stefan Hajnoczi, mreitz

On 13/05/2015 16:55, Fam Zheng wrote:
> > So if we consider this idea worthwhile, and decide that pausing device
> > I/O on the target should pause the block job, the backup job actually
> > has to prevent *adding a DEVICE_IO blocker* on the target.
> 
> When do we need to pause device IO on the target? For drive-backup the target
> is anonymous and no device exists on it, so it's out of question. For
> blockdev-backup, users can have a device attached, but I'm not sure we should
> pause the job in this case.

Well, it depends on the meaning you give to "pause"/"device_io blocker".
 For me it was "I want exclusive ownership of the disk, no one else
should write".

For blockdev-backup for example you could mirror the backup destination
somewhere else, and mirror needs to drain the source.  For that to make
sense, the block job has to be paused.

Of course this is contrived, but it's better to make the design generic.

> > > This
> > > "meta-block" is not possible in your design, which is a pity because on
> > > the surface it looked nicer than mine.
> > > 
> > > FWIW, my original idea was:
> > > 
> > > - bdrv_pause checks if there is an operation blocker for PAUSE.  if it
> > > is there, it fails
> 
> It's more complicated, because bdrv_drain_all can't fail now, if we want
> bdrv_pause there, then it could fail, and will become much harder to use.

I think we don't need bdrv_pause there.  See my reply to patch 11---we
want it in the callers.

> In your idea, who will block PAUSE, and why?

For example, blockdev-backup/drive-backup can block PAUSE on the target.
 But maybe they only need to add an op blocker notifier, and use it to
block device I/O on the source and drain it.  So your idea is safe.  Good!

Another idea was to block PAUSE in virtio-scsi until it's fixed.

Paolo

> > > - otherwise, bdrv_pause invokes a notifier list if this is the outermost
> > > call. if not the outermost call, it does nothing
> > > 
> > > - bdrv_resume does the same, but does not need a blocker
> > > 
> > > - drive-backup should block PAUSE on its target

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 01/11] block: Add op blocker type "device IO"
  2015-05-13 15:02     ` Fam Zheng
@ 2015-05-13 15:09       ` Paolo Bonzini
  2015-05-14  2:40         ` Fam Zheng
  0 siblings, 1 reply; 33+ messages in thread
From: Paolo Bonzini @ 2015-05-13 15:09 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, armbru, jcody, qemu-devel, mreitz,
	Stefan Hajnoczi



On 13/05/2015 17:02, Fam Zheng wrote:
>> > For example, SCSI requests can result in many consecutive I/Os:
>> > 
>> > (1) FUA requests are split in write+flush
>> > 
>> > (2) adapters that do not use QEMUSGList-based I/O only read 128K at a time
>> > 
>> > (3) WRITE SAME operations are also split in chunks
>> > 
>> > (4) UNMAP operations process one descriptor at a time
> I don't understand the point of these examples. If we don't return -EBUSY here,
> the request will sneak into block/io.c and perhaps break qmp transaction
> semantics, if it lands between two backups.

It won't, because after blocking DEVICE_IO you will always drain I/O and
the bdrv_drain will loop until the above are all satisfied.

Paolo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all
  2015-05-13 11:33       ` Paolo Bonzini
@ 2015-05-13 15:17         ` Fam Zheng
  2015-05-13 15:25           ` Paolo Bonzini
  0 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 15:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, qemu-block, armbru, jcody, qemu-devel, mreitz,
	Stefan Hajnoczi

On Wed, 05/13 13:33, Paolo Bonzini wrote:
> 
> 
> On 13/05/2015 13:08, Fam Zheng wrote:
> > > I think this isn't enough.  It's the callers of bdrv_drain and
> > > bdrv_drain_all that need to block before drain and unblock before
> > > aio_context_release.
> > 
> > Which callers do you mean? qmp_transaction is covered in this series.
> 
> All of them. :(
> 
> In some cases it may be unnecessary.  I'm thinking of bdrv_set_aio_context,
> and mig_save_device_dirty, but that's probably the exception rather than
> the rule.
> 
> In some cases, like bdrv_snapshot_delete, the change is trivial.
> 
> The only somewhat more complex case is probably block/mirror.c.  There,
> the aio_context_release happens through block_job_sleep_ns or
> qemu_coroutine_yield.  I guess the change would look something like
> this sketch:
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index 58f391a..dcfede0 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -513,16 +513,29 @@ static void coroutine_fn mirror_run(void *opaque)
>  
>          if (cnt == 0 && should_complete) {
>              /* The dirty bitmap is not updated while operations are pending.
> -             * If we're about to exit, wait for pending operations before
> -             * calling bdrv_get_dirty_count(bs), or we may exit while the
> +             * If we're about to exit, wait for pending operations and
> +             * recheck bdrv_get_dirty_count(bs), or we may exit while the
>               * source has dirty data to copy!
>               *
>               * Note that I/O can be submitted by the guest while
> -             * mirror_populate runs.
> +             * the rest of mirror_populate runs, but we lock it out here.
>               */
>              trace_mirror_before_drain(s, cnt);
> +
> +            // ... block ...
>              bdrv_drain(bs);
>              cnt = bdrv_get_dirty_count(s->dirty_bitmap);
> +
> +            if (cnt == 0) {
> +                /* The two disks are in sync.  Exit and report successful
> +                 * completion.
> +                 */
> +                assert(s->synced && QLIST_EMPTY(&bs->tracked_requests));
> +                s->common.cancelled = false;
> +                break; // ... and unblock somewhere else...
> +            }
> +
> +            // ... unblock ...
>          }
>  
>          ret = 0;
> @@ -535,13 +549,6 @@ static void coroutine_fn mirror_run(void *opaque)
>          } else if (!should_complete) {
>              delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
>              block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
> -        } else if (cnt == 0) {
> -            /* The two disks are in sync.  Exit and report successful
> -             * completion.
> -             */
> -            assert(QLIST_EMPTY(&bs->tracked_requests));
> -            s->common.cancelled = false;
> -            break;
>          }
>          last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>      }
> 
> 
> It can be the topic of a separate series.  But this patch brings a
> false sense of security (either the blocker is unnecessary, or it
> needs to last after bdrv_drain returns), so I think it should be
> dropped.

Doesn't this let bdrv_drain_all return when virtio-blk-dataplane is having high
workload, in places where you say "the blocker is unnecessary"?

Fam

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all
  2015-05-13 15:17         ` Fam Zheng
@ 2015-05-13 15:25           ` Paolo Bonzini
  2015-05-14  3:02             ` Fam Zheng
  0 siblings, 1 reply; 33+ messages in thread
From: Paolo Bonzini @ 2015-05-13 15:25 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, armbru, jcody, qemu-devel, mreitz,
	Stefan Hajnoczi



On 13/05/2015 17:17, Fam Zheng wrote:
>> > 
>> > It can be the topic of a separate series.  But this patch brings a
>> > false sense of security (either the blocker is unnecessary, or it
>> > needs to last after bdrv_drain returns), so I think it should be
>> > dropped.
> Doesn't this let bdrv_drain_all return when virtio-blk-dataplane is having high
> workload, in places where you say "the blocker is unnecessary"?

Yes, you're right.  Please document it in the commit message and the
code, it's tricky.

Paolo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/11] blockdev: Block device IO during blockdev-backup transaction
  2015-05-13 15:08     ` Paolo Bonzini
@ 2015-05-13 15:34       ` Fam Zheng
  0 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 15:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: mreitz, qemu-block, qemu-devel, Stefan Hajnoczi, armbru

On Wed, 05/13 17:08, Paolo Bonzini wrote:
> On 13/05/2015 16:55, Fam Zheng wrote:
> > In your idea, who will block PAUSE, and why?
> 
> For example, blockdev-backup/drive-backup can block PAUSE on the target.
>  But maybe they only need to add an op blocker notifier, and use it to
> block device I/O on the source and drain it.  So your idea is safe.  Good!

Yes, sounds right.

> 
> Another idea was to block PAUSE in virtio-scsi until it's fixed.

I prefer to follow up another series to fix virtio-scsi, if my idea works :)

Fam

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v2 00/11] Fix transactional snapshot with virtio-blk dataplane and NBD export
@ 2015-05-13 17:28 Fam Zheng
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 01/11] block: Add op blocker type "device IO" Fam Zheng
                   ` (10 more replies)
  0 siblings, 11 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

Changes from RFC:
  - Add op blocker listener in nbd server.
  - Add other transaction types.
  - Only notify listeners when changing from/to empty. (Paolo)

Reported by Paolo.

Unlike the iohandler in main loop, iothreads currently process the event
notifier used by virtio-blk ioeventfd in nested aio_poll. This is dangerous
without proper protection, because guest requests could sneak to block layer
where they mustn't.

For example, a QMP transaction may involve multiple bdrv_drain_all() in
handling the list of AioContext it works on. If an aio_poll in one of the
bdrv_drain_all() happens to process a guest VQ kick, and dispatches the
ioeventfd event to virtio-blk, a new guest write is then submitted, and voila,
the transaction semantics is violated.

This series avoids this problem by disabling virtio-blk handlers during
bdrv_drain_all() and transactions.

- Patches 1~3 add the block layer op blocker change notifier code.
- Patches 4,5 secure virtio-blk dataplane.
- Patch 6 secures nbd export.
- Patch 7~10 protect each transaction type from being voilated by new IO
  generated in nested aio_poll.
- Patch 11 protects bdrv_drain and bdrv_drain_all.

Notes:

virtio-scsi-dataplane will be a bit more complicated, but still doable.  It
would probably need one more interface abstraction between scsi-disk, scsi-bus
and virtio-scsi.

Although other devices don't have a pause/resume callback yet, the
blk_check_request, which returns -EBUSY if "device io" op blocker is set, could
hopefully cover most cases already.

Timers and block jobs also generate IO, but it should be fine as long as they
don't change guest visible data, which is true AFAICT. Besides, bdrv_drain_all
already pauses block jobs.



Fam Zheng (11):
  block: Add op blocker type "device IO"
  block: Add op blocker notifier list
  block-backend: Add blk_op_blocker_add_notifier
  virtio-blk: Move complete_request to 'ops' structure
  virtio-blk: Don't handle output when there is "device IO" op blocker
  nbd-server: Clear "can_read" when "device io" blocker is set
  blockdev: Block device IO during internal snapshot transaction
  blockdev: Block device IO during external snapshot transaction
  blockdev: Block device IO during drive-backup transaction
  blockdev: Block device IO during blockdev-backup transaction
  block: Block "device IO" during bdrv_drain and bdrv_drain_all

 block.c                         | 28 +++++++++++++++++
 block/block-backend.c           | 10 ++++++
 block/io.c                      | 12 +++++++
 blockdev.c                      | 49 ++++++++++++++++++++++++-----
 hw/block/dataplane/virtio-blk.c | 36 ++++++++++++++++++---
 hw/block/virtio-blk.c           | 69 +++++++++++++++++++++++++++++++++++++++--
 include/block/block.h           |  9 ++++++
 include/block/block_int.h       |  3 ++
 include/hw/virtio/virtio-blk.h  | 17 ++++++++--
 include/sysemu/block-backend.h  |  2 ++
 nbd.c                           | 18 +++++++++++
 11 files changed, 235 insertions(+), 18 deletions(-)

-- 
2.4.0

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v2 01/11] block: Add op blocker type "device IO"
  2015-05-13 17:28 [Qemu-devel] [PATCH v2 00/11] Fix transactional snapshot with virtio-blk dataplane and NBD export Fam Zheng
@ 2015-05-13 17:28 ` Fam Zheng
  2015-05-13 11:32   ` Wen Congyang
  2015-05-13 12:04   ` Paolo Bonzini
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 02/11] block: Add op blocker notifier list Fam Zheng
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

Preventing device from submitting IO is useful around various nested
aio_poll. Op blocker is a good place to put this flag.

Devices would submit IO requests through blk_* block backend interface,
which calls blk_check_request to check the validity. Return -EBUSY if
the operation is blocked, which means device IO is temporarily disabled
by another BDS user.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/block-backend.c | 4 ++++
 include/block/block.h | 1 +
 2 files changed, 5 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 93e46f3..71fc695 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -478,6 +478,10 @@ static int blk_check_request(BlockBackend *blk, int64_t sector_num,
         return -EIO;
     }
 
+    if (bdrv_op_is_blocked(blk->bs, BLOCK_OP_TYPE_DEVICE_IO, NULL)) {
+        return -EBUSY;
+    }
+
     return blk_check_byte_request(blk, sector_num * BDRV_SECTOR_SIZE,
                                   nb_sectors * BDRV_SECTOR_SIZE);
 }
diff --git a/include/block/block.h b/include/block/block.h
index 7d1a717..906fb31 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -159,6 +159,7 @@ typedef enum BlockOpType {
     BLOCK_OP_TYPE_RESIZE,
     BLOCK_OP_TYPE_STREAM,
     BLOCK_OP_TYPE_REPLACE,
+    BLOCK_OP_TYPE_DEVICE_IO,
     BLOCK_OP_TYPE_MAX,
 } BlockOpType;
 
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v2 02/11] block: Add op blocker notifier list
  2015-05-13 17:28 [Qemu-devel] [PATCH v2 00/11] Fix transactional snapshot with virtio-blk dataplane and NBD export Fam Zheng
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 01/11] block: Add op blocker type "device IO" Fam Zheng
@ 2015-05-13 17:28 ` Fam Zheng
  2015-05-14  5:32   ` Fam Zheng
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 03/11] block-backend: Add blk_op_blocker_add_notifier Fam Zheng
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

BDS users can register a notifier and get notified about op blocker
changes.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c                   | 28 ++++++++++++++++++++++++++++
 include/block/block.h     |  8 ++++++++
 include/block/block_int.h |  3 +++
 3 files changed, 39 insertions(+)

diff --git a/block.c b/block.c
index 7904098..cb47308 100644
--- a/block.c
+++ b/block.c
@@ -3375,6 +3375,12 @@ struct BdrvOpBlocker {
     QLIST_ENTRY(BdrvOpBlocker) list;
 };
 
+void bdrv_op_blocker_add_notifier(BlockDriverState *bs,
+                                  Notifier *notifier)
+{
+    notifier_list_add(&bs->op_blocker_notifiers, notifier);
+}
+
 bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp)
 {
     BdrvOpBlocker *blocker;
@@ -3391,26 +3397,48 @@ bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp)
     return false;
 }
 
+static void bdrv_op_blocker_notify(BlockDriverState *bs, BlockOpType op,
+                                   Error *reason, bool blocking)
+{
+    BlockOpEvent event = (BlockOpEvent) {
+        op = op,
+        reason = reason,
+        blocking = true,
+    };
+
+    notifier_list_notify(&bs->op_blocker_notifiers, &event);
+}
+
 void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
 {
+    bool blocked;
     BdrvOpBlocker *blocker;
     assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
 
     blocker = g_new0(BdrvOpBlocker, 1);
     blocker->reason = reason;
+    blocked = !QLIST_EMPTY(&bs->op_blockers[op]);
     QLIST_INSERT_HEAD(&bs->op_blockers[op], blocker, list);
+    if (!blocked) {
+        bdrv_op_blocker_notify(bs, op, reason, true);
+    }
 }
 
 void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason)
 {
+    bool blocked;
     BdrvOpBlocker *blocker, *next;
     assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
+    blocked = !QLIST_EMPTY(&bs->op_blockers[op]);
     QLIST_FOREACH_SAFE(blocker, &bs->op_blockers[op], list, next) {
         if (blocker->reason == reason) {
             QLIST_REMOVE(blocker, list);
             g_free(blocker);
         }
     }
+    if (blocked && QLIST_EMPTY(&bs->op_blockers[op])) {
+        bdrv_op_blocker_notify(bs, op, reason, false);
+    }
 }
 
 void bdrv_op_block_all(BlockDriverState *bs, Error *reason)
diff --git a/include/block/block.h b/include/block/block.h
index 906fb31..3420b2c 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -163,6 +163,12 @@ typedef enum BlockOpType {
     BLOCK_OP_TYPE_MAX,
 } BlockOpType;
 
+typedef struct {
+    BlockOpType type;
+    Error *reason;
+    bool blocking;
+} BlockOpEvent;
+
 void bdrv_iostatus_enable(BlockDriverState *bs);
 void bdrv_iostatus_reset(BlockDriverState *bs);
 void bdrv_iostatus_disable(BlockDriverState *bs);
@@ -491,6 +497,8 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
 void bdrv_ref(BlockDriverState *bs);
 void bdrv_unref(BlockDriverState *bs);
 
+void bdrv_op_blocker_add_notifier(BlockDriverState *bs,
+                                  Notifier *notifier);
 bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp);
 void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason);
 void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index db29b74..29d1c84 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -418,6 +418,9 @@ struct BlockDriverState {
     /* operation blockers */
     QLIST_HEAD(, BdrvOpBlocker) op_blockers[BLOCK_OP_TYPE_MAX];
 
+    /* Callback when one list in op_blockers has "empty" status change. */
+    NotifierList op_blocker_notifiers;
+
     /* long-running background operation */
     BlockJob *job;
 
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v2 03/11] block-backend: Add blk_op_blocker_add_notifier
  2015-05-13 17:28 [Qemu-devel] [PATCH v2 00/11] Fix transactional snapshot with virtio-blk dataplane and NBD export Fam Zheng
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 01/11] block: Add op blocker type "device IO" Fam Zheng
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 02/11] block: Add op blocker notifier list Fam Zheng
@ 2015-05-13 17:28 ` Fam Zheng
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 04/11] virtio-blk: Move complete_request to 'ops' structure Fam Zheng
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

Forward the call to bdrv_op_blocker_add_notifier.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/block-backend.c          | 6 ++++++
 include/sysemu/block-backend.h | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 71fc695..90d7476 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -806,6 +806,12 @@ void blk_op_unblock_all(BlockBackend *blk, Error *reason)
     bdrv_op_unblock_all(blk->bs, reason);
 }
 
+void blk_op_blocker_add_notifier(BlockBackend *blk,
+                                 Notifier *notifier)
+{
+    bdrv_op_blocker_add_notifier(blk->bs, notifier);
+}
+
 AioContext *blk_get_aio_context(BlockBackend *blk)
 {
     return bdrv_get_aio_context(blk->bs);
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index b4a4d5e..cde9651 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -136,6 +136,8 @@ int blk_get_flags(BlockBackend *blk);
 int blk_get_max_transfer_length(BlockBackend *blk);
 void blk_set_guest_block_size(BlockBackend *blk, int align);
 void *blk_blockalign(BlockBackend *blk, size_t size);
+void blk_op_blocker_add_notifier(BlockBackend *blk,
+                                 Notifier *notifier);
 bool blk_op_is_blocked(BlockBackend *blk, BlockOpType op, Error **errp);
 void blk_op_unblock(BlockBackend *blk, BlockOpType op, Error *reason);
 void blk_op_block_all(BlockBackend *blk, Error *reason);
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v2 04/11] virtio-blk: Move complete_request to 'ops' structure
  2015-05-13 17:28 [Qemu-devel] [PATCH v2 00/11] Fix transactional snapshot with virtio-blk dataplane and NBD export Fam Zheng
                   ` (2 preceding siblings ...)
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 03/11] block-backend: Add blk_op_blocker_add_notifier Fam Zheng
@ 2015-05-13 17:28 ` Fam Zheng
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 05/11] virtio-blk: Don't handle output when there is "device IO" op blocker Fam Zheng
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

Should more ops be added to differentiate code between dataplane and
non-dataplane, the new saved_ops approach will be cleaner than messing
with N pointers.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/block/dataplane/virtio-blk.c | 13 ++++++++-----
 hw/block/virtio-blk.c           |  8 ++++++--
 include/hw/virtio/virtio-blk.h  |  9 +++++++--
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 3db139b..ec0c8f4 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -51,8 +51,7 @@ struct VirtIOBlockDataPlane {
 
     /* Operation blocker on BDS */
     Error *blocker;
-    void (*saved_complete_request)(struct VirtIOBlockReq *req,
-                                   unsigned char status);
+    const VirtIOBlockOps *saved_ops;
 };
 
 /* Raise an interrupt to signal guest, if necessary */
@@ -88,6 +87,10 @@ static void complete_request_vring(VirtIOBlockReq *req, unsigned char status)
     qemu_bh_schedule(s->bh);
 }
 
+static const VirtIOBlockOps virtio_blk_data_plane_ops = {
+    .complete_request = complete_request_vring,
+};
+
 static void handle_notify(EventNotifier *e)
 {
     VirtIOBlockDataPlane *s = container_of(e, VirtIOBlockDataPlane,
@@ -269,8 +272,8 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
     }
     s->host_notifier = *virtio_queue_get_host_notifier(vq);
 
-    s->saved_complete_request = vblk->complete_request;
-    vblk->complete_request = complete_request_vring;
+    s->saved_ops = vblk->ops;
+    vblk->ops = &virtio_blk_data_plane_ops;
 
     s->starting = false;
     s->started = true;
@@ -313,7 +316,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
         return;
     }
     s->stopping = true;
-    vblk->complete_request = s->saved_complete_request;
+    vblk->ops = s->saved_ops;
     trace_virtio_blk_data_plane_stop(s);
 
     aio_context_acquire(s->ctx);
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index e6afe97..f4a9d19 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -59,9 +59,13 @@ static void virtio_blk_complete_request(VirtIOBlockReq *req,
     virtio_notify(vdev, s->vq);
 }
 
+static const VirtIOBlockOps virtio_blk_ops = (VirtIOBlockOps) {
+    .complete_request = virtio_blk_complete_request,
+};
+
 static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char status)
 {
-    req->dev->complete_request(req, status);
+    req->dev->ops->complete_request(req, status);
 }
 
 static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error,
@@ -905,7 +909,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
     s->sector_mask = (s->conf.conf.logical_block_size / BDRV_SECTOR_SIZE) - 1;
 
     s->vq = virtio_add_queue(vdev, 128, virtio_blk_handle_output);
-    s->complete_request = virtio_blk_complete_request;
+    s->ops = &virtio_blk_ops;
     virtio_blk_data_plane_create(vdev, conf, &s->dataplane, &err);
     if (err != NULL) {
         error_propagate(errp, err);
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 6bf5905..28b3436 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -44,6 +44,12 @@ struct VirtIOBlkConf
 struct VirtIOBlockDataPlane;
 
 struct VirtIOBlockReq;
+
+typedef struct {
+    /* Function to push to vq and notify guest */
+    void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status);
+} VirtIOBlockOps;
+
 typedef struct VirtIOBlock {
     VirtIODevice parent_obj;
     BlockBackend *blk;
@@ -54,8 +60,7 @@ typedef struct VirtIOBlock {
     unsigned short sector_mask;
     bool original_wce;
     VMChangeStateEntry *change;
-    /* Function to push to vq and notify guest */
-    void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status);
+    const VirtIOBlockOps *ops;
     Notifier migration_state_notifier;
     struct VirtIOBlockDataPlane *dataplane;
 } VirtIOBlock;
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v2 05/11] virtio-blk: Don't handle output when there is "device IO" op blocker
  2015-05-13 17:28 [Qemu-devel] [PATCH v2 00/11] Fix transactional snapshot with virtio-blk dataplane and NBD export Fam Zheng
                   ` (3 preceding siblings ...)
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 04/11] virtio-blk: Move complete_request to 'ops' structure Fam Zheng
@ 2015-05-13 17:28 ` Fam Zheng
  2015-05-13 10:26   ` Paolo Bonzini
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 06/11] nbd-server: Clear "can_read" when "device io" blocker is set Fam Zheng
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

virtio-blk now listens to op blocker change of the associated block
backend.

Up on setting op blocker on BLOCK_OP_TYPE_DEVICE_IO:

  non-dataplane:
   1) Set VirtIOBlock.paused
   2) In virtio_blk_handle_output, do nothing if VirtIOBlock.paused

  dataplane:
   1) Clear the host event notifier
   2) In handle_notify, do nothing if VirtIOBlock.paused

Up on removing the op blocker:

  non-dataplane:
   1) Clear VirtIOBlock.paused
   2) Schedule a BH on the AioContext of the backend, which calls
   virtio_blk_handle_output, so that the previous unhandled kicks can
   make progress

  dataplane:
   1) Set the host event notifier
   2) Notify the host event notifier so that unhandled events are
   processed

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/block/dataplane/virtio-blk.c | 25 +++++++++++++++-
 hw/block/virtio-blk.c           | 63 +++++++++++++++++++++++++++++++++++++++--
 include/hw/virtio/virtio-blk.h  |  8 +++++-
 3 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index ec0c8f4..d6c943c 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -87,8 +87,28 @@ static void complete_request_vring(VirtIOBlockReq *req, unsigned char status)
     qemu_bh_schedule(s->bh);
 }
 
+static void virtio_blk_data_plane_pause(VirtIOBlock *vblk)
+{
+    VirtIOBlockDataPlane *s = vblk->dataplane;
+
+    event_notifier_test_and_clear(&s->host_notifier);
+    aio_set_event_notifier(s->ctx, &s->host_notifier, NULL);
+}
+
+static void handle_notify(EventNotifier *e);
+static void virtio_blk_data_plane_resume(VirtIOBlock *vblk)
+{
+    VirtIOBlockDataPlane *s = vblk->dataplane;
+
+    aio_set_event_notifier(s->ctx, &s->host_notifier, handle_notify);
+
+    event_notifier_set(&s->host_notifier);
+}
+
 static const VirtIOBlockOps virtio_blk_data_plane_ops = {
-    .complete_request = complete_request_vring,
+    .complete_request           = complete_request_vring,
+    .pause                      = virtio_blk_data_plane_pause,
+    .resume                     = virtio_blk_data_plane_resume,
 };
 
 static void handle_notify(EventNotifier *e)
@@ -98,6 +118,9 @@ static void handle_notify(EventNotifier *e)
     VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
 
     event_notifier_test_and_clear(&s->host_notifier);
+    if (vblk->paused) {
+        return;
+    }
     blk_io_plug(s->conf->conf.blk);
     for (;;) {
         MultiReqBuffer mrb = {};
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f4a9d19..4bc1b2a 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -59,8 +59,38 @@ static void virtio_blk_complete_request(VirtIOBlockReq *req,
     virtio_notify(vdev, s->vq);
 }
 
+typedef struct {
+    QEMUBH *bh;
+    VirtIOBlock *s;
+} VirtIOBlockResumeData;
+
+static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq);
+static void virtio_blk_resume_bh_cb(void *opaque)
+{
+    VirtIOBlockResumeData *data = opaque;
+    qemu_bh_delete(data->bh);
+    virtio_blk_handle_output(VIRTIO_DEVICE(data->s), data->s->vq);
+}
+
+static void virtio_blk_pause(VirtIOBlock *vblk)
+{
+    /* TODO: stop ioeventfd */
+}
+
+static void virtio_blk_resume(VirtIOBlock *vblk)
+{
+    VirtIOBlockResumeData *data = g_new(VirtIOBlockResumeData, 1);
+    data->bh = aio_bh_new(blk_get_aio_context(vblk->blk),
+            virtio_blk_resume_bh_cb, data);
+    data->s = vblk;
+    data->s->paused = false;
+    qemu_bh_schedule(data->bh);
+}
+
 static const VirtIOBlockOps virtio_blk_ops = (VirtIOBlockOps) {
-    .complete_request = virtio_blk_complete_request,
+    .complete_request           = virtio_blk_complete_request,
+    .pause                      = virtio_blk_pause,
+    .resume                     = virtio_blk_resume,
 };
 
 static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char status)
@@ -597,6 +627,9 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
     VirtIOBlockReq *req;
     MultiReqBuffer mrb = {};
 
+    if (s->paused) {
+        return;
+    }
     /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
      * dataplane here instead of waiting for .set_status().
      */
@@ -787,7 +820,7 @@ static void virtio_blk_save(QEMUFile *f, void *opaque)
 
     virtio_save(vdev, f);
 }
-    
+
 static void virtio_blk_save_device(VirtIODevice *vdev, QEMUFile *f)
 {
     VirtIOBlock *s = VIRTIO_BLK(vdev);
@@ -875,6 +908,29 @@ static void virtio_blk_migration_state_changed(Notifier *notifier, void *data)
     }
 }
 
+static void virtio_blk_op_blocker_changed(Notifier *notifier, void *opaque)
+{
+    BlockOpEvent *event = opaque;
+    VirtIOBlock *s = container_of(notifier, VirtIOBlock,
+                                  op_blocker_notifier);
+    bool pause;
+
+    if (event->type != BLOCK_OP_TYPE_DEVICE_IO) {
+        return;
+    }
+    pause = event->blocking || blk_op_is_blocked(s->blk,
+                                                    BLOCK_OP_TYPE_DEVICE_IO,
+                                                    NULL);
+    if (pause == s->paused) {
+        return;
+    }
+    if (pause) {
+        s->ops->pause(s);
+    } else {
+        s->ops->resume(s);
+    }
+}
+
 static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -926,6 +982,9 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
     blk_set_guest_block_size(s->blk, s->conf.conf.logical_block_size);
 
     blk_iostatus_enable(s->blk);
+
+    s->op_blocker_notifier.notify = virtio_blk_op_blocker_changed;
+    blk_op_blocker_add_notifier(s->blk, &s->op_blocker_notifier);
 }
 
 static void virtio_blk_device_unrealize(DeviceState *dev, Error **errp)
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 28b3436..aa15fea 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -42,12 +42,16 @@ struct VirtIOBlkConf
 };
 
 struct VirtIOBlockDataPlane;
-
+struct VirtIOBlock;
 struct VirtIOBlockReq;
 
 typedef struct {
     /* Function to push to vq and notify guest */
     void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status);
+
+    /* Functions to pause/resume request handling */
+    void (*pause)(struct VirtIOBlock *vblk);
+    void (*resume)(struct VirtIOBlock *vblk);
 } VirtIOBlockOps;
 
 typedef struct VirtIOBlock {
@@ -62,6 +66,8 @@ typedef struct VirtIOBlock {
     VMChangeStateEntry *change;
     const VirtIOBlockOps *ops;
     Notifier migration_state_notifier;
+    Notifier op_blocker_notifier;
+    bool paused;
     struct VirtIOBlockDataPlane *dataplane;
 } VirtIOBlock;
 
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v2 06/11] nbd-server: Clear "can_read" when "device io" blocker is set
  2015-05-13 17:28 [Qemu-devel] [PATCH v2 00/11] Fix transactional snapshot with virtio-blk dataplane and NBD export Fam Zheng
                   ` (4 preceding siblings ...)
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 05/11] virtio-blk: Don't handle output when there is "device IO" op blocker Fam Zheng
@ 2015-05-13 17:28 ` Fam Zheng
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 07/11] blockdev: Block device IO during internal snapshot transaction Fam Zheng
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

So that NBD export cannot submit IO during bdrv_drain_all().

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 nbd.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/nbd.c b/nbd.c
index 06b501b..7d9d3e4 100644
--- a/nbd.c
+++ b/nbd.c
@@ -160,6 +160,8 @@ struct NBDExport {
     uint32_t nbdflags;
     QTAILQ_HEAD(, NBDClient) clients;
     QTAILQ_ENTRY(NBDExport) next;
+    Notifier blocker_notifier;
+    bool io_blocked;
 
     AioContext *ctx;
 };
@@ -1053,6 +1055,16 @@ static void blk_aio_detach(void *opaque)
     exp->ctx = NULL;
 }
 
+static void nbd_op_blocker_changed(Notifier *notifier, void *data)
+{
+    BlockOpEvent *event = data;
+    NBDExport *exp = container_of(notifier, NBDExport, blocker_notifier);
+    if (event->type != BLOCK_OP_TYPE_DEVICE_IO) {
+        return;
+    }
+    exp->io_blocked = event->blocking;
+}
+
 NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size,
                           uint32_t nbdflags, void (*close)(NBDExport *),
                           Error **errp)
@@ -1081,6 +1093,8 @@ NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size,
      * access since the export could be available before migration handover.
      */
     blk_invalidate_cache(blk, NULL);
+    exp->blocker_notifier.notify = nbd_op_blocker_changed;
+    blk_op_blocker_add_notifier(blk, &exp->blocker_notifier);
     return exp;
 
 fail:
@@ -1132,6 +1146,7 @@ void nbd_export_close(NBDExport *exp)
     nbd_export_set_name(exp, NULL);
     nbd_export_put(exp);
     if (exp->blk) {
+        notifier_remove(&exp->blocker_notifier);
         blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
                                         blk_aio_detach, exp);
         blk_unref(exp->blk);
@@ -1455,6 +1470,9 @@ static void nbd_update_can_read(NBDClient *client)
     bool can_read = client->recv_coroutine ||
                     client->nb_requests < MAX_NBD_REQUESTS;
 
+    if (client->exp && client->exp->io_blocked) {
+        can_read = false;
+    }
     if (can_read != client->can_read) {
         client->can_read = can_read;
         nbd_set_handlers(client);
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v2 07/11] blockdev: Block device IO during internal snapshot transaction
  2015-05-13 17:28 [Qemu-devel] [PATCH v2 00/11] Fix transactional snapshot with virtio-blk dataplane and NBD export Fam Zheng
                   ` (5 preceding siblings ...)
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 06/11] nbd-server: Clear "can_read" when "device io" blocker is set Fam Zheng
@ 2015-05-13 17:28 ` Fam Zheng
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 08/11] blockdev: Block device IO during external " Fam Zheng
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 5eaf77e..7f763d9 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1262,6 +1262,7 @@ typedef struct InternalSnapshotState {
     BlockDriverState *bs;
     AioContext *aio_context;
     QEMUSnapshotInfo sn;
+    Error *blocker;
 } InternalSnapshotState;
 
 static void internal_snapshot_prepare(BlkTransactionState *common,
@@ -1300,6 +1301,10 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
     state->aio_context = bdrv_get_aio_context(bs);
     aio_context_acquire(state->aio_context);
 
+    state->bs = bs;
+    error_setg(&state->blocker, "internal snapshot in progress");
+    bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
+
     if (!bdrv_is_inserted(bs)) {
         error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
         return;
@@ -1354,9 +1359,6 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
                          name, device);
         return;
     }
-
-    /* 4. succeed, mark a snapshot is created */
-    state->bs = bs;
 }
 
 static void internal_snapshot_abort(BlkTransactionState *common)
@@ -1387,6 +1389,10 @@ static void internal_snapshot_clean(BlkTransactionState *common)
     InternalSnapshotState *state = DO_UPCAST(InternalSnapshotState,
                                              common, common);
 
+    if (state->bs) {
+        bdrv_op_unblock(state->bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
+        error_free(state->blocker);
+    }
     if (state->aio_context) {
         aio_context_release(state->aio_context);
     }
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v2 08/11] blockdev: Block device IO during external snapshot transaction
  2015-05-13 17:28 [Qemu-devel] [PATCH v2 00/11] Fix transactional snapshot with virtio-blk dataplane and NBD export Fam Zheng
                   ` (6 preceding siblings ...)
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 07/11] blockdev: Block device IO during internal snapshot transaction Fam Zheng
@ 2015-05-13 17:28 ` Fam Zheng
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 09/11] blockdev: Block device IO during drive-backup transaction Fam Zheng
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 7f763d9..923fc90 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1404,6 +1404,7 @@ typedef struct ExternalSnapshotState {
     BlockDriverState *old_bs;
     BlockDriverState *new_bs;
     AioContext *aio_context;
+    Error *blocker;
 } ExternalSnapshotState;
 
 static void external_snapshot_prepare(BlkTransactionState *common,
@@ -1525,6 +1526,9 @@ static void external_snapshot_prepare(BlkTransactionState *common,
     if (ret != 0) {
         error_propagate(errp, local_err);
     }
+
+    error_setg(&state->blocker, "snapshot in progress");
+    bdrv_op_block(state->old_bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
 }
 
 static void external_snapshot_commit(BlkTransactionState *common)
@@ -1541,8 +1545,6 @@ static void external_snapshot_commit(BlkTransactionState *common)
      * don't want to abort all of them if one of them fails the reopen */
     bdrv_reopen(state->new_bs, state->new_bs->open_flags & ~BDRV_O_RDWR,
                 NULL);
-
-    aio_context_release(state->aio_context);
 }
 
 static void external_snapshot_abort(BlkTransactionState *common)
@@ -1552,6 +1554,17 @@ static void external_snapshot_abort(BlkTransactionState *common)
     if (state->new_bs) {
         bdrv_unref(state->new_bs);
     }
+}
+
+static void external_snapshot_clean(BlkTransactionState *common)
+{
+    ExternalSnapshotState *state =
+                             DO_UPCAST(ExternalSnapshotState, common, common);
+
+    if (state->old_bs) {
+        bdrv_op_unblock(state->old_bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
+        error_free(state->blocker);
+    }
     if (state->aio_context) {
         aio_context_release(state->aio_context);
     }
@@ -1716,6 +1729,7 @@ static const BdrvActionOps actions[] = {
         .prepare  = external_snapshot_prepare,
         .commit   = external_snapshot_commit,
         .abort = external_snapshot_abort,
+        .clean = external_snapshot_clean,
     },
     [TRANSACTION_ACTION_KIND_DRIVE_BACKUP] = {
         .instance_size = sizeof(DriveBackupState),
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v2 09/11] blockdev: Block device IO during drive-backup transaction
  2015-05-13 17:28 [Qemu-devel] [PATCH v2 00/11] Fix transactional snapshot with virtio-blk dataplane and NBD export Fam Zheng
                   ` (7 preceding siblings ...)
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 08/11] blockdev: Block device IO during external " Fam Zheng
@ 2015-05-13 17:28 ` Fam Zheng
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 10/11] blockdev: Block device IO during blockdev-backup transaction Fam Zheng
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all Fam Zheng
  10 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 923fc90..ae52d27 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1575,6 +1575,7 @@ typedef struct DriveBackupState {
     BlockDriverState *bs;
     AioContext *aio_context;
     BlockJob *job;
+    Error *blocker;
 } DriveBackupState;
 
 static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
@@ -1599,6 +1600,9 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
     state->aio_context = bdrv_get_aio_context(bs);
     aio_context_acquire(state->aio_context);
 
+    state->bs = bs;
+    error_setg(&state->blocker, "drive-backup in progress");
+    bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
     qmp_drive_backup(backup->device, backup->target,
                      backup->has_format, backup->format,
                      backup->sync,
@@ -1613,7 +1617,6 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
         return;
     }
 
-    state->bs = bs;
     state->job = state->bs->job;
 }
 
@@ -1632,6 +1635,10 @@ static void drive_backup_clean(BlkTransactionState *common)
 {
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
 
+    if (state->bs) {
+        bdrv_op_unblock(state->bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
+        error_free(state->blocker);
+    }
     if (state->aio_context) {
         aio_context_release(state->aio_context);
     }
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v2 10/11] blockdev: Block device IO during blockdev-backup transaction
  2015-05-13 17:28 [Qemu-devel] [PATCH v2 00/11] Fix transactional snapshot with virtio-blk dataplane and NBD export Fam Zheng
                   ` (8 preceding siblings ...)
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 09/11] blockdev: Block device IO during drive-backup transaction Fam Zheng
@ 2015-05-13 17:28 ` Fam Zheng
  2015-05-13 11:22   ` Wen Congyang
  2015-05-13 12:21   ` Paolo Bonzini
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all Fam Zheng
  10 siblings, 2 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index ae52d27..bd28183 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1649,6 +1649,7 @@ typedef struct BlockdevBackupState {
     BlockDriverState *bs;
     BlockJob *job;
     AioContext *aio_context;
+    Error *blocker;
 } BlockdevBackupState;
 
 static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
@@ -1685,6 +1686,10 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
     }
     aio_context_acquire(state->aio_context);
 
+    state->bs = bs;
+    error_setg(&state->blocker, "blockdev-backup in progress");
+    bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
+
     qmp_blockdev_backup(backup->device, backup->target,
                         backup->sync,
                         backup->has_speed, backup->speed,
@@ -1696,7 +1701,6 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
         return;
     }
 
-    state->bs = bs;
     state->job = state->bs->job;
 }
 
@@ -1715,6 +1719,10 @@ static void blockdev_backup_clean(BlkTransactionState *common)
 {
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
 
+    if (state->bs) {
+        bdrv_op_unblock(state->bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
+        error_free(state->blocker);
+    }
     if (state->aio_context) {
         aio_context_release(state->aio_context);
     }
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all
  2015-05-13 17:28 [Qemu-devel] [PATCH v2 00/11] Fix transactional snapshot with virtio-blk dataplane and NBD export Fam Zheng
                   ` (9 preceding siblings ...)
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 10/11] blockdev: Block device IO during blockdev-backup transaction Fam Zheng
@ 2015-05-13 17:28 ` Fam Zheng
  2015-05-13 10:26   ` Paolo Bonzini
  10 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2015-05-13 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, jcody, armbru, mreitz, Stefan Hajnoczi, pbonzini

We don't want new requests from guest, so block the operation around the
nested poll.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/io.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/block/io.c b/block/io.c
index 1ce62c4..d369de3 100644
--- a/block/io.c
+++ b/block/io.c
@@ -289,9 +289,15 @@ static bool bdrv_drain_one(BlockDriverState *bs)
  */
 void bdrv_drain(BlockDriverState *bs)
 {
+    Error *blocker = NULL;
+
+    error_setg(&blocker, "bdrv_drain in progress");
+    bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, blocker);
     while (bdrv_drain_one(bs)) {
         /* Keep iterating */
     }
+    bdrv_op_unblock(bs, BLOCK_OP_TYPE_DEVICE_IO, blocker);
+    error_free(blocker);
 }
 
 /*
@@ -311,6 +317,9 @@ void bdrv_drain_all(void)
     /* Always run first iteration so any pending completion BHs run */
     bool busy = true;
     BlockDriverState *bs = NULL;
+    Error *blocker = NULL;
+
+    error_setg(&blocker, "bdrv_drain_all in progress");
 
     while ((bs = bdrv_next(bs))) {
         AioContext *aio_context = bdrv_get_aio_context(bs);
@@ -319,6 +328,7 @@ void bdrv_drain_all(void)
         if (bs->job) {
             block_job_pause(bs->job);
         }
+        bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, blocker);
         aio_context_release(aio_context);
     }
 
@@ -343,8 +353,10 @@ void bdrv_drain_all(void)
         if (bs->job) {
             block_job_resume(bs->job);
         }
+        bdrv_op_unblock(bs, BLOCK_OP_TYPE_DEVICE_IO, blocker);
         aio_context_release(aio_context);
     }
+    error_free(blocker);
 }
 
 /**
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/11] blockdev: Block device IO during blockdev-backup transaction
  2015-05-13 12:55     ` Fam Zheng
@ 2015-05-14  1:12       ` Wen Congyang
  2015-05-14  1:53         ` Fam Zheng
  0 siblings, 1 reply; 33+ messages in thread
From: Wen Congyang @ 2015-05-14  1:12 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, armbru, jcody, qemu-devel, mreitz,
	Stefan Hajnoczi, pbonzini

On 05/13/2015 08:55 PM, Fam Zheng wrote:
> On Wed, 05/13 19:22, Wen Congyang wrote:
>> On 05/14/2015 01:28 AM, Fam Zheng wrote:
>>> Signed-off-by: Fam Zheng <famz@redhat.com>
>>> ---
>>>  blockdev.c | 10 +++++++++-
>>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/blockdev.c b/blockdev.c
>>> index ae52d27..bd28183 100644
>>> --- a/blockdev.c
>>> +++ b/blockdev.c
>>> @@ -1649,6 +1649,7 @@ typedef struct BlockdevBackupState {
>>>      BlockDriverState *bs;
>>>      BlockJob *job;
>>>      AioContext *aio_context;
>>> +    Error *blocker;
>>>  } BlockdevBackupState;
>>>  
>>>  static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
>>> @@ -1685,6 +1686,10 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
>>>      }
>>>      aio_context_acquire(state->aio_context);
>>>  
>>> +    state->bs = bs;
>>> +    error_setg(&state->blocker, "blockdev-backup in progress");
>>> +    bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
>>> +
>>
>> Do you test this patch? You need to read from bs to do backup!!
>> If the mode is none, you also need to write to bs!!
> 
> This blocker is only temporary and is removed in blockdev_backup_clean before
> qmp_transaction returns.

Yes. Another question:
We will use bdrv_op_block_all() in the job, and don't unblock BLOCK_OP_TYPE_DEVICE_IO.
Is it OK?

Thanks
Wen Congyang

> 
> Fam
> 
>>
>> Thanks
>> Wen Congyang
>>
>>>      qmp_blockdev_backup(backup->device, backup->target,
>>>                          backup->sync,
>>>                          backup->has_speed, backup->speed,
>>> @@ -1696,7 +1701,6 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
>>>          return;
>>>      }
>>>  
>>> -    state->bs = bs;
>>>      state->job = state->bs->job;
>>>  }
>>>  
>>> @@ -1715,6 +1719,10 @@ static void blockdev_backup_clean(BlkTransactionState *common)
>>>  {
>>>      BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
>>>  
>>> +    if (state->bs) {
>>> +        bdrv_op_unblock(state->bs, BLOCK_OP_TYPE_DEVICE_IO, state->blocker);
>>> +        error_free(state->blocker);
>>> +    }
>>>      if (state->aio_context) {
>>>          aio_context_release(state->aio_context);
>>>      }
>>>
>>
> .
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/11] blockdev: Block device IO during blockdev-backup transaction
  2015-05-14  1:12       ` Wen Congyang
@ 2015-05-14  1:53         ` Fam Zheng
  0 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-14  1:53 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Kevin Wolf, qemu-block, armbru, jcody, qemu-devel, mreitz,
	Stefan Hajnoczi, pbonzini

On Thu, 05/14 09:12, Wen Congyang wrote:
> We will use bdrv_op_block_all() in the job, and don't unblock BLOCK_OP_TYPE_DEVICE_IO.
> Is it OK?

Good question and you're right, it's broken in this series. I will fix it.

Fam

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 01/11] block: Add op blocker type "device IO"
  2015-05-13 15:09       ` Paolo Bonzini
@ 2015-05-14  2:40         ` Fam Zheng
  0 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-14  2:40 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, qemu-block, qemu-devel, jcody, armbru, mreitz,
	Stefan Hajnoczi

On Wed, 05/13 17:09, Paolo Bonzini wrote:
> 
> 
> On 13/05/2015 17:02, Fam Zheng wrote:
> >> > For example, SCSI requests can result in many consecutive I/Os:
> >> > 
> >> > (1) FUA requests are split in write+flush
> >> > 
> >> > (2) adapters that do not use QEMUSGList-based I/O only read 128K at a time
> >> > 
> >> > (3) WRITE SAME operations are also split in chunks
> >> > 
> >> > (4) UNMAP operations process one descriptor at a time
> > I don't understand the point of these examples. If we don't return -EBUSY here,
> > the request will sneak into block/io.c and perhaps break qmp transaction
> > semantics, if it lands between two backups.
> 
> It won't, because after blocking DEVICE_IO you will always drain I/O and
> the bdrv_drain will loop until the above are all satisfied.

That's right, I'll drop this patch.

Fam

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all
  2015-05-13 15:25           ` Paolo Bonzini
@ 2015-05-14  3:02             ` Fam Zheng
  0 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-14  3:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, qemu-block, qemu-devel, jcody, armbru, mreitz,
	Stefan Hajnoczi

On Wed, 05/13 17:25, Paolo Bonzini wrote:
> 
> 
> On 13/05/2015 17:17, Fam Zheng wrote:
> >> > 
> >> > It can be the topic of a separate series.  But this patch brings a
> >> > false sense of security (either the blocker is unnecessary, or it
> >> > needs to last after bdrv_drain returns), so I think it should be
> >> > dropped.
> > Doesn't this let bdrv_drain_all return when virtio-blk-dataplane is having high
> > workload, in places where you say "the blocker is unnecessary"?
> 
> Yes, you're right.  Please document it in the commit message and the
> code, it's tricky.

OK, will do it.

Fam

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 02/11] block: Add op blocker notifier list
  2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 02/11] block: Add op blocker notifier list Fam Zheng
@ 2015-05-14  5:32   ` Fam Zheng
  0 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2015-05-14  5:32 UTC (permalink / raw)
  To: qemu-devel

On Wed, 05/13 17:28, Fam Zheng wrote:
> +static void bdrv_op_blocker_notify(BlockDriverState *bs, BlockOpType op,
> +                                   Error *reason, bool blocking)
> +{
> +    BlockOpEvent event = (BlockOpEvent) {
> +        op = op,
> +        reason = reason,
> +        blocking = true,

s/true/blocking/

> +    };
> +
> +    notifier_list_notify(&bs->op_blocker_notifiers, &event);
> +}
> +

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2015-05-14  5:32 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-13 17:28 [Qemu-devel] [PATCH v2 00/11] Fix transactional snapshot with virtio-blk dataplane and NBD export Fam Zheng
2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 01/11] block: Add op blocker type "device IO" Fam Zheng
2015-05-13 11:32   ` Wen Congyang
2015-05-13 12:04   ` Paolo Bonzini
2015-05-13 15:02     ` Fam Zheng
2015-05-13 15:09       ` Paolo Bonzini
2015-05-14  2:40         ` Fam Zheng
2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 02/11] block: Add op blocker notifier list Fam Zheng
2015-05-14  5:32   ` Fam Zheng
2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 03/11] block-backend: Add blk_op_blocker_add_notifier Fam Zheng
2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 04/11] virtio-blk: Move complete_request to 'ops' structure Fam Zheng
2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 05/11] virtio-blk: Don't handle output when there is "device IO" op blocker Fam Zheng
2015-05-13 10:26   ` Paolo Bonzini
2015-05-13 11:09     ` Fam Zheng
2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 06/11] nbd-server: Clear "can_read" when "device io" blocker is set Fam Zheng
2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 07/11] blockdev: Block device IO during internal snapshot transaction Fam Zheng
2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 08/11] blockdev: Block device IO during external " Fam Zheng
2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 09/11] blockdev: Block device IO during drive-backup transaction Fam Zheng
2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 10/11] blockdev: Block device IO during blockdev-backup transaction Fam Zheng
2015-05-13 11:22   ` Wen Congyang
2015-05-13 12:55     ` Fam Zheng
2015-05-14  1:12       ` Wen Congyang
2015-05-14  1:53         ` Fam Zheng
2015-05-13 12:21   ` Paolo Bonzini
2015-05-13 15:08     ` Paolo Bonzini
2015-05-13 15:34       ` Fam Zheng
2015-05-13 17:28 ` [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all Fam Zheng
2015-05-13 10:26   ` Paolo Bonzini
2015-05-13 11:08     ` Fam Zheng
2015-05-13 11:33       ` Paolo Bonzini
2015-05-13 15:17         ` Fam Zheng
2015-05-13 15:25           ` Paolo Bonzini
2015-05-14  3:02             ` Fam Zheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.