All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 0/8] block: Mirror discarded sectors
@ 2015-05-22  3:40 Fam Zheng
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 1/8] block: Add "base" option to bdrv_get_block_status Fam Zheng
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-22  3:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	pbonzini, jsnow, wangxiaolong

v4: Add Paolo's rev-by in 2,4.
    Add "unmap" option to drive-mirror.
    Introduce "above" to bdrv_get_block_status and use it in mirror.c

v3: Add John's rev-by in patches 3~6.
    Rewrite patch 1: discard may not suffice, use write zeroes in that case.

v2: Fix typo and add Eric's rev-by in patch 3.
    Add patch 1 to discard target in mirror job. (Paolo)
    Add patch 6 to improve iotests.wait_ready. (John)

This fixes the mirror assert failure reported by wangxiaolong:

https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg04458.html

The direct cause is that hbitmap code couldn't handle unset of bits *after*
iterator's current position. We could fix that, but the bdrv_reset_dirty() call
is more questionable:

Before, if guest discarded some sectors during migration, it could see
different data after moving to dest side, depending on block backends of the
src and the dest. This is IMO worse than mirroring the actual reading as done
in this series, because we don't know what the guest is doing.

For example if a guest first issues WRITE SAME to wipe out the area then issues
UNMAP to discard it, just to get rid of some sensitive data completely, we may
miss both operations and leave stale data on dest image.


Fam Zheng (8):
  block: Add "base" option to bdrv_get_block_status
  qmp: Add optional bool "unmap" to drive-mirror
  mirror: Do zero write on target if sectors not allocated
  block: Fix dirty bitmap in bdrv_co_discard
  block: Remove bdrv_reset_dirty
  qemu-iotests: Make block job methods common
  qemu-iotests: Add test case for mirror with unmap
  iotests: Use event_wait in wait_ready

 block.c                       | 12 --------
 block/io.c                    | 43 ++++++++++++++++++++++------
 block/mirror.c                | 26 ++++++++++++++---
 blockdev.c                    |  5 ++++
 hmp.c                         |  2 +-
 include/block/block.h         |  3 +-
 include/block/block_int.h     |  4 +--
 qapi/block-core.json          |  5 +++-
 qemu-img.c                    |  6 ++--
 qmp-commands.hx               |  3 ++
 tests/qemu-iotests/041        | 66 ++++++++++---------------------------------
 tests/qemu-iotests/131        | 59 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/131.out    |  5 ++++
 tests/qemu-iotests/group      |  1 +
 tests/qemu-iotests/iotests.py | 23 +++++++++++++++
 15 files changed, 180 insertions(+), 83 deletions(-)
 create mode 100644 tests/qemu-iotests/131
 create mode 100644 tests/qemu-iotests/131.out

-- 
2.4.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v4 1/8] block: Add "base" option to bdrv_get_block_status
  2015-05-22  3:40 [Qemu-devel] [PATCH v4 0/8] block: Mirror discarded sectors Fam Zheng
@ 2015-05-22  3:40 ` Fam Zheng
  2015-05-22 19:32   ` Eric Blake
  2015-05-25 14:43   ` Paolo Bonzini
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 2/8] qmp: Add optional bool "unmap" to drive-mirror Fam Zheng
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-22  3:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	pbonzini, jsnow, wangxiaolong

Now this function follows the backing chain until seeing BDRV_BLOCK_ALLOCATED.
Base is not included, and it can be NULL just like bdrv_is_allocated_above().

Existing callers pass in bs->backing_hd to keep the old behavior.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/io.c            | 39 ++++++++++++++++++++++++++++++++-------
 include/block/block.h |  3 ++-
 qemu-img.c            |  6 ++++--
 3 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/block/io.c b/block/io.c
index 1ce62c4..36a4007 100644
--- a/block/io.c
+++ b/block/io.c
@@ -673,7 +673,8 @@ int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags flags)
         if (nb_sectors <= 0) {
             return 0;
         }
-        ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &n);
+        ret = bdrv_get_block_status(bs, bs->backing_hd, sector_num,
+                                    nb_sectors, &n);
         if (ret < 0) {
             error_report("error getting block status at sector %" PRId64 ": %s",
                          sector_num, strerror(-ret));
@@ -1450,7 +1451,8 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
 
     if (ret & BDRV_BLOCK_RAW) {
         assert(ret & BDRV_BLOCK_OFFSET_VALID);
-        return bdrv_get_block_status(bs->file, ret >> BDRV_SECTOR_BITS,
+        return bdrv_get_block_status(bs->file, bs->file->backing_hd,
+                                     ret >> BDRV_SECTOR_BITS,
                                      *pnum, pnum);
     }
 
@@ -1497,14 +1499,34 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
     return ret;
 }
 
+static int64_t coroutine_fn bdrv_co_get_block_status_above(BlockDriverState *bs,
+        BlockDriverState *base,
+        int64_t sector_num,
+        int nb_sectors,
+        int *pnum)
+{
+    BlockDriverState *p;
+    int64_t ret;
+
+    assert(bs != base);
+    for (p = bs; p != base; p = p->backing_hd) {
+        ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum);
+        if (ret < 0 || ret & BDRV_BLOCK_ALLOCATED) {
+            break;
+        }
+    }
+    return ret;
+}
+
 /* Coroutine wrapper for bdrv_get_block_status() */
 static void coroutine_fn bdrv_get_block_status_co_entry(void *opaque)
 {
     BdrvCoGetBlockStatusData *data = opaque;
-    BlockDriverState *bs = data->bs;
 
-    data->ret = bdrv_co_get_block_status(bs, data->sector_num, data->nb_sectors,
-                                         data->pnum);
+    data->ret = bdrv_co_get_block_status_above(data->bs, data->base,
+                                               data->sector_num,
+                                               data->nb_sectors,
+                                               data->pnum);
     data->done = true;
 }
 
@@ -1513,12 +1535,14 @@ static void coroutine_fn bdrv_get_block_status_co_entry(void *opaque)
  *
  * See bdrv_co_get_block_status() for details.
  */
-int64_t bdrv_get_block_status(BlockDriverState *bs, int64_t sector_num,
+int64_t bdrv_get_block_status(BlockDriverState *bs, BlockDriverState *base,
+                              int64_t sector_num,
                               int nb_sectors, int *pnum)
 {
     Coroutine *co;
     BdrvCoGetBlockStatusData data = {
         .bs = bs,
+        .base = base,
         .sector_num = sector_num,
         .nb_sectors = nb_sectors,
         .pnum = pnum,
@@ -1543,7 +1567,8 @@ int64_t bdrv_get_block_status(BlockDriverState *bs, int64_t sector_num,
 int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num,
                                    int nb_sectors, int *pnum)
 {
-    int64_t ret = bdrv_get_block_status(bs, sector_num, nb_sectors, pnum);
+    int64_t ret = bdrv_get_block_status(bs, bs->backing_hd,
+                                        sector_num, nb_sectors, pnum);
     if (ret < 0) {
         return ret;
     }
diff --git a/include/block/block.h b/include/block/block.h
index 7d1a717..e1d408c 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -359,7 +359,8 @@ int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
 bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs);
 bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs);
-int64_t bdrv_get_block_status(BlockDriverState *bs, int64_t sector_num,
+int64_t bdrv_get_block_status(BlockDriverState *bs, BlockDriverState *base,
+                              int64_t sector_num,
                               int nb_sectors, int *pnum);
 int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
                       int *pnum);
diff --git a/qemu-img.c b/qemu-img.c
index 8d30e43..723d8a6 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1350,7 +1350,8 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num)
     n = MIN(s->total_sectors - sector_num, BDRV_REQUEST_MAX_SECTORS);
 
     if (s->sector_next_status <= sector_num) {
-        ret = bdrv_get_block_status(blk_bs(s->src[s->src_cur]),
+        BlockDriverState *bs = blk_bs(s->src[s->src_cur]);
+        ret = bdrv_get_block_status(bs, bs->backing_hd,
                                     sector_num - s->src_cur_offset,
                                     n, &n);
         if (ret < 0) {
@@ -2287,7 +2288,8 @@ static int get_block_status(BlockDriverState *bs, int64_t sector_num,
 
     depth = 0;
     for (;;) {
-        ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &nb_sectors);
+        ret = bdrv_get_block_status(bs, bs->backing_hd, sector_num,
+                                    nb_sectors, &nb_sectors);
         if (ret < 0) {
             return ret;
         }
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v4 2/8] qmp: Add optional bool "unmap" to drive-mirror
  2015-05-22  3:40 [Qemu-devel] [PATCH v4 0/8] block: Mirror discarded sectors Fam Zheng
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 1/8] block: Add "base" option to bdrv_get_block_status Fam Zheng
@ 2015-05-22  3:40 ` Fam Zheng
  2015-05-22 19:57   ` Eric Blake
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 3/8] mirror: Do zero write on target if sectors not allocated Fam Zheng
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Fam Zheng @ 2015-05-22  3:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	pbonzini, jsnow, wangxiaolong

If specified as "true", it allows discarding on target sectors where source is
not allocated.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/mirror.c            | 7 +++++--
 blockdev.c                | 5 +++++
 hmp.c                     | 2 +-
 include/block/block_int.h | 2 ++
 qapi/block-core.json      | 5 ++++-
 qmp-commands.hx           | 3 +++
 6 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 58f391a..85995b2 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -57,6 +57,7 @@ typedef struct MirrorBlockJob {
     int in_flight;
     int sectors_in_flight;
     int ret;
+    bool unmap;
 } MirrorBlockJob;
 
 typedef struct MirrorOp {
@@ -651,6 +652,7 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
                              int64_t buf_size,
                              BlockdevOnError on_source_error,
                              BlockdevOnError on_target_error,
+                             bool unmap,
                              BlockCompletionFunc *cb,
                              void *opaque, Error **errp,
                              const BlockJobDriver *driver,
@@ -703,6 +705,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
                   int64_t speed, uint32_t granularity, int64_t buf_size,
                   MirrorSyncMode mode, BlockdevOnError on_source_error,
                   BlockdevOnError on_target_error,
+                  bool unmap,
                   BlockCompletionFunc *cb,
                   void *opaque, Error **errp)
 {
@@ -717,7 +720,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
     base = mode == MIRROR_SYNC_MODE_TOP ? bs->backing_hd : NULL;
     mirror_start_job(bs, target, replaces,
                      speed, granularity, buf_size,
-                     on_source_error, on_target_error, cb, opaque, errp,
+                     on_source_error, on_target_error, unmap, cb, opaque, errp,
                      &mirror_job_driver, is_none_mode, base);
 }
 
@@ -765,7 +768,7 @@ void commit_active_start(BlockDriverState *bs, BlockDriverState *base,
 
     bdrv_ref(base);
     mirror_start_job(bs, base, NULL, speed, 0, 0,
-                     on_error, on_error, cb, opaque, &local_err,
+                     on_error, on_error, false, cb, opaque, &local_err,
                      &commit_active_job_driver, false, base);
     if (local_err) {
         error_propagate(errp, local_err);
diff --git a/blockdev.c b/blockdev.c
index 5eaf77e..4387e14 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2632,6 +2632,7 @@ void qmp_drive_mirror(const char *device, const char *target,
                       bool has_buf_size, int64_t buf_size,
                       bool has_on_source_error, BlockdevOnError on_source_error,
                       bool has_on_target_error, BlockdevOnError on_target_error,
+                      bool has_unmap, bool unmap,
                       Error **errp)
 {
     BlockBackend *blk;
@@ -2663,6 +2664,9 @@ void qmp_drive_mirror(const char *device, const char *target,
     if (!has_buf_size) {
         buf_size = DEFAULT_MIRROR_BUF_SIZE;
     }
+    if (!has_unmap) {
+        unmap = true;
+    }
 
     if (granularity != 0 && (granularity < 512 || granularity > 1048576 * 64)) {
         error_set(errp, QERR_INVALID_PARAMETER_VALUE, "granularity",
@@ -2802,6 +2806,7 @@ void qmp_drive_mirror(const char *device, const char *target,
                  has_replaces ? replaces : NULL,
                  speed, granularity, buf_size, sync,
                  on_source_error, on_target_error,
+                 unmap,
                  block_job_cb, bs, &local_err);
     if (local_err != NULL) {
         bdrv_unref(target_bs);
diff --git a/hmp.c b/hmp.c
index e17852d..06748e4 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1056,7 +1056,7 @@ void hmp_drive_mirror(Monitor *mon, const QDict *qdict)
                      false, NULL, false, NULL,
                      full ? MIRROR_SYNC_MODE_FULL : MIRROR_SYNC_MODE_TOP,
                      true, mode, false, 0, false, 0, false, 0,
-                     false, 0, false, 0, &err);
+                     false, 0, false, 0, false, false, &err);
     hmp_handle_error(mon, &err);
 }
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index db29b74..d136d34 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -587,6 +587,7 @@ void commit_active_start(BlockDriverState *bs, BlockDriverState *base,
  * @mode: Whether to collapse all images in the chain to the target.
  * @on_source_error: The action to take upon error reading from the source.
  * @on_target_error: The action to take upon error writing to the target.
+ * @unmap: Whether unmap target where source sectors only contain zeroes.
  * @cb: Completion function for the job.
  * @opaque: Opaque pointer value passed to @cb.
  * @errp: Error object.
@@ -601,6 +602,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
                   int64_t speed, uint32_t granularity, int64_t buf_size,
                   MirrorSyncMode mode, BlockdevOnError on_source_error,
                   BlockdevOnError on_target_error,
+                  bool unmap,
                   BlockCompletionFunc *cb,
                   void *opaque, Error **errp);
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 863ffea..b422d58 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -954,6 +954,8 @@
 # @on-target-error: #optional the action to take on an error on the target,
 #                   default 'report' (no limitations, since this applies to
 #                   a different block device than @device).
+# @unmap: #optional Whether unmap target sectors where source has zero. Default
+#         is true. (Since 2.4)
 #
 # Returns: nothing on success
 #          If @device is not a valid block device, DeviceNotFound
@@ -966,7 +968,8 @@
             'sync': 'MirrorSyncMode', '*mode': 'NewImageMode',
             '*speed': 'int', '*granularity': 'uint32',
             '*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
-            '*on-target-error': 'BlockdevOnError' } }
+            '*on-target-error': 'BlockdevOnError',
+            '*unmap': 'bool' } }
 
 ##
 # @BlockDirtyBitmap
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 14e109e..63c86fc 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1503,6 +1503,7 @@ EQMP
         .args_type  = "sync:s,device:B,target:s,speed:i?,mode:s?,format:s?,"
                       "node-name:s?,replaces:s?,"
                       "on-source-error:s?,on-target-error:s?,"
+                      "unmap:b?,"
                       "granularity:i?,buf-size:i?",
         .mhandler.cmd_new = qmp_marshal_input_drive_mirror,
     },
@@ -1542,6 +1543,8 @@ Arguments:
   (BlockdevOnError, default 'report')
 - "on-target-error": the action to take on an error on the target
   (BlockdevOnError, default 'report')
+- "unmap": whether the target sectors should be discarded where source has only
+  zeroes. (json-bool, optional, default true)
 
 The default value of the granularity is the image cluster size clamped
 between 4096 and 65536, if the image format defines one.  If the format
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v4 3/8] mirror: Do zero write on target if sectors not allocated
  2015-05-22  3:40 [Qemu-devel] [PATCH v4 0/8] block: Mirror discarded sectors Fam Zheng
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 1/8] block: Add "base" option to bdrv_get_block_status Fam Zheng
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 2/8] qmp: Add optional bool "unmap" to drive-mirror Fam Zheng
@ 2015-05-22  3:40 ` Fam Zheng
  2015-05-22 20:20   ` Eric Blake
  2015-05-25 14:36   ` Paolo Bonzini
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 4/8] block: Fix dirty bitmap in bdrv_co_discard Fam Zheng
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-22  3:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	pbonzini, jsnow, wangxiaolong

If guest discards a source cluster, mirroring with bdrv_aio_readv is overkill.
Some protocols do zero upon discard, where it's best to use
bdrv_aio_write_zeroes, otherwise, bdrv_aio_discard will be enough.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/mirror.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 85995b2..87f5ac4 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -164,6 +164,8 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
     int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
     uint64_t delay_ns = 0;
     MirrorOp *op;
+    int pnum;
+    int64_t ret;
 
     s->sector_num = hbitmap_iter_next(&s->hbi);
     if (s->sector_num < 0) {
@@ -290,8 +292,21 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
     s->in_flight++;
     s->sectors_in_flight += nb_sectors;
     trace_mirror_one_iteration(s, sector_num, nb_sectors);
-    bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
-                   mirror_read_complete, op);
+
+    ret = bdrv_get_block_status(source, NULL, sector_num, nb_sectors, &pnum);
+    if (ret < 0 || pnum < nb_sectors ||
+            (ret & BDRV_BLOCK_ALLOCATED && !(ret & BDRV_BLOCK_ZERO))) {
+        bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
+                       mirror_read_complete, op);
+    } else if (ret & BDRV_BLOCK_ZERO) {
+        bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
+                              s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
+                              mirror_write_complete, op);
+    } else {
+        assert(!(ret & BDRV_BLOCK_ALLOCATED));
+        bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
+                         mirror_write_complete, op);
+    }
     return delay_ns;
 }
 
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v4 4/8] block: Fix dirty bitmap in bdrv_co_discard
  2015-05-22  3:40 [Qemu-devel] [PATCH v4 0/8] block: Mirror discarded sectors Fam Zheng
                   ` (2 preceding siblings ...)
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 3/8] mirror: Do zero write on target if sectors not allocated Fam Zheng
@ 2015-05-22  3:40 ` Fam Zheng
  2015-05-22 20:22   ` Eric Blake
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 5/8] block: Remove bdrv_reset_dirty Fam Zheng
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Fam Zheng @ 2015-05-22  3:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	pbonzini, jsnow, wangxiaolong

Unsetting dirty globally with discard is not very correct. The discard may zero
out sectors (depending on can_write_zeroes_with_unmap), we should replicate
this change to destinition side to make sure that the guest sees the same data.

Calling bdrv_reset_dirty also troubles mirror job because the hbitmap iterator
doesn't expect unsetting of bits after current position.

So let's do it the opposite way which fixes both problems: set the dirty bits
if we are to discard it.

Reported-by: wangxiaolong@ucloud.cn
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/io.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index 36a4007..a1ff943 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2368,8 +2368,6 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
         return -EROFS;
     }
 
-    bdrv_reset_dirty(bs, sector_num, nb_sectors);
-
     /* Do nothing if disabled.  */
     if (!(bs->open_flags & BDRV_O_UNMAP)) {
         return 0;
@@ -2379,6 +2377,8 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
         return 0;
     }
 
+    bdrv_set_dirty(bs, sector_num, nb_sectors);
+
     max_discard = MIN_NON_ZERO(bs->bl.max_discard, BDRV_REQUEST_MAX_SECTORS);
     while (nb_sectors > 0) {
         int ret;
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v4 5/8] block: Remove bdrv_reset_dirty
  2015-05-22  3:40 [Qemu-devel] [PATCH v4 0/8] block: Mirror discarded sectors Fam Zheng
                   ` (3 preceding siblings ...)
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 4/8] block: Fix dirty bitmap in bdrv_co_discard Fam Zheng
@ 2015-05-22  3:40 ` Fam Zheng
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 6/8] qemu-iotests: Make block job methods common Fam Zheng
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-22  3:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	pbonzini, jsnow, wangxiaolong

Using this function would always be wrong because a dirty bitmap must
have a specific owner that consumes the dirty bits and calls
bdrv_reset_dirty_bitmap().

Remove the unused function to avoid future misuse.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
---
 block.c                   | 12 ------------
 include/block/block_int.h |  2 --
 2 files changed, 14 deletions(-)

diff --git a/block.c b/block.c
index 7904098..511e13c 100644
--- a/block.c
+++ b/block.c
@@ -3324,18 +3324,6 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
     }
 }
 
-void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector,
-                      int nr_sectors)
-{
-    BdrvDirtyBitmap *bitmap;
-    QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
-        if (!bdrv_dirty_bitmap_enabled(bitmap)) {
-            continue;
-        }
-        hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
-    }
-}
-
 /**
  * Advance an HBitmapIter to an arbitrary offset.
  */
diff --git a/include/block/block_int.h b/include/block/block_int.h
index d136d34..aa9bf50 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -637,7 +637,5 @@ bool blk_dev_is_medium_locked(BlockBackend *blk);
 void blk_dev_resize_cb(BlockBackend *blk);
 
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector, int nr_sectors);
-void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector,
-                      int nr_sectors);
 
 #endif /* BLOCK_INT_H */
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v4 6/8] qemu-iotests: Make block job methods common
  2015-05-22  3:40 [Qemu-devel] [PATCH v4 0/8] block: Mirror discarded sectors Fam Zheng
                   ` (4 preceding siblings ...)
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 5/8] block: Remove bdrv_reset_dirty Fam Zheng
@ 2015-05-22  3:40 ` Fam Zheng
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 7/8] qemu-iotests: Add test case for mirror with unmap Fam Zheng
  2015-05-22  3:41 ` [Qemu-devel] [PATCH v4 8/8] iotests: Use event_wait in wait_ready Fam Zheng
  7 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-22  3:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	pbonzini, jsnow, wangxiaolong

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/qemu-iotests/041        | 66 ++++++++++---------------------------------
 tests/qemu-iotests/iotests.py | 28 ++++++++++++++++++
 2 files changed, 43 insertions(+), 51 deletions(-)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 59a8f73..3d46ed7 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -34,38 +34,8 @@ quorum_img3 = os.path.join(iotests.test_dir, 'quorum3.img')
 quorum_repair_img = os.path.join(iotests.test_dir, 'quorum_repair.img')
 quorum_snapshot_file = os.path.join(iotests.test_dir, 'quorum_snapshot.img')
 
-class ImageMirroringTestCase(iotests.QMPTestCase):
-    '''Abstract base class for image mirroring test cases'''
 
-    def wait_ready(self, drive='drive0'):
-        '''Wait until a block job BLOCK_JOB_READY event'''
-        ready = False
-        while not ready:
-            for event in self.vm.get_qmp_events(wait=True):
-                if event['event'] == 'BLOCK_JOB_READY':
-                    self.assert_qmp(event, 'data/type', 'mirror')
-                    self.assert_qmp(event, 'data/device', drive)
-                    ready = True
-
-    def wait_ready_and_cancel(self, drive='drive0'):
-        self.wait_ready(drive=drive)
-        event = self.cancel_and_wait(drive=drive)
-        self.assertEquals(event['event'], 'BLOCK_JOB_COMPLETED')
-        self.assert_qmp(event, 'data/type', 'mirror')
-        self.assert_qmp(event, 'data/offset', event['data']['len'])
-
-    def complete_and_wait(self, drive='drive0', wait_ready=True):
-        '''Complete a block job and wait for it to finish'''
-        if wait_ready:
-            self.wait_ready(drive=drive)
-
-        result = self.vm.qmp('block-job-complete', device=drive)
-        self.assert_qmp(result, 'return', {})
-
-        event = self.wait_until_completed(drive=drive)
-        self.assert_qmp(event, 'data/type', 'mirror')
-
-class TestSingleDrive(ImageMirroringTestCase):
+class TestSingleDrive(iotests.QMPTestCase):
     image_len = 1 * 1024 * 1024 # MB
 
     def setUp(self):
@@ -221,17 +191,9 @@ class TestSingleDriveUnalignedLength(TestSingleDrive):
     test_small_buffer2 = None
     test_large_cluster = None
 
-class TestMirrorNoBacking(ImageMirroringTestCase):
+class TestMirrorNoBacking(iotests.QMPTestCase):
     image_len = 2 * 1024 * 1024 # MB
 
-    def complete_and_wait(self, drive='drive0', wait_ready=True):
-        iotests.create_image(target_backing_img, TestMirrorNoBacking.image_len)
-        return ImageMirroringTestCase.complete_and_wait(self, drive, wait_ready)
-
-    def compare_images(self, img1, img2):
-        iotests.create_image(target_backing_img, TestMirrorNoBacking.image_len)
-        return iotests.compare_images(img1, img2)
-
     def setUp(self):
         iotests.create_image(backing_img, TestMirrorNoBacking.image_len)
         qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % backing_img, test_img)
@@ -242,7 +204,10 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
         self.vm.shutdown()
         os.remove(test_img)
         os.remove(backing_img)
-        os.remove(target_backing_img)
+        try:
+            os.remove(target_backing_img)
+        except:
+            pass
         os.remove(target_img)
 
     def test_complete(self):
@@ -257,7 +222,7 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
         result = self.vm.qmp('query-block')
         self.assert_qmp(result, 'return[0]/inserted/file', target_img)
         self.vm.shutdown()
-        self.assertTrue(self.compare_images(test_img, target_img),
+        self.assertTrue(iotests.compare_images(test_img, target_img),
                         'target image does not match source after mirroring')
 
     def test_cancel(self):
@@ -272,7 +237,7 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
         result = self.vm.qmp('query-block')
         self.assert_qmp(result, 'return[0]/inserted/file', test_img)
         self.vm.shutdown()
-        self.assertTrue(self.compare_images(test_img, target_img),
+        self.assertTrue(iotests.compare_images(test_img, target_img),
                         'target image does not match source after mirroring')
 
     def test_large_cluster(self):
@@ -283,7 +248,6 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
                         %(TestMirrorNoBacking.image_len), target_backing_img)
         qemu_img('create', '-f', iotests.imgfmt, '-o', 'cluster_size=%d,backing_file=%s'
                         % (TestMirrorNoBacking.image_len, target_backing_img), target_img)
-        os.remove(target_backing_img)
 
         result = self.vm.qmp('drive-mirror', device='drive0', sync='full',
                              mode='existing', target=target_img)
@@ -293,10 +257,10 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
         result = self.vm.qmp('query-block')
         self.assert_qmp(result, 'return[0]/inserted/file', target_img)
         self.vm.shutdown()
-        self.assertTrue(self.compare_images(test_img, target_img),
+        self.assertTrue(iotests.compare_images(test_img, target_img),
                         'target image does not match source after mirroring')
 
-class TestMirrorResized(ImageMirroringTestCase):
+class TestMirrorResized(iotests.QMPTestCase):
     backing_len = 1 * 1024 * 1024 # MB
     image_len = 2 * 1024 * 1024 # MB
 
@@ -344,7 +308,7 @@ class TestMirrorResized(ImageMirroringTestCase):
         self.assertTrue(iotests.compare_images(test_img, target_img),
                         'target image does not match source after mirroring')
 
-class TestReadErrors(ImageMirroringTestCase):
+class TestReadErrors(iotests.QMPTestCase):
     image_len = 2 * 1024 * 1024 # MB
 
     # this should be a multiple of twice the default granularity
@@ -498,7 +462,7 @@ new_state = "1"
         self.assert_no_active_block_jobs()
         self.vm.shutdown()
 
-class TestWriteErrors(ImageMirroringTestCase):
+class TestWriteErrors(iotests.QMPTestCase):
     image_len = 2 * 1024 * 1024 # MB
 
     # this should be a multiple of twice the default granularity
@@ -624,7 +588,7 @@ new_state = "1"
         self.assert_no_active_block_jobs()
         self.vm.shutdown()
 
-class TestSetSpeed(ImageMirroringTestCase):
+class TestSetSpeed(iotests.QMPTestCase):
     image_len = 80 * 1024 * 1024 # MB
 
     def setUp(self):
@@ -690,7 +654,7 @@ class TestSetSpeed(ImageMirroringTestCase):
 
         self.wait_ready_and_cancel()
 
-class TestUnbackedSource(ImageMirroringTestCase):
+class TestUnbackedSource(iotests.QMPTestCase):
     image_len = 2 * 1024 * 1024 # MB
 
     def setUp(self):
@@ -731,7 +695,7 @@ class TestUnbackedSource(ImageMirroringTestCase):
         self.complete_and_wait()
         self.assert_no_active_block_jobs()
 
-class TestRepairQuorum(ImageMirroringTestCase):
+class TestRepairQuorum(iotests.QMPTestCase):
     """ This class test quorum file repair using drive-mirror.
         It's mostly a fork of TestSingleDrive """
     image_len = 1 * 1024 * 1024 # MB
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index e93e623..2e07cc4 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -326,6 +326,34 @@ class QMPTestCase(unittest.TestCase):
         self.assert_no_active_block_jobs()
         return event
 
+    def wait_ready(self, drive='drive0'):
+        '''Wait until a block job BLOCK_JOB_READY event'''
+        ready = False
+        while not ready:
+            for event in self.vm.get_qmp_events(wait=True):
+                if event['event'] == 'BLOCK_JOB_READY':
+                    self.assert_qmp(event, 'data/type', 'mirror')
+                    self.assert_qmp(event, 'data/device', drive)
+                    ready = True
+
+    def wait_ready_and_cancel(self, drive='drive0'):
+        self.wait_ready(drive=drive)
+        event = self.cancel_and_wait(drive=drive)
+        self.assertEquals(event['event'], 'BLOCK_JOB_COMPLETED')
+        self.assert_qmp(event, 'data/type', 'mirror')
+        self.assert_qmp(event, 'data/offset', event['data']['len'])
+
+    def complete_and_wait(self, drive='drive0', wait_ready=True):
+        '''Complete a block job and wait for it to finish'''
+        if wait_ready:
+            self.wait_ready(drive=drive)
+
+        result = self.vm.qmp('block-job-complete', device=drive)
+        self.assert_qmp(result, 'return', {})
+
+        event = self.wait_until_completed(drive=drive)
+        self.assert_qmp(event, 'data/type', 'mirror')
+
 def notrun(reason):
     '''Skip this test suite'''
     # Each test in qemu-iotests has a number ("seq")
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v4 7/8] qemu-iotests: Add test case for mirror with unmap
  2015-05-22  3:40 [Qemu-devel] [PATCH v4 0/8] block: Mirror discarded sectors Fam Zheng
                   ` (5 preceding siblings ...)
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 6/8] qemu-iotests: Make block job methods common Fam Zheng
@ 2015-05-22  3:40 ` Fam Zheng
  2015-05-22  3:41 ` [Qemu-devel] [PATCH v4 8/8] iotests: Use event_wait in wait_ready Fam Zheng
  7 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-22  3:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	pbonzini, jsnow, wangxiaolong

This checks that the discard on mirror source that effectively zeroes
data is also reflected by the data of target.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
---
 tests/qemu-iotests/131     | 59 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/131.out |  5 ++++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 65 insertions(+)
 create mode 100644 tests/qemu-iotests/131
 create mode 100644 tests/qemu-iotests/131.out

diff --git a/tests/qemu-iotests/131 b/tests/qemu-iotests/131
new file mode 100644
index 0000000..f53ef6e
--- /dev/null
+++ b/tests/qemu-iotests/131
@@ -0,0 +1,59 @@
+#!/usr/bin/env python
+#
+# Test mirror with unmap
+#
+# Copyright (C) 2015 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import time
+import os
+import iotests
+from iotests import qemu_img, qemu_io
+
+test_img = os.path.join(iotests.test_dir, 'test.img')
+target_img = os.path.join(iotests.test_dir, 'target.img')
+
+class TestSingleDrive(iotests.QMPTestCase):
+    image_len = 2 * 1024 * 1024 # MB
+
+    def setUp(self):
+        # Write data to the image so we can compare later
+        qemu_img('create', '-f', iotests.imgfmt, test_img, str(TestSingleDrive.image_len))
+        qemu_io('-f', iotests.imgfmt, '-c', 'write -P0x5d 0 2M', test_img)
+
+        self.vm = iotests.VM().add_drive(test_img, 'discard=unmap')
+        self.vm.launch()
+
+    def tearDown(self):
+        self.vm.shutdown()
+        os.remove(test_img)
+        try:
+            os.remove(target_img)
+        except OSError:
+            pass
+
+    def test_mirror_discard(self):
+        result = self.vm.qmp('drive-mirror', device='drive0', sync='full',
+                             target=target_img)
+        self.assert_qmp(result, 'return', {})
+        self.vm.hmp_qemu_io('drive0', 'discard 0 64k')
+        self.complete_and_wait('drive0')
+        self.vm.shutdown()
+        self.assertTrue(iotests.compare_images(test_img, target_img),
+                        'target image does not match source after mirroring')
+
+if __name__ == '__main__':
+    iotests.main(supported_fmts=['raw', 'qcow2'])
diff --git a/tests/qemu-iotests/131.out b/tests/qemu-iotests/131.out
new file mode 100644
index 0000000..ae1213e
--- /dev/null
+++ b/tests/qemu-iotests/131.out
@@ -0,0 +1,5 @@
+.
+----------------------------------------------------------------------
+Ran 1 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 6ca3466..34b16cb 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -128,3 +128,4 @@
 128 rw auto quick
 129 rw auto quick
 130 rw auto quick
+131 rw auto quick
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v4 8/8] iotests: Use event_wait in wait_ready
  2015-05-22  3:40 [Qemu-devel] [PATCH v4 0/8] block: Mirror discarded sectors Fam Zheng
                   ` (6 preceding siblings ...)
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 7/8] qemu-iotests: Add test case for mirror with unmap Fam Zheng
@ 2015-05-22  3:41 ` Fam Zheng
  7 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-22  3:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	pbonzini, jsnow, wangxiaolong

Only poll the specific type of event we are interested in, to avoid
stealing events that should be consumed by someone else.

Suggested-by: John Snow <jsnow@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
---
 tests/qemu-iotests/iotests.py | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 2e07cc4..0ddc513 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -328,13 +328,8 @@ class QMPTestCase(unittest.TestCase):
 
     def wait_ready(self, drive='drive0'):
         '''Wait until a block job BLOCK_JOB_READY event'''
-        ready = False
-        while not ready:
-            for event in self.vm.get_qmp_events(wait=True):
-                if event['event'] == 'BLOCK_JOB_READY':
-                    self.assert_qmp(event, 'data/type', 'mirror')
-                    self.assert_qmp(event, 'data/device', drive)
-                    ready = True
+        f = {'data': {'type': 'mirror', 'device': drive } }
+        event = self.vm.event_wait(name='BLOCK_JOB_READY', match=f)
 
     def wait_ready_and_cancel(self, drive='drive0'):
         self.wait_ready(drive=drive)
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/8] block: Add "base" option to bdrv_get_block_status
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 1/8] block: Add "base" option to bdrv_get_block_status Fam Zheng
@ 2015-05-22 19:32   ` Eric Blake
  2015-05-25 14:43   ` Paolo Bonzini
  1 sibling, 0 replies; 18+ messages in thread
From: Eric Blake @ 2015-05-22 19:32 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	pbonzini, jsnow, wangxiaolong

[-- Attachment #1: Type: text/plain, Size: 692 bytes --]

On 05/21/2015 09:40 PM, Fam Zheng wrote:
> Now this function follows the backing chain until seeing BDRV_BLOCK_ALLOCATED.
> Base is not included, and it can be NULL just like bdrv_is_allocated_above().
> 
> Existing callers pass in bs->backing_hd to keep the old behavior.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/io.c            | 39 ++++++++++++++++++++++++++++++++-------
>  include/block/block.h |  3 ++-
>  qemu-img.c            |  6 ++++--
>  3 files changed, 38 insertions(+), 10 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/8] qmp: Add optional bool "unmap" to drive-mirror
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 2/8] qmp: Add optional bool "unmap" to drive-mirror Fam Zheng
@ 2015-05-22 19:57   ` Eric Blake
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Blake @ 2015-05-22 19:57 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	pbonzini, jsnow, wangxiaolong

[-- Attachment #1: Type: text/plain, Size: 3246 bytes --]

On 05/21/2015 09:40 PM, Fam Zheng wrote:
> If specified as "true", it allows discarding on target sectors where source is
> not allocated.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/mirror.c            | 7 +++++--
>  blockdev.c                | 5 +++++
>  hmp.c                     | 2 +-
>  include/block/block_int.h | 2 ++
>  qapi/block-core.json      | 5 ++++-
>  qmp-commands.hx           | 3 +++
>  6 files changed, 20 insertions(+), 4 deletions(-)
> 

> +++ b/blockdev.c
> @@ -2632,6 +2632,7 @@ void qmp_drive_mirror(const char *device, const char *target,
>                        bool has_buf_size, int64_t buf_size,
>                        bool has_on_source_error, BlockdevOnError on_source_error,
>                        bool has_on_target_error, BlockdevOnError on_target_error,
> +                      bool has_unmap, bool unmap,
>                        Error **errp)
>  {
>      BlockBackend *blk;
> @@ -2663,6 +2664,9 @@ void qmp_drive_mirror(const char *device, const char *target,
>      if (!has_buf_size) {
>          buf_size = DEFAULT_MIRROR_BUF_SIZE;
>      }
> +    if (!has_unmap) {
> +        unmap = true;
> +    }

So drive-mirror defaults to true...

> +++ b/hmp.c
> @@ -1056,7 +1056,7 @@ void hmp_drive_mirror(Monitor *mon, const QDict *qdict)
>                       false, NULL, false, NULL,
>                       full ? MIRROR_SYNC_MODE_FULL : MIRROR_SYNC_MODE_TOP,
>                       true, mode, false, 0, false, 0, false, 0,
> -                     false, 0, false, 0, &err);
> +                     false, 0, false, 0, false, false, &err);

...even though you are passing unmap=false here.  It might be a bit less
confusing to pass unmap=true, even though the has_unmap=false means to
fall back to the default, so that someone reading this code doesn't get
confused about the unmap choice being changed by the defaulting mechanism.

>      hmp_handle_error(mon, &err);
>  }
>  
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index db29b74..d136d34 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -587,6 +587,7 @@ void commit_active_start(BlockDriverState *bs, BlockDriverState *base,
>   * @mode: Whether to collapse all images in the chain to the target.
>   * @on_source_error: The action to take upon error reading from the source.
>   * @on_target_error: The action to take upon error writing to the target.
> + * @unmap: Whether unmap target where source sectors only contain zeroes.

s/Whether/Whether to/

> +++ b/qapi/block-core.json
> @@ -954,6 +954,8 @@
>  # @on-target-error: #optional the action to take on an error on the target,
>  #                   default 'report' (no limitations, since this applies to
>  #                   a different block device than @device).
> +# @unmap: #optional Whether unmap target sectors where source has zero. Default

s/Whether/Whether to/

> +#         is true. (Since 2.4)

That says what it does, but not why I would want to ever specify false.
 Is it worth a bit more explanation?


-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v4 3/8] mirror: Do zero write on target if sectors not allocated
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 3/8] mirror: Do zero write on target if sectors not allocated Fam Zheng
@ 2015-05-22 20:20   ` Eric Blake
  2015-05-25 14:38     ` Paolo Bonzini
  2015-05-25 14:36   ` Paolo Bonzini
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Blake @ 2015-05-22 20:20 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	pbonzini, jsnow, wangxiaolong

[-- Attachment #1: Type: text/plain, Size: 3347 bytes --]

On 05/21/2015 09:40 PM, Fam Zheng wrote:
> If guest discards a source cluster, mirroring with bdrv_aio_readv is overkill.
> Some protocols do zero upon discard, where it's best to use
> bdrv_aio_write_zeroes, otherwise, bdrv_aio_discard will be enough.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/mirror.c | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> +
> +    ret = bdrv_get_block_status(source, NULL, sector_num, nb_sectors, &pnum);

Ah, you are checking the entire chain for allocation, so if it is
unallocated through all layers, then the destination doesn't need to
allocate it either.  But is this the correct location to start with,
when the block-mirror is shallow?

> +    if (ret < 0 || pnum < nb_sectors ||
> +            (ret & BDRV_BLOCK_ALLOCATED && !(ret & BDRV_BLOCK_ZERO))) {
> +        bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,

Do we still want to call bdrv_aio_readv() if ret < 0 (where it will
likely fail), or should this 'if' be broken into two clauses?

> +                       mirror_read_complete, op);
> +    } else if (ret & BDRV_BLOCK_ZERO) {
> +        bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
> +                              s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
> +                              mirror_write_complete, op);
> +    } else {
> +        assert(!(ret & BDRV_BLOCK_ALLOCATED));
> +        bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
> +                         mirror_write_complete, op);
> +    }

I'm okay with what happens when you mirror to a flat image, as in
copying "base <- active" into "copy".  There, the copy can omit clusters
that are not allocated in anywhere in the source chain, and can also
omit clusters if the source has all zero in the cluster, and the
destination would read back all zero even if the cluster is unallocated.

But I'm worried about a shallow copy.  If I start with "base <- active",
where "active" has an explicit zero cluster that is overwriting an
allocated non-zero cluster in "base", and I'm creating the shallow clone
to "base <- copy", then the default of 'unmap=true' says that
bdrv_aio_write_zeroes() may attempt to unmap the cluster in "copy".  At
which point, doesn't that mean that reading from "copy" will dredge up
the non-zero data from "base", which is NOT a faithful mirroring of
"active"?

Or symbolically, suppose I have this layout, with letters for non-zero
clusters, 0 for explicit zero clusters, and - for unallocated clusters:

base   AAA000---
active -0B-0B-0B   # Guest sees A0B00B00B

If I'm understanding your code correctly, a deep block-mirror will
create either:

copy   A-B--B--B   # Guest sees A0B00B00B, unmap was true, image is sparse

or

copy   A0B00B-0B   # Guest sees A0B00B00B, unmap was false, image is
allocated

But a shallow block-mirror will cause:

base   AAA000---
copy   -0B-0B-0B   # Guest sees A0B00B00B, unmap was false

or

base   AAA000---
copy   --B--B--B   # Guest sees AAB00B00B, unmap was true

Whoops - unmapping a cluster in the destination which was all zeros in
the source caused corruption in what the guest sees.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v4 4/8] block: Fix dirty bitmap in bdrv_co_discard
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 4/8] block: Fix dirty bitmap in bdrv_co_discard Fam Zheng
@ 2015-05-22 20:22   ` Eric Blake
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Blake @ 2015-05-22 20:22 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	pbonzini, jsnow, wangxiaolong

[-- Attachment #1: Type: text/plain, Size: 960 bytes --]

On 05/21/2015 09:40 PM, Fam Zheng wrote:
> Unsetting dirty globally with discard is not very correct. The discard may zero
> out sectors (depending on can_write_zeroes_with_unmap), we should replicate
> this change to destinition side to make sure that the guest sees the same data.

s/destinition/destination/

> 
> Calling bdrv_reset_dirty also troubles mirror job because the hbitmap iterator
> doesn't expect unsetting of bits after current position.
> 
> So let's do it the opposite way which fixes both problems: set the dirty bits
> if we are to discard it.
> 
> Reported-by: wangxiaolong@ucloud.cn
> Signed-off-by: Fam Zheng <famz@redhat.com>
> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/io.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v4 3/8] mirror: Do zero write on target if sectors not allocated
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 3/8] mirror: Do zero write on target if sectors not allocated Fam Zheng
  2015-05-22 20:20   ` Eric Blake
@ 2015-05-25 14:36   ` Paolo Bonzini
  2015-05-25 14:41     ` Paolo Bonzini
  1 sibling, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2015-05-25 14:36 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	jsnow, wangxiaolong



On 22/05/2015 05:40, Fam Zheng wrote:
> +    ret = bdrv_get_block_status(source, NULL, sector_num, nb_sectors, &pnum);
> +    if (ret < 0 || pnum < nb_sectors ||
> +            (ret & BDRV_BLOCK_ALLOCATED && !(ret & BDRV_BLOCK_ZERO))) {
> +        bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
> +                       mirror_read_complete, op);
> +    } else if (ret & BDRV_BLOCK_ZERO) {
> +        bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
> +                              s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
> +                              mirror_write_complete, op);
> +    } else {
> +        assert(!(ret & BDRV_BLOCK_ALLOCATED));
> +        bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
> +                         mirror_write_complete, op);
> +    }

This doesn't work if you have a backing file.  You want to test
BDRV_BLOCK_DATA, not BDRV_BLOCK_ALLOCATED.

On the other hand, if BDRV_BLOCK_ALLOCATED is nonzero, you need to
recurse on bs->backing_hd.  The logic is very similar to
bdrv_is_allocated_above, but you need to write bdrv_get_block_status_above.

Paolo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v4 3/8] mirror: Do zero write on target if sectors not allocated
  2015-05-22 20:20   ` Eric Blake
@ 2015-05-25 14:38     ` Paolo Bonzini
  0 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2015-05-25 14:38 UTC (permalink / raw)
  To: Eric Blake, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	jsnow, wangxiaolong

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256



On 22/05/2015 22:20, Eric Blake wrote:
> But I'm worried about a shallow copy.  If I start with "base <-
> active", where "active" has an explicit zero cluster that is
> overwriting an allocated non-zero cluster in "base", and I'm
> creating the shallow clone to "base <- copy", then the default of
> 'unmap=true' says that bdrv_aio_write_zeroes() may attempt to unmap
> the cluster in "copy".  At which point, doesn't that mean that
> reading from "copy" will dredge up the non-zero data from "base",
> which is NOT a faithful mirroring of "active"?

No, bdv_aio_write_zeroes+BDRV_REQ_MAY_UNMAP only unmaps if it results
in zeroes.  In addition, unlike bdrv_aio_discard,
bdrv_aio_write_zeroes will do a real write of zeroes if [sector_num,
sector_num+nb_sectors) is not aligned to the disk's unmap granularity.

Paolo
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCAAGBQJVYzPhAAoJEL/70l94x66DGMYIAKKGQIts+Ve689Iar0P6TmXz
BOO0xKJm477PAP9pvUtnGnpt+g2+OqKp0VBTu4F3tbLIQLQOZT1kAZ9oxJ9PMnEa
69c8n86DWdJIgx1lcnOI2wgjaoG65e2K2MH3vYs2Rj6ZD65/FfsF7+oj/R9zCgZm
MIMSiW/fbC5EjBQxOTkDWekuB+/iNRtV9K8YBCtmb8ZlF28X3FPrdZnoqcZGtxPL
RoGYLckjEuLMLSsDH3Frb1+lfKpCaSlJ203ihQQfXzr/8UBH4zOqPZOTbrhdA1pZ
3fHaW5KWyZ1vjW6bb64kpjMnRwCRxB+uAAY4pUmUCZxGsHxh19hGUeSuOexFvqc=
=GUVR
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v4 3/8] mirror: Do zero write on target if sectors not allocated
  2015-05-25 14:36   ` Paolo Bonzini
@ 2015-05-25 14:41     ` Paolo Bonzini
  0 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2015-05-25 14:41 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Stefan Hajnoczi, qemu-stable, qemu-block, wangxiaolong



On 25/05/2015 16:36, Paolo Bonzini wrote:
> 
> 
> On 22/05/2015 05:40, Fam Zheng wrote:
>> +    ret = bdrv_get_block_status(source, NULL, sector_num, nb_sectors, &pnum);
>> +    if (ret < 0 || pnum < nb_sectors ||
>> +            (ret & BDRV_BLOCK_ALLOCATED && !(ret & BDRV_BLOCK_ZERO))) {
>> +        bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
>> +                       mirror_read_complete, op);
>> +    } else if (ret & BDRV_BLOCK_ZERO) {
>> +        bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
>> +                              s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
>> +                              mirror_write_complete, op);
>> +    } else {
>> +        assert(!(ret & BDRV_BLOCK_ALLOCATED));
>> +        bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
>> +                         mirror_write_complete, op);
>> +    }
> 
> This doesn't work if you have a backing file.  You want to test
> BDRV_BLOCK_DATA, not BDRV_BLOCK_ALLOCATED.
> 
> On the other hand, if BDRV_BLOCK_ALLOCATED is nonzero, you need to
> recurse on bs->backing_hd.  The logic is very similar to
> bdrv_is_allocated_above, but you need to write bdrv_get_block_status_above.

Oops, I totally missed the "NULL" in the first line.  Still, I think
BDRV_BLOCK_DATA is a better check than BDRV_BLOCK_ALLOCATED.

Paolo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/8] block: Add "base" option to bdrv_get_block_status
  2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 1/8] block: Add "base" option to bdrv_get_block_status Fam Zheng
  2015-05-22 19:32   ` Eric Blake
@ 2015-05-25 14:43   ` Paolo Bonzini
  2015-05-26  3:30     ` Fam Zheng
  1 sibling, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2015-05-25 14:43 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-stable, Stefan Hajnoczi,
	jsnow, wangxiaolong



On 22/05/2015 05:40, Fam Zheng wrote:
> +{
> +    BlockDriverState *p;
> +    int64_t ret;
> +
> +    assert(bs != base);
> +    for (p = bs; p != base; p = p->backing_hd) {
> +        ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum);

It's a bit ugly to have different parameters for bdrv_get_block_status
and bdrv_co_get_block_status.

> +        if (ret < 0 || ret & BDRV_BLOCK_ALLOCATED) {
> +            break;
> +        }
> +    }
> +    return ret;
> +}
> +

I think BDRV_BLOCK_ALLOCATED will always be true if passing NULL as the
base, so patch 3 still needs a tweak.

But, more important, you need extra logic to correct the output value of
pnum, as in bdrv_is_allocated_above.  If you want, you can also
reimplemented bdrv_is_allocated_above on top of bdrv_get_block_status.

Paolo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/8] block: Add "base" option to bdrv_get_block_status
  2015-05-25 14:43   ` Paolo Bonzini
@ 2015-05-26  3:30     ` Fam Zheng
  0 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-26  3:30 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, qemu-block, Jeff Cody, qemu-devel, qemu-stable,
	Stefan Hajnoczi, jsnow, wangxiaolong

On Mon, 05/25 16:43, Paolo Bonzini wrote:
> 
> 
> On 22/05/2015 05:40, Fam Zheng wrote:
> > +{
> > +    BlockDriverState *p;
> > +    int64_t ret;
> > +
> > +    assert(bs != base);
> > +    for (p = bs; p != base; p = p->backing_hd) {
> > +        ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum);
> 
> It's a bit ugly to have different parameters for bdrv_get_block_status
> and bdrv_co_get_block_status.

I'll add bdrv_get_block_status_above and rewrite bdrv_get_block_status body on
top.

> 
> > +        if (ret < 0 || ret & BDRV_BLOCK_ALLOCATED) {
> > +            break;
> > +        }
> > +    }
> > +    return ret;
> > +}
> > +
> 
> I think BDRV_BLOCK_ALLOCATED will always be true if passing NULL as the
> base, so patch 3 still needs a tweak.

OK, will use BDRV_BLOCK_DATA there.

> 
> But, more important, you need extra logic to correct the output value of
> pnum, as in bdrv_is_allocated_above.

bdrv_co_get_block_status should fix pnum.

Fam

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-05-26  3:30 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-22  3:40 [Qemu-devel] [PATCH v4 0/8] block: Mirror discarded sectors Fam Zheng
2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 1/8] block: Add "base" option to bdrv_get_block_status Fam Zheng
2015-05-22 19:32   ` Eric Blake
2015-05-25 14:43   ` Paolo Bonzini
2015-05-26  3:30     ` Fam Zheng
2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 2/8] qmp: Add optional bool "unmap" to drive-mirror Fam Zheng
2015-05-22 19:57   ` Eric Blake
2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 3/8] mirror: Do zero write on target if sectors not allocated Fam Zheng
2015-05-22 20:20   ` Eric Blake
2015-05-25 14:38     ` Paolo Bonzini
2015-05-25 14:36   ` Paolo Bonzini
2015-05-25 14:41     ` Paolo Bonzini
2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 4/8] block: Fix dirty bitmap in bdrv_co_discard Fam Zheng
2015-05-22 20:22   ` Eric Blake
2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 5/8] block: Remove bdrv_reset_dirty Fam Zheng
2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 6/8] qemu-iotests: Make block job methods common Fam Zheng
2015-05-22  3:40 ` [Qemu-devel] [PATCH v4 7/8] qemu-iotests: Add test case for mirror with unmap Fam Zheng
2015-05-22  3:41 ` [Qemu-devel] [PATCH v4 8/8] iotests: Use event_wait in wait_ready Fam Zheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.