All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/19] Make image fleecing more usable
@ 2021-08-27 18:17 Vladimir Sementsov-Ogievskiy
  2021-08-27 18:17 ` [PATCH v2 01/19] block/block-copy: move copy_bitmap initialization to block_copy_state_new() Vladimir Sementsov-Ogievskiy
                   ` (19 more replies)
  0 siblings, 20 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:17 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

Hi all!

That continues "[PATCH RFC DRAFT 00/11] Make image fleecing more usable"
and supersedes "[PATCH v2 for-6.2 0/6] push backup with fleecing"

Supersedes: <20210804131750.127574-1-vsementsov@virtuozzo.com>
Supersedes: <20210721140424.163701-1-vsementsov@virtuozzo.com>

There several improvements to fleecing scheme:

1. support bitmap in copy-before-write filter

2. introduce fleecing block driver, which opens the door for a lot of
   image fleecing improvements.
   See "block: introduce fleecing block driver" commit message for
   details.

3. support "push backup with fleecing" scheme, when backup job is a
   client of common fleecing scheme. That helps when writes to final
   backup target are slow and we don't want guest writes hang waiting
   for copy-before-write operations to final target.

Vladimir Sementsov-Ogievskiy (19):
  block/block-copy: move copy_bitmap initialization to
    block_copy_state_new()
  block/dirty-bitmap: bdrv_merge_dirty_bitmap(): add return value
  block/block-copy: block_copy_state_new(): add bitmap parameter
  block/copy-before-write: add bitmap open parameter
  block/block-copy: add block_copy_reset()
  block: intoduce reqlist
  block/dirty-bitmap: introduce bdrv_dirty_bitmap_status()
  block/reqlist: add reqlist_wait_all()
  block: introduce FleecingState class
  block: introduce fleecing block driver
  block/copy-before-write: support fleecing block driver
  block/block-copy: add write-unchanged mode
  block/copy-before-write: use write-unchanged in fleecing mode
  iotests/image-fleecing: add test-case for fleecing format node
  iotests.py: add qemu_io_pipe_and_status()
  iotests/image-fleecing: add test case with bitmap
  block: blk_root(): return non-const pointer
  qapi: backup: add immutable-source parameter
  iotests/image-fleecing: test push backup with fleecing

 qapi/block-core.json                        |  39 ++-
 block/fleecing.h                            | 151 ++++++++++++
 include/block/block-copy.h                  |   4 +-
 include/block/block_int.h                   |   1 +
 include/block/dirty-bitmap.h                |   4 +-
 include/block/reqlist.h                     |  75 ++++++
 include/qemu/hbitmap.h                      |  11 +
 include/sysemu/block-backend.h              |   2 +-
 block/backup.c                              |  61 ++++-
 block/block-backend.c                       |   2 +-
 block/block-copy.c                          | 157 +++++-------
 block/copy-before-write.c                   |  70 +++++-
 block/dirty-bitmap.c                        |  15 +-
 block/fleecing-drv.c                        | 260 ++++++++++++++++++++
 block/fleecing.c                            | 182 ++++++++++++++
 block/monitor/bitmap-qmp-cmds.c             |   5 +-
 block/replication.c                         |   2 +-
 block/reqlist.c                             |  84 +++++++
 blockdev.c                                  |   1 +
 util/hbitmap.c                              |  36 +++
 MAINTAINERS                                 |   7 +-
 block/meson.build                           |   3 +
 tests/qemu-iotests/iotests.py               |   4 +
 tests/qemu-iotests/tests/image-fleecing     | 178 +++++++++++---
 tests/qemu-iotests/tests/image-fleecing.out | 221 ++++++++++++++++-
 25 files changed, 1420 insertions(+), 155 deletions(-)
 create mode 100644 block/fleecing.h
 create mode 100644 include/block/reqlist.h
 create mode 100644 block/fleecing-drv.c
 create mode 100644 block/fleecing.c
 create mode 100644 block/reqlist.c

-- 
2.29.2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 01/19] block/block-copy: move copy_bitmap initialization to block_copy_state_new()
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:17 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:17 ` [PATCH v2 02/19] block/dirty-bitmap: bdrv_merge_dirty_bitmap(): add return value Vladimir Sementsov-Ogievskiy
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:17 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

We are going to complicate bitmap initialization in the further
commit. And in future, backup job will be able to work without filter
(when source is immutable), so we'll need same bitmap initialization in
copy-before-write filter and in backup job. So, it's reasonable to do
it in block-copy.

Note that for now cbw_open() is the only caller of
block_copy_state_new().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/block-copy.c        | 1 +
 block/copy-before-write.c | 4 ----
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 37d804ec42..c39cb5fda7 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -401,6 +401,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
         return NULL;
     }
     bdrv_disable_dirty_bitmap(copy_bitmap);
+    bdrv_set_dirty_bitmap(copy_bitmap, 0, bdrv_dirty_bitmap_size(copy_bitmap));
 
     /*
      * If source is in backing chain of target assume that target is going to be
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 2a5e57deca..f5551cd15b 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -148,7 +148,6 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
                     Error **errp)
 {
     BDRVCopyBeforeWriteState *s = bs->opaque;
-    BdrvDirtyBitmap *copy_bitmap;
 
     bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
                                BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
@@ -176,9 +175,6 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
         return -EINVAL;
     }
 
-    copy_bitmap = block_copy_dirty_bitmap(s->bcs);
-    bdrv_set_dirty_bitmap(copy_bitmap, 0, bdrv_dirty_bitmap_size(copy_bitmap));
-
     return 0;
 }
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 02/19] block/dirty-bitmap: bdrv_merge_dirty_bitmap(): add return value
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
  2021-08-27 18:17 ` [PATCH v2 01/19] block/block-copy: move copy_bitmap initialization to block_copy_state_new() Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:17 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:17 ` [PATCH v2 03/19] block/block-copy: block_copy_state_new(): add bitmap parameter Vladimir Sementsov-Ogievskiy
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:17 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

That simplifies handling failure in existing code and in further new
usage of bdrv_merge_dirty_bitmap().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/dirty-bitmap.h    | 2 +-
 block/dirty-bitmap.c            | 9 +++++++--
 block/monitor/bitmap-qmp-cmds.c | 5 +----
 3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 40950ae3d5..f95d350b70 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -77,7 +77,7 @@ void bdrv_dirty_bitmap_set_persistence(BdrvDirtyBitmap *bitmap,
                                        bool persistent);
 void bdrv_dirty_bitmap_set_inconsistent(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_set_busy(BdrvDirtyBitmap *bitmap, bool busy);
-void bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const BdrvDirtyBitmap *src,
+bool bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const BdrvDirtyBitmap *src,
                              HBitmap **backup, Error **errp);
 void bdrv_dirty_bitmap_skip_store(BdrvDirtyBitmap *bitmap, bool skip);
 bool bdrv_dirty_bitmap_get(BdrvDirtyBitmap *bitmap, int64_t offset);
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 0ef46163e3..94a0276833 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -880,11 +880,14 @@ bool bdrv_dirty_bitmap_next_dirty_area(BdrvDirtyBitmap *bitmap,
  * Ensures permissions on bitmaps are reasonable; use for public API.
  *
  * @backup: If provided, make a copy of dest here prior to merge.
+ *
+ * Returns true on success, false on failure. In case of failure bitmaps are
+ * untouched.
  */
-void bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const BdrvDirtyBitmap *src,
+bool bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const BdrvDirtyBitmap *src,
                              HBitmap **backup, Error **errp)
 {
-    bool ret;
+    bool ret = false;
 
     bdrv_dirty_bitmaps_lock(dest->bs);
     if (src->bs != dest->bs) {
@@ -912,6 +915,8 @@ out:
     if (src->bs != dest->bs) {
         bdrv_dirty_bitmaps_unlock(src->bs);
     }
+
+    return ret;
 }
 
 /**
diff --git a/block/monitor/bitmap-qmp-cmds.c b/block/monitor/bitmap-qmp-cmds.c
index 9f11deec64..83970b22fa 100644
--- a/block/monitor/bitmap-qmp-cmds.c
+++ b/block/monitor/bitmap-qmp-cmds.c
@@ -259,7 +259,6 @@ BdrvDirtyBitmap *block_dirty_bitmap_merge(const char *node, const char *target,
     BlockDriverState *bs;
     BdrvDirtyBitmap *dst, *src, *anon;
     BlockDirtyBitmapMergeSourceList *lst;
-    Error *local_err = NULL;
 
     dst = block_dirty_bitmap_lookup(node, target, &bs, errp);
     if (!dst) {
@@ -297,9 +296,7 @@ BdrvDirtyBitmap *block_dirty_bitmap_merge(const char *node, const char *target,
             abort();
         }
 
-        bdrv_merge_dirty_bitmap(anon, src, NULL, &local_err);
-        if (local_err) {
-            error_propagate(errp, local_err);
+        if (!bdrv_merge_dirty_bitmap(anon, src, NULL, errp)) {
             dst = NULL;
             goto out;
         }
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 03/19] block/block-copy: block_copy_state_new(): add bitmap parameter
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
  2021-08-27 18:17 ` [PATCH v2 01/19] block/block-copy: move copy_bitmap initialization to block_copy_state_new() Vladimir Sementsov-Ogievskiy
  2021-08-27 18:17 ` [PATCH v2 02/19] block/dirty-bitmap: bdrv_merge_dirty_bitmap(): add return value Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:17 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:17 ` [PATCH v2 04/19] block/copy-before-write: add bitmap open parameter Vladimir Sementsov-Ogievskiy
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:17 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

This will be used in the following commit to bring "incremental" mode
to copy-before-write filter.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block-copy.h |  2 +-
 block/block-copy.c         | 14 ++++++++++++--
 block/copy-before-write.c  |  2 +-
 3 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 99370fa38b..8da4cec1b6 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -25,7 +25,7 @@ typedef struct BlockCopyState BlockCopyState;
 typedef struct BlockCopyCallState BlockCopyCallState;
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
-                                     Error **errp);
+                                     BdrvDirtyBitmap *bitmap, Error **errp);
 
 /* Function should be called prior any actual copy request */
 void block_copy_set_copy_opts(BlockCopyState *s, bool use_copy_range,
diff --git a/block/block-copy.c b/block/block-copy.c
index c39cb5fda7..65019d0d1d 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -383,8 +383,9 @@ static int64_t block_copy_calculate_cluster_size(BlockDriverState *target,
 }
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
-                                     Error **errp)
+                                     BdrvDirtyBitmap *bitmap, Error **errp)
 {
+    ERRP_GUARD();
     BlockCopyState *s;
     int64_t cluster_size;
     BdrvDirtyBitmap *copy_bitmap;
@@ -401,7 +402,16 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
         return NULL;
     }
     bdrv_disable_dirty_bitmap(copy_bitmap);
-    bdrv_set_dirty_bitmap(copy_bitmap, 0, bdrv_dirty_bitmap_size(copy_bitmap));
+    if (bitmap) {
+        if (!bdrv_merge_dirty_bitmap(copy_bitmap, bitmap, NULL, errp)) {
+            error_prepend(errp, "Failed to merge bitmap '%s' to internal "
+                          "copy-bitmap: ", bdrv_dirty_bitmap_name(bitmap));
+            return NULL;
+        }
+    } else {
+        bdrv_set_dirty_bitmap(copy_bitmap, 0,
+                              bdrv_dirty_bitmap_size(copy_bitmap));
+    }
 
     /*
      * If source is in backing chain of target assume that target is going to be
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index f5551cd15b..d31ca61111 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -169,7 +169,7 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
             ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
              bs->file->bs->supported_zero_flags);
 
-    s->bcs = block_copy_state_new(bs->file, s->target, errp);
+    s->bcs = block_copy_state_new(bs->file, s->target, NULL, errp);
     if (!s->bcs) {
         error_prepend(errp, "Cannot create block-copy-state: ");
         return -EINVAL;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 04/19] block/copy-before-write: add bitmap open parameter
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2021-08-27 18:17 ` [PATCH v2 03/19] block/block-copy: block_copy_state_new(): add bitmap parameter Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:17 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:17 ` [PATCH v2 05/19] block/block-copy: add block_copy_reset() Vladimir Sementsov-Ogievskiy
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:17 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

This brings "incremental" mode to copy-before-write filter: user can
specify bitmap so that filter will copy only "dirty" areas.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 qapi/block-core.json      | 10 +++++++++-
 block/copy-before-write.c | 30 +++++++++++++++++++++++++++++-
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6764d8b84f..c42d23752d 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4063,11 +4063,19 @@
 #
 # @target: The target for copy-before-write operations.
 #
+# @bitmap: If specified, copy-before-write filter will do
+#          copy-before-write operations only for dirty regions of the
+#          bitmap. Bitmap size must be equal to length of file and
+#          target child of the filter. Note also, that bitmap is used
+#          only to initialize internal bitmap of the process, so further
+#          modifications (or removing) of specified bitmap doesn't
+#          influence the filter.
+#
 # Since: 6.2
 ##
 { 'struct': 'BlockdevOptionsCbw',
   'base': 'BlockdevOptionsGenericFormat',
-  'data': { 'target': 'BlockdevRef' } }
+  'data': { 'target': 'BlockdevRef', '*bitmap': 'BlockDirtyBitmap' } }
 
 ##
 # @BlockdevOptions:
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index d31ca61111..e3456ad6aa 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -148,6 +148,7 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
                     Error **errp)
 {
     BDRVCopyBeforeWriteState *s = bs->opaque;
+    BdrvDirtyBitmap *bitmap = NULL;
 
     bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
                                BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
@@ -162,6 +163,33 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
         return -EINVAL;
     }
 
+    if (qdict_haskey(options, "bitmap.node") ||
+        qdict_haskey(options, "bitmap.name"))
+    {
+        const char *bitmap_node, *bitmap_name;
+
+        if (!qdict_haskey(options, "bitmap.node")) {
+            error_setg(errp, "bitmap.node is not specified");
+            return -EINVAL;
+        }
+
+        if (!qdict_haskey(options, "bitmap.name")) {
+            error_setg(errp, "bitmap.name is not specified");
+            return -EINVAL;
+        }
+
+        bitmap_node = qdict_get_str(options, "bitmap.node");
+        bitmap_name = qdict_get_str(options, "bitmap.name");
+        qdict_del(options, "bitmap.node");
+        qdict_del(options, "bitmap.name");
+
+        bitmap = block_dirty_bitmap_lookup(bitmap_node, bitmap_name, NULL,
+                                           errp);
+        if (!bitmap) {
+            return -EINVAL;
+        }
+    }
+
     bs->total_sectors = bs->file->bs->total_sectors;
     bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
             (BDRV_REQ_FUA & bs->file->bs->supported_write_flags);
@@ -169,7 +197,7 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
             ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
              bs->file->bs->supported_zero_flags);
 
-    s->bcs = block_copy_state_new(bs->file, s->target, NULL, errp);
+    s->bcs = block_copy_state_new(bs->file, s->target, bitmap, errp);
     if (!s->bcs) {
         error_prepend(errp, "Cannot create block-copy-state: ");
         return -EINVAL;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 05/19] block/block-copy: add block_copy_reset()
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2021-08-27 18:17 ` [PATCH v2 04/19] block/copy-before-write: add bitmap open parameter Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:17 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:17 ` [PATCH v2 06/19] block: intoduce reqlist Vladimir Sementsov-Ogievskiy
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:17 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

Split block_copy_reset() out of block_copy_reset_unallocated() to be
used separately later.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block-copy.h |  1 +
 block/block-copy.c         | 21 +++++++++++++--------
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 8da4cec1b6..a11e1620f6 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -34,6 +34,7 @@ void block_copy_set_progress_meter(BlockCopyState *s, ProgressMeter *pm);
 
 void block_copy_state_free(BlockCopyState *s);
 
+void block_copy_reset(BlockCopyState *s, int64_t offset, int64_t bytes);
 int64_t block_copy_reset_unallocated(BlockCopyState *s,
                                      int64_t offset, int64_t *count);
 
diff --git a/block/block-copy.c b/block/block-copy.c
index 65019d0d1d..ca51eab149 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -689,6 +689,18 @@ static int block_copy_is_cluster_allocated(BlockCopyState *s, int64_t offset,
     }
 }
 
+void block_copy_reset(BlockCopyState *s, int64_t offset, int64_t bytes)
+{
+    QEMU_LOCK_GUARD(&s->lock);
+
+    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
+    if (s->progress) {
+        progress_set_remaining(s->progress,
+                               bdrv_get_dirty_count(s->copy_bitmap) +
+                               s->in_flight_bytes);
+    }
+}
+
 /*
  * Reset bits in copy_bitmap starting at offset if they represent unallocated
  * data in the image. May reset subsequent contiguous bits.
@@ -709,14 +721,7 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
     bytes = clusters * s->cluster_size;
 
     if (!ret) {
-        qemu_co_mutex_lock(&s->lock);
-        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
-        if (s->progress) {
-            progress_set_remaining(s->progress,
-                                   bdrv_get_dirty_count(s->copy_bitmap) +
-                                   s->in_flight_bytes);
-        }
-        qemu_co_mutex_unlock(&s->lock);
+        block_copy_reset(s, offset, bytes);
     }
 
     *count = bytes;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 06/19] block: intoduce reqlist
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2021-08-27 18:17 ` [PATCH v2 05/19] block/block-copy: add block_copy_reset() Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:17 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:17 ` [PATCH v2 07/19] block/dirty-bitmap: introduce bdrv_dirty_bitmap_status() Vladimir Sementsov-Ogievskiy
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:17 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

Split intersecting-requests functionality out of block-copy to be
reused in copy-before-write filter.

Note: while being here, fix tiny typo in MAINTAINERS.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/reqlist.h |  67 +++++++++++++++++++++++
 block/block-copy.c      | 116 +++++++++++++---------------------------
 block/reqlist.c         |  76 ++++++++++++++++++++++++++
 MAINTAINERS             |   4 +-
 block/meson.build       |   1 +
 5 files changed, 184 insertions(+), 80 deletions(-)
 create mode 100644 include/block/reqlist.h
 create mode 100644 block/reqlist.c

diff --git a/include/block/reqlist.h b/include/block/reqlist.h
new file mode 100644
index 0000000000..b904d80216
--- /dev/null
+++ b/include/block/reqlist.h
@@ -0,0 +1,67 @@
+/*
+ * reqlist API
+ *
+ * Copyright (C) 2013 Proxmox Server Solutions
+ * Copyright (c) 2021 Virtuozzo International GmbH.
+ *
+ * Authors:
+ *  Dietmar Maurer (dietmar@proxmox.com)
+ *  Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef REQLIST_H
+#define REQLIST_H
+
+#include "qemu/coroutine.h"
+
+/*
+ * The API is not thread-safe and shouldn't be. The struct is public to be part
+ * of other structures and protected by third-party locks, see
+ * block/block-copy.c for example.
+ */
+
+typedef struct BlockReq {
+    int64_t offset;
+    int64_t bytes;
+
+    CoQueue wait_queue; /* coroutines blocked on this req */
+    QLIST_ENTRY(BlockReq) list;
+} BlockReq;
+
+typedef QLIST_HEAD(, BlockReq) BlockReqList;
+
+/*
+ * Initialize new request and add it to the list. Caller should be sure that
+ * there are no conflicting requests in the list.
+ */
+void reqlist_init_req(BlockReqList *reqs, BlockReq *req, int64_t offset,
+                      int64_t bytes);
+/* Search for request in the list intersecting with @offset/@bytes area. */
+BlockReq *reqlist_find_conflict(BlockReqList *reqs, int64_t offset,
+                                int64_t bytes);
+
+/*
+ * If there are no intersecting requests return false. Otherwise, wait for the
+ * first found intersecting request to finish and return true.
+ *
+ * @lock is passed to qemu_co_queue_wait()
+ * False return value proves that lock was NOT released.
+ */
+bool coroutine_fn reqlist_wait_one(BlockReqList *reqs, int64_t offset,
+                                   int64_t bytes, CoMutex *lock);
+
+/*
+ * Shrink request and wake all waiting coroutines (may be some of them are not
+ * intersecting with shrunk request).
+ */
+void coroutine_fn reqlist_shrink_req(BlockReq *req, int64_t new_bytes);
+
+/*
+ * Remove request and wake all waiting coroutines. Do not release any memory.
+ */
+void coroutine_fn reqlist_remove_req(BlockReq *req);
+
+#endif /* REQLIST_H */
diff --git a/block/block-copy.c b/block/block-copy.c
index ca51eab149..46e6a6736d 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -17,6 +17,7 @@
 #include "trace.h"
 #include "qapi/error.h"
 #include "block/block-copy.h"
+#include "block/reqlist.h"
 #include "sysemu/block-backend.h"
 #include "qemu/units.h"
 #include "qemu/coroutine.h"
@@ -82,7 +83,6 @@ typedef struct BlockCopyTask {
      */
     BlockCopyState *s;
     BlockCopyCallState *call_state;
-    int64_t offset;
     /*
      * @method can also be set again in the while loop of
      * block_copy_dirty_clusters(), but it is never accessed concurrently
@@ -93,21 +93,17 @@ typedef struct BlockCopyTask {
     BlockCopyMethod method;
 
     /*
-     * Fields whose state changes throughout the execution
-     * Protected by lock in BlockCopyState.
+     * Generally, req is protected by lock in BlockCopyState, Still req.offset
+     * is only set on task creation, so may be read concurrently after creation.
+     * req.bytes is changed at most once, and need only protecting the case of
+     * parallel read while updating @bytes value in block_copy_task_shrink().
      */
-    CoQueue wait_queue; /* coroutines blocked on this task */
-    /*
-     * Only protect the case of parallel read while updating @bytes
-     * value in block_copy_task_shrink().
-     */
-    int64_t bytes;
-    QLIST_ENTRY(BlockCopyTask) list;
+    BlockReq req;
 } BlockCopyTask;
 
 static int64_t task_end(BlockCopyTask *task)
 {
-    return task->offset + task->bytes;
+    return task->req.offset + task->req.bytes;
 }
 
 typedef struct BlockCopyState {
@@ -135,7 +131,7 @@ typedef struct BlockCopyState {
     CoMutex lock;
     int64_t in_flight_bytes;
     BlockCopyMethod method;
-    QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
+    BlockReqList reqs;
     QLIST_HEAD(, BlockCopyCallState) calls;
     /*
      * skip_unallocated:
@@ -159,42 +155,6 @@ typedef struct BlockCopyState {
     RateLimit rate_limit;
 } BlockCopyState;
 
-/* Called with lock held */
-static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
-                                            int64_t offset, int64_t bytes)
-{
-    BlockCopyTask *t;
-
-    QLIST_FOREACH(t, &s->tasks, list) {
-        if (offset + bytes > t->offset && offset < t->offset + t->bytes) {
-            return t;
-        }
-    }
-
-    return NULL;
-}
-
-/*
- * If there are no intersecting tasks return false. Otherwise, wait for the
- * first found intersecting tasks to finish and return true.
- *
- * Called with lock held. May temporary release the lock.
- * Return value of 0 proves that lock was NOT released.
- */
-static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
-                                             int64_t bytes)
-{
-    BlockCopyTask *task = find_conflicting_task(s, offset, bytes);
-
-    if (!task) {
-        return false;
-    }
-
-    qemu_co_queue_wait(&task->wait_queue, &s->lock);
-
-    return true;
-}
-
 /* Called with lock held */
 static int64_t block_copy_chunk_size(BlockCopyState *s)
 {
@@ -238,7 +198,7 @@ block_copy_task_create(BlockCopyState *s, BlockCopyCallState *call_state,
     bytes = QEMU_ALIGN_UP(bytes, s->cluster_size);
 
     /* region is dirty, so no existent tasks possible in it */
-    assert(!find_conflicting_task(s, offset, bytes));
+    assert(!reqlist_find_conflict(&s->reqs, offset, bytes));
 
     bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
     s->in_flight_bytes += bytes;
@@ -248,12 +208,9 @@ block_copy_task_create(BlockCopyState *s, BlockCopyCallState *call_state,
         .task.func = block_copy_task_entry,
         .s = s,
         .call_state = call_state,
-        .offset = offset,
-        .bytes = bytes,
         .method = s->method,
     };
-    qemu_co_queue_init(&task->wait_queue);
-    QLIST_INSERT_HEAD(&s->tasks, task, list);
+    reqlist_init_req(&s->reqs, &task->req, offset, bytes);
 
     return task;
 }
@@ -269,34 +226,34 @@ static void coroutine_fn block_copy_task_shrink(BlockCopyTask *task,
                                                 int64_t new_bytes)
 {
     QEMU_LOCK_GUARD(&task->s->lock);
-    if (new_bytes == task->bytes) {
+    if (new_bytes == task->req.bytes) {
         return;
     }
 
-    assert(new_bytes > 0 && new_bytes < task->bytes);
+    assert(new_bytes > 0 && new_bytes < task->req.bytes);
 
-    task->s->in_flight_bytes -= task->bytes - new_bytes;
+    task->s->in_flight_bytes -= task->req.bytes - new_bytes;
     bdrv_set_dirty_bitmap(task->s->copy_bitmap,
-                          task->offset + new_bytes, task->bytes - new_bytes);
+                          task->req.offset + new_bytes,
+                          task->req.bytes - new_bytes);
 
-    task->bytes = new_bytes;
-    qemu_co_queue_restart_all(&task->wait_queue);
+    reqlist_shrink_req(&task->req, new_bytes);
 }
 
 static void coroutine_fn block_copy_task_end(BlockCopyTask *task, int ret)
 {
     QEMU_LOCK_GUARD(&task->s->lock);
-    task->s->in_flight_bytes -= task->bytes;
+    task->s->in_flight_bytes -= task->req.bytes;
     if (ret < 0) {
-        bdrv_set_dirty_bitmap(task->s->copy_bitmap, task->offset, task->bytes);
+        bdrv_set_dirty_bitmap(task->s->copy_bitmap, task->req.offset,
+                              task->req.bytes);
     }
-    QLIST_REMOVE(task, list);
     if (task->s->progress) {
         progress_set_remaining(task->s->progress,
                                bdrv_get_dirty_count(task->s->copy_bitmap) +
                                task->s->in_flight_bytes);
     }
-    qemu_co_queue_restart_all(&task->wait_queue);
+    reqlist_remove_req(&task->req);
 }
 
 void block_copy_state_free(BlockCopyState *s)
@@ -447,7 +404,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
 
     ratelimit_init(&s->rate_limit);
     qemu_co_mutex_init(&s->lock);
-    QLIST_INIT(&s->tasks);
+    QLIST_INIT(&s->reqs);
     QLIST_INIT(&s->calls);
 
     return s;
@@ -480,7 +437,7 @@ static coroutine_fn int block_copy_task_run(AioTaskPool *pool,
 
     aio_task_pool_wait_slot(pool);
     if (aio_task_pool_status(pool) < 0) {
-        co_put_to_shres(task->s->mem, task->bytes);
+        co_put_to_shres(task->s->mem, task->req.bytes);
         block_copy_task_end(task, -ECANCELED);
         g_free(task);
         return -ECANCELED;
@@ -593,7 +550,8 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
     BlockCopyMethod method = t->method;
     int ret;
 
-    ret = block_copy_do_copy(s, t->offset, t->bytes, &method, &error_is_read);
+    ret = block_copy_do_copy(s, t->req.offset, t->req.bytes, &method,
+                             &error_is_read);
 
     WITH_QEMU_LOCK_GUARD(&s->lock) {
         if (s->method == t->method) {
@@ -606,10 +564,10 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
                 t->call_state->error_is_read = error_is_read;
             }
         } else if (s->progress) {
-            progress_work_done(s->progress, t->bytes);
+            progress_work_done(s->progress, t->req.bytes);
         }
     }
-    co_put_to_shres(s->mem, t->bytes);
+    co_put_to_shres(s->mem, t->req.bytes);
     block_copy_task_end(t, ret);
 
     return ret;
@@ -768,22 +726,22 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
             trace_block_copy_skip_range(s, offset, bytes);
             break;
         }
-        if (task->offset > offset) {
-            trace_block_copy_skip_range(s, offset, task->offset - offset);
+        if (task->req.offset > offset) {
+            trace_block_copy_skip_range(s, offset, task->req.offset - offset);
         }
 
         found_dirty = true;
 
-        ret = block_copy_block_status(s, task->offset, task->bytes,
+        ret = block_copy_block_status(s, task->req.offset, task->req.bytes,
                                       &status_bytes);
         assert(ret >= 0); /* never fail */
-        if (status_bytes < task->bytes) {
+        if (status_bytes < task->req.bytes) {
             block_copy_task_shrink(task, status_bytes);
         }
         if (qatomic_read(&s->skip_unallocated) &&
             !(ret & BDRV_BLOCK_ALLOCATED)) {
             block_copy_task_end(task, 0);
-            trace_block_copy_skip_range(s, task->offset, task->bytes);
+            trace_block_copy_skip_range(s, task->req.offset, task->req.bytes);
             offset = task_end(task);
             bytes = end - offset;
             g_free(task);
@@ -804,11 +762,11 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
             }
         }
 
-        ratelimit_calculate_delay(&s->rate_limit, task->bytes);
+        ratelimit_calculate_delay(&s->rate_limit, task->req.bytes);
 
-        trace_block_copy_process(s, task->offset);
+        trace_block_copy_process(s, task->req.offset);
 
-        co_get_from_shres(s->mem, task->bytes);
+        co_get_from_shres(s->mem, task->req.bytes);
 
         offset = task_end(task);
         bytes = end - offset;
@@ -876,8 +834,8 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
                  * Check that there is no task we still need to
                  * wait to complete
                  */
-                ret = block_copy_wait_one(s, call_state->offset,
-                                          call_state->bytes);
+                ret = reqlist_wait_one(&s->reqs, call_state->offset,
+                                       call_state->bytes, &s->lock);
                 if (ret == 0) {
                     /*
                      * No pending tasks, but check again the bitmap in this
@@ -885,7 +843,7 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
                      * between this and the critical section in
                      * block_copy_dirty_clusters().
                      *
-                     * block_copy_wait_one return value 0 also means that it
+                     * reqlist_wait_one return value 0 also means that it
                      * didn't release the lock. So, we are still in the same
                      * critical section, not interrupted by any concurrent
                      * access to state.
diff --git a/block/reqlist.c b/block/reqlist.c
new file mode 100644
index 0000000000..5e320ba649
--- /dev/null
+++ b/block/reqlist.c
@@ -0,0 +1,76 @@
+/*
+ * reqlist API
+ *
+ * Copyright (C) 2013 Proxmox Server Solutions
+ * Copyright (c) 2021 Virtuozzo International GmbH.
+ *
+ * Authors:
+ *  Dietmar Maurer (dietmar@proxmox.com)
+ *  Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+
+#include "block/reqlist.h"
+
+void reqlist_init_req(BlockReqList *reqs, BlockReq *req, int64_t offset,
+                      int64_t bytes)
+{
+    assert(!reqlist_find_conflict(reqs, offset, bytes));
+
+    *req = (BlockReq) {
+        .offset = offset,
+        .bytes = bytes,
+    };
+    qemu_co_queue_init(&req->wait_queue);
+    QLIST_INSERT_HEAD(reqs, req, list);
+}
+
+BlockReq *reqlist_find_conflict(BlockReqList *reqs, int64_t offset,
+                                int64_t bytes)
+{
+    BlockReq *r;
+
+    QLIST_FOREACH(r, reqs, list) {
+        if (offset + bytes > r->offset && offset < r->offset + r->bytes) {
+            return r;
+        }
+    }
+
+    return NULL;
+}
+
+bool coroutine_fn reqlist_wait_one(BlockReqList *reqs, int64_t offset,
+                                   int64_t bytes, CoMutex *lock)
+{
+    BlockReq *r = reqlist_find_conflict(reqs, offset, bytes);
+
+    if (!r) {
+        return false;
+    }
+
+    qemu_co_queue_wait(&r->wait_queue, lock);
+
+    return true;
+}
+
+void coroutine_fn reqlist_shrink_req(BlockReq *req, int64_t new_bytes)
+{
+    if (new_bytes == req->bytes) {
+        return;
+    }
+
+    assert(new_bytes > 0 && new_bytes < req->bytes);
+
+    req->bytes = new_bytes;
+    qemu_co_queue_restart_all(&req->wait_queue);
+}
+
+void coroutine_fn reqlist_remove_req(BlockReq *req)
+{
+    QLIST_REMOVE(req, list);
+    qemu_co_queue_restart_all(&req->wait_queue);
+}
diff --git a/MAINTAINERS b/MAINTAINERS
index ece044bd25..2d58170224 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2336,7 +2336,9 @@ F: block/stream.c
 F: block/mirror.c
 F: qapi/job.json
 F: block/block-copy.c
-F: include/block/block-copy.c
+F: include/block/block-copy.h
+F: block/reqlist.c
+F: include/block/reqlist.h
 F: block/copy-before-write.h
 F: block/copy-before-write.c
 F: include/block/aio_task.h
diff --git a/block/meson.build b/block/meson.build
index 66ee11e62c..c630e9374a 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -32,6 +32,7 @@ block_ss.add(files(
   'qcow2.c',
   'quorum.c',
   'raw-format.c',
+  'reqlist.c',
   'snapshot.c',
   'throttle-groups.c',
   'throttle.c',
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 07/19] block/dirty-bitmap: introduce bdrv_dirty_bitmap_status()
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2021-08-27 18:17 ` [PATCH v2 06/19] block: intoduce reqlist Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:17 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:17 ` [PATCH v2 08/19] block/reqlist: add reqlist_wait_all() Vladimir Sementsov-Ogievskiy
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:17 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

Add a convenient function similar with bdrv_block_status() to get
status of dirty bitmap.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/dirty-bitmap.h |  2 ++
 include/qemu/hbitmap.h       | 11 +++++++++++
 block/dirty-bitmap.c         |  6 ++++++
 util/hbitmap.c               | 36 ++++++++++++++++++++++++++++++++++++
 4 files changed, 55 insertions(+)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index f95d350b70..2ae7dc3d1d 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -115,6 +115,8 @@ int64_t bdrv_dirty_bitmap_next_zero(BdrvDirtyBitmap *bitmap, int64_t offset,
 bool bdrv_dirty_bitmap_next_dirty_area(BdrvDirtyBitmap *bitmap,
         int64_t start, int64_t end, int64_t max_dirty_count,
         int64_t *dirty_start, int64_t *dirty_count);
+void bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap, int64_t offset,
+                              int64_t bytes, bool *is_dirty, int64_t *count);
 BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
                                                   Error **errp);
 
diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 5e71b6d6f7..845fda12db 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -340,6 +340,17 @@ bool hbitmap_next_dirty_area(const HBitmap *hb, int64_t start, int64_t end,
                              int64_t max_dirty_count,
                              int64_t *dirty_start, int64_t *dirty_count);
 
+/*
+ * bdrv_dirty_bitmap_status:
+ * @hb: The HBitmap to operate on
+ * @start: the offset to start from
+ * @end: end of requested area
+ * @is_dirty: is bitmap dirty at @offset
+ * @pnum: how many bits has same value starting from @offset
+ */
+void hbitmap_status(const HBitmap *hb, int64_t offset, int64_t bytes,
+                    bool *is_dirty, int64_t *pnum);
+
 /**
  * hbitmap_iter_next:
  * @hbi: HBitmapIter to operate on.
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 94a0276833..e4a836749a 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -875,6 +875,12 @@ bool bdrv_dirty_bitmap_next_dirty_area(BdrvDirtyBitmap *bitmap,
                                    dirty_start, dirty_count);
 }
 
+void bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap, int64_t offset,
+                              int64_t bytes, bool *is_dirty, int64_t *count)
+{
+    hbitmap_status(bitmap->bitmap, offset, bytes, is_dirty, count);
+}
+
 /**
  * bdrv_merge_dirty_bitmap: merge src into dest.
  * Ensures permissions on bitmaps are reasonable; use for public API.
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 305b894a63..ae8d0eb4d2 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -301,6 +301,42 @@ bool hbitmap_next_dirty_area(const HBitmap *hb, int64_t start, int64_t end,
     return true;
 }
 
+void hbitmap_status(const HBitmap *hb, int64_t start, int64_t count,
+                    bool *is_dirty, int64_t *pnum)
+{
+    int64_t next_dirty, next_zero;
+
+    assert(start >= 0);
+    assert(count > 0);
+    assert(start + count <= hb->orig_size);
+
+    next_dirty = hbitmap_next_dirty(hb, start, count);
+    if (next_dirty == -1) {
+        *pnum = count;
+        *is_dirty = false;
+        return;
+    }
+
+    if (next_dirty > start) {
+        *pnum = next_dirty - start;
+        *is_dirty = false;
+        return;
+    }
+
+    assert(next_dirty == start);
+
+    next_zero = hbitmap_next_zero(hb, start, count);
+    if (next_zero == -1) {
+        *pnum = count;
+        *is_dirty = true;
+        return;
+    }
+
+    assert(next_zero > start);
+    *pnum = next_zero - start;
+    *is_dirty = false;
+}
+
 bool hbitmap_empty(const HBitmap *hb)
 {
     return hb->count == 0;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 08/19] block/reqlist: add reqlist_wait_all()
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2021-08-27 18:17 ` [PATCH v2 07/19] block/dirty-bitmap: introduce bdrv_dirty_bitmap_status() Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:17 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:17 ` [PATCH v2 09/19] block: introduce FleecingState class Vladimir Sementsov-Ogievskiy
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:17 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

Add function to wait for all intersecting requests.
To be used in the further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/reqlist.h | 8 ++++++++
 block/reqlist.c         | 8 ++++++++
 2 files changed, 16 insertions(+)

diff --git a/include/block/reqlist.h b/include/block/reqlist.h
index b904d80216..4695623bb3 100644
--- a/include/block/reqlist.h
+++ b/include/block/reqlist.h
@@ -53,6 +53,14 @@ BlockReq *reqlist_find_conflict(BlockReqList *reqs, int64_t offset,
 bool coroutine_fn reqlist_wait_one(BlockReqList *reqs, int64_t offset,
                                    int64_t bytes, CoMutex *lock);
 
+/*
+ * Wait for all intersecting requests. It just calls reqlist_wait_one() in a
+ * loops, caller is responsible to stop producing new requests in this region
+ * in parallel, otherwise reqlist_wait_all() may never return.
+ */
+void coroutine_fn reqlist_wait_all(BlockReqList *reqs, int64_t offset,
+                                   int64_t bytes, CoMutex *lock);
+
 /*
  * Shrink request and wake all waiting coroutines (may be some of them are not
  * intersecting with shrunk request).
diff --git a/block/reqlist.c b/block/reqlist.c
index 5e320ba649..52a362a1d8 100644
--- a/block/reqlist.c
+++ b/block/reqlist.c
@@ -57,6 +57,14 @@ bool coroutine_fn reqlist_wait_one(BlockReqList *reqs, int64_t offset,
     return true;
 }
 
+void coroutine_fn reqlist_wait_all(BlockReqList *reqs, int64_t offset,
+                                   int64_t bytes, CoMutex *lock)
+{
+    while (reqlist_wait_one(reqs, offset, bytes, lock)) {
+        /* continue */
+    }
+}
+
 void coroutine_fn reqlist_shrink_req(BlockReq *req, int64_t new_bytes)
 {
     if (new_bytes == req->bytes) {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 09/19] block: introduce FleecingState class
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2021-08-27 18:17 ` [PATCH v2 08/19] block/reqlist: add reqlist_wait_all() Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:17 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:17 ` [PATCH v2 10/19] block: introduce fleecing block driver Vladimir Sementsov-Ogievskiy
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:17 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

FleecingState represents state shared between copy-before-write filter
and upcoming fleecing block driver.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/fleecing.h  | 135 ++++++++++++++++++++++++++++++++++
 block/fleecing.c  | 182 ++++++++++++++++++++++++++++++++++++++++++++++
 MAINTAINERS       |   2 +
 block/meson.build |   1 +
 4 files changed, 320 insertions(+)
 create mode 100644 block/fleecing.h
 create mode 100644 block/fleecing.c

diff --git a/block/fleecing.h b/block/fleecing.h
new file mode 100644
index 0000000000..fb7b2f86c4
--- /dev/null
+++ b/block/fleecing.h
@@ -0,0 +1,135 @@
+/*
+ * FleecingState
+ *
+ * The common state of image fleecing, shared between copy-before-write filter
+ * and fleecing block driver.
+ *
+ * Copyright (c) 2021 Virtuozzo International GmbH.
+ *
+ * Author:
+ *  Sementsov-Ogievskiy Vladimir <vsementsov@virtuozzo.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ *
+ *
+ * Fleecing scheme looks as follows:
+ *
+ * [guest blk]                   [nbd export]
+ *    |                              |
+ *    |root                          |
+ *    v                              v
+ * [copy-before-write]--target-->[fleecing drv]
+ *    |                          /   |
+ *    |file                     /    |file
+ *    v                        /     v
+ * [active disk]<--source-----/  [temp disk]
+ *
+ * Note that "active disk" is also called just "source" and "temp disk" is also
+ * called "target".
+ *
+ * What happens here:
+ *
+ * copy-before-write filter performs copy-before-write operations: on guest
+ * write we should copy old data to target child before rewriting. Note that we
+ * write this data through fleecing driver: it saves a possibility to implement
+ * a kind of cache in fleecing driver in future.
+ *
+ * Fleecing user is nbd export: it can read from fleecing node, which guarantees
+ * a snapshot-view for fleecing user. Fleecing user may also do discard
+ * operations.
+ *
+ * FleecingState is responsible for most of the fleecing logic:
+ *
+ * 1. Fleecing read. Handle reads of fleecing user: we should decide where from
+ * to read, from source node or from copy-before-write target node. In former
+ * case we need to synchronize with guest writes. See fleecing_read_lock() and
+ * fleecing_read_unlock() functionality.
+ *
+ * 2. Guest write synchronization (part of [1] actually). See
+ * fleecing_mark_done_and_wait_readers()
+ *
+ * 3. Fleecing discard. Used by fleecing user when corresponding area is already
+ * copied. Fleecing user may discard the area which is not needed anymore, that
+ * should result in:
+ *   - discarding data to free disk space
+ *   - clear bits in copy-bitmap of block-copy, to avoid extra copy-before-write
+ *     operations
+ *   - clear bits in access-bitmap of FleecingState, to avoid further wrong
+ *     access
+ *
+ * Still, FleecingState doesn't own any block children, so all real io
+ * operations (reads, writes and discards) are done by copy-before-write filter
+ * and fleecing block driver.
+ */
+
+#ifndef FLEECING_H
+#define FLEECING_H
+
+#include "block/block_int.h"
+#include "block/block-copy.h"
+#include "block/reqlist.h"
+
+typedef struct FleecingState FleecingState;
+
+/*
+ * Create FleecingState.
+ *
+ * @bcs: link to block-copy owned by copy-before-write filter.
+ *
+ * @fleecing_node: should be fleecing block driver node. Used to create some
+ * bitmaps in it.
+ */
+FleecingState *fleecing_new(BlockCopyState *bcs,
+                            BlockDriverState *fleecing_node,
+                            Error **errp);
+
+/* Free the state. Doesn't free block-copy state (@bcs) */
+void fleecing_free(FleecingState *s);
+
+/*
+ * Convenient function for thous who want to do fleecing read.
+ *
+ * If requested region starts in "done" area, i.e. data is already copied to
+ * copy-before-write target node, req is set to NULL, pnum is set to available
+ * bytes to read from target. User is free to read @pnum bytes from target.
+ * Still, user is responsible for concurrent discards on target.
+ *
+ * If requests region starts in "not done" area, i.e. we have to read from
+ * source node directly, than @pnum bytes of source node are frozen and
+ * guaranteed not be rewritten until user calls cbw_snapshot_read_unlock().
+ *
+ * Returns 0 on success and -EACCES when try to read non-dirty area of
+ * access_bitmap.
+ */
+int fleecing_read_lock(FleecingState *f, int64_t offset,
+                       int64_t bytes, const BlockReq **req, int64_t *pnum);
+/* Called as closing pair for fleecing_read_lock() */
+void fleecing_read_unlock(FleecingState *f, const BlockReq *req);
+
+/*
+ * Called when fleecing user doesn't need the region anymore (for example the
+ * region is successfully read and backed up somewhere).
+ * This prevents extra copy-before-write operations in this area in future.
+ * Next fleecing read from this area will fail with -EACCES.
+ */
+void fleecing_discard(FleecingState *f, int64_t offset, int64_t bytes);
+
+/*
+ * Called by copy-before-write filter after successful copy-before-write
+ * operation to synchronize with parallel fleecing reads.
+ */
+void fleecing_mark_done_and_wait_readers(FleecingState *f, int64_t offset,
+                                         int64_t bytes);
+
+#endif /* FLEECING_H */
diff --git a/block/fleecing.c b/block/fleecing.c
new file mode 100644
index 0000000000..f75d11b892
--- /dev/null
+++ b/block/fleecing.c
@@ -0,0 +1,182 @@
+/*
+ * FleecingState
+ *
+ * The common state of image fleecing, shared between copy-before-write filter
+ * and fleecing block driver.
+ *
+ * Copyright (c) 2021 Virtuozzo International GmbH.
+ *
+ * Author:
+ *  Sementsov-Ogievskiy Vladimir <vsementsov@virtuozzo.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+
+#include "sysemu/block-backend.h"
+#include "qemu/cutils.h"
+#include "qapi/error.h"
+#include "block/block_int.h"
+#include "block/coroutines.h"
+#include "block/qdict.h"
+#include "block/block-copy.h"
+#include "block/reqlist.h"
+
+#include "block/fleecing.h"
+
+/*
+ * @bcs: link to block-copy state owned by copy-before-write filter which
+ * performs copy-before-write operations in context of fleecing scheme.
+ * FleecingState doesn't own the block-copy state and don't free it on cleanup.
+ *
+ * @lock: protects access to @access_bitmap, @done_bitmap and @frozen_read_reqs
+ *
+ * @access_bitmap: represents areas allowed for reading by fleecing user.
+ * Reading from non-dirty areas leads to -EACCES. Discard operation among other
+ * things clears corresponding bits in this bitmaps.
+ *
+ * @done_bitmap: represents areas that was successfully copied by
+ * copy-before-write operations. So, for dirty areas fleecing user should read
+ * from target node and for clear areas - from source node.
+ *
+ * @frozen_read_reqs: current read requests for fleecing user in source node.
+ * corresponding areas must not be rewritten by guest.
+ */
+typedef struct FleecingState {
+    BlockCopyState *bcs;
+
+    CoMutex lock;
+
+    BdrvDirtyBitmap *access_bitmap;
+    BdrvDirtyBitmap *done_bitmap;
+
+    BlockReqList frozen_read_reqs;
+} FleecingState;
+
+FleecingState *fleecing_new(BlockCopyState *bcs,
+                            BlockDriverState *fleecing_node,
+                            Error **errp)
+{
+    BdrvDirtyBitmap *bcs_bitmap = block_copy_dirty_bitmap(bcs),
+                    *done_bitmap, *access_bitmap;
+    int64_t cluster_size = block_copy_cluster_size(bcs);
+    FleecingState *s;
+
+    /* done_bitmap starts empty */
+    done_bitmap = bdrv_create_dirty_bitmap(fleecing_node, cluster_size, NULL,
+                                           errp);
+    if (!done_bitmap) {
+        return NULL;
+    }
+    bdrv_disable_dirty_bitmap(done_bitmap);
+
+    /* access_bitmap starts equal to bcs_bitmap */
+    access_bitmap = bdrv_create_dirty_bitmap(fleecing_node, cluster_size, NULL,
+                                             errp);
+    if (!access_bitmap) {
+        return NULL;
+    }
+    bdrv_disable_dirty_bitmap(access_bitmap);
+    if (!bdrv_dirty_bitmap_merge_internal(access_bitmap, bcs_bitmap,
+                                          NULL, true))
+    {
+        return NULL;
+    }
+
+    s = g_new(FleecingState, 1);
+    *s = (FleecingState) {
+        .bcs = bcs,
+        .done_bitmap = done_bitmap,
+        .access_bitmap = access_bitmap,
+    };
+    qemu_co_mutex_init(&s->lock);
+    QLIST_INIT(&s->frozen_read_reqs);
+
+    return s;
+}
+
+void fleecing_free(FleecingState *s)
+{
+    if (!s) {
+        return;
+    }
+
+    bdrv_release_dirty_bitmap(s->access_bitmap);
+    bdrv_release_dirty_bitmap(s->done_bitmap);
+    g_free(s);
+}
+
+static BlockReq *add_read_req(FleecingState *s, uint64_t offset, uint64_t bytes)
+{
+    BlockReq *req = g_new(BlockReq, 1);
+
+    reqlist_init_req(&s->frozen_read_reqs, req, offset, bytes);
+
+    return req;
+}
+
+static void drop_read_req(BlockReq *req)
+{
+    reqlist_remove_req(req);
+    g_free(req);
+}
+
+int fleecing_read_lock(FleecingState *s, int64_t offset,
+                       int64_t bytes, const BlockReq **req,
+                       int64_t *pnum)
+{
+    bool done;
+
+    QEMU_LOCK_GUARD(&s->lock);
+
+    if (bdrv_dirty_bitmap_next_zero(s->access_bitmap, offset, bytes) != -1) {
+        return -EACCES;
+    }
+
+    bdrv_dirty_bitmap_status(s->done_bitmap, offset, bytes, &done, pnum);
+    if (!done) {
+        *req = add_read_req(s, offset, *pnum);
+    }
+
+    return 0;
+}
+
+void fleecing_read_unlock(FleecingState *s, const BlockReq *req)
+{
+    QEMU_LOCK_GUARD(&s->lock);
+
+    drop_read_req((BlockReq *)req);
+}
+
+void fleecing_discard(FleecingState *s, int64_t offset, int64_t bytes)
+{
+    WITH_QEMU_LOCK_GUARD(&s->lock) {
+        bdrv_reset_dirty_bitmap(s->access_bitmap, offset, bytes);
+    }
+
+    block_copy_reset(s->bcs, offset, bytes);
+}
+
+void fleecing_mark_done_and_wait_readers(FleecingState *s, int64_t offset,
+                                         int64_t bytes)
+{
+    assert(QEMU_IS_ALIGNED(offset, block_copy_cluster_size(s->bcs)));
+    assert(QEMU_IS_ALIGNED(bytes, block_copy_cluster_size(s->bcs)));
+
+    WITH_QEMU_LOCK_GUARD(&s->lock) {
+        bdrv_set_dirty_bitmap(s->done_bitmap, offset, bytes);
+        reqlist_wait_all(&s->frozen_read_reqs, offset, bytes, &s->lock);
+    }
+}
diff --git a/MAINTAINERS b/MAINTAINERS
index 2d58170224..20f3c7eb68 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2341,6 +2341,8 @@ F: block/reqlist.c
 F: include/block/reqlist.h
 F: block/copy-before-write.h
 F: block/copy-before-write.c
+F: block/fleecing.h
+F: block/fleecing.c
 F: include/block/aio_task.h
 F: block/aio_task.c
 F: util/qemu-co-shared-resource.c
diff --git a/block/meson.build b/block/meson.build
index c630e9374a..caa52f49c2 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -18,6 +18,7 @@ block_ss.add(files(
   'crypto.c',
   'dirty-bitmap.c',
   'filter-compress.c',
+  'fleecing.c',
   'io.c',
   'mirror.c',
   'nbd.c',
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 10/19] block: introduce fleecing block driver
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (8 preceding siblings ...)
  2021-08-27 18:17 ` [PATCH v2 09/19] block: introduce FleecingState class Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:17 ` Vladimir Sementsov-Ogievskiy
  2021-09-01 11:44   ` Markus Armbruster
  2021-08-27 18:18 ` [PATCH v2 11/19] block/copy-before-write: support " Vladimir Sementsov-Ogievskiy
                   ` (9 subsequent siblings)
  19 siblings, 1 reply; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:17 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

Introduce a new driver, that works in pair with copy-before-write to
improve fleecing.

Without fleecing driver, old fleecing scheme looks as follows:

[guest]
  |
  |root
  v
[copy-before-write] -----> [temp.qcow2] <--- [nbd export]
  |                 target  |
  |file                     |backing
  v                         |
[active disk] <-------------+

With fleecing driver, new scheme is:

[guest]
  |
  |root
  v
[copy-before-write] -----> [fleecing] <--- [nbd export]
  |                 target  |    |
  |file                     |    |file
  v                         |    v
[active disk]<--source------+  [temp.img]

Benefits of new scheme:

1. Access control: if remote client try to read data that not covered
   by original dirty bitmap used on copy-before-write open, client gets
   -EACCES.

2. Discard support: if remote client do DISCARD, this additionally to
   discarding data in temp.img informs block-copy process to not copy
   these clusters. Next read from discarded area will return -EACCES.
   This is significant thing: when fleecing user reads data that was
   not yet copied to temp.img, we can avoid copying it on further guest
   write.

3. Synchronisation between client reads and block-copy write is more
   efficient: it doesn't block intersecting block-copy write during
   client read.

4. We don't rely on backing feature: active disk should not be backing
   of temp image, so we avoid some permission-related difficulties and
   temp image now is not required to support backing, it may be simple
   raw image.

Note that now nobody calls fleecing_drv_activate(), so new driver is
actually unusable. It's a work for the following patch: support
fleecing block driver in copy-before-write filter driver.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 qapi/block-core.json |  17 ++-
 block/fleecing.h     |  16 +++
 block/fleecing-drv.c | 260 +++++++++++++++++++++++++++++++++++++++++++
 MAINTAINERS          |   1 +
 block/meson.build    |   1 +
 5 files changed, 294 insertions(+), 1 deletion(-)
 create mode 100644 block/fleecing-drv.c

diff --git a/qapi/block-core.json b/qapi/block-core.json
index c42d23752d..8a333136f5 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2826,13 +2826,14 @@
 # @blkreplay: Since 4.2
 # @compress: Since 5.0
 # @copy-before-write: Since 6.2
+# @fleecing: Since 6.2
 #
 # Since: 2.9
 ##
 { 'enum': 'BlockdevDriver',
   'data': [ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs',
             'cloop', 'compress', 'copy-before-write', 'copy-on-read', 'dmg',
-            'file', 'ftp', 'ftps', 'gluster',
+            'file', 'fleecing', 'ftp', 'ftps', 'gluster',
             {'name': 'host_cdrom', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
             {'name': 'host_device', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
             'http', 'https', 'iscsi',
@@ -4077,6 +4078,19 @@
   'base': 'BlockdevOptionsGenericFormat',
   'data': { 'target': 'BlockdevRef', '*bitmap': 'BlockDirtyBitmap' } }
 
+##
+# @BlockdevOptionsFleecing:
+#
+# Driver that works in pair with copy-before-write to make fleecing scheme.
+#
+# @source: source node of fleecing
+#
+# Since: 6.2
+##
+{ 'struct': 'BlockdevOptionsFleecing',
+  'base': 'BlockdevOptionsGenericFormat',
+  'data': { 'source': 'str' } }
+
 ##
 # @BlockdevOptions:
 #
@@ -4133,6 +4147,7 @@
       'copy-on-read':'BlockdevOptionsCor',
       'dmg':        'BlockdevOptionsGenericFormat',
       'file':       'BlockdevOptionsFile',
+      'fleecing':   'BlockdevOptionsFleecing',
       'ftp':        'BlockdevOptionsCurlFtp',
       'ftps':       'BlockdevOptionsCurlFtps',
       'gluster':    'BlockdevOptionsGluster',
diff --git a/block/fleecing.h b/block/fleecing.h
index fb7b2f86c4..75ad2f8b19 100644
--- a/block/fleecing.h
+++ b/block/fleecing.h
@@ -80,6 +80,9 @@
 #include "block/block-copy.h"
 #include "block/reqlist.h"
 
+
+/* fleecing.c */
+
 typedef struct FleecingState FleecingState;
 
 /*
@@ -132,4 +135,17 @@ void fleecing_discard(FleecingState *f, int64_t offset, int64_t bytes);
 void fleecing_mark_done_and_wait_readers(FleecingState *f, int64_t offset,
                                          int64_t bytes);
 
+
+/* fleecing-drv.c */
+
+/* Returns true if @bs->drv is fleecing block driver */
+bool is_fleecing_drv(BlockDriverState *bs);
+
+/*
+ * Normally FleecingState is created by copy-before-write filter. Then
+ * copy-before-write filter calls fleecing_drv_activate() to share FleecingState
+ * with fleecing block driver.
+ */
+void fleecing_drv_activate(BlockDriverState *bs, FleecingState *fleecing);
+
 #endif /* FLEECING_H */
diff --git a/block/fleecing-drv.c b/block/fleecing-drv.c
new file mode 100644
index 0000000000..9161e13809
--- /dev/null
+++ b/block/fleecing-drv.c
@@ -0,0 +1,260 @@
+/*
+ * fleecing block driver
+ *
+ * Copyright (c) 2021 Virtuozzo International GmbH.
+ *
+ * Author:
+ *  Sementsov-Ogievskiy Vladimir <vsementsov@virtuozzo.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+
+#include "sysemu/block-backend.h"
+#include "qemu/cutils.h"
+#include "qapi/error.h"
+#include "block/block_int.h"
+#include "block/coroutines.h"
+#include "block/qdict.h"
+#include "block/block-copy.h"
+#include "block/reqlist.h"
+
+#include "block/copy-before-write.h"
+#include "block/fleecing.h"
+
+typedef struct BDRVFleecingState {
+    FleecingState *fleecing;
+    BdrvChild *source;
+} BDRVFleecingState;
+
+static coroutine_fn int fleecing_co_preadv_part(
+        BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+        QEMUIOVector *qiov, size_t qiov_offset, int flags)
+{
+    BDRVFleecingState *s = bs->opaque;
+    const BlockReq *req;
+    int ret;
+
+    if (!s->fleecing) {
+        /* fleecing_drv_activate() was not called */
+        return -EINVAL;
+    }
+
+    /* TODO: upgrade to async loop using AioTask */
+    while (bytes) {
+        int64_t cur_bytes;
+
+        ret = fleecing_read_lock(s->fleecing, offset, bytes, &req, &cur_bytes);
+        if (ret < 0) {
+            return ret;
+        }
+
+        if (req) {
+            ret = bdrv_co_preadv_part(s->source, offset, cur_bytes,
+                                      qiov, qiov_offset, flags);
+            fleecing_read_unlock(s->fleecing, req);
+        } else {
+            ret = bdrv_co_preadv_part(bs->file, offset, cur_bytes,
+                                      qiov, qiov_offset, flags);
+        }
+        if (ret < 0) {
+            return ret;
+        }
+
+        bytes -= cur_bytes;
+        offset += cur_bytes;
+        qiov_offset += cur_bytes;
+    }
+
+    return 0;
+}
+
+static int coroutine_fn fleecing_co_block_status(BlockDriverState *bs,
+                                                 bool want_zero, int64_t offset,
+                                                 int64_t bytes, int64_t *pnum,
+                                                 int64_t *map,
+                                                 BlockDriverState **file)
+{
+    BDRVFleecingState *s = bs->opaque;
+    const BlockReq *req = NULL;
+    int ret;
+    int64_t cur_bytes;
+
+    if (!s->fleecing) {
+        /* fleecing_drv_activate() was not called */
+        return -EINVAL;
+    }
+
+    ret = fleecing_read_lock(s->fleecing, offset, bytes, &req, &cur_bytes);
+    if (ret < 0) {
+        return ret;
+    }
+
+    *pnum = cur_bytes;
+    *map = offset;
+
+    if (req) {
+        *file = s->source->bs;
+        fleecing_read_unlock(s->fleecing, req);
+    } else {
+        *file = bs->file->bs;
+    }
+
+    return ret;
+}
+
+static int coroutine_fn fleecing_co_pdiscard(BlockDriverState *bs,
+                                             int64_t offset, int bytes)
+{
+    BDRVFleecingState *s = bs->opaque;
+    if (!s->fleecing) {
+        /* fleecing_drv_activate() was not called */
+        return -EINVAL;
+    }
+
+    fleecing_discard(s->fleecing, offset, bytes);
+
+    bdrv_co_pdiscard(bs->file, offset, bytes);
+
+    /*
+     * Ignore bdrv_co_pdiscard() result: fleecing_discard() succeeded, that
+     * means that next read from this area will fail with -EACCES. More correct
+     * to report success now.
+     */
+    return 0;
+}
+
+static int coroutine_fn fleecing_co_pwrite_zeroes(BlockDriverState *bs,
+        int64_t offset, int bytes, BdrvRequestFlags flags)
+{
+    BDRVFleecingState *s = bs->opaque;
+    if (!s->fleecing) {
+        /* fleecing_drv_activate() was not called */
+        return -EINVAL;
+    }
+
+    /*
+     * TODO: implement cache, to have a chance to fleecing user to read and
+     * discard this data before actual writing to temporary image.
+     */
+    return bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags);
+}
+
+static coroutine_fn int fleecing_co_pwritev(BlockDriverState *bs,
+                                       uint64_t offset,
+                                       uint64_t bytes,
+                                       QEMUIOVector *qiov, int flags)
+{
+    BDRVFleecingState *s = bs->opaque;
+    if (!s->fleecing) {
+        /* fleecing_drv_activate() was not called */
+        return -EINVAL;
+    }
+
+    /*
+     * TODO: implement cache, to have a chance to fleecing user to read and
+     * discard this data before actual writing to temporary image.
+     */
+    return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
+}
+
+
+static void fleecing_refresh_filename(BlockDriverState *bs)
+{
+    pstrcpy(bs->exact_filename, sizeof(bs->exact_filename),
+            bs->file->bs->filename);
+}
+
+static int fleecing_open(BlockDriverState *bs, QDict *options, int flags,
+                         Error **errp)
+{
+    BDRVFleecingState *s = bs->opaque;
+
+    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
+                               BDRV_CHILD_DATA | BDRV_CHILD_PRIMARY,
+                               false, errp);
+    if (!bs->file) {
+        return -EINVAL;
+    }
+
+    s->source = bdrv_open_child(NULL, options, "source", bs, &child_of_bds,
+                               BDRV_CHILD_DATA, false, errp);
+    if (!s->source) {
+        return -EINVAL;
+    }
+
+    bs->total_sectors = bs->file->bs->total_sectors;
+
+    return 0;
+}
+
+static void fleecing_child_perm(BlockDriverState *bs, BdrvChild *c,
+                                BdrvChildRole role,
+                                BlockReopenQueue *reopen_queue,
+                                uint64_t perm, uint64_t shared,
+                                uint64_t *nperm, uint64_t *nshared)
+{
+    bdrv_default_perms(bs, c, role, reopen_queue, perm, shared, nperm, nshared);
+
+    if (role & BDRV_CHILD_PRIMARY) {
+        *nshared &= BLK_PERM_CONSISTENT_READ;
+    } else {
+        *nperm &= BLK_PERM_CONSISTENT_READ;
+
+        /*
+         * copy-before-write filter is responsible for source child and need
+         * write access to it.
+         */
+        *nshared |= BLK_PERM_WRITE;
+    }
+}
+
+BlockDriver bdrv_fleecing_drv = {
+    .format_name = "fleecing",
+    .instance_size = sizeof(BDRVFleecingState),
+
+    .bdrv_open                  = fleecing_open,
+
+    .bdrv_co_preadv_part        = fleecing_co_preadv_part,
+    .bdrv_co_pwritev            = fleecing_co_pwritev,
+    .bdrv_co_pwrite_zeroes      = fleecing_co_pwrite_zeroes,
+    .bdrv_co_pdiscard           = fleecing_co_pdiscard,
+    .bdrv_co_block_status       = fleecing_co_block_status,
+
+    .bdrv_refresh_filename      = fleecing_refresh_filename,
+
+    .bdrv_child_perm            = fleecing_child_perm,
+};
+
+bool is_fleecing_drv(BlockDriverState *bs)
+{
+    return bs && bs->drv == &bdrv_fleecing_drv;
+}
+
+void fleecing_drv_activate(BlockDriverState *bs, FleecingState *fleecing)
+{
+    BDRVFleecingState *s = bs->opaque;
+
+    assert(is_fleecing_drv(bs));
+
+    s->fleecing = fleecing;
+}
+
+static void fleecing_init(void)
+{
+    bdrv_register(&bdrv_fleecing_drv);
+}
+
+block_init(fleecing_init);
diff --git a/MAINTAINERS b/MAINTAINERS
index 20f3c7eb68..9b62f99557 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2343,6 +2343,7 @@ F: block/copy-before-write.h
 F: block/copy-before-write.c
 F: block/fleecing.h
 F: block/fleecing.c
+F: block/fleecing-drv.c
 F: include/block/aio_task.h
 F: block/aio_task.c
 F: util/qemu-co-shared-resource.c
diff --git a/block/meson.build b/block/meson.build
index caa52f49c2..10cd0c0ad2 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -19,6 +19,7 @@ block_ss.add(files(
   'dirty-bitmap.c',
   'filter-compress.c',
   'fleecing.c',
+  'fleecing-drv.c',
   'io.c',
   'mirror.c',
   'nbd.c',
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 11/19] block/copy-before-write: support fleecing block driver
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (9 preceding siblings ...)
  2021-08-27 18:17 ` [PATCH v2 10/19] block: introduce fleecing block driver Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:18 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:18 ` [PATCH v2 12/19] block/block-copy: add write-unchanged mode Vladimir Sementsov-Ogievskiy
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:18 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

The last step to make new fleecing scheme work (see block/fleecing.h
for descritption) is to update copy-before-write filter:

If we detect that unfiltered target child is fleecing block driver, we
do:
 - initialize shared FleecingState
 - activate fleecing block driver with it
 - do guest write synchronization with help of
   fleecing_mark_done_and_wait_readers() function

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/copy-before-write.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index e3456ad6aa..686a085861 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -33,10 +33,13 @@
 #include "block/block-copy.h"
 
 #include "block/copy-before-write.h"
+#include "block/fleecing.h"
 
 typedef struct BDRVCopyBeforeWriteState {
     BlockCopyState *bcs;
     BdrvChild *target;
+
+    FleecingState *fleecing;
 } BDRVCopyBeforeWriteState;
 
 static coroutine_fn int cbw_co_preadv(
@@ -50,6 +53,7 @@ static coroutine_fn int cbw_do_copy_before_write(BlockDriverState *bs,
         uint64_t offset, uint64_t bytes, BdrvRequestFlags flags)
 {
     BDRVCopyBeforeWriteState *s = bs->opaque;
+    int ret;
     uint64_t off, end;
     int64_t cluster_size = block_copy_cluster_size(s->bcs);
 
@@ -60,7 +64,16 @@ static coroutine_fn int cbw_do_copy_before_write(BlockDriverState *bs,
     off = QEMU_ALIGN_DOWN(offset, cluster_size);
     end = QEMU_ALIGN_UP(offset + bytes, cluster_size);
 
-    return block_copy(s->bcs, off, end - off, true);
+    ret = block_copy(s->bcs, off, end - off, true);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (s->fleecing) {
+        fleecing_mark_done_and_wait_readers(s->fleecing, off, end - off);
+    }
+
+    return 0;
 }
 
 static int coroutine_fn cbw_co_pdiscard(BlockDriverState *bs,
@@ -149,6 +162,7 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
 {
     BDRVCopyBeforeWriteState *s = bs->opaque;
     BdrvDirtyBitmap *bitmap = NULL;
+    BlockDriverState *unfiltered_target;
 
     bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
                                BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
@@ -162,6 +176,7 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
     if (!s->target) {
         return -EINVAL;
     }
+    unfiltered_target = bdrv_skip_filters(s->target->bs);
 
     if (qdict_haskey(options, "bitmap.node") ||
         qdict_haskey(options, "bitmap.name"))
@@ -203,6 +218,14 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
         return -EINVAL;
     }
 
+    if (is_fleecing_drv(unfiltered_target)) {
+        s->fleecing = fleecing_new(s->bcs, unfiltered_target, errp);
+        if (!s->fleecing) {
+            return -EINVAL;
+        }
+        fleecing_drv_activate(unfiltered_target, s->fleecing);
+    }
+
     return 0;
 }
 
@@ -210,6 +233,8 @@ static void cbw_close(BlockDriverState *bs)
 {
     BDRVCopyBeforeWriteState *s = bs->opaque;
 
+    fleecing_free(s->fleecing);
+    s->fleecing = NULL;
     block_copy_state_free(s->bcs);
     s->bcs = NULL;
 }
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 12/19] block/block-copy: add write-unchanged mode
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (10 preceding siblings ...)
  2021-08-27 18:18 ` [PATCH v2 11/19] block/copy-before-write: support " Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:18 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:18 ` [PATCH v2 13/19] block/copy-before-write: use write-unchanged in fleecing mode Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:18 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

We are going to implement push backup with fleecing scheme. This means
that backup job will be a fleecing user and therefore will not need
separate copy-before-write filter. Instead it will consider source as
constant unchanged drive. Of course backup will want to unshare writes
on source for this case. But we want to do copy-before-write
operations. Still these operations may be considered as
write-unchanged. Add corresponding option to block-copy now, to use in
the following commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block-copy.h | 3 ++-
 block/block-copy.c         | 9 ++++++---
 block/copy-before-write.c  | 2 +-
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index a11e1620f6..a66f81d314 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -25,7 +25,8 @@ typedef struct BlockCopyState BlockCopyState;
 typedef struct BlockCopyCallState BlockCopyCallState;
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
-                                     BdrvDirtyBitmap *bitmap, Error **errp);
+                                     BdrvDirtyBitmap *bitmap,
+                                     bool write_unchanged, Error **errp);
 
 /* Function should be called prior any actual copy request */
 void block_copy_set_copy_opts(BlockCopyState *s, bool use_copy_range,
diff --git a/block/block-copy.c b/block/block-copy.c
index 46e6a6736d..6d5d517ac6 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -279,7 +279,8 @@ void block_copy_set_copy_opts(BlockCopyState *s, bool use_copy_range,
                               bool compress)
 {
     /* Keep BDRV_REQ_SERIALISING set (or not set) in block_copy_state_new() */
-    s->write_flags = (s->write_flags & BDRV_REQ_SERIALISING) |
+    s->write_flags = (s->write_flags &
+                      (BDRV_REQ_SERIALISING | BDRV_REQ_WRITE_UNCHANGED)) |
         (compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
 
     if (s->max_transfer < s->cluster_size) {
@@ -340,7 +341,8 @@ static int64_t block_copy_calculate_cluster_size(BlockDriverState *target,
 }
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
-                                     BdrvDirtyBitmap *bitmap, Error **errp)
+                                     BdrvDirtyBitmap *bitmap,
+                                     bool write_unchanged, Error **errp)
 {
     ERRP_GUARD();
     BlockCopyState *s;
@@ -393,7 +395,8 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
         .copy_bitmap = copy_bitmap,
         .cluster_size = cluster_size,
         .len = bdrv_dirty_bitmap_size(copy_bitmap),
-        .write_flags = (is_fleecing ? BDRV_REQ_SERIALISING : 0),
+        .write_flags = (is_fleecing ? BDRV_REQ_SERIALISING : 0) |
+            (write_unchanged ? BDRV_REQ_WRITE_UNCHANGED : 0),
         .mem = shres_create(BLOCK_COPY_MAX_MEM),
         .max_transfer = QEMU_ALIGN_DOWN(
                                     block_copy_max_transfer(source, target),
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 686a085861..7e4e4bf7d4 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -212,7 +212,7 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
             ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
              bs->file->bs->supported_zero_flags);
 
-    s->bcs = block_copy_state_new(bs->file, s->target, bitmap, errp);
+    s->bcs = block_copy_state_new(bs->file, s->target, bitmap, false, errp);
     if (!s->bcs) {
         error_prepend(errp, "Cannot create block-copy-state: ");
         return -EINVAL;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 13/19] block/copy-before-write: use write-unchanged in fleecing mode
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (11 preceding siblings ...)
  2021-08-27 18:18 ` [PATCH v2 12/19] block/block-copy: add write-unchanged mode Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:18 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:18 ` [PATCH v2 14/19] iotests/image-fleecing: add test-case for fleecing format node Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:18 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

As announced in previous commit, we need use write-unchanged operations
for fleecing, so that fleecing client may unshare writes if needed.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/copy-before-write.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 7e4e4bf7d4..91250cc634 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -132,6 +132,8 @@ static void cbw_child_perm(BlockDriverState *bs, BdrvChild *c,
                            uint64_t perm, uint64_t shared,
                            uint64_t *nperm, uint64_t *nshared)
 {
+    BDRVCopyBeforeWriteState *s = bs->opaque;
+
     if (!(role & BDRV_CHILD_FILTERED)) {
         /*
          * Target child
@@ -142,7 +144,7 @@ static void cbw_child_perm(BlockDriverState *bs, BdrvChild *c,
          * only upfront.
          */
         *nshared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
-        *nperm = BLK_PERM_WRITE;
+        *nperm = s->fleecing ? BLK_PERM_WRITE_UNCHANGED : BLK_PERM_WRITE;
     } else {
         /* Source child */
         bdrv_default_perms(bs, c, role, reopen_queue,
@@ -212,7 +214,14 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
             ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
              bs->file->bs->supported_zero_flags);
 
-    s->bcs = block_copy_state_new(bs->file, s->target, bitmap, false, errp);
+    /*
+     * For fleecing scheme set parameter write_unchanged=true, as our
+     * copy-before-write operations will actually be write-unchanged. As well we
+     * take write-unchanged permission instead of write, which is important for
+     * backup with immutable_source=true to work as fleecing client.
+     */
+    s->bcs = block_copy_state_new(bs->file, s->target, bitmap,
+                                  is_fleecing_drv(unfiltered_target), errp);
     if (!s->bcs) {
         error_prepend(errp, "Cannot create block-copy-state: ");
         return -EINVAL;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 14/19] iotests/image-fleecing: add test-case for fleecing format node
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (12 preceding siblings ...)
  2021-08-27 18:18 ` [PATCH v2 13/19] block/copy-before-write: use write-unchanged in fleecing mode Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:18 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:18 ` [PATCH v2 15/19] iotests.py: add qemu_io_pipe_and_status() Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:18 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/tests/image-fleecing     | 67 +++++++++++++------
 tests/qemu-iotests/tests/image-fleecing.out | 74 ++++++++++++++++++++-
 2 files changed, 121 insertions(+), 20 deletions(-)

diff --git a/tests/qemu-iotests/tests/image-fleecing b/tests/qemu-iotests/tests/image-fleecing
index f6318492c6..f18881fa71 100755
--- a/tests/qemu-iotests/tests/image-fleecing
+++ b/tests/qemu-iotests/tests/image-fleecing
@@ -48,12 +48,17 @@ remainder = [('0xd5', '0x108000',  '32k'), # Right-end of partial-left [1]
              ('0xdc', '32M',       '32k'), # Left-end of partial-right [2]
              ('0xcd', '0x3ff0000', '64k')] # patterns[3]
 
-def do_test(use_cbw, base_img_path, fleece_img_path, nbd_sock_path, vm):
+def do_test(use_cbw, use_fleecing_filter, base_img_path,
+            fleece_img_path, nbd_sock_path, vm):
     log('--- Setting up images ---')
     log('')
 
     assert qemu_img('create', '-f', iotests.imgfmt, base_img_path, '64M') == 0
-    assert qemu_img('create', '-f', 'qcow2', fleece_img_path, '64M') == 0
+    if use_fleecing_filter:
+        assert use_cbw
+        assert qemu_img('create', '-f', 'raw', fleece_img_path, '64M') == 0
+    else:
+        assert qemu_img('create', '-f', 'qcow2', fleece_img_path, '64M') == 0
 
     for p in patterns:
         qemu_io('-f', iotests.imgfmt,
@@ -80,24 +85,39 @@ def do_test(use_cbw, base_img_path, fleece_img_path, nbd_sock_path, vm):
     log('')
 
 
-    # create tmp_node backed by src_node
-    log(vm.qmp('blockdev-add', {
-        'driver': 'qcow2',
-        'node-name': tmp_node,
-        'file': {
+    if use_fleecing_filter:
+        log(vm.qmp('blockdev-add', {
+            'node-name': tmp_node,
             'driver': 'file',
             'filename': fleece_img_path,
-        },
-        'backing': src_node,
-    }))
+        }))
+    else:
+        # create tmp_node backed by src_node
+        log(vm.qmp('blockdev-add', {
+            'driver': 'qcow2',
+            'node-name': tmp_node,
+            'file': {
+                'driver': 'file',
+                'filename': fleece_img_path,
+            },
+            'backing': src_node,
+        }))
 
     # Establish CBW from source to fleecing node
     if use_cbw:
+        if use_fleecing_filter:
+            log(vm.qmp('blockdev-add', {
+                'driver': 'fleecing',
+                'node-name': 'fl-fleecing',
+                'file': tmp_node,
+                'source': src_node,
+            }))
+
         log(vm.qmp('blockdev-add', {
             'driver': 'copy-before-write',
             'node-name': 'fl-cbw',
             'file': src_node,
-            'target': tmp_node
+            'target': 'fl-fleecing' if use_fleecing_filter else tmp_node
         }))
 
         log(vm.qmp('qom-set', path=qom_path, property='drive', value='fl-cbw'))
@@ -108,16 +128,18 @@ def do_test(use_cbw, base_img_path, fleece_img_path, nbd_sock_path, vm):
                    target=tmp_node,
                    sync='none'))
 
+    export_node = 'fl-fleecing' if use_fleecing_filter else tmp_node
+
     log('')
     log('--- Setting up NBD Export ---')
     log('')
 
-    nbd_uri = 'nbd+unix:///%s?socket=%s' % (tmp_node, nbd_sock_path)
+    nbd_uri = 'nbd+unix:///%s?socket=%s' % (export_node, nbd_sock_path)
     log(vm.qmp('nbd-server-start',
                {'addr': { 'type': 'unix',
                           'data': { 'path': nbd_sock_path } } }))
 
-    log(vm.qmp('nbd-server-add', device=tmp_node))
+    log(vm.qmp('nbd-server-add', device=export_node))
 
     log('')
     log('--- Sanity Check ---')
@@ -150,16 +172,19 @@ def do_test(use_cbw, base_img_path, fleece_img_path, nbd_sock_path, vm):
     log('--- Cleanup ---')
     log('')
 
+    log(vm.qmp('nbd-server-stop'))
+
     if use_cbw:
         log(vm.qmp('qom-set', path=qom_path, property='drive', value=src_node))
         log(vm.qmp('blockdev-del', node_name='fl-cbw'))
+        if use_fleecing_filter:
+            log(vm.qmp('blockdev-del', node_name='fl-fleecing'))
     else:
         log(vm.qmp('block-job-cancel', device='fleecing'))
         e = vm.event_wait('BLOCK_JOB_CANCELLED')
         assert e is not None
         log(e, filters=[iotests.filter_qmp_event])
 
-    log(vm.qmp('nbd-server-stop'))
     log(vm.qmp('blockdev-del', node_name=tmp_node))
     vm.shutdown()
 
@@ -176,17 +201,21 @@ def do_test(use_cbw, base_img_path, fleece_img_path, nbd_sock_path, vm):
     log('Done')
 
 
-def test(use_cbw):
+def test(use_cbw, use_fleecing_filter):
     with iotests.FilePath('base.img') as base_img_path, \
          iotests.FilePath('fleece.img') as fleece_img_path, \
          iotests.FilePath('nbd.sock',
                           base_dir=iotests.sock_dir) as nbd_sock_path, \
          iotests.VM() as vm:
-        do_test(use_cbw, base_img_path, fleece_img_path, nbd_sock_path, vm)
+        do_test(use_cbw, use_fleecing_filter, base_img_path,
+                fleece_img_path, nbd_sock_path, vm)
 
 
 log('=== Test backup(sync=none) based fleecing ===\n')
-test(False)
+test(False, False)
+
+log('=== Test cbw-filter based fleecing ===\n')
+test(True, False)
 
-log('=== Test filter based fleecing ===\n')
-test(True)
+log('=== Test fleecing-format based fleecing ===\n')
+test(True, True)
diff --git a/tests/qemu-iotests/tests/image-fleecing.out b/tests/qemu-iotests/tests/image-fleecing.out
index e96d122a8b..da0af93388 100644
--- a/tests/qemu-iotests/tests/image-fleecing.out
+++ b/tests/qemu-iotests/tests/image-fleecing.out
@@ -51,9 +51,79 @@ read -P0 0x3fe0000 64k
 
 --- Cleanup ---
 
+{"return": {}}
 {"return": {}}
 {"data": {"device": "fleecing", "len": 67108864, "offset": 393216, "speed": 0, "type": "backup"}, "event": "BLOCK_JOB_CANCELLED", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"return": {}}
+
+--- Confirming writes ---
+
+read -P0xab 0 64k
+read -P0xad 0x00f8000 64k
+read -P0x1d 0x2008000 64k
+read -P0xea 0x3fe0000 64k
+read -P0xd5 0x108000 32k
+read -P0xdc 32M 32k
+read -P0xcd 0x3ff0000 64k
+
+Done
+=== Test cbw-filter based fleecing ===
+
+--- Setting up images ---
+
+Done
+
+--- Launching VM ---
+
+Done
+
+--- Setting up Fleecing Graph ---
+
+{"return": {}}
+{"return": {}}
+{"return": {}}
+
+--- Setting up NBD Export ---
+
+{"return": {}}
+{"return": {}}
+
+--- Sanity Check ---
+
+read -P0x5d 0 64k
+read -P0xd5 1M 64k
+read -P0xdc 32M 64k
+read -P0xcd 0x3ff0000 64k
+read -P0 0x00f8000 32k
+read -P0 0x2010000 32k
+read -P0 0x3fe0000 64k
+
+--- Testing COW ---
+
+write -P0xab 0 64k
+{"return": ""}
+write -P0xad 0x00f8000 64k
+{"return": ""}
+write -P0x1d 0x2008000 64k
+{"return": ""}
+write -P0xea 0x3fe0000 64k
+{"return": ""}
+
+--- Verifying Data ---
+
+read -P0x5d 0 64k
+read -P0xd5 1M 64k
+read -P0xdc 32M 64k
+read -P0xcd 0x3ff0000 64k
+read -P0 0x00f8000 32k
+read -P0 0x2010000 32k
+read -P0 0x3fe0000 64k
+
+--- Cleanup ---
+
+{"return": {}}
+{"return": {}}
+{"return": {}}
 {"return": {}}
 
 --- Confirming writes ---
@@ -67,7 +137,7 @@ read -P0xdc 32M 32k
 read -P0xcd 0x3ff0000 64k
 
 Done
-=== Test filter based fleecing ===
+=== Test fleecing-format based fleecing ===
 
 --- Setting up images ---
 
@@ -82,6 +152,7 @@ Done
 {"return": {}}
 {"return": {}}
 {"return": {}}
+{"return": {}}
 
 --- Setting up NBD Export ---
 
@@ -125,6 +196,7 @@ read -P0 0x3fe0000 64k
 {"return": {}}
 {"return": {}}
 {"return": {}}
+{"return": {}}
 
 --- Confirming writes ---
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 15/19] iotests.py: add qemu_io_pipe_and_status()
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (13 preceding siblings ...)
  2021-08-27 18:18 ` [PATCH v2 14/19] iotests/image-fleecing: add test-case for fleecing format node Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:18 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:18 ` [PATCH v2 16/19] iotests/image-fleecing: add test case with bitmap Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:18 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

Add helper that returns both status and output, to be used in the
following commit

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/iotests.py | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 77efcb0927..ff96d8ef16 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -206,6 +206,10 @@ def qemu_io(*args):
     args = qemu_io_args + list(args)
     return qemu_tool_pipe_and_status('qemu-io', args)[0]
 
+def qemu_io_pipe_and_status(*args):
+    args = qemu_io_args + list(args)
+    return qemu_tool_pipe_and_status('qemu-io', args)
+
 def qemu_io_log(*args):
     result = qemu_io(*args)
     log(result, filters=[filter_testfiles, filter_qemu_io])
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 16/19] iotests/image-fleecing: add test case with bitmap
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (14 preceding siblings ...)
  2021-08-27 18:18 ` [PATCH v2 15/19] iotests.py: add qemu_io_pipe_and_status() Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:18 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:18 ` [PATCH v2 17/19] block: blk_root(): return non-const pointer Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:18 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

Note that reads zero areas (not dirty in the bitmap) fails, that's
correct.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/tests/image-fleecing     | 32 ++++++--
 tests/qemu-iotests/tests/image-fleecing.out | 84 +++++++++++++++++++++
 2 files changed, 108 insertions(+), 8 deletions(-)

diff --git a/tests/qemu-iotests/tests/image-fleecing b/tests/qemu-iotests/tests/image-fleecing
index f18881fa71..ab95e4960e 100755
--- a/tests/qemu-iotests/tests/image-fleecing
+++ b/tests/qemu-iotests/tests/image-fleecing
@@ -23,7 +23,7 @@
 # Creator/Owner: John Snow <jsnow@redhat.com>
 
 import iotests
-from iotests import log, qemu_img, qemu_io, qemu_io_silent
+from iotests import log, qemu_img, qemu_io, qemu_io_silent, qemu_io_pipe_and_status
 
 iotests.script_initialize(
     supported_fmts=['qcow2', 'qcow', 'qed', 'vmdk', 'vhdx', 'raw'],
@@ -49,11 +49,15 @@ remainder = [('0xd5', '0x108000',  '32k'), # Right-end of partial-left [1]
              ('0xcd', '0x3ff0000', '64k')] # patterns[3]
 
 def do_test(use_cbw, use_fleecing_filter, base_img_path,
-            fleece_img_path, nbd_sock_path, vm):
+            fleece_img_path, nbd_sock_path, vm,
+            bitmap=False):
     log('--- Setting up images ---')
     log('')
 
     assert qemu_img('create', '-f', iotests.imgfmt, base_img_path, '64M') == 0
+    if bitmap:
+        assert qemu_img('bitmap', '--add', base_img_path, 'bitmap0') == 0
+
     if use_fleecing_filter:
         assert use_cbw
         assert qemu_img('create', '-f', 'raw', fleece_img_path, '64M') == 0
@@ -113,12 +117,17 @@ def do_test(use_cbw, use_fleecing_filter, base_img_path,
                 'source': src_node,
             }))
 
-        log(vm.qmp('blockdev-add', {
+        fl_cbw = {
             'driver': 'copy-before-write',
             'node-name': 'fl-cbw',
             'file': src_node,
             'target': 'fl-fleecing' if use_fleecing_filter else tmp_node
-        }))
+        }
+
+        if bitmap:
+            fl_cbw['bitmap'] = {'node': src_node, 'name': 'bitmap0'}
+
+        log(vm.qmp('blockdev-add', fl_cbw))
 
         log(vm.qmp('qom-set', path=qom_path, property='drive', value='fl-cbw'))
     else:
@@ -148,7 +157,9 @@ def do_test(use_cbw, use_fleecing_filter, base_img_path,
     for p in patterns + zeroes:
         cmd = 'read -P%s %s %s' % p
         log(cmd)
-        assert qemu_io_silent('-r', '-f', 'raw', '-c', cmd, nbd_uri) == 0
+        out, ret = qemu_io_pipe_and_status('-r', '-f', 'raw', '-c', cmd, nbd_uri)
+        if ret != 0:
+            print(out)
 
     log('')
     log('--- Testing COW ---')
@@ -166,7 +177,9 @@ def do_test(use_cbw, use_fleecing_filter, base_img_path,
     for p in patterns + zeroes:
         cmd = 'read -P%s %s %s' % p
         log(cmd)
-        assert qemu_io_silent('-r', '-f', 'raw', '-c', cmd, nbd_uri) == 0
+        out, ret = qemu_io_pipe_and_status('-r', '-f', 'raw', '-c', cmd, nbd_uri)
+        if ret != 0:
+            print(out)
 
     log('')
     log('--- Cleanup ---')
@@ -201,14 +214,14 @@ def do_test(use_cbw, use_fleecing_filter, base_img_path,
     log('Done')
 
 
-def test(use_cbw, use_fleecing_filter):
+def test(use_cbw, use_fleecing_filter, bitmap=False):
     with iotests.FilePath('base.img') as base_img_path, \
          iotests.FilePath('fleece.img') as fleece_img_path, \
          iotests.FilePath('nbd.sock',
                           base_dir=iotests.sock_dir) as nbd_sock_path, \
          iotests.VM() as vm:
         do_test(use_cbw, use_fleecing_filter, base_img_path,
-                fleece_img_path, nbd_sock_path, vm)
+                fleece_img_path, nbd_sock_path, vm, bitmap=bitmap)
 
 
 log('=== Test backup(sync=none) based fleecing ===\n')
@@ -219,3 +232,6 @@ test(True, False)
 
 log('=== Test fleecing-format based fleecing ===\n')
 test(True, True)
+
+log('=== Test fleecing-format based fleecing with bitmap ===\n')
+test(True, True, bitmap=True)
diff --git a/tests/qemu-iotests/tests/image-fleecing.out b/tests/qemu-iotests/tests/image-fleecing.out
index da0af93388..62e1c1fe42 100644
--- a/tests/qemu-iotests/tests/image-fleecing.out
+++ b/tests/qemu-iotests/tests/image-fleecing.out
@@ -190,6 +190,90 @@ read -P0 0x00f8000 32k
 read -P0 0x2010000 32k
 read -P0 0x3fe0000 64k
 
+--- Cleanup ---
+
+{"return": {}}
+{"return": {}}
+{"return": {}}
+{"return": {}}
+{"return": {}}
+
+--- Confirming writes ---
+
+read -P0xab 0 64k
+read -P0xad 0x00f8000 64k
+read -P0x1d 0x2008000 64k
+read -P0xea 0x3fe0000 64k
+read -P0xd5 0x108000 32k
+read -P0xdc 32M 32k
+read -P0xcd 0x3ff0000 64k
+
+Done
+=== Test fleecing-format based fleecing with bitmap ===
+
+--- Setting up images ---
+
+Done
+
+--- Launching VM ---
+
+Done
+
+--- Setting up Fleecing Graph ---
+
+{"return": {}}
+{"return": {}}
+{"return": {}}
+{"return": {}}
+
+--- Setting up NBD Export ---
+
+{"return": {}}
+{"return": {}}
+
+--- Sanity Check ---
+
+read -P0x5d 0 64k
+read -P0xd5 1M 64k
+read -P0xdc 32M 64k
+read -P0xcd 0x3ff0000 64k
+read -P0 0x00f8000 32k
+read failed: Invalid argument
+
+read -P0 0x2010000 32k
+read failed: Invalid argument
+
+read -P0 0x3fe0000 64k
+read failed: Invalid argument
+
+
+--- Testing COW ---
+
+write -P0xab 0 64k
+{"return": ""}
+write -P0xad 0x00f8000 64k
+{"return": ""}
+write -P0x1d 0x2008000 64k
+{"return": ""}
+write -P0xea 0x3fe0000 64k
+{"return": ""}
+
+--- Verifying Data ---
+
+read -P0x5d 0 64k
+read -P0xd5 1M 64k
+read -P0xdc 32M 64k
+read -P0xcd 0x3ff0000 64k
+read -P0 0x00f8000 32k
+read failed: Invalid argument
+
+read -P0 0x2010000 32k
+read failed: Invalid argument
+
+read -P0 0x3fe0000 64k
+read failed: Invalid argument
+
+
 --- Cleanup ---
 
 {"return": {}}
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 17/19] block: blk_root(): return non-const pointer
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (15 preceding siblings ...)
  2021-08-27 18:18 ` [PATCH v2 16/19] iotests/image-fleecing: add test case with bitmap Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:18 ` Vladimir Sementsov-Ogievskiy
  2021-08-27 18:18 ` [PATCH v2 18/19] qapi: backup: add immutable-source parameter Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:18 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

In the following patch we'll want to pass blk children to block-copy.
Const pointers are not enough. So, return non const pointer from
blk_root().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/sysemu/block-backend.h | 2 +-
 block/block-backend.c          | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 29d4fdbf63..5d4dd877b7 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -271,7 +271,7 @@ int coroutine_fn blk_co_copy_range(BlockBackend *blk_in, int64_t off_in,
                                    int bytes, BdrvRequestFlags read_flags,
                                    BdrvRequestFlags write_flags);
 
-const BdrvChild *blk_root(BlockBackend *blk);
+BdrvChild *blk_root(BlockBackend *blk);
 
 int blk_make_empty(BlockBackend *blk, Error **errp);
 
diff --git a/block/block-backend.c b/block/block-backend.c
index 6140d133e2..b167c630d2 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -2463,7 +2463,7 @@ int coroutine_fn blk_co_copy_range(BlockBackend *blk_in, int64_t off_in,
                               bytes, read_flags, write_flags);
 }
 
-const BdrvChild *blk_root(BlockBackend *blk)
+BdrvChild *blk_root(BlockBackend *blk)
 {
     return blk->root;
 }
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 18/19] qapi: backup: add immutable-source parameter
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (16 preceding siblings ...)
  2021-08-27 18:18 ` [PATCH v2 17/19] block: blk_root(): return non-const pointer Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:18 ` Vladimir Sementsov-Ogievskiy
  2021-09-01 11:47   ` Markus Armbruster
  2021-08-27 18:18 ` [PATCH v2 19/19] iotests/image-fleecing: test push backup with fleecing Vladimir Sementsov-Ogievskiy
  2021-09-22  7:45 ` [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
  19 siblings, 1 reply; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:18 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

We are on the way to implement internal-backup with fleecing scheme,
which includes backup job copying from fleecing block driver node
(which is target of copy-before-write filter) to final target of
backup. This job doesn't need own filter, as fleecing block driver node
is a kind of snapshot, it's immutable from reader point of view.

Let's add a parameter for backup to not insert filter but instead
unshare writes on source. This way backup job becomes a simple copying
process.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 qapi/block-core.json      | 12 +++++++-
 include/block/block_int.h |  1 +
 block/backup.c            | 61 +++++++++++++++++++++++++++++++++++----
 block/replication.c       |  2 +-
 blockdev.c                |  1 +
 5 files changed, 70 insertions(+), 7 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 8a333136f5..995ca16a5e 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1391,6 +1391,15 @@
 #                    above node specified by @drive. If this option is not given,
 #                    a node name is autogenerated. (Since: 4.2)
 #
+# @immutable-source: If true, assume source is immutable and don't insert filter
+#                    as no copy-before-write operations are needed. It will
+#                    fail if there are existing writers on source node, as well,
+#                    any attempt to add writer to source node during backup will
+#                    fail. @filter-node-name must not be set.
+#                    If false, insert copy-before-write filter above source node
+#                    (see also @filter-node-name parameter).
+#                    Default is false. (Since 6.2)
+#
 # @x-perf: Performance options. (Since 6.0)
 #
 # Note: @on-source-error and @on-target-error only affect background
@@ -1407,7 +1416,8 @@
             '*on-source-error': 'BlockdevOnError',
             '*on-target-error': 'BlockdevOnError',
             '*auto-finalize': 'bool', '*auto-dismiss': 'bool',
-            '*filter-node-name': 'str', '*x-perf': 'BackupPerf'  } }
+            '*filter-node-name': 'str', '*immutable-source': 'bool',
+            '*x-perf': 'BackupPerf'  } }
 
 ##
 # @DriveBackup:
diff --git a/include/block/block_int.h b/include/block/block_int.h
index f1a54db0f8..6571dad061 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1284,6 +1284,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
                             BitmapSyncMode bitmap_mode,
                             bool compress,
                             const char *filter_node_name,
+                            bool immutable_source,
                             BackupPerf *perf,
                             BlockdevOnError on_source_error,
                             BlockdevOnError on_target_error,
diff --git a/block/backup.c b/block/backup.c
index 687d2882bc..a7f4d0d663 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -34,6 +34,14 @@ typedef struct BackupBlockJob {
     BlockDriverState *cbw;
     BlockDriverState *source_bs;
     BlockDriverState *target_bs;
+    BlockBackend *source_blk;
+    BlockBackend *target_blk;
+    /*
+     * Note that if backup runs with filter (immutable-source parameter is
+     * false), @cbw is set but @source_blk and @target_blk are NULL.
+     * Otherwise if backup runs without filter (immutable-source paramter is
+     * true), @cbw is NULL but @source_blk and @target_blk are set.
+     */
 
     BdrvDirtyBitmap *sync_bitmap;
 
@@ -102,7 +110,17 @@ static void backup_clean(Job *job)
 {
     BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
     block_job_remove_all_bdrv(&s->common);
-    bdrv_cbw_drop(s->cbw);
+    if (s->cbw) {
+        assert(!s->source_blk && !s->target_blk);
+        bdrv_cbw_drop(s->cbw);
+    } else {
+        block_copy_state_free(s->bcs);
+        s->bcs = NULL;
+        blk_unref(s->source_blk);
+        s->source_blk = NULL;
+        blk_unref(s->target_blk);
+        s->target_blk = NULL;
+    }
 }
 
 void backup_do_checkpoint(BlockJob *job, Error **errp)
@@ -356,6 +374,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
                   BitmapSyncMode bitmap_mode,
                   bool compress,
                   const char *filter_node_name,
+                  bool immutable_source,
                   BackupPerf *perf,
                   BlockdevOnError on_source_error,
                   BlockdevOnError on_target_error,
@@ -368,6 +387,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
     int64_t cluster_size;
     BlockDriverState *cbw = NULL;
     BlockCopyState *bcs = NULL;
+    BlockBackend *source_blk = NULL, *target_blk = NULL;
 
     assert(bs);
     assert(target);
@@ -376,6 +396,12 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
     assert(sync_mode != MIRROR_SYNC_MODE_INCREMENTAL);
     assert(sync_bitmap || sync_mode != MIRROR_SYNC_MODE_BITMAP);
 
+    if (immutable_source && filter_node_name) {
+        error_setg(errp, "immutable-source and filter-node-name should not "
+                   "be set simultaneously");
+        return NULL;
+    }
+
     if (bs == target) {
         error_setg(errp, "Source and target cannot be the same");
         return NULL;
@@ -450,9 +476,30 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
         goto error;
     }
 
-    cbw = bdrv_cbw_append(bs, target, filter_node_name, &bcs, errp);
-    if (!cbw) {
-        goto error;
+    if (immutable_source) {
+        source_blk = blk_new_with_bs(bs, BLK_PERM_CONSISTENT_READ,
+                                        BLK_PERM_WRITE_UNCHANGED |
+                                        BLK_PERM_CONSISTENT_READ, errp);
+        if (!source_blk) {
+            goto error;
+        }
+
+        target_blk  = blk_new_with_bs(target, BLK_PERM_WRITE,
+                                      BLK_PERM_CONSISTENT_READ, errp);
+        if (!target_blk) {
+            goto error;
+        }
+
+        bcs = block_copy_state_new(blk_root(source_blk), blk_root(target_blk),
+                                   NULL, false, errp);
+        if (!bcs) {
+            goto error;
+        }
+    } else {
+        cbw = bdrv_cbw_append(bs, target, filter_node_name, &bcs, errp);
+        if (!cbw) {
+            goto error;
+        }
     }
 
     cluster_size = block_copy_cluster_size(bcs);
@@ -464,7 +511,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
     }
 
     /* job->len is fixed, so we can't allow resize */
-    job = block_job_create(job_id, &backup_job_driver, txn, cbw,
+    job = block_job_create(job_id, &backup_job_driver, txn, cbw ?: bs,
                            0, BLK_PERM_ALL,
                            speed, creation_flags, cb, opaque, errp);
     if (!job) {
@@ -474,6 +521,8 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
     job->cbw = cbw;
     job->source_bs = bs;
     job->target_bs = target;
+    job->source_blk = source_blk;
+    job->target_blk = target_blk;
     job->on_source_error = on_source_error;
     job->on_target_error = on_target_error;
     job->sync_mode = sync_mode;
@@ -501,6 +550,8 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
     if (cbw) {
         bdrv_cbw_drop(cbw);
     }
+    blk_unref(source_blk);
+    blk_unref(target_blk);
 
     return NULL;
 }
diff --git a/block/replication.c b/block/replication.c
index 32444b9a8f..5dba4907b8 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -590,7 +590,7 @@ static void replication_start(ReplicationState *rs, ReplicationMode mode,
         s->backup_job = backup_job_create(
                                 NULL, s->secondary_disk->bs, s->hidden_disk->bs,
                                 0, MIRROR_SYNC_MODE_NONE, NULL, 0, false, NULL,
-                                &perf,
+                                false, &perf,
                                 BLOCKDEV_ON_ERROR_REPORT,
                                 BLOCKDEV_ON_ERROR_REPORT, JOB_INTERNAL,
                                 backup_job_completed, bs, NULL, &local_err);
diff --git a/blockdev.c b/blockdev.c
index 3d8ac368a1..3a8ff7c5ce 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2911,6 +2911,7 @@ static BlockJob *do_backup_common(BackupCommon *backup,
                             backup->sync, bmap, backup->bitmap_mode,
                             backup->compress,
                             backup->filter_node_name,
+                            backup->immutable_source,
                             &perf,
                             backup->on_source_error,
                             backup->on_target_error,
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 19/19] iotests/image-fleecing: test push backup with fleecing
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (17 preceding siblings ...)
  2021-08-27 18:18 ` [PATCH v2 18/19] qapi: backup: add immutable-source parameter Vladimir Sementsov-Ogievskiy
@ 2021-08-27 18:18 ` Vladimir Sementsov-Ogievskiy
  2021-09-22  7:45 ` [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-08-27 18:18 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, vsementsov, jsnow

Add test for push backup with fleecing:

 - start fleecing with copy-before-write filter
 - start a backup job from temporary fleecing node to actual backup
   target

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/tests/image-fleecing     | 121 ++++++++++++++------
 tests/qemu-iotests/tests/image-fleecing.out |  63 ++++++++++
 2 files changed, 152 insertions(+), 32 deletions(-)

diff --git a/tests/qemu-iotests/tests/image-fleecing b/tests/qemu-iotests/tests/image-fleecing
index ab95e4960e..ed57358e5d 100755
--- a/tests/qemu-iotests/tests/image-fleecing
+++ b/tests/qemu-iotests/tests/image-fleecing
@@ -48,9 +48,15 @@ remainder = [('0xd5', '0x108000',  '32k'), # Right-end of partial-left [1]
              ('0xdc', '32M',       '32k'), # Left-end of partial-right [2]
              ('0xcd', '0x3ff0000', '64k')] # patterns[3]
 
-def do_test(use_cbw, use_fleecing_filter, base_img_path,
-            fleece_img_path, nbd_sock_path, vm,
+def do_test(vm, use_cbw, use_fleecing_filter, base_img_path,
+            fleece_img_path, nbd_sock_path=None,
+            target_img_path=None,
             bitmap=False):
+    push_backup = target_img_path is not None
+    assert (nbd_sock_path is not None) != push_backup
+    if push_backup:
+        assert use_cbw
+
     log('--- Setting up images ---')
     log('')
 
@@ -64,6 +70,9 @@ def do_test(use_cbw, use_fleecing_filter, base_img_path,
     else:
         assert qemu_img('create', '-f', 'qcow2', fleece_img_path, '64M') == 0
 
+    if push_backup:
+        assert qemu_img('create', '-f', 'qcow2', target_img_path, '64M') == 0
+
     for p in patterns:
         qemu_io('-f', iotests.imgfmt,
                 '-c', 'write -P%s %s %s' % p, base_img_path)
@@ -139,27 +148,45 @@ def do_test(use_cbw, use_fleecing_filter, base_img_path,
 
     export_node = 'fl-fleecing' if use_fleecing_filter else tmp_node
 
-    log('')
-    log('--- Setting up NBD Export ---')
-    log('')
+    if push_backup:
+        log('')
+        log('--- Starting actual backup ---')
+        log('')
 
-    nbd_uri = 'nbd+unix:///%s?socket=%s' % (export_node, nbd_sock_path)
-    log(vm.qmp('nbd-server-start',
-               {'addr': { 'type': 'unix',
-                          'data': { 'path': nbd_sock_path } } }))
+        log(vm.qmp('blockdev-add', **{
+            'driver': iotests.imgfmt,
+            'node-name': 'target',
+            'file': {
+                'driver': 'file',
+                'filename': target_img_path
+            }
+        }))
+        log(vm.qmp('blockdev-backup', device=export_node,
+                   sync='full', target='target',
+                   immutable_source=True,
+                   job_id='push-backup', speed=1))
+    else:
+        log('')
+        log('--- Setting up NBD Export ---')
+        log('')
 
-    log(vm.qmp('nbd-server-add', device=export_node))
+        nbd_uri = 'nbd+unix:///%s?socket=%s' % (export_node, nbd_sock_path)
+        log(vm.qmp('nbd-server-start',
+                   {'addr': { 'type': 'unix',
+                              'data': { 'path': nbd_sock_path } } }))
 
-    log('')
-    log('--- Sanity Check ---')
-    log('')
+        log(vm.qmp('nbd-server-add', device=export_node))
 
-    for p in patterns + zeroes:
-        cmd = 'read -P%s %s %s' % p
-        log(cmd)
-        out, ret = qemu_io_pipe_and_status('-r', '-f', 'raw', '-c', cmd, nbd_uri)
-        if ret != 0:
-            print(out)
+        log('')
+        log('--- Sanity Check ---')
+        log('')
+
+        for p in patterns + zeroes:
+            cmd = 'read -P%s %s %s' % p
+            log(cmd)
+            out, ret = qemu_io_pipe_and_status('-r', '-f', 'raw', '-c', cmd, nbd_uri)
+            if ret != 0:
+                print(out)
 
     log('')
     log('--- Testing COW ---')
@@ -170,6 +197,20 @@ def do_test(use_cbw, use_fleecing_filter, base_img_path,
         log(cmd)
         log(vm.hmp_qemu_io(qom_path, cmd, qdev=True))
 
+    if push_backup:
+        # Check that previous operations were done during backup, not after
+        result = vm.qmp('query-block-jobs')
+        if len(result['return']) != 1:
+            log('Backup finished too fast, COW is not tested')
+
+        result = vm.qmp('block-job-set-speed', device='push-backup', speed=0)
+        assert result == {'return': {}}
+
+        log(vm.event_wait(name='BLOCK_JOB_COMPLETED',
+                          match={'data': {'device': 'push-backup'}}),
+                          filters=[iotests.filter_qmp_event])
+        log(vm.qmp('blockdev-del', node_name='target'))
+
     log('')
     log('--- Verifying Data ---')
     log('')
@@ -177,15 +218,19 @@ def do_test(use_cbw, use_fleecing_filter, base_img_path,
     for p in patterns + zeroes:
         cmd = 'read -P%s %s %s' % p
         log(cmd)
-        out, ret = qemu_io_pipe_and_status('-r', '-f', 'raw', '-c', cmd, nbd_uri)
-        if ret != 0:
-            print(out)
+        if push_backup:
+            assert qemu_io_silent('-r', '-c', cmd, target_img_path) == 0
+        else:
+            out, ret = qemu_io_pipe_and_status('-r', '-f', 'raw', '-c', cmd, nbd_uri)
+            if ret != 0:
+                print(out)
 
     log('')
     log('--- Cleanup ---')
     log('')
 
-    log(vm.qmp('nbd-server-stop'))
+    if not push_backup:
+        log(vm.qmp('nbd-server-stop'))
 
     if use_cbw:
         log(vm.qmp('qom-set', path=qom_path, property='drive', value=src_node))
@@ -214,24 +259,36 @@ def do_test(use_cbw, use_fleecing_filter, base_img_path,
     log('Done')
 
 
-def test(use_cbw, use_fleecing_filter, bitmap=False):
+def test(use_cbw, use_fleecing_filter,
+         nbd_sock_path=None, target_img_path=None, bitmap=False):
     with iotests.FilePath('base.img') as base_img_path, \
          iotests.FilePath('fleece.img') as fleece_img_path, \
-         iotests.FilePath('nbd.sock',
-                          base_dir=iotests.sock_dir) as nbd_sock_path, \
          iotests.VM() as vm:
-        do_test(use_cbw, use_fleecing_filter, base_img_path,
-                fleece_img_path, nbd_sock_path, vm, bitmap=bitmap)
+        do_test(vm, use_cbw, use_fleecing_filter, base_img_path,
+                fleece_img_path, nbd_sock_path, target_img_path,
+                bitmap=bitmap)
+
+def test_pull(use_cbw, use_fleecing_filter, bitmap=False):
+    with iotests.FilePath('nbd.sock',
+                          base_dir=iotests.sock_dir) as nbd_sock_path:
+        test(use_cbw, use_fleecing_filter, nbd_sock_path, None, bitmap=bitmap)
+
+def test_push():
+    with iotests.FilePath('target.img') as target_img_path:
+        test(True, True, None, target_img_path)
 
 
 log('=== Test backup(sync=none) based fleecing ===\n')
-test(False, False)
+test_pull(False, False)
 
 log('=== Test cbw-filter based fleecing ===\n')
-test(True, False)
+test_pull(True, False)
 
 log('=== Test fleecing-format based fleecing ===\n')
-test(True, True)
+test_pull(True, True)
 
 log('=== Test fleecing-format based fleecing with bitmap ===\n')
-test(True, True, bitmap=True)
+test_pull(True, True, bitmap=True)
+
+log('=== Test push backup with fleecing ===\n')
+test_push()
diff --git a/tests/qemu-iotests/tests/image-fleecing.out b/tests/qemu-iotests/tests/image-fleecing.out
index 62e1c1fe42..acfc89ff0e 100644
--- a/tests/qemu-iotests/tests/image-fleecing.out
+++ b/tests/qemu-iotests/tests/image-fleecing.out
@@ -293,3 +293,66 @@ read -P0xdc 32M 32k
 read -P0xcd 0x3ff0000 64k
 
 Done
+=== Test push backup with fleecing ===
+
+--- Setting up images ---
+
+Done
+
+--- Launching VM ---
+
+Done
+
+--- Setting up Fleecing Graph ---
+
+{"return": {}}
+{"return": {}}
+{"return": {}}
+{"return": {}}
+
+--- Starting actual backup ---
+
+{"return": {}}
+{"return": {}}
+
+--- Testing COW ---
+
+write -P0xab 0 64k
+{"return": ""}
+write -P0xad 0x00f8000 64k
+{"return": ""}
+write -P0x1d 0x2008000 64k
+{"return": ""}
+write -P0xea 0x3fe0000 64k
+{"return": ""}
+{"data": {"device": "push-backup", "len": 67108864, "offset": 67108864, "speed": 0, "type": "backup"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
+{"return": {}}
+
+--- Verifying Data ---
+
+read -P0x5d 0 64k
+read -P0xd5 1M 64k
+read -P0xdc 32M 64k
+read -P0xcd 0x3ff0000 64k
+read -P0 0x00f8000 32k
+read -P0 0x2010000 32k
+read -P0 0x3fe0000 64k
+
+--- Cleanup ---
+
+{"return": {}}
+{"return": {}}
+{"return": {}}
+{"return": {}}
+
+--- Confirming writes ---
+
+read -P0xab 0 64k
+read -P0xad 0x00f8000 64k
+read -P0x1d 0x2008000 64k
+read -P0xea 0x3fe0000 64k
+read -P0xd5 0x108000 32k
+read -P0xdc 32M 32k
+read -P0xcd 0x3ff0000 64k
+
+Done
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 10/19] block: introduce fleecing block driver
  2021-08-27 18:17 ` [PATCH v2 10/19] block: introduce fleecing block driver Vladimir Sementsov-Ogievskiy
@ 2021-09-01 11:44   ` Markus Armbruster
  2021-09-03 13:29     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 24+ messages in thread
From: Markus Armbruster @ 2021-09-01 11:44 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: kwolf, qemu-block, wencongyang2, xiechanglong.d, qemu-devel,
	jsnow, hreitz, eblake

Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:

> Introduce a new driver, that works in pair with copy-before-write to
> improve fleecing.
>
> Without fleecing driver, old fleecing scheme looks as follows:
>
> [guest]
>   |
>   |root
>   v
> [copy-before-write] -----> [temp.qcow2] <--- [nbd export]
>   |                 target  |
>   |file                     |backing
>   v                         |
> [active disk] <-------------+
>
> With fleecing driver, new scheme is:
>
> [guest]
>   |
>   |root
>   v
> [copy-before-write] -----> [fleecing] <--- [nbd export]
>   |                 target  |    |
>   |file                     |    |file
>   v                         |    v
> [active disk]<--source------+  [temp.img]
>
> Benefits of new scheme:
>
> 1. Access control: if remote client try to read data that not covered
>    by original dirty bitmap used on copy-before-write open, client gets
>    -EACCES.
>
> 2. Discard support: if remote client do DISCARD, this additionally to
>    discarding data in temp.img informs block-copy process to not copy
>    these clusters. Next read from discarded area will return -EACCES.
>    This is significant thing: when fleecing user reads data that was
>    not yet copied to temp.img, we can avoid copying it on further guest
>    write.
>
> 3. Synchronisation between client reads and block-copy write is more
>    efficient: it doesn't block intersecting block-copy write during
>    client read.
>
> 4. We don't rely on backing feature: active disk should not be backing
>    of temp image, so we avoid some permission-related difficulties and
>    temp image now is not required to support backing, it may be simple
>    raw image.
>
> Note that now nobody calls fleecing_drv_activate(), so new driver is
> actually unusable. It's a work for the following patch: support
> fleecing block driver in copy-before-write filter driver.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  qapi/block-core.json |  17 ++-
>  block/fleecing.h     |  16 +++
>  block/fleecing-drv.c | 260 +++++++++++++++++++++++++++++++++++++++++++
>  MAINTAINERS          |   1 +
>  block/meson.build    |   1 +
>  5 files changed, 294 insertions(+), 1 deletion(-)
>  create mode 100644 block/fleecing-drv.c
>
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index c42d23752d..8a333136f5 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -2826,13 +2826,14 @@
>  # @blkreplay: Since 4.2
>  # @compress: Since 5.0
>  # @copy-before-write: Since 6.2
> +# @fleecing: Since 6.2
>  #
>  # Since: 2.9
>  ##
>  { 'enum': 'BlockdevDriver',
>    'data': [ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs',
>              'cloop', 'compress', 'copy-before-write', 'copy-on-read', 'dmg',
> -            'file', 'ftp', 'ftps', 'gluster',
> +            'file', 'fleecing', 'ftp', 'ftps', 'gluster',
>              {'name': 'host_cdrom', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
>              {'name': 'host_device', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
>              'http', 'https', 'iscsi',
> @@ -4077,6 +4078,19 @@
>    'base': 'BlockdevOptionsGenericFormat',
>    'data': { 'target': 'BlockdevRef', '*bitmap': 'BlockDirtyBitmap' } }
>  
> +##
> +# @BlockdevOptionsFleecing:
> +#
> +# Driver that works in pair with copy-before-write to make fleecing scheme.

This is really terse.  Do we explain the driver's intended use anywhere?

I'd suggest s/to make fleecing scheme/to make a fleecing scheme/, except
it doesn't make much sense to me either way :)

> +#
> +# @source: source node of fleecing

We usually say "node name of ...".

> +#
> +# Since: 6.2
> +##
> +{ 'struct': 'BlockdevOptionsFleecing',
> +  'base': 'BlockdevOptionsGenericFormat',
> +  'data': { 'source': 'str' } }
> +
>  ##
>  # @BlockdevOptions:
>  #
> @@ -4133,6 +4147,7 @@
>        'copy-on-read':'BlockdevOptionsCor',
>        'dmg':        'BlockdevOptionsGenericFormat',
>        'file':       'BlockdevOptionsFile',
> +      'fleecing':   'BlockdevOptionsFleecing',
>        'ftp':        'BlockdevOptionsCurlFtp',
>        'ftps':       'BlockdevOptionsCurlFtps',
>        'gluster':    'BlockdevOptionsGluster',

[...]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 18/19] qapi: backup: add immutable-source parameter
  2021-08-27 18:18 ` [PATCH v2 18/19] qapi: backup: add immutable-source parameter Vladimir Sementsov-Ogievskiy
@ 2021-09-01 11:47   ` Markus Armbruster
  0 siblings, 0 replies; 24+ messages in thread
From: Markus Armbruster @ 2021-09-01 11:47 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: kwolf, qemu-block, wencongyang2, xiechanglong.d, qemu-devel,
	jsnow, hreitz, eblake

Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:

> We are on the way to implement internal-backup with fleecing scheme,
> which includes backup job copying from fleecing block driver node
> (which is target of copy-before-write filter) to final target of
> backup. This job doesn't need own filter, as fleecing block driver node
> is a kind of snapshot, it's immutable from reader point of view.
>
> Let's add a parameter for backup to not insert filter but instead
> unshare writes on source. This way backup job becomes a simple copying
> process.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  qapi/block-core.json      | 12 +++++++-
>  include/block/block_int.h |  1 +
>  block/backup.c            | 61 +++++++++++++++++++++++++++++++++++----
>  block/replication.c       |  2 +-
>  blockdev.c                |  1 +
>  5 files changed, 70 insertions(+), 7 deletions(-)
>
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 8a333136f5..995ca16a5e 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1391,6 +1391,15 @@
>  #                    above node specified by @drive. If this option is not given,
>  #                    a node name is autogenerated. (Since: 4.2)
>  #
> +# @immutable-source: If true, assume source is immutable and don't insert filter

Suggest comma after immutable.

> +#                    as no copy-before-write operations are needed. It will
> +#                    fail if there are existing writers on source node, as well,
> +#                    any attempt to add writer to source node during backup will
> +#                    fail. @filter-node-name must not be set.

Suggest to split the sentence like "... fail if there are existing
writers on source node.  Any attempt ... will also fail."

> +#                    If false, insert copy-before-write filter above source node
> +#                    (see also @filter-node-name parameter).
> +#                    Default is false. (Since 6.2)
> +#
>  # @x-perf: Performance options. (Since 6.0)
>  #
>  # Note: @on-source-error and @on-target-error only affect background
> @@ -1407,7 +1416,8 @@
>              '*on-source-error': 'BlockdevOnError',
>              '*on-target-error': 'BlockdevOnError',
>              '*auto-finalize': 'bool', '*auto-dismiss': 'bool',
> -            '*filter-node-name': 'str', '*x-perf': 'BackupPerf'  } }
> +            '*filter-node-name': 'str', '*immutable-source': 'bool',
> +            '*x-perf': 'BackupPerf'  } }
>  
>  ##
>  # @DriveBackup:

[...]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 10/19] block: introduce fleecing block driver
  2021-09-01 11:44   ` Markus Armbruster
@ 2021-09-03 13:29     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-09-03 13:29 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: qemu-block, qemu-devel, xiechanglong.d, wencongyang2, eblake,
	hreitz, kwolf, jsnow

01.09.2021 14:44, Markus Armbruster wrote:
> Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:
> 
>> Introduce a new driver, that works in pair with copy-before-write to
>> improve fleecing.
>>
>> Without fleecing driver, old fleecing scheme looks as follows:
>>
>> [guest]
>>    |
>>    |root
>>    v
>> [copy-before-write] -----> [temp.qcow2] <--- [nbd export]
>>    |                 target  |
>>    |file                     |backing
>>    v                         |
>> [active disk] <-------------+
>>
>> With fleecing driver, new scheme is:
>>
>> [guest]
>>    |
>>    |root
>>    v
>> [copy-before-write] -----> [fleecing] <--- [nbd export]
>>    |                 target  |    |
>>    |file                     |    |file
>>    v                         |    v
>> [active disk]<--source------+  [temp.img]
>>
>> Benefits of new scheme:
>>
>> 1. Access control: if remote client try to read data that not covered
>>     by original dirty bitmap used on copy-before-write open, client gets
>>     -EACCES.
>>
>> 2. Discard support: if remote client do DISCARD, this additionally to
>>     discarding data in temp.img informs block-copy process to not copy
>>     these clusters. Next read from discarded area will return -EACCES.
>>     This is significant thing: when fleecing user reads data that was
>>     not yet copied to temp.img, we can avoid copying it on further guest
>>     write.
>>
>> 3. Synchronisation between client reads and block-copy write is more
>>     efficient: it doesn't block intersecting block-copy write during
>>     client read.
>>
>> 4. We don't rely on backing feature: active disk should not be backing
>>     of temp image, so we avoid some permission-related difficulties and
>>     temp image now is not required to support backing, it may be simple
>>     raw image.
>>
>> Note that now nobody calls fleecing_drv_activate(), so new driver is
>> actually unusable. It's a work for the following patch: support
>> fleecing block driver in copy-before-write filter driver.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   qapi/block-core.json |  17 ++-
>>   block/fleecing.h     |  16 +++
>>   block/fleecing-drv.c | 260 +++++++++++++++++++++++++++++++++++++++++++
>>   MAINTAINERS          |   1 +
>>   block/meson.build    |   1 +
>>   5 files changed, 294 insertions(+), 1 deletion(-)
>>   create mode 100644 block/fleecing-drv.c
>>
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index c42d23752d..8a333136f5 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -2826,13 +2826,14 @@
>>   # @blkreplay: Since 4.2
>>   # @compress: Since 5.0
>>   # @copy-before-write: Since 6.2
>> +# @fleecing: Since 6.2
>>   #
>>   # Since: 2.9
>>   ##
>>   { 'enum': 'BlockdevDriver',
>>     'data': [ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs',
>>               'cloop', 'compress', 'copy-before-write', 'copy-on-read', 'dmg',
>> -            'file', 'ftp', 'ftps', 'gluster',
>> +            'file', 'fleecing', 'ftp', 'ftps', 'gluster',
>>               {'name': 'host_cdrom', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
>>               {'name': 'host_device', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
>>               'http', 'https', 'iscsi',
>> @@ -4077,6 +4078,19 @@
>>     'base': 'BlockdevOptionsGenericFormat',
>>     'data': { 'target': 'BlockdevRef', '*bitmap': 'BlockDirtyBitmap' } }
>>   
>> +##
>> +# @BlockdevOptionsFleecing:
>> +#
>> +# Driver that works in pair with copy-before-write to make fleecing scheme.
> 
> This is really terse.  Do we explain the driver's intended use anywhere?

Hmm. I can duplicate here the ASII art from commit message together with some
explanations.

> 
> I'd suggest s/to make fleecing scheme/to make a fleecing scheme/, except
> it doesn't make much sense to me either way :)
> 
>> +#
>> +# @source: source node of fleecing
> 
> We usually say "node name of ...".

OK

> 
>> +#
>> +# Since: 6.2
>> +##
>> +{ 'struct': 'BlockdevOptionsFleecing',
>> +  'base': 'BlockdevOptionsGenericFormat',
>> +  'data': { 'source': 'str' } }
>> +
>>   ##
>>   # @BlockdevOptions:
>>   #
>> @@ -4133,6 +4147,7 @@
>>         'copy-on-read':'BlockdevOptionsCor',
>>         'dmg':        'BlockdevOptionsGenericFormat',
>>         'file':       'BlockdevOptionsFile',
>> +      'fleecing':   'BlockdevOptionsFleecing',
>>         'ftp':        'BlockdevOptionsCurlFtp',
>>         'ftps':       'BlockdevOptionsCurlFtps',
>>         'gluster':    'BlockdevOptionsGluster',
> 
> [...]
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 00/19] Make image fleecing more usable
  2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
                   ` (18 preceding siblings ...)
  2021-08-27 18:18 ` [PATCH v2 19/19] iotests/image-fleecing: test push backup with fleecing Vladimir Sementsov-Ogievskiy
@ 2021-09-22  7:45 ` Vladimir Sementsov-Ogievskiy
  19 siblings, 0 replies; 24+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-09-22  7:45 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, armbru, xiechanglong.d, wencongyang2, eblake, hreitz,
	kwolf, jsnow

ping )

27.08.2021 21:17, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> That continues "[PATCH RFC DRAFT 00/11] Make image fleecing more usable"
> and supersedes "[PATCH v2 for-6.2 0/6] push backup with fleecing"
> 
> Supersedes: <20210804131750.127574-1-vsementsov@virtuozzo.com>
> Supersedes: <20210721140424.163701-1-vsementsov@virtuozzo.com>
> 
> There several improvements to fleecing scheme:
> 
> 1. support bitmap in copy-before-write filter
> 
> 2. introduce fleecing block driver, which opens the door for a lot of
>     image fleecing improvements.
>     See "block: introduce fleecing block driver" commit message for
>     details.
> 
> 3. support "push backup with fleecing" scheme, when backup job is a
>     client of common fleecing scheme. That helps when writes to final
>     backup target are slow and we don't want guest writes hang waiting
>     for copy-before-write operations to final target.
> 
> Vladimir Sementsov-Ogievskiy (19):
>    block/block-copy: move copy_bitmap initialization to
>      block_copy_state_new()
>    block/dirty-bitmap: bdrv_merge_dirty_bitmap(): add return value
>    block/block-copy: block_copy_state_new(): add bitmap parameter
>    block/copy-before-write: add bitmap open parameter
>    block/block-copy: add block_copy_reset()
>    block: intoduce reqlist
>    block/dirty-bitmap: introduce bdrv_dirty_bitmap_status()
>    block/reqlist: add reqlist_wait_all()
>    block: introduce FleecingState class
>    block: introduce fleecing block driver
>    block/copy-before-write: support fleecing block driver
>    block/block-copy: add write-unchanged mode
>    block/copy-before-write: use write-unchanged in fleecing mode
>    iotests/image-fleecing: add test-case for fleecing format node
>    iotests.py: add qemu_io_pipe_and_status()
>    iotests/image-fleecing: add test case with bitmap
>    block: blk_root(): return non-const pointer
>    qapi: backup: add immutable-source parameter
>    iotests/image-fleecing: test push backup with fleecing
> 
>   qapi/block-core.json                        |  39 ++-
>   block/fleecing.h                            | 151 ++++++++++++
>   include/block/block-copy.h                  |   4 +-
>   include/block/block_int.h                   |   1 +
>   include/block/dirty-bitmap.h                |   4 +-
>   include/block/reqlist.h                     |  75 ++++++
>   include/qemu/hbitmap.h                      |  11 +
>   include/sysemu/block-backend.h              |   2 +-
>   block/backup.c                              |  61 ++++-
>   block/block-backend.c                       |   2 +-
>   block/block-copy.c                          | 157 +++++-------
>   block/copy-before-write.c                   |  70 +++++-
>   block/dirty-bitmap.c                        |  15 +-
>   block/fleecing-drv.c                        | 260 ++++++++++++++++++++
>   block/fleecing.c                            | 182 ++++++++++++++
>   block/monitor/bitmap-qmp-cmds.c             |   5 +-
>   block/replication.c                         |   2 +-
>   block/reqlist.c                             |  84 +++++++
>   blockdev.c                                  |   1 +
>   util/hbitmap.c                              |  36 +++
>   MAINTAINERS                                 |   7 +-
>   block/meson.build                           |   3 +
>   tests/qemu-iotests/iotests.py               |   4 +
>   tests/qemu-iotests/tests/image-fleecing     | 178 +++++++++++---
>   tests/qemu-iotests/tests/image-fleecing.out | 221 ++++++++++++++++-
>   25 files changed, 1420 insertions(+), 155 deletions(-)
>   create mode 100644 block/fleecing.h
>   create mode 100644 include/block/reqlist.h
>   create mode 100644 block/fleecing-drv.c
>   create mode 100644 block/fleecing.c
>   create mode 100644 block/reqlist.c
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-09-22  7:47 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-27 18:17 [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy
2021-08-27 18:17 ` [PATCH v2 01/19] block/block-copy: move copy_bitmap initialization to block_copy_state_new() Vladimir Sementsov-Ogievskiy
2021-08-27 18:17 ` [PATCH v2 02/19] block/dirty-bitmap: bdrv_merge_dirty_bitmap(): add return value Vladimir Sementsov-Ogievskiy
2021-08-27 18:17 ` [PATCH v2 03/19] block/block-copy: block_copy_state_new(): add bitmap parameter Vladimir Sementsov-Ogievskiy
2021-08-27 18:17 ` [PATCH v2 04/19] block/copy-before-write: add bitmap open parameter Vladimir Sementsov-Ogievskiy
2021-08-27 18:17 ` [PATCH v2 05/19] block/block-copy: add block_copy_reset() Vladimir Sementsov-Ogievskiy
2021-08-27 18:17 ` [PATCH v2 06/19] block: intoduce reqlist Vladimir Sementsov-Ogievskiy
2021-08-27 18:17 ` [PATCH v2 07/19] block/dirty-bitmap: introduce bdrv_dirty_bitmap_status() Vladimir Sementsov-Ogievskiy
2021-08-27 18:17 ` [PATCH v2 08/19] block/reqlist: add reqlist_wait_all() Vladimir Sementsov-Ogievskiy
2021-08-27 18:17 ` [PATCH v2 09/19] block: introduce FleecingState class Vladimir Sementsov-Ogievskiy
2021-08-27 18:17 ` [PATCH v2 10/19] block: introduce fleecing block driver Vladimir Sementsov-Ogievskiy
2021-09-01 11:44   ` Markus Armbruster
2021-09-03 13:29     ` Vladimir Sementsov-Ogievskiy
2021-08-27 18:18 ` [PATCH v2 11/19] block/copy-before-write: support " Vladimir Sementsov-Ogievskiy
2021-08-27 18:18 ` [PATCH v2 12/19] block/block-copy: add write-unchanged mode Vladimir Sementsov-Ogievskiy
2021-08-27 18:18 ` [PATCH v2 13/19] block/copy-before-write: use write-unchanged in fleecing mode Vladimir Sementsov-Ogievskiy
2021-08-27 18:18 ` [PATCH v2 14/19] iotests/image-fleecing: add test-case for fleecing format node Vladimir Sementsov-Ogievskiy
2021-08-27 18:18 ` [PATCH v2 15/19] iotests.py: add qemu_io_pipe_and_status() Vladimir Sementsov-Ogievskiy
2021-08-27 18:18 ` [PATCH v2 16/19] iotests/image-fleecing: add test case with bitmap Vladimir Sementsov-Ogievskiy
2021-08-27 18:18 ` [PATCH v2 17/19] block: blk_root(): return non-const pointer Vladimir Sementsov-Ogievskiy
2021-08-27 18:18 ` [PATCH v2 18/19] qapi: backup: add immutable-source parameter Vladimir Sementsov-Ogievskiy
2021-09-01 11:47   ` Markus Armbruster
2021-08-27 18:18 ` [PATCH v2 19/19] iotests/image-fleecing: test push backup with fleecing Vladimir Sementsov-Ogievskiy
2021-09-22  7:45 ` [PATCH v2 00/19] Make image fleecing more usable Vladimir Sementsov-Ogievskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.