All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v14 00/13] Apply COR-filter to the block-stream permanently
@ 2020-12-04 22:07 Vladimir Sementsov-Ogievskiy
  2020-12-04 22:07 ` [PATCH v14 01/13] copy-on-read: support preadv/pwritev_part functions Vladimir Sementsov-Ogievskiy
                   ` (12 more replies)
  0 siblings, 13 replies; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

Hi all!

I decided to post v14 myself, to show how to keep the test with parallel
stream jobs.

So, main addition in v14 is "bottom" argument for stream job. Next week
I'll send a follow-up with deprecation for old "base" API.

Also, I already finished my work on updating permissions, so that we
don't need ".active"-like things to add/drop filters, still, as v14 is
a lot bigger than v2, and I believe this this v14 is closer to be
merged, so I'd better resend my
  "[PATCH v2 00/36] block: update graph permissions update", basing on
this v14 (and reworking filter drop/remove).

05: fix bdrv_is_allocated_above() usage
09: merge change from further commit which uses unfiltered_bs for bdrv_find_backing_image
10: new
11: new
12: new, splitted from last commit to simplify it a bit
13: - pass bottom option to filter always (who know, may be base will appear during the job)
    - keep test test_stream_parallel in 30 iotest
    - rework changes in 245 iotest

Andrey Shinkevich (10):
  copy-on-read: support preadv/pwritev_part functions
  block: add API function to insert a node
  copy-on-read: add filter drop function
  qapi: add filter-node-name to block-stream
  qapi: create BlockdevOptionsCor structure for COR driver
  iotests: add #310 to test bottom node in COR driver
  block: include supported_read_flags into BDS structure
  copy-on-read: skip non-guest reads if no copy needed
  stream: skip filters when writing backing file name to QCOW2 header
  block: apply COR-filter to block-stream jobs

Vladimir Sementsov-Ogievskiy (3):
  qapi: block-stream: add "bottom" argument
  iotests: 30: prepare to COR filter insertion by stream job
  block/stream: add s->target_bs

 qapi/block-core.json           |  35 ++++++-
 block/copy-on-read.h           |  32 +++++++
 include/block/block.h          |  10 +-
 include/block/block_int.h      |  12 ++-
 block.c                        |  25 +++++
 block/copy-on-read.c           | 143 ++++++++++++++++++++++++---
 block/io.c                     |  12 ++-
 block/monitor/block-hmp-cmds.c |   7 +-
 block/stream.c                 | 170 +++++++++++++++++++++++----------
 blockdev.c                     |  71 ++++++++++----
 tests/qemu-iotests/030         |  12 ++-
 tests/qemu-iotests/141.out     |   2 +-
 tests/qemu-iotests/245         |  20 ++--
 tests/qemu-iotests/310         | 114 ++++++++++++++++++++++
 tests/qemu-iotests/310.out     |  15 +++
 tests/qemu-iotests/group       |   1 +
 16 files changed, 574 insertions(+), 107 deletions(-)
 create mode 100644 block/copy-on-read.h
 create mode 100755 tests/qemu-iotests/310
 create mode 100644 tests/qemu-iotests/310.out

-- 
2.21.3



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v14 01/13] copy-on-read: support preadv/pwritev_part functions
  2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
@ 2020-12-04 22:07 ` Vladimir Sementsov-Ogievskiy
  2020-12-04 22:07 ` [PATCH v14 02/13] block: add API function to insert a node Vladimir Sementsov-Ogievskiy
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

Add support for the recently introduced functions
bdrv_co_preadv_part()
and
bdrv_co_pwritev_part()
to the COR-filter driver.

Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/copy-on-read.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index 2816e61afe..cb03e0f2d3 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -74,21 +74,25 @@ static int64_t cor_getlength(BlockDriverState *bs)
 }
 
 
-static int coroutine_fn cor_co_preadv(BlockDriverState *bs,
-                                      uint64_t offset, uint64_t bytes,
-                                      QEMUIOVector *qiov, int flags)
+static int coroutine_fn cor_co_preadv_part(BlockDriverState *bs,
+                                           uint64_t offset, uint64_t bytes,
+                                           QEMUIOVector *qiov,
+                                           size_t qiov_offset,
+                                           int flags)
 {
-    return bdrv_co_preadv(bs->file, offset, bytes, qiov,
-                          flags | BDRV_REQ_COPY_ON_READ);
+    return bdrv_co_preadv_part(bs->file, offset, bytes, qiov, qiov_offset,
+                               flags | BDRV_REQ_COPY_ON_READ);
 }
 
 
-static int coroutine_fn cor_co_pwritev(BlockDriverState *bs,
-                                       uint64_t offset, uint64_t bytes,
-                                       QEMUIOVector *qiov, int flags)
+static int coroutine_fn cor_co_pwritev_part(BlockDriverState *bs,
+                                            uint64_t offset,
+                                            uint64_t bytes,
+                                            QEMUIOVector *qiov,
+                                            size_t qiov_offset, int flags)
 {
-
-    return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
+    return bdrv_co_pwritev_part(bs->file, offset, bytes, qiov, qiov_offset,
+                                flags);
 }
 
 
@@ -137,8 +141,8 @@ static BlockDriver bdrv_copy_on_read = {
 
     .bdrv_getlength                     = cor_getlength,
 
-    .bdrv_co_preadv                     = cor_co_preadv,
-    .bdrv_co_pwritev                    = cor_co_pwritev,
+    .bdrv_co_preadv_part                = cor_co_preadv_part,
+    .bdrv_co_pwritev_part               = cor_co_pwritev_part,
     .bdrv_co_pwrite_zeroes              = cor_co_pwrite_zeroes,
     .bdrv_co_pdiscard                   = cor_co_pdiscard,
     .bdrv_co_pwritev_compressed         = cor_co_pwritev_compressed,
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v14 02/13] block: add API function to insert a node
  2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
  2020-12-04 22:07 ` [PATCH v14 01/13] copy-on-read: support preadv/pwritev_part functions Vladimir Sementsov-Ogievskiy
@ 2020-12-04 22:07 ` Vladimir Sementsov-Ogievskiy
  2020-12-10 17:33   ` Max Reitz
  2020-12-04 22:07 ` [PATCH v14 03/13] copy-on-read: add filter drop function Vladimir Sementsov-Ogievskiy
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

Provide API for insertion a node to backing chain.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h |  2 ++
 block.c               | 25 +++++++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index c9d7c58765..81a3894129 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -350,6 +350,8 @@ void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
                  Error **errp);
 void bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
                        Error **errp);
+BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *node_options,
+                                   int flags, Error **errp);
 
 int bdrv_parse_aio(const char *mode, int *flags);
 int bdrv_parse_cache_mode(const char *mode, int *flags, bool *writethrough);
diff --git a/block.c b/block.c
index f1cedac362..b71c39f3e6 100644
--- a/block.c
+++ b/block.c
@@ -4698,6 +4698,31 @@ static void bdrv_delete(BlockDriverState *bs)
     g_free(bs);
 }
 
+BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *node_options,
+                                   int flags, Error **errp)
+{
+    BlockDriverState *new_node_bs;
+    Error *local_err = NULL;
+
+    new_node_bs =  bdrv_open(NULL, NULL, node_options, flags, errp);
+    if (new_node_bs == NULL) {
+        error_prepend(errp, "Could not create node: ");
+        return NULL;
+    }
+
+    bdrv_drained_begin(bs);
+    bdrv_replace_node(bs, new_node_bs, &local_err);
+    bdrv_drained_end(bs);
+
+    if (local_err) {
+        bdrv_unref(new_node_bs);
+        error_propagate(errp, local_err);
+        return NULL;
+    }
+
+    return new_node_bs;
+}
+
 /*
  * Run consistency checks on an image
  *
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v14 03/13] copy-on-read: add filter drop function
  2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
  2020-12-04 22:07 ` [PATCH v14 01/13] copy-on-read: support preadv/pwritev_part functions Vladimir Sementsov-Ogievskiy
  2020-12-04 22:07 ` [PATCH v14 02/13] block: add API function to insert a node Vladimir Sementsov-Ogievskiy
@ 2020-12-04 22:07 ` Vladimir Sementsov-Ogievskiy
  2020-12-10 17:34   ` Max Reitz
  2020-12-04 22:07 ` [PATCH v14 04/13] qapi: add filter-node-name to block-stream Vladimir Sementsov-Ogievskiy
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

Provide API for the COR-filter removal. Also, drop the filter child
permissions for an inactive state when the filter node is being
removed.
To insert the filter, the block generic layer function
bdrv_insert_node() can be used.
The new function bdrv_cor_filter_drop() may be considered as an
intermediate solution before the QEMU permission update system has
overhauled. Then we are able to implement the API function
bdrv_remove_node() on the block generic layer.

Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/copy-on-read.h | 32 +++++++++++++++++++++++++
 block/copy-on-read.c | 56 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 88 insertions(+)
 create mode 100644 block/copy-on-read.h

diff --git a/block/copy-on-read.h b/block/copy-on-read.h
new file mode 100644
index 0000000000..7bf405dccd
--- /dev/null
+++ b/block/copy-on-read.h
@@ -0,0 +1,32 @@
+/*
+ * Copy-on-read filter block driver
+ *
+ * The filter driver performs Copy-On-Read (COR) operations
+ *
+ * Copyright (c) 2018-2020 Virtuozzo International GmbH.
+ *
+ * Author:
+ *   Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef BLOCK_COPY_ON_READ
+#define BLOCK_COPY_ON_READ
+
+#include "block/block_int.h"
+
+void bdrv_cor_filter_drop(BlockDriverState *cor_filter_bs);
+
+#endif /* BLOCK_COPY_ON_READ */
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index cb03e0f2d3..618c4c4f43 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -23,11 +23,20 @@
 #include "qemu/osdep.h"
 #include "block/block_int.h"
 #include "qemu/module.h"
+#include "qapi/error.h"
+#include "block/copy-on-read.h"
+
+
+typedef struct BDRVStateCOR {
+    bool active;
+} BDRVStateCOR;
 
 
 static int cor_open(BlockDriverState *bs, QDict *options, int flags,
                     Error **errp)
 {
+    BDRVStateCOR *state = bs->opaque;
+
     bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
                                BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
                                false, errp);
@@ -42,6 +51,13 @@ static int cor_open(BlockDriverState *bs, QDict *options, int flags,
         ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
             bs->file->bs->supported_zero_flags);
 
+    state->active = true;
+
+    /*
+     * We don't need to call bdrv_child_refresh_perms() now as the permissions
+     * will be updated later when the filter node gets its parent.
+     */
+
     return 0;
 }
 
@@ -57,6 +73,17 @@ static void cor_child_perm(BlockDriverState *bs, BdrvChild *c,
                            uint64_t perm, uint64_t shared,
                            uint64_t *nperm, uint64_t *nshared)
 {
+    BDRVStateCOR *s = bs->opaque;
+
+    if (!s->active) {
+        /*
+         * While the filter is being removed
+         */
+        *nperm = 0;
+        *nshared = BLK_PERM_ALL;
+        return;
+    }
+
     *nperm = perm & PERM_PASSTHROUGH;
     *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED;
 
@@ -135,6 +162,7 @@ static void cor_lock_medium(BlockDriverState *bs, bool locked)
 
 static BlockDriver bdrv_copy_on_read = {
     .format_name                        = "copy-on-read",
+    .instance_size                      = sizeof(BDRVStateCOR),
 
     .bdrv_open                          = cor_open,
     .bdrv_child_perm                    = cor_child_perm,
@@ -154,6 +182,34 @@ static BlockDriver bdrv_copy_on_read = {
     .is_filter                          = true,
 };
 
+
+void bdrv_cor_filter_drop(BlockDriverState *cor_filter_bs)
+{
+    BdrvChild *child;
+    BlockDriverState *bs;
+    BDRVStateCOR *s = cor_filter_bs->opaque;
+
+    child = bdrv_filter_child(cor_filter_bs);
+    if (!child) {
+        return;
+    }
+    bs = child->bs;
+
+    /* Retain the BDS until we complete the graph change. */
+    bdrv_ref(bs);
+    /* Hold a guest back from writing while permissions are being reset. */
+    bdrv_drained_begin(bs);
+    /* Drop permissions before the graph change. */
+    s->active = false;
+    bdrv_child_refresh_perms(cor_filter_bs, child, &error_abort);
+    bdrv_replace_node(cor_filter_bs, bs, &error_abort);
+
+    bdrv_drained_end(bs);
+    bdrv_unref(bs);
+    bdrv_unref(cor_filter_bs);
+}
+
+
 static void bdrv_copy_on_read_init(void)
 {
     bdrv_register(&bdrv_copy_on_read);
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v14 04/13] qapi: add filter-node-name to block-stream
  2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2020-12-04 22:07 ` [PATCH v14 03/13] copy-on-read: add filter drop function Vladimir Sementsov-Ogievskiy
@ 2020-12-04 22:07 ` Vladimir Sementsov-Ogievskiy
  2020-12-10 17:37   ` Max Reitz
  2020-12-04 22:07 ` [PATCH v14 05/13] qapi: create BlockdevOptionsCor structure for COR driver Vladimir Sementsov-Ogievskiy
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

Provide the possibility to pass the 'filter-node-name' parameter to the
block-stream job as it is done for the commit block job.

Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 qapi/block-core.json           | 6 ++++++
 include/block/block_int.h      | 7 ++++++-
 block/monitor/block-hmp-cmds.c | 4 ++--
 block/stream.c                 | 4 +++-
 blockdev.c                     | 4 +++-
 5 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 04ad80bc1e..8ef3df6767 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2543,6 +2543,11 @@
 #            'stop' and 'enospc' can only be used if the block device
 #            supports io-status (see BlockInfo).  Since 1.3.
 #
+# @filter-node-name: the node name that should be assigned to the
+#                    filter driver that the stream job inserts into the graph
+#                    above @device. If this option is not given, a node name is
+#                    autogenerated. (Since: 5.2)
+#
 # @auto-finalize: When false, this job will wait in a PENDING state after it has
 #                 finished its work, waiting for @block-job-finalize before
 #                 making any block graph changes.
@@ -2573,6 +2578,7 @@
   'data': { '*job-id': 'str', 'device': 'str', '*base': 'str',
             '*base-node': 'str', '*backing-file': 'str', '*speed': 'int',
             '*on-error': 'BlockdevOnError',
+            '*filter-node-name': 'str',
             '*auto-finalize': 'bool', '*auto-dismiss': 'bool' } }
 
 ##
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 95d9333be1..c05fa1eb6b 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1134,6 +1134,9 @@ int is_windows_drive(const char *filename);
  *                  See @BlockJobCreateFlags
  * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
  * @on_error: The action to take upon error.
+ * @filter_node_name: The node name that should be assigned to the filter
+ * driver that the commit job inserts into the graph above @bs. NULL means
+ * that a node name should be autogenerated.
  * @errp: Error object.
  *
  * Start a streaming operation on @bs.  Clusters that are unallocated
@@ -1146,7 +1149,9 @@ int is_windows_drive(const char *filename);
 void stream_start(const char *job_id, BlockDriverState *bs,
                   BlockDriverState *base, const char *backing_file_str,
                   int creation_flags, int64_t speed,
-                  BlockdevOnError on_error, Error **errp);
+                  BlockdevOnError on_error,
+                  const char *filter_node_name,
+                  Error **errp);
 
 /**
  * commit_start:
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index d15a2be827..e8a58f326e 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -508,8 +508,8 @@ void hmp_block_stream(Monitor *mon, const QDict *qdict)
 
     qmp_block_stream(true, device, device, base != NULL, base, false, NULL,
                      false, NULL, qdict_haskey(qdict, "speed"), speed, true,
-                     BLOCKDEV_ON_ERROR_REPORT, false, false, false, false,
-                     &error);
+                     BLOCKDEV_ON_ERROR_REPORT, false, NULL, false, false, false,
+                     false, &error);
 
     hmp_handle_error(mon, error);
 }
diff --git a/block/stream.c b/block/stream.c
index 236384f2f7..6e281c71ac 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -221,7 +221,9 @@ static const BlockJobDriver stream_job_driver = {
 void stream_start(const char *job_id, BlockDriverState *bs,
                   BlockDriverState *base, const char *backing_file_str,
                   int creation_flags, int64_t speed,
-                  BlockdevOnError on_error, Error **errp)
+                  BlockdevOnError on_error,
+                  const char *filter_node_name,
+                  Error **errp)
 {
     StreamBlockJob *s;
     BlockDriverState *iter;
diff --git a/blockdev.c b/blockdev.c
index fe6fb5dc1d..c917625245 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2499,6 +2499,7 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
                       bool has_backing_file, const char *backing_file,
                       bool has_speed, int64_t speed,
                       bool has_on_error, BlockdevOnError on_error,
+                      bool has_filter_node_name, const char *filter_node_name,
                       bool has_auto_finalize, bool auto_finalize,
                       bool has_auto_dismiss, bool auto_dismiss,
                       Error **errp)
@@ -2581,7 +2582,8 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
     }
 
     stream_start(has_job_id ? job_id : NULL, bs, base_bs, base_name,
-                 job_flags, has_speed ? speed : 0, on_error, &local_err);
+                 job_flags, has_speed ? speed : 0, on_error,
+                 filter_node_name, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         goto out;
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v14 05/13] qapi: create BlockdevOptionsCor structure for COR driver
  2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2020-12-04 22:07 ` [PATCH v14 04/13] qapi: add filter-node-name to block-stream Vladimir Sementsov-Ogievskiy
@ 2020-12-04 22:07 ` Vladimir Sementsov-Ogievskiy
  2020-12-10 17:43   ` Max Reitz
  2020-12-04 22:07 ` [PATCH v14 06/13] iotests: add #310 to test bottom node in " Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

Create the BlockdevOptionsCor structure for COR driver specific options
splitting it off form the BlockdevOptionsGenericFormat. The only option
'bottom' node in the structure denotes an image file that limits the
COR operations in the backing chain.
We are going to use the COR-filter for a block-stream job and will pass
a bottom node name to the COR driver. The bottom node is the first
non-filter overlay of the base. It was introduced because the base node
itself may change due to possible concurrent jobs.

Suggested-by: Max Reitz <mreitz@redhat.com>
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
  [vsementsov: fix bdrv_is_allocated_above() usage]
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 qapi/block-core.json | 21 +++++++++++++++-
 block/copy-on-read.c | 57 ++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 75 insertions(+), 3 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 8ef3df6767..04055ef50c 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3942,6 +3942,25 @@
   'data': { 'throttle-group': 'str',
             'file' : 'BlockdevRef'
              } }
+
+##
+# @BlockdevOptionsCor:
+#
+# Driver specific block device options for the copy-on-read driver.
+#
+# @bottom: the name of a non-filter node (allocation-bearing layer) that limits
+#          the COR operations in the backing chain (inclusive).
+#          For the block-stream job, it will be the first non-filter overlay of
+#          the base node. We do not involve the base node into the COR
+#          operations because the base may change due to a concurrent
+#          block-commit job on the same backing chain.
+#
+# Since: 5.2
+##
+{ 'struct': 'BlockdevOptionsCor',
+  'base': 'BlockdevOptionsGenericFormat',
+  'data': { '*bottom': 'str' } }
+
 ##
 # @BlockdevOptions:
 #
@@ -3994,7 +4013,7 @@
       'bochs':      'BlockdevOptionsGenericFormat',
       'cloop':      'BlockdevOptionsGenericFormat',
       'compress':   'BlockdevOptionsGenericFormat',
-      'copy-on-read':'BlockdevOptionsGenericFormat',
+      'copy-on-read':'BlockdevOptionsCor',
       'dmg':        'BlockdevOptionsGenericFormat',
       'file':       'BlockdevOptionsFile',
       'ftp':        'BlockdevOptionsCurlFtp',
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index 618c4c4f43..67f61983c0 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -24,18 +24,23 @@
 #include "block/block_int.h"
 #include "qemu/module.h"
 #include "qapi/error.h"
+#include "qapi/qmp/qdict.h"
 #include "block/copy-on-read.h"
 
 
 typedef struct BDRVStateCOR {
     bool active;
+    BlockDriverState *bottom_bs;
 } BDRVStateCOR;
 
 
 static int cor_open(BlockDriverState *bs, QDict *options, int flags,
                     Error **errp)
 {
+    BlockDriverState *bottom_bs = NULL;
     BDRVStateCOR *state = bs->opaque;
+    /* Find a bottom node name, if any */
+    const char *bottom_node = qdict_get_try_str(options, "bottom");
 
     bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
                                BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
@@ -51,7 +56,17 @@ static int cor_open(BlockDriverState *bs, QDict *options, int flags,
         ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
             bs->file->bs->supported_zero_flags);
 
+    if (bottom_node) {
+        bottom_bs = bdrv_lookup_bs(NULL, bottom_node, errp);
+        if (!bottom_bs) {
+            error_setg(errp, "Bottom node '%s' not found", bottom_node);
+            qdict_del(options, "bottom");
+            return -EINVAL;
+        }
+        qdict_del(options, "bottom");
+    }
     state->active = true;
+    state->bottom_bs = bottom_bs;
 
     /*
      * We don't need to call bdrv_child_refresh_perms() now as the permissions
@@ -107,8 +122,46 @@ static int coroutine_fn cor_co_preadv_part(BlockDriverState *bs,
                                            size_t qiov_offset,
                                            int flags)
 {
-    return bdrv_co_preadv_part(bs->file, offset, bytes, qiov, qiov_offset,
-                               flags | BDRV_REQ_COPY_ON_READ);
+    int64_t n;
+    int local_flags;
+    int ret;
+    BDRVStateCOR *state = bs->opaque;
+
+    if (!state->bottom_bs) {
+        return bdrv_co_preadv_part(bs->file, offset, bytes, qiov, qiov_offset,
+                                   flags | BDRV_REQ_COPY_ON_READ);
+    }
+
+    while (bytes) {
+        local_flags = flags;
+
+        /* In case of failure, try to copy-on-read anyway */
+        ret = bdrv_is_allocated(bs->file->bs, offset, bytes, &n);
+        if (ret <= 0) {
+            ret = bdrv_is_allocated_above(bdrv_backing_chain_next(bs->file->bs),
+                                          state->bottom_bs, true, offset,
+                                          n, &n);
+            if (ret > 0 || ret < 0) {
+                local_flags |= BDRV_REQ_COPY_ON_READ;
+            }
+            /* Finish earlier if the end of a backing file has been reached */
+            if (n == 0) {
+                break;
+            }
+        }
+
+        ret = bdrv_co_preadv_part(bs->file, offset, n, qiov, qiov_offset,
+                                  local_flags);
+        if (ret < 0) {
+            return ret;
+        }
+
+        offset += n;
+        qiov_offset += n;
+        bytes -= n;
+    }
+
+    return 0;
 }
 
 
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v14 06/13] iotests: add #310 to test bottom node in COR driver
  2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2020-12-04 22:07 ` [PATCH v14 05/13] qapi: create BlockdevOptionsCor structure for COR driver Vladimir Sementsov-Ogievskiy
@ 2020-12-04 22:07 ` Vladimir Sementsov-Ogievskiy
  2020-12-11 12:49   ` Max Reitz
  2020-12-04 22:07 ` [PATCH v14 07/13] block: include supported_read_flags into BDS structure Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

The test case #310 is similar to #216 by Max Reitz. The difference is
that the test #310 involves a bottom node to the COR filter driver.

Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/310     | 114 +++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/310.out |  15 +++++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 130 insertions(+)
 create mode 100755 tests/qemu-iotests/310
 create mode 100644 tests/qemu-iotests/310.out

diff --git a/tests/qemu-iotests/310 b/tests/qemu-iotests/310
new file mode 100755
index 0000000000..c8b34cd887
--- /dev/null
+++ b/tests/qemu-iotests/310
@@ -0,0 +1,114 @@
+#!/usr/bin/env python3
+#
+# Copy-on-read tests using a COR filter with a bottom node
+#
+# Copyright (C) 2018 Red Hat, Inc.
+# Copyright (c) 2020 Virtuozzo International GmbH
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import iotests
+from iotests import log, qemu_img, qemu_io_silent
+
+# Need backing file support
+iotests.script_initialize(supported_fmts=['qcow2', 'qcow', 'qed', 'vmdk'],
+                          supported_platforms=['linux'])
+
+log('')
+log('=== Copy-on-read across nodes ===')
+log('')
+
+# This test is similar to the 216 one by Max Reitz <mreitz@redhat.com>
+# The difference is that this test case involves a bottom node to the
+# COR filter driver.
+
+with iotests.FilePath('base.img') as base_img_path, \
+     iotests.FilePath('mid.img') as mid_img_path, \
+     iotests.FilePath('top.img') as top_img_path, \
+     iotests.VM() as vm:
+
+    log('--- Setting up images ---')
+    log('')
+
+    assert qemu_img('create', '-f', iotests.imgfmt, base_img_path, '64M') == 0
+    assert qemu_io_silent(base_img_path, '-c', 'write -P 1 0M 1M') == 0
+    assert qemu_io_silent(base_img_path, '-c', 'write -P 1 3M 1M') == 0
+    assert qemu_img('create', '-f', iotests.imgfmt, '-b', base_img_path,
+                    '-F', iotests.imgfmt, mid_img_path) == 0
+    assert qemu_io_silent(mid_img_path,  '-c', 'write -P 3 2M 1M') == 0
+    assert qemu_io_silent(mid_img_path,  '-c', 'write -P 3 4M 1M') == 0
+    assert qemu_img('create', '-f', iotests.imgfmt, '-b', mid_img_path,
+                    '-F', iotests.imgfmt, top_img_path) == 0
+    assert qemu_io_silent(top_img_path,  '-c', 'write -P 2 1M 1M') == 0
+
+#      0 1 2 3 4
+# top    2
+# mid      3   3
+# base 1     1
+
+    log('Done')
+
+    log('')
+    log('--- Doing COR ---')
+    log('')
+
+    vm.launch()
+
+    log(vm.qmp('blockdev-add',
+               node_name='node0',
+               driver='copy-on-read',
+               bottom='node2',
+               file={
+                   'driver': iotests.imgfmt,
+                   'file': {
+                       'driver': 'file',
+                       'filename': top_img_path
+                   },
+                   'backing': {
+                       'node-name': 'node2',
+                       'driver': iotests.imgfmt,
+                       'file': {
+                           'driver': 'file',
+                           'filename': mid_img_path
+                       },
+                       'backing': {
+                           'driver': iotests.imgfmt,
+                           'file': {
+                               'driver': 'file',
+                               'filename': base_img_path
+                           }
+                       },
+                   }
+               }))
+
+    # Trigger COR
+    log(vm.qmp('human-monitor-command',
+               command_line='qemu-io node0 "read 0 5M"'))
+
+    vm.shutdown()
+
+    log('')
+    log('--- Checking COR result ---')
+    log('')
+
+    assert qemu_io_silent(base_img_path, '-c', 'discard 0 4M') == 0
+    assert qemu_io_silent(mid_img_path, '-c', 'discard 0M 5M') == 0
+    assert qemu_io_silent(top_img_path,  '-c', 'read -P 0 0 1M') == 0
+    assert qemu_io_silent(top_img_path,  '-c', 'read -P 2 1M 1M') == 0
+    assert qemu_io_silent(top_img_path,  '-c', 'read -P 3 2M 1M') == 0
+    assert qemu_io_silent(top_img_path,  '-c', 'read -P 0 3M 1M') == 0
+    assert qemu_io_silent(top_img_path,  '-c', 'read -P 3 4M 1M') == 0
+
+    log('Done')
diff --git a/tests/qemu-iotests/310.out b/tests/qemu-iotests/310.out
new file mode 100644
index 0000000000..a70aa5cdae
--- /dev/null
+++ b/tests/qemu-iotests/310.out
@@ -0,0 +1,15 @@
+
+=== Copy-on-read across nodes ===
+
+--- Setting up images ---
+
+Done
+
+--- Doing COR ---
+
+{"return": {}}
+{"return": ""}
+
+--- Checking COR result ---
+
+Done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 2960dff728..2793dc31be 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -316,3 +316,4 @@
 305 rw quick
 307 rw quick export
 309 rw auto quick
+310 rw quick
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v14 07/13] block: include supported_read_flags into BDS structure
  2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2020-12-04 22:07 ` [PATCH v14 06/13] iotests: add #310 to test bottom node in " Vladimir Sementsov-Ogievskiy
@ 2020-12-04 22:07 ` Vladimir Sementsov-Ogievskiy
  2020-12-11 13:20   ` Max Reitz
  2020-12-04 22:07 ` [PATCH v14 08/13] copy-on-read: skip non-guest reads if no copy needed Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

Add the new member supported_read_flags to the BlockDriverState
structure. It will control the flags set for copy-on-read operations.
Make the block generic layer evaluate supported read flags before they
go to a block driver.

Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block_int.h |  4 ++++
 block/io.c                | 12 ++++++++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index c05fa1eb6b..247e166ab6 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -873,6 +873,10 @@ struct BlockDriverState {
     /* I/O Limits */
     BlockLimits bl;
 
+    /*
+     * Flags honored during pread
+     */
+    unsigned int supported_read_flags;
     /* Flags honored during pwrite (so far: BDRV_REQ_FUA,
      * BDRV_REQ_WRITE_UNCHANGED).
      * If a driver does not support BDRV_REQ_WRITE_UNCHANGED, those
diff --git a/block/io.c b/block/io.c
index ec5e152bb7..e28b11c42b 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1405,6 +1405,9 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild *child,
     if (flags & BDRV_REQ_COPY_ON_READ) {
         int64_t pnum;
 
+        /* The flag BDRV_REQ_COPY_ON_READ has reached its addressee */
+        flags &= ~BDRV_REQ_COPY_ON_READ;
+
         ret = bdrv_is_allocated(bs, offset, bytes, &pnum);
         if (ret < 0) {
             goto out;
@@ -1426,9 +1429,13 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild *child,
         goto out;
     }
 
+    if (flags & ~bs->supported_read_flags) {
+        abort();
+    }
+
     max_bytes = ROUND_UP(MAX(0, total_bytes - offset), align);
     if (bytes <= max_bytes && bytes <= max_transfer) {
-        ret = bdrv_driver_preadv(bs, offset, bytes, qiov, qiov_offset, 0);
+        ret = bdrv_driver_preadv(bs, offset, bytes, qiov, qiov_offset, flags);
         goto out;
     }
 
@@ -1441,7 +1448,8 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild *child,
 
             ret = bdrv_driver_preadv(bs, offset + bytes - bytes_remaining,
                                      num, qiov,
-                                     qiov_offset + bytes - bytes_remaining, 0);
+                                     qiov_offset + bytes - bytes_remaining,
+                                     flags);
             max_bytes -= num;
         } else {
             num = bytes_remaining;
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v14 08/13] copy-on-read: skip non-guest reads if no copy needed
  2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2020-12-04 22:07 ` [PATCH v14 07/13] block: include supported_read_flags into BDS structure Vladimir Sementsov-Ogievskiy
@ 2020-12-04 22:07 ` Vladimir Sementsov-Ogievskiy
  2020-12-11 14:29   ` Max Reitz
  2020-12-04 22:07 ` [PATCH v14 09/13] stream: skip filters when writing backing file name to QCOW2 header Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

If the flag BDRV_REQ_PREFETCH was set, skip idling read/write
operations in COR-driver. It can be taken into account for the
COR-algorithms optimization. That check is being made during the
block stream job by the moment.

Add the BDRV_REQ_PREFETCH flag to the supported_read_flags of the
COR-filter.

block: Modify the comment for the flag BDRV_REQ_PREFETCH as we are
going to use it alone and pass it to the COR-filter driver for further
processing.

Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h |  8 +++++---
 block/copy-on-read.c  | 14 ++++++++++----
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 81a3894129..3499554d9c 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -81,9 +81,11 @@ typedef enum {
     BDRV_REQ_NO_FALLBACK        = 0x100,
 
     /*
-     * BDRV_REQ_PREFETCH may be used only together with BDRV_REQ_COPY_ON_READ
-     * on read request and means that caller doesn't really need data to be
-     * written to qiov parameter which may be NULL.
+     * BDRV_REQ_PREFETCH makes sense only in the context of copy-on-read
+     * (i.e., together with the BDRV_REQ_COPY_ON_READ flag or when a COR
+     * filter is involved), in which case it signals that the COR operation
+     * need not read the data into memory (qiov) but only ensure they are
+     * copied to the top layer (i.e., that COR operation is done).
      */
     BDRV_REQ_PREFETCH  = 0x200,
     /* Mask of valid flags */
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index 67f61983c0..8b64e55e22 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -49,6 +49,8 @@ static int cor_open(BlockDriverState *bs, QDict *options, int flags,
         return -EINVAL;
     }
 
+    bs->supported_read_flags = BDRV_REQ_PREFETCH;
+
     bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
         (BDRV_REQ_FUA & bs->file->bs->supported_write_flags);
 
@@ -150,10 +152,14 @@ static int coroutine_fn cor_co_preadv_part(BlockDriverState *bs,
             }
         }
 
-        ret = bdrv_co_preadv_part(bs->file, offset, n, qiov, qiov_offset,
-                                  local_flags);
-        if (ret < 0) {
-            return ret;
+        /* Skip if neither read nor write are needed */
+        if ((local_flags & (BDRV_REQ_PREFETCH | BDRV_REQ_COPY_ON_READ)) !=
+            BDRV_REQ_PREFETCH) {
+            ret = bdrv_co_preadv_part(bs->file, offset, n, qiov, qiov_offset,
+                                      local_flags);
+            if (ret < 0) {
+                return ret;
+            }
         }
 
         offset += n;
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v14 09/13] stream: skip filters when writing backing file name to QCOW2 header
  2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2020-12-04 22:07 ` [PATCH v14 08/13] copy-on-read: skip non-guest reads if no copy needed Vladimir Sementsov-Ogievskiy
@ 2020-12-04 22:07 ` Vladimir Sementsov-Ogievskiy
  2020-12-11 15:15   ` Max Reitz
  2020-12-04 22:07 ` [PATCH v14 10/13] qapi: block-stream: add "bottom" argument Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

Avoid writing a filter JSON file name and a filter format name to QCOW2
image when the backing file is being changed after the block stream
job. It can occur due to a concurrent commit job on the same backing
chain.
A user is still able to assign the 'backing-file' parameter for a
block-stream job keeping in mind the possible issue mentioned above.
If the user does not specify the 'backing-file' parameter, QEMU will
assign it automatically.

Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
 [vsementsov: use unfiltered_bs for bdrv_find_backing_image()]
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/stream.c | 21 +++++++++++++++++++--
 blockdev.c     |  8 +-------
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/block/stream.c b/block/stream.c
index 6e281c71ac..c208393c34 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -17,6 +17,7 @@
 #include "block/blockjob_int.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qerror.h"
+#include "qemu/error-report.h"
 #include "qemu/ratelimit.h"
 #include "sysemu/block-backend.h"
 
@@ -65,6 +66,8 @@ static int stream_prepare(Job *job)
     BlockDriverState *bs = blk_bs(bjob->blk);
     BlockDriverState *unfiltered_bs = bdrv_skip_filters(bs);
     BlockDriverState *base = bdrv_filter_or_cow_bs(s->above_base);
+    BlockDriverState *base_unfiltered;
+    BlockDriverState *backing_bs;
     Error *local_err = NULL;
     int ret = 0;
 
@@ -75,8 +78,22 @@ static int stream_prepare(Job *job)
         const char *base_id = NULL, *base_fmt = NULL;
         if (base) {
             base_id = s->backing_file_str;
-            if (base->drv) {
-                base_fmt = base->drv->format_name;
+            if (base_id) {
+                backing_bs = bdrv_find_backing_image(unfiltered_bs, base_id);
+                if (backing_bs && backing_bs->drv) {
+                    base_fmt = backing_bs->drv->format_name;
+                } else {
+                    error_report("Format not found for backing file %s",
+                                 s->backing_file_str);
+                }
+            } else {
+                base_unfiltered = bdrv_skip_filters(base);
+                if (base_unfiltered) {
+                    base_id = base_unfiltered->filename;
+                    if (base_unfiltered->drv) {
+                        base_fmt = base_unfiltered->drv->format_name;
+                    }
+                }
             }
         }
         bdrv_set_backing_hd(unfiltered_bs, base, &local_err);
diff --git a/blockdev.c b/blockdev.c
index c917625245..70900f4f77 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2508,7 +2508,6 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
     BlockDriverState *base_bs = NULL;
     AioContext *aio_context;
     Error *local_err = NULL;
-    const char *base_name = NULL;
     int job_flags = JOB_DEFAULT;
 
     if (!has_on_error) {
@@ -2536,7 +2535,6 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
             goto out;
         }
         assert(bdrv_get_aio_context(base_bs) == aio_context);
-        base_name = base;
     }
 
     if (has_base_node) {
@@ -2551,7 +2549,6 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
         }
         assert(bdrv_get_aio_context(base_bs) == aio_context);
         bdrv_refresh_filename(base_bs);
-        base_name = base_bs->filename;
     }
 
     /* Check for op blockers in the whole chain between bs and base */
@@ -2571,9 +2568,6 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
         goto out;
     }
 
-    /* backing_file string overrides base bs filename */
-    base_name = has_backing_file ? backing_file : base_name;
-
     if (has_auto_finalize && !auto_finalize) {
         job_flags |= JOB_MANUAL_FINALIZE;
     }
@@ -2581,7 +2575,7 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
         job_flags |= JOB_MANUAL_DISMISS;
     }
 
-    stream_start(has_job_id ? job_id : NULL, bs, base_bs, base_name,
+    stream_start(has_job_id ? job_id : NULL, bs, base_bs, backing_file,
                  job_flags, has_speed ? speed : 0, on_error,
                  filter_node_name, &local_err);
     if (local_err) {
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v14 10/13] qapi: block-stream: add "bottom" argument
  2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
                   ` (8 preceding siblings ...)
  2020-12-04 22:07 ` [PATCH v14 09/13] stream: skip filters when writing backing file name to QCOW2 header Vladimir Sementsov-Ogievskiy
@ 2020-12-04 22:07 ` Vladimir Sementsov-Ogievskiy
  2020-12-11 16:05   ` Max Reitz
  2020-12-04 22:07 ` [PATCH v14 11/13] iotests: 30: prepare to COR filter insertion by stream job Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

The code already don't freeze base node and we try to make it prepared
for the situation when base node is changed during the operation. In
other words, block-stream doesn't own base node.

Let's introduce a new interface which should replace the current one,
which will in better relations with the code. Specifying bottom node
instead of base, and requiring it to be non-filter gives us the
following benefits:

 - drop difference between above_base and base_overlay, which will be
   renamed to just bottom, when old interface dropped

 - clean way to work with parallel streams/commits on the same backing
   chain, which otherwise become a problem when we introduce a filter
   for stream job

 - cleaner interface. Nobody will surprised the fact that base node may
   disappear during block-stream, when there is no word about "base" in
   the interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 qapi/block-core.json           |  8 +++--
 include/block/block_int.h      |  1 +
 block/monitor/block-hmp-cmds.c |  3 +-
 block/stream.c                 | 50 +++++++++++++++++++---------
 blockdev.c                     | 61 ++++++++++++++++++++++++++++------
 5 files changed, 94 insertions(+), 29 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 04055ef50c..5d6681a35d 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2522,6 +2522,10 @@
 # @base-node: the node name of the backing file.
 #             It cannot be set if @base is also set. (Since 2.8)
 #
+# @bottom: the last node in the chain that should be streamed into
+#          top. It cannot be set any of @base, @base-node or @backing-file
+#          is set. It cannot be filter node. (Since 6.0)
+#
 # @backing-file: The backing file string to write into the top
 #                image. This filename is not validated.
 #
@@ -2576,8 +2580,8 @@
 ##
 { 'command': 'block-stream',
   'data': { '*job-id': 'str', 'device': 'str', '*base': 'str',
-            '*base-node': 'str', '*backing-file': 'str', '*speed': 'int',
-            '*on-error': 'BlockdevOnError',
+            '*base-node': 'str', '*backing-file': 'str', '*bottom': 'str',
+            '*speed': 'int', '*on-error': 'BlockdevOnError',
             '*filter-node-name': 'str',
             '*auto-finalize': 'bool', '*auto-dismiss': 'bool' } }
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 247e166ab6..b13154edbf 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1152,6 +1152,7 @@ int is_windows_drive(const char *filename);
  */
 void stream_start(const char *job_id, BlockDriverState *bs,
                   BlockDriverState *base, const char *backing_file_str,
+                  BlockDriverState *bottom,
                   int creation_flags, int64_t speed,
                   BlockdevOnError on_error,
                   const char *filter_node_name,
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index e8a58f326e..afd75ab628 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -507,7 +507,8 @@ void hmp_block_stream(Monitor *mon, const QDict *qdict)
     int64_t speed = qdict_get_try_int(qdict, "speed", 0);
 
     qmp_block_stream(true, device, device, base != NULL, base, false, NULL,
-                     false, NULL, qdict_haskey(qdict, "speed"), speed, true,
+                     false, NULL, false, NULL,
+                     qdict_haskey(qdict, "speed"), speed, true,
                      BLOCKDEV_ON_ERROR_REPORT, false, NULL, false, false, false,
                      false, &error);
 
diff --git a/block/stream.c b/block/stream.c
index c208393c34..a2744d07fe 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -237,6 +237,7 @@ static const BlockJobDriver stream_job_driver = {
 
 void stream_start(const char *job_id, BlockDriverState *bs,
                   BlockDriverState *base, const char *backing_file_str,
+                  BlockDriverState *bottom,
                   int creation_flags, int64_t speed,
                   BlockdevOnError on_error,
                   const char *filter_node_name,
@@ -246,25 +247,42 @@ void stream_start(const char *job_id, BlockDriverState *bs,
     BlockDriverState *iter;
     bool bs_read_only;
     int basic_flags = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED;
-    BlockDriverState *base_overlay = bdrv_find_overlay(bs, base);
+    BlockDriverState *base_overlay;
     BlockDriverState *above_base;
 
-    if (!base_overlay) {
-        error_setg(errp, "'%s' is not in the backing chain of '%s'",
-                   base->node_name, bs->node_name);
-        return;
-    }
+    assert(!(base && bottom));
+    assert(!(backing_file_str && bottom));
+
+    if (bottom) {
+        /*
+         * New simple interface. The code is written in terms of old interface
+         * with @base parameter (still, it doesn't freeze link to base, so in
+         * this mean old code is correct for new interface). So, for now, just
+         * emulate base_overlay and above_base. Still, when old interface
+         * finally removed, we should refactor code to use only "bottom", but
+         * not "*base*" things.
+         */
+        assert(!bottom->drv->is_filter);
+        base_overlay = above_base = bottom;
+    } else {
+        base_overlay = bdrv_find_overlay(bs, base);
+        if (!base_overlay) {
+            error_setg(errp, "'%s' is not in the backing chain of '%s'",
+                       base->node_name, bs->node_name);
+            return;
+        }
 
-    /*
-     * Find the node directly above @base.  @base_overlay is a COW overlay, so
-     * it must have a bdrv_cow_child(), but it is the immediate overlay of
-     * @base, so between the two there can only be filters.
-     */
-    above_base = base_overlay;
-    if (bdrv_cow_bs(above_base) != base) {
-        above_base = bdrv_cow_bs(above_base);
-        while (bdrv_filter_bs(above_base) != base) {
-            above_base = bdrv_filter_bs(above_base);
+        /*
+         * Find the node directly above @base.  @base_overlay is a COW overlay,
+         * so it must have a bdrv_cow_child(), but it is the immediate overlay
+         * of @base, so between the two there can only be filters.
+         */
+        above_base = base_overlay;
+        if (bdrv_cow_bs(above_base) != base) {
+            above_base = bdrv_cow_bs(above_base);
+            while (bdrv_filter_bs(above_base) != base) {
+                above_base = bdrv_filter_bs(above_base);
+            }
         }
     }
 
diff --git a/blockdev.c b/blockdev.c
index 70900f4f77..e0e19db88b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2497,6 +2497,7 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
                       bool has_base, const char *base,
                       bool has_base_node, const char *base_node,
                       bool has_backing_file, const char *backing_file,
+                      bool has_bottom, const char *bottom,
                       bool has_speed, int64_t speed,
                       bool has_on_error, BlockdevOnError on_error,
                       bool has_filter_node_name, const char *filter_node_name,
@@ -2504,16 +2505,37 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
                       bool has_auto_dismiss, bool auto_dismiss,
                       Error **errp)
 {
-    BlockDriverState *bs, *iter;
+    BlockDriverState *bs, *iter, *iter_end;
     BlockDriverState *base_bs = NULL;
+    BlockDriverState *bottom_bs = NULL;
     AioContext *aio_context;
     Error *local_err = NULL;
     int job_flags = JOB_DEFAULT;
+    struct {
+        bool a;
+        const char *a_name;
+        bool b;
+        const char *b_name;
+    } restricted_pairs[] = {
+        {has_base, "base", has_base_node, "base-node"},
+        {has_bottom, "bottom", has_base, "base"},
+        {has_bottom, "bottom", has_base_node, "base-node"},
+        {has_bottom, "bottom", has_backing_file, "backing-file"},
+        {0}
+    }, *rp = restricted_pairs;
 
     if (!has_on_error) {
         on_error = BLOCKDEV_ON_ERROR_REPORT;
     }
 
+    for ( ; rp->a_name; rp++) {
+        if (rp->a && rp->b) {
+            error_setg(errp, "'%s' and '%s' cannot be specified "
+                       "at the same time", rp->a_name, rp->b_name);
+            return;
+        }
+    }
+
     bs = bdrv_lookup_bs(device, device, errp);
     if (!bs) {
         return;
@@ -2522,12 +2544,6 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
     aio_context = bdrv_get_aio_context(bs);
     aio_context_acquire(aio_context);
 
-    if (has_base && has_base_node) {
-        error_setg(errp, "'base' and 'base-node' cannot be specified "
-                   "at the same time");
-        goto out;
-    }
-
     if (has_base) {
         base_bs = bdrv_find_backing_image(bs, base);
         if (base_bs == NULL) {
@@ -2551,8 +2567,33 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
         bdrv_refresh_filename(base_bs);
     }
 
-    /* Check for op blockers in the whole chain between bs and base */
-    for (iter = bs; iter && iter != base_bs;
+    if (has_bottom) {
+        bottom_bs = bdrv_lookup_bs(NULL, bottom, errp);
+        if (!bottom_bs) {
+            goto out;
+        }
+        if (!bottom_bs->drv) {
+            error_setg(errp, "Node '%s' is not open", bottom);
+            goto out;
+        }
+        if (bottom_bs->drv->is_filter) {
+            error_setg(errp, "Node '%s' is filter, use non-filter node"
+                       "as 'bottom'", bottom);
+            goto out;
+        }
+        if (!bdrv_chain_contains(bs, bottom_bs)) {
+            error_setg(errp, "Node '%s' is not in a chain starting from '%s'",
+                       bottom, device);
+            goto out;
+        }
+        assert(bdrv_get_aio_context(bottom_bs) == aio_context);
+    }
+
+    /*
+     * Check for op blockers in the whole chain between bs and base (or bottom)
+     */
+    iter_end = has_bottom ? bdrv_filter_or_cow_bs(bottom_bs) : base_bs;
+    for (iter = bs; iter && iter != iter_end;
          iter = bdrv_filter_or_cow_bs(iter))
     {
         if (bdrv_op_is_blocked(iter, BLOCK_OP_TYPE_STREAM, errp)) {
@@ -2576,7 +2617,7 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
     }
 
     stream_start(has_job_id ? job_id : NULL, bs, base_bs, backing_file,
-                 job_flags, has_speed ? speed : 0, on_error,
+                 bottom_bs, job_flags, has_speed ? speed : 0, on_error,
                  filter_node_name, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v14 11/13] iotests: 30: prepare to COR filter insertion by stream job
  2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
                   ` (9 preceding siblings ...)
  2020-12-04 22:07 ` [PATCH v14 10/13] qapi: block-stream: add "bottom" argument Vladimir Sementsov-Ogievskiy
@ 2020-12-04 22:07 ` Vladimir Sementsov-Ogievskiy
  2020-12-11 16:09   ` Max Reitz
  2020-12-04 22:07 ` [PATCH v14 12/13] block/stream: add s->target_bs Vladimir Sementsov-Ogievskiy
  2020-12-04 22:07 ` [PATCH v14 13/13] block: apply COR-filter to block-stream jobs Vladimir Sementsov-Ogievskiy
  12 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

test_stream_parallel run parallel stream jobs, intersecting so that top
of one is base of another. It's OK now, but it would be a problem if
insert the filter, as one job will want to use another job's filter as
above_base node.

Correct thing to do is move to new interface: "bottom" argument instead
of base. This guarantees that jobs don't intersect by their actions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/030 | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
index dcb4b5d6a6..bd8cf9cff7 100755
--- a/tests/qemu-iotests/030
+++ b/tests/qemu-iotests/030
@@ -245,7 +245,9 @@ class TestParallelOps(iotests.QMPTestCase):
             node_name = 'node%d' % i
             job_id = 'stream-%s' % node_name
             pending_jobs.append(job_id)
-            result = self.vm.qmp('block-stream', device=node_name, job_id=job_id, base=self.imgs[i-2], speed=1024)
+            result = self.vm.qmp('block-stream', device=node_name,
+                                 job_id=job_id, bottom=f'node{i-1}',
+                                 speed=1024)
             self.assert_qmp(result, 'return', {})
 
         for job in pending_jobs:
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v14 12/13] block/stream: add s->target_bs
  2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
                   ` (10 preceding siblings ...)
  2020-12-04 22:07 ` [PATCH v14 11/13] iotests: 30: prepare to COR filter insertion by stream job Vladimir Sementsov-Ogievskiy
@ 2020-12-04 22:07 ` Vladimir Sementsov-Ogievskiy
  2020-12-11 16:33   ` Max Reitz
  2020-12-04 22:07 ` [PATCH v14 13/13] block: apply COR-filter to block-stream jobs Vladimir Sementsov-Ogievskiy
  12 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

Add a direct link to target bs for convenience and to simplify
following commit which will insert COR filter above target bs.

This is a part of original commit written by Andrey.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/stream.c | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/block/stream.c b/block/stream.c
index a2744d07fe..a7fd8945ad 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -34,6 +34,7 @@ typedef struct StreamBlockJob {
     BlockJob common;
     BlockDriverState *base_overlay; /* COW overlay (stream from this) */
     BlockDriverState *above_base;   /* Node directly above the base */
+    BlockDriverState *target_bs;
     BlockdevOnError on_error;
     char *backing_file_str;
     bool bs_read_only;
@@ -54,24 +55,21 @@ static void stream_abort(Job *job)
     StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
 
     if (s->chain_frozen) {
-        BlockJob *bjob = &s->common;
-        bdrv_unfreeze_backing_chain(blk_bs(bjob->blk), s->above_base);
+        bdrv_unfreeze_backing_chain(s->target_bs, s->above_base);
     }
 }
 
 static int stream_prepare(Job *job)
 {
     StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
-    BlockJob *bjob = &s->common;
-    BlockDriverState *bs = blk_bs(bjob->blk);
-    BlockDriverState *unfiltered_bs = bdrv_skip_filters(bs);
+    BlockDriverState *unfiltered_bs = bdrv_skip_filters(s->target_bs);
     BlockDriverState *base = bdrv_filter_or_cow_bs(s->above_base);
     BlockDriverState *base_unfiltered;
     BlockDriverState *backing_bs;
     Error *local_err = NULL;
     int ret = 0;
 
-    bdrv_unfreeze_backing_chain(bs, s->above_base);
+    bdrv_unfreeze_backing_chain(s->target_bs, s->above_base);
     s->chain_frozen = false;
 
     if (bdrv_cow_child(unfiltered_bs)) {
@@ -111,13 +109,12 @@ static void stream_clean(Job *job)
 {
     StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
     BlockJob *bjob = &s->common;
-    BlockDriverState *bs = blk_bs(bjob->blk);
 
     /* Reopen the image back in read-only mode if necessary */
     if (s->bs_read_only) {
         /* Give up write permissions before making it read-only */
         blk_set_perm(bjob->blk, 0, BLK_PERM_ALL, &error_abort);
-        bdrv_reopen_set_read_only(bs, true, NULL);
+        bdrv_reopen_set_read_only(s->target_bs, true, NULL);
     }
 
     g_free(s->backing_file_str);
@@ -127,8 +124,7 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
 {
     StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
     BlockBackend *blk = s->common.blk;
-    BlockDriverState *bs = blk_bs(blk);
-    BlockDriverState *unfiltered_bs = bdrv_skip_filters(bs);
+    BlockDriverState *unfiltered_bs = bdrv_skip_filters(s->target_bs);
     bool enable_cor = !bdrv_cow_child(s->base_overlay);
     int64_t len;
     int64_t offset = 0;
@@ -141,7 +137,7 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
         return 0;
     }
 
-    len = bdrv_getlength(bs);
+    len = bdrv_getlength(s->target_bs);
     if (len < 0) {
         return len;
     }
@@ -153,7 +149,7 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
      * account.
      */
     if (enable_cor) {
-        bdrv_enable_copy_on_read(bs);
+        bdrv_enable_copy_on_read(s->target_bs);
     }
 
     for ( ; offset < len; offset += n) {
@@ -215,7 +211,7 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
     }
 
     if (enable_cor) {
-        bdrv_disable_copy_on_read(bs);
+        bdrv_disable_copy_on_read(s->target_bs);
     }
 
     /* Do not remove the backing file if an error was there but ignored. */
@@ -330,6 +326,7 @@ void stream_start(const char *job_id, BlockDriverState *bs,
     s->base_overlay = base_overlay;
     s->above_base = above_base;
     s->backing_file_str = g_strdup(backing_file_str);
+    s->target_bs = bs;
     s->bs_read_only = bs_read_only;
     s->chain_frozen = true;
 
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v14 13/13] block: apply COR-filter to block-stream jobs
  2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
                   ` (11 preceding siblings ...)
  2020-12-04 22:07 ` [PATCH v14 12/13] block/stream: add s->target_bs Vladimir Sementsov-Ogievskiy
@ 2020-12-04 22:07 ` Vladimir Sementsov-Ogievskiy
  2020-12-11 17:21   ` Max Reitz
  12 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-04 22:07 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, armbru, jsnow, fam, stefanha, mreitz, kwolf,
	den, andrey.shinkevich, vsementsov

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

This patch completes the series with the COR-filter applied to
block-stream operations.

Adding the filter makes it possible in future implement discarding
copied regions in backing files during the block-stream job, to reduce
the disk overuse (we need control on permissions).

Also, the filter now is smart enough to do copy-on-read with specified
base, so we have benefit on guest reads even when doing block-stream of
the part of the backing chain.

Several iotests are slightly modified due to filter insertion.

Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/stream.c             | 78 ++++++++++++++++++++++++++------------
 tests/qemu-iotests/030     |  8 ++--
 tests/qemu-iotests/141.out |  2 +-
 tests/qemu-iotests/245     | 20 ++++++----
 4 files changed, 72 insertions(+), 36 deletions(-)

diff --git a/block/stream.c b/block/stream.c
index a7fd8945ad..b92f7de55b 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -18,8 +18,10 @@
 #include "qapi/error.h"
 #include "qapi/qmp/qerror.h"
 #include "qemu/error-report.h"
+#include "qapi/qmp/qdict.h"
 #include "qemu/ratelimit.h"
 #include "sysemu/block-backend.h"
+#include "block/copy-on-read.h"
 
 enum {
     /*
@@ -34,6 +36,7 @@ typedef struct StreamBlockJob {
     BlockJob common;
     BlockDriverState *base_overlay; /* COW overlay (stream from this) */
     BlockDriverState *above_base;   /* Node directly above the base */
+    BlockDriverState *cor_filter_bs;
     BlockDriverState *target_bs;
     BlockdevOnError on_error;
     char *backing_file_str;
@@ -46,8 +49,7 @@ static int coroutine_fn stream_populate(BlockBackend *blk,
 {
     assert(bytes < SIZE_MAX);
 
-    return blk_co_preadv(blk, offset, bytes, NULL,
-                         BDRV_REQ_COPY_ON_READ | BDRV_REQ_PREFETCH);
+    return blk_co_preadv(blk, offset, bytes, NULL, BDRV_REQ_PREFETCH);
 }
 
 static void stream_abort(Job *job)
@@ -55,7 +57,7 @@ static void stream_abort(Job *job)
     StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
 
     if (s->chain_frozen) {
-        bdrv_unfreeze_backing_chain(s->target_bs, s->above_base);
+        bdrv_unfreeze_backing_chain(s->cor_filter_bs, s->above_base);
     }
 }
 
@@ -69,7 +71,7 @@ static int stream_prepare(Job *job)
     Error *local_err = NULL;
     int ret = 0;
 
-    bdrv_unfreeze_backing_chain(s->target_bs, s->above_base);
+    bdrv_unfreeze_backing_chain(s->cor_filter_bs, s->above_base);
     s->chain_frozen = false;
 
     if (bdrv_cow_child(unfiltered_bs)) {
@@ -117,6 +119,8 @@ static void stream_clean(Job *job)
         bdrv_reopen_set_read_only(s->target_bs, true, NULL);
     }
 
+    bdrv_cor_filter_drop(s->cor_filter_bs);
+
     g_free(s->backing_file_str);
 }
 
@@ -125,7 +129,6 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
     StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
     BlockBackend *blk = s->common.blk;
     BlockDriverState *unfiltered_bs = bdrv_skip_filters(s->target_bs);
-    bool enable_cor = !bdrv_cow_child(s->base_overlay);
     int64_t len;
     int64_t offset = 0;
     uint64_t delay_ns = 0;
@@ -143,15 +146,6 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
     }
     job_progress_set_remaining(&s->common.job, len);
 
-    /* Turn on copy-on-read for the whole block device so that guest read
-     * requests help us make progress.  Only do this when copying the entire
-     * backing chain since the copy-on-read operation does not take base into
-     * account.
-     */
-    if (enable_cor) {
-        bdrv_enable_copy_on_read(s->target_bs);
-    }
-
     for ( ; offset < len; offset += n) {
         bool copy;
         int ret;
@@ -210,10 +204,6 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
         }
     }
 
-    if (enable_cor) {
-        bdrv_disable_copy_on_read(s->target_bs);
-    }
-
     /* Do not remove the backing file if an error was there but ignored. */
     return error;
 }
@@ -244,7 +234,9 @@ void stream_start(const char *job_id, BlockDriverState *bs,
     bool bs_read_only;
     int basic_flags = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED;
     BlockDriverState *base_overlay;
+    BlockDriverState *cor_filter_bs = NULL;
     BlockDriverState *above_base;
+    QDict *opts;
 
     assert(!(base && bottom));
     assert(!(backing_file_str && bottom));
@@ -295,17 +287,49 @@ void stream_start(const char *job_id, BlockDriverState *bs,
         }
     }
 
-    /* Prevent concurrent jobs trying to modify the graph structure here, we
-     * already have our own plans. Also don't allow resize as the image size is
-     * queried only at the job start and then cached. */
-    s = block_job_create(job_id, &stream_job_driver, NULL, bs,
-                         basic_flags | BLK_PERM_GRAPH_MOD,
+    opts = qdict_new();
+
+    qdict_put_str(opts, "driver", "copy-on-read");
+    qdict_put_str(opts, "file", bdrv_get_node_name(bs));
+    /* Pass the base_overlay node name as 'bottom' to COR driver */
+    qdict_put_str(opts, "bottom", base_overlay->node_name);
+    if (filter_node_name) {
+        qdict_put_str(opts, "node-name", filter_node_name);
+    }
+
+    cor_filter_bs = bdrv_insert_node(bs, opts, BDRV_O_RDWR, errp);
+    if (cor_filter_bs == NULL) {
+        goto fail;
+    }
+
+    if (!filter_node_name) {
+        cor_filter_bs->implicit = true;
+    }
+
+    if (bdrv_freeze_backing_chain(cor_filter_bs, bs, errp) < 0) {
+        bdrv_cor_filter_drop(cor_filter_bs);
+        cor_filter_bs = NULL;
+        goto fail;
+    }
+
+    s = block_job_create(job_id, &stream_job_driver, NULL, cor_filter_bs,
+                         BLK_PERM_CONSISTENT_READ,
                          basic_flags | BLK_PERM_WRITE,
                          speed, creation_flags, NULL, NULL, errp);
     if (!s) {
         goto fail;
     }
 
+    /*
+     * Prevent concurrent jobs trying to modify the graph structure here, we
+     * already have our own plans. Also don't allow resize as the image size is
+     * queried only at the job start and then cached.
+     */
+    if (block_job_add_bdrv(&s->common, "active node", bs, 0,
+                           basic_flags | BLK_PERM_WRITE, &error_abort)) {
+        goto fail;
+    }
+
     /* Block all intermediate nodes between bs and base, because they will
      * disappear from the chain after this operation. The streaming job reads
      * every block only once, assuming that it doesn't change, so forbid writes
@@ -326,6 +350,7 @@ void stream_start(const char *job_id, BlockDriverState *bs,
     s->base_overlay = base_overlay;
     s->above_base = above_base;
     s->backing_file_str = g_strdup(backing_file_str);
+    s->cor_filter_bs = cor_filter_bs;
     s->target_bs = bs;
     s->bs_read_only = bs_read_only;
     s->chain_frozen = true;
@@ -339,5 +364,10 @@ fail:
     if (bs_read_only) {
         bdrv_reopen_set_read_only(bs, true, NULL);
     }
-    bdrv_unfreeze_backing_chain(bs, above_base);
+    if (cor_filter_bs) {
+        bdrv_unfreeze_backing_chain(cor_filter_bs, above_base);
+        bdrv_cor_filter_drop(cor_filter_bs);
+    } else {
+        bdrv_unfreeze_backing_chain(bs, above_base);
+    }
 }
diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
index bd8cf9cff7..c576d55d07 100755
--- a/tests/qemu-iotests/030
+++ b/tests/qemu-iotests/030
@@ -278,12 +278,14 @@ class TestParallelOps(iotests.QMPTestCase):
         self.assert_no_active_block_jobs()
 
         # Set a speed limit to make sure that this job blocks the rest
-        result = self.vm.qmp('block-stream', device='node4', job_id='stream-node4', base=self.imgs[1], speed=1024*1024)
+        result = self.vm.qmp('block-stream', device='node4',
+                             job_id='stream-node4', base=self.imgs[1],
+                             filter_node_name='stream-filter', speed=1024*1024)
         self.assert_qmp(result, 'return', {})
 
         result = self.vm.qmp('block-stream', device='node5', job_id='stream-node5', base=self.imgs[2])
         self.assert_qmp(result, 'error/desc',
-            "Node 'node4' is busy: block device is in use by block job: stream")
+            "Node 'stream-filter' is busy: block device is in use by block job: stream")
 
         result = self.vm.qmp('block-stream', device='node3', job_id='stream-node3', base=self.imgs[2])
         self.assert_qmp(result, 'error/desc',
@@ -296,7 +298,7 @@ class TestParallelOps(iotests.QMPTestCase):
         # block-commit should also fail if it touches nodes used by the stream job
         result = self.vm.qmp('block-commit', device='drive0', base=self.imgs[4], job_id='commit-node4')
         self.assert_qmp(result, 'error/desc',
-            "Node 'node4' is busy: block device is in use by block job: stream")
+            "Node 'stream-filter' is busy: block device is in use by block job: stream")
 
         result = self.vm.qmp('block-commit', device='drive0', base=self.imgs[1], top=self.imgs[3], job_id='commit-node1')
         self.assert_qmp(result, 'error/desc',
diff --git a/tests/qemu-iotests/141.out b/tests/qemu-iotests/141.out
index 08e0aecd65..028a16f365 100644
--- a/tests/qemu-iotests/141.out
+++ b/tests/qemu-iotests/141.out
@@ -99,7 +99,7 @@ wrote 1048576/1048576 bytes at offset 0
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
 {'execute': 'blockdev-del', 'arguments': {'node-name': 'drv0'}}
-{"error": {"class": "GenericError", "desc": "Node drv0 is in use"}}
+{"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: block device is in use by block job: stream"}}
 {'execute': 'block-job-cancel', 'arguments': {'device': 'job0'}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
diff --git a/tests/qemu-iotests/245 b/tests/qemu-iotests/245
index e60c8326d3..432e837e6c 100755
--- a/tests/qemu-iotests/245
+++ b/tests/qemu-iotests/245
@@ -892,20 +892,24 @@ class TestBlockdevReopen(iotests.QMPTestCase):
 
         # hd1 <- hd0
         result = self.vm.qmp('block-stream', conv_keys = True, job_id = 'stream0',
-                             device = 'hd1', auto_finalize = False)
+                             device = 'hd1', filter_node_name='cor',
+                             auto_finalize = False)
         self.assert_qmp(result, 'return', {})
 
-        # We can't reopen with the original options because that would
-        # make hd1 read-only and block-stream requires it to be read-write
-        # (Which error message appears depends on whether the stream job is
-        # already done with copying at this point.)
+        # We can't reopen with the original options because there is a filter
+        # inserted by stream job above hd1.
         self.reopen(opts, {},
-            ["Can't set node 'hd1' to r/o with copy-on-read enabled",
-             "Cannot make block node read-only, there is a writer on it"])
+                    "Cannot change the option 'backing.backing.file.node-name'")
+
+        # We can't reopen hd1 to read-only, as block-stream requires it to be
+        # read-write
+        self.reopen(opts['backing'], {'read-only': True},
+                    "Cannot make block node read-only, there is a writer on it")
 
         # We can't remove hd2 while the stream job is ongoing
         opts['backing']['backing'] = None
-        self.reopen(opts, {'backing.read-only': False}, "Cannot change 'backing' link from 'hd1' to 'hd2'")
+        self.reopen(opts['backing'], {'read-only': False},
+                    "Cannot change 'backing' link from 'hd1' to 'hd2'")
 
         # We can detach hd1 from hd0 because it doesn't affect the stream job
         opts['backing'] = None
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 02/13] block: add API function to insert a node
  2020-12-04 22:07 ` [PATCH v14 02/13] block: add API function to insert a node Vladimir Sementsov-Ogievskiy
@ 2020-12-10 17:33   ` Max Reitz
  0 siblings, 0 replies; 37+ messages in thread
From: Max Reitz @ 2020-12-10 17:33 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> 
> Provide API for insertion a node to backing chain.
> 
> Suggested-by: Max Reitz <mreitz@redhat.com>
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   include/block/block.h |  2 ++
>   block.c               | 25 +++++++++++++++++++++++++
>   2 files changed, 27 insertions(+)

[...]

> diff --git a/block.c b/block.c
> index f1cedac362..b71c39f3e6 100644
> --- a/block.c
> +++ b/block.c
> @@ -4698,6 +4698,31 @@ static void bdrv_delete(BlockDriverState *bs)
>       g_free(bs);
>   }
>   
> +BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *node_options,
> +                                   int flags, Error **errp)
> +{
> +    BlockDriverState *new_node_bs;
> +    Error *local_err = NULL;
> +
> +    new_node_bs =  bdrv_open(NULL, NULL, node_options, flags, errp);

s/ =  / = /

With that done:

Reviewed-by: Max Reitz <mreitz@redhat.com>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 03/13] copy-on-read: add filter drop function
  2020-12-04 22:07 ` [PATCH v14 03/13] copy-on-read: add filter drop function Vladimir Sementsov-Ogievskiy
@ 2020-12-10 17:34   ` Max Reitz
  0 siblings, 0 replies; 37+ messages in thread
From: Max Reitz @ 2020-12-10 17:34 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> 
> Provide API for the COR-filter removal. Also, drop the filter child
> permissions for an inactive state when the filter node is being
> removed.
> To insert the filter, the block generic layer function
> bdrv_insert_node() can be used.
> The new function bdrv_cor_filter_drop() may be considered as an
> intermediate solution before the QEMU permission update system has
> overhauled. Then we are able to implement the API function
> bdrv_remove_node() on the block generic layer.
> 
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   block/copy-on-read.h | 32 +++++++++++++++++++++++++
>   block/copy-on-read.c | 56 ++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 88 insertions(+)
>   create mode 100644 block/copy-on-read.h

Reviewed-by: Max Reitz <mreitz@redhat.com>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 04/13] qapi: add filter-node-name to block-stream
  2020-12-04 22:07 ` [PATCH v14 04/13] qapi: add filter-node-name to block-stream Vladimir Sementsov-Ogievskiy
@ 2020-12-10 17:37   ` Max Reitz
  0 siblings, 0 replies; 37+ messages in thread
From: Max Reitz @ 2020-12-10 17:37 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> 
> Provide the possibility to pass the 'filter-node-name' parameter to the
> block-stream job as it is done for the commit block job.
> 
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   qapi/block-core.json           | 6 ++++++
>   include/block/block_int.h      | 7 ++++++-
>   block/monitor/block-hmp-cmds.c | 4 ++--
>   block/stream.c                 | 4 +++-
>   blockdev.c                     | 4 +++-
>   5 files changed, 20 insertions(+), 5 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 04ad80bc1e..8ef3df6767 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -2543,6 +2543,11 @@
>   #            'stop' and 'enospc' can only be used if the block device
>   #            supports io-status (see BlockInfo).  Since 1.3.
>   #
> +# @filter-node-name: the node name that should be assigned to the
> +#                    filter driver that the stream job inserts into the graph
> +#                    above @device. If this option is not given, a node name is
> +#                    autogenerated. (Since: 5.2)

*6.0

> +#
>   # @auto-finalize: When false, this job will wait in a PENDING state after it has
>   #                 finished its work, waiting for @block-job-finalize before
>   #                 making any block graph changes.

[...]

> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 95d9333be1..c05fa1eb6b 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -1134,6 +1134,9 @@ int is_windows_drive(const char *filename);
>    *                  See @BlockJobCreateFlags
>    * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
>    * @on_error: The action to take upon error.
> + * @filter_node_name: The node name that should be assigned to the filter
> + * driver that the commit job inserts into the graph above @bs. NULL means
> + * that a node name should be autogenerated.

Personally, I find such descriptions to be more easily readable (or 
rather more easily visually separable from other parameters) if they’re 
indented.  I understand that two other parameters’ descriptions aren’t 
indented either (but one is), so in the end it’s your choice.  (But I 
thought a little nudging couldn’t hurt.)

So either way (with *6.0):

Reviewed-by: Max Reitz <mreitz@redhat.com>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 05/13] qapi: create BlockdevOptionsCor structure for COR driver
  2020-12-04 22:07 ` [PATCH v14 05/13] qapi: create BlockdevOptionsCor structure for COR driver Vladimir Sementsov-Ogievskiy
@ 2020-12-10 17:43   ` Max Reitz
  2020-12-10 18:30     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 37+ messages in thread
From: Max Reitz @ 2020-12-10 17:43 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

I don’t like this patch’s subject very much, because I find the 
implementation of the @bottom option to be more noteworthy than the 
addition of the QAPI structure.


On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> 
> Create the BlockdevOptionsCor structure for COR driver specific options
> splitting it off form the BlockdevOptionsGenericFormat. The only option
> 'bottom' node in the structure denotes an image file that limits the
> COR operations in the backing chain.
> We are going to use the COR-filter for a block-stream job and will pass
> a bottom node name to the COR driver. The bottom node is the first
> non-filter overlay of the base. It was introduced because the base node
> itself may change due to possible concurrent jobs.
> 
> Suggested-by: Max Reitz <mreitz@redhat.com>
> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>    [vsementsov: fix bdrv_is_allocated_above() usage]
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   qapi/block-core.json | 21 +++++++++++++++-
>   block/copy-on-read.c | 57 ++++++++++++++++++++++++++++++++++++++++++--
>   2 files changed, 75 insertions(+), 3 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 8ef3df6767..04055ef50c 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -3942,6 +3942,25 @@
>     'data': { 'throttle-group': 'str',
>               'file' : 'BlockdevRef'
>                } }
> +
> +##
> +# @BlockdevOptionsCor:
> +#
> +# Driver specific block device options for the copy-on-read driver.
> +#
> +# @bottom: the name of a non-filter node (allocation-bearing layer) that limits
> +#          the COR operations in the backing chain (inclusive).

This seems to me like something’s missing.  Perhaps technically there 
isn’t, but “limits the COR operations” begs the question (to me) “Limits 
them in what way?” (to which the answer is: No data below @bottom is 
copied).

Could you make it more verbose?  Perhaps something like “The name of a 
non-filter node (allocation-bearing layer) that limits the COR 
operations in the backing chain (inclusive), so that no data below this 
node will be copied by this filter”?

> +#          For the block-stream job, it will be the first non-filter overlay of
> +#          the base node. We do not involve the base node into the COR
> +#          operations because the base may change due to a concurrent
> +#          block-commit job on the same backing chain.

I think the default behavior should be mentioned here somewhere, i.e. 
that no limit is applied, so that data from all backing layers may be 
copied.

> +#
> +# Since: 5.2

*6.0

> +##
> +{ 'struct': 'BlockdevOptionsCor',
> +  'base': 'BlockdevOptionsGenericFormat',
> +  'data': { '*bottom': 'str' } }
> +
>   ##
>   # @BlockdevOptions:
>   #

[...]

> diff --git a/block/copy-on-read.c b/block/copy-on-read.c
> index 618c4c4f43..67f61983c0 100644
> --- a/block/copy-on-read.c
> +++ b/block/copy-on-read.c

[...]

> @@ -51,7 +56,17 @@ static int cor_open(BlockDriverState *bs, QDict *options, int flags,
>           ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
>               bs->file->bs->supported_zero_flags);
>   
> +    if (bottom_node) {
> +        bottom_bs = bdrv_lookup_bs(NULL, bottom_node, errp);
> +        if (!bottom_bs) {
> +            error_setg(errp, "Bottom node '%s' not found", bottom_node);
> +            qdict_del(options, "bottom");
> +            return -EINVAL;
> +        }

Should we verify that bottom_bs is not a filter, as required by the schema?

Max



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 05/13] qapi: create BlockdevOptionsCor structure for COR driver
  2020-12-10 17:43   ` Max Reitz
@ 2020-12-10 18:30     ` Vladimir Sementsov-Ogievskiy
  2020-12-11  8:54       ` Max Reitz
  0 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-10 18:30 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

10.12.2020 20:43, Max Reitz wrote:
> I don’t like this patch’s subject very much, because I find the implementation of the @bottom option to be more noteworthy than the addition of the QAPI structure.
> 
> 
> On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
>> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>
>> Create the BlockdevOptionsCor structure for COR driver specific options
>> splitting it off form the BlockdevOptionsGenericFormat. The only option
>> 'bottom' node in the structure denotes an image file that limits the
>> COR operations in the backing chain.
>> We are going to use the COR-filter for a block-stream job and will pass
>> a bottom node name to the COR driver. The bottom node is the first
>> non-filter overlay of the base. It was introduced because the base node
>> itself may change due to possible concurrent jobs.
>>
>> Suggested-by: Max Reitz <mreitz@redhat.com>
>> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>    [vsementsov: fix bdrv_is_allocated_above() usage]
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   qapi/block-core.json | 21 +++++++++++++++-
>>   block/copy-on-read.c | 57 ++++++++++++++++++++++++++++++++++++++++++--
>>   2 files changed, 75 insertions(+), 3 deletions(-)
>>
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index 8ef3df6767..04055ef50c 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -3942,6 +3942,25 @@
>>     'data': { 'throttle-group': 'str',
>>               'file' : 'BlockdevRef'
>>                } }
>> +
>> +##
>> +# @BlockdevOptionsCor:
>> +#
>> +# Driver specific block device options for the copy-on-read driver.
>> +#
>> +# @bottom: the name of a non-filter node (allocation-bearing layer) that limits
>> +#          the COR operations in the backing chain (inclusive).
> 
> This seems to me like something’s missing.  Perhaps technically there isn’t, but “limits the COR operations” begs the question (to me) “Limits them in what way?” (to which the answer is: No data below @bottom is copied).
> 
> Could you make it more verbose?  Perhaps something like “The name of a non-filter node (allocation-bearing layer) that limits the COR operations in the backing chain (inclusive), so that no data below this node will be copied by this filter”?

Sounds good for me.

> 
>> +#          For the block-stream job, it will be the first non-filter overlay of
>> +#          the base node. We do not involve the base node into the COR
>> +#          operations because the base may change due to a concurrent
>> +#          block-commit job on the same backing chain.
> 

I now see that paragraph conflicts with further introduce of "bottom" for stream job itself. I think it may be safely dropped. It's a wrong place to describe how block-stream works.

> I think the default behavior should be mentioned here somewhere, i.e. that no limit is applied, so that data from all backing layers may be copied.

agree

> 
>> +#
>> +# Since: 5.2
> 
> *6.0
> 
>> +##
>> +{ 'struct': 'BlockdevOptionsCor',
>> +  'base': 'BlockdevOptionsGenericFormat',
>> +  'data': { '*bottom': 'str' } }
>> +
>>   ##
>>   # @BlockdevOptions:
>>   #
> 
> [...]
> 
>> diff --git a/block/copy-on-read.c b/block/copy-on-read.c
>> index 618c4c4f43..67f61983c0 100644
>> --- a/block/copy-on-read.c
>> +++ b/block/copy-on-read.c
> 
> [...]
> 
>> @@ -51,7 +56,17 @@ static int cor_open(BlockDriverState *bs, QDict *options, int flags,
>>           ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
>>               bs->file->bs->supported_zero_flags);
>> +    if (bottom_node) {
>> +        bottom_bs = bdrv_lookup_bs(NULL, bottom_node, errp);
>> +        if (!bottom_bs) {
>> +            error_setg(errp, "Bottom node '%s' not found", bottom_node);
>> +            qdict_del(options, "bottom");
>> +            return -EINVAL;
>> +        }
> 
> Should we verify that bottom_bs is not a filter, as required by the schema?
> 

yes, thanks for the catch!


Hmm.. Interesting, we don't freeze the backing chain in cor filter open. And I think we shouldn't. But then, bottom node may disappear. We should handle it without a crash.

I suggest:

1. document, that if bottom node disappear from the backing chain of the filter, it continues to work like without any specified "bottom" node

2. do bdrv_ref/bdrv_unref of bottom_bs, to not work with dead pointer

3. check in cor_co_preadv_part() is bottom_bs is still in backing chain or not

Haha, bottom node may return into backing chain at some moment and we can continue to handle it:)

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 05/13] qapi: create BlockdevOptionsCor structure for COR driver
  2020-12-10 18:30     ` Vladimir Sementsov-Ogievskiy
@ 2020-12-11  8:54       ` Max Reitz
  2020-12-11 12:32         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 37+ messages in thread
From: Max Reitz @ 2020-12-11  8:54 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 10.12.20 19:30, Vladimir Sementsov-Ogievskiy wrote:
> 10.12.2020 20:43, Max Reitz wrote:
>> I don’t like this patch’s subject very much, because I find the 
>> implementation of the @bottom option to be more noteworthy than the 
>> addition of the QAPI structure.
>>
>>
>> On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
>>> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>
>>> Create the BlockdevOptionsCor structure for COR driver specific options
>>> splitting it off form the BlockdevOptionsGenericFormat. The only option
>>> 'bottom' node in the structure denotes an image file that limits the
>>> COR operations in the backing chain.
>>> We are going to use the COR-filter for a block-stream job and will pass
>>> a bottom node name to the COR driver. The bottom node is the first
>>> non-filter overlay of the base. It was introduced because the base node
>>> itself may change due to possible concurrent jobs.
>>>
>>> Suggested-by: Max Reitz <mreitz@redhat.com>
>>> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>    [vsementsov: fix bdrv_is_allocated_above() usage]
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>   qapi/block-core.json | 21 +++++++++++++++-
>>>   block/copy-on-read.c | 57 ++++++++++++++++++++++++++++++++++++++++++--
>>>   2 files changed, 75 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>> index 8ef3df6767..04055ef50c 100644
>>> --- a/qapi/block-core.json
>>> +++ b/qapi/block-core.json
>>> @@ -3942,6 +3942,25 @@
>>>     'data': { 'throttle-group': 'str',
>>>               'file' : 'BlockdevRef'
>>>                } }
>>> +
>>> +##
>>> +# @BlockdevOptionsCor:
>>> +#
>>> +# Driver specific block device options for the copy-on-read driver.
>>> +#
>>> +# @bottom: the name of a non-filter node (allocation-bearing layer) 
>>> that limits
>>> +#          the COR operations in the backing chain (inclusive).
>>
>> This seems to me like something’s missing.  Perhaps technically there 
>> isn’t, but “limits the COR operations” begs the question (to me) 
>> “Limits them in what way?” (to which the answer is: No data below 
>> @bottom is copied).
>>
>> Could you make it more verbose?  Perhaps something like “The name of a 
>> non-filter node (allocation-bearing layer) that limits the COR 
>> operations in the backing chain (inclusive), so that no data below 
>> this node will be copied by this filter”?
> 
> Sounds good for me.
> 
>>
>>> +#          For the block-stream job, it will be the first non-filter 
>>> overlay of
>>> +#          the base node. We do not involve the base node into the COR
>>> +#          operations because the base may change due to a concurrent
>>> +#          block-commit job on the same backing chain.
>>
> 
> I now see that paragraph conflicts with further introduce of "bottom" 
> for stream job itself. I think it may be safely dropped. It's a wrong 
> place to describe how block-stream works.
> 
>> I think the default behavior should be mentioned here somewhere, i.e. 
>> that no limit is applied, so that data from all backing layers may be 
>> copied.
> 
> agree
> 
>>
>>> +#
>>> +# Since: 5.2
>>
>> *6.0
>>
>>> +##
>>> +{ 'struct': 'BlockdevOptionsCor',
>>> +  'base': 'BlockdevOptionsGenericFormat',
>>> +  'data': { '*bottom': 'str' } }
>>> +
>>>   ##
>>>   # @BlockdevOptions:
>>>   #
>>
>> [...]
>>
>>> diff --git a/block/copy-on-read.c b/block/copy-on-read.c
>>> index 618c4c4f43..67f61983c0 100644
>>> --- a/block/copy-on-read.c
>>> +++ b/block/copy-on-read.c
>>
>> [...]
>>
>>> @@ -51,7 +56,17 @@ static int cor_open(BlockDriverState *bs, QDict 
>>> *options, int flags,
>>>           ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
>>>               bs->file->bs->supported_zero_flags);
>>> +    if (bottom_node) {
>>> +        bottom_bs = bdrv_lookup_bs(NULL, bottom_node, errp);
>>> +        if (!bottom_bs) {
>>> +            error_setg(errp, "Bottom node '%s' not found", 
>>> bottom_node);
>>> +            qdict_del(options, "bottom");
>>> +            return -EINVAL;
>>> +        }
>>
>> Should we verify that bottom_bs is not a filter, as required by the 
>> schema?
>>
> 
> yes, thanks for the catch!
> 
> 
> Hmm.. Interesting, we don't freeze the backing chain in cor filter open. 
> And I think we shouldn't. But then, bottom node may disappear. We should 
> handle it without a crash.
> 
> I suggest:
> 
> 1. document, that if bottom node disappear from the backing chain of the 
> filter, it continues to work like without any specified "bottom" node
> 
> 2. do bdrv_ref/bdrv_unref of bottom_bs, to not work with dead pointer
> 
> 3. check in cor_co_preadv_part() is bottom_bs is still in backing chain 
> or not

Hm, right.

Alternatively, we could also freeze the chain until the bottom node and 
then allow changing the @bottom node through reopen.  Then it would have 
to be manually unset before the bottom node is allowed to disappear from 
the chain.

Would freezing the chain pose a problem?

Max



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 05/13] qapi: create BlockdevOptionsCor structure for COR driver
  2020-12-11  8:54       ` Max Reitz
@ 2020-12-11 12:32         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-11 12:32 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

11.12.2020 11:54, Max Reitz wrote:
> On 10.12.20 19:30, Vladimir Sementsov-Ogievskiy wrote:
>> 10.12.2020 20:43, Max Reitz wrote:
>>> I don’t like this patch’s subject very much, because I find the implementation of the @bottom option to be more noteworthy than the addition of the QAPI structure.
>>>
>>>
>>> On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
>>>> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>>
>>>> Create the BlockdevOptionsCor structure for COR driver specific options
>>>> splitting it off form the BlockdevOptionsGenericFormat. The only option
>>>> 'bottom' node in the structure denotes an image file that limits the
>>>> COR operations in the backing chain.
>>>> We are going to use the COR-filter for a block-stream job and will pass
>>>> a bottom node name to the COR driver. The bottom node is the first
>>>> non-filter overlay of the base. It was introduced because the base node
>>>> itself may change due to possible concurrent jobs.
>>>>
>>>> Suggested-by: Max Reitz <mreitz@redhat.com>
>>>> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>>    [vsementsov: fix bdrv_is_allocated_above() usage]
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>   qapi/block-core.json | 21 +++++++++++++++-
>>>>   block/copy-on-read.c | 57 ++++++++++++++++++++++++++++++++++++++++++--
>>>>   2 files changed, 75 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>>> index 8ef3df6767..04055ef50c 100644
>>>> --- a/qapi/block-core.json
>>>> +++ b/qapi/block-core.json
>>>> @@ -3942,6 +3942,25 @@
>>>>     'data': { 'throttle-group': 'str',
>>>>               'file' : 'BlockdevRef'
>>>>                } }
>>>> +
>>>> +##
>>>> +# @BlockdevOptionsCor:
>>>> +#
>>>> +# Driver specific block device options for the copy-on-read driver.
>>>> +#
>>>> +# @bottom: the name of a non-filter node (allocation-bearing layer) that limits
>>>> +#          the COR operations in the backing chain (inclusive).
>>>
>>> This seems to me like something’s missing.  Perhaps technically there isn’t, but “limits the COR operations” begs the question (to me) “Limits them in what way?” (to which the answer is: No data below @bottom is copied).
>>>
>>> Could you make it more verbose?  Perhaps something like “The name of a non-filter node (allocation-bearing layer) that limits the COR operations in the backing chain (inclusive), so that no data below this node will be copied by this filter”?
>>
>> Sounds good for me.
>>
>>>
>>>> +#          For the block-stream job, it will be the first non-filter overlay of
>>>> +#          the base node. We do not involve the base node into the COR
>>>> +#          operations because the base may change due to a concurrent
>>>> +#          block-commit job on the same backing chain.
>>>
>>
>> I now see that paragraph conflicts with further introduce of "bottom" for stream job itself. I think it may be safely dropped. It's a wrong place to describe how block-stream works.
>>
>>> I think the default behavior should be mentioned here somewhere, i.e. that no limit is applied, so that data from all backing layers may be copied.
>>
>> agree
>>
>>>
>>>> +#
>>>> +# Since: 5.2
>>>
>>> *6.0
>>>
>>>> +##
>>>> +{ 'struct': 'BlockdevOptionsCor',
>>>> +  'base': 'BlockdevOptionsGenericFormat',
>>>> +  'data': { '*bottom': 'str' } }
>>>> +
>>>>   ##
>>>>   # @BlockdevOptions:
>>>>   #
>>>
>>> [...]
>>>
>>>> diff --git a/block/copy-on-read.c b/block/copy-on-read.c
>>>> index 618c4c4f43..67f61983c0 100644
>>>> --- a/block/copy-on-read.c
>>>> +++ b/block/copy-on-read.c
>>>
>>> [...]
>>>
>>>> @@ -51,7 +56,17 @@ static int cor_open(BlockDriverState *bs, QDict *options, int flags,
>>>>           ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
>>>>               bs->file->bs->supported_zero_flags);
>>>> +    if (bottom_node) {
>>>> +        bottom_bs = bdrv_lookup_bs(NULL, bottom_node, errp);
>>>> +        if (!bottom_bs) {
>>>> +            error_setg(errp, "Bottom node '%s' not found", bottom_node);
>>>> +            qdict_del(options, "bottom");
>>>> +            return -EINVAL;
>>>> +        }
>>>
>>> Should we verify that bottom_bs is not a filter, as required by the schema?
>>>
>>
>> yes, thanks for the catch!
>>
>>
>> Hmm.. Interesting, we don't freeze the backing chain in cor filter open. And I think we shouldn't. But then, bottom node may disappear. We should handle it without a crash.
>>
>> I suggest:
>>
>> 1. document, that if bottom node disappear from the backing chain of the filter, it continues to work like without any specified "bottom" node
>>
>> 2. do bdrv_ref/bdrv_unref of bottom_bs, to not work with dead pointer
>>
>> 3. check in cor_co_preadv_part() is bottom_bs is still in backing chain or not
> 
> Hm, right.
> 
> Alternatively, we could also freeze the chain until the bottom node and then allow changing the @bottom node through reopen.  Then it would have to be manually unset before the bottom node is allowed to disappear from the chain.
> 
> Would freezing the chain pose a problem?
> 

Hmm. Then we'll just need not freeze it in block-stream, so freezing is done by filter.

It's just more restrictive.. But I can't imagine reasonable cases where user specify bottom node and than remove it. Forcing user to reopen the filter to change the bottom node may be more clean then "just don't care". OK, I think we can freeze the chain in the filter.


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 06/13] iotests: add #310 to test bottom node in COR driver
  2020-12-04 22:07 ` [PATCH v14 06/13] iotests: add #310 to test bottom node in " Vladimir Sementsov-Ogievskiy
@ 2020-12-11 12:49   ` Max Reitz
  2020-12-11 13:10     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 37+ messages in thread
From: Max Reitz @ 2020-12-11 12:49 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> 
> The test case #310 is similar to #216 by Max Reitz. The difference is
> that the test #310 involves a bottom node to the COR filter driver.
> 
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   tests/qemu-iotests/310     | 114 +++++++++++++++++++++++++++++++++++++
>   tests/qemu-iotests/310.out |  15 +++++
>   tests/qemu-iotests/group   |   1 +
>   3 files changed, 130 insertions(+)
>   create mode 100755 tests/qemu-iotests/310
>   create mode 100644 tests/qemu-iotests/310.out
> 
> diff --git a/tests/qemu-iotests/310 b/tests/qemu-iotests/310
> new file mode 100755
> index 0000000000..c8b34cd887
> --- /dev/null
> +++ b/tests/qemu-iotests/310
> @@ -0,0 +1,114 @@
> +#!/usr/bin/env python3
> +#
> +# Copy-on-read tests using a COR filter with a bottom node
> +#
> +# Copyright (C) 2018 Red Hat, Inc.
> +# Copyright (c) 2020 Virtuozzo International GmbH
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +#
> +
> +import iotests
> +from iotests import log, qemu_img, qemu_io_silent
> +
> +# Need backing file support
> +iotests.script_initialize(supported_fmts=['qcow2', 'qcow', 'qed', 'vmdk'],
> +                          supported_platforms=['linux'])
> +
> +log('')
> +log('=== Copy-on-read across nodes ===')
> +log('')
> +
> +# This test is similar to the 216 one by Max Reitz <mreitz@redhat.com>
> +# The difference is that this test case involves a bottom node to the
> +# COR filter driver.
> +
> +with iotests.FilePath('base.img') as base_img_path, \
> +     iotests.FilePath('mid.img') as mid_img_path, \
> +     iotests.FilePath('top.img') as top_img_path, \
> +     iotests.VM() as vm:
> +
> +    log('--- Setting up images ---')
> +    log('')
> +
> +    assert qemu_img('create', '-f', iotests.imgfmt, base_img_path, '64M') == 0
> +    assert qemu_io_silent(base_img_path, '-c', 'write -P 1 0M 1M') == 0
> +    assert qemu_io_silent(base_img_path, '-c', 'write -P 1 3M 1M') == 0
> +    assert qemu_img('create', '-f', iotests.imgfmt, '-b', base_img_path,
> +                    '-F', iotests.imgfmt, mid_img_path) == 0
> +    assert qemu_io_silent(mid_img_path,  '-c', 'write -P 3 2M 1M') == 0
> +    assert qemu_io_silent(mid_img_path,  '-c', 'write -P 3 4M 1M') == 0
> +    assert qemu_img('create', '-f', iotests.imgfmt, '-b', mid_img_path,
> +                    '-F', iotests.imgfmt, top_img_path) == 0
> +    assert qemu_io_silent(top_img_path,  '-c', 'write -P 2 1M 1M') == 0
> +
> +#      0 1 2 3 4
> +# top    2
> +# mid      3   3
> +# base 1     1
> +
> +    log('Done')
> +
> +    log('')
> +    log('--- Doing COR ---')
> +    log('')
> +
> +    vm.launch()
> +
> +    log(vm.qmp('blockdev-add',
> +               node_name='node0',
> +               driver='copy-on-read',
> +               bottom='node2',
> +               file={
> +                   'driver': iotests.imgfmt,
> +                   'file': {
> +                       'driver': 'file',
> +                       'filename': top_img_path
> +                   },
> +                   'backing': {
> +                       'node-name': 'node2',
> +                       'driver': iotests.imgfmt,
> +                       'file': {
> +                           'driver': 'file',
> +                           'filename': mid_img_path
> +                       },
> +                       'backing': {
> +                           'driver': iotests.imgfmt,
> +                           'file': {
> +                               'driver': 'file',
> +                               'filename': base_img_path
> +                           }
> +                       },
> +                   }
> +               }))
> +
> +    # Trigger COR
> +    log(vm.qmp('human-monitor-command',
> +               command_line='qemu-io node0 "read 0 5M"'))
> +
> +    vm.shutdown()
> +
> +    log('')
> +    log('--- Checking COR result ---')
> +    log('')
> +
> +    assert qemu_io_silent(base_img_path, '-c', 'discard 0 4M') == 0
> +    assert qemu_io_silent(mid_img_path, '-c', 'discard 0M 5M') == 0

The data discard leaves behind is undefined, so this may not result in 
zeroes.  (In fact, the test does fail for me with vmdk, qed, and qcow.) 
  'write -z' would work better, although perhaps you intentionally chose 
discard to just drop the data from the backing images.

In that case, you could also recreate the middle image, so it’s empty 
then – the only problem with that is that it’ll break VMDK because it 
stores this reference to its backing image, and if the backing image is 
changed, you’ll get EINVAL when falling back to it...

(The same goes for overwriting any data in the backing image, though, be 
it with discard, write -z, or write -P 0.  So I suppose VMDK just won’t 
work with this test.)

Max

> +    assert qemu_io_silent(top_img_path,  '-c', 'read -P 0 0 1M') == 0
> +    assert qemu_io_silent(top_img_path,  '-c', 'read -P 2 1M 1M') == 0
> +    assert qemu_io_silent(top_img_path,  '-c', 'read -P 3 2M 1M') == 0
> +    assert qemu_io_silent(top_img_path,  '-c', 'read -P 0 3M 1M') == 0
> +    assert qemu_io_silent(top_img_path,  '-c', 'read -P 3 4M 1M') == 0
> +
> +    log('Done')



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 06/13] iotests: add #310 to test bottom node in COR driver
  2020-12-11 12:49   ` Max Reitz
@ 2020-12-11 13:10     ` Vladimir Sementsov-Ogievskiy
  2020-12-11 13:24       ` Max Reitz
  0 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-11 13:10 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

11.12.2020 15:49, Max Reitz wrote:
> On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
>> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>
>> The test case #310 is similar to #216 by Max Reitz. The difference is
>> that the test #310 involves a bottom node to the COR filter driver.
>>
>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   tests/qemu-iotests/310     | 114 +++++++++++++++++++++++++++++++++++++
>>   tests/qemu-iotests/310.out |  15 +++++
>>   tests/qemu-iotests/group   |   1 +
>>   3 files changed, 130 insertions(+)
>>   create mode 100755 tests/qemu-iotests/310
>>   create mode 100644 tests/qemu-iotests/310.out
>>
>> diff --git a/tests/qemu-iotests/310 b/tests/qemu-iotests/310
>> new file mode 100755
>> index 0000000000..c8b34cd887
>> --- /dev/null
>> +++ b/tests/qemu-iotests/310
>> @@ -0,0 +1,114 @@
>> +#!/usr/bin/env python3
>> +#
>> +# Copy-on-read tests using a COR filter with a bottom node
>> +#
>> +# Copyright (C) 2018 Red Hat, Inc.
>> +# Copyright (c) 2020 Virtuozzo International GmbH
>> +#
>> +# This program is free software; you can redistribute it and/or modify
>> +# it under the terms of the GNU General Public License as published by
>> +# the Free Software Foundation; either version 2 of the License, or
>> +# (at your option) any later version.
>> +#
>> +# This program is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> +#
>> +
>> +import iotests
>> +from iotests import log, qemu_img, qemu_io_silent
>> +
>> +# Need backing file support
>> +iotests.script_initialize(supported_fmts=['qcow2', 'qcow', 'qed', 'vmdk'],
>> +                          supported_platforms=['linux'])
>> +
>> +log('')
>> +log('=== Copy-on-read across nodes ===')
>> +log('')
>> +
>> +# This test is similar to the 216 one by Max Reitz <mreitz@redhat.com>
>> +# The difference is that this test case involves a bottom node to the
>> +# COR filter driver.
>> +
>> +with iotests.FilePath('base.img') as base_img_path, \
>> +     iotests.FilePath('mid.img') as mid_img_path, \
>> +     iotests.FilePath('top.img') as top_img_path, \
>> +     iotests.VM() as vm:
>> +
>> +    log('--- Setting up images ---')
>> +    log('')
>> +
>> +    assert qemu_img('create', '-f', iotests.imgfmt, base_img_path, '64M') == 0
>> +    assert qemu_io_silent(base_img_path, '-c', 'write -P 1 0M 1M') == 0
>> +    assert qemu_io_silent(base_img_path, '-c', 'write -P 1 3M 1M') == 0
>> +    assert qemu_img('create', '-f', iotests.imgfmt, '-b', base_img_path,
>> +                    '-F', iotests.imgfmt, mid_img_path) == 0
>> +    assert qemu_io_silent(mid_img_path,  '-c', 'write -P 3 2M 1M') == 0
>> +    assert qemu_io_silent(mid_img_path,  '-c', 'write -P 3 4M 1M') == 0
>> +    assert qemu_img('create', '-f', iotests.imgfmt, '-b', mid_img_path,
>> +                    '-F', iotests.imgfmt, top_img_path) == 0
>> +    assert qemu_io_silent(top_img_path,  '-c', 'write -P 2 1M 1M') == 0
>> +
>> +#      0 1 2 3 4
>> +# top    2
>> +# mid      3   3
>> +# base 1     1
>> +
>> +    log('Done')
>> +
>> +    log('')
>> +    log('--- Doing COR ---')
>> +    log('')
>> +
>> +    vm.launch()
>> +
>> +    log(vm.qmp('blockdev-add',
>> +               node_name='node0',
>> +               driver='copy-on-read',
>> +               bottom='node2',
>> +               file={
>> +                   'driver': iotests.imgfmt,
>> +                   'file': {
>> +                       'driver': 'file',
>> +                       'filename': top_img_path
>> +                   },
>> +                   'backing': {
>> +                       'node-name': 'node2',
>> +                       'driver': iotests.imgfmt,
>> +                       'file': {
>> +                           'driver': 'file',
>> +                           'filename': mid_img_path
>> +                       },
>> +                       'backing': {
>> +                           'driver': iotests.imgfmt,
>> +                           'file': {
>> +                               'driver': 'file',
>> +                               'filename': base_img_path
>> +                           }
>> +                       },
>> +                   }
>> +               }))
>> +
>> +    # Trigger COR
>> +    log(vm.qmp('human-monitor-command',
>> +               command_line='qemu-io node0 "read 0 5M"'))
>> +
>> +    vm.shutdown()
>> +
>> +    log('')
>> +    log('--- Checking COR result ---')
>> +    log('')
>> +
>> +    assert qemu_io_silent(base_img_path, '-c', 'discard 0 4M') == 0
>> +    assert qemu_io_silent(mid_img_path, '-c', 'discard 0M 5M') == 0
> 
> The data discard leaves behind is undefined, so this may not result in zeroes.  (In fact, the test does fail for me with vmdk, qed, and qcow.)  'write -z' would work better, although perhaps you intentionally chose discard to just drop the data from the backing images.
> 
> In that case, you could also recreate the middle image, so it’s empty then – the only problem with that is that it’ll break VMDK because it stores this reference to its backing image, and if the backing image is changed, you’ll get EINVAL when falling back to it...
> 
> (The same goes for overwriting any data in the backing image, though, be it with discard, write -z, or write -P 0.  So I suppose VMDK just won’t work with this test.)
> 

I think the goal is just to be sure the following reads read from the top and check exactly that COR works.

So we can just use 'write -z'.. Or, we can changed the backing file of top_img to nothing instead. Can qemu-img do it?

> 
>> +    assert qemu_io_silent(top_img_path,  '-c', 'read -P 0 0 1M') == 0
>> +    assert qemu_io_silent(top_img_path,  '-c', 'read -P 2 1M 1M') == 0
>> +    assert qemu_io_silent(top_img_path,  '-c', 'read -P 3 2M 1M') == 0
>> +    assert qemu_io_silent(top_img_path,  '-c', 'read -P 0 3M 1M') == 0
>> +    assert qemu_io_silent(top_img_path,  '-c', 'read -P 3 4M 1M') == 0
>> +
>> +    log('Done')
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 07/13] block: include supported_read_flags into BDS structure
  2020-12-04 22:07 ` [PATCH v14 07/13] block: include supported_read_flags into BDS structure Vladimir Sementsov-Ogievskiy
@ 2020-12-11 13:20   ` Max Reitz
  2020-12-11 13:31     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 37+ messages in thread
From: Max Reitz @ 2020-12-11 13:20 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> 
> Add the new member supported_read_flags to the BlockDriverState
> structure. It will control the flags set for copy-on-read operations.
> Make the block generic layer evaluate supported read flags before they
> go to a block driver.
> 
> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   include/block/block_int.h |  4 ++++
>   block/io.c                | 12 ++++++++++--
>   2 files changed, 14 insertions(+), 2 deletions(-)

[...]

> diff --git a/block/io.c b/block/io.c
> index ec5e152bb7..e28b11c42b 100644
> --- a/block/io.c
> +++ b/block/io.c

[...]

> @@ -1426,9 +1429,13 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild *child,
>           goto out;
>       }
>   
> +    if (flags & ~bs->supported_read_flags) {
> +        abort();
> +    }

I’d prefer an assert(!(flags & ~bs->supported_read_flags)), so in case 
we do abort, there’s going to be an error message that immediately tells 
what the problem is.

Apart from that:

Reviewed-by: Max Reitz <mreitz@redhat.com>

> +
>       max_bytes = ROUND_UP(MAX(0, total_bytes - offset), align);
>       if (bytes <= max_bytes && bytes <= max_transfer) {
> -        ret = bdrv_driver_preadv(bs, offset, bytes, qiov, qiov_offset, 0);
> +        ret = bdrv_driver_preadv(bs, offset, bytes, qiov, qiov_offset, flags);
>           goto out;
>       }



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 06/13] iotests: add #310 to test bottom node in COR driver
  2020-12-11 13:10     ` Vladimir Sementsov-Ogievskiy
@ 2020-12-11 13:24       ` Max Reitz
  0 siblings, 0 replies; 37+ messages in thread
From: Max Reitz @ 2020-12-11 13:24 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 11.12.20 14:10, Vladimir Sementsov-Ogievskiy wrote:
> 11.12.2020 15:49, Max Reitz wrote:
>> On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
>>> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>
>>> The test case #310 is similar to #216 by Max Reitz. The difference is
>>> that the test #310 involves a bottom node to the COR filter driver.
>>>
>>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>   tests/qemu-iotests/310     | 114 +++++++++++++++++++++++++++++++++++++
>>>   tests/qemu-iotests/310.out |  15 +++++
>>>   tests/qemu-iotests/group   |   1 +
>>>   3 files changed, 130 insertions(+)
>>>   create mode 100755 tests/qemu-iotests/310
>>>   create mode 100644 tests/qemu-iotests/310.out
>>>
>>> diff --git a/tests/qemu-iotests/310 b/tests/qemu-iotests/310
>>> new file mode 100755
>>> index 0000000000..c8b34cd887
>>> --- /dev/null
>>> +++ b/tests/qemu-iotests/310
>>> @@ -0,0 +1,114 @@
>>> +#!/usr/bin/env python3
>>> +#
>>> +# Copy-on-read tests using a COR filter with a bottom node
>>> +#
>>> +# Copyright (C) 2018 Red Hat, Inc.
>>> +# Copyright (c) 2020 Virtuozzo International GmbH
>>> +#
>>> +# This program is free software; you can redistribute it and/or modify
>>> +# it under the terms of the GNU General Public License as published by
>>> +# the Free Software Foundation; either version 2 of the License, or
>>> +# (at your option) any later version.
>>> +#
>>> +# This program is distributed in the hope that it will be useful,
>>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> +# GNU General Public License for more details.
>>> +#
>>> +# You should have received a copy of the GNU General Public License
>>> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
>>> +#
>>> +
>>> +import iotests
>>> +from iotests import log, qemu_img, qemu_io_silent
>>> +
>>> +# Need backing file support
>>> +iotests.script_initialize(supported_fmts=['qcow2', 'qcow', 'qed', 
>>> 'vmdk'],
>>> +                          supported_platforms=['linux'])
>>> +
>>> +log('')
>>> +log('=== Copy-on-read across nodes ===')
>>> +log('')
>>> +
>>> +# This test is similar to the 216 one by Max Reitz <mreitz@redhat.com>
>>> +# The difference is that this test case involves a bottom node to the
>>> +# COR filter driver.
>>> +
>>> +with iotests.FilePath('base.img') as base_img_path, \
>>> +     iotests.FilePath('mid.img') as mid_img_path, \
>>> +     iotests.FilePath('top.img') as top_img_path, \
>>> +     iotests.VM() as vm:
>>> +
>>> +    log('--- Setting up images ---')
>>> +    log('')
>>> +
>>> +    assert qemu_img('create', '-f', iotests.imgfmt, base_img_path, 
>>> '64M') == 0
>>> +    assert qemu_io_silent(base_img_path, '-c', 'write -P 1 0M 1M') == 0
>>> +    assert qemu_io_silent(base_img_path, '-c', 'write -P 1 3M 1M') == 0
>>> +    assert qemu_img('create', '-f', iotests.imgfmt, '-b', 
>>> base_img_path,
>>> +                    '-F', iotests.imgfmt, mid_img_path) == 0
>>> +    assert qemu_io_silent(mid_img_path,  '-c', 'write -P 3 2M 1M') == 0
>>> +    assert qemu_io_silent(mid_img_path,  '-c', 'write -P 3 4M 1M') == 0
>>> +    assert qemu_img('create', '-f', iotests.imgfmt, '-b', mid_img_path,
>>> +                    '-F', iotests.imgfmt, top_img_path) == 0
>>> +    assert qemu_io_silent(top_img_path,  '-c', 'write -P 2 1M 1M') == 0
>>> +
>>> +#      0 1 2 3 4
>>> +# top    2
>>> +# mid      3   3
>>> +# base 1     1
>>> +
>>> +    log('Done')
>>> +
>>> +    log('')
>>> +    log('--- Doing COR ---')
>>> +    log('')
>>> +
>>> +    vm.launch()
>>> +
>>> +    log(vm.qmp('blockdev-add',
>>> +               node_name='node0',
>>> +               driver='copy-on-read',
>>> +               bottom='node2',
>>> +               file={
>>> +                   'driver': iotests.imgfmt,
>>> +                   'file': {
>>> +                       'driver': 'file',
>>> +                       'filename': top_img_path
>>> +                   },
>>> +                   'backing': {
>>> +                       'node-name': 'node2',
>>> +                       'driver': iotests.imgfmt,
>>> +                       'file': {
>>> +                           'driver': 'file',
>>> +                           'filename': mid_img_path
>>> +                       },
>>> +                       'backing': {
>>> +                           'driver': iotests.imgfmt,
>>> +                           'file': {
>>> +                               'driver': 'file',
>>> +                               'filename': base_img_path
>>> +                           }
>>> +                       },
>>> +                   }
>>> +               }))
>>> +
>>> +    # Trigger COR
>>> +    log(vm.qmp('human-monitor-command',
>>> +               command_line='qemu-io node0 "read 0 5M"'))
>>> +
>>> +    vm.shutdown()
>>> +
>>> +    log('')
>>> +    log('--- Checking COR result ---')
>>> +    log('')
>>> +
>>> +    assert qemu_io_silent(base_img_path, '-c', 'discard 0 4M') == 0
>>> +    assert qemu_io_silent(mid_img_path, '-c', 'discard 0M 5M') == 0
>>
>> The data discard leaves behind is undefined, so this may not result in 
>> zeroes.  (In fact, the test does fail for me with vmdk, qed, and 
>> qcow.)  'write -z' would work better, although perhaps you 
>> intentionally chose discard to just drop the data from the backing 
>> images.
>>
>> In that case, you could also recreate the middle image, so it’s empty 
>> then – the only problem with that is that it’ll break VMDK because it 
>> stores this reference to its backing image, and if the backing image 
>> is changed, you’ll get EINVAL when falling back to it...
>>
>> (The same goes for overwriting any data in the backing image, though, 
>> be it with discard, write -z, or write -P 0.  So I suppose VMDK just 
>> won’t work with this test.)
>>
> 
> I think the goal is just to be sure the following reads read from the 
> top and check exactly that COR works.
> 
> So we can just use 'write -z'.. Or, we can changed the backing file of 
> top_img to nothing instead. Can qemu-img do it?

Yes, with rebase -u -b ''.  (I think I tested that, and that too didn’t 
work for all formats, though...?)

Max



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 07/13] block: include supported_read_flags into BDS structure
  2020-12-11 13:20   ` Max Reitz
@ 2020-12-11 13:31     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-11 13:31 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

11.12.2020 16:20, Max Reitz wrote:
> On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
>> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>
>> Add the new member supported_read_flags to the BlockDriverState
>> structure. It will control the flags set for copy-on-read operations.
>> Make the block generic layer evaluate supported read flags before they
>> go to a block driver.
>>
>> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   include/block/block_int.h |  4 ++++
>>   block/io.c                | 12 ++++++++++--
>>   2 files changed, 14 insertions(+), 2 deletions(-)
> 
> [...]
> 
>> diff --git a/block/io.c b/block/io.c
>> index ec5e152bb7..e28b11c42b 100644
>> --- a/block/io.c
>> +++ b/block/io.c
> 
> [...]
> 
>> @@ -1426,9 +1429,13 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild *child,
>>           goto out;
>>       }
>> +    if (flags & ~bs->supported_read_flags) {
>> +        abort();
>> +    }
> 
> I’d prefer an assert(!(flags & ~bs->supported_read_flags)), so in case we do abort, there’s going to be an error message that immediately tells what the problem is.

agree. and one-line check is shorter than three-line

> 
> Apart from that:
> 
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> 
>> +
>>       max_bytes = ROUND_UP(MAX(0, total_bytes - offset), align);
>>       if (bytes <= max_bytes && bytes <= max_transfer) {
>> -        ret = bdrv_driver_preadv(bs, offset, bytes, qiov, qiov_offset, 0);
>> +        ret = bdrv_driver_preadv(bs, offset, bytes, qiov, qiov_offset, flags);
>>           goto out;
>>       }
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 08/13] copy-on-read: skip non-guest reads if no copy needed
  2020-12-04 22:07 ` [PATCH v14 08/13] copy-on-read: skip non-guest reads if no copy needed Vladimir Sementsov-Ogievskiy
@ 2020-12-11 14:29   ` Max Reitz
  0 siblings, 0 replies; 37+ messages in thread
From: Max Reitz @ 2020-12-11 14:29 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> 
> If the flag BDRV_REQ_PREFETCH was set, skip idling read/write
> operations in COR-driver. It can be taken into account for the
> COR-algorithms optimization. That check is being made during the
> block stream job by the moment.
> 
> Add the BDRV_REQ_PREFETCH flag to the supported_read_flags of the
> COR-filter.
> 
> block: Modify the comment for the flag BDRV_REQ_PREFETCH as we are
> going to use it alone and pass it to the COR-filter driver for further
> processing.
> 
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   include/block/block.h |  8 +++++---
>   block/copy-on-read.c  | 14 ++++++++++----
>   2 files changed, 15 insertions(+), 7 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 09/13] stream: skip filters when writing backing file name to QCOW2 header
  2020-12-04 22:07 ` [PATCH v14 09/13] stream: skip filters when writing backing file name to QCOW2 header Vladimir Sementsov-Ogievskiy
@ 2020-12-11 15:15   ` Max Reitz
  0 siblings, 0 replies; 37+ messages in thread
From: Max Reitz @ 2020-12-11 15:15 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> 
> Avoid writing a filter JSON file name and a filter format name to QCOW2
> image when the backing file is being changed after the block stream
> job. It can occur due to a concurrent commit job on the same backing
> chain.
> A user is still able to assign the 'backing-file' parameter for a
> block-stream job keeping in mind the possible issue mentioned above.
> If the user does not specify the 'backing-file' parameter, QEMU will
> assign it automatically.
> 
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>   [vsementsov: use unfiltered_bs for bdrv_find_backing_image()]
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   block/stream.c | 21 +++++++++++++++++++--
>   blockdev.c     |  8 +-------
>   2 files changed, 20 insertions(+), 9 deletions(-)
> 
> diff --git a/block/stream.c b/block/stream.c
> index 6e281c71ac..c208393c34 100644
> --- a/block/stream.c
> +++ b/block/stream.c

[...]

> @@ -75,8 +78,22 @@ static int stream_prepare(Job *job)
>           const char *base_id = NULL, *base_fmt = NULL;
>           if (base) {
>               base_id = s->backing_file_str;
> -            if (base->drv) {
> -                base_fmt = base->drv->format_name;
> +            if (base_id) {
> +                backing_bs = bdrv_find_backing_image(unfiltered_bs, base_id);
> +                if (backing_bs && backing_bs->drv) {
> +                    base_fmt = backing_bs->drv->format_name;
> +                } else {
> +                    error_report("Format not found for backing file %s",
> +                                 s->backing_file_str);

I think it’s actually going to be rather likely that we’re not going to 
find the backing file here.  If the user were to use a filename that 
just appears as-such in the backing chain, they wouldn’t need to specify 
a backing-file parameter at all, because the one figured out 
automatically would be just fine.

But then again, what are we supposed to do then.  We can continue as 
before, which is to just use the base node’s format.  But if the user 
wants to perhaps use a backing file that isn’t even open in qemu (a copy 
of the the base on some different storage), we have no idea what format 
it’s in.

So printing an error here, but continuing on with setting a backing_fmt 
is probably the most reasonable thing to do indeed.

> +                }
> +            } else {
> +                base_unfiltered = bdrv_skip_filters(base);
> +                if (base_unfiltered) {

@base_unfiltered cannot be NULL here (because @base is sure not to be 
NULL).  Of course, double-checking isn’t wrong, it just looks a bit 
weird, because it seems to imply that we might end up with a case where 
base != NULL, but base_id == NULL.  Anyway:

Reviewed-by: Max Reitz <mreitz@redhat.com>

> +                    base_id = base_unfiltered->filename;
> +                    if (base_unfiltered->drv) {
> +                        base_fmt = base_unfiltered->drv->format_name;
> +                    }
> +                }
>               }
>           }
>           bdrv_set_backing_hd(unfiltered_bs, base, &local_err);



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 10/13] qapi: block-stream: add "bottom" argument
  2020-12-04 22:07 ` [PATCH v14 10/13] qapi: block-stream: add "bottom" argument Vladimir Sementsov-Ogievskiy
@ 2020-12-11 16:05   ` Max Reitz
  2020-12-11 16:50     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 37+ messages in thread
From: Max Reitz @ 2020-12-11 16:05 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
> The code already don't freeze base node and we try to make it prepared
> for the situation when base node is changed during the operation. In
> other words, block-stream doesn't own base node.
> 
> Let's introduce a new interface which should replace the current one,
> which will in better relations with the code. Specifying bottom node
> instead of base, and requiring it to be non-filter gives us the
> following benefits:
> 
>   - drop difference between above_base and base_overlay, which will be
>     renamed to just bottom, when old interface dropped
> 
>   - clean way to work with parallel streams/commits on the same backing
>     chain, which otherwise become a problem when we introduce a filter
>     for stream job
> 
>   - cleaner interface. Nobody will surprised the fact that base node may
>     disappear during block-stream, when there is no word about "base" in
>     the interface.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   qapi/block-core.json           |  8 +++--
>   include/block/block_int.h      |  1 +
>   block/monitor/block-hmp-cmds.c |  3 +-
>   block/stream.c                 | 50 +++++++++++++++++++---------
>   blockdev.c                     | 61 ++++++++++++++++++++++++++++------
>   5 files changed, 94 insertions(+), 29 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 04055ef50c..5d6681a35d 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -2522,6 +2522,10 @@
>   # @base-node: the node name of the backing file.
>   #             It cannot be set if @base is also set. (Since 2.8)
>   #
> +# @bottom: the last node in the chain that should be streamed into
> +#          top. It cannot be set any of @base, @base-node or @backing-file

s/set any/set if any/

But what’s the problem with backing-file?  The fact that specifying 
backing-file means that stream will look for that filename in the 
backing chain when the job is done (so if you use @bottom, we generally 
don’t want to rely on the presence of any nodes below it)?

(If so, I would have thought that we actually want the user to specify 
backing-file so we don’t have to look down below @bottom to look for a 
filename.  Perhaps a @backing-fmt parameter would help.)

[...]

> diff --git a/blockdev.c b/blockdev.c
> index 70900f4f77..e0e19db88b 100644
> --- a/blockdev.c
> +++ b/blockdev.c

[...]

> @@ -2551,8 +2567,33 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
>           bdrv_refresh_filename(base_bs);
>       }
>   
> -    /* Check for op blockers in the whole chain between bs and base */
> -    for (iter = bs; iter && iter != base_bs;
> +    if (has_bottom) {
> +        bottom_bs = bdrv_lookup_bs(NULL, bottom, errp);
> +        if (!bottom_bs) {
> +            goto out;
> +        }
> +        if (!bottom_bs->drv) {
> +            error_setg(errp, "Node '%s' is not open", bottom);
> +            goto out;
> +        }
> +        if (bottom_bs->drv->is_filter) {
> +            error_setg(errp, "Node '%s' is filter, use non-filter node"
> +                       "as 'bottom'", bottom);

Missing a space between “node” and “as”.  (Also, probably two articles, 
i.e. “Node '%s' is a filter, use a non-filter node...”.)

The rest looks good to me, but I’m withholding my R-b because I haven’t 
understood why using @bottom precludes giving @backing-file.

Max



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 11/13] iotests: 30: prepare to COR filter insertion by stream job
  2020-12-04 22:07 ` [PATCH v14 11/13] iotests: 30: prepare to COR filter insertion by stream job Vladimir Sementsov-Ogievskiy
@ 2020-12-11 16:09   ` Max Reitz
  0 siblings, 0 replies; 37+ messages in thread
From: Max Reitz @ 2020-12-11 16:09 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
> test_stream_parallel run parallel stream jobs, intersecting so that top
> of one is base of another. It's OK now, but it would be a problem if
> insert the filter, as one job will want to use another job's filter as
> above_base node.
> 
> Correct thing to do is move to new interface: "bottom" argument instead
> of base. This guarantees that jobs don't intersect by their actions.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   tests/qemu-iotests/030 | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 12/13] block/stream: add s->target_bs
  2020-12-04 22:07 ` [PATCH v14 12/13] block/stream: add s->target_bs Vladimir Sementsov-Ogievskiy
@ 2020-12-11 16:33   ` Max Reitz
  0 siblings, 0 replies; 37+ messages in thread
From: Max Reitz @ 2020-12-11 16:33 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
> Add a direct link to target bs for convenience and to simplify
> following commit which will insert COR filter above target bs.
> 
> This is a part of original commit written by Andrey.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   block/stream.c | 23 ++++++++++-------------
>   1 file changed, 10 insertions(+), 13 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 10/13] qapi: block-stream: add "bottom" argument
  2020-12-11 16:05   ` Max Reitz
@ 2020-12-11 16:50     ` Vladimir Sementsov-Ogievskiy
  2020-12-11 17:24       ` Max Reitz
  0 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-11 16:50 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

11.12.2020 19:05, Max Reitz wrote:
> On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
>> The code already don't freeze base node and we try to make it prepared
>> for the situation when base node is changed during the operation. In
>> other words, block-stream doesn't own base node.
>>
>> Let's introduce a new interface which should replace the current one,
>> which will in better relations with the code. Specifying bottom node
>> instead of base, and requiring it to be non-filter gives us the
>> following benefits:
>>
>>   - drop difference between above_base and base_overlay, which will be
>>     renamed to just bottom, when old interface dropped
>>
>>   - clean way to work with parallel streams/commits on the same backing
>>     chain, which otherwise become a problem when we introduce a filter
>>     for stream job
>>
>>   - cleaner interface. Nobody will surprised the fact that base node may
>>     disappear during block-stream, when there is no word about "base" in
>>     the interface.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   qapi/block-core.json           |  8 +++--
>>   include/block/block_int.h      |  1 +
>>   block/monitor/block-hmp-cmds.c |  3 +-
>>   block/stream.c                 | 50 +++++++++++++++++++---------
>>   blockdev.c                     | 61 ++++++++++++++++++++++++++++------
>>   5 files changed, 94 insertions(+), 29 deletions(-)
>>
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index 04055ef50c..5d6681a35d 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -2522,6 +2522,10 @@
>>   # @base-node: the node name of the backing file.
>>   #             It cannot be set if @base is also set. (Since 2.8)
>>   #
>> +# @bottom: the last node in the chain that should be streamed into
>> +#          top. It cannot be set any of @base, @base-node or @backing-file
> 
> s/set any/set if any/
> 
> But what’s the problem with backing-file?  The fact that specifying backing-file means that stream will look for that filename in the backing chain when the job is done (so if you use @bottom, we generally don’t want to rely on the presence of any nodes below it)?

I just wanted to deprecate 'backing-file' together with base and base-node as a next step. If user wants to set backing file unrelated to current backing-chain, is it correct at all? It's a direct violation of what's going on, and I doubt that other parts of Qemu working with backing-file are prepared for such situation. User can do it by hand later.. Anyway, we'll have three releases deprecation period for people to come and cry that this is a really needed option, so we can support it later on demand.

> 
> (If so, I would have thought that we actually want the user to specify backing-file so we don’t have to look down below @bottom to look for a filename.  Perhaps a @backing-fmt parameter would help.)

If we decide that 'backing-file' is really needed, than yes we should require backing-fmt to be specified together with backing-file when using new "bottom" interface.

> 
> [...]
> 
>> diff --git a/blockdev.c b/blockdev.c
>> index 70900f4f77..e0e19db88b 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
> 
> [...]
> 
>> @@ -2551,8 +2567,33 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
>>           bdrv_refresh_filename(base_bs);
>>       }
>> -    /* Check for op blockers in the whole chain between bs and base */
>> -    for (iter = bs; iter && iter != base_bs;
>> +    if (has_bottom) {
>> +        bottom_bs = bdrv_lookup_bs(NULL, bottom, errp);
>> +        if (!bottom_bs) {
>> +            goto out;
>> +        }
>> +        if (!bottom_bs->drv) {
>> +            error_setg(errp, "Node '%s' is not open", bottom);
>> +            goto out;
>> +        }
>> +        if (bottom_bs->drv->is_filter) {
>> +            error_setg(errp, "Node '%s' is filter, use non-filter node"
>> +                       "as 'bottom'", bottom);
> 
> Missing a space between “node” and “as”.  (Also, probably two articles, i.e. “Node '%s' is a filter, use a non-filter node...”.)
> 
> The rest looks good to me, but I’m withholding my R-b because I haven’t understood why using @bottom precludes giving @backing-file.
> 
> Max
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 13/13] block: apply COR-filter to block-stream jobs
  2020-12-04 22:07 ` [PATCH v14 13/13] block: apply COR-filter to block-stream jobs Vladimir Sementsov-Ogievskiy
@ 2020-12-11 17:21   ` Max Reitz
  2020-12-11 17:48     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 37+ messages in thread
From: Max Reitz @ 2020-12-11 17:21 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> 
> This patch completes the series with the COR-filter applied to
> block-stream operations.
> 
> Adding the filter makes it possible in future implement discarding
> copied regions in backing files during the block-stream job, to reduce
> the disk overuse (we need control on permissions).
> 
> Also, the filter now is smart enough to do copy-on-read with specified
> base, so we have benefit on guest reads even when doing block-stream of
> the part of the backing chain.
> 
> Several iotests are slightly modified due to filter insertion.
> 
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   block/stream.c             | 78 ++++++++++++++++++++++++++------------
>   tests/qemu-iotests/030     |  8 ++--
>   tests/qemu-iotests/141.out |  2 +-
>   tests/qemu-iotests/245     | 20 ++++++----
>   4 files changed, 72 insertions(+), 36 deletions(-)
> 
> diff --git a/block/stream.c b/block/stream.c
> index a7fd8945ad..b92f7de55b 100644
> --- a/block/stream.c
> +++ b/block/stream.c

[...]

> @@ -295,17 +287,49 @@ void stream_start(const char *job_id, BlockDriverState *bs,

[...]

> +    opts = qdict_new();
> +
> +    qdict_put_str(opts, "driver", "copy-on-read");
> +    qdict_put_str(opts, "file", bdrv_get_node_name(bs));
> +    /* Pass the base_overlay node name as 'bottom' to COR driver */
> +    qdict_put_str(opts, "bottom", base_overlay->node_name);

Hm.  Should we set this option even if no base was specified?

On one hand, omitting this option would cor_co_preadv_part() a bit quicker.

On the other, what happens when you add a backing file below the bottom 
node during streaming (yes, a largely theoretical case)...  Now, all 
data from it is ignored.  That seemed a bit strange to me at first, but 
on second thought, it makes more sense.  Doing anything else would 
produce a garbage result basically, because stream_run() doesn’t take 
such a change into account.

So...  After all I think I agree with setting @bottom unconditionally.

And that’s the only comment I had. :)

Reviewed-by: Max Reitz <mreitz@redhat.com>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 10/13] qapi: block-stream: add "bottom" argument
  2020-12-11 16:50     ` Vladimir Sementsov-Ogievskiy
@ 2020-12-11 17:24       ` Max Reitz
  2020-12-11 17:42         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 37+ messages in thread
From: Max Reitz @ 2020-12-11 17:24 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 11.12.20 17:50, Vladimir Sementsov-Ogievskiy wrote:
> 11.12.2020 19:05, Max Reitz wrote:
>> On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
>>> The code already don't freeze base node and we try to make it prepared
>>> for the situation when base node is changed during the operation. In
>>> other words, block-stream doesn't own base node.
>>>
>>> Let's introduce a new interface which should replace the current one,
>>> which will in better relations with the code. Specifying bottom node
>>> instead of base, and requiring it to be non-filter gives us the
>>> following benefits:
>>>
>>>   - drop difference between above_base and base_overlay, which will be
>>>     renamed to just bottom, when old interface dropped
>>>
>>>   - clean way to work with parallel streams/commits on the same backing
>>>     chain, which otherwise become a problem when we introduce a filter
>>>     for stream job
>>>
>>>   - cleaner interface. Nobody will surprised the fact that base node may
>>>     disappear during block-stream, when there is no word about "base" in
>>>     the interface.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>   qapi/block-core.json           |  8 +++--
>>>   include/block/block_int.h      |  1 +
>>>   block/monitor/block-hmp-cmds.c |  3 +-
>>>   block/stream.c                 | 50 +++++++++++++++++++---------
>>>   blockdev.c                     | 61 ++++++++++++++++++++++++++++------
>>>   5 files changed, 94 insertions(+), 29 deletions(-)
>>>
>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>> index 04055ef50c..5d6681a35d 100644
>>> --- a/qapi/block-core.json
>>> +++ b/qapi/block-core.json
>>> @@ -2522,6 +2522,10 @@
>>>   # @base-node: the node name of the backing file.
>>>   #             It cannot be set if @base is also set. (Since 2.8)
>>>   #
>>> +# @bottom: the last node in the chain that should be streamed into
>>> +#          top. It cannot be set any of @base, @base-node or 
>>> @backing-file
>>
>> s/set any/set if any/
>>
>> But what’s the problem with backing-file?  The fact that specifying 
>> backing-file means that stream will look for that filename in the 
>> backing chain when the job is done (so if you use @bottom, we 
>> generally don’t want to rely on the presence of any nodes below it)?
> 
> I just wanted to deprecate 'backing-file' together with base and 
> base-node as a next step. If user wants to set backing file unrelated to 
> current backing-chain, is it correct at all? It's a direct violation of 
> what's going on, and I doubt that other parts of Qemu working with 
> backing-file are prepared for such situation. User can do it by hand 
> later.. Anyway, we'll have three releases deprecation period for people 
> to come and cry that this is a really needed option, so we can support 
> it later on demand.
> 
>>
>> (If so, I would have thought that we actually want the user to specify 
>> backing-file so we don’t have to look down below @bottom to look for a 
>> filename.  Perhaps a @backing-fmt parameter would help.)
> 
> If we decide that 'backing-file' is really needed, than yes we should 
> require backing-fmt to be specified together with backing-file when 
> using new "bottom" interface.
Before I can agree on removing backing-file (or deprecating it), I need 
to know what it’s actually used for.  I actually don’t, though.  The 
only reason I could imagine was because the user wanted to write some 
string into there that is different from base.filename.

(The original commit 13d8cc515df does mention cases like FD passing, 
where qemu has no idea what an appropriate filename would be (it can 
only see /dev/fd/*).  From that, it does appear to me that it’ll be 
needed even with @bottom.)

Max



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 10/13] qapi: block-stream: add "bottom" argument
  2020-12-11 17:24       ` Max Reitz
@ 2020-12-11 17:42         ` Vladimir Sementsov-Ogievskiy
  2020-12-11 17:52           ` Max Reitz
  0 siblings, 1 reply; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-11 17:42 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

11.12.2020 20:24, Max Reitz wrote:
> On 11.12.20 17:50, Vladimir Sementsov-Ogievskiy wrote:
>> 11.12.2020 19:05, Max Reitz wrote:
>>> On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
>>>> The code already don't freeze base node and we try to make it prepared
>>>> for the situation when base node is changed during the operation. In
>>>> other words, block-stream doesn't own base node.
>>>>
>>>> Let's introduce a new interface which should replace the current one,
>>>> which will in better relations with the code. Specifying bottom node
>>>> instead of base, and requiring it to be non-filter gives us the
>>>> following benefits:
>>>>
>>>>   - drop difference between above_base and base_overlay, which will be
>>>>     renamed to just bottom, when old interface dropped
>>>>
>>>>   - clean way to work with parallel streams/commits on the same backing
>>>>     chain, which otherwise become a problem when we introduce a filter
>>>>     for stream job
>>>>
>>>>   - cleaner interface. Nobody will surprised the fact that base node may
>>>>     disappear during block-stream, when there is no word about "base" in
>>>>     the interface.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>   qapi/block-core.json           |  8 +++--
>>>>   include/block/block_int.h      |  1 +
>>>>   block/monitor/block-hmp-cmds.c |  3 +-
>>>>   block/stream.c                 | 50 +++++++++++++++++++---------
>>>>   blockdev.c                     | 61 ++++++++++++++++++++++++++++------
>>>>   5 files changed, 94 insertions(+), 29 deletions(-)
>>>>
>>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>>> index 04055ef50c..5d6681a35d 100644
>>>> --- a/qapi/block-core.json
>>>> +++ b/qapi/block-core.json
>>>> @@ -2522,6 +2522,10 @@
>>>>   # @base-node: the node name of the backing file.
>>>>   #             It cannot be set if @base is also set. (Since 2.8)
>>>>   #
>>>> +# @bottom: the last node in the chain that should be streamed into
>>>> +#          top. It cannot be set any of @base, @base-node or @backing-file
>>>
>>> s/set any/set if any/
>>>
>>> But what’s the problem with backing-file?  The fact that specifying backing-file means that stream will look for that filename in the backing chain when the job is done (so if you use @bottom, we generally don’t want to rely on the presence of any nodes below it)?
>>
>> I just wanted to deprecate 'backing-file' together with base and base-node as a next step. If user wants to set backing file unrelated to current backing-chain, is it correct at all? It's a direct violation of what's going on, and I doubt that other parts of Qemu working with backing-file are prepared for such situation. User can do it by hand later.. Anyway, we'll have three releases deprecation period for people to come and cry that this is a really needed option, so we can support it later on demand.
>>
>>>
>>> (If so, I would have thought that we actually want the user to specify backing-file so we don’t have to look down below @bottom to look for a filename.  Perhaps a @backing-fmt parameter would help.)
>>
>> If we decide that 'backing-file' is really needed, than yes we should require backing-fmt to be specified together with backing-file when using new "bottom" interface.
> Before I can agree on removing backing-file (or deprecating it), I need to know what it’s actually used for.  I actually don’t, though.  The only reason I could imagine was because the user wanted to write some string into there that is different from base.filename.
> 
> (The original commit 13d8cc515df does mention cases like FD passing, where qemu has no idea what an appropriate filename would be (it can only see /dev/fd/*).  From that, it does appear to me that it’ll be needed even with @bottom.)
> 

I should have checked it myself.. That's one more reason for my "RFC: don't store backing filename in qcow2 image"..

OK, do you think we can require backing-fmt to be specified if backing-file and bottom are specified? Or allow omitting it and deprecate this thing? We actually already have deprecation message in bdrv_change_backing_file(), and how we are trying to workaround it in block-stream will not work with file descriptors anyway (hmm, and old code works, so, actually 09 is a regression?)

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 13/13] block: apply COR-filter to block-stream jobs
  2020-12-11 17:21   ` Max Reitz
@ 2020-12-11 17:48     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 37+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-11 17:48 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

11.12.2020 20:21, Max Reitz wrote:
> On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
>> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>
>> This patch completes the series with the COR-filter applied to
>> block-stream operations.
>>
>> Adding the filter makes it possible in future implement discarding
>> copied regions in backing files during the block-stream job, to reduce
>> the disk overuse (we need control on permissions).
>>
>> Also, the filter now is smart enough to do copy-on-read with specified
>> base, so we have benefit on guest reads even when doing block-stream of
>> the part of the backing chain.
>>
>> Several iotests are slightly modified due to filter insertion.
>>
>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/stream.c             | 78 ++++++++++++++++++++++++++------------
>>   tests/qemu-iotests/030     |  8 ++--
>>   tests/qemu-iotests/141.out |  2 +-
>>   tests/qemu-iotests/245     | 20 ++++++----
>>   4 files changed, 72 insertions(+), 36 deletions(-)
>>
>> diff --git a/block/stream.c b/block/stream.c
>> index a7fd8945ad..b92f7de55b 100644
>> --- a/block/stream.c
>> +++ b/block/stream.c
> 
> [...]
> 
>> @@ -295,17 +287,49 @@ void stream_start(const char *job_id, BlockDriverState *bs,
> 
> [...]
> 
>> +    opts = qdict_new();
>> +
>> +    qdict_put_str(opts, "driver", "copy-on-read");
>> +    qdict_put_str(opts, "file", bdrv_get_node_name(bs));
>> +    /* Pass the base_overlay node name as 'bottom' to COR driver */
>> +    qdict_put_str(opts, "bottom", base_overlay->node_name);
> 
> Hm.  Should we set this option even if no base was specified?
> 
> On one hand, omitting this option would cor_co_preadv_part() a bit quicker.
> 
> On the other, what happens when you add a backing file below the bottom node during streaming (yes, a largely theoretical case)...

Yes, that's what I was thinking about.

And more: we are moving to using "bottom" and deprecate "base". So bottom is the main concept, and it can't be NULL. If user don't specify it, than default bottom - is the current bottom node in the chain.

I think, we are not going to introduce a different behavior for stream "without bottom", when user can add more nodes to the chain during the job, and all of them will be removed after the job. It will require rethinking of freezing and keeping references on intermediate nodes at least..

>  Now, all data from it is ignored.  That seemed a bit strange to me at first, but on second thought, it makes more sense.  Doing anything else would produce a garbage result basically, because stream_run() doesn’t take such a change into account.
> 

yes

> So...  After all I think I agree with setting @bottom unconditionally.
> 
> And that’s the only comment I had. :)
> 
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> 

Thanks! v15 will come next week)

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v14 10/13] qapi: block-stream: add "bottom" argument
  2020-12-11 17:42         ` Vladimir Sementsov-Ogievskiy
@ 2020-12-11 17:52           ` Max Reitz
  0 siblings, 0 replies; 37+ messages in thread
From: Max Reitz @ 2020-12-11 17:52 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, armbru, stefanha, andrey.shinkevich, den, jsnow

On 11.12.20 18:42, Vladimir Sementsov-Ogievskiy wrote:
> 11.12.2020 20:24, Max Reitz wrote:
>> On 11.12.20 17:50, Vladimir Sementsov-Ogievskiy wrote:
>>> 11.12.2020 19:05, Max Reitz wrote:
>>>> On 04.12.20 23:07, Vladimir Sementsov-Ogievskiy wrote:
>>>>> The code already don't freeze base node and we try to make it prepared
>>>>> for the situation when base node is changed during the operation. In
>>>>> other words, block-stream doesn't own base node.
>>>>>
>>>>> Let's introduce a new interface which should replace the current one,
>>>>> which will in better relations with the code. Specifying bottom node
>>>>> instead of base, and requiring it to be non-filter gives us the
>>>>> following benefits:
>>>>>
>>>>>   - drop difference between above_base and base_overlay, which will be
>>>>>     renamed to just bottom, when old interface dropped
>>>>>
>>>>>   - clean way to work with parallel streams/commits on the same 
>>>>> backing
>>>>>     chain, which otherwise become a problem when we introduce a filter
>>>>>     for stream job
>>>>>
>>>>>   - cleaner interface. Nobody will surprised the fact that base 
>>>>> node may
>>>>>     disappear during block-stream, when there is no word about 
>>>>> "base" in
>>>>>     the interface.
>>>>>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>> ---
>>>>>   qapi/block-core.json           |  8 +++--
>>>>>   include/block/block_int.h      |  1 +
>>>>>   block/monitor/block-hmp-cmds.c |  3 +-
>>>>>   block/stream.c                 | 50 +++++++++++++++++++---------
>>>>>   blockdev.c                     | 61 
>>>>> ++++++++++++++++++++++++++++------
>>>>>   5 files changed, 94 insertions(+), 29 deletions(-)
>>>>>
>>>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>>>> index 04055ef50c..5d6681a35d 100644
>>>>> --- a/qapi/block-core.json
>>>>> +++ b/qapi/block-core.json
>>>>> @@ -2522,6 +2522,10 @@
>>>>>   # @base-node: the node name of the backing file.
>>>>>   #             It cannot be set if @base is also set. (Since 2.8)
>>>>>   #
>>>>> +# @bottom: the last node in the chain that should be streamed into
>>>>> +#          top. It cannot be set any of @base, @base-node or 
>>>>> @backing-file
>>>>
>>>> s/set any/set if any/
>>>>
>>>> But what’s the problem with backing-file?  The fact that specifying 
>>>> backing-file means that stream will look for that filename in the 
>>>> backing chain when the job is done (so if you use @bottom, we 
>>>> generally don’t want to rely on the presence of any nodes below it)?
>>>
>>> I just wanted to deprecate 'backing-file' together with base and 
>>> base-node as a next step. If user wants to set backing file unrelated 
>>> to current backing-chain, is it correct at all? It's a direct 
>>> violation of what's going on, and I doubt that other parts of Qemu 
>>> working with backing-file are prepared for such situation. User can 
>>> do it by hand later.. Anyway, we'll have three releases deprecation 
>>> period for people to come and cry that this is a really needed 
>>> option, so we can support it later on demand.
>>>
>>>>
>>>> (If so, I would have thought that we actually want the user to 
>>>> specify backing-file so we don’t have to look down below @bottom to 
>>>> look for a filename.  Perhaps a @backing-fmt parameter would help.)
>>>
>>> If we decide that 'backing-file' is really needed, than yes we should 
>>> require backing-fmt to be specified together with backing-file when 
>>> using new "bottom" interface.
>> Before I can agree on removing backing-file (or deprecating it), I 
>> need to know what it’s actually used for.  I actually don’t, though.  
>> The only reason I could imagine was because the user wanted to write 
>> some string into there that is different from base.filename.
>>
>> (The original commit 13d8cc515df does mention cases like FD passing, 
>> where qemu has no idea what an appropriate filename would be (it can 
>> only see /dev/fd/*).  From that, it does appear to me that it’ll be 
>> needed even with @bottom.)
>>
> 
> I should have checked it myself.. That's one more reason for my "RFC: 
> don't store backing filename in qcow2 image"..
> 
> OK, do you think we can require backing-fmt to be specified if 
> backing-file and bottom are specified?

Sure.

> Or allow omitting it and 
> deprecate this thing? We actually already have deprecation message in 
> bdrv_change_backing_file(), and how we are trying to workaround it in 
> block-stream will not work with file descriptors anyway (hmm, and old 
> code works, so, actually 09 is a regression?)

I think requiring backing-fmt for bottom + backing-file would be the 
most simple and clean way, hopefully saving us some headaches.

Max



^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2020-12-11 18:32 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-04 22:07 [PATCH v14 00/13] Apply COR-filter to the block-stream permanently Vladimir Sementsov-Ogievskiy
2020-12-04 22:07 ` [PATCH v14 01/13] copy-on-read: support preadv/pwritev_part functions Vladimir Sementsov-Ogievskiy
2020-12-04 22:07 ` [PATCH v14 02/13] block: add API function to insert a node Vladimir Sementsov-Ogievskiy
2020-12-10 17:33   ` Max Reitz
2020-12-04 22:07 ` [PATCH v14 03/13] copy-on-read: add filter drop function Vladimir Sementsov-Ogievskiy
2020-12-10 17:34   ` Max Reitz
2020-12-04 22:07 ` [PATCH v14 04/13] qapi: add filter-node-name to block-stream Vladimir Sementsov-Ogievskiy
2020-12-10 17:37   ` Max Reitz
2020-12-04 22:07 ` [PATCH v14 05/13] qapi: create BlockdevOptionsCor structure for COR driver Vladimir Sementsov-Ogievskiy
2020-12-10 17:43   ` Max Reitz
2020-12-10 18:30     ` Vladimir Sementsov-Ogievskiy
2020-12-11  8:54       ` Max Reitz
2020-12-11 12:32         ` Vladimir Sementsov-Ogievskiy
2020-12-04 22:07 ` [PATCH v14 06/13] iotests: add #310 to test bottom node in " Vladimir Sementsov-Ogievskiy
2020-12-11 12:49   ` Max Reitz
2020-12-11 13:10     ` Vladimir Sementsov-Ogievskiy
2020-12-11 13:24       ` Max Reitz
2020-12-04 22:07 ` [PATCH v14 07/13] block: include supported_read_flags into BDS structure Vladimir Sementsov-Ogievskiy
2020-12-11 13:20   ` Max Reitz
2020-12-11 13:31     ` Vladimir Sementsov-Ogievskiy
2020-12-04 22:07 ` [PATCH v14 08/13] copy-on-read: skip non-guest reads if no copy needed Vladimir Sementsov-Ogievskiy
2020-12-11 14:29   ` Max Reitz
2020-12-04 22:07 ` [PATCH v14 09/13] stream: skip filters when writing backing file name to QCOW2 header Vladimir Sementsov-Ogievskiy
2020-12-11 15:15   ` Max Reitz
2020-12-04 22:07 ` [PATCH v14 10/13] qapi: block-stream: add "bottom" argument Vladimir Sementsov-Ogievskiy
2020-12-11 16:05   ` Max Reitz
2020-12-11 16:50     ` Vladimir Sementsov-Ogievskiy
2020-12-11 17:24       ` Max Reitz
2020-12-11 17:42         ` Vladimir Sementsov-Ogievskiy
2020-12-11 17:52           ` Max Reitz
2020-12-04 22:07 ` [PATCH v14 11/13] iotests: 30: prepare to COR filter insertion by stream job Vladimir Sementsov-Ogievskiy
2020-12-11 16:09   ` Max Reitz
2020-12-04 22:07 ` [PATCH v14 12/13] block/stream: add s->target_bs Vladimir Sementsov-Ogievskiy
2020-12-11 16:33   ` Max Reitz
2020-12-04 22:07 ` [PATCH v14 13/13] block: apply COR-filter to block-stream jobs Vladimir Sementsov-Ogievskiy
2020-12-11 17:21   ` Max Reitz
2020-12-11 17:48     ` Vladimir Sementsov-Ogievskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.