All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/45] Transactional block-graph modifying API
@ 2022-03-30 21:28 Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 01/45] block: BlockDriver: add .filtered_child_is_backing field Vladimir Sementsov-Ogievskiy
                   ` (44 more replies)
  0 siblings, 45 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Hi all!

v5: rebased on master, sorry for the noise

That's a big series, which unites some of my previous ones, and
completes them with necessary additions to finally implement block-graph
modifying API. The series is called "v4" as it inherits
"[PATCH v3 00/11] blockdev-replace" (among other things).

After this series, we have blockdev-add, blockdev-del and
x-blockdev-replace transaction actions, which allows to insert and
remove filters.

Additional challenge is to avoid intermediate permission update. That's
and existing paradigm of block graph modifications: first do all the
modifications and then refresh the permissions. Now we should bring this
paradigm to block-graph modifying transactions: if several graph
modifying commands are sequential in one transaction, permission are
updated after the last of these commands. The application of this is
possibility to correct copy-before-write filter permission requirements
(see last patch).

I now unite all these things into one series because:
 - they depend on each other and I have to rebase them together when
 something needs fix or refactoring
 - just to resend with my new email address
If needed, parts may go in separate, and I can split them again if
necessary.

So, what is here:

1. "[PATCH 00/14] block: cleanup backing and file handling" series,
unchanged:

  block: BlockDriver: add .filtered_child_is_backing field
  block: introduce bdrv_open_file_child() helper
  block/blklogwrites: don't care to remove bs->file child on failure
  test-bdrv-graph-mod: update test_parallel_perm_update test case
  tests-bdrv-drain: bdrv_replace_test driver: declare supports_backing
  test-bdrv-graph-mod: fix filters to be filters
  block: document connection between child roles and
    bs->backing/bs->file
  block/snapshot: stress that we fallback to primary child
  Revert "block: Let replace_child_noperm free children"
  Revert "block: Let replace_child_tran keep indirect pointer"
  Revert "block: Restructure remove_file_or_backing_child()"
  Revert "block: Pass BdrvChild ** to replace_child_noperm"
  block: Manipulate bs->file / bs->backing pointers in .attach/.detach
  block/snapshot: drop indirection around bdrv_snapshot_fallback_ptr

2. implement bdrv_unref_tran() - the key thing to implement blockdev-del
transaction action later.
This part inherits from "[PATCH 00/14] block: blockdev-del force=false".
Still force=false is not realized and qcow2 is untouched, as the target
now is transactional removement.

  block: refactor bdrv_remove_file_or_backing_child to bdrv_remove_child
  block: drop bdrv_detach_child()
  block: drop bdrv_remove_filter_or_cow_child
  block: bdrv_refresh_perms(): allow external tran
  block: refactor bdrv_list_refresh_perms to allow any list of nodes
  block: make permission update functions public
  block: add bdrv_try_set_aio_context_tran transaction action
  block: implemet bdrv_unref_tran()

3. Move blockdev.c transactions to util/transactions.c API.

  blockdev: refactor transaction to use Transaction API
  blockdev: transactions: rename some things
  blockdev: qmp_transaction: refactor loop to classic for
  blockdev: transaction: refactor handling transaction properties
  blockdev: qmp_transaction: drop extra generic layer

4. add blockdev-del transaction action

  qapi: block: add blockdev-del transaction action

5. add blockdev-add transaction action
(inherits from "[PATCH 0/2] blockdev-add transaction")

  block: introduce BDRV_O_NOPERM flag
  block: bdrv_insert_node(): use BDRV_O_NOPERM
  qapi: block: add blockdev-add transaction action
  iotests: add blockdev-add-transaction

6. add x-blockdev-replace command and transaction action
(inherits from "[PATCH v3 00/11] blockdev-replace")

  block-backend: blk_root(): drop const specifier on return type
  block/export: add blk_by_export_id()
  block: make bdrv_find_child() function public
  block: bdrv_replace_child_bs(): move to external transaction
  qapi: add x-blockdev-replace command
  qapi: add x-blockdev-replace transaction action
  block: bdrv_get_xdbg_block_graph(): report export ids
  iotests.py: qemu_img_create: use imgfmt by default
  iotests.py: introduce VM.assert_edges_list() method
  iotests.py: add VM.qmp_check() helper
  iotests: add filter-insertion

7. Correct permission scheme of copy-before-write filter, with help of
new design of graph-modifying API.

  block: bdrv_open_inherit: create BlockBackend only when necessary
  block/copy-before-write: correct permission scheme

 block.c                                       | 871 ++++++++++--------
 block/blkdebug.c                              |   9 +-
 block/blklogwrites.c                          |  11 +-
 block/blkreplay.c                             |   7 +-
 block/blkverify.c                             |   9 +-
 block/block-backend.c                         |  11 +-
 block/bochs.c                                 |   7 +-
 block/cloop.c                                 |   7 +-
 block/commit.c                                |   1 +
 block/copy-before-write.c                     |  24 +-
 block/copy-on-read.c                          |   9 +-
 block/crypto.c                                |  11 +-
 block/dmg.c                                   |   7 +-
 block/export/export.c                         |  31 +
 block/filter-compress.c                       |   6 +-
 block/mirror.c                                |   1 +
 block/monitor/block-hmp-cmds.c                |   2 +-
 block/parallels.c                             |   7 +-
 block/preallocate.c                           |   9 +-
 block/qcow.c                                  |   6 +-
 block/qcow2.c                                 |   8 +-
 block/qed.c                                   |   8 +-
 block/raw-format.c                            |   4 +-
 block/replication.c                           |   8 +-
 block/snapshot.c                              |  60 +-
 block/throttle.c                              |   8 +-
 block/vdi.c                                   |   7 +-
 block/vhdx.c                                  |   7 +-
 block/vmdk.c                                  |   7 +-
 block/vpc.c                                   |   7 +-
 blockdev.c                                    | 842 +++++++++--------
 include/block/block-common.h                  |  47 +-
 include/block/block-global-state.h            |  24 +-
 include/block/block_int-common.h              |  36 +-
 include/block/block_int-global-state.h        |   3 +-
 include/block/block_int-io.h                  |   1 +
 include/block/export.h                        |   1 +
 include/sysemu/block-backend-global-state.h   |   3 +-
 qapi/block-core.json                          |  73 +-
 qapi/transaction.json                         |  35 +-
 stubs/blk-by-qdev-id.c                        |   9 +
 stubs/blk-exp-find-by-blk.c                   |   9 +
 stubs/meson.build                             |   2 +
 tests/qemu-iotests/iotests.py                 |  23 +
 .../tests/blockdev-add-transaction            |  52 ++
 .../tests/blockdev-add-transaction.out        |   6 +
 tests/qemu-iotests/tests/filter-insertion     | 253 +++++
 tests/qemu-iotests/tests/filter-insertion.out |   5 +
 tests/qemu-iotests/tests/image-fleecing       |  20 +-
 tests/qemu-iotests/tests/image-fleecing.out   |   8 -
 tests/unit/test-bdrv-drain.c                  |  11 +-
 tests/unit/test-bdrv-graph-mod.c              |  94 +-
 52 files changed, 1725 insertions(+), 1002 deletions(-)
 create mode 100644 stubs/blk-by-qdev-id.c
 create mode 100644 stubs/blk-exp-find-by-blk.c
 create mode 100755 tests/qemu-iotests/tests/blockdev-add-transaction
 create mode 100644 tests/qemu-iotests/tests/blockdev-add-transaction.out
 create mode 100755 tests/qemu-iotests/tests/filter-insertion
 create mode 100644 tests/qemu-iotests/tests/filter-insertion.out

-- 
2.35.1



^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v5 01/45] block: BlockDriver: add .filtered_child_is_backing field
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-07  9:57   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 02/45] block: introduce bdrv_open_file_child() helper Vladimir Sementsov-Ogievskiy
                   ` (43 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, hreitz, vsementsov, John Snow

Unfortunately not all filters use .file child as filtered child. Two
exclusions are mirror_top and commit_top. Happily they both are private
filters. Bad thing is that this inconsistency is observable through qmp
commands query-block / query-named-block-nodes. So, could we just
change mirror_top and commit_top to use file child as all other filter
driver is an open question. Probably, we could do that with some kind
of deprecation period, but how to warn users during it?

For now, let's just add a field so we can distinguish them in generic
code, it will be used in further commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block/commit.c                   |  1 +
 block/mirror.c                   |  1 +
 include/block/block_int-common.h | 13 +++++++++++++
 3 files changed, 15 insertions(+)

diff --git a/block/commit.c b/block/commit.c
index 851d1c557a..7722a392af 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -238,6 +238,7 @@ static BlockDriver bdrv_commit_top = {
     .bdrv_child_perm            = bdrv_commit_top_child_perm,
 
     .is_filter                  = true,
+    .filtered_child_is_backing  = true,
 };
 
 void commit_start(const char *job_id, BlockDriverState *bs,
diff --git a/block/mirror.c b/block/mirror.c
index d8ecb9efa2..824b273fc7 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1578,6 +1578,7 @@ static BlockDriver bdrv_mirror_top = {
     .bdrv_child_perm            = bdrv_mirror_top_child_perm,
 
     .is_filter                  = true,
+    .filtered_child_is_backing  = true,
 };
 
 static BlockJob *mirror_start_job(
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 8947abab76..9d91ccbcbf 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -119,6 +119,19 @@ struct BlockDriver {
      * (And this filtered child must then be bs->file or bs->backing.)
      */
     bool is_filter;
+    /*
+     * Only make sense for filter drivers, for others must be false.
+     * If true, filtered child is bs->backing. Otherwise it's bs->file.
+     * Only two internal filters use bs->backing as filtered child and has this
+     * field set to true: mirror_top and commit_top.
+     *
+     * Never create any more such filters!
+     *
+     * TODO: imagine how to deprecate this behavior and make all filters work
+     * similarly using bs->file as filtered child.
+     */
+    bool filtered_child_is_backing;
+
     /*
      * Set to true if the BlockDriver is a format driver.  Format nodes
      * generally do not expect their children to be other format nodes
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 02/45] block: introduce bdrv_open_file_child() helper
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 01/45] block: BlockDriver: add .filtered_child_is_backing field Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-07  9:57   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 03/45] block/blklogwrites: don't care to remove bs->file child on failure Vladimir Sementsov-Ogievskiy
                   ` (42 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, Fam Zheng, v.sementsov-og, Stefan Hajnoczi, Jeff Cody,
	Wen Congyang, Xie Changlong, qemu-devel, hreitz, vsementsov,
	Pavel Dovgalyuk, Denis V. Lunev, Paolo Bonzini, Stefan Weil,
	John Snow, Ari Sundholm

Almost all drivers call bdrv_open_child() similarly. Let's create a
helper for this.

The only not updated driver that call bdrv_open_child() to set
bs->file is raw-format, as it sometimes want to have filtered child but
don't set drv->is_filter to true.

Possibly we should implement drv->is_filter_func() handler, to consider
raw-format as filter when it works as filter.. But it's another story.

Note also, that we decrease assignments to bs->file in code: it helps
us restrict modifying this field in further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c                            | 21 +++++++++++++++++++++
 block/blkdebug.c                   |  9 +++------
 block/blklogwrites.c               |  7 ++-----
 block/blkreplay.c                  |  7 ++-----
 block/blkverify.c                  |  9 +++------
 block/bochs.c                      |  7 +++----
 block/cloop.c                      |  7 +++----
 block/copy-before-write.c          |  9 ++++-----
 block/copy-on-read.c               |  9 ++++-----
 block/crypto.c                     | 11 ++++++-----
 block/dmg.c                        |  7 +++----
 block/filter-compress.c            |  6 ++----
 block/parallels.c                  |  7 +++----
 block/preallocate.c                |  9 ++++-----
 block/qcow.c                       |  6 ++----
 block/qcow2.c                      |  8 ++++----
 block/qed.c                        |  8 ++++----
 block/replication.c                |  8 +++-----
 block/throttle.c                   |  8 +++-----
 block/vdi.c                        |  7 +++----
 block/vhdx.c                       |  7 +++----
 block/vmdk.c                       |  7 +++----
 block/vpc.c                        |  7 +++----
 include/block/block-global-state.h |  3 +++
 24 files changed, 94 insertions(+), 100 deletions(-)

diff --git a/block.c b/block.c
index 718e4cae8b..8110b1b330 100644
--- a/block.c
+++ b/block.c
@@ -3666,6 +3666,27 @@ BdrvChild *bdrv_open_child(const char *filename,
                              errp);
 }
 
+/*
+ * Wrapper on bdrv_open_child() for most popular case: open primary child of bs.
+ */
+int bdrv_open_file_child(const char *filename,
+                         QDict *options, const char *bdref_key,
+                         BlockDriverState *parent, Error **errp)
+{
+    BdrvChildRole role;
+
+    /* commit_top and mirror_top don't use this function */
+    assert(!parent->drv->filtered_child_is_backing);
+
+    role = parent->drv->is_filter ?
+        (BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY) : BDRV_CHILD_IMAGE;
+
+    parent->file = bdrv_open_child(filename, options, bdref_key, parent,
+                                   &child_of_bds, role, false, errp);
+
+    return parent->file ? 0 : -EINVAL;
+}
+
 /*
  * TODO Future callers may need to specify parent/child_class in order for
  * option inheritance to work. Existing callers use it for the root node.
diff --git a/block/blkdebug.c b/block/blkdebug.c
index bbf2948703..5fcfc8ac6f 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -503,12 +503,9 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
     }
 
     /* Open the image file */
-    bs->file = bdrv_open_child(qemu_opt_get(opts, "x-image"), options, "image",
-                               bs, &child_of_bds,
-                               BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-                               false, errp);
-    if (!bs->file) {
-        ret = -EINVAL;
+    ret = bdrv_open_file_child(qemu_opt_get(opts, "x-image"), options, "image",
+                               bs, errp);
+    if (ret < 0) {
         goto out;
     }
 
diff --git a/block/blklogwrites.c b/block/blklogwrites.c
index f7a251e91f..f66a617eb3 100644
--- a/block/blklogwrites.c
+++ b/block/blklogwrites.c
@@ -155,11 +155,8 @@ static int blk_log_writes_open(BlockDriverState *bs, QDict *options, int flags,
     }
 
     /* Open the file */
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY, false,
-                               errp);
-    if (!bs->file) {
-        ret = -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
         goto fail;
     }
 
diff --git a/block/blkreplay.c b/block/blkreplay.c
index dcbe780ddb..76a0b8d12a 100644
--- a/block/blkreplay.c
+++ b/block/blkreplay.c
@@ -26,11 +26,8 @@ static int blkreplay_open(BlockDriverState *bs, QDict *options, int flags,
     int ret;
 
     /* Open the image file */
-    bs->file = bdrv_open_child(NULL, options, "image", bs, &child_of_bds,
-                               BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-                               false, errp);
-    if (!bs->file) {
-        ret = -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "image", bs, errp);
+    if (ret < 0) {
         goto fail;
     }
 
diff --git a/block/blkverify.c b/block/blkverify.c
index e4a37af3b2..e4d40a63aa 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -122,12 +122,9 @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
     }
 
     /* Open the raw file */
-    bs->file = bdrv_open_child(qemu_opt_get(opts, "x-raw"), options, "raw",
-                               bs, &child_of_bds,
-                               BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-                               false, errp);
-    if (!bs->file) {
-        ret = -EINVAL;
+    ret = bdrv_open_file_child(qemu_opt_get(opts, "x-raw"), options, "raw",
+                               bs, errp);
+    if (ret < 0) {
         goto fail;
     }
 
diff --git a/block/bochs.c b/block/bochs.c
index 4d68658087..b2dc06bbfd 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -110,10 +110,9 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
         return ret;
     }
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_IMAGE, false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     ret = bdrv_pread(bs->file, 0, &bochs, sizeof(bochs));
diff --git a/block/cloop.c b/block/cloop.c
index b8c6d0eccd..bee87da173 100644
--- a/block/cloop.c
+++ b/block/cloop.c
@@ -71,10 +71,9 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
         return ret;
     }
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_IMAGE, false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     /* read header */
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index a8a06fdc09..4fad564691 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -376,12 +376,11 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
     BDRVCopyBeforeWriteState *s = bs->opaque;
     BdrvDirtyBitmap *bitmap = NULL;
     int64_t cluster_size;
+    int ret;
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-                               false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     s->target = bdrv_open_child(NULL, options, "target", bs, &child_of_bds,
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index 1fc7fb3333..815ac1d835 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -41,12 +41,11 @@ static int cor_open(BlockDriverState *bs, QDict *options, int flags,
     BDRVStateCOR *state = bs->opaque;
     /* Find a bottom node name, if any */
     const char *bottom_node = qdict_get_try_str(options, "bottom");
+    int ret;
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-                               false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     bs->supported_read_flags = BDRV_REQ_PREFETCH;
diff --git a/block/crypto.c b/block/crypto.c
index 1ba82984ef..e165447a5b 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -261,15 +261,14 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
 {
     BlockCrypto *crypto = bs->opaque;
     QemuOpts *opts = NULL;
-    int ret = -EINVAL;
+    int ret;
     QCryptoBlockOpenOptions *open_opts = NULL;
     unsigned int cflags = 0;
     QDict *cryptoopts = NULL;
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_IMAGE, false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     bs->supported_write_flags = BDRV_REQ_FUA &
@@ -277,6 +276,7 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
 
     opts = qemu_opts_create(opts_spec, NULL, 0, &error_abort);
     if (!qemu_opts_absorb_qdict(opts, options, errp)) {
+        ret = -EINVAL;
         goto cleanup;
     }
 
@@ -285,6 +285,7 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
 
     open_opts = block_crypto_open_opts_init(cryptoopts, errp);
     if (!open_opts) {
+        ret = -EINVAL;
         goto cleanup;
     }
 
diff --git a/block/dmg.c b/block/dmg.c
index c626587f9c..f91c5d3980 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -440,10 +440,9 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
         return ret;
     }
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_IMAGE, false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     block_module_load_one("dmg-bz2");
diff --git a/block/filter-compress.c b/block/filter-compress.c
index d5be538619..b2cfa9a9a5 100644
--- a/block/filter-compress.c
+++ b/block/filter-compress.c
@@ -30,10 +30,8 @@
 static int compress_open(BlockDriverState *bs, QDict *options, int flags,
                          Error **errp)
 {
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-                               false, errp);
-    if (!bs->file) {
+    int ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
         return -EINVAL;
     }
 
diff --git a/block/parallels.c b/block/parallels.c
index cd23e02d06..c55f1af5da 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -736,10 +736,9 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
     Error *local_err = NULL;
     char *buf;
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_IMAGE, false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     ret = bdrv_pread(bs->file, 0, &ph, sizeof(ph));
diff --git a/block/preallocate.c b/block/preallocate.c
index e15cb8c74a..d50ee7f49b 100644
--- a/block/preallocate.c
+++ b/block/preallocate.c
@@ -134,6 +134,7 @@ static int preallocate_open(BlockDriverState *bs, QDict *options, int flags,
                             Error **errp)
 {
     BDRVPreallocateState *s = bs->opaque;
+    int ret;
 
     /*
      * s->data_end and friends should be initialized on permission update.
@@ -141,11 +142,9 @@ static int preallocate_open(BlockDriverState *bs, QDict *options, int flags,
      */
     s->file_end = s->zero_start = s->data_end = -EINVAL;
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-                               false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     if (!preallocate_absorb_opts(&s->opts, options, bs->file->bs, errp)) {
diff --git a/block/qcow.c b/block/qcow.c
index 4fba1b9e36..b4033c0db9 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -121,10 +121,8 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
     qdict_extract_subqdict(options, &encryptopts, "encrypt.");
     encryptfmt = qdict_get_try_str(encryptopts, "format");
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_IMAGE, false, errp);
-    if (!bs->file) {
-        ret = -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
         goto fail;
     }
 
diff --git a/block/qcow2.c b/block/qcow2.c
index b5c47931ef..6de8dbef32 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1891,11 +1891,11 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
         .errp = errp,
         .ret = -EINPROGRESS
     };
+    int ret;
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_IMAGE, false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     /* Initialise locks */
diff --git a/block/qed.c b/block/qed.c
index f34d9a3ac1..1ff024f16d 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -559,11 +559,11 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
         .errp = errp,
         .ret = -EINPROGRESS
     };
+    int ret;
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_IMAGE, false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     bdrv_qed_init_state(bs);
diff --git a/block/replication.c b/block/replication.c
index 55c8f894aa..2f17397764 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -88,11 +88,9 @@ static int replication_open(BlockDriverState *bs, QDict *options,
     const char *mode;
     const char *top_id;
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-                               false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     ret = -EINVAL;
diff --git a/block/throttle.c b/block/throttle.c
index 6e8d52fa24..4fb5798c27 100644
--- a/block/throttle.c
+++ b/block/throttle.c
@@ -78,11 +78,9 @@ static int throttle_open(BlockDriverState *bs, QDict *options,
     char *group;
     int ret;
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-                               false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
     bs->supported_write_flags = bs->file->bs->supported_write_flags |
                                 BDRV_REQ_WRITE_UNCHANGED;
diff --git a/block/vdi.c b/block/vdi.c
index cca3a3a356..a539081138 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -377,10 +377,9 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
     int ret;
     QemuUUID uuid_link, uuid_parent;
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_IMAGE, false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     logout("\n");
diff --git a/block/vhdx.c b/block/vhdx.c
index 410c6f9610..994f8b91cc 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -997,10 +997,9 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, int flags,
     uint64_t signature;
     Error *local_err = NULL;
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_IMAGE, false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     s->bat = NULL;
diff --git a/block/vmdk.c b/block/vmdk.c
index 37c0946066..9fd417b4a3 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1263,10 +1263,9 @@ static int vmdk_open(BlockDriverState *bs, QDict *options, int flags,
     BDRVVmdkState *s = bs->opaque;
     uint32_t magic;
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_IMAGE, false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     buf = vmdk_read_desc(bs->file, 0, errp);
diff --git a/block/vpc.c b/block/vpc.c
index 4d8f16e199..cd82eb7f92 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -233,10 +233,9 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags,
     int ret;
     int64_t bs_size;
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               BDRV_CHILD_IMAGE, false, errp);
-    if (!bs->file) {
-        return -EINVAL;
+    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    if (ret < 0) {
+        return ret;
     }
 
     opts = qemu_opts_create(&vpc_runtime_opts, NULL, 0, &error_abort);
diff --git a/include/block/block-global-state.h b/include/block/block-global-state.h
index 25bb69bbef..600afcf5bd 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -76,6 +76,9 @@ BdrvChild *bdrv_open_child(const char *filename,
                            const BdrvChildClass *child_class,
                            BdrvChildRole child_role,
                            bool allow_none, Error **errp);
+int bdrv_open_file_child(const char *filename,
+                         QDict *options, const char *bdref_key,
+                         BlockDriverState *parent, Error **errp);
 BlockDriverState *bdrv_open_blockdev_ref(BlockdevRef *ref, Error **errp);
 int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
                         Error **errp);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 03/45] block/blklogwrites: don't care to remove bs->file child on failure
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 01/45] block: BlockDriver: add .filtered_child_is_backing field Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 02/45] block: introduce bdrv_open_file_child() helper Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-07 10:05   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 04/45] test-bdrv-graph-mod: update test_parallel_perm_update test case Vladimir Sementsov-Ogievskiy
                   ` (41 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, hreitz, vsementsov, Ari Sundholm

We don't need to remove bs->file, generic layer takes care of it. No
other driver cares to remove bs->file on failure by hand.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block/blklogwrites.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/block/blklogwrites.c b/block/blklogwrites.c
index f66a617eb3..7d25df97cc 100644
--- a/block/blklogwrites.c
+++ b/block/blklogwrites.c
@@ -254,10 +254,6 @@ fail_log:
         s->log_file = NULL;
     }
 fail:
-    if (ret < 0) {
-        bdrv_unref_child(bs, bs->file);
-        bs->file = NULL;
-    }
     qemu_opts_del(opts);
     return ret;
 }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 04/45] test-bdrv-graph-mod: update test_parallel_perm_update test case
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 03/45] block/blklogwrites: don't care to remove bs->file child on failure Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-07 10:53   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 05/45] tests-bdrv-drain: bdrv_replace_test driver: declare supports_backing Vladimir Sementsov-Ogievskiy
                   ` (40 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

test_parallel_perm_update() does two things that we are going to
restrict in the near future:

1. It updates bs->file field by hand. bs->file will be managed
   automatically by generic code (together with bs->children list).

   Let's better refactor our "tricky" bds to have own state where one
   of children is linked as "selected".
   This also looks less "tricky", so avoid using this word.

2. It create FILTERED children that are not PRIMARY. Except for tests
   all FILTERED children in the Qemu block layer are always PRIMARY as
   well.  We are going to formalize this rule, so let's better use DATA
   children here.

While being here, update the picture to better correspond to the test
code.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 tests/unit/test-bdrv-graph-mod.c | 70 ++++++++++++++++++++------------
 1 file changed, 44 insertions(+), 26 deletions(-)

diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
index a6e3bb79be..40795d3c04 100644
--- a/tests/unit/test-bdrv-graph-mod.c
+++ b/tests/unit/test-bdrv-graph-mod.c
@@ -241,13 +241,26 @@ static void test_parallel_exclusive_write(void)
     bdrv_unref(top);
 }
 
-static void write_to_file_perms(BlockDriverState *bs, BdrvChild *c,
-                                     BdrvChildRole role,
-                                     BlockReopenQueue *reopen_queue,
-                                     uint64_t perm, uint64_t shared,
-                                     uint64_t *nperm, uint64_t *nshared)
+/*
+ * write-to-selected node may have several DATA children, one of them may be
+ * "selected". Exclusive write permission is taken on selected child.
+ *
+ * We don't realize write handler itself, as we need only to test how permission
+ * update works.
+ */
+typedef struct BDRVWriteToSelectedState {
+    BdrvChild *selected;
+} BDRVWriteToSelectedState;
+
+static void write_to_selected_perms(BlockDriverState *bs, BdrvChild *c,
+                                    BdrvChildRole role,
+                                    BlockReopenQueue *reopen_queue,
+                                    uint64_t perm, uint64_t shared,
+                                    uint64_t *nperm, uint64_t *nshared)
 {
-    if (bs->file && c == bs->file) {
+    BDRVWriteToSelectedState *s = bs->opaque;
+
+    if (s->selected && c == s->selected) {
         *nperm = BLK_PERM_WRITE;
         *nshared = BLK_PERM_ALL & ~BLK_PERM_WRITE;
     } else {
@@ -256,9 +269,10 @@ static void write_to_file_perms(BlockDriverState *bs, BdrvChild *c,
     }
 }
 
-static BlockDriver bdrv_write_to_file = {
-    .format_name = "tricky-perm",
-    .bdrv_child_perm = write_to_file_perms,
+static BlockDriver bdrv_write_to_selected = {
+    .format_name = "write-to-selected",
+    .instance_size = sizeof(BDRVWriteToSelectedState),
+    .bdrv_child_perm = write_to_selected_perms,
 };
 
 
@@ -266,15 +280,18 @@ static BlockDriver bdrv_write_to_file = {
  * The following test shows that topological-sort order is required for
  * permission update, simple DFS is not enough.
  *
- * Consider the block driver which has two filter children: one active
- * with exclusive write access and one inactive with no specific
- * permissions.
+ * Consider the block driver (write-to-selected) which has two children: one is
+ * selected so we have exclusive write access to it and for the other one we
+ * don't need any specific permissions.
  *
  * And, these two children has a common base child, like this:
+ *   (additional "top" on top is used in test just because the only public
+ *    function to update permission should get a specific child to update.
+ *    Making bdrv_refresh_perms() public just for this test doesn't worth it)
  *
- * ┌─────┐     ┌──────┐
- * │ fl2 │ ◀── │ top  │
- * └─────┘     └──────┘
+ * ┌─────┐     ┌───────────────────┐     ┌─────┐
+ * │ fl2 │ ◀── │ write-to-selected │ ◀── │ top │
+ * └─────┘     └───────────────────┘     └─────┘
  *   │           │
  *   │           │ w
  *   │           ▼
@@ -290,7 +307,7 @@ static BlockDriver bdrv_write_to_file = {
  *
  * So, exclusive write is propagated.
  *
- * Assume, we want to make fl2 active instead of fl1.
+ * Assume, we want to select fl2  instead of fl1.
  * So, we set some option for top driver and do permission update.
  *
  * With simple DFS, if permission update goes first through
@@ -306,9 +323,10 @@ static BlockDriver bdrv_write_to_file = {
 static void test_parallel_perm_update(void)
 {
     BlockDriverState *top = no_perm_node("top");
-    BlockDriverState *tricky =
-            bdrv_new_open_driver(&bdrv_write_to_file, "tricky", BDRV_O_RDWR,
+    BlockDriverState *ws =
+            bdrv_new_open_driver(&bdrv_write_to_selected, "ws", BDRV_O_RDWR,
                                  &error_abort);
+    BDRVWriteToSelectedState *s = ws->opaque;
     BlockDriverState *base = no_perm_node("base");
     BlockDriverState *fl1 = pass_through_node("fl1");
     BlockDriverState *fl2 = pass_through_node("fl2");
@@ -320,33 +338,33 @@ static void test_parallel_perm_update(void)
      */
     bdrv_ref(base);
 
-    bdrv_attach_child(top, tricky, "file", &child_of_bds, BDRV_CHILD_DATA,
+    bdrv_attach_child(top, ws, "file", &child_of_bds, BDRV_CHILD_DATA,
                       &error_abort);
-    c_fl1 = bdrv_attach_child(tricky, fl1, "first", &child_of_bds,
-                              BDRV_CHILD_FILTERED, &error_abort);
-    c_fl2 = bdrv_attach_child(tricky, fl2, "second", &child_of_bds,
-                              BDRV_CHILD_FILTERED, &error_abort);
+    c_fl1 = bdrv_attach_child(ws, fl1, "first", &child_of_bds,
+                              BDRV_CHILD_DATA, &error_abort);
+    c_fl2 = bdrv_attach_child(ws, fl2, "second", &child_of_bds,
+                              BDRV_CHILD_DATA, &error_abort);
     bdrv_attach_child(fl1, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
                       &error_abort);
     bdrv_attach_child(fl2, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
                       &error_abort);
 
     /* Select fl1 as first child to be active */
-    tricky->file = c_fl1;
+    s->selected = c_fl1;
     bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
 
     assert(c_fl1->perm & BLK_PERM_WRITE);
     assert(!(c_fl2->perm & BLK_PERM_WRITE));
 
     /* Now, try to switch active child and update permissions */
-    tricky->file = c_fl2;
+    s->selected = c_fl2;
     bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
 
     assert(c_fl2->perm & BLK_PERM_WRITE);
     assert(!(c_fl1->perm & BLK_PERM_WRITE));
 
     /* Switch once more, to not care about real child order in the list */
-    tricky->file = c_fl1;
+    s->selected = c_fl1;
     bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
 
     assert(c_fl1->perm & BLK_PERM_WRITE);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 05/45] tests-bdrv-drain: bdrv_replace_test driver: declare supports_backing
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 04/45] test-bdrv-graph-mod: update test_parallel_perm_update test case Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-07 10:59   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 06/45] test-bdrv-graph-mod: fix filters to be filters Vladimir Sementsov-Ogievskiy
                   ` (39 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, Nikita Lapshin, qemu-devel, hreitz,
	vsementsov, Eric Blake

We do add COW child to the node.  In future we are going to forbid
adding COW child to the node that doesn't support backing. So, fix it
here now.

Don't worry about setting bs->backing itself: it further commit we'll
update the block-layer to automatically set/unset this field in generic
code.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 tests/unit/test-bdrv-drain.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
index 36be84ae55..23d425a494 100644
--- a/tests/unit/test-bdrv-drain.c
+++ b/tests/unit/test-bdrv-drain.c
@@ -1948,6 +1948,7 @@ static void coroutine_fn bdrv_replace_test_co_drain_end(BlockDriverState *bs)
 static BlockDriver bdrv_replace_test = {
     .format_name            = "replace_test",
     .instance_size          = sizeof(BDRVReplaceTestState),
+    .supports_backing       = true,
 
     .bdrv_close             = bdrv_replace_test_close,
     .bdrv_co_preadv         = bdrv_replace_test_co_preadv,
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 06/45] test-bdrv-graph-mod: fix filters to be filters
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 05/45] tests-bdrv-drain: bdrv_replace_test driver: declare supports_backing Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-07 11:22   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 07/45] block: document connection between child roles and bs->backing/bs->file Vladimir Sementsov-Ogievskiy
                   ` (38 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

bdrv_pass_through is used as filter, even all node variables has
corresponding names. We want to append it, so it should be
backing-child-based filter like mirror_top.
So, in test_update_perm_tree, first child should be DATA, as we don't
want filters with two filtered children.

bdrv_exclusive_writer is used as a filter once. So it should be filter
anyway. We want to append it, so it should be backing-child-based
fitler too.

Make all FILTERED children to be PRIMARY as well. We are going to force
this rule by assertion soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 include/block/block_int-common.h |  5 +++--
 tests/unit/test-bdrv-graph-mod.c | 24 +++++++++++++++++-------
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 9d91ccbcbf..d68adc6ff3 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -122,8 +122,9 @@ struct BlockDriver {
     /*
      * Only make sense for filter drivers, for others must be false.
      * If true, filtered child is bs->backing. Otherwise it's bs->file.
-     * Only two internal filters use bs->backing as filtered child and has this
-     * field set to true: mirror_top and commit_top.
+     * Two internal filters use bs->backing as filtered child and has this
+     * field set to true: mirror_top and commit_top. There also two such test
+     * filters in tests/unit/test-bdrv-graph-mod.c.
      *
      * Never create any more such filters!
      *
diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
index 40795d3c04..7265971013 100644
--- a/tests/unit/test-bdrv-graph-mod.c
+++ b/tests/unit/test-bdrv-graph-mod.c
@@ -26,6 +26,8 @@
 
 static BlockDriver bdrv_pass_through = {
     .format_name = "pass-through",
+    .is_filter = true,
+    .filtered_child_is_backing = true,
     .bdrv_child_perm = bdrv_default_perms,
 };
 
@@ -57,6 +59,8 @@ static void exclusive_write_perms(BlockDriverState *bs, BdrvChild *c,
 
 static BlockDriver bdrv_exclusive_writer = {
     .format_name = "exclusive-writer",
+    .is_filter = true,
+    .filtered_child_is_backing = true,
     .bdrv_child_perm = exclusive_write_perms,
 };
 
@@ -134,7 +138,7 @@ static void test_update_perm_tree(void)
     blk_insert_bs(root, bs, &error_abort);
 
     bdrv_attach_child(filter, bs, "child", &child_of_bds,
-                      BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY, &error_abort);
+                      BDRV_CHILD_DATA, &error_abort);
 
     ret = bdrv_append(filter, bs, NULL);
     g_assert_cmpint(ret, <, 0);
@@ -228,11 +232,14 @@ static void test_parallel_exclusive_write(void)
      */
     bdrv_ref(base);
 
-    bdrv_attach_child(top, fl1, "backing", &child_of_bds, BDRV_CHILD_DATA,
+    bdrv_attach_child(top, fl1, "backing", &child_of_bds,
+                      BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
                       &error_abort);
-    bdrv_attach_child(fl1, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
+    bdrv_attach_child(fl1, base, "backing", &child_of_bds,
+                      BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
                       &error_abort);
-    bdrv_attach_child(fl2, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
+    bdrv_attach_child(fl2, base, "backing", &child_of_bds,
+                      BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
                       &error_abort);
 
     bdrv_replace_node(fl1, fl2, &error_abort);
@@ -344,9 +351,11 @@ static void test_parallel_perm_update(void)
                               BDRV_CHILD_DATA, &error_abort);
     c_fl2 = bdrv_attach_child(ws, fl2, "second", &child_of_bds,
                               BDRV_CHILD_DATA, &error_abort);
-    bdrv_attach_child(fl1, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
+    bdrv_attach_child(fl1, base, "backing", &child_of_bds,
+                      BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
                       &error_abort);
-    bdrv_attach_child(fl2, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
+    bdrv_attach_child(fl2, base, "backing", &child_of_bds,
+                      BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
                       &error_abort);
 
     /* Select fl1 as first child to be active */
@@ -397,7 +406,8 @@ static void test_append_greedy_filter(void)
     BlockDriverState *base = no_perm_node("base");
     BlockDriverState *fl = exclusive_writer_node("fl1");
 
-    bdrv_attach_child(top, base, "backing", &child_of_bds, BDRV_CHILD_COW,
+    bdrv_attach_child(top, base, "backing", &child_of_bds,
+                      BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
                       &error_abort);
 
     bdrv_append(fl, base, &error_abort);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 07/45] block: document connection between child roles and bs->backing/bs->file
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 06/45] test-bdrv-graph-mod: fix filters to be filters Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-07 12:11   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 08/45] block/snapshot: stress that we fallback to primary child Vladimir Sementsov-Ogievskiy
                   ` (37 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Make the informal rules formal. In further commit we'll add
corresponding assertions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 include/block/block-common.h | 42 ++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/include/block/block-common.h b/include/block/block-common.h
index fdb7306e78..2687a2519c 100644
--- a/include/block/block-common.h
+++ b/include/block/block-common.h
@@ -313,6 +313,48 @@ enum {
  *
  * At least one of DATA, METADATA, FILTERED, or COW must be set for
  * every child.
+ *
+ *
+ * = Connection with bs->children, bs->file and bs->backing fields =
+ *
+ * 1. Filters
+ *
+ * Filter drivers has drv->is_filter = true.
+ *
+ * Filter driver has exactly one FILTERED|PRIMARY child, any may have other
+ * children which must not have these bits (the example is copy-before-write
+ * filter that also has target DATA child).
+ *
+ * Filter driver never has COW children.
+ *
+ * For all filters except for mirror_top and commit_top, the filtered child is
+ * linked in bs->file, bs->backing is NULL.
+ *
+ * For mirror_top and commit_top filtered child is linked in bs->backing and
+ * their bs->file is NULL. These two filters has drv->filtered_child_is_backing
+ * = true.
+ *
+ * 2. "raw" driver (block/raw-format.c)
+ *
+ * Formally it's not a filter (drv->is_filter = false)
+ *
+ * bs->backing is always NULL
+ *
+ * Only has one child, linked in bs->file. It's role is either FILTERED|PRIMARY
+ * (like filter) either DATA|PRIMARY depending on options.
+ *
+ * 3. Other drivers
+ *
+ * Doesn't have any FILTERED children.
+ *
+ * May have at most one COW child. In this case it's linked in bs->backing.
+ * Otherwise bs->backing is NULL. COW child is never PRIMARY.
+ *
+ * May have at most one PRIMARY child. In this case it's linked in bs->file.
+ * Otherwise bs->file is NULL.
+ *
+ * May also have some other children that don't have neither PRIMARY nor COW
+ * bits set.
  */
 enum BdrvChildRoleBits {
     /*
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 08/45] block/snapshot: stress that we fallback to primary child
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 07/45] block: document connection between child roles and bs->backing/bs->file Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-07 13:42   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 09/45] Revert "block: Let replace_child_noperm free children" Vladimir Sementsov-Ogievskiy
                   ` (36 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Actually what we chose is a primary child. Let's stress it in the code.

We are going to drop indirect pointer logic here in future. Actually
this commit simplifies the future work: we drop use of indirection in
the assertion now.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block/snapshot.c | 30 ++++++++++--------------------
 1 file changed, 10 insertions(+), 20 deletions(-)

diff --git a/block/snapshot.c b/block/snapshot.c
index d6f53c3065..f4ec4f9ef3 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -161,21 +161,14 @@ bool bdrv_snapshot_find_by_id_and_name(BlockDriverState *bs,
 static BdrvChild **bdrv_snapshot_fallback_ptr(BlockDriverState *bs)
 {
     BdrvChild **fallback;
-    BdrvChild *child;
+    BdrvChild *child = bdrv_primary_child(bs);
 
-    /*
-     * The only BdrvChild pointers that are safe to modify (and which
-     * we can thus return a reference to) are bs->file and
-     * bs->backing.
-     */
-    fallback = &bs->file;
-    if (!*fallback && bs->drv && bs->drv->is_filter) {
-        fallback = &bs->backing;
-    }
-
-    if (!*fallback) {
+    /* We allow fallback only to primary child */
+    if (!child) {
         return NULL;
     }
+    fallback = (child == bs->file ? &bs->file : &bs->backing);
+    assert(*fallback == child);
 
     /*
      * Check that there are no other children that would need to be
@@ -309,15 +302,12 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
         }
 
         /*
-         * fallback_ptr is &bs->file or &bs->backing.  *fallback_ptr
-         * was closed above and set to NULL, but the .bdrv_open() call
-         * has opened it again, because we set the respective option
-         * (with the qdict_put_str() call above).
-         * Assert that .bdrv_open() has attached some child on
-         * *fallback_ptr, and that it has attached the one we wanted
-         * it to (i.e., fallback_bs).
+         * fallback was a primary child. It was closed above and set to NULL,
+         * but the .bdrv_open() call has opened it again, because we set the
+         * respective option (with the qdict_put_str() call above).
+         * Assert that .bdrv_open() has attached some BDS as primary child.
          */
-        assert(*fallback_ptr && fallback_bs == (*fallback_ptr)->bs);
+        assert(bdrv_primary_bs(bs) == fallback_bs);
         bdrv_unref(fallback_bs);
         return ret;
     }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 09/45] Revert "block: Let replace_child_noperm free children"
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 08/45] block/snapshot: stress that we fallback to primary child Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-07 14:03   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 10/45] Revert "block: Let replace_child_tran keep indirect pointer" Vladimir Sementsov-Ogievskiy
                   ` (35 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

We are going to reimplement this behavior (clear bs->file / bs->backing
pointers automatically when child->bs is cleared) in a nicer way.

This reverts commit b0a9f6fed3d80de610dcd04a7e66f9f30a04174f.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c | 102 +++++++++++++-------------------------------------------
 1 file changed, 23 insertions(+), 79 deletions(-)

diff --git a/block.c b/block.c
index 8110b1b330..7f11ec4a80 100644
--- a/block.c
+++ b/block.c
@@ -90,10 +90,8 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
 static bool bdrv_recurse_has_child(BlockDriverState *bs,
                                    BlockDriverState *child);
 
-static void bdrv_child_free(BdrvChild *child);
 static void bdrv_replace_child_noperm(BdrvChild **child,
-                                      BlockDriverState *new_bs,
-                                      bool free_empty_child);
+                                      BlockDriverState *new_bs);
 static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
                                               BdrvChild *child,
                                               Transaction *tran);
@@ -2338,7 +2336,6 @@ typedef struct BdrvReplaceChildState {
     BdrvChild *child;
     BdrvChild **childp;
     BlockDriverState *old_bs;
-    bool free_empty_child;
 } BdrvReplaceChildState;
 
 static void bdrv_replace_child_commit(void *opaque)
@@ -2346,9 +2343,6 @@ static void bdrv_replace_child_commit(void *opaque)
     BdrvReplaceChildState *s = opaque;
     GLOBAL_STATE_CODE();
 
-    if (s->free_empty_child && !s->child->bs) {
-        bdrv_child_free(s->child);
-    }
     bdrv_unref(s->old_bs);
 }
 
@@ -2366,26 +2360,22 @@ static void bdrv_replace_child_abort(void *opaque)
      *     modify the BdrvChild * pointer we indirectly pass to it, i.e. it
      *     will not modify s->child.  From that perspective, it does not matter
      *     whether we pass s->childp or &s->child.
+     *     (TODO: Right now, bdrv_replace_child_noperm() never modifies that
+     *     pointer anyway (though it will in the future), so at this point it
+     *     absolutely does not matter whether we pass s->childp or &s->child.)
      * (2) If new_bs is not NULL, s->childp will be NULL.  We then cannot use
      *     it here.
      * (3) If new_bs is NULL, *s->childp will have been NULLed by
      *     bdrv_replace_child_tran()'s bdrv_replace_child_noperm() call, and we
      *     must not pass a NULL *s->childp here.
+     *     (TODO: In its current state, bdrv_replace_child_noperm() will not
+     *     have NULLed *s->childp, so this does not apply yet.  It will in the
+     *     future.)
      *
      * So whether new_bs was NULL or not, we cannot pass s->childp here; and in
      * any case, there is no reason to pass it anyway.
      */
-    bdrv_replace_child_noperm(&s->child, s->old_bs, true);
-    /*
-     * The child was pre-existing, so s->old_bs must be non-NULL, and
-     * s->child thus must not have been freed
-     */
-    assert(s->child != NULL);
-    if (!new_bs) {
-        /* As described above, *s->childp was cleared, so restore it */
-        assert(s->childp != NULL);
-        *s->childp = s->child;
-    }
+    bdrv_replace_child_noperm(&s->child, s->old_bs);
     bdrv_unref(new_bs);
 }
 
@@ -2402,44 +2392,30 @@ static TransactionActionDrv bdrv_replace_child_drv = {
  *
  * The function doesn't update permissions, caller is responsible for this.
  *
- * (*childp)->bs must not be NULL.
- *
  * Note that if new_bs == NULL, @childp is stored in a state object attached
  * to @tran, so that the old child can be reinstated in the abort handler.
  * Therefore, if @new_bs can be NULL, @childp must stay valid until the
  * transaction is committed or aborted.
  *
- * If @free_empty_child is true and @new_bs is NULL, the BdrvChild is
- * freed (on commit).  @free_empty_child should only be false if the
- * caller will free the BDrvChild themselves (which may be important
- * if this is in turn called in another transactional context).
+ * (TODO: The reinstating does not happen yet, but it will once
+ * bdrv_replace_child_noperm() NULLs *childp when new_bs is NULL.)
  */
 static void bdrv_replace_child_tran(BdrvChild **childp,
                                     BlockDriverState *new_bs,
-                                    Transaction *tran,
-                                    bool free_empty_child)
+                                    Transaction *tran)
 {
     BdrvReplaceChildState *s = g_new(BdrvReplaceChildState, 1);
     *s = (BdrvReplaceChildState) {
         .child = *childp,
         .childp = new_bs == NULL ? childp : NULL,
         .old_bs = (*childp)->bs,
-        .free_empty_child = free_empty_child,
     };
     tran_add(tran, &bdrv_replace_child_drv, s);
 
-    /* The abort handler relies on this */
-    assert(s->old_bs != NULL);
-
     if (new_bs) {
         bdrv_ref(new_bs);
     }
-    /*
-     * Pass free_empty_child=false, we will free the child (if
-     * necessary) in bdrv_replace_child_commit() (if our
-     * @free_empty_child parameter was true).
-     */
-    bdrv_replace_child_noperm(childp, new_bs, false);
+    bdrv_replace_child_noperm(childp, new_bs);
     /* old_bs reference is transparently moved from *childp to @s */
 }
 
@@ -2821,22 +2797,8 @@ uint64_t bdrv_qapi_perm_to_blk_perm(BlockPermission qapi_perm)
     return permissions[qapi_perm];
 }
 
-/**
- * Replace (*childp)->bs by @new_bs.
- *
- * If @new_bs is NULL, *childp will be set to NULL, too: BDS parents
- * generally cannot handle a BdrvChild with .bs == NULL, so clearing
- * BdrvChild.bs should generally immediately be followed by the
- * BdrvChild pointer being cleared as well.
- *
- * If @free_empty_child is true and @new_bs is NULL, the BdrvChild is
- * freed.  @free_empty_child should only be false if the caller will
- * free the BdrvChild themselves (this may be important in a
- * transactional context, where it may only be freed on commit).
- */
 static void bdrv_replace_child_noperm(BdrvChild **childp,
-                                      BlockDriverState *new_bs,
-                                      bool free_empty_child)
+                                      BlockDriverState *new_bs)
 {
     BdrvChild *child = *childp;
     BlockDriverState *old_bs = child->bs;
@@ -2875,9 +2837,6 @@ static void bdrv_replace_child_noperm(BdrvChild **childp,
     }
 
     child->bs = new_bs;
-    if (!new_bs) {
-        *childp = NULL;
-    }
 
     if (new_bs) {
         assert_bdrv_graph_writable(new_bs);
@@ -2908,10 +2867,6 @@ static void bdrv_replace_child_noperm(BdrvChild **childp,
         bdrv_parent_drained_end_single(child);
         drain_saldo++;
     }
-
-    if (free_empty_child && !child->bs) {
-        bdrv_child_free(child);
-    }
 }
 
 /**
@@ -2943,14 +2898,7 @@ static void bdrv_attach_child_common_abort(void *opaque)
     BlockDriverState *bs = child->bs;
 
     GLOBAL_STATE_CODE();
-    /*
-     * Pass free_empty_child=false, because we still need the child
-     * for the AioContext operations on the parent below; those
-     * BdrvChildClass methods all work on a BdrvChild object, so we
-     * need to keep it as an empty shell (after this function, it will
-     * not be attached to any parent, and it will not have a .bs).
-     */
-    bdrv_replace_child_noperm(s->child, NULL, false);
+    bdrv_replace_child_noperm(s->child, NULL);
 
     if (bdrv_get_aio_context(bs) != s->old_child_ctx) {
         bdrv_try_set_aio_context(bs, s->old_child_ctx, &error_abort);
@@ -2972,6 +2920,7 @@ static void bdrv_attach_child_common_abort(void *opaque)
 
     bdrv_unref(bs);
     bdrv_child_free(child);
+    *s->child = NULL;
 }
 
 static TransactionActionDrv bdrv_attach_child_common_drv = {
@@ -3050,9 +2999,7 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
     }
 
     bdrv_ref(child_bs);
-    bdrv_replace_child_noperm(&new_child, child_bs, true);
-    /* child_bs was non-NULL, so new_child must not have been freed */
-    assert(new_child != NULL);
+    bdrv_replace_child_noperm(&new_child, child_bs);
 
     *child = new_child;
 
@@ -3113,7 +3060,8 @@ static void bdrv_detach_child(BdrvChild **childp)
     BlockDriverState *old_bs = (*childp)->bs;
 
     GLOBAL_STATE_CODE();
-    bdrv_replace_child_noperm(childp, NULL, true);
+    bdrv_replace_child_noperm(childp, NULL);
+    bdrv_child_free(*childp);
 
     if (old_bs) {
         /*
@@ -5167,11 +5115,7 @@ static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
     }
 
     if (child->bs) {
-        /*
-         * Pass free_empty_child=false, we will free the child in
-         * bdrv_remove_filter_or_cow_child_commit()
-         */
-        bdrv_replace_child_tran(childp, NULL, tran, false);
+        bdrv_replace_child_tran(childp, NULL, tran);
     }
 
     s = g_new(BdrvRemoveFilterOrCowChild, 1);
@@ -5181,6 +5125,8 @@ static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
         .is_backing = (childp == &bs->backing),
     };
     tran_add(tran, &bdrv_remove_filter_or_cow_child_drv, s);
+
+    *childp = NULL;
 }
 
 /*
@@ -5224,7 +5170,7 @@ static int bdrv_replace_node_noperm(BlockDriverState *from,
          * Passing a pointer to the local variable @c is fine here, because
          * @to is not NULL, and so &c will not be attached to the transaction.
          */
-        bdrv_replace_child_tran(&c, to, tran, true);
+        bdrv_replace_child_tran(&c, to, tran);
     }
 
     return 0;
@@ -5389,9 +5335,7 @@ int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
     bdrv_drained_begin(old_bs);
     bdrv_drained_begin(new_bs);
 
-    bdrv_replace_child_tran(&child, new_bs, tran, true);
-    /* @new_bs must have been non-NULL, so @child must not have been freed */
-    assert(child != NULL);
+    bdrv_replace_child_tran(&child, new_bs, tran);
 
     found = g_hash_table_new(NULL, NULL);
     refresh_list = bdrv_topological_dfs(refresh_list, found, old_bs);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 10/45] Revert "block: Let replace_child_tran keep indirect pointer"
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (8 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 09/45] Revert "block: Let replace_child_noperm free children" Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 11/45] Revert "block: Restructure remove_file_or_backing_child()" Vladimir Sementsov-Ogievskiy
                   ` (34 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

That's a preparation to previously reverted
"block: Let replace_child_noperm free children". Drop it too, we don't
need it for a new approach.

This reverts commit 82b54cf51656bf3cd5ed1ac549e8a1085a0e3290.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c | 81 +++++++--------------------------------------------------
 1 file changed, 10 insertions(+), 71 deletions(-)

diff --git a/block.c b/block.c
index 7f11ec4a80..258851576a 100644
--- a/block.c
+++ b/block.c
@@ -2334,7 +2334,6 @@ static int bdrv_drv_set_perm(BlockDriverState *bs, uint64_t perm,
 
 typedef struct BdrvReplaceChildState {
     BdrvChild *child;
-    BdrvChild **childp;
     BlockDriverState *old_bs;
 } BdrvReplaceChildState;
 
@@ -2352,29 +2351,7 @@ static void bdrv_replace_child_abort(void *opaque)
     BlockDriverState *new_bs = s->child->bs;
 
     GLOBAL_STATE_CODE();
-    /*
-     * old_bs reference is transparently moved from @s to s->child.
-     *
-     * Pass &s->child here instead of s->childp, because:
-     * (1) s->old_bs must be non-NULL, so bdrv_replace_child_noperm() will not
-     *     modify the BdrvChild * pointer we indirectly pass to it, i.e. it
-     *     will not modify s->child.  From that perspective, it does not matter
-     *     whether we pass s->childp or &s->child.
-     *     (TODO: Right now, bdrv_replace_child_noperm() never modifies that
-     *     pointer anyway (though it will in the future), so at this point it
-     *     absolutely does not matter whether we pass s->childp or &s->child.)
-     * (2) If new_bs is not NULL, s->childp will be NULL.  We then cannot use
-     *     it here.
-     * (3) If new_bs is NULL, *s->childp will have been NULLed by
-     *     bdrv_replace_child_tran()'s bdrv_replace_child_noperm() call, and we
-     *     must not pass a NULL *s->childp here.
-     *     (TODO: In its current state, bdrv_replace_child_noperm() will not
-     *     have NULLed *s->childp, so this does not apply yet.  It will in the
-     *     future.)
-     *
-     * So whether new_bs was NULL or not, we cannot pass s->childp here; and in
-     * any case, there is no reason to pass it anyway.
-     */
+    /* old_bs reference is transparently moved from @s to @s->child */
     bdrv_replace_child_noperm(&s->child, s->old_bs);
     bdrv_unref(new_bs);
 }
@@ -2391,32 +2368,22 @@ static TransactionActionDrv bdrv_replace_child_drv = {
  * Note: real unref of old_bs is done only on commit.
  *
  * The function doesn't update permissions, caller is responsible for this.
- *
- * Note that if new_bs == NULL, @childp is stored in a state object attached
- * to @tran, so that the old child can be reinstated in the abort handler.
- * Therefore, if @new_bs can be NULL, @childp must stay valid until the
- * transaction is committed or aborted.
- *
- * (TODO: The reinstating does not happen yet, but it will once
- * bdrv_replace_child_noperm() NULLs *childp when new_bs is NULL.)
  */
-static void bdrv_replace_child_tran(BdrvChild **childp,
-                                    BlockDriverState *new_bs,
+static void bdrv_replace_child_tran(BdrvChild *child, BlockDriverState *new_bs,
                                     Transaction *tran)
 {
     BdrvReplaceChildState *s = g_new(BdrvReplaceChildState, 1);
     *s = (BdrvReplaceChildState) {
-        .child = *childp,
-        .childp = new_bs == NULL ? childp : NULL,
-        .old_bs = (*childp)->bs,
+        .child = child,
+        .old_bs = child->bs,
     };
     tran_add(tran, &bdrv_replace_child_drv, s);
 
     if (new_bs) {
         bdrv_ref(new_bs);
     }
-    bdrv_replace_child_noperm(childp, new_bs);
-    /* old_bs reference is transparently moved from *childp to @s */
+    bdrv_replace_child_noperm(&child, new_bs);
+    /* old_bs reference is transparently moved from @child to @s */
 }
 
 /*
@@ -5041,7 +5008,6 @@ static bool should_update_child(BdrvChild *c, BlockDriverState *to)
 
 typedef struct BdrvRemoveFilterOrCowChild {
     BdrvChild *child;
-    BlockDriverState *bs;
     bool is_backing;
 } BdrvRemoveFilterOrCowChild;
 
@@ -5071,19 +5037,10 @@ static void bdrv_remove_filter_or_cow_child_commit(void *opaque)
     bdrv_child_free(s->child);
 }
 
-static void bdrv_remove_filter_or_cow_child_clean(void *opaque)
-{
-    BdrvRemoveFilterOrCowChild *s = opaque;
-
-    /* Drop the bs reference after the transaction is done */
-    bdrv_unref(s->bs);
-    g_free(s);
-}
-
 static TransactionActionDrv bdrv_remove_filter_or_cow_child_drv = {
     .abort = bdrv_remove_filter_or_cow_child_abort,
     .commit = bdrv_remove_filter_or_cow_child_commit,
-    .clean = bdrv_remove_filter_or_cow_child_clean,
+    .clean = g_free,
 };
 
 /*
@@ -5101,11 +5058,6 @@ static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
         return;
     }
 
-    /*
-     * Keep a reference to @bs so @childp will stay valid throughout the
-     * transaction (required by bdrv_replace_child_tran())
-     */
-    bdrv_ref(bs);
     if (child == bs->backing) {
         childp = &bs->backing;
     } else if (child == bs->file) {
@@ -5115,13 +5067,12 @@ static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
     }
 
     if (child->bs) {
-        bdrv_replace_child_tran(childp, NULL, tran);
+        bdrv_replace_child_tran(*childp, NULL, tran);
     }
 
     s = g_new(BdrvRemoveFilterOrCowChild, 1);
     *s = (BdrvRemoveFilterOrCowChild) {
         .child = child,
-        .bs = bs,
         .is_backing = (childp == &bs->backing),
     };
     tran_add(tran, &bdrv_remove_filter_or_cow_child_drv, s);
@@ -5147,7 +5098,6 @@ static int bdrv_replace_node_noperm(BlockDriverState *from,
 {
     BdrvChild *c, *next;
 
-    assert(to != NULL);
     GLOBAL_STATE_CODE();
 
     QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) {
@@ -5165,12 +5115,7 @@ static int bdrv_replace_node_noperm(BlockDriverState *from,
                        c->name, from->node_name);
             return -EPERM;
         }
-
-        /*
-         * Passing a pointer to the local variable @c is fine here, because
-         * @to is not NULL, and so &c will not be attached to the transaction.
-         */
-        bdrv_replace_child_tran(&c, to, tran);
+        bdrv_replace_child_tran(c, to, tran);
     }
 
     return 0;
@@ -5185,8 +5130,6 @@ static int bdrv_replace_node_noperm(BlockDriverState *from,
  *
  * With @detach_subchain=true @to must be in a backing chain of @from. In this
  * case backing link of the cow-parent of @to is removed.
- *
- * @to must not be NULL.
  */
 static int bdrv_replace_node_common(BlockDriverState *from,
                                     BlockDriverState *to,
@@ -5200,7 +5143,6 @@ static int bdrv_replace_node_common(BlockDriverState *from,
     int ret;
 
     GLOBAL_STATE_CODE();
-    assert(to != NULL);
 
     if (detach_subchain) {
         assert(bdrv_chain_contains(from, to));
@@ -5257,9 +5199,6 @@ out:
     return ret;
 }
 
-/**
- * Replace node @from by @to (where neither may be NULL).
- */
 int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
                       Error **errp)
 {
@@ -5335,7 +5274,7 @@ int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
     bdrv_drained_begin(old_bs);
     bdrv_drained_begin(new_bs);
 
-    bdrv_replace_child_tran(&child, new_bs, tran);
+    bdrv_replace_child_tran(child, new_bs, tran);
 
     found = g_hash_table_new(NULL, NULL);
     refresh_list = bdrv_topological_dfs(refresh_list, found, old_bs);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 11/45] Revert "block: Restructure remove_file_or_backing_child()"
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (9 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 10/45] Revert "block: Let replace_child_tran keep indirect pointer" Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 12/45] Revert "block: Pass BdrvChild ** to replace_child_noperm" Vladimir Sementsov-Ogievskiy
                   ` (33 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

That's a preparation to previously reverted
"block: Let replace_child_noperm free children". Drop it too, we don't
need it for a new approach.

This reverts commit 562bda8bb41879eeda0bd484dd3d55134579b28e.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index 258851576a..34eee40d48 100644
--- a/block.c
+++ b/block.c
@@ -5051,33 +5051,30 @@ static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
                                               BdrvChild *child,
                                               Transaction *tran)
 {
-    BdrvChild **childp;
     BdrvRemoveFilterOrCowChild *s;
 
+    assert(child == bs->backing || child == bs->file);
+
     if (!child) {
         return;
     }
 
-    if (child == bs->backing) {
-        childp = &bs->backing;
-    } else if (child == bs->file) {
-        childp = &bs->file;
-    } else {
-        g_assert_not_reached();
-    }
-
     if (child->bs) {
-        bdrv_replace_child_tran(*childp, NULL, tran);
+        bdrv_replace_child_tran(child, NULL, tran);
     }
 
     s = g_new(BdrvRemoveFilterOrCowChild, 1);
     *s = (BdrvRemoveFilterOrCowChild) {
         .child = child,
-        .is_backing = (childp == &bs->backing),
+        .is_backing = (child == bs->backing),
     };
     tran_add(tran, &bdrv_remove_filter_or_cow_child_drv, s);
 
-    *childp = NULL;
+    if (s->is_backing) {
+        bs->backing = NULL;
+    } else {
+        bs->file = NULL;
+    }
 }
 
 /*
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 12/45] Revert "block: Pass BdrvChild ** to replace_child_noperm"
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (10 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 11/45] Revert "block: Restructure remove_file_or_backing_child()" Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 13/45] block: Manipulate bs->file / bs->backing pointers in .attach/.detach Vladimir Sementsov-Ogievskiy
                   ` (32 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

That's a preparation to previously reverted
"block: Let replace_child_noperm free children". Drop it too, we don't
need it for a new approach.

This reverts commit be64bbb0149748f3999c49b13976aafb8330ea86.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index 34eee40d48..8e8ed639fe 100644
--- a/block.c
+++ b/block.c
@@ -90,7 +90,7 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
 static bool bdrv_recurse_has_child(BlockDriverState *bs,
                                    BlockDriverState *child);
 
-static void bdrv_replace_child_noperm(BdrvChild **child,
+static void bdrv_replace_child_noperm(BdrvChild *child,
                                       BlockDriverState *new_bs);
 static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
                                               BdrvChild *child,
@@ -2352,7 +2352,7 @@ static void bdrv_replace_child_abort(void *opaque)
 
     GLOBAL_STATE_CODE();
     /* old_bs reference is transparently moved from @s to @s->child */
-    bdrv_replace_child_noperm(&s->child, s->old_bs);
+    bdrv_replace_child_noperm(s->child, s->old_bs);
     bdrv_unref(new_bs);
 }
 
@@ -2382,7 +2382,7 @@ static void bdrv_replace_child_tran(BdrvChild *child, BlockDriverState *new_bs,
     if (new_bs) {
         bdrv_ref(new_bs);
     }
-    bdrv_replace_child_noperm(&child, new_bs);
+    bdrv_replace_child_noperm(child, new_bs);
     /* old_bs reference is transparently moved from @child to @s */
 }
 
@@ -2764,10 +2764,9 @@ uint64_t bdrv_qapi_perm_to_blk_perm(BlockPermission qapi_perm)
     return permissions[qapi_perm];
 }
 
-static void bdrv_replace_child_noperm(BdrvChild **childp,
+static void bdrv_replace_child_noperm(BdrvChild *child,
                                       BlockDriverState *new_bs)
 {
-    BdrvChild *child = *childp;
     BlockDriverState *old_bs = child->bs;
     int new_bs_quiesce_counter;
     int drain_saldo;
@@ -2865,7 +2864,7 @@ static void bdrv_attach_child_common_abort(void *opaque)
     BlockDriverState *bs = child->bs;
 
     GLOBAL_STATE_CODE();
-    bdrv_replace_child_noperm(s->child, NULL);
+    bdrv_replace_child_noperm(child, NULL);
 
     if (bdrv_get_aio_context(bs) != s->old_child_ctx) {
         bdrv_try_set_aio_context(bs, s->old_child_ctx, &error_abort);
@@ -2966,7 +2965,7 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
     }
 
     bdrv_ref(child_bs);
-    bdrv_replace_child_noperm(&new_child, child_bs);
+    bdrv_replace_child_noperm(new_child, child_bs);
 
     *child = new_child;
 
@@ -3022,13 +3021,13 @@ static int bdrv_attach_child_noperm(BlockDriverState *parent_bs,
     return 0;
 }
 
-static void bdrv_detach_child(BdrvChild **childp)
+static void bdrv_detach_child(BdrvChild *child)
 {
-    BlockDriverState *old_bs = (*childp)->bs;
+    BlockDriverState *old_bs = child->bs;
 
     GLOBAL_STATE_CODE();
-    bdrv_replace_child_noperm(childp, NULL);
-    bdrv_child_free(*childp);
+    bdrv_replace_child_noperm(child, NULL);
+    bdrv_child_free(child);
 
     if (old_bs) {
         /*
@@ -3140,7 +3139,7 @@ void bdrv_root_unref_child(BdrvChild *child)
     GLOBAL_STATE_CODE();
 
     child_bs = child->bs;
-    bdrv_detach_child(&child);
+    bdrv_detach_child(child);
     bdrv_unref(child_bs);
 }
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 13/45] block: Manipulate bs->file / bs->backing pointers in .attach/.detach
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (11 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 12/45] Revert "block: Pass BdrvChild ** to replace_child_noperm" Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-07 15:55   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 14/45] block/snapshot: drop indirection around bdrv_snapshot_fallback_ptr Vladimir Sementsov-Ogievskiy
                   ` (31 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

bs->file and bs->backing are a kind of duplication of part of
bs->children. But very useful diplication, so let's not drop them at
all:)

We should manage bs->file and bs->backing in same place, where we
manage bs->children, to keep them in sync.

Moreover, generic io paths are unprepared to BdrvChild without a bs, so
it's double good to clear bs->file / bs->backing when we detach the
child.

Detach is simple: if we detach bs->file or bs->backing child, just
set corresponding field to NULL.

Attach is a bit more complicated. But we still can precisely detect
should we set one of bs->file / bs->backing or not:

- if role is BDRV_CHILD_COW, we definitely deal with bs->backing
- else, if role is BDRV_CHILD_FILTERED (it must be also
  BDRV_CHILD_PRIMARY), it's a filtered child. Use
  bs->drv->filtered_child_is_backing to chose the pointer field to
  modify.
- else, if role is BDRV_CHILD_PRIMARY, we deal with bs->file
- in all other cases, it's neither bs->backing nor bs->file. It's some
  other child and we shouldn't care

OK. This change brings one more good thing: we can (and should) get rid
of all indirect pointers in the block-graph-change transactions:

bdrv_attach_child_common() stores BdrvChild** into transaction to clear
it on abort.

bdrv_attach_child_common() has two callers: bdrv_attach_child_noperm()
just pass-through this feature, bdrv_root_attach_child() doesn't need
the feature.

Look at bdrv_attach_child_noperm() callers:
  - bdrv_attach_child() doesn't need the feature
  - bdrv_set_file_or_backing_noperm() uses the feature to manage
    bs->file and bs->backing, we don't want it anymore
  - bdrv_append() uses the feature to manage bs->backing, again we
    don't want it anymore

So, we should drop this stuff! Great!

We still keep BdrvChild** argument to return the child and int return
value, and not move to simply returning BdrvChild*, as we don't want to
lose int return values.

However we don't require *@child to be NULL anymore, and even allow
@child to be NULL, if caller don't need the new child pointer.

Finally, we now set .file / .backing automatically in generic code and
want to restring setting them by hand outside of .attach/.detach.
So, this patch cleanups all remaining places where they were set.
To find such places I use:

  git grep '\->file ='
  git grep '\->backing ='
  git grep '&.*\<backing\>'
  git grep '&.*\<file\>'

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c                          | 156 ++++++++++++++-----------------
 block/raw-format.c               |   4 +-
 block/snapshot.c                 |   1 -
 include/block/block_int-common.h |  15 ++-
 tests/unit/test-bdrv-drain.c     |  10 +-
 5 files changed, 89 insertions(+), 97 deletions(-)

diff --git a/block.c b/block.c
index 8e8ed639fe..6b43e101a1 100644
--- a/block.c
+++ b/block.c
@@ -1438,9 +1438,33 @@ static void bdrv_child_cb_attach(BdrvChild *child)
 
     assert_bdrv_graph_writable(bs);
     QLIST_INSERT_HEAD(&bs->children, child, next);
-
-    if (child->role & BDRV_CHILD_COW) {
+    if (bs->drv->is_filter | (child->role & BDRV_CHILD_FILTERED)) {
+        /*
+         * Here we handle filters and block/raw-format.c when it behave like
+         * filter.
+         */
+        assert(!(child->role & BDRV_CHILD_COW));
+        if (child->role & (BDRV_CHILD_PRIMARY | BDRV_CHILD_FILTERED)) {
+            assert(child->role & BDRV_CHILD_PRIMARY);
+            assert(child->role & BDRV_CHILD_FILTERED);
+            assert(!bs->backing);
+            assert(!bs->file);
+
+            if (bs->drv->filtered_child_is_backing) {
+                bs->backing = child;
+            } else {
+                bs->file = child;
+            }
+        }
+    } else if (child->role & BDRV_CHILD_COW) {
+        assert(bs->drv->supports_backing);
+        assert(!(child->role & BDRV_CHILD_PRIMARY));
+        assert(!bs->backing);
+        bs->backing = child;
         bdrv_backing_attach(child);
+    } else if (child->role & BDRV_CHILD_PRIMARY) {
+        assert(!bs->file);
+        bs->file = child;
     }
 
     bdrv_apply_subtree_drain(child, bs);
@@ -1458,6 +1482,12 @@ static void bdrv_child_cb_detach(BdrvChild *child)
 
     assert_bdrv_graph_writable(bs);
     QLIST_REMOVE(child, next);
+    if (child == bs->backing) {
+        assert(child != bs->file);
+        bs->backing = NULL;
+    } else if (child == bs->file) {
+        bs->file = NULL;
+    }
 }
 
 static int bdrv_child_cb_update_filename(BdrvChild *c, BlockDriverState *base,
@@ -1663,7 +1693,7 @@ open_failed:
     bs->drv = NULL;
     if (bs->file != NULL) {
         bdrv_unref_child(bs, bs->file);
-        bs->file = NULL;
+        assert(!bs->file);
     }
     g_free(bs->opaque);
     bs->opaque = NULL;
@@ -2852,7 +2882,7 @@ static void bdrv_child_free(BdrvChild *child)
 }
 
 typedef struct BdrvAttachChildCommonState {
-    BdrvChild **child;
+    BdrvChild *child;
     AioContext *old_parent_ctx;
     AioContext *old_child_ctx;
 } BdrvAttachChildCommonState;
@@ -2860,33 +2890,31 @@ typedef struct BdrvAttachChildCommonState {
 static void bdrv_attach_child_common_abort(void *opaque)
 {
     BdrvAttachChildCommonState *s = opaque;
-    BdrvChild *child = *s->child;
-    BlockDriverState *bs = child->bs;
+    BlockDriverState *bs = s->child->bs;
 
     GLOBAL_STATE_CODE();
-    bdrv_replace_child_noperm(child, NULL);
+    bdrv_replace_child_noperm(s->child, NULL);
 
     if (bdrv_get_aio_context(bs) != s->old_child_ctx) {
         bdrv_try_set_aio_context(bs, s->old_child_ctx, &error_abort);
     }
 
-    if (bdrv_child_get_parent_aio_context(child) != s->old_parent_ctx) {
+    if (bdrv_child_get_parent_aio_context(s->child) != s->old_parent_ctx) {
         GSList *ignore;
 
         /* No need to ignore `child`, because it has been detached already */
         ignore = NULL;
-        child->klass->can_set_aio_ctx(child, s->old_parent_ctx, &ignore,
-                                      &error_abort);
+        s->child->klass->can_set_aio_ctx(s->child, s->old_parent_ctx, &ignore,
+                                         &error_abort);
         g_slist_free(ignore);
 
         ignore = NULL;
-        child->klass->set_aio_ctx(child, s->old_parent_ctx, &ignore);
+        s->child->klass->set_aio_ctx(s->child, s->old_parent_ctx, &ignore);
         g_slist_free(ignore);
     }
 
     bdrv_unref(bs);
-    bdrv_child_free(child);
-    *s->child = NULL;
+    bdrv_child_free(s->child);
 }
 
 static TransactionActionDrv bdrv_attach_child_common_drv = {
@@ -2897,11 +2925,11 @@ static TransactionActionDrv bdrv_attach_child_common_drv = {
 /*
  * Common part of attaching bdrv child to bs or to blk or to job
  *
- * Resulting new child is returned through @child.
- * At start *@child must be NULL.
- * @child is saved to a new entry of @tran, so that *@child could be reverted to
- * NULL on abort(). So referenced variable must live at least until transaction
- * end.
+ * If @child is not NULL, it's set to new created child. Note, that @child
+ * pointer is stored in the transaction and therefore not cleared on abort.
+ * Consider @child as part of return value: we may change the return value of
+ * the function to BdrvChild* and return child directly, but this way we lose
+ * different return codes.
  *
  * Function doesn't update permissions, caller is responsible for this.
  */
@@ -2917,8 +2945,6 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
     AioContext *parent_ctx;
     AioContext *child_ctx = bdrv_get_aio_context(child_bs);
 
-    assert(child);
-    assert(*child == NULL);
     assert(child_class->get_parent_desc);
     GLOBAL_STATE_CODE();
 
@@ -2967,22 +2993,25 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
     bdrv_ref(child_bs);
     bdrv_replace_child_noperm(new_child, child_bs);
 
-    *child = new_child;
-
     BdrvAttachChildCommonState *s = g_new(BdrvAttachChildCommonState, 1);
     *s = (BdrvAttachChildCommonState) {
-        .child = child,
+        .child = new_child,
         .old_parent_ctx = parent_ctx,
         .old_child_ctx = child_ctx,
     };
     tran_add(tran, &bdrv_attach_child_common_drv, s);
 
+    if (child) {
+        *child = new_child;
+    }
+
     return 0;
 }
 
 /*
- * Variable referenced by @child must live at least until transaction end.
- * (see bdrv_attach_child_common() doc for details)
+ * If @child is not NULL, it's set to new created child. @child is a part of
+ * return value, not a part of transaction (see bdrv_attach_child_common() doc
+ * for details).
  *
  * Function doesn't update permissions, caller is responsible for this.
  */
@@ -3063,7 +3092,7 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
                                   void *opaque, Error **errp)
 {
     int ret;
-    BdrvChild *child = NULL;
+    BdrvChild *child;
     Transaction *tran = tran_new();
 
     GLOBAL_STATE_CODE();
@@ -3079,11 +3108,10 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
 
 out:
     tran_finalize(tran, ret);
-    /* child is unset on failure by bdrv_attach_child_common_abort() */
-    assert((ret < 0) == !child);
 
     bdrv_unref(child_bs);
-    return child;
+
+    return ret < 0 ? NULL : child;
 }
 
 /*
@@ -3105,7 +3133,7 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
                              Error **errp)
 {
     int ret;
-    BdrvChild *child = NULL;
+    BdrvChild *child;
     Transaction *tran = tran_new();
 
     GLOBAL_STATE_CODE();
@@ -3123,12 +3151,10 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
 
 out:
     tran_finalize(tran, ret);
-    /* child is unset on failure by bdrv_attach_child_common_abort() */
-    assert((ret < 0) == !child);
 
     bdrv_unref(child_bs);
 
-    return child;
+    return ret < 0 ? NULL : child;
 }
 
 /* Callers must ensure that child->frozen is false. */
@@ -3331,9 +3357,7 @@ static int bdrv_set_file_or_backing_noperm(BlockDriverState *parent_bs,
     ret = bdrv_attach_child_noperm(parent_bs, child_bs,
                                    is_backing ? "backing" : "file",
                                    &child_of_bds, role,
-                                   is_backing ? &parent_bs->backing :
-                                                &parent_bs->file,
-                                   tran, errp);
+                                   NULL, tran, errp);
     if (ret < 0) {
         return ret;
     }
@@ -3591,14 +3615,16 @@ int bdrv_open_file_child(const char *filename,
 
     /* commit_top and mirror_top don't use this function */
     assert(!parent->drv->filtered_child_is_backing);
-
     role = parent->drv->is_filter ?
         (BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY) : BDRV_CHILD_IMAGE;
 
-    parent->file = bdrv_open_child(filename, options, bdref_key, parent,
-                                   &child_of_bds, role, false, errp);
+    if (!bdrv_open_child(filename, options, bdref_key, parent,
+                         &child_of_bds, role, false, errp))
+    {
+        return -EINVAL;
+    }
 
-    return parent->file ? 0 : -EINVAL;
+    return 0;
 }
 
 /*
@@ -4873,8 +4899,8 @@ static void bdrv_close(BlockDriverState *bs)
         bdrv_unref_child(bs, child);
     }
 
-    bs->backing = NULL;
-    bs->file = NULL;
+    assert(!bs->backing);
+    assert(!bs->file);
     g_free(bs->opaque);
     bs->opaque = NULL;
     qatomic_set(&bs->copy_on_read, 0);
@@ -5005,41 +5031,14 @@ static bool should_update_child(BdrvChild *c, BlockDriverState *to)
     return ret;
 }
 
-typedef struct BdrvRemoveFilterOrCowChild {
-    BdrvChild *child;
-    bool is_backing;
-} BdrvRemoveFilterOrCowChild;
-
-static void bdrv_remove_filter_or_cow_child_abort(void *opaque)
-{
-    BdrvRemoveFilterOrCowChild *s = opaque;
-    BlockDriverState *parent_bs = s->child->opaque;
-
-    if (s->is_backing) {
-        parent_bs->backing = s->child;
-    } else {
-        parent_bs->file = s->child;
-    }
-
-    /*
-     * We don't have to restore child->bs here to undo bdrv_replace_child_tran()
-     * because that function is transactionable and it registered own completion
-     * entries in @tran, so .abort() for bdrv_replace_child_safe() will be
-     * called automatically.
-     */
-}
-
 static void bdrv_remove_filter_or_cow_child_commit(void *opaque)
 {
-    BdrvRemoveFilterOrCowChild *s = opaque;
     GLOBAL_STATE_CODE();
-    bdrv_child_free(s->child);
+    bdrv_child_free(opaque);
 }
 
 static TransactionActionDrv bdrv_remove_filter_or_cow_child_drv = {
-    .abort = bdrv_remove_filter_or_cow_child_abort,
     .commit = bdrv_remove_filter_or_cow_child_commit,
-    .clean = g_free,
 };
 
 /*
@@ -5050,8 +5049,6 @@ static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
                                               BdrvChild *child,
                                               Transaction *tran)
 {
-    BdrvRemoveFilterOrCowChild *s;
-
     assert(child == bs->backing || child == bs->file);
 
     if (!child) {
@@ -5062,18 +5059,7 @@ static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
         bdrv_replace_child_tran(child, NULL, tran);
     }
 
-    s = g_new(BdrvRemoveFilterOrCowChild, 1);
-    *s = (BdrvRemoveFilterOrCowChild) {
-        .child = child,
-        .is_backing = (child == bs->backing),
-    };
-    tran_add(tran, &bdrv_remove_filter_or_cow_child_drv, s);
-
-    if (s->is_backing) {
-        bs->backing = NULL;
-    } else {
-        bs->file = NULL;
-    }
+    tran_add(tran, &bdrv_remove_filter_or_cow_child_drv, child);
 }
 
 /*
@@ -5235,7 +5221,7 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
 
     ret = bdrv_attach_child_noperm(bs_new, bs_top, "backing",
                                    &child_of_bds, bdrv_backing_role(bs_new),
-                                   &bs_new->backing, tran, errp);
+                                   NULL, tran, errp);
     if (ret < 0) {
         goto out;
     }
diff --git a/block/raw-format.c b/block/raw-format.c
index 69fd650eaf..d8ca8ee3a9 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -457,8 +457,8 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
         file_role = BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY;
     }
 
-    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
-                               file_role, false, errp);
+    bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
+                    file_role, false, errp);
     if (!bs->file) {
         return -EINVAL;
     }
diff --git a/block/snapshot.c b/block/snapshot.c
index f4ec4f9ef3..02a880911f 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -288,7 +288,6 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
 
         /* .bdrv_open() will re-attach it */
         bdrv_unref_child(bs, *fallback_ptr);
-        *fallback_ptr = NULL;
 
         ret = bdrv_snapshot_goto(fallback_bs, snapshot_id, errp);
         open_ret = drv->bdrv_open(bs, options, bs->open_flags, &local_err);
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index d68adc6ff3..c4d8b11dbb 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -1056,9 +1056,6 @@ struct BlockDriverState {
     QDict *full_open_options;
     char exact_filename[PATH_MAX];
 
-    BdrvChild *backing;
-    BdrvChild *file;
-
     /* I/O Limits */
     BlockLimits bl;
 
@@ -1117,7 +1114,19 @@ struct BlockDriverState {
      * parent node of this node.
      */
     BlockDriverState *inherits_from;
+
+    /*
+     * @backing and @file are some of @children or NULL. All these three fields
+     * (@file, @backing and @children) are modified only in
+     * bdrv_child_cb_attach() and bdrv_child_cb_detach().
+     *
+     * See also comment in include/block/block.h, to learn how backing and file
+     * are connected with BdrvChildRole.
+     */
     QLIST_HEAD(, BdrvChild) children;
+    BdrvChild *backing;
+    BdrvChild *file;
+
     QLIST_HEAD(, BdrvChild) parents;
 
     QDict *options;
diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
index 23d425a494..4cf99edf5b 100644
--- a/tests/unit/test-bdrv-drain.c
+++ b/tests/unit/test-bdrv-drain.c
@@ -1808,9 +1808,8 @@ static void test_drop_intermediate_poll(void)
     for (i = 0; i < 3; i++) {
         if (i) {
             /* Takes the reference to chain[i - 1] */
-            chain[i]->backing = bdrv_attach_child(chain[i], chain[i - 1],
-                                                  "chain", &chain_child_class,
-                                                  BDRV_CHILD_COW, &error_abort);
+            bdrv_attach_child(chain[i], chain[i - 1], "chain",
+                              &chain_child_class, BDRV_CHILD_COW, &error_abort);
         }
     }
 
@@ -2028,9 +2027,8 @@ static void do_test_replace_child_mid_drain(int old_drain_count,
     new_child_bs->total_sectors = 1;
 
     bdrv_ref(old_child_bs);
-    parent_bs->backing = bdrv_attach_child(parent_bs, old_child_bs, "child",
-                                           &child_of_bds, BDRV_CHILD_COW,
-                                           &error_abort);
+    bdrv_attach_child(parent_bs, old_child_bs, "child", &child_of_bds,
+                      BDRV_CHILD_COW, &error_abort);
 
     for (i = 0; i < old_drain_count; i++) {
         bdrv_drained_begin(old_child_bs);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 14/45] block/snapshot: drop indirection around bdrv_snapshot_fallback_ptr
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (12 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 13/45] block: Manipulate bs->file / bs->backing pointers in .attach/.detach Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-07 15:58   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 15/45] block: refactor bdrv_remove_file_or_backing_child to bdrv_remove_child Vladimir Sementsov-Ogievskiy
                   ` (30 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Now the indirection is not actually used, we can safely reduce it to
simple pointer.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block/snapshot.c | 39 +++++++++++++++++----------------------
 1 file changed, 17 insertions(+), 22 deletions(-)

diff --git a/block/snapshot.c b/block/snapshot.c
index 02a880911f..4eb9258de6 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -151,34 +151,29 @@ bool bdrv_snapshot_find_by_id_and_name(BlockDriverState *bs,
 }
 
 /**
- * Return a pointer to the child BDS pointer to which we can fall
+ * Return a pointer to child of given BDS to which we can fall
  * back if the given BDS does not support snapshots.
  * Return NULL if there is no BDS to (safely) fall back to.
- *
- * We need to return an indirect pointer because bdrv_snapshot_goto()
- * has to modify the BdrvChild pointer.
  */
-static BdrvChild **bdrv_snapshot_fallback_ptr(BlockDriverState *bs)
+static BdrvChild *bdrv_snapshot_fallback_ptr(BlockDriverState *bs)
 {
-    BdrvChild **fallback;
-    BdrvChild *child = bdrv_primary_child(bs);
+    BdrvChild *fallback = bdrv_primary_child(bs);
+    BdrvChild *child;
 
     /* We allow fallback only to primary child */
-    if (!child) {
+    if (!fallback) {
         return NULL;
     }
-    fallback = (child == bs->file ? &bs->file : &bs->backing);
-    assert(*fallback == child);
 
     /*
      * Check that there are no other children that would need to be
      * snapshotted.  If there are, it is not safe to fall back to
-     * *fallback.
+     * fallback.
      */
     QLIST_FOREACH(child, &bs->children, next) {
         if (child->role & (BDRV_CHILD_DATA | BDRV_CHILD_METADATA |
                            BDRV_CHILD_FILTERED) &&
-            child != *fallback)
+            child != fallback)
         {
             return NULL;
         }
@@ -189,8 +184,8 @@ static BdrvChild **bdrv_snapshot_fallback_ptr(BlockDriverState *bs)
 
 static BlockDriverState *bdrv_snapshot_fallback(BlockDriverState *bs)
 {
-    BdrvChild **child_ptr = bdrv_snapshot_fallback_ptr(bs);
-    return child_ptr ? (*child_ptr)->bs : NULL;
+    BdrvChild *child_ptr = bdrv_snapshot_fallback_ptr(bs);
+    return child_ptr ? child_ptr->bs : NULL;
 }
 
 int bdrv_can_snapshot(BlockDriverState *bs)
@@ -237,7 +232,7 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
                        Error **errp)
 {
     BlockDriver *drv = bs->drv;
-    BdrvChild **fallback_ptr;
+    BdrvChild *fallback;
     int ret, open_ret;
 
     GLOBAL_STATE_CODE();
@@ -260,13 +255,13 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
         return ret;
     }
 
-    fallback_ptr = bdrv_snapshot_fallback_ptr(bs);
-    if (fallback_ptr) {
+    fallback = bdrv_snapshot_fallback_ptr(bs);
+    if (fallback) {
         QDict *options;
         QDict *file_options;
         Error *local_err = NULL;
-        BlockDriverState *fallback_bs = (*fallback_ptr)->bs;
-        char *subqdict_prefix = g_strdup_printf("%s.", (*fallback_ptr)->name);
+        BlockDriverState *fallback_bs = fallback->bs;
+        char *subqdict_prefix = g_strdup_printf("%s.", fallback->name);
 
         options = qdict_clone_shallow(bs->options);
 
@@ -277,8 +272,8 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
         qobject_unref(file_options);
         g_free(subqdict_prefix);
 
-        /* Force .bdrv_open() below to re-attach fallback_bs on *fallback_ptr */
-        qdict_put_str(options, (*fallback_ptr)->name,
+        /* Force .bdrv_open() below to re-attach fallback_bs on fallback */
+        qdict_put_str(options, fallback->name,
                       bdrv_get_node_name(fallback_bs));
 
         /* Now close bs, apply the snapshot on fallback_bs, and re-open bs */
@@ -287,7 +282,7 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
         }
 
         /* .bdrv_open() will re-attach it */
-        bdrv_unref_child(bs, *fallback_ptr);
+        bdrv_unref_child(bs, fallback);
 
         ret = bdrv_snapshot_goto(fallback_bs, snapshot_id, errp);
         open_ret = drv->bdrv_open(bs, options, bs->open_flags, &local_err);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 15/45] block: refactor bdrv_remove_file_or_backing_child to bdrv_remove_child
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (13 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 14/45] block/snapshot: drop indirection around bdrv_snapshot_fallback_ptr Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-08 10:04   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 16/45] block: drop bdrv_detach_child() Vladimir Sementsov-Ogievskiy
                   ` (29 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Now the function can remove any child, so give it more common name.
Drop assertions and drop bs argument which becomes unused. Function
would be reused in a further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/block.c b/block.c
index 6b43e101a1..ea5687edc8 100644
--- a/block.c
+++ b/block.c
@@ -92,9 +92,7 @@ static bool bdrv_recurse_has_child(BlockDriverState *bs,
 
 static void bdrv_replace_child_noperm(BdrvChild *child,
                                       BlockDriverState *new_bs);
-static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
-                                              BdrvChild *child,
-                                              Transaction *tran);
+static void bdrv_remove_child(BdrvChild *child, Transaction *tran);
 static void bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
                                             Transaction *tran);
 
@@ -3347,7 +3345,7 @@ static int bdrv_set_file_or_backing_noperm(BlockDriverState *parent_bs,
 
     if (child) {
         bdrv_unset_inherits_from(parent_bs, child, tran);
-        bdrv_remove_file_or_backing_child(parent_bs, child, tran);
+        bdrv_remove_child(child, tran);
     }
 
     if (!child_bs) {
@@ -5031,26 +5029,22 @@ static bool should_update_child(BdrvChild *c, BlockDriverState *to)
     return ret;
 }
 
-static void bdrv_remove_filter_or_cow_child_commit(void *opaque)
+static void bdrv_remove_child_commit(void *opaque)
 {
     GLOBAL_STATE_CODE();
     bdrv_child_free(opaque);
 }
 
-static TransactionActionDrv bdrv_remove_filter_or_cow_child_drv = {
-    .commit = bdrv_remove_filter_or_cow_child_commit,
+static TransactionActionDrv bdrv_remove_child_drv = {
+    .commit = bdrv_remove_child_commit,
 };
 
 /*
  * A function to remove backing or file child of @bs.
  * Function doesn't update permissions, caller is responsible for this.
  */
-static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
-                                              BdrvChild *child,
-                                              Transaction *tran)
+static void bdrv_remove_child(BdrvChild *child, Transaction *tran)
 {
-    assert(child == bs->backing || child == bs->file);
-
     if (!child) {
         return;
     }
@@ -5059,7 +5053,7 @@ static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
         bdrv_replace_child_tran(child, NULL, tran);
     }
 
-    tran_add(tran, &bdrv_remove_filter_or_cow_child_drv, child);
+    tran_add(tran, &bdrv_remove_child_drv, child);
 }
 
 /*
@@ -5070,7 +5064,7 @@ static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
 static void bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
                                             Transaction *tran)
 {
-    bdrv_remove_file_or_backing_child(bs, bdrv_filter_or_cow_child(bs), tran);
+    bdrv_remove_child(bdrv_filter_or_cow_child(bs), tran);
 }
 
 static int bdrv_replace_node_noperm(BlockDriverState *from,
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 16/45] block: drop bdrv_detach_child()
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (14 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 15/45] block: refactor bdrv_remove_file_or_backing_child to bdrv_remove_child Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-08 10:22   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 17/45] block: drop bdrv_remove_filter_or_cow_child Vladimir Sementsov-Ogievskiy
                   ` (28 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

The only caller is bdrv_root_unref_child(), let's just do the logic
directly in it. It simplifies further convertion of
bdrv_root_unref_child() to transaction action.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c | 45 ++++++++++++++++++---------------------------
 1 file changed, 18 insertions(+), 27 deletions(-)

diff --git a/block.c b/block.c
index ea5687edc8..34e89b277f 100644
--- a/block.c
+++ b/block.c
@@ -3048,30 +3048,6 @@ static int bdrv_attach_child_noperm(BlockDriverState *parent_bs,
     return 0;
 }
 
-static void bdrv_detach_child(BdrvChild *child)
-{
-    BlockDriverState *old_bs = child->bs;
-
-    GLOBAL_STATE_CODE();
-    bdrv_replace_child_noperm(child, NULL);
-    bdrv_child_free(child);
-
-    if (old_bs) {
-        /*
-         * Update permissions for old node. We're just taking a parent away, so
-         * we're loosening restrictions. Errors of permission update are not
-         * fatal in this case, ignore them.
-         */
-        bdrv_refresh_perms(old_bs, NULL);
-
-        /*
-         * When the parent requiring a non-default AioContext is removed, the
-         * node moves back to the main AioContext
-         */
-        bdrv_try_set_aio_context(old_bs, qemu_get_aio_context(), NULL);
-    }
-}
-
 /*
  * This function steals the reference to child_bs from the caller.
  * That reference is later dropped by bdrv_root_unref_child().
@@ -3158,12 +3134,27 @@ out:
 /* Callers must ensure that child->frozen is false. */
 void bdrv_root_unref_child(BdrvChild *child)
 {
-    BlockDriverState *child_bs;
+    BlockDriverState *child_bs = child->bs;
 
     GLOBAL_STATE_CODE();
+    bdrv_replace_child_noperm(child, NULL);
+    bdrv_child_free(child);
+
+    if (child_bs) {
+        /*
+         * Update permissions for old node. We're just taking a parent away, so
+         * we're loosening restrictions. Errors of permission update are not
+         * fatal in this case, ignore them.
+         */
+        bdrv_refresh_perms(child_bs, NULL);
+
+        /*
+         * When the parent requiring a non-default AioContext is removed, the
+         * node moves back to the main AioContext
+         */
+        bdrv_try_set_aio_context(child_bs, qemu_get_aio_context(), NULL);
+    }
 
-    child_bs = child->bs;
-    bdrv_detach_child(child);
     bdrv_unref(child_bs);
 }
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 17/45] block: drop bdrv_remove_filter_or_cow_child
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (15 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 16/45] block: drop bdrv_detach_child() Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-08 10:40   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 18/45] block: bdrv_refresh_perms(): allow external tran Vladimir Sementsov-Ogievskiy
                   ` (27 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Drop this simple wrapper used only in one place. We have too many graph
modifying functions even without it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c | 15 +--------------
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/block.c b/block.c
index 34e89b277f..656e596e0c 100644
--- a/block.c
+++ b/block.c
@@ -93,8 +93,6 @@ static bool bdrv_recurse_has_child(BlockDriverState *bs,
 static void bdrv_replace_child_noperm(BdrvChild *child,
                                       BlockDriverState *new_bs);
 static void bdrv_remove_child(BdrvChild *child, Transaction *tran);
-static void bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
-                                            Transaction *tran);
 
 static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
                                BlockReopenQueue *queue,
@@ -5047,17 +5045,6 @@ static void bdrv_remove_child(BdrvChild *child, Transaction *tran)
     tran_add(tran, &bdrv_remove_child_drv, child);
 }
 
-/*
- * A function to remove backing-chain child of @bs if exists: cow child for
- * format nodes (always .backing) and filter child for filters (may be .file or
- * .backing)
- */
-static void bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
-                                            Transaction *tran)
-{
-    bdrv_remove_child(bdrv_filter_or_cow_child(bs), tran);
-}
-
 static int bdrv_replace_node_noperm(BlockDriverState *from,
                                     BlockDriverState *to,
                                     bool auto_skip, Transaction *tran,
@@ -5142,7 +5129,7 @@ static int bdrv_replace_node_common(BlockDriverState *from,
     }
 
     if (detach_subchain) {
-        bdrv_remove_filter_or_cow_child(to_cow_parent, tran);
+        bdrv_remove_child(bdrv_filter_or_cow_child(to_cow_parent), tran);
     }
 
     found = g_hash_table_new(NULL, NULL);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 18/45] block: bdrv_refresh_perms(): allow external tran
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (16 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 17/45] block: drop bdrv_remove_filter_or_cow_child Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-08 10:57   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 19/45] block: refactor bdrv_list_refresh_perms to allow any list of nodes Vladimir Sementsov-Ogievskiy
                   ` (26 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Allow passing external Transaction pointer, stop creating extra
Transaction objects.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/block.c b/block.c
index 656e596e0c..f3ed351360 100644
--- a/block.c
+++ b/block.c
@@ -2557,15 +2557,24 @@ char *bdrv_perm_names(uint64_t perm)
 }
 
 
-static int bdrv_refresh_perms(BlockDriverState *bs, Error **errp)
+/* @tran is allowed to be NULL. In this case no rollback is possible */
+static int bdrv_refresh_perms(BlockDriverState *bs, Transaction *tran,
+                              Error **errp)
 {
     int ret;
-    Transaction *tran = tran_new();
+    Transaction *local_tran = NULL;
     g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
     GLOBAL_STATE_CODE();
 
+    if (!tran) {
+        tran = local_tran = tran_new();
+    }
+
     ret = bdrv_list_refresh_perms(list, NULL, tran, errp);
-    tran_finalize(tran, ret);
+
+    if (local_tran) {
+        tran_finalize(local_tran, ret);
+    }
 
     return ret;
 }
@@ -2581,7 +2590,7 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
 
     bdrv_child_set_perm(c, perm, shared, tran);
 
-    ret = bdrv_refresh_perms(c->bs, &local_err);
+    ret = bdrv_refresh_perms(c->bs, tran, &local_err);
 
     tran_finalize(tran, ret);
 
@@ -3076,7 +3085,7 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
         goto out;
     }
 
-    ret = bdrv_refresh_perms(child_bs, errp);
+    ret = bdrv_refresh_perms(child_bs, tran, errp);
 
 out:
     tran_finalize(tran, ret);
@@ -3116,7 +3125,7 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
         goto out;
     }
 
-    ret = bdrv_refresh_perms(parent_bs, errp);
+    ret = bdrv_refresh_perms(parent_bs, tran, errp);
     if (ret < 0) {
         goto out;
     }
@@ -3144,7 +3153,7 @@ void bdrv_root_unref_child(BdrvChild *child)
          * we're loosening restrictions. Errors of permission update are not
          * fatal in this case, ignore them.
          */
-        bdrv_refresh_perms(child_bs, NULL);
+        bdrv_refresh_perms(child_bs, NULL, NULL);
 
         /*
          * When the parent requiring a non-default AioContext is removed, the
@@ -3386,7 +3395,7 @@ int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
         goto out;
     }
 
-    ret = bdrv_refresh_perms(bs, errp);
+    ret = bdrv_refresh_perms(bs, tran, errp);
 out:
     tran_finalize(tran, ret);
 
@@ -5203,7 +5212,7 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
         goto out;
     }
 
-    ret = bdrv_refresh_perms(bs_new, errp);
+    ret = bdrv_refresh_perms(bs_new, tran, errp);
 out:
     tran_finalize(tran, ret);
 
@@ -6500,7 +6509,7 @@ int bdrv_activate(BlockDriverState *bs, Error **errp)
      */
     if (bs->open_flags & BDRV_O_INACTIVE) {
         bs->open_flags &= ~BDRV_O_INACTIVE;
-        ret = bdrv_refresh_perms(bs, errp);
+        ret = bdrv_refresh_perms(bs, NULL, errp);
         if (ret < 0) {
             bs->open_flags |= BDRV_O_INACTIVE;
             return ret;
@@ -6645,7 +6654,7 @@ static int bdrv_inactivate_recurse(BlockDriverState *bs)
      * We only tried to loosen restrictions, so errors are not fatal, ignore
      * them.
      */
-    bdrv_refresh_perms(bs, NULL);
+    bdrv_refresh_perms(bs, NULL, NULL);
 
     /* Recursively inactivate children */
     QLIST_FOREACH(child, &bs->children, next) {
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 19/45] block: refactor bdrv_list_refresh_perms to allow any list of nodes
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (17 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 18/45] block: bdrv_refresh_perms(): allow external tran Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-08 11:27   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 20/45] block: make permission update functions public Vladimir Sementsov-Ogievskiy
                   ` (25 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

We are going to increase usage of collecting nodes in a list to then
update, and calling bdrv_topological_dfs() each time is not convenient,
and not correct as we are going to interleave graph modifying with
filling the node list.

So, let's switch to a function that takes any list of nodes, adds all
their subtrees and do topological sort. And finally, refresh
permissions.

While being here, make the function public, as we'll want to use it
from blockdev.c in near future.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c | 51 ++++++++++++++++++++++++++++++++-------------------
 1 file changed, 32 insertions(+), 19 deletions(-)

diff --git a/block.c b/block.c
index f3ed351360..9009f73534 100644
--- a/block.c
+++ b/block.c
@@ -2487,8 +2487,12 @@ static int bdrv_node_refresh_perm(BlockDriverState *bs, BlockReopenQueue *q,
     return 0;
 }
 
-static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
-                                   Transaction *tran, Error **errp)
+/*
+ * @list is a product of bdrv_topological_dfs() (may be called several times) -
+ * a topologically sorted subgraph.
+ */
+static int bdrv_do_refresh_perms(GSList *list, BlockReopenQueue *q,
+                                 Transaction *tran, Error **errp)
 {
     int ret;
     BlockDriverState *bs;
@@ -2510,6 +2514,24 @@ static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
     return 0;
 }
 
+/*
+ * @list is any list of nodes. List is completed by all subtreees and
+ * topologically sorted. It's not a problem if some node occurs in the @list
+ * several times.
+ */
+static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
+                                   Transaction *tran, Error **errp)
+{
+    g_autoptr(GHashTable) found = g_hash_table_new(NULL, NULL);
+    g_autoptr(GSList) refresh_list = NULL;
+
+    for ( ; list; list = list->next) {
+        refresh_list = bdrv_topological_dfs(refresh_list, found, list->data);
+    }
+
+    return bdrv_do_refresh_perms(refresh_list, q, tran, errp);
+}
+
 void bdrv_get_cumulative_perm(BlockDriverState *bs, uint64_t *perm,
                               uint64_t *shared_perm)
 {
@@ -2570,7 +2592,7 @@ static int bdrv_refresh_perms(BlockDriverState *bs, Transaction *tran,
         tran = local_tran = tran_new();
     }
 
-    ret = bdrv_list_refresh_perms(list, NULL, tran, errp);
+    ret = bdrv_do_refresh_perms(list, NULL, tran, errp);
 
     if (local_tran) {
         tran_finalize(local_tran, ret);
@@ -4339,7 +4361,6 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
     BlockReopenQueueEntry *bs_entry, *next;
     AioContext *ctx;
     Transaction *tran = tran_new();
-    g_autoptr(GHashTable) found = NULL;
     g_autoptr(GSList) refresh_list = NULL;
 
     assert(qemu_get_current_aio_context() == qemu_get_aio_context());
@@ -4369,18 +4390,15 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
         bs_entry->prepared = true;
     }
 
-    found = g_hash_table_new(NULL, NULL);
     QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
         BDRVReopenState *state = &bs_entry->state;
 
-        refresh_list = bdrv_topological_dfs(refresh_list, found, state->bs);
+        refresh_list = g_slist_prepend(refresh_list, state->bs);
         if (state->old_backing_bs) {
-            refresh_list = bdrv_topological_dfs(refresh_list, found,
-                                                state->old_backing_bs);
+            refresh_list = g_slist_prepend(refresh_list, state->old_backing_bs);
         }
         if (state->old_file_bs) {
-            refresh_list = bdrv_topological_dfs(refresh_list, found,
-                                                state->old_file_bs);
+            refresh_list = g_slist_prepend(refresh_list, state->old_file_bs);
         }
     }
 
@@ -5100,7 +5118,6 @@ static int bdrv_replace_node_common(BlockDriverState *from,
                                     Error **errp)
 {
     Transaction *tran = tran_new();
-    g_autoptr(GHashTable) found = NULL;
     g_autoptr(GSList) refresh_list = NULL;
     BlockDriverState *to_cow_parent = NULL;
     int ret;
@@ -5141,10 +5158,8 @@ static int bdrv_replace_node_common(BlockDriverState *from,
         bdrv_remove_child(bdrv_filter_or_cow_child(to_cow_parent), tran);
     }
 
-    found = g_hash_table_new(NULL, NULL);
-
-    refresh_list = bdrv_topological_dfs(refresh_list, found, to);
-    refresh_list = bdrv_topological_dfs(refresh_list, found, from);
+    refresh_list = g_slist_prepend(refresh_list, to);
+    refresh_list = g_slist_prepend(refresh_list, from);
 
     ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
     if (ret < 0) {
@@ -5227,7 +5242,6 @@ int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
 {
     int ret;
     Transaction *tran = tran_new();
-    g_autoptr(GHashTable) found = NULL;
     g_autoptr(GSList) refresh_list = NULL;
     BlockDriverState *old_bs = child->bs;
 
@@ -5239,9 +5253,8 @@ int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
 
     bdrv_replace_child_tran(child, new_bs, tran);
 
-    found = g_hash_table_new(NULL, NULL);
-    refresh_list = bdrv_topological_dfs(refresh_list, found, old_bs);
-    refresh_list = bdrv_topological_dfs(refresh_list, found, new_bs);
+    refresh_list = g_slist_prepend(refresh_list, old_bs);
+    refresh_list = g_slist_prepend(refresh_list, new_bs);
 
     ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 20/45] block: make permission update functions public
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (18 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 19/45] block: refactor bdrv_list_refresh_perms to allow any list of nodes Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-08 11:31   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 21/45] block: add bdrv_try_set_aio_context_tran transaction action Vladimir Sementsov-Ogievskiy
                   ` (24 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

We'll need them in further commits in blockdev.c for new transaction
block-graph modifying API.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c                            | 7 +++----
 include/block/block-global-state.h | 4 ++++
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index 9009f73534..be19964f89 100644
--- a/block.c
+++ b/block.c
@@ -2519,8 +2519,8 @@ static int bdrv_do_refresh_perms(GSList *list, BlockReopenQueue *q,
  * topologically sorted. It's not a problem if some node occurs in the @list
  * several times.
  */
-static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
-                                   Transaction *tran, Error **errp)
+int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
+                            Transaction *tran, Error **errp)
 {
     g_autoptr(GHashTable) found = g_hash_table_new(NULL, NULL);
     g_autoptr(GSList) refresh_list = NULL;
@@ -2580,8 +2580,7 @@ char *bdrv_perm_names(uint64_t perm)
 
 
 /* @tran is allowed to be NULL. In this case no rollback is possible */
-static int bdrv_refresh_perms(BlockDriverState *bs, Transaction *tran,
-                              Error **errp)
+int bdrv_refresh_perms(BlockDriverState *bs, Transaction *tran, Error **errp)
 {
     int ret;
     Transaction *local_tran = NULL;
diff --git a/include/block/block-global-state.h b/include/block/block-global-state.h
index 600afcf5bd..c307b48b2a 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -253,4 +253,8 @@ void bdrv_unregister_buf(BlockDriverState *bs, void *host);
 
 void bdrv_cancel_in_flight(BlockDriverState *bs);
 
+int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
+                            Transaction *tran, Error **errp);
+int bdrv_refresh_perms(BlockDriverState *bs, Transaction *tran, Error **errp);
+
 #endif /* BLOCK_GLOBAL_STATE_H */
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 21/45] block: add bdrv_try_set_aio_context_tran transaction action
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (19 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 20/45] block: make permission update functions public Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-08 11:49   ` Hanna Reitz
  2022-06-13  7:46   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 22/45] block: implemet bdrv_unref_tran() Vladimir Sementsov-Ogievskiy
                   ` (23 subsequent siblings)
  44 siblings, 2 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

To be used in further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/block.c b/block.c
index be19964f89..1900cdf277 100644
--- a/block.c
+++ b/block.c
@@ -2907,6 +2907,54 @@ static void bdrv_child_free(BdrvChild *child)
     g_free(child);
 }
 
+typedef struct BdrvTrySetAioContextState {
+    BlockDriverState *bs;
+    AioContext *old_ctx;
+} BdrvTrySetAioContextState;
+
+static void bdrv_try_set_aio_context_abort(void *opaque)
+{
+    BdrvTrySetAioContextState *s = opaque;
+
+    if (bdrv_get_aio_context(s->bs) != s->old_ctx) {
+        bdrv_try_set_aio_context(s->bs, s->old_ctx, &error_abort);
+    }
+}
+
+static TransactionActionDrv bdrv_try_set_aio_context_drv = {
+    .abort = bdrv_try_set_aio_context_abort,
+    .clean = g_free,
+};
+
+__attribute__((unused))
+static int bdrv_try_set_aio_context_tran(BlockDriverState *bs,
+                                         AioContext *new_ctx,
+                                         Transaction *tran,
+                                         Error **errp)
+{
+    AioContext *old_ctx = bdrv_get_aio_context(bs);
+    BdrvTrySetAioContextState *s;
+    int ret;
+
+    if (old_ctx == new_ctx) {
+        return 0;
+    }
+
+    ret = bdrv_try_set_aio_context(bs, new_ctx, errp);
+    if (ret < 0) {
+        return ret;
+    }
+
+    s = g_new(BdrvTrySetAioContextState, 1);
+    *s = (BdrvTrySetAioContextState) {
+        .bs = bs,
+        .old_ctx = old_ctx,
+    };
+    tran_add(tran, &bdrv_try_set_aio_context_drv, s);
+
+    return 0;
+}
+
 typedef struct BdrvAttachChildCommonState {
     BdrvChild *child;
     AioContext *old_parent_ctx;
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 22/45] block: implemet bdrv_unref_tran()
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (20 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 21/45] block: add bdrv_try_set_aio_context_tran transaction action Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-13  9:07   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 23/45] blockdev: refactor transaction to use Transaction API Vladimir Sementsov-Ogievskiy
                   ` (22 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Now nodes are removed during block-graph update transactions now? Look
at bdrv_replace_child_tran: bdrv_unref() is simply postponed to commit
phase.

What is the problem with it?

We want to make copy-before-write permissions strict: it should unshare
write always, not only when it has at least one parent. But if so, we
can't neither insert the filter nor remove it:

To insert the filter, we should first do blockdev-add, and filter will
unshare write on the child, so, blockdev-add will fail if disk is in
use by guest.

To remove the filter, we should first do a replace operations, which
again leads to situation when the filter and old parent share one
child, and all parent want write permission when the filter unshare it.

The solution is first do both graph-modifying operations (add &
replace, or replace & remove) and only then update permissions. But
that is not possible with current method to transactionally remove the
block node: if we just postpone bdrv_unref() to commit phase, than on
prepare phase the node is not removed, and it still keep all
permissions on its children.

What to do? In general, I don't know. But it's possible to solve the
problem for the block drivers that doesn't need access to their
children on .bdrv_close(). For such drivers we can detach their
children on prepare stage (still, postponing bdrv_close() call to
commit phase). For this to work we of course should effectively reduce
bs->refcnt on prepare phase as well.

So, the logic of new bdrv_unref_tran() is:

prepare:
  decrease refcnt and detach children if possible (and if refcnt is 0)

commit:
  do bdrv_delete() if refcnt is 0

abort:
  restore children and refcnt

What's the difficulty with it? If we want to transactionally (and with
no permission change) remove nodes, we should understand that some
nodes may be removed recursively, and finally we get several possible
not deleted leaves, where permissions should be updated. How caller
will know what to update? That leads to additional transaction-wide
refresh_list variable, which is filled by various graph modifying
function. So, user should declare referesh_list variable and do one or
several block-graph modifying operations (that may probably remove some
nodes), then user call bdrv_list_refresh_perms on resulting
refresh_list.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c                            | 225 +++++++++++++++++++----------
 include/block/block-common.h       |   2 -
 include/block/block-global-state.h |   2 +
 include/block/block_int-common.h   |   7 +
 4 files changed, 156 insertions(+), 80 deletions(-)

diff --git a/block.c b/block.c
index 1900cdf277..a7020d3cd8 100644
--- a/block.c
+++ b/block.c
@@ -92,10 +92,12 @@ static bool bdrv_recurse_has_child(BlockDriverState *bs,
 
 static void bdrv_replace_child_noperm(BdrvChild *child,
                                       BlockDriverState *new_bs);
-static void bdrv_remove_child(BdrvChild *child, Transaction *tran);
+static void bdrv_remove_child(BdrvChild *child, GSList **refresh_list,
+                              Transaction *tran);
 
 static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
                                BlockReopenQueue *queue,
+                               GSList **refresh_list,
                                Transaction *change_child_tran, Error **errp);
 static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
 static void bdrv_reopen_abort(BDRVReopenState *reopen_state);
@@ -2363,40 +2365,24 @@ typedef struct BdrvReplaceChildState {
     BlockDriverState *old_bs;
 } BdrvReplaceChildState;
 
-static void bdrv_replace_child_commit(void *opaque)
-{
-    BdrvReplaceChildState *s = opaque;
-    GLOBAL_STATE_CODE();
-
-    bdrv_unref(s->old_bs);
-}
-
 static void bdrv_replace_child_abort(void *opaque)
 {
     BdrvReplaceChildState *s = opaque;
     BlockDriverState *new_bs = s->child->bs;
 
     GLOBAL_STATE_CODE();
-    /* old_bs reference is transparently moved from @s to @s->child */
     bdrv_replace_child_noperm(s->child, s->old_bs);
     bdrv_unref(new_bs);
 }
 
 static TransactionActionDrv bdrv_replace_child_drv = {
-    .commit = bdrv_replace_child_commit,
     .abort = bdrv_replace_child_abort,
     .clean = g_free,
 };
 
-/*
- * bdrv_replace_child_tran
- *
- * Note: real unref of old_bs is done only on commit.
- *
- * The function doesn't update permissions, caller is responsible for this.
- */
+/* Caller is responsible to refresh permissions in @refresh_list */
 static void bdrv_replace_child_tran(BdrvChild *child, BlockDriverState *new_bs,
-                                    Transaction *tran)
+                                    GSList **refresh_list, Transaction *tran)
 {
     BdrvReplaceChildState *s = g_new(BdrvReplaceChildState, 1);
     *s = (BdrvReplaceChildState) {
@@ -2407,9 +2393,15 @@ static void bdrv_replace_child_tran(BdrvChild *child, BlockDriverState *new_bs,
 
     if (new_bs) {
         bdrv_ref(new_bs);
+        *refresh_list = g_slist_prepend(*refresh_list, new_bs);
     }
     bdrv_replace_child_noperm(child, new_bs);
-    /* old_bs reference is transparently moved from @child to @s */
+    if (s->old_bs) {
+        bdrv_unref_tran(s->old_bs, refresh_list, tran);
+        if (s->old_bs->refcnt > 0) {
+            *refresh_list = g_slist_prepend(*refresh_list, s->old_bs);
+        }
+    }
 }
 
 /*
@@ -2926,7 +2918,6 @@ static TransactionActionDrv bdrv_try_set_aio_context_drv = {
     .clean = g_free,
 };
 
-__attribute__((unused))
 static int bdrv_try_set_aio_context_tran(BlockDriverState *bs,
                                          AioContext *new_ctx,
                                          Transaction *tran,
@@ -3207,31 +3198,41 @@ out:
     return ret < 0 ? NULL : child;
 }
 
-/* Callers must ensure that child->frozen is false. */
-void bdrv_root_unref_child(BdrvChild *child)
+/* Caller is responsible to refresh permissions in @refresh_list */
+static void bdrv_root_unref_child_tran(BdrvChild *child, GSList **refresh_list,
+                                       Transaction *tran)
 {
     BlockDriverState *child_bs = child->bs;
 
     GLOBAL_STATE_CODE();
-    bdrv_replace_child_noperm(child, NULL);
-    bdrv_child_free(child);
-
-    if (child_bs) {
-        /*
-         * Update permissions for old node. We're just taking a parent away, so
-         * we're loosening restrictions. Errors of permission update are not
-         * fatal in this case, ignore them.
-         */
-        bdrv_refresh_perms(child_bs, NULL, NULL);
+    bdrv_remove_child(child, refresh_list, tran);
 
+    if (child_bs && child_bs->refcnt > 0) {
         /*
          * When the parent requiring a non-default AioContext is removed, the
          * node moves back to the main AioContext
          */
-        bdrv_try_set_aio_context(child_bs, qemu_get_aio_context(), NULL);
+        bdrv_try_set_aio_context_tran(child_bs, qemu_get_aio_context(),
+                                      tran, NULL);
     }
+}
 
-    bdrv_unref(child_bs);
+/* Callers must ensure that child->frozen is false. */
+void bdrv_root_unref_child(BdrvChild *child)
+{
+    Transaction *tran = tran_new();
+    g_autoptr(GSList) refresh_list = NULL;
+
+    bdrv_root_unref_child_tran(child, &refresh_list, tran);
+
+    /*
+     * Update permissions for old node. We're just taking a parent away, so
+     * we're loosening restrictions. Errors of permission update are not
+     * fatal in this case, ignore them.
+     */
+    bdrv_list_refresh_perms(refresh_list, NULL, tran, NULL);
+
+    tran_commit(tran);
 }
 
 typedef struct BdrvSetInheritsFrom {
@@ -3300,16 +3301,28 @@ static void bdrv_unset_inherits_from(BlockDriverState *root, BdrvChild *child,
     }
 }
 
-/* Callers must ensure that child->frozen is false. */
-void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child)
+/* Caller is responsible to refresh permissions in @refresh_list */
+static void bdrv_unref_child_tran(BlockDriverState *parent, BdrvChild *child,
+                                    GSList **refresh_list, Transaction *tran)
 {
     GLOBAL_STATE_CODE();
     if (child == NULL) {
         return;
     }
 
-    bdrv_unset_inherits_from(parent, child, NULL);
-    bdrv_root_unref_child(child);
+    bdrv_unset_inherits_from(parent, child, tran);
+    bdrv_root_unref_child_tran(child, refresh_list, tran);
+}
+
+/* Callers must ensure that child->frozen is false. */
+void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child)
+{
+    Transaction *tran = tran_new();
+    g_autoptr(GSList) refresh_list = NULL;
+
+    bdrv_unref_child_tran(parent, child, &refresh_list, tran);
+    bdrv_list_refresh_perms(refresh_list, NULL, tran, NULL);
+    tran_commit(tran);
 }
 
 
@@ -3354,11 +3367,12 @@ static BdrvChildRole bdrv_backing_role(BlockDriverState *bs)
  * Sets the bs->backing or bs->file link of a BDS. A new reference is created;
  * callers which don't need their own reference any more must call bdrv_unref().
  *
- * Function doesn't update permissions, caller is responsible for this.
+ * Caller is responsible to refresh permissions in @refresh_list.
  */
 static int bdrv_set_file_or_backing_noperm(BlockDriverState *parent_bs,
                                            BlockDriverState *child_bs,
                                            bool is_backing,
+                                           GSList **refresh_list,
                                            Transaction *tran, Error **errp)
 {
     int ret = 0;
@@ -3412,13 +3426,15 @@ static int bdrv_set_file_or_backing_noperm(BlockDriverState *parent_bs,
 
     if (child) {
         bdrv_unset_inherits_from(parent_bs, child, tran);
-        bdrv_remove_child(child, tran);
+        bdrv_remove_child(child, refresh_list, tran);
     }
 
     if (!child_bs) {
         goto out;
     }
 
+    *refresh_list = g_slist_prepend(*refresh_list, parent_bs);
+
     ret = bdrv_attach_child_noperm(parent_bs, child_bs,
                                    is_backing ? "backing" : "file",
                                    &child_of_bds, role,
@@ -3442,12 +3458,15 @@ out:
     return 0;
 }
 
+/* Caller is responsible to refresh permissions in @refresh_list */
 static int bdrv_set_backing_noperm(BlockDriverState *bs,
                                    BlockDriverState *backing_hd,
+                                   GSList **refresh_list,
                                    Transaction *tran, Error **errp)
 {
     GLOBAL_STATE_CODE();
-    return bdrv_set_file_or_backing_noperm(bs, backing_hd, true, tran, errp);
+    return bdrv_set_file_or_backing_noperm(bs, backing_hd, true, refresh_list,
+                                           tran, errp);
 }
 
 int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
@@ -3455,16 +3474,17 @@ int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
 {
     int ret;
     Transaction *tran = tran_new();
+    g_autoptr(GSList) refresh_list = NULL;
 
     GLOBAL_STATE_CODE();
     bdrv_drained_begin(bs);
 
-    ret = bdrv_set_backing_noperm(bs, backing_hd, tran, errp);
+    ret = bdrv_set_backing_noperm(bs, backing_hd, &refresh_list, tran, errp);
     if (ret < 0) {
         goto out;
     }
 
-    ret = bdrv_refresh_perms(bs, tran, errp);
+    ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
 out:
     tran_finalize(tran, ret);
 
@@ -4429,7 +4449,8 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
         assert(bs_entry->state.bs->quiesce_counter > 0);
         ctx = bdrv_get_aio_context(bs_entry->state.bs);
         aio_context_acquire(ctx);
-        ret = bdrv_reopen_prepare(&bs_entry->state, bs_queue, tran, errp);
+        ret = bdrv_reopen_prepare(&bs_entry->state, bs_queue, &refresh_list,
+                                  tran, errp);
         aio_context_release(ctx);
         if (ret < 0) {
             goto abort;
@@ -4441,14 +4462,7 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
         BDRVReopenState *state = &bs_entry->state;
 
         refresh_list = g_slist_prepend(refresh_list, state->bs);
-        if (state->old_backing_bs) {
-            refresh_list = g_slist_prepend(refresh_list, state->old_backing_bs);
-        }
-        if (state->old_file_bs) {
-            refresh_list = g_slist_prepend(refresh_list, state->old_file_bs);
-        }
     }
-
     /*
      * Note that file-posix driver rely on permission update done during reopen
      * (even if no permission changed), because it wants "new" permissions for
@@ -4561,10 +4575,14 @@ int bdrv_reopen_set_read_only(BlockDriverState *bs, bool read_only,
  * true and reopen_state->new_backing_bs contains a pointer to the new
  * backing BlockDriverState (or NULL).
  *
+ * Caller is responsible to refresh permissions in @refresh_list.
+ *
  * Return 0 on success, otherwise return < 0 and set @errp.
  */
 static int bdrv_reopen_parse_file_or_backing(BDRVReopenState *reopen_state,
-                                             bool is_backing, Transaction *tran,
+                                             bool is_backing,
+                                             GSList **refresh_list,
+                                             Transaction *tran,
                                              Error **errp)
 {
     BlockDriverState *bs = reopen_state->bs;
@@ -4632,14 +4650,8 @@ static int bdrv_reopen_parse_file_or_backing(BDRVReopenState *reopen_state,
         return -EINVAL;
     }
 
-    if (is_backing) {
-        reopen_state->old_backing_bs = old_child_bs;
-    } else {
-        reopen_state->old_file_bs = old_child_bs;
-    }
-
     return bdrv_set_file_or_backing_noperm(bs, new_child_bs, is_backing,
-                                           tran, errp);
+                                           refresh_list, tran, errp);
 }
 
 /*
@@ -4651,6 +4663,8 @@ static int bdrv_reopen_parse_file_or_backing(BDRVReopenState *reopen_state,
  * flags are the new open flags
  * queue is the reopen queue
  *
+ * Caller is responsible to refresh permissions in @refresh_list.
+ *
  * Returns 0 on success, non-zero on error.  On error errp will be set
  * as well.
  *
@@ -4661,6 +4675,7 @@ static int bdrv_reopen_parse_file_or_backing(BDRVReopenState *reopen_state,
  */
 static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
                                BlockReopenQueue *queue,
+                               GSList **refresh_list,
                                Transaction *change_child_tran, Error **errp)
 {
     int ret = -1;
@@ -4782,7 +4797,7 @@ static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
      * either a reference to an existing node (using its node name)
      * or NULL to simply detach the current backing file.
      */
-    ret = bdrv_reopen_parse_file_or_backing(reopen_state, true,
+    ret = bdrv_reopen_parse_file_or_backing(reopen_state, true, refresh_list,
                                             change_child_tran, errp);
     if (ret < 0) {
         goto error;
@@ -4790,7 +4805,7 @@ static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
     qdict_del(reopen_state->options, "backing");
 
     /* Allow changing the 'file' option. In this case NULL is not allowed */
-    ret = bdrv_reopen_parse_file_or_backing(reopen_state, false,
+    ret = bdrv_reopen_parse_file_or_backing(reopen_state, false, refresh_list,
                                             change_child_tran, errp);
     if (ret < 0) {
         goto error;
@@ -5104,24 +5119,28 @@ static TransactionActionDrv bdrv_remove_child_drv = {
 
 /*
  * A function to remove backing or file child of @bs.
- * Function doesn't update permissions, caller is responsible for this.
+ * Caller is responsible to refresh permissions in @refresh_list.
  */
-static void bdrv_remove_child(BdrvChild *child, Transaction *tran)
+static void bdrv_remove_child(BdrvChild *child, GSList **refresh_list,
+                              Transaction *tran)
 {
     if (!child) {
         return;
     }
 
     if (child->bs) {
-        bdrv_replace_child_tran(child, NULL, tran);
+        bdrv_replace_child_tran(child, NULL, refresh_list, tran);
     }
 
     tran_add(tran, &bdrv_remove_child_drv, child);
 }
 
+/* Caller is responsible to refresh permissions in @refresh_list */
 static int bdrv_replace_node_noperm(BlockDriverState *from,
                                     BlockDriverState *to,
-                                    bool auto_skip, Transaction *tran,
+                                    bool auto_skip,
+                                    GSList **refresh_list,
+                                    Transaction *tran,
                                     Error **errp)
 {
     BdrvChild *c, *next;
@@ -5143,7 +5162,7 @@ static int bdrv_replace_node_noperm(BlockDriverState *from,
                        c->name, from->node_name);
             return -EPERM;
         }
-        bdrv_replace_child_tran(c, to, tran);
+        bdrv_replace_child_tran(c, to, refresh_list, tran);
     }
 
     return 0;
@@ -5196,18 +5215,17 @@ static int bdrv_replace_node_common(BlockDriverState *from,
      * permissions based on new graph. If we fail, we'll roll-back the
      * replacement.
      */
-    ret = bdrv_replace_node_noperm(from, to, auto_skip, tran, errp);
+    ret = bdrv_replace_node_noperm(from, to, auto_skip, &refresh_list, tran,
+                                   errp);
     if (ret < 0) {
         goto out;
     }
 
     if (detach_subchain) {
-        bdrv_remove_child(bdrv_filter_or_cow_child(to_cow_parent), tran);
+        bdrv_remove_child(bdrv_filter_or_cow_child(to_cow_parent),
+                          &refresh_list, tran);
     }
 
-    refresh_list = g_slist_prepend(refresh_list, to);
-    refresh_list = g_slist_prepend(refresh_list, from);
-
     ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
     if (ret < 0) {
         goto out;
@@ -5257,6 +5275,7 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
 {
     int ret;
     Transaction *tran = tran_new();
+    g_autoptr(GSList) refresh_list = NULL;
 
     GLOBAL_STATE_CODE();
 
@@ -5269,12 +5288,13 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
         goto out;
     }
 
-    ret = bdrv_replace_node_noperm(bs_top, bs_new, true, tran, errp);
+    ret = bdrv_replace_node_noperm(bs_top, bs_new, true, &refresh_list, tran,
+                                   errp);
     if (ret < 0) {
         goto out;
     }
 
-    ret = bdrv_refresh_perms(bs_new, tran, errp);
+    ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
 out:
     tran_finalize(tran, ret);
 
@@ -5298,10 +5318,7 @@ int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
     bdrv_drained_begin(old_bs);
     bdrv_drained_begin(new_bs);
 
-    bdrv_replace_child_tran(child, new_bs, tran);
-
-    refresh_list = g_slist_prepend(refresh_list, old_bs);
-    refresh_list = g_slist_prepend(refresh_list, new_bs);
+    bdrv_replace_child_tran(child, new_bs, &refresh_list, tran);
 
     ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
 
@@ -6830,6 +6847,58 @@ void bdrv_ref(BlockDriverState *bs)
     bs->refcnt++;
 }
 
+static void bdrv_unref_commit(void *opaque)
+{
+    BlockDriverState *bs = opaque;
+
+    if (bs->refcnt == 0) {
+        bdrv_delete(bs);
+    }
+}
+
+static void bdrv_unref_abort(void *opaque)
+{
+    bdrv_ref(opaque);
+}
+
+static TransactionActionDrv bdrv_unref_drv = {
+    .commit = bdrv_unref_commit,
+    .abort = bdrv_unref_abort,
+};
+
+/*
+ * Transactional unref
+ *   - deletion is postponed to transaction commit
+ *   - where possible children are detached now, and permissions are not
+ *     updated. @refresh_list is filled with nodes, to call
+ *     bdrv_nodes_refresh_perms() on.
+ */
+void bdrv_unref_tran(BlockDriverState *bs, GSList **refresh_list,
+                     Transaction *tran)
+{
+    BdrvChild *child, *next;
+
+    if (!bs) {
+        return;
+    }
+
+    assert(bs->refcnt > 0);
+    bs->refcnt--;
+
+    tran_add(tran, &bdrv_unref_drv, bs);
+
+    if (bs->drv && (!bs->drv->bdrv_close || bs->drv->independent_close) &&
+        refresh_list && bs->refcnt == 0)
+    {
+        QLIST_FOREACH_SAFE(child, &bs->children, next, next) {
+            if (child->bs && child->bs->refcnt > 1) {
+                *refresh_list = g_slist_prepend(*refresh_list, child->bs);
+            }
+            bdrv_unref_child_tran(bs, child, refresh_list, tran);
+        }
+    }
+}
+
 /* Release a previously grabbed reference to bs.
  * If after releasing, reference count is zero, the BlockDriverState is
  * deleted. */
diff --git a/include/block/block-common.h b/include/block/block-common.h
index 2687a2519c..2f247dd607 100644
--- a/include/block/block-common.h
+++ b/include/block/block-common.h
@@ -230,8 +230,6 @@ typedef struct BDRVReopenState {
     int flags;
     BlockdevDetectZeroesOptions detect_zeroes;
     bool backing_missing;
-    BlockDriverState *old_backing_bs; /* keep pointer for permissions update */
-    BlockDriverState *old_file_bs; /* keep pointer for permissions update */
     QDict *options;
     QDict *explicit_options;
     void *opaque;
diff --git a/include/block/block-global-state.h b/include/block/block-global-state.h
index c307b48b2a..f3ec72810e 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -186,6 +186,8 @@ void bdrv_img_create(const char *filename, const char *fmt,
 
 void bdrv_ref(BlockDriverState *bs);
 void bdrv_unref(BlockDriverState *bs);
+void bdrv_unref_tran(BlockDriverState *bs, GSList **refresh_list,
+                     Transaction *tran);
 void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child);
 BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
                              BlockDriverState *child_bs,
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index c4d8b11dbb..6713c58934 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -162,6 +162,13 @@ struct BlockDriver {
      */
     bool supports_backing;
 
+    /*
+     * If true that guarantees that .bdrv_close doesn't access any bdrv children
+     * and is safe to be called in commit phase of block-graph modifying
+     * transaction.
+     */
+    bool independent_close;
+
     bool has_variable_length;
 
     /*
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 23/45] blockdev: refactor transaction to use Transaction API
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (21 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 22/45] block: implemet bdrv_unref_tran() Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 24/45] blockdev: transactions: rename some things Vladimir Sementsov-Ogievskiy
                   ` (21 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, Markus Armbruster, hreitz, vsementsov

We are going to add more block-graph modifying transaction actions,
and block-graph modifying functions are already based on Transaction
API.

Next, we'll need to separately update permissions after several
graph-modifying actions, and this would be simple with help of
Transaction API.

So, now let's just transform what we have into new-style transaction
actions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 blockdev.c | 317 +++++++++++++++++++++++++++++++----------------------
 1 file changed, 186 insertions(+), 131 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index e46e831212..a9fb5f66b0 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1200,10 +1200,7 @@ typedef struct BlkActionState BlkActionState;
  */
 typedef struct BlkActionOps {
     size_t instance_size;
-    void (*prepare)(BlkActionState *common, Error **errp);
-    void (*commit)(BlkActionState *common);
-    void (*abort)(BlkActionState *common);
-    void (*clean)(BlkActionState *common);
+    void (*action)(BlkActionState *common, Transaction *tran, Error **errp);
 } BlkActionOps;
 
 /**
@@ -1235,6 +1232,12 @@ typedef struct InternalSnapshotState {
     bool created;
 } InternalSnapshotState;
 
+static void internal_snapshot_abort(void *opaque);
+static void internal_snapshot_clean(void *opaque);
+TransactionActionDrv internal_snapshot_drv = {
+    .abort = internal_snapshot_abort,
+    .clean = internal_snapshot_clean,
+};
 
 static int action_check_completion_mode(BlkActionState *s, Error **errp)
 {
@@ -1249,8 +1252,8 @@ static int action_check_completion_mode(BlkActionState *s, Error **errp)
     return 0;
 }
 
-static void internal_snapshot_prepare(BlkActionState *common,
-                                      Error **errp)
+static void internal_snapshot_action(BlkActionState *common,
+                                     Transaction *tran, Error **errp)
 {
     Error *local_err = NULL;
     const char *device;
@@ -1269,6 +1272,8 @@ static void internal_snapshot_prepare(BlkActionState *common,
     internal = common->action->u.blockdev_snapshot_internal_sync.data;
     state = DO_UPCAST(InternalSnapshotState, common, common);
 
+    tran_add(tran, &internal_snapshot_drv, state);
+
     /* 1. parse input */
     device = internal->device;
     name = internal->name;
@@ -1353,10 +1358,9 @@ out:
     aio_context_release(aio_context);
 }
 
-static void internal_snapshot_abort(BlkActionState *common)
+static void internal_snapshot_abort(void *opaque)
 {
-    InternalSnapshotState *state =
-                             DO_UPCAST(InternalSnapshotState, common, common);
+    InternalSnapshotState *state = opaque;
     BlockDriverState *bs = state->bs;
     QEMUSnapshotInfo *sn = &state->sn;
     AioContext *aio_context;
@@ -1380,10 +1384,9 @@ static void internal_snapshot_abort(BlkActionState *common)
     aio_context_release(aio_context);
 }
 
-static void internal_snapshot_clean(BlkActionState *common)
+static void internal_snapshot_clean(void *opaque)
 {
-    InternalSnapshotState *state = DO_UPCAST(InternalSnapshotState,
-                                             common, common);
+    InternalSnapshotState *state = opaque;
     AioContext *aio_context;
 
     if (!state->bs) {
@@ -1396,6 +1399,8 @@ static void internal_snapshot_clean(BlkActionState *common)
     bdrv_drained_end(state->bs);
 
     aio_context_release(aio_context);
+
+    g_free(state);
 }
 
 /* external snapshot private data */
@@ -1406,8 +1411,17 @@ typedef struct ExternalSnapshotState {
     bool overlay_appended;
 } ExternalSnapshotState;
 
-static void external_snapshot_prepare(BlkActionState *common,
-                                      Error **errp)
+static void external_snapshot_commit(void *opaque);
+static void external_snapshot_abort(void *opaque);
+static void external_snapshot_clean(void *opaque);
+TransactionActionDrv external_snapshot_drv = {
+    .commit = external_snapshot_commit,
+    .abort = external_snapshot_abort,
+    .clean = external_snapshot_clean,
+};
+
+static void external_snapshot_action(BlkActionState *common, Transaction *tran,
+                                     Error **errp)
 {
     int ret;
     int flags = 0;
@@ -1426,6 +1440,8 @@ static void external_snapshot_prepare(BlkActionState *common,
     AioContext *aio_context;
     uint64_t perm, shared;
 
+    tran_add(tran, &external_snapshot_drv, state);
+
     /* 'blockdev-snapshot' and 'blockdev-snapshot-sync' have similar
      * purpose but a different set of parameters */
     switch (action->type) {
@@ -1575,10 +1591,9 @@ out:
     aio_context_release(aio_context);
 }
 
-static void external_snapshot_commit(BlkActionState *common)
+static void external_snapshot_commit(void *opaque)
 {
-    ExternalSnapshotState *state =
-                             DO_UPCAST(ExternalSnapshotState, common, common);
+    ExternalSnapshotState *state = opaque;
     AioContext *aio_context;
 
     aio_context = bdrv_get_aio_context(state->old_bs);
@@ -1594,10 +1609,9 @@ static void external_snapshot_commit(BlkActionState *common)
     aio_context_release(aio_context);
 }
 
-static void external_snapshot_abort(BlkActionState *common)
+static void external_snapshot_abort(void *opaque)
 {
-    ExternalSnapshotState *state =
-                             DO_UPCAST(ExternalSnapshotState, common, common);
+    ExternalSnapshotState *state = opaque;
     if (state->new_bs) {
         if (state->overlay_appended) {
             AioContext *aio_context;
@@ -1637,10 +1651,9 @@ static void external_snapshot_abort(BlkActionState *common)
     }
 }
 
-static void external_snapshot_clean(BlkActionState *common)
+static void external_snapshot_clean(void *opaque)
 {
-    ExternalSnapshotState *state =
-                             DO_UPCAST(ExternalSnapshotState, common, common);
+    ExternalSnapshotState *state = opaque;
     AioContext *aio_context;
 
     if (!state->old_bs) {
@@ -1654,6 +1667,8 @@ static void external_snapshot_clean(BlkActionState *common)
     bdrv_unref(state->new_bs);
 
     aio_context_release(aio_context);
+
+    g_free(state);
 }
 
 typedef struct DriveBackupState {
@@ -1668,7 +1683,17 @@ static BlockJob *do_backup_common(BackupCommon *backup,
                                   AioContext *aio_context,
                                   JobTxn *txn, Error **errp);
 
-static void drive_backup_prepare(BlkActionState *common, Error **errp)
+static void drive_backup_commit(void *opaque);
+static void drive_backup_abort(void *opaque);
+static void drive_backup_clean(void *opaque);
+TransactionActionDrv drive_backup_drv = {
+    .commit = drive_backup_commit,
+    .abort = drive_backup_abort,
+    .clean = drive_backup_clean,
+};
+
+static void drive_backup_action(BlkActionState *common, Transaction *tran,
+                                Error **errp)
 {
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
     DriveBackup *backup;
@@ -1684,6 +1709,8 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
     bool set_backing_hd = false;
     int ret;
 
+    tran_add(tran, &drive_backup_drv, state);
+
     assert(common->action->type == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
     backup = common->action->u.drive_backup.data;
 
@@ -1814,9 +1841,9 @@ out:
     aio_context_release(aio_context);
 }
 
-static void drive_backup_commit(BlkActionState *common)
+static void drive_backup_commit(void *opaque)
 {
-    DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
+    DriveBackupState *state = opaque;
     AioContext *aio_context;
 
     aio_context = bdrv_get_aio_context(state->bs);
@@ -1828,9 +1855,9 @@ static void drive_backup_commit(BlkActionState *common)
     aio_context_release(aio_context);
 }
 
-static void drive_backup_abort(BlkActionState *common)
+static void drive_backup_abort(void *opaque)
 {
-    DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
+    DriveBackupState *state = opaque;
 
     if (state->job) {
         AioContext *aio_context;
@@ -1844,9 +1871,9 @@ static void drive_backup_abort(BlkActionState *common)
     }
 }
 
-static void drive_backup_clean(BlkActionState *common)
+static void drive_backup_clean(void *opaque)
 {
-    DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
+    DriveBackupState *state = opaque;
     AioContext *aio_context;
 
     if (!state->bs) {
@@ -1859,6 +1886,8 @@ static void drive_backup_clean(BlkActionState *common)
     bdrv_drained_end(state->bs);
 
     aio_context_release(aio_context);
+
+    g_free(state);
 }
 
 typedef struct BlockdevBackupState {
@@ -1867,7 +1896,17 @@ typedef struct BlockdevBackupState {
     BlockJob *job;
 } BlockdevBackupState;
 
-static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
+static void blockdev_backup_commit(void *opaque);
+static void blockdev_backup_abort(void *opaque);
+static void blockdev_backup_clean(void *opaque);
+TransactionActionDrv blockdev_backup_drv = {
+    .commit = blockdev_backup_commit,
+    .abort = blockdev_backup_abort,
+    .clean = blockdev_backup_clean,
+};
+
+static void blockdev_backup_action(BlkActionState *common, Transaction *tran,
+                                   Error **errp)
 {
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
     BlockdevBackup *backup;
@@ -1877,6 +1916,8 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
     AioContext *old_context;
     int ret;
 
+    tran_add(tran, &blockdev_backup_drv, state);
+
     assert(common->action->type == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
     backup = common->action->u.blockdev_backup.data;
 
@@ -1915,9 +1956,9 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
     aio_context_release(aio_context);
 }
 
-static void blockdev_backup_commit(BlkActionState *common)
+static void blockdev_backup_commit(void *opaque)
 {
-    BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
+    BlockdevBackupState *state = opaque;
     AioContext *aio_context;
 
     aio_context = bdrv_get_aio_context(state->bs);
@@ -1929,9 +1970,9 @@ static void blockdev_backup_commit(BlkActionState *common)
     aio_context_release(aio_context);
 }
 
-static void blockdev_backup_abort(BlkActionState *common)
+static void blockdev_backup_abort(void *opaque)
 {
-    BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
+    BlockdevBackupState *state = opaque;
 
     if (state->job) {
         AioContext *aio_context;
@@ -1945,9 +1986,9 @@ static void blockdev_backup_abort(BlkActionState *common)
     }
 }
 
-static void blockdev_backup_clean(BlkActionState *common)
+static void blockdev_backup_clean(void *opaque)
 {
-    BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
+    BlockdevBackupState *state = opaque;
     AioContext *aio_context;
 
     if (!state->bs) {
@@ -1960,6 +2001,8 @@ static void blockdev_backup_clean(BlkActionState *common)
     bdrv_drained_end(state->bs);
 
     aio_context_release(aio_context);
+
+    g_free(state);
 }
 
 typedef struct BlockDirtyBitmapState {
@@ -1971,14 +2014,22 @@ typedef struct BlockDirtyBitmapState {
     bool was_enabled;
 } BlockDirtyBitmapState;
 
-static void block_dirty_bitmap_add_prepare(BlkActionState *common,
-                                           Error **errp)
+static void block_dirty_bitmap_add_abort(void *opaque);
+TransactionActionDrv block_dirty_bitmap_add_drv = {
+    .abort = block_dirty_bitmap_add_abort,
+    .clean = g_free,
+};
+
+static void block_dirty_bitmap_add_action(BlkActionState *common,
+                                          Transaction *tran, Error **errp)
 {
     Error *local_err = NULL;
     BlockDirtyBitmapAdd *action;
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
                                              common, common);
 
+    tran_add(tran, &block_dirty_bitmap_add_drv, state);
+
     if (action_check_completion_mode(common, errp) < 0) {
         return;
     }
@@ -1998,13 +2049,12 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
     }
 }
 
-static void block_dirty_bitmap_add_abort(BlkActionState *common)
+static void block_dirty_bitmap_add_abort(void *opaque)
 {
     BlockDirtyBitmapAdd *action;
-    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
-                                             common, common);
+    BlockDirtyBitmapState *state = opaque;
 
-    action = common->action->u.block_dirty_bitmap_add.data;
+    action = state->common.action->u.block_dirty_bitmap_add.data;
     /* Should not be able to fail: IF the bitmap was added via .prepare(),
      * then the node reference and bitmap name must have been valid.
      */
@@ -2013,13 +2063,23 @@ static void block_dirty_bitmap_add_abort(BlkActionState *common)
     }
 }
 
-static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
-                                             Error **errp)
+static void block_dirty_bitmap_restore(void *opaque);
+static void block_dirty_bitmap_free_backup(void *opaque);
+TransactionActionDrv block_dirty_bitmap_clear_drv = {
+    .abort = block_dirty_bitmap_restore,
+    .commit = block_dirty_bitmap_free_backup,
+    .clean = g_free,
+};
+
+static void block_dirty_bitmap_clear_action(BlkActionState *common,
+                                            Transaction *tran, Error **errp)
 {
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
                                              common, common);
     BlockDirtyBitmap *action;
 
+    tran_add(tran, &block_dirty_bitmap_clear_drv, state);
+
     if (action_check_completion_mode(common, errp) < 0) {
         return;
     }
@@ -2040,31 +2100,37 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
     bdrv_clear_dirty_bitmap(state->bitmap, &state->backup);
 }
 
-static void block_dirty_bitmap_restore(BlkActionState *common)
+static void block_dirty_bitmap_restore(void *opaque)
 {
-    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
-                                             common, common);
+    BlockDirtyBitmapState *state = opaque;
 
     if (state->backup) {
         bdrv_restore_dirty_bitmap(state->bitmap, state->backup);
     }
 }
 
-static void block_dirty_bitmap_free_backup(BlkActionState *common)
+static void block_dirty_bitmap_free_backup(void *opaque)
 {
-    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
-                                             common, common);
+    BlockDirtyBitmapState *state = opaque;
 
     hbitmap_free(state->backup);
 }
 
-static void block_dirty_bitmap_enable_prepare(BlkActionState *common,
-                                              Error **errp)
+static void block_dirty_bitmap_enable_abort(void *opaque);
+TransactionActionDrv block_dirty_bitmap_enable_drv = {
+    .abort = block_dirty_bitmap_enable_abort,
+    .clean = g_free,
+};
+
+static void block_dirty_bitmap_enable_action(BlkActionState *common,
+                                             Transaction *tran, Error **errp)
 {
     BlockDirtyBitmap *action;
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
                                              common, common);
 
+    tran_add(tran, &block_dirty_bitmap_enable_drv, state);
+
     if (action_check_completion_mode(common, errp) < 0) {
         return;
     }
@@ -2086,23 +2152,30 @@ static void block_dirty_bitmap_enable_prepare(BlkActionState *common,
     bdrv_enable_dirty_bitmap(state->bitmap);
 }
 
-static void block_dirty_bitmap_enable_abort(BlkActionState *common)
+static void block_dirty_bitmap_enable_abort(void *opaque)
 {
-    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
-                                             common, common);
+    BlockDirtyBitmapState *state = opaque;
 
     if (!state->was_enabled) {
         bdrv_disable_dirty_bitmap(state->bitmap);
     }
 }
 
-static void block_dirty_bitmap_disable_prepare(BlkActionState *common,
-                                               Error **errp)
+static void block_dirty_bitmap_disable_abort(void *opaque);
+TransactionActionDrv block_dirty_bitmap_disable_drv = {
+    .abort = block_dirty_bitmap_disable_abort,
+    .clean = g_free,
+};
+
+static void block_dirty_bitmap_disable_action(BlkActionState *common,
+                                              Transaction *tran, Error **errp)
 {
     BlockDirtyBitmap *action;
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
                                              common, common);
 
+    tran_add(tran, &block_dirty_bitmap_disable_drv, state);
+
     if (action_check_completion_mode(common, errp) < 0) {
         return;
     }
@@ -2124,23 +2197,30 @@ static void block_dirty_bitmap_disable_prepare(BlkActionState *common,
     bdrv_disable_dirty_bitmap(state->bitmap);
 }
 
-static void block_dirty_bitmap_disable_abort(BlkActionState *common)
+static void block_dirty_bitmap_disable_abort(void *opaque)
 {
-    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
-                                             common, common);
+    BlockDirtyBitmapState *state = opaque;
 
     if (state->was_enabled) {
         bdrv_enable_dirty_bitmap(state->bitmap);
     }
 }
 
-static void block_dirty_bitmap_merge_prepare(BlkActionState *common,
-                                             Error **errp)
+TransactionActionDrv block_dirty_bitmap_merge_drv = {
+    .commit = block_dirty_bitmap_free_backup,
+    .abort = block_dirty_bitmap_restore,
+    .clean = g_free,
+};
+
+static void block_dirty_bitmap_merge_action(BlkActionState *common,
+                                            Transaction *tran, Error **errp)
 {
     BlockDirtyBitmapMerge *action;
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
                                              common, common);
 
+    tran_add(tran, &block_dirty_bitmap_merge_drv, state);
+
     if (action_check_completion_mode(common, errp) < 0) {
         return;
     }
@@ -2152,13 +2232,23 @@ static void block_dirty_bitmap_merge_prepare(BlkActionState *common,
                                              errp);
 }
 
-static void block_dirty_bitmap_remove_prepare(BlkActionState *common,
-                                              Error **errp)
+static void block_dirty_bitmap_remove_commit(void *opaque);
+static void block_dirty_bitmap_remove_abort(void *opaque);
+TransactionActionDrv block_dirty_bitmap_remove_drv = {
+    .commit = block_dirty_bitmap_remove_commit,
+    .abort = block_dirty_bitmap_remove_abort,
+    .clean = g_free,
+};
+
+static void block_dirty_bitmap_remove_action(BlkActionState *common,
+                                             Transaction *tran, Error **errp)
 {
     BlockDirtyBitmap *action;
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
                                              common, common);
 
+    tran_add(tran, &block_dirty_bitmap_remove_drv, state);
+
     if (action_check_completion_mode(common, errp) < 0) {
         return;
     }
@@ -2173,10 +2263,9 @@ static void block_dirty_bitmap_remove_prepare(BlkActionState *common,
     }
 }
 
-static void block_dirty_bitmap_remove_abort(BlkActionState *common)
+static void block_dirty_bitmap_remove_abort(void *opaque)
 {
-    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
-                                             common, common);
+    BlockDirtyBitmapState *state = opaque;
 
     if (state->bitmap) {
         bdrv_dirty_bitmap_skip_store(state->bitmap, false);
@@ -2184,21 +2273,28 @@ static void block_dirty_bitmap_remove_abort(BlkActionState *common)
     }
 }
 
-static void block_dirty_bitmap_remove_commit(BlkActionState *common)
+static void block_dirty_bitmap_remove_commit(void *opaque)
 {
-    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
-                                             common, common);
+    BlockDirtyBitmapState *state = opaque;
 
     bdrv_dirty_bitmap_set_busy(state->bitmap, false);
     bdrv_release_dirty_bitmap(state->bitmap);
 }
 
-static void abort_prepare(BlkActionState *common, Error **errp)
+static void abort_commit(void *opaque);
+TransactionActionDrv abort_drv = {
+    .commit = abort_commit,
+    .clean = g_free,
+};
+
+static void abort_action(BlkActionState *common, Transaction *tran,
+                         Error **errp)
 {
+    tran_add(tran, &abort_drv, common);
     error_setg(errp, "Transaction aborted using Abort action");
 }
 
-static void abort_commit(BlkActionState *common)
+static void abort_commit(void *opaque)
 {
     g_assert_not_reached(); /* this action never succeeds */
 }
@@ -2206,75 +2302,51 @@ static void abort_commit(BlkActionState *common)
 static const BlkActionOps actions[] = {
     [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT] = {
         .instance_size = sizeof(ExternalSnapshotState),
-        .prepare  = external_snapshot_prepare,
-        .commit   = external_snapshot_commit,
-        .abort = external_snapshot_abort,
-        .clean = external_snapshot_clean,
+        .action  = external_snapshot_action,
     },
     [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC] = {
         .instance_size = sizeof(ExternalSnapshotState),
-        .prepare  = external_snapshot_prepare,
-        .commit   = external_snapshot_commit,
-        .abort = external_snapshot_abort,
-        .clean = external_snapshot_clean,
+        .action  = external_snapshot_action,
     },
     [TRANSACTION_ACTION_KIND_DRIVE_BACKUP] = {
         .instance_size = sizeof(DriveBackupState),
-        .prepare = drive_backup_prepare,
-        .commit = drive_backup_commit,
-        .abort = drive_backup_abort,
-        .clean = drive_backup_clean,
+        .action = drive_backup_action,
     },
     [TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP] = {
         .instance_size = sizeof(BlockdevBackupState),
-        .prepare = blockdev_backup_prepare,
-        .commit = blockdev_backup_commit,
-        .abort = blockdev_backup_abort,
-        .clean = blockdev_backup_clean,
+        .action = blockdev_backup_action,
     },
     [TRANSACTION_ACTION_KIND_ABORT] = {
         .instance_size = sizeof(BlkActionState),
-        .prepare = abort_prepare,
-        .commit = abort_commit,
+        .action = abort_action,
     },
     [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_INTERNAL_SYNC] = {
         .instance_size = sizeof(InternalSnapshotState),
-        .prepare  = internal_snapshot_prepare,
-        .abort = internal_snapshot_abort,
-        .clean = internal_snapshot_clean,
+        .action  = internal_snapshot_action,
     },
     [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ADD] = {
         .instance_size = sizeof(BlockDirtyBitmapState),
-        .prepare = block_dirty_bitmap_add_prepare,
-        .abort = block_dirty_bitmap_add_abort,
+        .action = block_dirty_bitmap_add_action,
     },
     [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_CLEAR] = {
         .instance_size = sizeof(BlockDirtyBitmapState),
-        .prepare = block_dirty_bitmap_clear_prepare,
-        .commit = block_dirty_bitmap_free_backup,
-        .abort = block_dirty_bitmap_restore,
+        .action = block_dirty_bitmap_clear_action,
     },
     [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ENABLE] = {
         .instance_size = sizeof(BlockDirtyBitmapState),
-        .prepare = block_dirty_bitmap_enable_prepare,
-        .abort = block_dirty_bitmap_enable_abort,
+        .action = block_dirty_bitmap_enable_action,
     },
     [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_DISABLE] = {
         .instance_size = sizeof(BlockDirtyBitmapState),
-        .prepare = block_dirty_bitmap_disable_prepare,
-        .abort = block_dirty_bitmap_disable_abort,
+        .action = block_dirty_bitmap_disable_action,
     },
     [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_MERGE] = {
         .instance_size = sizeof(BlockDirtyBitmapState),
-        .prepare = block_dirty_bitmap_merge_prepare,
-        .commit = block_dirty_bitmap_free_backup,
-        .abort = block_dirty_bitmap_restore,
+        .action = block_dirty_bitmap_merge_action,
     },
     [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_REMOVE] = {
         .instance_size = sizeof(BlockDirtyBitmapState),
-        .prepare = block_dirty_bitmap_remove_prepare,
-        .commit = block_dirty_bitmap_remove_commit,
-        .abort = block_dirty_bitmap_remove_abort,
+        .action = block_dirty_bitmap_remove_action,
     },
     /* Where are transactions for MIRROR, COMMIT and STREAM?
      * Although these blockjobs use transaction callbacks like the backup job,
@@ -2316,14 +2388,11 @@ void qmp_transaction(TransactionActionList *dev_list,
 {
     TransactionActionList *dev_entry = dev_list;
     JobTxn *block_job_txn = NULL;
-    BlkActionState *state, *next;
     Error *local_err = NULL;
+    Transaction *tran = tran_new();
 
     GLOBAL_STATE_CODE();
 
-    QTAILQ_HEAD(, BlkActionState) snap_bdrv_states;
-    QTAILQ_INIT(&snap_bdrv_states);
-
     /* Does this transaction get canceled as a group on failure?
      * If not, we don't really need to make a JobTxn.
      */
@@ -2339,6 +2408,7 @@ void qmp_transaction(TransactionActionList *dev_list,
     while (NULL != dev_entry) {
         TransactionAction *dev_info = NULL;
         const BlkActionOps *ops;
+        BlkActionState *state;
 
         dev_info = dev_entry->value;
         dev_entry = dev_entry->next;
@@ -2353,38 +2423,23 @@ void qmp_transaction(TransactionActionList *dev_list,
         state->action = dev_info;
         state->block_job_txn = block_job_txn;
         state->txn_props = props;
-        QTAILQ_INSERT_TAIL(&snap_bdrv_states, state, entry);
 
-        state->ops->prepare(state, &local_err);
+        state->ops->action(state, tran, &local_err);
         if (local_err) {
             error_propagate(errp, local_err);
             goto delete_and_fail;
         }
     }
 
-    QTAILQ_FOREACH(state, &snap_bdrv_states, entry) {
-        if (state->ops->commit) {
-            state->ops->commit(state);
-        }
-    }
+    tran_commit(tran);
 
     /* success */
     goto exit;
 
 delete_and_fail:
     /* failure, and it is all-or-none; roll back all operations */
-    QTAILQ_FOREACH_REVERSE(state, &snap_bdrv_states, entry) {
-        if (state->ops->abort) {
-            state->ops->abort(state);
-        }
-    }
+    tran_abort(tran);
 exit:
-    QTAILQ_FOREACH_SAFE(state, &snap_bdrv_states, entry, next) {
-        if (state->ops->clean) {
-            state->ops->clean(state);
-        }
-        g_free(state);
-    }
     if (!has_props) {
         qapi_free_TransactionProperties(props);
     }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 24/45] blockdev: transactions: rename some things
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (22 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 23/45] blockdev: refactor transaction to use Transaction API Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 25/45] blockdev: qmp_transaction: refactor loop to classic for Vladimir Sementsov-Ogievskiy
                   ` (20 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, Markus Armbruster, hreitz, vsementsov

Look at qmp_transaction(): dev_list is not obvious name for list of
actions. Let's look at qapi spec, this argument is "actions". Let's
follow the common practice of using same argument names in qapi scheme
and code.

To be honest, rename props to properties for same reason.

Next, we have to rename global map of actions, to not conflict with new
name for function argument.

Rename also dev_entry loop variable accordingly to new name of the
list.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 blockdev.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index a9fb5f66b0..177f3ff989 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2299,7 +2299,7 @@ static void abort_commit(void *opaque)
     g_assert_not_reached(); /* this action never succeeds */
 }
 
-static const BlkActionOps actions[] = {
+static const BlkActionOps actions_map[] = {
     [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT] = {
         .instance_size = sizeof(ExternalSnapshotState),
         .action  = external_snapshot_action,
@@ -2381,12 +2381,12 @@ static TransactionProperties *get_transaction_properties(
  *
  * Always run under BQL.
  */
-void qmp_transaction(TransactionActionList *dev_list,
-                     bool has_props,
-                     struct TransactionProperties *props,
+void qmp_transaction(TransactionActionList *actions,
+                     bool has_properties,
+                     struct TransactionProperties *properties,
                      Error **errp)
 {
-    TransactionActionList *dev_entry = dev_list;
+    TransactionActionList *act = actions;
     JobTxn *block_job_txn = NULL;
     Error *local_err = NULL;
     Transaction *tran = tran_new();
@@ -2396,8 +2396,8 @@ void qmp_transaction(TransactionActionList *dev_list,
     /* Does this transaction get canceled as a group on failure?
      * If not, we don't really need to make a JobTxn.
      */
-    props = get_transaction_properties(props);
-    if (props->completion_mode != ACTION_COMPLETION_MODE_INDIVIDUAL) {
+    properties = get_transaction_properties(properties);
+    if (properties->completion_mode != ACTION_COMPLETION_MODE_INDIVIDUAL) {
         block_job_txn = job_txn_new();
     }
 
@@ -2405,24 +2405,24 @@ void qmp_transaction(TransactionActionList *dev_list,
     bdrv_drain_all();
 
     /* We don't do anything in this loop that commits us to the operations */
-    while (NULL != dev_entry) {
+    while (NULL != act) {
         TransactionAction *dev_info = NULL;
         const BlkActionOps *ops;
         BlkActionState *state;
 
-        dev_info = dev_entry->value;
-        dev_entry = dev_entry->next;
+        dev_info = act->value;
+        act = act->next;
 
-        assert(dev_info->type < ARRAY_SIZE(actions));
+        assert(dev_info->type < ARRAY_SIZE(actions_map));
 
-        ops = &actions[dev_info->type];
+        ops = &actions_map[dev_info->type];
         assert(ops->instance_size > 0);
 
         state = g_malloc0(ops->instance_size);
         state->ops = ops;
         state->action = dev_info;
         state->block_job_txn = block_job_txn;
-        state->txn_props = props;
+        state->txn_props = properties;
 
         state->ops->action(state, tran, &local_err);
         if (local_err) {
@@ -2440,8 +2440,8 @@ delete_and_fail:
     /* failure, and it is all-or-none; roll back all operations */
     tran_abort(tran);
 exit:
-    if (!has_props) {
-        qapi_free_TransactionProperties(props);
+    if (!has_properties) {
+        qapi_free_TransactionProperties(properties);
     }
     job_txn_unref(block_job_txn);
 }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 25/45] blockdev: qmp_transaction: refactor loop to classic for
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (23 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 24/45] blockdev: transactions: rename some things Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 26/45] blockdev: transaction: refactor handling transaction properties Vladimir Sementsov-Ogievskiy
                   ` (19 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, Markus Armbruster, hreitz, vsementsov

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 blockdev.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 177f3ff989..b44f0ca101 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2386,7 +2386,7 @@ void qmp_transaction(TransactionActionList *actions,
                      struct TransactionProperties *properties,
                      Error **errp)
 {
-    TransactionActionList *act = actions;
+    TransactionActionList *act;
     JobTxn *block_job_txn = NULL;
     Error *local_err = NULL;
     Transaction *tran = tran_new();
@@ -2405,14 +2405,11 @@ void qmp_transaction(TransactionActionList *actions,
     bdrv_drain_all();
 
     /* We don't do anything in this loop that commits us to the operations */
-    while (NULL != act) {
-        TransactionAction *dev_info = NULL;
+    for (act = actions; act; act = act->next) {
+        TransactionAction *dev_info = act->value;
         const BlkActionOps *ops;
         BlkActionState *state;
 
-        dev_info = act->value;
-        act = act->next;
-
         assert(dev_info->type < ARRAY_SIZE(actions_map));
 
         ops = &actions_map[dev_info->type];
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 26/45] blockdev: transaction: refactor handling transaction properties
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (24 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 25/45] blockdev: qmp_transaction: refactor loop to classic for Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 27/45] blockdev: qmp_transaction: drop extra generic layer Vladimir Sementsov-Ogievskiy
                   ` (18 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, Markus Armbruster, hreitz, vsementsov

Only backup supports GROUPED mode. Make this logic more clear. And
avoid passing extra thing to each action.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 blockdev.c | 88 ++++++++++++------------------------------------------
 1 file changed, 19 insertions(+), 69 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index b44f0ca101..3c9e826355 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1220,7 +1220,6 @@ struct BlkActionState {
     TransactionAction *action;
     const BlkActionOps *ops;
     JobTxn *block_job_txn;
-    TransactionProperties *txn_props;
     QTAILQ_ENTRY(BlkActionState) entry;
 };
 
@@ -1239,19 +1238,6 @@ TransactionActionDrv internal_snapshot_drv = {
     .clean = internal_snapshot_clean,
 };
 
-static int action_check_completion_mode(BlkActionState *s, Error **errp)
-{
-    if (s->txn_props->completion_mode != ACTION_COMPLETION_MODE_INDIVIDUAL) {
-        error_setg(errp,
-                   "Action '%s' does not support Transaction property "
-                   "completion-mode = %s",
-                   TransactionActionKind_str(s->action->type),
-                   ActionCompletionMode_str(s->txn_props->completion_mode));
-        return -1;
-    }
-    return 0;
-}
-
 static void internal_snapshot_action(BlkActionState *common,
                                      Transaction *tran, Error **errp)
 {
@@ -1274,15 +1260,9 @@ static void internal_snapshot_action(BlkActionState *common,
 
     tran_add(tran, &internal_snapshot_drv, state);
 
-    /* 1. parse input */
     device = internal->device;
     name = internal->name;
 
-    /* 2. check for validation */
-    if (action_check_completion_mode(common, errp) < 0) {
-        return;
-    }
-
     bs = qmp_get_root_bs(device, errp);
     if (!bs) {
         return;
@@ -1468,9 +1448,6 @@ static void external_snapshot_action(BlkActionState *common, Transaction *tran,
     }
 
     /* start processing */
-    if (action_check_completion_mode(common, errp) < 0) {
-        return;
-    }
 
     state->old_bs = bdrv_lookup_bs(device, node_name, errp);
     if (!state->old_bs) {
@@ -2030,10 +2007,6 @@ static void block_dirty_bitmap_add_action(BlkActionState *common,
 
     tran_add(tran, &block_dirty_bitmap_add_drv, state);
 
-    if (action_check_completion_mode(common, errp) < 0) {
-        return;
-    }
-
     action = common->action->u.block_dirty_bitmap_add.data;
     /* AIO context taken and released within qmp_block_dirty_bitmap_add */
     qmp_block_dirty_bitmap_add(action->node, action->name,
@@ -2080,10 +2053,6 @@ static void block_dirty_bitmap_clear_action(BlkActionState *common,
 
     tran_add(tran, &block_dirty_bitmap_clear_drv, state);
 
-    if (action_check_completion_mode(common, errp) < 0) {
-        return;
-    }
-
     action = common->action->u.block_dirty_bitmap_clear.data;
     state->bitmap = block_dirty_bitmap_lookup(action->node,
                                               action->name,
@@ -2131,10 +2100,6 @@ static void block_dirty_bitmap_enable_action(BlkActionState *common,
 
     tran_add(tran, &block_dirty_bitmap_enable_drv, state);
 
-    if (action_check_completion_mode(common, errp) < 0) {
-        return;
-    }
-
     action = common->action->u.block_dirty_bitmap_enable.data;
     state->bitmap = block_dirty_bitmap_lookup(action->node,
                                               action->name,
@@ -2176,10 +2141,6 @@ static void block_dirty_bitmap_disable_action(BlkActionState *common,
 
     tran_add(tran, &block_dirty_bitmap_disable_drv, state);
 
-    if (action_check_completion_mode(common, errp) < 0) {
-        return;
-    }
-
     action = common->action->u.block_dirty_bitmap_disable.data;
     state->bitmap = block_dirty_bitmap_lookup(action->node,
                                               action->name,
@@ -2221,10 +2182,6 @@ static void block_dirty_bitmap_merge_action(BlkActionState *common,
 
     tran_add(tran, &block_dirty_bitmap_merge_drv, state);
 
-    if (action_check_completion_mode(common, errp) < 0) {
-        return;
-    }
-
     action = common->action->u.block_dirty_bitmap_merge.data;
 
     state->bitmap = block_dirty_bitmap_merge(action->node, action->target,
@@ -2249,10 +2206,6 @@ static void block_dirty_bitmap_remove_action(BlkActionState *common,
 
     tran_add(tran, &block_dirty_bitmap_remove_drv, state);
 
-    if (action_check_completion_mode(common, errp) < 0) {
-        return;
-    }
-
     action = common->action->u.block_dirty_bitmap_remove.data;
 
     state->bitmap = block_dirty_bitmap_remove(action->node, action->name,
@@ -2356,25 +2309,6 @@ static const BlkActionOps actions_map[] = {
      */
 };
 
-/**
- * Allocate a TransactionProperties structure if necessary, and fill
- * that structure with desired defaults if they are unset.
- */
-static TransactionProperties *get_transaction_properties(
-    TransactionProperties *props)
-{
-    if (!props) {
-        props = g_new0(TransactionProperties, 1);
-    }
-
-    if (!props->has_completion_mode) {
-        props->has_completion_mode = true;
-        props->completion_mode = ACTION_COMPLETION_MODE_INDIVIDUAL;
-    }
-
-    return props;
-}
-
 /*
  * 'Atomic' group operations.  The operations are performed as a set, and if
  * any fail then we roll back all operations in the group.
@@ -2390,14 +2324,31 @@ void qmp_transaction(TransactionActionList *actions,
     JobTxn *block_job_txn = NULL;
     Error *local_err = NULL;
     Transaction *tran = tran_new();
+    ActionCompletionMode comp_mode =
+        has_properties ? properties->completion_mode :
+        ACTION_COMPLETION_MODE_INDIVIDUAL;
 
     GLOBAL_STATE_CODE();
 
     /* Does this transaction get canceled as a group on failure?
      * If not, we don't really need to make a JobTxn.
      */
-    properties = get_transaction_properties(properties);
-    if (properties->completion_mode != ACTION_COMPLETION_MODE_INDIVIDUAL) {
+    if (comp_mode != ACTION_COMPLETION_MODE_INDIVIDUAL) {
+        for (act = actions; act; act = act->next) {
+            TransactionActionKind type = act->value->type;
+
+            if (type != TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP &&
+                type != TRANSACTION_ACTION_KIND_DRIVE_BACKUP)
+            {
+                error_setg(errp,
+                           "Action '%s' does not support Transaction property "
+                           "completion-mode = %s",
+                           TransactionActionKind_str(type),
+                           ActionCompletionMode_str(comp_mode));
+                return;
+            }
+        }
+
         block_job_txn = job_txn_new();
     }
 
@@ -2419,7 +2370,6 @@ void qmp_transaction(TransactionActionList *actions,
         state->ops = ops;
         state->action = dev_info;
         state->block_job_txn = block_job_txn;
-        state->txn_props = properties;
 
         state->ops->action(state, tran, &local_err);
         if (local_err) {
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 27/45] blockdev: qmp_transaction: drop extra generic layer
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (25 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 26/45] blockdev: transaction: refactor handling transaction properties Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 28/45] qapi: block: add blockdev-del transaction action Vladimir Sementsov-Ogievskiy
                   ` (17 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, Markus Armbruster, hreitz, vsementsov

Let's simplify things:

First, actions generally don't need and access to common
BlkActionState structure. The only exclusion are backup actions that
need block_job_txn.

Next, for transaction actions of Transaction API is more native to
allocated state structure in the action itself.

So, do the following transformation:

1. Let all actions be represented by a function with corresponding
   structure as arguments.

2. Instead of array-map marshaller, let's make a function, that calls
   corresponding action directly.

3. BlkActionOps and BlkActionState structures become unused

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 blockdev.c | 278 +++++++++++++++++------------------------------------
 1 file changed, 89 insertions(+), 189 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 3c9e826355..a7287bf64f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1178,54 +1178,8 @@ out_aio_context:
     return NULL;
 }
 
-/* New and old BlockDriverState structs for atomic group operations */
-
-typedef struct BlkActionState BlkActionState;
-
-/**
- * BlkActionOps:
- * Table of operations that define an Action.
- *
- * @instance_size: Size of state struct, in bytes.
- * @prepare: Prepare the work, must NOT be NULL.
- * @commit: Commit the changes, can be NULL.
- * @abort: Abort the changes on fail, can be NULL.
- * @clean: Clean up resources after all transaction actions have called
- *         commit() or abort(). Can be NULL.
- *
- * Only prepare() may fail. In a single transaction, only one of commit() or
- * abort() will be called. clean() will always be called if it is present.
- *
- * Always run under BQL.
- */
-typedef struct BlkActionOps {
-    size_t instance_size;
-    void (*action)(BlkActionState *common, Transaction *tran, Error **errp);
-} BlkActionOps;
-
-/**
- * BlkActionState:
- * Describes one Action's state within a Transaction.
- *
- * @action: QAPI-defined enum identifying which Action to perform.
- * @ops: Table of ActionOps this Action can perform.
- * @block_job_txn: Transaction which this action belongs to.
- * @entry: List membership for all Actions in this Transaction.
- *
- * This structure must be arranged as first member in a subclassed type,
- * assuming that the compiler will also arrange it to the same offsets as the
- * base class.
- */
-struct BlkActionState {
-    TransactionAction *action;
-    const BlkActionOps *ops;
-    JobTxn *block_job_txn;
-    QTAILQ_ENTRY(BlkActionState) entry;
-};
-
 /* internal snapshot private data */
 typedef struct InternalSnapshotState {
-    BlkActionState common;
     BlockDriverState *bs;
     QEMUSnapshotInfo sn;
     bool created;
@@ -1238,7 +1192,7 @@ TransactionActionDrv internal_snapshot_drv = {
     .clean = internal_snapshot_clean,
 };
 
-static void internal_snapshot_action(BlkActionState *common,
+static void internal_snapshot_action(BlockdevSnapshotInternal *internal,
                                      Transaction *tran, Error **errp)
 {
     Error *local_err = NULL;
@@ -1248,16 +1202,10 @@ static void internal_snapshot_action(BlkActionState *common,
     QEMUSnapshotInfo old_sn, *sn;
     bool ret;
     qemu_timeval tv;
-    BlockdevSnapshotInternal *internal;
-    InternalSnapshotState *state;
+    InternalSnapshotState *state = g_new0(InternalSnapshotState, 1);
     AioContext *aio_context;
     int ret1;
 
-    g_assert(common->action->type ==
-             TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_INTERNAL_SYNC);
-    internal = common->action->u.blockdev_snapshot_internal_sync.data;
-    state = DO_UPCAST(InternalSnapshotState, common, common);
-
     tran_add(tran, &internal_snapshot_drv, state);
 
     device = internal->device;
@@ -1385,7 +1333,6 @@ static void internal_snapshot_clean(void *opaque)
 
 /* external snapshot private data */
 typedef struct ExternalSnapshotState {
-    BlkActionState common;
     BlockDriverState *old_bs;
     BlockDriverState *new_bs;
     bool overlay_appended;
@@ -1400,8 +1347,8 @@ TransactionActionDrv external_snapshot_drv = {
     .clean = external_snapshot_clean,
 };
 
-static void external_snapshot_action(BlkActionState *common, Transaction *tran,
-                                     Error **errp)
+static void external_snapshot_action(TransactionAction *action,
+                                     Transaction *tran, Error **errp)
 {
     int ret;
     int flags = 0;
@@ -1414,9 +1361,7 @@ static void external_snapshot_action(BlkActionState *common, Transaction *tran,
     const char *snapshot_ref;
     /* File name of the new image (for 'blockdev-snapshot-sync') */
     const char *new_image_file;
-    ExternalSnapshotState *state =
-                             DO_UPCAST(ExternalSnapshotState, common, common);
-    TransactionAction *action = common->action;
+    ExternalSnapshotState *state = g_new0(ExternalSnapshotState, 1);
     AioContext *aio_context;
     uint64_t perm, shared;
 
@@ -1649,7 +1594,6 @@ static void external_snapshot_clean(void *opaque)
 }
 
 typedef struct DriveBackupState {
-    BlkActionState common;
     BlockDriverState *bs;
     BlockJob *job;
 } DriveBackupState;
@@ -1669,11 +1613,11 @@ TransactionActionDrv drive_backup_drv = {
     .clean = drive_backup_clean,
 };
 
-static void drive_backup_action(BlkActionState *common, Transaction *tran,
-                                Error **errp)
+static void drive_backup_action(DriveBackup *backup,
+                                JobTxn *block_job_txn,
+                                Transaction *tran, Error **errp)
 {
-    DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
-    DriveBackup *backup;
+    DriveBackupState *state = g_new0(DriveBackupState, 1);
     BlockDriverState *bs;
     BlockDriverState *target_bs;
     BlockDriverState *source = NULL;
@@ -1688,9 +1632,6 @@ static void drive_backup_action(BlkActionState *common, Transaction *tran,
 
     tran_add(tran, &drive_backup_drv, state);
 
-    assert(common->action->type == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
-    backup = common->action->u.drive_backup.data;
-
     if (!backup->has_mode) {
         backup->mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
     }
@@ -1810,7 +1751,7 @@ static void drive_backup_action(BlkActionState *common, Transaction *tran,
 
     state->job = do_backup_common(qapi_DriveBackup_base(backup),
                                   bs, target_bs, aio_context,
-                                  common->block_job_txn, errp);
+                                  block_job_txn, errp);
 
 unref:
     bdrv_unref(target_bs);
@@ -1868,7 +1809,6 @@ static void drive_backup_clean(void *opaque)
 }
 
 typedef struct BlockdevBackupState {
-    BlkActionState common;
     BlockDriverState *bs;
     BlockJob *job;
 } BlockdevBackupState;
@@ -1882,11 +1822,11 @@ TransactionActionDrv blockdev_backup_drv = {
     .clean = blockdev_backup_clean,
 };
 
-static void blockdev_backup_action(BlkActionState *common, Transaction *tran,
-                                   Error **errp)
+static void blockdev_backup_action(BlockdevBackup *backup,
+                                   JobTxn *block_job_txn,
+                                   Transaction *tran, Error **errp)
 {
-    BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
-    BlockdevBackup *backup;
+    BlockdevBackupState *state = g_new0(BlockdevBackupState, 1);
     BlockDriverState *bs;
     BlockDriverState *target_bs;
     AioContext *aio_context;
@@ -1895,9 +1835,6 @@ static void blockdev_backup_action(BlkActionState *common, Transaction *tran,
 
     tran_add(tran, &blockdev_backup_drv, state);
 
-    assert(common->action->type == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
-    backup = common->action->u.blockdev_backup.data;
-
     bs = bdrv_lookup_bs(backup->device, backup->device, errp);
     if (!bs) {
         return;
@@ -1928,7 +1865,7 @@ static void blockdev_backup_action(BlkActionState *common, Transaction *tran,
 
     state->job = do_backup_common(qapi_BlockdevBackup_base(backup),
                                   bs, target_bs, aio_context,
-                                  common->block_job_txn, errp);
+                                  block_job_txn, errp);
 
     aio_context_release(aio_context);
 }
@@ -1983,11 +1920,9 @@ static void blockdev_backup_clean(void *opaque)
 }
 
 typedef struct BlockDirtyBitmapState {
-    BlkActionState common;
     BdrvDirtyBitmap *bitmap;
     BlockDriverState *bs;
     HBitmap *backup;
-    bool prepared;
     bool was_enabled;
 } BlockDirtyBitmapState;
 
@@ -1997,17 +1932,14 @@ TransactionActionDrv block_dirty_bitmap_add_drv = {
     .clean = g_free,
 };
 
-static void block_dirty_bitmap_add_action(BlkActionState *common,
+static void block_dirty_bitmap_add_action(BlockDirtyBitmapAdd *action,
                                           Transaction *tran, Error **errp)
 {
     Error *local_err = NULL;
-    BlockDirtyBitmapAdd *action;
-    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
-                                             common, common);
+    BlockDirtyBitmapState *state = g_new0(BlockDirtyBitmapState, 1);
 
     tran_add(tran, &block_dirty_bitmap_add_drv, state);
 
-    action = common->action->u.block_dirty_bitmap_add.data;
     /* AIO context taken and released within qmp_block_dirty_bitmap_add */
     qmp_block_dirty_bitmap_add(action->node, action->name,
                                action->has_granularity, action->granularity,
@@ -2016,7 +1948,8 @@ static void block_dirty_bitmap_add_action(BlkActionState *common,
                                &local_err);
 
     if (!local_err) {
-        state->prepared = true;
+        state->bitmap = block_dirty_bitmap_lookup(action->node, action->name,
+                                                  NULL, &error_abort);
     } else {
         error_propagate(errp, local_err);
     }
@@ -2024,15 +1957,10 @@ static void block_dirty_bitmap_add_action(BlkActionState *common,
 
 static void block_dirty_bitmap_add_abort(void *opaque)
 {
-    BlockDirtyBitmapAdd *action;
     BlockDirtyBitmapState *state = opaque;
 
-    action = state->common.action->u.block_dirty_bitmap_add.data;
-    /* Should not be able to fail: IF the bitmap was added via .prepare(),
-     * then the node reference and bitmap name must have been valid.
-     */
-    if (state->prepared) {
-        qmp_block_dirty_bitmap_remove(action->node, action->name, &error_abort);
+    if (state->bitmap) {
+        bdrv_release_dirty_bitmap(state->bitmap);
     }
 }
 
@@ -2044,16 +1972,13 @@ TransactionActionDrv block_dirty_bitmap_clear_drv = {
     .clean = g_free,
 };
 
-static void block_dirty_bitmap_clear_action(BlkActionState *common,
+static void block_dirty_bitmap_clear_action(BlockDirtyBitmap *action,
                                             Transaction *tran, Error **errp)
 {
-    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
-                                             common, common);
-    BlockDirtyBitmap *action;
+    BlockDirtyBitmapState *state = g_new0(BlockDirtyBitmapState, 1);
 
     tran_add(tran, &block_dirty_bitmap_clear_drv, state);
 
-    action = common->action->u.block_dirty_bitmap_clear.data;
     state->bitmap = block_dirty_bitmap_lookup(action->node,
                                               action->name,
                                               &state->bs,
@@ -2091,16 +2016,13 @@ TransactionActionDrv block_dirty_bitmap_enable_drv = {
     .clean = g_free,
 };
 
-static void block_dirty_bitmap_enable_action(BlkActionState *common,
+static void block_dirty_bitmap_enable_action(BlockDirtyBitmap *action,
                                              Transaction *tran, Error **errp)
 {
-    BlockDirtyBitmap *action;
-    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
-                                             common, common);
+    BlockDirtyBitmapState *state = g_new0(BlockDirtyBitmapState, 1);
 
     tran_add(tran, &block_dirty_bitmap_enable_drv, state);
 
-    action = common->action->u.block_dirty_bitmap_enable.data;
     state->bitmap = block_dirty_bitmap_lookup(action->node,
                                               action->name,
                                               NULL,
@@ -2132,16 +2054,13 @@ TransactionActionDrv block_dirty_bitmap_disable_drv = {
     .clean = g_free,
 };
 
-static void block_dirty_bitmap_disable_action(BlkActionState *common,
+static void block_dirty_bitmap_disable_action(BlockDirtyBitmap *action,
                                               Transaction *tran, Error **errp)
 {
-    BlockDirtyBitmap *action;
-    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
-                                             common, common);
+    BlockDirtyBitmapState *state = g_new0(BlockDirtyBitmapState, 1);
 
     tran_add(tran, &block_dirty_bitmap_disable_drv, state);
 
-    action = common->action->u.block_dirty_bitmap_disable.data;
     state->bitmap = block_dirty_bitmap_lookup(action->node,
                                               action->name,
                                               NULL,
@@ -2173,17 +2092,13 @@ TransactionActionDrv block_dirty_bitmap_merge_drv = {
     .clean = g_free,
 };
 
-static void block_dirty_bitmap_merge_action(BlkActionState *common,
+static void block_dirty_bitmap_merge_action(BlockDirtyBitmapMerge *action,
                                             Transaction *tran, Error **errp)
 {
-    BlockDirtyBitmapMerge *action;
-    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
-                                             common, common);
+    BlockDirtyBitmapState *state = g_new0(BlockDirtyBitmapState, 1);
 
     tran_add(tran, &block_dirty_bitmap_merge_drv, state);
 
-    action = common->action->u.block_dirty_bitmap_merge.data;
-
     state->bitmap = block_dirty_bitmap_merge(action->node, action->target,
                                              action->bitmaps, &state->backup,
                                              errp);
@@ -2197,16 +2112,13 @@ TransactionActionDrv block_dirty_bitmap_remove_drv = {
     .clean = g_free,
 };
 
-static void block_dirty_bitmap_remove_action(BlkActionState *common,
+static void block_dirty_bitmap_remove_action(BlockDirtyBitmap *action,
                                              Transaction *tran, Error **errp)
 {
-    BlockDirtyBitmap *action;
-    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
-                                             common, common);
+    BlockDirtyBitmapState *state = g_new0(BlockDirtyBitmapState, 1);
 
     tran_add(tran, &block_dirty_bitmap_remove_drv, state);
 
-    action = common->action->u.block_dirty_bitmap_remove.data;
 
     state->bitmap = block_dirty_bitmap_remove(action->node, action->name,
                                               false, &state->bs, errp);
@@ -2237,13 +2149,11 @@ static void block_dirty_bitmap_remove_commit(void *opaque)
 static void abort_commit(void *opaque);
 TransactionActionDrv abort_drv = {
     .commit = abort_commit,
-    .clean = g_free,
 };
 
-static void abort_action(BlkActionState *common, Transaction *tran,
-                         Error **errp)
+static void abort_action(Transaction *tran, Error **errp)
 {
-    tran_add(tran, &abort_drv, common);
+    tran_add(tran, &abort_drv, NULL);
     error_setg(errp, "Transaction aborted using Abort action");
 }
 
@@ -2252,62 +2162,66 @@ static void abort_commit(void *opaque)
     g_assert_not_reached(); /* this action never succeeds */
 }
 
-static const BlkActionOps actions_map[] = {
-    [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT] = {
-        .instance_size = sizeof(ExternalSnapshotState),
-        .action  = external_snapshot_action,
-    },
-    [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC] = {
-        .instance_size = sizeof(ExternalSnapshotState),
-        .action  = external_snapshot_action,
-    },
-    [TRANSACTION_ACTION_KIND_DRIVE_BACKUP] = {
-        .instance_size = sizeof(DriveBackupState),
-        .action = drive_backup_action,
-    },
-    [TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP] = {
-        .instance_size = sizeof(BlockdevBackupState),
-        .action = blockdev_backup_action,
-    },
-    [TRANSACTION_ACTION_KIND_ABORT] = {
-        .instance_size = sizeof(BlkActionState),
-        .action = abort_action,
-    },
-    [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_INTERNAL_SYNC] = {
-        .instance_size = sizeof(InternalSnapshotState),
-        .action  = internal_snapshot_action,
-    },
-    [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ADD] = {
-        .instance_size = sizeof(BlockDirtyBitmapState),
-        .action = block_dirty_bitmap_add_action,
-    },
-    [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_CLEAR] = {
-        .instance_size = sizeof(BlockDirtyBitmapState),
-        .action = block_dirty_bitmap_clear_action,
-    },
-    [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ENABLE] = {
-        .instance_size = sizeof(BlockDirtyBitmapState),
-        .action = block_dirty_bitmap_enable_action,
-    },
-    [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_DISABLE] = {
-        .instance_size = sizeof(BlockDirtyBitmapState),
-        .action = block_dirty_bitmap_disable_action,
-    },
-    [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_MERGE] = {
-        .instance_size = sizeof(BlockDirtyBitmapState),
-        .action = block_dirty_bitmap_merge_action,
-    },
-    [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_REMOVE] = {
-        .instance_size = sizeof(BlockDirtyBitmapState),
-        .action = block_dirty_bitmap_remove_action,
-    },
-    /* Where are transactions for MIRROR, COMMIT and STREAM?
+static void transaction_action(TransactionAction *act, JobTxn *block_job_txn,
+                               Transaction *tran, Error **errp)
+{
+    switch (act->type) {
+    case TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT:
+    case TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC:
+        external_snapshot_action(act, tran, errp);
+        return;
+    case TRANSACTION_ACTION_KIND_DRIVE_BACKUP:
+        drive_backup_action(act->u.drive_backup.data,
+                            block_job_txn, tran, errp);
+        return;
+    case TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP:
+        blockdev_backup_action(act->u.blockdev_backup.data,
+                               block_job_txn, tran, errp);
+        return;
+    case TRANSACTION_ACTION_KIND_ABORT:
+        abort_action(tran, errp);
+        return;
+    case TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_INTERNAL_SYNC:
+        internal_snapshot_action(act->u.blockdev_snapshot_internal_sync.data,
+                                 tran, errp);
+        return;
+    case TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ADD:
+        block_dirty_bitmap_add_action(act->u.block_dirty_bitmap_add.data,
+                                      tran, errp);
+        return;
+    case TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_CLEAR:
+        block_dirty_bitmap_clear_action(act->u.block_dirty_bitmap_clear.data,
+                                        tran, errp);
+        return;
+    case TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ENABLE:
+        block_dirty_bitmap_enable_action(act->u.block_dirty_bitmap_enable.data,
+                                         tran, errp);
+        return;
+    case TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_DISABLE:
+        block_dirty_bitmap_disable_action(
+                act->u.block_dirty_bitmap_disable.data, tran, errp);
+        return;
+    case TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_MERGE:
+        block_dirty_bitmap_merge_action(act->u.block_dirty_bitmap_merge.data,
+                                        tran, errp);
+        return;
+    case TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_REMOVE:
+        block_dirty_bitmap_remove_action(act->u.block_dirty_bitmap_remove.data,
+                                         tran, errp);
+        return;
+    /*
+     * Where are transactions for MIRROR, COMMIT and STREAM?
      * Although these blockjobs use transaction callbacks like the backup job,
      * these jobs do not necessarily adhere to transaction semantics.
      * These jobs may not fully undo all of their actions on abort, nor do they
      * necessarily work in transactions with more than one job in them.
      */
-};
+    case TRANSACTION_ACTION_KIND__MAX:
+    default:
+        g_assert_not_reached();
+    };
+}
+
 
 /*
  * 'Atomic' group operations.  The operations are performed as a set, and if
@@ -2357,21 +2271,7 @@ void qmp_transaction(TransactionActionList *actions,
 
     /* We don't do anything in this loop that commits us to the operations */
     for (act = actions; act; act = act->next) {
-        TransactionAction *dev_info = act->value;
-        const BlkActionOps *ops;
-        BlkActionState *state;
-
-        assert(dev_info->type < ARRAY_SIZE(actions_map));
-
-        ops = &actions_map[dev_info->type];
-        assert(ops->instance_size > 0);
-
-        state = g_malloc0(ops->instance_size);
-        state->ops = ops;
-        state->action = dev_info;
-        state->block_job_txn = block_job_txn;
-
-        state->ops->action(state, tran, &local_err);
+        transaction_action(act->value, block_job_txn, tran, &local_err);
         if (local_err) {
             error_propagate(errp, local_err);
             goto delete_and_fail;
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 28/45] qapi: block: add blockdev-del transaction action
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (26 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 27/45] blockdev: qmp_transaction: drop extra generic layer Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 29/45] block: introduce BDRV_O_NOPERM flag Vladimir Sementsov-Ogievskiy
                   ` (16 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, Markus Armbruster, hreitz,
	vsementsov, Eric Blake

Support blockdev-del in a transaction.

The tricky thing is how we update permissions: not after every
blockdev-del operation, but after group of such operations. Soon we'll
support blockdev-add and new blockdev-replace in the same manner, and
we'll be able to do a wide range of block-graph modifying operation in
a bunch, so that permissions are updated only after the whole group, to
avoid intermediate permission conflicts.

Additionally we need to add  aio_context_acquire_tran() transaction
action, to keep aio context acquired including final bdrv_delete() in
commit of bdrv_unref_tran().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 blockdev.c            | 87 +++++++++++++++++++++++++++++++++++++------
 qapi/block-core.json  | 11 ++++--
 qapi/transaction.json | 12 ++++++
 3 files changed, 95 insertions(+), 15 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index a7287bf64f..1cd95f4f02 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -63,6 +63,9 @@
 #include "qemu/main-loop.h"
 #include "qemu/throttle-options.h"
 
+static int blockdev_del(const char *node_name, GSList **detached,
+                        Transaction *tran, Error **errp);
+
 /* Protected by BQL */
 QTAILQ_HEAD(, BlockDriverState) monitor_bdrv_states =
     QTAILQ_HEAD_INITIALIZER(monitor_bdrv_states);
@@ -2163,6 +2166,7 @@ static void abort_commit(void *opaque)
 }
 
 static void transaction_action(TransactionAction *act, JobTxn *block_job_txn,
+                               GSList **refresh_list,
                                Transaction *tran, Error **errp)
 {
     switch (act->type) {
@@ -2209,6 +2213,10 @@ static void transaction_action(TransactionAction *act, JobTxn *block_job_txn,
         block_dirty_bitmap_remove_action(act->u.block_dirty_bitmap_remove.data,
                                          tran, errp);
         return;
+    case TRANSACTION_ACTION_KIND_BLOCKDEV_DEL:
+        blockdev_del(act->u.blockdev_del.data->node_name,
+                     refresh_list, tran, errp);
+        return;
     /*
      * Where are transactions for MIRROR, COMMIT and STREAM?
      * Although these blockjobs use transaction callbacks like the backup job,
@@ -2234,6 +2242,7 @@ void qmp_transaction(TransactionActionList *actions,
                      struct TransactionProperties *properties,
                      Error **errp)
 {
+    int ret;
     TransactionActionList *act;
     JobTxn *block_job_txn = NULL;
     Error *local_err = NULL;
@@ -2241,6 +2250,7 @@ void qmp_transaction(TransactionActionList *actions,
     ActionCompletionMode comp_mode =
         has_properties ? properties->completion_mode :
         ACTION_COMPLETION_MODE_INDIVIDUAL;
+    g_autoptr(GSList) refresh_list = NULL;
 
     GLOBAL_STATE_CODE();
 
@@ -2271,13 +2281,32 @@ void qmp_transaction(TransactionActionList *actions,
 
     /* We don't do anything in this loop that commits us to the operations */
     for (act = actions; act; act = act->next) {
-        transaction_action(act->value, block_job_txn, tran, &local_err);
+        TransactionActionKind type = act->value->type;
+
+        if (refresh_list &&
+            type != TRANSACTION_ACTION_KIND_BLOCKDEV_DEL)
+        {
+            ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
+            if (ret < 0) {
+                goto delete_and_fail;
+            }
+            g_slist_free(refresh_list);
+            refresh_list = NULL;
+        }
+
+        transaction_action(act->value, block_job_txn, &refresh_list, tran,
+                           &local_err);
         if (local_err) {
             error_propagate(errp, local_err);
             goto delete_and_fail;
         }
     }
 
+    ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
+    if (ret < 0) {
+        goto delete_and_fail;
+    }
+
     tran_commit(tran);
 
     /* success */
@@ -3520,9 +3549,25 @@ fail:
     g_slist_free(drained);
 }
 
-void qmp_blockdev_del(const char *node_name, Error **errp)
+static void aio_context_acquire_clean(void *opaque)
+{
+    aio_context_release(opaque);
+}
+
+TransactionActionDrv aio_context_acquire_drv = {
+    .clean = aio_context_acquire_clean,
+};
+
+static void aio_context_acquire_tran(AioContext *ctx, Transaction *tran)
+{
+    aio_context_acquire(ctx);
+    tran_add(tran, &aio_context_acquire_drv, ctx);
+}
+
+/* Function doesn't update permissions, it's a responsibility of caller. */
+static int blockdev_del(const char *node_name, GSList **refresh_list,
+                        Transaction *tran, Error **errp)
 {
-    AioContext *aio_context;
     BlockDriverState *bs;
 
     GLOBAL_STATE_CODE();
@@ -3530,36 +3575,54 @@ void qmp_blockdev_del(const char *node_name, Error **errp)
     bs = bdrv_find_node(node_name);
     if (!bs) {
         error_setg(errp, "Failed to find node with node-name='%s'", node_name);
-        return;
+        return -EINVAL;
     }
     if (bdrv_has_blk(bs)) {
         error_setg(errp, "Node %s is in use", node_name);
-        return;
+        return -EINVAL;
     }
-    aio_context = bdrv_get_aio_context(bs);
-    aio_context_acquire(aio_context);
+    aio_context_acquire_tran(bdrv_get_aio_context(bs), tran);
 
     if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_DRIVE_DEL, errp)) {
-        goto out;
+        return -EINVAL;
     }
 
     if (!QTAILQ_IN_USE(bs, monitor_list)) {
         error_setg(errp, "Node %s is not owned by the monitor",
                    bs->node_name);
-        goto out;
+        return -EINVAL;
     }
 
     if (bs->refcnt > 1) {
         error_setg(errp, "Block device %s is in use",
                    bdrv_get_device_or_node_name(bs));
-        goto out;
+        return -EINVAL;
     }
 
     QTAILQ_REMOVE(&monitor_bdrv_states, bs, monitor_list);
-    bdrv_unref(bs);
+    bdrv_unref_tran(bs, refresh_list, tran);
+
+    return 0;
+}
+
+void qmp_blockdev_del(const char *node_name, Error **errp)
+{
+    int ret;
+    Transaction *tran = tran_new();
+    g_autoptr(GSList) refresh_list = NULL;
+
+    ret = blockdev_del(node_name, &refresh_list, tran, errp);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
+    if (ret < 0) {
+        goto out;
+    }
 
 out:
-    aio_context_release(aio_context);
+    tran_finalize(tran, ret);
 }
 
 static BdrvChild *bdrv_find_child(BlockDriverState *parent_bs,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index e89f2dfb5b..d915cddde9 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4407,6 +4407,13 @@
 { 'command': 'blockdev-reopen',
   'data': { 'options': ['BlockdevOptions'] } }
 
+##
+# @BlockdevDel:
+#
+# @node-name: Name of the graph node to delete.
+##
+{ 'struct': 'BlockdevDel', 'data': { 'node-name': 'str' } }
+
 ##
 # @blockdev-del:
 #
@@ -4414,8 +4421,6 @@
 # The command will fail if the node is attached to a device or is
 # otherwise being used.
 #
-# @node-name: Name of the graph node to delete.
-#
 # Since: 2.9
 #
 # Example:
@@ -4438,7 +4443,7 @@
 # <- { "return": {} }
 #
 ##
-{ 'command': 'blockdev-del', 'data': { 'node-name': 'str' } }
+{ 'command': 'blockdev-del', 'data': 'BlockdevDel' }
 
 ##
 # @BlockdevCreateOptionsFile:
diff --git a/qapi/transaction.json b/qapi/transaction.json
index 381a2df782..ea20df770c 100644
--- a/qapi/transaction.json
+++ b/qapi/transaction.json
@@ -53,6 +53,7 @@
 # @blockdev-snapshot-internal-sync: Since 1.7
 # @blockdev-snapshot-sync: since 1.1
 # @drive-backup: Since 1.6
+# @blockdev-del: since 7.1
 #
 # Features:
 # @deprecated: Member @drive-backup is deprecated.  Use member
@@ -66,6 +67,7 @@
             'block-dirty-bitmap-disable', 'block-dirty-bitmap-merge',
             'blockdev-backup', 'blockdev-snapshot',
             'blockdev-snapshot-internal-sync', 'blockdev-snapshot-sync',
+            'blockdev-del',
             { 'name': 'drive-backup', 'features': [ 'deprecated' ] } ] }
 
 ##
@@ -140,6 +142,15 @@
 { 'struct': 'DriveBackupWrapper',
   'data': { 'data': 'DriveBackup' } }
 
+##
+# @BlockdevDelWrapper:
+#
+# Since: 7.1
+##
+{ 'struct': 'BlockdevDelWrapper',
+  'data': { 'data': 'BlockdevDel' } }
+
+
 ##
 # @TransactionAction:
 #
@@ -163,6 +174,7 @@
        'blockdev-snapshot': 'BlockdevSnapshotWrapper',
        'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternalWrapper',
        'blockdev-snapshot-sync': 'BlockdevSnapshotSyncWrapper',
+       'blockdev-del': 'BlockdevDelWrapper',
        'drive-backup': 'DriveBackupWrapper'
    } }
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 29/45] block: introduce BDRV_O_NOPERM flag
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (27 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 28/45] qapi: block: add blockdev-del transaction action Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-06-13  9:54   ` Hanna Reitz
  2022-03-30 21:28 ` [PATCH v5 30/45] block: bdrv_insert_node(): use BDRV_O_NOPERM Vladimir Sementsov-Ogievskiy
                   ` (15 subsequent siblings)
  44 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, Markus Armbruster, hreitz, vsementsov

Now copy-before-write filter has weak permission model: when it has no
parents, it share write permission on source. Otherwise we just can't
blockdev-add it, when existing user of source has write permission.

The situation is bad, it means that copy-before-write filter doesn't
guarantee that all write goes through it. And a lot better is unshare
write always. But how to insert the filter in this case?

The solution is to do blockdev-add and blockdev-replace in one
transaction, and more, update permissions only after both command.

For now, let's create a possibility to not update permission on file
child of copy-before-write filter at time of open.

New interfaces are:

- bds_tree_init() with flags argument, so that caller may pass
  additional flags, for example the new BDRV_O_NOPERM.

- bdrv_open_file_child_common() with boolean refresh_perms arguments.
  Drivers may use this function with refresh_perms = true, if they want
  to satisfy BDRV_O_NOPERM. No one such driver for now.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c                                | 82 +++++++++++++++++++-------
 block/monitor/block-hmp-cmds.c         |  2 +-
 blockdev.c                             | 13 ++--
 include/block/block-common.h           |  3 +
 include/block/block-global-state.h     | 11 ++++
 include/block/block_int-global-state.h |  3 +-
 6 files changed, 83 insertions(+), 31 deletions(-)

diff --git a/block.c b/block.c
index a7020d3cd8..ca0b629bec 100644
--- a/block.c
+++ b/block.c
@@ -3166,12 +3166,13 @@ out:
  * If @parent_bs and @child_bs are in different AioContexts, the caller must
  * hold the AioContext lock for @child_bs, but not for @parent_bs.
  */
-BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
-                             BlockDriverState *child_bs,
-                             const char *child_name,
-                             const BdrvChildClass *child_class,
-                             BdrvChildRole child_role,
-                             Error **errp)
+static BdrvChild *bdrv_do_attach_child(BlockDriverState *parent_bs,
+                                       BlockDriverState *child_bs,
+                                       const char *child_name,
+                                       const BdrvChildClass *child_class,
+                                       BdrvChildRole child_role,
+                                       bool refresh_perms,
+                                       Error **errp)
 {
     int ret;
     BdrvChild *child;
@@ -3185,9 +3186,11 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
         goto out;
     }
 
-    ret = bdrv_refresh_perms(parent_bs, tran, errp);
-    if (ret < 0) {
-        goto out;
+    if (refresh_perms) {
+        ret = bdrv_refresh_perms(parent_bs, tran, errp);
+        if (ret < 0) {
+            goto out;
+        }
     }
 
 out:
@@ -3198,6 +3201,17 @@ out:
     return ret < 0 ? NULL : child;
 }
 
+BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
+                             BlockDriverState *child_bs,
+                             const char *child_name,
+                             const BdrvChildClass *child_class,
+                             BdrvChildRole child_role,
+                             Error **errp)
+{
+    return bdrv_do_attach_child(parent_bs, child_bs, child_name, child_class,
+                                child_role, true, errp);
+}
+
 /* Caller is responsible to refresh permissions in @refresh_list */
 static void bdrv_root_unref_child_tran(BdrvChild *child, GSList **refresh_list,
                                        Transaction *tran)
@@ -3668,12 +3682,13 @@ done:
  *
  * The BlockdevRef will be removed from the options QDict.
  */
-BdrvChild *bdrv_open_child(const char *filename,
-                           QDict *options, const char *bdref_key,
-                           BlockDriverState *parent,
-                           const BdrvChildClass *child_class,
-                           BdrvChildRole child_role,
-                           bool allow_none, Error **errp)
+BdrvChild *bdrv_open_child_common(const char *filename,
+                                  QDict *options, const char *bdref_key,
+                                  BlockDriverState *parent,
+                                  const BdrvChildClass *child_class,
+                                  BdrvChildRole child_role,
+                                  bool allow_none, bool refresh_perms,
+                                  Error **errp)
 {
     BlockDriverState *bs;
 
@@ -3685,16 +3700,29 @@ BdrvChild *bdrv_open_child(const char *filename,
         return NULL;
     }
 
-    return bdrv_attach_child(parent, bs, bdref_key, child_class, child_role,
-                             errp);
+    return bdrv_do_attach_child(parent, bs, bdref_key, child_class, child_role,
+                                refresh_perms, errp);
+}
+
+BdrvChild *bdrv_open_child(const char *filename,
+                           QDict *options, const char *bdref_key,
+                           BlockDriverState *parent,
+                           const BdrvChildClass *child_class,
+                           BdrvChildRole child_role,
+                           bool allow_none, Error **errp)
+{
+    return bdrv_open_child_common(filename, options, bdref_key, parent,
+                                  child_class, child_role, allow_none, true,
+                                  errp);
 }
 
 /*
  * Wrapper on bdrv_open_child() for most popular case: open primary child of bs.
  */
-int bdrv_open_file_child(const char *filename,
-                         QDict *options, const char *bdref_key,
-                         BlockDriverState *parent, Error **errp)
+int bdrv_open_file_child_common(const char *filename,
+                                QDict *options, const char *bdref_key,
+                                BlockDriverState *parent, bool refresh_perms,
+                                Error **errp)
 {
     BdrvChildRole role;
 
@@ -3703,8 +3731,9 @@ int bdrv_open_file_child(const char *filename,
     role = parent->drv->is_filter ?
         (BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY) : BDRV_CHILD_IMAGE;
 
-    if (!bdrv_open_child(filename, options, bdref_key, parent,
-                         &child_of_bds, role, false, errp))
+    if (!bdrv_open_child_common(filename, options, bdref_key, parent,
+                                &child_of_bds, role, false, refresh_perms,
+                                errp))
     {
         return -EINVAL;
     }
@@ -3712,6 +3741,15 @@ int bdrv_open_file_child(const char *filename,
     return 0;
 }
 
+int bdrv_open_file_child(const char *filename,
+                         QDict *options, const char *bdref_key,
+                         BlockDriverState *parent,
+                         Error **errp)
+{
+    return bdrv_open_file_child_common(filename, options, bdref_key, parent,
+                                       true, errp);
+}
+
 /*
  * TODO Future callers may need to specify parent/child_class in order for
  * option inheritance to work. Existing callers use it for the root node.
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index bfb3c043a0..9145ccfc46 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -76,7 +76,7 @@ static void hmp_drive_add_node(Monitor *mon, const char *optstr)
         goto out;
     }
 
-    BlockDriverState *bs = bds_tree_init(qdict, &local_err);
+    BlockDriverState *bs = bds_tree_init(qdict, 0, &local_err);
     if (!bs) {
         error_report_err(local_err);
         goto out;
diff --git a/blockdev.c b/blockdev.c
index 1cd95f4f02..16a9b98afc 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -643,12 +643,11 @@ err_no_opts:
 }
 
 /* Takes the ownership of bs_opts */
-BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp)
+BlockDriverState *bds_tree_init(QDict *bs_opts, BdrvRequestFlags flags,
+                                Error **errp)
 {
-    int bdrv_flags = 0;
-
     GLOBAL_STATE_CODE();
-    /* bdrv_open() defaults to the values in bdrv_flags (for compatibility
+    /* bdrv_open() defaults to the values in flags (for compatibility
      * with other callers) rather than what we want as the real defaults.
      * Apply the defaults here instead. */
     qdict_set_default_str(bs_opts, BDRV_OPT_CACHE_DIRECT, "off");
@@ -656,10 +655,10 @@ BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp)
     qdict_set_default_str(bs_opts, BDRV_OPT_READ_ONLY, "off");
 
     if (runstate_check(RUN_STATE_INMIGRATE)) {
-        bdrv_flags |= BDRV_O_INACTIVE;
+        flags |= BDRV_O_INACTIVE;
     }
 
-    return bdrv_open(NULL, NULL, bs_opts, bdrv_flags, errp);
+    return bdrv_open(NULL, NULL, bs_opts, flags, errp);
 }
 
 void blockdev_close_all_bdrv_states(void)
@@ -3473,7 +3472,7 @@ void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
         goto fail;
     }
 
-    bs = bds_tree_init(qdict, errp);
+    bs = bds_tree_init(qdict, 0, errp);
     if (!bs) {
         goto fail;
     }
diff --git a/include/block/block-common.h b/include/block/block-common.h
index 2f247dd607..face2d62d0 100644
--- a/include/block/block-common.h
+++ b/include/block/block-common.h
@@ -145,6 +145,9 @@ typedef enum {
                                       read-write fails */
 #define BDRV_O_IO_URING    0x40000 /* use io_uring instead of the thread pool */
 
+#define BDRV_O_NOPERM      0x80000 /* Don't update permissions if possible,
+                                      open() caller will do that. */
+
 #define BDRV_O_CACHE_MASK  (BDRV_O_NOCACHE | BDRV_O_NO_FLUSH)
 
 
diff --git a/include/block/block-global-state.h b/include/block/block-global-state.h
index f3ec72810e..8527bcad28 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -76,6 +76,17 @@ BdrvChild *bdrv_open_child(const char *filename,
                            const BdrvChildClass *child_class,
                            BdrvChildRole child_role,
                            bool allow_none, Error **errp);
+BdrvChild *bdrv_open_child_common(const char *filename,
+                                  QDict *options, const char *bdref_key,
+                                  BlockDriverState *parent,
+                                  const BdrvChildClass *child_class,
+                                  BdrvChildRole child_role,
+                                  bool allow_none, bool refresh_perms,
+                                  Error **errp);
+int bdrv_open_file_child_common(const char *filename,
+                                QDict *options, const char *bdref_key,
+                                BlockDriverState *parent, bool refresh_perms,
+                                Error **errp);
 int bdrv_open_file_child(const char *filename,
                          QDict *options, const char *bdref_key,
                          BlockDriverState *parent, Error **errp);
diff --git a/include/block/block_int-global-state.h b/include/block/block_int-global-state.h
index 0f21b0570b..aed62a45d9 100644
--- a/include/block/block_int-global-state.h
+++ b/include/block/block_int-global-state.h
@@ -245,7 +245,8 @@ void bdrv_set_monitor_owned(BlockDriverState *bs);
 
 void blockdev_close_all_bdrv_states(void);
 
-BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp);
+BlockDriverState *bds_tree_init(QDict *bs_opts, BdrvRequestFlags flags,
+                                Error **errp);
 
 /**
  * Simple implementation of bdrv_co_create_opts for protocol drivers
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 30/45] block: bdrv_insert_node(): use BDRV_O_NOPERM
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (28 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 29/45] block: introduce BDRV_O_NOPERM flag Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 31/45] qapi: block: add blockdev-add transaction action Vladimir Sementsov-Ogievskiy
                   ` (14 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Further bdrv_replace_node will refresh permissions anyway, so we can
avoid intermediate permission conflicts.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index ca0b629bec..17c057a962 100644
--- a/block.c
+++ b/block.c
@@ -5420,8 +5420,8 @@ BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *options,
 
     GLOBAL_STATE_CODE();
 
-    new_node_bs = bdrv_new_open_driver_opts(drv, node_name, options, flags,
-                                            errp);
+    new_node_bs = bdrv_new_open_driver_opts(drv, node_name, options,
+                                            flags | BDRV_O_NOPERM, errp);
     options = NULL; /* bdrv_new_open_driver() eats options */
     if (!new_node_bs) {
         error_prepend(errp, "Could not create node: ");
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 31/45] qapi: block: add blockdev-add transaction action
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (29 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 30/45] block: bdrv_insert_node(): use BDRV_O_NOPERM Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 32/45] iotests: add blockdev-add-transaction Vladimir Sementsov-Ogievskiy
                   ` (13 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, Markus Armbruster, hreitz,
	vsementsov, Eric Blake

Use new flag to avoid permission updates where possible during
blockdev_add, so that a bunch of add/del (and soon, new 'replace')
command may be done before actual permission update to avoid
intermediate permission conflicts.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 blockdev.c            | 54 ++++++++++++++++++++++++++++++++++++++++---
 qapi/transaction.json | 10 +++++++-
 2 files changed, 60 insertions(+), 4 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 16a9b98afc..3afd2ceea8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -65,6 +65,8 @@
 
 static int blockdev_del(const char *node_name, GSList **detached,
                         Transaction *tran, Error **errp);
+static int blockdev_add(BlockdevOptions *options, GSList **refresh_list,
+                        Transaction *tran, Error **errp);
 
 /* Protected by BQL */
 QTAILQ_HEAD(, BlockDriverState) monitor_bdrv_states =
@@ -2216,6 +2218,10 @@ static void transaction_action(TransactionAction *act, JobTxn *block_job_txn,
         blockdev_del(act->u.blockdev_del.data->node_name,
                      refresh_list, tran, errp);
         return;
+    case TRANSACTION_ACTION_KIND_BLOCKDEV_ADD:
+        blockdev_add(act->u.blockdev_add.data,
+                     refresh_list, tran, errp);
+        return;
     /*
      * Where are transactions for MIRROR, COMMIT and STREAM?
      * Although these blockjobs use transaction callbacks like the backup job,
@@ -2283,7 +2289,8 @@ void qmp_transaction(TransactionActionList *actions,
         TransactionActionKind type = act->value->type;
 
         if (refresh_list &&
-            type != TRANSACTION_ACTION_KIND_BLOCKDEV_DEL)
+            type != TRANSACTION_ACTION_KIND_BLOCKDEV_DEL &&
+            type != TRANSACTION_ACTION_KIND_BLOCKDEV_ADD)
         {
             ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
             if (ret < 0) {
@@ -3454,7 +3461,21 @@ out:
     aio_context_release(aio_context);
 }
 
-void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
+static void blockdev_add_abort(void *opaque)
+{
+    BlockDriverState *bs = opaque;
+
+    QTAILQ_REMOVE(&monitor_bdrv_states, bs, monitor_list);
+    bdrv_unref(bs);
+}
+
+TransactionActionDrv blockdev_add_drv = {
+    .abort = blockdev_add_abort,
+};
+
+/* Caller is responsible to refresh permissions */
+static int blockdev_add(BlockdevOptions *options, GSList **refresh_list,
+                        Transaction *tran, Error **errp)
 {
     BlockDriverState *bs;
     QObject *obj;
@@ -3472,15 +3493,42 @@ void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
         goto fail;
     }
 
-    bs = bds_tree_init(qdict, 0, errp);
+    bs = bds_tree_init(qdict, BDRV_O_NOPERM, errp);
     if (!bs) {
         goto fail;
     }
 
+    *refresh_list = g_slist_prepend(*refresh_list, bs);
+    tran_add(tran, &blockdev_add_drv, bs);
+
     bdrv_set_monitor_owned(bs);
 
+    visit_free(v);
+    return 0;
+
 fail:
     visit_free(v);
+    return -EINVAL;
+}
+
+void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
+{
+    int ret;
+    Transaction *tran = tran_new();
+    g_autoptr(GSList) refresh_list = NULL;
+
+    ret = blockdev_add(options, &refresh_list, tran, errp);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
+    if (ret < 0) {
+        goto out;
+    }
+
+out:
+    tran_finalize(tran, ret);
 }
 
 void qmp_blockdev_reopen(BlockdevOptionsList *reopen_list, Error **errp)
diff --git a/qapi/transaction.json b/qapi/transaction.json
index ea20df770c..000dd16bb7 100644
--- a/qapi/transaction.json
+++ b/qapi/transaction.json
@@ -67,7 +67,7 @@
             'block-dirty-bitmap-disable', 'block-dirty-bitmap-merge',
             'blockdev-backup', 'blockdev-snapshot',
             'blockdev-snapshot-internal-sync', 'blockdev-snapshot-sync',
-            'blockdev-del',
+            'blockdev-del', 'blockdev-add',
             { 'name': 'drive-backup', 'features': [ 'deprecated' ] } ] }
 
 ##
@@ -150,6 +150,13 @@
 { 'struct': 'BlockdevDelWrapper',
   'data': { 'data': 'BlockdevDel' } }
 
+##
+# @BlockdevAddWrapper:
+#
+# Since: 7.1
+##
+{ 'struct': 'BlockdevAddWrapper',
+  'data': { 'data': 'BlockdevOptions' } }
 
 ##
 # @TransactionAction:
@@ -175,6 +182,7 @@
        'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternalWrapper',
        'blockdev-snapshot-sync': 'BlockdevSnapshotSyncWrapper',
        'blockdev-del': 'BlockdevDelWrapper',
+       'blockdev-add': 'BlockdevAddWrapper',
        'drive-backup': 'DriveBackupWrapper'
    } }
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 32/45] iotests: add blockdev-add-transaction
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (30 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 31/45] qapi: block: add blockdev-add transaction action Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 33/45] block-backend: blk_root(): drop const specifier on return type Vladimir Sementsov-Ogievskiy
                   ` (12 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Add a test for transaction support of blockdev-add.

Test is format-agnostic, so limit it to qcow2 to avoid extra test runs.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 .../tests/blockdev-add-transaction            | 52 +++++++++++++++++++
 .../tests/blockdev-add-transaction.out        |  6 +++
 2 files changed, 58 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/blockdev-add-transaction
 create mode 100644 tests/qemu-iotests/tests/blockdev-add-transaction.out

diff --git a/tests/qemu-iotests/tests/blockdev-add-transaction b/tests/qemu-iotests/tests/blockdev-add-transaction
new file mode 100755
index 0000000000..ce3c1c069b
--- /dev/null
+++ b/tests/qemu-iotests/tests/blockdev-add-transaction
@@ -0,0 +1,52 @@
+#!/usr/bin/env python3
+#
+# Test blockdev-add transaction action
+#
+# Copyright (c) 2022 Virtuozzo International GmbH.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import iotests
+from iotests import log
+
+iotests.script_initialize(supported_fmts=['qcow2'])
+
+with iotests.VM() as vm:
+    vm.launch()
+
+    # Use same node-name for nodes, neither one should appear.
+    vm.qmp_log('transaction', actions=[
+        {'type': 'blockdev-add',
+         'data': {'node-name': 'node0', 'driver': 'null-co',
+                  'size': 1024 * 1024}},
+        {'type': 'blockdev-add',
+         'data': {'node-name': 'node0', 'driver': 'null-co',
+                  'size': 1024 * 1024}}
+    ])
+
+    n = len(vm.qmp('query-named-block-nodes')['return'])
+    log(f'Created {n} nodes')
+
+    vm.qmp_log('transaction', actions=[
+        {'type': 'blockdev-add',
+         'data': {'node-name': 'node0', 'driver': 'null-co',
+                  'size': 1024 * 1024}},
+        {'type': 'blockdev-add',
+         'data': {'node-name': 'node1', 'driver': 'null-co',
+                  'size': 1024 * 1024}}
+    ])
+
+    n = len(vm.qmp('query-named-block-nodes')['return'])
+    log(f'Created {n} nodes')
diff --git a/tests/qemu-iotests/tests/blockdev-add-transaction.out b/tests/qemu-iotests/tests/blockdev-add-transaction.out
new file mode 100644
index 0000000000..7e6cd5a9a3
--- /dev/null
+++ b/tests/qemu-iotests/tests/blockdev-add-transaction.out
@@ -0,0 +1,6 @@
+{"execute": "transaction", "arguments": {"actions": [{"data": {"driver": "null-co", "node-name": "node0", "size": 1048576}, "type": "blockdev-add"}, {"data": {"driver": "null-co", "node-name": "node0", "size": 1048576}, "type": "blockdev-add"}]}}
+{"error": {"class": "GenericError", "desc": "Duplicate nodes with node-name='node0'"}}
+Created 0 nodes
+{"execute": "transaction", "arguments": {"actions": [{"data": {"driver": "null-co", "node-name": "node0", "size": 1048576}, "type": "blockdev-add"}, {"data": {"driver": "null-co", "node-name": "node1", "size": 1048576}, "type": "blockdev-add"}]}}
+{"return": {}}
+Created 2 nodes
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 33/45] block-backend: blk_root(): drop const specifier on return type
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (31 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 32/45] iotests: add blockdev-add-transaction Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 34/45] block/export: add blk_by_export_id() Vladimir Sementsov-Ogievskiy
                   ` (11 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

We'll need get non-const child pointer for graph modifications in
further commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block/block-backend.c                       | 2 +-
 include/sysemu/block-backend-global-state.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index e0e1aff4b1..f5476bb9fc 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -2618,7 +2618,7 @@ int coroutine_fn blk_co_copy_range(BlockBackend *blk_in, int64_t off_in,
                               bytes, read_flags, write_flags);
 }
 
-const BdrvChild *blk_root(BlockBackend *blk)
+BdrvChild *blk_root(BlockBackend *blk)
 {
     GLOBAL_STATE_CODE();
     return blk->root;
diff --git a/include/sysemu/block-backend-global-state.h b/include/sysemu/block-backend-global-state.h
index 2e93a74679..0ee6dced99 100644
--- a/include/sysemu/block-backend-global-state.h
+++ b/include/sysemu/block-backend-global-state.h
@@ -109,7 +109,7 @@ void blk_set_force_allow_inactivate(BlockBackend *blk);
 void blk_register_buf(BlockBackend *blk, void *host, size_t size);
 void blk_unregister_buf(BlockBackend *blk, void *host);
 
-const BdrvChild *blk_root(BlockBackend *blk);
+BdrvChild *blk_root(BlockBackend *blk);
 
 int blk_make_empty(BlockBackend *blk, Error **errp);
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 34/45] block/export: add blk_by_export_id()
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (32 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 33/45] block-backend: blk_root(): drop const specifier on return type Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 35/45] block: make bdrv_find_child() function public Vladimir Sementsov-Ogievskiy
                   ` (10 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block/export/export.c                       | 18 ++++++++++++++++++
 include/sysemu/block-backend-global-state.h |  1 +
 2 files changed, 19 insertions(+)

diff --git a/block/export/export.c b/block/export/export.c
index 7253af3bc3..66e62f0074 100644
--- a/block/export/export.c
+++ b/block/export/export.c
@@ -362,3 +362,21 @@ BlockExportInfoList *qmp_query_block_exports(Error **errp)
 
     return head;
 }
+
+BlockBackend *blk_by_export_id(const char *id, Error **errp)
+{
+    BlockExport *exp;
+
+    exp = blk_exp_find(id);
+    if (exp == NULL) {
+        error_setg(errp, "Export '%s' not found", id);
+        return NULL;
+    }
+
+    if (!exp->blk) {
+        error_setg(errp, "Export '%s' is empty", id);
+        return NULL;
+    }
+
+    return exp->blk;
+}
diff --git a/include/sysemu/block-backend-global-state.h b/include/sysemu/block-backend-global-state.h
index 0ee6dced99..ea1a93d787 100644
--- a/include/sysemu/block-backend-global-state.h
+++ b/include/sysemu/block-backend-global-state.h
@@ -58,6 +58,7 @@ void blk_detach_dev(BlockBackend *blk, DeviceState *dev);
 DeviceState *blk_get_attached_dev(BlockBackend *blk);
 BlockBackend *blk_by_dev(void *dev);
 BlockBackend *blk_by_qdev_id(const char *id, Error **errp);
+BlockBackend *blk_by_export_id(const char *id, Error **errp);
 void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops, void *opaque);
 
 void blk_activate(BlockBackend *blk, Error **errp);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 35/45] block: make bdrv_find_child() function public
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (33 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 34/45] block/export: add blk_by_export_id() Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 36/45] block: bdrv_replace_child_bs(): move to external transaction Vladimir Sementsov-Ogievskiy
                   ` (9 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, Markus Armbruster, hreitz, vsementsov

To be reused soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c                      | 13 +++++++++++++
 blockdev.c                   | 14 --------------
 include/block/block_int-io.h |  1 +
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/block.c b/block.c
index 17c057a962..9e1be402e2 100644
--- a/block.c
+++ b/block.c
@@ -8039,6 +8039,19 @@ int bdrv_make_empty(BdrvChild *c, Error **errp)
     return 0;
 }
 
+BdrvChild *bdrv_find_child(BlockDriverState *parent_bs, const char *child_name)
+{
+    BdrvChild *child;
+
+    QLIST_FOREACH(child, &parent_bs->children, next) {
+        if (strcmp(child->name, child_name) == 0) {
+            return child;
+        }
+    }
+
+    return NULL;
+}
+
 /*
  * Return the child that @bs acts as an overlay for, and from which data may be
  * copied in COW or COR operations.  Usually this is the backing file.
diff --git a/blockdev.c b/blockdev.c
index 3afd2ceea8..abd0600d15 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3672,20 +3672,6 @@ out:
     tran_finalize(tran, ret);
 }
 
-static BdrvChild *bdrv_find_child(BlockDriverState *parent_bs,
-                                  const char *child_name)
-{
-    BdrvChild *child;
-
-    QLIST_FOREACH(child, &parent_bs->children, next) {
-        if (strcmp(child->name, child_name) == 0) {
-            return child;
-        }
-    }
-
-    return NULL;
-}
-
 void qmp_x_blockdev_change(const char *parent, bool has_child,
                            const char *child, bool has_node,
                            const char *node, Error **errp)
diff --git a/include/block/block_int-io.h b/include/block/block_int-io.h
index bb454200e5..0ce5eaf9a2 100644
--- a/include/block/block_int-io.h
+++ b/include/block/block_int-io.h
@@ -122,6 +122,7 @@ int coroutine_fn bdrv_co_copy_range_to(BdrvChild *src, int64_t src_offset,
 
 int refresh_total_sectors(BlockDriverState *bs, int64_t hint);
 
+BdrvChild *bdrv_find_child(BlockDriverState *parent_bs, const char *child_name);
 BdrvChild *bdrv_cow_child(BlockDriverState *bs);
 BdrvChild *bdrv_filter_child(BlockDriverState *bs);
 BdrvChild *bdrv_filter_or_cow_child(BlockDriverState *bs);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 36/45] block: bdrv_replace_child_bs(): move to external transaction
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (34 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 35/45] block: make bdrv_find_child() function public Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 37/45] qapi: add x-blockdev-replace command Vladimir Sementsov-Ogievskiy
                   ` (8 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

We'll need this functionality as part of external transaction, so make
the whole function to be transaction action. For this we need to
introduce a transaction action helper: bdrv_drained(), which calls
bdrv_drained_begin() and postpone bdrv_drained_end() to .clean() phase.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c                            | 42 +++++++++++++++++++-----------
 block/block-backend.c              |  9 ++++++-
 include/block/block-global-state.h |  2 +-
 3 files changed, 36 insertions(+), 17 deletions(-)

diff --git a/block.c b/block.c
index 9e1be402e2..4b5b7d8794 100644
--- a/block.c
+++ b/block.c
@@ -5341,32 +5341,44 @@ out:
     return ret;
 }
 
+static void bdrv_drained_clean(void *opaque)
+{
+    BlockDriverState *bs = opaque;
+
+    bdrv_drained_end(bs);
+    bdrv_unref(bs);
+}
+
+TransactionActionDrv bdrv_drained_drv = {
+    .clean = bdrv_drained_clean,
+};
+
+/*
+ * Start drained section on @bs, and finish it in .clean action.
+ * Reference to @bs is kept, so @bs can't be removed during transaction.
+ */
+static void bdrv_drained(BlockDriverState *bs, Transaction *tran)
+{
+    bdrv_ref(bs);
+    bdrv_drained_begin(bs);
+    tran_add(tran, &bdrv_drained_drv, bs);
+}
+
 /* Not for empty child */
 int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
-                          Error **errp)
+                          Transaction *tran, Error **errp)
 {
-    int ret;
-    Transaction *tran = tran_new();
     g_autoptr(GSList) refresh_list = NULL;
     BlockDriverState *old_bs = child->bs;
 
     GLOBAL_STATE_CODE();
 
-    bdrv_ref(old_bs);
-    bdrv_drained_begin(old_bs);
-    bdrv_drained_begin(new_bs);
+    bdrv_drained(old_bs, tran);
+    bdrv_drained(new_bs, tran);
 
     bdrv_replace_child_tran(child, new_bs, &refresh_list, tran);
 
-    ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
-
-    tran_finalize(tran, ret);
-
-    bdrv_drained_end(old_bs);
-    bdrv_drained_end(new_bs);
-    bdrv_unref(old_bs);
-
-    return ret;
+    return bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
 }
 
 static void bdrv_delete(BlockDriverState *bs)
diff --git a/block/block-backend.c b/block/block-backend.c
index f5476bb9fc..fa1d810da2 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -929,8 +929,15 @@ int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, Error **errp)
  */
 int blk_replace_bs(BlockBackend *blk, BlockDriverState *new_bs, Error **errp)
 {
+    int ret;
+    Transaction *tran = tran_new();
+
     GLOBAL_STATE_CODE();
-    return bdrv_replace_child_bs(blk->root, new_bs, errp);
+
+    ret = bdrv_replace_child_bs(blk->root, new_bs, tran, errp);
+    tran_finalize(tran, ret);
+
+    return ret;
 }
 
 /*
diff --git a/include/block/block-global-state.h b/include/block/block-global-state.h
index 8527bcad28..fa5f698228 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -65,7 +65,7 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
 int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
                       Error **errp);
 int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
-                          Error **errp);
+                          Transaction *tran, Error **errp);
 BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *node_options,
                                    int flags, Error **errp);
 int bdrv_drop_filter(BlockDriverState *bs, Error **errp);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 37/45] qapi: add x-blockdev-replace command
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (35 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 36/45] block: bdrv_replace_child_bs(): move to external transaction Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 38/45] qapi: add x-blockdev-replace transaction action Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, Markus Armbruster, hreitz,
	vsementsov, Paolo Bonzini, Eric Blake

Add a command that can replace bs in following BdrvChild structures:

 - qdev blk root child
 - block-export blk root child
 - any child BlockDriverState selected by child-name

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 blockdev.c             | 65 ++++++++++++++++++++++++++++++++++++++++++
 qapi/block-core.json   | 62 ++++++++++++++++++++++++++++++++++++++++
 stubs/blk-by-qdev-id.c |  9 ++++++
 stubs/meson.build      |  1 +
 4 files changed, 137 insertions(+)
 create mode 100644 stubs/blk-by-qdev-id.c

diff --git a/blockdev.c b/blockdev.c
index abd0600d15..3d84cb6f92 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2166,6 +2166,71 @@ static void abort_commit(void *opaque)
     g_assert_not_reached(); /* this action never succeeds */
 }
 
+static int blockdev_replace(BlockdevReplace *repl, Transaction *tran,
+                            Error **errp)
+{
+    BdrvChild *child = NULL;
+    BlockDriverState *new_child_bs;
+
+    if (repl->parent_type == BLOCK_PARENT_TYPE_DRIVER) {
+        BlockDriverState *parent_bs;
+
+        parent_bs = bdrv_find_node(repl->u.driver.node_name);
+        if (!parent_bs) {
+            error_setg(errp, "Block driver node with node-name '%s' not "
+                       "found", repl->u.driver.node_name);
+            return -EINVAL;
+        }
+
+        child = bdrv_find_child(parent_bs, repl->u.driver.child);
+        if (!child) {
+            error_setg(errp, "Block driver node '%s' doesn't have child "
+                       "named '%s'", repl->u.driver.node_name,
+                       repl->u.driver.child);
+            return -EINVAL;
+        }
+    } else {
+        /* Other types are similar, they work through blk */
+        BlockBackend *blk;
+        bool is_qdev = repl->parent_type == BLOCK_PARENT_TYPE_QDEV;
+        const char *id =
+            is_qdev ? repl->u.qdev.qdev_id : repl->u.export.export_id;
+
+        assert(is_qdev || repl->parent_type == BLOCK_PARENT_TYPE_EXPORT);
+
+        blk = is_qdev ? blk_by_qdev_id(id, errp) : blk_by_export_id(id, errp);
+        if (!blk) {
+            return -EINVAL;
+        }
+
+        child = blk_root(blk);
+        if (!child) {
+            error_setg(errp, "%s '%s' is empty, nothing to replace",
+                       is_qdev ? "Device" : "Export", id);
+            return -EINVAL;
+        }
+    }
+
+    assert(child);
+    assert(child->bs);
+
+    new_child_bs = bdrv_find_node(repl->new_child);
+    if (!new_child_bs) {
+        error_setg(errp, "Node '%s' not found", repl->new_child);
+        return -EINVAL;
+    }
+
+    return bdrv_replace_child_bs(child, new_child_bs, tran, errp);
+}
+
+void qmp_x_blockdev_replace(BlockdevReplace *repl, Error **errp)
+{
+    Transaction *tran = tran_new();
+    int ret = blockdev_replace(repl, tran, errp);
+
+    tran_finalize(tran, ret);
+}
+
 static void transaction_action(TransactionAction *act, JobTxn *block_job_txn,
                                GSList **refresh_list,
                                Transaction *tran, Error **errp)
diff --git a/qapi/block-core.json b/qapi/block-core.json
index d915cddde9..6e944e4f52 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -5592,3 +5592,65 @@
 { 'command': 'blockdev-snapshot-delete-internal-sync',
   'data': { 'device': 'str', '*id': 'str', '*name': 'str'},
   'returns': 'SnapshotInfo' }
+
+##
+# @BlockParentType:
+#
+# Since 7.0
+##
+{ 'enum': 'BlockParentType',
+  'data': ['qdev', 'driver', 'export'] }
+
+##
+# @BdrvChildRefQdev:
+#
+# Since 7.0
+##
+{ 'struct': 'BdrvChildRefQdev',
+  'data': { 'qdev-id': 'str' } }
+
+##
+# @BdrvChildRefExport:
+#
+# Since 7.0
+##
+{ 'struct': 'BdrvChildRefExport',
+  'data': { 'export-id': 'str' } }
+
+##
+# @BdrvChildRefDriver:
+#
+# Since 7.0
+##
+{ 'struct': 'BdrvChildRefDriver',
+  'data': { 'node-name': 'str', 'child': 'str' } }
+
+##
+# @BlockdevReplace:
+#
+# Since 7.0
+##
+{ 'union': 'BlockdevReplace',
+  'base': {
+      'parent-type': 'BlockParentType',
+      'new-child': 'str'
+  },
+  'discriminator': 'parent-type',
+  'data': {
+      'qdev': 'BdrvChildRefQdev',
+      'export': 'BdrvChildRefExport',
+      'driver': 'BdrvChildRefDriver'
+  } }
+
+##
+# @x-blockdev-replace:
+#
+# Replace a block-node associated with device (selected by
+# @qdev-id) or with block-export (selected by @export-id) or
+# any child of block-node (selected by @node-name and @child)
+# with @new-child block-node.
+#
+# Since 7.0
+##
+{ 'command': 'x-blockdev-replace', 'boxed': true,
+  'data': 'BlockdevReplace' }
diff --git a/stubs/blk-by-qdev-id.c b/stubs/blk-by-qdev-id.c
new file mode 100644
index 0000000000..0e751ce4f7
--- /dev/null
+++ b/stubs/blk-by-qdev-id.c
@@ -0,0 +1,9 @@
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "sysemu/block-backend.h"
+
+BlockBackend *blk_by_qdev_id(const char *id, Error **errp)
+{
+    error_setg(errp, "blk '%s' not found", id);
+    return NULL;
+}
diff --git a/stubs/meson.build b/stubs/meson.build
index 6f80fec761..9924810a23 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -1,6 +1,7 @@
 stub_ss.add(files('bdrv-next-monitor-owned.c'))
 stub_ss.add(files('blk-commit-all.c'))
 stub_ss.add(files('blk-exp-close-all.c'))
+stub_ss.add(files('blk-by-qdev-id.c'))
 stub_ss.add(files('blockdev-close-all-bdrv-states.c'))
 stub_ss.add(files('change-state-handler.c'))
 stub_ss.add(files('cmos.c'))
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 38/45] qapi: add x-blockdev-replace transaction action
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (36 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 37/45] qapi: add x-blockdev-replace command Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 39/45] block: bdrv_get_xdbg_block_graph(): report export ids Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, Markus Armbruster, hreitz,
	vsementsov, Eric Blake

Support blockdev-replace in a transaction.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c                            |  4 ++--
 blockdev.c                         | 29 ++++++++++++++++++++++++-----
 include/block/block-global-state.h |  2 ++
 qapi/transaction.json              | 15 ++++++++++++++-
 4 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/block.c b/block.c
index 4b5b7d8794..efe0ec0f00 100644
--- a/block.c
+++ b/block.c
@@ -2381,8 +2381,8 @@ static TransactionActionDrv bdrv_replace_child_drv = {
 };
 
 /* Caller is responsible to refresh permissions in @refresh_list */
-static void bdrv_replace_child_tran(BdrvChild *child, BlockDriverState *new_bs,
-                                    GSList **refresh_list, Transaction *tran)
+void bdrv_replace_child_tran(BdrvChild *child, BlockDriverState *new_bs,
+                             GSList **refresh_list, Transaction *tran)
 {
     BdrvReplaceChildState *s = g_new(BdrvReplaceChildState, 1);
     *s = (BdrvReplaceChildState) {
diff --git a/blockdev.c b/blockdev.c
index 3d84cb6f92..89167fbc08 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2166,8 +2166,9 @@ static void abort_commit(void *opaque)
     g_assert_not_reached(); /* this action never succeeds */
 }
 
-static int blockdev_replace(BlockdevReplace *repl, Transaction *tran,
-                            Error **errp)
+/* Caller is responsible to update permission of nodes added to @update_list */
+static int blockdev_replace(BlockdevReplace *repl, GSList **refresh_list,
+                            Transaction *tran, Error **errp)
 {
     BdrvChild *child = NULL;
     BlockDriverState *new_child_bs;
@@ -2220,14 +2221,27 @@ static int blockdev_replace(BlockdevReplace *repl, Transaction *tran,
         return -EINVAL;
     }
 
-    return bdrv_replace_child_bs(child, new_child_bs, tran, errp);
+    bdrv_replace_child_tran(child, new_child_bs, refresh_list, tran);
+    return 0;
 }
 
 void qmp_x_blockdev_replace(BlockdevReplace *repl, Error **errp)
 {
+    int ret;
     Transaction *tran = tran_new();
-    int ret = blockdev_replace(repl, tran, errp);
+    g_autoptr(GSList) update_list = NULL;
+
+    ret = blockdev_replace(repl, &update_list, tran, errp);
+    if (ret < 0) {
+        goto out;
+    }
 
+    ret = bdrv_list_refresh_perms(update_list, NULL, tran, errp);
+    if (ret < 0) {
+        goto out;
+    }
+
+out:
     tran_finalize(tran, ret);
 }
 
@@ -2287,6 +2301,10 @@ static void transaction_action(TransactionAction *act, JobTxn *block_job_txn,
         blockdev_add(act->u.blockdev_add.data,
                      refresh_list, tran, errp);
         return;
+    case TRANSACTION_ACTION_KIND_X_BLOCKDEV_REPLACE:
+        blockdev_replace(act->u.x_blockdev_replace.data,
+                         refresh_list, tran, errp);
+        return;
     /*
      * Where are transactions for MIRROR, COMMIT and STREAM?
      * Although these blockjobs use transaction callbacks like the backup job,
@@ -2355,7 +2373,8 @@ void qmp_transaction(TransactionActionList *actions,
 
         if (refresh_list &&
             type != TRANSACTION_ACTION_KIND_BLOCKDEV_DEL &&
-            type != TRANSACTION_ACTION_KIND_BLOCKDEV_ADD)
+            type != TRANSACTION_ACTION_KIND_BLOCKDEV_ADD &&
+            type != TRANSACTION_ACTION_KIND_X_BLOCKDEV_REPLACE)
         {
             ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
             if (ret < 0) {
diff --git a/include/block/block-global-state.h b/include/block/block-global-state.h
index fa5f698228..253cc28a9a 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -66,6 +66,8 @@ int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
                       Error **errp);
 int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
                           Transaction *tran, Error **errp);
+void bdrv_replace_child_tran(BdrvChild *child, BlockDriverState *new_bs,
+                             GSList **refresh_list, Transaction *tran);
 BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *node_options,
                                    int flags, Error **errp);
 int bdrv_drop_filter(BlockDriverState *bs, Error **errp);
diff --git a/qapi/transaction.json b/qapi/transaction.json
index 000dd16bb7..61cb2d2312 100644
--- a/qapi/transaction.json
+++ b/qapi/transaction.json
@@ -54,10 +54,13 @@
 # @blockdev-snapshot-sync: since 1.1
 # @drive-backup: Since 1.6
 # @blockdev-del: since 7.1
+# @blockdev-add: since 7.1
+# @x-blockdev-replace: since 7.1
 #
 # Features:
 # @deprecated: Member @drive-backup is deprecated.  Use member
 #              @blockdev-backup instead.
+# @unstable: Member @x-blockdev-replace is experimental
 #
 # Since: 1.1
 ##
@@ -68,6 +71,7 @@
             'blockdev-backup', 'blockdev-snapshot',
             'blockdev-snapshot-internal-sync', 'blockdev-snapshot-sync',
             'blockdev-del', 'blockdev-add',
+            { 'name': 'x-blockdev-replace', 'features': [ 'unstable' ] },
             { 'name': 'drive-backup', 'features': [ 'deprecated' ] } ] }
 
 ##
@@ -158,6 +162,14 @@
 { 'struct': 'BlockdevAddWrapper',
   'data': { 'data': 'BlockdevOptions' } }
 
+##
+# @BlockdevReplaceWrapper:
+#
+# Since: 7.1
+##
+{ 'struct': 'BlockdevReplaceWrapper',
+  'data': { 'data': 'BlockdevReplace' } }
+
 ##
 # @TransactionAction:
 #
@@ -183,7 +195,8 @@
        'blockdev-snapshot-sync': 'BlockdevSnapshotSyncWrapper',
        'blockdev-del': 'BlockdevDelWrapper',
        'blockdev-add': 'BlockdevAddWrapper',
-       'drive-backup': 'DriveBackupWrapper'
+       'drive-backup': 'DriveBackupWrapper',
+       'x-blockdev-replace': 'BlockdevReplaceWrapper'
    } }
 
 ##
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 39/45] block: bdrv_get_xdbg_block_graph(): report export ids
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (37 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 38/45] qapi: add x-blockdev-replace transaction action Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 40/45] iotests.py: qemu_img_create: use imgfmt by default Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, hreitz, vsementsov, Paolo Bonzini

Currently for block exports we report empty blk names. That's not good.
Let's try to find corresponding block export and report its id.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c                     |  4 ++++
 block/export/export.c       | 13 +++++++++++++
 include/block/export.h      |  1 +
 stubs/blk-exp-find-by-blk.c |  9 +++++++++
 stubs/meson.build           |  1 +
 5 files changed, 28 insertions(+)
 create mode 100644 stubs/blk-exp-find-by-blk.c

diff --git a/block.c b/block.c
index efe0ec0f00..40f54fe121 100644
--- a/block.c
+++ b/block.c
@@ -6147,7 +6147,11 @@ XDbgBlockGraph *bdrv_get_xdbg_block_graph(Error **errp)
     for (blk = blk_all_next(NULL); blk; blk = blk_all_next(blk)) {
         char *allocated_name = NULL;
         const char *name = blk_name(blk);
+        BlockExport *exp = blk_exp_find_by_blk(blk);
 
+        if (!*name && exp) {
+            name = exp->id;
+        }
         if (!*name) {
             name = allocated_name = blk_get_attached_dev_id(blk);
         }
diff --git a/block/export/export.c b/block/export/export.c
index 66e62f0074..a9f935f772 100644
--- a/block/export/export.c
+++ b/block/export/export.c
@@ -54,6 +54,19 @@ BlockExport *blk_exp_find(const char *id)
     return NULL;
 }
 
+BlockExport *blk_exp_find_by_blk(BlockBackend *blk)
+{
+    BlockExport *exp;
+
+    QLIST_FOREACH(exp, &block_exports, next) {
+        if (exp->blk == blk) {
+            return exp;
+        }
+    }
+
+    return NULL;
+}
+
 static const BlockExportDriver *blk_exp_find_driver(BlockExportType type)
 {
     int i;
diff --git a/include/block/export.h b/include/block/export.h
index 7feb02e10d..172c180819 100644
--- a/include/block/export.h
+++ b/include/block/export.h
@@ -80,6 +80,7 @@ struct BlockExport {
 
 BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp);
 BlockExport *blk_exp_find(const char *id);
+BlockExport *blk_exp_find_by_blk(BlockBackend *blk);
 void blk_exp_ref(BlockExport *exp);
 void blk_exp_unref(BlockExport *exp);
 void blk_exp_request_shutdown(BlockExport *exp);
diff --git a/stubs/blk-exp-find-by-blk.c b/stubs/blk-exp-find-by-blk.c
new file mode 100644
index 0000000000..2fc1da953b
--- /dev/null
+++ b/stubs/blk-exp-find-by-blk.c
@@ -0,0 +1,9 @@
+#include "qemu/osdep.h"
+#include "sysemu/block-backend.h"
+#include "block/export.h"
+
+BlockExport *blk_exp_find_by_blk(BlockBackend *blk)
+{
+    return NULL;
+}
+
diff --git a/stubs/meson.build b/stubs/meson.build
index 9924810a23..af60dd9778 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -2,6 +2,7 @@ stub_ss.add(files('bdrv-next-monitor-owned.c'))
 stub_ss.add(files('blk-commit-all.c'))
 stub_ss.add(files('blk-exp-close-all.c'))
 stub_ss.add(files('blk-by-qdev-id.c'))
+stub_ss.add(files('blk-exp-find-by-blk.c'))
 stub_ss.add(files('blockdev-close-all-bdrv-states.c'))
 stub_ss.add(files('change-state-handler.c'))
 stub_ss.add(files('cmos.c'))
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 40/45] iotests.py: qemu_img_create: use imgfmt by default
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (38 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 39/45] block: bdrv_get_xdbg_block_graph(): report export ids Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 41/45] iotests.py: introduce VM.assert_edges_list() method Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Less typing: let's use imgfmt by default if user doesn't specify
neither -f nor --image-opts.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 tests/qemu-iotests/iotests.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index fcec3e51e5..c7a38a95a4 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -266,6 +266,8 @@ def ordered_qmp(qmsg, conv_keys=True):
     return qmsg
 
 def qemu_img_create(*args: str) -> 'subprocess.CompletedProcess[str]':
+    if '-f' not in args and '--image-opts' not in args:
+        args = ['-f', imgfmt] + list(args)
     return qemu_img('create', *args)
 
 def qemu_img_json(*args: str) -> Any:
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 41/45] iotests.py: introduce VM.assert_edges_list() method
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (39 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 40/45] iotests.py: qemu_img_create: use imgfmt by default Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:28 ` [PATCH v5 42/45] iotests.py: add VM.qmp_check() helper Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Add an alternative method to check block graph, to be used in further
commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 tests/qemu-iotests/iotests.py | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index c7a38a95a4..aaa77b5105 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -1084,6 +1084,23 @@ def check_bitmap_status(self, node_name, bitmap_name, fields):
 
         return fields.items() <= ret.items()
 
+    def get_block_graph(self):
+        """
+        Returns block graph in form of edges list, where each edge is a tuple:
+          (parent_node_name, child_name, child_node_name)
+        """
+        graph = self.qmp('x-debug-query-block-graph')['return']
+
+        nodes = {n['id']: n['name'] for n in graph['nodes']}
+        # Check that all names are unique:
+        assert len(set(nodes.values())) == len(nodes)
+
+        return [(nodes[e['parent']], e['name'], nodes[e['child']])
+                for e in graph['edges']]
+
+    def assert_edges_list(self, edges):
+        assert sorted(edges) == sorted(self.get_block_graph())
+
     def assert_block_path(self, root, path, expected_node, graph=None):
         """
         Check whether the node under the given path in the block graph
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 42/45] iotests.py: add VM.qmp_check() helper
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (40 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 41/45] iotests.py: introduce VM.assert_edges_list() method Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:28 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:29 ` [PATCH v5 43/45] iotests: add filter-insertion Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:28 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

I'm tired of this pattern being everywhere. Let's add a helper.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 tests/qemu-iotests/iotests.py | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index aaa77b5105..329297bfe4 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -1101,6 +1101,10 @@ def get_block_graph(self):
     def assert_edges_list(self, edges):
         assert sorted(edges) == sorted(self.get_block_graph())
 
+    def qmp_check(self, *args, **kwargs):
+        result = self.qmp(*args, **kwargs)
+        assert result == {'return': {}}
+
     def assert_block_path(self, root, path, expected_node, graph=None):
         """
         Check whether the node under the given path in the block graph
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 43/45] iotests: add filter-insertion
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (41 preceding siblings ...)
  2022-03-30 21:28 ` [PATCH v5 42/45] iotests.py: add VM.qmp_check() helper Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:29 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:29 ` [PATCH v5 44/45] block: bdrv_open_inherit: create BlockBackend only when necessary Vladimir Sementsov-Ogievskiy
  2022-03-30 21:29 ` [PATCH v5 45/45] block/copy-before-write: correct permission scheme Vladimir Sementsov-Ogievskiy
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:29 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

Demonstrate new API for filter insertion and removal.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 tests/qemu-iotests/tests/filter-insertion     | 253 ++++++++++++++++++
 tests/qemu-iotests/tests/filter-insertion.out |   5 +
 2 files changed, 258 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/filter-insertion
 create mode 100644 tests/qemu-iotests/tests/filter-insertion.out

diff --git a/tests/qemu-iotests/tests/filter-insertion b/tests/qemu-iotests/tests/filter-insertion
new file mode 100755
index 0000000000..4898d6e043
--- /dev/null
+++ b/tests/qemu-iotests/tests/filter-insertion
@@ -0,0 +1,253 @@
+#!/usr/bin/env python3
+#
+# Tests for inserting and removing filters in a block graph.
+#
+# Copyright (c) 2022 Virtuozzo International GmbH.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import os
+
+import iotests
+from iotests import qemu_img_create, try_remove
+
+
+disk = os.path.join(iotests.test_dir, 'disk')
+sock = os.path.join(iotests.sock_dir, 'sock')
+size = 1024 * 1024
+
+
+class TestFilterInsertion(iotests.QMPTestCase):
+    def setUp(self):
+        qemu_img_create(disk, str(size))
+
+        self.vm = iotests.VM()
+        self.vm.launch()
+
+        self.vm.qmp_check('blockdev-add', {
+            'node-name': 'disk0',
+            'driver': 'qcow2',
+            'file': {
+                'node-name': 'file0',
+                'driver': 'file',
+                'filename': disk
+            }
+        })
+
+    def tearDown(self):
+        self.vm.shutdown()
+        os.remove(disk)
+        try_remove(sock)
+
+    def test_simple_insertion(self):
+        vm = self.vm
+
+        vm.qmp_check('blockdev-add', {
+            'node-name': 'filter',
+            'driver': 'preallocate',
+            'file': 'file0'
+        })
+
+        vm.qmp_check('x-blockdev-replace', {
+            'parent-type': 'driver',
+            'node-name': 'disk0',
+            'child': 'file',
+            'new-child': 'filter'
+        })
+
+        # Filter inserted:
+        # disk0 -file-> filter -file-> file0
+        vm.assert_edges_list([
+            ('disk0', 'file', 'filter'),
+            ('filter', 'file', 'file0')
+        ])
+
+        vm.qmp_check('x-blockdev-replace', {
+            'parent-type': 'driver',
+            'node-name': 'disk0',
+            'child': 'file',
+            'new-child': 'file0'
+        })
+
+        # Filter replaced, but still exists:
+        # dik0 -file-> file0 <-file- filter
+        vm.assert_edges_list([
+            ('disk0', 'file', 'file0'),
+            ('filter', 'file', 'file0')
+        ])
+
+        vm.qmp_check('blockdev-del', node_name='filter')
+
+        # Filter removed
+        # dik0 -file-> file0
+        vm.assert_edges_list([
+            ('disk0', 'file', 'file0')
+        ])
+
+    def test_tran_insert_under_qdev(self):
+        vm = self.vm
+
+        vm.qmp_check('device_add', driver='virtio-scsi')
+        vm.qmp_check('device_add', id='sda', driver='scsi-hd', drive='disk0')
+
+        vm.qmp_check('transaction', actions=[
+            {
+                'type': 'blockdev-add',
+                'data': {
+                    'node-name': 'filter',
+                    'driver': 'compress',
+                    'file': 'disk0'
+                }
+            }, {
+                'type': 'x-blockdev-replace',
+                'data': {
+                    'parent-type': 'qdev',
+                    'qdev-id': 'sda',
+                    'new-child': 'filter'
+                }
+            }
+        ])
+
+        # Filter inserted:
+        # sda -root-> filter -file-> disk0 -file-> file0
+        vm.assert_edges_list([
+            # parent_node_name, child_name, child_node_name
+            ('sda', 'root', 'filter'),
+            ('filter', 'file', 'disk0'),
+            ('disk0', 'file', 'file0'),
+        ])
+
+        vm.qmp_check('x-blockdev-replace', {
+            'parent-type': 'qdev',
+            'qdev-id': 'sda',
+            'new-child': 'disk0'
+        })
+        vm.qmp_check('blockdev-del', node_name='filter')
+
+        # Filter removed:
+        # sda -root-> disk0 -file-> file0
+        vm.assert_edges_list([
+            # parent_node_name, child_name, child_node_name
+            ('sda', 'root', 'disk0'),
+            ('disk0', 'file', 'file0'),
+        ])
+
+    def test_tran_insert_under_nbd_export(self):
+        vm = self.vm
+
+        vm.qmp_check('nbd-server-start',
+                     addr={'type': 'unix', 'data': {'path': sock}})
+        vm.qmp_check('block-export-add', id='exp1', type='nbd',
+                     node_name='disk0', name='exp1')
+        vm.qmp_check('block-export-add', id='exp2', type='nbd',
+                     node_name='disk0', name='exp2')
+        vm.qmp_check('object-add', qom_type='throttle-group',
+                     id='tg', limits={'iops-read': 1})
+
+        vm.qmp_check('transaction', actions=[
+            {
+                'type': 'blockdev-add',
+                'data': {
+                    'node-name': 'filter',
+                    'driver': 'throttle',
+                    'throttle-group': 'tg',
+                    'file': 'disk0'
+                }
+            }, {
+                'type': 'x-blockdev-replace',
+                'data': {
+                    'parent-type': 'export',
+                    'export-id': 'exp1',
+                    'new-child': 'filter'
+                }
+            }
+        ])
+
+        # Only exp1 is throttled, exp2 is not:
+        # exp1 -root-> filter
+        #                |
+        #                |file
+        #                v
+        # exp2 -file-> disk0 -file> file0
+        vm.assert_edges_list([
+            # parent_node_name, child_name, child_node_name
+            ('exp1', 'root', 'filter'),
+            ('filter', 'file', 'disk0'),
+            ('disk0', 'file', 'file0'),
+            ('exp2', 'root', 'disk0')
+        ])
+
+        vm.qmp_check('x-blockdev-replace', {
+            'parent-type': 'export',
+            'export-id': 'exp2',
+            'new-child': 'filter'
+        })
+
+        # Both throttled:
+        # exp1 -root-> filter <-file- exp2
+        #                |
+        #                |file
+        #                v
+        #              disk0 -file> file0
+        vm.assert_edges_list([
+            # parent_node_name, child_name, child_node_name
+            ('exp1', 'root', 'filter'),
+            ('filter', 'file', 'disk0'),
+            ('disk0', 'file', 'file0'),
+            ('exp2', 'root', 'filter')
+        ])
+
+        # Check, that filter is in use and can't be removed
+        result = vm.qmp('blockdev-del', node_name='filter')
+        self.assert_qmp(result, 'error/desc', 'Node filter is in use')
+
+        vm.qmp_check('transaction', actions=[
+            {
+                'type': 'x-blockdev-replace',
+                'data': {
+                    'parent-type': 'export',
+                    'export-id': 'exp1',
+                    'new-child': 'disk0'
+                }
+            }, {
+                'type': 'x-blockdev-replace',
+                'data': {
+                    'parent-type': 'export',
+                    'export-id': 'exp2',
+                    'new-child': 'disk0'
+                }
+            }
+        ])
+        vm.qmp_check('blockdev-del', node_name='filter')
+
+        # Filter removed:
+        # exp1 -root-> disk0 <-file- exp2
+        #                |
+        #                |file
+        #                v
+        #              file0
+        vm.assert_edges_list([
+            # parent_node_name, child_name, child_node_name
+            ('exp1', 'root', 'disk0'),
+            ('disk0', 'file', 'file0'),
+            ('exp2', 'root', 'disk0')
+        ])
+
+
+if __name__ == '__main__':
+    iotests.main(
+        supported_fmts=['qcow2'],
+        supported_protocols=['file']
+    )
diff --git a/tests/qemu-iotests/tests/filter-insertion.out b/tests/qemu-iotests/tests/filter-insertion.out
new file mode 100644
index 0000000000..8d7e996700
--- /dev/null
+++ b/tests/qemu-iotests/tests/filter-insertion.out
@@ -0,0 +1,5 @@
+...
+----------------------------------------------------------------------
+Ran 3 tests
+
+OK
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 44/45] block: bdrv_open_inherit: create BlockBackend only when necessary
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (42 preceding siblings ...)
  2022-03-30 21:29 ` [PATCH v5 43/45] iotests: add filter-insertion Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:29 ` Vladimir Sementsov-Ogievskiy
  2022-03-30 21:29 ` [PATCH v5 45/45] block/copy-before-write: correct permission scheme Vladimir Sementsov-Ogievskiy
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:29 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, vsementsov, v.sementsov-og, qemu-devel

We need this blk only for probing - let's create it only when we are
going to probe.

That's significant for further changes: we'll need to avoid permission
update during open() when possible (to refresh them later of course).
But blk_unref() leads to permission update. Instead of implementing
extra logic to avoid permission update during blk_unref when we want
it, let's just drop blk_unref() from normal code path.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block.c | 48 +++++++++++++++++++++++++-----------------------
 1 file changed, 25 insertions(+), 23 deletions(-)

diff --git a/block.c b/block.c
index 40f54fe121..5a4b59eb6c 100644
--- a/block.c
+++ b/block.c
@@ -1821,7 +1821,7 @@ QemuOptsList bdrv_create_opts_simple = {
  *
  * Removes all processed options from *options.
  */
-static int bdrv_open_common(BlockDriverState *bs, BlockBackend *file,
+static int bdrv_open_common(BlockDriverState *bs, BlockDriverState *file,
                             QDict *options, Error **errp)
 {
     int ret, open_flags;
@@ -1861,8 +1861,8 @@ static int bdrv_open_common(BlockDriverState *bs, BlockBackend *file,
     }
 
     if (file != NULL) {
-        bdrv_refresh_filename(blk_bs(file));
-        filename = blk_bs(file)->filename;
+        bdrv_refresh_filename(file);
+        filename = file->filename;
     } else {
         /*
          * Caution: while qdict_get_try_str() is fine, getting
@@ -3883,7 +3883,7 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
                                            Error **errp)
 {
     int ret;
-    BlockBackend *file = NULL;
+    BlockDriverState *file_bs = NULL;
     BlockDriverState *bs;
     BlockDriver *drv = NULL;
     BdrvChild *child;
@@ -4016,8 +4016,6 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
      * probing, the block drivers will do their own bdrv_open_child() for the
      * same BDS, which is why we put the node name back into options. */
     if ((flags & BDRV_O_PROTOCOL) == 0) {
-        BlockDriverState *file_bs;
-
         file_bs = bdrv_open_child_bs(filename, options, "file", bs,
                                      &child_of_bds, BDRV_CHILD_IMAGE,
                                      true, &local_err);
@@ -4025,24 +4023,28 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
             goto fail;
         }
         if (file_bs != NULL) {
-            /* Not requesting BLK_PERM_CONSISTENT_READ because we're only
-             * looking at the header to guess the image format. This works even
-             * in cases where a guest would not see a consistent state. */
-            file = blk_new(bdrv_get_aio_context(file_bs), 0, BLK_PERM_ALL);
-            blk_insert_bs(file, file_bs, &local_err);
-            bdrv_unref(file_bs);
-            if (local_err) {
-                goto fail;
-            }
-
             qdict_put_str(options, "file", bdrv_get_node_name(file_bs));
         }
     }
 
     /* Image format probing */
     bs->probed = !drv;
-    if (!drv && file) {
+    if (!drv && file_bs) {
+        /*
+         * Not requesting BLK_PERM_CONSISTENT_READ because we're only
+         * looking at the header to guess the image format. This works even
+         * in cases where a guest would not see a consistent state.
+         */
+        BlockBackend *file = blk_new(bdrv_get_aio_context(file_bs), 0,
+                                     BLK_PERM_ALL);
+        blk_insert_bs(file, file_bs, &local_err);
+        if (local_err) {
+            blk_unref(file);
+            goto fail;
+        }
+
         ret = find_image_format(file, filename, &drv, &local_err);
+        blk_unref(file);
         if (ret < 0) {
             goto fail;
         }
@@ -4068,17 +4070,17 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
     assert(!!(flags & BDRV_O_PROTOCOL) == !!drv->bdrv_file_open);
     /* file must be NULL if a protocol BDS is about to be created
      * (the inverse results in an error message from bdrv_open_common()) */
-    assert(!(flags & BDRV_O_PROTOCOL) || !file);
+    assert(!(flags & BDRV_O_PROTOCOL) || !file_bs);
 
     /* Open the image */
-    ret = bdrv_open_common(bs, file, options, &local_err);
+    ret = bdrv_open_common(bs, file_bs, options, &local_err);
     if (ret < 0) {
         goto fail;
     }
 
-    if (file) {
-        blk_unref(file);
-        file = NULL;
+    if (file_bs) {
+        bdrv_unref(file_bs);
+        file_bs = NULL;
     }
 
     /* If there is a backing file, use it */
@@ -4142,7 +4144,7 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
     return bs;
 
 fail:
-    blk_unref(file);
+    bdrv_unref(file_bs);
     qobject_unref(snapshot_options);
     qobject_unref(bs->explicit_options);
     qobject_unref(bs->options);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 45/45] block/copy-before-write: correct permission scheme
  2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
                   ` (43 preceding siblings ...)
  2022-03-30 21:29 ` [PATCH v5 44/45] block: bdrv_open_inherit: create BlockBackend only when necessary Vladimir Sementsov-Ogievskiy
@ 2022-03-30 21:29 ` Vladimir Sementsov-Ogievskiy
  44 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-03-30 21:29 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, v.sementsov-og, qemu-devel, hreitz, vsementsov, John Snow

Finally we can strictly unshare write on source node, as all write must
go through copy-before-write filter. For this to work:

 - Declare independent close, so that blockdev-del transaction action
   may detach children of removed node at prepare phase (that's for
   filter removement). We can do it because copy-before-write filter
   doesn't do any IO on its children on close().

 - Support BDRV_O_NOPERM, so that blockdev-add transaction action can
   skip intermediate permission update. We can do it because
   copy-before-write filter doesn't do any IO on its children on
   open().

 - Move to new block-graph modifying API in iotest image-fleecing.
   Separate qom-set + del/add doesn't work anymore for
   copy-before-write filter, because intermediate state violate new
   strict permissions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block/copy-before-write.c                   | 17 ++++++++---------
 tests/qemu-iotests/tests/image-fleecing     | 20 +++++++++++++++-----
 tests/qemu-iotests/tests/image-fleecing.out |  8 --------
 3 files changed, 23 insertions(+), 22 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 4fad564691..90a9c7874a 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -319,12 +319,8 @@ static void cbw_child_perm(BlockDriverState *bs, BdrvChild *c,
         bdrv_default_perms(bs, c, role, reopen_queue,
                            perm, shared, nperm, nshared);
 
-        if (!QLIST_EMPTY(&bs->parents)) {
-            if (perm & BLK_PERM_WRITE) {
-                *nperm = *nperm | BLK_PERM_CONSISTENT_READ;
-            }
-            *nshared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
-        }
+        *nperm = *nperm | BLK_PERM_CONSISTENT_READ;
+        *nshared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
     }
 }
 
@@ -378,13 +374,15 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
     int64_t cluster_size;
     int ret;
 
-    ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
+    ret = bdrv_open_file_child_common(NULL, options, "file", bs,
+                                      !(flags & BDRV_O_NOPERM), errp);
     if (ret < 0) {
         return ret;
     }
 
-    s->target = bdrv_open_child(NULL, options, "target", bs, &child_of_bds,
-                                BDRV_CHILD_DATA, false, errp);
+    s->target = bdrv_open_child_common(NULL, options, "target", bs,
+                                       &child_of_bds, BDRV_CHILD_DATA, false,
+                                       !(flags & BDRV_O_NOPERM), errp);
     if (!s->target) {
         return -EINVAL;
     }
@@ -444,6 +442,7 @@ static void cbw_close(BlockDriverState *bs)
 BlockDriver bdrv_cbw_filter = {
     .format_name = "copy-before-write",
     .instance_size = sizeof(BDRVCopyBeforeWriteState),
+    .independent_close = true,
 
     .bdrv_open                  = cbw_open,
     .bdrv_close                 = cbw_close,
diff --git a/tests/qemu-iotests/tests/image-fleecing b/tests/qemu-iotests/tests/image-fleecing
index b7e5076104..23b55ded70 100755
--- a/tests/qemu-iotests/tests/image-fleecing
+++ b/tests/qemu-iotests/tests/image-fleecing
@@ -131,9 +131,13 @@ def do_test(vm, use_cbw, use_snapshot_access_filter, base_img_path,
         if bitmap:
             fl_cbw['bitmap'] = {'node': src_node, 'name': 'bitmap0'}
 
-        log(vm.qmp('blockdev-add', fl_cbw))
-
-        log(vm.qmp('qom-set', path=qom_path, property='drive', value='fl-cbw'))
+        log(vm.qmp('transaction', {'actions': [
+            {'type': 'blockdev-add', 'data': fl_cbw},
+            {'type': 'x-blockdev-replace', 'data': {
+                'parent-type': 'qdev',
+                'qdev-id': 'sda',
+                'new-child': 'fl-cbw'}}
+        ]}))
 
         if use_snapshot_access_filter:
             log(vm.qmp('blockdev-add', {
@@ -242,8 +246,14 @@ def do_test(vm, use_cbw, use_snapshot_access_filter, base_img_path,
     if use_cbw:
         if use_snapshot_access_filter:
             log(vm.qmp('blockdev-del', node_name='fl-access'))
-        log(vm.qmp('qom-set', path=qom_path, property='drive', value=src_node))
-        log(vm.qmp('blockdev-del', node_name='fl-cbw'))
+        log(vm.qmp('transaction', {'actions': [
+            {'type': 'x-blockdev-replace', 'data': {
+                'parent-type': 'qdev',
+                'qdev-id': 'sda',
+                'new-child': src_node}},
+            {'type': 'blockdev-del', 'data': {
+                'node-name': 'fl-cbw'}}
+        ]}))
     else:
         log(vm.qmp('block-job-cancel', device='fleecing'))
         e = vm.event_wait('BLOCK_JOB_CANCELLED')
diff --git a/tests/qemu-iotests/tests/image-fleecing.out b/tests/qemu-iotests/tests/image-fleecing.out
index acfc89ff0e..33c6c239da 100644
--- a/tests/qemu-iotests/tests/image-fleecing.out
+++ b/tests/qemu-iotests/tests/image-fleecing.out
@@ -79,7 +79,6 @@ Done
 
 --- Setting up Fleecing Graph ---
 
-{"return": {}}
 {"return": {}}
 {"return": {}}
 
@@ -124,7 +123,6 @@ read -P0 0x3fe0000 64k
 {"return": {}}
 {"return": {}}
 {"return": {}}
-{"return": {}}
 
 --- Confirming writes ---
 
@@ -152,7 +150,6 @@ Done
 {"return": {}}
 {"return": {}}
 {"return": {}}
-{"return": {}}
 
 --- Setting up NBD Export ---
 
@@ -196,7 +193,6 @@ read -P0 0x3fe0000 64k
 {"return": {}}
 {"return": {}}
 {"return": {}}
-{"return": {}}
 
 --- Confirming writes ---
 
@@ -224,7 +220,6 @@ Done
 {"return": {}}
 {"return": {}}
 {"return": {}}
-{"return": {}}
 
 --- Setting up NBD Export ---
 
@@ -280,7 +275,6 @@ read failed: Invalid argument
 {"return": {}}
 {"return": {}}
 {"return": {}}
-{"return": {}}
 
 --- Confirming writes ---
 
@@ -308,7 +302,6 @@ Done
 {"return": {}}
 {"return": {}}
 {"return": {}}
-{"return": {}}
 
 --- Starting actual backup ---
 
@@ -343,7 +336,6 @@ read -P0 0x3fe0000 64k
 {"return": {}}
 {"return": {}}
 {"return": {}}
-{"return": {}}
 
 --- Confirming writes ---
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 02/45] block: introduce bdrv_open_file_child() helper
  2022-03-30 21:28 ` [PATCH v5 02/45] block: introduce bdrv_open_file_child() helper Vladimir Sementsov-Ogievskiy
@ 2022-06-07  9:57   ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-07  9:57 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og, Ari Sundholm,
	Pavel Dovgalyuk, Paolo Bonzini, Stefan Hajnoczi, John Snow,
	Denis V. Lunev, Wen Congyang, Xie Changlong, Stefan Weil,
	Jeff Cody, Fam Zheng

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> Almost all drivers call bdrv_open_child() similarly. Let's create a
> helper for this.
>
> The only not updated driver that call bdrv_open_child() to set
> bs->file is raw-format, as it sometimes want to have filtered child but
> don't set drv->is_filter to true.

Also snapshot-access, which uses DATA | PRIMARY.

> Possibly we should implement drv->is_filter_func() handler, to consider
> raw-format as filter when it works as filter.. But it's another story.
>
> Note also, that we decrease assignments to bs->file in code: it helps
> us restrict modifying this field in further commit.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---

[...]

> diff --git a/block/filter-compress.c b/block/filter-compress.c
> index d5be538619..b2cfa9a9a5 100644
> --- a/block/filter-compress.c
> +++ b/block/filter-compress.c
> @@ -30,10 +30,8 @@
>   static int compress_open(BlockDriverState *bs, QDict *options, int flags,
>                            Error **errp)
>   {
> -    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
> -                               BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
> -                               false, errp);
> -    if (!bs->file) {
> +    int ret = bdrv_open_file_child(NULL, options, "file", bs, errp);
> +    if (ret < 0) {
>           return -EINVAL;

Should probably be `return ret;` like elsewhere.

With that done:

Reviewed-by: Hanna Reitz <hreitz@redhat.com>



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 01/45] block: BlockDriver: add .filtered_child_is_backing field
  2022-03-30 21:28 ` [PATCH v5 01/45] block: BlockDriver: add .filtered_child_is_backing field Vladimir Sementsov-Ogievskiy
@ 2022-06-07  9:57   ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-07  9:57 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og, John Snow

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> Unfortunately not all filters use .file child as filtered child. Two
> exclusions are mirror_top and commit_top. Happily they both are private
> filters. Bad thing is that this inconsistency is observable through qmp
> commands query-block / query-named-block-nodes. So, could we just
> change mirror_top and commit_top to use file child as all other filter
> driver is an open question. Probably, we could do that with some kind
> of deprecation period, but how to warn users during it?
>
> For now, let's just add a field so we can distinguish them in generic
> code, it will be used in further commits.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block/commit.c                   |  1 +
>   block/mirror.c                   |  1 +
>   include/block/block_int-common.h | 13 +++++++++++++
>   3 files changed, 15 insertions(+)

Reviewed-by: Hanna Reitz <hreitz@redhat.com>



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 03/45] block/blklogwrites: don't care to remove bs->file child on failure
  2022-03-30 21:28 ` [PATCH v5 03/45] block/blklogwrites: don't care to remove bs->file child on failure Vladimir Sementsov-Ogievskiy
@ 2022-06-07 10:05   ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-07 10:05 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og, Ari Sundholm

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> We don't need to remove bs->file, generic layer takes care of it. No
> other driver cares to remove bs->file on failure by hand.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block/blklogwrites.c | 4 ----
>   1 file changed, 4 deletions(-)

Reviewed-by: Hanna Reitz <hreitz@redhat.com>



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 04/45] test-bdrv-graph-mod: update test_parallel_perm_update test case
  2022-03-30 21:28 ` [PATCH v5 04/45] test-bdrv-graph-mod: update test_parallel_perm_update test case Vladimir Sementsov-Ogievskiy
@ 2022-06-07 10:53   ` Hanna Reitz
  2022-06-09 13:08     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 78+ messages in thread
From: Hanna Reitz @ 2022-06-07 10:53 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> test_parallel_perm_update() does two things that we are going to
> restrict in the near future:
>
> 1. It updates bs->file field by hand. bs->file will be managed
>     automatically by generic code (together with bs->children list).
>
>     Let's better refactor our "tricky" bds to have own state where one
>     of children is linked as "selected".
>     This also looks less "tricky", so avoid using this word.
>
> 2. It create FILTERED children that are not PRIMARY. Except for tests
>     all FILTERED children in the Qemu block layer are always PRIMARY as
>     well.  We are going to formalize this rule, so let's better use DATA
>     children here.

Another thing is that any node may have at most one FILTERED child at a 
time, which was already formalized in BDRV_CHILD_FILTERED’s description.

> While being here, update the picture to better correspond to the test
> code.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---

The change looks good, I’m just a bit confused when it comes to the 
comment describing what’s going on.

>   tests/unit/test-bdrv-graph-mod.c | 70 ++++++++++++++++++++------------
>   1 file changed, 44 insertions(+), 26 deletions(-)
>
> diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
> index a6e3bb79be..40795d3c04 100644
> --- a/tests/unit/test-bdrv-graph-mod.c
> +++ b/tests/unit/test-bdrv-graph-mod.c

[...]

> @@ -266,15 +280,18 @@ static BlockDriver bdrv_write_to_file = {
>    * The following test shows that topological-sort order is required for
>    * permission update, simple DFS is not enough.
>    *
> - * Consider the block driver which has two filter children: one active
> - * with exclusive write access and one inactive with no specific
> - * permissions.
> + * Consider the block driver (write-to-selected) which has two children: one is
> + * selected so we have exclusive write access to it and for the other one we
> + * don't need any specific permissions.
>    *
>    * And, these two children has a common base child, like this:
> + *   (additional "top" on top is used in test just because the only public
> + *    function to update permission should get a specific child to update.
> + *    Making bdrv_refresh_perms() public just for this test doesn't worth it)

s/doesn't/isn't/

>    *
> - * ┌─────┐     ┌──────┐
> - * │ fl2 │ ◀── │ top  │
> - * └─────┘     └──────┘
> + * ┌─────┐     ┌───────────────────┐     ┌─────┐
> + * │ fl2 │ ◀── │ write-to-selected │ ◀── │ top │
> + * └─────┘     └───────────────────┘     └─────┘
>    *   │           │
>    *   │           │ w
>    *   │           ▼
> @@ -290,7 +307,7 @@ static BlockDriver bdrv_write_to_file = {
>    *
>    * So, exclusive write is propagated.
>    *
> - * Assume, we want to make fl2 active instead of fl1.
> + * Assume, we want to select fl2  instead of fl1.

There’s a double space after “fl2”.

>    * So, we set some option for top driver and do permission update.

Here and in the rest of the comment, it’s now unclear what node “top” 
refers to.  I think it’s still the now-renamed “write-to-selected” node, 
right?  But “top” is now a different node, so I’m not 100% sure.

(On the other hand, even before this patch, there was a “top” node that 
was distinct from the former “tricky” node...  So it seems like this 
comment was already not quite right before?)

>    *
>    * With simple DFS, if permission update goes first through



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 05/45] tests-bdrv-drain: bdrv_replace_test driver: declare supports_backing
  2022-03-30 21:28 ` [PATCH v5 05/45] tests-bdrv-drain: bdrv_replace_test driver: declare supports_backing Vladimir Sementsov-Ogievskiy
@ 2022-06-07 10:59   ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-07 10:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og, Eric Blake,
	Nikita Lapshin

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> We do add COW child to the node.  In future we are going to forbid
> adding COW child to the node that doesn't support backing. So, fix it
> here now.
>
> Don't worry about setting bs->backing itself: it further commit we'll

s/it/in/

> update the block-layer to automatically set/unset this field in generic
> code.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   tests/unit/test-bdrv-drain.c | 1 +
>   1 file changed, 1 insertion(+)

Reviewed-by: Hanna Reitz <hreitz@redhat.com>



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 06/45] test-bdrv-graph-mod: fix filters to be filters
  2022-03-30 21:28 ` [PATCH v5 06/45] test-bdrv-graph-mod: fix filters to be filters Vladimir Sementsov-Ogievskiy
@ 2022-06-07 11:22   ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-07 11:22 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> bdrv_pass_through is used as filter, even all node variables has
> corresponding names. We want to append it, so it should be
> backing-child-based filter like mirror_top.
> So, in test_update_perm_tree, first child should be DATA, as we don't
> want filters with two filtered children.
>
> bdrv_exclusive_writer is used as a filter once. So it should be filter
> anyway. We want to append it, so it should be backing-child-based
> fitler too.
>
> Make all FILTERED children to be PRIMARY as well. We are going to force
> this rule by assertion soon.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   include/block/block_int-common.h |  5 +++--
>   tests/unit/test-bdrv-graph-mod.c | 24 +++++++++++++++++-------
>   2 files changed, 20 insertions(+), 9 deletions(-)
>
> diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
> index 9d91ccbcbf..d68adc6ff3 100644
> --- a/include/block/block_int-common.h
> +++ b/include/block/block_int-common.h
> @@ -122,8 +122,9 @@ struct BlockDriver {
>       /*
>        * Only make sense for filter drivers, for others must be false.
>        * If true, filtered child is bs->backing. Otherwise it's bs->file.
> -     * Only two internal filters use bs->backing as filtered child and has this
> -     * field set to true: mirror_top and commit_top.
> +     * Two internal filters use bs->backing as filtered child and has this
> +     * field set to true: mirror_top and commit_top. There also two such test
> +     * filters in tests/unit/test-bdrv-graph-mod.c.
>        *
>        * Never create any more such filters!

I mean, it’s just a test, of course, but it is kind of strange that you 
put this very strong imperative here just a couple of patches ago and 
now you disobey it. O:)

Makes sense, though.

Reviewed-by: Hanna Reitz <hreitz@redhat.com>



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 07/45] block: document connection between child roles and bs->backing/bs->file
  2022-03-30 21:28 ` [PATCH v5 07/45] block: document connection between child roles and bs->backing/bs->file Vladimir Sementsov-Ogievskiy
@ 2022-06-07 12:11   ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-07 12:11 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> Make the informal rules formal. In further commit we'll add
> corresponding assertions.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   include/block/block-common.h | 42 ++++++++++++++++++++++++++++++++++++
>   1 file changed, 42 insertions(+)
>
> diff --git a/include/block/block-common.h b/include/block/block-common.h
> index fdb7306e78..2687a2519c 100644
> --- a/include/block/block-common.h
> +++ b/include/block/block-common.h
> @@ -313,6 +313,48 @@ enum {
>    *
>    * At least one of DATA, METADATA, FILTERED, or COW must be set for
>    * every child.
> + *
> + *
> + * = Connection with bs->children, bs->file and bs->backing fields =
> + *
> + * 1. Filters
> + *
> + * Filter drivers has drv->is_filter = true.

s/has/have/

> + *
> + * Filter driver has exactly one FILTERED|PRIMARY child, any may have other

s/Filter driver/A filter node/?

And s/any/and/, I think.

> + * children which must not have these bits (the example is copy-before-write
> + * filter that also has target DATA child).

Mild style suggestion: “one example is the copy-before write filter, 
which also has its target DATA child”

> + *
> + * Filter driver never has COW children.

Maybe “Filter nodes never have COW children.”?

> + *
> + * For all filters except for mirror_top and commit_top, the filtered child is
> + * linked in bs->file, bs->backing is NULL.
> + *
> + * For mirror_top and commit_top filtered child is linked in bs->backing and

s/commit_top filtered/commit_top, the filtered/ (like in the paragraph 
above)

> + * their bs->file is NULL. These two filters has drv->filtered_child_is_backing

s/has/have/

> + * = true.

This also applies to the two test drivers in test-bdrv-graph-mod; should 
that be mentioned, too?

Or should we just link to filtered_child_is_backing when it comes to the 
list of drivers for which this applies, e.g. by rephrasing the two 
paragraphs as follows:

For most filters, the filtered child is linked in bs->file, bs->backing 
is NULL.  For some filters (as an exception), it is the other way 
around; those drivers will have drv->filtered_child_is_backing set to 
true (see that field’s documentation for what drivers this concerns).

(Just so we don’t duplicate the list of drivers.)

> + *
> + * 2. "raw" driver (block/raw-format.c)
> + *
> + * Formally it's not a filter (drv->is_filter = false)
> + *
> + * bs->backing is always NULL
> + *
> + * Only has one child, linked in bs->file. It's role is either FILTERED|PRIMARY

s/it's/its/

> + * (like filter) either DATA|PRIMARY depending on options.

s/either/or/

> + *
> + * 3. Other drivers
> + *
> + * Doesn't have any FILTERED children.

s/Doesn't/Don't/ (because “drivers” was in plural)

> + *
> + * May have at most one COW child. In this case it's linked in bs->backing.
> + * Otherwise bs->backing is NULL. COW child is never PRIMARY.
> + *
> + * May have at most one PRIMARY child. In this case it's linked in bs->file.
> + * Otherwise bs->file is NULL.
> + *
> + * May also have some other children that don't have neither PRIMARY nor COW
> + * bits set.

I think either “that don't have the PRIMARY or COW bit set" or "that 
have neither the PRIMARY nor the COW bit set".

>    */
>   enum BdrvChildRoleBits {
>       /*

Aside from typo/style nit picks, sounds good!



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 08/45] block/snapshot: stress that we fallback to primary child
  2022-03-30 21:28 ` [PATCH v5 08/45] block/snapshot: stress that we fallback to primary child Vladimir Sementsov-Ogievskiy
@ 2022-06-07 13:42   ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-07 13:42 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> Actually what we chose is a primary child. Let's stress it in the code.
>
> We are going to drop indirect pointer logic here in future. Actually
> this commit simplifies the future work: we drop use of indirection in
> the assertion now.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block/snapshot.c | 30 ++++++++++--------------------
>   1 file changed, 10 insertions(+), 20 deletions(-)
>
> diff --git a/block/snapshot.c b/block/snapshot.c
> index d6f53c3065..f4ec4f9ef3 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -161,21 +161,14 @@ bool bdrv_snapshot_find_by_id_and_name(BlockDriverState *bs,
>   static BdrvChild **bdrv_snapshot_fallback_ptr(BlockDriverState *bs)
>   {
>       BdrvChild **fallback;
> -    BdrvChild *child;
> +    BdrvChild *child = bdrv_primary_child(bs);
>   
> -    /*
> -     * The only BdrvChild pointers that are safe to modify (and which
> -     * we can thus return a reference to) are bs->file and
> -     * bs->backing.
> -     */
> -    fallback = &bs->file;
> -    if (!*fallback && bs->drv && bs->drv->is_filter) {
> -        fallback = &bs->backing;
> -    }
> -
> -    if (!*fallback) {
> +    /* We allow fallback only to primary child */
> +    if (!child) {
>           return NULL;
>       }
> +    fallback = (child == bs->file ? &bs->file : &bs->backing);
> +    assert(*fallback == child);
>   
>       /*
>        * Check that there are no other children that would need to be
> @@ -309,15 +302,12 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
>           }
>   
>           /*
> -         * fallback_ptr is &bs->file or &bs->backing.  *fallback_ptr
> -         * was closed above and set to NULL, but the .bdrv_open() call
> -         * has opened it again, because we set the respective option
> -         * (with the qdict_put_str() call above).
> -         * Assert that .bdrv_open() has attached some child on
> -         * *fallback_ptr, and that it has attached the one we wanted
> -         * it to (i.e., fallback_bs).
> +         * fallback was a primary child. It was closed above and set to NULL,
> +         * but the .bdrv_open() call has opened it again, because we set the
> +         * respective option (with the qdict_put_str() call above).
> +         * Assert that .bdrv_open() has attached some BDS as primary child.

s/some/the right/?

Reviewed-by: Hanna Reitz <hreitz@redhat.com>

>            */
> -        assert(*fallback_ptr && fallback_bs == (*fallback_ptr)->bs);
> +        assert(bdrv_primary_bs(bs) == fallback_bs);
>           bdrv_unref(fallback_bs);
>           return ret;
>       }



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 09/45] Revert "block: Let replace_child_noperm free children"
  2022-03-30 21:28 ` [PATCH v5 09/45] Revert "block: Let replace_child_noperm free children" Vladimir Sementsov-Ogievskiy
@ 2022-06-07 14:03   ` Hanna Reitz
  2022-06-07 15:09     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 78+ messages in thread
From: Hanna Reitz @ 2022-06-07 14:03 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> We are going to reimplement this behavior (clear bs->file / bs->backing
> pointers automatically when child->bs is cleared) in a nicer way.
>
> This reverts commit b0a9f6fed3d80de610dcd04a7e66f9f30a04174f.

This doesn’t really explain why it’s fine to revert this commit here.  
As far as I understand, the bug that was fixed in that commit will 
resurface when it is reverted without the proposed reimplementation, so 
technically, we cannot revert before reimplementing.

As far as I can guess, it’d be unwieldy to do the reimplementation while 
these existing changes are in the way, and it’d be one bomb of a patch 
to squash these five patches (9 to 14) into one, and that’s why you’ve 
chosen to do it this way around.

But technically, we can’t willingly break something just to keep patches 
nicer.  We can make exceptions, but then there needs to be justification 
here in the commit message.

(Or perhaps I’m wrong and it is fine at this point to revert the patch, 
but then I’d like to see the explanation for that, too, because I can’t 
see it myself.)

Hanna



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 09/45] Revert "block: Let replace_child_noperm free children"
  2022-06-07 14:03   ` Hanna Reitz
@ 2022-06-07 15:09     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-06-07 15:09 UTC (permalink / raw)
  To: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 6/7/22 17:03, Hanna Reitz wrote:
> On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
>> We are going to reimplement this behavior (clear bs->file / bs->backing
>> pointers automatically when child->bs is cleared) in a nicer way.
>>
>> This reverts commit b0a9f6fed3d80de610dcd04a7e66f9f30a04174f.
> 
> This doesn’t really explain why it’s fine to revert this commit here. As far as I understand, the bug that was fixed in that commit will resurface when it is reverted without the proposed reimplementation, so technically, we cannot revert before reimplementing.
> 
> As far as I can guess, it’d be unwieldy to do the reimplementation while these existing changes are in the way, and it’d be one bomb of a patch to squash these five patches (9 to 14) into one, and that’s why you’ve chosen to do it this way around.

Yes, that's the reason

> 
> But technically, we can’t willingly break something just to keep patches nicer.  We can make exceptions, but then there needs to be justification here in the commit message.

Agree, will add.

As far as I remember (and after re-reading commit message) b0a9f6fed3d80de610dc was not a direct fix of some concrete bug. It was a measure to prevent theoretic problems. And we don't have any test for it. So I think, breaking bisect at this point for some future test is not too bad.

> 
> (Or perhaps I’m wrong and it is fine at this point to revert the patch, but then I’d like to see the explanation for that, too, because I can’t see it myself.)
> 
> Hanna
> 
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 13/45] block: Manipulate bs->file / bs->backing pointers in .attach/.detach
  2022-03-30 21:28 ` [PATCH v5 13/45] block: Manipulate bs->file / bs->backing pointers in .attach/.detach Vladimir Sementsov-Ogievskiy
@ 2022-06-07 15:55   ` Hanna Reitz
  2022-06-09 13:40     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 78+ messages in thread
From: Hanna Reitz @ 2022-06-07 15:55 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> bs->file and bs->backing are a kind of duplication of part of
> bs->children. But very useful diplication, so let's not drop them at
> all:)
>
> We should manage bs->file and bs->backing in same place, where we
> manage bs->children, to keep them in sync.
>
> Moreover, generic io paths are unprepared to BdrvChild without a bs, so
> it's double good to clear bs->file / bs->backing when we detach the
> child.

I think this was reproducible (rarely) with 030, but I can’t reproduce 
it now.  Oh well.

> Detach is simple: if we detach bs->file or bs->backing child, just
> set corresponding field to NULL.
>
> Attach is a bit more complicated. But we still can precisely detect
> should we set one of bs->file / bs->backing or not:
>
> - if role is BDRV_CHILD_COW, we definitely deal with bs->backing
> - else, if role is BDRV_CHILD_FILTERED (it must be also
>    BDRV_CHILD_PRIMARY), it's a filtered child. Use
>    bs->drv->filtered_child_is_backing to chose the pointer field to
>    modify.
> - else, if role is BDRV_CHILD_PRIMARY, we deal with bs->file
> - in all other cases, it's neither bs->backing nor bs->file. It's some
>    other child and we shouldn't care

Sounds correct.

> OK. This change brings one more good thing: we can (and should) get rid
> of all indirect pointers in the block-graph-change transactions:
>
> bdrv_attach_child_common() stores BdrvChild** into transaction to clear
> it on abort.
>
> bdrv_attach_child_common() has two callers: bdrv_attach_child_noperm()
> just pass-through this feature, bdrv_root_attach_child() doesn't need
> the feature.
>
> Look at bdrv_attach_child_noperm() callers:
>    - bdrv_attach_child() doesn't need the feature
>    - bdrv_set_file_or_backing_noperm() uses the feature to manage
>      bs->file and bs->backing, we don't want it anymore
>    - bdrv_append() uses the feature to manage bs->backing, again we
>      don't want it anymore
>
> So, we should drop this stuff! Great!
>
> We still keep BdrvChild** argument to return the child and int return
> value, and not move to simply returning BdrvChild*, as we don't want to
> lose int return values.
>
> However we don't require *@child to be NULL anymore, and even allow
> @child to be NULL, if caller don't need the new child pointer.
>
> Finally, we now set .file / .backing automatically in generic code and
> want to restring setting them by hand outside of .attach/.detach.
> So, this patch cleanups all remaining places where they were set.
> To find such places I use:
>
>    git grep '\->file ='
>    git grep '\->backing ='
>    git grep '&.*\<backing\>'
>    git grep '&.*\<file\>'

Awesome.

block/snapshot-access.c needs a touchup, but other than that, this still 
seems to hold.

> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block.c                          | 156 ++++++++++++++-----------------
>   block/raw-format.c               |   4 +-
>   block/snapshot.c                 |   1 -
>   include/block/block_int-common.h |  15 ++-
>   tests/unit/test-bdrv-drain.c     |  10 +-
>   5 files changed, 89 insertions(+), 97 deletions(-)
>
> diff --git a/block.c b/block.c
> index 8e8ed639fe..6b43e101a1 100644
> --- a/block.c
> +++ b/block.c
> @@ -1438,9 +1438,33 @@ static void bdrv_child_cb_attach(BdrvChild *child)
>   
>       assert_bdrv_graph_writable(bs);
>       QLIST_INSERT_HEAD(&bs->children, child, next);
> -
> -    if (child->role & BDRV_CHILD_COW) {
> +    if (bs->drv->is_filter | (child->role & BDRV_CHILD_FILTERED)) {

Should be `||`.

> +        /*
> +         * Here we handle filters and block/raw-format.c when it behave like
> +         * filter.

I’d like this comment to expand on how they are handled.

For example, that they generally have a single PRIMARY child, which is 
also the FILTERED child, and that they may have multiple more children, 
but none of them will be a COW child.  So bs->file will be the PRIMARY 
child, unless the PRIMARY child goes into bs->backing on exceptional 
cases; and bs->backing will be nothing else.  (Which is why we ignore 
all other children.)

> +         */
> +        assert(!(child->role & BDRV_CHILD_COW));
> +        if (child->role & (BDRV_CHILD_PRIMARY | BDRV_CHILD_FILTERED)) {

Why do we check for FILTERED here?  It appears to me that PRIMARY is the 
flag that tells us to put this child into bs->file (but for filters, 
sometimes we have to make an exception and put it into bs->backing).

Is the check for FILTERED just a safeguard, so that filter drivers 
always set the two in tandem?  If so, I’d make the condition just `role 
& PRIMARY` and then in an `else` path assert that `!(role & FILTERED)`.

> +            assert(child->role & BDRV_CHILD_PRIMARY);
> +            assert(child->role & BDRV_CHILD_FILTERED);
> +            assert(!bs->backing);
> +            assert(!bs->file);
> +
> +            if (bs->drv->filtered_child_is_backing) {
> +                bs->backing = child;
> +            } else {
> +                bs->file = child;
> +            }
> +        }

[...]

> @@ -2897,11 +2925,11 @@ static TransactionActionDrv bdrv_attach_child_common_drv = {
>   /*
>    * Common part of attaching bdrv child to bs or to blk or to job
>    *
> - * Resulting new child is returned through @child.
> - * At start *@child must be NULL.
> - * @child is saved to a new entry of @tran, so that *@child could be reverted to
> - * NULL on abort(). So referenced variable must live at least until transaction
> - * end.
> + * If @child is not NULL, it's set to new created child. Note, that @child
> + * pointer is stored in the transaction and therefore not cleared on abort.

I can’t quite parse this comment.  It doesn’t look like `child` is 
stored in the transaction.  I mean, `new_child` is, which is what 
`*child` is, but there’s a difference between `@child` and `*child` (or 
`*@child`) after all.

Or is there a “not” missing, i.e. “that the @child pointer is not stored 
in the transaction”?  That would also make more sense, why it isn’t 
cleared on abort.

I’d also like to ask for this to be even more clear, e.g. by adding a 
sentence “When this transaction is aborted, the pointer stored in 
*@child becomes invalid.”

> + * Consider @child as part of return value: we may change the return value of
> + * the function to BdrvChild* and return child directly, but this way we lose
> + * different return codes.

I mean, do we even care about return codes?  I hope not, but maybe we 
do?  We do have `errp` for a description, and I think the only 
distinction we make in the block layer based on error codes is ENOSPC 
vs. anything else.  I hope this function never returns ENOSPC, so I 
think the return value shouldn’t matter.

(I can understand that it seems like a loss if we can no longer decide 
between e.g. EINVAL and EPERM, but I don’t think it really is.  We could 
just make it EINVAL always and it shouldn’t matter.)

>    *
>    * Function doesn't update permissions, caller is responsible for this.
>    */



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 14/45] block/snapshot: drop indirection around bdrv_snapshot_fallback_ptr
  2022-03-30 21:28 ` [PATCH v5 14/45] block/snapshot: drop indirection around bdrv_snapshot_fallback_ptr Vladimir Sementsov-Ogievskiy
@ 2022-06-07 15:58   ` Hanna Reitz
  2022-06-09 14:44     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 78+ messages in thread
From: Hanna Reitz @ 2022-06-07 15:58 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> Now the indirection is not actually used, we can safely reduce it to
> simple pointer.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block/snapshot.c | 39 +++++++++++++++++----------------------
>   1 file changed, 17 insertions(+), 22 deletions(-)

Looks good, just wondering whether we should drop some of the "_ptr" 
suffixes now.

> diff --git a/block/snapshot.c b/block/snapshot.c
> index 02a880911f..4eb9258de6 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -151,34 +151,29 @@ bool bdrv_snapshot_find_by_id_and_name(BlockDriverState *bs,
>   }
>   
>   /**
> - * Return a pointer to the child BDS pointer to which we can fall
> + * Return a pointer to child of given BDS to which we can fall
>    * back if the given BDS does not support snapshots.
>    * Return NULL if there is no BDS to (safely) fall back to.
> - *
> - * We need to return an indirect pointer because bdrv_snapshot_goto()
> - * has to modify the BdrvChild pointer.
>    */
> -static BdrvChild **bdrv_snapshot_fallback_ptr(BlockDriverState *bs)
> +static BdrvChild *bdrv_snapshot_fallback_ptr(BlockDriverState *bs)

The _ptr part was meant to point out that it returns an indirect 
pointer; maybe we should name it bdrv_snapshot_fallback_child() now?

>   {
> -    BdrvChild **fallback;
> -    BdrvChild *child = bdrv_primary_child(bs);
> +    BdrvChild *fallback = bdrv_primary_child(bs);
> +    BdrvChild *child;
>   
>       /* We allow fallback only to primary child */
> -    if (!child) {
> +    if (!fallback) {
>           return NULL;
>       }
> -    fallback = (child == bs->file ? &bs->file : &bs->backing);
> -    assert(*fallback == child);
>   
>       /*
>        * Check that there are no other children that would need to be
>        * snapshotted.  If there are, it is not safe to fall back to
> -     * *fallback.
> +     * fallback.
>        */
>       QLIST_FOREACH(child, &bs->children, next) {
>           if (child->role & (BDRV_CHILD_DATA | BDRV_CHILD_METADATA |
>                              BDRV_CHILD_FILTERED) &&
> -            child != *fallback)
> +            child != fallback)
>           {
>               return NULL;
>           }
> @@ -189,8 +184,8 @@ static BdrvChild **bdrv_snapshot_fallback_ptr(BlockDriverState *bs)
>   
>   static BlockDriverState *bdrv_snapshot_fallback(BlockDriverState *bs)
>   {
> -    BdrvChild **child_ptr = bdrv_snapshot_fallback_ptr(bs);

Just "child" is enough (and better) now, I think.

> -    return child_ptr ? (*child_ptr)->bs : NULL;
> +    BdrvChild *child_ptr = bdrv_snapshot_fallback_ptr(bs);
> +    return child_ptr ? child_ptr->bs : NULL;
>   }
>   
>   int bdrv_can_snapshot(BlockDriverState *bs)



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 15/45] block: refactor bdrv_remove_file_or_backing_child to bdrv_remove_child
  2022-03-30 21:28 ` [PATCH v5 15/45] block: refactor bdrv_remove_file_or_backing_child to bdrv_remove_child Vladimir Sementsov-Ogievskiy
@ 2022-06-08 10:04   ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-08 10:04 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> Now the function can remove any child, so give it more common name.
> Drop assertions and drop bs argument which becomes unused. Function
> would be reused in a further commit.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block.c | 22 ++++++++--------------
>   1 file changed, 8 insertions(+), 14 deletions(-)

Good!

> diff --git a/block.c b/block.c
> index 6b43e101a1..ea5687edc8 100644
> --- a/block.c
> +++ b/block.c

[...]

> -static TransactionActionDrv bdrv_remove_filter_or_cow_child_drv = {
> -    .commit = bdrv_remove_filter_or_cow_child_commit,
> +static TransactionActionDrv bdrv_remove_child_drv = {
> +    .commit = bdrv_remove_child_commit,
>   };
>   
>   /*
>    * A function to remove backing or file child of @bs.

I think it’d make sense to update this description here.

>    * Function doesn't update permissions, caller is responsible for this.
>    */
> -static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
> -                                              BdrvChild *child,
> -                                              Transaction *tran)
> +static void bdrv_remove_child(BdrvChild *child, Transaction *tran)
>   {
> -    assert(child == bs->backing || child == bs->file);
> -
>       if (!child) {
>           return;
>       }



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 16/45] block: drop bdrv_detach_child()
  2022-03-30 21:28 ` [PATCH v5 16/45] block: drop bdrv_detach_child() Vladimir Sementsov-Ogievskiy
@ 2022-06-08 10:22   ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-08 10:22 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> The only caller is bdrv_root_unref_child(), let's just do the logic
> directly in it. It simplifies further convertion of
> bdrv_root_unref_child() to transaction action.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block.c | 45 ++++++++++++++++++---------------------------
>   1 file changed, 18 insertions(+), 27 deletions(-)

Reviewed-by: Hanna Reitz <hreitz@redhat.com>



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 17/45] block: drop bdrv_remove_filter_or_cow_child
  2022-03-30 21:28 ` [PATCH v5 17/45] block: drop bdrv_remove_filter_or_cow_child Vladimir Sementsov-Ogievskiy
@ 2022-06-08 10:40   ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-08 10:40 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> Drop this simple wrapper used only in one place. We have too many graph
> modifying functions even without it.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block.c | 15 +--------------
>   1 file changed, 1 insertion(+), 14 deletions(-)

Reviewed-by: Hanna Reitz <hreitz@redhat.com>



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 18/45] block: bdrv_refresh_perms(): allow external tran
  2022-03-30 21:28 ` [PATCH v5 18/45] block: bdrv_refresh_perms(): allow external tran Vladimir Sementsov-Ogievskiy
@ 2022-06-08 10:57   ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-08 10:57 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> Allow passing external Transaction pointer, stop creating extra
> Transaction objects.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block.c | 31 ++++++++++++++++++++-----------
>   1 file changed, 20 insertions(+), 11 deletions(-)

Reviewed-by: Hanna Reitz <hreitz@redhat.com>



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 19/45] block: refactor bdrv_list_refresh_perms to allow any list of nodes
  2022-03-30 21:28 ` [PATCH v5 19/45] block: refactor bdrv_list_refresh_perms to allow any list of nodes Vladimir Sementsov-Ogievskiy
@ 2022-06-08 11:27   ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-08 11:27 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> We are going to increase usage of collecting nodes in a list to then
> update, and calling bdrv_topological_dfs() each time is not convenient,
> and not correct as we are going to interleave graph modifying with
> filling the node list.
>
> So, let's switch to a function that takes any list of nodes, adds all
> their subtrees and do topological sort. And finally, refresh
> permissions.
>
> While being here, make the function public, as we'll want to use it
> from blockdev.c in near future.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block.c | 51 ++++++++++++++++++++++++++++++++-------------------
>   1 file changed, 32 insertions(+), 19 deletions(-)
>
> diff --git a/block.c b/block.c
> index f3ed351360..9009f73534 100644
> --- a/block.c
> +++ b/block.c

[...]

> @@ -2510,6 +2514,24 @@ static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
>       return 0;
>   }
>   
> +/*
> + * @list is any list of nodes. List is completed by all subtreees and

*subtrees

With that fixed:

Reviewed-by: Hanna Reitz <hreitz@redhat.com>

> + * topologically sorted. It's not a problem if some node occurs in the @list
> + * several times.
> + */
> +static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
> +                                   Transaction *tran, Error **errp)
> +{
> +    g_autoptr(GHashTable) found = g_hash_table_new(NULL, NULL);
> +    g_autoptr(GSList) refresh_list = NULL;
> +
> +    for ( ; list; list = list->next) {
> +        refresh_list = bdrv_topological_dfs(refresh_list, found, list->data);
> +    }
> +
> +    return bdrv_do_refresh_perms(refresh_list, q, tran, errp);
> +}
> +



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 20/45] block: make permission update functions public
  2022-03-30 21:28 ` [PATCH v5 20/45] block: make permission update functions public Vladimir Sementsov-Ogievskiy
@ 2022-06-08 11:31   ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-08 11:31 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> We'll need them in further commits in blockdev.c for new transaction
> block-graph modifying API.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block.c                            | 7 +++----
>   include/block/block-global-state.h | 4 ++++
>   2 files changed, 7 insertions(+), 4 deletions(-)

Reviewed-by: Hanna Reitz <hreitz@redhat.com>



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 21/45] block: add bdrv_try_set_aio_context_tran transaction action
  2022-03-30 21:28 ` [PATCH v5 21/45] block: add bdrv_try_set_aio_context_tran transaction action Vladimir Sementsov-Ogievskiy
@ 2022-06-08 11:49   ` Hanna Reitz
  2022-06-09 14:56     ` Vladimir Sementsov-Ogievskiy
  2022-06-13  7:46   ` Hanna Reitz
  1 sibling, 1 reply; 78+ messages in thread
From: Hanna Reitz @ 2022-06-08 11:49 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> To be used in further commit.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 48 insertions(+)

Looking at bdrv_child_try_set_aio_context(), it looks like 
bdrv_can_set_aio_context() were supposed to be the .prepare action, and 
bdrv_set_aio_context_ignore() should be the .commit action.  Can we not 
use it that way?



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 04/45] test-bdrv-graph-mod: update test_parallel_perm_update test case
  2022-06-07 10:53   ` Hanna Reitz
@ 2022-06-09 13:08     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-06-09 13:08 UTC (permalink / raw)
  To: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 6/7/22 13:53, Hanna Reitz wrote:
> On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
>> test_parallel_perm_update() does two things that we are going to
>> restrict in the near future:
>>
>> 1. It updates bs->file field by hand. bs->file will be managed
>>     automatically by generic code (together with bs->children list).
>>
>>     Let's better refactor our "tricky" bds to have own state where one
>>     of children is linked as "selected".
>>     This also looks less "tricky", so avoid using this word.
>>
>> 2. It create FILTERED children that are not PRIMARY. Except for tests
>>     all FILTERED children in the Qemu block layer are always PRIMARY as
>>     well.  We are going to formalize this rule, so let's better use DATA
>>     children here.
> 
> Another thing is that any node may have at most one FILTERED child at a time, which was already formalized in BDRV_CHILD_FILTERED’s description.

Right, will add

> 
>> While being here, update the picture to better correspond to the test
>> code.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
>> ---
> 
> The change looks good, I’m just a bit confused when it comes to the comment describing what’s going on.
> 
>>   tests/unit/test-bdrv-graph-mod.c | 70 ++++++++++++++++++++------------
>>   1 file changed, 44 insertions(+), 26 deletions(-)
>>
>> diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
>> index a6e3bb79be..40795d3c04 100644
>> --- a/tests/unit/test-bdrv-graph-mod.c
>> +++ b/tests/unit/test-bdrv-graph-mod.c
> 
> [...]
> 
>> @@ -266,15 +280,18 @@ static BlockDriver bdrv_write_to_file = {
>>    * The following test shows that topological-sort order is required for
>>    * permission update, simple DFS is not enough.
>>    *
>> - * Consider the block driver which has two filter children: one active
>> - * with exclusive write access and one inactive with no specific
>> - * permissions.
>> + * Consider the block driver (write-to-selected) which has two children: one is
>> + * selected so we have exclusive write access to it and for the other one we
>> + * don't need any specific permissions.
>>    *
>>    * And, these two children has a common base child, like this:
>> + *   (additional "top" on top is used in test just because the only public
>> + *    function to update permission should get a specific child to update.
>> + *    Making bdrv_refresh_perms() public just for this test doesn't worth it)
> 
> s/doesn't/isn't/
> 
>>    *
>> - * ┌─────┐     ┌──────┐
>> - * │ fl2 │ ◀── │ top  │
>> - * └─────┘     └──────┘
>> + * ┌─────┐     ┌───────────────────┐     ┌─────┐
>> + * │ fl2 │ ◀── │ write-to-selected │ ◀── │ top │
>> + * └─────┘     └───────────────────┘     └─────┘
>>    *   │           │
>>    *   │           │ w
>>    *   │           ▼
>> @@ -290,7 +307,7 @@ static BlockDriver bdrv_write_to_file = {
>>    *
>>    * So, exclusive write is propagated.
>>    *
>> - * Assume, we want to make fl2 active instead of fl1.
>> + * Assume, we want to select fl2  instead of fl1.
> 
> There’s a double space after “fl2”.
> 
>>    * So, we set some option for top driver and do permission update.
> 
> Here and in the rest of the comment, it’s now unclear what node “top” refers to.  I think it’s still the now-renamed “write-to-selected” node, right?  But “top” is now a different node, so I’m not 100% sure.

Right, will fix.

> 
> (On the other hand, even before this patch, there was a “top” node that was distinct from the former “tricky” node...  So it seems like this comment was already not quite right before?)

Hmm yes. Obviously I tried to make this more obvious, but didn't update the whole comment.

> 
>>    *
>>    * With simple DFS, if permission update goes first through
> 
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 13/45] block: Manipulate bs->file / bs->backing pointers in .attach/.detach
  2022-06-07 15:55   ` Hanna Reitz
@ 2022-06-09 13:40     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-06-09 13:40 UTC (permalink / raw)
  To: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 6/7/22 18:55, Hanna Reitz wrote:
> On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
>> bs->file and bs->backing are a kind of duplication of part of
>> bs->children. But very useful diplication, so let's not drop them at
>> all:)
>>
>> We should manage bs->file and bs->backing in same place, where we
>> manage bs->children, to keep them in sync.
>>
>> Moreover, generic io paths are unprepared to BdrvChild without a bs, so
>> it's double good to clear bs->file / bs->backing when we detach the
>> child.
> 
> I think this was reproducible (rarely) with 030, but I can’t reproduce it now.  Oh well.
> 
>> Detach is simple: if we detach bs->file or bs->backing child, just
>> set corresponding field to NULL.
>>
>> Attach is a bit more complicated. But we still can precisely detect
>> should we set one of bs->file / bs->backing or not:
>>
>> - if role is BDRV_CHILD_COW, we definitely deal with bs->backing
>> - else, if role is BDRV_CHILD_FILTERED (it must be also
>>    BDRV_CHILD_PRIMARY), it's a filtered child. Use
>>    bs->drv->filtered_child_is_backing to chose the pointer field to
>>    modify.
>> - else, if role is BDRV_CHILD_PRIMARY, we deal with bs->file
>> - in all other cases, it's neither bs->backing nor bs->file. It's some
>>    other child and we shouldn't care
> 
> Sounds correct.
> 
>> OK. This change brings one more good thing: we can (and should) get rid
>> of all indirect pointers in the block-graph-change transactions:
>>
>> bdrv_attach_child_common() stores BdrvChild** into transaction to clear
>> it on abort.
>>
>> bdrv_attach_child_common() has two callers: bdrv_attach_child_noperm()
>> just pass-through this feature, bdrv_root_attach_child() doesn't need
>> the feature.
>>
>> Look at bdrv_attach_child_noperm() callers:
>>    - bdrv_attach_child() doesn't need the feature
>>    - bdrv_set_file_or_backing_noperm() uses the feature to manage
>>      bs->file and bs->backing, we don't want it anymore
>>    - bdrv_append() uses the feature to manage bs->backing, again we
>>      don't want it anymore
>>
>> So, we should drop this stuff! Great!
>>
>> We still keep BdrvChild** argument to return the child and int return
>> value, and not move to simply returning BdrvChild*, as we don't want to
>> lose int return values.
>>
>> However we don't require *@child to be NULL anymore, and even allow
>> @child to be NULL, if caller don't need the new child pointer.
>>
>> Finally, we now set .file / .backing automatically in generic code and
>> want to restring setting them by hand outside of .attach/.detach.
>> So, this patch cleanups all remaining places where they were set.
>> To find such places I use:
>>
>>    git grep '\->file ='
>>    git grep '\->backing ='
>>    git grep '&.*\<backing\>'
>>    git grep '&.*\<file\>'
> 
> Awesome.
> 
> block/snapshot-access.c needs a touchup, but other than that, this still seems to hold.
> 
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
>> ---
>>   block.c                          | 156 ++++++++++++++-----------------
>>   block/raw-format.c               |   4 +-
>>   block/snapshot.c                 |   1 -
>>   include/block/block_int-common.h |  15 ++-
>>   tests/unit/test-bdrv-drain.c     |  10 +-
>>   5 files changed, 89 insertions(+), 97 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 8e8ed639fe..6b43e101a1 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -1438,9 +1438,33 @@ static void bdrv_child_cb_attach(BdrvChild *child)
>>       assert_bdrv_graph_writable(bs);
>>       QLIST_INSERT_HEAD(&bs->children, child, next);
>> -
>> -    if (child->role & BDRV_CHILD_COW) {
>> +    if (bs->drv->is_filter | (child->role & BDRV_CHILD_FILTERED)) {
> 
> Should be `||`.
> 
>> +        /*
>> +         * Here we handle filters and block/raw-format.c when it behave like
>> +         * filter.
> 
> I’d like this comment to expand on how they are handled.
> 
> For example, that they generally have a single PRIMARY child, which is also the FILTERED child, and that they may have multiple more children, but none of them will be a COW child.  So bs->file will be the PRIMARY child, unless the PRIMARY child goes into bs->backing on exceptional cases; and bs->backing will be nothing else.  (Which is why we ignore all other children.)
> 
>> +         */
>> +        assert(!(child->role & BDRV_CHILD_COW));
>> +        if (child->role & (BDRV_CHILD_PRIMARY | BDRV_CHILD_FILTERED)) {
> 
> Why do we check for FILTERED here?  It appears to me that PRIMARY is the flag that tells us to put this child into bs->file (but for filters, sometimes we have to make an exception and put it into bs->backing).
> 
> Is the check for FILTERED just a safeguard, so that filter drivers always set the two in tandem?  If so, I’d make the condition just `role & PRIMARY` and then in an `else` path assert that `!(role & FILTERED)`.

Agree

> 
>> +            assert(child->role & BDRV_CHILD_PRIMARY);
>> +            assert(child->role & BDRV_CHILD_FILTERED);
>> +            assert(!bs->backing);
>> +            assert(!bs->file);
>> +
>> +            if (bs->drv->filtered_child_is_backing) {
>> +                bs->backing = child;
>> +            } else {
>> +                bs->file = child;
>> +            }
>> +        }
> 
> [...]
> 
>> @@ -2897,11 +2925,11 @@ static TransactionActionDrv bdrv_attach_child_common_drv = {
>>   /*
>>    * Common part of attaching bdrv child to bs or to blk or to job
>>    *
>> - * Resulting new child is returned through @child.
>> - * At start *@child must be NULL.
>> - * @child is saved to a new entry of @tran, so that *@child could be reverted to
>> - * NULL on abort(). So referenced variable must live at least until transaction
>> - * end.
>> + * If @child is not NULL, it's set to new created child. Note, that @child
>> + * pointer is stored in the transaction and therefore not cleared on abort.
> 
> I can’t quite parse this comment.  It doesn’t look like `child` is stored in the transaction.  I mean, `new_child` is, which is what `*child` is, but there’s a difference between `@child` and `*child` (or `*@child`) after all.
> 
> Or is there a “not” missing, i.e. “that the @child pointer is not stored in the transaction”?  That would also make more sense, why it isn’t cleared on abort.

Yes, "not" is missing, sorry)

> 
> I’d also like to ask for this to be even more clear, e.g. by adding a sentence “When this transaction is aborted, the pointer stored in *@child becomes invalid.”

OK

> 
>> + * Consider @child as part of return value: we may change the return value of
>> + * the function to BdrvChild* and return child directly, but this way we lose
>> + * different return codes.
> 
> I mean, do we even care about return codes?  I hope not, but maybe we do?  We do have `errp` for a description, and I think the only distinction we make in the block layer based on error codes is ENOSPC vs. anything else.  I hope this function never returns ENOSPC, so I think the return value shouldn’t matter.
> 
> (I can understand that it seems like a loss if we can no longer decide between e.g. EINVAL and EPERM, but I don’t think it really is.  We could just make it EINVAL always and it shouldn’t matter.)
> 

Hmm. Seems reasonable. I'll check if we can move to simply return the child.


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 14/45] block/snapshot: drop indirection around bdrv_snapshot_fallback_ptr
  2022-06-07 15:58   ` Hanna Reitz
@ 2022-06-09 14:44     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-06-09 14:44 UTC (permalink / raw)
  To: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 6/7/22 18:58, Hanna Reitz wrote:
> On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
>> Now the indirection is not actually used, we can safely reduce it to
>> simple pointer.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
>> ---
>>   block/snapshot.c | 39 +++++++++++++++++----------------------
>>   1 file changed, 17 insertions(+), 22 deletions(-)
> 
> Looks good, just wondering whether we should drop some of the "_ptr" suffixes now.
> 
>> diff --git a/block/snapshot.c b/block/snapshot.c
>> index 02a880911f..4eb9258de6 100644
>> --- a/block/snapshot.c
>> +++ b/block/snapshot.c
>> @@ -151,34 +151,29 @@ bool bdrv_snapshot_find_by_id_and_name(BlockDriverState *bs,
>>   }
>>   /**
>> - * Return a pointer to the child BDS pointer to which we can fall
>> + * Return a pointer to child of given BDS to which we can fall
>>    * back if the given BDS does not support snapshots.
>>    * Return NULL if there is no BDS to (safely) fall back to.
>> - *
>> - * We need to return an indirect pointer because bdrv_snapshot_goto()
>> - * has to modify the BdrvChild pointer.
>>    */
>> -static BdrvChild **bdrv_snapshot_fallback_ptr(BlockDriverState *bs)
>> +static BdrvChild *bdrv_snapshot_fallback_ptr(BlockDriverState *bs)
> 
> The _ptr part was meant to point out that it returns an indirect pointer; maybe we should name it bdrv_snapshot_fallback_child() now?
> 
>>   {
>> -    BdrvChild **fallback;
>> -    BdrvChild *child = bdrv_primary_child(bs);
>> +    BdrvChild *fallback = bdrv_primary_child(bs);
>> +    BdrvChild *child;
>>       /* We allow fallback only to primary child */
>> -    if (!child) {
>> +    if (!fallback) {
>>           return NULL;
>>       }
>> -    fallback = (child == bs->file ? &bs->file : &bs->backing);
>> -    assert(*fallback == child);
>>       /*
>>        * Check that there are no other children that would need to be
>>        * snapshotted.  If there are, it is not safe to fall back to
>> -     * *fallback.
>> +     * fallback.
>>        */
>>       QLIST_FOREACH(child, &bs->children, next) {
>>           if (child->role & (BDRV_CHILD_DATA | BDRV_CHILD_METADATA |
>>                              BDRV_CHILD_FILTERED) &&
>> -            child != *fallback)
>> +            child != fallback)
>>           {
>>               return NULL;
>>           }
>> @@ -189,8 +184,8 @@ static BdrvChild **bdrv_snapshot_fallback_ptr(BlockDriverState *bs)
>>   static BlockDriverState *bdrv_snapshot_fallback(BlockDriverState *bs)
>>   {
>> -    BdrvChild **child_ptr = bdrv_snapshot_fallback_ptr(bs);
> 
> Just "child" is enough (and better) now, I think.
> 
>> -    return child_ptr ? (*child_ptr)->bs : NULL;
>> +    BdrvChild *child_ptr = bdrv_snapshot_fallback_ptr(bs);
>> +    return child_ptr ? child_ptr->bs : NULL;
>>   }
>>   int bdrv_can_snapshot(BlockDriverState *bs)
> 
> 

Agree to all comments, will do

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 21/45] block: add bdrv_try_set_aio_context_tran transaction action
  2022-06-08 11:49   ` Hanna Reitz
@ 2022-06-09 14:56     ` Vladimir Sementsov-Ogievskiy
  2022-06-13  7:12       ` Hanna Reitz
  0 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-06-09 14:56 UTC (permalink / raw)
  To: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 6/8/22 14:49, Hanna Reitz wrote:
> On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
>> To be used in further commit.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
>> ---
>>   block.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 48 insertions(+)
> 
> Looking at bdrv_child_try_set_aio_context(), it looks like bdrv_can_set_aio_context() were supposed to be the .prepare action, and bdrv_set_aio_context_ignore() should be the .commit action.  Can we not use it that way?
> 
> 


The difference is that we want the whole action be done in .prepare stage, not only the check. It's generally better: when do several actions in a transaction, actions usually depend on result of previous actions.

And I think it's necessary for graph update. Graph relations are changed during other actions .prepare phases, so I can't imagine how to postpone aio-context update to .commit phase.


But I agree, that having both _can_ / _set_  and *tran APIs don't look good. May be we can refactor it.. But not in this series I think)

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 21/45] block: add bdrv_try_set_aio_context_tran transaction action
  2022-06-09 14:56     ` Vladimir Sementsov-Ogievskiy
@ 2022-06-13  7:12       ` Hanna Reitz
  0 siblings, 0 replies; 78+ messages in thread
From: Hanna Reitz @ 2022-06-13  7:12 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 09.06.22 16:56, Vladimir Sementsov-Ogievskiy wrote:
> On 6/8/22 14:49, Hanna Reitz wrote:
>> On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
>>> To be used in further commit.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
>>> ---
>>>   block.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 48 insertions(+)
>>
>> Looking at bdrv_child_try_set_aio_context(), it looks like 
>> bdrv_can_set_aio_context() were supposed to be the .prepare action, 
>> and bdrv_set_aio_context_ignore() should be the .commit action.  Can 
>> we not use it that way?
>>
>>
>
>
> The difference is that we want the whole action be done in .prepare 
> stage, not only the check. It's generally better: when do several 
> actions in a transaction, actions usually depend on result of previous 
> actions.

Ah, yes.

> And I think it's necessary for graph update. Graph relations are 
> changed during other actions .prepare phases, so I can't imagine how 
> to postpone aio-context update to .commit phase.

OK, sounds good.

> But I agree, that having both _can_ / _set_  and *tran APIs don't look 
> good. May be we can refactor it.. But not in this series I think)



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 21/45] block: add bdrv_try_set_aio_context_tran transaction action
  2022-03-30 21:28 ` [PATCH v5 21/45] block: add bdrv_try_set_aio_context_tran transaction action Vladimir Sementsov-Ogievskiy
  2022-06-08 11:49   ` Hanna Reitz
@ 2022-06-13  7:46   ` Hanna Reitz
  2022-06-20 20:57     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 78+ messages in thread
From: Hanna Reitz @ 2022-06-13  7:46 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> To be used in further commit.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 48 insertions(+)
>
> diff --git a/block.c b/block.c
> index be19964f89..1900cdf277 100644
> --- a/block.c
> +++ b/block.c
> @@ -2907,6 +2907,54 @@ static void bdrv_child_free(BdrvChild *child)
>       g_free(child);
>   }
>   
> +typedef struct BdrvTrySetAioContextState {
> +    BlockDriverState *bs;
> +    AioContext *old_ctx;
> +} BdrvTrySetAioContextState;
> +
> +static void bdrv_try_set_aio_context_abort(void *opaque)
> +{
> +    BdrvTrySetAioContextState *s = opaque;
> +
> +    if (bdrv_get_aio_context(s->bs) != s->old_ctx) {
> +        bdrv_try_set_aio_context(s->bs, s->old_ctx, &error_abort);

As far as I understand, users of this transaction will need to do a bit 
of AioContext lock shuffling: To set the context, they need to hold 
old_ctx, but not new_ctx; but in case of abort, they need to release 
old_ctx and acquire new_ctx before the abort handlers are called.  (Due 
to the constraints on bdrv_set_aio_context_ignore().)

If that’s true, I think that should be documented somewhere.

> +    }
> +}
> +



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 22/45] block: implemet bdrv_unref_tran()
  2022-03-30 21:28 ` [PATCH v5 22/45] block: implemet bdrv_unref_tran() Vladimir Sementsov-Ogievskiy
@ 2022-06-13  9:07   ` Hanna Reitz
  2022-06-20 21:16     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 78+ messages in thread
From: Hanna Reitz @ 2022-06-13  9:07 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> Now nodes are removed during block-graph update transactions now? Look
> at bdrv_replace_child_tran: bdrv_unref() is simply postponed to commit
> phase.
>
> What is the problem with it?
>
> We want to make copy-before-write permissions strict: it should unshare
> write always, not only when it has at least one parent.

Looking over this patch in not too much detail (because I find it rather 
complicated), it looks OK to me; but this reason for why we need it 
doesn’t really satisfy me.  What is the problem with how CBW permissions 
work?  Is that really the only reason for this patch?

> But if so, we
> can't neither insert the filter nor remove it:
>
> To insert the filter, we should first do blockdev-add, and filter will
> unshare write on the child, so, blockdev-add will fail if disk is in
> use by guest.
>
> To remove the filter, we should first do a replace operations, which
> again leads to situation when the filter and old parent share one
> child, and all parent want write permission when the filter unshare it.
>
> The solution is first do both graph-modifying operations (add &
> replace, or replace & remove) and only then update permissions. But
> that is not possible with current method to transactionally remove the
> block node: if we just postpone bdrv_unref() to commit phase, than on
> prepare phase the node is not removed, and it still keep all
> permissions on its children.
>
> What to do? In general, I don't know. But it's possible to solve the
> problem for the block drivers that doesn't need access to their
> children on .bdrv_close(). For such drivers we can detach their
> children on prepare stage (still, postponing bdrv_close() call to
> commit phase). For this to work we of course should effectively reduce
> bs->refcnt on prepare phase as well.
>
> So, the logic of new bdrv_unref_tran() is:
>
> prepare:
>    decrease refcnt and detach children if possible (and if refcnt is 0)
>
> commit:
>    do bdrv_delete() if refcnt is 0
>
> abort:
>    restore children and refcnt
>
> What's the difficulty with it? If we want to transactionally (and with
> no permission change) remove nodes, we should understand that some
> nodes may be removed recursively, and finally we get several possible
> not deleted leaves, where permissions should be updated. How caller
> will know what to update? That leads to additional transaction-wide
> refresh_list variable, which is filled by various graph modifying
> function. So, user should declare referesh_list variable and do one or
> several block-graph modifying operations (that may probably remove some
> nodes), then user call bdrv_list_refresh_perms on resulting
> refresh_list.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 29/45] block: introduce BDRV_O_NOPERM flag
  2022-03-30 21:28 ` [PATCH v5 29/45] block: introduce BDRV_O_NOPERM flag Vladimir Sementsov-Ogievskiy
@ 2022-06-13  9:54   ` Hanna Reitz
  2022-06-21 12:11     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 78+ messages in thread
From: Hanna Reitz @ 2022-06-13  9:54 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og, Markus Armbruster

On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
> Now copy-before-write filter has weak permission model: when it has no
> parents, it share write permission on source. Otherwise we just can't
> blockdev-add it, when existing user of source has write permission.
>
> The situation is bad, it means that copy-before-write filter doesn't
> guarantee that all write goes through it.

I don’t understand how this situation really is bad, because it sounds 
like anything else would just be a safeguard against users adding a CBW 
filter without making use of it.  Which I’d think is their own fault.

As far as I remember the actual problem is that we cannot do 
transactional graph modifications, where e.g. a CBW node is inserted and 
a bitmap is created in a single atomic transaction[1].  Which is a 
problem.  And now I just don’t quite understand how unsharing WRITE 
unconditionally would help with the actual problem.

[1] Then again, would then even be “atomic”?  For that transaction to 
work as intended, the node would need to be drained during the 
transaction (so that the bitmap stays in sync with the CBW state). It 
doesn’t look like that would be the case.

So perhaps I’m just remembering incorrectly.

> And a lot better is unshare
> write always. But how to insert the filter in this case?
>
> The solution is to do blockdev-add and blockdev-replace in one
> transaction, and more, update permissions only after both command.
>
> For now, let's create a possibility to not update permission on file
> child of copy-before-write filter at time of open.
>
> New interfaces are:
>
> - bds_tree_init() with flags argument, so that caller may pass
>    additional flags, for example the new BDRV_O_NOPERM.
>
> - bdrv_open_file_child_common() with boolean refresh_perms arguments.
>    Drivers may use this function with refresh_perms = true, if they want
>    to satisfy BDRV_O_NOPERM. No one such driver for now.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 21/45] block: add bdrv_try_set_aio_context_tran transaction action
  2022-06-13  7:46   ` Hanna Reitz
@ 2022-06-20 20:57     ` Vladimir Sementsov-Ogievskiy
  2022-06-21 11:04       ` Hanna Reitz
  0 siblings, 1 reply; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-06-20 20:57 UTC (permalink / raw)
  To: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 6/13/22 10:46, Hanna Reitz wrote:
> On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
>> To be used in further commit.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
>> ---
>>   block.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 48 insertions(+)
>>
>> diff --git a/block.c b/block.c
>> index be19964f89..1900cdf277 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -2907,6 +2907,54 @@ static void bdrv_child_free(BdrvChild *child)
>>       g_free(child);
>>   }
>> +typedef struct BdrvTrySetAioContextState {
>> +    BlockDriverState *bs;
>> +    AioContext *old_ctx;
>> +} BdrvTrySetAioContextState;
>> +
>> +static void bdrv_try_set_aio_context_abort(void *opaque)
>> +{
>> +    BdrvTrySetAioContextState *s = opaque;
>> +
>> +    if (bdrv_get_aio_context(s->bs) != s->old_ctx) {
>> +        bdrv_try_set_aio_context(s->bs, s->old_ctx, &error_abort);
> 
> As far as I understand, users of this transaction will need to do a bit of AioContext lock shuffling: To set the context, they need to hold old_ctx, but not new_ctx; but in case of abort, they need to release old_ctx and acquire new_ctx before the abort handlers are called.  (Due to the constraints on bdrv_set_aio_context_ignore().)
> 
> If that’s true, I think that should be documented somewhere.
> 

Hmm.. Actually, I think that bdrv_try_set_aio_context_abort() should do this shuffling by it self. The only hope to correctly rollback a transaction, is operation in assumption that on .abort() we are exactly on the same state as on .prepare(), regardless of other actions. And this means that old_ctx is acquired and new_ctx is not.


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 22/45] block: implemet bdrv_unref_tran()
  2022-06-13  9:07   ` Hanna Reitz
@ 2022-06-20 21:16     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-06-20 21:16 UTC (permalink / raw)
  To: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 6/13/22 12:07, Hanna Reitz wrote:
> On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
>> Now nodes are removed during block-graph update transactions now? Look
>> at bdrv_replace_child_tran: bdrv_unref() is simply postponed to commit
>> phase.
>>
>> What is the problem with it?
>>
>> We want to make copy-before-write permissions strict: it should unshare
>> write always, not only when it has at least one parent.
> 
> Looking over this patch in not too much detail (because I find it rather complicated), it looks OK to me; but this reason for why we need it doesn’t really satisfy me.  What is the problem with how CBW permissions work?  Is that really the only reason for this patch?

Currently, CBW don't unshare write, when have no parent. It's kind of "inactive" state.

That's not quite correct. For example, if we just don't have parents on start of the job, nothing prevents user of adding new parents that write directly to source, ignoring CBW filter. Of course, user is responsible for his actions. But ideally, CBW filter should guarantee, that we are doing correct thing. It become more important when we consider "snapshot-access" interface. CBW filter provides this interface, and it should guarantee that it works correctly.

And to achieve this we want to effectively remove nodes during transaction, not just postpone removal to commit(). And that's in good sync with global concept: do all modifications first, than update permissions.

The other way could be removing permissions from nodes "to be removed", but that seems less correct to me.

Does these strong permissions for CBW worh a complexity? Good question) And actually it's hard to estimate it in such a big series. I can try to split this thing out of the series and see, could we at least postpone it, keeping for now only the interfaces with weaker protection.

> 
>> But if so, we
>> can't neither insert the filter nor remove it:
>>
>> To insert the filter, we should first do blockdev-add, and filter will
>> unshare write on the child, so, blockdev-add will fail if disk is in
>> use by guest.
>>
>> To remove the filter, we should first do a replace operations, which
>> again leads to situation when the filter and old parent share one
>> child, and all parent want write permission when the filter unshare it.
>>
>> The solution is first do both graph-modifying operations (add &
>> replace, or replace & remove) and only then update permissions. But
>> that is not possible with current method to transactionally remove the
>> block node: if we just postpone bdrv_unref() to commit phase, than on
>> prepare phase the node is not removed, and it still keep all
>> permissions on its children.
>>
>> What to do? In general, I don't know. But it's possible to solve the
>> problem for the block drivers that doesn't need access to their
>> children on .bdrv_close(). For such drivers we can detach their
>> children on prepare stage (still, postponing bdrv_close() call to
>> commit phase). For this to work we of course should effectively reduce
>> bs->refcnt on prepare phase as well.
>>
>> So, the logic of new bdrv_unref_tran() is:
>>
>> prepare:
>>    decrease refcnt and detach children if possible (and if refcnt is 0)
>>
>> commit:
>>    do bdrv_delete() if refcnt is 0
>>
>> abort:
>>    restore children and refcnt
>>
>> What's the difficulty with it? If we want to transactionally (and with
>> no permission change) remove nodes, we should understand that some
>> nodes may be removed recursively, and finally we get several possible
>> not deleted leaves, where permissions should be updated. How caller
>> will know what to update? That leads to additional transaction-wide
>> refresh_list variable, which is filled by various graph modifying
>> function. So, user should declare referesh_list variable and do one or
>> several block-graph modifying operations (that may probably remove some
>> nodes), then user call bdrv_list_refresh_perms on resulting
>> refresh_list.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> 
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 21/45] block: add bdrv_try_set_aio_context_tran transaction action
  2022-06-20 20:57     ` Vladimir Sementsov-Ogievskiy
@ 2022-06-21 11:04       ` Hanna Reitz
  2022-06-21 11:44         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 78+ messages in thread
From: Hanna Reitz @ 2022-06-21 11:04 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 20.06.22 22:57, Vladimir Sementsov-Ogievskiy wrote:
> On 6/13/22 10:46, Hanna Reitz wrote:
>> On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
>>> To be used in further commit.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
>>> ---
>>>   block.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 48 insertions(+)
>>>
>>> diff --git a/block.c b/block.c
>>> index be19964f89..1900cdf277 100644
>>> --- a/block.c
>>> +++ b/block.c
>>> @@ -2907,6 +2907,54 @@ static void bdrv_child_free(BdrvChild *child)
>>>       g_free(child);
>>>   }
>>> +typedef struct BdrvTrySetAioContextState {
>>> +    BlockDriverState *bs;
>>> +    AioContext *old_ctx;
>>> +} BdrvTrySetAioContextState;
>>> +
>>> +static void bdrv_try_set_aio_context_abort(void *opaque)
>>> +{
>>> +    BdrvTrySetAioContextState *s = opaque;
>>> +
>>> +    if (bdrv_get_aio_context(s->bs) != s->old_ctx) {
>>> +        bdrv_try_set_aio_context(s->bs, s->old_ctx, &error_abort);
>>
>> As far as I understand, users of this transaction will need to do a 
>> bit of AioContext lock shuffling: To set the context, they need to 
>> hold old_ctx, but not new_ctx; but in case of abort, they need to 
>> release old_ctx and acquire new_ctx before the abort handlers are 
>> called.  (Due to the constraints on bdrv_set_aio_context_ignore().)
>>
>> If that’s true, I think that should be documented somewhere.
>>
>
> Hmm.. Actually, I think that bdrv_try_set_aio_context_abort() should 
> do this shuffling by it self. The only hope to correctly rollback a 
> transaction, is operation in assumption that on .abort() we are 
> exactly on the same state as on .prepare(), regardless of other 
> actions. And this means that old_ctx is acquired and new_ctx is not.

But if old_ctx is acquired and new_ctx is not, you cannot invoke 
bdrv_try_set_aio_context(bs, old_ctx), because that requires the current 
context (bdrv_get_aio_context(bs)) to be held, but not old_ctx (the 
“new” context for this call).



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 21/45] block: add bdrv_try_set_aio_context_tran transaction action
  2022-06-21 11:04       ` Hanna Reitz
@ 2022-06-21 11:44         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-06-21 11:44 UTC (permalink / raw)
  To: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og

On 6/21/22 14:04, Hanna Reitz wrote:
> On 20.06.22 22:57, Vladimir Sementsov-Ogievskiy wrote:
>> On 6/13/22 10:46, Hanna Reitz wrote:
>>> On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
>>>> To be used in further commit.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
>>>> ---
>>>>   block.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>   1 file changed, 48 insertions(+)
>>>>
>>>> diff --git a/block.c b/block.c
>>>> index be19964f89..1900cdf277 100644
>>>> --- a/block.c
>>>> +++ b/block.c
>>>> @@ -2907,6 +2907,54 @@ static void bdrv_child_free(BdrvChild *child)
>>>>       g_free(child);
>>>>   }
>>>> +typedef struct BdrvTrySetAioContextState {
>>>> +    BlockDriverState *bs;
>>>> +    AioContext *old_ctx;
>>>> +} BdrvTrySetAioContextState;
>>>> +
>>>> +static void bdrv_try_set_aio_context_abort(void *opaque)
>>>> +{
>>>> +    BdrvTrySetAioContextState *s = opaque;
>>>> +
>>>> +    if (bdrv_get_aio_context(s->bs) != s->old_ctx) {
>>>> +        bdrv_try_set_aio_context(s->bs, s->old_ctx, &error_abort);
>>>
>>> As far as I understand, users of this transaction will need to do a bit of AioContext lock shuffling: To set the context, they need to hold old_ctx, but not new_ctx; but in case of abort, they need to release old_ctx and acquire new_ctx before the abort handlers are called.  (Due to the constraints on bdrv_set_aio_context_ignore().)
>>>
>>> If that’s true, I think that should be documented somewhere.
>>>
>>
>> Hmm.. Actually, I think that bdrv_try_set_aio_context_abort() should do this shuffling by it self. The only hope to correctly rollback a transaction, is operation in assumption that on .abort() we are exactly on the same state as on .prepare(), regardless of other actions. And this means that old_ctx is acquired and new_ctx is not.
> 
> But if old_ctx is acquired and new_ctx is not, you cannot invoke bdrv_try_set_aio_context(bs, old_ctx), because that requires the current context (bdrv_get_aio_context(bs)) to be held, but not old_ctx (the “new” context for this call).
> 

Yes and that means that .abort() should release old_ctx and acquire new_ctx before calling bdrv_try_set_aio_context(). And release new_ctx and acquire back old_ctx. Does it make sense?

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 29/45] block: introduce BDRV_O_NOPERM flag
  2022-06-13  9:54   ` Hanna Reitz
@ 2022-06-21 12:11     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 78+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-06-21 12:11 UTC (permalink / raw)
  To: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, kwolf, vsementsov, v.sementsov-og, Markus Armbruster

On 6/13/22 12:54, Hanna Reitz wrote:
> On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:
>> Now copy-before-write filter has weak permission model: when it has no
>> parents, it share write permission on source. Otherwise we just can't
>> blockdev-add it, when existing user of source has write permission.
>>
>> The situation is bad, it means that copy-before-write filter doesn't
>> guarantee that all write goes through it.
> 
> I don’t understand how this situation really is bad, because it sounds like anything else would just be a safeguard against users adding a CBW filter without making use of it.  Which I’d think is their own fault.
> 
> As far as I remember the actual problem is that we cannot do transactional graph modifications, where e.g. a CBW node is inserted and a bitmap is created in a single atomic transaction[1].  Which is a problem.  And now I just don’t quite understand how unsharing WRITE unconditionally would help with the actual problem.
> 
> [1] Then again, would then even be “atomic”?  For that transaction to work as intended, the node would need to be drained during the transaction (so that the bitmap stays in sync with the CBW state). It doesn’t look like that would be the case.

I think, we should already be in a drained section, when do the transaction.

In qmp_transaction we have bdrv_drain_all() call. It's enough if we don't yield during transaction actions (and mostly, we shouldn't) (is it enough, when we have iothreads?). Probably, it should be bdrv_drain_all_being() before all actions and bdrv_drain_all_end() after them.

> 
> So perhaps I’m just remembering incorrectly.

OK, the same answer: I should try to split these features, as they are separate:

1. transactional API

2. strict permissions for CBW

Seems that [2] is not necessary for [1]. If so, we can consider smaller picture (only [1]), and do [2] later (or not do, if it remains too complicated for the small profit).

> 
>> And a lot better is unshare
>> write always. But how to insert the filter in this case?
>>
>> The solution is to do blockdev-add and blockdev-replace in one
>> transaction, and more, update permissions only after both command.
>>
>> For now, let's create a possibility to not update permission on file
>> child of copy-before-write filter at time of open.
>>
>> New interfaces are:
>>
>> - bds_tree_init() with flags argument, so that caller may pass
>>    additional flags, for example the new BDRV_O_NOPERM.
>>
>> - bdrv_open_file_child_common() with boolean refresh_perms arguments.
>>    Drivers may use this function with refresh_perms = true, if they want
>>    to satisfy BDRV_O_NOPERM. No one such driver for now.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> 
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 78+ messages in thread

end of thread, other threads:[~2022-06-21 12:12 UTC | newest]

Thread overview: 78+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-30 21:28 [PATCH v5 00/45] Transactional block-graph modifying API Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 01/45] block: BlockDriver: add .filtered_child_is_backing field Vladimir Sementsov-Ogievskiy
2022-06-07  9:57   ` Hanna Reitz
2022-03-30 21:28 ` [PATCH v5 02/45] block: introduce bdrv_open_file_child() helper Vladimir Sementsov-Ogievskiy
2022-06-07  9:57   ` Hanna Reitz
2022-03-30 21:28 ` [PATCH v5 03/45] block/blklogwrites: don't care to remove bs->file child on failure Vladimir Sementsov-Ogievskiy
2022-06-07 10:05   ` Hanna Reitz
2022-03-30 21:28 ` [PATCH v5 04/45] test-bdrv-graph-mod: update test_parallel_perm_update test case Vladimir Sementsov-Ogievskiy
2022-06-07 10:53   ` Hanna Reitz
2022-06-09 13:08     ` Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 05/45] tests-bdrv-drain: bdrv_replace_test driver: declare supports_backing Vladimir Sementsov-Ogievskiy
2022-06-07 10:59   ` Hanna Reitz
2022-03-30 21:28 ` [PATCH v5 06/45] test-bdrv-graph-mod: fix filters to be filters Vladimir Sementsov-Ogievskiy
2022-06-07 11:22   ` Hanna Reitz
2022-03-30 21:28 ` [PATCH v5 07/45] block: document connection between child roles and bs->backing/bs->file Vladimir Sementsov-Ogievskiy
2022-06-07 12:11   ` Hanna Reitz
2022-03-30 21:28 ` [PATCH v5 08/45] block/snapshot: stress that we fallback to primary child Vladimir Sementsov-Ogievskiy
2022-06-07 13:42   ` Hanna Reitz
2022-03-30 21:28 ` [PATCH v5 09/45] Revert "block: Let replace_child_noperm free children" Vladimir Sementsov-Ogievskiy
2022-06-07 14:03   ` Hanna Reitz
2022-06-07 15:09     ` Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 10/45] Revert "block: Let replace_child_tran keep indirect pointer" Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 11/45] Revert "block: Restructure remove_file_or_backing_child()" Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 12/45] Revert "block: Pass BdrvChild ** to replace_child_noperm" Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 13/45] block: Manipulate bs->file / bs->backing pointers in .attach/.detach Vladimir Sementsov-Ogievskiy
2022-06-07 15:55   ` Hanna Reitz
2022-06-09 13:40     ` Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 14/45] block/snapshot: drop indirection around bdrv_snapshot_fallback_ptr Vladimir Sementsov-Ogievskiy
2022-06-07 15:58   ` Hanna Reitz
2022-06-09 14:44     ` Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 15/45] block: refactor bdrv_remove_file_or_backing_child to bdrv_remove_child Vladimir Sementsov-Ogievskiy
2022-06-08 10:04   ` Hanna Reitz
2022-03-30 21:28 ` [PATCH v5 16/45] block: drop bdrv_detach_child() Vladimir Sementsov-Ogievskiy
2022-06-08 10:22   ` Hanna Reitz
2022-03-30 21:28 ` [PATCH v5 17/45] block: drop bdrv_remove_filter_or_cow_child Vladimir Sementsov-Ogievskiy
2022-06-08 10:40   ` Hanna Reitz
2022-03-30 21:28 ` [PATCH v5 18/45] block: bdrv_refresh_perms(): allow external tran Vladimir Sementsov-Ogievskiy
2022-06-08 10:57   ` Hanna Reitz
2022-03-30 21:28 ` [PATCH v5 19/45] block: refactor bdrv_list_refresh_perms to allow any list of nodes Vladimir Sementsov-Ogievskiy
2022-06-08 11:27   ` Hanna Reitz
2022-03-30 21:28 ` [PATCH v5 20/45] block: make permission update functions public Vladimir Sementsov-Ogievskiy
2022-06-08 11:31   ` Hanna Reitz
2022-03-30 21:28 ` [PATCH v5 21/45] block: add bdrv_try_set_aio_context_tran transaction action Vladimir Sementsov-Ogievskiy
2022-06-08 11:49   ` Hanna Reitz
2022-06-09 14:56     ` Vladimir Sementsov-Ogievskiy
2022-06-13  7:12       ` Hanna Reitz
2022-06-13  7:46   ` Hanna Reitz
2022-06-20 20:57     ` Vladimir Sementsov-Ogievskiy
2022-06-21 11:04       ` Hanna Reitz
2022-06-21 11:44         ` Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 22/45] block: implemet bdrv_unref_tran() Vladimir Sementsov-Ogievskiy
2022-06-13  9:07   ` Hanna Reitz
2022-06-20 21:16     ` Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 23/45] blockdev: refactor transaction to use Transaction API Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 24/45] blockdev: transactions: rename some things Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 25/45] blockdev: qmp_transaction: refactor loop to classic for Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 26/45] blockdev: transaction: refactor handling transaction properties Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 27/45] blockdev: qmp_transaction: drop extra generic layer Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 28/45] qapi: block: add blockdev-del transaction action Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 29/45] block: introduce BDRV_O_NOPERM flag Vladimir Sementsov-Ogievskiy
2022-06-13  9:54   ` Hanna Reitz
2022-06-21 12:11     ` Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 30/45] block: bdrv_insert_node(): use BDRV_O_NOPERM Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 31/45] qapi: block: add blockdev-add transaction action Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 32/45] iotests: add blockdev-add-transaction Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 33/45] block-backend: blk_root(): drop const specifier on return type Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 34/45] block/export: add blk_by_export_id() Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 35/45] block: make bdrv_find_child() function public Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 36/45] block: bdrv_replace_child_bs(): move to external transaction Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 37/45] qapi: add x-blockdev-replace command Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 38/45] qapi: add x-blockdev-replace transaction action Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 39/45] block: bdrv_get_xdbg_block_graph(): report export ids Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 40/45] iotests.py: qemu_img_create: use imgfmt by default Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 41/45] iotests.py: introduce VM.assert_edges_list() method Vladimir Sementsov-Ogievskiy
2022-03-30 21:28 ` [PATCH v5 42/45] iotests.py: add VM.qmp_check() helper Vladimir Sementsov-Ogievskiy
2022-03-30 21:29 ` [PATCH v5 43/45] iotests: add filter-insertion Vladimir Sementsov-Ogievskiy
2022-03-30 21:29 ` [PATCH v5 44/45] block: bdrv_open_inherit: create BlockBackend only when necessary Vladimir Sementsov-Ogievskiy
2022-03-30 21:29 ` [PATCH v5 45/45] block/copy-before-write: correct permission scheme Vladimir Sementsov-Ogievskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.