All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/36] block: update graph permissions update
@ 2020-11-27 14:44 Vladimir Sementsov-Ogievskiy
  2020-11-27 14:44 ` [PATCH v2 01/36] tests/test-bdrv-graph-mod: add test_parallel_exclusive_write Vladimir Sementsov-Ogievskiy
                   ` (36 more replies)
  0 siblings, 37 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Hi all!

Here is a proposal of updating graph changing procedures.

The thing brought me here is a question about "activating" filters after
insertion, which is done in mirror_top and backup_top. The problem is
that we can't simply avoid permission conflict when inserting the
filter: during insertion old permissions of relations to be removed
conflicting with new permissions of new created relations. And current
solution is supporting additional "inactive" mode for the filter when it
doesn't require any permissions.

I suggest to change the order of operations: let's first do all graph
relations modifications and then refresh permissions. Of course we'll
need a way to restore old graph if refresh fails.

Another problem with permission update is that we update permissions in
order of DFS which is not always correct. Better is update node when all
its parents already updated and require correct permissions. This needs
a topological sort of nodes prior to permission update, see more in
patches later.

Patches plan:

01,02 - add failing tests to illustrate conceptual problems of current
permission update system.
[Here is side suggestion: we usually add tests after fix, so careful
 reviewer has to change order of patches to check that test fails before
 fix. I add tests in the way the may be simply executed but not yet take
 part in make check. It seems more native: first show the problem, then
 fix it. And when fixed, make tests available for make check]

03-09 - some perparations, refactorings which may go in separate

10 - new transaction API

15 - toplogical sort implemented for permission update, one of new tests
now pass

19 - improve bdrv_replace_node. second new test now pass

26 - drop .active field and activation procedure for backup-top!

30 - update bdrv_reopen_multiple. At this point everything is using new
paradigm of permission update

31-36 - post refactoring, drop old interfaces, we are done.

Note, that this series does nothing with another graph-update problem
discussed under "[PATCH RFC 0/5] Fix accidental crash in iotest 30".

The series based on block-next Max's branch and can be found here:

git: https://src.openvz.org/scm/~vsementsov/qemu.git
tag: up-block-topologic-perm-v2

Vladimir Sementsov-Ogievskiy (36):
  tests/test-bdrv-graph-mod: add test_parallel_exclusive_write
  tests/test-bdrv-graph-mod: add test_parallel_perm_update
  block: bdrv_append(): don't consume reference
  block: bdrv_append(): return status
  block: add bdrv_parent_try_set_aio_context
  block: BdrvChildClass: add .get_parent_aio_context handler
  block: drop ctx argument from bdrv_root_attach_child
  block: make bdrv_reopen_{prepare,commit,abort} private
  block: return value from bdrv_replace_node()
  util: add transactions.c
  block: bdrv_refresh_perms: check parents compliance
  block: refactor bdrv_child* permission functions
  block: rewrite bdrv_child_try_set_perm() using bdrv_refresh_perms()
  block: inline bdrv_child_*() permission functions calls
  block: use topological sort for permission update
  block: add bdrv_drv_set_perm transaction action
  block: add bdrv_list_* permission update functions
  block: add bdrv_replace_child_safe() transaction action
  block: fix bdrv_replace_node_common
  block: add bdrv_attach_child_common() transaction action
  block: add bdrv_attach_child_noperm() transaction action
  block: split out bdrv_replace_node_noperm()
  block: adapt bdrv_append() for inserting filters
  block: add bdrv_remove_backing transaction action
  block: introduce bdrv_drop_filter()
  block/backup-top: drop .active
  block: drop ignore_children for permission update functions
  block: add bdrv_set_backing_noperm() transaction action
  blockdev: qmp_x_blockdev_reopen: acquire all contexts
  block: bdrv_reopen_multiple: refresh permissions on updated graph
  block: drop unused permission update functions
  block: inline bdrv_check_perm_common()
  block: inline bdrv_replace_child()
  block: refactor bdrv_child_set_perm_safe() transaction action
  block: rename bdrv_replace_child_safe() to bdrv_replace_child()
  block: refactor bdrv_node_check_perm()

 include/block/block.h       |   20 +-
 include/block/block_int.h   |    8 +-
 include/qemu/transactions.h |   46 ++
 block.c                     | 1319 ++++++++++++++++++++---------------
 block/backup-top.c          |   39 +-
 block/block-backend.c       |   13 +-
 block/commit.c              |    7 +-
 block/file-posix.c          |   84 +--
 block/mirror.c              |    9 +-
 blockdev.c                  |   33 +-
 blockjob.c                  |   11 +-
 tests/test-bdrv-drain.c     |    2 +-
 tests/test-bdrv-graph-mod.c |  122 +++-
 util/transactions.c         |   81 +++
 tests/qemu-iotests/283.out  |    2 +-
 util/meson.build            |    1 +
 16 files changed, 1100 insertions(+), 697 deletions(-)
 create mode 100644 include/qemu/transactions.h
 create mode 100644 util/transactions.c

-- 
2.21.3



^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v2 01/36] tests/test-bdrv-graph-mod: add test_parallel_exclusive_write
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:44 ` Vladimir Sementsov-Ogievskiy
  2020-11-27 14:44 ` [PATCH v2 02/36] tests/test-bdrv-graph-mod: add test_parallel_perm_update Vladimir Sementsov-Ogievskiy
                   ` (35 subsequent siblings)
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Add the test that shows that concept of ignore_children is incomplete.
Actually, when we want to update something, ignoring permission of some
existing BdrvChild, we should ignore also the propagated effect of this
child to the other children. But that's not done. Better approach
(update permissions on already updated graph) will be implemented
later.

Now the test fails, so it's added with -d argument to not break make
check.

Test fails with

 "Conflicts with use by fl1 as 'backing', which does not allow 'write' on base"

because when updating permissions we can ignore original top->fl1
BdrvChild. But we don't ignore exclusive write permission in fl1->base
BdrvChild, which is propagated. Correct thing to do is make graph
change first and then do permission update from the top node.

To run test do

  ./test-bdrv-graph-mod -d -p /bdrv-graph-mod/parallel-exclusive-write

from <build-directory>/tests.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/test-bdrv-graph-mod.c | 62 +++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/tests/test-bdrv-graph-mod.c b/tests/test-bdrv-graph-mod.c
index 8cff13830e..3b9e6f242f 100644
--- a/tests/test-bdrv-graph-mod.c
+++ b/tests/test-bdrv-graph-mod.c
@@ -44,6 +44,21 @@ static BlockDriver bdrv_no_perm = {
     .bdrv_child_perm = no_perm_default_perms,
 };
 
+static void exclusive_write_perms(BlockDriverState *bs, BdrvChild *c,
+                                  BdrvChildRole role,
+                                  BlockReopenQueue *reopen_queue,
+                                  uint64_t perm, uint64_t shared,
+                                  uint64_t *nperm, uint64_t *nshared)
+{
+    *nperm = BLK_PERM_WRITE;
+    *nshared = BLK_PERM_ALL & ~BLK_PERM_WRITE;
+}
+
+static BlockDriver bdrv_exclusive_writer = {
+    .format_name = "exclusive-writer",
+    .bdrv_child_perm = exclusive_write_perms,
+};
+
 static BlockDriverState *no_perm_node(const char *name)
 {
     return bdrv_new_open_driver(&bdrv_no_perm, name, BDRV_O_RDWR, &error_abort);
@@ -55,6 +70,12 @@ static BlockDriverState *pass_through_node(const char *name)
                                 BDRV_O_RDWR, &error_abort);
 }
 
+static BlockDriverState *exclusive_writer_node(const char *name)
+{
+    return bdrv_new_open_driver(&bdrv_exclusive_writer, name,
+                                BDRV_O_RDWR, &error_abort);
+}
+
 /*
  * test_update_perm_tree
  *
@@ -185,8 +206,44 @@ static void test_should_update_child(void)
     blk_unref(root);
 }
 
+/*
+ * test_parallel_exclusive_write
+ *
+ * Check that when we replace node, old permissions of the node being removed
+ * doesn't break the replacement.
+ */
+static void test_parallel_exclusive_write(void)
+{
+    BlockDriverState *top = exclusive_writer_node("top");
+    BlockDriverState *base = no_perm_node("base");
+    BlockDriverState *fl1 = pass_through_node("fl1");
+    BlockDriverState *fl2 = pass_through_node("fl2");
+
+    bdrv_attach_child(top, fl1, "backing", &child_of_bds, BDRV_CHILD_DATA,
+                      &error_abort);
+    bdrv_attach_child(fl1, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
+                      &error_abort);
+    bdrv_attach_child(fl2, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
+                      &error_abort);
+    bdrv_ref(base);
+
+    bdrv_replace_node(fl1, fl2, &error_abort);
+
+    bdrv_unref(top);
+}
+
 int main(int argc, char *argv[])
 {
+    int i;
+    bool debug = false;
+
+    for (i = 1; i < argc; i++) {
+        if (!strcmp(argv[i], "-d")) {
+            debug = true;
+            break;
+        }
+    }
+
     bdrv_init();
     qemu_init_main_loop(&error_abort);
 
@@ -196,5 +253,10 @@ int main(int argc, char *argv[])
     g_test_add_func("/bdrv-graph-mod/should-update-child",
                     test_should_update_child);
 
+    if (debug) {
+        g_test_add_func("/bdrv-graph-mod/parallel-exclusive-write",
+                        test_parallel_exclusive_write);
+    }
+
     return g_test_run();
 }
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 02/36] tests/test-bdrv-graph-mod: add test_parallel_perm_update
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
  2020-11-27 14:44 ` [PATCH v2 01/36] tests/test-bdrv-graph-mod: add test_parallel_exclusive_write Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:44 ` Vladimir Sementsov-Ogievskiy
  2021-01-18 14:05   ` Kevin Wolf
  2020-11-27 14:44 ` [PATCH v2 03/36] block: bdrv_append(): don't consume reference Vladimir Sementsov-Ogievskiy
                   ` (34 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Add test to show that simple DFS recursion order is not correct for
permission update. Correct order is topological-sort order, which will
be introduced later.

Consider the block driver which has two filter children: one active
with exclusive write access and one inactive with no specific
permissions.

And, these two children has a common base child, like this:

┌─────┐     ┌──────┐
│ fl2 │ ◀── │ top  │
└─────┘     └──────┘
  │           │
  │           │ w
  │           ▼
  │         ┌──────┐
  │         │ fl1  │
  │         └──────┘
  │           │
  │           │ w
  │           ▼
  │         ┌──────┐
  └───────▶ │ base │
            └──────┘

So, exclusive write is propagated.

Assume, we want to make fl2 active instead of fl1.
So, we set some option for top driver and do permission update.

If permission update (remember, it's DFS) goes first through
top->fl1->base branch it will succeed: it firstly drop exclusive write
permissions and than apply them for another BdrvChildren.
But if permission update goes first through top->fl2->base branch it
will fail, as when we try to update fl2->base child, old not yet
updated fl1->base child will be in conflict.

Now test fails, so it runs only with -d flag. To run do

  ./test-bdrv-graph-mod -d -p /bdrv-graph-mod/parallel-perm-update

from <build-directory>/tests.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/test-bdrv-graph-mod.c | 64 +++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/tests/test-bdrv-graph-mod.c b/tests/test-bdrv-graph-mod.c
index 3b9e6f242f..27e3361a60 100644
--- a/tests/test-bdrv-graph-mod.c
+++ b/tests/test-bdrv-graph-mod.c
@@ -232,6 +232,68 @@ static void test_parallel_exclusive_write(void)
     bdrv_unref(top);
 }
 
+static void write_to_file_perms(BlockDriverState *bs, BdrvChild *c,
+                                     BdrvChildRole role,
+                                     BlockReopenQueue *reopen_queue,
+                                     uint64_t perm, uint64_t shared,
+                                     uint64_t *nperm, uint64_t *nshared)
+{
+    if (bs->file && c == bs->file) {
+        *nperm = BLK_PERM_WRITE;
+        *nshared = BLK_PERM_ALL & ~BLK_PERM_WRITE;
+    } else {
+        *nperm = 0;
+        *nshared = BLK_PERM_ALL;
+    }
+}
+
+static BlockDriver bdrv_write_to_file = {
+    .format_name = "tricky-perm",
+    .bdrv_child_perm = write_to_file_perms,
+};
+
+static void test_parallel_perm_update(void)
+{
+    BlockDriverState *top = no_perm_node("top");
+    BlockDriverState *tricky =
+            bdrv_new_open_driver(&bdrv_write_to_file, "tricky", BDRV_O_RDWR,
+                                 &error_abort);
+    BlockDriverState *base = no_perm_node("base");
+    BlockDriverState *fl1 = pass_through_node("fl1");
+    BlockDriverState *fl2 = pass_through_node("fl2");
+    BdrvChild *c_fl1, *c_fl2;
+
+    bdrv_attach_child(top, tricky, "file", &child_of_bds, BDRV_CHILD_DATA,
+                      &error_abort);
+    c_fl1 = bdrv_attach_child(tricky, fl1, "first", &child_of_bds,
+                              BDRV_CHILD_FILTERED, &error_abort);
+    c_fl2 = bdrv_attach_child(tricky, fl2, "second", &child_of_bds,
+                              BDRV_CHILD_FILTERED, &error_abort);
+    bdrv_attach_child(fl1, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
+                      &error_abort);
+    bdrv_attach_child(fl2, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
+                      &error_abort);
+    bdrv_ref(base);
+
+    /* Select fl1 as first child to be active */
+    tricky->file = c_fl1;
+    bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
+
+    assert(c_fl1->perm & BLK_PERM_WRITE);
+
+    /* Now, try to switch active child and update permissions */
+    tricky->file = c_fl2;
+    bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
+
+    assert(c_fl2->perm & BLK_PERM_WRITE);
+
+    /* Switch once more, to not care about real child order in the list */
+    tricky->file = c_fl1;
+    bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
+
+    assert(c_fl1->perm & BLK_PERM_WRITE);
+}
+
 int main(int argc, char *argv[])
 {
     int i;
@@ -256,6 +318,8 @@ int main(int argc, char *argv[])
     if (debug) {
         g_test_add_func("/bdrv-graph-mod/parallel-exclusive-write",
                         test_parallel_exclusive_write);
+        g_test_add_func("/bdrv-graph-mod/parallel-perm-update",
+                        test_parallel_perm_update);
     }
 
     return g_test_run();
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 03/36] block: bdrv_append(): don't consume reference
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
  2020-11-27 14:44 ` [PATCH v2 01/36] tests/test-bdrv-graph-mod: add test_parallel_exclusive_write Vladimir Sementsov-Ogievskiy
  2020-11-27 14:44 ` [PATCH v2 02/36] tests/test-bdrv-graph-mod: add test_parallel_perm_update Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:44 ` Vladimir Sementsov-Ogievskiy
  2021-01-18 14:18   ` Kevin Wolf
  2020-11-27 14:44 ` [PATCH v2 04/36] block: bdrv_append(): return status Vladimir Sementsov-Ogievskiy
                   ` (33 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

We have too much comments for this feature. It seems better just don't
do it. Most of real users (tests don't count) have to create additional
reference.

Drop also comment in external_snapshot_prepare:
 - bdrv_append doesn't "remove" old bs in common sense, it sounds
   strange
 - the fact that bdrv_append can fail is obvious from the context
 - the fact that we must rollback all changes in transaction abort is
   known (it's the direct role of abort)

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c                     | 19 +++----------------
 block/backup-top.c          |  1 -
 block/commit.c              |  1 +
 block/mirror.c              |  3 ---
 blockdev.c                  |  4 ----
 tests/test-bdrv-drain.c     |  2 +-
 tests/test-bdrv-graph-mod.c |  2 ++
 7 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/block.c b/block.c
index 0dd28f0902..55efef3c9d 100644
--- a/block.c
+++ b/block.c
@@ -3145,11 +3145,6 @@ static BlockDriverState *bdrv_append_temp_snapshot(BlockDriverState *bs,
         goto out;
     }
 
-    /* bdrv_append() consumes a strong reference to bs_snapshot
-     * (i.e. it will call bdrv_unref() on it) even on error, so in
-     * order to be able to return one, we have to increase
-     * bs_snapshot's refcount here */
-    bdrv_ref(bs_snapshot);
     bdrv_append(bs_snapshot, bs, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
@@ -4608,10 +4603,8 @@ void bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
  *
  * This function does not create any image files.
  *
- * bdrv_append() takes ownership of a bs_new reference and unrefs it because
- * that's what the callers commonly need. bs_new will be referenced by the old
- * parents of bs_top after bdrv_append() returns. If the caller needs to keep a
- * reference of its own, it must call bdrv_ref().
+ * Recent update: bdrv_append does NOT eat bs_new reference for now. Drop this
+ * comment several moths later.
  */
 void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
                  Error **errp)
@@ -4621,20 +4614,14 @@ void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
     bdrv_set_backing_hd(bs_new, bs_top, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
-        goto out;
+        return;
     }
 
     bdrv_replace_node(bs_top, bs_new, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         bdrv_set_backing_hd(bs_new, NULL, &error_abort);
-        goto out;
     }
-
-    /* bs_new is now referenced by its new parents, we don't need the
-     * additional reference any more. */
-out:
-    bdrv_unref(bs_new);
 }
 
 static void bdrv_delete(BlockDriverState *bs)
diff --git a/block/backup-top.c b/block/backup-top.c
index fe6883cc97..650ed6195c 100644
--- a/block/backup-top.c
+++ b/block/backup-top.c
@@ -222,7 +222,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
 
     bdrv_drained_begin(source);
 
-    bdrv_ref(top);
     bdrv_append(top, source, &local_err);
     if (local_err) {
         error_prepend(&local_err, "Cannot append backup-top filter: ");
diff --git a/block/commit.c b/block/commit.c
index 71db7ba747..61924bcf66 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -313,6 +313,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
     commit_top_bs->total_sectors = top->total_sectors;
 
     bdrv_append(commit_top_bs, top, &local_err);
+    bdrv_unref(commit_top_bs); /* referenced by new parents or failed */
     if (local_err) {
         commit_top_bs = NULL;
         error_propagate(errp, local_err);
diff --git a/block/mirror.c b/block/mirror.c
index 8e1ad6eceb..13f7ecc998 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1605,9 +1605,6 @@ static BlockJob *mirror_start_job(
     bs_opaque = g_new0(MirrorBDSOpaque, 1);
     mirror_top_bs->opaque = bs_opaque;
 
-    /* bdrv_append takes ownership of the mirror_top_bs reference, need to keep
-     * it alive until block_job_create() succeeds even if bs has no parent. */
-    bdrv_ref(mirror_top_bs);
     bdrv_drained_begin(bs);
     bdrv_append(mirror_top_bs, bs, &local_err);
     bdrv_drained_end(bs);
diff --git a/blockdev.c b/blockdev.c
index b5f11c524b..96c96f8ba6 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1587,10 +1587,6 @@ static void external_snapshot_prepare(BlkActionState *common,
         goto out;
     }
 
-    /* This removes our old bs and adds the new bs. This is an operation that
-     * can fail, so we need to do it in .prepare; undoing it for abort is
-     * always possible. */
-    bdrv_ref(state->new_bs);
     bdrv_append(state->new_bs, state->old_bs, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
index 8a29e33e00..892f7f47d8 100644
--- a/tests/test-bdrv-drain.c
+++ b/tests/test-bdrv-drain.c
@@ -1478,7 +1478,6 @@ static void test_append_to_drained(void)
     g_assert_cmpint(base_s->drain_count, ==, 1);
     g_assert_cmpint(base->in_flight, ==, 0);
 
-    /* Takes ownership of overlay, so we don't have to unref it later */
     bdrv_append(overlay, base, &error_abort);
     g_assert_cmpint(base->in_flight, ==, 0);
     g_assert_cmpint(overlay->in_flight, ==, 0);
@@ -1495,6 +1494,7 @@ static void test_append_to_drained(void)
     g_assert_cmpint(overlay->quiesce_counter, ==, 0);
     g_assert_cmpint(overlay_s->drain_count, ==, 0);
 
+    bdrv_unref(overlay);
     bdrv_unref(base);
     blk_unref(blk);
 }
diff --git a/tests/test-bdrv-graph-mod.c b/tests/test-bdrv-graph-mod.c
index 27e3361a60..cfe096c9af 100644
--- a/tests/test-bdrv-graph-mod.c
+++ b/tests/test-bdrv-graph-mod.c
@@ -138,6 +138,7 @@ static void test_update_perm_tree(void)
     bdrv_append(filter, bs, &local_err);
     error_free_or_abort(&local_err);
 
+    bdrv_unref(filter);
     blk_unref(root);
 }
 
@@ -202,6 +203,7 @@ static void test_should_update_child(void)
     bdrv_append(filter, bs, &error_abort);
     g_assert(target->backing->bs == bs);
 
+    bdrv_unref(filter);
     bdrv_unref(bs);
     blk_unref(root);
 }
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 04/36] block: bdrv_append(): return status
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2020-11-27 14:44 ` [PATCH v2 03/36] block: bdrv_append(): don't consume reference Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:44 ` Vladimir Sementsov-Ogievskiy
  2020-12-14 15:49   ` Alberto Garcia
  2021-01-18 14:32   ` Kevin Wolf
  2020-11-27 14:44 ` [PATCH v2 05/36] block: add bdrv_parent_try_set_aio_context Vladimir Sementsov-Ogievskiy
                   ` (32 subsequent siblings)
  36 siblings, 2 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Return int status to avoid extra error propagation schemes.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h       |  4 ++--
 block.c                     | 15 ++++++++-------
 block/commit.c              |  6 ++----
 block/mirror.c              |  6 ++----
 blockdev.c                  |  6 +++---
 tests/test-bdrv-graph-mod.c |  6 +++---
 6 files changed, 20 insertions(+), 23 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index db37a35cee..ee3f5a6cca 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -344,8 +344,8 @@ int bdrv_create(BlockDriver *drv, const char* filename,
 int bdrv_create_file(const char *filename, QemuOpts *opts, Error **errp);
 
 BlockDriverState *bdrv_new(void);
-void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
-                 Error **errp);
+int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
+                Error **errp);
 void bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
                        Error **errp);
 
diff --git a/block.c b/block.c
index 55efef3c9d..916087ee1a 100644
--- a/block.c
+++ b/block.c
@@ -3103,7 +3103,6 @@ static BlockDriverState *bdrv_append_temp_snapshot(BlockDriverState *bs,
     int64_t total_size;
     QemuOpts *opts = NULL;
     BlockDriverState *bs_snapshot = NULL;
-    Error *local_err = NULL;
     int ret;
 
     /* if snapshot, we create a temporary backing file and open it
@@ -3145,9 +3144,8 @@ static BlockDriverState *bdrv_append_temp_snapshot(BlockDriverState *bs,
         goto out;
     }
 
-    bdrv_append(bs_snapshot, bs, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
+    ret = bdrv_append(bs_snapshot, bs, errp);
+    if (ret < 0) {
         bs_snapshot = NULL;
         goto out;
     }
@@ -4606,22 +4604,25 @@ void bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
  * Recent update: bdrv_append does NOT eat bs_new reference for now. Drop this
  * comment several moths later.
  */
-void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
-                 Error **errp)
+int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
+                Error **errp)
 {
     Error *local_err = NULL;
 
     bdrv_set_backing_hd(bs_new, bs_top, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
-        return;
+        return -EPERM;
     }
 
     bdrv_replace_node(bs_top, bs_new, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         bdrv_set_backing_hd(bs_new, NULL, &error_abort);
+        return -EPERM;
     }
+
+    return 0;
 }
 
 static void bdrv_delete(BlockDriverState *bs)
diff --git a/block/commit.c b/block/commit.c
index 61924bcf66..b89bb20b75 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -254,7 +254,6 @@ void commit_start(const char *job_id, BlockDriverState *bs,
     BlockDriverState *iter;
     BlockDriverState *commit_top_bs = NULL;
     BlockDriverState *filtered_base;
-    Error *local_err = NULL;
     int64_t base_size, top_size;
     uint64_t base_perms, iter_shared_perms;
     int ret;
@@ -312,11 +311,10 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 
     commit_top_bs->total_sectors = top->total_sectors;
 
-    bdrv_append(commit_top_bs, top, &local_err);
+    ret = bdrv_append(commit_top_bs, top, errp);
     bdrv_unref(commit_top_bs); /* referenced by new parents or failed */
-    if (local_err) {
+    if (ret < 0) {
         commit_top_bs = NULL;
-        error_propagate(errp, local_err);
         goto fail;
     }
 
diff --git a/block/mirror.c b/block/mirror.c
index 13f7ecc998..c3fbe3e8bd 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1560,7 +1560,6 @@ static BlockJob *mirror_start_job(
     BlockDriverState *mirror_top_bs;
     bool target_is_backing;
     uint64_t target_perms, target_shared_perms;
-    Error *local_err = NULL;
     int ret;
 
     if (granularity == 0) {
@@ -1606,12 +1605,11 @@ static BlockJob *mirror_start_job(
     mirror_top_bs->opaque = bs_opaque;
 
     bdrv_drained_begin(bs);
-    bdrv_append(mirror_top_bs, bs, &local_err);
+    ret = bdrv_append(mirror_top_bs, bs, errp);
     bdrv_drained_end(bs);
 
-    if (local_err) {
+    if (ret < 0) {
         bdrv_unref(mirror_top_bs);
-        error_propagate(errp, local_err);
         return NULL;
     }
 
diff --git a/blockdev.c b/blockdev.c
index 96c96f8ba6..2af35d0958 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1432,6 +1432,7 @@ typedef struct ExternalSnapshotState {
 static void external_snapshot_prepare(BlkActionState *common,
                                       Error **errp)
 {
+    int ret;
     int flags = 0;
     QDict *options = NULL;
     Error *local_err = NULL;
@@ -1587,9 +1588,8 @@ static void external_snapshot_prepare(BlkActionState *common,
         goto out;
     }
 
-    bdrv_append(state->new_bs, state->old_bs, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
+    ret = bdrv_append(state->new_bs, state->old_bs, errp);
+    if (ret < 0) {
         goto out;
     }
     state->overlay_appended = true;
diff --git a/tests/test-bdrv-graph-mod.c b/tests/test-bdrv-graph-mod.c
index cfe096c9af..74f4a4153b 100644
--- a/tests/test-bdrv-graph-mod.c
+++ b/tests/test-bdrv-graph-mod.c
@@ -122,7 +122,7 @@ static BlockDriverState *exclusive_writer_node(const char *name)
  */
 static void test_update_perm_tree(void)
 {
-    Error *local_err = NULL;
+    int ret;
 
     BlockBackend *root = blk_new(qemu_get_aio_context(),
                                  BLK_PERM_WRITE | BLK_PERM_CONSISTENT_READ,
@@ -135,8 +135,8 @@ static void test_update_perm_tree(void)
     bdrv_attach_child(filter, bs, "child", &child_of_bds,
                       BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY, &error_abort);
 
-    bdrv_append(filter, bs, &local_err);
-    error_free_or_abort(&local_err);
+    ret = bdrv_append(filter, bs, NULL);
+    g_assert_cmpint(ret, <, 0);
 
     bdrv_unref(filter);
     blk_unref(root);
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 05/36] block: add bdrv_parent_try_set_aio_context
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2020-11-27 14:44 ` [PATCH v2 04/36] block: bdrv_append(): return status Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:44 ` Vladimir Sementsov-Ogievskiy
  2021-01-18 15:08   ` Kevin Wolf
  2020-11-27 14:44 ` [PATCH v2 06/36] block: BdrvChildClass: add .get_parent_aio_context handler Vladimir Sementsov-Ogievskiy
                   ` (31 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

We already have bdrv_parent_can_set_aio_context(). Add corresponding
bdrv_parent_set_aio_context_ignore() and
bdrv_parent_try_set_aio_context() and use them instead of open-coding.

Make bdrv_parent_try_set_aio_context() public, as it will be used in
further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h |  2 ++
 block.c               | 51 +++++++++++++++++++++++++++++++++----------
 2 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index ee3f5a6cca..550c5a7513 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -686,6 +686,8 @@ bool bdrv_child_can_set_aio_context(BdrvChild *c, AioContext *ctx,
                                     GSList **ignore, Error **errp);
 bool bdrv_can_set_aio_context(BlockDriverState *bs, AioContext *ctx,
                               GSList **ignore, Error **errp);
+int bdrv_parent_try_set_aio_context(BdrvChild *c, AioContext *ctx,
+                                    Error **errp);
 int bdrv_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz);
 int bdrv_probe_geometry(BlockDriverState *bs, HDGeometry *geo);
 
diff --git a/block.c b/block.c
index 916087ee1a..5d925c208d 100644
--- a/block.c
+++ b/block.c
@@ -81,6 +81,9 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
                                            BdrvChildRole child_role,
                                            Error **errp);
 
+static void bdrv_parent_set_aio_context_ignore(BdrvChild *c, AioContext *ctx,
+                                               GSList **ignore);
+
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
@@ -2655,17 +2658,12 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
      * try moving the parent into the AioContext of child_bs instead. */
     if (bdrv_get_aio_context(child_bs) != ctx) {
         ret = bdrv_try_set_aio_context(child_bs, ctx, &local_err);
-        if (ret < 0 && child_class->can_set_aio_ctx) {
-            GSList *ignore = g_slist_prepend(NULL, child);
-            ctx = bdrv_get_aio_context(child_bs);
-            if (child_class->can_set_aio_ctx(child, ctx, &ignore, NULL)) {
-                error_free(local_err);
+        if (ret < 0) {
+            if (bdrv_parent_try_set_aio_context(child, ctx, NULL) == 0) {
                 ret = 0;
-                g_slist_free(ignore);
-                ignore = g_slist_prepend(NULL, child);
-                child_class->set_aio_ctx(child, ctx, &ignore);
+                error_free(local_err);
+                local_err = NULL;
             }
-            g_slist_free(ignore);
         }
         if (ret < 0) {
             error_propagate(errp, local_err);
@@ -6452,9 +6450,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
         if (g_slist_find(*ignore, child)) {
             continue;
         }
-        assert(child->klass->set_aio_ctx);
-        *ignore = g_slist_prepend(*ignore, child);
-        child->klass->set_aio_ctx(child, new_context, ignore);
+        bdrv_parent_set_aio_context_ignore(child, new_context, ignore);
     }
 
     bdrv_detach_aio_context(bs);
@@ -6511,6 +6507,37 @@ static bool bdrv_parent_can_set_aio_context(BdrvChild *c, AioContext *ctx,
     return true;
 }
 
+static void bdrv_parent_set_aio_context_ignore(BdrvChild *c, AioContext *ctx,
+                                               GSList **ignore)
+{
+    if (g_slist_find(*ignore, c)) {
+        return;
+    }
+    *ignore = g_slist_prepend(*ignore, c);
+
+    assert(c->klass->set_aio_ctx);
+    c->klass->set_aio_ctx(c, ctx, ignore);
+}
+
+int bdrv_parent_try_set_aio_context(BdrvChild *c, AioContext *ctx,
+                                    Error **errp)
+{
+    GSList *ignore = NULL;
+
+    if (!bdrv_parent_can_set_aio_context(c, ctx, &ignore, errp)) {
+        g_slist_free(ignore);
+        return -EPERM;
+    }
+
+    g_slist_free(ignore);
+    ignore = NULL;
+
+    bdrv_parent_set_aio_context_ignore(c, ctx, &ignore);
+    g_slist_free(ignore);
+
+    return 0;
+}
+
 bool bdrv_child_can_set_aio_context(BdrvChild *c, AioContext *ctx,
                                     GSList **ignore, Error **errp)
 {
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 06/36] block: BdrvChildClass: add .get_parent_aio_context handler
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2020-11-27 14:44 ` [PATCH v2 05/36] block: add bdrv_parent_try_set_aio_context Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:44 ` Vladimir Sementsov-Ogievskiy
  2021-01-18 15:13   ` Kevin Wolf
  2020-11-27 14:44 ` [PATCH v2 07/36] block: drop ctx argument from bdrv_root_attach_child Vladimir Sementsov-Ogievskiy
                   ` (30 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Add new handler to get aio context and implement it in all child
classes. Add corresponding public interface to be used soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h     |  3 +++
 include/block/block_int.h |  2 ++
 block.c                   | 13 +++++++++++++
 block/block-backend.c     |  9 +++++++++
 blockjob.c                |  8 ++++++++
 5 files changed, 35 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index 550c5a7513..6788ccd25b 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -688,6 +688,9 @@ bool bdrv_can_set_aio_context(BlockDriverState *bs, AioContext *ctx,
                               GSList **ignore, Error **errp);
 int bdrv_parent_try_set_aio_context(BdrvChild *c, AioContext *ctx,
                                     Error **errp);
+
+AioContext *bdrv_child_get_parent_aio_context(BdrvChild *c);
+
 int bdrv_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz);
 int bdrv_probe_geometry(BlockDriverState *bs, HDGeometry *geo);
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 9138aaf5ec..943fd855fe 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -772,6 +772,8 @@ struct BdrvChildClass {
     bool (*can_set_aio_ctx)(BdrvChild *child, AioContext *ctx,
                             GSList **ignore, Error **errp);
     void (*set_aio_ctx)(BdrvChild *child, AioContext *ctx, GSList **ignore);
+
+    AioContext *(*get_parent_aio_context)(BdrvChild *child);
 };
 
 extern const BdrvChildClass child_of_bds;
diff --git a/block.c b/block.c
index 5d925c208d..95d3684d8d 100644
--- a/block.c
+++ b/block.c
@@ -1334,6 +1334,13 @@ static int bdrv_child_cb_update_filename(BdrvChild *c, BlockDriverState *base,
     return 0;
 }
 
+static AioContext *bdrv_child_cb_get_parent_aio_context(BdrvChild *c)
+{
+    BlockDriverState *bs = c->opaque;
+
+    return bdrv_get_aio_context(bs);
+}
+
 const BdrvChildClass child_of_bds = {
     .parent_is_bds   = true,
     .get_parent_desc = bdrv_child_get_parent_desc,
@@ -1347,8 +1354,14 @@ const BdrvChildClass child_of_bds = {
     .can_set_aio_ctx = bdrv_child_cb_can_set_aio_ctx,
     .set_aio_ctx     = bdrv_child_cb_set_aio_ctx,
     .update_filename = bdrv_child_cb_update_filename,
+    .get_parent_aio_context = bdrv_child_cb_get_parent_aio_context,
 };
 
+AioContext *bdrv_child_get_parent_aio_context(BdrvChild *c)
+{
+    return c->klass->get_parent_aio_context(c);
+}
+
 static int bdrv_open_flags(BlockDriverState *bs, int flags)
 {
     int open_flags = flags;
diff --git a/block/block-backend.c b/block/block-backend.c
index ce78d30794..28efa0dff3 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -298,6 +298,13 @@ static void blk_root_detach(BdrvChild *child)
     }
 }
 
+static AioContext *blk_root_get_parent_aio_context(BdrvChild *c)
+{
+    BlockBackend *blk = c->opaque;
+
+    return blk_get_aio_context(blk);
+}
+
 static const BdrvChildClass child_root = {
     .inherit_options    = blk_root_inherit_options,
 
@@ -318,6 +325,8 @@ static const BdrvChildClass child_root = {
 
     .can_set_aio_ctx    = blk_root_can_set_aio_ctx,
     .set_aio_ctx        = blk_root_set_aio_ctx,
+
+    .get_parent_aio_context = blk_root_get_parent_aio_context,
 };
 
 /*
diff --git a/blockjob.c b/blockjob.c
index 9d0bed01c2..f671763c2c 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -163,6 +163,13 @@ static void child_job_set_aio_ctx(BdrvChild *c, AioContext *ctx,
     job->job.aio_context = ctx;
 }
 
+static AioContext *child_job_get_parent_aio_context(BdrvChild *c)
+{
+    BlockJob *job = c->opaque;
+
+    return job->job.aio_context;
+}
+
 static const BdrvChildClass child_job = {
     .get_parent_desc    = child_job_get_parent_desc,
     .drained_begin      = child_job_drained_begin,
@@ -171,6 +178,7 @@ static const BdrvChildClass child_job = {
     .can_set_aio_ctx    = child_job_can_set_aio_ctx,
     .set_aio_ctx        = child_job_set_aio_ctx,
     .stay_at_node       = true,
+    .get_parent_aio_context = child_job_get_parent_aio_context,
 };
 
 void block_job_remove_all_bdrv(BlockJob *job)
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 07/36] block: drop ctx argument from bdrv_root_attach_child
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2020-11-27 14:44 ` [PATCH v2 06/36] block: BdrvChildClass: add .get_parent_aio_context handler Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:44 ` Vladimir Sementsov-Ogievskiy
  2020-11-27 14:44 ` [PATCH v2 08/36] block: make bdrv_reopen_{prepare, commit, abort} private Vladimir Sementsov-Ogievskiy via
                   ` (29 subsequent siblings)
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Passing parent aio context is redundant, as child_class and parent
opaque pointer are enough to retrieve it. Drop the argument and use new
bdrv_child_get_parent_aio_context() interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block_int.h | 1 -
 block.c                   | 8 +++++---
 block/block-backend.c     | 4 ++--
 blockjob.c                | 3 +--
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 943fd855fe..24a04ac2dc 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1278,7 +1278,6 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
                                   const char *child_name,
                                   const BdrvChildClass *child_class,
                                   BdrvChildRole child_role,
-                                  AioContext *ctx,
                                   uint64_t perm, uint64_t shared_perm,
                                   void *opaque, Error **errp);
 void bdrv_root_unref_child(BdrvChild *child);
diff --git a/block.c b/block.c
index 95d3684d8d..15e6ab666e 100644
--- a/block.c
+++ b/block.c
@@ -2640,13 +2640,13 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
                                   const char *child_name,
                                   const BdrvChildClass *child_class,
                                   BdrvChildRole child_role,
-                                  AioContext *ctx,
                                   uint64_t perm, uint64_t shared_perm,
                                   void *opaque, Error **errp)
 {
     BdrvChild *child;
     Error *local_err = NULL;
     int ret;
+    AioContext *ctx;
 
     ret = bdrv_check_update_perm(child_bs, NULL, perm, shared_perm, NULL, errp);
     if (ret < 0) {
@@ -2666,6 +2666,8 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
         .opaque         = opaque,
     };
 
+    ctx = bdrv_child_get_parent_aio_context(child);
+
     /* If the AioContexts don't match, first try to move the subtree of
      * child_bs into the AioContext of the new parent. If this doesn't work,
      * try moving the parent into the AioContext of child_bs instead. */
@@ -2721,8 +2723,8 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
                     perm, shared_perm, &perm, &shared_perm);
 
     child = bdrv_root_attach_child(child_bs, child_name, child_class,
-                                   child_role, bdrv_get_aio_context(parent_bs),
-                                   perm, shared_perm, parent_bs, errp);
+                                   child_role, perm, shared_perm, parent_bs,
+                                   errp);
     if (child == NULL) {
         return NULL;
     }
diff --git a/block/block-backend.c b/block/block-backend.c
index 28efa0dff3..357931ee34 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -435,7 +435,7 @@ BlockBackend *blk_new_open(const char *filename, const char *reference,
 
     blk->root = bdrv_root_attach_child(bs, "root", &child_root,
                                        BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-                                       blk->ctx, perm, BLK_PERM_ALL, blk, errp);
+                                       perm, BLK_PERM_ALL, blk, errp);
     if (!blk->root) {
         blk_unref(blk);
         return NULL;
@@ -849,7 +849,7 @@ int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, Error **errp)
     bdrv_ref(bs);
     blk->root = bdrv_root_attach_child(bs, "root", &child_root,
                                        BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-                                       blk->ctx, blk->perm, blk->shared_perm,
+                                       blk->perm, blk->shared_perm,
                                        blk, errp);
     if (blk->root == NULL) {
         return -EPERM;
diff --git a/blockjob.c b/blockjob.c
index f671763c2c..01da714755 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -225,8 +225,7 @@ int block_job_add_bdrv(BlockJob *job, const char *name, BlockDriverState *bs,
     if (job->job.aio_context != qemu_get_aio_context()) {
         aio_context_release(job->job.aio_context);
     }
-    c = bdrv_root_attach_child(bs, name, &child_job, 0,
-                               job->job.aio_context, perm, shared_perm, job,
+    c = bdrv_root_attach_child(bs, name, &child_job, 0, perm, shared_perm, job,
                                errp);
     if (job->job.aio_context != qemu_get_aio_context()) {
         aio_context_acquire(job->job.aio_context);
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 08/36] block: make bdrv_reopen_{prepare, commit, abort} private
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2020-11-27 14:44 ` [PATCH v2 07/36] block: drop ctx argument from bdrv_root_attach_child Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:44 ` Vladimir Sementsov-Ogievskiy via
  2020-12-15 17:28   ` Alberto Garcia
  2021-01-18 15:24   ` [PATCH v2 08/36] block: make bdrv_reopen_{prepare,commit,abort} private Kevin Wolf
  2020-11-27 14:44 ` [PATCH v2 09/36] block: return value from bdrv_replace_node() Vladimir Sementsov-Ogievskiy
                   ` (28 subsequent siblings)
  36 siblings, 2 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy via @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

These functions are called only from bdrv_reopen_multiple() in block.c.
No reason to publish them.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h |  4 ----
 block.c               | 13 +++++++++----
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 6788ccd25b..5d59984ad4 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -373,10 +373,6 @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
 int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp);
 int bdrv_reopen_set_read_only(BlockDriverState *bs, bool read_only,
                               Error **errp);
-int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
-                        BlockReopenQueue *queue, Error **errp);
-void bdrv_reopen_commit(BDRVReopenState *reopen_state);
-void bdrv_reopen_abort(BDRVReopenState *reopen_state);
 int bdrv_pwrite_zeroes(BdrvChild *child, int64_t offset,
                        int bytes, BdrvRequestFlags flags);
 int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags flags);
diff --git a/block.c b/block.c
index 15e6ab666e..3765c7caed 100644
--- a/block.c
+++ b/block.c
@@ -84,6 +84,11 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
 static void bdrv_parent_set_aio_context_ignore(BdrvChild *c, AioContext *ctx,
                                                GSList **ignore);
 
+static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
+                               *queue, Error **errp);
+static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
+static void bdrv_reopen_abort(BDRVReopenState *reopen_state);
+
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
@@ -4082,8 +4087,8 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
  * commit() for any other BDS that have been left in a prepare() state
  *
  */
-int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
-                        Error **errp)
+static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
+                               BlockReopenQueue *queue, Error **errp)
 {
     int ret = -1;
     int old_flags;
@@ -4298,7 +4303,7 @@ error:
  * makes them final by swapping the staging BlockDriverState contents into
  * the active BlockDriverState contents.
  */
-void bdrv_reopen_commit(BDRVReopenState *reopen_state)
+static void bdrv_reopen_commit(BDRVReopenState *reopen_state)
 {
     BlockDriver *drv;
     BlockDriverState *bs;
@@ -4358,7 +4363,7 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
  * Abort the reopen, and delete and free the staged changes in
  * reopen_state
  */
-void bdrv_reopen_abort(BDRVReopenState *reopen_state)
+static void bdrv_reopen_abort(BDRVReopenState *reopen_state)
 {
     BlockDriver *drv;
 
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 09/36] block: return value from bdrv_replace_node()
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2020-11-27 14:44 ` [PATCH v2 08/36] block: make bdrv_reopen_{prepare, commit, abort} private Vladimir Sementsov-Ogievskiy via
@ 2020-11-27 14:44 ` Vladimir Sementsov-Ogievskiy
  2020-12-15 17:30   ` Alberto Garcia
  2021-01-18 15:40   ` Kevin Wolf
  2020-11-27 14:44 ` [PATCH v2 10/36] util: add transactions.c Vladimir Sementsov-Ogievskiy
                   ` (27 subsequent siblings)
  36 siblings, 2 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Functions with errp argument are not recommened to be void-functions.
Improve bdrv_replace_node().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h |  4 ++--
 block.c               | 14 ++++++++------
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 5d59984ad4..8f6100dad7 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -346,8 +346,8 @@ int bdrv_create_file(const char *filename, QemuOpts *opts, Error **errp);
 BlockDriverState *bdrv_new(void);
 int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
                 Error **errp);
-void bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
-                       Error **errp);
+int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
+                      Error **errp);
 
 int bdrv_parse_aio(const char *mode, int *flags);
 int bdrv_parse_cache_mode(const char *mode, int *flags, bool *writethrough);
diff --git a/block.c b/block.c
index 3765c7caed..29082c6d47 100644
--- a/block.c
+++ b/block.c
@@ -4537,14 +4537,14 @@ static bool should_update_child(BdrvChild *c, BlockDriverState *to)
  * With auto_skip=false the error is returned if from has a parent which should
  * not be updated.
  */
-static void bdrv_replace_node_common(BlockDriverState *from,
-                                     BlockDriverState *to,
-                                     bool auto_skip, Error **errp)
+static int bdrv_replace_node_common(BlockDriverState *from,
+                                    BlockDriverState *to,
+                                    bool auto_skip, Error **errp)
 {
+    int ret = -EPERM;
     BdrvChild *c, *next;
     GSList *list = NULL, *p;
     uint64_t perm = 0, shared = BLK_PERM_ALL;
-    int ret;
 
     /* Make sure that @from doesn't go away until we have successfully attached
      * all of its parents to @to. */
@@ -4600,10 +4600,12 @@ out:
     g_slist_free(list);
     bdrv_drained_end(from);
     bdrv_unref(from);
+
+    return ret;
 }
 
-void bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
-                       Error **errp)
+int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
+                      Error **errp)
 {
     return bdrv_replace_node_common(from, to, true, errp);
 }
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 10/36] util: add transactions.c
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (8 preceding siblings ...)
  2020-11-27 14:44 ` [PATCH v2 09/36] block: return value from bdrv_replace_node() Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:44 ` Vladimir Sementsov-Ogievskiy
  2021-01-18 16:50   ` Kevin Wolf
  2020-11-27 14:44 ` [PATCH v2 11/36] block: bdrv_refresh_perms: check parents compliance Vladimir Sementsov-Ogievskiy
                   ` (26 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Add simple transaction API to use in further update of block graph
operations.

Supposed usage is:

- "prepare" is main function of the action and it should make the main
  effect of the action to be visible for the following actions, keeping
  possibility of roll-back, saving necessary things in action state,
  which is prepended to the list. So, driver struct doesn't include
  "prepare" field, as it is supposed to be called directly.

- commit/rollback is supposed to be called for the list of action
  states, to commit/rollback all the actions in reverse order

- When possible "commit" should not make visible effect for other
  actions, which make possible transparent logical interaction between
  actions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/qemu/transactions.h | 46 +++++++++++++++++++++
 util/transactions.c         | 81 +++++++++++++++++++++++++++++++++++++
 util/meson.build            |  1 +
 3 files changed, 128 insertions(+)
 create mode 100644 include/qemu/transactions.h
 create mode 100644 util/transactions.c

diff --git a/include/qemu/transactions.h b/include/qemu/transactions.h
new file mode 100644
index 0000000000..a5b15f46ab
--- /dev/null
+++ b/include/qemu/transactions.h
@@ -0,0 +1,46 @@
+/*
+ * Simple transactions API
+ *
+ * Copyright (c) 2020 Virtuozzo International GmbH.
+ *
+ * Author:
+ *  Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef QEMU_TRANSACTIONS_H
+#define QEMU_TRANSACTIONS_H
+
+#include <gmodule.h>
+
+typedef struct TransactionActionDrv {
+    void (*abort)(void *opeque);
+    void (*commit)(void *opeque);
+    void (*clean)(void *opeque);
+} TransactionActionDrv;
+
+void tran_prepend(GSList **list, TransactionActionDrv *drv, void *opaque);
+void tran_abort(GSList *backup);
+void tran_commit(GSList *backup);
+static inline void tran_finalize(GSList *backup, int ret)
+{
+    if (ret < 0) {
+        tran_abort(backup);
+    } else {
+        tran_commit(backup);
+    }
+}
+
+#endif /* QEMU_TRANSACTIONS_H */
diff --git a/util/transactions.c b/util/transactions.c
new file mode 100644
index 0000000000..ef1b9a36a4
--- /dev/null
+++ b/util/transactions.c
@@ -0,0 +1,81 @@
+/*
+ * Simple transactions API
+ *
+ * Copyright (c) 2020 Virtuozzo International GmbH.
+ *
+ * Author:
+ *  Sementsov-Ogievskiy Vladimir <vsementsov@virtuozzo.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+
+#include "qemu/transactions.h"
+
+typedef struct BdrvAction {
+    TransactionActionDrv *drv;
+    void *opaque;
+} BdrvAction;
+
+void tran_prepend(GSList **list, TransactionActionDrv *drv, void *opaque)
+{
+    BdrvAction *act;
+
+    act = g_new(BdrvAction, 1);
+    *act = (BdrvAction) {
+        .drv = drv,
+        .opaque = opaque
+    };
+
+    *list = g_slist_prepend(*list, act);
+}
+
+void tran_abort(GSList *list)
+{
+    GSList *p;
+
+    for (p = list; p != NULL; p = p->next) {
+        BdrvAction *act = p->data;
+
+        if (act->drv->abort) {
+            act->drv->abort(act->opaque);
+        }
+
+        if (act->drv->clean) {
+            act->drv->clean(act->opaque);
+        }
+    }
+
+    g_slist_free_full(list, g_free);
+}
+
+void tran_commit(GSList *list)
+{
+    GSList *p;
+
+    for (p = list; p != NULL; p = p->next) {
+        BdrvAction *act = p->data;
+
+        if (act->drv->commit) {
+            act->drv->commit(act->opaque);
+        }
+
+        if (act->drv->clean) {
+            act->drv->clean(act->opaque);
+        }
+    }
+
+    g_slist_free_full(list, g_free);
+}
diff --git a/util/meson.build b/util/meson.build
index f359af0d46..8c7c28bd40 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -41,6 +41,7 @@ util_ss.add(files('qsp.c'))
 util_ss.add(files('range.c'))
 util_ss.add(files('stats64.c'))
 util_ss.add(files('systemd.c'))
+util_ss.add(files('transactions.c'))
 util_ss.add(when: 'CONFIG_POSIX', if_true: files('drm.c'))
 util_ss.add(files('guest-random.c'))
 
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 11/36] block: bdrv_refresh_perms: check parents compliance
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (9 preceding siblings ...)
  2020-11-27 14:44 ` [PATCH v2 10/36] util: add transactions.c Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:44 ` Vladimir Sementsov-Ogievskiy
  2021-01-19 17:42   ` Kevin Wolf
  2020-11-27 14:44 ` [PATCH v2 12/36] block: refactor bdrv_child* permission functions Vladimir Sementsov-Ogievskiy
                   ` (25 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Add additional check that node parents do not interfere with each
other. This should not hurt existing callers and allows in further
patch use bdrv_refresh_perms() to update a subtree of changed
BdrvChild (check that change is correct).

New check will substitute bdrv_check_update_perm() in following
permissions refactoring, so keep error messages the same to avoid
unit test result changes.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 54 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index 29082c6d47..a756f3e8ad 100644
--- a/block.c
+++ b/block.c
@@ -1966,6 +1966,57 @@ bool bdrv_is_writable(BlockDriverState *bs)
     return bdrv_is_writable_after_reopen(bs, NULL);
 }
 
+static char *bdrv_child_user_desc(BdrvChild *c)
+{
+    if (c->klass->get_parent_desc) {
+        return c->klass->get_parent_desc(c);
+    }
+
+    return g_strdup("another user");
+}
+
+static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, Error **errp)
+{
+    g_autofree char *user = NULL;
+    g_autofree char *perm_names = NULL;
+
+    if ((b->perm & a->shared_perm) == b->perm) {
+        return true;
+    }
+
+    perm_names = bdrv_perm_names(b->perm & ~a->shared_perm);
+    user = bdrv_child_user_desc(a);
+    error_setg(errp, "Conflicts with use by %s as '%s', which does not "
+               "allow '%s' on %s",
+               user, a->name, perm_names, bdrv_get_node_name(b->bs));
+
+    return false;
+}
+
+static bool bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp)
+{
+    BdrvChild *a, *b;
+
+    /*
+     * During the loop we'll look at each pair twice. That's correct is
+     * bdrv_a_allow_b() is asymmetric and we should check each pair in both
+     * directions.
+     */
+    QLIST_FOREACH(a, &bs->parents, next_parent) {
+        QLIST_FOREACH(b, &bs->parents, next_parent) {
+            if (a == b) {
+                continue;
+            }
+
+            if (!bdrv_a_allow_b(a, b, errp)) {
+                return false;
+            }
+        }
+    }
+
+    return true;
+}
+
 static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
                             BdrvChild *c, BdrvChildRole role,
                             BlockReopenQueue *reopen_queue,
@@ -2143,15 +2194,6 @@ void bdrv_get_cumulative_perm(BlockDriverState *bs, uint64_t *perm,
     *shared_perm = cumulative_shared_perms;
 }
 
-static char *bdrv_child_user_desc(BdrvChild *c)
-{
-    if (c->klass->get_parent_desc) {
-        return c->klass->get_parent_desc(c);
-    }
-
-    return g_strdup("another user");
-}
-
 char *bdrv_perm_names(uint64_t perm)
 {
     struct perm_name {
@@ -2295,6 +2337,9 @@ static int bdrv_refresh_perms(BlockDriverState *bs, Error **errp)
     int ret;
     uint64_t perm, shared_perm;
 
+    if (!bdrv_check_parents_compliance(bs, errp)) {
+        return -EPERM;
+    }
     bdrv_get_cumulative_perm(bs, &perm, &shared_perm);
     ret = bdrv_check_perm(bs, NULL, perm, shared_perm, NULL, errp);
     if (ret < 0) {
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 12/36] block: refactor bdrv_child* permission functions
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (10 preceding siblings ...)
  2020-11-27 14:44 ` [PATCH v2 11/36] block: bdrv_refresh_perms: check parents compliance Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:44 ` Vladimir Sementsov-Ogievskiy
  2021-01-19 18:09   ` Kevin Wolf
  2020-11-27 14:44 ` [PATCH v2 13/36] block: rewrite bdrv_child_try_set_perm() using bdrv_refresh_perms() Vladimir Sementsov-Ogievskiy
                   ` (24 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Split out non-recursive parts, and refactor as block graph transaction
action.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 79 ++++++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 59 insertions(+), 20 deletions(-)

diff --git a/block.c b/block.c
index a756f3e8ad..7267b4a3e9 100644
--- a/block.c
+++ b/block.c
@@ -48,6 +48,7 @@
 #include "qemu/timer.h"
 #include "qemu/cutils.h"
 #include "qemu/id.h"
+#include "qemu/transactions.h"
 #include "block/coroutines.h"
 
 #ifdef CONFIG_BSD
@@ -2033,6 +2034,61 @@ static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
     }
 }
 
+static void bdrv_child_set_perm_commit(void *opaque)
+{
+    BdrvChild *c = opaque;
+
+    c->has_backup_perm = false;
+}
+
+static void bdrv_child_set_perm_abort(void *opaque)
+{
+    BdrvChild *c = opaque;
+    /*
+     * We may have child->has_backup_perm unset at this point, as in case of
+     * _check_ stage of permission update failure we may _check_ not the whole
+     * subtree.  Still, _abort_ is called on the whole subtree anyway.
+     */
+    if (c->has_backup_perm) {
+        c->perm = c->backup_perm;
+        c->shared_perm = c->backup_shared_perm;
+        c->has_backup_perm = false;
+    }
+}
+
+static TransactionActionDrv bdrv_child_set_pem_drv = {
+    .abort = bdrv_child_set_perm_abort,
+    .commit = bdrv_child_set_perm_commit,
+};
+
+/*
+ * With tran=NULL needs to be followed by direct call to either
+ * bdrv_child_set_perm_commit() or bdrv_child_set_perm_abort().
+ *
+ * With non-NUll tran needs to be followed by tran_abort() or tran_commit()
+ * instead.
+ */
+static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
+                                     uint64_t shared, GSList **tran)
+{
+    if (!c->has_backup_perm) {
+        c->has_backup_perm = true;
+        c->backup_perm = c->perm;
+        c->backup_shared_perm = c->shared_perm;
+    }
+    /*
+     * Note: it's OK if c->has_backup_perm was already set, as we can find the
+     * same c twice during check_perm procedure
+     */
+
+    c->perm = perm;
+    c->shared_perm = shared;
+
+    if (tran) {
+        tran_prepend(tran, &bdrv_child_set_pem_drv, c);
+    }
+}
+
 /*
  * Check whether permissions on this node can be changed in a way that
  * @cumulative_perms and @cumulative_shared_perms are the new cumulative
@@ -2298,37 +2354,20 @@ static int bdrv_child_check_perm(BdrvChild *c, BlockReopenQueue *q,
         return ret;
     }
 
-    if (!c->has_backup_perm) {
-        c->has_backup_perm = true;
-        c->backup_perm = c->perm;
-        c->backup_shared_perm = c->shared_perm;
-    }
-    /*
-     * Note: it's OK if c->has_backup_perm was already set, as we can find the
-     * same child twice during check_perm procedure
-     */
-
-    c->perm = perm;
-    c->shared_perm = shared;
+    bdrv_child_set_perm_safe(c, perm, shared, NULL);
 
     return 0;
 }
 
 static void bdrv_child_set_perm(BdrvChild *c)
 {
-    c->has_backup_perm = false;
-
+    bdrv_child_set_perm_commit(c);
     bdrv_set_perm(c->bs);
 }
 
 static void bdrv_child_abort_perm_update(BdrvChild *c)
 {
-    if (c->has_backup_perm) {
-        c->perm = c->backup_perm;
-        c->shared_perm = c->backup_shared_perm;
-        c->has_backup_perm = false;
-    }
-
+    bdrv_child_set_perm_abort(c);
     bdrv_abort_perm_update(c->bs);
 }
 
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 13/36] block: rewrite bdrv_child_try_set_perm() using bdrv_refresh_perms()
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (11 preceding siblings ...)
  2020-11-27 14:44 ` [PATCH v2 12/36] block: refactor bdrv_child* permission functions Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:44 ` Vladimir Sementsov-Ogievskiy
  2020-11-27 14:45 ` [PATCH v2 14/36] block: inline bdrv_child_*() permission functions calls Vladimir Sementsov-Ogievskiy
                   ` (23 subsequent siblings)
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:44 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

We are going to drop recursive bdrv_child_* functions, so stop use them
in bdrv_child_try_set_perm() as a first step.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index 7267b4a3e9..82786ebe1f 100644
--- a/block.c
+++ b/block.c
@@ -2394,11 +2394,16 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
                             Error **errp)
 {
     Error *local_err = NULL;
+    GSList *tran = NULL;
     int ret;
 
-    ret = bdrv_child_check_perm(c, NULL, perm, shared, NULL, &local_err);
+    bdrv_child_set_perm_safe(c, perm, shared, &tran);
+
+    ret = bdrv_refresh_perms(c->bs, &local_err);
+
+    tran_finalize(tran, ret);
+
     if (ret < 0) {
-        bdrv_child_abort_perm_update(c);
         if ((perm & ~c->perm) || (c->shared_perm & ~shared)) {
             /* tighten permissions */
             error_propagate(errp, local_err);
@@ -2412,12 +2417,9 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
             error_free(local_err);
             ret = 0;
         }
-        return ret;
     }
 
-    bdrv_child_set_perm(c);
-
-    return 0;
+    return ret;
 }
 
 int bdrv_child_refresh_perms(BlockDriverState *bs, BdrvChild *c, Error **errp)
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 14/36] block: inline bdrv_child_*() permission functions calls
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (12 preceding siblings ...)
  2020-11-27 14:44 ` [PATCH v2 13/36] block: rewrite bdrv_child_try_set_perm() using bdrv_refresh_perms() Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2020-12-16 17:16   ` Alberto Garcia
  2020-11-27 14:45 ` [PATCH v2 15/36] block: use topological sort for permission update Vladimir Sementsov-Ogievskiy
                   ` (22 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Each of them has only one caller. Open-coding simplifies further
pemission-update system changes.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 59 +++++++++++++++++----------------------------------------
 1 file changed, 17 insertions(+), 42 deletions(-)

diff --git a/block.c b/block.c
index 82786ebe1f..92bfcbedc9 100644
--- a/block.c
+++ b/block.c
@@ -1914,11 +1914,11 @@ static int bdrv_fill_options(QDict **options, const char *filename,
     return 0;
 }
 
-static int bdrv_child_check_perm(BdrvChild *c, BlockReopenQueue *q,
-                                 uint64_t perm, uint64_t shared,
-                                 GSList *ignore_children, Error **errp);
-static void bdrv_child_abort_perm_update(BdrvChild *c);
-static void bdrv_child_set_perm(BdrvChild *c);
+static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
+                                  uint64_t new_used_perm,
+                                  uint64_t new_shared_perm,
+                                  GSList *ignore_children,
+                                  Error **errp);
 
 typedef struct BlockReopenQueueEntry {
      bool prepared;
@@ -2166,15 +2166,21 @@ static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
     /* Check all children */
     QLIST_FOREACH(c, &bs->children, next) {
         uint64_t cur_perm, cur_shared;
+        GSList *cur_ignore_children;
 
         bdrv_child_perm(bs, c->bs, c, c->role, q,
                         cumulative_perms, cumulative_shared_perms,
                         &cur_perm, &cur_shared);
-        ret = bdrv_child_check_perm(c, q, cur_perm, cur_shared, ignore_children,
-                                    errp);
+
+        cur_ignore_children = g_slist_prepend(g_slist_copy(ignore_children), c);
+        ret = bdrv_check_update_perm(c->bs, q, cur_perm, cur_shared,
+                                     cur_ignore_children, errp);
+        g_slist_free(cur_ignore_children);
         if (ret < 0) {
             return ret;
         }
+
+        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
     }
 
     return 0;
@@ -2201,7 +2207,8 @@ static void bdrv_abort_perm_update(BlockDriverState *bs)
     }
 
     QLIST_FOREACH(c, &bs->children, next) {
-        bdrv_child_abort_perm_update(c);
+        bdrv_child_set_perm_abort(c);
+        bdrv_abort_perm_update(c->bs);
     }
 }
 
@@ -2230,7 +2237,8 @@ static void bdrv_set_perm(BlockDriverState *bs)
 
     /* Update all children */
     QLIST_FOREACH(c, &bs->children, next) {
-        bdrv_child_set_perm(c);
+        bdrv_child_set_perm_commit(c);
+        bdrv_set_perm(c->bs);
     }
 }
 
@@ -2338,39 +2346,6 @@ static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
                            ignore_children, errp);
 }
 
-/* Needs to be followed by a call to either bdrv_child_set_perm() or
- * bdrv_child_abort_perm_update(). */
-static int bdrv_child_check_perm(BdrvChild *c, BlockReopenQueue *q,
-                                 uint64_t perm, uint64_t shared,
-                                 GSList *ignore_children, Error **errp)
-{
-    int ret;
-
-    ignore_children = g_slist_prepend(g_slist_copy(ignore_children), c);
-    ret = bdrv_check_update_perm(c->bs, q, perm, shared, ignore_children, errp);
-    g_slist_free(ignore_children);
-
-    if (ret < 0) {
-        return ret;
-    }
-
-    bdrv_child_set_perm_safe(c, perm, shared, NULL);
-
-    return 0;
-}
-
-static void bdrv_child_set_perm(BdrvChild *c)
-{
-    bdrv_child_set_perm_commit(c);
-    bdrv_set_perm(c->bs);
-}
-
-static void bdrv_child_abort_perm_update(BdrvChild *c)
-{
-    bdrv_child_set_perm_abort(c);
-    bdrv_abort_perm_update(c->bs);
-}
-
 static int bdrv_refresh_perms(BlockDriverState *bs, Error **errp)
 {
     int ret;
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 15/36] block: use topological sort for permission update
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (13 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 14/36] block: inline bdrv_child_*() permission functions calls Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2021-01-27 18:38   ` Kevin Wolf
  2020-11-27 14:45 ` [PATCH v2 16/36] block: add bdrv_drv_set_perm transaction action Vladimir Sementsov-Ogievskiy
                   ` (21 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Rewrite bdrv_check_perm(), bdrv_abort_perm_update() and bdrv_set_perm()
to update nodes in topological sort order instead of simple DFS. With
topologically sorted nodes, we update a node only when all its parents
already updated. With DFS it's not so.

Consider the following example:

    A -+
    |  |
    |  v
    |  B
    |  |
    v  |
    C<-+

A is parent for B and C, B is parent for C.

Obviously, to update permissions, we should go in order A B C, so, when
we update C, all parent permissions already updated. But with current
approach (simple recursion) we can update in sequence A C B C (C is
updated twice). On first update of C, we consider old B permissions, so
doing wrong thing. If it succeed, all is OK, on second C update we will
finish with correct graph. But if the wrong thing failed, we break the
whole process for no reason (it's possible that updated B permission
will be less strict, but we will never check it).

Also new approach gives a way to simultaneously and correctly update
several nodes, we just need to run bdrv_topological_dfs() several times
to add all nodes and their subtrees into one topologically sorted list
(next patch will update bdrv_replace_node() in this manner).

Test test_parallel_perm_update() is now passing, so move it out of
debugging "if".

We also need to support ignore_children in
bdrv_check_parents_compliance().

For test 283 order of parents compliance check is changed.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c                     | 103 +++++++++++++++++++++++++++++-------
 tests/test-bdrv-graph-mod.c |   4 +-
 tests/qemu-iotests/283.out  |   2 +-
 3 files changed, 86 insertions(+), 23 deletions(-)

diff --git a/block.c b/block.c
index 92bfcbedc9..81ccf51605 100644
--- a/block.c
+++ b/block.c
@@ -1994,7 +1994,9 @@ static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, Error **errp)
     return false;
 }
 
-static bool bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp)
+static bool bdrv_check_parents_compliance(BlockDriverState *bs,
+                                          GSList *ignore_children,
+                                          Error **errp)
 {
     BdrvChild *a, *b;
 
@@ -2005,7 +2007,9 @@ static bool bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp)
      */
     QLIST_FOREACH(a, &bs->parents, next_parent) {
         QLIST_FOREACH(b, &bs->parents, next_parent) {
-            if (a == b) {
+            if (a == b || g_slist_find(ignore_children, a) ||
+                g_slist_find(ignore_children, b))
+            {
                 continue;
             }
 
@@ -2034,6 +2038,29 @@ static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
     }
 }
 
+static GSList *bdrv_topological_dfs(GSList *list, GHashTable *found,
+                                    BlockDriverState *bs)
+{
+    BdrvChild *child;
+    g_autoptr(GHashTable) local_found = NULL;
+
+    if (!found) {
+        assert(!list);
+        found = local_found = g_hash_table_new(NULL, NULL);
+    }
+
+    if (g_hash_table_contains(found, bs)) {
+        return list;
+    }
+    g_hash_table_add(found, bs);
+
+    QLIST_FOREACH(child, &bs->children, next) {
+        list = bdrv_topological_dfs(list, found, child->bs);
+    }
+
+    return g_slist_prepend(list, bs);
+}
+
 static void bdrv_child_set_perm_commit(void *opaque)
 {
     BdrvChild *c = opaque;
@@ -2098,10 +2125,10 @@ static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
  * A call to this function must always be followed by a call to bdrv_set_perm()
  * or bdrv_abort_perm_update().
  */
-static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
-                           uint64_t cumulative_perms,
-                           uint64_t cumulative_shared_perms,
-                           GSList *ignore_children, Error **errp)
+static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
+                                uint64_t cumulative_perms,
+                                uint64_t cumulative_shared_perms,
+                                GSList *ignore_children, Error **errp)
 {
     BlockDriver *drv = bs->drv;
     BdrvChild *c;
@@ -2166,21 +2193,43 @@ static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
     /* Check all children */
     QLIST_FOREACH(c, &bs->children, next) {
         uint64_t cur_perm, cur_shared;
-        GSList *cur_ignore_children;
 
         bdrv_child_perm(bs, c->bs, c, c->role, q,
                         cumulative_perms, cumulative_shared_perms,
                         &cur_perm, &cur_shared);
+        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
+    }
+
+    return 0;
+}
+
+static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
+                           uint64_t cumulative_perms,
+                           uint64_t cumulative_shared_perms,
+                           GSList *ignore_children, Error **errp)
+{
+    int ret;
+    BlockDriverState *root = bs;
+    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root);
+
+    for ( ; list; list = list->next) {
+        bs = list->data;
+
+        if (bs != root) {
+            if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) {
+                return -EINVAL;
+            }
+
+            bdrv_get_cumulative_perm(bs, &cumulative_perms,
+                                     &cumulative_shared_perms);
+        }
 
-        cur_ignore_children = g_slist_prepend(g_slist_copy(ignore_children), c);
-        ret = bdrv_check_update_perm(c->bs, q, cur_perm, cur_shared,
-                                     cur_ignore_children, errp);
-        g_slist_free(cur_ignore_children);
+        ret = bdrv_node_check_perm(bs, q, cumulative_perms,
+                                   cumulative_shared_perms,
+                                   ignore_children, errp);
         if (ret < 0) {
             return ret;
         }
-
-        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
     }
 
     return 0;
@@ -2190,10 +2239,8 @@ static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
  * Notifies drivers that after a previous bdrv_check_perm() call, the
  * permission update is not performed and any preparations made for it (e.g.
  * taken file locks) need to be undone.
- *
- * This function recursively notifies all child nodes.
  */
-static void bdrv_abort_perm_update(BlockDriverState *bs)
+static void bdrv_node_abort_perm_update(BlockDriverState *bs)
 {
     BlockDriver *drv = bs->drv;
     BdrvChild *c;
@@ -2208,11 +2255,19 @@ static void bdrv_abort_perm_update(BlockDriverState *bs)
 
     QLIST_FOREACH(c, &bs->children, next) {
         bdrv_child_set_perm_abort(c);
-        bdrv_abort_perm_update(c->bs);
     }
 }
 
-static void bdrv_set_perm(BlockDriverState *bs)
+static void bdrv_abort_perm_update(BlockDriverState *bs)
+{
+    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
+
+    for ( ; list; list = list->next) {
+        bdrv_node_abort_perm_update((BlockDriverState *)list->data);
+    }
+}
+
+static void bdrv_node_set_perm(BlockDriverState *bs)
 {
     uint64_t cumulative_perms, cumulative_shared_perms;
     BlockDriver *drv = bs->drv;
@@ -2238,7 +2293,15 @@ static void bdrv_set_perm(BlockDriverState *bs)
     /* Update all children */
     QLIST_FOREACH(c, &bs->children, next) {
         bdrv_child_set_perm_commit(c);
-        bdrv_set_perm(c->bs);
+    }
+}
+
+static void bdrv_set_perm(BlockDriverState *bs)
+{
+    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
+
+    for ( ; list; list = list->next) {
+        bdrv_node_set_perm((BlockDriverState *)list->data);
     }
 }
 
@@ -2351,7 +2414,7 @@ static int bdrv_refresh_perms(BlockDriverState *bs, Error **errp)
     int ret;
     uint64_t perm, shared_perm;
 
-    if (!bdrv_check_parents_compliance(bs, errp)) {
+    if (!bdrv_check_parents_compliance(bs, NULL, errp)) {
         return -EPERM;
     }
     bdrv_get_cumulative_perm(bs, &perm, &shared_perm);
diff --git a/tests/test-bdrv-graph-mod.c b/tests/test-bdrv-graph-mod.c
index 74f4a4153b..0d62e05ddb 100644
--- a/tests/test-bdrv-graph-mod.c
+++ b/tests/test-bdrv-graph-mod.c
@@ -316,12 +316,12 @@ int main(int argc, char *argv[])
     g_test_add_func("/bdrv-graph-mod/update-perm-tree", test_update_perm_tree);
     g_test_add_func("/bdrv-graph-mod/should-update-child",
                     test_should_update_child);
+    g_test_add_func("/bdrv-graph-mod/parallel-perm-update",
+                    test_parallel_perm_update);
 
     if (debug) {
         g_test_add_func("/bdrv-graph-mod/parallel-exclusive-write",
                         test_parallel_exclusive_write);
-        g_test_add_func("/bdrv-graph-mod/parallel-perm-update",
-                        test_parallel_perm_update);
     }
 
     return g_test_run();
diff --git a/tests/qemu-iotests/283.out b/tests/qemu-iotests/283.out
index d8cff22cc1..fbb7d0f619 100644
--- a/tests/qemu-iotests/283.out
+++ b/tests/qemu-iotests/283.out
@@ -5,4 +5,4 @@
 {"execute": "blockdev-add", "arguments": {"driver": "blkdebug", "image": "base", "node-name": "other", "take-child-perms": ["write"]}}
 {"return": {}}
 {"execute": "blockdev-backup", "arguments": {"device": "source", "sync": "full", "target": "target"}}
-{"error": {"class": "GenericError", "desc": "Cannot set permissions for backup-top filter: Conflicts with use by other as 'image', which uses 'write' on base"}}
+{"error": {"class": "GenericError", "desc": "Cannot set permissions for backup-top filter: Conflicts with use by source as 'image', which does not allow 'write' on base"}}
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 16/36] block: add bdrv_drv_set_perm transaction action
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (14 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 15/36] block: use topological sort for permission update Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2020-11-27 14:45 ` [PATCH v2 17/36] block: add bdrv_list_* permission update functions Vladimir Sementsov-Ogievskiy
                   ` (20 subsequent siblings)
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Refactor calling driver callbacks to a separate transaction action to
be used later.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 70 ++++++++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 54 insertions(+), 16 deletions(-)

diff --git a/block.c b/block.c
index 81ccf51605..4a43a33401 100644
--- a/block.c
+++ b/block.c
@@ -2116,6 +2116,54 @@ static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
     }
 }
 
+static void bdrv_drv_set_perm_commit(void *opaque)
+{
+    BlockDriverState *bs = opaque;
+    uint64_t cumulative_perms, cumulative_shared_perms;
+
+    if (bs->drv->bdrv_set_perm) {
+        bdrv_get_cumulative_perm(bs, &cumulative_perms,
+                                 &cumulative_shared_perms);
+        bs->drv->bdrv_set_perm(bs, cumulative_perms, cumulative_shared_perms);
+    }
+}
+
+static void bdrv_drv_set_perm_abort(void *opaque)
+{
+    BlockDriverState *bs = opaque;
+
+    if (bs->drv->bdrv_abort_perm_update) {
+        bs->drv->bdrv_abort_perm_update(bs);
+    }
+}
+
+TransactionActionDrv bdrv_drv_set_perm_drv = {
+    .abort = bdrv_drv_set_perm_abort,
+    .commit = bdrv_drv_set_perm_commit,
+};
+
+static int bdrv_drv_set_perm(BlockDriverState *bs, uint64_t perm,
+                             uint64_t shared_perm, GSList **tran,
+                             Error **errp)
+{
+    if (!bs->drv) {
+        return 0;
+    }
+
+    if (bs->drv->bdrv_check_perm) {
+        int ret = bs->drv->bdrv_check_perm(bs, perm, shared_perm, errp);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    if (tran) {
+        tran_prepend(tran, &bdrv_drv_set_perm_drv, bs);
+    }
+
+    return 0;
+}
+
 /*
  * Check whether permissions on this node can be changed in a way that
  * @cumulative_perms and @cumulative_shared_perms are the new cumulative
@@ -2176,12 +2224,10 @@ static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
         return 0;
     }
 
-    if (drv->bdrv_check_perm) {
-        ret = drv->bdrv_check_perm(bs, cumulative_perms,
-                                   cumulative_shared_perms, errp);
-        if (ret < 0) {
-            return ret;
-        }
+    ret = bdrv_drv_set_perm(bs, cumulative_perms, cumulative_shared_perms, NULL,
+                            errp);
+    if (ret < 0) {
+        return ret;
     }
 
     /* Drivers that never have children can omit .bdrv_child_perm() */
@@ -2249,9 +2295,7 @@ static void bdrv_node_abort_perm_update(BlockDriverState *bs)
         return;
     }
 
-    if (drv->bdrv_abort_perm_update) {
-        drv->bdrv_abort_perm_update(bs);
-    }
+    bdrv_drv_set_perm_abort(bs);
 
     QLIST_FOREACH(c, &bs->children, next) {
         bdrv_child_set_perm_abort(c);
@@ -2269,7 +2313,6 @@ static void bdrv_abort_perm_update(BlockDriverState *bs)
 
 static void bdrv_node_set_perm(BlockDriverState *bs)
 {
-    uint64_t cumulative_perms, cumulative_shared_perms;
     BlockDriver *drv = bs->drv;
     BdrvChild *c;
 
@@ -2277,12 +2320,7 @@ static void bdrv_node_set_perm(BlockDriverState *bs)
         return;
     }
 
-    bdrv_get_cumulative_perm(bs, &cumulative_perms, &cumulative_shared_perms);
-
-    /* Update this node */
-    if (drv->bdrv_set_perm) {
-        drv->bdrv_set_perm(bs, cumulative_perms, cumulative_shared_perms);
-    }
+    bdrv_drv_set_perm_commit(bs);
 
     /* Drivers that never have children can omit .bdrv_child_perm() */
     if (!drv->bdrv_child_perm) {
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 17/36] block: add bdrv_list_* permission update functions
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (15 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 16/36] block: add bdrv_drv_set_perm transaction action Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2020-11-27 14:45 ` [PATCH v2 18/36] block: add bdrv_replace_child_safe() transaction action Vladimir Sementsov-Ogievskiy
                   ` (19 subsequent siblings)
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Add new interface, allowing use of existing node list. It will be used
to fix bdrv_replace_node() in the further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 106 +++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 71 insertions(+), 35 deletions(-)

diff --git a/block.c b/block.c
index 4a43a33401..6996aee1cf 100644
--- a/block.c
+++ b/block.c
@@ -2176,7 +2176,8 @@ static int bdrv_drv_set_perm(BlockDriverState *bs, uint64_t perm,
 static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
                                 uint64_t cumulative_perms,
                                 uint64_t cumulative_shared_perms,
-                                GSList *ignore_children, Error **errp)
+                                GSList *ignore_children,
+                                GSList **tran, Error **errp)
 {
     BlockDriver *drv = bs->drv;
     BdrvChild *c;
@@ -2224,7 +2225,7 @@ static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
         return 0;
     }
 
-    ret = bdrv_drv_set_perm(bs, cumulative_perms, cumulative_shared_perms, NULL,
+    ret = bdrv_drv_set_perm(bs, cumulative_perms, cumulative_shared_perms, tran,
                             errp);
     if (ret < 0) {
         return ret;
@@ -2243,36 +2244,53 @@ static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
         bdrv_child_perm(bs, c->bs, c, c->role, q,
                         cumulative_perms, cumulative_shared_perms,
                         &cur_perm, &cur_shared);
-        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
+        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, tran);
     }
 
     return 0;
 }
 
-static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
-                           uint64_t cumulative_perms,
-                           uint64_t cumulative_shared_perms,
-                           GSList *ignore_children, Error **errp)
+/*
+ * If use_cumulative_perms is true, use cumulative_perms and
+ * cumulative_shared_perms for first element of the list. Otherwise just refresh
+ * all permissions.
+ */
+static int bdrv_check_perm_common(GSList *list, BlockReopenQueue *q,
+                                  bool use_cumulative_perms,
+                                  uint64_t cumulative_perms,
+                                  uint64_t cumulative_shared_perms,
+                                  GSList *ignore_children,
+                                  GSList **tran, Error **errp)
 {
     int ret;
-    BlockDriverState *root = bs;
-    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root);
+    BlockDriverState *bs;
 
-    for ( ; list; list = list->next) {
+    if (use_cumulative_perms) {
         bs = list->data;
 
-        if (bs != root) {
-            if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) {
-                return -EINVAL;
-            }
+        ret = bdrv_node_check_perm(bs, q, cumulative_perms,
+                                   cumulative_shared_perms,
+                                   ignore_children, tran, errp);
+        if (ret < 0) {
+            return ret;
+        }
 
-            bdrv_get_cumulative_perm(bs, &cumulative_perms,
-                                     &cumulative_shared_perms);
+        list = list->next;
+    }
+
+    for ( ; list; list = list->next) {
+        bs = list->data;
+
+        if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) {
+            return -EINVAL;
         }
 
+        bdrv_get_cumulative_perm(bs, &cumulative_perms,
+                                 &cumulative_shared_perms);
+
         ret = bdrv_node_check_perm(bs, q, cumulative_perms,
                                    cumulative_shared_perms,
-                                   ignore_children, errp);
+                                   ignore_children, tran, errp);
         if (ret < 0) {
             return ret;
         }
@@ -2281,6 +2299,23 @@ static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
     return 0;
 }
 
+static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
+                           uint64_t cumulative_perms,
+                           uint64_t cumulative_shared_perms,
+                           GSList *ignore_children, Error **errp)
+{
+    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
+    return bdrv_check_perm_common(list, q, true, cumulative_perms,
+                                  cumulative_shared_perms, ignore_children,
+                                  NULL, errp);
+}
+
+static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
+                                   GSList **tran, Error **errp)
+{
+    return bdrv_check_perm_common(list, q, false, 0, 0, NULL, tran, errp);
+}
+
 /*
  * Notifies drivers that after a previous bdrv_check_perm() call, the
  * permission update is not performed and any preparations made for it (e.g.
@@ -2302,15 +2337,19 @@ static void bdrv_node_abort_perm_update(BlockDriverState *bs)
     }
 }
 
-static void bdrv_abort_perm_update(BlockDriverState *bs)
+static void bdrv_list_abort_perm_update(GSList *list)
 {
-    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
-
     for ( ; list; list = list->next) {
         bdrv_node_abort_perm_update((BlockDriverState *)list->data);
     }
 }
 
+static void bdrv_abort_perm_update(BlockDriverState *bs)
+{
+    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
+    return bdrv_list_abort_perm_update(list);
+}
+
 static void bdrv_node_set_perm(BlockDriverState *bs)
 {
     BlockDriver *drv = bs->drv;
@@ -2334,15 +2373,19 @@ static void bdrv_node_set_perm(BlockDriverState *bs)
     }
 }
 
-static void bdrv_set_perm(BlockDriverState *bs)
+static void bdrv_list_set_perm(GSList *list)
 {
-    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
-
     for ( ; list; list = list->next) {
         bdrv_node_set_perm((BlockDriverState *)list->data);
     }
 }
 
+static void bdrv_set_perm(BlockDriverState *bs)
+{
+    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
+    return bdrv_list_set_perm(list);
+}
+
 void bdrv_get_cumulative_perm(BlockDriverState *bs, uint64_t *perm,
                               uint64_t *shared_perm)
 {
@@ -2450,20 +2493,13 @@ static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
 static int bdrv_refresh_perms(BlockDriverState *bs, Error **errp)
 {
     int ret;
-    uint64_t perm, shared_perm;
+    GSList *tran = NULL;
+    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
 
-    if (!bdrv_check_parents_compliance(bs, NULL, errp)) {
-        return -EPERM;
-    }
-    bdrv_get_cumulative_perm(bs, &perm, &shared_perm);
-    ret = bdrv_check_perm(bs, NULL, perm, shared_perm, NULL, errp);
-    if (ret < 0) {
-        bdrv_abort_perm_update(bs);
-        return ret;
-    }
-    bdrv_set_perm(bs);
+    ret = bdrv_list_refresh_perms(list, NULL, &tran, errp);
+    tran_finalize(tran, ret);
 
-    return 0;
+    return ret;
 }
 
 int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 18/36] block: add bdrv_replace_child_safe() transaction action
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (16 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 17/36] block: add bdrv_list_* permission update functions Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2020-11-27 14:45 ` [PATCH v2 19/36] block: fix bdrv_replace_node_common Vladimir Sementsov-Ogievskiy
                   ` (18 subsequent siblings)
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

To be used in the following commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/block.c b/block.c
index 6996aee1cf..f24bd60c2f 100644
--- a/block.c
+++ b/block.c
@@ -84,6 +84,8 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
 
 static void bdrv_parent_set_aio_context_ignore(BdrvChild *c, AioContext *ctx,
                                                GSList **ignore);
+static void bdrv_replace_child_noperm(BdrvChild *child,
+                                      BlockDriverState *new_bs);
 
 static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
                                *queue, Error **errp);
@@ -2164,6 +2166,57 @@ static int bdrv_drv_set_perm(BlockDriverState *bs, uint64_t perm,
     return 0;
 }
 
+typedef struct BdrvReplaceChildState {
+    BdrvChild *child;
+    BlockDriverState *old_bs;
+} BdrvReplaceChildState;
+
+static void bdrv_replace_child_commit(void *opaque)
+{
+    BdrvReplaceChildState *s = opaque;
+
+    bdrv_unref(s->old_bs);
+}
+
+static void bdrv_replace_child_abort(void *opaque)
+{
+    BdrvReplaceChildState *s = opaque;
+    BlockDriverState *new_bs = s->child->bs;
+
+    /* old_bs reference is transparently moved from @s to @s->child */
+    bdrv_replace_child_noperm(s->child, s->old_bs);
+    bdrv_unref(new_bs);
+}
+
+static TransactionActionDrv bdrv_replace_child_drv = {
+    .commit = bdrv_replace_child_commit,
+    .abort = bdrv_replace_child_abort,
+    .clean = g_free,
+};
+
+/*
+ * bdrv_replace_child_safe
+ *
+ * Note: real unref of old_bs is done only on commit.
+ */
+__attribute__((unused))
+static void bdrv_replace_child_safe(BdrvChild *child, BlockDriverState *new_bs,
+                                    GSList **tran)
+{
+    BdrvReplaceChildState *s = g_new(BdrvReplaceChildState, 1);
+    *s = (BdrvReplaceChildState) {
+        .child = child,
+        .old_bs = child->bs,
+    };
+    tran_prepend(tran, &bdrv_replace_child_drv, s);
+
+    if (new_bs) {
+        bdrv_ref(new_bs);
+    }
+    bdrv_replace_child_noperm(child, new_bs);
+    /* old_bs reference is transparently moved from @child to @s */
+}
+
 /*
  * Check whether permissions on this node can be changed in a way that
  * @cumulative_perms and @cumulative_shared_perms are the new cumulative
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 19/36] block: fix bdrv_replace_node_common
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (17 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 18/36] block: add bdrv_replace_child_safe() transaction action Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2021-02-03 18:23   ` Kevin Wolf
  2020-11-27 14:45 ` [PATCH v2 20/36] block: add bdrv_attach_child_common() transaction action Vladimir Sementsov-Ogievskiy
                   ` (17 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

inore_children thing doesn't help to track all propagated permissions
of children we want to ignore. The simplest way to correctly update
permissions is update graph first and then do permission update. In
this case we just referesh permissions for the whole subgraph (in
topological-sort defined order) and everything is correctly calculated
automatically without any ignore_children.

So, refactor bdrv_replace_node_common to first do graph update and then
refresh the permissions.

Test test_parallel_exclusive_write() now pass, so move it out of
debugging "if".

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c                     | 42 ++++++++++++++-----------------------
 tests/test-bdrv-graph-mod.c | 18 +++-------------
 2 files changed, 19 insertions(+), 41 deletions(-)

diff --git a/block.c b/block.c
index f24bd60c2f..f0fcd75555 100644
--- a/block.c
+++ b/block.c
@@ -2199,7 +2199,6 @@ static TransactionActionDrv bdrv_replace_child_drv = {
  *
  * Note: real unref of old_bs is done only on commit.
  */
-__attribute__((unused))
 static void bdrv_replace_child_safe(BdrvChild *child, BlockDriverState *new_bs,
                                     GSList **tran)
 {
@@ -4794,8 +4793,9 @@ static int bdrv_replace_node_common(BlockDriverState *from,
 {
     int ret = -EPERM;
     BdrvChild *c, *next;
-    GSList *list = NULL, *p;
-    uint64_t perm = 0, shared = BLK_PERM_ALL;
+    GSList *tran = NULL;
+    g_autoptr(GHashTable) found = NULL;
+    g_autoptr(GSList) refresh_list = NULL;
 
     /* Make sure that @from doesn't go away until we have successfully attached
      * all of its parents to @to. */
@@ -4805,7 +4805,12 @@ static int bdrv_replace_node_common(BlockDriverState *from,
     assert(bdrv_get_aio_context(from) == bdrv_get_aio_context(to));
     bdrv_drained_begin(from);
 
-    /* Put all parents into @list and calculate their cumulative permissions */
+    /*
+     * Do the replacement without permission update.
+     * Replacement may influence the permissions, we should calculate new
+     * permissions based on new graph. If we fail, we'll roll-back the
+     * replacement.
+     */
     QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) {
         assert(c->bs == from);
         if (!should_update_child(c, to)) {
@@ -4821,34 +4826,19 @@ static int bdrv_replace_node_common(BlockDriverState *from,
                        c->name, from->node_name);
             goto out;
         }
-        list = g_slist_prepend(list, c);
-        perm |= c->perm;
-        shared &= c->shared_perm;
-    }
-
-    /* Check whether the required permissions can be granted on @to, ignoring
-     * all BdrvChild in @list so that they can't block themselves. */
-    ret = bdrv_check_update_perm(to, NULL, perm, shared, list, errp);
-    if (ret < 0) {
-        bdrv_abort_perm_update(to);
-        goto out;
+        bdrv_replace_child_safe(c, to, &tran);
     }
 
-    /* Now actually perform the change. We performed the permission check for
-     * all elements of @list at once, so set the permissions all at once at the
-     * very end. */
-    for (p = list; p != NULL; p = p->next) {
-        c = p->data;
+    found = g_hash_table_new(NULL, NULL);
 
-        bdrv_ref(to);
-        bdrv_replace_child_noperm(c, to);
-        bdrv_unref(from);
-    }
+    refresh_list = bdrv_topological_dfs(refresh_list, found, to);
+    refresh_list = bdrv_topological_dfs(refresh_list, found, from);
 
-    bdrv_set_perm(to);
+    ret = bdrv_list_refresh_perms(refresh_list, NULL, &tran, errp);
 
 out:
-    g_slist_free(list);
+    tran_finalize(tran, ret);
+
     bdrv_drained_end(from);
     bdrv_unref(from);
 
diff --git a/tests/test-bdrv-graph-mod.c b/tests/test-bdrv-graph-mod.c
index 0d62e05ddb..93a5941a9b 100644
--- a/tests/test-bdrv-graph-mod.c
+++ b/tests/test-bdrv-graph-mod.c
@@ -294,20 +294,11 @@ static void test_parallel_perm_update(void)
     bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
 
     assert(c_fl1->perm & BLK_PERM_WRITE);
+    bdrv_unref(top);
 }
 
 int main(int argc, char *argv[])
 {
-    int i;
-    bool debug = false;
-
-    for (i = 1; i < argc; i++) {
-        if (!strcmp(argv[i], "-d")) {
-            debug = true;
-            break;
-        }
-    }
-
     bdrv_init();
     qemu_init_main_loop(&error_abort);
 
@@ -318,11 +309,8 @@ int main(int argc, char *argv[])
                     test_should_update_child);
     g_test_add_func("/bdrv-graph-mod/parallel-perm-update",
                     test_parallel_perm_update);
-
-    if (debug) {
-        g_test_add_func("/bdrv-graph-mod/parallel-exclusive-write",
-                        test_parallel_exclusive_write);
-    }
+    g_test_add_func("/bdrv-graph-mod/parallel-exclusive-write",
+                    test_parallel_exclusive_write);
 
     return g_test_run();
 }
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 20/36] block: add bdrv_attach_child_common() transaction action
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (18 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 19/36] block: fix bdrv_replace_node_common Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2021-02-03 21:01   ` Kevin Wolf
  2020-11-27 14:45 ` [PATCH v2 21/36] block: add bdrv_attach_child_noperm() " Vladimir Sementsov-Ogievskiy
                   ` (16 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Split out no-perm part of bdrv_root_attach_child() into separate
transaction action. bdrv_root_attach_child() now moves to new
permission update paradigm: first update graph relations then update
permissions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 162 ++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 117 insertions(+), 45 deletions(-)

diff --git a/block.c b/block.c
index f0fcd75555..a7ccbb4fb1 100644
--- a/block.c
+++ b/block.c
@@ -86,6 +86,13 @@ static void bdrv_parent_set_aio_context_ignore(BdrvChild *c, AioContext *ctx,
                                                GSList **ignore);
 static void bdrv_replace_child_noperm(BdrvChild *child,
                                       BlockDriverState *new_bs);
+static int bdrv_attach_child_common(BlockDriverState *child_bs,
+                                    const char *child_name,
+                                    const BdrvChildClass *child_class,
+                                    BdrvChildRole child_role,
+                                    uint64_t perm, uint64_t shared_perm,
+                                    void *opaque, BdrvChild **child,
+                                    GSList **tran, Error **errp);
 
 static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
                                *queue, Error **errp);
@@ -2898,55 +2905,22 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
                                   uint64_t perm, uint64_t shared_perm,
                                   void *opaque, Error **errp)
 {
-    BdrvChild *child;
-    Error *local_err = NULL;
     int ret;
-    AioContext *ctx;
+    BdrvChild *child = NULL;
+    GSList *tran = NULL;
 
-    ret = bdrv_check_update_perm(child_bs, NULL, perm, shared_perm, NULL, errp);
+    ret = bdrv_attach_child_common(child_bs, child_name, child_class,
+                                   child_role, perm, shared_perm, opaque,
+                                   &child, &tran, errp);
     if (ret < 0) {
-        bdrv_abort_perm_update(child_bs);
         bdrv_unref(child_bs);
         return NULL;
     }
 
-    child = g_new(BdrvChild, 1);
-    *child = (BdrvChild) {
-        .bs             = NULL,
-        .name           = g_strdup(child_name),
-        .klass          = child_class,
-        .role           = child_role,
-        .perm           = perm,
-        .shared_perm    = shared_perm,
-        .opaque         = opaque,
-    };
-
-    ctx = bdrv_child_get_parent_aio_context(child);
-
-    /* If the AioContexts don't match, first try to move the subtree of
-     * child_bs into the AioContext of the new parent. If this doesn't work,
-     * try moving the parent into the AioContext of child_bs instead. */
-    if (bdrv_get_aio_context(child_bs) != ctx) {
-        ret = bdrv_try_set_aio_context(child_bs, ctx, &local_err);
-        if (ret < 0) {
-            if (bdrv_parent_try_set_aio_context(child, ctx, NULL) == 0) {
-                ret = 0;
-                error_free(local_err);
-                local_err = NULL;
-            }
-        }
-        if (ret < 0) {
-            error_propagate(errp, local_err);
-            g_free(child);
-            bdrv_abort_perm_update(child_bs);
-            bdrv_unref(child_bs);
-            return NULL;
-        }
-    }
-
-    /* This performs the matching bdrv_set_perm() for the above check. */
-    bdrv_replace_child(child, child_bs);
+    ret = bdrv_refresh_perms(child_bs, errp);
+    tran_finalize(tran, ret);
 
+    bdrv_unref(child_bs);
     return child;
 }
 
@@ -2988,16 +2962,114 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
     return child;
 }
 
-static void bdrv_detach_child(BdrvChild *child)
+static void bdrv_remove_empty_child(BdrvChild *child)
 {
+    assert(!child->bs);
     QLIST_SAFE_REMOVE(child, next);
-
-    bdrv_replace_child(child, NULL);
-
     g_free(child->name);
     g_free(child);
 }
 
+typedef struct BdrvAttachChildCommonState {
+    BdrvChild **child;
+    AioContext *old_parent_ctx;
+    AioContext *old_child_ctx;
+} BdrvAttachChildCommonState;
+
+static void bdrv_attach_child_common_abort(void *opaque)
+{
+    BdrvAttachChildCommonState *s = opaque;
+    BdrvChild *child = *s->child;
+    BlockDriverState *bs = child->bs;
+
+    bdrv_replace_child_noperm(child, NULL);
+
+    if (bdrv_get_aio_context(bs) != s->old_child_ctx) {
+        bdrv_try_set_aio_context(bs, s->old_child_ctx, &error_abort);
+    }
+
+    if (bdrv_child_get_parent_aio_context(child) != s->old_parent_ctx) {
+        bdrv_parent_try_set_aio_context(child, s->old_parent_ctx,
+                                        &error_abort);
+    }
+
+    bdrv_unref(bs);
+    bdrv_remove_empty_child(child);
+    *s->child = NULL;
+}
+
+static TransactionActionDrv bdrv_attach_child_common_drv = {
+    .abort = bdrv_attach_child_common_abort,
+};
+
+/*
+ * Common part of attoching bdrv child to bs or to blk or to job
+ */
+static int bdrv_attach_child_common(BlockDriverState *child_bs,
+                                    const char *child_name,
+                                    const BdrvChildClass *child_class,
+                                    BdrvChildRole child_role,
+                                    uint64_t perm, uint64_t shared_perm,
+                                    void *opaque, BdrvChild **child,
+                                    GSList **tran, Error **errp)
+{
+    int ret;
+    BdrvChild *new_child;
+    AioContext *parent_ctx;
+    AioContext *child_ctx = bdrv_get_aio_context(child_bs);
+
+    assert(child);
+    assert(*child == NULL);
+
+    new_child = g_new(BdrvChild, 1);
+    *new_child = (BdrvChild) {
+        .bs             = NULL,
+        .name           = g_strdup(child_name),
+        .klass          = child_class,
+        .role           = child_role,
+        .perm           = perm,
+        .shared_perm    = shared_perm,
+        .opaque         = opaque,
+    };
+
+    parent_ctx = bdrv_child_get_parent_aio_context(new_child);
+    if (child_ctx != parent_ctx) {
+        ret = bdrv_try_set_aio_context(child_bs, parent_ctx, NULL);
+        if (ret < 0) {
+            /*
+             * bdrv_try_set_aio_context_tran don't need rollback after failure,
+             * so we don't care.
+             */
+            ret = bdrv_parent_try_set_aio_context(new_child, child_ctx, errp);
+        }
+        if (ret < 0) {
+            bdrv_remove_empty_child(new_child);
+            return ret;
+        }
+    }
+
+    bdrv_ref(child_bs);
+    bdrv_replace_child_noperm(new_child, child_bs);
+
+    *child = new_child;
+
+    BdrvAttachChildCommonState *s = g_new(BdrvAttachChildCommonState, 1);
+    *s = (BdrvAttachChildCommonState) {
+        .child = child,
+        .old_parent_ctx = parent_ctx,
+        .old_child_ctx = child_ctx,
+    };
+    tran_prepend(tran, &bdrv_attach_child_common_drv, s);
+
+    return 0;
+}
+
+static void bdrv_detach_child(BdrvChild *child)
+{
+    bdrv_replace_child(child, NULL);
+    bdrv_remove_empty_child(child);
+}
+
 /* Callers must ensure that child->frozen is false. */
 void bdrv_root_unref_child(BdrvChild *child)
 {
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 21/36] block: add bdrv_attach_child_noperm() transaction action
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (19 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 20/36] block: add bdrv_attach_child_common() transaction action Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2020-11-27 14:45 ` [PATCH v2 22/36] block: split out bdrv_replace_node_noperm() Vladimir Sementsov-Ogievskiy
                   ` (15 subsequent siblings)
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Split no-perm part of bdrv_attach_child as separate transaction action.
It will be used in later commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 58 insertions(+), 13 deletions(-)

diff --git a/block.c b/block.c
index a7ccbb4fb1..162a247579 100644
--- a/block.c
+++ b/block.c
@@ -86,6 +86,14 @@ static void bdrv_parent_set_aio_context_ignore(BdrvChild *c, AioContext *ctx,
                                                GSList **ignore);
 static void bdrv_replace_child_noperm(BdrvChild *child,
                                       BlockDriverState *new_bs);
+static int bdrv_attach_child_noperm(BlockDriverState *parent_bs,
+                                    BlockDriverState *child_bs,
+                                    const char *child_name,
+                                    const BdrvChildClass *child_class,
+                                    BdrvChildRole child_role,
+                                    BdrvChild **child,
+                                    GSList **tran,
+                                    Error **errp);
 static int bdrv_attach_child_common(BlockDriverState *child_bs,
                                     const char *child_name,
                                     const BdrvChildClass *child_class,
@@ -2942,23 +2950,26 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
                              BdrvChildRole child_role,
                              Error **errp)
 {
-    BdrvChild *child;
-    uint64_t perm, shared_perm;
-
-    bdrv_get_cumulative_perm(parent_bs, &perm, &shared_perm);
+    int ret;
+    BdrvChild *child = NULL;
+    GSList *tran = NULL;
 
-    assert(parent_bs->drv);
-    bdrv_child_perm(parent_bs, child_bs, NULL, child_role, NULL,
-                    perm, shared_perm, &perm, &shared_perm);
+    ret = bdrv_attach_child_noperm(parent_bs, child_bs, child_name, child_class,
+                                   child_role, &child, &tran, errp);
+    if (ret < 0) {
+        goto out;
+    }
 
-    child = bdrv_root_attach_child(child_bs, child_name, child_class,
-                                   child_role, perm, shared_perm, parent_bs,
-                                   errp);
-    if (child == NULL) {
-        return NULL;
+    ret = bdrv_refresh_perms(parent_bs, errp);
+    if (ret < 0) {
+        goto out;
     }
 
-    QLIST_INSERT_HEAD(&parent_bs->children, child, next);
+out:
+    tran_finalize(tran, ret);
+
+    bdrv_unref(child_bs);
+
     return child;
 }
 
@@ -3064,6 +3075,40 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
     return 0;
 }
 
+static int bdrv_attach_child_noperm(BlockDriverState *parent_bs,
+                                    BlockDriverState *child_bs,
+                                    const char *child_name,
+                                    const BdrvChildClass *child_class,
+                                    BdrvChildRole child_role,
+                                    BdrvChild **child,
+                                    GSList **tran,
+                                    Error **errp)
+{
+    int ret;
+    uint64_t perm, shared_perm;
+
+    assert(parent_bs->drv);
+
+    bdrv_get_cumulative_perm(parent_bs, &perm, &shared_perm);
+    bdrv_child_perm(parent_bs, child_bs, NULL, child_role, NULL,
+                    perm, shared_perm, &perm, &shared_perm);
+
+    ret = bdrv_attach_child_common(child_bs, child_name, child_class,
+                                   child_role, perm, shared_perm, parent_bs,
+                                   child, tran, errp);
+    if (ret < 0) {
+        return ret;
+    }
+
+    QLIST_INSERT_HEAD(&parent_bs->children, *child, next);
+    /*
+     * child is removed in bdrv_attach_child_common_abort(), so don't care to
+     * abort this change separately.
+     */
+
+    return 0;
+}
+
 static void bdrv_detach_child(BdrvChild *child)
 {
     bdrv_replace_child(child, NULL);
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 22/36] block: split out bdrv_replace_node_noperm()
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (20 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 21/36] block: add bdrv_attach_child_noperm() " Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2021-02-03 21:16   ` Kevin Wolf
  2020-11-27 14:45 ` [PATCH v2 23/36] block: adapt bdrv_append() for inserting filters Vladimir Sementsov-Ogievskiy
                   ` (14 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Split part of bdrv_replace_node_common() to be used separately.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 47 ++++++++++++++++++++++++++++++-----------------
 1 file changed, 30 insertions(+), 17 deletions(-)

diff --git a/block.c b/block.c
index 162a247579..02da1a90bc 100644
--- a/block.c
+++ b/block.c
@@ -4897,6 +4897,33 @@ static bool should_update_child(BdrvChild *c, BlockDriverState *to)
     return ret;
 }
 
+static int bdrv_replace_node_noperm(BlockDriverState *from,
+                                    BlockDriverState *to,
+                                    bool auto_skip, GSList **tran, Error **errp)
+{
+    BdrvChild *c, *next;
+
+    QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) {
+        assert(c->bs == from);
+        if (!should_update_child(c, to)) {
+            if (auto_skip) {
+                continue;
+            }
+            error_setg(errp, "Should not change '%s' link to '%s'",
+                       c->name, from->node_name);
+            return -EPERM;
+        }
+        if (c->frozen) {
+            error_setg(errp, "Cannot change '%s' link to '%s'",
+                       c->name, from->node_name);
+            return -EPERM;
+        }
+        bdrv_replace_child_safe(c, to, tran);
+    }
+
+    return 0;
+}
+
 /*
  * With auto_skip=true bdrv_replace_node_common skips updating from parents
  * if it creates a parent-child relation loop or if parent is block-job.
@@ -4909,7 +4936,6 @@ static int bdrv_replace_node_common(BlockDriverState *from,
                                     bool auto_skip, Error **errp)
 {
     int ret = -EPERM;
-    BdrvChild *c, *next;
     GSList *tran = NULL;
     g_autoptr(GHashTable) found = NULL;
     g_autoptr(GSList) refresh_list = NULL;
@@ -4928,22 +4954,9 @@ static int bdrv_replace_node_common(BlockDriverState *from,
      * permissions based on new graph. If we fail, we'll roll-back the
      * replacement.
      */
-    QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) {
-        assert(c->bs == from);
-        if (!should_update_child(c, to)) {
-            if (auto_skip) {
-                continue;
-            }
-            error_setg(errp, "Should not change '%s' link to '%s'",
-                       c->name, from->node_name);
-            goto out;
-        }
-        if (c->frozen) {
-            error_setg(errp, "Cannot change '%s' link to '%s'",
-                       c->name, from->node_name);
-            goto out;
-        }
-        bdrv_replace_child_safe(c, to, &tran);
+    ret = bdrv_replace_node_noperm(from, to, auto_skip, &tran, errp);
+    if (ret < 0) {
+        goto out;
     }
 
     found = g_hash_table_new(NULL, NULL);
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 23/36] block: adapt bdrv_append() for inserting filters
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (21 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 22/36] block: split out bdrv_replace_node_noperm() Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2021-02-03 21:33   ` Kevin Wolf
  2020-11-27 14:45 ` [PATCH v2 24/36] block: add bdrv_remove_backing transaction action Vladimir Sementsov-Ogievskiy
                   ` (13 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

bdrv_append is not very good for inserting filters: it does extra
permission update as part of bdrv_set_backing_hd(). During this update
filter may conflict with other parents of top_bs.

Instead, let's first do all graph modifications and after it update
permissions.

Note: bdrv_append() is still only works for backing-child based
filters. It's something to improve later.

It simplifies the fact that bdrv_append() used to append new nodes,
without backing child. Let's add an assertion.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/block.c b/block.c
index 02da1a90bc..7094922509 100644
--- a/block.c
+++ b/block.c
@@ -4998,22 +4998,28 @@ int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
 int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
                 Error **errp)
 {
-    Error *local_err = NULL;
+    int ret;
+    GSList *tran = NULL;
 
-    bdrv_set_backing_hd(bs_new, bs_top, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        return -EPERM;
+    assert(!bs_new->backing);
+
+    ret = bdrv_attach_child_noperm(bs_new, bs_top, "backing",
+                                   &child_of_bds, bdrv_backing_role(bs_new),
+                                   &bs_new->backing, &tran, errp);
+    if (ret < 0) {
+        goto out;
     }
 
-    bdrv_replace_node(bs_top, bs_new, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        bdrv_set_backing_hd(bs_new, NULL, &error_abort);
-        return -EPERM;
+    ret = bdrv_replace_node_noperm(bs_top, bs_new, true, &tran, errp);
+    if (ret < 0) {
+        goto out;
     }
 
-    return 0;
+    ret = bdrv_refresh_perms(bs_new, errp);
+out:
+    tran_finalize(tran, ret);
+
+    return ret;
 }
 
 static void bdrv_delete(BlockDriverState *bs)
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 24/36] block: add bdrv_remove_backing transaction action
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (22 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 23/36] block: adapt bdrv_append() for inserting filters Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2020-11-27 14:45 ` [PATCH v2 25/36] block: introduce bdrv_drop_filter() Vladimir Sementsov-Ogievskiy
                   ` (12 subsequent siblings)
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 42 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 7094922509..b1394b721c 100644
--- a/block.c
+++ b/block.c
@@ -2973,12 +2973,19 @@ out:
     return child;
 }
 
+static void bdrv_child_free(void *opaque)
+{
+    BdrvChild *c = opaque;
+
+    g_free(c->name);
+    g_free(c);
+}
+
 static void bdrv_remove_empty_child(BdrvChild *child)
 {
     assert(!child->bs);
     QLIST_SAFE_REMOVE(child, next);
-    g_free(child->name);
-    g_free(child);
+    bdrv_child_free(child);
 }
 
 typedef struct BdrvAttachChildCommonState {
@@ -4897,6 +4904,37 @@ static bool should_update_child(BdrvChild *c, BlockDriverState *to)
     return ret;
 }
 
+/* this doesn't restore original child bs, only the child itself */
+static void bdrv_remove_backing_abort(void *opaque)
+{
+    BdrvChild *c = opaque;
+    BlockDriverState *parent_bs = c->opaque;
+
+    QLIST_INSERT_HEAD(&parent_bs->children, c, next);
+    parent_bs->backing = c;
+}
+
+static TransactionActionDrv bdrv_remove_backing_drv = {
+    .abort = bdrv_remove_backing_abort,
+    .commit = bdrv_child_free,
+};
+
+__attribute__((unused))
+static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran)
+{
+    if (!bs->backing) {
+        return;
+    }
+
+    if (bs->backing->bs) {
+        bdrv_replace_child_safe(bs->backing, NULL, tran);
+    }
+
+    tran_prepend(tran, &bdrv_remove_backing_drv, bs->backing);
+    QLIST_SAFE_REMOVE(bs->backing, next);
+    bs->backing = NULL;
+}
+
 static int bdrv_replace_node_noperm(BlockDriverState *from,
                                     BlockDriverState *to,
                                     bool auto_skip, GSList **tran, Error **errp)
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 25/36] block: introduce bdrv_drop_filter()
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (23 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 24/36] block: add bdrv_remove_backing transaction action Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2021-02-04 11:31   ` Kevin Wolf
  2020-11-27 14:45 ` [PATCH v2 26/36] block/backup-top: drop .active Vladimir Sementsov-Ogievskiy
                   ` (11 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Using bdrv_replace_node() for removing filter is not good enough: it
keeps child reference of the filter, which may conflict with original
top node during permission update.

Instead let's create new interface, which will do all graph
modifications first and then update permissions.

Let's modify bdrv_replace_node_common(), allowing it additionally drop
backing chain child link pointing to new node. This is quite
appropriate for bdrv_drop_intermediate() and makes possible to add
new bdrv_drop_filter() as a simple wrapper.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h |  1 +
 block.c               | 42 ++++++++++++++++++++++++++++++++++++++----
 2 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 8f6100dad7..0f21ef313f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -348,6 +348,7 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
                 Error **errp);
 int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
                       Error **errp);
+int bdrv_drop_filter(BlockDriverState *bs, Error **errp);
 
 int bdrv_parse_aio(const char *mode, int *flags);
 int bdrv_parse_cache_mode(const char *mode, int *flags, bool *writethrough);
diff --git a/block.c b/block.c
index b1394b721c..e835a78f06 100644
--- a/block.c
+++ b/block.c
@@ -4919,7 +4919,6 @@ static TransactionActionDrv bdrv_remove_backing_drv = {
     .commit = bdrv_child_free,
 };
 
-__attribute__((unused))
 static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran)
 {
     if (!bs->backing) {
@@ -4968,15 +4967,30 @@ static int bdrv_replace_node_noperm(BlockDriverState *from,
  *
  * With auto_skip=false the error is returned if from has a parent which should
  * not be updated.
+ *
+ * With detach_subchain to must be in a backing chain of from. In this case
+ * backing link of the cow-parent of @to is removed.
  */
 static int bdrv_replace_node_common(BlockDriverState *from,
                                     BlockDriverState *to,
-                                    bool auto_skip, Error **errp)
+                                    bool auto_skip, bool detach_subchain,
+                                    Error **errp)
 {
     int ret = -EPERM;
     GSList *tran = NULL;
     g_autoptr(GHashTable) found = NULL;
     g_autoptr(GSList) refresh_list = NULL;
+    BlockDriverState *to_cow_parent;
+
+    if (detach_subchain) {
+        assert(bdrv_chain_contains(from, to));
+        for (to_cow_parent = from;
+             bdrv_filter_or_cow_bs(to_cow_parent) != to;
+             to_cow_parent = bdrv_filter_or_cow_bs(to_cow_parent))
+        {
+            ;
+        }
+    }
 
     /* Make sure that @from doesn't go away until we have successfully attached
      * all of its parents to @to. */
@@ -4997,6 +5011,10 @@ static int bdrv_replace_node_common(BlockDriverState *from,
         goto out;
     }
 
+    if (detach_subchain) {
+        bdrv_remove_backing(to_cow_parent, &tran);
+    }
+
     found = g_hash_table_new(NULL, NULL);
 
     refresh_list = bdrv_topological_dfs(refresh_list, found, to);
@@ -5016,7 +5034,13 @@ out:
 int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
                       Error **errp)
 {
-    return bdrv_replace_node_common(from, to, true, errp);
+    return bdrv_replace_node_common(from, to, true, false, errp);
+}
+
+int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
+{
+    return bdrv_replace_node_common(bs, bdrv_filter_or_cow_bs(bs), true, true,
+                                    errp);
 }
 
 /*
@@ -5326,7 +5350,17 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
         updated_children = g_slist_prepend(updated_children, c);
     }
 
-    bdrv_replace_node_common(top, base, false, &local_err);
+    /*
+     * It seems correct to pass detach_subchain=true here, but it triggers
+     * one more yet not fixed bug, when due to nested aio_poll loop we switch to
+     * another drained section, which modify the graph (for example, removing
+     * the child, which we keep in updated_children list). So, it's a TODO.
+     *
+     * Note, bug triggered if pass detach_subchain=true here and run
+     * test-bdrv-drain. test_drop_intermediate_poll() test-case will crash.
+     * That's a FIXME.
+     */
+    bdrv_replace_node_common(top, base, false, false, &local_err);
     if (local_err) {
         error_report_err(local_err);
         goto exit;
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 26/36] block/backup-top: drop .active
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (24 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 25/36] block: introduce bdrv_drop_filter() Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2021-02-04 12:26   ` Kevin Wolf
  2020-11-27 14:45 ` [PATCH v2 27/36] block: drop ignore_children for permission update functions Vladimir Sementsov-Ogievskiy
                   ` (10 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

We don't need this workaround anymore: bdrv_append is already smart
enough and we can use new bdrv_drop_filter().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/backup-top.c         | 38 +-------------------------------------
 tests/qemu-iotests/283.out |  2 +-
 2 files changed, 2 insertions(+), 38 deletions(-)

diff --git a/block/backup-top.c b/block/backup-top.c
index 650ed6195c..84eb73aeb7 100644
--- a/block/backup-top.c
+++ b/block/backup-top.c
@@ -37,7 +37,6 @@
 typedef struct BDRVBackupTopState {
     BlockCopyState *bcs;
     BdrvChild *target;
-    bool active;
     int64_t cluster_size;
 } BDRVBackupTopState;
 
@@ -127,21 +126,6 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
                                   uint64_t perm, uint64_t shared,
                                   uint64_t *nperm, uint64_t *nshared)
 {
-    BDRVBackupTopState *s = bs->opaque;
-
-    if (!s->active) {
-        /*
-         * The filter node may be in process of bdrv_append(), which firstly do
-         * bdrv_set_backing_hd() and then bdrv_replace_node(). This means that
-         * we can't unshare BLK_PERM_WRITE during bdrv_append() operation. So,
-         * let's require nothing during bdrv_append() and refresh permissions
-         * after it (see bdrv_backup_top_append()).
-         */
-        *nperm = 0;
-        *nshared = BLK_PERM_ALL;
-        return;
-    }
-
     if (!(role & BDRV_CHILD_FILTERED)) {
         /*
          * Target child
@@ -229,18 +213,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
     }
     appended = true;
 
-    /*
-     * bdrv_append() finished successfully, now we can require permissions
-     * we want.
-     */
-    state->active = true;
-    bdrv_child_refresh_perms(top, top->backing, &local_err);
-    if (local_err) {
-        error_prepend(&local_err,
-                      "Cannot set permissions for backup-top filter: ");
-        goto fail;
-    }
-
     state->cluster_size = cluster_size;
     state->bcs = block_copy_state_new(top->backing, state->target,
                                       cluster_size, write_flags, &local_err);
@@ -256,7 +228,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
 
 fail:
     if (appended) {
-        state->active = false;
         bdrv_backup_top_drop(top);
     } else {
         bdrv_unref(top);
@@ -272,16 +243,9 @@ void bdrv_backup_top_drop(BlockDriverState *bs)
 {
     BDRVBackupTopState *s = bs->opaque;
 
-    bdrv_drained_begin(bs);
+    bdrv_drop_filter(bs, &error_abort);
 
     block_copy_state_free(s->bcs);
 
-    s->active = false;
-    bdrv_child_refresh_perms(bs, bs->backing, &error_abort);
-    bdrv_replace_node(bs, bs->backing->bs, &error_abort);
-    bdrv_set_backing_hd(bs, NULL, &error_abort);
-
-    bdrv_drained_end(bs);
-
     bdrv_unref(bs);
 }
diff --git a/tests/qemu-iotests/283.out b/tests/qemu-iotests/283.out
index fbb7d0f619..a34e4e3f92 100644
--- a/tests/qemu-iotests/283.out
+++ b/tests/qemu-iotests/283.out
@@ -5,4 +5,4 @@
 {"execute": "blockdev-add", "arguments": {"driver": "blkdebug", "image": "base", "node-name": "other", "take-child-perms": ["write"]}}
 {"return": {}}
 {"execute": "blockdev-backup", "arguments": {"device": "source", "sync": "full", "target": "target"}}
-{"error": {"class": "GenericError", "desc": "Cannot set permissions for backup-top filter: Conflicts with use by source as 'image', which does not allow 'write' on base"}}
+{"error": {"class": "GenericError", "desc": "Cannot append backup-top filter: Conflicts with use by source as 'image', which does not allow 'write' on base"}}
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 27/36] block: drop ignore_children for permission update functions
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (25 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 26/36] block/backup-top: drop .active Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2020-11-27 14:45 ` [PATCH v2 28/36] block: add bdrv_set_backing_noperm() transaction action Vladimir Sementsov-Ogievskiy
                   ` (9 subsequent siblings)
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

This argument is always NULL. Drop it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 36 +++++++++++-------------------------
 1 file changed, 11 insertions(+), 25 deletions(-)

diff --git a/block.c b/block.c
index e835a78f06..54fb6d24bd 100644
--- a/block.c
+++ b/block.c
@@ -1934,7 +1934,6 @@ static int bdrv_fill_options(QDict **options, const char *filename,
 static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
                                   uint64_t new_used_perm,
                                   uint64_t new_shared_perm,
-                                  GSList *ignore_children,
                                   Error **errp);
 
 typedef struct BlockReopenQueueEntry {
@@ -2011,9 +2010,7 @@ static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, Error **errp)
     return false;
 }
 
-static bool bdrv_check_parents_compliance(BlockDriverState *bs,
-                                          GSList *ignore_children,
-                                          Error **errp)
+static bool bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp)
 {
     BdrvChild *a, *b;
 
@@ -2024,9 +2021,7 @@ static bool bdrv_check_parents_compliance(BlockDriverState *bs,
      */
     QLIST_FOREACH(a, &bs->parents, next_parent) {
         QLIST_FOREACH(b, &bs->parents, next_parent) {
-            if (a == b || g_slist_find(ignore_children, a) ||
-                g_slist_find(ignore_children, b))
-            {
+            if (a == b) {
                 continue;
             }
 
@@ -2243,7 +2238,6 @@ static void bdrv_replace_child_safe(BdrvChild *child, BlockDriverState *new_bs,
 static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
                                 uint64_t cumulative_perms,
                                 uint64_t cumulative_shared_perms,
-                                GSList *ignore_children,
                                 GSList **tran, Error **errp)
 {
     BlockDriver *drv = bs->drv;
@@ -2326,7 +2320,6 @@ static int bdrv_check_perm_common(GSList *list, BlockReopenQueue *q,
                                   bool use_cumulative_perms,
                                   uint64_t cumulative_perms,
                                   uint64_t cumulative_shared_perms,
-                                  GSList *ignore_children,
                                   GSList **tran, Error **errp)
 {
     int ret;
@@ -2337,7 +2330,7 @@ static int bdrv_check_perm_common(GSList *list, BlockReopenQueue *q,
 
         ret = bdrv_node_check_perm(bs, q, cumulative_perms,
                                    cumulative_shared_perms,
-                                   ignore_children, tran, errp);
+                                   tran, errp);
         if (ret < 0) {
             return ret;
         }
@@ -2348,7 +2341,7 @@ static int bdrv_check_perm_common(GSList *list, BlockReopenQueue *q,
     for ( ; list; list = list->next) {
         bs = list->data;
 
-        if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) {
+        if (!bdrv_check_parents_compliance(bs, errp)) {
             return -EINVAL;
         }
 
@@ -2357,7 +2350,7 @@ static int bdrv_check_perm_common(GSList *list, BlockReopenQueue *q,
 
         ret = bdrv_node_check_perm(bs, q, cumulative_perms,
                                    cumulative_shared_perms,
-                                   ignore_children, tran, errp);
+                                   tran, errp);
         if (ret < 0) {
             return ret;
         }
@@ -2368,19 +2361,17 @@ static int bdrv_check_perm_common(GSList *list, BlockReopenQueue *q,
 
 static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
                            uint64_t cumulative_perms,
-                           uint64_t cumulative_shared_perms,
-                           GSList *ignore_children, Error **errp)
+                           uint64_t cumulative_shared_perms, Error **errp)
 {
     g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
     return bdrv_check_perm_common(list, q, true, cumulative_perms,
-                                  cumulative_shared_perms, ignore_children,
-                                  NULL, errp);
+                                  cumulative_shared_perms, NULL, errp);
 }
 
 static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
                                    GSList **tran, Error **errp)
 {
-    return bdrv_check_perm_common(list, q, false, 0, 0, NULL, tran, errp);
+    return bdrv_check_perm_common(list, q, false, 0, 0, tran, errp);
 }
 
 /*
@@ -2509,7 +2500,6 @@ char *bdrv_perm_names(uint64_t perm)
 static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
                                   uint64_t new_used_perm,
                                   uint64_t new_shared_perm,
-                                  GSList *ignore_children,
                                   Error **errp)
 {
     BdrvChild *c;
@@ -2521,10 +2511,6 @@ static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
     assert(new_shared_perm & BLK_PERM_WRITE_UNCHANGED);
 
     QLIST_FOREACH(c, &bs->parents, next_parent) {
-        if (g_slist_find(ignore_children, c)) {
-            continue;
-        }
-
         if ((new_used_perm & c->shared_perm) != new_used_perm) {
             char *user = bdrv_child_user_desc(c);
             char *perm_names = bdrv_perm_names(new_used_perm & ~c->shared_perm);
@@ -2554,7 +2540,7 @@ static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
     }
 
     return bdrv_check_perm(bs, q, cumulative_perms, cumulative_shared_perms,
-                           ignore_children, errp);
+                           errp);
 }
 
 static int bdrv_refresh_perms(BlockDriverState *bs, Error **errp)
@@ -4149,7 +4135,7 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
     QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
         BDRVReopenState *state = &bs_entry->state;
         ret = bdrv_check_perm(state->bs, bs_queue, state->perm,
-                              state->shared_perm, NULL, errp);
+                              state->shared_perm, errp);
         if (ret < 0) {
             goto cleanup_perm;
         }
@@ -4161,7 +4147,7 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
                             bs_queue, state->perm, state->shared_perm,
                             &nperm, &nshared);
             ret = bdrv_check_update_perm(state->new_backing_bs, NULL,
-                                         nperm, nshared, NULL, errp);
+                                         nperm, nshared, errp);
             if (ret < 0) {
                 goto cleanup_perm;
             }
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 28/36] block: add bdrv_set_backing_noperm() transaction action
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (26 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 27/36] block: drop ignore_children for permission update functions Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2021-02-05 14:00   ` Kevin Wolf
  2021-02-05 16:26   ` Kevin Wolf
  2020-11-27 14:45 ` [PATCH v2 29/36] blockdev: qmp_x_blockdev_reopen: acquire all contexts Vladimir Sementsov-Ogievskiy
                   ` (8 subsequent siblings)
  36 siblings, 2 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Split out no-perm part of bdrv_set_backing_hd() as a separate
transaction action. Note the in case of existing BdrvChild we reuse it,
not recreate, just to do less actions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 111 +++++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 89 insertions(+), 22 deletions(-)

diff --git a/block.c b/block.c
index 54fb6d24bd..617cba9547 100644
--- a/block.c
+++ b/block.c
@@ -101,6 +101,7 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
                                     uint64_t perm, uint64_t shared_perm,
                                     void *opaque, BdrvChild **child,
                                     GSList **tran, Error **errp);
+static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran);
 
 static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
                                *queue, Error **errp);
@@ -3194,45 +3195,111 @@ static BdrvChildRole bdrv_backing_role(BlockDriverState *bs)
     }
 }
 
+typedef struct BdrvSetBackingNoPermState {
+    BlockDriverState *bs;
+    BlockDriverState *backing_bs;
+    BlockDriverState *old_inherits_from;
+    GSList *attach_tran;
+} BdrvSetBackingNoPermState;
+
+static void bdrv_set_backing_noperm_abort(void *opaque)
+{
+    BdrvSetBackingNoPermState *s = opaque;
+
+    if (s->backing_bs) {
+        s->backing_bs->inherits_from = s->old_inherits_from;
+    }
+
+    tran_abort(s->attach_tran);
+
+    bdrv_refresh_limits(s->bs, NULL);
+    if (s->old_inherits_from) {
+        bdrv_refresh_limits(s->old_inherits_from, NULL);
+    }
+}
+
+static void bdrv_set_backing_noperm_commit(void *opaque)
+{
+    BdrvSetBackingNoPermState *s = opaque;
+
+    tran_commit(s->attach_tran);
+}
+
+static TransactionActionDrv bdrv_set_backing_noperm_drv = {
+    .abort = bdrv_set_backing_noperm_abort,
+    .commit = bdrv_set_backing_noperm_commit,
+    .clean = g_free,
+};
+
 /*
  * Sets the bs->backing link of a BDS. A new reference is created; callers
  * which don't need their own reference any more must call bdrv_unref().
  */
-void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
-                         Error **errp)
+static int bdrv_set_backing_noperm(BlockDriverState *bs,
+                                   BlockDriverState *backing_bs,
+                                   GSList **tran, Error **errp)
 {
-    bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
-        bdrv_inherits_from_recursive(backing_hd, bs);
+    int ret = 0;
+    bool update_inherits_from = bdrv_chain_contains(bs, backing_bs) &&
+        bdrv_inherits_from_recursive(backing_bs, bs);
+    GSList *attach_tran = NULL;
+    BdrvSetBackingNoPermState *s;
 
     if (bdrv_is_backing_chain_frozen(bs, child_bs(bs->backing), errp)) {
-        return;
+        return -EPERM;
     }
 
-    if (backing_hd) {
-        bdrv_ref(backing_hd);
+    if (bs->backing && backing_bs) {
+        bdrv_replace_child_safe(bs->backing, backing_bs, tran);
+    } else if (bs->backing && !backing_bs) {
+        bdrv_remove_backing(bs, tran);
+    } else if (backing_bs) {
+        assert(!bs->backing);
+        ret = bdrv_attach_child_noperm(bs, backing_bs, "backing",
+                                       &child_of_bds, bdrv_backing_role(bs),
+                                       &bs->backing, &attach_tran, errp);
+        if (ret < 0) {
+            tran_abort(attach_tran);
+            return ret;
+        }
     }
 
-    if (bs->backing) {
-        /* Cannot be frozen, we checked that above */
-        bdrv_unref_child(bs, bs->backing);
-        bs->backing = NULL;
-    }
+    s = g_new(BdrvSetBackingNoPermState, 1);
+    *s = (BdrvSetBackingNoPermState) {
+        .bs = bs,
+        .backing_bs = backing_bs,
+        .old_inherits_from = backing_bs ? backing_bs->inherits_from : NULL,
+    };
+    tran_prepend(tran, &bdrv_set_backing_noperm_drv, s);
 
-    if (!backing_hd) {
-        goto out;
+    /*
+     * If backing_bs was already part of bs's backing chain, and
+     * inherits_from pointed recursively to bs then let's update it to
+     * point directly to bs (else it will become NULL).
+     */
+    if (backing_bs && update_inherits_from) {
+        backing_bs->inherits_from = bs;
     }
 
-    bs->backing = bdrv_attach_child(bs, backing_hd, "backing", &child_of_bds,
-                                    bdrv_backing_role(bs), errp);
-    /* If backing_hd was already part of bs's backing chain, and
-     * inherits_from pointed recursively to bs then let's update it to
-     * point directly to bs (else it will become NULL). */
-    if (bs->backing && update_inherits_from) {
-        backing_hd->inherits_from = bs;
+    bdrv_refresh_limits(bs, NULL);
+
+    return 0;
+}
+
+void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
+                         Error **errp)
+{
+    int ret;
+    GSList *tran = NULL;
+
+    ret = bdrv_set_backing_noperm(bs, backing_hd, &tran, errp);
+    if (ret < 0) {
+        goto out;
     }
 
+    ret = bdrv_refresh_perms(bs, errp);
 out:
-    bdrv_refresh_limits(bs, NULL);
+    tran_finalize(tran, ret);
 }
 
 /*
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 29/36] blockdev: qmp_x_blockdev_reopen: acquire all contexts
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (27 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 28/36] block: add bdrv_set_backing_noperm() transaction action Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2021-02-05 16:01   ` Kevin Wolf
  2020-11-27 14:45 ` [PATCH v2 30/36] block: bdrv_reopen_multiple: refresh permissions on updated graph Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

During reopen we may add backing bs from other aio context, which may
lead to changing original context of top bs.

We are going to move graph modification to prepare stage. So, it will
be possible that bdrv_flush() in bdrv_reopen_prepare called on bs in
non-original aio context, which we didn't aquire which leads to crash.

More correct would be to acquire all aio context we are going to work
with. And the simplest ways is to just acquire all of them. It may be
optimized later if needed.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 blockdev.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 2af35d0958..098a05709d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3531,7 +3531,6 @@ fail:
 void qmp_x_blockdev_reopen(BlockdevOptions *options, Error **errp)
 {
     BlockDriverState *bs;
-    AioContext *ctx;
     QObject *obj;
     Visitor *v = qobject_output_visitor_new(&obj);
     BlockReopenQueue *queue;
@@ -3557,13 +3556,29 @@ void qmp_x_blockdev_reopen(BlockdevOptions *options, Error **errp)
     qdict_flatten(qdict);
 
     /* Perform the reopen operation */
-    ctx = bdrv_get_aio_context(bs);
-    aio_context_acquire(ctx);
+    BdrvNextIterator it;
+    GSList *aio_ctxs = NULL, *ctx;
+    BlockDriverState *it_bs;
+
+    for (it_bs = bdrv_first(&it); it_bs; it_bs = bdrv_next(&it)) {
+        AioContext *aio_context = bdrv_get_aio_context(it_bs);
+
+        if (!g_slist_find(aio_ctxs, aio_context)) {
+            aio_ctxs = g_slist_prepend(aio_ctxs, aio_context);
+            aio_context_acquire(aio_context);
+        }
+    }
+
     bdrv_subtree_drained_begin(bs);
     queue = bdrv_reopen_queue(NULL, bs, qdict, false);
     bdrv_reopen_multiple(queue, errp);
     bdrv_subtree_drained_end(bs);
-    aio_context_release(ctx);
+
+    for (ctx = aio_ctxs; ctx != NULL; ctx = ctx->next) {
+        AioContext *aio_context = ctx->data;
+        aio_context_release(aio_context);
+    }
+    g_slist_free(aio_ctxs);
 
 fail:
     visit_free(v);
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 30/36] block: bdrv_reopen_multiple: refresh permissions on updated graph
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (28 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 29/36] blockdev: qmp_x_blockdev_reopen: acquire all contexts Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2021-02-05 17:57   ` Kevin Wolf
  2020-11-27 14:45 ` [PATCH v2 31/36] block: drop unused permission update functions Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Move bdrv_reopen_multiple to new paradigm of permission update:
first update graph relations, then do refresh the permissions.

We have to modify reopen process in file-posix driver: with new scheme
we don't have prepared permissions in raw_reopen_prepare(), so we
should reconfigure fd in raw_check_perm(). Still this seems more native
and simple anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h |   2 +-
 block.c               | 183 +++++++++++-------------------------------
 block/file-posix.c    |  84 +++++--------------
 3 files changed, 70 insertions(+), 199 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 0f21ef313f..82271d9ccd 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -195,7 +195,7 @@ typedef struct BDRVReopenState {
     BlockdevDetectZeroesOptions detect_zeroes;
     bool backing_missing;
     bool replace_backing_bs;  /* new_backing_bs is ignored if this is false */
-    BlockDriverState *new_backing_bs; /* If NULL then detach the current bs */
+    BlockDriverState *old_backing_bs; /* keep pointer for permissions update */
     uint64_t perm, shared_perm;
     QDict *options;
     QDict *explicit_options;
diff --git a/block.c b/block.c
index 617cba9547..474e624152 100644
--- a/block.c
+++ b/block.c
@@ -103,8 +103,9 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
                                     GSList **tran, Error **errp);
 static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran);
 
-static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
-                               *queue, Error **errp);
+static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
+                               BlockReopenQueue *queue,
+                               GSList **set_backings_tran, Error **errp);
 static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
 static void bdrv_reopen_abort(BDRVReopenState *reopen_state);
 
@@ -2403,6 +2404,7 @@ static void bdrv_list_abort_perm_update(GSList *list)
     }
 }
 
+__attribute__((unused))
 static void bdrv_abort_perm_update(BlockDriverState *bs)
 {
     g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
@@ -2498,6 +2500,7 @@ char *bdrv_perm_names(uint64_t perm)
  *
  * Needs to be followed by a call to either bdrv_set_perm() or
  * bdrv_abort_perm_update(). */
+__attribute__((unused))
 static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
                                   uint64_t new_used_perm,
                                   uint64_t new_shared_perm,
@@ -4100,10 +4103,6 @@ static BlockReopenQueue *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
     bs_entry->state.explicit_options = explicit_options;
     bs_entry->state.flags = flags;
 
-    /* This needs to be overwritten in bdrv_reopen_prepare() */
-    bs_entry->state.perm = UINT64_MAX;
-    bs_entry->state.shared_perm = 0;
-
     /*
      * If keep_old_opts is false then it means that unspecified
      * options must be reset to their original value. We don't allow
@@ -4186,40 +4185,37 @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
  */
 int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
 {
-    int ret = -1;
+    int ret = 0;
     BlockReopenQueueEntry *bs_entry, *next;
+    GSList *tran = NULL;
+    g_autoptr(GHashTable) found = NULL;
+    g_autoptr(GSList) refresh_list = NULL;
 
     assert(bs_queue != NULL);
 
     QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
         assert(bs_entry->state.bs->quiesce_counter > 0);
-        if (bdrv_reopen_prepare(&bs_entry->state, bs_queue, errp)) {
-            goto cleanup;
+        ret = bdrv_reopen_prepare(&bs_entry->state, bs_queue, &tran, errp);
+        if (ret < 0) {
+            goto abort;
         }
         bs_entry->prepared = true;
     }
 
+    found = g_hash_table_new(NULL, NULL);
     QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
         BDRVReopenState *state = &bs_entry->state;
-        ret = bdrv_check_perm(state->bs, bs_queue, state->perm,
-                              state->shared_perm, errp);
-        if (ret < 0) {
-            goto cleanup_perm;
-        }
-        /* Check if new_backing_bs would accept the new permissions */
-        if (state->replace_backing_bs && state->new_backing_bs) {
-            uint64_t nperm, nshared;
-            bdrv_child_perm(state->bs, state->new_backing_bs,
-                            NULL, bdrv_backing_role(state->bs),
-                            bs_queue, state->perm, state->shared_perm,
-                            &nperm, &nshared);
-            ret = bdrv_check_update_perm(state->new_backing_bs, NULL,
-                                         nperm, nshared, errp);
-            if (ret < 0) {
-                goto cleanup_perm;
-            }
+
+        refresh_list = bdrv_topological_dfs(refresh_list, found, state->bs);
+        if (state->old_backing_bs) {
+            refresh_list = bdrv_topological_dfs(refresh_list, found,
+                                                state->old_backing_bs);
         }
-        bs_entry->perms_checked = true;
+    }
+
+    ret = bdrv_list_refresh_perms(refresh_list, bs_queue, &tran, errp);
+    if (ret < 0) {
+        goto abort;
     }
 
     /*
@@ -4235,51 +4231,29 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
         bdrv_reopen_commit(&bs_entry->state);
     }
 
-    ret = 0;
-cleanup_perm:
-    QTAILQ_FOREACH_SAFE(bs_entry, bs_queue, entry, next) {
-        BDRVReopenState *state = &bs_entry->state;
+    tran_commit(tran);
 
-        if (!bs_entry->perms_checked) {
-            continue;
-        }
-
-        if (ret == 0) {
-            uint64_t perm, shared;
-
-            bdrv_get_cumulative_perm(state->bs, &perm, &shared);
-            assert(perm == state->perm);
-            assert(shared == state->shared_perm);
+    QTAILQ_FOREACH_REVERSE(bs_entry, bs_queue, entry) {
+        BlockDriverState *bs = bs_entry->state.bs;
 
-            bdrv_set_perm(state->bs);
-        } else {
-            bdrv_abort_perm_update(state->bs);
-            if (state->replace_backing_bs && state->new_backing_bs) {
-                bdrv_abort_perm_update(state->new_backing_bs);
-            }
+        if (bs->drv->bdrv_reopen_commit_post) {
+            bs->drv->bdrv_reopen_commit_post(&bs_entry->state);
         }
     }
+    goto cleanup;
 
-    if (ret == 0) {
-        QTAILQ_FOREACH_REVERSE(bs_entry, bs_queue, entry) {
-            BlockDriverState *bs = bs_entry->state.bs;
-
-            if (bs->drv->bdrv_reopen_commit_post)
-                bs->drv->bdrv_reopen_commit_post(&bs_entry->state);
+abort:
+    tran_abort(tran);
+    QTAILQ_FOREACH_SAFE(bs_entry, bs_queue, entry, next) {
+        if (bs_entry->prepared) {
+            bdrv_reopen_abort(&bs_entry->state);
         }
+        qobject_unref(bs_entry->state.explicit_options);
+        qobject_unref(bs_entry->state.options);
     }
+
 cleanup:
     QTAILQ_FOREACH_SAFE(bs_entry, bs_queue, entry, next) {
-        if (ret) {
-            if (bs_entry->prepared) {
-                bdrv_reopen_abort(&bs_entry->state);
-            }
-            qobject_unref(bs_entry->state.explicit_options);
-            qobject_unref(bs_entry->state.options);
-        }
-        if (bs_entry->state.new_backing_bs) {
-            bdrv_unref(bs_entry->state.new_backing_bs);
-        }
         g_free(bs_entry);
     }
     g_free(bs_queue);
@@ -4304,53 +4278,6 @@ int bdrv_reopen_set_read_only(BlockDriverState *bs, bool read_only,
     return ret;
 }
 
-static BlockReopenQueueEntry *find_parent_in_reopen_queue(BlockReopenQueue *q,
-                                                          BdrvChild *c)
-{
-    BlockReopenQueueEntry *entry;
-
-    QTAILQ_FOREACH(entry, q, entry) {
-        BlockDriverState *bs = entry->state.bs;
-        BdrvChild *child;
-
-        QLIST_FOREACH(child, &bs->children, next) {
-            if (child == c) {
-                return entry;
-            }
-        }
-    }
-
-    return NULL;
-}
-
-static void bdrv_reopen_perm(BlockReopenQueue *q, BlockDriverState *bs,
-                             uint64_t *perm, uint64_t *shared)
-{
-    BdrvChild *c;
-    BlockReopenQueueEntry *parent;
-    uint64_t cumulative_perms = 0;
-    uint64_t cumulative_shared_perms = BLK_PERM_ALL;
-
-    QLIST_FOREACH(c, &bs->parents, next_parent) {
-        parent = find_parent_in_reopen_queue(q, c);
-        if (!parent) {
-            cumulative_perms |= c->perm;
-            cumulative_shared_perms &= c->shared_perm;
-        } else {
-            uint64_t nperm, nshared;
-
-            bdrv_child_perm(parent->state.bs, bs, c, c->role, q,
-                            parent->state.perm, parent->state.shared_perm,
-                            &nperm, &nshared);
-
-            cumulative_perms |= nperm;
-            cumulative_shared_perms &= nshared;
-        }
-    }
-    *perm = cumulative_perms;
-    *shared = cumulative_shared_perms;
-}
-
 static bool bdrv_reopen_can_attach(BlockDriverState *parent,
                                    BdrvChild *child,
                                    BlockDriverState *new_child,
@@ -4392,6 +4319,7 @@ static bool bdrv_reopen_can_attach(BlockDriverState *parent,
  * Return 0 on success, otherwise return < 0 and set @errp.
  */
 static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
+                                     GSList **set_backings_tran,
                                      Error **errp)
 {
     BlockDriverState *bs = reopen_state->bs;
@@ -4468,6 +4396,8 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
 
     /* If we want to replace the backing file we need some extra checks */
     if (new_backing_bs != bdrv_filter_or_cow_bs(overlay_bs)) {
+        int ret;
+
         /* Check for implicit nodes between bs and its backing file */
         if (bs != overlay_bs) {
             error_setg(errp, "Cannot change backing link if '%s' has "
@@ -4488,9 +4418,11 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
             return -EPERM;
         }
         reopen_state->replace_backing_bs = true;
-        if (new_backing_bs) {
-            bdrv_ref(new_backing_bs);
-            reopen_state->new_backing_bs = new_backing_bs;
+        reopen_state->old_backing_bs = bs->backing ? bs->backing->bs : NULL;
+        ret = bdrv_set_backing_noperm(bs, new_backing_bs, set_backings_tran,
+                                      errp);
+        if (ret < 0) {
+            return ret;
         }
     }
 
@@ -4515,7 +4447,8 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
  *
  */
 static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
-                               BlockReopenQueue *queue, Error **errp)
+                               BlockReopenQueue *queue,
+                               GSList **set_backings_tran, Error **errp)
 {
     int ret = -1;
     int old_flags;
@@ -4582,10 +4515,6 @@ static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
         goto error;
     }
 
-    /* Calculate required permissions after reopening */
-    bdrv_reopen_perm(queue, reopen_state->bs,
-                     &reopen_state->perm, &reopen_state->shared_perm);
-
     ret = bdrv_flush(reopen_state->bs);
     if (ret) {
         error_setg_errno(errp, -ret, "Error flushing drive");
@@ -4645,7 +4574,7 @@ static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
      * either a reference to an existing node (using its node name)
      * or NULL to simply detach the current backing file.
      */
-    ret = bdrv_reopen_parse_backing(reopen_state, errp);
+    ret = bdrv_reopen_parse_backing(reopen_state, set_backings_tran, errp);
     if (ret < 0) {
         goto error;
     }
@@ -4767,22 +4696,6 @@ static void bdrv_reopen_commit(BDRVReopenState *reopen_state)
         qdict_del(bs->explicit_options, child->name);
         qdict_del(bs->options, child->name);
     }
-
-    /*
-     * Change the backing file if a new one was specified. We do this
-     * after updating bs->options, so bdrv_refresh_filename() (called
-     * from bdrv_set_backing_hd()) has the new values.
-     */
-    if (reopen_state->replace_backing_bs) {
-        BlockDriverState *old_backing_bs = child_bs(bs->backing);
-        assert(!old_backing_bs || !old_backing_bs->implicit);
-        /* Abort the permission update on the backing bs we're detaching */
-        if (old_backing_bs) {
-            bdrv_abort_perm_update(old_backing_bs);
-        }
-        bdrv_set_backing_hd(bs, reopen_state->new_backing_bs, &error_abort);
-    }
-
     bdrv_refresh_limits(bs, NULL);
 }
 
diff --git a/block/file-posix.c b/block/file-posix.c
index 37d9266f6a..42c60c8a02 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -175,7 +175,6 @@ typedef struct BDRVRawState {
 } BDRVRawState;
 
 typedef struct BDRVRawReopenState {
-    int fd;
     int open_flags;
     bool drop_cache;
     bool check_cache_dropped;
@@ -1062,7 +1061,6 @@ static int raw_reopen_prepare(BDRVReopenState *state,
     BDRVRawReopenState *rs;
     QemuOpts *opts;
     int ret;
-    Error *local_err = NULL;
 
     assert(state != NULL);
     assert(state->bs != NULL);
@@ -1088,32 +1086,9 @@ static int raw_reopen_prepare(BDRVReopenState *state,
      * bdrv_reopen_prepare() will detect changes and complain. */
     qemu_opts_to_qdict(opts, state->options);
 
-    rs->fd = raw_reconfigure_getfd(state->bs, state->flags, &rs->open_flags,
-                                   state->perm, true, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -1;
-        goto out;
-    }
-
-    /* Fail already reopen_prepare() if we can't get a working O_DIRECT
-     * alignment with the new fd. */
-    if (rs->fd != -1) {
-        raw_probe_alignment(state->bs, rs->fd, &local_err);
-        if (local_err) {
-            error_propagate(errp, local_err);
-            ret = -EINVAL;
-            goto out_fd;
-        }
-    }
-
     s->reopen_state = state;
     ret = 0;
-out_fd:
-    if (ret < 0) {
-        qemu_close(rs->fd);
-        rs->fd = -1;
-    }
+
 out:
     qemu_opts_del(opts);
     return ret;
@@ -1127,10 +1102,6 @@ static void raw_reopen_commit(BDRVReopenState *state)
     s->drop_cache = rs->drop_cache;
     s->check_cache_dropped = rs->check_cache_dropped;
     s->open_flags = rs->open_flags;
-
-    qemu_close(s->fd);
-    s->fd = rs->fd;
-
     g_free(state->opaque);
     state->opaque = NULL;
 
@@ -1149,10 +1120,6 @@ static void raw_reopen_abort(BDRVReopenState *state)
         return;
     }
 
-    if (rs->fd >= 0) {
-        qemu_close(rs->fd);
-        rs->fd = -1;
-    }
     g_free(state->opaque);
     state->opaque = NULL;
 
@@ -3060,39 +3027,30 @@ static int raw_check_perm(BlockDriverState *bs, uint64_t perm, uint64_t shared,
                           Error **errp)
 {
     BDRVRawState *s = bs->opaque;
-    BDRVRawReopenState *rs = NULL;
+    int input_flags = s->reopen_state ? s->reopen_state->flags : bs->open_flags;
     int open_flags;
     int ret;
 
-    if (s->perm_change_fd) {
+    /* We may need a new fd if auto-read-only switches the mode */
+    ret = raw_reconfigure_getfd(bs, input_flags, &open_flags, perm,
+                                false, errp);
+    if (ret < 0) {
+        return ret;
+    } else if (ret != s->fd) {
+        Error *local_err = NULL;
+
         /*
-         * In the context of reopen, this function may be called several times
-         * (directly and recursively while change permissions of the parent).
-         * This is even true for children that don't inherit from the original
-         * reopen node, so s->reopen_state is not set.
-         *
-         * Ignore all but the first call.
+         * Fail already check_perm() if we can't get a working O_DIRECT
+         * alignment with the new fd.
          */
-        return 0;
-    }
-
-    if (s->reopen_state) {
-        /* We already have a new file descriptor to set permissions for */
-        assert(s->reopen_state->perm == perm);
-        assert(s->reopen_state->shared_perm == shared);
-        rs = s->reopen_state->opaque;
-        s->perm_change_fd = rs->fd;
-        s->perm_change_flags = rs->open_flags;
-    } else {
-        /* We may need a new fd if auto-read-only switches the mode */
-        ret = raw_reconfigure_getfd(bs, bs->open_flags, &open_flags, perm,
-                                    false, errp);
-        if (ret < 0) {
-            return ret;
-        } else if (ret != s->fd) {
-            s->perm_change_fd = ret;
-            s->perm_change_flags = open_flags;
+        raw_probe_alignment(bs, ret, &local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            return -EINVAL;
         }
+
+        s->perm_change_fd = ret;
+        s->perm_change_flags = open_flags;
     }
 
     /* Prepare permissions on old fd to avoid conflicts between old and new,
@@ -3114,7 +3072,7 @@ static int raw_check_perm(BlockDriverState *bs, uint64_t perm, uint64_t shared,
     return 0;
 
 fail:
-    if (s->perm_change_fd && !s->reopen_state) {
+    if (s->perm_change_fd) {
         qemu_close(s->perm_change_fd);
     }
     s->perm_change_fd = 0;
@@ -3145,7 +3103,7 @@ static void raw_abort_perm_update(BlockDriverState *bs)
 
     /* For reopen, .bdrv_reopen_abort is called afterwards and will close
      * the file descriptor. */
-    if (s->perm_change_fd && !s->reopen_state) {
+    if (s->perm_change_fd) {
         qemu_close(s->perm_change_fd);
     }
     s->perm_change_fd = 0;
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 31/36] block: drop unused permission update functions
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (29 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 30/36] block: bdrv_reopen_multiple: refresh permissions on updated graph Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2020-11-27 14:45 ` [PATCH v2 32/36] block: inline bdrv_check_perm_common() Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 103 --------------------------------------------------------
 1 file changed, 103 deletions(-)

diff --git a/block.c b/block.c
index 474e624152..3ea04bbd8f 100644
--- a/block.c
+++ b/block.c
@@ -1933,11 +1933,6 @@ static int bdrv_fill_options(QDict **options, const char *filename,
     return 0;
 }
 
-static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
-                                  uint64_t new_used_perm,
-                                  uint64_t new_shared_perm,
-                                  Error **errp);
-
 typedef struct BlockReopenQueueEntry {
      bool prepared;
      bool perms_checked;
@@ -2361,56 +2356,12 @@ static int bdrv_check_perm_common(GSList *list, BlockReopenQueue *q,
     return 0;
 }
 
-static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
-                           uint64_t cumulative_perms,
-                           uint64_t cumulative_shared_perms, Error **errp)
-{
-    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
-    return bdrv_check_perm_common(list, q, true, cumulative_perms,
-                                  cumulative_shared_perms, NULL, errp);
-}
-
 static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
                                    GSList **tran, Error **errp)
 {
     return bdrv_check_perm_common(list, q, false, 0, 0, tran, errp);
 }
 
-/*
- * Notifies drivers that after a previous bdrv_check_perm() call, the
- * permission update is not performed and any preparations made for it (e.g.
- * taken file locks) need to be undone.
- */
-static void bdrv_node_abort_perm_update(BlockDriverState *bs)
-{
-    BlockDriver *drv = bs->drv;
-    BdrvChild *c;
-
-    if (!drv) {
-        return;
-    }
-
-    bdrv_drv_set_perm_abort(bs);
-
-    QLIST_FOREACH(c, &bs->children, next) {
-        bdrv_child_set_perm_abort(c);
-    }
-}
-
-static void bdrv_list_abort_perm_update(GSList *list)
-{
-    for ( ; list; list = list->next) {
-        bdrv_node_abort_perm_update((BlockDriverState *)list->data);
-    }
-}
-
-__attribute__((unused))
-static void bdrv_abort_perm_update(BlockDriverState *bs)
-{
-    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
-    return bdrv_list_abort_perm_update(list);
-}
-
 static void bdrv_node_set_perm(BlockDriverState *bs)
 {
     BlockDriver *drv = bs->drv;
@@ -2492,60 +2443,6 @@ char *bdrv_perm_names(uint64_t perm)
     return g_string_free(result, FALSE);
 }
 
-/*
- * Checks whether a new reference to @bs can be added if the new user requires
- * @new_used_perm/@new_shared_perm as its permissions. If @ignore_children is
- * set, the BdrvChild objects in this list are ignored in the calculations;
- * this allows checking permission updates for an existing reference.
- *
- * Needs to be followed by a call to either bdrv_set_perm() or
- * bdrv_abort_perm_update(). */
-__attribute__((unused))
-static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
-                                  uint64_t new_used_perm,
-                                  uint64_t new_shared_perm,
-                                  Error **errp)
-{
-    BdrvChild *c;
-    uint64_t cumulative_perms = new_used_perm;
-    uint64_t cumulative_shared_perms = new_shared_perm;
-
-
-    /* There is no reason why anyone couldn't tolerate write_unchanged */
-    assert(new_shared_perm & BLK_PERM_WRITE_UNCHANGED);
-
-    QLIST_FOREACH(c, &bs->parents, next_parent) {
-        if ((new_used_perm & c->shared_perm) != new_used_perm) {
-            char *user = bdrv_child_user_desc(c);
-            char *perm_names = bdrv_perm_names(new_used_perm & ~c->shared_perm);
-
-            error_setg(errp, "Conflicts with use by %s as '%s', which does not "
-                             "allow '%s' on %s",
-                       user, c->name, perm_names, bdrv_get_node_name(c->bs));
-            g_free(user);
-            g_free(perm_names);
-            return -EPERM;
-        }
-
-        if ((c->perm & new_shared_perm) != c->perm) {
-            char *user = bdrv_child_user_desc(c);
-            char *perm_names = bdrv_perm_names(c->perm & ~new_shared_perm);
-
-            error_setg(errp, "Conflicts with use by %s as '%s', which uses "
-                             "'%s' on %s",
-                       user, c->name, perm_names, bdrv_get_node_name(c->bs));
-            g_free(user);
-            g_free(perm_names);
-            return -EPERM;
-        }
-
-        cumulative_perms |= c->perm;
-        cumulative_shared_perms &= c->shared_perm;
-    }
-
-    return bdrv_check_perm(bs, q, cumulative_perms, cumulative_shared_perms,
-                           errp);
-}
 
 static int bdrv_refresh_perms(BlockDriverState *bs, Error **errp)
 {
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 32/36] block: inline bdrv_check_perm_common()
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (30 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 31/36] block: drop unused permission update functions Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2020-11-27 14:45 ` [PATCH v2 33/36] block: inline bdrv_replace_child() Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

bdrv_check_perm_common() has only one caller, so no more sense in
"common".

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 32 +++-----------------------------
 1 file changed, 3 insertions(+), 29 deletions(-)

diff --git a/block.c b/block.c
index 3ea04bbd8f..6c87ad0287 100644
--- a/block.c
+++ b/block.c
@@ -2308,33 +2308,13 @@ static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
     return 0;
 }
 
-/*
- * If use_cumulative_perms is true, use cumulative_perms and
- * cumulative_shared_perms for first element of the list. Otherwise just refresh
- * all permissions.
- */
-static int bdrv_check_perm_common(GSList *list, BlockReopenQueue *q,
-                                  bool use_cumulative_perms,
-                                  uint64_t cumulative_perms,
-                                  uint64_t cumulative_shared_perms,
-                                  GSList **tran, Error **errp)
+static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
+                                   GSList **tran, Error **errp)
 {
     int ret;
+    uint64_t cumulative_perms, cumulative_shared_perms;
     BlockDriverState *bs;
 
-    if (use_cumulative_perms) {
-        bs = list->data;
-
-        ret = bdrv_node_check_perm(bs, q, cumulative_perms,
-                                   cumulative_shared_perms,
-                                   tran, errp);
-        if (ret < 0) {
-            return ret;
-        }
-
-        list = list->next;
-    }
-
     for ( ; list; list = list->next) {
         bs = list->data;
 
@@ -2356,12 +2336,6 @@ static int bdrv_check_perm_common(GSList *list, BlockReopenQueue *q,
     return 0;
 }
 
-static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
-                                   GSList **tran, Error **errp)
-{
-    return bdrv_check_perm_common(list, q, false, 0, 0, tran, errp);
-}
-
 static void bdrv_node_set_perm(BlockDriverState *bs)
 {
     BlockDriver *drv = bs->drv;
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 33/36] block: inline bdrv_replace_child()
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (31 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 32/36] block: inline bdrv_check_perm_common() Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2020-11-27 14:45 ` [PATCH v2 34/36] block: refactor bdrv_child_set_perm_safe() transaction action Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

bdrv_replace_child() has only one caller, the second argument is
unused. Inline it now. This triggers deletion of some more unused
interfaces.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 101 ++++++++++----------------------------------------------
 1 file changed, 18 insertions(+), 83 deletions(-)

diff --git a/block.c b/block.c
index 6c87ad0287..3093d20db8 100644
--- a/block.c
+++ b/block.c
@@ -2336,42 +2336,6 @@ static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
     return 0;
 }
 
-static void bdrv_node_set_perm(BlockDriverState *bs)
-{
-    BlockDriver *drv = bs->drv;
-    BdrvChild *c;
-
-    if (!drv) {
-        return;
-    }
-
-    bdrv_drv_set_perm_commit(bs);
-
-    /* Drivers that never have children can omit .bdrv_child_perm() */
-    if (!drv->bdrv_child_perm) {
-        assert(QLIST_EMPTY(&bs->children));
-        return;
-    }
-
-    /* Update all children */
-    QLIST_FOREACH(c, &bs->children, next) {
-        bdrv_child_set_perm_commit(c);
-    }
-}
-
-static void bdrv_list_set_perm(GSList *list)
-{
-    for ( ; list; list = list->next) {
-        bdrv_node_set_perm((BlockDriverState *)list->data);
-    }
-}
-
-static void bdrv_set_perm(BlockDriverState *bs)
-{
-    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
-    return bdrv_list_set_perm(list);
-}
-
 void bdrv_get_cumulative_perm(BlockDriverState *bs, uint64_t *perm,
                               uint64_t *shared_perm)
 {
@@ -2711,52 +2675,6 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
     }
 }
 
-/*
- * Updates @child to change its reference to point to @new_bs, including
- * checking and applying the necessary permission updates both to the old node
- * and to @new_bs.
- *
- * NULL is passed as @new_bs for removing the reference before freeing @child.
- *
- * If @new_bs is not NULL, bdrv_check_perm() must be called beforehand, as this
- * function uses bdrv_set_perm() to update the permissions according to the new
- * reference that @new_bs gets.
- *
- * Callers must ensure that child->frozen is false.
- */
-static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs)
-{
-    BlockDriverState *old_bs = child->bs;
-
-    /* Asserts that child->frozen == false */
-    bdrv_replace_child_noperm(child, new_bs);
-
-    /*
-     * Start with the new node's permissions.  If @new_bs is a (direct
-     * or indirect) child of @old_bs, we must complete the permission
-     * update on @new_bs before we loosen the restrictions on @old_bs.
-     * Otherwise, bdrv_check_perm() on @old_bs would re-initiate
-     * updating the permissions of @new_bs, and thus not purely loosen
-     * restrictions.
-     */
-    if (new_bs) {
-        bdrv_set_perm(new_bs);
-    }
-
-    if (old_bs) {
-        /*
-         * Update permissions for old node. We're just taking a parent away, so
-         * we're loosening restrictions. Errors of permission update are not
-         * fatal in this case, ignore them.
-         */
-        bdrv_refresh_perms(old_bs, NULL);
-
-        /* When the parent requiring a non-default AioContext is removed, the
-         * node moves back to the main AioContext */
-        bdrv_try_set_aio_context(old_bs, qemu_get_aio_context(), NULL);
-    }
-}
-
 /*
  * This function steals the reference to child_bs from the caller.
  * That reference is later dropped by bdrv_root_unref_child().
@@ -2979,8 +2897,25 @@ static int bdrv_attach_child_noperm(BlockDriverState *parent_bs,
 
 static void bdrv_detach_child(BdrvChild *child)
 {
-    bdrv_replace_child(child, NULL);
+    BlockDriverState *old_bs = child->bs;
+
+    bdrv_replace_child_noperm(child, NULL);
     bdrv_remove_empty_child(child);
+
+    if (old_bs) {
+        /*
+         * Update permissions for old node. We're just taking a parent away, so
+         * we're loosening restrictions. Errors of permission update are not
+         * fatal in this case, ignore them.
+         */
+        bdrv_refresh_perms(old_bs, NULL);
+
+        /*
+         * When the parent requiring a non-default AioContext is removed, the
+         * node moves back to the main AioContext
+         */
+        bdrv_try_set_aio_context(old_bs, qemu_get_aio_context(), NULL);
+    }
 }
 
 /* Callers must ensure that child->frozen is false. */
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 34/36] block: refactor bdrv_child_set_perm_safe() transaction action
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (32 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 33/36] block: inline bdrv_replace_child() Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2021-02-10 14:51   ` Kevin Wolf
  2020-11-27 14:45 ` [PATCH v2 35/36] block: rename bdrv_replace_child_safe() to bdrv_replace_child() Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Old interfaces dropped, nobody directly calls
bdrv_child_set_perm_abort() and bdrv_child_set_perm_commit(), so we can
use personal state structure for the action and stop exploiting
BdrvChild structure. Also, drop "_safe" suffix which is redundant now.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block_int.h |  5 ----
 block.c                   | 63 ++++++++++++++-------------------------
 2 files changed, 22 insertions(+), 46 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 24a04ac2dc..1e509db867 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -796,11 +796,6 @@ struct BdrvChild {
      */
     uint64_t shared_perm;
 
-    /* backup of permissions during permission update procedure */
-    bool has_backup_perm;
-    uint64_t backup_perm;
-    uint64_t backup_shared_perm;
-
     /*
      * This link is frozen: the child can neither be replaced nor
      * detached from the parent.
diff --git a/block.c b/block.c
index 3093d20db8..1fde22e4f4 100644
--- a/block.c
+++ b/block.c
@@ -2070,59 +2070,40 @@ static GSList *bdrv_topological_dfs(GSList *list, GHashTable *found,
     return g_slist_prepend(list, bs);
 }
 
-static void bdrv_child_set_perm_commit(void *opaque)
-{
-    BdrvChild *c = opaque;
-
-    c->has_backup_perm = false;
-}
+typedef struct BdrvChildSetPermState {
+    BdrvChild *child;
+    uint64_t old_perm;
+    uint64_t old_shared_perm;
+} BdrvChildSetPermState;
 
 static void bdrv_child_set_perm_abort(void *opaque)
 {
-    BdrvChild *c = opaque;
-    /*
-     * We may have child->has_backup_perm unset at this point, as in case of
-     * _check_ stage of permission update failure we may _check_ not the whole
-     * subtree.  Still, _abort_ is called on the whole subtree anyway.
-     */
-    if (c->has_backup_perm) {
-        c->perm = c->backup_perm;
-        c->shared_perm = c->backup_shared_perm;
-        c->has_backup_perm = false;
-    }
+    BdrvChildSetPermState *s = opaque;
+
+    s->child->perm = s->old_perm;
+    s->child->shared_perm = s->old_shared_perm;
 }
 
 static TransactionActionDrv bdrv_child_set_pem_drv = {
     .abort = bdrv_child_set_perm_abort,
-    .commit = bdrv_child_set_perm_commit,
+    .clean = g_free,
 };
 
-/*
- * With tran=NULL needs to be followed by direct call to either
- * bdrv_child_set_perm_commit() or bdrv_child_set_perm_abort().
- *
- * With non-NUll tran needs to be followed by tran_abort() or tran_commit()
- * instead.
- */
-static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
-                                     uint64_t shared, GSList **tran)
+static void bdrv_child_set_perm(BdrvChild *c, uint64_t perm,
+                                uint64_t shared, GSList **tran)
 {
-    if (!c->has_backup_perm) {
-        c->has_backup_perm = true;
-        c->backup_perm = c->perm;
-        c->backup_shared_perm = c->shared_perm;
-    }
-    /*
-     * Note: it's OK if c->has_backup_perm was already set, as we can find the
-     * same c twice during check_perm procedure
-     */
+    BdrvChildSetPermState *s = g_new(BdrvChildSetPermState, 1);
+
+    *s = (BdrvChildSetPermState) {
+        .child = c,
+        .old_perm = c->perm,
+        .old_shared_perm = c->shared_perm,
+    };
 
     c->perm = perm;
     c->shared_perm = shared;
 
-    if (tran) {
-        tran_prepend(tran, &bdrv_child_set_pem_drv, c);
-    }
+    tran_prepend(tran, &bdrv_child_set_pem_drv, s);
 }
 
 static void bdrv_drv_set_perm_commit(void *opaque)
@@ -2302,7 +2283,7 @@ static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
         bdrv_child_perm(bs, c->bs, c, c->role, q,
                         cumulative_perms, cumulative_shared_perms,
                         &cur_perm, &cur_shared);
-        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, tran);
+        bdrv_child_set_perm(c, cur_perm, cur_shared, tran);
     }
 
     return 0;
@@ -2401,7 +2382,7 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
     GSList *tran = NULL;
     int ret;
 
-    bdrv_child_set_perm_safe(c, perm, shared, &tran);
+    bdrv_child_set_perm(c, perm, shared, &tran);
 
     ret = bdrv_refresh_perms(c->bs, &local_err);
 
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 35/36] block: rename bdrv_replace_child_safe() to bdrv_replace_child()
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (33 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 34/36] block: refactor bdrv_child_set_perm_safe() transaction action Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2020-11-27 14:45 ` [PATCH v2 36/36] block: refactor bdrv_node_check_perm() Vladimir Sementsov-Ogievskiy
  2021-01-09 10:12 ` [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

We don't have bdrv_replace_child(), so it's time for
bdrv_replace_child_safe() to take its place.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index 1fde22e4f4..20b1cf59f7 100644
--- a/block.c
+++ b/block.c
@@ -2183,11 +2183,11 @@ static TransactionActionDrv bdrv_replace_child_drv = {
 };
 
 /*
- * bdrv_replace_child_safe
+ * bdrv_replace_child
  *
  * Note: real unref of old_bs is done only on commit.
  */
-static void bdrv_replace_child_safe(BdrvChild *child, BlockDriverState *new_bs,
+static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs,
                                     GSList **tran)
 {
     BdrvReplaceChildState *s = g_new(BdrvReplaceChildState, 1);
@@ -3040,7 +3040,7 @@ static int bdrv_set_backing_noperm(BlockDriverState *bs,
     }
 
     if (bs->backing && backing_bs) {
-        bdrv_replace_child_safe(bs->backing, backing_bs, tran);
+        bdrv_replace_child(bs->backing, backing_bs, tran);
     } else if (bs->backing && !backing_bs) {
         bdrv_remove_backing(bs, tran);
     } else if (backing_bs) {
@@ -4679,7 +4679,7 @@ static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran)
     }
 
     if (bs->backing->bs) {
-        bdrv_replace_child_safe(bs->backing, NULL, tran);
+        bdrv_replace_child(bs->backing, NULL, tran);
     }
 
     tran_prepend(tran, &bdrv_remove_backing_drv, bs->backing);
@@ -4708,7 +4708,7 @@ static int bdrv_replace_node_noperm(BlockDriverState *from,
                        c->name, from->node_name);
             return -EPERM;
         }
-        bdrv_replace_child_safe(c, to, tran);
+        bdrv_replace_child(c, to, tran);
     }
 
     return 0;
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 36/36] block: refactor bdrv_node_check_perm()
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (34 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 35/36] block: rename bdrv_replace_child_safe() to bdrv_replace_child() Vladimir Sementsov-Ogievskiy
@ 2020-11-27 14:45 ` Vladimir Sementsov-Ogievskiy
  2021-02-10 15:07   ` Kevin Wolf
  2021-01-09 10:12 ` [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
  36 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-27 14:45 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, armbru, jsnow, mreitz, kwolf, vsementsov, den

Now, bdrv_node_check_perm() is called only with fresh cumulative
permissions, so its actually "refresh_perm".

Move permission calculation to the function. Also, drop unreachable
error message.

Add also Virtuozzo copyright, as big work is done at this point.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 38 +++++++++-----------------------------
 1 file changed, 9 insertions(+), 29 deletions(-)

diff --git a/block.c b/block.c
index 20b1cf59f7..576b145cbf 100644
--- a/block.c
+++ b/block.c
@@ -2,6 +2,7 @@
  * QEMU System Emulator block driver
  *
  * Copyright (c) 2003 Fabrice Bellard
+ * Copyright (c) 2020 Virtuozzo International GmbH.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to deal
@@ -2204,23 +2205,15 @@ static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs,
     /* old_bs reference is transparently moved from @child to @s */
 }
 
-/*
- * Check whether permissions on this node can be changed in a way that
- * @cumulative_perms and @cumulative_shared_perms are the new cumulative
- * permissions of all its parents. This involves checking whether all necessary
- * permission changes to child nodes can be performed.
- *
- * A call to this function must always be followed by a call to bdrv_set_perm()
- * or bdrv_abort_perm_update().
- */
-static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
-                                uint64_t cumulative_perms,
-                                uint64_t cumulative_shared_perms,
-                                GSList **tran, Error **errp)
+static int bdrv_node_refresh_perm(BlockDriverState *bs, BlockReopenQueue *q,
+                                  GSList **tran, Error **errp)
 {
     BlockDriver *drv = bs->drv;
     BdrvChild *c;
     int ret;
+    uint64_t cumulative_perms, cumulative_shared_perms;
+
+    bdrv_get_cumulative_perm(bs, &cumulative_perms, &cumulative_shared_perms);
 
     /* Write permissions never work with read-only images */
     if ((cumulative_perms & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) &&
@@ -2229,15 +2222,8 @@ static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
         if (!bdrv_is_writable_after_reopen(bs, NULL)) {
             error_setg(errp, "Block node is read-only");
         } else {
-            uint64_t current_perms, current_shared;
-            bdrv_get_cumulative_perm(bs, &current_perms, &current_shared);
-            if (current_perms & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) {
-                error_setg(errp, "Cannot make block node read-only, there is "
-                           "a writer on it");
-            } else {
-                error_setg(errp, "Cannot make block node read-only and create "
-                           "a writer on it");
-            }
+            error_setg(errp, "Cannot make block node read-only, there is "
+                       "a writer on it");
         }
 
         return -EPERM;
@@ -2293,7 +2279,6 @@ static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
                                    GSList **tran, Error **errp)
 {
     int ret;
-    uint64_t cumulative_perms, cumulative_shared_perms;
     BlockDriverState *bs;
 
     for ( ; list; list = list->next) {
@@ -2303,12 +2288,7 @@ static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
             return -EINVAL;
         }
 
-        bdrv_get_cumulative_perm(bs, &cumulative_perms,
-                                 &cumulative_shared_perms);
-
-        ret = bdrv_node_check_perm(bs, q, cumulative_perms,
-                                   cumulative_shared_perms,
-                                   tran, errp);
+        ret = bdrv_node_refresh_perm(bs, q, tran, errp);
         if (ret < 0) {
             return ret;
         }
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 04/36] block: bdrv_append(): return status
  2020-11-27 14:44 ` [PATCH v2 04/36] block: bdrv_append(): return status Vladimir Sementsov-Ogievskiy
@ 2020-12-14 15:49   ` Alberto Garcia
  2021-01-18 14:32   ` Kevin Wolf
  1 sibling, 0 replies; 108+ messages in thread
From: Alberto Garcia @ 2020-12-14 15:49 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, vsementsov, armbru, qemu-devel, mreitz, den, jsnow

On Fri 27 Nov 2020 03:44:50 PM CET, Vladimir Sementsov-Ogievskiy wrote:
> Return int status to avoid extra error propagation schemes.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 08/36] block: make bdrv_reopen_{prepare, commit, abort} private
  2020-11-27 14:44 ` [PATCH v2 08/36] block: make bdrv_reopen_{prepare, commit, abort} private Vladimir Sementsov-Ogievskiy via
@ 2020-12-15 17:28   ` Alberto Garcia
  2021-01-18 15:24   ` [PATCH v2 08/36] block: make bdrv_reopen_{prepare,commit,abort} private Kevin Wolf
  1 sibling, 0 replies; 108+ messages in thread
From: Alberto Garcia @ 2020-12-15 17:28 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy via, qemu-block
  Cc: kwolf, vsementsov, armbru, qemu-devel, mreitz, den, jsnow

On Fri 27 Nov 2020 03:44:54 PM CET, Vladimir Sementsov-Ogievskiy via wrote:
> These functions are called only from bdrv_reopen_multiple() in block.c.
> No reason to publish them.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 09/36] block: return value from bdrv_replace_node()
  2020-11-27 14:44 ` [PATCH v2 09/36] block: return value from bdrv_replace_node() Vladimir Sementsov-Ogievskiy
@ 2020-12-15 17:30   ` Alberto Garcia
  2021-01-18 15:40   ` Kevin Wolf
  1 sibling, 0 replies; 108+ messages in thread
From: Alberto Garcia @ 2020-12-15 17:30 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, vsementsov, armbru, qemu-devel, mreitz, den, jsnow

On Fri 27 Nov 2020 03:44:55 PM CET, Vladimir Sementsov-Ogievskiy wrote:
> Functions with errp argument are not recommened to be void-functions.
> Improve bdrv_replace_node().
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 14/36] block: inline bdrv_child_*() permission functions calls
  2020-11-27 14:45 ` [PATCH v2 14/36] block: inline bdrv_child_*() permission functions calls Vladimir Sementsov-Ogievskiy
@ 2020-12-16 17:16   ` Alberto Garcia
  0 siblings, 0 replies; 108+ messages in thread
From: Alberto Garcia @ 2020-12-16 17:16 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, vsementsov, armbru, qemu-devel, mreitz, den, jsnow

On Fri 27 Nov 2020 03:45:00 PM CET, Vladimir Sementsov-Ogievskiy wrote:
> Each of them has only one caller. Open-coding simplifies further
> pemission-update system changes.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 00/36] block: update graph permissions update
  2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
                   ` (35 preceding siblings ...)
  2020-11-27 14:45 ` [PATCH v2 36/36] block: refactor bdrv_node_check_perm() Vladimir Sementsov-Ogievskiy
@ 2021-01-09 10:12 ` Vladimir Sementsov-Ogievskiy
  36 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-09 10:12 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, armbru, qemu-devel, mreitz, den, jsnow

ping

27.11.2020 17:44, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> Here is a proposal of updating graph changing procedures.
> 
> The thing brought me here is a question about "activating" filters after
> insertion, which is done in mirror_top and backup_top. The problem is
> that we can't simply avoid permission conflict when inserting the
> filter: during insertion old permissions of relations to be removed
> conflicting with new permissions of new created relations. And current
> solution is supporting additional "inactive" mode for the filter when it
> doesn't require any permissions.
> 
> I suggest to change the order of operations: let's first do all graph
> relations modifications and then refresh permissions. Of course we'll
> need a way to restore old graph if refresh fails.
> 
> Another problem with permission update is that we update permissions in
> order of DFS which is not always correct. Better is update node when all
> its parents already updated and require correct permissions. This needs
> a topological sort of nodes prior to permission update, see more in
> patches later.
> 
> Patches plan:
> 
> 01,02 - add failing tests to illustrate conceptual problems of current
> permission update system.
> [Here is side suggestion: we usually add tests after fix, so careful
>   reviewer has to change order of patches to check that test fails before
>   fix. I add tests in the way the may be simply executed but not yet take
>   part in make check. It seems more native: first show the problem, then
>   fix it. And when fixed, make tests available for make check]
> 
> 03-09 - some perparations, refactorings which may go in separate
> 
> 10 - new transaction API
> 
> 15 - toplogical sort implemented for permission update, one of new tests
> now pass
> 
> 19 - improve bdrv_replace_node. second new test now pass
> 
> 26 - drop .active field and activation procedure for backup-top!
> 
> 30 - update bdrv_reopen_multiple. At this point everything is using new
> paradigm of permission update
> 
> 31-36 - post refactoring, drop old interfaces, we are done.
> 
> Note, that this series does nothing with another graph-update problem
> discussed under "[PATCH RFC 0/5] Fix accidental crash in iotest 30".
> 
> The series based on block-next Max's branch and can be found here:
> 
> git: https://src.openvz.org/scm/~vsementsov/qemu.git
> tag: up-block-topologic-perm-v2
> 
> Vladimir Sementsov-Ogievskiy (36):
>    tests/test-bdrv-graph-mod: add test_parallel_exclusive_write
>    tests/test-bdrv-graph-mod: add test_parallel_perm_update
>    block: bdrv_append(): don't consume reference
>    block: bdrv_append(): return status
>    block: add bdrv_parent_try_set_aio_context
>    block: BdrvChildClass: add .get_parent_aio_context handler
>    block: drop ctx argument from bdrv_root_attach_child
>    block: make bdrv_reopen_{prepare,commit,abort} private
>    block: return value from bdrv_replace_node()
>    util: add transactions.c
>    block: bdrv_refresh_perms: check parents compliance
>    block: refactor bdrv_child* permission functions
>    block: rewrite bdrv_child_try_set_perm() using bdrv_refresh_perms()
>    block: inline bdrv_child_*() permission functions calls
>    block: use topological sort for permission update
>    block: add bdrv_drv_set_perm transaction action
>    block: add bdrv_list_* permission update functions
>    block: add bdrv_replace_child_safe() transaction action
>    block: fix bdrv_replace_node_common
>    block: add bdrv_attach_child_common() transaction action
>    block: add bdrv_attach_child_noperm() transaction action
>    block: split out bdrv_replace_node_noperm()
>    block: adapt bdrv_append() for inserting filters
>    block: add bdrv_remove_backing transaction action
>    block: introduce bdrv_drop_filter()
>    block/backup-top: drop .active
>    block: drop ignore_children for permission update functions
>    block: add bdrv_set_backing_noperm() transaction action
>    blockdev: qmp_x_blockdev_reopen: acquire all contexts
>    block: bdrv_reopen_multiple: refresh permissions on updated graph
>    block: drop unused permission update functions
>    block: inline bdrv_check_perm_common()
>    block: inline bdrv_replace_child()
>    block: refactor bdrv_child_set_perm_safe() transaction action
>    block: rename bdrv_replace_child_safe() to bdrv_replace_child()
>    block: refactor bdrv_node_check_perm()
> 
>   include/block/block.h       |   20 +-
>   include/block/block_int.h   |    8 +-
>   include/qemu/transactions.h |   46 ++
>   block.c                     | 1319 ++++++++++++++++++++---------------
>   block/backup-top.c          |   39 +-
>   block/block-backend.c       |   13 +-
>   block/commit.c              |    7 +-
>   block/file-posix.c          |   84 +--
>   block/mirror.c              |    9 +-
>   blockdev.c                  |   33 +-
>   blockjob.c                  |   11 +-
>   tests/test-bdrv-drain.c     |    2 +-
>   tests/test-bdrv-graph-mod.c |  122 +++-
>   util/transactions.c         |   81 +++
>   tests/qemu-iotests/283.out  |    2 +-
>   util/meson.build            |    1 +
>   16 files changed, 1100 insertions(+), 697 deletions(-)
>   create mode 100644 include/qemu/transactions.h
>   create mode 100644 util/transactions.c
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 02/36] tests/test-bdrv-graph-mod: add test_parallel_perm_update
  2020-11-27 14:44 ` [PATCH v2 02/36] tests/test-bdrv-graph-mod: add test_parallel_perm_update Vladimir Sementsov-Ogievskiy
@ 2021-01-18 14:05   ` Kevin Wolf
  2021-01-18 17:13     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-01-18 14:05 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Add test to show that simple DFS recursion order is not correct for
> permission update. Correct order is topological-sort order, which will
> be introduced later.
> 
> Consider the block driver which has two filter children: one active
> with exclusive write access and one inactive with no specific
> permissions.
> 
> And, these two children has a common base child, like this:
> 
> ┌─────┐     ┌──────┐
> │ fl2 │ ◀── │ top  │
> └─────┘     └──────┘
>   │           │
>   │           │ w
>   │           ▼
>   │         ┌──────┐
>   │         │ fl1  │
>   │         └──────┘
>   │           │
>   │           │ w
>   │           ▼
>   │         ┌──────┐
>   └───────▶ │ base │
>             └──────┘
> 
> So, exclusive write is propagated.
> 
> Assume, we want to make fl2 active instead of fl1.
> So, we set some option for top driver and do permission update.
> 
> If permission update (remember, it's DFS) goes first through
> top->fl1->base branch it will succeed: it firstly drop exclusive write
> permissions and than apply them for another BdrvChildren.
> But if permission update goes first through top->fl2->base branch it
> will fail, as when we try to update fl2->base child, old not yet
> updated fl1->base child will be in conflict.
> 
> Now test fails, so it runs only with -d flag. To run do
> 
>   ./test-bdrv-graph-mod -d -p /bdrv-graph-mod/parallel-perm-update
> 
> from <build-directory>/tests.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  tests/test-bdrv-graph-mod.c | 64 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 64 insertions(+)
> 
> diff --git a/tests/test-bdrv-graph-mod.c b/tests/test-bdrv-graph-mod.c
> index 3b9e6f242f..27e3361a60 100644
> --- a/tests/test-bdrv-graph-mod.c
> +++ b/tests/test-bdrv-graph-mod.c
> @@ -232,6 +232,68 @@ static void test_parallel_exclusive_write(void)
>      bdrv_unref(top);
>  }
>  
> +static void write_to_file_perms(BlockDriverState *bs, BdrvChild *c,
> +                                     BdrvChildRole role,
> +                                     BlockReopenQueue *reopen_queue,
> +                                     uint64_t perm, uint64_t shared,
> +                                     uint64_t *nperm, uint64_t *nshared)
> +{
> +    if (bs->file && c == bs->file) {
> +        *nperm = BLK_PERM_WRITE;
> +        *nshared = BLK_PERM_ALL & ~BLK_PERM_WRITE;
> +    } else {
> +        *nperm = 0;
> +        *nshared = BLK_PERM_ALL;
> +    }
> +}
> +
> +static BlockDriver bdrv_write_to_file = {
> +    .format_name = "tricky-perm",
> +    .bdrv_child_perm = write_to_file_perms,
> +};
> +
> +static void test_parallel_perm_update(void)
> +{
> +    BlockDriverState *top = no_perm_node("top");
> +    BlockDriverState *tricky =
> +            bdrv_new_open_driver(&bdrv_write_to_file, "tricky", BDRV_O_RDWR,
> +                                 &error_abort);
> +    BlockDriverState *base = no_perm_node("base");
> +    BlockDriverState *fl1 = pass_through_node("fl1");
> +    BlockDriverState *fl2 = pass_through_node("fl2");
> +    BdrvChild *c_fl1, *c_fl2;
> +
> +    bdrv_attach_child(top, tricky, "file", &child_of_bds, BDRV_CHILD_DATA,
> +                      &error_abort);
> +    c_fl1 = bdrv_attach_child(tricky, fl1, "first", &child_of_bds,
> +                              BDRV_CHILD_FILTERED, &error_abort);
> +    c_fl2 = bdrv_attach_child(tricky, fl2, "second", &child_of_bds,
> +                              BDRV_CHILD_FILTERED, &error_abort);
> +    bdrv_attach_child(fl1, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
> +                      &error_abort);
> +    bdrv_attach_child(fl2, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
> +                      &error_abort);
> +    bdrv_ref(base);
> +
> +    /* Select fl1 as first child to be active */
> +    tricky->file = c_fl1;
> +    bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
> +
> +    assert(c_fl1->perm & BLK_PERM_WRITE);
> +
> +    /* Now, try to switch active child and update permissions */
> +    tricky->file = c_fl2;
> +    bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
> +
> +    assert(c_fl2->perm & BLK_PERM_WRITE);
> +
> +    /* Switch once more, to not care about real child order in the list */
> +    tricky->file = c_fl1;
> +    bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
> +
> +    assert(c_fl1->perm & BLK_PERM_WRITE);

Should we also assert in each case that the we don't hole the write
permission for the inactive child?

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 03/36] block: bdrv_append(): don't consume reference
  2020-11-27 14:44 ` [PATCH v2 03/36] block: bdrv_append(): don't consume reference Vladimir Sementsov-Ogievskiy
@ 2021-01-18 14:18   ` Kevin Wolf
  2021-01-18 17:21     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-01-18 14:18 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> We have too much comments for this feature. It seems better just don't
> do it. Most of real users (tests don't count) have to create additional
> reference.
> 
> Drop also comment in external_snapshot_prepare:
>  - bdrv_append doesn't "remove" old bs in common sense, it sounds
>    strange
>  - the fact that bdrv_append can fail is obvious from the context
>  - the fact that we must rollback all changes in transaction abort is
>    known (it's the direct role of abort)
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block.c                     | 19 +++----------------
>  block/backup-top.c          |  1 -
>  block/commit.c              |  1 +
>  block/mirror.c              |  3 ---
>  blockdev.c                  |  4 ----
>  tests/test-bdrv-drain.c     |  2 +-
>  tests/test-bdrv-graph-mod.c |  2 ++
>  7 files changed, 7 insertions(+), 25 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 0dd28f0902..55efef3c9d 100644
> --- a/block.c
> +++ b/block.c
> @@ -3145,11 +3145,6 @@ static BlockDriverState *bdrv_append_temp_snapshot(BlockDriverState *bs,
>          goto out;
>      }
>  
> -    /* bdrv_append() consumes a strong reference to bs_snapshot
> -     * (i.e. it will call bdrv_unref() on it) even on error, so in
> -     * order to be able to return one, we have to increase
> -     * bs_snapshot's refcount here */
> -    bdrv_ref(bs_snapshot);
>      bdrv_append(bs_snapshot, bs, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
> @@ -4608,10 +4603,8 @@ void bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
>   *
>   * This function does not create any image files.
>   *
> - * bdrv_append() takes ownership of a bs_new reference and unrefs it because
> - * that's what the callers commonly need. bs_new will be referenced by the old
> - * parents of bs_top after bdrv_append() returns. If the caller needs to keep a
> - * reference of its own, it must call bdrv_ref().
> + * Recent update: bdrv_append does NOT eat bs_new reference for now. Drop this
> + * comment several moths later.

A comment like this is unusual. Do you think there is a high risk of
somebody introducing a new bdrv_append() in parallel and that they would
read this comment when rebasing their existing patches?

If we do keep the comment: s/for now/now/ (it has recently changed,
we're not intending to change it later) and s/moths/months/.

>   */
>  void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
>                   Error **errp)
> @@ -4621,20 +4614,14 @@ void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
>      bdrv_set_backing_hd(bs_new, bs_top, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
> -        goto out;
> +        return;
>      }
>  
>      bdrv_replace_node(bs_top, bs_new, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          bdrv_set_backing_hd(bs_new, NULL, &error_abort);
> -        goto out;

Can we leave a return here just in case that new code will be added at
the end of the function?

>      }
> -
> -    /* bs_new is now referenced by its new parents, we don't need the
> -     * additional reference any more. */
> -out:
> -    bdrv_unref(bs_new);
>  }
>  
>  static void bdrv_delete(BlockDriverState *bs)
> diff --git a/block/backup-top.c b/block/backup-top.c
> index fe6883cc97..650ed6195c 100644
> --- a/block/backup-top.c
> +++ b/block/backup-top.c
> @@ -222,7 +222,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
>  
>      bdrv_drained_begin(source);
>  
> -    bdrv_ref(top);
>      bdrv_append(top, source, &local_err);
>      if (local_err) {
>          error_prepend(&local_err, "Cannot append backup-top filter: ");
> diff --git a/block/commit.c b/block/commit.c
> index 71db7ba747..61924bcf66 100644
> --- a/block/commit.c
> +++ b/block/commit.c
> @@ -313,6 +313,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
>      commit_top_bs->total_sectors = top->total_sectors;
>  
>      bdrv_append(commit_top_bs, top, &local_err);
> +    bdrv_unref(commit_top_bs); /* referenced by new parents or failed */
>      if (local_err) {
>          commit_top_bs = NULL;
>          error_propagate(errp, local_err);
> diff --git a/block/mirror.c b/block/mirror.c
> index 8e1ad6eceb..13f7ecc998 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -1605,9 +1605,6 @@ static BlockJob *mirror_start_job(
>      bs_opaque = g_new0(MirrorBDSOpaque, 1);
>      mirror_top_bs->opaque = bs_opaque;
>  
> -    /* bdrv_append takes ownership of the mirror_top_bs reference, need to keep
> -     * it alive until block_job_create() succeeds even if bs has no parent. */
> -    bdrv_ref(mirror_top_bs);
>      bdrv_drained_begin(bs);
>      bdrv_append(mirror_top_bs, bs, &local_err);
>      bdrv_drained_end(bs);
> diff --git a/blockdev.c b/blockdev.c
> index b5f11c524b..96c96f8ba6 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1587,10 +1587,6 @@ static void external_snapshot_prepare(BlkActionState *common,
>          goto out;
>      }
>  
> -    /* This removes our old bs and adds the new bs. This is an operation that
> -     * can fail, so we need to do it in .prepare; undoing it for abort is
> -     * always possible. */

This comment is still relevant, it's unrelated to the bdrv_ref().

> -    bdrv_ref(state->new_bs);
>      bdrv_append(state->new_bs, state->old_bs, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 04/36] block: bdrv_append(): return status
  2020-11-27 14:44 ` [PATCH v2 04/36] block: bdrv_append(): return status Vladimir Sementsov-Ogievskiy
  2020-12-14 15:49   ` Alberto Garcia
@ 2021-01-18 14:32   ` Kevin Wolf
  1 sibling, 0 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-01-18 14:32 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Return int status to avoid extra error propagation schemes.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 05/36] block: add bdrv_parent_try_set_aio_context
  2020-11-27 14:44 ` [PATCH v2 05/36] block: add bdrv_parent_try_set_aio_context Vladimir Sementsov-Ogievskiy
@ 2021-01-18 15:08   ` Kevin Wolf
  2021-01-18 17:26     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-01-18 15:08 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> We already have bdrv_parent_can_set_aio_context(). Add corresponding
> bdrv_parent_set_aio_context_ignore() and
> bdrv_parent_try_set_aio_context() and use them instead of open-coding.
> 
> Make bdrv_parent_try_set_aio_context() public, as it will be used in
> further commit.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/block/block.h |  2 ++
>  block.c               | 51 +++++++++++++++++++++++++++++++++----------
>  2 files changed, 41 insertions(+), 12 deletions(-)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index ee3f5a6cca..550c5a7513 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -686,6 +686,8 @@ bool bdrv_child_can_set_aio_context(BdrvChild *c, AioContext *ctx,
>                                      GSList **ignore, Error **errp);
>  bool bdrv_can_set_aio_context(BlockDriverState *bs, AioContext *ctx,
>                                GSList **ignore, Error **errp);
> +int bdrv_parent_try_set_aio_context(BdrvChild *c, AioContext *ctx,
> +                                    Error **errp);
>  int bdrv_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz);
>  int bdrv_probe_geometry(BlockDriverState *bs, HDGeometry *geo);
>  
> diff --git a/block.c b/block.c
> index 916087ee1a..5d925c208d 100644
> --- a/block.c
> +++ b/block.c
> @@ -81,6 +81,9 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
>                                             BdrvChildRole child_role,
>                                             Error **errp);
>  
> +static void bdrv_parent_set_aio_context_ignore(BdrvChild *c, AioContext *ctx,
> +                                               GSList **ignore);
> +
>  /* If non-zero, use only whitelisted block drivers */
>  static int use_bdrv_whitelist;
>  
> @@ -2655,17 +2658,12 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
>       * try moving the parent into the AioContext of child_bs instead. */
>      if (bdrv_get_aio_context(child_bs) != ctx) {
>          ret = bdrv_try_set_aio_context(child_bs, ctx, &local_err);
> -        if (ret < 0 && child_class->can_set_aio_ctx) {
> -            GSList *ignore = g_slist_prepend(NULL, child);
> -            ctx = bdrv_get_aio_context(child_bs);

You are losing this line...

> -            if (child_class->can_set_aio_ctx(child, ctx, &ignore, NULL)) {
> -                error_free(local_err);
> +        if (ret < 0) {
> +            if (bdrv_parent_try_set_aio_context(child, ctx, NULL) == 0) {

...before this one, so I think the wrong context is passed now. Instead
of trying to move the parent to the AioContext of the child, we'll try
to move it to the AioContext in which it already is (and which doesn't
match the AioContext of the child).

Kevin

>                  ret = 0;
> -                g_slist_free(ignore);
> -                ignore = g_slist_prepend(NULL, child);
> -                child_class->set_aio_ctx(child, ctx, &ignore);
> +                error_free(local_err);
> +                local_err = NULL;
>              }
> -            g_slist_free(ignore);
>          }
>          if (ret < 0) {
>              error_propagate(errp, local_err);
> @@ -6452,9 +6450,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
>          if (g_slist_find(*ignore, child)) {
>              continue;
>          }
> -        assert(child->klass->set_aio_ctx);
> -        *ignore = g_slist_prepend(*ignore, child);
> -        child->klass->set_aio_ctx(child, new_context, ignore);
> +        bdrv_parent_set_aio_context_ignore(child, new_context, ignore);
>      }
>  
>      bdrv_detach_aio_context(bs);
> @@ -6511,6 +6507,37 @@ static bool bdrv_parent_can_set_aio_context(BdrvChild *c, AioContext *ctx,
>      return true;
>  }
>  
> +static void bdrv_parent_set_aio_context_ignore(BdrvChild *c, AioContext *ctx,
> +                                               GSList **ignore)
> +{
> +    if (g_slist_find(*ignore, c)) {
> +        return;
> +    }
> +    *ignore = g_slist_prepend(*ignore, c);
> +
> +    assert(c->klass->set_aio_ctx);
> +    c->klass->set_aio_ctx(c, ctx, ignore);
> +}
> +
> +int bdrv_parent_try_set_aio_context(BdrvChild *c, AioContext *ctx,
> +                                    Error **errp)
> +{
> +    GSList *ignore = NULL;
> +
> +    if (!bdrv_parent_can_set_aio_context(c, ctx, &ignore, errp)) {
> +        g_slist_free(ignore);
> +        return -EPERM;
> +    }
> +
> +    g_slist_free(ignore);
> +    ignore = NULL;
> +
> +    bdrv_parent_set_aio_context_ignore(c, ctx, &ignore);
> +    g_slist_free(ignore);
> +
> +    return 0;
> +}
> +
>  bool bdrv_child_can_set_aio_context(BdrvChild *c, AioContext *ctx,
>                                      GSList **ignore, Error **errp)
>  {
> -- 
> 2.21.3
> 



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 06/36] block: BdrvChildClass: add .get_parent_aio_context handler
  2020-11-27 14:44 ` [PATCH v2 06/36] block: BdrvChildClass: add .get_parent_aio_context handler Vladimir Sementsov-Ogievskiy
@ 2021-01-18 15:13   ` Kevin Wolf
  2021-01-18 17:36     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-01-18 15:13 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Add new handler to get aio context and implement it in all child
> classes. Add corresponding public interface to be used soon.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Hm, are you going to introduce cases where parent and child context
don't match, or why is this a useful function?

Even if so, I feel it shouldn't be any of the child's business what
AioContext the parent uses.

Well, maybe the rest of the series will answer this.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 08/36] block: make bdrv_reopen_{prepare,commit,abort} private
  2020-11-27 14:44 ` [PATCH v2 08/36] block: make bdrv_reopen_{prepare, commit, abort} private Vladimir Sementsov-Ogievskiy via
  2020-12-15 17:28   ` Alberto Garcia
@ 2021-01-18 15:24   ` Kevin Wolf
  1 sibling, 0 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-01-18 15:24 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> These functions are called only from bdrv_reopen_multiple() in block.c.
> No reason to publish them.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 09/36] block: return value from bdrv_replace_node()
  2020-11-27 14:44 ` [PATCH v2 09/36] block: return value from bdrv_replace_node() Vladimir Sementsov-Ogievskiy
  2020-12-15 17:30   ` Alberto Garcia
@ 2021-01-18 15:40   ` Kevin Wolf
  1 sibling, 0 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-01-18 15:40 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Functions with errp argument are not recommened to be void-functions.
> Improve bdrv_replace_node().
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/block/block.h |  4 ++--
>  block.c               | 14 ++++++++------
>  2 files changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index 5d59984ad4..8f6100dad7 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -346,8 +346,8 @@ int bdrv_create_file(const char *filename, QemuOpts *opts, Error **errp);
>  BlockDriverState *bdrv_new(void);
>  int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
>                  Error **errp);
> -void bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
> -                       Error **errp);
> +int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
> +                      Error **errp);
>  
>  int bdrv_parse_aio(const char *mode, int *flags);
>  int bdrv_parse_cache_mode(const char *mode, int *flags, bool *writethrough);
> diff --git a/block.c b/block.c
> index 3765c7caed..29082c6d47 100644
> --- a/block.c
> +++ b/block.c
> @@ -4537,14 +4537,14 @@ static bool should_update_child(BdrvChild *c, BlockDriverState *to)
>   * With auto_skip=false the error is returned if from has a parent which should
>   * not be updated.
>   */
> -static void bdrv_replace_node_common(BlockDriverState *from,
> -                                     BlockDriverState *to,
> -                                     bool auto_skip, Error **errp)
> +static int bdrv_replace_node_common(BlockDriverState *from,
> +                                    BlockDriverState *to,
> +                                    bool auto_skip, Error **errp)
>  {
> +    int ret = -EPERM;
>      BdrvChild *c, *next;
>      GSList *list = NULL, *p;
>      uint64_t perm = 0, shared = BLK_PERM_ALL;
> -    int ret;

I think I'd prefer setting ret in each error path. This makes it more
obvious that ret has the right value and hasn't been modified between
the initialisation and the error.

>  
>      /* Make sure that @from doesn't go away until we have successfully attached
>       * all of its parents to @to. */
> @@ -4600,10 +4600,12 @@ out:

Let's add an explicit ret = 0 right before the out: label.

>      g_slist_free(list);
>      bdrv_drained_end(from);
>      bdrv_unref(from);
> +
> +    return ret;
>  }

With these changes:

Reviewed-by: Kevin Wolf <kwolf@redhat.com>



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 10/36] util: add transactions.c
  2020-11-27 14:44 ` [PATCH v2 10/36] util: add transactions.c Vladimir Sementsov-Ogievskiy
@ 2021-01-18 16:50   ` Kevin Wolf
  2021-01-18 17:41     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-01-18 16:50 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Add simple transaction API to use in further update of block graph
> operations.
> 
> Supposed usage is:
> 
> - "prepare" is main function of the action and it should make the main
>   effect of the action to be visible for the following actions, keeping
>   possibility of roll-back, saving necessary things in action state,
>   which is prepended to the list. So, driver struct doesn't include
>   "prepare" field, as it is supposed to be called directly.

So the convention is that tran_prepend() should be called by the
function that does the preparation? Or would we call tran_prepend() and
do the actual action in different places?

> - commit/rollback is supposed to be called for the list of action
>   states, to commit/rollback all the actions in reverse order
> 
> - When possible "commit" should not make visible effect for other
>   actions, which make possible transparent logical interaction between
>   actions.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/qemu/transactions.h | 46 +++++++++++++++++++++
>  util/transactions.c         | 81 +++++++++++++++++++++++++++++++++++++
>  util/meson.build            |  1 +
>  3 files changed, 128 insertions(+)
>  create mode 100644 include/qemu/transactions.h
>  create mode 100644 util/transactions.c
> 
> diff --git a/include/qemu/transactions.h b/include/qemu/transactions.h
> new file mode 100644
> index 0000000000..a5b15f46ab
> --- /dev/null
> +++ b/include/qemu/transactions.h
> @@ -0,0 +1,46 @@
> +/*
> + * Simple transactions API
> + *
> + * Copyright (c) 2020 Virtuozzo International GmbH.
> + *
> + * Author:
> + *  Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef QEMU_TRANSACTIONS_H
> +#define QEMU_TRANSACTIONS_H
> +
> +#include <gmodule.h>
> +
> +typedef struct TransactionActionDrv {
> +    void (*abort)(void *opeque);
> +    void (*commit)(void *opeque);
> +    void (*clean)(void *opeque);
> +} TransactionActionDrv;

s/opeque/opaque/

> +void tran_prepend(GSList **list, TransactionActionDrv *drv, void *opaque);
> +void tran_abort(GSList *backup);
> +void tran_commit(GSList *backup);

I'd add an empty line before a full function definition.

> +static inline void tran_finalize(GSList *backup, int ret)
> +{
> +    if (ret < 0) {
> +        tran_abort(backup);
> +    } else {
> +        tran_commit(backup);
> +    }
> +}

Let's use an opaque struct instead of GSList here and...

> +#endif /* QEMU_TRANSACTIONS_H */
> diff --git a/util/transactions.c b/util/transactions.c
> new file mode 100644
> index 0000000000..ef1b9a36a4
> --- /dev/null
> +++ b/util/transactions.c
> @@ -0,0 +1,81 @@
> +/*
> + * Simple transactions API
> + *
> + * Copyright (c) 2020 Virtuozzo International GmbH.
> + *
> + * Author:
> + *  Sementsov-Ogievskiy Vladimir <vsementsov@virtuozzo.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu/osdep.h"
> +
> +#include "qemu/transactions.h"
> +
> +typedef struct BdrvAction {
> +    TransactionActionDrv *drv;
> +    void *opaque;
> +} BdrvAction;

...add a QSLIST_ENTRY (or similar) here to make it a type-safe list.

The missing type safety of GSList means that we should avoid it whenever
it's easily possible (i.e. we know the number of lists in which an
element will be). Here, each BdrvAction will only be in a single list,
so typed lists should be simple enough.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 02/36] tests/test-bdrv-graph-mod: add test_parallel_perm_update
  2021-01-18 14:05   ` Kevin Wolf
@ 2021-01-18 17:13     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-18 17:13 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

18.01.2021 17:05, Kevin Wolf wrote:
> Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Add test to show that simple DFS recursion order is not correct for
>> permission update. Correct order is topological-sort order, which will
>> be introduced later.
>>
>> Consider the block driver which has two filter children: one active
>> with exclusive write access and one inactive with no specific
>> permissions.
>>
>> And, these two children has a common base child, like this:
>>
>> ┌─────┐     ┌──────┐
>> │ fl2 │ ◀── │ top  │
>> └─────┘     └──────┘
>>    │           │
>>    │           │ w
>>    │           ▼
>>    │         ┌──────┐
>>    │         │ fl1  │
>>    │         └──────┘
>>    │           │
>>    │           │ w
>>    │           ▼
>>    │         ┌──────┐
>>    └───────▶ │ base │
>>              └──────┘
>>
>> So, exclusive write is propagated.
>>
>> Assume, we want to make fl2 active instead of fl1.
>> So, we set some option for top driver and do permission update.
>>
>> If permission update (remember, it's DFS) goes first through
>> top->fl1->base branch it will succeed: it firstly drop exclusive write
>> permissions and than apply them for another BdrvChildren.
>> But if permission update goes first through top->fl2->base branch it
>> will fail, as when we try to update fl2->base child, old not yet
>> updated fl1->base child will be in conflict.
>>
>> Now test fails, so it runs only with -d flag. To run do
>>
>>    ./test-bdrv-graph-mod -d -p /bdrv-graph-mod/parallel-perm-update
>>
>> from <build-directory>/tests.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   tests/test-bdrv-graph-mod.c | 64 +++++++++++++++++++++++++++++++++++++
>>   1 file changed, 64 insertions(+)
>>
>> diff --git a/tests/test-bdrv-graph-mod.c b/tests/test-bdrv-graph-mod.c
>> index 3b9e6f242f..27e3361a60 100644
>> --- a/tests/test-bdrv-graph-mod.c
>> +++ b/tests/test-bdrv-graph-mod.c
>> @@ -232,6 +232,68 @@ static void test_parallel_exclusive_write(void)
>>       bdrv_unref(top);
>>   }
>>   
>> +static void write_to_file_perms(BlockDriverState *bs, BdrvChild *c,
>> +                                     BdrvChildRole role,
>> +                                     BlockReopenQueue *reopen_queue,
>> +                                     uint64_t perm, uint64_t shared,
>> +                                     uint64_t *nperm, uint64_t *nshared)
>> +{
>> +    if (bs->file && c == bs->file) {
>> +        *nperm = BLK_PERM_WRITE;
>> +        *nshared = BLK_PERM_ALL & ~BLK_PERM_WRITE;
>> +    } else {
>> +        *nperm = 0;
>> +        *nshared = BLK_PERM_ALL;
>> +    }
>> +}
>> +
>> +static BlockDriver bdrv_write_to_file = {
>> +    .format_name = "tricky-perm",
>> +    .bdrv_child_perm = write_to_file_perms,
>> +};
>> +
>> +static void test_parallel_perm_update(void)
>> +{
>> +    BlockDriverState *top = no_perm_node("top");
>> +    BlockDriverState *tricky =
>> +            bdrv_new_open_driver(&bdrv_write_to_file, "tricky", BDRV_O_RDWR,
>> +                                 &error_abort);
>> +    BlockDriverState *base = no_perm_node("base");
>> +    BlockDriverState *fl1 = pass_through_node("fl1");
>> +    BlockDriverState *fl2 = pass_through_node("fl2");
>> +    BdrvChild *c_fl1, *c_fl2;
>> +
>> +    bdrv_attach_child(top, tricky, "file", &child_of_bds, BDRV_CHILD_DATA,
>> +                      &error_abort);
>> +    c_fl1 = bdrv_attach_child(tricky, fl1, "first", &child_of_bds,
>> +                              BDRV_CHILD_FILTERED, &error_abort);
>> +    c_fl2 = bdrv_attach_child(tricky, fl2, "second", &child_of_bds,
>> +                              BDRV_CHILD_FILTERED, &error_abort);
>> +    bdrv_attach_child(fl1, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
>> +                      &error_abort);
>> +    bdrv_attach_child(fl2, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
>> +                      &error_abort);
>> +    bdrv_ref(base);
>> +
>> +    /* Select fl1 as first child to be active */
>> +    tricky->file = c_fl1;
>> +    bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
>> +
>> +    assert(c_fl1->perm & BLK_PERM_WRITE);
>> +
>> +    /* Now, try to switch active child and update permissions */
>> +    tricky->file = c_fl2;
>> +    bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
>> +
>> +    assert(c_fl2->perm & BLK_PERM_WRITE);
>> +
>> +    /* Switch once more, to not care about real child order in the list */
>> +    tricky->file = c_fl1;
>> +    bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
>> +
>> +    assert(c_fl1->perm & BLK_PERM_WRITE);
> 
> Should we also assert in each case that the we don't hole the write
> permission for the inactive child?
> 

Won't hurt, will add

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 03/36] block: bdrv_append(): don't consume reference
  2021-01-18 14:18   ` Kevin Wolf
@ 2021-01-18 17:21     ` Vladimir Sementsov-Ogievskiy
  2021-01-18 17:59       ` Kevin Wolf
  0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-18 17:21 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

18.01.2021 17:18, Kevin Wolf wrote:
> Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> We have too much comments for this feature. It seems better just don't
>> do it. Most of real users (tests don't count) have to create additional
>> reference.
>>
>> Drop also comment in external_snapshot_prepare:
>>   - bdrv_append doesn't "remove" old bs in common sense, it sounds
>>     strange
>>   - the fact that bdrv_append can fail is obvious from the context
>>   - the fact that we must rollback all changes in transaction abort is
>>     known (it's the direct role of abort)
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block.c                     | 19 +++----------------
>>   block/backup-top.c          |  1 -
>>   block/commit.c              |  1 +
>>   block/mirror.c              |  3 ---
>>   blockdev.c                  |  4 ----
>>   tests/test-bdrv-drain.c     |  2 +-
>>   tests/test-bdrv-graph-mod.c |  2 ++
>>   7 files changed, 7 insertions(+), 25 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 0dd28f0902..55efef3c9d 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -3145,11 +3145,6 @@ static BlockDriverState *bdrv_append_temp_snapshot(BlockDriverState *bs,
>>           goto out;
>>       }
>>   
>> -    /* bdrv_append() consumes a strong reference to bs_snapshot
>> -     * (i.e. it will call bdrv_unref() on it) even on error, so in
>> -     * order to be able to return one, we have to increase
>> -     * bs_snapshot's refcount here */
>> -    bdrv_ref(bs_snapshot);
>>       bdrv_append(bs_snapshot, bs, &local_err);
>>       if (local_err) {
>>           error_propagate(errp, local_err);
>> @@ -4608,10 +4603,8 @@ void bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
>>    *
>>    * This function does not create any image files.
>>    *
>> - * bdrv_append() takes ownership of a bs_new reference and unrefs it because
>> - * that's what the callers commonly need. bs_new will be referenced by the old
>> - * parents of bs_top after bdrv_append() returns. If the caller needs to keep a
>> - * reference of its own, it must call bdrv_ref().
>> + * Recent update: bdrv_append does NOT eat bs_new reference for now. Drop this
>> + * comment several moths later.
> 
> A comment like this is unusual. Do you think there is a high risk of
> somebody introducing a new bdrv_append() in parallel and that they would
> read this comment when rebasing their existing patches?

Or even later, someone may remember that bdrv_append() eat the reference, and then face some strange behavior with a new call of bdrv_append(), finally go to check the function code and see the new comment.. I don't insist, we can drop the comment

> 
> If we do keep the comment: s/for now/now/ (it has recently changed,
> we're not intending to change it later) and s/moths/months/.
> 
>>    */
>>   void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
>>                    Error **errp)
>> @@ -4621,20 +4614,14 @@ void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
>>       bdrv_set_backing_hd(bs_new, bs_top, &local_err);
>>       if (local_err) {
>>           error_propagate(errp, local_err);
>> -        goto out;
>> +        return;
>>       }
>>   
>>       bdrv_replace_node(bs_top, bs_new, &local_err);
>>       if (local_err) {
>>           error_propagate(errp, local_err);
>>           bdrv_set_backing_hd(bs_new, NULL, &error_abort);
>> -        goto out;
> 
> Can we leave a return here just in case that new code will be added at
> the end of the function?

sure

> 
>>       }
>> -
>> -    /* bs_new is now referenced by its new parents, we don't need the
>> -     * additional reference any more. */
>> -out:
>> -    bdrv_unref(bs_new);
>>   }
>>   
>>   static void bdrv_delete(BlockDriverState *bs)
>> diff --git a/block/backup-top.c b/block/backup-top.c
>> index fe6883cc97..650ed6195c 100644
>> --- a/block/backup-top.c
>> +++ b/block/backup-top.c
>> @@ -222,7 +222,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
>>   
>>       bdrv_drained_begin(source);
>>   
>> -    bdrv_ref(top);
>>       bdrv_append(top, source, &local_err);
>>       if (local_err) {
>>           error_prepend(&local_err, "Cannot append backup-top filter: ");
>> diff --git a/block/commit.c b/block/commit.c
>> index 71db7ba747..61924bcf66 100644
>> --- a/block/commit.c
>> +++ b/block/commit.c
>> @@ -313,6 +313,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
>>       commit_top_bs->total_sectors = top->total_sectors;
>>   
>>       bdrv_append(commit_top_bs, top, &local_err);
>> +    bdrv_unref(commit_top_bs); /* referenced by new parents or failed */
>>       if (local_err) {
>>           commit_top_bs = NULL;
>>           error_propagate(errp, local_err);
>> diff --git a/block/mirror.c b/block/mirror.c
>> index 8e1ad6eceb..13f7ecc998 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -1605,9 +1605,6 @@ static BlockJob *mirror_start_job(
>>       bs_opaque = g_new0(MirrorBDSOpaque, 1);
>>       mirror_top_bs->opaque = bs_opaque;
>>   
>> -    /* bdrv_append takes ownership of the mirror_top_bs reference, need to keep
>> -     * it alive until block_job_create() succeeds even if bs has no parent. */
>> -    bdrv_ref(mirror_top_bs);
>>       bdrv_drained_begin(bs);
>>       bdrv_append(mirror_top_bs, bs, &local_err);
>>       bdrv_drained_end(bs);
>> diff --git a/blockdev.c b/blockdev.c
>> index b5f11c524b..96c96f8ba6 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -1587,10 +1587,6 @@ static void external_snapshot_prepare(BlkActionState *common,
>>           goto out;
>>       }
>>   
>> -    /* This removes our old bs and adds the new bs. This is an operation that
>> -     * can fail, so we need to do it in .prepare; undoing it for abort is
>> -     * always possible. */
> 
> This comment is still relevant, it's unrelated to the bdrv_ref().

I described in commit message, why I dislike this comment :) I can keep it of course, it's not critical

> 
>> -    bdrv_ref(state->new_bs);
>>       bdrv_append(state->new_bs, state->old_bs, &local_err);
>>       if (local_err) {
>>           error_propagate(errp, local_err);
> 
> Kevin
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 05/36] block: add bdrv_parent_try_set_aio_context
  2021-01-18 15:08   ` Kevin Wolf
@ 2021-01-18 17:26     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-18 17:26 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

18.01.2021 18:08, Kevin Wolf wrote:
> Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> We already have bdrv_parent_can_set_aio_context(). Add corresponding
>> bdrv_parent_set_aio_context_ignore() and
>> bdrv_parent_try_set_aio_context() and use them instead of open-coding.
>>
>> Make bdrv_parent_try_set_aio_context() public, as it will be used in
>> further commit.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   include/block/block.h |  2 ++
>>   block.c               | 51 +++++++++++++++++++++++++++++++++----------
>>   2 files changed, 41 insertions(+), 12 deletions(-)
>>
>> diff --git a/include/block/block.h b/include/block/block.h
>> index ee3f5a6cca..550c5a7513 100644
>> --- a/include/block/block.h
>> +++ b/include/block/block.h
>> @@ -686,6 +686,8 @@ bool bdrv_child_can_set_aio_context(BdrvChild *c, AioContext *ctx,
>>                                       GSList **ignore, Error **errp);
>>   bool bdrv_can_set_aio_context(BlockDriverState *bs, AioContext *ctx,
>>                                 GSList **ignore, Error **errp);
>> +int bdrv_parent_try_set_aio_context(BdrvChild *c, AioContext *ctx,
>> +                                    Error **errp);
>>   int bdrv_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz);
>>   int bdrv_probe_geometry(BlockDriverState *bs, HDGeometry *geo);
>>   
>> diff --git a/block.c b/block.c
>> index 916087ee1a..5d925c208d 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -81,6 +81,9 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
>>                                              BdrvChildRole child_role,
>>                                              Error **errp);
>>   
>> +static void bdrv_parent_set_aio_context_ignore(BdrvChild *c, AioContext *ctx,
>> +                                               GSList **ignore);
>> +
>>   /* If non-zero, use only whitelisted block drivers */
>>   static int use_bdrv_whitelist;
>>   
>> @@ -2655,17 +2658,12 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
>>        * try moving the parent into the AioContext of child_bs instead. */
>>       if (bdrv_get_aio_context(child_bs) != ctx) {
>>           ret = bdrv_try_set_aio_context(child_bs, ctx, &local_err);
>> -        if (ret < 0 && child_class->can_set_aio_ctx) {
>> -            GSList *ignore = g_slist_prepend(NULL, child);
>> -            ctx = bdrv_get_aio_context(child_bs);
> 
> You are losing this line...
> 
>> -            if (child_class->can_set_aio_ctx(child, ctx, &ignore, NULL)) {
>> -                error_free(local_err);
>> +        if (ret < 0) {
>> +            if (bdrv_parent_try_set_aio_context(child, ctx, NULL) == 0) {
> 
> ...before this one, so I think the wrong context is passed now. Instead
> of trying to move the parent to the AioContext of the child, we'll try
> to move it to the AioContext in which it already is (and which doesn't
> match the AioContext of the child).
> 

Oops, right, will fix

> 
>>                   ret = 0;
>> -                g_slist_free(ignore);
>> -                ignore = g_slist_prepend(NULL, child);
>> -                child_class->set_aio_ctx(child, ctx, &ignore);
>> +                error_free(local_err);
>> +                local_err = NULL;
>>               }
>> -            g_slist_free(ignore);
>>           }
>>           if (ret < 0) {
>>               error_propagate(errp, local_err);
>> @@ -6452,9 +6450,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
>>           if (g_slist_find(*ignore, child)) {
>>               continue;
>>           }
>> -        assert(child->klass->set_aio_ctx);
>> -        *ignore = g_slist_prepend(*ignore, child);
>> -        child->klass->set_aio_ctx(child, new_context, ignore);
>> +        bdrv_parent_set_aio_context_ignore(child, new_context, ignore);
>>       }
>>   
>>       bdrv_detach_aio_context(bs);
>> @@ -6511,6 +6507,37 @@ static bool bdrv_parent_can_set_aio_context(BdrvChild *c, AioContext *ctx,
>>       return true;
>>   }
>>   
>> +static void bdrv_parent_set_aio_context_ignore(BdrvChild *c, AioContext *ctx,
>> +                                               GSList **ignore)
>> +{
>> +    if (g_slist_find(*ignore, c)) {
>> +        return;
>> +    }
>> +    *ignore = g_slist_prepend(*ignore, c);
>> +
>> +    assert(c->klass->set_aio_ctx);
>> +    c->klass->set_aio_ctx(c, ctx, ignore);
>> +}
>> +
>> +int bdrv_parent_try_set_aio_context(BdrvChild *c, AioContext *ctx,
>> +                                    Error **errp)
>> +{
>> +    GSList *ignore = NULL;
>> +
>> +    if (!bdrv_parent_can_set_aio_context(c, ctx, &ignore, errp)) {
>> +        g_slist_free(ignore);
>> +        return -EPERM;
>> +    }
>> +
>> +    g_slist_free(ignore);
>> +    ignore = NULL;
>> +
>> +    bdrv_parent_set_aio_context_ignore(c, ctx, &ignore);
>> +    g_slist_free(ignore);
>> +
>> +    return 0;
>> +}
>> +
>>   bool bdrv_child_can_set_aio_context(BdrvChild *c, AioContext *ctx,
>>                                       GSList **ignore, Error **errp)
>>   {
>> -- 
>> 2.21.3
>>
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 06/36] block: BdrvChildClass: add .get_parent_aio_context handler
  2021-01-18 15:13   ` Kevin Wolf
@ 2021-01-18 17:36     ` Vladimir Sementsov-Ogievskiy
  2021-01-19 16:38       ` Kevin Wolf
  0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-18 17:36 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

18.01.2021 18:13, Kevin Wolf wrote:
> Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Add new handler to get aio context and implement it in all child
>> classes. Add corresponding public interface to be used soon.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> Hm, are you going to introduce cases where parent and child context
> don't match, or why is this a useful function?
> 
> Even if so, I feel it shouldn't be any of the child's business what
> AioContext the parent uses.
> 
> Well, maybe the rest of the series will answer this.
> 

It's for the following patch, to not pass parent (as opaque) with it's class, and with its ctx in separate. Just to simplify the interface of the function, we are going to work with a lot.

Hm. I'll take this opportunity to say, that the terminology that calls graph edge "BdrvChild" always leads to the mess between parents and children.. "child_class" is a class of parent.. list of parents is list of children... It all would be a lot simpler to understand if BdrvChild was BdrvEdge or something like this.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 10/36] util: add transactions.c
  2021-01-18 16:50   ` Kevin Wolf
@ 2021-01-18 17:41     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-18 17:41 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

18.01.2021 19:50, Kevin Wolf wrote:
> Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Add simple transaction API to use in further update of block graph
>> operations.
>>
>> Supposed usage is:
>>
>> - "prepare" is main function of the action and it should make the main
>>    effect of the action to be visible for the following actions, keeping
>>    possibility of roll-back, saving necessary things in action state,
>>    which is prepended to the list. So, driver struct doesn't include
>>    "prepare" field, as it is supposed to be called directly.
> 
> So the convention is that tran_prepend() should be called by the
> function that does the preparation?

yes.

> Or would we call tran_prepend() and
> do the actual action in different places?
> 
>> - commit/rollback is supposed to be called for the list of action
>>    states, to commit/rollback all the actions in reverse order
>>
>> - When possible "commit" should not make visible effect for other
>>    actions, which make possible transparent logical interaction between
>>    actions.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   include/qemu/transactions.h | 46 +++++++++++++++++++++
>>   util/transactions.c         | 81 +++++++++++++++++++++++++++++++++++++
>>   util/meson.build            |  1 +
>>   3 files changed, 128 insertions(+)
>>   create mode 100644 include/qemu/transactions.h
>>   create mode 100644 util/transactions.c
>>
>> diff --git a/include/qemu/transactions.h b/include/qemu/transactions.h
>> new file mode 100644
>> index 0000000000..a5b15f46ab
>> --- /dev/null
>> +++ b/include/qemu/transactions.h
>> @@ -0,0 +1,46 @@
>> +/*
>> + * Simple transactions API
>> + *
>> + * Copyright (c) 2020 Virtuozzo International GmbH.
>> + *
>> + * Author:
>> + *  Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef QEMU_TRANSACTIONS_H
>> +#define QEMU_TRANSACTIONS_H
>> +
>> +#include <gmodule.h>
>> +
>> +typedef struct TransactionActionDrv {
>> +    void (*abort)(void *opeque);
>> +    void (*commit)(void *opeque);
>> +    void (*clean)(void *opeque);
>> +} TransactionActionDrv;
> 
> s/opeque/opaque/
> 
>> +void tran_prepend(GSList **list, TransactionActionDrv *drv, void *opaque);
>> +void tran_abort(GSList *backup);
>> +void tran_commit(GSList *backup);
> 
> I'd add an empty line before a full function definition.
> 
>> +static inline void tran_finalize(GSList *backup, int ret)
>> +{
>> +    if (ret < 0) {
>> +        tran_abort(backup);
>> +    } else {
>> +        tran_commit(backup);
>> +    }
>> +}
> 
> Let's use an opaque struct instead of GSList here and...
> 
>> +#endif /* QEMU_TRANSACTIONS_H */
>> diff --git a/util/transactions.c b/util/transactions.c
>> new file mode 100644
>> index 0000000000..ef1b9a36a4
>> --- /dev/null
>> +++ b/util/transactions.c
>> @@ -0,0 +1,81 @@
>> +/*
>> + * Simple transactions API
>> + *
>> + * Copyright (c) 2020 Virtuozzo International GmbH.
>> + *
>> + * Author:
>> + *  Sementsov-Ogievskiy Vladimir <vsementsov@virtuozzo.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +
>> +#include "qemu/transactions.h"
>> +
>> +typedef struct BdrvAction {
>> +    TransactionActionDrv *drv;
>> +    void *opaque;
>> +} BdrvAction;
> 
> ...add a QSLIST_ENTRY (or similar) here to make it a type-safe list.
> 
> The missing type safety of GSList means that we should avoid it whenever
> it's easily possible (i.e. we know the number of lists in which an
> element will be). Here, each BdrvAction will only be in a single list,
> so typed lists should be simple enough.
> 

OK


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 03/36] block: bdrv_append(): don't consume reference
  2021-01-18 17:21     ` Vladimir Sementsov-Ogievskiy
@ 2021-01-18 17:59       ` Kevin Wolf
  0 siblings, 0 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-01-18 17:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 18.01.2021 um 18:21 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 18.01.2021 17:18, Kevin Wolf wrote:
> > Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > We have too much comments for this feature. It seems better just don't
> > > do it. Most of real users (tests don't count) have to create additional
> > > reference.
> > > 
> > > Drop also comment in external_snapshot_prepare:
> > >   - bdrv_append doesn't "remove" old bs in common sense, it sounds
> > >     strange
> > >   - the fact that bdrv_append can fail is obvious from the context
> > >   - the fact that we must rollback all changes in transaction abort is
> > >     known (it's the direct role of abort)
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> > > diff --git a/blockdev.c b/blockdev.c
> > > index b5f11c524b..96c96f8ba6 100644
> > > --- a/blockdev.c
> > > +++ b/blockdev.c
> > > @@ -1587,10 +1587,6 @@ static void external_snapshot_prepare(BlkActionState *common,
> > >           goto out;
> > >       }
> > > -    /* This removes our old bs and adds the new bs. This is an operation that
> > > -     * can fail, so we need to do it in .prepare; undoing it for abort is
> > > -     * always possible. */
> > 
> > This comment is still relevant, it's unrelated to the bdrv_ref().
> 
> I described in commit message, why I dislike this comment :) I can
> keep it of course, it's not critical

Ah, right, I missed this bit in the commit message (or forgot it until I
reached this hunk) and thought it was an accidental removal.

If it's intentional, no reason to change the patch.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 06/36] block: BdrvChildClass: add .get_parent_aio_context handler
  2021-01-18 17:36     ` Vladimir Sementsov-Ogievskiy
@ 2021-01-19 16:38       ` Kevin Wolf
  2021-01-22 11:04         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-01-19 16:38 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 18.01.2021 um 18:36 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 18.01.2021 18:13, Kevin Wolf wrote:
> > Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > Add new handler to get aio context and implement it in all child
> > > classes. Add corresponding public interface to be used soon.
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > 
> > Hm, are you going to introduce cases where parent and child context
> > don't match, or why is this a useful function?
> > 
> > Even if so, I feel it shouldn't be any of the child's business what
> > AioContext the parent uses.
> > 
> > Well, maybe the rest of the series will answer this.
> 
> It's for the following patch, to not pass parent (as opaque) with it's
> class, and with its ctx in separate. Just to simplify the interface of
> the function, we are going to work with a lot.

In a way, the result is nicer because we already assume that ctx is the
parent context when we move things to different AioContexts. So it's
more consistent to just directly take it from the parent.

At the same time, it also complicates things a bit because now we need a
defined state in the middle of an operation that adds, removes or
replaces a child.

I guess I still to make more progress in the review of this series until
I see how you're using the interface.

> Hm. I'll take this opportunity to say, that the terminology that calls
> graph edge "BdrvChild" always leads to the mess between parents and
> children.. "child_class" is a class of parent.. list of parents is
> list of children... It all would be a lot simpler to understand if
> BdrvChild was BdrvEdge or something like this.

Yeah, that would probably be clearer now that we actually use it to
work with both ends of the edge. And BdrvNode instead of
BlockDriverState, I guess.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 11/36] block: bdrv_refresh_perms: check parents compliance
  2020-11-27 14:44 ` [PATCH v2 11/36] block: bdrv_refresh_perms: check parents compliance Vladimir Sementsov-Ogievskiy
@ 2021-01-19 17:42   ` Kevin Wolf
  2021-01-19 18:10     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-01-19 17:42 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Add additional check that node parents do not interfere with each
> other. This should not hurt existing callers and allows in further
> patch use bdrv_refresh_perms() to update a subtree of changed
> BdrvChild (check that change is correct).
> 
> New check will substitute bdrv_check_update_perm() in following
> permissions refactoring, so keep error messages the same to avoid
> unit test result changes.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

The change itself looks ok, but I'm not happy with the naming. It feels
a bit unspecific. How about inverting the result and calling it
bdrv_parent_perms_conflict() and bdrv_child_perms_conflict()?

At least, I'd call it "permission consistency" rather then "compliance".

> diff --git a/block.c b/block.c
> index 29082c6d47..a756f3e8ad 100644
> --- a/block.c
> +++ b/block.c
> @@ -1966,6 +1966,57 @@ bool bdrv_is_writable(BlockDriverState *bs)
>      return bdrv_is_writable_after_reopen(bs, NULL);
>  }
>  
> +static char *bdrv_child_user_desc(BdrvChild *c)
> +{
> +    if (c->klass->get_parent_desc) {
> +        return c->klass->get_parent_desc(c);
> +    }
> +
> +    return g_strdup("another user");
> +}
> +
> +static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, Error **errp)
> +{
> +    g_autofree char *user = NULL;
> +    g_autofree char *perm_names = NULL;
> +
> +    if ((b->perm & a->shared_perm) == b->perm) {
> +        return true;
> +    }
> +
> +    perm_names = bdrv_perm_names(b->perm & ~a->shared_perm);
> +    user = bdrv_child_user_desc(a);
> +    error_setg(errp, "Conflicts with use by %s as '%s', which does not "
> +               "allow '%s' on %s",
> +               user, a->name, perm_names, bdrv_get_node_name(b->bs));
> +
> +    return false;
> +}
> +
> +static bool bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp)
> +{
> +    BdrvChild *a, *b;
> +
> +    /*
> +     * During the loop we'll look at each pair twice. That's correct is

s/is/because/ or what did you mean here?

> +     * bdrv_a_allow_b() is asymmetric and we should check each pair in both
> +     * directions.
> +     */
> +    QLIST_FOREACH(a, &bs->parents, next_parent) {
> +        QLIST_FOREACH(b, &bs->parents, next_parent) {
> +            if (a == b) {
> +                continue;
> +            }
> +
> +            if (!bdrv_a_allow_b(a, b, errp)) {
> +                return false;
> +            }
> +        }
> +    }
> +
> +    return true;
> +}

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 12/36] block: refactor bdrv_child* permission functions
  2020-11-27 14:44 ` [PATCH v2 12/36] block: refactor bdrv_child* permission functions Vladimir Sementsov-Ogievskiy
@ 2021-01-19 18:09   ` Kevin Wolf
  2021-01-19 18:30     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-01-19 18:09 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Split out non-recursive parts, and refactor as block graph transaction
> action.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block.c | 79 ++++++++++++++++++++++++++++++++++++++++++---------------
>  1 file changed, 59 insertions(+), 20 deletions(-)
> 
> diff --git a/block.c b/block.c
> index a756f3e8ad..7267b4a3e9 100644
> --- a/block.c
> +++ b/block.c
> @@ -48,6 +48,7 @@
>  #include "qemu/timer.h"
>  #include "qemu/cutils.h"
>  #include "qemu/id.h"
> +#include "qemu/transactions.h"
>  #include "block/coroutines.h"
>  
>  #ifdef CONFIG_BSD
> @@ -2033,6 +2034,61 @@ static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
>      }
>  }
>  
> +static void bdrv_child_set_perm_commit(void *opaque)
> +{
> +    BdrvChild *c = opaque;
> +
> +    c->has_backup_perm = false;
> +}
> +
> +static void bdrv_child_set_perm_abort(void *opaque)
> +{
> +    BdrvChild *c = opaque;
> +    /*
> +     * We may have child->has_backup_perm unset at this point, as in case of
> +     * _check_ stage of permission update failure we may _check_ not the whole
> +     * subtree.  Still, _abort_ is called on the whole subtree anyway.
> +     */
> +    if (c->has_backup_perm) {
> +        c->perm = c->backup_perm;
> +        c->shared_perm = c->backup_shared_perm;
> +        c->has_backup_perm = false;
> +    }
> +}
> +
> +static TransactionActionDrv bdrv_child_set_pem_drv = {
> +    .abort = bdrv_child_set_perm_abort,
> +    .commit = bdrv_child_set_perm_commit,
> +};
> +
> +/*
> + * With tran=NULL needs to be followed by direct call to either
> + * bdrv_child_set_perm_commit() or bdrv_child_set_perm_abort().
> + *
> + * With non-NUll tran needs to be followed by tran_abort() or tran_commit()

s/NUll/NULL/

> + * instead.
> + */
> +static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
> +                                     uint64_t shared, GSList **tran)
> +{
> +    if (!c->has_backup_perm) {
> +        c->has_backup_perm = true;
> +        c->backup_perm = c->perm;
> +        c->backup_shared_perm = c->shared_perm;
> +    }

This is the obvious refactoring at this point, and it's fine as such.

However, when you start to actually use tran (and in new places), it
means that I have to check that we can never end up here recursively
with a different tran.

It would probably be much cleaner if the next patch moved the backup
state into the opaque struct for BdrvAction.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 11/36] block: bdrv_refresh_perms: check parents compliance
  2021-01-19 17:42   ` Kevin Wolf
@ 2021-01-19 18:10     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-19 18:10 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

19.01.2021 20:42, Kevin Wolf wrote:
> Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Add additional check that node parents do not interfere with each
>> other. This should not hurt existing callers and allows in further
>> patch use bdrv_refresh_perms() to update a subtree of changed
>> BdrvChild (check that change is correct).
>>
>> New check will substitute bdrv_check_update_perm() in following
>> permissions refactoring, so keep error messages the same to avoid
>> unit test result changes.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> The change itself looks ok, but I'm not happy with the naming. It feels
> a bit unspecific. How about inverting the result and calling it
> bdrv_parent_perms_conflict() and bdrv_child_perms_conflict()?
> 
> At least, I'd call it "permission consistency" rather then "compliance".

bdrv_parent_perms_conflict() sound good for me, OK

> 
>> diff --git a/block.c b/block.c
>> index 29082c6d47..a756f3e8ad 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -1966,6 +1966,57 @@ bool bdrv_is_writable(BlockDriverState *bs)
>>       return bdrv_is_writable_after_reopen(bs, NULL);
>>   }
>>   
>> +static char *bdrv_child_user_desc(BdrvChild *c)
>> +{
>> +    if (c->klass->get_parent_desc) {
>> +        return c->klass->get_parent_desc(c);
>> +    }
>> +
>> +    return g_strdup("another user");
>> +}
>> +
>> +static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, Error **errp)
>> +{
>> +    g_autofree char *user = NULL;
>> +    g_autofree char *perm_names = NULL;
>> +
>> +    if ((b->perm & a->shared_perm) == b->perm) {
>> +        return true;
>> +    }
>> +
>> +    perm_names = bdrv_perm_names(b->perm & ~a->shared_perm);
>> +    user = bdrv_child_user_desc(a);
>> +    error_setg(errp, "Conflicts with use by %s as '%s', which does not "
>> +               "allow '%s' on %s",
>> +               user, a->name, perm_names, bdrv_get_node_name(b->bs));
>> +
>> +    return false;
>> +}
>> +
>> +static bool bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp)
>> +{
>> +    BdrvChild *a, *b;
>> +
>> +    /*
>> +     * During the loop we'll look at each pair twice. That's correct is
> 
> s/is/because/ or what did you mean here?

yes, s/is/because/

> 
>> +     * bdrv_a_allow_b() is asymmetric and we should check each pair in both
>> +     * directions.
>> +     */
>> +    QLIST_FOREACH(a, &bs->parents, next_parent) {
>> +        QLIST_FOREACH(b, &bs->parents, next_parent) {
>> +            if (a == b) {
>> +                continue;
>> +            }
>> +
>> +            if (!bdrv_a_allow_b(a, b, errp)) {
>> +                return false;
>> +            }
>> +        }
>> +    }
>> +
>> +    return true;
>> +}
> 
> Kevin
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 12/36] block: refactor bdrv_child* permission functions
  2021-01-19 18:09   ` Kevin Wolf
@ 2021-01-19 18:30     ` Vladimir Sementsov-Ogievskiy
  2021-01-20  9:09       ` Kevin Wolf
  0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-19 18:30 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

19.01.2021 21:09, Kevin Wolf wrote:
> Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Split out non-recursive parts, and refactor as block graph transaction
>> action.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block.c | 79 ++++++++++++++++++++++++++++++++++++++++++---------------
>>   1 file changed, 59 insertions(+), 20 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index a756f3e8ad..7267b4a3e9 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -48,6 +48,7 @@
>>   #include "qemu/timer.h"
>>   #include "qemu/cutils.h"
>>   #include "qemu/id.h"
>> +#include "qemu/transactions.h"
>>   #include "block/coroutines.h"
>>   
>>   #ifdef CONFIG_BSD
>> @@ -2033,6 +2034,61 @@ static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
>>       }
>>   }
>>   
>> +static void bdrv_child_set_perm_commit(void *opaque)
>> +{
>> +    BdrvChild *c = opaque;
>> +
>> +    c->has_backup_perm = false;
>> +}
>> +
>> +static void bdrv_child_set_perm_abort(void *opaque)
>> +{
>> +    BdrvChild *c = opaque;
>> +    /*
>> +     * We may have child->has_backup_perm unset at this point, as in case of
>> +     * _check_ stage of permission update failure we may _check_ not the whole
>> +     * subtree.  Still, _abort_ is called on the whole subtree anyway.
>> +     */
>> +    if (c->has_backup_perm) {
>> +        c->perm = c->backup_perm;
>> +        c->shared_perm = c->backup_shared_perm;
>> +        c->has_backup_perm = false;
>> +    }
>> +}
>> +
>> +static TransactionActionDrv bdrv_child_set_pem_drv = {
>> +    .abort = bdrv_child_set_perm_abort,
>> +    .commit = bdrv_child_set_perm_commit,
>> +};
>> +
>> +/*
>> + * With tran=NULL needs to be followed by direct call to either
>> + * bdrv_child_set_perm_commit() or bdrv_child_set_perm_abort().
>> + *
>> + * With non-NUll tran needs to be followed by tran_abort() or tran_commit()
> 
> s/NUll/NULL/
> 
>> + * instead.
>> + */
>> +static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
>> +                                     uint64_t shared, GSList **tran)
>> +{
>> +    if (!c->has_backup_perm) {
>> +        c->has_backup_perm = true;
>> +        c->backup_perm = c->perm;
>> +        c->backup_shared_perm = c->shared_perm;
>> +    }
> 
> This is the obvious refactoring at this point, and it's fine as such.
> 
> However, when you start to actually use tran (and in new places), it
> means that I have to check that we can never end up here recursively
> with a different tran.

I don't follow.. Which different tran do you mean?

> 
> It would probably be much cleaner if the next patch moved the backup
> state into the opaque struct for BdrvAction.

But old code which calls the function with tran=NULL can't use it.. Hmm, we can probably support both ways: c->backup_perm for callers with tran=NULL and opaque struct for new style callers.


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 12/36] block: refactor bdrv_child* permission functions
  2021-01-19 18:30     ` Vladimir Sementsov-Ogievskiy
@ 2021-01-20  9:09       ` Kevin Wolf
  2021-01-20  9:56         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-01-20  9:09 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 19.01.2021 um 19:30 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 19.01.2021 21:09, Kevin Wolf wrote:
> > Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > Split out non-recursive parts, and refactor as block graph transaction
> > > action.
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > > ---
> > >   block.c | 79 ++++++++++++++++++++++++++++++++++++++++++---------------
> > >   1 file changed, 59 insertions(+), 20 deletions(-)
> > > 
> > > diff --git a/block.c b/block.c
> > > index a756f3e8ad..7267b4a3e9 100644
> > > --- a/block.c
> > > +++ b/block.c
> > > @@ -48,6 +48,7 @@
> > >   #include "qemu/timer.h"
> > >   #include "qemu/cutils.h"
> > >   #include "qemu/id.h"
> > > +#include "qemu/transactions.h"
> > >   #include "block/coroutines.h"
> > >   #ifdef CONFIG_BSD
> > > @@ -2033,6 +2034,61 @@ static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
> > >       }
> > >   }
> > > +static void bdrv_child_set_perm_commit(void *opaque)
> > > +{
> > > +    BdrvChild *c = opaque;
> > > +
> > > +    c->has_backup_perm = false;
> > > +}
> > > +
> > > +static void bdrv_child_set_perm_abort(void *opaque)
> > > +{
> > > +    BdrvChild *c = opaque;
> > > +    /*
> > > +     * We may have child->has_backup_perm unset at this point, as in case of
> > > +     * _check_ stage of permission update failure we may _check_ not the whole
> > > +     * subtree.  Still, _abort_ is called on the whole subtree anyway.
> > > +     */
> > > +    if (c->has_backup_perm) {
> > > +        c->perm = c->backup_perm;
> > > +        c->shared_perm = c->backup_shared_perm;
> > > +        c->has_backup_perm = false;
> > > +    }
> > > +}
> > > +
> > > +static TransactionActionDrv bdrv_child_set_pem_drv = {
> > > +    .abort = bdrv_child_set_perm_abort,
> > > +    .commit = bdrv_child_set_perm_commit,
> > > +};
> > > +
> > > +/*
> > > + * With tran=NULL needs to be followed by direct call to either
> > > + * bdrv_child_set_perm_commit() or bdrv_child_set_perm_abort().
> > > + *
> > > + * With non-NUll tran needs to be followed by tran_abort() or tran_commit()
> > 
> > s/NUll/NULL/
> > 
> > > + * instead.
> > > + */
> > > +static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
> > > +                                     uint64_t shared, GSList **tran)
> > > +{
> > > +    if (!c->has_backup_perm) {
> > > +        c->has_backup_perm = true;
> > > +        c->backup_perm = c->perm;
> > > +        c->backup_shared_perm = c->shared_perm;
> > > +    }
> > 
> > This is the obvious refactoring at this point, and it's fine as such.
> > 
> > However, when you start to actually use tran (and in new places), it
> > means that I have to check that we can never end up here recursively
> > with a different tran.
> 
> I don't follow.. Which different tran do you mean?

Any really. At this point in the series, nothing passes a non-NULL tran
yet. When you add the first user, it is only a local transaction for a
single node. If something else called bdrv_child_set_perm_safe() on the
same node before the transaction has completed, the result would be
broken.

So reviewing/understanding this change (and actually developing it in
the first place) means going through all the code that ends up inside
the transaction and making sure that we never try to change permissions
for the same node a second time in any context.

> > 
> > It would probably be much cleaner if the next patch moved the backup
> > state into the opaque struct for BdrvAction.
> 
> But old code which calls the function with tran=NULL can't use it..
> Hmm, we can probably support both ways: c->backup_perm for callers
> with tran=NULL and opaque struct for new style callers.

Hm, you're right about that... Maybe that's too much complication then.

What happens in the next patch is still understandable enough with the
way you currently have it. Let's see what it looks like with the rest.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 12/36] block: refactor bdrv_child* permission functions
  2021-01-20  9:09       ` Kevin Wolf
@ 2021-01-20  9:56         ` Vladimir Sementsov-Ogievskiy
  2021-01-20 10:06           ` Kevin Wolf
  0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-20  9:56 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

20.01.2021 12:09, Kevin Wolf wrote:
> Am 19.01.2021 um 19:30 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 19.01.2021 21:09, Kevin Wolf wrote:
>>> Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> Split out non-recursive parts, and refactor as block graph transaction
>>>> action.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>    block.c | 79 ++++++++++++++++++++++++++++++++++++++++++---------------
>>>>    1 file changed, 59 insertions(+), 20 deletions(-)
>>>>
>>>> diff --git a/block.c b/block.c
>>>> index a756f3e8ad..7267b4a3e9 100644
>>>> --- a/block.c
>>>> +++ b/block.c
>>>> @@ -48,6 +48,7 @@
>>>>    #include "qemu/timer.h"
>>>>    #include "qemu/cutils.h"
>>>>    #include "qemu/id.h"
>>>> +#include "qemu/transactions.h"
>>>>    #include "block/coroutines.h"
>>>>    #ifdef CONFIG_BSD
>>>> @@ -2033,6 +2034,61 @@ static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
>>>>        }
>>>>    }
>>>> +static void bdrv_child_set_perm_commit(void *opaque)
>>>> +{
>>>> +    BdrvChild *c = opaque;
>>>> +
>>>> +    c->has_backup_perm = false;
>>>> +}
>>>> +
>>>> +static void bdrv_child_set_perm_abort(void *opaque)
>>>> +{
>>>> +    BdrvChild *c = opaque;
>>>> +    /*
>>>> +     * We may have child->has_backup_perm unset at this point, as in case of
>>>> +     * _check_ stage of permission update failure we may _check_ not the whole
>>>> +     * subtree.  Still, _abort_ is called on the whole subtree anyway.
>>>> +     */
>>>> +    if (c->has_backup_perm) {
>>>> +        c->perm = c->backup_perm;
>>>> +        c->shared_perm = c->backup_shared_perm;
>>>> +        c->has_backup_perm = false;
>>>> +    }
>>>> +}
>>>> +
>>>> +static TransactionActionDrv bdrv_child_set_pem_drv = {
>>>> +    .abort = bdrv_child_set_perm_abort,
>>>> +    .commit = bdrv_child_set_perm_commit,
>>>> +};
>>>> +
>>>> +/*
>>>> + * With tran=NULL needs to be followed by direct call to either
>>>> + * bdrv_child_set_perm_commit() or bdrv_child_set_perm_abort().
>>>> + *
>>>> + * With non-NUll tran needs to be followed by tran_abort() or tran_commit()
>>>
>>> s/NUll/NULL/
>>>
>>>> + * instead.
>>>> + */
>>>> +static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
>>>> +                                     uint64_t shared, GSList **tran)
>>>> +{
>>>> +    if (!c->has_backup_perm) {
>>>> +        c->has_backup_perm = true;
>>>> +        c->backup_perm = c->perm;
>>>> +        c->backup_shared_perm = c->shared_perm;
>>>> +    }
>>>
>>> This is the obvious refactoring at this point, and it's fine as such.
>>>
>>> However, when you start to actually use tran (and in new places), it
>>> means that I have to check that we can never end up here recursively
>>> with a different tran.
>>
>> I don't follow.. Which different tran do you mean?
> 
> Any really. At this point in the series, nothing passes a non-NULL tran
> yet. When you add the first user, it is only a local transaction for a
> single node. If something else called bdrv_child_set_perm_safe() on the
> same node before the transaction has completed, the result would be
> broken.

But this problem is preexisting: if someone call bdrv_child_set_perm twice on the same node during one update operation, c->backup* will be rewritten.

> 
> So reviewing/understanding this change (and actually developing it in
> the first place) means going through all the code that ends up inside
> the transaction and making sure that we never try to change permissions
> for the same node a second time in any context.

I think we do it, when find same node several times during update. And that is fixed in "[PATCH v2 15/36] block: use topological sort for permission update", when we move to topological sorted list.

> 
>>>
>>> It would probably be much cleaner if the next patch moved the backup
>>> state into the opaque struct for BdrvAction.
>>
>> But old code which calls the function with tran=NULL can't use it..
>> Hmm, we can probably support both ways: c->backup_perm for callers
>> with tran=NULL and opaque struct for new style callers.
> 
> Hm, you're right about that... Maybe that's too much complication then.
> 
> What happens in the next patch is still understandable enough with the
> way you currently have it. Let's see what it looks like with the rest.
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 12/36] block: refactor bdrv_child* permission functions
  2021-01-20  9:56         ` Vladimir Sementsov-Ogievskiy
@ 2021-01-20 10:06           ` Kevin Wolf
  0 siblings, 0 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-01-20 10:06 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 20.01.2021 um 10:56 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 20.01.2021 12:09, Kevin Wolf wrote:
> > Am 19.01.2021 um 19:30 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > 19.01.2021 21:09, Kevin Wolf wrote:
> > > > Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > > Split out non-recursive parts, and refactor as block graph transaction
> > > > > action.
> > > > > 
> > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > > > > ---
> > > > >    block.c | 79 ++++++++++++++++++++++++++++++++++++++++++---------------
> > > > >    1 file changed, 59 insertions(+), 20 deletions(-)
> > > > > 
> > > > > diff --git a/block.c b/block.c
> > > > > index a756f3e8ad..7267b4a3e9 100644
> > > > > --- a/block.c
> > > > > +++ b/block.c
> > > > > @@ -48,6 +48,7 @@
> > > > >    #include "qemu/timer.h"
> > > > >    #include "qemu/cutils.h"
> > > > >    #include "qemu/id.h"
> > > > > +#include "qemu/transactions.h"
> > > > >    #include "block/coroutines.h"
> > > > >    #ifdef CONFIG_BSD
> > > > > @@ -2033,6 +2034,61 @@ static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
> > > > >        }
> > > > >    }
> > > > > +static void bdrv_child_set_perm_commit(void *opaque)
> > > > > +{
> > > > > +    BdrvChild *c = opaque;
> > > > > +
> > > > > +    c->has_backup_perm = false;
> > > > > +}
> > > > > +
> > > > > +static void bdrv_child_set_perm_abort(void *opaque)
> > > > > +{
> > > > > +    BdrvChild *c = opaque;
> > > > > +    /*
> > > > > +     * We may have child->has_backup_perm unset at this point, as in case of
> > > > > +     * _check_ stage of permission update failure we may _check_ not the whole
> > > > > +     * subtree.  Still, _abort_ is called on the whole subtree anyway.
> > > > > +     */
> > > > > +    if (c->has_backup_perm) {
> > > > > +        c->perm = c->backup_perm;
> > > > > +        c->shared_perm = c->backup_shared_perm;
> > > > > +        c->has_backup_perm = false;
> > > > > +    }
> > > > > +}
> > > > > +
> > > > > +static TransactionActionDrv bdrv_child_set_pem_drv = {
> > > > > +    .abort = bdrv_child_set_perm_abort,
> > > > > +    .commit = bdrv_child_set_perm_commit,
> > > > > +};
> > > > > +
> > > > > +/*
> > > > > + * With tran=NULL needs to be followed by direct call to either
> > > > > + * bdrv_child_set_perm_commit() or bdrv_child_set_perm_abort().
> > > > > + *
> > > > > + * With non-NUll tran needs to be followed by tran_abort() or tran_commit()
> > > > 
> > > > s/NUll/NULL/
> > > > 
> > > > > + * instead.
> > > > > + */
> > > > > +static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
> > > > > +                                     uint64_t shared, GSList **tran)
> > > > > +{
> > > > > +    if (!c->has_backup_perm) {
> > > > > +        c->has_backup_perm = true;
> > > > > +        c->backup_perm = c->perm;
> > > > > +        c->backup_shared_perm = c->shared_perm;
> > > > > +    }
> > > > 
> > > > This is the obvious refactoring at this point, and it's fine as such.
> > > > 
> > > > However, when you start to actually use tran (and in new places), it
> > > > means that I have to check that we can never end up here recursively
> > > > with a different tran.
> > > 
> > > I don't follow.. Which different tran do you mean?
> > 
> > Any really. At this point in the series, nothing passes a non-NULL tran
> > yet. When you add the first user, it is only a local transaction for a
> > single node. If something else called bdrv_child_set_perm_safe() on the
> > same node before the transaction has completed, the result would be
> > broken.
> 
> But this problem is preexisting: if someone call bdrv_child_set_perm
> twice on the same node during one update operation, c->backup* will be
> rewritten.
> 
> > 
> > So reviewing/understanding this change (and actually developing it in
> > the first place) means going through all the code that ends up inside
> > the transaction and making sure that we never try to change permissions
> > for the same node a second time in any context.
> 
> I think we do it, when find same node several times during update. And
> that is fixed in "[PATCH v2 15/36] block: use topological sort for
> permission update", when we move to topological sorted list.

Ah, fair. Knowing that the state is broken before this patch makes
things easier in a way...

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 06/36] block: BdrvChildClass: add .get_parent_aio_context handler
  2021-01-19 16:38       ` Kevin Wolf
@ 2021-01-22 11:04         ` Vladimir Sementsov-Ogievskiy
  2021-01-22 11:18           ` Kevin Wolf
  0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-22 11:04 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

19.01.2021 19:38, Kevin Wolf wrote:
> Am 18.01.2021 um 18:36 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 18.01.2021 18:13, Kevin Wolf wrote:
>>> Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> Add new handler to get aio context and implement it in all child
>>>> classes. Add corresponding public interface to be used soon.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>
>>> Hm, are you going to introduce cases where parent and child context
>>> don't match, or why is this a useful function?
>>>
>>> Even if so, I feel it shouldn't be any of the child's business what
>>> AioContext the parent uses.
>>>
>>> Well, maybe the rest of the series will answer this.
>>
>> It's for the following patch, to not pass parent (as opaque) with it's
>> class, and with its ctx in separate. Just to simplify the interface of
>> the function, we are going to work with a lot.
> 
> In a way, the result is nicer because we already assume that ctx is the
> parent context when we move things to different AioContexts. So it's
> more consistent to just directly take it from the parent.
> 
> At the same time, it also complicates things a bit because now we need a
> defined state in the middle of an operation that adds, removes or
> replaces a child.
> 
> I guess I still to make more progress in the review of this series until
> I see how you're using the interface.
> 
>> Hm. I'll take this opportunity to say, that the terminology that calls
>> graph edge "BdrvChild" always leads to the mess between parents and
>> children.. "child_class" is a class of parent.. list of parents is
>> list of children... It all would be a lot simpler to understand if
>> BdrvChild was BdrvEdge or something like this.
> 
> Yeah, that would probably be clearer now that we actually use it to
> work with both ends of the edge. And BdrvNode instead of
> BlockDriverState, I guess.

Do you think, we can just rename them? Or it would be too painful for developers,
who get used to current names? I can make patches


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 06/36] block: BdrvChildClass: add .get_parent_aio_context handler
  2021-01-22 11:04         ` Vladimir Sementsov-Ogievskiy
@ 2021-01-22 11:18           ` Kevin Wolf
  2021-01-22 11:26             ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-01-22 11:18 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 22.01.2021 um 12:04 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 19.01.2021 19:38, Kevin Wolf wrote:
> > Am 18.01.2021 um 18:36 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > 18.01.2021 18:13, Kevin Wolf wrote:
> > > > Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > > Add new handler to get aio context and implement it in all child
> > > > > classes. Add corresponding public interface to be used soon.
> > > > > 
> > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > > > 
> > > > Hm, are you going to introduce cases where parent and child context
> > > > don't match, or why is this a useful function?
> > > > 
> > > > Even if so, I feel it shouldn't be any of the child's business what
> > > > AioContext the parent uses.
> > > > 
> > > > Well, maybe the rest of the series will answer this.
> > > 
> > > It's for the following patch, to not pass parent (as opaque) with it's
> > > class, and with its ctx in separate. Just to simplify the interface of
> > > the function, we are going to work with a lot.
> > 
> > In a way, the result is nicer because we already assume that ctx is the
> > parent context when we move things to different AioContexts. So it's
> > more consistent to just directly take it from the parent.
> > 
> > At the same time, it also complicates things a bit because now we need a
> > defined state in the middle of an operation that adds, removes or
> > replaces a child.
> > 
> > I guess I still to make more progress in the review of this series until
> > I see how you're using the interface.
> > 
> > > Hm. I'll take this opportunity to say, that the terminology that calls
> > > graph edge "BdrvChild" always leads to the mess between parents and
> > > children.. "child_class" is a class of parent.. list of parents is
> > > list of children... It all would be a lot simpler to understand if
> > > BdrvChild was BdrvEdge or something like this.
> > 
> > Yeah, that would probably be clearer now that we actually use it to
> > work with both ends of the edge. And BdrvNode instead of
> > BlockDriverState, I guess.
> 
> Do you think, we can just rename them? Or it would be too painful for developers,
> who get used to current names? I can make patches

I think getting used to new names wouldn't be too bad. I would be more
afraid of the merge conflicts.

Not sure, but it might in the category that we would do it differently
if we were starting over, but maybe not worth changing now.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 06/36] block: BdrvChildClass: add .get_parent_aio_context handler
  2021-01-22 11:18           ` Kevin Wolf
@ 2021-01-22 11:26             ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-22 11:26 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

22.01.2021 14:18, Kevin Wolf wrote:
> Am 22.01.2021 um 12:04 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 19.01.2021 19:38, Kevin Wolf wrote:
>>> Am 18.01.2021 um 18:36 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> 18.01.2021 18:13, Kevin Wolf wrote:
>>>>> Am 27.11.2020 um 15:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>>> Add new handler to get aio context and implement it in all child
>>>>>> classes. Add corresponding public interface to be used soon.
>>>>>>
>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>>
>>>>> Hm, are you going to introduce cases where parent and child context
>>>>> don't match, or why is this a useful function?
>>>>>
>>>>> Even if so, I feel it shouldn't be any of the child's business what
>>>>> AioContext the parent uses.
>>>>>
>>>>> Well, maybe the rest of the series will answer this.
>>>>
>>>> It's for the following patch, to not pass parent (as opaque) with it's
>>>> class, and with its ctx in separate. Just to simplify the interface of
>>>> the function, we are going to work with a lot.
>>>
>>> In a way, the result is nicer because we already assume that ctx is the
>>> parent context when we move things to different AioContexts. So it's
>>> more consistent to just directly take it from the parent.
>>>
>>> At the same time, it also complicates things a bit because now we need a
>>> defined state in the middle of an operation that adds, removes or
>>> replaces a child.
>>>
>>> I guess I still to make more progress in the review of this series until
>>> I see how you're using the interface.
>>>
>>>> Hm. I'll take this opportunity to say, that the terminology that calls
>>>> graph edge "BdrvChild" always leads to the mess between parents and
>>>> children.. "child_class" is a class of parent.. list of parents is
>>>> list of children... It all would be a lot simpler to understand if
>>>> BdrvChild was BdrvEdge or something like this.
>>>
>>> Yeah, that would probably be clearer now that we actually use it to
>>> work with both ends of the edge. And BdrvNode instead of
>>> BlockDriverState, I guess.
>>
>> Do you think, we can just rename them? Or it would be too painful for developers,
>> who get used to current names? I can make patches
> 
> I think getting used to new names wouldn't be too bad. I would be more
> afraid of the merge conflicts.
> 
> Not sure, but it might in the category that we would do it differently
> if we were starting over, but maybe not worth changing now.
> 

Hmm yes, such rename will add a year of uncomfortable patch backporting..


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 15/36] block: use topological sort for permission update
  2020-11-27 14:45 ` [PATCH v2 15/36] block: use topological sort for permission update Vladimir Sementsov-Ogievskiy
@ 2021-01-27 18:38   ` Kevin Wolf
  2021-01-28  9:34     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-01-27 18:38 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Rewrite bdrv_check_perm(), bdrv_abort_perm_update() and bdrv_set_perm()
> to update nodes in topological sort order instead of simple DFS. With
> topologically sorted nodes, we update a node only when all its parents
> already updated. With DFS it's not so.
> 
> Consider the following example:
> 
>     A -+
>     |  |
>     |  v
>     |  B
>     |  |
>     v  |
>     C<-+
> 
> A is parent for B and C, B is parent for C.
> 
> Obviously, to update permissions, we should go in order A B C, so, when
> we update C, all parent permissions already updated.

I wondered for a moment why this order is obvious. Taking a permission
on A may mean that we need to take the permisson on C, too.

The answer is (or so I think) that the whole operation is atomic so the
half-updated state will never be visible to a caller, but this is about
calculating the right permissions. Permissions a node needs on its
children may depend on what its parents requested, but parent
permissions never depend on what children request.

Ok, makes sense.

> But with current
> approach (simple recursion) we can update in sequence A C B C (C is
> updated twice). On first update of C, we consider old B permissions, so
> doing wrong thing. If it succeed, all is OK, on second C update we will
> finish with correct graph. But if the wrong thing failed, we break the
> whole process for no reason (it's possible that updated B permission
> will be less strict, but we will never check it).
> 
> Also new approach gives a way to simultaneously and correctly update
> several nodes, we just need to run bdrv_topological_dfs() several times
> to add all nodes and their subtrees into one topologically sorted list
> (next patch will update bdrv_replace_node() in this manner).
> 
> Test test_parallel_perm_update() is now passing, so move it out of
> debugging "if".
> 
> We also need to support ignore_children in
> bdrv_check_parents_compliance().
> 
> For test 283 order of parents compliance check is changed.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block.c                     | 103 +++++++++++++++++++++++++++++-------
>  tests/test-bdrv-graph-mod.c |   4 +-
>  tests/qemu-iotests/283.out  |   2 +-
>  3 files changed, 86 insertions(+), 23 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 92bfcbedc9..81ccf51605 100644
> --- a/block.c
> +++ b/block.c
> @@ -1994,7 +1994,9 @@ static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, Error **errp)
>      return false;
>  }
>  
> -static bool bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp)
> +static bool bdrv_check_parents_compliance(BlockDriverState *bs,
> +                                          GSList *ignore_children,
> +                                          Error **errp)
>  {
>      BdrvChild *a, *b;
>  
> @@ -2005,7 +2007,9 @@ static bool bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp)
>       */
>      QLIST_FOREACH(a, &bs->parents, next_parent) {
>          QLIST_FOREACH(b, &bs->parents, next_parent) {
> -            if (a == b) {
> +            if (a == b || g_slist_find(ignore_children, a) ||
> +                g_slist_find(ignore_children, b))

'a' should be checked in the outer loop, no reason to repeat the same
check all the time in the inner loop.

> +            {
>                  continue;
>              }
>  
> @@ -2034,6 +2038,29 @@ static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
>      }
>  }
>  
> +static GSList *bdrv_topological_dfs(GSList *list, GHashTable *found,
> +                                    BlockDriverState *bs)

It would be good to have a comment that explains the details of the
contract.

In particular, this seems to require that @list is already topologically
sorted, and it's complete in the sense that if a node is in the list,
all of its children are in the list, too.

> +{
> +    BdrvChild *child;
> +    g_autoptr(GHashTable) local_found = NULL;
> +
> +    if (!found) {
> +        assert(!list);
> +        found = local_found = g_hash_table_new(NULL, NULL);
> +    }
> +
> +    if (g_hash_table_contains(found, bs)) {
> +        return list;
> +    }
> +    g_hash_table_add(found, bs);
> +
> +    QLIST_FOREACH(child, &bs->children, next) {
> +        list = bdrv_topological_dfs(list, found, child->bs);
> +    }
> +
> +    return g_slist_prepend(list, bs);
> +}
> +
>  static void bdrv_child_set_perm_commit(void *opaque)
>  {
>      BdrvChild *c = opaque;
> @@ -2098,10 +2125,10 @@ static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
>   * A call to this function must always be followed by a call to bdrv_set_perm()
>   * or bdrv_abort_perm_update().
>   */

One big source of confusion for me when trying to understand this was
that bdrv_check_perm() is a misnomer since commit f962e96150e and the
above comment isn't really accurate any more.

The function doesn't only check the validity of the new permissions in
advance to actually making the change, but it already updates the
permissions of all child nodes (however not of its root node).

So we have gone from the original check/set/abort model (which the
function names still suggest) to a prepare/commit/rollback model.

I think some comment updates are in order, and possibly we should rename
some functions, too.

> -static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> -                           uint64_t cumulative_perms,
> -                           uint64_t cumulative_shared_perms,
> -                           GSList *ignore_children, Error **errp)
> +static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> +                                uint64_t cumulative_perms,
> +                                uint64_t cumulative_shared_perms,
> +                                GSList *ignore_children, Error **errp)
>  {
>      BlockDriver *drv = bs->drv;
>      BdrvChild *c;
> @@ -2166,21 +2193,43 @@ static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>      /* Check all children */
>      QLIST_FOREACH(c, &bs->children, next) {
>          uint64_t cur_perm, cur_shared;
> -        GSList *cur_ignore_children;
>  
>          bdrv_child_perm(bs, c->bs, c, c->role, q,
>                          cumulative_perms, cumulative_shared_perms,
>                          &cur_perm, &cur_shared);
> +        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);

This "added" line is actually old code. What is removed here is the
recursive call of bdrv_check_update_perm(). This is what the code below
will have to replace.

> +    }
> +
> +    return 0;
> +}
> +
> +static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> +                           uint64_t cumulative_perms,
> +                           uint64_t cumulative_shared_perms,
> +                           GSList *ignore_children, Error **errp)
> +{
> +    int ret;
> +    BlockDriverState *root = bs;
> +    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root);
> +
> +    for ( ; list; list = list->next) {
> +        bs = list->data;
> +
> +        if (bs != root) {
> +            if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) {
> +                return -EINVAL;
> +            }

At this point bs still had the old permissions, but we don't access
them. As we're going in topological order, the parents have already been
updated if they were a child covered in bdrv_node_check_perm(), so we're
checking the relevant values. Good.

What about the root node? If I understand correctly, the parents of the
root nodes wouldn't have been checked in the old code. In the new state,
the parent BdrvChild already has to contain the new permission.

In bdrv_refresh_perms(), we already check parent conflicts, so no change
for all callers going through it. Good.

bdrv_reopen_multiple() is less obvious. It passes permissions from the
BDRVReopenState, without applying the permissions first. Do we check the
old parent permissions instead of the new state here?

> +            bdrv_get_cumulative_perm(bs, &cumulative_perms,
> +                                     &cumulative_shared_perms);
> +        }
>  
> -        cur_ignore_children = g_slist_prepend(g_slist_copy(ignore_children), c);
> -        ret = bdrv_check_update_perm(c->bs, q, cur_perm, cur_shared,
> -                                     cur_ignore_children, errp);
> -        g_slist_free(cur_ignore_children);
> +        ret = bdrv_node_check_perm(bs, q, cumulative_perms,
> +                                   cumulative_shared_perms,
> +                                   ignore_children, errp);

We use the original ignore_children for every node in the sorted list.
The old code extends it with all nodes in the path to each node.

For the bdrv_check_update_perm() call that is now replaced with
bdrv_check_parents_compliance(), I think this was necessary because
bdrv_check_update_perm() always assumes adding a new edge, so if you
update one instead of adding it, you have to ignore it so that it can't
conflict with itself. This isn't necessary any more now because we just
update and then check for consistency.

For passing to bdrv_node_check_perm() it doesn't make a difference
anyway because the parameter is now unused (and should probably be
removed).

>          if (ret < 0) {
>              return ret;
>          }
> -
> -        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
>      }
>  
>      return 0;

A tricky patch to understand, but I think it's right for the most part.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 15/36] block: use topological sort for permission update
  2021-01-27 18:38   ` Kevin Wolf
@ 2021-01-28  9:34     ` Vladimir Sementsov-Ogievskiy
  2021-01-28 17:13       ` Kevin Wolf
  0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-28  9:34 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

27.01.2021 21:38, Kevin Wolf wrote:
> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Rewrite bdrv_check_perm(), bdrv_abort_perm_update() and bdrv_set_perm()
>> to update nodes in topological sort order instead of simple DFS. With
>> topologically sorted nodes, we update a node only when all its parents
>> already updated. With DFS it's not so.
>>
>> Consider the following example:
>>
>>      A -+
>>      |  |
>>      |  v
>>      |  B
>>      |  |
>>      v  |
>>      C<-+
>>
>> A is parent for B and C, B is parent for C.
>>
>> Obviously, to update permissions, we should go in order A B C, so, when
>> we update C, all parent permissions already updated.
> 
> I wondered for a moment why this order is obvious. Taking a permission
> on A may mean that we need to take the permisson on C, too.
> 
> The answer is (or so I think) that the whole operation is atomic so the
> half-updated state will never be visible to a caller, but this is about
> calculating the right permissions. Permissions a node needs on its
> children may depend on what its parents requested, but parent
> permissions never depend on what children request.
> 

yes, that's about these relations

> 
>> But with current
>> approach (simple recursion) we can update in sequence A C B C (C is
>> updated twice). On first update of C, we consider old B permissions, so
>> doing wrong thing. If it succeed, all is OK, on second C update we will
>> finish with correct graph. But if the wrong thing failed, we break the
>> whole process for no reason (it's possible that updated B permission
>> will be less strict, but we will never check it).
>>
>> Also new approach gives a way to simultaneously and correctly update
>> several nodes, we just need to run bdrv_topological_dfs() several times
>> to add all nodes and their subtrees into one topologically sorted list
>> (next patch will update bdrv_replace_node() in this manner).
>>
>> Test test_parallel_perm_update() is now passing, so move it out of
>> debugging "if".
>>
>> We also need to support ignore_children in
>> bdrv_check_parents_compliance().
>>
>> For test 283 order of parents compliance check is changed.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block.c                     | 103 +++++++++++++++++++++++++++++-------
>>   tests/test-bdrv-graph-mod.c |   4 +-
>>   tests/qemu-iotests/283.out  |   2 +-
>>   3 files changed, 86 insertions(+), 23 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 92bfcbedc9..81ccf51605 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -1994,7 +1994,9 @@ static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, Error **errp)
>>       return false;
>>   }
>>   
>> -static bool bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp)
>> +static bool bdrv_check_parents_compliance(BlockDriverState *bs,
>> +                                          GSList *ignore_children,
>> +                                          Error **errp)
>>   {
>>       BdrvChild *a, *b;
>>   
>> @@ -2005,7 +2007,9 @@ static bool bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp)
>>        */
>>       QLIST_FOREACH(a, &bs->parents, next_parent) {
>>           QLIST_FOREACH(b, &bs->parents, next_parent) {
>> -            if (a == b) {
>> +            if (a == b || g_slist_find(ignore_children, a) ||
>> +                g_slist_find(ignore_children, b))
> 
> 'a' should be checked in the outer loop, no reason to repeat the same
> check all the time in the inner loop.
> 
>> +            {
>>                   continue;
>>               }
>>   
>> @@ -2034,6 +2038,29 @@ static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
>>       }
>>   }
>>   
>> +static GSList *bdrv_topological_dfs(GSList *list, GHashTable *found,
>> +                                    BlockDriverState *bs)
> 
> It would be good to have a comment that explains the details of the
> contract.
> 
> In particular, this seems to require that @list is already topologically
> sorted, and it's complete in the sense that if a node is in the list,
> all of its children are in the list, too.

Right, will add

> 
>> +{
>> +    BdrvChild *child;
>> +    g_autoptr(GHashTable) local_found = NULL;
>> +
>> +    if (!found) {
>> +        assert(!list);
>> +        found = local_found = g_hash_table_new(NULL, NULL);
>> +    }
>> +
>> +    if (g_hash_table_contains(found, bs)) {
>> +        return list;
>> +    }
>> +    g_hash_table_add(found, bs);
>> +
>> +    QLIST_FOREACH(child, &bs->children, next) {
>> +        list = bdrv_topological_dfs(list, found, child->bs);
>> +    }
>> +
>> +    return g_slist_prepend(list, bs);
>> +}
>> +
>>   static void bdrv_child_set_perm_commit(void *opaque)
>>   {
>>       BdrvChild *c = opaque;
>> @@ -2098,10 +2125,10 @@ static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
>>    * A call to this function must always be followed by a call to bdrv_set_perm()
>>    * or bdrv_abort_perm_update().
>>    */
> 
> One big source of confusion for me when trying to understand this was
> that bdrv_check_perm() is a misnomer since commit f962e96150e and the
> above comment isn't really accurate any more.
> 
> The function doesn't only check the validity of the new permissions in
> advance to actually making the change, but it already updates the
> permissions of all child nodes (however not of its root node).
> 
> So we have gone from the original check/set/abort model (which the
> function names still suggest) to a prepare/commit/rollback model.
> 
> I think some comment updates are in order, and possibly we should rename
> some functions, too.
> 

In the end of the series they are refactored and renamed to be native part of new transaction system (introduced in [10])

>> -static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>> -                           uint64_t cumulative_perms,
>> -                           uint64_t cumulative_shared_perms,
>> -                           GSList *ignore_children, Error **errp)
>> +static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>> +                                uint64_t cumulative_perms,
>> +                                uint64_t cumulative_shared_perms,
>> +                                GSList *ignore_children, Error **errp)
>>   {
>>       BlockDriver *drv = bs->drv;
>>       BdrvChild *c;
>> @@ -2166,21 +2193,43 @@ static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>       /* Check all children */
>>       QLIST_FOREACH(c, &bs->children, next) {
>>           uint64_t cur_perm, cur_shared;
>> -        GSList *cur_ignore_children;
>>   
>>           bdrv_child_perm(bs, c->bs, c, c->role, q,
>>                           cumulative_perms, cumulative_shared_perms,
>>                           &cur_perm, &cur_shared);
>> +        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
> 
> This "added" line is actually old code. What is removed here is the
> recursive call of bdrv_check_update_perm(). This is what the code below
> will have to replace.

yes, we'll use explicit loop instead of recursion

> 
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>> +                           uint64_t cumulative_perms,
>> +                           uint64_t cumulative_shared_perms,
>> +                           GSList *ignore_children, Error **errp)
>> +{
>> +    int ret;
>> +    BlockDriverState *root = bs;
>> +    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root);
>> +
>> +    for ( ; list; list = list->next) {
>> +        bs = list->data;
>> +
>> +        if (bs != root) {
>> +            if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) {
>> +                return -EINVAL;
>> +            }
> 
> At this point bs still had the old permissions, but we don't access
> them. As we're going in topological order, the parents have already been
> updated if they were a child covered in bdrv_node_check_perm(), so we're
> checking the relevant values. Good.
> 
> What about the root node? If I understand correctly, the parents of the
> root nodes wouldn't have been checked in the old code. In the new state,
> the parent BdrvChild already has to contain the new permission.
> 
> In bdrv_refresh_perms(), we already check parent conflicts, so no change
> for all callers going through it. Good.
> 
> bdrv_reopen_multiple() is less obvious. It passes permissions from the
> BDRVReopenState, without applying the permissions first.

It will be changed in the series

> Do we check the
> old parent permissions instead of the new state here?

We use given (new) cumulative permissions for bs, and recalculate permissions for bs subtree.

It follows old behavior. The only thing is changed that pre-patch we do DFS recursion starting from bs (and probably visit some nodes several times), after-patch we first do topological sort of bs subtree and go through the list. The order of nodes is better and we visit each node once.

> 
>> +            bdrv_get_cumulative_perm(bs, &cumulative_perms,
>> +                                     &cumulative_shared_perms);
>> +        }
>>   
>> -        cur_ignore_children = g_slist_prepend(g_slist_copy(ignore_children), c);
>> -        ret = bdrv_check_update_perm(c->bs, q, cur_perm, cur_shared,
>> -                                     cur_ignore_children, errp);
>> -        g_slist_free(cur_ignore_children);
>> +        ret = bdrv_node_check_perm(bs, q, cumulative_perms,
>> +                                   cumulative_shared_perms,
>> +                                   ignore_children, errp);
> 
> We use the original ignore_children for every node in the sorted list.
> The old code extends it with all nodes in the path to each node.
> 
> For the bdrv_check_update_perm() call that is now replaced with
> bdrv_check_parents_compliance(), I think this was necessary because
> bdrv_check_update_perm() always assumes adding a new edge, so if you
> update one instead of adding it, you have to ignore it so that it can't
> conflict with itself. This isn't necessary any more now because we just
> update and then check for consistency.
> 
> For passing to bdrv_node_check_perm() it doesn't make a difference
> anyway because the parameter is now unused (and should probably be
> removed).

ignore_children will be dropped in [27]. For now it is still needed for bdrv_replace_node_common

> 
>>           if (ret < 0) {
>>               return ret;
>>           }
>> -
>> -        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
>>       }
>>   
>>       return 0;
> 
> A tricky patch to understand, but I think it's right for the most part.
> 
> Kevin
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 15/36] block: use topological sort for permission update
  2021-01-28  9:34     ` Vladimir Sementsov-Ogievskiy
@ 2021-01-28 17:13       ` Kevin Wolf
  2021-01-28 18:04         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-01-28 17:13 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 28.01.2021 um 10:34 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 27.01.2021 21:38, Kevin Wolf wrote:
> > Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > -static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> > > -                           uint64_t cumulative_perms,
> > > -                           uint64_t cumulative_shared_perms,
> > > -                           GSList *ignore_children, Error **errp)
> > > +static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> > > +                                uint64_t cumulative_perms,
> > > +                                uint64_t cumulative_shared_perms,
> > > +                                GSList *ignore_children, Error **errp)
> > >   {
> > >       BlockDriver *drv = bs->drv;
> > >       BdrvChild *c;
> > > @@ -2166,21 +2193,43 @@ static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> > >       /* Check all children */
> > >       QLIST_FOREACH(c, &bs->children, next) {
> > >           uint64_t cur_perm, cur_shared;
> > > -        GSList *cur_ignore_children;
> > >           bdrv_child_perm(bs, c->bs, c, c->role, q,
> > >                           cumulative_perms, cumulative_shared_perms,
> > >                           &cur_perm, &cur_shared);
> > > +        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
> > 
> > This "added" line is actually old code. What is removed here is the
> > recursive call of bdrv_check_update_perm(). This is what the code below
> > will have to replace.
> 
> yes, we'll use explicit loop instead of recursion
> 
> > 
> > > +    }
> > > +
> > > +    return 0;
> > > +}
> > > +
> > > +static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> > > +                           uint64_t cumulative_perms,
> > > +                           uint64_t cumulative_shared_perms,
> > > +                           GSList *ignore_children, Error **errp)
> > > +{
> > > +    int ret;
> > > +    BlockDriverState *root = bs;
> > > +    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root);
> > > +
> > > +    for ( ; list; list = list->next) {
> > > +        bs = list->data;
> > > +
> > > +        if (bs != root) {
> > > +            if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) {
> > > +                return -EINVAL;
> > > +            }
> > 
> > At this point bs still had the old permissions, but we don't access
> > them. As we're going in topological order, the parents have already been
> > updated if they were a child covered in bdrv_node_check_perm(), so we're
> > checking the relevant values. Good.
> > 
> > What about the root node? If I understand correctly, the parents of the
> > root nodes wouldn't have been checked in the old code. In the new state,
> > the parent BdrvChild already has to contain the new permission.
> > 
> > In bdrv_refresh_perms(), we already check parent conflicts, so no change
> > for all callers going through it. Good.
> > 
> > bdrv_reopen_multiple() is less obvious. It passes permissions from the
> > BDRVReopenState, without applying the permissions first.
> 
> It will be changed in the series
> 
> > Do we check the
> > old parent permissions instead of the new state here?
> 
> We use given (new) cumulative permissions for bs, and recalculate
> permissions for bs subtree.

Where do we actually set them? I would expect a
bdrv_child_set_perm_safe() call somewhere, but I can't see it in the
call path from bdrv_reopen_multiple().

> It follows old behavior. The only thing is changed that pre-patch we
> do DFS recursion starting from bs (and probably visit some nodes
> several times), after-patch we first do topological sort of bs subtree
> and go through the list. The order of nodes is better and we visit
> each node once.

It's not the only thing that changes. Maybe this is what makes the patch
hard to understand, because it seems to do two steps at once:

1. Change the order in which nodes are processed

2. Replace bdrv_check_update_perm() with bdrv_check_parents_compliance()

In step 2, the point I mentioned above is important (new permissions
must already be set in the BdrvChild objects).

The switch to bdrv_check_parents_compliance() also means that error
messages become a bit worse because we don't know any more which of the
conflicting nodes is the new one, so we can't provide two different
messages any more. This is probably unavoidable, though.

> > 
> > > +            bdrv_get_cumulative_perm(bs, &cumulative_perms,
> > > +                                     &cumulative_shared_perms);
> > > +        }
> > > -        cur_ignore_children = g_slist_prepend(g_slist_copy(ignore_children), c);
> > > -        ret = bdrv_check_update_perm(c->bs, q, cur_perm, cur_shared,
> > > -                                     cur_ignore_children, errp);
> > > -        g_slist_free(cur_ignore_children);
> > > +        ret = bdrv_node_check_perm(bs, q, cumulative_perms,
> > > +                                   cumulative_shared_perms,
> > > +                                   ignore_children, errp);
> > 
> > We use the original ignore_children for every node in the sorted list.
> > The old code extends it with all nodes in the path to each node.
> > 
> > For the bdrv_check_update_perm() call that is now replaced with
> > bdrv_check_parents_compliance(), I think this was necessary because
> > bdrv_check_update_perm() always assumes adding a new edge, so if you
> > update one instead of adding it, you have to ignore it so that it can't
> > conflict with itself. This isn't necessary any more now because we just
> > update and then check for consistency.
> > 
> > For passing to bdrv_node_check_perm() it doesn't make a difference
> > anyway because the parameter is now unused (and should probably be
> > removed).
> 
> ignore_children will be dropped in [27]. For now it is still needed
> for bdrv_replace_node_common

In bdrv_node_check_perm(), it's already unused after this patch. But
fair enough.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 15/36] block: use topological sort for permission update
  2021-01-28 17:13       ` Kevin Wolf
@ 2021-01-28 18:04         ` Vladimir Sementsov-Ogievskiy
  2021-02-03 18:38           ` Kevin Wolf
  0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-28 18:04 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

28.01.2021 20:13, Kevin Wolf wrote:
> Am 28.01.2021 um 10:34 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 27.01.2021 21:38, Kevin Wolf wrote:
>>> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> -static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>>> -                           uint64_t cumulative_perms,
>>>> -                           uint64_t cumulative_shared_perms,
>>>> -                           GSList *ignore_children, Error **errp)
>>>> +static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>>> +                                uint64_t cumulative_perms,
>>>> +                                uint64_t cumulative_shared_perms,
>>>> +                                GSList *ignore_children, Error **errp)
>>>>    {
>>>>        BlockDriver *drv = bs->drv;
>>>>        BdrvChild *c;
>>>> @@ -2166,21 +2193,43 @@ static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>>>        /* Check all children */
>>>>        QLIST_FOREACH(c, &bs->children, next) {
>>>>            uint64_t cur_perm, cur_shared;
>>>> -        GSList *cur_ignore_children;
>>>>            bdrv_child_perm(bs, c->bs, c, c->role, q,
>>>>                            cumulative_perms, cumulative_shared_perms,
>>>>                            &cur_perm, &cur_shared);
>>>> +        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
>>>
>>> This "added" line is actually old code. What is removed here is the
>>> recursive call of bdrv_check_update_perm(). This is what the code below
>>> will have to replace.
>>
>> yes, we'll use explicit loop instead of recursion
>>
>>>
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>>> +                           uint64_t cumulative_perms,
>>>> +                           uint64_t cumulative_shared_perms,
>>>> +                           GSList *ignore_children, Error **errp)
>>>> +{
>>>> +    int ret;
>>>> +    BlockDriverState *root = bs;
>>>> +    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root);
>>>> +
>>>> +    for ( ; list; list = list->next) {
>>>> +        bs = list->data;
>>>> +
>>>> +        if (bs != root) {
>>>> +            if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) {
>>>> +                return -EINVAL;
>>>> +            }
>>>
>>> At this point bs still had the old permissions, but we don't access
>>> them. As we're going in topological order, the parents have already been
>>> updated if they were a child covered in bdrv_node_check_perm(), so we're
>>> checking the relevant values. Good.
>>>
>>> What about the root node? If I understand correctly, the parents of the
>>> root nodes wouldn't have been checked in the old code. In the new state,
>>> the parent BdrvChild already has to contain the new permission.
>>>
>>> In bdrv_refresh_perms(), we already check parent conflicts, so no change
>>> for all callers going through it. Good.
>>>
>>> bdrv_reopen_multiple() is less obvious. It passes permissions from the
>>> BDRVReopenState, without applying the permissions first.
>>
>> It will be changed in the series
>>
>>> Do we check the
>>> old parent permissions instead of the new state here?
>>
>> We use given (new) cumulative permissions for bs, and recalculate
>> permissions for bs subtree.
> 
> Where do we actually set them? I would expect a
> bdrv_child_set_perm_safe() call somewhere, but I can't see it in the
> call path from bdrv_reopen_multiple().

You mean parent BdrvChild objects? Then this question applies as well to pre-patch code.

So, we just call bdrv_check_perm() for bs in bdrv_reopen_multiple.. I think the answer is like this:

if state->perm and state->shared_perm are different from actual cumulative permissions (before reopne), then we must
have the parent(s) of the node in same bs_queue. Then, corresponding children are updated as part
of another bdrv_check_perm call from same loop in bdrv_reopen_multiple().

Let's check how state->perm and state->shared_perm are set:

bdrv_reopen_queue_child()

     /* This needs to be overwritten in bdrv_reopen_prepare() */
     bs_entry->state.perm = UINT64_MAX;
     bs_entry->state.shared_perm = 0;


...
  
bdrv_reopen_prepare()

        bdrv_reopen_perm(queue, reopen_state->bs,
                      &reopen_state->perm, &reopen_state->shared_perm);

and bdrv_reopen_perm() calculate cumulative permissions, taking permissions from the queue, for parents which exists in queue.

Not sure how much it correct, keeping in mind that we may look at a node in queue, for which bdrv_reopen_perm was not yet called, but the idea is clean.

> 
>> It follows old behavior. The only thing is changed that pre-patch we
>> do DFS recursion starting from bs (and probably visit some nodes
>> several times), after-patch we first do topological sort of bs subtree
>> and go through the list. The order of nodes is better and we visit
>> each node once.
> 
> It's not the only thing that changes. Maybe this is what makes the patch
> hard to understand, because it seems to do two steps at once:
> 
> 1. Change the order in which nodes are processed
> 
> 2. Replace bdrv_check_update_perm() with bdrv_check_parents_compliance()

hmm, yes. But we do bdrv_check_parents_compliance() only for nodes inside subtree, for all except root.
So, for them we have updated permissions.

> 
> In step 2, the point I mentioned above is important (new permissions
> must already be set in the BdrvChild objects).
> 
> The switch to bdrv_check_parents_compliance() also means that error
> messages become a bit worse because we don't know any more which of the
> conflicting nodes is the new one, so we can't provide two different
> messages any more. This is probably unavoidable, though.
> 
>>>
>>>> +            bdrv_get_cumulative_perm(bs, &cumulative_perms,
>>>> +                                     &cumulative_shared_perms);
>>>> +        }
>>>> -        cur_ignore_children = g_slist_prepend(g_slist_copy(ignore_children), c);
>>>> -        ret = bdrv_check_update_perm(c->bs, q, cur_perm, cur_shared,
>>>> -                                     cur_ignore_children, errp);
>>>> -        g_slist_free(cur_ignore_children);
>>>> +        ret = bdrv_node_check_perm(bs, q, cumulative_perms,
>>>> +                                   cumulative_shared_perms,
>>>> +                                   ignore_children, errp);
>>>
>>> We use the original ignore_children for every node in the sorted list.
>>> The old code extends it with all nodes in the path to each node.
>>>
>>> For the bdrv_check_update_perm() call that is now replaced with
>>> bdrv_check_parents_compliance(), I think this was necessary because
>>> bdrv_check_update_perm() always assumes adding a new edge, so if you
>>> update one instead of adding it, you have to ignore it so that it can't
>>> conflict with itself. This isn't necessary any more now because we just
>>> update and then check for consistency.
>>>
>>> For passing to bdrv_node_check_perm() it doesn't make a difference
>>> anyway because the parameter is now unused (and should probably be
>>> removed).
>>
>> ignore_children will be dropped in [27]. For now it is still needed
>> for bdrv_replace_node_common
> 
> In bdrv_node_check_perm(), it's already unused after this patch. But
> fair enough.
> 
> Kevin
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 19/36] block: fix bdrv_replace_node_common
  2020-11-27 14:45 ` [PATCH v2 19/36] block: fix bdrv_replace_node_common Vladimir Sementsov-Ogievskiy
@ 2021-02-03 18:23   ` Kevin Wolf
  2021-02-04  7:24     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-02-03 18:23 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> inore_children thing doesn't help to track all propagated permissions
> of children we want to ignore. The simplest way to correctly update
> permissions is update graph first and then do permission update. In
> this case we just referesh permissions for the whole subgraph (in
> topological-sort defined order) and everything is correctly calculated
> automatically without any ignore_children.
> 
> So, refactor bdrv_replace_node_common to first do graph update and then
> refresh the permissions.
> 
> Test test_parallel_exclusive_write() now pass, so move it out of
> debugging "if".
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> diff --git a/tests/test-bdrv-graph-mod.c b/tests/test-bdrv-graph-mod.c
> index 0d62e05ddb..93a5941a9b 100644
> --- a/tests/test-bdrv-graph-mod.c
> +++ b/tests/test-bdrv-graph-mod.c
> @@ -294,20 +294,11 @@ static void test_parallel_perm_update(void)
>      bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
>  
>      assert(c_fl1->perm & BLK_PERM_WRITE);
> +    bdrv_unref(top);
>  }

Why do have this addition in this patch? Shouldn't the changed function
behave the same as before with respect to referenced nodes?

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 15/36] block: use topological sort for permission update
  2021-01-28 18:04         ` Vladimir Sementsov-Ogievskiy
@ 2021-02-03 18:38           ` Kevin Wolf
  2021-02-04  7:16             ` Vladimir Sementsov-Ogievskiy
  2021-03-10 11:08             ` Vladimir Sementsov-Ogievskiy
  0 siblings, 2 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-02-03 18:38 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 28.01.2021 um 19:04 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 28.01.2021 20:13, Kevin Wolf wrote:
> > Am 28.01.2021 um 10:34 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > 27.01.2021 21:38, Kevin Wolf wrote:
> > > > Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > > -static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> > > > > -                           uint64_t cumulative_perms,
> > > > > -                           uint64_t cumulative_shared_perms,
> > > > > -                           GSList *ignore_children, Error **errp)
> > > > > +static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> > > > > +                                uint64_t cumulative_perms,
> > > > > +                                uint64_t cumulative_shared_perms,
> > > > > +                                GSList *ignore_children, Error **errp)
> > > > >    {
> > > > >        BlockDriver *drv = bs->drv;
> > > > >        BdrvChild *c;
> > > > > @@ -2166,21 +2193,43 @@ static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> > > > >        /* Check all children */
> > > > >        QLIST_FOREACH(c, &bs->children, next) {
> > > > >            uint64_t cur_perm, cur_shared;
> > > > > -        GSList *cur_ignore_children;
> > > > >            bdrv_child_perm(bs, c->bs, c, c->role, q,
> > > > >                            cumulative_perms, cumulative_shared_perms,
> > > > >                            &cur_perm, &cur_shared);
> > > > > +        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
> > > > 
> > > > This "added" line is actually old code. What is removed here is the
> > > > recursive call of bdrv_check_update_perm(). This is what the code below
> > > > will have to replace.
> > > 
> > > yes, we'll use explicit loop instead of recursion
> > > 
> > > > 
> > > > > +    }
> > > > > +
> > > > > +    return 0;
> > > > > +}
> > > > > +
> > > > > +static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> > > > > +                           uint64_t cumulative_perms,
> > > > > +                           uint64_t cumulative_shared_perms,
> > > > > +                           GSList *ignore_children, Error **errp)
> > > > > +{
> > > > > +    int ret;
> > > > > +    BlockDriverState *root = bs;
> > > > > +    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root);
> > > > > +
> > > > > +    for ( ; list; list = list->next) {
> > > > > +        bs = list->data;
> > > > > +
> > > > > +        if (bs != root) {
> > > > > +            if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) {
> > > > > +                return -EINVAL;
> > > > > +            }
> > > > 
> > > > At this point bs still had the old permissions, but we don't access
> > > > them. As we're going in topological order, the parents have already been
> > > > updated if they were a child covered in bdrv_node_check_perm(), so we're
> > > > checking the relevant values. Good.
> > > > 
> > > > What about the root node? If I understand correctly, the parents of the
> > > > root nodes wouldn't have been checked in the old code. In the new state,
> > > > the parent BdrvChild already has to contain the new permission.
> > > > 
> > > > In bdrv_refresh_perms(), we already check parent conflicts, so no change
> > > > for all callers going through it. Good.
> > > > 
> > > > bdrv_reopen_multiple() is less obvious. It passes permissions from the
> > > > BDRVReopenState, without applying the permissions first.
> > > 
> > > It will be changed in the series
> > > 
> > > > Do we check the
> > > > old parent permissions instead of the new state here?
> > > 
> > > We use given (new) cumulative permissions for bs, and recalculate
> > > permissions for bs subtree.
> > 
> > Where do we actually set them? I would expect a
> > bdrv_child_set_perm_safe() call somewhere, but I can't see it in the
> > call path from bdrv_reopen_multiple().
> 
> You mean parent BdrvChild objects? Then this question applies as well
> to pre-patch code.

I don't think so. The pre-patch code doesn't rely on the permissions
already being set in the BdrvChild object, but it gets them passed in
parameters. Changing the graph first and relying on the information in
BdrvChild is the new approach that you're introducing.

> So, we just call bdrv_check_perm() for bs in bdrv_reopen_multiple.. I
> think the answer is like this:
> 
> if state->perm and state->shared_perm are different from actual
> cumulative permissions (before reopne), then we must have the
> parent(s) of the node in same bs_queue. Then, corresponding children
> are updated as part of another bdrv_check_perm call from same loop in
> bdrv_reopen_multiple().
> 
> Let's check how state->perm and state->shared_perm are set:
> 
> bdrv_reopen_queue_child()
> 
>     /* This needs to be overwritten in bdrv_reopen_prepare() */
>     bs_entry->state.perm = UINT64_MAX;
>     bs_entry->state.shared_perm = 0;
> 
> 
> ...
> bdrv_reopen_prepare()
> 
>        bdrv_reopen_perm(queue, reopen_state->bs,
>                      &reopen_state->perm, &reopen_state->shared_perm);
> 
> and bdrv_reopen_perm() calculate cumulative permissions, taking
> permissions from the queue, for parents which exists in queue.

Right, but it stores the new permissions in reopen_state, not in the
BdrvChild objects that this patch is looking it. Or am I missing
something?

> Not sure how much it correct, keeping in mind that we may look at a
> node in queue, for which bdrv_reopen_perm was not yet called, but the
> idea is clean.

I don't think the above code can work correctly without something
actually updating the BdrvChild first.

> > > It follows old behavior. The only thing is changed that pre-patch we
> > > do DFS recursion starting from bs (and probably visit some nodes
> > > several times), after-patch we first do topological sort of bs subtree
> > > and go through the list. The order of nodes is better and we visit
> > > each node once.
> > 
> > It's not the only thing that changes. Maybe this is what makes the patch
> > hard to understand, because it seems to do two steps at once:
> > 
> > 1. Change the order in which nodes are processed
> > 
> > 2. Replace bdrv_check_update_perm() with bdrv_check_parents_compliance()
> 
> hmm, yes. But we do bdrv_check_parents_compliance() only for nodes
> inside subtree, for all except root.  So, for them we have updated
> permissions.

Ah! This might be the missing piece that makes it safe.

Maybe worth a comment?

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 20/36] block: add bdrv_attach_child_common() transaction action
  2020-11-27 14:45 ` [PATCH v2 20/36] block: add bdrv_attach_child_common() transaction action Vladimir Sementsov-Ogievskiy
@ 2021-02-03 21:01   ` Kevin Wolf
  2021-02-04  7:34     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-02-03 21:01 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Split out no-perm part of bdrv_root_attach_child() into separate
> transaction action. bdrv_root_attach_child() now moves to new
> permission update paradigm: first update graph relations then update
> permissions.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block.c | 162 ++++++++++++++++++++++++++++++++++++++++----------------
>  1 file changed, 117 insertions(+), 45 deletions(-)
> 
> diff --git a/block.c b/block.c
> index f0fcd75555..a7ccbb4fb1 100644
> --- a/block.c
> +++ b/block.c
> @@ -86,6 +86,13 @@ static void bdrv_parent_set_aio_context_ignore(BdrvChild *c, AioContext *ctx,
>                                                 GSList **ignore);
>  static void bdrv_replace_child_noperm(BdrvChild *child,
>                                        BlockDriverState *new_bs);
> +static int bdrv_attach_child_common(BlockDriverState *child_bs,
> +                                    const char *child_name,
> +                                    const BdrvChildClass *child_class,
> +                                    BdrvChildRole child_role,
> +                                    uint64_t perm, uint64_t shared_perm,
> +                                    void *opaque, BdrvChild **child,
> +                                    GSList **tran, Error **errp);

If you added the new code above bdrv_root_attach_child(), we wouldn't
need the forward declaration and the patch would probably be simpler to
read (because it's the first part of bdrv_root_attach_child() that is
factored out).

>  static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
>                                 *queue, Error **errp);
> @@ -2898,55 +2905,22 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
>                                    uint64_t perm, uint64_t shared_perm,
>                                    void *opaque, Error **errp)
>  {
> -    BdrvChild *child;
> -    Error *local_err = NULL;
>      int ret;
> -    AioContext *ctx;
> +    BdrvChild *child = NULL;
> +    GSList *tran = NULL;
>  
> -    ret = bdrv_check_update_perm(child_bs, NULL, perm, shared_perm, NULL, errp);
> +    ret = bdrv_attach_child_common(child_bs, child_name, child_class,
> +                                   child_role, perm, shared_perm, opaque,
> +                                   &child, &tran, errp);
>      if (ret < 0) {
> -        bdrv_abort_perm_update(child_bs);
>          bdrv_unref(child_bs);
>          return NULL;
>      }
>  
> -    child = g_new(BdrvChild, 1);
> -    *child = (BdrvChild) {
> -        .bs             = NULL,
> -        .name           = g_strdup(child_name),
> -        .klass          = child_class,
> -        .role           = child_role,
> -        .perm           = perm,
> -        .shared_perm    = shared_perm,
> -        .opaque         = opaque,
> -    };
> -
> -    ctx = bdrv_child_get_parent_aio_context(child);
> -
> -    /* If the AioContexts don't match, first try to move the subtree of
> -     * child_bs into the AioContext of the new parent. If this doesn't work,
> -     * try moving the parent into the AioContext of child_bs instead. */
> -    if (bdrv_get_aio_context(child_bs) != ctx) {
> -        ret = bdrv_try_set_aio_context(child_bs, ctx, &local_err);
> -        if (ret < 0) {
> -            if (bdrv_parent_try_set_aio_context(child, ctx, NULL) == 0) {
> -                ret = 0;
> -                error_free(local_err);
> -                local_err = NULL;
> -            }
> -        }
> -        if (ret < 0) {
> -            error_propagate(errp, local_err);
> -            g_free(child);
> -            bdrv_abort_perm_update(child_bs);
> -            bdrv_unref(child_bs);
> -            return NULL;
> -        }
> -    }
> -
> -    /* This performs the matching bdrv_set_perm() for the above check. */
> -    bdrv_replace_child(child, child_bs);
> +    ret = bdrv_refresh_perms(child_bs, errp);
> +    tran_finalize(tran, ret);
>  
> +    bdrv_unref(child_bs);
>      return child;
>  }
>  
> @@ -2988,16 +2962,114 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
>      return child;
>  }
>  
> -static void bdrv_detach_child(BdrvChild *child)
> +static void bdrv_remove_empty_child(BdrvChild *child)
>  {
> +    assert(!child->bs);
>      QLIST_SAFE_REMOVE(child, next);
> -
> -    bdrv_replace_child(child, NULL);
> -
>      g_free(child->name);
>      g_free(child);
>  }
>  
> +typedef struct BdrvAttachChildCommonState {
> +    BdrvChild **child;
> +    AioContext *old_parent_ctx;
> +    AioContext *old_child_ctx;
> +} BdrvAttachChildCommonState;
> +
> +static void bdrv_attach_child_common_abort(void *opaque)
> +{
> +    BdrvAttachChildCommonState *s = opaque;
> +    BdrvChild *child = *s->child;
> +    BlockDriverState *bs = child->bs;
> +
> +    bdrv_replace_child_noperm(child, NULL);
> +
> +    if (bdrv_get_aio_context(bs) != s->old_child_ctx) {
> +        bdrv_try_set_aio_context(bs, s->old_child_ctx, &error_abort);

Would failure actually be fatal? I think we can ignore it, the node is
in an AioContext that works for it.

> +    }
> +
> +    if (bdrv_child_get_parent_aio_context(child) != s->old_parent_ctx) {
> +        bdrv_parent_try_set_aio_context(child, s->old_parent_ctx,
> +                                        &error_abort);

And the same here.

> +    }
> +
> +    bdrv_unref(bs);
> +    bdrv_remove_empty_child(child);
> +    *s->child = NULL;
> +}
> +
> +static TransactionActionDrv bdrv_attach_child_common_drv = {
> +    .abort = bdrv_attach_child_common_abort,
> +};
> +
> +/*
> + * Common part of attoching bdrv child to bs or to blk or to job
> + */
> +static int bdrv_attach_child_common(BlockDriverState *child_bs,
> +                                    const char *child_name,
> +                                    const BdrvChildClass *child_class,
> +                                    BdrvChildRole child_role,
> +                                    uint64_t perm, uint64_t shared_perm,
> +                                    void *opaque, BdrvChild **child,
> +                                    GSList **tran, Error **errp)
> +{
> +    int ret;
> +    BdrvChild *new_child;
> +    AioContext *parent_ctx;
> +    AioContext *child_ctx = bdrv_get_aio_context(child_bs);
> +
> +    assert(child);
> +    assert(*child == NULL);
> +
> +    new_child = g_new(BdrvChild, 1);
> +    *new_child = (BdrvChild) {
> +        .bs             = NULL,
> +        .name           = g_strdup(child_name),
> +        .klass          = child_class,
> +        .role           = child_role,
> +        .perm           = perm,
> +        .shared_perm    = shared_perm,
> +        .opaque         = opaque,
> +    };
> +
> +    parent_ctx = bdrv_child_get_parent_aio_context(new_child);
> +    if (child_ctx != parent_ctx) {
> +        ret = bdrv_try_set_aio_context(child_bs, parent_ctx, NULL);
> +        if (ret < 0) {
> +            /*
> +             * bdrv_try_set_aio_context_tran don't need rollback after failure,
> +             * so we don't care.
> +             */
> +            ret = bdrv_parent_try_set_aio_context(new_child, child_ctx, errp);
> +        }
> +        if (ret < 0) {
> +            bdrv_remove_empty_child(new_child);
> +            return ret;
> +        }
> +    }

Not sure why you decided to rewrite this block while moving it from
bdrv_root_attach_child().

We're losing the comment above it, and a possible error message is now
related to changing the context of the parent node instead of the newly
added node, which I imagine is less obvious in the general case.

> +    bdrv_ref(child_bs);
> +    bdrv_replace_child_noperm(new_child, child_bs);
> +
> +    *child = new_child;
> +
> +    BdrvAttachChildCommonState *s = g_new(BdrvAttachChildCommonState, 1);
> +    *s = (BdrvAttachChildCommonState) {
> +        .child = child,
> +        .old_parent_ctx = parent_ctx,
> +        .old_child_ctx = child_ctx,
> +    };
> +    tran_prepend(tran, &bdrv_attach_child_common_drv, s);
> +
> +    return 0;
> +}

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 22/36] block: split out bdrv_replace_node_noperm()
  2020-11-27 14:45 ` [PATCH v2 22/36] block: split out bdrv_replace_node_noperm() Vladimir Sementsov-Ogievskiy
@ 2021-02-03 21:16   ` Kevin Wolf
  0 siblings, 0 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-02-03 21:16 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Split part of bdrv_replace_node_common() to be used separately.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> @@ -4909,7 +4936,6 @@ static int bdrv_replace_node_common(BlockDriverState *from,
>                                      bool auto_skip, Error **errp)
>  {
>      int ret = -EPERM;
> -    BdrvChild *c, *next;
>      GSList *tran = NULL;
>      g_autoptr(GHashTable) found = NULL;
>      g_autoptr(GSList) refresh_list = NULL;

ret doesn't need to be initialised any more now.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 23/36] block: adapt bdrv_append() for inserting filters
  2020-11-27 14:45 ` [PATCH v2 23/36] block: adapt bdrv_append() for inserting filters Vladimir Sementsov-Ogievskiy
@ 2021-02-03 21:33   ` Kevin Wolf
  2021-02-04  8:30     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-02-03 21:33 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> bdrv_append is not very good for inserting filters: it does extra
> permission update as part of bdrv_set_backing_hd(). During this update
> filter may conflict with other parents of top_bs.
> 
> Instead, let's first do all graph modifications and after it update
> permissions.

This sounds like it fixes a bug. If so, should we have a test like for
the other cases fixed by this series?

> Note: bdrv_append() is still only works for backing-child based
> filters. It's something to improve later.
> 
> It simplifies the fact that bdrv_append() used to append new nodes,
> without backing child. Let's add an assertion.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block.c | 28 +++++++++++++++++-----------
>  1 file changed, 17 insertions(+), 11 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 02da1a90bc..7094922509 100644
> --- a/block.c
> +++ b/block.c
> @@ -4998,22 +4998,28 @@ int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
>  int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
>                  Error **errp)
>  {
> -    Error *local_err = NULL;
> +    int ret;
> +    GSList *tran = NULL;
>  
> -    bdrv_set_backing_hd(bs_new, bs_top, &local_err);
> -    if (local_err) {
> -        error_propagate(errp, local_err);
> -        return -EPERM;
> +    assert(!bs_new->backing);
> +
> +    ret = bdrv_attach_child_noperm(bs_new, bs_top, "backing",
> +                                   &child_of_bds, bdrv_backing_role(bs_new),
> +                                   &bs_new->backing, &tran, errp);
> +    if (ret < 0) {
> +        goto out;
>      }

I don't think changing bs->backing without bdrv_set_backing_hd() is
correct at the moment. We lose a few things:

1. The bdrv_is_backing_chain_frozen() check
2. Updating backing_hd->inherits_from if necessary
3. bdrv_refresh_limits()

If I'm not missing anything, all of these are needed in the context of
bdrv_append().

> -    bdrv_replace_node(bs_top, bs_new, &local_err);
> -    if (local_err) {
> -        error_propagate(errp, local_err);
> -        bdrv_set_backing_hd(bs_new, NULL, &error_abort);
> -        return -EPERM;
> +    ret = bdrv_replace_node_noperm(bs_top, bs_new, true, &tran, errp);
> +    if (ret < 0) {
> +        goto out;
>      }
>  
> -    return 0;
> +    ret = bdrv_refresh_perms(bs_new, errp);
> +out:
> +    tran_finalize(tran, ret);
> +
> +    return ret;
>  }

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 15/36] block: use topological sort for permission update
  2021-02-03 18:38           ` Kevin Wolf
@ 2021-02-04  7:16             ` Vladimir Sementsov-Ogievskiy
  2021-03-10 11:08             ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-04  7:16 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

03.02.2021 21:38, Kevin Wolf wrote:
> Am 28.01.2021 um 19:04 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 28.01.2021 20:13, Kevin Wolf wrote:
>>> Am 28.01.2021 um 10:34 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> 27.01.2021 21:38, Kevin Wolf wrote:
>>>>> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>>> -static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>>>>> -                           uint64_t cumulative_perms,
>>>>>> -                           uint64_t cumulative_shared_perms,
>>>>>> -                           GSList *ignore_children, Error **errp)
>>>>>> +static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>>>>> +                                uint64_t cumulative_perms,
>>>>>> +                                uint64_t cumulative_shared_perms,
>>>>>> +                                GSList *ignore_children, Error **errp)
>>>>>>     {
>>>>>>         BlockDriver *drv = bs->drv;
>>>>>>         BdrvChild *c;
>>>>>> @@ -2166,21 +2193,43 @@ static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>>>>>         /* Check all children */
>>>>>>         QLIST_FOREACH(c, &bs->children, next) {
>>>>>>             uint64_t cur_perm, cur_shared;
>>>>>> -        GSList *cur_ignore_children;
>>>>>>             bdrv_child_perm(bs, c->bs, c, c->role, q,
>>>>>>                             cumulative_perms, cumulative_shared_perms,
>>>>>>                             &cur_perm, &cur_shared);
>>>>>> +        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
>>>>>
>>>>> This "added" line is actually old code. What is removed here is the
>>>>> recursive call of bdrv_check_update_perm(). This is what the code below
>>>>> will have to replace.
>>>>
>>>> yes, we'll use explicit loop instead of recursion
>>>>
>>>>>
>>>>>> +    }
>>>>>> +
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>>>>> +                           uint64_t cumulative_perms,
>>>>>> +                           uint64_t cumulative_shared_perms,
>>>>>> +                           GSList *ignore_children, Error **errp)
>>>>>> +{
>>>>>> +    int ret;
>>>>>> +    BlockDriverState *root = bs;
>>>>>> +    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root);
>>>>>> +
>>>>>> +    for ( ; list; list = list->next) {
>>>>>> +        bs = list->data;
>>>>>> +
>>>>>> +        if (bs != root) {
>>>>>> +            if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) {
>>>>>> +                return -EINVAL;
>>>>>> +            }
>>>>>
>>>>> At this point bs still had the old permissions, but we don't access
>>>>> them. As we're going in topological order, the parents have already been
>>>>> updated if they were a child covered in bdrv_node_check_perm(), so we're
>>>>> checking the relevant values. Good.
>>>>>
>>>>> What about the root node? If I understand correctly, the parents of the
>>>>> root nodes wouldn't have been checked in the old code. In the new state,
>>>>> the parent BdrvChild already has to contain the new permission.
>>>>>
>>>>> In bdrv_refresh_perms(), we already check parent conflicts, so no change
>>>>> for all callers going through it. Good.
>>>>>
>>>>> bdrv_reopen_multiple() is less obvious. It passes permissions from the
>>>>> BDRVReopenState, without applying the permissions first.
>>>>
>>>> It will be changed in the series
>>>>
>>>>> Do we check the
>>>>> old parent permissions instead of the new state here?
>>>>
>>>> We use given (new) cumulative permissions for bs, and recalculate
>>>> permissions for bs subtree.
>>>
>>> Where do we actually set them? I would expect a
>>> bdrv_child_set_perm_safe() call somewhere, but I can't see it in the
>>> call path from bdrv_reopen_multiple().
>>
>> You mean parent BdrvChild objects? Then this question applies as well
>> to pre-patch code.
> 
> I don't think so. The pre-patch code doesn't rely on the permissions
> already being set in the BdrvChild object, but it gets them passed in
> parameters. Changing the graph first and relying on the information in
> BdrvChild is the new approach that you're introducing.
> 
>> So, we just call bdrv_check_perm() for bs in bdrv_reopen_multiple.. I
>> think the answer is like this:
>>
>> if state->perm and state->shared_perm are different from actual
>> cumulative permissions (before reopne), then we must have the
>> parent(s) of the node in same bs_queue. Then, corresponding children
>> are updated as part of another bdrv_check_perm call from same loop in
>> bdrv_reopen_multiple().
>>
>> Let's check how state->perm and state->shared_perm are set:
>>
>> bdrv_reopen_queue_child()
>>
>>      /* This needs to be overwritten in bdrv_reopen_prepare() */
>>      bs_entry->state.perm = UINT64_MAX;
>>      bs_entry->state.shared_perm = 0;
>>
>>
>> ...
>> bdrv_reopen_prepare()
>>
>>         bdrv_reopen_perm(queue, reopen_state->bs,
>>                       &reopen_state->perm, &reopen_state->shared_perm);
>>
>> and bdrv_reopen_perm() calculate cumulative permissions, taking
>> permissions from the queue, for parents which exists in queue.
> 
> Right, but it stores the new permissions in reopen_state, not in the
> BdrvChild objects that this patch is looking it. Or am I missing
> something?
> 
>> Not sure how much it correct, keeping in mind that we may look at a
>> node in queue, for which bdrv_reopen_perm was not yet called, but the
>> idea is clean.
> 
> I don't think the above code can work correctly without something
> actually updating the BdrvChild first.
> 
>>>> It follows old behavior. The only thing is changed that pre-patch we
>>>> do DFS recursion starting from bs (and probably visit some nodes
>>>> several times), after-patch we first do topological sort of bs subtree
>>>> and go through the list. The order of nodes is better and we visit
>>>> each node once.
>>>
>>> It's not the only thing that changes. Maybe this is what makes the patch
>>> hard to understand, because it seems to do two steps at once:
>>>
>>> 1. Change the order in which nodes are processed
>>>
>>> 2. Replace bdrv_check_update_perm() with bdrv_check_parents_compliance()
>>
>> hmm, yes. But we do bdrv_check_parents_compliance() only for nodes
>> inside subtree, for all except root.  So, for them we have updated
>> permissions.
> 
> Ah! This might be the missing piece that makes it safe.
> 
> Maybe worth a comment?
> 

Will add

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 19/36] block: fix bdrv_replace_node_common
  2021-02-03 18:23   ` Kevin Wolf
@ 2021-02-04  7:24     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-04  7:24 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

03.02.2021 21:23, Kevin Wolf wrote:
> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> inore_children thing doesn't help to track all propagated permissions
>> of children we want to ignore. The simplest way to correctly update
>> permissions is update graph first and then do permission update. In
>> this case we just referesh permissions for the whole subgraph (in
>> topological-sort defined order) and everything is correctly calculated
>> automatically without any ignore_children.
>>
>> So, refactor bdrv_replace_node_common to first do graph update and then
>> refresh the permissions.
>>
>> Test test_parallel_exclusive_write() now pass, so move it out of
>> debugging "if".
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
>> diff --git a/tests/test-bdrv-graph-mod.c b/tests/test-bdrv-graph-mod.c
>> index 0d62e05ddb..93a5941a9b 100644
>> --- a/tests/test-bdrv-graph-mod.c
>> +++ b/tests/test-bdrv-graph-mod.c
>> @@ -294,20 +294,11 @@ static void test_parallel_perm_update(void)
>>       bdrv_child_refresh_perms(top, top->children.lh_first, &error_abort);
>>   
>>       assert(c_fl1->perm & BLK_PERM_WRITE);
>> +    bdrv_unref(top);
>>   }
> 
> Why do have this addition in this patch? Shouldn't the changed function
> behave the same as before with respect to referenced nodes?
> 

Hmm, looks like accidental fixup that should be squashed to original commit.. Or just a mistake. Will check when prepare next version


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 20/36] block: add bdrv_attach_child_common() transaction action
  2021-02-03 21:01   ` Kevin Wolf
@ 2021-02-04  7:34     ` Vladimir Sementsov-Ogievskiy
  2021-02-04  7:50       ` Kevin Wolf
  0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-04  7:34 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

04.02.2021 00:01, Kevin Wolf wrote:
> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Split out no-perm part of bdrv_root_attach_child() into separate
>> transaction action. bdrv_root_attach_child() now moves to new
>> permission update paradigm: first update graph relations then update
>> permissions.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block.c | 162 ++++++++++++++++++++++++++++++++++++++++----------------
>>   1 file changed, 117 insertions(+), 45 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index f0fcd75555..a7ccbb4fb1 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -86,6 +86,13 @@ static void bdrv_parent_set_aio_context_ignore(BdrvChild *c, AioContext *ctx,
>>                                                  GSList **ignore);
>>   static void bdrv_replace_child_noperm(BdrvChild *child,
>>                                         BlockDriverState *new_bs);
>> +static int bdrv_attach_child_common(BlockDriverState *child_bs,
>> +                                    const char *child_name,
>> +                                    const BdrvChildClass *child_class,
>> +                                    BdrvChildRole child_role,
>> +                                    uint64_t perm, uint64_t shared_perm,
>> +                                    void *opaque, BdrvChild **child,
>> +                                    GSList **tran, Error **errp);
> 
> If you added the new code above bdrv_root_attach_child(), we wouldn't
> need the forward declaration and the patch would probably be simpler to
> read (because it's the first part of bdrv_root_attach_child() that is
> factored out).
> 
>>   static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
>>                                  *queue, Error **errp);
>> @@ -2898,55 +2905,22 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
>>                                     uint64_t perm, uint64_t shared_perm,
>>                                     void *opaque, Error **errp)
>>   {
>> -    BdrvChild *child;
>> -    Error *local_err = NULL;
>>       int ret;
>> -    AioContext *ctx;
>> +    BdrvChild *child = NULL;
>> +    GSList *tran = NULL;
>>   
>> -    ret = bdrv_check_update_perm(child_bs, NULL, perm, shared_perm, NULL, errp);
>> +    ret = bdrv_attach_child_common(child_bs, child_name, child_class,
>> +                                   child_role, perm, shared_perm, opaque,
>> +                                   &child, &tran, errp);
>>       if (ret < 0) {
>> -        bdrv_abort_perm_update(child_bs);
>>           bdrv_unref(child_bs);
>>           return NULL;
>>       }
>>   
>> -    child = g_new(BdrvChild, 1);
>> -    *child = (BdrvChild) {
>> -        .bs             = NULL,
>> -        .name           = g_strdup(child_name),
>> -        .klass          = child_class,
>> -        .role           = child_role,
>> -        .perm           = perm,
>> -        .shared_perm    = shared_perm,
>> -        .opaque         = opaque,
>> -    };
>> -
>> -    ctx = bdrv_child_get_parent_aio_context(child);
>> -
>> -    /* If the AioContexts don't match, first try to move the subtree of
>> -     * child_bs into the AioContext of the new parent. If this doesn't work,
>> -     * try moving the parent into the AioContext of child_bs instead. */
>> -    if (bdrv_get_aio_context(child_bs) != ctx) {
>> -        ret = bdrv_try_set_aio_context(child_bs, ctx, &local_err);
>> -        if (ret < 0) {
>> -            if (bdrv_parent_try_set_aio_context(child, ctx, NULL) == 0) {
>> -                ret = 0;
>> -                error_free(local_err);
>> -                local_err = NULL;
>> -            }
>> -        }
>> -        if (ret < 0) {
>> -            error_propagate(errp, local_err);
>> -            g_free(child);
>> -            bdrv_abort_perm_update(child_bs);
>> -            bdrv_unref(child_bs);
>> -            return NULL;
>> -        }
>> -    }
>> -
>> -    /* This performs the matching bdrv_set_perm() for the above check. */
>> -    bdrv_replace_child(child, child_bs);
>> +    ret = bdrv_refresh_perms(child_bs, errp);
>> +    tran_finalize(tran, ret);
>>   
>> +    bdrv_unref(child_bs);
>>       return child;
>>   }
>>   
>> @@ -2988,16 +2962,114 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
>>       return child;
>>   }
>>   
>> -static void bdrv_detach_child(BdrvChild *child)
>> +static void bdrv_remove_empty_child(BdrvChild *child)
>>   {
>> +    assert(!child->bs);
>>       QLIST_SAFE_REMOVE(child, next);
>> -
>> -    bdrv_replace_child(child, NULL);
>> -
>>       g_free(child->name);
>>       g_free(child);
>>   }
>>   
>> +typedef struct BdrvAttachChildCommonState {
>> +    BdrvChild **child;
>> +    AioContext *old_parent_ctx;
>> +    AioContext *old_child_ctx;
>> +} BdrvAttachChildCommonState;
>> +
>> +static void bdrv_attach_child_common_abort(void *opaque)
>> +{
>> +    BdrvAttachChildCommonState *s = opaque;
>> +    BdrvChild *child = *s->child;
>> +    BlockDriverState *bs = child->bs;
>> +
>> +    bdrv_replace_child_noperm(child, NULL);
>> +
>> +    if (bdrv_get_aio_context(bs) != s->old_child_ctx) {
>> +        bdrv_try_set_aio_context(bs, s->old_child_ctx, &error_abort);
> 
> Would failure actually be fatal? I think we can ignore it, the node is
> in an AioContext that works for it.

As far as I explored the code, check-aio-context is transparent enough, nothing rely on IO, etc, and if we succeeded to change it we must success in revert.

And as I understand it is critical: if we failed to rollback aio-context change somewhere (but succeeded in reverting graph relation change), it means that we end up with different aio contexts inside one block subtree..

> 
>> +    }
>> +
>> +    if (bdrv_child_get_parent_aio_context(child) != s->old_parent_ctx) {
>> +        bdrv_parent_try_set_aio_context(child, s->old_parent_ctx,
>> +                                        &error_abort);
> 
> And the same here.
> 
>> +    }
>> +
>> +    bdrv_unref(bs);
>> +    bdrv_remove_empty_child(child);
>> +    *s->child = NULL;
>> +}
>> +
>> +static TransactionActionDrv bdrv_attach_child_common_drv = {
>> +    .abort = bdrv_attach_child_common_abort,
>> +};
>> +
>> +/*
>> + * Common part of attoching bdrv child to bs or to blk or to job
>> + */
>> +static int bdrv_attach_child_common(BlockDriverState *child_bs,
>> +                                    const char *child_name,
>> +                                    const BdrvChildClass *child_class,
>> +                                    BdrvChildRole child_role,
>> +                                    uint64_t perm, uint64_t shared_perm,
>> +                                    void *opaque, BdrvChild **child,
>> +                                    GSList **tran, Error **errp)
>> +{
>> +    int ret;
>> +    BdrvChild *new_child;
>> +    AioContext *parent_ctx;
>> +    AioContext *child_ctx = bdrv_get_aio_context(child_bs);
>> +
>> +    assert(child);
>> +    assert(*child == NULL);
>> +
>> +    new_child = g_new(BdrvChild, 1);
>> +    *new_child = (BdrvChild) {
>> +        .bs             = NULL,
>> +        .name           = g_strdup(child_name),
>> +        .klass          = child_class,
>> +        .role           = child_role,
>> +        .perm           = perm,
>> +        .shared_perm    = shared_perm,
>> +        .opaque         = opaque,
>> +    };
>> +
>> +    parent_ctx = bdrv_child_get_parent_aio_context(new_child);
>> +    if (child_ctx != parent_ctx) {
>> +        ret = bdrv_try_set_aio_context(child_bs, parent_ctx, NULL);
>> +        if (ret < 0) {
>> +            /*
>> +             * bdrv_try_set_aio_context_tran don't need rollback after failure,
>> +             * so we don't care.
>> +             */
>> +            ret = bdrv_parent_try_set_aio_context(new_child, child_ctx, errp);
>> +        }
>> +        if (ret < 0) {
>> +            bdrv_remove_empty_child(new_child);
>> +            return ret;
>> +        }
>> +    }
> 
> Not sure why you decided to rewrite this block while moving it from
> bdrv_root_attach_child().
> 
> We're losing the comment above it, and a possible error message is now
> related to changing the context of the parent node instead of the newly
> added node, which I imagine is less obvious in the general case.

Don't remember:( Will try to revert, and if find that it's really needed, will leave some good comment on it.

> 
>> +    bdrv_ref(child_bs);
>> +    bdrv_replace_child_noperm(new_child, child_bs);
>> +
>> +    *child = new_child;
>> +
>> +    BdrvAttachChildCommonState *s = g_new(BdrvAttachChildCommonState, 1);
>> +    *s = (BdrvAttachChildCommonState) {
>> +        .child = child,
>> +        .old_parent_ctx = parent_ctx,
>> +        .old_child_ctx = child_ctx,
>> +    };
>> +    tran_prepend(tran, &bdrv_attach_child_common_drv, s);
>> +
>> +    return 0;
>> +}
> 
> Kevin
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 20/36] block: add bdrv_attach_child_common() transaction action
  2021-02-04  7:34     ` Vladimir Sementsov-Ogievskiy
@ 2021-02-04  7:50       ` Kevin Wolf
  0 siblings, 0 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-02-04  7:50 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 04.02.2021 um 08:34 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 04.02.2021 00:01, Kevin Wolf wrote:
> > Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > Split out no-perm part of bdrv_root_attach_child() into separate
> > > transaction action. bdrv_root_attach_child() now moves to new
> > > permission update paradigm: first update graph relations then update
> > > permissions.
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> > > +static void bdrv_attach_child_common_abort(void *opaque)
> > > +{
> > > +    BdrvAttachChildCommonState *s = opaque;
> > > +    BdrvChild *child = *s->child;
> > > +    BlockDriverState *bs = child->bs;
> > > +
> > > +    bdrv_replace_child_noperm(child, NULL);
> > > +
> > > +    if (bdrv_get_aio_context(bs) != s->old_child_ctx) {
> > > +        bdrv_try_set_aio_context(bs, s->old_child_ctx, &error_abort);
> > 
> > Would failure actually be fatal? I think we can ignore it, the node is
> > in an AioContext that works for it.
> 
> As far as I explored the code, check-aio-context is transparent
> enough, nothing rely on IO, etc, and if we succeeded to change it we
> must success in revert.
> 
> And as I understand it is critical: if we failed to rollback
> aio-context change somewhere (but succeeded in reverting graph
> relation change), it means that we end up with different aio contexts
> inside one block subtree..

Ah, right, we're going to change the graph once again, so what is
working now doesn't have to be working for the changed graph.

Ok, let's leave this as &error_abort.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 23/36] block: adapt bdrv_append() for inserting filters
  2021-02-03 21:33   ` Kevin Wolf
@ 2021-02-04  8:30     ` Vladimir Sementsov-Ogievskiy
  2021-02-04  9:05       ` Kevin Wolf
  0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-04  8:30 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

04.02.2021 00:33, Kevin Wolf wrote:
> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> bdrv_append is not very good for inserting filters: it does extra
>> permission update as part of bdrv_set_backing_hd(). During this update
>> filter may conflict with other parents of top_bs.
>>
>> Instead, let's first do all graph modifications and after it update
>> permissions.
> 
> This sounds like it fixes a bug. If so, should we have a test like for
> the other cases fixed by this series?

Hm. I considered it mostly like a lack not a bug. We just have to workaround this lack by "inactive" mode of filters. But adding a test is good idea anyway. Will do.

> 
>> Note: bdrv_append() is still only works for backing-child based
>> filters. It's something to improve later.
>>
>> It simplifies the fact that bdrv_append() used to append new nodes,
>> without backing child. Let's add an assertion.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block.c | 28 +++++++++++++++++-----------
>>   1 file changed, 17 insertions(+), 11 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 02da1a90bc..7094922509 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -4998,22 +4998,28 @@ int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
>>   int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
>>                   Error **errp)
>>   {
>> -    Error *local_err = NULL;
>> +    int ret;
>> +    GSList *tran = NULL;
>>   
>> -    bdrv_set_backing_hd(bs_new, bs_top, &local_err);
>> -    if (local_err) {
>> -        error_propagate(errp, local_err);
>> -        return -EPERM;
>> +    assert(!bs_new->backing);
>> +
>> +    ret = bdrv_attach_child_noperm(bs_new, bs_top, "backing",
>> +                                   &child_of_bds, bdrv_backing_role(bs_new),
>> +                                   &bs_new->backing, &tran, errp);
>> +    if (ret < 0) {
>> +        goto out;
>>       }
> 
> I don't think changing bs->backing without bdrv_set_backing_hd() is
> correct at the moment. We lose a few things:
> 
> 1. The bdrv_is_backing_chain_frozen() check
> 2. Updating backing_hd->inherits_from if necessary
> 3. bdrv_refresh_limits()
> 
> If I'm not missing anything, all of these are needed in the context of
> bdrv_append().

I decided that bdrv_append() is only for appending new nodes, so frozen and inherts_from checks are not needed. And I've added assert(!bs_new->backing)...

Checking this now:

- appending filters is obvious
- bdrv_append_temp_snapshot() creates new qcow2 node based on tmp file, don't see any backing initialization (and it would be rather strange)
- external_snapshot_prepare() do check if (bdrv_cow_child(state->new_bs)) {  error-out }

So everything is OK. I should describe it in commit message and add a comment to bdrv_append.

> 
>> -    bdrv_replace_node(bs_top, bs_new, &local_err);
>> -    if (local_err) {
>> -        error_propagate(errp, local_err);
>> -        bdrv_set_backing_hd(bs_new, NULL, &error_abort);
>> -        return -EPERM;
>> +    ret = bdrv_replace_node_noperm(bs_top, bs_new, true, &tran, errp);
>> +    if (ret < 0) {
>> +        goto out;
>>       }
>>   
>> -    return 0;
>> +    ret = bdrv_refresh_perms(bs_new, errp);
>> +out:
>> +    tran_finalize(tran, ret);
>> +
>> +    return ret;
>>   }
> 
> Kevin
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 23/36] block: adapt bdrv_append() for inserting filters
  2021-02-04  8:30     ` Vladimir Sementsov-Ogievskiy
@ 2021-02-04  9:05       ` Kevin Wolf
  2021-02-04 11:54         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-02-04  9:05 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 04.02.2021 um 09:30 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 04.02.2021 00:33, Kevin Wolf wrote:
> > Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > >   int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
> > >                   Error **errp)
> > >   {
> > > -    Error *local_err = NULL;
> > > +    int ret;
> > > +    GSList *tran = NULL;
> > > -    bdrv_set_backing_hd(bs_new, bs_top, &local_err);
> > > -    if (local_err) {
> > > -        error_propagate(errp, local_err);
> > > -        return -EPERM;
> > > +    assert(!bs_new->backing);
> > > +
> > > +    ret = bdrv_attach_child_noperm(bs_new, bs_top, "backing",
> > > +                                   &child_of_bds, bdrv_backing_role(bs_new),
> > > +                                   &bs_new->backing, &tran, errp);
> > > +    if (ret < 0) {
> > > +        goto out;
> > >       }
> > 
> > I don't think changing bs->backing without bdrv_set_backing_hd() is
> > correct at the moment. We lose a few things:
> > 
> > 1. The bdrv_is_backing_chain_frozen() check
> > 2. Updating backing_hd->inherits_from if necessary
> > 3. bdrv_refresh_limits()
> > 
> > If I'm not missing anything, all of these are needed in the context of
> > bdrv_append().
> 
> I decided that bdrv_append() is only for appending new nodes, so
> frozen and inherts_from checks are not needed. And I've added
> assert(!bs_new->backing)...
> 
> Checking this now:
> 
> - appending filters is obvious
> - bdrv_append_temp_snapshot() creates new qcow2 node based on tmp
>   file, don't see any backing initialization (and it would be rather
>   strange)

Yes, the internal uses are obviously unproblematic for the frozen check.

> - external_snapshot_prepare() do check if
>   (bdrv_cow_child(state->new_bs)) {  error-out }

Ok, the only thing bdrv_set_backing_hd() can and must check is whether
the link to the old backing file was frozen, and we know that we don't
have an old backing file. Makes sense.

Same thing for inherits_from, we only do this if the the new backing
file (i.e. the old active layer for bdrv_append) was already in the
backing chain of the new node.

> So everything is OK. I should describe it in commit message and add a
> comment to bdrv_append.

What about bdrv_refresh_limits()? The node gains a new backing file, so
I think the limits could change.

Ideally, bdrv_child_cb_attach/detach() would take care of this, but at
the moment they don't.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 25/36] block: introduce bdrv_drop_filter()
  2020-11-27 14:45 ` [PATCH v2 25/36] block: introduce bdrv_drop_filter() Vladimir Sementsov-Ogievskiy
@ 2021-02-04 11:31   ` Kevin Wolf
  2021-02-04 12:27     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-02-04 11:31 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Using bdrv_replace_node() for removing filter is not good enough: it
> keeps child reference of the filter, which may conflict with original
> top node during permission update.
> 
> Instead let's create new interface, which will do all graph
> modifications first and then update permissions.
> 
> Let's modify bdrv_replace_node_common(), allowing it additionally drop
> backing chain child link pointing to new node. This is quite
> appropriate for bdrv_drop_intermediate() and makes possible to add
> new bdrv_drop_filter() as a simple wrapper.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/block/block.h |  1 +
>  block.c               | 42 ++++++++++++++++++++++++++++++++++++++----
>  2 files changed, 39 insertions(+), 4 deletions(-)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index 8f6100dad7..0f21ef313f 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -348,6 +348,7 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
>                  Error **errp);
>  int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
>                        Error **errp);
> +int bdrv_drop_filter(BlockDriverState *bs, Error **errp);
>  
>  int bdrv_parse_aio(const char *mode, int *flags);
>  int bdrv_parse_cache_mode(const char *mode, int *flags, bool *writethrough);
> diff --git a/block.c b/block.c
> index b1394b721c..e835a78f06 100644
> --- a/block.c
> +++ b/block.c
> @@ -4919,7 +4919,6 @@ static TransactionActionDrv bdrv_remove_backing_drv = {
>      .commit = bdrv_child_free,
>  };
>  
> -__attribute__((unused))
>  static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran)
>  {
>      if (!bs->backing) {
> @@ -4968,15 +4967,30 @@ static int bdrv_replace_node_noperm(BlockDriverState *from,
>   *
>   * With auto_skip=false the error is returned if from has a parent which should
>   * not be updated.
> + *
> + * With detach_subchain to must be in a backing chain of from. In this case

@to and @from make it easier to read.

> + * backing link of the cow-parent of @to is removed.
>   */
>  static int bdrv_replace_node_common(BlockDriverState *from,
>                                      BlockDriverState *to,
> -                                    bool auto_skip, Error **errp)
> +                                    bool auto_skip, bool detach_subchain,
> +                                    Error **errp)
>  {
>      int ret = -EPERM;
>      GSList *tran = NULL;
>      g_autoptr(GHashTable) found = NULL;
>      g_autoptr(GSList) refresh_list = NULL;
> +    BlockDriverState *to_cow_parent;
> +
> +    if (detach_subchain) {
> +        assert(bdrv_chain_contains(from, to));

The loop below also relies on from != to, so maybe assert that, too.

> +        for (to_cow_parent = from;
> +             bdrv_filter_or_cow_bs(to_cow_parent) != to;
> +             to_cow_parent = bdrv_filter_or_cow_bs(to_cow_parent))
> +        {
> +            ;
> +        }
> +    }
>  
>      /* Make sure that @from doesn't go away until we have successfully attached
>       * all of its parents to @to. */
> @@ -4997,6 +5011,10 @@ static int bdrv_replace_node_common(BlockDriverState *from,
>          goto out;
>      }
>  
> +    if (detach_subchain) {
> +        bdrv_remove_backing(to_cow_parent, &tran);
> +    }

So bdrv_drop_filter() only works for filters that go through
bs->backing?

Wouldn't it have been more useful to make it bdrv_remove_filter_or_cow()
like you use already use in other places in this patch?

If not, the limitation needs to be documented for bdrv_drop_filter().

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 23/36] block: adapt bdrv_append() for inserting filters
  2021-02-04  9:05       ` Kevin Wolf
@ 2021-02-04 11:54         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-04 11:54 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

04.02.2021 12:05, Kevin Wolf wrote:
> Am 04.02.2021 um 09:30 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 04.02.2021 00:33, Kevin Wolf wrote:
>>> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>    int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
>>>>                    Error **errp)
>>>>    {
>>>> -    Error *local_err = NULL;
>>>> +    int ret;
>>>> +    GSList *tran = NULL;
>>>> -    bdrv_set_backing_hd(bs_new, bs_top, &local_err);
>>>> -    if (local_err) {
>>>> -        error_propagate(errp, local_err);
>>>> -        return -EPERM;
>>>> +    assert(!bs_new->backing);
>>>> +
>>>> +    ret = bdrv_attach_child_noperm(bs_new, bs_top, "backing",
>>>> +                                   &child_of_bds, bdrv_backing_role(bs_new),
>>>> +                                   &bs_new->backing, &tran, errp);
>>>> +    if (ret < 0) {
>>>> +        goto out;
>>>>        }
>>>
>>> I don't think changing bs->backing without bdrv_set_backing_hd() is
>>> correct at the moment. We lose a few things:
>>>
>>> 1. The bdrv_is_backing_chain_frozen() check
>>> 2. Updating backing_hd->inherits_from if necessary
>>> 3. bdrv_refresh_limits()
>>>
>>> If I'm not missing anything, all of these are needed in the context of
>>> bdrv_append().
>>
>> I decided that bdrv_append() is only for appending new nodes, so
>> frozen and inherts_from checks are not needed. And I've added
>> assert(!bs_new->backing)...
>>
>> Checking this now:
>>
>> - appending filters is obvious
>> - bdrv_append_temp_snapshot() creates new qcow2 node based on tmp
>>    file, don't see any backing initialization (and it would be rather
>>    strange)
> 
> Yes, the internal uses are obviously unproblematic for the frozen check.
> 
>> - external_snapshot_prepare() do check if
>>    (bdrv_cow_child(state->new_bs)) {  error-out }
> 
> Ok, the only thing bdrv_set_backing_hd() can and must check is whether
> the link to the old backing file was frozen, and we know that we don't
> have an old backing file. Makes sense.
> 
> Same thing for inherits_from, we only do this if the the new backing
> file (i.e. the old active layer for bdrv_append) was already in the
> backing chain of the new node.
> 
>> So everything is OK. I should describe it in commit message and add a
>> comment to bdrv_append.
> 
> What about bdrv_refresh_limits()? The node gains a new backing file, so
> I think the limits could change.
> 
> Ideally, bdrv_child_cb_attach/detach() would take care of this, but at
> the moment they don't.
> 

when answering I thought that it is called at the end of a function. But I both forget to write it in the answer and was wrong :) As it's actually bdrv_refresh_perms().  I'll add call of bdrv_refresh_limits()


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 26/36] block/backup-top: drop .active
  2020-11-27 14:45 ` [PATCH v2 26/36] block/backup-top: drop .active Vladimir Sementsov-Ogievskiy
@ 2021-02-04 12:26   ` Kevin Wolf
  2021-02-04 12:33     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-02-04 12:26 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> We don't need this workaround anymore: bdrv_append is already smart
> enough and we can use new bdrv_drop_filter().
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/backup-top.c         | 38 +-------------------------------------
>  tests/qemu-iotests/283.out |  2 +-
>  2 files changed, 2 insertions(+), 38 deletions(-)
> 
> diff --git a/block/backup-top.c b/block/backup-top.c
> index 650ed6195c..84eb73aeb7 100644
> --- a/block/backup-top.c
> +++ b/block/backup-top.c
> @@ -37,7 +37,6 @@
>  typedef struct BDRVBackupTopState {
>      BlockCopyState *bcs;
>      BdrvChild *target;
> -    bool active;
>      int64_t cluster_size;
>  } BDRVBackupTopState;
>  
> @@ -127,21 +126,6 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
>                                    uint64_t perm, uint64_t shared,
>                                    uint64_t *nperm, uint64_t *nshared)
>  {
> -    BDRVBackupTopState *s = bs->opaque;
> -
> -    if (!s->active) {
> -        /*
> -         * The filter node may be in process of bdrv_append(), which firstly do
> -         * bdrv_set_backing_hd() and then bdrv_replace_node(). This means that
> -         * we can't unshare BLK_PERM_WRITE during bdrv_append() operation. So,
> -         * let's require nothing during bdrv_append() and refresh permissions
> -         * after it (see bdrv_backup_top_append()).
> -         */
> -        *nperm = 0;
> -        *nshared = BLK_PERM_ALL;
> -        return;
> -    }
> -
>      if (!(role & BDRV_CHILD_FILTERED)) {
>          /*
>           * Target child
> @@ -229,18 +213,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
>      }
>      appended = true;
>  
> -    /*
> -     * bdrv_append() finished successfully, now we can require permissions
> -     * we want.
> -     */
> -    state->active = true;
> -    bdrv_child_refresh_perms(top, top->backing, &local_err);

bdrv_append() uses bdrv_refresh_perms() for the whole node. Is it doing
unnecessary extra work there and should really do the same as backup-top
did here, i.e. bdrv_child_refresh_perms(bs_new->backing)?

(Really a comment for an earlier patch. This patch itself looks fine.)

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 25/36] block: introduce bdrv_drop_filter()
  2021-02-04 11:31   ` Kevin Wolf
@ 2021-02-04 12:27     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-04 12:27 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

04.02.2021 14:31, Kevin Wolf wrote:
> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Using bdrv_replace_node() for removing filter is not good enough: it
>> keeps child reference of the filter, which may conflict with original
>> top node during permission update.
>>
>> Instead let's create new interface, which will do all graph
>> modifications first and then update permissions.
>>
>> Let's modify bdrv_replace_node_common(), allowing it additionally drop
>> backing chain child link pointing to new node. This is quite
>> appropriate for bdrv_drop_intermediate() and makes possible to add
>> new bdrv_drop_filter() as a simple wrapper.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   include/block/block.h |  1 +
>>   block.c               | 42 ++++++++++++++++++++++++++++++++++++++----
>>   2 files changed, 39 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/block/block.h b/include/block/block.h
>> index 8f6100dad7..0f21ef313f 100644
>> --- a/include/block/block.h
>> +++ b/include/block/block.h
>> @@ -348,6 +348,7 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
>>                   Error **errp);
>>   int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
>>                         Error **errp);
>> +int bdrv_drop_filter(BlockDriverState *bs, Error **errp);
>>   
>>   int bdrv_parse_aio(const char *mode, int *flags);
>>   int bdrv_parse_cache_mode(const char *mode, int *flags, bool *writethrough);
>> diff --git a/block.c b/block.c
>> index b1394b721c..e835a78f06 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -4919,7 +4919,6 @@ static TransactionActionDrv bdrv_remove_backing_drv = {
>>       .commit = bdrv_child_free,
>>   };
>>   
>> -__attribute__((unused))
>>   static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran)
>>   {
>>       if (!bs->backing) {
>> @@ -4968,15 +4967,30 @@ static int bdrv_replace_node_noperm(BlockDriverState *from,
>>    *
>>    * With auto_skip=false the error is returned if from has a parent which should
>>    * not be updated.
>> + *
>> + * With detach_subchain to must be in a backing chain of from. In this case
> 
> @to and @from make it easier to read.
> 
>> + * backing link of the cow-parent of @to is removed.
>>    */
>>   static int bdrv_replace_node_common(BlockDriverState *from,
>>                                       BlockDriverState *to,
>> -                                    bool auto_skip, Error **errp)
>> +                                    bool auto_skip, bool detach_subchain,
>> +                                    Error **errp)
>>   {
>>       int ret = -EPERM;
>>       GSList *tran = NULL;
>>       g_autoptr(GHashTable) found = NULL;
>>       g_autoptr(GSList) refresh_list = NULL;
>> +    BlockDriverState *to_cow_parent;
>> +
>> +    if (detach_subchain) {
>> +        assert(bdrv_chain_contains(from, to));
> 
> The loop below also relies on from != to, so maybe assert that, too.
> 
>> +        for (to_cow_parent = from;
>> +             bdrv_filter_or_cow_bs(to_cow_parent) != to;
>> +             to_cow_parent = bdrv_filter_or_cow_bs(to_cow_parent))
>> +        {
>> +            ;
>> +        }
>> +    }
>>   
>>       /* Make sure that @from doesn't go away until we have successfully attached
>>        * all of its parents to @to. */
>> @@ -4997,6 +5011,10 @@ static int bdrv_replace_node_common(BlockDriverState *from,
>>           goto out;
>>       }
>>   
>> +    if (detach_subchain) {
>> +        bdrv_remove_backing(to_cow_parent, &tran);
>> +    }
> 
> So bdrv_drop_filter() only works for filters that go through
> bs->backing?
> 
> Wouldn't it have been more useful to make it bdrv_remove_filter_or_cow()
> like you use already use in other places in this patch?
> 
> If not, the limitation needs to be documented for bdrv_drop_filter().
> 

bdrv_append supports only bs->backing based filters too.. So for now it's enough. And probably in future we'll refactor it all again when implement multi-reopen qmp command. Will look at it when prepare new version and either improve or document limitation.


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 26/36] block/backup-top: drop .active
  2021-02-04 12:26   ` Kevin Wolf
@ 2021-02-04 12:33     ` Vladimir Sementsov-Ogievskiy
  2021-02-04 13:25       ` Kevin Wolf
  0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-04 12:33 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

04.02.2021 15:26, Kevin Wolf wrote:
> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> We don't need this workaround anymore: bdrv_append is already smart
>> enough and we can use new bdrv_drop_filter().
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/backup-top.c         | 38 +-------------------------------------
>>   tests/qemu-iotests/283.out |  2 +-
>>   2 files changed, 2 insertions(+), 38 deletions(-)
>>
>> diff --git a/block/backup-top.c b/block/backup-top.c
>> index 650ed6195c..84eb73aeb7 100644
>> --- a/block/backup-top.c
>> +++ b/block/backup-top.c
>> @@ -37,7 +37,6 @@
>>   typedef struct BDRVBackupTopState {
>>       BlockCopyState *bcs;
>>       BdrvChild *target;
>> -    bool active;
>>       int64_t cluster_size;
>>   } BDRVBackupTopState;
>>   
>> @@ -127,21 +126,6 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
>>                                     uint64_t perm, uint64_t shared,
>>                                     uint64_t *nperm, uint64_t *nshared)
>>   {
>> -    BDRVBackupTopState *s = bs->opaque;
>> -
>> -    if (!s->active) {
>> -        /*
>> -         * The filter node may be in process of bdrv_append(), which firstly do
>> -         * bdrv_set_backing_hd() and then bdrv_replace_node(). This means that
>> -         * we can't unshare BLK_PERM_WRITE during bdrv_append() operation. So,
>> -         * let's require nothing during bdrv_append() and refresh permissions
>> -         * after it (see bdrv_backup_top_append()).
>> -         */
>> -        *nperm = 0;
>> -        *nshared = BLK_PERM_ALL;
>> -        return;
>> -    }
>> -
>>       if (!(role & BDRV_CHILD_FILTERED)) {
>>           /*
>>            * Target child
>> @@ -229,18 +213,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
>>       }
>>       appended = true;
>>   
>> -    /*
>> -     * bdrv_append() finished successfully, now we can require permissions
>> -     * we want.
>> -     */
>> -    state->active = true;
>> -    bdrv_child_refresh_perms(top, top->backing, &local_err);
> 
> bdrv_append() uses bdrv_refresh_perms() for the whole node. Is it doing
> unnecessary extra work there and should really do the same as backup-top
> did here, i.e. bdrv_child_refresh_perms(bs_new->backing)?
> 
> (Really a comment for an earlier patch. This patch itself looks fine.)
> 

You mean how backup-top code works at the point when we modified bdrv_append()? Actually all works, as we use state->active. We set it to true and should call refresh_perms. Now we drop _refresh_perms _together_ with state->active variable, so filter is always "active", but new bdrv_append can handle it now. I.e., before this patch backup-top.c code is correct but over-complicated with logic which is not necessary after bdrv_append() improvement (and of-course we need also bdrv_drop_filter() to drop the whole state->active related logic).


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 26/36] block/backup-top: drop .active
  2021-02-04 12:33     ` Vladimir Sementsov-Ogievskiy
@ 2021-02-04 13:25       ` Kevin Wolf
  2021-02-04 13:46         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-02-04 13:25 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 04.02.2021 um 13:33 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 04.02.2021 15:26, Kevin Wolf wrote:
> > Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > We don't need this workaround anymore: bdrv_append is already smart
> > > enough and we can use new bdrv_drop_filter().
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > > ---
> > >   block/backup-top.c         | 38 +-------------------------------------
> > >   tests/qemu-iotests/283.out |  2 +-
> > >   2 files changed, 2 insertions(+), 38 deletions(-)
> > > 
> > > diff --git a/block/backup-top.c b/block/backup-top.c
> > > index 650ed6195c..84eb73aeb7 100644
> > > --- a/block/backup-top.c
> > > +++ b/block/backup-top.c
> > > @@ -37,7 +37,6 @@
> > >   typedef struct BDRVBackupTopState {
> > >       BlockCopyState *bcs;
> > >       BdrvChild *target;
> > > -    bool active;
> > >       int64_t cluster_size;
> > >   } BDRVBackupTopState;
> > > @@ -127,21 +126,6 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
> > >                                     uint64_t perm, uint64_t shared,
> > >                                     uint64_t *nperm, uint64_t *nshared)
> > >   {
> > > -    BDRVBackupTopState *s = bs->opaque;
> > > -
> > > -    if (!s->active) {
> > > -        /*
> > > -         * The filter node may be in process of bdrv_append(), which firstly do
> > > -         * bdrv_set_backing_hd() and then bdrv_replace_node(). This means that
> > > -         * we can't unshare BLK_PERM_WRITE during bdrv_append() operation. So,
> > > -         * let's require nothing during bdrv_append() and refresh permissions
> > > -         * after it (see bdrv_backup_top_append()).
> > > -         */
> > > -        *nperm = 0;
> > > -        *nshared = BLK_PERM_ALL;
> > > -        return;
> > > -    }
> > > -
> > >       if (!(role & BDRV_CHILD_FILTERED)) {
> > >           /*
> > >            * Target child
> > > @@ -229,18 +213,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
> > >       }
> > >       appended = true;
> > > -    /*
> > > -     * bdrv_append() finished successfully, now we can require permissions
> > > -     * we want.
> > > -     */
> > > -    state->active = true;
> > > -    bdrv_child_refresh_perms(top, top->backing, &local_err);
> > 
> > bdrv_append() uses bdrv_refresh_perms() for the whole node. Is it doing
> > unnecessary extra work there and should really do the same as backup-top
> > did here, i.e. bdrv_child_refresh_perms(bs_new->backing)?
> > 
> > (Really a comment for an earlier patch. This patch itself looks fine.)
> > 
> 
> You mean how backup-top code works at the point when we modified
> bdrv_append()? Actually all works, as we use state->active. We set it
> to true and should call refresh_perms. Now we drop _refresh_perms
> _together_ with state->active variable, so filter is always "active",
> but new bdrv_append can handle it now. I.e., before this patch
> backup-top.c code is correct but over-complicated with logic which is
> not necessary after bdrv_append() improvement (and of-course we need
> also bdrv_drop_filter() to drop the whole state->active related
> logic).

No, I just mean that bdrv_child_refresh_perms(bs, bs->backing) is enough
when adding a new image to the chain. A full bdrv_child_refresh_perms()
like we now have in bdrv_append() is doing more work than is necessary.

It doesn't make a difference for backup-top (because the filter has only
a single child), but if you append a new qcow2 snapshot, you would also
recalculate permissions for the bs->file subtree even though nothing has
changed there.

It's only a small detail anyway, not very important in a slow path.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 26/36] block/backup-top: drop .active
  2021-02-04 13:25       ` Kevin Wolf
@ 2021-02-04 13:46         ` Vladimir Sementsov-Ogievskiy
  2021-02-04 14:31           ` Kevin Wolf
  0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-04 13:46 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

04.02.2021 16:25, Kevin Wolf wrote:
> Am 04.02.2021 um 13:33 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 04.02.2021 15:26, Kevin Wolf wrote:
>>> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> We don't need this workaround anymore: bdrv_append is already smart
>>>> enough and we can use new bdrv_drop_filter().
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>    block/backup-top.c         | 38 +-------------------------------------
>>>>    tests/qemu-iotests/283.out |  2 +-
>>>>    2 files changed, 2 insertions(+), 38 deletions(-)
>>>>
>>>> diff --git a/block/backup-top.c b/block/backup-top.c
>>>> index 650ed6195c..84eb73aeb7 100644
>>>> --- a/block/backup-top.c
>>>> +++ b/block/backup-top.c
>>>> @@ -37,7 +37,6 @@
>>>>    typedef struct BDRVBackupTopState {
>>>>        BlockCopyState *bcs;
>>>>        BdrvChild *target;
>>>> -    bool active;
>>>>        int64_t cluster_size;
>>>>    } BDRVBackupTopState;
>>>> @@ -127,21 +126,6 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
>>>>                                      uint64_t perm, uint64_t shared,
>>>>                                      uint64_t *nperm, uint64_t *nshared)
>>>>    {
>>>> -    BDRVBackupTopState *s = bs->opaque;
>>>> -
>>>> -    if (!s->active) {
>>>> -        /*
>>>> -         * The filter node may be in process of bdrv_append(), which firstly do
>>>> -         * bdrv_set_backing_hd() and then bdrv_replace_node(). This means that
>>>> -         * we can't unshare BLK_PERM_WRITE during bdrv_append() operation. So,
>>>> -         * let's require nothing during bdrv_append() and refresh permissions
>>>> -         * after it (see bdrv_backup_top_append()).
>>>> -         */
>>>> -        *nperm = 0;
>>>> -        *nshared = BLK_PERM_ALL;
>>>> -        return;
>>>> -    }
>>>> -
>>>>        if (!(role & BDRV_CHILD_FILTERED)) {
>>>>            /*
>>>>             * Target child
>>>> @@ -229,18 +213,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
>>>>        }
>>>>        appended = true;
>>>> -    /*
>>>> -     * bdrv_append() finished successfully, now we can require permissions
>>>> -     * we want.
>>>> -     */
>>>> -    state->active = true;
>>>> -    bdrv_child_refresh_perms(top, top->backing, &local_err);
>>>
>>> bdrv_append() uses bdrv_refresh_perms() for the whole node. Is it doing
>>> unnecessary extra work there and should really do the same as backup-top
>>> did here, i.e. bdrv_child_refresh_perms(bs_new->backing)?
>>>
>>> (Really a comment for an earlier patch. This patch itself looks fine.)
>>>
>>
>> You mean how backup-top code works at the point when we modified
>> bdrv_append()? Actually all works, as we use state->active. We set it
>> to true and should call refresh_perms. Now we drop _refresh_perms
>> _together_ with state->active variable, so filter is always "active",
>> but new bdrv_append can handle it now. I.e., before this patch
>> backup-top.c code is correct but over-complicated with logic which is
>> not necessary after bdrv_append() improvement (and of-course we need
>> also bdrv_drop_filter() to drop the whole state->active related
>> logic).
> 
> No, I just mean that bdrv_child_refresh_perms(bs, bs->backing) is enough
> when adding a new image to the chain. A full bdrv_child_refresh_perms()
> like we now have in bdrv_append() is doing more work than is necessary.
> 
> It doesn't make a difference for backup-top (because the filter has only
> a single child), but if you append a new qcow2 snapshot, you would also
> recalculate permissions for the bs->file subtree even though nothing has
> changed there.
> 
> It's only a small detail anyway, not very important in a slow path.
> 

Understand now. I think bdrv_append() do correct things: bs_new gets new parents, so we refresh the whole subtree.. So for appending qcow2 we should refresh its file child as well. Probably new permissions of new bs_new parents will influence what qcow2 wants to do with it file node..

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 26/36] block/backup-top: drop .active
  2021-02-04 13:46         ` Vladimir Sementsov-Ogievskiy
@ 2021-02-04 14:31           ` Kevin Wolf
  0 siblings, 0 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-02-04 14:31 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 04.02.2021 um 14:46 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 04.02.2021 16:25, Kevin Wolf wrote:
> > Am 04.02.2021 um 13:33 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > 04.02.2021 15:26, Kevin Wolf wrote:
> > > > Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > > We don't need this workaround anymore: bdrv_append is already smart
> > > > > enough and we can use new bdrv_drop_filter().
> > > > > 
> > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > > > > ---
> > > > >    block/backup-top.c         | 38 +-------------------------------------
> > > > >    tests/qemu-iotests/283.out |  2 +-
> > > > >    2 files changed, 2 insertions(+), 38 deletions(-)
> > > > > 
> > > > > diff --git a/block/backup-top.c b/block/backup-top.c
> > > > > index 650ed6195c..84eb73aeb7 100644
> > > > > --- a/block/backup-top.c
> > > > > +++ b/block/backup-top.c
> > > > > @@ -37,7 +37,6 @@
> > > > >    typedef struct BDRVBackupTopState {
> > > > >        BlockCopyState *bcs;
> > > > >        BdrvChild *target;
> > > > > -    bool active;
> > > > >        int64_t cluster_size;
> > > > >    } BDRVBackupTopState;
> > > > > @@ -127,21 +126,6 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
> > > > >                                      uint64_t perm, uint64_t shared,
> > > > >                                      uint64_t *nperm, uint64_t *nshared)
> > > > >    {
> > > > > -    BDRVBackupTopState *s = bs->opaque;
> > > > > -
> > > > > -    if (!s->active) {
> > > > > -        /*
> > > > > -         * The filter node may be in process of bdrv_append(), which firstly do
> > > > > -         * bdrv_set_backing_hd() and then bdrv_replace_node(). This means that
> > > > > -         * we can't unshare BLK_PERM_WRITE during bdrv_append() operation. So,
> > > > > -         * let's require nothing during bdrv_append() and refresh permissions
> > > > > -         * after it (see bdrv_backup_top_append()).
> > > > > -         */
> > > > > -        *nperm = 0;
> > > > > -        *nshared = BLK_PERM_ALL;
> > > > > -        return;
> > > > > -    }
> > > > > -
> > > > >        if (!(role & BDRV_CHILD_FILTERED)) {
> > > > >            /*
> > > > >             * Target child
> > > > > @@ -229,18 +213,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
> > > > >        }
> > > > >        appended = true;
> > > > > -    /*
> > > > > -     * bdrv_append() finished successfully, now we can require permissions
> > > > > -     * we want.
> > > > > -     */
> > > > > -    state->active = true;
> > > > > -    bdrv_child_refresh_perms(top, top->backing, &local_err);
> > > > 
> > > > bdrv_append() uses bdrv_refresh_perms() for the whole node. Is it doing
> > > > unnecessary extra work there and should really do the same as backup-top
> > > > did here, i.e. bdrv_child_refresh_perms(bs_new->backing)?
> > > > 
> > > > (Really a comment for an earlier patch. This patch itself looks fine.)
> > > > 
> > > 
> > > You mean how backup-top code works at the point when we modified
> > > bdrv_append()? Actually all works, as we use state->active. We set it
> > > to true and should call refresh_perms. Now we drop _refresh_perms
> > > _together_ with state->active variable, so filter is always "active",
> > > but new bdrv_append can handle it now. I.e., before this patch
> > > backup-top.c code is correct but over-complicated with logic which is
> > > not necessary after bdrv_append() improvement (and of-course we need
> > > also bdrv_drop_filter() to drop the whole state->active related
> > > logic).
> > 
> > No, I just mean that bdrv_child_refresh_perms(bs, bs->backing) is enough
> > when adding a new image to the chain. A full bdrv_child_refresh_perms()
> > like we now have in bdrv_append() is doing more work than is necessary.
> > 
> > It doesn't make a difference for backup-top (because the filter has only
> > a single child), but if you append a new qcow2 snapshot, you would also
> > recalculate permissions for the bs->file subtree even though nothing has
> > changed there.
> > 
> > It's only a small detail anyway, not very important in a slow path.
> 
> Understand now. I think bdrv_append() do correct things: bs_new gets
> new parents, so we refresh the whole subtree.. So for appending qcow2
> we should refresh its file child as well. Probably new permissions of
> new bs_new parents will influence what qcow2 wants to do with it file
> node..

You mean the parents that move from bs_top to bs_new and that they could
change the permissions that bs_new needs?

Good point, yes.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 28/36] block: add bdrv_set_backing_noperm() transaction action
  2020-11-27 14:45 ` [PATCH v2 28/36] block: add bdrv_set_backing_noperm() transaction action Vladimir Sementsov-Ogievskiy
@ 2021-02-05 14:00   ` Kevin Wolf
  2021-02-05 16:06     ` Vladimir Sementsov-Ogievskiy
  2021-02-05 16:26   ` Kevin Wolf
  1 sibling, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-02-05 14:00 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Split out no-perm part of bdrv_set_backing_hd() as a separate
> transaction action. Note the in case of existing BdrvChild we reuse it,
> not recreate, just to do less actions.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block.c | 111 +++++++++++++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 89 insertions(+), 22 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 54fb6d24bd..617cba9547 100644
> --- a/block.c
> +++ b/block.c
> @@ -101,6 +101,7 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
>                                      uint64_t perm, uint64_t shared_perm,
>                                      void *opaque, BdrvChild **child,
>                                      GSList **tran, Error **errp);
> +static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran);
>  
>  static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
>                                 *queue, Error **errp);
> @@ -3194,45 +3195,111 @@ static BdrvChildRole bdrv_backing_role(BlockDriverState *bs)
>      }
>  }
>  
> +typedef struct BdrvSetBackingNoPermState {
> +    BlockDriverState *bs;
> +    BlockDriverState *backing_bs;
> +    BlockDriverState *old_inherits_from;
> +    GSList *attach_tran;
> +} BdrvSetBackingNoPermState;

Why do we need the nested attach_tran instead of just including these
actions in the outer transaction?

> +static void bdrv_set_backing_noperm_abort(void *opaque)
> +{
> +    BdrvSetBackingNoPermState *s = opaque;
> +
> +    if (s->backing_bs) {
> +        s->backing_bs->inherits_from = s->old_inherits_from;
> +    }
> +
> +    tran_abort(s->attach_tran);
> +
> +    bdrv_refresh_limits(s->bs, NULL);
> +    if (s->old_inherits_from) {
> +        bdrv_refresh_limits(s->old_inherits_from, NULL);
> +    }

How is bs->inherits_from related to limits? I don't see a
bdrv_refresh_limits() call in bdrv_set_backing_noperm() that this would
undo.

> +}
> +
> +static void bdrv_set_backing_noperm_commit(void *opaque)
> +{
> +    BdrvSetBackingNoPermState *s = opaque;
> +
> +    tran_commit(s->attach_tran);
> +}
> +
> +static TransactionActionDrv bdrv_set_backing_noperm_drv = {
> +    .abort = bdrv_set_backing_noperm_abort,
> +    .commit = bdrv_set_backing_noperm_commit,
> +    .clean = g_free,
> +};
> +
>  /*
>   * Sets the bs->backing link of a BDS. A new reference is created; callers
>   * which don't need their own reference any more must call bdrv_unref().
>   */
> -void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
> -                         Error **errp)
> +static int bdrv_set_backing_noperm(BlockDriverState *bs,
> +                                   BlockDriverState *backing_bs,
> +                                   GSList **tran, Error **errp)
>  {
> -    bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
> -        bdrv_inherits_from_recursive(backing_hd, bs);
> +    int ret = 0;
> +    bool update_inherits_from = bdrv_chain_contains(bs, backing_bs) &&
> +        bdrv_inherits_from_recursive(backing_bs, bs);
> +    GSList *attach_tran = NULL;
> +    BdrvSetBackingNoPermState *s;
>  
>      if (bdrv_is_backing_chain_frozen(bs, child_bs(bs->backing), errp)) {
> -        return;
> +        return -EPERM;
>      }
>  
> -    if (backing_hd) {
> -        bdrv_ref(backing_hd);
> +    if (bs->backing && backing_bs) {
> +        bdrv_replace_child_safe(bs->backing, backing_bs, tran);
> +    } else if (bs->backing && !backing_bs) {
> +        bdrv_remove_backing(bs, tran);
> +    } else if (backing_bs) {
> +        assert(!bs->backing);
> +        ret = bdrv_attach_child_noperm(bs, backing_bs, "backing",
> +                                       &child_of_bds, bdrv_backing_role(bs),
> +                                       &bs->backing, &attach_tran, errp);
> +        if (ret < 0) {
> +            tran_abort(attach_tran);

This looks wrong to me, we'll call tran_abort() a second time through
bdrv_set_backing_noperm_abort() when the outer transaction aborts.

I also notice that the other two if branches do just add things to the
outer 'tran', it's just this branch that gets a nested one.

> +            return ret;
> +        }
>      }
>  
> -    if (bs->backing) {
> -        /* Cannot be frozen, we checked that above */
> -        bdrv_unref_child(bs, bs->backing);
> -        bs->backing = NULL;
> -    }
> +    s = g_new(BdrvSetBackingNoPermState, 1);
> +    *s = (BdrvSetBackingNoPermState) {
> +        .bs = bs,
> +        .backing_bs = backing_bs,
> +        .old_inherits_from = backing_bs ? backing_bs->inherits_from : NULL,
> +    };
> +    tran_prepend(tran, &bdrv_set_backing_noperm_drv, s);
>  
> -    if (!backing_hd) {
> -        goto out;
> +    /*
> +     * If backing_bs was already part of bs's backing chain, and
> +     * inherits_from pointed recursively to bs then let's update it to
> +     * point directly to bs (else it will become NULL).

Setting it to NULL was previously done by bdrv_unref_child().

bdrv_replace_child_safe() and bdrv_remove_backing() don't seem to do
this any more.

> +     */
> +    if (backing_bs && update_inherits_from) {
> +        backing_bs->inherits_from = bs;
>      }
>  
> -    bs->backing = bdrv_attach_child(bs, backing_hd, "backing", &child_of_bds,
> -                                    bdrv_backing_role(bs), errp);
> -    /* If backing_hd was already part of bs's backing chain, and
> -     * inherits_from pointed recursively to bs then let's update it to
> -     * point directly to bs (else it will become NULL). */
> -    if (bs->backing && update_inherits_from) {
> -        backing_hd->inherits_from = bs;
> +    bdrv_refresh_limits(bs, NULL);
> +
> +    return 0;
> +}

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 29/36] blockdev: qmp_x_blockdev_reopen: acquire all contexts
  2020-11-27 14:45 ` [PATCH v2 29/36] blockdev: qmp_x_blockdev_reopen: acquire all contexts Vladimir Sementsov-Ogievskiy
@ 2021-02-05 16:01   ` Kevin Wolf
  2021-02-05 16:16     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-02-05 16:01 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> During reopen we may add backing bs from other aio context, which may
> lead to changing original context of top bs.
> 
> We are going to move graph modification to prepare stage. So, it will
> be possible that bdrv_flush() in bdrv_reopen_prepare called on bs in
> non-original aio context, which we didn't aquire which leads to crash.
> 
> More correct would be to acquire all aio context we are going to work
> with. And the simplest ways is to just acquire all of them. It may be
> optimized later if needed.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

I'm afraid it's not as easy. Holding the lock of more than one
AioContext is always a bit risky with respect to deadlocks.

For example, changing the AioContext of a node with
bdrv_set_aio_context_ignore() has explicit rules that are now violated:

 * The caller must own the AioContext lock for the old AioContext of bs, but it
 * must not own the AioContext lock for new_context (unless new_context is the
 * same as the current context of bs).

Draining while holding all AioContext locks is suspicious, too. I think
I have seen deadlocks before, which is why bdrv_drain_all_*() are
careful to only ever lock a single AioContext at a time.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 28/36] block: add bdrv_set_backing_noperm() transaction action
  2021-02-05 14:00   ` Kevin Wolf
@ 2021-02-05 16:06     ` Vladimir Sementsov-Ogievskiy
  2021-02-05 16:30       ` Kevin Wolf
  0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 16:06 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

05.02.2021 17:00, Kevin Wolf wrote:
> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Split out no-perm part of bdrv_set_backing_hd() as a separate
>> transaction action. Note the in case of existing BdrvChild we reuse it,
>> not recreate, just to do less actions.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block.c | 111 +++++++++++++++++++++++++++++++++++++++++++++-----------
>>   1 file changed, 89 insertions(+), 22 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 54fb6d24bd..617cba9547 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -101,6 +101,7 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
>>                                       uint64_t perm, uint64_t shared_perm,
>>                                       void *opaque, BdrvChild **child,
>>                                       GSList **tran, Error **errp);
>> +static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran);
>>   
>>   static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
>>                                  *queue, Error **errp);
>> @@ -3194,45 +3195,111 @@ static BdrvChildRole bdrv_backing_role(BlockDriverState *bs)
>>       }
>>   }
>>   
>> +typedef struct BdrvSetBackingNoPermState {
>> +    BlockDriverState *bs;
>> +    BlockDriverState *backing_bs;
>> +    BlockDriverState *old_inherits_from;
>> +    GSList *attach_tran;
>> +} BdrvSetBackingNoPermState;
> 
> Why do we need the nested attach_tran instead of just including these
> actions in the outer transaction?
> 
>> +static void bdrv_set_backing_noperm_abort(void *opaque)
>> +{
>> +    BdrvSetBackingNoPermState *s = opaque;
>> +
>> +    if (s->backing_bs) {
>> +        s->backing_bs->inherits_from = s->old_inherits_from;
>> +    }
>> +
>> +    tran_abort(s->attach_tran);
>> +
>> +    bdrv_refresh_limits(s->bs, NULL);
>> +    if (s->old_inherits_from) {
>> +        bdrv_refresh_limits(s->old_inherits_from, NULL);
>> +    }
> 
> How is bs->inherits_from related to limits? I don't see a
> bdrv_refresh_limits() call in bdrv_set_backing_noperm() that this would
> undo.
> 
>> +}
>> +
>> +static void bdrv_set_backing_noperm_commit(void *opaque)
>> +{
>> +    BdrvSetBackingNoPermState *s = opaque;
>> +
>> +    tran_commit(s->attach_tran);
>> +}
>> +
>> +static TransactionActionDrv bdrv_set_backing_noperm_drv = {
>> +    .abort = bdrv_set_backing_noperm_abort,
>> +    .commit = bdrv_set_backing_noperm_commit,
>> +    .clean = g_free,
>> +};
>> +
>>   /*
>>    * Sets the bs->backing link of a BDS. A new reference is created; callers
>>    * which don't need their own reference any more must call bdrv_unref().
>>    */
>> -void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
>> -                         Error **errp)
>> +static int bdrv_set_backing_noperm(BlockDriverState *bs,
>> +                                   BlockDriverState *backing_bs,
>> +                                   GSList **tran, Error **errp)
>>   {
>> -    bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
>> -        bdrv_inherits_from_recursive(backing_hd, bs);
>> +    int ret = 0;
>> +    bool update_inherits_from = bdrv_chain_contains(bs, backing_bs) &&
>> +        bdrv_inherits_from_recursive(backing_bs, bs);
>> +    GSList *attach_tran = NULL;
>> +    BdrvSetBackingNoPermState *s;
>>   
>>       if (bdrv_is_backing_chain_frozen(bs, child_bs(bs->backing), errp)) {
>> -        return;
>> +        return -EPERM;
>>       }
>>   
>> -    if (backing_hd) {
>> -        bdrv_ref(backing_hd);
>> +    if (bs->backing && backing_bs) {
>> +        bdrv_replace_child_safe(bs->backing, backing_bs, tran);
>> +    } else if (bs->backing && !backing_bs) {
>> +        bdrv_remove_backing(bs, tran);
>> +    } else if (backing_bs) {
>> +        assert(!bs->backing);
>> +        ret = bdrv_attach_child_noperm(bs, backing_bs, "backing",
>> +                                       &child_of_bds, bdrv_backing_role(bs),
>> +                                       &bs->backing, &attach_tran, errp);
>> +        if (ret < 0) {
>> +            tran_abort(attach_tran);
> 
> This looks wrong to me, we'll call tran_abort() a second time through
> bdrv_set_backing_noperm_abort() when the outer transaction aborts.
> 
> I also notice that the other two if branches do just add things to the
> outer 'tran', it's just this branch that gets a nested one.
> 
>> +            return ret;
>> +        }
>>       }
>>   
>> -    if (bs->backing) {
>> -        /* Cannot be frozen, we checked that above */
>> -        bdrv_unref_child(bs, bs->backing);
>> -        bs->backing = NULL;
>> -    }
>> +    s = g_new(BdrvSetBackingNoPermState, 1);
>> +    *s = (BdrvSetBackingNoPermState) {
>> +        .bs = bs,
>> +        .backing_bs = backing_bs,
>> +        .old_inherits_from = backing_bs ? backing_bs->inherits_from : NULL,
>> +    };
>> +    tran_prepend(tran, &bdrv_set_backing_noperm_drv, s);
>>   
>> -    if (!backing_hd) {
>> -        goto out;
>> +    /*
>> +     * If backing_bs was already part of bs's backing chain, and
>> +     * inherits_from pointed recursively to bs then let's update it to
>> +     * point directly to bs (else it will become NULL).
> 
> Setting it to NULL was previously done by bdrv_unref_child().
> 
> bdrv_replace_child_safe() and bdrv_remove_backing() don't seem to do
> this any more.

Hmm, yes.. May be we should move bdrv_unset_inherts_from() from bdrv_unref_child() to bdrv_replace_child_noperm() ?

> 
>> +     */
>> +    if (backing_bs && update_inherits_from) {
>> +        backing_bs->inherits_from = bs;
>>       }
>>   
>> -    bs->backing = bdrv_attach_child(bs, backing_hd, "backing", &child_of_bds,
>> -                                    bdrv_backing_role(bs), errp);
>> -    /* If backing_hd was already part of bs's backing chain, and
>> -     * inherits_from pointed recursively to bs then let's update it to
>> -     * point directly to bs (else it will become NULL). */
>> -    if (bs->backing && update_inherits_from) {
>> -        backing_hd->inherits_from = bs;
>> +    bdrv_refresh_limits(bs, NULL);
>> +
>> +    return 0;
>> +}
> 
> Kevin
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 29/36] blockdev: qmp_x_blockdev_reopen: acquire all contexts
  2021-02-05 16:01   ` Kevin Wolf
@ 2021-02-05 16:16     ` Vladimir Sementsov-Ogievskiy
  2021-02-05 16:36       ` Kevin Wolf
  0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 16:16 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

05.02.2021 19:01, Kevin Wolf wrote:
> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> During reopen we may add backing bs from other aio context, which may
>> lead to changing original context of top bs.
>>
>> We are going to move graph modification to prepare stage. So, it will
>> be possible that bdrv_flush() in bdrv_reopen_prepare called on bs in
>> non-original aio context, which we didn't aquire which leads to crash.
>>
>> More correct would be to acquire all aio context we are going to work
>> with. And the simplest ways is to just acquire all of them. It may be
>> optimized later if needed.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> I'm afraid it's not as easy. Holding the lock of more than one
> AioContext is always a bit risky with respect to deadlocks.
> 
> For example, changing the AioContext of a node with
> bdrv_set_aio_context_ignore() has explicit rules that are now violated:
> 
>   * The caller must own the AioContext lock for the old AioContext of bs, but it
>   * must not own the AioContext lock for new_context (unless new_context is the
>   * same as the current context of bs).
> 
> Draining while holding all AioContext locks is suspicious, too. I think
> I have seen deadlocks before, which is why bdrv_drain_all_*() are
> careful to only ever lock a single AioContext at a time.
> 
> Kevin
> 

That's not good :\ Hmm, probably we just should flush everything before all graph modifications.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 28/36] block: add bdrv_set_backing_noperm() transaction action
  2020-11-27 14:45 ` [PATCH v2 28/36] block: add bdrv_set_backing_noperm() transaction action Vladimir Sementsov-Ogievskiy
  2021-02-05 14:00   ` Kevin Wolf
@ 2021-02-05 16:26   ` Kevin Wolf
  2021-02-08  9:34     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-02-05 16:26 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Split out no-perm part of bdrv_set_backing_hd() as a separate
> transaction action. Note the in case of existing BdrvChild we reuse it,
> not recreate, just to do less actions.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

>  /*
>   * Sets the bs->backing link of a BDS. A new reference is created; callers
>   * which don't need their own reference any more must call bdrv_unref().
>   */
> -void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
> -                         Error **errp)
> +static int bdrv_set_backing_noperm(BlockDriverState *bs,
> +                                   BlockDriverState *backing_bs,
> +                                   GSList **tran, Error **errp)
>  {
> -    bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
> -        bdrv_inherits_from_recursive(backing_hd, bs);
> +    int ret = 0;
> +    bool update_inherits_from = bdrv_chain_contains(bs, backing_bs) &&
> +        bdrv_inherits_from_recursive(backing_bs, bs);
> +    GSList *attach_tran = NULL;
> +    BdrvSetBackingNoPermState *s;
>  
>      if (bdrv_is_backing_chain_frozen(bs, child_bs(bs->backing), errp)) {
> -        return;
> +        return -EPERM;
>      }
>  
> -    if (backing_hd) {
> -        bdrv_ref(backing_hd);
> +    if (bs->backing && backing_bs) {
> +        bdrv_replace_child_safe(bs->backing, backing_bs, tran);

The old code with separate bdrv_unref_child() and then
bdrv_attach_child() tried to make the AioContests of bs and backing_bs
compatible by moving one of the nodes if necessary.

bdrv_replace_child_safe() doesn't seem to do that, but it only asserts
that both nodes are already in the same context.

I see that iotest 245 doesn't crash, which I think it should if this
were broken, but where does the switch happen now?

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 28/36] block: add bdrv_set_backing_noperm() transaction action
  2021-02-05 16:06     ` Vladimir Sementsov-Ogievskiy
@ 2021-02-05 16:30       ` Kevin Wolf
  2021-03-11 18:29         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-02-05 16:30 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 05.02.2021 um 17:06 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 05.02.2021 17:00, Kevin Wolf wrote:
> > Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > Split out no-perm part of bdrv_set_backing_hd() as a separate
> > > transaction action. Note the in case of existing BdrvChild we reuse it,
> > > not recreate, just to do less actions.
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > > ---
> > >   block.c | 111 +++++++++++++++++++++++++++++++++++++++++++++-----------
> > >   1 file changed, 89 insertions(+), 22 deletions(-)
> > > 
> > > diff --git a/block.c b/block.c
> > > index 54fb6d24bd..617cba9547 100644
> > > --- a/block.c
> > > +++ b/block.c
> > > @@ -101,6 +101,7 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
> > >                                       uint64_t perm, uint64_t shared_perm,
> > >                                       void *opaque, BdrvChild **child,
> > >                                       GSList **tran, Error **errp);
> > > +static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran);
> > >   static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
> > >                                  *queue, Error **errp);
> > > @@ -3194,45 +3195,111 @@ static BdrvChildRole bdrv_backing_role(BlockDriverState *bs)
> > >       }
> > >   }
> > > +typedef struct BdrvSetBackingNoPermState {
> > > +    BlockDriverState *bs;
> > > +    BlockDriverState *backing_bs;
> > > +    BlockDriverState *old_inherits_from;
> > > +    GSList *attach_tran;
> > > +} BdrvSetBackingNoPermState;
> > 
> > Why do we need the nested attach_tran instead of just including these
> > actions in the outer transaction?
> > 
> > > +static void bdrv_set_backing_noperm_abort(void *opaque)
> > > +{
> > > +    BdrvSetBackingNoPermState *s = opaque;
> > > +
> > > +    if (s->backing_bs) {
> > > +        s->backing_bs->inherits_from = s->old_inherits_from;
> > > +    }
> > > +
> > > +    tran_abort(s->attach_tran);
> > > +
> > > +    bdrv_refresh_limits(s->bs, NULL);
> > > +    if (s->old_inherits_from) {
> > > +        bdrv_refresh_limits(s->old_inherits_from, NULL);
> > > +    }
> > 
> > How is bs->inherits_from related to limits? I don't see a
> > bdrv_refresh_limits() call in bdrv_set_backing_noperm() that this would
> > undo.
> > 
> > > +}
> > > +
> > > +static void bdrv_set_backing_noperm_commit(void *opaque)
> > > +{
> > > +    BdrvSetBackingNoPermState *s = opaque;
> > > +
> > > +    tran_commit(s->attach_tran);
> > > +}
> > > +
> > > +static TransactionActionDrv bdrv_set_backing_noperm_drv = {
> > > +    .abort = bdrv_set_backing_noperm_abort,
> > > +    .commit = bdrv_set_backing_noperm_commit,
> > > +    .clean = g_free,
> > > +};
> > > +
> > >   /*
> > >    * Sets the bs->backing link of a BDS. A new reference is created; callers
> > >    * which don't need their own reference any more must call bdrv_unref().
> > >    */
> > > -void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
> > > -                         Error **errp)
> > > +static int bdrv_set_backing_noperm(BlockDriverState *bs,
> > > +                                   BlockDriverState *backing_bs,
> > > +                                   GSList **tran, Error **errp)
> > >   {
> > > -    bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
> > > -        bdrv_inherits_from_recursive(backing_hd, bs);
> > > +    int ret = 0;
> > > +    bool update_inherits_from = bdrv_chain_contains(bs, backing_bs) &&
> > > +        bdrv_inherits_from_recursive(backing_bs, bs);
> > > +    GSList *attach_tran = NULL;
> > > +    BdrvSetBackingNoPermState *s;
> > >       if (bdrv_is_backing_chain_frozen(bs, child_bs(bs->backing), errp)) {
> > > -        return;
> > > +        return -EPERM;
> > >       }
> > > -    if (backing_hd) {
> > > -        bdrv_ref(backing_hd);
> > > +    if (bs->backing && backing_bs) {
> > > +        bdrv_replace_child_safe(bs->backing, backing_bs, tran);
> > > +    } else if (bs->backing && !backing_bs) {
> > > +        bdrv_remove_backing(bs, tran);
> > > +    } else if (backing_bs) {
> > > +        assert(!bs->backing);
> > > +        ret = bdrv_attach_child_noperm(bs, backing_bs, "backing",
> > > +                                       &child_of_bds, bdrv_backing_role(bs),
> > > +                                       &bs->backing, &attach_tran, errp);
> > > +        if (ret < 0) {
> > > +            tran_abort(attach_tran);
> > 
> > This looks wrong to me, we'll call tran_abort() a second time through
> > bdrv_set_backing_noperm_abort() when the outer transaction aborts.
> > 
> > I also notice that the other two if branches do just add things to the
> > outer 'tran', it's just this branch that gets a nested one.
> > 
> > > +            return ret;
> > > +        }
> > >       }
> > > -    if (bs->backing) {
> > > -        /* Cannot be frozen, we checked that above */
> > > -        bdrv_unref_child(bs, bs->backing);
> > > -        bs->backing = NULL;
> > > -    }
> > > +    s = g_new(BdrvSetBackingNoPermState, 1);
> > > +    *s = (BdrvSetBackingNoPermState) {
> > > +        .bs = bs,
> > > +        .backing_bs = backing_bs,
> > > +        .old_inherits_from = backing_bs ? backing_bs->inherits_from : NULL,
> > > +    };
> > > +    tran_prepend(tran, &bdrv_set_backing_noperm_drv, s);
> > > -    if (!backing_hd) {
> > > -        goto out;
> > > +    /*
> > > +     * If backing_bs was already part of bs's backing chain, and
> > > +     * inherits_from pointed recursively to bs then let's update it to
> > > +     * point directly to bs (else it will become NULL).
> > 
> > Setting it to NULL was previously done by bdrv_unref_child().
> > 
> > bdrv_replace_child_safe() and bdrv_remove_backing() don't seem to do
> > this any more.
> 
> Hmm, yes.. May be we should move bdrv_unset_inherts_from() from
> bdrv_unref_child() to bdrv_replace_child_noperm() ?

Sounds good to me. This should hopefully be called for all graph changes
that could possibly happen.

Kevin

> > 
> > > +     */
> > > +    if (backing_bs && update_inherits_from) {
> > > +        backing_bs->inherits_from = bs;
> > >       }
> > > -    bs->backing = bdrv_attach_child(bs, backing_hd, "backing", &child_of_bds,
> > > -                                    bdrv_backing_role(bs), errp);
> > > -    /* If backing_hd was already part of bs's backing chain, and
> > > -     * inherits_from pointed recursively to bs then let's update it to
> > > -     * point directly to bs (else it will become NULL). */
> > > -    if (bs->backing && update_inherits_from) {
> > > -        backing_hd->inherits_from = bs;
> > > +    bdrv_refresh_limits(bs, NULL);
> > > +
> > > +    return 0;
> > > +}
> > 
> > Kevin
> > 
> 
> 
> -- 
> Best regards,
> Vladimir
> 



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 29/36] blockdev: qmp_x_blockdev_reopen: acquire all contexts
  2021-02-05 16:16     ` Vladimir Sementsov-Ogievskiy
@ 2021-02-05 16:36       ` Kevin Wolf
  0 siblings, 0 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-02-05 16:36 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 05.02.2021 um 17:16 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 05.02.2021 19:01, Kevin Wolf wrote:
> > Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > During reopen we may add backing bs from other aio context, which may
> > > lead to changing original context of top bs.
> > > 
> > > We are going to move graph modification to prepare stage. So, it will
> > > be possible that bdrv_flush() in bdrv_reopen_prepare called on bs in
> > > non-original aio context, which we didn't aquire which leads to crash.
> > > 
> > > More correct would be to acquire all aio context we are going to work
> > > with. And the simplest ways is to just acquire all of them. It may be
> > > optimized later if needed.
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > 
> > I'm afraid it's not as easy. Holding the lock of more than one
> > AioContext is always a bit risky with respect to deadlocks.
> > 
> > For example, changing the AioContext of a node with
> > bdrv_set_aio_context_ignore() has explicit rules that are now violated:
> > 
> >   * The caller must own the AioContext lock for the old AioContext of bs, but it
> >   * must not own the AioContext lock for new_context (unless new_context is the
> >   * same as the current context of bs).
> > 
> > Draining while holding all AioContext locks is suspicious, too. I think
> > I have seen deadlocks before, which is why bdrv_drain_all_*() are
> > careful to only ever lock a single AioContext at a time.
> 
> That's not good :\ Hmm, probably we just should flush everything
> before all graph modifications.

Would that have to be a separate phase before prepare then?

I suppose the same problem exists with drv->bdrv_reopen_prepare, which
might be called in a different state (both graph structure and
AioContext) than before. I'll have to see the patch first that reorders
things, but this callback has always had the problem that sometimes it
wants the old state and sometimes it wants the new state...

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 30/36] block: bdrv_reopen_multiple: refresh permissions on updated graph
  2020-11-27 14:45 ` [PATCH v2 30/36] block: bdrv_reopen_multiple: refresh permissions on updated graph Vladimir Sementsov-Ogievskiy
@ 2021-02-05 17:57   ` Kevin Wolf
  2021-02-08 11:21     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-02-05 17:57 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Move bdrv_reopen_multiple to new paradigm of permission update:
> first update graph relations, then do refresh the permissions.
> 
> We have to modify reopen process in file-posix driver: with new scheme
> we don't have prepared permissions in raw_reopen_prepare(), so we
> should reconfigure fd in raw_check_perm(). Still this seems more native
> and simple anyway.

Hm... The diffstat shows that it is simpler because it needs less code.

But relying on the permission change callbacks for getting a new file
descriptor that changes more than just permissions doesn't feel
completely right either. Can we even expect the permission callbacks to
be called when the permissions aren't changed?

But then, reopen and permission updates were already a bit entangled
before. If we can guarantee that the permission functions will always be
called, even if the permissions don't change, I guess it's okay.

> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/block/block.h |   2 +-
>  block.c               | 183 +++++++++++-------------------------------
>  block/file-posix.c    |  84 +++++--------------
>  3 files changed, 70 insertions(+), 199 deletions(-)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index 0f21ef313f..82271d9ccd 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -195,7 +195,7 @@ typedef struct BDRVReopenState {
>      BlockdevDetectZeroesOptions detect_zeroes;
>      bool backing_missing;
>      bool replace_backing_bs;  /* new_backing_bs is ignored if this is false */
> -    BlockDriverState *new_backing_bs; /* If NULL then detach the current bs */
> +    BlockDriverState *old_backing_bs; /* keep pointer for permissions update */
>      uint64_t perm, shared_perm;

perm and shared_perm are unused now and can be removed.

>      QDict *options;
>      QDict *explicit_options;
> diff --git a/block.c b/block.c
> index 617cba9547..474e624152 100644
> --- a/block.c
> +++ b/block.c
> @@ -103,8 +103,9 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
>                                      GSList **tran, Error **errp);
>  static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran);
>  
> -static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
> -                               *queue, Error **errp);
> +static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
> +                               BlockReopenQueue *queue,
> +                               GSList **set_backings_tran, Error **errp);
>  static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
>  static void bdrv_reopen_abort(BDRVReopenState *reopen_state);
>  
> @@ -2403,6 +2404,7 @@ static void bdrv_list_abort_perm_update(GSList *list)
>      }
>  }
>  
> +__attribute__((unused))
>  static void bdrv_abort_perm_update(BlockDriverState *bs)
>  {
>      g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
> @@ -2498,6 +2500,7 @@ char *bdrv_perm_names(uint64_t perm)
>   *
>   * Needs to be followed by a call to either bdrv_set_perm() or
>   * bdrv_abort_perm_update(). */
> +__attribute__((unused))
>  static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
>                                    uint64_t new_used_perm,
>                                    uint64_t new_shared_perm,
> @@ -4100,10 +4103,6 @@ static BlockReopenQueue *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
>      bs_entry->state.explicit_options = explicit_options;
>      bs_entry->state.flags = flags;
>  
> -    /* This needs to be overwritten in bdrv_reopen_prepare() */
> -    bs_entry->state.perm = UINT64_MAX;
> -    bs_entry->state.shared_perm = 0;
> -
>      /*
>       * If keep_old_opts is false then it means that unspecified
>       * options must be reset to their original value. We don't allow
> @@ -4186,40 +4185,37 @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
>   */
>  int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
>  {
> -    int ret = -1;
> +    int ret = 0;

I would prefer to leave this right before the 'goto cleanup'.

Not sure if I fully understand all consequences yet, but overall, apart
from my concerns about file-posix and the potential AioContext locking
problems, this looks like a nice simplification of the process.

Come to think of it, the AioContext handling is probably wrong already
before your series. reopen_commit for one node could move the whole tree
to a different context and then the later nodes would all be processed
while holding the wrong lock.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 28/36] block: add bdrv_set_backing_noperm() transaction action
  2021-02-05 16:26   ` Kevin Wolf
@ 2021-02-08  9:34     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-08  9:34 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

05.02.2021 19:26, Kevin Wolf wrote:
> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Split out no-perm part of bdrv_set_backing_hd() as a separate
>> transaction action. Note the in case of existing BdrvChild we reuse it,
>> not recreate, just to do less actions.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
>>   /*
>>    * Sets the bs->backing link of a BDS. A new reference is created; callers
>>    * which don't need their own reference any more must call bdrv_unref().
>>    */
>> -void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
>> -                         Error **errp)
>> +static int bdrv_set_backing_noperm(BlockDriverState *bs,
>> +                                   BlockDriverState *backing_bs,
>> +                                   GSList **tran, Error **errp)
>>   {
>> -    bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
>> -        bdrv_inherits_from_recursive(backing_hd, bs);
>> +    int ret = 0;
>> +    bool update_inherits_from = bdrv_chain_contains(bs, backing_bs) &&
>> +        bdrv_inherits_from_recursive(backing_bs, bs);
>> +    GSList *attach_tran = NULL;
>> +    BdrvSetBackingNoPermState *s;
>>   
>>       if (bdrv_is_backing_chain_frozen(bs, child_bs(bs->backing), errp)) {
>> -        return;
>> +        return -EPERM;
>>       }
>>   
>> -    if (backing_hd) {
>> -        bdrv_ref(backing_hd);
>> +    if (bs->backing && backing_bs) {
>> +        bdrv_replace_child_safe(bs->backing, backing_bs, tran);
> 
> The old code with separate bdrv_unref_child() and then
> bdrv_attach_child() tried to make the AioContests of bs and backing_bs
> compatible by moving one of the nodes if necessary.
> 
> bdrv_replace_child_safe() doesn't seem to do that, but it only asserts
> that both nodes are already in the same context.
> 
> I see that iotest 245 doesn't crash, which I think it should if this
> were broken, but where does the switch happen now?

Hmm. Seems on path "if (bs->backing && backing_bs) {" we really miss aio context handling. Probably 245 doesn't check this branch? Or if leaves different aio contexts in one subtree..


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 30/36] block: bdrv_reopen_multiple: refresh permissions on updated graph
  2021-02-05 17:57   ` Kevin Wolf
@ 2021-02-08 11:21     ` Vladimir Sementsov-Ogievskiy
  2021-02-10 14:13       ` Kevin Wolf
  2021-02-10 14:38       ` Kevin Wolf
  0 siblings, 2 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-08 11:21 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

05.02.2021 20:57, Kevin Wolf wrote:
> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Move bdrv_reopen_multiple to new paradigm of permission update:
>> first update graph relations, then do refresh the permissions.
>>
>> We have to modify reopen process in file-posix driver: with new scheme
>> we don't have prepared permissions in raw_reopen_prepare(), so we
>> should reconfigure fd in raw_check_perm(). Still this seems more native
>> and simple anyway.
> 
> Hm... The diffstat shows that it is simpler because it needs less code.
> 
> But relying on the permission change callbacks for getting a new file
> descriptor that changes more than just permissions doesn't feel
> completely right either. Can we even expect the permission callbacks to
> be called when the permissions aren't changed?

With new scheme permission update becomes an obvious step of bdrv_reopen_multiple(): we do call bdrv_list_refresh_perms(), for the list of all touched nodes and all their subtrees. And callbacks are called unconditionally bdrv_node_refresh_perm()->bdrv_drv_set_perm(). So, I think, we can rely on it. Probably worth one-two comments.

> 
> But then, reopen and permission updates were already a bit entangled
> before. If we can guarantee that the permission functions will always be
> called, even if the permissions don't change, I guess it's okay.
> 
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   include/block/block.h |   2 +-
>>   block.c               | 183 +++++++++++-------------------------------
>>   block/file-posix.c    |  84 +++++--------------
>>   3 files changed, 70 insertions(+), 199 deletions(-)
>>
>> diff --git a/include/block/block.h b/include/block/block.h
>> index 0f21ef313f..82271d9ccd 100644
>> --- a/include/block/block.h
>> +++ b/include/block/block.h
>> @@ -195,7 +195,7 @@ typedef struct BDRVReopenState {
>>       BlockdevDetectZeroesOptions detect_zeroes;
>>       bool backing_missing;
>>       bool replace_backing_bs;  /* new_backing_bs is ignored if this is false */
>> -    BlockDriverState *new_backing_bs; /* If NULL then detach the current bs */
>> +    BlockDriverState *old_backing_bs; /* keep pointer for permissions update */
>>       uint64_t perm, shared_perm;
> 
> perm and shared_perm are unused now and can be removed.
> 
>>       QDict *options;
>>       QDict *explicit_options;
>> diff --git a/block.c b/block.c
>> index 617cba9547..474e624152 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -103,8 +103,9 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
>>                                       GSList **tran, Error **errp);
>>   static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran);
>>   
>> -static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
>> -                               *queue, Error **errp);
>> +static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
>> +                               BlockReopenQueue *queue,
>> +                               GSList **set_backings_tran, Error **errp);
>>   static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
>>   static void bdrv_reopen_abort(BDRVReopenState *reopen_state);
>>   
>> @@ -2403,6 +2404,7 @@ static void bdrv_list_abort_perm_update(GSList *list)
>>       }
>>   }
>>   
>> +__attribute__((unused))
>>   static void bdrv_abort_perm_update(BlockDriverState *bs)
>>   {
>>       g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
>> @@ -2498,6 +2500,7 @@ char *bdrv_perm_names(uint64_t perm)
>>    *
>>    * Needs to be followed by a call to either bdrv_set_perm() or
>>    * bdrv_abort_perm_update(). */
>> +__attribute__((unused))
>>   static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>                                     uint64_t new_used_perm,
>>                                     uint64_t new_shared_perm,
>> @@ -4100,10 +4103,6 @@ static BlockReopenQueue *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
>>       bs_entry->state.explicit_options = explicit_options;
>>       bs_entry->state.flags = flags;
>>   
>> -    /* This needs to be overwritten in bdrv_reopen_prepare() */
>> -    bs_entry->state.perm = UINT64_MAX;
>> -    bs_entry->state.shared_perm = 0;
>> -
>>       /*
>>        * If keep_old_opts is false then it means that unspecified
>>        * options must be reset to their original value. We don't allow
>> @@ -4186,40 +4185,37 @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
>>    */
>>   int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
>>   {
>> -    int ret = -1;
>> +    int ret = 0;
> 
> I would prefer to leave this right before the 'goto cleanup'.
> 
> Not sure if I fully understand all consequences yet, but overall, apart
> from my concerns about file-posix and the potential AioContext locking
> problems, this looks like a nice simplification of the process.
> 
> Come to think of it, the AioContext handling is probably wrong already
> before your series. reopen_commit for one node could move the whole tree
> to a different context and then the later nodes would all be processed
> while holding the wrong lock.
> 

Probably proper way is to acquire all involved aio contexts as I do in 29 and update aio-context updating functions to work in such conditions(all aio contexts are already acquired by caller).


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 30/36] block: bdrv_reopen_multiple: refresh permissions on updated graph
  2021-02-08 11:21     ` Vladimir Sementsov-Ogievskiy
@ 2021-02-10 14:13       ` Kevin Wolf
  2021-02-10 14:38       ` Kevin Wolf
  1 sibling, 0 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-02-10 14:13 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 08.02.2021 um 12:21 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 05.02.2021 20:57, Kevin Wolf wrote:
> > Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > Move bdrv_reopen_multiple to new paradigm of permission update:
> > > first update graph relations, then do refresh the permissions.
> > > 
> > > We have to modify reopen process in file-posix driver: with new scheme
> > > we don't have prepared permissions in raw_reopen_prepare(), so we
> > > should reconfigure fd in raw_check_perm(). Still this seems more native
> > > and simple anyway.
> > 
> > Hm... The diffstat shows that it is simpler because it needs less code.
> > 
> > But relying on the permission change callbacks for getting a new file
> > descriptor that changes more than just permissions doesn't feel
> > completely right either. Can we even expect the permission callbacks to
> > be called when the permissions aren't changed?
> 
> With new scheme permission update becomes an obvious step of
> bdrv_reopen_multiple(): we do call bdrv_list_refresh_perms(), for the
> list of all touched nodes and all their subtrees. And callbacks are
> called unconditionally bdrv_node_refresh_perm()->bdrv_drv_set_perm().
> So, I think, we can rely on it. Probably worth one-two comments.

Yes, some comments in the right places that we must call the driver
callbacks even if the permissions are the same as before wouldn't hurt.

> > 
> > But then, reopen and permission updates were already a bit entangled
> > before. If we can guarantee that the permission functions will always be
> > called, even if the permissions don't change, I guess it's okay.
> > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > > ---
> > >   include/block/block.h |   2 +-
> > >   block.c               | 183 +++++++++++-------------------------------
> > >   block/file-posix.c    |  84 +++++--------------
> > >   3 files changed, 70 insertions(+), 199 deletions(-)
> > > 
> > > diff --git a/include/block/block.h b/include/block/block.h
> > > index 0f21ef313f..82271d9ccd 100644
> > > --- a/include/block/block.h
> > > +++ b/include/block/block.h
> > > @@ -195,7 +195,7 @@ typedef struct BDRVReopenState {
> > >       BlockdevDetectZeroesOptions detect_zeroes;
> > >       bool backing_missing;
> > >       bool replace_backing_bs;  /* new_backing_bs is ignored if this is false */
> > > -    BlockDriverState *new_backing_bs; /* If NULL then detach the current bs */
> > > +    BlockDriverState *old_backing_bs; /* keep pointer for permissions update */
> > >       uint64_t perm, shared_perm;
> > 
> > perm and shared_perm are unused now and can be removed.
> > 
> > >       QDict *options;
> > >       QDict *explicit_options;
> > > diff --git a/block.c b/block.c
> > > index 617cba9547..474e624152 100644
> > > --- a/block.c
> > > +++ b/block.c
> > > @@ -103,8 +103,9 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
> > >                                       GSList **tran, Error **errp);
> > >   static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran);
> > > -static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
> > > -                               *queue, Error **errp);
> > > +static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
> > > +                               BlockReopenQueue *queue,
> > > +                               GSList **set_backings_tran, Error **errp);
> > >   static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
> > >   static void bdrv_reopen_abort(BDRVReopenState *reopen_state);
> > > @@ -2403,6 +2404,7 @@ static void bdrv_list_abort_perm_update(GSList *list)
> > >       }
> > >   }
> > > +__attribute__((unused))
> > >   static void bdrv_abort_perm_update(BlockDriverState *bs)
> > >   {
> > >       g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
> > > @@ -2498,6 +2500,7 @@ char *bdrv_perm_names(uint64_t perm)
> > >    *
> > >    * Needs to be followed by a call to either bdrv_set_perm() or
> > >    * bdrv_abort_perm_update(). */
> > > +__attribute__((unused))
> > >   static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
> > >                                     uint64_t new_used_perm,
> > >                                     uint64_t new_shared_perm,
> > > @@ -4100,10 +4103,6 @@ static BlockReopenQueue *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
> > >       bs_entry->state.explicit_options = explicit_options;
> > >       bs_entry->state.flags = flags;
> > > -    /* This needs to be overwritten in bdrv_reopen_prepare() */
> > > -    bs_entry->state.perm = UINT64_MAX;
> > > -    bs_entry->state.shared_perm = 0;
> > > -
> > >       /*
> > >        * If keep_old_opts is false then it means that unspecified
> > >        * options must be reset to their original value. We don't allow
> > > @@ -4186,40 +4185,37 @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
> > >    */
> > >   int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
> > >   {
> > > -    int ret = -1;
> > > +    int ret = 0;
> > 
> > I would prefer to leave this right before the 'goto cleanup'.
> > 
> > Not sure if I fully understand all consequences yet, but overall, apart
> > from my concerns about file-posix and the potential AioContext locking
> > problems, this looks like a nice simplification of the process.
> > 
> > Come to think of it, the AioContext handling is probably wrong already
> > before your series. reopen_commit for one node could move the whole tree
> > to a different context and then the later nodes would all be processed
> > while holding the wrong lock.
> > 
> 
> Probably proper way is to acquire all involved aio contexts as I do in
> 29 and update aio-context updating functions to work in such
> conditions(all aio contexts are already acquired by caller).

Well, as we already discussed, patch 29 is probably wrong in its current
form. But you seemed to have a solution in mind, which will hopefully
work here, too.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 30/36] block: bdrv_reopen_multiple: refresh permissions on updated graph
  2021-02-08 11:21     ` Vladimir Sementsov-Ogievskiy
  2021-02-10 14:13       ` Kevin Wolf
@ 2021-02-10 14:38       ` Kevin Wolf
  1 sibling, 0 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-02-10 14:38 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 08.02.2021 um 12:21 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > Come to think of it, the AioContext handling is probably wrong already
> > before your series. reopen_commit for one node could move the whole tree
> > to a different context and then the later nodes would all be processed
> > while holding the wrong lock.
> 
> Probably proper way is to acquire all involved aio contexts as I do in
> 29 and update aio-context updating functions to work in such
> conditions(all aio contexts are already acquired by caller).

Whoops, what I gave was kind of a non-answer...

So essentially the reason for the locking rules of changing the
AioContext is that they drain the node first and drain imposes the
locking rule that the AioContext for the node to be drained must be
locked, and all other AioContexts must be unlocked.

The reason why drain imposes the rule is that we run AIO_WAIT_WHILE() in
one thread and we may need the event loops in other threads to make
progress until the while condition can eventually become false. If other
threads can't make progress because their lock is taken, we'll see
deadlocks sooner or later.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 34/36] block: refactor bdrv_child_set_perm_safe() transaction action
  2020-11-27 14:45 ` [PATCH v2 34/36] block: refactor bdrv_child_set_perm_safe() transaction action Vladimir Sementsov-Ogievskiy
@ 2021-02-10 14:51   ` Kevin Wolf
  0 siblings, 0 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-02-10 14:51 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Old interfaces dropped, nobody directly calls
> bdrv_child_set_perm_abort() and bdrv_child_set_perm_commit(), so we can
> use personal state structure for the action and stop exploiting
> BdrvChild structure. Also, drop "_safe" suffix which is redundant now.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> diff --git a/block.c b/block.c
> index 3093d20db8..1fde22e4f4 100644
> --- a/block.c
> +++ b/block.c
> @@ -2070,59 +2070,40 @@ static GSList *bdrv_topological_dfs(GSList *list, GHashTable *found,
>      return g_slist_prepend(list, bs);
>  }
>  
> -static void bdrv_child_set_perm_commit(void *opaque)
> -{
> -    BdrvChild *c = opaque;
> -
> -    c->has_backup_perm = false;
> -}
> +typedef struct BdrvChildSetPermState {
> +    BdrvChild *child;
> +    uint64_t old_perm;
> +    uint64_t old_shared_perm;
> +} BdrvChildSetPermState;
>  
>  static void bdrv_child_set_perm_abort(void *opaque)
>  {
> -    BdrvChild *c = opaque;
> -    /*
> -     * We may have child->has_backup_perm unset at this point, as in case of
> -     * _check_ stage of permission update failure we may _check_ not the whole
> -     * subtree.  Still, _abort_ is called on the whole subtree anyway.
> -     */
> -    if (c->has_backup_perm) {
> -        c->perm = c->backup_perm;
> -        c->shared_perm = c->backup_shared_perm;
> -        c->has_backup_perm = false;
> -    }
> +    BdrvChildSetPermState *s = opaque;
> +
> +    s->child->perm = s->old_perm;
> +    s->child->shared_perm = s->old_shared_perm;
>  }

Ah, so this patch actually implements what I had asked for somewhere at
the start of the series.

Don't bother changing it earlier then. As long as it's in the same
series, this is fine.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 36/36] block: refactor bdrv_node_check_perm()
  2020-11-27 14:45 ` [PATCH v2 36/36] block: refactor bdrv_node_check_perm() Vladimir Sementsov-Ogievskiy
@ 2021-02-10 15:07   ` Kevin Wolf
  2021-02-11  9:50     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 108+ messages in thread
From: Kevin Wolf @ 2021-02-10 15:07 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Now, bdrv_node_check_perm() is called only with fresh cumulative
> permissions, so its actually "refresh_perm".
> 
> Move permission calculation to the function. Also, drop unreachable
> error message.
> 
> Add also Virtuozzo copyright, as big work is done at this point.

I guess we could add many copyright lines then... Maybe we should, I
don't know.

> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block.c | 38 +++++++++-----------------------------
>  1 file changed, 9 insertions(+), 29 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 20b1cf59f7..576b145cbf 100644
> --- a/block.c
> +++ b/block.c
> @@ -2,6 +2,7 @@
>   * QEMU System Emulator block driver
>   *
>   * Copyright (c) 2003 Fabrice Bellard
> + * Copyright (c) 2020 Virtuozzo International GmbH.
>   *
>   * Permission is hereby granted, free of charge, to any person obtaining a copy
>   * of this software and associated documentation files (the "Software"), to deal
> @@ -2204,23 +2205,15 @@ static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs,
>      /* old_bs reference is transparently moved from @child to @s */
>  }
>  
> -/*
> - * Check whether permissions on this node can be changed in a way that
> - * @cumulative_perms and @cumulative_shared_perms are the new cumulative
> - * permissions of all its parents. This involves checking whether all necessary
> - * permission changes to child nodes can be performed.
> - *
> - * A call to this function must always be followed by a call to bdrv_set_perm()
> - * or bdrv_abort_perm_update().
> - */

Would you mind updating the comment rather than removing it?

> -static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> -                                uint64_t cumulative_perms,
> -                                uint64_t cumulative_shared_perms,
> -                                GSList **tran, Error **errp)
> +static int bdrv_node_refresh_perm(BlockDriverState *bs, BlockReopenQueue *q,
> +                                  GSList **tran, Error **errp)
>  {
>      BlockDriver *drv = bs->drv;
>      BdrvChild *c;
>      int ret;
> +    uint64_t cumulative_perms, cumulative_shared_perms;
> +
> +    bdrv_get_cumulative_perm(bs, &cumulative_perms, &cumulative_shared_perms);
>  
>      /* Write permissions never work with read-only images */
>      if ((cumulative_perms & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) &&
> @@ -2229,15 +2222,8 @@ static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>          if (!bdrv_is_writable_after_reopen(bs, NULL)) {
>              error_setg(errp, "Block node is read-only");
>          } else {
> -            uint64_t current_perms, current_shared;
> -            bdrv_get_cumulative_perm(bs, &current_perms, &current_shared);
> -            if (current_perms & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) {
> -                error_setg(errp, "Cannot make block node read-only, there is "
> -                           "a writer on it");
> -            } else {
> -                error_setg(errp, "Cannot make block node read-only and create "
> -                           "a writer on it");
> -            }
> +            error_setg(errp, "Cannot make block node read-only, there is "
> +                       "a writer on it");

Hm, so if you want to add a new writer to an existing read-only node,
this is the error message that you would get?

Now that we can't distinguish both cases any more, should we try to
rephrase it so that it makes sense for both directions? Like "Read-only
block node <node-name> cannot support read-write users"?


Sorry for it taking so long, but I've now finally looked at all patches
in this series. Please feel free to send v3 when you're ready.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 36/36] block: refactor bdrv_node_check_perm()
  2021-02-10 15:07   ` Kevin Wolf
@ 2021-02-11  9:50     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-11  9:50 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

10.02.2021 18:07, Kevin Wolf wrote:
> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Now, bdrv_node_check_perm() is called only with fresh cumulative
>> permissions, so its actually "refresh_perm".
>>
>> Move permission calculation to the function. Also, drop unreachable
>> error message.
>>
>> Add also Virtuozzo copyright, as big work is done at this point.
> 
> I guess we could add many copyright lines then... Maybe we should, I
> don't know.
> 
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block.c | 38 +++++++++-----------------------------
>>   1 file changed, 9 insertions(+), 29 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 20b1cf59f7..576b145cbf 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -2,6 +2,7 @@
>>    * QEMU System Emulator block driver
>>    *
>>    * Copyright (c) 2003 Fabrice Bellard
>> + * Copyright (c) 2020 Virtuozzo International GmbH.
>>    *
>>    * Permission is hereby granted, free of charge, to any person obtaining a copy
>>    * of this software and associated documentation files (the "Software"), to deal
>> @@ -2204,23 +2205,15 @@ static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs,
>>       /* old_bs reference is transparently moved from @child to @s */
>>   }
>>   
>> -/*
>> - * Check whether permissions on this node can be changed in a way that
>> - * @cumulative_perms and @cumulative_shared_perms are the new cumulative
>> - * permissions of all its parents. This involves checking whether all necessary
>> - * permission changes to child nodes can be performed.
>> - *
>> - * A call to this function must always be followed by a call to bdrv_set_perm()
>> - * or bdrv_abort_perm_update().
>> - */
> 
> Would you mind updating the comment rather than removing it?
> 
>> -static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>> -                                uint64_t cumulative_perms,
>> -                                uint64_t cumulative_shared_perms,
>> -                                GSList **tran, Error **errp)
>> +static int bdrv_node_refresh_perm(BlockDriverState *bs, BlockReopenQueue *q,
>> +                                  GSList **tran, Error **errp)
>>   {
>>       BlockDriver *drv = bs->drv;
>>       BdrvChild *c;
>>       int ret;
>> +    uint64_t cumulative_perms, cumulative_shared_perms;
>> +
>> +    bdrv_get_cumulative_perm(bs, &cumulative_perms, &cumulative_shared_perms);
>>   
>>       /* Write permissions never work with read-only images */
>>       if ((cumulative_perms & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) &&
>> @@ -2229,15 +2222,8 @@ static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>           if (!bdrv_is_writable_after_reopen(bs, NULL)) {
>>               error_setg(errp, "Block node is read-only");
>>           } else {
>> -            uint64_t current_perms, current_shared;
>> -            bdrv_get_cumulative_perm(bs, &current_perms, &current_shared);
>> -            if (current_perms & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) {
>> -                error_setg(errp, "Cannot make block node read-only, there is "
>> -                           "a writer on it");
>> -            } else {
>> -                error_setg(errp, "Cannot make block node read-only and create "
>> -                           "a writer on it");
>> -            }
>> +            error_setg(errp, "Cannot make block node read-only, there is "
>> +                       "a writer on it");
> 
> Hm, so if you want to add a new writer to an existing read-only node,
> this is the error message that you would get?
> 
> Now that we can't distinguish both cases any more, should we try to
> rephrase it so that it makes sense for both directions? Like "Read-only
> block node <node-name> cannot support read-write users"?
> 

OK.

> 
> Sorry for it taking so long, but I've now finally looked at all patches
> in this series. Please feel free to send v3 when you're ready.
> 
Thanks a lot for reviewing!

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 15/36] block: use topological sort for permission update
  2021-02-03 18:38           ` Kevin Wolf
  2021-02-04  7:16             ` Vladimir Sementsov-Ogievskiy
@ 2021-03-10 11:08             ` Vladimir Sementsov-Ogievskiy
  2021-03-10 11:55               ` Kevin Wolf
  1 sibling, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-03-10 11:08 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

03.02.2021 21:38, Kevin Wolf wrote:
> Am 28.01.2021 um 19:04 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 28.01.2021 20:13, Kevin Wolf wrote:
>>> Am 28.01.2021 um 10:34 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> 27.01.2021 21:38, Kevin Wolf wrote:
>>>>> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>>> -static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>>>>> -                           uint64_t cumulative_perms,
>>>>>> -                           uint64_t cumulative_shared_perms,
>>>>>> -                           GSList *ignore_children, Error **errp)
>>>>>> +static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>>>>> +                                uint64_t cumulative_perms,
>>>>>> +                                uint64_t cumulative_shared_perms,
>>>>>> +                                GSList *ignore_children, Error **errp)
>>>>>>     {
>>>>>>         BlockDriver *drv = bs->drv;
>>>>>>         BdrvChild *c;
>>>>>> @@ -2166,21 +2193,43 @@ static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>>>>>         /* Check all children */
>>>>>>         QLIST_FOREACH(c, &bs->children, next) {
>>>>>>             uint64_t cur_perm, cur_shared;
>>>>>> -        GSList *cur_ignore_children;
>>>>>>             bdrv_child_perm(bs, c->bs, c, c->role, q,
>>>>>>                             cumulative_perms, cumulative_shared_perms,
>>>>>>                             &cur_perm, &cur_shared);
>>>>>> +        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
>>>>>
>>>>> This "added" line is actually old code. What is removed here is the
>>>>> recursive call of bdrv_check_update_perm(). This is what the code below
>>>>> will have to replace.
>>>>
>>>> yes, we'll use explicit loop instead of recursion
>>>>
>>>>>
>>>>>> +    }
>>>>>> +
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
>>>>>> +                           uint64_t cumulative_perms,
>>>>>> +                           uint64_t cumulative_shared_perms,
>>>>>> +                           GSList *ignore_children, Error **errp)
>>>>>> +{
>>>>>> +    int ret;
>>>>>> +    BlockDriverState *root = bs;
>>>>>> +    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root);
>>>>>> +
>>>>>> +    for ( ; list; list = list->next) {
>>>>>> +        bs = list->data;
>>>>>> +
>>>>>> +        if (bs != root) {
>>>>>> +            if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) {
>>>>>> +                return -EINVAL;
>>>>>> +            }
>>>>>
>>>>> At this point bs still had the old permissions, but we don't access
>>>>> them. As we're going in topological order, the parents have already been
>>>>> updated if they were a child covered in bdrv_node_check_perm(), so we're
>>>>> checking the relevant values. Good.
>>>>>
>>>>> What about the root node? If I understand correctly, the parents of the
>>>>> root nodes wouldn't have been checked in the old code. In the new state,
>>>>> the parent BdrvChild already has to contain the new permission.
>>>>>
>>>>> In bdrv_refresh_perms(), we already check parent conflicts, so no change
>>>>> for all callers going through it. Good.
>>>>>
>>>>> bdrv_reopen_multiple() is less obvious. It passes permissions from the
>>>>> BDRVReopenState, without applying the permissions first.
>>>>
>>>> It will be changed in the series
>>>>
>>>>> Do we check the
>>>>> old parent permissions instead of the new state here?
>>>>
>>>> We use given (new) cumulative permissions for bs, and recalculate
>>>> permissions for bs subtree.
>>>
>>> Where do we actually set them? I would expect a
>>> bdrv_child_set_perm_safe() call somewhere, but I can't see it in the
>>> call path from bdrv_reopen_multiple().
>>
>> You mean parent BdrvChild objects? Then this question applies as well
>> to pre-patch code.
> 
> I don't think so. The pre-patch code doesn't rely on the permissions
> already being set in the BdrvChild object, but it gets them passed in
> parameters. Changing the graph first and relying on the information in
> BdrvChild is the new approach that you're introducing.

New code still pass permissions as parameters for root node. And only
inside subtree we rely on updated parents. But that's correct due to
topological order of updating.


OK, that's incorrect for the following case: when one of the parents (X)
of inner node in bs subtree IS NOT in the bs subtree but IS in reopen queue.
And we'll use wrong permission of X. Still:

1. It's preexisting. bdrv_check_update_perm() doesn't take reopen queue
in mind when calculate cumulative permissions (and ignore_children doesn't
help for the described case

2. We can hope that on next permission update, started from node X, permissions
will become more correct

3. At the end of series permission calculation in bdrv_reopen_multiple is
rewritten anyway.


> 
>> So, we just call bdrv_check_perm() for bs in bdrv_reopen_multiple.. I
>> think the answer is like this:
>>
>> if state->perm and state->shared_perm are different from actual
>> cumulative permissions (before reopne), then we must have the
>> parent(s) of the node in same bs_queue. Then, corresponding children
>> are updated as part of another bdrv_check_perm call from same loop in
>> bdrv_reopen_multiple().
>>
>> Let's check how state->perm and state->shared_perm are set:
>>
>> bdrv_reopen_queue_child()
>>
>>      /* This needs to be overwritten in bdrv_reopen_prepare() */
>>      bs_entry->state.perm = UINT64_MAX;
>>      bs_entry->state.shared_perm = 0;
>>
>>
>> ...
>> bdrv_reopen_prepare()
>>
>>         bdrv_reopen_perm(queue, reopen_state->bs,
>>                       &reopen_state->perm, &reopen_state->shared_perm);
>>
>> and bdrv_reopen_perm() calculate cumulative permissions, taking
>> permissions from the queue, for parents which exists in queue.
> 
> Right, but it stores the new permissions in reopen_state, not in the
> BdrvChild objects that this patch is looking it. Or am I missing
> something?
> 
>> Not sure how much it correct, keeping in mind that we may look at a
>> node in queue, for which bdrv_reopen_perm was not yet called, but the
>> idea is clean.
> 
> I don't think the above code can work correctly without something
> actually updating the BdrvChild first.
> 
>>>> It follows old behavior. The only thing is changed that pre-patch we
>>>> do DFS recursion starting from bs (and probably visit some nodes
>>>> several times), after-patch we first do topological sort of bs subtree
>>>> and go through the list. The order of nodes is better and we visit
>>>> each node once.
>>>
>>> It's not the only thing that changes. Maybe this is what makes the patch
>>> hard to understand, because it seems to do two steps at once:
>>>
>>> 1. Change the order in which nodes are processed
>>>
>>> 2. Replace bdrv_check_update_perm() with bdrv_check_parents_compliance()
>>
>> hmm, yes. But we do bdrv_check_parents_compliance() only for nodes
>> inside subtree, for all except root.  So, for them we have updated
>> permissions.
> 
> Ah! This might be the missing piece that makes it safe.
> 
> Maybe worth a comment?
> 
> Kevin
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 15/36] block: use topological sort for permission update
  2021-03-10 11:08             ` Vladimir Sementsov-Ogievskiy
@ 2021-03-10 11:55               ` Kevin Wolf
  0 siblings, 0 replies; 108+ messages in thread
From: Kevin Wolf @ 2021-03-10 11:55 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, armbru, qemu-devel, mreitz, den, jsnow

Am 10.03.2021 um 12:08 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 03.02.2021 21:38, Kevin Wolf wrote:
> > Am 28.01.2021 um 19:04 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > 28.01.2021 20:13, Kevin Wolf wrote:
> > > > Am 28.01.2021 um 10:34 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > > 27.01.2021 21:38, Kevin Wolf wrote:
> > > > > > Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > > > > -static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> > > > > > > -                           uint64_t cumulative_perms,
> > > > > > > -                           uint64_t cumulative_shared_perms,
> > > > > > > -                           GSList *ignore_children, Error **errp)
> > > > > > > +static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> > > > > > > +                                uint64_t cumulative_perms,
> > > > > > > +                                uint64_t cumulative_shared_perms,
> > > > > > > +                                GSList *ignore_children, Error **errp)
> > > > > > >     {
> > > > > > >         BlockDriver *drv = bs->drv;
> > > > > > >         BdrvChild *c;
> > > > > > > @@ -2166,21 +2193,43 @@ static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> > > > > > >         /* Check all children */
> > > > > > >         QLIST_FOREACH(c, &bs->children, next) {
> > > > > > >             uint64_t cur_perm, cur_shared;
> > > > > > > -        GSList *cur_ignore_children;
> > > > > > >             bdrv_child_perm(bs, c->bs, c, c->role, q,
> > > > > > >                             cumulative_perms, cumulative_shared_perms,
> > > > > > >                             &cur_perm, &cur_shared);
> > > > > > > +        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
> > > > > > 
> > > > > > This "added" line is actually old code. What is removed here is the
> > > > > > recursive call of bdrv_check_update_perm(). This is what the code below
> > > > > > will have to replace.
> > > > > 
> > > > > yes, we'll use explicit loop instead of recursion
> > > > > 
> > > > > > 
> > > > > > > +    }
> > > > > > > +
> > > > > > > +    return 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> > > > > > > +                           uint64_t cumulative_perms,
> > > > > > > +                           uint64_t cumulative_shared_perms,
> > > > > > > +                           GSList *ignore_children, Error **errp)
> > > > > > > +{
> > > > > > > +    int ret;
> > > > > > > +    BlockDriverState *root = bs;
> > > > > > > +    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root);
> > > > > > > +
> > > > > > > +    for ( ; list; list = list->next) {
> > > > > > > +        bs = list->data;
> > > > > > > +
> > > > > > > +        if (bs != root) {
> > > > > > > +            if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) {
> > > > > > > +                return -EINVAL;
> > > > > > > +            }
> > > > > > 
> > > > > > At this point bs still had the old permissions, but we don't access
> > > > > > them. As we're going in topological order, the parents have already been
> > > > > > updated if they were a child covered in bdrv_node_check_perm(), so we're
> > > > > > checking the relevant values. Good.
> > > > > > 
> > > > > > What about the root node? If I understand correctly, the parents of the
> > > > > > root nodes wouldn't have been checked in the old code. In the new state,
> > > > > > the parent BdrvChild already has to contain the new permission.
> > > > > > 
> > > > > > In bdrv_refresh_perms(), we already check parent conflicts, so no change
> > > > > > for all callers going through it. Good.
> > > > > > 
> > > > > > bdrv_reopen_multiple() is less obvious. It passes permissions from the
> > > > > > BDRVReopenState, without applying the permissions first.
> > > > > 
> > > > > It will be changed in the series
> > > > > 
> > > > > > Do we check the
> > > > > > old parent permissions instead of the new state here?
> > > > > 
> > > > > We use given (new) cumulative permissions for bs, and recalculate
> > > > > permissions for bs subtree.
> > > > 
> > > > Where do we actually set them? I would expect a
> > > > bdrv_child_set_perm_safe() call somewhere, but I can't see it in the
> > > > call path from bdrv_reopen_multiple().
> > > 
> > > You mean parent BdrvChild objects? Then this question applies as well
> > > to pre-patch code.
> > 
> > I don't think so. The pre-patch code doesn't rely on the permissions
> > already being set in the BdrvChild object, but it gets them passed in
> > parameters. Changing the graph first and relying on the information in
> > BdrvChild is the new approach that you're introducing.
> 
> New code still pass permissions as parameters for root node. And only
> inside subtree we rely on updated parents. But that's correct due to
> topological order of updating.
> 
> 
> OK, that's incorrect for the following case: when one of the parents (X)
> of inner node in bs subtree IS NOT in the bs subtree but IS in reopen queue.
> And we'll use wrong permission of X. Still:
> 
> 1. It's preexisting. bdrv_check_update_perm() doesn't take reopen queue
> in mind when calculate cumulative permissions (and ignore_children doesn't
> help for the described case
> 
> 2. We can hope that on next permission update, started from node X, permissions
> will become more correct
> 
> 3. At the end of series permission calculation in bdrv_reopen_multiple is
> rewritten anyway.

Yes, I think 3. is the strongest argument in favour of the patch.

Kevin



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 28/36] block: add bdrv_set_backing_noperm() transaction action
  2021-02-05 16:30       ` Kevin Wolf
@ 2021-03-11 18:29         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-03-11 18:29 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, armbru, jsnow, mreitz, den

05.02.2021 19:30, Kevin Wolf wrote:
> Am 05.02.2021 um 17:06 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 05.02.2021 17:00, Kevin Wolf wrote:
>>> Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> Split out no-perm part of bdrv_set_backing_hd() as a separate
>>>> transaction action. Note the in case of existing BdrvChild we reuse it,
>>>> not recreate, just to do less actions.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>    block.c | 111 +++++++++++++++++++++++++++++++++++++++++++++-----------
>>>>    1 file changed, 89 insertions(+), 22 deletions(-)
>>>>
>>>> diff --git a/block.c b/block.c
>>>> index 54fb6d24bd..617cba9547 100644
>>>> --- a/block.c
>>>> +++ b/block.c
>>>> @@ -101,6 +101,7 @@ static int bdrv_attach_child_common(BlockDriverState *child_bs,
>>>>                                        uint64_t perm, uint64_t shared_perm,
>>>>                                        void *opaque, BdrvChild **child,
>>>>                                        GSList **tran, Error **errp);
>>>> +static void bdrv_remove_backing(BlockDriverState *bs, GSList **tran);
>>>>    static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
>>>>                                   *queue, Error **errp);
>>>> @@ -3194,45 +3195,111 @@ static BdrvChildRole bdrv_backing_role(BlockDriverState *bs)
>>>>        }
>>>>    }
>>>> +typedef struct BdrvSetBackingNoPermState {
>>>> +    BlockDriverState *bs;
>>>> +    BlockDriverState *backing_bs;
>>>> +    BlockDriverState *old_inherits_from;
>>>> +    GSList *attach_tran;
>>>> +} BdrvSetBackingNoPermState;
>>>
>>> Why do we need the nested attach_tran instead of just including these
>>> actions in the outer transaction?
>>>
>>>> +static void bdrv_set_backing_noperm_abort(void *opaque)
>>>> +{
>>>> +    BdrvSetBackingNoPermState *s = opaque;
>>>> +
>>>> +    if (s->backing_bs) {
>>>> +        s->backing_bs->inherits_from = s->old_inherits_from;
>>>> +    }
>>>> +
>>>> +    tran_abort(s->attach_tran);
>>>> +
>>>> +    bdrv_refresh_limits(s->bs, NULL);
>>>> +    if (s->old_inherits_from) {
>>>> +        bdrv_refresh_limits(s->old_inherits_from, NULL);
>>>> +    }
>>>
>>> How is bs->inherits_from related to limits? I don't see a
>>> bdrv_refresh_limits() call in bdrv_set_backing_noperm() that this would
>>> undo.
>>>
>>>> +}
>>>> +
>>>> +static void bdrv_set_backing_noperm_commit(void *opaque)
>>>> +{
>>>> +    BdrvSetBackingNoPermState *s = opaque;
>>>> +
>>>> +    tran_commit(s->attach_tran);
>>>> +}
>>>> +
>>>> +static TransactionActionDrv bdrv_set_backing_noperm_drv = {
>>>> +    .abort = bdrv_set_backing_noperm_abort,
>>>> +    .commit = bdrv_set_backing_noperm_commit,
>>>> +    .clean = g_free,
>>>> +};
>>>> +
>>>>    /*
>>>>     * Sets the bs->backing link of a BDS. A new reference is created; callers
>>>>     * which don't need their own reference any more must call bdrv_unref().
>>>>     */
>>>> -void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
>>>> -                         Error **errp)
>>>> +static int bdrv_set_backing_noperm(BlockDriverState *bs,
>>>> +                                   BlockDriverState *backing_bs,
>>>> +                                   GSList **tran, Error **errp)
>>>>    {
>>>> -    bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
>>>> -        bdrv_inherits_from_recursive(backing_hd, bs);
>>>> +    int ret = 0;
>>>> +    bool update_inherits_from = bdrv_chain_contains(bs, backing_bs) &&
>>>> +        bdrv_inherits_from_recursive(backing_bs, bs);
>>>> +    GSList *attach_tran = NULL;
>>>> +    BdrvSetBackingNoPermState *s;
>>>>        if (bdrv_is_backing_chain_frozen(bs, child_bs(bs->backing), errp)) {
>>>> -        return;
>>>> +        return -EPERM;
>>>>        }
>>>> -    if (backing_hd) {
>>>> -        bdrv_ref(backing_hd);
>>>> +    if (bs->backing && backing_bs) {
>>>> +        bdrv_replace_child_safe(bs->backing, backing_bs, tran);
>>>> +    } else if (bs->backing && !backing_bs) {
>>>> +        bdrv_remove_backing(bs, tran);
>>>> +    } else if (backing_bs) {
>>>> +        assert(!bs->backing);
>>>> +        ret = bdrv_attach_child_noperm(bs, backing_bs, "backing",
>>>> +                                       &child_of_bds, bdrv_backing_role(bs),
>>>> +                                       &bs->backing, &attach_tran, errp);
>>>> +        if (ret < 0) {
>>>> +            tran_abort(attach_tran);
>>>
>>> This looks wrong to me, we'll call tran_abort() a second time through
>>> bdrv_set_backing_noperm_abort() when the outer transaction aborts.
>>>
>>> I also notice that the other two if branches do just add things to the
>>> outer 'tran', it's just this branch that gets a nested one.
>>>
>>>> +            return ret;
>>>> +        }
>>>>        }
>>>> -    if (bs->backing) {
>>>> -        /* Cannot be frozen, we checked that above */
>>>> -        bdrv_unref_child(bs, bs->backing);
>>>> -        bs->backing = NULL;
>>>> -    }
>>>> +    s = g_new(BdrvSetBackingNoPermState, 1);
>>>> +    *s = (BdrvSetBackingNoPermState) {
>>>> +        .bs = bs,
>>>> +        .backing_bs = backing_bs,
>>>> +        .old_inherits_from = backing_bs ? backing_bs->inherits_from : NULL,
>>>> +    };
>>>> +    tran_prepend(tran, &bdrv_set_backing_noperm_drv, s);
>>>> -    if (!backing_hd) {
>>>> -        goto out;
>>>> +    /*
>>>> +     * If backing_bs was already part of bs's backing chain, and
>>>> +     * inherits_from pointed recursively to bs then let's update it to
>>>> +     * point directly to bs (else it will become NULL).
>>>
>>> Setting it to NULL was previously done by bdrv_unref_child().
>>>
>>> bdrv_replace_child_safe() and bdrv_remove_backing() don't seem to do
>>> this any more.
>>
>> Hmm, yes.. May be we should move bdrv_unset_inherts_from() from
>> bdrv_unref_child() to bdrv_replace_child_noperm() ?
> 
> Sounds good to me. This should hopefully be called for all graph changes
> that could possibly happen.
> 

This will break current "feature":

block jobs don't break inherits_from at all: when filter inserted inherits_from is broken. But when filter removed, it works again, as .inherits_from is not changed by bdrv_append().. So, I'll just try to keep current behavior around inherits_from as is.



-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 108+ messages in thread

end of thread, other threads:[~2021-03-11 18:33 UTC | newest]

Thread overview: 108+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-27 14:44 [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy
2020-11-27 14:44 ` [PATCH v2 01/36] tests/test-bdrv-graph-mod: add test_parallel_exclusive_write Vladimir Sementsov-Ogievskiy
2020-11-27 14:44 ` [PATCH v2 02/36] tests/test-bdrv-graph-mod: add test_parallel_perm_update Vladimir Sementsov-Ogievskiy
2021-01-18 14:05   ` Kevin Wolf
2021-01-18 17:13     ` Vladimir Sementsov-Ogievskiy
2020-11-27 14:44 ` [PATCH v2 03/36] block: bdrv_append(): don't consume reference Vladimir Sementsov-Ogievskiy
2021-01-18 14:18   ` Kevin Wolf
2021-01-18 17:21     ` Vladimir Sementsov-Ogievskiy
2021-01-18 17:59       ` Kevin Wolf
2020-11-27 14:44 ` [PATCH v2 04/36] block: bdrv_append(): return status Vladimir Sementsov-Ogievskiy
2020-12-14 15:49   ` Alberto Garcia
2021-01-18 14:32   ` Kevin Wolf
2020-11-27 14:44 ` [PATCH v2 05/36] block: add bdrv_parent_try_set_aio_context Vladimir Sementsov-Ogievskiy
2021-01-18 15:08   ` Kevin Wolf
2021-01-18 17:26     ` Vladimir Sementsov-Ogievskiy
2020-11-27 14:44 ` [PATCH v2 06/36] block: BdrvChildClass: add .get_parent_aio_context handler Vladimir Sementsov-Ogievskiy
2021-01-18 15:13   ` Kevin Wolf
2021-01-18 17:36     ` Vladimir Sementsov-Ogievskiy
2021-01-19 16:38       ` Kevin Wolf
2021-01-22 11:04         ` Vladimir Sementsov-Ogievskiy
2021-01-22 11:18           ` Kevin Wolf
2021-01-22 11:26             ` Vladimir Sementsov-Ogievskiy
2020-11-27 14:44 ` [PATCH v2 07/36] block: drop ctx argument from bdrv_root_attach_child Vladimir Sementsov-Ogievskiy
2020-11-27 14:44 ` [PATCH v2 08/36] block: make bdrv_reopen_{prepare, commit, abort} private Vladimir Sementsov-Ogievskiy via
2020-12-15 17:28   ` Alberto Garcia
2021-01-18 15:24   ` [PATCH v2 08/36] block: make bdrv_reopen_{prepare,commit,abort} private Kevin Wolf
2020-11-27 14:44 ` [PATCH v2 09/36] block: return value from bdrv_replace_node() Vladimir Sementsov-Ogievskiy
2020-12-15 17:30   ` Alberto Garcia
2021-01-18 15:40   ` Kevin Wolf
2020-11-27 14:44 ` [PATCH v2 10/36] util: add transactions.c Vladimir Sementsov-Ogievskiy
2021-01-18 16:50   ` Kevin Wolf
2021-01-18 17:41     ` Vladimir Sementsov-Ogievskiy
2020-11-27 14:44 ` [PATCH v2 11/36] block: bdrv_refresh_perms: check parents compliance Vladimir Sementsov-Ogievskiy
2021-01-19 17:42   ` Kevin Wolf
2021-01-19 18:10     ` Vladimir Sementsov-Ogievskiy
2020-11-27 14:44 ` [PATCH v2 12/36] block: refactor bdrv_child* permission functions Vladimir Sementsov-Ogievskiy
2021-01-19 18:09   ` Kevin Wolf
2021-01-19 18:30     ` Vladimir Sementsov-Ogievskiy
2021-01-20  9:09       ` Kevin Wolf
2021-01-20  9:56         ` Vladimir Sementsov-Ogievskiy
2021-01-20 10:06           ` Kevin Wolf
2020-11-27 14:44 ` [PATCH v2 13/36] block: rewrite bdrv_child_try_set_perm() using bdrv_refresh_perms() Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 14/36] block: inline bdrv_child_*() permission functions calls Vladimir Sementsov-Ogievskiy
2020-12-16 17:16   ` Alberto Garcia
2020-11-27 14:45 ` [PATCH v2 15/36] block: use topological sort for permission update Vladimir Sementsov-Ogievskiy
2021-01-27 18:38   ` Kevin Wolf
2021-01-28  9:34     ` Vladimir Sementsov-Ogievskiy
2021-01-28 17:13       ` Kevin Wolf
2021-01-28 18:04         ` Vladimir Sementsov-Ogievskiy
2021-02-03 18:38           ` Kevin Wolf
2021-02-04  7:16             ` Vladimir Sementsov-Ogievskiy
2021-03-10 11:08             ` Vladimir Sementsov-Ogievskiy
2021-03-10 11:55               ` Kevin Wolf
2020-11-27 14:45 ` [PATCH v2 16/36] block: add bdrv_drv_set_perm transaction action Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 17/36] block: add bdrv_list_* permission update functions Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 18/36] block: add bdrv_replace_child_safe() transaction action Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 19/36] block: fix bdrv_replace_node_common Vladimir Sementsov-Ogievskiy
2021-02-03 18:23   ` Kevin Wolf
2021-02-04  7:24     ` Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 20/36] block: add bdrv_attach_child_common() transaction action Vladimir Sementsov-Ogievskiy
2021-02-03 21:01   ` Kevin Wolf
2021-02-04  7:34     ` Vladimir Sementsov-Ogievskiy
2021-02-04  7:50       ` Kevin Wolf
2020-11-27 14:45 ` [PATCH v2 21/36] block: add bdrv_attach_child_noperm() " Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 22/36] block: split out bdrv_replace_node_noperm() Vladimir Sementsov-Ogievskiy
2021-02-03 21:16   ` Kevin Wolf
2020-11-27 14:45 ` [PATCH v2 23/36] block: adapt bdrv_append() for inserting filters Vladimir Sementsov-Ogievskiy
2021-02-03 21:33   ` Kevin Wolf
2021-02-04  8:30     ` Vladimir Sementsov-Ogievskiy
2021-02-04  9:05       ` Kevin Wolf
2021-02-04 11:54         ` Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 24/36] block: add bdrv_remove_backing transaction action Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 25/36] block: introduce bdrv_drop_filter() Vladimir Sementsov-Ogievskiy
2021-02-04 11:31   ` Kevin Wolf
2021-02-04 12:27     ` Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 26/36] block/backup-top: drop .active Vladimir Sementsov-Ogievskiy
2021-02-04 12:26   ` Kevin Wolf
2021-02-04 12:33     ` Vladimir Sementsov-Ogievskiy
2021-02-04 13:25       ` Kevin Wolf
2021-02-04 13:46         ` Vladimir Sementsov-Ogievskiy
2021-02-04 14:31           ` Kevin Wolf
2020-11-27 14:45 ` [PATCH v2 27/36] block: drop ignore_children for permission update functions Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 28/36] block: add bdrv_set_backing_noperm() transaction action Vladimir Sementsov-Ogievskiy
2021-02-05 14:00   ` Kevin Wolf
2021-02-05 16:06     ` Vladimir Sementsov-Ogievskiy
2021-02-05 16:30       ` Kevin Wolf
2021-03-11 18:29         ` Vladimir Sementsov-Ogievskiy
2021-02-05 16:26   ` Kevin Wolf
2021-02-08  9:34     ` Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 29/36] blockdev: qmp_x_blockdev_reopen: acquire all contexts Vladimir Sementsov-Ogievskiy
2021-02-05 16:01   ` Kevin Wolf
2021-02-05 16:16     ` Vladimir Sementsov-Ogievskiy
2021-02-05 16:36       ` Kevin Wolf
2020-11-27 14:45 ` [PATCH v2 30/36] block: bdrv_reopen_multiple: refresh permissions on updated graph Vladimir Sementsov-Ogievskiy
2021-02-05 17:57   ` Kevin Wolf
2021-02-08 11:21     ` Vladimir Sementsov-Ogievskiy
2021-02-10 14:13       ` Kevin Wolf
2021-02-10 14:38       ` Kevin Wolf
2020-11-27 14:45 ` [PATCH v2 31/36] block: drop unused permission update functions Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 32/36] block: inline bdrv_check_perm_common() Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 33/36] block: inline bdrv_replace_child() Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 34/36] block: refactor bdrv_child_set_perm_safe() transaction action Vladimir Sementsov-Ogievskiy
2021-02-10 14:51   ` Kevin Wolf
2020-11-27 14:45 ` [PATCH v2 35/36] block: rename bdrv_replace_child_safe() to bdrv_replace_child() Vladimir Sementsov-Ogievskiy
2020-11-27 14:45 ` [PATCH v2 36/36] block: refactor bdrv_node_check_perm() Vladimir Sementsov-Ogievskiy
2021-02-10 15:07   ` Kevin Wolf
2021-02-11  9:50     ` Vladimir Sementsov-Ogievskiy
2021-01-09 10:12 ` [PATCH v2 00/36] block: update graph permissions update Vladimir Sementsov-Ogievskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.