All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/3] block/mirror: Fix target backing BDS
@ 2016-06-06 14:42 Max Reitz
  2016-06-06 14:42 ` [Qemu-devel] [PATCH v2 1/3] block: Allow replacement of a BDS by its overlay Max Reitz
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Max Reitz @ 2016-06-06 14:42 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf, Fam Zheng

Issue #1: Sometimes we can have a wrong backing BDS for the target after
a mirror block job. In "existing" mode with drive-mirror, or when using
blockdev-mirror, it's generally the user's fault. In "absolute-paths"
mode this only means that after a sync=full drive-mirror, the target may
have a backing file, but this will not change its visible data, so it's
"fine".

Still, it's ugly.

Issue #2: Currently the backing chain of the target is basically opened
using bdrv_open_backing_file() (except for sometimes™). This results in
multiple BDSs for a single physical file, which is bad. In most use
cases, this is only temporary, but it still is bad.

We can just reuse the existing backing chain of the source, so we should
do so.


Patch 2 fixes the issue. Patch 1 allows change_parent_backing_link() to
replace a BDS by its immediate overlay (which is necessary so that patch
2 can set the source BDS as the backing BDS of the target (sync=none) in
mirror_complete(), i.e. before bdrv_replace_in_backing_chain() is called
in mirror_exit()).

Patch 3 adds a test.


v2:
- Move the whole logic to mirror_complete(). This has the benefit of
  resolving the bdrv_open_backing_file() issue with multiple BDSs being
  open for a single physical file (which is a very real issue when it
  comes to image locking).
- However, this also has the drawback of requiring patch 1. So it needed
  to be added.


git-backport-diff against v1:

Key:
[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/3:[down] 'block: Allow replacement of a BDS by its overlay'
002/3:[0030] [FC] 'block/mirror: Fix target backing BDS'
003/3:[----] [--] 'iotests: Add test for post-mirror backing chains'


Max Reitz (3):
  block: Allow replacement of a BDS by its overlay
  block/mirror: Fix target backing BDS
  iotests: Add test for post-mirror backing chains

 block.c                    |  23 +++--
 block/mirror.c             |  21 +++--
 tests/qemu-iotests/155     | 218 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/155.out |   5 ++
 tests/qemu-iotests/group   |   1 +
 5 files changed, 251 insertions(+), 17 deletions(-)
 create mode 100755 tests/qemu-iotests/155
 create mode 100644 tests/qemu-iotests/155.out

-- 
2.8.3

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v2 1/3] block: Allow replacement of a BDS by its overlay
  2016-06-06 14:42 [Qemu-devel] [PATCH v2 0/3] block/mirror: Fix target backing BDS Max Reitz
@ 2016-06-06 14:42 ` Max Reitz
  2016-06-08  8:58   ` Kevin Wolf
  2016-06-06 14:42 ` [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS Max Reitz
  2016-06-06 14:42 ` [Qemu-devel] [PATCH v2 3/3] iotests: Add test for post-mirror backing chains Max Reitz
  2 siblings, 1 reply; 16+ messages in thread
From: Max Reitz @ 2016-06-06 14:42 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf, Fam Zheng

change_parent_backing_link() asserts that the BDS to be replaced is not
used as a backing file. However, we may want to replace a BDS by its
overlay in which case that very link should not be redirected.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index f54bc25..16463aa 100644
--- a/block.c
+++ b/block.c
@@ -2224,9 +2224,22 @@ void bdrv_close_all(void)
 static void change_parent_backing_link(BlockDriverState *from,
                                        BlockDriverState *to)
 {
-    BdrvChild *c, *next;
+    BdrvChild *c, *next, *to_c;
 
     QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) {
+        if (c->role == &child_backing) {
+            /* Allow @from to be in a backing chain, but only if it is @to's
+             * backing chain. Do not replace @from by @to there. */
+            QLIST_FOREACH(to_c, &to->children, next) {
+                if (to_c == c) {
+                    break;
+                }
+            }
+            if (to_c) {
+                continue;
+            }
+        }
+
         assert(c->role != &child_backing);
         bdrv_ref(to);
         bdrv_replace_child(c, to);
-- 
2.8.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS
  2016-06-06 14:42 [Qemu-devel] [PATCH v2 0/3] block/mirror: Fix target backing BDS Max Reitz
  2016-06-06 14:42 ` [Qemu-devel] [PATCH v2 1/3] block: Allow replacement of a BDS by its overlay Max Reitz
@ 2016-06-06 14:42 ` Max Reitz
  2016-06-08  9:32   ` Kevin Wolf
  2016-06-06 14:42 ` [Qemu-devel] [PATCH v2 3/3] iotests: Add test for post-mirror backing chains Max Reitz
  2 siblings, 1 reply; 16+ messages in thread
From: Max Reitz @ 2016-06-06 14:42 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf, Fam Zheng

Currently, we are trying to move the backing BDS from the source to the
target in bdrv_replace_in_backing_chain() which is called from
mirror_exit(). However, mirror_complete() already tries to open the
target's backing chain with a call to bdrv_open_backing_file().

First, we should only set the target's backing BDS once. Second, the
mirroring block job has a better idea of what to set it to than the
generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
conditions on when to move the backing BDS from source to target are not
really correct).

Therefore, remove that code from bdrv_replace_in_backing_chain() and
leave it to mirror_complete().

However, mirror_complete() in turn pursues a questionable strategy by
employing bdrv_open_backing_file(): On the one hand, because this may
open the wrong backing file with drive-mirror in "existing" mode, or
because it will not override a possibly wrong backing file in the
blockdev-mirror case.

On the other hand, we want to reuse the existing backing chain of the
source instead of opening everything anew, because the latter results in
having multiple BDSs for a single physical file and thus potentially
concurrent access which we should try to avoid.

Thus, instead of invoking bdrv_open_backing_file(), just set the correct
backing BDS directly via bdrv_set_backing_hd(). Also, do so only when
mirror_complete() is certain to succeed.

In contrast to what bdrv_replace_in_backing_chain() did so far, we do
not need to drop the source's backing file.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c        |  8 --------
 block/mirror.c | 21 +++++++++++++--------
 2 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/block.c b/block.c
index 16463aa..792f5dd 100644
--- a/block.c
+++ b/block.c
@@ -2288,14 +2288,6 @@ void bdrv_replace_in_backing_chain(BlockDriverState *old, BlockDriverState *new)
 
     change_parent_backing_link(old, new);
 
-    /* Change backing files if a previously independent node is added to the
-     * chain. For active commit, we replace top by its own (indirect) backing
-     * file and don't do anything here so we don't build a loop. */
-    if (new->backing == NULL && !bdrv_chain_contains(backing_bs(old), new)) {
-        bdrv_set_backing_hd(new, backing_bs(old));
-        bdrv_set_backing_hd(old, NULL);
-    }
-
     bdrv_unref(old);
 }
 
diff --git a/block/mirror.c b/block/mirror.c
index 80fd3c7..217475b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -742,15 +742,11 @@ static void mirror_set_speed(BlockJob *job, int64_t speed, Error **errp)
 static void mirror_complete(BlockJob *job, Error **errp)
 {
     MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
-    Error *local_err = NULL;
-    int ret;
+    BlockDriverState *src, *target;
+
+    src = blk_bs(job->blk);
+    target = blk_bs(s->target);
 
-    ret = bdrv_open_backing_file(blk_bs(s->target), NULL, "backing",
-                                 &local_err);
-    if (ret < 0) {
-        error_propagate(errp, local_err);
-        return;
-    }
     if (!s->synced) {
         error_setg(errp, QERR_BLOCK_JOB_NOT_READY, job->id);
         return;
@@ -777,6 +773,15 @@ static void mirror_complete(BlockJob *job, Error **errp)
         aio_context_release(replace_aio_context);
     }
 
+    /* Now we need to adjust the target's backing BDS. This is not necessary
+     * when having performed a commit operation. */
+    if (!bdrv_chain_contains(backing_bs(src), target)) {
+        BlockDriverState *backing = s->is_none_mode ? src : s->base;
+        if (backing_bs(target) != backing) {
+            bdrv_set_backing_hd(target, backing);
+        }
+    }
+
     s->should_complete = true;
     block_job_enter(&s->common);
 }
-- 
2.8.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v2 3/3] iotests: Add test for post-mirror backing chains
  2016-06-06 14:42 [Qemu-devel] [PATCH v2 0/3] block/mirror: Fix target backing BDS Max Reitz
  2016-06-06 14:42 ` [Qemu-devel] [PATCH v2 1/3] block: Allow replacement of a BDS by its overlay Max Reitz
  2016-06-06 14:42 ` [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS Max Reitz
@ 2016-06-06 14:42 ` Max Reitz
  2 siblings, 0 replies; 16+ messages in thread
From: Max Reitz @ 2016-06-06 14:42 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf, Fam Zheng

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/155     | 218 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/155.out |   5 ++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 224 insertions(+)
 create mode 100755 tests/qemu-iotests/155
 create mode 100644 tests/qemu-iotests/155.out

diff --git a/tests/qemu-iotests/155 b/tests/qemu-iotests/155
new file mode 100755
index 0000000..76fdd4f
--- /dev/null
+++ b/tests/qemu-iotests/155
@@ -0,0 +1,218 @@
+#!/usr/bin/env python
+#
+# Test whether the backing BDSs are correct after completion of a
+# mirror block job
+#
+# Copyright (C) 2016 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import os
+import stat
+import time
+import iotests
+from iotests import qemu_img
+
+back0_img = os.path.join(iotests.test_dir, 'back0.' + iotests.imgfmt)
+back1_img = os.path.join(iotests.test_dir, 'back1.' + iotests.imgfmt)
+back2_img = os.path.join(iotests.test_dir, 'back2.' + iotests.imgfmt)
+source_img = os.path.join(iotests.test_dir, 'source.' + iotests.imgfmt)
+target_img = os.path.join(iotests.test_dir, 'target.' + iotests.imgfmt)
+
+class BaseClass(iotests.QMPTestCase):
+    def setUp(self):
+        qemu_img('create', '-f', iotests.imgfmt, back0_img, '1M')
+        qemu_img('create', '-f', iotests.imgfmt, '-b', back0_img, back1_img)
+        qemu_img('create', '-f', iotests.imgfmt, '-b', back1_img, back2_img)
+        qemu_img('create', '-f', iotests.imgfmt, '-b', back2_img, source_img)
+
+        self.vm = iotests.VM()
+        self.vm.add_drive(None, '', 'none')
+        self.vm.launch()
+
+        # Add the BDS via blockdev-add so it stays around after the mirror block
+        # job has been completed
+        result = self.vm.qmp('blockdev-add',
+                             options={'node-name': 'source',
+                                      'driver': iotests.imgfmt,
+                                      'file': {'driver': 'file',
+                                               'filename': source_img}})
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm.qmp('x-blockdev-insert-medium',
+                             device='drive0', node_name='source')
+        self.assert_qmp(result, 'return', {})
+
+        self.assertIntactSourceBackingChain()
+
+        if self.existing:
+            if self.target_backing:
+                qemu_img('create', '-f', iotests.imgfmt,
+                         '-b', self.target_backing, target_img, '1M')
+            else:
+                qemu_img('create', '-f', iotests.imgfmt, target_img, '1M')
+
+            if self.cmd == 'blockdev-mirror':
+                result = self.vm.qmp('blockdev-add',
+                                     options={'node-name': 'target',
+                                              'driver': iotests.imgfmt,
+                                              'file': {'driver': 'file',
+                                                       'filename': target_img}})
+                self.assert_qmp(result, 'return', {})
+
+    def tearDown(self):
+        self.vm.shutdown()
+        os.remove(source_img)
+        os.remove(back2_img)
+        os.remove(back1_img)
+        os.remove(back0_img)
+        try:
+            os.remove(target_img)
+        except OSError:
+            pass
+
+    def findBlockNode(self, node_name, id=None):
+        if id:
+            result = self.vm.qmp('query-block')
+            for device in result['return']:
+                if device['device'] == id:
+                    if node_name:
+                        self.assert_qmp(device, 'inserted/node-name', node_name)
+                    return device['inserted']
+        else:
+            result = self.vm.qmp('query-named-block-nodes')
+            for node in result['return']:
+                if node['node-name'] == node_name:
+                    return node
+
+        self.fail('Cannot find node %s/%s' % (id, node_name))
+
+    def assertIntactSourceBackingChain(self):
+        node = self.findBlockNode('source')
+
+        self.assert_qmp(node, 'image' + '/backing-image' * 0 + '/filename',
+                        source_img)
+        self.assert_qmp(node, 'image' + '/backing-image' * 1 + '/filename',
+                        back2_img)
+        self.assert_qmp(node, 'image' + '/backing-image' * 2 + '/filename',
+                        back1_img)
+        self.assert_qmp(node, 'image' + '/backing-image' * 3 + '/filename',
+                        back0_img)
+        self.assert_qmp_absent(node, 'image' + '/backing-image' * 4)
+
+
+class MirrorBaseClass(BaseClass):
+    def runMirror(self, sync):
+        if self.cmd == 'blockdev-mirror':
+            result = self.vm.qmp(self.cmd, device='drive0', sync=sync,
+                                 target='target')
+        else:
+            if self.existing:
+                mode = 'existing'
+            else:
+                mode = 'absolute-paths'
+            result = self.vm.qmp(self.cmd, device='drive0', sync=sync,
+                                 target=target_img, format=iotests.imgfmt,
+                                 mode=mode, node_name='target')
+
+        self.assert_qmp(result, 'return', {})
+
+        self.vm.event_wait('BLOCK_JOB_READY')
+
+        result = self.vm.qmp('block-job-complete', device='drive0')
+        self.assert_qmp(result, 'return', {})
+
+        self.vm.event_wait('BLOCK_JOB_COMPLETED')
+
+    def testFull(self):
+        self.runMirror('full')
+
+        node = self.findBlockNode('target', 'drive0')
+        self.assert_qmp_absent(node, 'image/backing-image')
+
+        self.assertIntactSourceBackingChain()
+
+    def testTop(self):
+        self.runMirror('top')
+
+        node = self.findBlockNode('target', 'drive0')
+        self.assert_qmp(node, 'image/backing-image/filename', back2_img)
+
+        self.assertIntactSourceBackingChain()
+
+    def testNone(self):
+        self.runMirror('none')
+
+        node = self.findBlockNode('target', 'drive0')
+        self.assert_qmp(node, 'image/backing-image/filename', source_img)
+
+        self.assertIntactSourceBackingChain()
+
+
+class TestDriveMirrorAbsolutePaths(MirrorBaseClass):
+    cmd = 'drive-mirror'
+    existing = False
+
+class TestDriveMirrorExistingNoBacking(MirrorBaseClass):
+    cmd = 'drive-mirror'
+    existing = True
+    target_backing = None
+
+class TestDriveMirrorExistingBacking(MirrorBaseClass):
+    cmd = 'drive-mirror'
+    existing = True
+    target_backing = 'null-co://'
+
+class TestBlockdevMirrorNoBacking(MirrorBaseClass):
+    cmd = 'blockdev-mirror'
+    existing = True
+    target_backing = None
+
+class TestBlockdevMirrorBacking(MirrorBaseClass):
+    cmd = 'blockdev-mirror'
+    existing = True
+    target_backing = 'null-co://'
+
+
+class TestCommit(BaseClass):
+    existing = False
+
+    def testCommit(self):
+        result = self.vm.qmp('block-commit', device='drive0', base=back1_img)
+        self.assert_qmp(result, 'return', {})
+
+        self.vm.event_wait('BLOCK_JOB_READY')
+
+        result = self.vm.qmp('block-job-complete', device='drive0')
+        self.assert_qmp(result, 'return', {})
+
+        self.vm.event_wait('BLOCK_JOB_COMPLETED')
+
+        node = self.findBlockNode(None, 'drive0')
+        self.assert_qmp(node, 'image' + '/backing-image' * 0 + '/filename',
+                        back1_img)
+        self.assert_qmp(node, 'image' + '/backing-image' * 1 + '/filename',
+                        back0_img)
+        self.assert_qmp_absent(node, 'image' + '/backing-image' * 2 +
+                               '/filename')
+
+        self.assertIntactSourceBackingChain()
+
+
+BaseClass = None
+MirrorBaseClass = None
+
+if __name__ == '__main__':
+    iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/155.out b/tests/qemu-iotests/155.out
new file mode 100644
index 0000000..b6f2576
--- /dev/null
+++ b/tests/qemu-iotests/155.out
@@ -0,0 +1,5 @@
+................
+----------------------------------------------------------------------
+Ran 16 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index ab1d76e..9f1f2c0 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -154,3 +154,4 @@
 150 rw auto quick
 152 rw auto quick
 154 rw auto backing quick
+155 rw auto
-- 
2.8.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/3] block: Allow replacement of a BDS by its overlay
  2016-06-06 14:42 ` [Qemu-devel] [PATCH v2 1/3] block: Allow replacement of a BDS by its overlay Max Reitz
@ 2016-06-08  8:58   ` Kevin Wolf
  2016-06-08 14:21     ` Max Reitz
  0 siblings, 1 reply; 16+ messages in thread
From: Kevin Wolf @ 2016-06-08  8:58 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-block, qemu-devel, Fam Zheng

Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
> change_parent_backing_link() asserts that the BDS to be replaced is not
> used as a backing file. However, we may want to replace a BDS by its
> overlay in which case that very link should not be redirected.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>

So the scenario is like this?

              +---- to
              v
    base <- from <- top

And we want to change it into this:

    base <- from <- to <- top

Okay, makes sense.

(Hm, put ASCII art in the commit message? I'd be all for it.)

>  block.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/block.c b/block.c
> index f54bc25..16463aa 100644
> --- a/block.c
> +++ b/block.c
> @@ -2224,9 +2224,22 @@ void bdrv_close_all(void)
>  static void change_parent_backing_link(BlockDriverState *from,
>                                         BlockDriverState *to)
>  {
> -    BdrvChild *c, *next;
> +    BdrvChild *c, *next, *to_c;
>  
>      QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) {
> +        if (c->role == &child_backing) {
> +            /* Allow @from to be in a backing chain, but only if it is @to's
> +             * backing chain. Do not replace @from by @to there. */

The comment suggests bdrv_chain_contains(), but you only accept it as a
direct child of top and you accept non-backing-file children.

Intuitively I would say that anywhere in the backing chain would make
sense, but we can always allow that later when we actually need it.
Accepting all types of children sounds right when I think about
inserting a quorum node as 'to'.

So I guess the code is fine and the comment should be changed to
correctly reflect what the code does.

> +            QLIST_FOREACH(to_c, &to->children, next) {
> +                if (to_c == c) {
> +                    break;
> +                }
> +            }
> +            if (to_c) {
> +                continue;
> +            }
> +        }
> +
>          assert(c->role != &child_backing);
>          bdrv_ref(to);
>          bdrv_replace_child(c, to);

The other thing is that I'm unsure whether this function makes any sense
at all. "Replace in all parents" is kind of arbitrary. In the long term,
we may want to allow the user to specify the exact graph modifications
on (block-)job-complete.

Kevin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS
  2016-06-06 14:42 ` [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS Max Reitz
@ 2016-06-08  9:32   ` Kevin Wolf
  2016-06-08 11:28     ` Paolo Bonzini
                       ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Kevin Wolf @ 2016-06-08  9:32 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-block, qemu-devel, Fam Zheng, nsoffer, eblake, pbonzini

Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
> Currently, we are trying to move the backing BDS from the source to the
> target in bdrv_replace_in_backing_chain() which is called from
> mirror_exit(). However, mirror_complete() already tries to open the
> target's backing chain with a call to bdrv_open_backing_file().
> 
> First, we should only set the target's backing BDS once. Second, the
> mirroring block job has a better idea of what to set it to than the
> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
> conditions on when to move the backing BDS from source to target are not
> really correct).
> 
> Therefore, remove that code from bdrv_replace_in_backing_chain() and
> leave it to mirror_complete().
> 
> However, mirror_complete() in turn pursues a questionable strategy by
> employing bdrv_open_backing_file(): On the one hand, because this may
> open the wrong backing file with drive-mirror in "existing" mode, or
> because it will not override a possibly wrong backing file in the
> blockdev-mirror case.
> 
> On the other hand, we want to reuse the existing backing chain of the
> source instead of opening everything anew, because the latter results in
> having multiple BDSs for a single physical file and thus potentially
> concurrent access which we should try to avoid.

Careful, this "wrong" backing file might actually be intended!

Consider a case where you want to move an image with its whole backing
chain to different storage. In that case, you would copy all of the
backing files (cp is good enough, they are read-only), create the
destination image which already points at the copied backing chain, and
then mirror in "existing" mode.

The intention is obviously that after the job completion the new backing
chain is used and not the old one.

I know that such cases were discussed when mirroring was introduced, I'm
not sure whether it's actually used. We need some input there:

Eric, can you tell us whether libvirt makes use of such a setup?

Nir, I'm not sure who is the right person in oVirt these days, but do
you either know yourself whether oVirt requires this to work, or do you
know who else would know?

> Thus, instead of invoking bdrv_open_backing_file(), just set the correct
> backing BDS directly via bdrv_set_backing_hd(). Also, do so only when
> mirror_complete() is certain to succeed.
> 
> In contrast to what bdrv_replace_in_backing_chain() did so far, we do
> not need to drop the source's backing file.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Leaving the actual code review for later when we have decided what
semantics we even want.

Kevin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS
  2016-06-08  9:32   ` Kevin Wolf
@ 2016-06-08 11:28     ` Paolo Bonzini
  2016-06-08 11:47       ` Kevin Wolf
  2016-06-08 14:40       ` Max Reitz
  2016-06-08 14:38     ` Max Reitz
  2016-06-08 15:39     ` Nir Soffer
  2 siblings, 2 replies; 16+ messages in thread
From: Paolo Bonzini @ 2016-06-08 11:28 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Max Reitz, qemu-block, qemu-devel, Fam Zheng, nsoffer, eblake



----- Original Message -----
> From: "Kevin Wolf" <kwolf@redhat.com>
> To: "Max Reitz" <mreitz@redhat.com>
> Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org, "Fam Zheng" <famz@redhat.com>, nsoffer@redhat.com,
> eblake@redhat.com, pbonzini@redhat.com
> Sent: Wednesday, June 8, 2016 11:32:29 AM
> Subject: Re: [PATCH v2 2/3] block/mirror: Fix target backing BDS
> 
> Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
> > Currently, we are trying to move the backing BDS from the source to the
> > target in bdrv_replace_in_backing_chain() which is called from
> > mirror_exit(). However, mirror_complete() already tries to open the
> > target's backing chain with a call to bdrv_open_backing_file().
> > 
> > First, we should only set the target's backing BDS once. Second, the
> > mirroring block job has a better idea of what to set it to than the
> > generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
> > conditions on when to move the backing BDS from source to target are not
> > really correct).
> > 
> > Therefore, remove that code from bdrv_replace_in_backing_chain() and
> > leave it to mirror_complete().
> > 
> > However, mirror_complete() in turn pursues a questionable strategy by
> > employing bdrv_open_backing_file(): On the one hand, because this may
> > open the wrong backing file with drive-mirror in "existing" mode, or
> > because it will not override a possibly wrong backing file in the
> > blockdev-mirror case.
> 
> Careful, this "wrong" backing file might actually be intended!
> 
> Consider a case where you want to move an image with its whole backing
> chain to different storage. In that case, you would copy all of the
> backing files (cp is good enough, they are read-only), create the
> destination image which already points at the copied backing chain, and
> then mirror in "existing" mode.
> 
> The intention is obviously that after the job completion the new backing
> chain is used and not the old one.

Yes, this is the intention and it should not be changed.  In addition
to what Kevin said, you can use drive-mirror to collapse the image to a
single file; in this case, QEMU should not be using the backing files of
the source.

bdrv_open_backing_file() is used because what we want to do is to
"undo" the BDRV_O_NO_BACKING flag used by qmp_drive_mirror.

If the contents change under the guest feet, it's the layers above
QEMU that have screwed up.

Paolo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS
  2016-06-08 11:28     ` Paolo Bonzini
@ 2016-06-08 11:47       ` Kevin Wolf
  2016-06-08 14:40       ` Max Reitz
  1 sibling, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2016-06-08 11:47 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Max Reitz, qemu-block, qemu-devel, Fam Zheng, nsoffer, eblake

Am 08.06.2016 um 13:28 hat Paolo Bonzini geschrieben:
> 
> 
> ----- Original Message -----
> > From: "Kevin Wolf" <kwolf@redhat.com>
> > To: "Max Reitz" <mreitz@redhat.com>
> > Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org, "Fam Zheng" <famz@redhat.com>, nsoffer@redhat.com,
> > eblake@redhat.com, pbonzini@redhat.com
> > Sent: Wednesday, June 8, 2016 11:32:29 AM
> > Subject: Re: [PATCH v2 2/3] block/mirror: Fix target backing BDS
> > 
> > Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
> > > Currently, we are trying to move the backing BDS from the source to the
> > > target in bdrv_replace_in_backing_chain() which is called from
> > > mirror_exit(). However, mirror_complete() already tries to open the
> > > target's backing chain with a call to bdrv_open_backing_file().
> > > 
> > > First, we should only set the target's backing BDS once. Second, the
> > > mirroring block job has a better idea of what to set it to than the
> > > generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
> > > conditions on when to move the backing BDS from source to target are not
> > > really correct).
> > > 
> > > Therefore, remove that code from bdrv_replace_in_backing_chain() and
> > > leave it to mirror_complete().
> > > 
> > > However, mirror_complete() in turn pursues a questionable strategy by
> > > employing bdrv_open_backing_file(): On the one hand, because this may
> > > open the wrong backing file with drive-mirror in "existing" mode, or
> > > because it will not override a possibly wrong backing file in the
> > > blockdev-mirror case.
> > 
> > Careful, this "wrong" backing file might actually be intended!
> > 
> > Consider a case where you want to move an image with its whole backing
> > chain to different storage. In that case, you would copy all of the
> > backing files (cp is good enough, they are read-only), create the
> > destination image which already points at the copied backing chain, and
> > then mirror in "existing" mode.
> > 
> > The intention is obviously that after the job completion the new backing
> > chain is used and not the old one.
> 
> Yes, this is the intention and it should not be changed.  In addition
> to what Kevin said, you can use drive-mirror to collapse the image to a
> single file; in this case, QEMU should not be using the backing files of
> the source.
> 
> bdrv_open_backing_file() is used because what we want to do is to
> "undo" the BDRV_O_NO_BACKING flag used by qmp_drive_mirror.
> 
> If the contents change under the guest feet, it's the layers above
> QEMU that have screwed up.

We should probably have test cases for both scenarios. They would make
it obvious that changing this behaviour is not okay. Actually, I'm
surprised that our existing cases don't seem to cover this.

Kevin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/3] block: Allow replacement of a BDS by its overlay
  2016-06-08  8:58   ` Kevin Wolf
@ 2016-06-08 14:21     ` Max Reitz
  0 siblings, 0 replies; 16+ messages in thread
From: Max Reitz @ 2016-06-08 14:21 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, Fam Zheng

[-- Attachment #1: Type: text/plain, Size: 3102 bytes --]

On 08.06.2016 10:58, Kevin Wolf wrote:
> Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
>> change_parent_backing_link() asserts that the BDS to be replaced is not
>> used as a backing file. However, we may want to replace a BDS by its
>> overlay in which case that very link should not be redirected.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
> 
> So the scenario is like this?
> 
>               +---- to
>               v
>     base <- from <- top
> 
> And we want to change it into this:
> 
>     base <- from <- to <- top
> 
> Okay, makes sense.

Doesn't the assert(c->role != &child_backing) prevent this scenario?
(which is the reason for the first sentence in my commit message.)

So the scenario is rather:

           +----- target
           v
base <- source <- BB

to:

base <- source <- target <- BB

> (Hm, put ASCII art in the commit message? I'd be all for it.)

Well, depending on whether a v3 will come or not, I can put it there, of
course.

> 
>>  block.c | 15 ++++++++++++++-
>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/block.c b/block.c
>> index f54bc25..16463aa 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -2224,9 +2224,22 @@ void bdrv_close_all(void)
>>  static void change_parent_backing_link(BlockDriverState *from,
>>                                         BlockDriverState *to)
>>  {
>> -    BdrvChild *c, *next;
>> +    BdrvChild *c, *next, *to_c;
>>  
>>      QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) {
>> +        if (c->role == &child_backing) {
>> +            /* Allow @from to be in a backing chain, but only if it is @to's
>> +             * backing chain. Do not replace @from by @to there. */
> 
> The comment suggests bdrv_chain_contains(), but you only accept it as a
> direct child of top and you accept non-backing-file children.
> 
> Intuitively I would say that anywhere in the backing chain would make
> sense, but we can always allow that later when we actually need it.
> Accepting all types of children sounds right when I think about
> inserting a quorum node as 'to'.
> 
> So I guess the code is fine and the comment should be changed to
> correctly reflect what the code does.

OK, will do.

>> +            QLIST_FOREACH(to_c, &to->children, next) {
>> +                if (to_c == c) {
>> +                    break;
>> +                }
>> +            }
>> +            if (to_c) {
>> +                continue;
>> +            }
>> +        }
>> +
>>          assert(c->role != &child_backing);
>>          bdrv_ref(to);
>>          bdrv_replace_child(c, to);
> 
> The other thing is that I'm unsure whether this function makes any sense
> at all. "Replace in all parents" is kind of arbitrary. In the long term,
> we may want to allow the user to specify the exact graph modifications
> on (block-)job-complete.

Well, as a replacement of bdrv_swap(), this function definitely does
make sense. Whether we can do even better in the long run... Is a
question for the long run, I think.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS
  2016-06-08  9:32   ` Kevin Wolf
  2016-06-08 11:28     ` Paolo Bonzini
@ 2016-06-08 14:38     ` Max Reitz
  2016-06-08 16:54       ` Max Reitz
  2016-06-08 15:39     ` Nir Soffer
  2 siblings, 1 reply; 16+ messages in thread
From: Max Reitz @ 2016-06-08 14:38 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, Fam Zheng, nsoffer, eblake, pbonzini

[-- Attachment #1: Type: text/plain, Size: 4471 bytes --]

On 08.06.2016 11:32, Kevin Wolf wrote:
> Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
>> Currently, we are trying to move the backing BDS from the source to the
>> target in bdrv_replace_in_backing_chain() which is called from
>> mirror_exit(). However, mirror_complete() already tries to open the
>> target's backing chain with a call to bdrv_open_backing_file().
>>
>> First, we should only set the target's backing BDS once. Second, the
>> mirroring block job has a better idea of what to set it to than the
>> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
>> conditions on when to move the backing BDS from source to target are not
>> really correct).
>>
>> Therefore, remove that code from bdrv_replace_in_backing_chain() and
>> leave it to mirror_complete().
>>
>> However, mirror_complete() in turn pursues a questionable strategy by
>> employing bdrv_open_backing_file(): On the one hand, because this may
>> open the wrong backing file with drive-mirror in "existing" mode, or
>> because it will not override a possibly wrong backing file in the
>> blockdev-mirror case.
>>
>> On the other hand, we want to reuse the existing backing chain of the
>> source instead of opening everything anew, because the latter results in
>> having multiple BDSs for a single physical file and thus potentially
>> concurrent access which we should try to avoid.
> 
> Careful, this "wrong" backing file might actually be intended!

True.

I still consider completely opening the backing chain not correct,
though, at least in absolute-paths mode, because this will result in
having at least two BDSs for single physical image files (once for the
old chain, once for the new one).

So let's go through everything.

== drive-mirror with absolute-paths ==

We already have the backing chain open (around the source BDS), and it's
definitely the correct one. So I think we can always reuse it for the
target.

== drive-mirror with existing ==

You're right, we should probably keep doing bdrv_open_backing_file()
because we cannot check whether the existing image has the same backing
chain as a new absolute-paths image would have had.

This is prone to give you some issues if you actually do want to have
the "default" backing chain, though, because of the multiple BDS thing.
This case is basically guaranteed to break with sync=none and default
image locking.

== blockdev-mirror ==

In theory the simplest one: We just assume the backing chain of the
target has been opened already, and then we blame the user if they have
created multiple BDSs per physical file.

Unluckily in practice, though, we require the target BDS to not have a
backing file at all. blockdev-mirror is just supposed to open the
backing chain after completion, which I really don't like (I don't think
a blockdev- command should do this kind of magic).

Maybe we should allow the target to have a backing file (I really don't
see why it shouldn't have one) and treat the non-backing case like
drive-mirror in existing mode.


Does that sound right?

Max


> Consider a case where you want to move an image with its whole backing
> chain to different storage. In that case, you would copy all of the
> backing files (cp is good enough, they are read-only), create the
> destination image which already points at the copied backing chain, and
> then mirror in "existing" mode.
> 
> The intention is obviously that after the job completion the new backing
> chain is used and not the old one.
> 
> I know that such cases were discussed when mirroring was introduced, I'm
> not sure whether it's actually used. We need some input there:
> 
> Eric, can you tell us whether libvirt makes use of such a setup?
> 
> Nir, I'm not sure who is the right person in oVirt these days, but do
> you either know yourself whether oVirt requires this to work, or do you
> know who else would know?
> 
>> Thus, instead of invoking bdrv_open_backing_file(), just set the correct
>> backing BDS directly via bdrv_set_backing_hd(). Also, do so only when
>> mirror_complete() is certain to succeed.
>>
>> In contrast to what bdrv_replace_in_backing_chain() did so far, we do
>> not need to drop the source's backing file.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
> 
> Leaving the actual code review for later when we have decided what
> semantics we even want.
> 
> Kevin
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS
  2016-06-08 11:28     ` Paolo Bonzini
  2016-06-08 11:47       ` Kevin Wolf
@ 2016-06-08 14:40       ` Max Reitz
  2016-06-08 14:42         ` Max Reitz
  1 sibling, 1 reply; 16+ messages in thread
From: Max Reitz @ 2016-06-08 14:40 UTC (permalink / raw)
  To: Paolo Bonzini, Kevin Wolf
  Cc: qemu-block, qemu-devel, Fam Zheng, nsoffer, eblake

[-- Attachment #1: Type: text/plain, Size: 2968 bytes --]

On 08.06.2016 13:28, Paolo Bonzini wrote:
> 
> 
> ----- Original Message -----
>> From: "Kevin Wolf" <kwolf@redhat.com>
>> To: "Max Reitz" <mreitz@redhat.com>
>> Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org, "Fam Zheng" <famz@redhat.com>, nsoffer@redhat.com,
>> eblake@redhat.com, pbonzini@redhat.com
>> Sent: Wednesday, June 8, 2016 11:32:29 AM
>> Subject: Re: [PATCH v2 2/3] block/mirror: Fix target backing BDS
>>
>> Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
>>> Currently, we are trying to move the backing BDS from the source to the
>>> target in bdrv_replace_in_backing_chain() which is called from
>>> mirror_exit(). However, mirror_complete() already tries to open the
>>> target's backing chain with a call to bdrv_open_backing_file().
>>>
>>> First, we should only set the target's backing BDS once. Second, the
>>> mirroring block job has a better idea of what to set it to than the
>>> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
>>> conditions on when to move the backing BDS from source to target are not
>>> really correct).
>>>
>>> Therefore, remove that code from bdrv_replace_in_backing_chain() and
>>> leave it to mirror_complete().
>>>
>>> However, mirror_complete() in turn pursues a questionable strategy by
>>> employing bdrv_open_backing_file(): On the one hand, because this may
>>> open the wrong backing file with drive-mirror in "existing" mode, or
>>> because it will not override a possibly wrong backing file in the
>>> blockdev-mirror case.
>>
>> Careful, this "wrong" backing file might actually be intended!
>>
>> Consider a case where you want to move an image with its whole backing
>> chain to different storage. In that case, you would copy all of the
>> backing files (cp is good enough, they are read-only), create the
>> destination image which already points at the copied backing chain, and
>> then mirror in "existing" mode.
>>
>> The intention is obviously that after the job completion the new backing
>> chain is used and not the old one.
> 
> Yes, this is the intention and it should not be changed.  In addition
> to what Kevin said, you can use drive-mirror to collapse the image to a
> single file; in this case, QEMU should not be using the backing files of
> the source.

That is an issue that we have right now. If you do drive-mirror in
absolute-paths mode with sync=full, the target will have the backing
chain of the source. This is something that this patch fixes.

In fact, I think if you do drive-mirror in existing mode or
blockdev-mirror and the target image does not have a backing file
(whatever sync mode you have used), the same will happen.

Max

> bdrv_open_backing_file() is used because what we want to do is to
> "undo" the BDRV_O_NO_BACKING flag used by qmp_drive_mirror.
> 
> If the contents change under the guest feet, it's the layers above
> QEMU that have screwed up.
> 
> Paolo
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS
  2016-06-08 14:40       ` Max Reitz
@ 2016-06-08 14:42         ` Max Reitz
  0 siblings, 0 replies; 16+ messages in thread
From: Max Reitz @ 2016-06-08 14:42 UTC (permalink / raw)
  To: Paolo Bonzini, Kevin Wolf
  Cc: qemu-block, qemu-devel, Fam Zheng, nsoffer, eblake

[-- Attachment #1: Type: text/plain, Size: 3259 bytes --]

On 08.06.2016 16:40, Max Reitz wrote:
> On 08.06.2016 13:28, Paolo Bonzini wrote:
>>
>>
>> ----- Original Message -----
>>> From: "Kevin Wolf" <kwolf@redhat.com>
>>> To: "Max Reitz" <mreitz@redhat.com>
>>> Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org, "Fam Zheng" <famz@redhat.com>, nsoffer@redhat.com,
>>> eblake@redhat.com, pbonzini@redhat.com
>>> Sent: Wednesday, June 8, 2016 11:32:29 AM
>>> Subject: Re: [PATCH v2 2/3] block/mirror: Fix target backing BDS
>>>
>>> Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
>>>> Currently, we are trying to move the backing BDS from the source to the
>>>> target in bdrv_replace_in_backing_chain() which is called from
>>>> mirror_exit(). However, mirror_complete() already tries to open the
>>>> target's backing chain with a call to bdrv_open_backing_file().
>>>>
>>>> First, we should only set the target's backing BDS once. Second, the
>>>> mirroring block job has a better idea of what to set it to than the
>>>> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
>>>> conditions on when to move the backing BDS from source to target are not
>>>> really correct).
>>>>
>>>> Therefore, remove that code from bdrv_replace_in_backing_chain() and
>>>> leave it to mirror_complete().
>>>>
>>>> However, mirror_complete() in turn pursues a questionable strategy by
>>>> employing bdrv_open_backing_file(): On the one hand, because this may
>>>> open the wrong backing file with drive-mirror in "existing" mode, or
>>>> because it will not override a possibly wrong backing file in the
>>>> blockdev-mirror case.
>>>
>>> Careful, this "wrong" backing file might actually be intended!
>>>
>>> Consider a case where you want to move an image with its whole backing
>>> chain to different storage. In that case, you would copy all of the
>>> backing files (cp is good enough, they are read-only), create the
>>> destination image which already points at the copied backing chain, and
>>> then mirror in "existing" mode.
>>>
>>> The intention is obviously that after the job completion the new backing
>>> chain is used and not the old one.
>>
>> Yes, this is the intention and it should not be changed.  In addition
>> to what Kevin said, you can use drive-mirror to collapse the image to a
>> single file; in this case, QEMU should not be using the backing files of
>> the source.
> 
> That is an issue that we have right now. If you do drive-mirror in
> absolute-paths mode with sync=full, the target will have the backing
> chain of the source. This is something that this patch fixes.

As a clarification: I mean the backing chain inside QEMU (in the BDS
graph), not the on-disk backing chain, i.e. how the physical image files
link to each other.

Max

> In fact, I think if you do drive-mirror in existing mode or
> blockdev-mirror and the target image does not have a backing file
> (whatever sync mode you have used), the same will happen.
> 
> Max
> 
>> bdrv_open_backing_file() is used because what we want to do is to
>> "undo" the BDRV_O_NO_BACKING flag used by qmp_drive_mirror.
>>
>> If the contents change under the guest feet, it's the layers above
>> QEMU that have screwed up.
>>
>> Paolo
>>
> 
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS
  2016-06-08  9:32   ` Kevin Wolf
  2016-06-08 11:28     ` Paolo Bonzini
  2016-06-08 14:38     ` Max Reitz
@ 2016-06-08 15:39     ` Nir Soffer
  2016-06-09  8:58       ` Kevin Wolf
  2 siblings, 1 reply; 16+ messages in thread
From: Nir Soffer @ 2016-06-08 15:39 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Max Reitz, qemu-block, qemu-devel, Fam Zheng, Eric Blake, pbonzini

On Wed, Jun 8, 2016 at 12:32 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
>> Currently, we are trying to move the backing BDS from the source to the
>> target in bdrv_replace_in_backing_chain() which is called from
>> mirror_exit(). However, mirror_complete() already tries to open the
>> target's backing chain with a call to bdrv_open_backing_file().
>>
>> First, we should only set the target's backing BDS once. Second, the
>> mirroring block job has a better idea of what to set it to than the
>> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
>> conditions on when to move the backing BDS from source to target are not
>> really correct).
>>
>> Therefore, remove that code from bdrv_replace_in_backing_chain() and
>> leave it to mirror_complete().
>>
>> However, mirror_complete() in turn pursues a questionable strategy by
>> employing bdrv_open_backing_file(): On the one hand, because this may
>> open the wrong backing file with drive-mirror in "existing" mode, or
>> because it will not override a possibly wrong backing file in the
>> blockdev-mirror case.
>>
>> On the other hand, we want to reuse the existing backing chain of the
>> source instead of opening everything anew, because the latter results in
>> having multiple BDSs for a single physical file and thus potentially
>> concurrent access which we should try to avoid.
>
> Careful, this "wrong" backing file might actually be intended!
>
> Consider a case where you want to move an image with its whole backing
> chain to different storage. In that case, you would copy all of the
> backing files (cp is good enough, they are read-only), create the
> destination image which already points at the copied backing chain, and
> then mirror in "existing" mode.
>
> The intention is obviously that after the job completion the new backing
> chain is used and not the old one.
>
> I know that such cases were discussed when mirroring was introduced, I'm
> not sure whether it's actually used. We need some input there:
>
> Eric, can you tell us whether libvirt makes use of such a setup?
>
> Nir, I'm not sure who is the right person in oVirt these days, but do
> you either know yourself whether oVirt requires this to work, or do you
> know who else would know?

I'm the right person, thanks for keeping me in the loop.

What you describe is how we migrate a disk from one storage to another:

1. Create a vm snapshot
2. Create a volume on the destination storage for the snapshot
3. Start mirroring from the source snapshot to the destination snapshot
    using libvirt virDomainBlockCopy:
    https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockCopy
4. Copy the reset of the chain from source to destination using qemu-img convert
5. Pivot to the new chain using libvirt virDomainBlockJobAbort
    https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockJobAbort
6. Remove the old chain

source and target can be files or block device, and we plan to support also
rbd and gluster volumes as target, maybe also as source.

Nir

>
>> Thus, instead of invoking bdrv_open_backing_file(), just set the correct
>> backing BDS directly via bdrv_set_backing_hd(). Also, do so only when
>> mirror_complete() is certain to succeed.
>>
>> In contrast to what bdrv_replace_in_backing_chain() did so far, we do
>> not need to drop the source's backing file.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>
> Leaving the actual code review for later when we have decided what
> semantics we even want.
>
> Kevin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS
  2016-06-08 14:38     ` Max Reitz
@ 2016-06-08 16:54       ` Max Reitz
  0 siblings, 0 replies; 16+ messages in thread
From: Max Reitz @ 2016-06-08 16:54 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, Fam Zheng, nsoffer, eblake, pbonzini

[-- Attachment #1: Type: text/plain, Size: 4896 bytes --]

On 08.06.2016 16:38, Max Reitz wrote:
> On 08.06.2016 11:32, Kevin Wolf wrote:
>> Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
>>> Currently, we are trying to move the backing BDS from the source to the
>>> target in bdrv_replace_in_backing_chain() which is called from
>>> mirror_exit(). However, mirror_complete() already tries to open the
>>> target's backing chain with a call to bdrv_open_backing_file().
>>>
>>> First, we should only set the target's backing BDS once. Second, the
>>> mirroring block job has a better idea of what to set it to than the
>>> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
>>> conditions on when to move the backing BDS from source to target are not
>>> really correct).
>>>
>>> Therefore, remove that code from bdrv_replace_in_backing_chain() and
>>> leave it to mirror_complete().
>>>
>>> However, mirror_complete() in turn pursues a questionable strategy by
>>> employing bdrv_open_backing_file(): On the one hand, because this may
>>> open the wrong backing file with drive-mirror in "existing" mode, or
>>> because it will not override a possibly wrong backing file in the
>>> blockdev-mirror case.
>>>
>>> On the other hand, we want to reuse the existing backing chain of the
>>> source instead of opening everything anew, because the latter results in
>>> having multiple BDSs for a single physical file and thus potentially
>>> concurrent access which we should try to avoid.
>>
>> Careful, this "wrong" backing file might actually be intended!
> 
> True.
> 
> I still consider completely opening the backing chain not correct,
> though, at least in absolute-paths mode, because this will result in
> having at least two BDSs for single physical image files (once for the
> old chain, once for the new one).
> 
> So let's go through everything.
> 
> == drive-mirror with absolute-paths ==
> 
> We already have the backing chain open (around the source BDS), and it's
> definitely the correct one. So I think we can always reuse it for the
> target.
> 
> == drive-mirror with existing ==
> 
> You're right, we should probably keep doing bdrv_open_backing_file()
> because we cannot check whether the existing image has the same backing
> chain as a new absolute-paths image would have had.
> 
> This is prone to give you some issues if you actually do want to have
> the "default" backing chain, though, because of the multiple BDS thing.
> This case is basically guaranteed to break with sync=none and default
> image locking.
> 
> == blockdev-mirror ==
> 
> In theory the simplest one: We just assume the backing chain of the
> target has been opened already, and then we blame the user if they have
> created multiple BDSs per physical file.
> 
> Unluckily in practice, though, we require the target BDS to not have a
> backing file at all. blockdev-mirror is just supposed to open the
> backing chain after completion, which I really don't like (I don't think
> a blockdev- command should do this kind of magic).

Good news: Turns out I was wrong. I was somehow mixing things up with
blockdev-snapshot (don't ask me why, I have no clue).

So I think it'd be fine to rely on the user that the backing chain of
the target is correct.

Max

> Maybe we should allow the target to have a backing file (I really don't
> see why it shouldn't have one) and treat the non-backing case like
> drive-mirror in existing mode.
> 
> 
> Does that sound right?
> 
> Max
> 
> 
>> Consider a case where you want to move an image with its whole backing
>> chain to different storage. In that case, you would copy all of the
>> backing files (cp is good enough, they are read-only), create the
>> destination image which already points at the copied backing chain, and
>> then mirror in "existing" mode.
>>
>> The intention is obviously that after the job completion the new backing
>> chain is used and not the old one.
>>
>> I know that such cases were discussed when mirroring was introduced, I'm
>> not sure whether it's actually used. We need some input there:
>>
>> Eric, can you tell us whether libvirt makes use of such a setup?
>>
>> Nir, I'm not sure who is the right person in oVirt these days, but do
>> you either know yourself whether oVirt requires this to work, or do you
>> know who else would know?
>>
>>> Thus, instead of invoking bdrv_open_backing_file(), just set the correct
>>> backing BDS directly via bdrv_set_backing_hd(). Also, do so only when
>>> mirror_complete() is certain to succeed.
>>>
>>> In contrast to what bdrv_replace_in_backing_chain() did so far, we do
>>> not need to drop the source's backing file.
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>
>> Leaving the actual code review for later when we have decided what
>> semantics we even want.
>>
>> Kevin
>>
> 
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS
  2016-06-08 15:39     ` Nir Soffer
@ 2016-06-09  8:58       ` Kevin Wolf
  2016-06-09 11:16         ` Nir Soffer
  0 siblings, 1 reply; 16+ messages in thread
From: Kevin Wolf @ 2016-06-09  8:58 UTC (permalink / raw)
  To: Nir Soffer
  Cc: Max Reitz, qemu-block, qemu-devel, Fam Zheng, Eric Blake, pbonzini

Am 08.06.2016 um 17:39 hat Nir Soffer geschrieben:
> On Wed, Jun 8, 2016 at 12:32 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> > Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
> >> Currently, we are trying to move the backing BDS from the source to the
> >> target in bdrv_replace_in_backing_chain() which is called from
> >> mirror_exit(). However, mirror_complete() already tries to open the
> >> target's backing chain with a call to bdrv_open_backing_file().
> >>
> >> First, we should only set the target's backing BDS once. Second, the
> >> mirroring block job has a better idea of what to set it to than the
> >> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
> >> conditions on when to move the backing BDS from source to target are not
> >> really correct).
> >>
> >> Therefore, remove that code from bdrv_replace_in_backing_chain() and
> >> leave it to mirror_complete().
> >>
> >> However, mirror_complete() in turn pursues a questionable strategy by
> >> employing bdrv_open_backing_file(): On the one hand, because this may
> >> open the wrong backing file with drive-mirror in "existing" mode, or
> >> because it will not override a possibly wrong backing file in the
> >> blockdev-mirror case.
> >>
> >> On the other hand, we want to reuse the existing backing chain of the
> >> source instead of opening everything anew, because the latter results in
> >> having multiple BDSs for a single physical file and thus potentially
> >> concurrent access which we should try to avoid.
> >
> > Careful, this "wrong" backing file might actually be intended!
> >
> > Consider a case where you want to move an image with its whole backing
> > chain to different storage. In that case, you would copy all of the
> > backing files (cp is good enough, they are read-only), create the
> > destination image which already points at the copied backing chain, and
> > then mirror in "existing" mode.
> >
> > The intention is obviously that after the job completion the new backing
> > chain is used and not the old one.
> >
> > I know that such cases were discussed when mirroring was introduced, I'm
> > not sure whether it's actually used. We need some input there:
> >
> > Eric, can you tell us whether libvirt makes use of such a setup?
> >
> > Nir, I'm not sure who is the right person in oVirt these days, but do
> > you either know yourself whether oVirt requires this to work, or do you
> > know who else would know?
> 
> I'm the right person, thanks for keeping me in the loop.
> 
> What you describe is how we migrate a disk from one storage to another:
> 
> 1. Create a vm snapshot
> 2. Create a volume on the destination storage for the snapshot
> 3. Start mirroring from the source snapshot to the destination snapshot
>     using libvirt virDomainBlockCopy:
>     https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockCopy

With VIR_DOMAIN_BLOCK_COPY_SHALLOW set, right? (That is, sync=top in QMP
speech.)

> 4. Copy the reset of the chain from source to destination using qemu-img convert
> 5. Pivot to the new chain using libvirt virDomainBlockJobAbort
>     https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockJobAbort
> 6. Remove the old chain
> 
> source and target can be files or block device, and we plan to support also
> rbd and gluster volumes as target, maybe also as source.

Thanks, Nir, we should then do our best not to break it.

Max, maybe we can add a qemu-iotests case that does the exact same thing
as oVirt does?

Kevin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS
  2016-06-09  8:58       ` Kevin Wolf
@ 2016-06-09 11:16         ` Nir Soffer
  0 siblings, 0 replies; 16+ messages in thread
From: Nir Soffer @ 2016-06-09 11:16 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Max Reitz, qemu-block, qemu-devel, Fam Zheng, Eric Blake, Paolo Bonzini

On Thu, Jun 9, 2016 at 11:58 AM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 08.06.2016 um 17:39 hat Nir Soffer geschrieben:
>> On Wed, Jun 8, 2016 at 12:32 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> > Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
>> >> Currently, we are trying to move the backing BDS from the source to the
>> >> target in bdrv_replace_in_backing_chain() which is called from
>> >> mirror_exit(). However, mirror_complete() already tries to open the
>> >> target's backing chain with a call to bdrv_open_backing_file().
>> >>
>> >> First, we should only set the target's backing BDS once. Second, the
>> >> mirroring block job has a better idea of what to set it to than the
>> >> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
>> >> conditions on when to move the backing BDS from source to target are not
>> >> really correct).
>> >>
>> >> Therefore, remove that code from bdrv_replace_in_backing_chain() and
>> >> leave it to mirror_complete().
>> >>
>> >> However, mirror_complete() in turn pursues a questionable strategy by
>> >> employing bdrv_open_backing_file(): On the one hand, because this may
>> >> open the wrong backing file with drive-mirror in "existing" mode, or
>> >> because it will not override a possibly wrong backing file in the
>> >> blockdev-mirror case.
>> >>
>> >> On the other hand, we want to reuse the existing backing chain of the
>> >> source instead of opening everything anew, because the latter results in
>> >> having multiple BDSs for a single physical file and thus potentially
>> >> concurrent access which we should try to avoid.
>> >
>> > Careful, this "wrong" backing file might actually be intended!
>> >
>> > Consider a case where you want to move an image with its whole backing
>> > chain to different storage. In that case, you would copy all of the
>> > backing files (cp is good enough, they are read-only), create the
>> > destination image which already points at the copied backing chain, and
>> > then mirror in "existing" mode.
>> >
>> > The intention is obviously that after the job completion the new backing
>> > chain is used and not the old one.
>> >
>> > I know that such cases were discussed when mirroring was introduced, I'm
>> > not sure whether it's actually used. We need some input there:
>> >
>> > Eric, can you tell us whether libvirt makes use of such a setup?
>> >
>> > Nir, I'm not sure who is the right person in oVirt these days, but do
>> > you either know yourself whether oVirt requires this to work, or do you
>> > know who else would know?
>>
>> I'm the right person, thanks for keeping me in the loop.
>>
>> What you describe is how we migrate a disk from one storage to another:
>>
>> 1. Create a vm snapshot
>> 2. Create a volume on the destination storage for the snapshot
>> 3. Start mirroring from the source snapshot to the destination snapshot
>>     using libvirt virDomainBlockCopy:
>>     https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockCopy
>
> With VIR_DOMAIN_BLOCK_COPY_SHALLOW set, right? (That is, sync=top in QMP
> speech.)

Yes, actually we use:

VIR_DOMAIN_BLOCK_COPY_SHALLOW | VIR_DOMAIN_BLOCK_COPY_REUSE_EXT

>> 4. Copy the reset of the chain from source to destination using qemu-img convert
>> 5. Pivot to the new chain using libvirt virDomainBlockJobAbort
>>     https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockJobAbort
>> 6. Remove the old chain
>>
>> source and target can be files or block device, and we plan to support also
>> rbd and gluster volumes as target, maybe also as source.
>
> Thanks, Nir, we should then do our best not to break it.
>
> Max, maybe we can add a qemu-iotests case that does the exact same thing
> as oVirt does?
>
> Kevin

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-06-09 11:16 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-06 14:42 [Qemu-devel] [PATCH v2 0/3] block/mirror: Fix target backing BDS Max Reitz
2016-06-06 14:42 ` [Qemu-devel] [PATCH v2 1/3] block: Allow replacement of a BDS by its overlay Max Reitz
2016-06-08  8:58   ` Kevin Wolf
2016-06-08 14:21     ` Max Reitz
2016-06-06 14:42 ` [Qemu-devel] [PATCH v2 2/3] block/mirror: Fix target backing BDS Max Reitz
2016-06-08  9:32   ` Kevin Wolf
2016-06-08 11:28     ` Paolo Bonzini
2016-06-08 11:47       ` Kevin Wolf
2016-06-08 14:40       ` Max Reitz
2016-06-08 14:42         ` Max Reitz
2016-06-08 14:38     ` Max Reitz
2016-06-08 16:54       ` Max Reitz
2016-06-08 15:39     ` Nir Soffer
2016-06-09  8:58       ` Kevin Wolf
2016-06-09 11:16         ` Nir Soffer
2016-06-06 14:42 ` [Qemu-devel] [PATCH v2 3/3] iotests: Add test for post-mirror backing chains Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.