All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups
@ 2016-01-13 16:37 Kevin Wolf
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 1/9] qcow2: Write feature table only for v3 images Kevin Wolf
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: Kevin Wolf @ 2016-01-13 16:37 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel, mreitz

This is the non-controversal part of the qcow2 locking series, which just makes
the state of images with respect to live migration clearer and fixes a few bugs
we have in the handoff procedure and with writing the qcow2 feature table.

Kevin Wolf (9):
  qcow2: Write feature table only for v3 images
  qcow2: Write full header on image creation
  block: Assert no write requests under BDRV_O_INCOMING
  block: Fix error path in bdrv_invalidate_cache()
  block: Rename BDRV_O_INCOMING to BDRV_O_INACTIVE
  block: Inactivate BDS when migration completes
  qcow2: Implement .bdrv_inactivate
  qcow2: Fix BDRV_O_INACTIVE handling in qcow2_invalidate_cache()
  qcow2: Make image inaccessible after failed qcow2_invalidate_cache()

 block.c                    |  42 +++++++++++++++--
 block/io.c                 |   2 +
 block/qcow2.c              | 110 +++++++++++++++++++++++++++------------------
 block/qed.c                |   4 +-
 include/block/block.h      |   3 +-
 include/block/block_int.h  |   1 +
 migration/migration.c      |   7 +++
 nbd.c                      |   2 +-
 qmp.c                      |  12 +++++
 tests/qemu-iotests/031.out |  17 +++----
 tests/qemu-iotests/036     |   2 +
 tests/qemu-iotests/036.out |   5 +++
 tests/qemu-iotests/061.out |  35 ++++++++-------
 13 files changed, 166 insertions(+), 76 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 1/9] qcow2: Write feature table only for v3 images
  2016-01-13 16:37 [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf
@ 2016-01-13 16:37 ` Kevin Wolf
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 2/9] qcow2: Write full header on image creation Kevin Wolf
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2016-01-13 16:37 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel, mreitz

Version 2 images don't have feature bits, so writing a feature table to
those images is kind of pointless.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.c              | 48 ++++++++++++++++++++++++----------------------
 tests/qemu-iotests/031.out | 12 +-----------
 tests/qemu-iotests/061.out | 15 ---------------
 3 files changed, 26 insertions(+), 49 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 1789af4..5f22e18 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1895,31 +1895,33 @@ int qcow2_update_header(BlockDriverState *bs)
     }
 
     /* Feature table */
-    Qcow2Feature features[] = {
-        {
-            .type = QCOW2_FEAT_TYPE_INCOMPATIBLE,
-            .bit  = QCOW2_INCOMPAT_DIRTY_BITNR,
-            .name = "dirty bit",
-        },
-        {
-            .type = QCOW2_FEAT_TYPE_INCOMPATIBLE,
-            .bit  = QCOW2_INCOMPAT_CORRUPT_BITNR,
-            .name = "corrupt bit",
-        },
-        {
-            .type = QCOW2_FEAT_TYPE_COMPATIBLE,
-            .bit  = QCOW2_COMPAT_LAZY_REFCOUNTS_BITNR,
-            .name = "lazy refcounts",
-        },
-    };
+    if (s->qcow_version >= 3) {
+        Qcow2Feature features[] = {
+            {
+                .type = QCOW2_FEAT_TYPE_INCOMPATIBLE,
+                .bit  = QCOW2_INCOMPAT_DIRTY_BITNR,
+                .name = "dirty bit",
+            },
+            {
+                .type = QCOW2_FEAT_TYPE_INCOMPATIBLE,
+                .bit  = QCOW2_INCOMPAT_CORRUPT_BITNR,
+                .name = "corrupt bit",
+            },
+            {
+                .type = QCOW2_FEAT_TYPE_COMPATIBLE,
+                .bit  = QCOW2_COMPAT_LAZY_REFCOUNTS_BITNR,
+                .name = "lazy refcounts",
+            },
+        };
 
-    ret = header_ext_add(buf, QCOW2_EXT_MAGIC_FEATURE_TABLE,
-                         features, sizeof(features), buflen);
-    if (ret < 0) {
-        goto fail;
+        ret = header_ext_add(buf, QCOW2_EXT_MAGIC_FEATURE_TABLE,
+                             features, sizeof(features), buflen);
+        if (ret < 0) {
+            goto fail;
+        }
+        buf += ret;
+        buflen -= ret;
     }
-    buf += ret;
-    buflen -= ret;
 
     /* Keep unknown header extensions */
     QLIST_FOREACH(uext, &s->unknown_header_ext, next) {
diff --git a/tests/qemu-iotests/031.out b/tests/qemu-iotests/031.out
index fce3ce0..f065404 100644
--- a/tests/qemu-iotests/031.out
+++ b/tests/qemu-iotests/031.out
@@ -53,11 +53,6 @@ refcount_order            4
 header_length             72
 
 Header extension:
-magic                     0x6803f857
-length                    144
-data                      <binary>
-
-Header extension:
 magic                     0x12345678
 length                    31
 data                      'This is a test header extension'
@@ -68,7 +63,7 @@ No errors were found on the image.
 
 magic                     0x514649fb
 version                   2
-backing_file_offset       0x128
+backing_file_offset       0x90
 backing_file_size         0x17
 cluster_bits              16
 size                      67108864
@@ -91,11 +86,6 @@ length                    11
 data                      'host_device'
 
 Header extension:
-magic                     0x6803f857
-length                    144
-data                      <binary>
-
-Header extension:
 magic                     0x12345678
 length                    31
 data                      'This is a test header extension'
diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out
index 57aae28..d604682 100644
--- a/tests/qemu-iotests/061.out
+++ b/tests/qemu-iotests/061.out
@@ -43,11 +43,6 @@ autoclear_features        0x0
 refcount_order            4
 header_length             72
 
-Header extension:
-magic                     0x6803f857
-length                    144
-data                      <binary>
-
 read 131072/131072 bytes at offset 0
 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 No errors were found on the image.
@@ -105,11 +100,6 @@ autoclear_features        0x0
 refcount_order            4
 header_length             72
 
-Header extension:
-magic                     0x6803f857
-length                    144
-data                      <binary>
-
 read 131072/131072 bytes at offset 0
 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 No errors were found on the image.
@@ -155,11 +145,6 @@ autoclear_features        0x0
 refcount_order            4
 header_length             72
 
-Header extension:
-magic                     0x6803f857
-length                    144
-data                      <binary>
-
 No errors were found on the image.
 
 === Testing version upgrade and resize ===
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 2/9] qcow2: Write full header on image creation
  2016-01-13 16:37 [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 1/9] qcow2: Write feature table only for v3 images Kevin Wolf
@ 2016-01-13 16:37 ` Kevin Wolf
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 3/9] block: Assert no write requests under BDRV_O_INCOMING Kevin Wolf
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2016-01-13 16:37 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel, mreitz

When creating a qcow2 image, we didn't necessarily call
qcow2_update_header(), but could end up with the basic header that
qcow2_create2() created manually. One thing that this basic header
lacks is the feature table. Let's make sure that it's always present.

This requires a few updates to test cases as well.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.c              |  7 +++++++
 tests/qemu-iotests/031.out |  5 +++++
 tests/qemu-iotests/036     |  2 ++
 tests/qemu-iotests/036.out |  5 +++++
 tests/qemu-iotests/061.out | 20 ++++++++++++++++++++
 5 files changed, 39 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 5f22e18..01f1fe3 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2239,6 +2239,13 @@ static int qcow2_create2(const char *filename, int64_t total_size,
         abort();
     }
 
+    /* Create a full header (including things like feature table) */
+    ret = qcow2_update_header(bs);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Could not update qcow2 header");
+        goto out;
+    }
+
     /* Okay, now that we have a valid image, let's give it the right size */
     ret = bdrv_truncate(bs, total_size);
     if (ret < 0) {
diff --git a/tests/qemu-iotests/031.out b/tests/qemu-iotests/031.out
index f065404..7f5050b 100644
--- a/tests/qemu-iotests/031.out
+++ b/tests/qemu-iotests/031.out
@@ -116,6 +116,11 @@ refcount_order            4
 header_length             104
 
 Header extension:
+magic                     0x6803f857
+length                    144
+data                      <binary>
+
+Header extension:
 magic                     0x12345678
 length                    31
 data                      'This is a test header extension'
diff --git a/tests/qemu-iotests/036 b/tests/qemu-iotests/036
index 392f1ef..c4cc91b 100755
--- a/tests/qemu-iotests/036
+++ b/tests/qemu-iotests/036
@@ -57,6 +57,7 @@ _make_test_img 64M
 $PYTHON qcow2.py "$TEST_IMG" set-feature-bit incompatible 63
 
 # Without feature table
+$PYTHON qcow2.py "$TEST_IMG" del-header-ext 0x6803f857
 $PYTHON qcow2.py "$TEST_IMG" dump-header
 _img_info
 
@@ -73,6 +74,7 @@ $PYTHON qcow2.py "$TEST_IMG" set-feature-bit incompatible 62
 $PYTHON qcow2.py "$TEST_IMG" set-feature-bit incompatible 63
 
 # Without feature table
+$PYTHON qcow2.py "$TEST_IMG" del-header-ext 0x6803f857
 _img_info
 
 # With feature table containing bit 63
diff --git a/tests/qemu-iotests/036.out b/tests/qemu-iotests/036.out
index 5616e37..f443635 100644
--- a/tests/qemu-iotests/036.out
+++ b/tests/qemu-iotests/036.out
@@ -56,6 +56,11 @@ autoclear_features        0x8000000000000000
 refcount_order            4
 header_length             104
 
+Header extension:
+magic                     0x6803f857
+length                    144
+data                      <binary>
+
 
 === Repair image ===
 
diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out
index d604682..a03732e 100644
--- a/tests/qemu-iotests/061.out
+++ b/tests/qemu-iotests/061.out
@@ -24,6 +24,11 @@ autoclear_features        0x0
 refcount_order            4
 header_length             104
 
+Header extension:
+magic                     0x6803f857
+length                    144
+data                      <binary>
+
 magic                     0x514649fb
 version                   2
 backing_file_offset       0x0
@@ -76,6 +81,11 @@ autoclear_features        0x0
 refcount_order            4
 header_length             104
 
+Header extension:
+magic                     0x6803f857
+length                    144
+data                      <binary>
+
 ERROR cluster 5 refcount=0 reference=1
 ERROR cluster 6 refcount=0 reference=1
 Rebuilding refcount structure
@@ -126,6 +136,11 @@ autoclear_features        0x40000000000
 refcount_order            4
 header_length             104
 
+Header extension:
+magic                     0x6803f857
+length                    144
+data                      <binary>
+
 magic                     0x514649fb
 version                   2
 backing_file_offset       0x0
@@ -228,6 +243,11 @@ autoclear_features        0x0
 refcount_order            4
 header_length             104
 
+Header extension:
+magic                     0x6803f857
+length                    144
+data                      <binary>
+
 ERROR cluster 5 refcount=0 reference=1
 ERROR cluster 6 refcount=0 reference=1
 Rebuilding refcount structure
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 3/9] block: Assert no write requests under BDRV_O_INCOMING
  2016-01-13 16:37 [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 1/9] qcow2: Write feature table only for v3 images Kevin Wolf
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 2/9] qcow2: Write full header on image creation Kevin Wolf
@ 2016-01-13 16:37 ` Kevin Wolf
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 4/9] block: Fix error path in bdrv_invalidate_cache() Kevin Wolf
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2016-01-13 16:37 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel, mreitz

As long as BDRV_O_INCOMING is set, the image file is only opened so we
have a file descriptor for it. We're definitely not supposed to modify
the image, it's still owned by the migration source.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/io.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/io.c b/block/io.c
index 63e3678..2f19f61 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1300,6 +1300,7 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
     if (bs->read_only) {
         return -EPERM;
     }
+    assert(!(bs->open_flags & BDRV_O_INCOMING));
 
     ret = bdrv_check_byte_request(bs, offset, bytes);
     if (ret < 0) {
@@ -2461,6 +2462,7 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
     } else if (bs->read_only) {
         return -EPERM;
     }
+    assert(!(bs->open_flags & BDRV_O_INCOMING));
 
     /* Do nothing if disabled.  */
     if (!(bs->open_flags & BDRV_O_UNMAP)) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 4/9] block: Fix error path in bdrv_invalidate_cache()
  2016-01-13 16:37 [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf
                   ` (2 preceding siblings ...)
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 3/9] block: Assert no write requests under BDRV_O_INCOMING Kevin Wolf
@ 2016-01-13 16:37 ` Kevin Wolf
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 5/9] block: Rename BDRV_O_INCOMING to BDRV_O_INACTIVE Kevin Wolf
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2016-01-13 16:37 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel, mreitz

We can only clear BDRV_O_INCOMING if the caches were actually
invalidated.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block.c b/block.c
index ef37d51..ee9d7ea 100644
--- a/block.c
+++ b/block.c
@@ -3276,12 +3276,14 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
         bdrv_invalidate_cache(bs->file->bs, &local_err);
     }
     if (local_err) {
+        bs->open_flags |= BDRV_O_INCOMING;
         error_propagate(errp, local_err);
         return;
     }
 
     ret = refresh_total_sectors(bs, bs->total_sectors);
     if (ret < 0) {
+        bs->open_flags |= BDRV_O_INCOMING;
         error_setg_errno(errp, -ret, "Could not refresh total sector count");
         return;
     }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 5/9] block: Rename BDRV_O_INCOMING to BDRV_O_INACTIVE
  2016-01-13 16:37 [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf
                   ` (3 preceding siblings ...)
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 4/9] block: Fix error path in bdrv_invalidate_cache() Kevin Wolf
@ 2016-01-13 16:37 ` Kevin Wolf
  2016-01-13 16:49   ` Eric Blake
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 6/9] block: Inactivate BDS when migration completes Kevin Wolf
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 12+ messages in thread
From: Kevin Wolf @ 2016-01-13 16:37 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel, mreitz

Instead of covering only the state of images on the migration
destination before the migration is completed, the flag will also cover
the state of images on the migration source after completion. This
common state implies that the image is technically still open, but no
writes will happen and any cached contents will be reloaded from disk if
and when the image leaves this state.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c               | 10 +++++-----
 block/io.c            |  4 ++--
 block/qcow2.c         |  6 +++---
 block/qed.c           |  4 ++--
 include/block/block.h |  2 +-
 nbd.c                 |  2 +-
 6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/block.c b/block.c
index ee9d7ea..7fbf74d 100644
--- a/block.c
+++ b/block.c
@@ -1191,7 +1191,7 @@ static int bdrv_fill_options(QDict **options, const char *filename,
     }
 
     if (runstate_check(RUN_STATE_INMIGRATE)) {
-        *flags |= BDRV_O_INCOMING;
+        *flags |= BDRV_O_INACTIVE;
     }
 
     return 0;
@@ -3265,10 +3265,10 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
         return;
     }
 
-    if (!(bs->open_flags & BDRV_O_INCOMING)) {
+    if (!(bs->open_flags & BDRV_O_INACTIVE)) {
         return;
     }
-    bs->open_flags &= ~BDRV_O_INCOMING;
+    bs->open_flags &= ~BDRV_O_INACTIVE;
 
     if (bs->drv->bdrv_invalidate_cache) {
         bs->drv->bdrv_invalidate_cache(bs, &local_err);
@@ -3276,14 +3276,14 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
         bdrv_invalidate_cache(bs->file->bs, &local_err);
     }
     if (local_err) {
-        bs->open_flags |= BDRV_O_INCOMING;
+        bs->open_flags |= BDRV_O_INACTIVE;
         error_propagate(errp, local_err);
         return;
     }
 
     ret = refresh_total_sectors(bs, bs->total_sectors);
     if (ret < 0) {
-        bs->open_flags |= BDRV_O_INCOMING;
+        bs->open_flags |= BDRV_O_INACTIVE;
         error_setg_errno(errp, -ret, "Could not refresh total sector count");
         return;
     }
diff --git a/block/io.c b/block/io.c
index 2f19f61..0129456 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1300,7 +1300,7 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
     if (bs->read_only) {
         return -EPERM;
     }
-    assert(!(bs->open_flags & BDRV_O_INCOMING));
+    assert(!(bs->open_flags & BDRV_O_INACTIVE));
 
     ret = bdrv_check_byte_request(bs, offset, bytes);
     if (ret < 0) {
@@ -2462,7 +2462,7 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
     } else if (bs->read_only) {
         return -EPERM;
     }
-    assert(!(bs->open_flags & BDRV_O_INCOMING));
+    assert(!(bs->open_flags & BDRV_O_INACTIVE));
 
     /* Do nothing if disabled.  */
     if (!(bs->open_flags & BDRV_O_UNMAP)) {
diff --git a/block/qcow2.c b/block/qcow2.c
index 01f1fe3..9e4abf3 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1140,7 +1140,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
     }
 
     /* Clear unknown autoclear feature bits */
-    if (!bs->read_only && !(flags & BDRV_O_INCOMING) && s->autoclear_features) {
+    if (!bs->read_only && !(flags & BDRV_O_INACTIVE) && s->autoclear_features) {
         s->autoclear_features = 0;
         ret = qcow2_update_header(bs);
         if (ret < 0) {
@@ -1153,7 +1153,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
     qemu_co_mutex_init(&s->lock);
 
     /* Repair image if dirty */
-    if (!(flags & (BDRV_O_CHECK | BDRV_O_INCOMING)) && !bs->read_only &&
+    if (!(flags & (BDRV_O_CHECK | BDRV_O_INACTIVE)) && !bs->read_only &&
         (s->incompatible_features & QCOW2_INCOMPAT_DIRTY)) {
         BdrvCheckResult result = {0};
 
@@ -1692,7 +1692,7 @@ static void qcow2_close(BlockDriverState *bs)
     /* else pre-write overlap checks in cache_destroy may crash */
     s->l1_table = NULL;
 
-    if (!(bs->open_flags & BDRV_O_INCOMING)) {
+    if (!(bs->open_flags & BDRV_O_INACTIVE)) {
         int ret1, ret2;
 
         ret1 = qcow2_cache_flush(bs, s->l2_table_cache);
diff --git a/block/qed.c b/block/qed.c
index 9b88895..63185b5 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -477,7 +477,7 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
      * feature is no longer valid.
      */
     if ((s->header.autoclear_features & ~QED_AUTOCLEAR_FEATURE_MASK) != 0 &&
-        !bdrv_is_read_only(bs->file->bs) && !(flags & BDRV_O_INCOMING)) {
+        !bdrv_is_read_only(bs->file->bs) && !(flags & BDRV_O_INACTIVE)) {
         s->header.autoclear_features &= QED_AUTOCLEAR_FEATURE_MASK;
 
         ret = qed_write_header_sync(s);
@@ -505,7 +505,7 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
          * aid data recovery from an otherwise inconsistent image.
          */
         if (!bdrv_is_read_only(bs->file->bs) &&
-            !(flags & BDRV_O_INCOMING)) {
+            !(flags & BDRV_O_INACTIVE)) {
             BdrvCheckResult result = {0};
 
             ret = qed_check(s, &result, true);
diff --git a/include/block/block.h b/include/block/block.h
index c96923d..2b7d33c 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -84,7 +84,7 @@ typedef struct HDGeometry {
 #define BDRV_O_NO_BACKING  0x0100 /* don't open the backing file */
 #define BDRV_O_NO_FLUSH    0x0200 /* disable flushing on this disk */
 #define BDRV_O_COPY_ON_READ 0x0400 /* copy read backing sectors into image */
-#define BDRV_O_INCOMING    0x0800  /* consistency hint for incoming migration */
+#define BDRV_O_INACTIVE    0x0800  /* consistency hint for migration handoff */
 #define BDRV_O_CHECK       0x1000  /* open solely for consistency check */
 #define BDRV_O_ALLOW_RDWR  0x2000  /* allow reopen to change from r/o to r/w */
 #define BDRV_O_UNMAP       0x4000  /* execute guest UNMAP/TRIM operations */
diff --git a/nbd.c b/nbd.c
index b3d9654..2faa2a0 100644
--- a/nbd.c
+++ b/nbd.c
@@ -1077,7 +1077,7 @@ NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size,
     blk_add_aio_context_notifier(blk, blk_aio_attached, blk_aio_detach, exp);
     /*
      * NBD exports are used for non-shared storage migration.  Make sure
-     * that BDRV_O_INCOMING is cleared and the image is ready for write
+     * that BDRV_O_INACTIVE is cleared and the image is ready for write
      * access since the export could be available before migration handover.
      */
     blk_invalidate_cache(blk, NULL);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 6/9] block: Inactivate BDS when migration completes
  2016-01-13 16:37 [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf
                   ` (4 preceding siblings ...)
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 5/9] block: Rename BDRV_O_INCOMING to BDRV_O_INACTIVE Kevin Wolf
@ 2016-01-13 16:37 ` Kevin Wolf
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 7/9] qcow2: Implement .bdrv_inactivate Kevin Wolf
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2016-01-13 16:37 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel, mreitz

So far, live migration with shared storage meant that the image is in a
not-really-ready don't-touch-me state on the destination while the
source is still actively using it, but after completing the migration,
the image was fully opened on both sides. This is bad.

This patch adds a block driver callback to inactivate images on the
source before completing the migration. Inactivation means that it goes
to a state as if it was just live migrated to the qemu instance on the
source (i.e. BDRV_O_INACTIVE is set). You're then supposed to continue
either on the source or on the destination, which takes ownership of the
image.

A typical migration looks like this now with respect to disk images:

1. Destination qemu is started, the image is opened with
   BDRV_O_INACTIVE. The image is fully opened on the source.

2. Migration is about to complete. The source flushes the image and
   inactivates it. Now both sides have the image opened with
   BDRV_O_INACTIVE and are expecting the other side to still modify it.

3. One side (the destination on success) continues and calls
   bdrv_invalidate_all() in order to take ownership of the image again.
   This removes BDRV_O_INACTIVE on the resuming side; the flag remains
   set on the other side.

This ensures that the same image isn't written to by both instances
(unless both are resumed, but then you get what you deserve). This is
important because .bdrv_close for non-BDRV_O_INACTIVE images could write
to the image file, which is definitely forbidden while another host is
using the image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
---
 block.c                   | 34 ++++++++++++++++++++++++++++++++++
 include/block/block.h     |  1 +
 include/block/block_int.h |  1 +
 migration/migration.c     |  7 +++++++
 qmp.c                     | 12 ++++++++++++
 5 files changed, 55 insertions(+)

diff --git a/block.c b/block.c
index 7fbf74d..527549b 100644
--- a/block.c
+++ b/block.c
@@ -3307,6 +3307,40 @@ void bdrv_invalidate_cache_all(Error **errp)
     }
 }
 
+static int bdrv_inactivate(BlockDriverState *bs)
+{
+    int ret;
+
+    if (bs->drv->bdrv_inactivate) {
+        ret = bs->drv->bdrv_inactivate(bs);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    bs->open_flags |= BDRV_O_INACTIVE;
+    return 0;
+}
+
+int bdrv_inactivate_all(void)
+{
+    BlockDriverState *bs;
+    int ret;
+
+    QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
+        AioContext *aio_context = bdrv_get_aio_context(bs);
+
+        aio_context_acquire(aio_context);
+        ret = bdrv_inactivate(bs);
+        aio_context_release(aio_context);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    return 0;
+}
+
 /**************************************************************/
 /* removable device support */
 
diff --git a/include/block/block.h b/include/block/block.h
index 2b7d33c..25f36dc 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -369,6 +369,7 @@ BlockAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
 /* Invalidate any cached metadata used by image formats */
 void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp);
 void bdrv_invalidate_cache_all(Error **errp);
+int bdrv_inactivate_all(void);
 
 /* Ensure contents are flushed to disk.  */
 int bdrv_flush(BlockDriverState *bs);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 256609d..428fa33 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -172,6 +172,7 @@ struct BlockDriver {
      * Invalidate any cached meta-data.
      */
     void (*bdrv_invalidate_cache)(BlockDriverState *bs, Error **errp);
+    int (*bdrv_inactivate)(BlockDriverState *bs);
 
     /*
      * Flushes all data that was already written to the OS all the way down to
diff --git a/migration/migration.c b/migration/migration.c
index c842499..309aa98 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1416,7 +1416,11 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
     *old_vm_running = runstate_is_running();
     global_state_store();
     ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
+    if (ret < 0) {
+        goto fail;
+    }
 
+    ret = bdrv_inactivate_all();
     if (ret < 0) {
         goto fail;
     }
@@ -1536,6 +1540,9 @@ static void migration_completion(MigrationState *s, int current_active_state,
         if (!ret) {
             ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
             if (ret >= 0) {
+                ret = bdrv_inactivate_all();
+            }
+            if (ret >= 0) {
                 qemu_file_set_rate_limit(s->file, INT64_MAX);
                 qemu_savevm_state_complete_precopy(s->file, false);
             }
diff --git a/qmp.c b/qmp.c
index 0a1fa19..39b8da1 100644
--- a/qmp.c
+++ b/qmp.c
@@ -192,6 +192,18 @@ void qmp_cont(Error **errp)
         }
     }
 
+    /* Continuing after completed migration. Images have been inactivated to
+     * allow the destination to take control. Need to get control back now. */
+    if (runstate_check(RUN_STATE_FINISH_MIGRATE) ||
+        runstate_check(RUN_STATE_POSTMIGRATE))
+    {
+        bdrv_invalidate_cache_all(&local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            return;
+        }
+    }
+
     if (runstate_check(RUN_STATE_INMIGRATE)) {
         autostart = 1;
     } else {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 7/9] qcow2: Implement .bdrv_inactivate
  2016-01-13 16:37 [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf
                   ` (5 preceding siblings ...)
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 6/9] block: Inactivate BDS when migration completes Kevin Wolf
@ 2016-01-13 16:37 ` Kevin Wolf
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 8/9] qcow2: Fix BDRV_O_INACTIVE handling in qcow2_invalidate_cache() Kevin Wolf
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2016-01-13 16:37 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel, mreitz

The callback has to ensure that closing or flushing the image afterwards
wouldn't cause a write access to the image files. This means that just
the caches have to be written out, which is part of the existing
.bdrv_close implementation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.c | 45 ++++++++++++++++++++++++++++-----------------
 1 file changed, 28 insertions(+), 17 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 9e4abf3..519e2ae 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1685,6 +1685,32 @@ fail:
     return ret;
 }
 
+static int qcow2_inactivate(BlockDriverState *bs)
+{
+    BDRVQcow2State *s = bs->opaque;
+    int ret, result = 0;
+
+    ret = qcow2_cache_flush(bs, s->l2_table_cache);
+    if (ret) {
+        result = ret;
+        error_report("Failed to flush the L2 table cache: %s",
+                     strerror(-ret));
+    }
+
+    ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+    if (ret) {
+        result = ret;
+        error_report("Failed to flush the refcount block cache: %s",
+                     strerror(-ret));
+    }
+
+    if (result == 0) {
+        qcow2_mark_clean(bs);
+    }
+
+    return result;
+}
+
 static void qcow2_close(BlockDriverState *bs)
 {
     BDRVQcow2State *s = bs->opaque;
@@ -1693,23 +1719,7 @@ static void qcow2_close(BlockDriverState *bs)
     s->l1_table = NULL;
 
     if (!(bs->open_flags & BDRV_O_INACTIVE)) {
-        int ret1, ret2;
-
-        ret1 = qcow2_cache_flush(bs, s->l2_table_cache);
-        ret2 = qcow2_cache_flush(bs, s->refcount_block_cache);
-
-        if (ret1) {
-            error_report("Failed to flush the L2 table cache: %s",
-                         strerror(-ret1));
-        }
-        if (ret2) {
-            error_report("Failed to flush the refcount block cache: %s",
-                         strerror(-ret2));
-        }
-
-        if (!ret1 && !ret2) {
-            qcow2_mark_clean(bs);
-        }
+        qcow2_inactivate(bs);
     }
 
     cache_clean_timer_del(bs);
@@ -3340,6 +3350,7 @@ BlockDriver bdrv_qcow2 = {
 
     .bdrv_refresh_limits        = qcow2_refresh_limits,
     .bdrv_invalidate_cache      = qcow2_invalidate_cache,
+    .bdrv_inactivate            = qcow2_inactivate,
 
     .create_opts         = &qcow2_create_opts,
     .bdrv_check          = qcow2_check,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 8/9] qcow2: Fix BDRV_O_INACTIVE handling in qcow2_invalidate_cache()
  2016-01-13 16:37 [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf
                   ` (6 preceding siblings ...)
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 7/9] qcow2: Implement .bdrv_inactivate Kevin Wolf
@ 2016-01-13 16:37 ` Kevin Wolf
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 9/9] qcow2: Make image inaccessible after failed qcow2_invalidate_cache() Kevin Wolf
  2016-01-19 16:55 ` [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf
  9 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2016-01-13 16:37 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel, mreitz

What qcow2_invalidate_cache() should do is close the image with
BDRV_O_INACTIVE set and reopen it with the flag cleared. In fact, it
used to do exactly the opposite: qcow2_close() relied on bs->open_flags,
which is already updated to have cleared BDRV_O_INACTIVE at this point,
whereas qcow2_open() was called with s->flags, which has the flag still
set. Fix this.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 519e2ae..1b9c5fc 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1718,7 +1718,7 @@ static void qcow2_close(BlockDriverState *bs)
     /* else pre-write overlap checks in cache_destroy may crash */
     s->l1_table = NULL;
 
-    if (!(bs->open_flags & BDRV_O_INACTIVE)) {
+    if (!(s->flags & BDRV_O_INACTIVE)) {
         qcow2_inactivate(bs);
     }
 
@@ -1769,6 +1769,7 @@ static void qcow2_invalidate_cache(BlockDriverState *bs, Error **errp)
     memset(s, 0, sizeof(BDRVQcow2State));
     options = qdict_clone_shallow(bs->options);
 
+    flags &= ~BDRV_O_INACTIVE;
     ret = qcow2_open(bs, options, flags, &local_err);
     QDECREF(options);
     if (local_err) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 9/9] qcow2: Make image inaccessible after failed qcow2_invalidate_cache()
  2016-01-13 16:37 [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf
                   ` (7 preceding siblings ...)
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 8/9] qcow2: Fix BDRV_O_INACTIVE handling in qcow2_invalidate_cache() Kevin Wolf
@ 2016-01-13 16:37 ` Kevin Wolf
  2016-01-19 16:55 ` [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf
  9 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2016-01-13 16:37 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel, mreitz

If qcow2_invalidate_cache() fails, we are in a state where qcow2_close()
has already been completed, but the image hasn't been reopened yet.
Calling into any qcow2 function for an image in this state will cause
crashes.

The real solution would be to get rid of the close/open pair and instead
do an atomic reset of the involved data structures, but this isn't
trivial, so let's just make the image inaccessible for now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 1b9c5fc..28d1a4b 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1763,6 +1763,7 @@ static void qcow2_invalidate_cache(BlockDriverState *bs, Error **errp)
     bdrv_invalidate_cache(bs->file->bs, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
+        bs->drv = NULL;
         return;
     }
 
@@ -1776,9 +1777,11 @@ static void qcow2_invalidate_cache(BlockDriverState *bs, Error **errp)
         error_setg(errp, "Could not reopen qcow2 layer: %s",
                    error_get_pretty(local_err));
         error_free(local_err);
+        bs->drv = NULL;
         return;
     } else if (ret < 0) {
         error_setg_errno(errp, -ret, "Could not reopen qcow2 layer");
+        bs->drv = NULL;
         return;
     }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/9] block: Rename BDRV_O_INCOMING to BDRV_O_INACTIVE
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 5/9] block: Rename BDRV_O_INCOMING to BDRV_O_INACTIVE Kevin Wolf
@ 2016-01-13 16:49   ` Eric Blake
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2016-01-13 16:49 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: qemu-devel, mreitz

[-- Attachment #1: Type: text/plain, Size: 939 bytes --]

On 01/13/2016 09:37 AM, Kevin Wolf wrote:
> Instead of covering only the state of images on the migration
> destination before the migration is completed, the flag will also cover
> the state of images on the migration source after completion. This
> common state implies that the image is technically still open, but no
> writes will happen and any cached contents will be reloaded from disk if
> and when the image leaves this state.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block.c               | 10 +++++-----
>  block/io.c            |  4 ++--
>  block/qcow2.c         |  6 +++---
>  block/qed.c           |  4 ++--
>  include/block/block.h |  2 +-
>  nbd.c                 |  2 +-
>  6 files changed, 14 insertions(+), 14 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups
  2016-01-13 16:37 [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf
                   ` (8 preceding siblings ...)
  2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 9/9] qcow2: Make image inaccessible after failed qcow2_invalidate_cache() Kevin Wolf
@ 2016-01-19 16:55 ` Kevin Wolf
  9 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2016-01-19 16:55 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, mreitz

Am 13.01.2016 um 17:37 hat Kevin Wolf geschrieben:
> This is the non-controversal part of the qcow2 locking series, which just makes
> the state of images with respect to live migration clearer and fixes a few bugs
> we have in the handoff procedure and with writing the qcow2 feature table.
> 
> Kevin Wolf (9):
>   qcow2: Write feature table only for v3 images
>   qcow2: Write full header on image creation
>   block: Assert no write requests under BDRV_O_INCOMING
>   block: Fix error path in bdrv_invalidate_cache()
>   block: Rename BDRV_O_INCOMING to BDRV_O_INACTIVE
>   block: Inactivate BDS when migration completes
>   qcow2: Implement .bdrv_inactivate
>   qcow2: Fix BDRV_O_INACTIVE handling in qcow2_invalidate_cache()
>   qcow2: Make image inaccessible after failed qcow2_invalidate_cache()

Applied to the block branch.

Kevin

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-01-19 16:55 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-13 16:37 [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf
2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 1/9] qcow2: Write feature table only for v3 images Kevin Wolf
2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 2/9] qcow2: Write full header on image creation Kevin Wolf
2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 3/9] block: Assert no write requests under BDRV_O_INCOMING Kevin Wolf
2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 4/9] block: Fix error path in bdrv_invalidate_cache() Kevin Wolf
2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 5/9] block: Rename BDRV_O_INCOMING to BDRV_O_INACTIVE Kevin Wolf
2016-01-13 16:49   ` Eric Blake
2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 6/9] block: Inactivate BDS when migration completes Kevin Wolf
2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 7/9] qcow2: Implement .bdrv_inactivate Kevin Wolf
2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 8/9] qcow2: Fix BDRV_O_INACTIVE handling in qcow2_invalidate_cache() Kevin Wolf
2016-01-13 16:37 ` [Qemu-devel] [PATCH v2 9/9] qcow2: Make image inaccessible after failed qcow2_invalidate_cache() Kevin Wolf
2016-01-19 16:55 ` [Qemu-devel] [PATCH v2 0/9] block/qcow2: Migration handoff fixes and cleanups Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.