QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH for-5.0 v2 0/4] bug fixes extracted from larger qcow2 zero bit work
@ 2020-03-24 17:42 Eric Blake
  2020-03-24 17:42 ` [PATCH v2 1/4] qcow2: Comment typo fixes Eric Blake
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Eric Blake @ 2020-03-24 17:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, mreitz

My proposal [1] to add an autoclear all-zero-content bit to the qcow2
format has now stalled into 5.1 territory, but several patches in my
initial proposal are uncontroversial and obvious bug fixes worth
having in 5.0.

[1] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg08075.html

Eric Blake (4):
  qcow2: Comment typo fixes
  qcow2: List autoclear bit names in header
  qcow2: Avoid feature name extension on small cluster size
  sheepdog: Consistently set bdrv_has_zero_init_truncate

 docs/interop/qcow2.txt     |  3 ++-
 block/qcow2.c              | 29 +++++++++++++++++++++++------
 block/sheepdog.c           |  2 ++
 tests/qemu-iotests/031.out |  8 ++++----
 tests/qemu-iotests/036     |  6 ++++--
 tests/qemu-iotests/036.out |  4 ++--
 tests/qemu-iotests/061     |  6 ++++--
 tests/qemu-iotests/061.out | 14 +++++++-------
 8 files changed, 48 insertions(+), 24 deletions(-)

-- 
2.26.0.rc2



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/4] qcow2: Comment typo fixes
  2020-03-24 17:42 [PATCH for-5.0 v2 0/4] bug fixes extracted from larger qcow2 zero bit work Eric Blake
@ 2020-03-24 17:42 ` Eric Blake
  2020-03-24 17:42 ` [PATCH v2 2/4] qcow2: List autoclear bit names in header Eric Blake
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2020-03-24 17:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia, qemu-block, mreitz

Various trivial typos noticed while working on this file.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index d1da3d91db21..5a7f30e692a2 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -177,7 +177,7 @@ static ssize_t qcow2_crypto_hdr_write_func(QCryptoBlock *block, size_t offset,
 }


-/* 
+/*
  * read qcow2 extension and fill bs
  * start reading from start_offset
  * finish reading upon magic of value 0 or when end_offset reached
@@ -3255,7 +3255,7 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
      * inconsistency later.
      *
      * We do need a refcount table because growing the refcount table means
-     * allocating two new refcount blocks - the seconds of which would be at
+     * allocating two new refcount blocks - the second of which would be at
      * 2 GB for 64k clusters, and we don't want to have a 2 GB initial file
      * size for any qcow2 image.
      */
@@ -3500,7 +3500,7 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
         goto out;
     }

-    /* Want a backing file? There you go.*/
+    /* Want a backing file? There you go. */
     if (qcow2_opts->has_backing_file) {
         const char *backing_format = NULL;

-- 
2.26.0.rc2



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 2/4] qcow2: List autoclear bit names in header
  2020-03-24 17:42 [PATCH for-5.0 v2 0/4] bug fixes extracted from larger qcow2 zero bit work Eric Blake
  2020-03-24 17:42 ` [PATCH v2 1/4] qcow2: Comment typo fixes Eric Blake
@ 2020-03-24 17:42 ` Eric Blake
  2020-03-24 17:42 ` [PATCH v2 3/4] qcow2: Avoid feature name extension on small cluster size Eric Blake
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2020-03-24 17:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Vladimir Sementsov-Ogievskiy, qemu-block, mreitz

The feature table is supposed to advertise the name of all feature
bits that we support; however, we forgot to update the table for
autoclear bits.  While at it, move the table to read-only memory in
code, and tweak the qcow2 spec to name the second autoclear bit.
Update iotests that are affected by the longer header length.

Fixes: 88ddffae
Fixes: 93c24936
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

---
[v2] tweak name of "bitmaps" autoclear bit [Vladimir]
---
 docs/interop/qcow2.txt     |  3 ++-
 block/qcow2.c              | 12 +++++++++++-
 tests/qemu-iotests/031.out |  8 ++++----
 tests/qemu-iotests/036.out |  4 ++--
 tests/qemu-iotests/061.out | 14 +++++++-------
 5 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index 5597e244746e..640e0eca4000 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -143,7 +143,8 @@ the next fields through header_length.
                                 bit is unset, the bitmaps extension data must be
                                 considered inconsistent.

-                    Bit 1:      If this bit is set, the external data file can
+                    Bit 1:      Raw external data bit
+                                If this bit is set, the external data file can
                                 be read as a consistent standalone raw image
                                 without looking at the qcow2 metadata.

diff --git a/block/qcow2.c b/block/qcow2.c
index 5a7f30e692a2..67b0c214fba4 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2825,7 +2825,7 @@ int qcow2_update_header(BlockDriverState *bs)

     /* Feature table */
     if (s->qcow_version >= 3) {
-        Qcow2Feature features[] = {
+        static const Qcow2Feature features[] = {
             {
                 .type = QCOW2_FEAT_TYPE_INCOMPATIBLE,
                 .bit  = QCOW2_INCOMPAT_DIRTY_BITNR,
@@ -2846,6 +2846,16 @@ int qcow2_update_header(BlockDriverState *bs)
                 .bit  = QCOW2_COMPAT_LAZY_REFCOUNTS_BITNR,
                 .name = "lazy refcounts",
             },
+            {
+                .type = QCOW2_FEAT_TYPE_AUTOCLEAR,
+                .bit  = QCOW2_AUTOCLEAR_BITMAPS_BITNR,
+                .name = "bitmaps",
+            },
+            {
+                .type = QCOW2_FEAT_TYPE_AUTOCLEAR,
+                .bit  = QCOW2_AUTOCLEAR_DATA_FILE_RAW_BITNR,
+                .name = "raw external data",
+            },
         };

         ret = header_ext_add(buf, QCOW2_EXT_MAGIC_FEATURE_TABLE,
diff --git a/tests/qemu-iotests/031.out b/tests/qemu-iotests/031.out
index d535e407bc30..46f97c5a4ea4 100644
--- a/tests/qemu-iotests/031.out
+++ b/tests/qemu-iotests/031.out
@@ -117,7 +117,7 @@ header_length             104

 Header extension:
 magic                     0x6803f857
-length                    192
+length                    288
 data                      <binary>

 Header extension:
@@ -150,7 +150,7 @@ header_length             104

 Header extension:
 magic                     0x6803f857
-length                    192
+length                    288
 data                      <binary>

 Header extension:
@@ -164,7 +164,7 @@ No errors were found on the image.

 magic                     0x514649fb
 version                   3
-backing_file_offset       0x178
+backing_file_offset       0x1d8
 backing_file_size         0x17
 cluster_bits              16
 size                      67108864
@@ -188,7 +188,7 @@ data                      'host_device'

 Header extension:
 magic                     0x6803f857
-length                    192
+length                    288
 data                      <binary>

 Header extension:
diff --git a/tests/qemu-iotests/036.out b/tests/qemu-iotests/036.out
index 0b52b934e115..23b699ce0622 100644
--- a/tests/qemu-iotests/036.out
+++ b/tests/qemu-iotests/036.out
@@ -26,7 +26,7 @@ compatible_features       []
 autoclear_features        [63]
 Header extension:
 magic                     0x6803f857
-length                    192
+length                    288
 data                      <binary>


@@ -38,7 +38,7 @@ compatible_features       []
 autoclear_features        []
 Header extension:
 magic                     0x6803f857
-length                    192
+length                    288
 data                      <binary>

 *** done
diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out
index 8b3091a412bc..413cc4e0f4ab 100644
--- a/tests/qemu-iotests/061.out
+++ b/tests/qemu-iotests/061.out
@@ -26,7 +26,7 @@ header_length             104

 Header extension:
 magic                     0x6803f857
-length                    192
+length                    288
 data                      <binary>

 magic                     0x514649fb
@@ -84,7 +84,7 @@ header_length             104

 Header extension:
 magic                     0x6803f857
-length                    192
+length                    288
 data                      <binary>

 magic                     0x514649fb
@@ -140,7 +140,7 @@ header_length             104

 Header extension:
 magic                     0x6803f857
-length                    192
+length                    288
 data                      <binary>

 ERROR cluster 5 refcount=0 reference=1
@@ -195,7 +195,7 @@ header_length             104

 Header extension:
 magic                     0x6803f857
-length                    192
+length                    288
 data                      <binary>

 magic                     0x514649fb
@@ -264,7 +264,7 @@ header_length             104

 Header extension:
 magic                     0x6803f857
-length                    192
+length                    288
 data                      <binary>

 read 65536/65536 bytes at offset 44040192
@@ -298,7 +298,7 @@ header_length             104

 Header extension:
 magic                     0x6803f857
-length                    192
+length                    288
 data                      <binary>

 ERROR cluster 5 refcount=0 reference=1
@@ -327,7 +327,7 @@ header_length             104

 Header extension:
 magic                     0x6803f857
-length                    192
+length                    288
 data                      <binary>

 read 131072/131072 bytes at offset 0
-- 
2.26.0.rc2



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 3/4] qcow2: Avoid feature name extension on small cluster size
  2020-03-24 17:42 [PATCH for-5.0 v2 0/4] bug fixes extracted from larger qcow2 zero bit work Eric Blake
  2020-03-24 17:42 ` [PATCH v2 1/4] qcow2: Comment typo fixes Eric Blake
  2020-03-24 17:42 ` [PATCH v2 2/4] qcow2: List autoclear bit names in header Eric Blake
@ 2020-03-24 17:42 ` Eric Blake
  2020-03-25 12:42   ` Max Reitz
  2020-03-24 17:42 ` [PATCH v2 4/4] sheepdog: Consistently set bdrv_has_zero_init_truncate Eric Blake
  2020-03-26 13:32 ` [PATCH for-5.0 v2 0/4] bug fixes extracted from larger qcow2 zero bit work Max Reitz
  4 siblings, 1 reply; 12+ messages in thread
From: Eric Blake @ 2020-03-24 17:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia, qemu-block, mreitz

As the feature name table can be quite large (over 9k if all 64 bits
of all three feature fields have names; a mere 8 features leaves only
8 bytes for a backing file name in a 512-byte cluster), it is unwise
to emit this optional header in images with small cluster sizes.

Update iotest 036 to skip running on small cluster sizes; meanwhile,
note that iotest 061 never passed on alternative cluster sizes
(however, I limited this patch to tests with output affected by adding
feature names, rather than auditing for other tests that are not
robust to alternative cluster sizes).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2.c          | 11 +++++++++--
 tests/qemu-iotests/036 |  6 ++++--
 tests/qemu-iotests/061 |  6 ++++--
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 67b0c214fba4..9475ace57163 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2823,8 +2823,15 @@ int qcow2_update_header(BlockDriverState *bs)
         buflen -= ret;
     }

-    /* Feature table */
-    if (s->qcow_version >= 3) {
+    /*
+     * Feature table.  A mere 8 feature names occupies 392 bytes, and
+     * when coupled with the v3 minimum header of 104 bytes plus the
+     * 8-byte end-of-extension marker, that would leave only 8 bytes
+     * for a backing file name in an image with 512-byte clusters.
+     * Thus, we choose to omit this header for cluster sizes 4k and
+     * smaller.
+     */
+    if (s->qcow_version >= 3 && s->cluster_size > 4096) {
         static const Qcow2Feature features[] = {
             {
                 .type = QCOW2_FEAT_TYPE_INCOMPATIBLE,
diff --git a/tests/qemu-iotests/036 b/tests/qemu-iotests/036
index 512598421c20..cf522de7a1aa 100755
--- a/tests/qemu-iotests/036
+++ b/tests/qemu-iotests/036
@@ -44,8 +44,10 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 _supported_fmt qcow2
 _supported_proto file
 # Only qcow2v3 and later supports feature bits;
-# qcow2.py does not support external data files
-_unsupported_imgopts 'compat=0.10' data_file
+# qcow2.py does not support external data files;
+# this test requires a cluster size large enough for the feature table
+_unsupported_imgopts 'compat=0.10' data_file \
+		     'cluster_size=\(512\|1024\|2048\|4096\)'

 echo
 echo === Image with unknown incompatible feature bit ===
diff --git a/tests/qemu-iotests/061 b/tests/qemu-iotests/061
index 36b040491fef..ce285d308408 100755
--- a/tests/qemu-iotests/061
+++ b/tests/qemu-iotests/061
@@ -44,8 +44,10 @@ _supported_os Linux
 # Conversion between different compat versions can only really work
 # with refcount_bits=16;
 # we have explicit tests for data_file here, but the whole test does
-# not work with it
-_unsupported_imgopts 'refcount_bits=\([^1]\|.\([^6]\|$\)\)' data_file
+# not work with it;
+# we have explicit tests for various cluster sizes, the remaining tests
+# require the default 64k cluster
+_unsupported_imgopts 'refcount_bits=\([^1]\|.\([^6]\|$\)\)' data_file cluster_size

 echo
 echo "=== Testing version downgrade with zero expansion ==="
-- 
2.26.0.rc2



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 4/4] sheepdog: Consistently set bdrv_has_zero_init_truncate
  2020-03-24 17:42 [PATCH for-5.0 v2 0/4] bug fixes extracted from larger qcow2 zero bit work Eric Blake
                   ` (2 preceding siblings ...)
  2020-03-24 17:42 ` [PATCH v2 3/4] qcow2: Avoid feature name extension on small cluster size Eric Blake
@ 2020-03-24 17:42 ` Eric Blake
  2020-03-24 19:33   ` John Snow
  2020-03-25 14:45   ` [PATCH-for-5.0 " Philippe Mathieu-Daudé
  2020-03-26 13:32 ` [PATCH for-5.0 v2 0/4] bug fixes extracted from larger qcow2 zero bit work Max Reitz
  4 siblings, 2 replies; 12+ messages in thread
From: Eric Blake @ 2020-03-24 17:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Liu Yuan, open list:Sheepdog, qemu-block, mreitz

block_int.h claims that .bdrv_has_zero_init must return 0 if
.bdrv_has_zero_init_truncate does likewise; but this is violated if
only the former callback is provided if .bdrv_co_truncate also exists.
When adding the latter callback, it was mistakenly added to only one
of the three possible sheepdog instantiations.

Fixes: 1dcaf527
Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/sheepdog.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index cfa84338a2d6..522c16a93676 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -3269,6 +3269,7 @@ static BlockDriver bdrv_sheepdog_tcp = {
     .bdrv_co_create               = sd_co_create,
     .bdrv_co_create_opts          = sd_co_create_opts,
     .bdrv_has_zero_init           = bdrv_has_zero_init_1,
+    .bdrv_has_zero_init_truncate  = bdrv_has_zero_init_1,
     .bdrv_getlength               = sd_getlength,
     .bdrv_get_allocated_file_size = sd_get_allocated_file_size,
     .bdrv_co_truncate             = sd_co_truncate,
@@ -3307,6 +3308,7 @@ static BlockDriver bdrv_sheepdog_unix = {
     .bdrv_co_create               = sd_co_create,
     .bdrv_co_create_opts          = sd_co_create_opts,
     .bdrv_has_zero_init           = bdrv_has_zero_init_1,
+    .bdrv_has_zero_init_truncate  = bdrv_has_zero_init_1,
     .bdrv_getlength               = sd_getlength,
     .bdrv_get_allocated_file_size = sd_get_allocated_file_size,
     .bdrv_co_truncate             = sd_co_truncate,
-- 
2.26.0.rc2



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/4] sheepdog: Consistently set bdrv_has_zero_init_truncate
  2020-03-24 17:42 ` [PATCH v2 4/4] sheepdog: Consistently set bdrv_has_zero_init_truncate Eric Blake
@ 2020-03-24 19:33   ` John Snow
  2020-03-25 14:45   ` [PATCH-for-5.0 " Philippe Mathieu-Daudé
  1 sibling, 0 replies; 12+ messages in thread
From: John Snow @ 2020-03-24 19:33 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: kwolf, Liu Yuan, open list:Sheepdog, qemu-block, mreitz



On 3/24/20 1:42 PM, Eric Blake wrote:
> block_int.h claims that .bdrv_has_zero_init must return 0 if
> .bdrv_has_zero_init_truncate does likewise; but this is violated if
> only the former callback is provided if .bdrv_co_truncate also exists.
> When adding the latter callback, it was mistakenly added to only one
> of the three possible sheepdog instantiations.
> 
> Fixes: 1dcaf527
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  block/sheepdog.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/block/sheepdog.c b/block/sheepdog.c
> index cfa84338a2d6..522c16a93676 100644
> --- a/block/sheepdog.c
> +++ b/block/sheepdog.c
> @@ -3269,6 +3269,7 @@ static BlockDriver bdrv_sheepdog_tcp = {
>      .bdrv_co_create               = sd_co_create,
>      .bdrv_co_create_opts          = sd_co_create_opts,
>      .bdrv_has_zero_init           = bdrv_has_zero_init_1,
> +    .bdrv_has_zero_init_truncate  = bdrv_has_zero_init_1,
>      .bdrv_getlength               = sd_getlength,
>      .bdrv_get_allocated_file_size = sd_get_allocated_file_size,
>      .bdrv_co_truncate             = sd_co_truncate,
> @@ -3307,6 +3308,7 @@ static BlockDriver bdrv_sheepdog_unix = {
>      .bdrv_co_create               = sd_co_create,
>      .bdrv_co_create_opts          = sd_co_create_opts,
>      .bdrv_has_zero_init           = bdrv_has_zero_init_1,
> +    .bdrv_has_zero_init_truncate  = bdrv_has_zero_init_1,
>      .bdrv_getlength               = sd_getlength,
>      .bdrv_get_allocated_file_size = sd_get_allocated_file_size,
>      .bdrv_co_truncate             = sd_co_truncate,
> 

Reviewed-by: John Snow <jsnow@redhat.com>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/4] qcow2: Avoid feature name extension on small cluster size
  2020-03-24 17:42 ` [PATCH v2 3/4] qcow2: Avoid feature name extension on small cluster size Eric Blake
@ 2020-03-25 12:42   ` Max Reitz
  2020-03-25 13:18     ` Eric Blake
  0 siblings, 1 reply; 12+ messages in thread
From: Max Reitz @ 2020-03-25 12:42 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: kwolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 2460 bytes --]

On 24.03.20 18:42, Eric Blake wrote:
> As the feature name table can be quite large (over 9k if all 64 bits
> of all three feature fields have names; a mere 8 features leaves only
> 8 bytes for a backing file name in a 512-byte cluster), it is unwise
> to emit this optional header in images with small cluster sizes.
> 
> Update iotest 036 to skip running on small cluster sizes; meanwhile,
> note that iotest 061 never passed on alternative cluster sizes
> (however, I limited this patch to tests with output affected by adding
> feature names, rather than auditing for other tests that are not
> robust to alternative cluster sizes).

That’s a bit more brave than necessary, considering I don’t think anyone
has ever run the iotests with the cluster_size option.  (I certainly
don’t, and I don’t plan to, because I don’t think it’s that important to
test that.)  There are certainly many other iotests that fail with a
non-default cluster size.

Not that it’s wrong care about it.  On the opposite, I’m happy you do. :)

> Signed-off-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Alberto Garcia <berto@igalia.com>
> ---
>  block/qcow2.c          | 11 +++++++++--
>  tests/qemu-iotests/036 |  6 ++++--
>  tests/qemu-iotests/061 |  6 ++++--
>  3 files changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 67b0c214fba4..9475ace57163 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -2823,8 +2823,15 @@ int qcow2_update_header(BlockDriverState *bs)
>          buflen -= ret;
>      }
> 
> -    /* Feature table */
> -    if (s->qcow_version >= 3) {
> +    /*
> +     * Feature table.  A mere 8 feature names occupies 392 bytes, and
> +     * when coupled with the v3 minimum header of 104 bytes plus the
> +     * 8-byte end-of-extension marker, that would leave only 8 bytes
> +     * for a backing file name in an image with 512-byte clusters.
> +     * Thus, we choose to omit this header for cluster sizes 4k and
> +     * smaller.

Can’t we do this decision more dynamically?  Like, always omit it when
cluster_size - sizeof(features) < 2k/3k/...?

Max

> +     */
> +    if (s->qcow_version >= 3 && s->cluster_size > 4096) {
>          static const Qcow2Feature features[] = {
>              {
>                  .type = QCOW2_FEAT_TYPE_INCOMPATIBLE,


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/4] qcow2: Avoid feature name extension on small cluster size
  2020-03-25 12:42   ` Max Reitz
@ 2020-03-25 13:18     ` Eric Blake
  2020-03-25 13:52       ` Max Reitz
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Blake @ 2020-03-25 13:18 UTC (permalink / raw)
  To: Max Reitz, qemu-devel
  Cc: kwolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia, qemu-block

On 3/25/20 7:42 AM, Max Reitz wrote:
> On 24.03.20 18:42, Eric Blake wrote:
>> As the feature name table can be quite large (over 9k if all 64 bits
>> of all three feature fields have names; a mere 8 features leaves only
>> 8 bytes for a backing file name in a 512-byte cluster), it is unwise
>> to emit this optional header in images with small cluster sizes.
>>
>> Update iotest 036 to skip running on small cluster sizes; meanwhile,
>> note that iotest 061 never passed on alternative cluster sizes
>> (however, I limited this patch to tests with output affected by adding
>> feature names, rather than auditing for other tests that are not
>> robust to alternative cluster sizes).
> 

>> -    /* Feature table */
>> -    if (s->qcow_version >= 3) {
>> +    /*
>> +     * Feature table.  A mere 8 feature names occupies 392 bytes, and
>> +     * when coupled with the v3 minimum header of 104 bytes plus the
>> +     * 8-byte end-of-extension marker, that would leave only 8 bytes
>> +     * for a backing file name in an image with 512-byte clusters.
>> +     * Thus, we choose to omit this header for cluster sizes 4k and
>> +     * smaller.
> 
> Can’t we do this decision more dynamically?  Like, always omit it when
> cluster_size - sizeof(features) < 2k/3k/...?
> 
> Max
> 
>> +     */
>> +    if (s->qcow_version >= 3 && s->cluster_size > 4096) {

Yes.  But when you consider that sizeof(features) is a compile-time 
constant, it isn't really dynamic for a given qemu release, but rather a 
different way to spell things; about the only thing it would buy us is 
that our cutoff window for what cluster size no longer gets the header 
may automatically shift from 2k to 4k to 8k as future qemu versions add 
more qcow2 features.  If we want to write it like that, which size limit 
do you propose?  Or asked differently, how much space should we reserve 
for other extension headers + backing file name?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/4] qcow2: Avoid feature name extension on small cluster size
  2020-03-25 13:18     ` Eric Blake
@ 2020-03-25 13:52       ` Max Reitz
  2020-03-25 14:01         ` Eric Blake
  0 siblings, 1 reply; 12+ messages in thread
From: Max Reitz @ 2020-03-25 13:52 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: kwolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 2744 bytes --]

On 25.03.20 14:18, Eric Blake wrote:
> On 3/25/20 7:42 AM, Max Reitz wrote:
>> On 24.03.20 18:42, Eric Blake wrote:
>>> As the feature name table can be quite large (over 9k if all 64 bits
>>> of all three feature fields have names; a mere 8 features leaves only
>>> 8 bytes for a backing file name in a 512-byte cluster), it is unwise
>>> to emit this optional header in images with small cluster sizes.
>>>
>>> Update iotest 036 to skip running on small cluster sizes; meanwhile,
>>> note that iotest 061 never passed on alternative cluster sizes
>>> (however, I limited this patch to tests with output affected by adding
>>> feature names, rather than auditing for other tests that are not
>>> robust to alternative cluster sizes).
>>
> 
>>> -    /* Feature table */
>>> -    if (s->qcow_version >= 3) {
>>> +    /*
>>> +     * Feature table.  A mere 8 feature names occupies 392 bytes, and
>>> +     * when coupled with the v3 minimum header of 104 bytes plus the
>>> +     * 8-byte end-of-extension marker, that would leave only 8 bytes
>>> +     * for a backing file name in an image with 512-byte clusters.
>>> +     * Thus, we choose to omit this header for cluster sizes 4k and
>>> +     * smaller.
>>
>> Can’t we do this decision more dynamically?  Like, always omit it when
>> cluster_size - sizeof(features) < 2k/3k/...?
>>
>> Max
>>
>>> +     */
>>> +    if (s->qcow_version >= 3 && s->cluster_size > 4096) {
> 
> Yes.  But when you consider that sizeof(features) is a compile-time
> constant, it isn't really dynamic for a given qemu release, but rather a
> different way to spell things; about the only thing it would buy us is
> that our cutoff window for what cluster size no longer gets the header
> may automatically shift from 2k to 4k to 8k as future qemu versions add
> more qcow2 features.

Yes.

> If we want to write it like that, which size limit
> do you propose?  Or asked differently, how much space should we reserve
> for other extension headers + backing file name?

Well, that was the “2k/3k/...” list. :)

The backing file name is limited to 1k, so I suppose that plus 1k for a
potential external data filename, plus 1k for the rest, so 3k in total?

Now that I look into the spec, I see that it actually doesn’t say that
the backing filename has to be part of the header cluster.  But, well.

It also only says that the image header must be part of the first
cluster, which in my opinion doesn’t necessarily include its extensions.
 So header extensions (and the backing filename) could actually be in
consecutive clusters.  But that of course would be much more difficult
to implement.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/4] qcow2: Avoid feature name extension on small cluster size
  2020-03-25 13:52       ` Max Reitz
@ 2020-03-25 14:01         ` Eric Blake
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2020-03-25 14:01 UTC (permalink / raw)
  To: Max Reitz, qemu-devel
  Cc: kwolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia, qemu-block

On 3/25/20 8:52 AM, Max Reitz wrote:

>> If we want to write it like that, which size limit
>> do you propose?  Or asked differently, how much space should we reserve
>> for other extension headers + backing file name?
> 
> Well, that was the “2k/3k/...” list. :)
> 
> The backing file name is limited to 1k, so I suppose that plus 1k for a
> potential external data filename, plus 1k for the rest, so 3k in total?
> 
> Now that I look into the spec, I see that it actually doesn’t say that
> the backing filename has to be part of the header cluster.  But, well.

qemu enforces that the header is one cluster.  But you're right, that 
does not appear to directly be a limitation mandated by the spec, and we 
could relax qemu to allow the header to be several consecutive clusters. 
  The tricky part, however, is that the backing file name is NOT 
described by a header extension, but rather is just whatever bytes occur 
after the final header extension.  There's no clear indication anywhere 
on when to stop reading those bytes, other than by an implicit limit 
such as insisting those bytes fall within the first cluster.

Had we been smarter when designing v3, we would have made the backing 
file name a header extension (in fact, it would have been possible to 
design the additional fields of v3 to look like an unknown header 
extension when parsed by a v2 binary) - but hindsight is 20/20.

> 
> It also only says that the image header must be part of the first
> cluster, which in my opinion doesn’t necessarily include its extensions.
>   So header extensions (and the backing filename) could actually be in
> consecutive clusters.  But that of course would be much more difficult
> to implement.

We'd still want a sane limit even for small-cluster images (maybe "no 
more than 2M of header information, regardless of cluster size); or 
maybe even introduce a NEW header field and/or extension to make it 
obvious how many clusters are being used for the purpose of the metadata 
header in this particular image (with sane fallbacks for when that 
extension is not present).  But you're right - it comes with complexity. 
  This patch as written is safe for 5.0-rc1, but this discussion about 
teaching qemu to permit headers larger than 1 cluster is squarely in the 
5.1 category, if at all.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH-for-5.0 v2 4/4] sheepdog: Consistently set bdrv_has_zero_init_truncate
  2020-03-24 17:42 ` [PATCH v2 4/4] sheepdog: Consistently set bdrv_has_zero_init_truncate Eric Blake
  2020-03-24 19:33   ` John Snow
@ 2020-03-25 14:45   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 12+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-03-25 14:45 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: kwolf, Liu Yuan, open list:Sheepdog, qemu-block, mreitz

On 3/24/20 6:42 PM, Eric Blake wrote:
> block_int.h claims that .bdrv_has_zero_init must return 0 if
> .bdrv_has_zero_init_truncate does likewise; but this is violated if
> only the former callback is provided if .bdrv_co_truncate also exists.
> When adding the latter callback, it was mistakenly added to only one
> of the three possible sheepdog instantiations.
> 
> Fixes: 1dcaf527
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>   block/sheepdog.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/block/sheepdog.c b/block/sheepdog.c
> index cfa84338a2d6..522c16a93676 100644
> --- a/block/sheepdog.c
> +++ b/block/sheepdog.c
> @@ -3269,6 +3269,7 @@ static BlockDriver bdrv_sheepdog_tcp = {
>       .bdrv_co_create               = sd_co_create,
>       .bdrv_co_create_opts          = sd_co_create_opts,
>       .bdrv_has_zero_init           = bdrv_has_zero_init_1,
> +    .bdrv_has_zero_init_truncate  = bdrv_has_zero_init_1,
>       .bdrv_getlength               = sd_getlength,
>       .bdrv_get_allocated_file_size = sd_get_allocated_file_size,
>       .bdrv_co_truncate             = sd_co_truncate,
> @@ -3307,6 +3308,7 @@ static BlockDriver bdrv_sheepdog_unix = {
>       .bdrv_co_create               = sd_co_create,
>       .bdrv_co_create_opts          = sd_co_create_opts,
>       .bdrv_has_zero_init           = bdrv_has_zero_init_1,
> +    .bdrv_has_zero_init_truncate  = bdrv_has_zero_init_1,
>       .bdrv_getlength               = sd_getlength,
>       .bdrv_get_allocated_file_size = sd_get_allocated_file_size,
>       .bdrv_co_truncate             = sd_co_truncate,
> 

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-5.0 v2 0/4] bug fixes extracted from larger qcow2 zero bit work
  2020-03-24 17:42 [PATCH for-5.0 v2 0/4] bug fixes extracted from larger qcow2 zero bit work Eric Blake
                   ` (3 preceding siblings ...)
  2020-03-24 17:42 ` [PATCH v2 4/4] sheepdog: Consistently set bdrv_has_zero_init_truncate Eric Blake
@ 2020-03-26 13:32 ` Max Reitz
  4 siblings, 0 replies; 12+ messages in thread
From: Max Reitz @ 2020-03-26 13:32 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: kwolf, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 1522 bytes --]

On 24.03.20 18:42, Eric Blake wrote:
> My proposal [1] to add an autoclear all-zero-content bit to the qcow2
> format has now stalled into 5.1 territory, but several patches in my
> initial proposal are uncontroversial and obvious bug fixes worth
> having in 5.0.
> 
> [1] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg08075.html
> 
> Eric Blake (4):
>   qcow2: Comment typo fixes
>   qcow2: List autoclear bit names in header
>   qcow2: Avoid feature name extension on small cluster size
>   sheepdog: Consistently set bdrv_has_zero_init_truncate

OK, so my only real gripe was with the question whether we could make
patch 3 decide dynamically when to include or not include the feature
name table.  There’s little difference in practice: Right now, we could
probably get away with including it in images with 4k clusters, then it
would automatically switch to 8k clusters minimum (which is the limit
chosen in patch 3 as-is), and at some theoretical point in the far, far
future it would switch 16k clusters minimum.

I don’t think anyone really cares about whether the feature name table
is in images with 4k clusters or not, and I think we still have a couple
of decades before we the table gets too big for images with 8k clusters
(and we probably won’t be using qcow2 then anymore).

So tl;dr: There’s no practical benefit of deciding dynamically, hence
I’m taking this series as-is:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block

Thanks!

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, back to index

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-24 17:42 [PATCH for-5.0 v2 0/4] bug fixes extracted from larger qcow2 zero bit work Eric Blake
2020-03-24 17:42 ` [PATCH v2 1/4] qcow2: Comment typo fixes Eric Blake
2020-03-24 17:42 ` [PATCH v2 2/4] qcow2: List autoclear bit names in header Eric Blake
2020-03-24 17:42 ` [PATCH v2 3/4] qcow2: Avoid feature name extension on small cluster size Eric Blake
2020-03-25 12:42   ` Max Reitz
2020-03-25 13:18     ` Eric Blake
2020-03-25 13:52       ` Max Reitz
2020-03-25 14:01         ` Eric Blake
2020-03-24 17:42 ` [PATCH v2 4/4] sheepdog: Consistently set bdrv_has_zero_init_truncate Eric Blake
2020-03-24 19:33   ` John Snow
2020-03-25 14:45   ` [PATCH-for-5.0 " Philippe Mathieu-Daudé
2020-03-26 13:32 ` [PATCH for-5.0 v2 0/4] bug fixes extracted from larger qcow2 zero bit work Max Reitz

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git