All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/2] qemu-img: Special post-backing convert handle
@ 2018-05-01 16:57 Max Reitz
  2018-05-01 16:57 ` [Qemu-devel] [PATCH 1/2] qemu-img: Special post-backing convert handling Max Reitz
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Max Reitz @ 2018-05-01 16:57 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf

This is the issue:

$ qemu-img create -f qcow2 base.qcow2 512M
Formatting 'base.qcow2', fmt=qcow2 size=536870912 cluster_size=65536 lazy_refcounts=off refcount_bits=16
$ qemu-img create -f qcow2 -b base.qcow2 source.qcow2 1G
Formatting 'source.qcow2', fmt=qcow2 size=1073741824 backing_file=base.qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
$ qemu-img convert -O qcow2 -B base.qcow2 -o compat=0.10 \
    source.qcow2 target.qcow2
$ qemu-img info target.qcow2
image: target.qcow2
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 512M    <-------------------- this here
cluster_size: 65536
backing file: base.qcow2
Format specific information:
    compat: 0.10
    refcount bits: 16
$ qemu-img map target.qcow2
Offset          Length          Mapped to       File
0x20000000      0x20000000      0x50000         target.qcow2

So qemu-img convert sees that source.qcow2 contains only zeroes past the
end of base.qcow2 -- but that is not the backing file, so it will
explicitly write zeroes to target.qcow2.  But that file is compat=0.10,
so it does not support efficient zero writes and will actually fill that
area with real zeroes.  Hence the mapping, hence the disk size.

However, we don't need to write zeroes to an image when it is
initialized to zeroes -- which the current qemu-img code says doesn't
happen for target images with backing files.  But it does happen when
the target backing file is shorter than the target image, then the area
past the end of the backing file may indeed read as zeroes and we don't
need to write anything there.

So the first patch in this series makes qemu-img convert detect that
case and handle zeroes past the end of the backing file as "fall back to
the backing file" (because that means zeroes, at least if unallocated
areas read as zeroes (which I presume they have to for formats
supporting backing files, but better be safe than sorry)), and the
second adds an iotest.


Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1527898


Max Reitz (2):
  qemu-img: Special post-backing convert handling
  iotests: Test post-backing convert target behavior

 qemu-img.c                 | 26 +++++++++++++++++++++++++-
 tests/qemu-iotests/122     | 42 ++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/122.out | 18 ++++++++++++++++++
 3 files changed, 85 insertions(+), 1 deletion(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 1/2] qemu-img: Special post-backing convert handling
  2018-05-01 16:57 [Qemu-devel] [PATCH 0/2] qemu-img: Special post-backing convert handle Max Reitz
@ 2018-05-01 16:57 ` Max Reitz
  2018-05-01 18:09   ` Eric Blake
  2018-05-01 16:57 ` [Qemu-devel] [PATCH 2/2] iotests: Test post-backing convert target behavior Max Reitz
  2018-06-01 11:15 ` [Qemu-devel] [PATCH 0/2] qemu-img: Special post-backing convert handle Max Reitz
  2 siblings, 1 reply; 6+ messages in thread
From: Max Reitz @ 2018-05-01 16:57 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf

Currently, qemu-img convert writes zeroes when it reads zeroes.
Sometimes it does not because the target is initialized to zeroes
anyway, so we do not need to overwrite (and thus potentially allocate)
it.  This is never the case for targets with backing files, though.  But
even them may have an area that is initialized to zeroes, and that is
the area past the end of the backing file (if that is shorter than the
overlay).

So if the target format's unallocated blocks are zero and there is a gap
between the target's backing file's end and the target's end, we do not
have to explicitly write zeroes there.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1527898
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 qemu-img.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/qemu-img.c b/qemu-img.c
index 855fa52514..5365afbd13 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1542,7 +1542,9 @@ typedef struct ImgConvertState {
     BlockBackend *target;
     bool has_zero_init;
     bool compressed;
+    bool unallocated_blocks_are_zero;
     bool target_has_backing;
+    int64_t target_backing_sectors; /* negative if unknown */
     bool wr_in_order;
     int min_sparse;
     size_t cluster_sectors;
@@ -1571,12 +1573,23 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num)
 {
     int64_t src_cur_offset;
     int ret, n, src_cur;
+    bool post_backing_zero = false;
 
     convert_select_part(s, sector_num, &src_cur, &src_cur_offset);
 
     assert(s->total_sectors > sector_num);
     n = MIN(s->total_sectors - sector_num, BDRV_REQUEST_MAX_SECTORS);
 
+    if (s->target_backing_sectors >= 0) {
+        if (sector_num >= s->target_backing_sectors) {
+            post_backing_zero = s->unallocated_blocks_are_zero;
+        } else if (sector_num + n > s->target_backing_sectors) {
+            /* Split requests around target_backing_sectors (because
+             * starting from there, zeros are handled differently) */
+            n = s->target_backing_sectors - sector_num;
+        }
+    }
+
     if (s->sector_next_status <= sector_num) {
         int64_t count = n * BDRV_SECTOR_SIZE;
 
@@ -1598,7 +1611,7 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num)
         n = DIV_ROUND_UP(count, BDRV_SECTOR_SIZE);
 
         if (ret & BDRV_BLOCK_ZERO) {
-            s->status = BLK_ZERO;
+            s->status = post_backing_zero ? BLK_BACKING_FILE : BLK_ZERO;
         } else if (ret & BDRV_BLOCK_DATA) {
             s->status = BLK_DATA;
         } else {
@@ -2319,6 +2332,16 @@ static int img_convert(int argc, char **argv)
         }
     }
 
+    if (s.target_has_backing) {
+        /* Errors are treated as "backing length unknown" (which means
+         * s.target_backing_sectors has to be negative, which it will
+         * be automatically).  The backing file length is used only
+         * for optimizations, so such a case is not fatal. */
+        s.target_backing_sectors = bdrv_nb_sectors(out_bs->backing->bs);
+    } else {
+        s.target_backing_sectors = -1;
+    }
+
     ret = bdrv_get_info(out_bs, &bdi);
     if (ret < 0) {
         if (s.compressed) {
@@ -2328,6 +2351,7 @@ static int img_convert(int argc, char **argv)
     } else {
         s.compressed = s.compressed || bdi.needs_compressed_writes;
         s.cluster_sectors = bdi.cluster_size / BDRV_SECTOR_SIZE;
+        s.unallocated_blocks_are_zero = bdi.unallocated_blocks_are_zero;
     }
 
     ret = convert_do_copy(&s);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 2/2] iotests: Test post-backing convert target behavior
  2018-05-01 16:57 [Qemu-devel] [PATCH 0/2] qemu-img: Special post-backing convert handle Max Reitz
  2018-05-01 16:57 ` [Qemu-devel] [PATCH 1/2] qemu-img: Special post-backing convert handling Max Reitz
@ 2018-05-01 16:57 ` Max Reitz
  2018-05-01 18:14   ` Eric Blake
  2018-06-01 11:15 ` [Qemu-devel] [PATCH 0/2] qemu-img: Special post-backing convert handle Max Reitz
  2 siblings, 1 reply; 6+ messages in thread
From: Max Reitz @ 2018-05-01 16:57 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf

This adds a test case to 122 for what happens when you convert to a
target with a backing file that is shorter than the target, and the
image format does not support efficient zero writes (as is the case with
qcow2 v2).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/122     | 42 ++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/122.out | 18 ++++++++++++++++++
 2 files changed, 60 insertions(+)

diff --git a/tests/qemu-iotests/122 b/tests/qemu-iotests/122
index 45b359c2ba..d8c8ad722d 100755
--- a/tests/qemu-iotests/122
+++ b/tests/qemu-iotests/122
@@ -76,6 +76,48 @@ $QEMU_IMG convert -O $IMGFMT -c -B "$TEST_IMG".base "$TEST_IMG" "$TEST_IMG".orig
 $QEMU_IO -c "read -P 0 0 3M" "$TEST_IMG".orig 2>&1 | _filter_qemu_io | _filter_testdir
 
 
+echo
+echo "=== Converting to an overlay larger than its backing file ==="
+echo
+
+TEST_IMG="$TEST_IMG".base _make_test_img 256M
+# Needs to be at least how much an L2 table covers
+# (64 kB/entry * 64 kB / 8 B/entry = 512 MB)
+# That way, qcow2 will yield at least two status request responses.
+# With just a single response, it would always say "Allocated in the
+# backing file", so the optimization qemu-img convert tries to do is
+# done automatically.  Once it has to be queried twice, however (and
+# one of the queries is completely after the end of the backing file),
+# the block layer will automatically add a ZERO flag that qemu-img
+# convert used to follow up with a zero write to the target.
+# We do not want such a zero write, however, because we are past the
+# end of the backing file on the target as well, so we do not need to
+# write anything there.
+_make_test_img -b "$TEST_IMG".base 768M
+
+# Use compat=0.10 as the output so there is no zero cluster support
+$QEMU_IMG convert -O $IMGFMT -B "$TEST_IMG".base -o compat=0.10 \
+    "$TEST_IMG" "$TEST_IMG".orig
+# See that nothing has been allocated past 64M
+$QEMU_IMG map "$TEST_IMG".orig | _filter_qemu_img_map
+
+echo
+
+# Just before the end of the backing file
+$QEMU_IO -c 'write -P 0x11 255M 1M' "$TEST_IMG".base 2>&1 | _filter_qemu_io
+# Somewhere in the second L2 table
+$QEMU_IO -c 'write -P 0x22 600M 1M' "$TEST_IMG" 2>&1 | _filter_qemu_io
+
+$QEMU_IMG convert -O $IMGFMT -B "$TEST_IMG".base -o compat=0.10 \
+    "$TEST_IMG" "$TEST_IMG".orig
+
+$QEMU_IMG map "$TEST_IMG".orig | _filter_qemu_img_map
+$QEMU_IO -c 'read -P 0x11 255M 1M' \
+         -c 'read -P 0x22 600M 1M' \
+         "$TEST_IMG".orig \
+    | _filter_qemu_io
+
+
 echo
 echo "=== Concatenate multiple source images ==="
 echo
diff --git a/tests/qemu-iotests/122.out b/tests/qemu-iotests/122.out
index 47d8656db8..6c7ee1da6c 100644
--- a/tests/qemu-iotests/122.out
+++ b/tests/qemu-iotests/122.out
@@ -28,6 +28,24 @@ read 3145728/3145728 bytes at offset 0
 read 3145728/3145728 bytes at offset 0
 3 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
+=== Converting to an overlay larger than its backing file ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=268435456
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=805306368 backing_file=TEST_DIR/t.IMGFMT.base
+Offset          Length          File
+
+wrote 1048576/1048576 bytes at offset 267386880
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 1048576/1048576 bytes at offset 629145600
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Offset          Length          File
+0xff00000       0x100000        TEST_DIR/t.IMGFMT.base
+0x25800000      0x100000        TEST_DIR/t.IMGFMT.orig
+read 1048576/1048576 bytes at offset 267386880
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 1048576/1048576 bytes at offset 629145600
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
 === Concatenate multiple source images ===
 
 Formatting 'TEST_DIR/t.IMGFMT.1', fmt=IMGFMT size=4194304
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] qemu-img: Special post-backing convert handling
  2018-05-01 16:57 ` [Qemu-devel] [PATCH 1/2] qemu-img: Special post-backing convert handling Max Reitz
@ 2018-05-01 18:09   ` Eric Blake
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Blake @ 2018-05-01 18:09 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

On 05/01/2018 11:57 AM, Max Reitz wrote:
> Currently, qemu-img convert writes zeroes when it reads zeroes.
> Sometimes it does not because the target is initialized to zeroes
> anyway, so we do not need to overwrite (and thus potentially allocate)
> it.  This is never the case for targets with backing files, though.  But
> even them may have an area that is initialized to zeroes, and that is

s/them/they/

> the area past the end of the backing file (if that is shorter than the
> overlay).
> 
> So if the target format's unallocated blocks are zero and there is a gap
> between the target's backing file's end and the target's end, we do not
> have to explicitly write zeroes there.
> 
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1527898
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   qemu-img.c | 26 +++++++++++++++++++++++++-
>   1 file changed, 25 insertions(+), 1 deletion(-)
> 

Some long variable names, but it aids legibility.  Nice fix.

Reviewed-by: Eric Blake <eblake@redhat.com>

(And here's hoping that Red Hat someday flips the switch so that new 
images avoid compat=0.10 in their downstream builds, as we've had quite 
a few corner cases where we already have optimized speed for compat=1.1 ...)

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] iotests: Test post-backing convert target behavior
  2018-05-01 16:57 ` [Qemu-devel] [PATCH 2/2] iotests: Test post-backing convert target behavior Max Reitz
@ 2018-05-01 18:14   ` Eric Blake
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Blake @ 2018-05-01 18:14 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

On 05/01/2018 11:57 AM, Max Reitz wrote:
> This adds a test case to 122 for what happens when you convert to a
> target with a backing file that is shorter than the target, and the
> image format does not support efficient zero writes (as is the case with
> qcow2 v2).
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   tests/qemu-iotests/122     | 42 ++++++++++++++++++++++++++++++++++++++++++
>   tests/qemu-iotests/122.out | 18 ++++++++++++++++++
>   2 files changed, 60 insertions(+)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] qemu-img: Special post-backing convert handle
  2018-05-01 16:57 [Qemu-devel] [PATCH 0/2] qemu-img: Special post-backing convert handle Max Reitz
  2018-05-01 16:57 ` [Qemu-devel] [PATCH 1/2] qemu-img: Special post-backing convert handling Max Reitz
  2018-05-01 16:57 ` [Qemu-devel] [PATCH 2/2] iotests: Test post-backing convert target behavior Max Reitz
@ 2018-06-01 11:15 ` Max Reitz
  2 siblings, 0 replies; 6+ messages in thread
From: Max Reitz @ 2018-06-01 11:15 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Kevin Wolf

[-- Attachment #1: Type: text/plain, Size: 2613 bytes --]

On 2018-05-01 18:57, Max Reitz wrote:
> This is the issue:
> 
> $ qemu-img create -f qcow2 base.qcow2 512M
> Formatting 'base.qcow2', fmt=qcow2 size=536870912 cluster_size=65536 lazy_refcounts=off refcount_bits=16
> $ qemu-img create -f qcow2 -b base.qcow2 source.qcow2 1G
> Formatting 'source.qcow2', fmt=qcow2 size=1073741824 backing_file=base.qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
> $ qemu-img convert -O qcow2 -B base.qcow2 -o compat=0.10 \
>     source.qcow2 target.qcow2
> $ qemu-img info target.qcow2
> image: target.qcow2
> file format: qcow2
> virtual size: 1.0G (1073741824 bytes)
> disk size: 512M    <-------------------- this here
> cluster_size: 65536
> backing file: base.qcow2
> Format specific information:
>     compat: 0.10
>     refcount bits: 16
> $ qemu-img map target.qcow2
> Offset          Length          Mapped to       File
> 0x20000000      0x20000000      0x50000         target.qcow2
> 
> So qemu-img convert sees that source.qcow2 contains only zeroes past the
> end of base.qcow2 -- but that is not the backing file, so it will
> explicitly write zeroes to target.qcow2.  But that file is compat=0.10,
> so it does not support efficient zero writes and will actually fill that
> area with real zeroes.  Hence the mapping, hence the disk size.
> 
> However, we don't need to write zeroes to an image when it is
> initialized to zeroes -- which the current qemu-img code says doesn't
> happen for target images with backing files.  But it does happen when
> the target backing file is shorter than the target image, then the area
> past the end of the backing file may indeed read as zeroes and we don't
> need to write anything there.
> 
> So the first patch in this series makes qemu-img convert detect that
> case and handle zeroes past the end of the backing file as "fall back to
> the backing file" (because that means zeroes, at least if unallocated
> areas read as zeroes (which I presume they have to for formats
> supporting backing files, but better be safe than sorry)), and the
> second adds an iotest.
> 
> 
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1527898
> 
> 
> Max Reitz (2):
>   qemu-img: Special post-backing convert handling
>   iotests: Test post-backing convert target behavior
> 
>  qemu-img.c                 | 26 +++++++++++++++++++++++++-
>  tests/qemu-iotests/122     | 42 ++++++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/122.out | 18 ++++++++++++++++++
>  3 files changed, 85 insertions(+), 1 deletion(-)

Applied to my block branch.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-06-01 11:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-01 16:57 [Qemu-devel] [PATCH 0/2] qemu-img: Special post-backing convert handle Max Reitz
2018-05-01 16:57 ` [Qemu-devel] [PATCH 1/2] qemu-img: Special post-backing convert handling Max Reitz
2018-05-01 18:09   ` Eric Blake
2018-05-01 16:57 ` [Qemu-devel] [PATCH 2/2] iotests: Test post-backing convert target behavior Max Reitz
2018-05-01 18:14   ` Eric Blake
2018-06-01 11:15 ` [Qemu-devel] [PATCH 0/2] qemu-img: Special post-backing convert handle Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.