QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX
@ 2019-10-10 10:08 Max Reitz
  2019-10-10 10:08 ` [PATCH 1/2] " Max Reitz
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Max Reitz @ 2019-10-10 10:08 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-devel, Max Reitz

Hi,

While looking for why handle_alloc_space() seems to cause issues on
ppc64le+XFS (performance degradation and data corruption), I spotted
this other issue.  It isn’t as bad, but still needs fixing.

See patch 1 for what is fixed and patch 2 for what breaks otherwise.


Max Reitz (2):
  qcow2: Limit total allocation range to INT_MAX
  iotests: Test large write request to qcow2 file

 block/qcow2-cluster.c      |  5 ++-
 tests/qemu-iotests/268     | 83 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/268.out |  9 +++++
 tests/qemu-iotests/group   |  1 +
 4 files changed, 97 insertions(+), 1 deletion(-)
 create mode 100755 tests/qemu-iotests/268
 create mode 100644 tests/qemu-iotests/268.out

-- 
2.21.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] qcow2: Limit total allocation range to INT_MAX
  2019-10-10 10:08 [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Max Reitz
@ 2019-10-10 10:08 ` " Max Reitz
  2019-10-10 15:56   ` Eric Blake
  2019-10-11 10:18   ` Philippe Mathieu-Daudé
  2019-10-10 10:08 ` [PATCH 2/2] iotests: Test large write request to qcow2 file Max Reitz
  2019-10-14 15:12 ` [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Kevin Wolf
  2 siblings, 2 replies; 7+ messages in thread
From: Max Reitz @ 2019-10-10 10:08 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-devel, Max Reitz

When the COW areas are included, the size of an allocation can exceed
INT_MAX.  This is kind of limited by handle_alloc() in that it already
caps avail_bytes at INT_MAX, but the number of clusters still reflects
the original length.

This can have all sorts of effects, ranging from the storage layer write
call failing to image corruption.  (If there were no image corruption,
then I suppose there would be data loss because the .cow_end area is
forced to be empty, even though there might be something we need to
COW.)

Fix all of it by limiting nb_clusters so the equivalent number of bytes
will not exceed INT_MAX.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cluster.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 8d5fa1539c..8982b7b762 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1330,6 +1330,9 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
     nb_clusters = MIN(nb_clusters, s->l2_slice_size - l2_index);
     assert(nb_clusters <= INT_MAX);
 
+    /* Limit total allocation byte count to INT_MAX */
+    nb_clusters = MIN(nb_clusters, INT_MAX >> s->cluster_bits);
+
     /* Find L2 entry for the first involved cluster */
     ret = get_cluster_table(bs, guest_offset, &l2_slice, &l2_index);
     if (ret < 0) {
@@ -1412,7 +1415,7 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
      * request actually writes to (excluding COW at the end)
      */
     uint64_t requested_bytes = *bytes + offset_into_cluster(s, guest_offset);
-    int avail_bytes = MIN(INT_MAX, nb_clusters << s->cluster_bits);
+    int avail_bytes = nb_clusters << s->cluster_bits;
     int nb_bytes = MIN(requested_bytes, avail_bytes);
     QCowL2Meta *old_m = *m;
 
-- 
2.21.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/2] iotests: Test large write request to qcow2 file
  2019-10-10 10:08 [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Max Reitz
  2019-10-10 10:08 ` [PATCH 1/2] " Max Reitz
@ 2019-10-10 10:08 ` Max Reitz
  2019-10-10 16:25   ` Eric Blake
  2019-10-14 15:12 ` [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Kevin Wolf
  2 siblings, 1 reply; 7+ messages in thread
From: Max Reitz @ 2019-10-10 10:08 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-devel, Max Reitz

Without HEAD^, the following happens when you attempt a large write
request to a qcow2 file such that the number of bytes covered by all
clusters involved in a single allocation will exceed INT_MAX:

(A) handle_alloc_space() decides to fill the whole area with zeroes and
    fails because bdrv_co_pwrite_zeroes() fails (the request is too
    large).

(B) If handle_alloc_space() does not do anything, but merge_cow()
    decides that the requests can be merged, it will create a too long
    IOV that later cannot be written.

(C) Otherwise, all parts will be written separately, so those requests
    will work.

In either B or C, though, qcow2_alloc_cluster_link_l2() will have an
overflow: We use an int (i) to iterate over nb_clusters, and then
calculate the L2 entry based on "i << s->cluster_bits" -- which will
overflow if the range covers more than INT_MAX bytes.  This then leads
to image corruption because the L2 entry will be wrong (it will be
recognized as a compressed cluster).

Even if that were not the case, the .cow_end area would be empty
(because handle_alloc() will cap avail_bytes and nb_bytes at INT_MAX, so
their difference (which is the .cow_end size) will be 0).

So this test checks that on such large requests, the image will not be
corrupted.  Unfortunately, we cannot check whether COW will be handled
correctly, because that data is discarded when it is written to null-co
(but we have to use null-co, because writing 2 GB of data in a test is
not quite reasonable).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/268     | 83 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/268.out |  9 +++++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 93 insertions(+)
 create mode 100755 tests/qemu-iotests/268
 create mode 100644 tests/qemu-iotests/268.out

diff --git a/tests/qemu-iotests/268 b/tests/qemu-iotests/268
new file mode 100755
index 0000000000..b9a12b908c
--- /dev/null
+++ b/tests/qemu-iotests/268
@@ -0,0 +1,83 @@
+#!/usr/bin/env bash
+#
+# Test large write to a qcow2 image
+#
+# Copyright (C) 2019 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+seq=$(basename "$0")
+echo "QA output created by $seq"
+
+status=1	# failure is the default!
+
+_cleanup()
+{
+    _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+# This is a qcow2 regression test
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+# We use our own external data file and our own cluster size, and we
+# require v3 images
+_unsupported_imgopts data_file cluster_size 'compat=0.10'
+
+
+# We need a backing file so that handle_alloc_space() will not do
+# anything.  (If it were to do anything, it would simply fail its
+# write-zeroes request because the request range is too large.)
+TEST_IMG="$TEST_IMG.base" _make_test_img 4G
+$QEMU_IO -c 'write 0 512' "$TEST_IMG.base" | _filter_qemu_io
+
+# (Use .orig because _cleanup_test_img will remove that file)
+# We need a large cluster size, see below for why (above the $QEMU_IO
+# invocation)
+_make_test_img -o cluster_size=2M,data_file="$TEST_IMG.orig" \
+    -b "$TEST_IMG.base" 4G
+
+# We want a null-co as the data file, because it allows us to quickly
+# "write" 2G of data without using any space.
+# (qemu-img create does not like it, though, because null-co does not
+# support image creation.)
+$QEMU_IMG amend -o data_file="json:{'driver':'null-co',,'size':'4294967296'}" \
+    "$TEST_IMG"
+
+# This gives us a range of:
+#   2^31 - 512 + 768 - 1 = 2^31 + 255 > 2^31
+# until the beginning of the end COW block.  (The total allocation
+# size depends on the cluster size, but all that is important is that
+# it exceeds INT_MAX.)
+#
+# 2^31 - 512 is the maximum request size.  We want this to result in a
+# single allocation, and because the qcow2 driver splits allocations
+# on L2 boundaries, we need large L2 tables; hence the cluster size of
+# 2 MB.  (Anything from 256 kB should work, though, because then one L2
+# table covers 8 GB.)
+$QEMU_IO -c "write 768 $((2 ** 31 - 512))" "$TEST_IMG" | _filter_qemu_io
+
+_check_test_img
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/268.out b/tests/qemu-iotests/268.out
new file mode 100644
index 0000000000..35d4f9e3e9
--- /dev/null
+++ b/tests/qemu-iotests/268.out
@@ -0,0 +1,9 @@
+QA output created by 268
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=4294967296
+wrote 512/512 bytes at offset 0
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4294967296 backing_file=TEST_DIR/t.IMGFMT.base data_file=TEST_DIR/t.IMGFMT.orig
+wrote 2147483136/2147483136 bytes at offset 768
+2 GiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 5805a79d9e..4c37b8acec 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -278,3 +278,4 @@
 265 rw auto quick
 266 rw quick
 267 rw auto quick snapshot
+268 rw backing quick
-- 
2.21.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] qcow2: Limit total allocation range to INT_MAX
  2019-10-10 10:08 ` [PATCH 1/2] " Max Reitz
@ 2019-10-10 15:56   ` Eric Blake
  2019-10-11 10:18   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Blake @ 2019-10-10 15:56 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

On 10/10/19 5:08 AM, Max Reitz wrote:
> When the COW areas are included, the size of an allocation can exceed
> INT_MAX.  This is kind of limited by handle_alloc() in that it already
> caps avail_bytes at INT_MAX, but the number of clusters still reflects
> the original length.
> 
> This can have all sorts of effects, ranging from the storage layer write
> call failing to image corruption.  (If there were no image corruption,
> then I suppose there would be data loss because the .cow_end area is
> forced to be empty, even though there might be something we need to
> COW.)
> 
> Fix all of it by limiting nb_clusters so the equivalent number of bytes
> will not exceed INT_MAX.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block/qcow2-cluster.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] iotests: Test large write request to qcow2 file
  2019-10-10 10:08 ` [PATCH 2/2] iotests: Test large write request to qcow2 file Max Reitz
@ 2019-10-10 16:25   ` Eric Blake
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Blake @ 2019-10-10 16:25 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

On 10/10/19 5:08 AM, Max Reitz wrote:
> Without HEAD^, the following happens when you attempt a large write
> request to a qcow2 file such that the number of bytes covered by all
> clusters involved in a single allocation will exceed INT_MAX:
> 
> (A) handle_alloc_space() decides to fill the whole area with zeroes and
>      fails because bdrv_co_pwrite_zeroes() fails (the request is too
>      large).
> 
> (B) If handle_alloc_space() does not do anything, but merge_cow()
>      decides that the requests can be merged, it will create a too long
>      IOV that later cannot be written.
> 
> (C) Otherwise, all parts will be written separately, so those requests
>      will work.
> 
> In either B or C, though, qcow2_alloc_cluster_link_l2() will have an
> overflow: We use an int (i) to iterate over nb_clusters, and then
> calculate the L2 entry based on "i << s->cluster_bits" -- which will
> overflow if the range covers more than INT_MAX bytes.  This then leads
> to image corruption because the L2 entry will be wrong (it will be
> recognized as a compressed cluster).
> 
> Even if that were not the case, the .cow_end area would be empty
> (because handle_alloc() will cap avail_bytes and nb_bytes at INT_MAX, so
> their difference (which is the .cow_end size) will be 0).
> 
> So this test checks that on such large requests, the image will not be
> corrupted.  Unfortunately, we cannot check whether COW will be handled
> correctly, because that data is discarded when it is written to null-co
> (but we have to use null-co, because writing 2 GB of data in a test is
> not quite reasonable).
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   tests/qemu-iotests/268     | 83 ++++++++++++++++++++++++++++++++++++++
>   tests/qemu-iotests/268.out |  9 +++++
>   tests/qemu-iotests/group   |  1 +
>   3 files changed, 93 insertions(+)
>   create mode 100755 tests/qemu-iotests/268
>   create mode 100644 tests/qemu-iotests/268.out
> 
> diff --git a/tests/qemu-iotests/268 b/tests/qemu-iotests/268
> new file mode 100755
> index 0000000000..b9a12b908c
> --- /dev/null
> +++ b/tests/qemu-iotests/268

> +# We want a null-co as the data file, because it allows us to quickly
> +# "write" 2G of data without using any space.
> +# (qemu-img create does not like it, though, because null-co does not
> +# support image creation.)
> +$QEMU_IMG amend -o data_file="json:{'driver':'null-co',,'size':'4294967296'}" \
> +    "$TEST_IMG"
> +

A bit awkward, but works.

> +# This gives us a range of:
> +#   2^31 - 512 + 768 - 1 = 2^31 + 255 > 2^31
> +# until the beginning of the end COW block.  (The total allocation
> +# size depends on the cluster size, but all that is important is that
> +# it exceeds INT_MAX.)
> +#
> +# 2^31 - 512 is the maximum request size.  We want this to result in a
> +# single allocation, and because the qcow2 driver splits allocations
> +# on L2 boundaries, we need large L2 tables; hence the cluster size of
> +# 2 MB.  (Anything from 256 kB should work, though, because then one L2
> +# table covers 8 GB.)
> +$QEMU_IO -c "write 768 $((2 ** 31 - 512))" "$TEST_IMG" | _filter_qemu_io

Yep, that causes the rounding issue that requires being able to handle > 
2G gracefully.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] qcow2: Limit total allocation range to INT_MAX
  2019-10-10 10:08 ` [PATCH 1/2] " Max Reitz
  2019-10-10 15:56   ` Eric Blake
@ 2019-10-11 10:18   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 7+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-10-11 10:18 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

On 10/10/19 12:08 PM, Max Reitz wrote:
> When the COW areas are included, the size of an allocation can exceed
> INT_MAX.  This is kind of limited by handle_alloc() in that it already
> caps avail_bytes at INT_MAX, but the number of clusters still reflects
> the original length.
> 
> This can have all sorts of effects, ranging from the storage layer write
> call failing to image corruption.  (If there were no image corruption,
> then I suppose there would be data loss because the .cow_end area is
> forced to be empty, even though there might be something we need to
> COW.)
> 
> Fix all of it by limiting nb_clusters so the equivalent number of bytes
> will not exceed INT_MAX.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> ---
>   block/qcow2-cluster.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 8d5fa1539c..8982b7b762 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -1330,6 +1330,9 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
>       nb_clusters = MIN(nb_clusters, s->l2_slice_size - l2_index);
>       assert(nb_clusters <= INT_MAX);
>   
> +    /* Limit total allocation byte count to INT_MAX */
> +    nb_clusters = MIN(nb_clusters, INT_MAX >> s->cluster_bits);
> +
>       /* Find L2 entry for the first involved cluster */
>       ret = get_cluster_table(bs, guest_offset, &l2_slice, &l2_index);
>       if (ret < 0) {
> @@ -1412,7 +1415,7 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
>        * request actually writes to (excluding COW at the end)
>        */
>       uint64_t requested_bytes = *bytes + offset_into_cluster(s, guest_offset);
> -    int avail_bytes = MIN(INT_MAX, nb_clusters << s->cluster_bits);
> +    int avail_bytes = nb_clusters << s->cluster_bits;
>       int nb_bytes = MIN(requested_bytes, avail_bytes);
>       QCowL2Meta *old_m = *m;
>   
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX
  2019-10-10 10:08 [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Max Reitz
  2019-10-10 10:08 ` [PATCH 1/2] " Max Reitz
  2019-10-10 10:08 ` [PATCH 2/2] iotests: Test large write request to qcow2 file Max Reitz
@ 2019-10-14 15:12 ` Kevin Wolf
  2 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2019-10-14 15:12 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-devel, qemu-block

Am 10.10.2019 um 12:08 hat Max Reitz geschrieben:
> Hi,
> 
> While looking for why handle_alloc_space() seems to cause issues on
> ppc64le+XFS (performance degradation and data corruption), I spotted
> this other issue.  It isn’t as bad, but still needs fixing.
> 
> See patch 1 for what is fixed and patch 2 for what breaks otherwise.

Thanks, applied to the block branch.

Kevin


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-10 10:08 [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Max Reitz
2019-10-10 10:08 ` [PATCH 1/2] " Max Reitz
2019-10-10 15:56   ` Eric Blake
2019-10-11 10:18   ` Philippe Mathieu-Daudé
2019-10-10 10:08 ` [PATCH 2/2] iotests: Test large write request to qcow2 file Max Reitz
2019-10-10 16:25   ` Eric Blake
2019-10-14 15:12 ` [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Kevin Wolf

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org qemu-devel@archiver.kernel.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox