* [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX @ 2019-10-10 10:08 Max Reitz 2019-10-10 10:08 ` [PATCH 1/2] " Max Reitz ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Max Reitz @ 2019-10-10 10:08 UTC (permalink / raw) To: qemu-block; +Cc: Kevin Wolf, qemu-devel, Max Reitz Hi, While looking for why handle_alloc_space() seems to cause issues on ppc64le+XFS (performance degradation and data corruption), I spotted this other issue. It isn’t as bad, but still needs fixing. See patch 1 for what is fixed and patch 2 for what breaks otherwise. Max Reitz (2): qcow2: Limit total allocation range to INT_MAX iotests: Test large write request to qcow2 file block/qcow2-cluster.c | 5 ++- tests/qemu-iotests/268 | 83 ++++++++++++++++++++++++++++++++++++++ tests/qemu-iotests/268.out | 9 +++++ tests/qemu-iotests/group | 1 + 4 files changed, 97 insertions(+), 1 deletion(-) create mode 100755 tests/qemu-iotests/268 create mode 100644 tests/qemu-iotests/268.out -- 2.21.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] qcow2: Limit total allocation range to INT_MAX 2019-10-10 10:08 [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Max Reitz @ 2019-10-10 10:08 ` Max Reitz 2019-10-10 15:56 ` Eric Blake 2019-10-11 10:18 ` Philippe Mathieu-Daudé 2019-10-10 10:08 ` [PATCH 2/2] iotests: Test large write request to qcow2 file Max Reitz 2019-10-14 15:12 ` [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Kevin Wolf 2 siblings, 2 replies; 7+ messages in thread From: Max Reitz @ 2019-10-10 10:08 UTC (permalink / raw) To: qemu-block; +Cc: Kevin Wolf, qemu-devel, Max Reitz When the COW areas are included, the size of an allocation can exceed INT_MAX. This is kind of limited by handle_alloc() in that it already caps avail_bytes at INT_MAX, but the number of clusters still reflects the original length. This can have all sorts of effects, ranging from the storage layer write call failing to image corruption. (If there were no image corruption, then I suppose there would be data loss because the .cow_end area is forced to be empty, even though there might be something we need to COW.) Fix all of it by limiting nb_clusters so the equivalent number of bytes will not exceed INT_MAX. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> --- block/qcow2-cluster.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 8d5fa1539c..8982b7b762 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1330,6 +1330,9 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset, nb_clusters = MIN(nb_clusters, s->l2_slice_size - l2_index); assert(nb_clusters <= INT_MAX); + /* Limit total allocation byte count to INT_MAX */ + nb_clusters = MIN(nb_clusters, INT_MAX >> s->cluster_bits); + /* Find L2 entry for the first involved cluster */ ret = get_cluster_table(bs, guest_offset, &l2_slice, &l2_index); if (ret < 0) { @@ -1412,7 +1415,7 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset, * request actually writes to (excluding COW at the end) */ uint64_t requested_bytes = *bytes + offset_into_cluster(s, guest_offset); - int avail_bytes = MIN(INT_MAX, nb_clusters << s->cluster_bits); + int avail_bytes = nb_clusters << s->cluster_bits; int nb_bytes = MIN(requested_bytes, avail_bytes); QCowL2Meta *old_m = *m; -- 2.21.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] qcow2: Limit total allocation range to INT_MAX 2019-10-10 10:08 ` [PATCH 1/2] " Max Reitz @ 2019-10-10 15:56 ` Eric Blake 2019-10-11 10:18 ` Philippe Mathieu-Daudé 1 sibling, 0 replies; 7+ messages in thread From: Eric Blake @ 2019-10-10 15:56 UTC (permalink / raw) To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel On 10/10/19 5:08 AM, Max Reitz wrote: > When the COW areas are included, the size of an allocation can exceed > INT_MAX. This is kind of limited by handle_alloc() in that it already > caps avail_bytes at INT_MAX, but the number of clusters still reflects > the original length. > > This can have all sorts of effects, ranging from the storage layer write > call failing to image corruption. (If there were no image corruption, > then I suppose there would be data loss because the .cow_end area is > forced to be empty, even though there might be something we need to > COW.) > > Fix all of it by limiting nb_clusters so the equivalent number of bytes > will not exceed INT_MAX. > > Cc: qemu-stable@nongnu.org > Signed-off-by: Max Reitz <mreitz@redhat.com> > --- > block/qcow2-cluster.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) Reviewed-by: Eric Blake <eblake@redhat.com> -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] qcow2: Limit total allocation range to INT_MAX 2019-10-10 10:08 ` [PATCH 1/2] " Max Reitz 2019-10-10 15:56 ` Eric Blake @ 2019-10-11 10:18 ` Philippe Mathieu-Daudé 1 sibling, 0 replies; 7+ messages in thread From: Philippe Mathieu-Daudé @ 2019-10-11 10:18 UTC (permalink / raw) To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel On 10/10/19 12:08 PM, Max Reitz wrote: > When the COW areas are included, the size of an allocation can exceed > INT_MAX. This is kind of limited by handle_alloc() in that it already > caps avail_bytes at INT_MAX, but the number of clusters still reflects > the original length. > > This can have all sorts of effects, ranging from the storage layer write > call failing to image corruption. (If there were no image corruption, > then I suppose there would be data loss because the .cow_end area is > forced to be empty, even though there might be something we need to > COW.) > > Fix all of it by limiting nb_clusters so the equivalent number of bytes > will not exceed INT_MAX. > > Cc: qemu-stable@nongnu.org > Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> > --- > block/qcow2-cluster.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c > index 8d5fa1539c..8982b7b762 100644 > --- a/block/qcow2-cluster.c > +++ b/block/qcow2-cluster.c > @@ -1330,6 +1330,9 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset, > nb_clusters = MIN(nb_clusters, s->l2_slice_size - l2_index); > assert(nb_clusters <= INT_MAX); > > + /* Limit total allocation byte count to INT_MAX */ > + nb_clusters = MIN(nb_clusters, INT_MAX >> s->cluster_bits); > + > /* Find L2 entry for the first involved cluster */ > ret = get_cluster_table(bs, guest_offset, &l2_slice, &l2_index); > if (ret < 0) { > @@ -1412,7 +1415,7 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset, > * request actually writes to (excluding COW at the end) > */ > uint64_t requested_bytes = *bytes + offset_into_cluster(s, guest_offset); > - int avail_bytes = MIN(INT_MAX, nb_clusters << s->cluster_bits); > + int avail_bytes = nb_clusters << s->cluster_bits; > int nb_bytes = MIN(requested_bytes, avail_bytes); > QCowL2Meta *old_m = *m; > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] iotests: Test large write request to qcow2 file 2019-10-10 10:08 [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Max Reitz 2019-10-10 10:08 ` [PATCH 1/2] " Max Reitz @ 2019-10-10 10:08 ` Max Reitz 2019-10-10 16:25 ` Eric Blake 2019-10-14 15:12 ` [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Kevin Wolf 2 siblings, 1 reply; 7+ messages in thread From: Max Reitz @ 2019-10-10 10:08 UTC (permalink / raw) To: qemu-block; +Cc: Kevin Wolf, qemu-devel, Max Reitz Without HEAD^, the following happens when you attempt a large write request to a qcow2 file such that the number of bytes covered by all clusters involved in a single allocation will exceed INT_MAX: (A) handle_alloc_space() decides to fill the whole area with zeroes and fails because bdrv_co_pwrite_zeroes() fails (the request is too large). (B) If handle_alloc_space() does not do anything, but merge_cow() decides that the requests can be merged, it will create a too long IOV that later cannot be written. (C) Otherwise, all parts will be written separately, so those requests will work. In either B or C, though, qcow2_alloc_cluster_link_l2() will have an overflow: We use an int (i) to iterate over nb_clusters, and then calculate the L2 entry based on "i << s->cluster_bits" -- which will overflow if the range covers more than INT_MAX bytes. This then leads to image corruption because the L2 entry will be wrong (it will be recognized as a compressed cluster). Even if that were not the case, the .cow_end area would be empty (because handle_alloc() will cap avail_bytes and nb_bytes at INT_MAX, so their difference (which is the .cow_end size) will be 0). So this test checks that on such large requests, the image will not be corrupted. Unfortunately, we cannot check whether COW will be handled correctly, because that data is discarded when it is written to null-co (but we have to use null-co, because writing 2 GB of data in a test is not quite reasonable). Signed-off-by: Max Reitz <mreitz@redhat.com> --- tests/qemu-iotests/268 | 83 ++++++++++++++++++++++++++++++++++++++ tests/qemu-iotests/268.out | 9 +++++ tests/qemu-iotests/group | 1 + 3 files changed, 93 insertions(+) create mode 100755 tests/qemu-iotests/268 create mode 100644 tests/qemu-iotests/268.out diff --git a/tests/qemu-iotests/268 b/tests/qemu-iotests/268 new file mode 100755 index 0000000000..b9a12b908c --- /dev/null +++ b/tests/qemu-iotests/268 @@ -0,0 +1,83 @@ +#!/usr/bin/env bash +# +# Test large write to a qcow2 image +# +# Copyright (C) 2019 Red Hat, Inc. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. +# + +seq=$(basename "$0") +echo "QA output created by $seq" + +status=1 # failure is the default! + +_cleanup() +{ + _cleanup_test_img +} +trap "_cleanup; exit \$status" 0 1 2 3 15 + +# get standard environment, filters and checks +. ./common.rc +. ./common.filter + +# This is a qcow2 regression test +_supported_fmt qcow2 +_supported_proto file +_supported_os Linux + +# We use our own external data file and our own cluster size, and we +# require v3 images +_unsupported_imgopts data_file cluster_size 'compat=0.10' + + +# We need a backing file so that handle_alloc_space() will not do +# anything. (If it were to do anything, it would simply fail its +# write-zeroes request because the request range is too large.) +TEST_IMG="$TEST_IMG.base" _make_test_img 4G +$QEMU_IO -c 'write 0 512' "$TEST_IMG.base" | _filter_qemu_io + +# (Use .orig because _cleanup_test_img will remove that file) +# We need a large cluster size, see below for why (above the $QEMU_IO +# invocation) +_make_test_img -o cluster_size=2M,data_file="$TEST_IMG.orig" \ + -b "$TEST_IMG.base" 4G + +# We want a null-co as the data file, because it allows us to quickly +# "write" 2G of data without using any space. +# (qemu-img create does not like it, though, because null-co does not +# support image creation.) +$QEMU_IMG amend -o data_file="json:{'driver':'null-co',,'size':'4294967296'}" \ + "$TEST_IMG" + +# This gives us a range of: +# 2^31 - 512 + 768 - 1 = 2^31 + 255 > 2^31 +# until the beginning of the end COW block. (The total allocation +# size depends on the cluster size, but all that is important is that +# it exceeds INT_MAX.) +# +# 2^31 - 512 is the maximum request size. We want this to result in a +# single allocation, and because the qcow2 driver splits allocations +# on L2 boundaries, we need large L2 tables; hence the cluster size of +# 2 MB. (Anything from 256 kB should work, though, because then one L2 +# table covers 8 GB.) +$QEMU_IO -c "write 768 $((2 ** 31 - 512))" "$TEST_IMG" | _filter_qemu_io + +_check_test_img + +# success, all done +echo "*** done" +rm -f $seq.full +status=0 diff --git a/tests/qemu-iotests/268.out b/tests/qemu-iotests/268.out new file mode 100644 index 0000000000..35d4f9e3e9 --- /dev/null +++ b/tests/qemu-iotests/268.out @@ -0,0 +1,9 @@ +QA output created by 268 +Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=4294967296 +wrote 512/512 bytes at offset 0 +512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4294967296 backing_file=TEST_DIR/t.IMGFMT.base data_file=TEST_DIR/t.IMGFMT.orig +wrote 2147483136/2147483136 bytes at offset 768 +2 GiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +No errors were found on the image. +*** done diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group index 5805a79d9e..4c37b8acec 100644 --- a/tests/qemu-iotests/group +++ b/tests/qemu-iotests/group @@ -278,3 +278,4 @@ 265 rw auto quick 266 rw quick 267 rw auto quick snapshot +268 rw backing quick -- 2.21.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] iotests: Test large write request to qcow2 file 2019-10-10 10:08 ` [PATCH 2/2] iotests: Test large write request to qcow2 file Max Reitz @ 2019-10-10 16:25 ` Eric Blake 0 siblings, 0 replies; 7+ messages in thread From: Eric Blake @ 2019-10-10 16:25 UTC (permalink / raw) To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel On 10/10/19 5:08 AM, Max Reitz wrote: > Without HEAD^, the following happens when you attempt a large write > request to a qcow2 file such that the number of bytes covered by all > clusters involved in a single allocation will exceed INT_MAX: > > (A) handle_alloc_space() decides to fill the whole area with zeroes and > fails because bdrv_co_pwrite_zeroes() fails (the request is too > large). > > (B) If handle_alloc_space() does not do anything, but merge_cow() > decides that the requests can be merged, it will create a too long > IOV that later cannot be written. > > (C) Otherwise, all parts will be written separately, so those requests > will work. > > In either B or C, though, qcow2_alloc_cluster_link_l2() will have an > overflow: We use an int (i) to iterate over nb_clusters, and then > calculate the L2 entry based on "i << s->cluster_bits" -- which will > overflow if the range covers more than INT_MAX bytes. This then leads > to image corruption because the L2 entry will be wrong (it will be > recognized as a compressed cluster). > > Even if that were not the case, the .cow_end area would be empty > (because handle_alloc() will cap avail_bytes and nb_bytes at INT_MAX, so > their difference (which is the .cow_end size) will be 0). > > So this test checks that on such large requests, the image will not be > corrupted. Unfortunately, we cannot check whether COW will be handled > correctly, because that data is discarded when it is written to null-co > (but we have to use null-co, because writing 2 GB of data in a test is > not quite reasonable). > > Signed-off-by: Max Reitz <mreitz@redhat.com> > --- > tests/qemu-iotests/268 | 83 ++++++++++++++++++++++++++++++++++++++ > tests/qemu-iotests/268.out | 9 +++++ > tests/qemu-iotests/group | 1 + > 3 files changed, 93 insertions(+) > create mode 100755 tests/qemu-iotests/268 > create mode 100644 tests/qemu-iotests/268.out > > diff --git a/tests/qemu-iotests/268 b/tests/qemu-iotests/268 > new file mode 100755 > index 0000000000..b9a12b908c > --- /dev/null > +++ b/tests/qemu-iotests/268 > +# We want a null-co as the data file, because it allows us to quickly > +# "write" 2G of data without using any space. > +# (qemu-img create does not like it, though, because null-co does not > +# support image creation.) > +$QEMU_IMG amend -o data_file="json:{'driver':'null-co',,'size':'4294967296'}" \ > + "$TEST_IMG" > + A bit awkward, but works. > +# This gives us a range of: > +# 2^31 - 512 + 768 - 1 = 2^31 + 255 > 2^31 > +# until the beginning of the end COW block. (The total allocation > +# size depends on the cluster size, but all that is important is that > +# it exceeds INT_MAX.) > +# > +# 2^31 - 512 is the maximum request size. We want this to result in a > +# single allocation, and because the qcow2 driver splits allocations > +# on L2 boundaries, we need large L2 tables; hence the cluster size of > +# 2 MB. (Anything from 256 kB should work, though, because then one L2 > +# table covers 8 GB.) > +$QEMU_IO -c "write 768 $((2 ** 31 - 512))" "$TEST_IMG" | _filter_qemu_io Yep, that causes the rounding issue that requires being able to handle > 2G gracefully. Reviewed-by: Eric Blake <eblake@redhat.com> -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX 2019-10-10 10:08 [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Max Reitz 2019-10-10 10:08 ` [PATCH 1/2] " Max Reitz 2019-10-10 10:08 ` [PATCH 2/2] iotests: Test large write request to qcow2 file Max Reitz @ 2019-10-14 15:12 ` Kevin Wolf 2 siblings, 0 replies; 7+ messages in thread From: Kevin Wolf @ 2019-10-14 15:12 UTC (permalink / raw) To: Max Reitz; +Cc: qemu-devel, qemu-block Am 10.10.2019 um 12:08 hat Max Reitz geschrieben: > Hi, > > While looking for why handle_alloc_space() seems to cause issues on > ppc64le+XFS (performance degradation and data corruption), I spotted > this other issue. It isn’t as bad, but still needs fixing. > > See patch 1 for what is fixed and patch 2 for what breaks otherwise. Thanks, applied to the block branch. Kevin ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-10-14 15:13 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-10-10 10:08 [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Max Reitz 2019-10-10 10:08 ` [PATCH 1/2] " Max Reitz 2019-10-10 15:56 ` Eric Blake 2019-10-11 10:18 ` Philippe Mathieu-Daudé 2019-10-10 10:08 ` [PATCH 2/2] iotests: Test large write request to qcow2 file Max Reitz 2019-10-10 16:25 ` Eric Blake 2019-10-14 15:12 ` [PATCH 0/2] qcow2: Limit total allocation range to INT_MAX Kevin Wolf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).