[Qemu-devel] [PATCH for-4.0? 0/2] Fix overflow bug in qcow2 discard

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH for-4.0? 0/2] Fix overflow bug in qcow2 discard
@ 2019-04-17 10:09 ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-17 10:09 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: kwolf, mreitz, fam, stefanha, vsementsov, den

Hi all. We faced an interesting bug, which may be simply reproduced:

prepare image:
./qemu-img create -f qcow2 -o cluster_size=1M /ssd/test 2300M
./qemu-io -c 'write 100M 2000M' -c 'write 2100M 200M' -c 'write 0 100M' /ssd/test

shrink:
./qemu-img resize --shrink /ssd/test 50M

bug:
./qemu-img info /ssd/test
image: /ssd/test
file format: qcow2
virtual size: 50M (52428800 bytes)
disk size: 2.2G
cluster_size: 1048576
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Virtual size is shrunk, but file - not. It is due to the fact,
that merged qcow2 discard may exceed 2G, and then converting from
uint64_t to int in qcow2_process_discards when we call bdrv_pdiscard
make wrong thing.

So, here are proposal of fix and new iotest for it.

Vladimir Sementsov-Ogievskiy (2):
  block/io: bdrv_pdiscard: support int64_t bytes parameter
  iotests: test big qcow2 shrink

 include/block/block.h      |  4 +--
 block/io.c                 | 19 ++++++-----
 tests/qemu-iotests/249     | 69 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/249.out | 30 +++++++++++++++++
 tests/qemu-iotests/group   |  1 +
 5 files changed, 112 insertions(+), 11 deletions(-)
 create mode 100755 tests/qemu-iotests/249
 create mode 100644 tests/qemu-iotests/249.out

-- 
2.18.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH for-4.0? 0/2] Fix overflow bug in qcow2 discard
@ 2019-04-17 10:09 ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-17 10:09 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: kwolf, fam, vsementsov, den, mreitz, stefanha

Hi all. We faced an interesting bug, which may be simply reproduced:

prepare image:
./qemu-img create -f qcow2 -o cluster_size=1M /ssd/test 2300M
./qemu-io -c 'write 100M 2000M' -c 'write 2100M 200M' -c 'write 0 100M' /ssd/test

shrink:
./qemu-img resize --shrink /ssd/test 50M

bug:
./qemu-img info /ssd/test
image: /ssd/test
file format: qcow2
virtual size: 50M (52428800 bytes)
disk size: 2.2G
cluster_size: 1048576
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Virtual size is shrunk, but file - not. It is due to the fact,
that merged qcow2 discard may exceed 2G, and then converting from
uint64_t to int in qcow2_process_discards when we call bdrv_pdiscard
make wrong thing.

So, here are proposal of fix and new iotest for it.

Vladimir Sementsov-Ogievskiy (2):
  block/io: bdrv_pdiscard: support int64_t bytes parameter
  iotests: test big qcow2 shrink

 include/block/block.h      |  4 +--
 block/io.c                 | 19 ++++++-----
 tests/qemu-iotests/249     | 69 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/249.out | 30 +++++++++++++++++
 tests/qemu-iotests/group   |  1 +
 5 files changed, 112 insertions(+), 11 deletions(-)
 create mode 100755 tests/qemu-iotests/249
 create mode 100644 tests/qemu-iotests/249.out

-- 
2.18.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 1/2] block/io: bdrv_pdiscard: support int64_t bytes parameter
@ 2019-04-17 10:09   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-17 10:09 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: kwolf, mreitz, fam, stefanha, vsementsov, den

This fixes at least one overflow in qcow2_process_discards.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h |  4 ++--
 block/io.c            | 19 ++++++++++---------
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index c7a26199aa..69fa18867e 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -432,8 +432,8 @@ void bdrv_drain_all(void);
     AIO_WAIT_WHILE(bdrv_get_aio_context(bs_),              \
                    cond); })
 
-int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes);
-int bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes);
+int bdrv_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes);
+int bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes);
 int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
 bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs);
diff --git a/block/io.c b/block/io.c
index dfc153b8d8..c6429380e8 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2653,7 +2653,7 @@ int bdrv_flush(BlockDriverState *bs)
 typedef struct DiscardCo {
     BdrvChild *child;
     int64_t offset;
-    int bytes;
+    int64_t bytes;
     int ret;
 } DiscardCo;
 static void coroutine_fn bdrv_pdiscard_co_entry(void *opaque)
@@ -2664,14 +2664,15 @@ static void coroutine_fn bdrv_pdiscard_co_entry(void *opaque)
     aio_wait_kick();
 }
 
-int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
+int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset,
+                                  int64_t bytes)
 {
     BdrvTrackedRequest req;
     int max_pdiscard, ret;
     int head, tail, align;
     BlockDriverState *bs = child->bs;
 
-    if (!bs || !bs->drv) {
+    if (!bs || !bs->drv || !bdrv_is_inserted(bs)) {
         return -ENOMEDIUM;
     }
 
@@ -2679,9 +2680,8 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
         return -EPERM;
     }
 
-    ret = bdrv_check_byte_request(bs, offset, bytes);
-    if (ret < 0) {
-        return ret;
+    if (offset < 0) {
+        return -EIO;
     }
 
     /* Do nothing if disabled.  */
@@ -2711,12 +2711,13 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
         goto out;
     }
 
-    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard, INT_MAX),
+    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard,
+                                                BDRV_REQUEST_MAX_BYTES),
                                    align);
     assert(max_pdiscard >= bs->bl.request_alignment);
 
     while (bytes > 0) {
-        int num = bytes;
+        int64_t num = bytes;
 
         if (head) {
             /* Make small requests to get to alignment boundaries. */
@@ -2778,7 +2779,7 @@ out:
     return ret;
 }
 
-int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes)
+int bdrv_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes)
 {
     Coroutine *co;
     DiscardCo rwco = {
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 1/2] block/io: bdrv_pdiscard: support int64_t bytes parameter
@ 2019-04-17 10:09   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-17 10:09 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: kwolf, fam, vsementsov, den, mreitz, stefanha

This fixes at least one overflow in qcow2_process_discards.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h |  4 ++--
 block/io.c            | 19 ++++++++++---------
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index c7a26199aa..69fa18867e 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -432,8 +432,8 @@ void bdrv_drain_all(void);
     AIO_WAIT_WHILE(bdrv_get_aio_context(bs_),              \
                    cond); })
 
-int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes);
-int bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes);
+int bdrv_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes);
+int bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes);
 int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
 bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs);
diff --git a/block/io.c b/block/io.c
index dfc153b8d8..c6429380e8 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2653,7 +2653,7 @@ int bdrv_flush(BlockDriverState *bs)
 typedef struct DiscardCo {
     BdrvChild *child;
     int64_t offset;
-    int bytes;
+    int64_t bytes;
     int ret;
 } DiscardCo;
 static void coroutine_fn bdrv_pdiscard_co_entry(void *opaque)
@@ -2664,14 +2664,15 @@ static void coroutine_fn bdrv_pdiscard_co_entry(void *opaque)
     aio_wait_kick();
 }
 
-int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
+int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset,
+                                  int64_t bytes)
 {
     BdrvTrackedRequest req;
     int max_pdiscard, ret;
     int head, tail, align;
     BlockDriverState *bs = child->bs;
 
-    if (!bs || !bs->drv) {
+    if (!bs || !bs->drv || !bdrv_is_inserted(bs)) {
         return -ENOMEDIUM;
     }
 
@@ -2679,9 +2680,8 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
         return -EPERM;
     }
 
-    ret = bdrv_check_byte_request(bs, offset, bytes);
-    if (ret < 0) {
-        return ret;
+    if (offset < 0) {
+        return -EIO;
     }
 
     /* Do nothing if disabled.  */
@@ -2711,12 +2711,13 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
         goto out;
     }
 
-    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard, INT_MAX),
+    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard,
+                                                BDRV_REQUEST_MAX_BYTES),
                                    align);
     assert(max_pdiscard >= bs->bl.request_alignment);
 
     while (bytes > 0) {
-        int num = bytes;
+        int64_t num = bytes;
 
         if (head) {
             /* Make small requests to get to alignment boundaries. */
@@ -2778,7 +2779,7 @@ out:
     return ret;
 }
 
-int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes)
+int bdrv_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes)
 {
     Coroutine *co;
     DiscardCo rwco = {
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 2/2] iotests: test big qcow2 shrink
@ 2019-04-17 10:09   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-17 10:09 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: kwolf, mreitz, fam, stefanha, vsementsov, den

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/249     | 69 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/249.out | 30 +++++++++++++++++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 100 insertions(+)
 create mode 100755 tests/qemu-iotests/249
 create mode 100644 tests/qemu-iotests/249.out

diff --git a/tests/qemu-iotests/249 b/tests/qemu-iotests/249
new file mode 100755
index 0000000000..f0140461ad
--- /dev/null
+++ b/tests/qemu-iotests/249
@@ -0,0 +1,69 @@
+#!/usr/bin/env bash
+#
+# Test big discard in qcow2 shrink
+#
+# Copyright (c) 2019 Virtuozzo International GmbH. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=vsementsov@virtuozzo.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+status=1	# failure is the default!
+
+_cleanup()
+{
+    _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+# Requires backing files and .bdrv_change_backing_file support
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+# This test checks bug in qcow2_process_discards, fixed by previous commit.
+# The problem was that bdrv_pdiscard was called with uint64_t bytes parameter
+# which was cropped to int.
+# To reproduce bug we need to overflow int by one sequential discard, so we
+# need size > 2G, bigger cluster size (as with default 64k we may have maximum
+# of 512M sequential data, corresponding to one L1 entry, and we need to write
+# at offset 0 at last, to prevent bdrv_co_truncate(bs->file) in
+# qcow2_co_truncate to stole the whole effect of failed discard.
+
+size=2300M
+IMGOPTS="cluster_size=1M"
+
+_make_test_img $size
+$QEMU_IO -c 'write 100M 2000M' -c 'write 2100M 200M' \
+    -c 'write 0 100M' "$TEST_IMG" | _filter_qemu_io
+
+$QEMU_IMG info "$TEST_IMG" | _filter_testdir
+
+$QEMU_IMG resize --shrink "$TEST_IMG" 50M
+
+$QEMU_IMG info "$TEST_IMG" | _filter_testdir
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/249.out b/tests/qemu-iotests/249.out
new file mode 100644
index 0000000000..18bf7b2fb2
--- /dev/null
+++ b/tests/qemu-iotests/249.out
@@ -0,0 +1,30 @@
+QA output created by 249
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=2411724800
+wrote 2097152000/2097152000 bytes at offset 104857600
+1.953 GiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 209715200/209715200 bytes at offset 2202009600
+200 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 104857600/104857600 bytes at offset 0
+100 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+image: TEST_DIR/t.qcow2
+file format: qcow2
+virtual size: 2.2G (2411724800 bytes)
+disk size: 2.3G
+cluster_size: 1048576
+Format specific information:
+    compat: 1.1
+    lazy refcounts: false
+    refcount bits: 16
+    corrupt: false
+Image resized.
+image: TEST_DIR/t.qcow2
+file format: qcow2
+virtual size: 50M (52428800 bytes)
+disk size: 54M
+cluster_size: 1048576
+Format specific information:
+    compat: 1.1
+    lazy refcounts: false
+    refcount bits: 16
+    corrupt: false
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index bae7718380..3a40dbfe69 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -248,3 +248,4 @@
 246 rw auto quick
 247 rw auto quick
 248 rw auto quick
+249 rw auto
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 2/2] iotests: test big qcow2 shrink
@ 2019-04-17 10:09   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-17 10:09 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: kwolf, fam, vsementsov, den, mreitz, stefanha

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/249     | 69 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/249.out | 30 +++++++++++++++++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 100 insertions(+)
 create mode 100755 tests/qemu-iotests/249
 create mode 100644 tests/qemu-iotests/249.out

diff --git a/tests/qemu-iotests/249 b/tests/qemu-iotests/249
new file mode 100755
index 0000000000..f0140461ad
--- /dev/null
+++ b/tests/qemu-iotests/249
@@ -0,0 +1,69 @@
+#!/usr/bin/env bash
+#
+# Test big discard in qcow2 shrink
+#
+# Copyright (c) 2019 Virtuozzo International GmbH. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=vsementsov@virtuozzo.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+status=1	# failure is the default!
+
+_cleanup()
+{
+    _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+# Requires backing files and .bdrv_change_backing_file support
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+# This test checks bug in qcow2_process_discards, fixed by previous commit.
+# The problem was that bdrv_pdiscard was called with uint64_t bytes parameter
+# which was cropped to int.
+# To reproduce bug we need to overflow int by one sequential discard, so we
+# need size > 2G, bigger cluster size (as with default 64k we may have maximum
+# of 512M sequential data, corresponding to one L1 entry, and we need to write
+# at offset 0 at last, to prevent bdrv_co_truncate(bs->file) in
+# qcow2_co_truncate to stole the whole effect of failed discard.
+
+size=2300M
+IMGOPTS="cluster_size=1M"
+
+_make_test_img $size
+$QEMU_IO -c 'write 100M 2000M' -c 'write 2100M 200M' \
+    -c 'write 0 100M' "$TEST_IMG" | _filter_qemu_io
+
+$QEMU_IMG info "$TEST_IMG" | _filter_testdir
+
+$QEMU_IMG resize --shrink "$TEST_IMG" 50M
+
+$QEMU_IMG info "$TEST_IMG" | _filter_testdir
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/249.out b/tests/qemu-iotests/249.out
new file mode 100644
index 0000000000..18bf7b2fb2
--- /dev/null
+++ b/tests/qemu-iotests/249.out
@@ -0,0 +1,30 @@
+QA output created by 249
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=2411724800
+wrote 2097152000/2097152000 bytes at offset 104857600
+1.953 GiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 209715200/209715200 bytes at offset 2202009600
+200 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 104857600/104857600 bytes at offset 0
+100 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+image: TEST_DIR/t.qcow2
+file format: qcow2
+virtual size: 2.2G (2411724800 bytes)
+disk size: 2.3G
+cluster_size: 1048576
+Format specific information:
+    compat: 1.1
+    lazy refcounts: false
+    refcount bits: 16
+    corrupt: false
+Image resized.
+image: TEST_DIR/t.qcow2
+file format: qcow2
+virtual size: 50M (52428800 bytes)
+disk size: 54M
+cluster_size: 1048576
+Format specific information:
+    compat: 1.1
+    lazy refcounts: false
+    refcount bits: 16
+    corrupt: false
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index bae7718380..3a40dbfe69 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -248,3 +248,4 @@
 246 rw auto quick
 247 rw auto quick
 248 rw auto quick
+249 rw auto
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0? 0/2] Fix overflow bug in qcow2 discard
@ 2019-04-17 12:33   ` Eric Blake
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Blake @ 2019-04-17 12:33 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block
  Cc: kwolf, fam, den, mreitz, stefanha

[-- Attachment #1: Type: text/plain, Size: 2004 bytes --]

On 4/17/19 5:09 AM, Vladimir Sementsov-Ogievskiy wrote:
> Hi all. We faced an interesting bug, which may be simply reproduced:
> 
> prepare image:
> ./qemu-img create -f qcow2 -o cluster_size=1M /ssd/test 2300M
> ./qemu-io -c 'write 100M 2000M' -c 'write 2100M 200M' -c 'write 0 100M' /ssd/test
> 
> shrink:
> ./qemu-img resize --shrink /ssd/test 50M
> 
> bug:
> ./qemu-img info /ssd/test
> image: /ssd/test
> file format: qcow2
> virtual size: 50M (52428800 bytes)
> disk size: 2.2G
> cluster_size: 1048576
> Format specific information:
>     compat: 1.1
>     lazy refcounts: false
>     refcount bits: 16
>     corrupt: false
> 
> Virtual size is shrunk, but file - not. It is due to the fact,
> that merged qcow2 discard may exceed 2G, and then converting from
> uint64_t to int in qcow2_process_discards when we call bdrv_pdiscard
> make wrong thing.

Too late for 4.0, but also not a regression new to this release, since
the problem appears to have been present since its introduction in
commit 0b919fae (1.6.0) (that is, even back then, Qcow2DiscardRegion was
introduced with a 64-bit discard length, but qcow2_process_discards
blindly passed that through bdrv_discard() at the time, which took 'int
nb_sectors').

> 
> So, here are proposal of fix and new iotest for it.
> 
> Vladimir Sementsov-Ogievskiy (2):
>   block/io: bdrv_pdiscard: support int64_t bytes parameter
>   iotests: test big qcow2 shrink
> 
>  include/block/block.h      |  4 +--
>  block/io.c                 | 19 ++++++-----
>  tests/qemu-iotests/249     | 69 ++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/249.out | 30 +++++++++++++++++
>  tests/qemu-iotests/group   |  1 +
>  5 files changed, 112 insertions(+), 11 deletions(-)
>  create mode 100755 tests/qemu-iotests/249
>  create mode 100644 tests/qemu-iotests/249.out
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0? 0/2] Fix overflow bug in qcow2 discard
@ 2019-04-17 12:33   ` Eric Blake
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Blake @ 2019-04-17 12:33 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block
  Cc: kwolf, fam, den, stefanha, mreitz

[-- Attachment #1: Type: text/plain, Size: 2004 bytes --]

On 4/17/19 5:09 AM, Vladimir Sementsov-Ogievskiy wrote:
> Hi all. We faced an interesting bug, which may be simply reproduced:
> 
> prepare image:
> ./qemu-img create -f qcow2 -o cluster_size=1M /ssd/test 2300M
> ./qemu-io -c 'write 100M 2000M' -c 'write 2100M 200M' -c 'write 0 100M' /ssd/test
> 
> shrink:
> ./qemu-img resize --shrink /ssd/test 50M
> 
> bug:
> ./qemu-img info /ssd/test
> image: /ssd/test
> file format: qcow2
> virtual size: 50M (52428800 bytes)
> disk size: 2.2G
> cluster_size: 1048576
> Format specific information:
>     compat: 1.1
>     lazy refcounts: false
>     refcount bits: 16
>     corrupt: false
> 
> Virtual size is shrunk, but file - not. It is due to the fact,
> that merged qcow2 discard may exceed 2G, and then converting from
> uint64_t to int in qcow2_process_discards when we call bdrv_pdiscard
> make wrong thing.

Too late for 4.0, but also not a regression new to this release, since
the problem appears to have been present since its introduction in
commit 0b919fae (1.6.0) (that is, even back then, Qcow2DiscardRegion was
introduced with a 64-bit discard length, but qcow2_process_discards
blindly passed that through bdrv_discard() at the time, which took 'int
nb_sectors').

> 
> So, here are proposal of fix and new iotest for it.
> 
> Vladimir Sementsov-Ogievskiy (2):
>   block/io: bdrv_pdiscard: support int64_t bytes parameter
>   iotests: test big qcow2 shrink
> 
>  include/block/block.h      |  4 +--
>  block/io.c                 | 19 ++++++-----
>  tests/qemu-iotests/249     | 69 ++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/249.out | 30 +++++++++++++++++
>  tests/qemu-iotests/group   |  1 +
>  5 files changed, 112 insertions(+), 11 deletions(-)
>  create mode 100755 tests/qemu-iotests/249
>  create mode 100644 tests/qemu-iotests/249.out
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] block/io: bdrv_pdiscard: support int64_t bytes parameter
@ 2019-04-17 14:48     ` Eric Blake
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Blake @ 2019-04-17 14:48 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block
  Cc: kwolf, fam, den, mreitz, stefanha

[-- Attachment #1: Type: text/plain, Size: 3433 bytes --]

On 4/17/19 5:09 AM, Vladimir Sementsov-Ogievskiy wrote:
> This fixes at least one overflow in qcow2_process_discards.

It's worth calling out how long the problem of passing >2G discard
requests has been present (my reply to the cover letter tracked down
0b919fae as tracking a 64-bit discard region but passing it to
bdrv_discard() which took an int sectors; I'm not sure if later changes
to byte-based rather than sector-based made a difference).

> @@ -2664,14 +2664,15 @@ static void coroutine_fn bdrv_pdiscard_co_entry(void *opaque)
>      aio_wait_kick();
>  }
>  
> -int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
> +int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset,
> +                                  int64_t bytes)
>  {
>      BdrvTrackedRequest req;
>      int max_pdiscard, ret;
>      int head, tail, align;
>      BlockDriverState *bs = child->bs;
>  
> -    if (!bs || !bs->drv) {
> +    if (!bs || !bs->drv || !bdrv_is_inserted(bs)) {

This change seems unrelated? Oh, it's because you are inlining the rest
of what bdrv_check_byte_request used to do.

>          return -ENOMEDIUM;
>      }
>  
> @@ -2679,9 +2680,8 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
>          return -EPERM;
>      }
>  
> -    ret = bdrv_check_byte_request(bs, offset, bytes);
> -    if (ret < 0) {
> -        return ret;

If we keep this call in place, we can flag if there were any other
callers that were passing truncated 64-bit quantities. But I also agree
that now that we are switching to a 64-bit interface, we no longer have
to check whether callers were properly limiting their requests.

Hmm - I just realized that bdrv_check_byte_request() takes a size_t
(rather than int64_t) size argument - could this result in any other
truncations on a 32-bit platform that don't affect 64-bit platforms?

> @@ -2711,12 +2711,13 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
>          goto out;
>      }
>  
> -    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard, INT_MAX),
> +    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard,
> +                                                BDRV_REQUEST_MAX_BYTES),
>                                     align);

This change is a no-op, since BDRV_REQUEST_MAX_BYTES is already INT_MAX
aligned down to sector size, and align is at least sector size.

>      assert(max_pdiscard >= bs->bl.request_alignment);
>  
>      while (bytes > 0) {
> -        int num = bytes;
> +        int64_t num = bytes;
>  
>          if (head) {
>              /* Make small requests to get to alignment boundaries. */
> @@ -2778,7 +2779,7 @@ out:
>      return ret;
>  }
>  
> -int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes)
> +int bdrv_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes)
>  {
>      Coroutine *co;
>      DiscardCo rwco = {
> 

I'm not sure the patch is perfect, but I definitely agree that we want
to support 64-byte discard length (where the block layer fragments the
request as needed) rather than the current 32-byte discard length (where
callers have to be careful to not suffer from truncation).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] block/io: bdrv_pdiscard: support int64_t bytes parameter
@ 2019-04-17 14:48     ` Eric Blake
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Blake @ 2019-04-17 14:48 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block
  Cc: kwolf, fam, den, stefanha, mreitz

[-- Attachment #1: Type: text/plain, Size: 3433 bytes --]

On 4/17/19 5:09 AM, Vladimir Sementsov-Ogievskiy wrote:
> This fixes at least one overflow in qcow2_process_discards.

It's worth calling out how long the problem of passing >2G discard
requests has been present (my reply to the cover letter tracked down
0b919fae as tracking a 64-bit discard region but passing it to
bdrv_discard() which took an int sectors; I'm not sure if later changes
to byte-based rather than sector-based made a difference).

> @@ -2664,14 +2664,15 @@ static void coroutine_fn bdrv_pdiscard_co_entry(void *opaque)
>      aio_wait_kick();
>  }
>  
> -int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
> +int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset,
> +                                  int64_t bytes)
>  {
>      BdrvTrackedRequest req;
>      int max_pdiscard, ret;
>      int head, tail, align;
>      BlockDriverState *bs = child->bs;
>  
> -    if (!bs || !bs->drv) {
> +    if (!bs || !bs->drv || !bdrv_is_inserted(bs)) {

This change seems unrelated? Oh, it's because you are inlining the rest
of what bdrv_check_byte_request used to do.

>          return -ENOMEDIUM;
>      }
>  
> @@ -2679,9 +2680,8 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
>          return -EPERM;
>      }
>  
> -    ret = bdrv_check_byte_request(bs, offset, bytes);
> -    if (ret < 0) {
> -        return ret;

If we keep this call in place, we can flag if there were any other
callers that were passing truncated 64-bit quantities. But I also agree
that now that we are switching to a 64-bit interface, we no longer have
to check whether callers were properly limiting their requests.

Hmm - I just realized that bdrv_check_byte_request() takes a size_t
(rather than int64_t) size argument - could this result in any other
truncations on a 32-bit platform that don't affect 64-bit platforms?

> @@ -2711,12 +2711,13 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
>          goto out;
>      }
>  
> -    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard, INT_MAX),
> +    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard,
> +                                                BDRV_REQUEST_MAX_BYTES),
>                                     align);

This change is a no-op, since BDRV_REQUEST_MAX_BYTES is already INT_MAX
aligned down to sector size, and align is at least sector size.

>      assert(max_pdiscard >= bs->bl.request_alignment);
>  
>      while (bytes > 0) {
> -        int num = bytes;
> +        int64_t num = bytes;
>  
>          if (head) {
>              /* Make small requests to get to alignment boundaries. */
> @@ -2778,7 +2779,7 @@ out:
>      return ret;
>  }
>  
> -int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes)
> +int bdrv_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes)
>  {
>      Coroutine *co;
>      DiscardCo rwco = {
> 

I'm not sure the patch is perfect, but I definitely agree that we want
to support 64-byte discard length (where the block layer fragments the
request as needed) rather than the current 32-byte discard length (where
callers have to be careful to not suffer from truncation).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] iotests: test big qcow2 shrink
@ 2019-04-17 15:04     ` Eric Blake
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Blake @ 2019-04-17 15:04 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block
  Cc: kwolf, fam, den, mreitz, stefanha

[-- Attachment #1: Type: text/plain, Size: 3139 bytes --]

On 4/17/19 5:09 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  tests/qemu-iotests/249     | 69 ++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/249.out | 30 +++++++++++++++++
>  tests/qemu-iotests/group   |  1 +
>  3 files changed, 100 insertions(+)
>  create mode 100755 tests/qemu-iotests/249
>  create mode 100644 tests/qemu-iotests/249.out

> +
> +# This test checks bug in qcow2_process_discards, fixed by previous commit.

This sentence makes sense in a commit message, but is a bit stale in the
test itself. Simpler to just state:

This test checks that qcow2_process_discards does not truncate a discard
request > 2G.

> +# The problem was that bdrv_pdiscard was called with uint64_t bytes parameter
> +# which was cropped to int.

then drop this sentence.

> +# To reproduce bug we need to overflow int by one sequential discard, so we
> +# need size > 2G, bigger cluster size (as with default 64k we may have maximum
> +# of 512M sequential data, corresponding to one L1 entry, and we need to write

s/entry,/entry),/

> +# at offset 0 at last, to prevent bdrv_co_truncate(bs->file) in
> +# qcow2_co_truncate to stole the whole effect of failed discard.

and we need a second write at offset 0 to prevent
bdrv_co_truncate(bs->file) from short-circuiting the entire discard.

> +
> +size=2300M
> +IMGOPTS="cluster_size=1M"
> +
> +_make_test_img $size
> +$QEMU_IO -c 'write 100M 2000M' -c 'write 2100M 200M' \
> +    -c 'write 0 100M' "$TEST_IMG" | _filter_qemu_io

This takes a LONG time to produce; and when I tried it on a 32-bit
machine, I got an ENOMEM failure for being unable to allocate a 2000M
buffer. Is there any faster way to produce a similar setup, perhaps by
manually messing with the qcow2 file rather than directly writing that
much consecutive data?  Even if there isn't, you'll want to at least
make the test gracefully skip rather than fail if it hits ENOMEM.

> +Image resized.
> +image: TEST_DIR/t.qcow2
> +file format: qcow2
> +virtual size: 50M (52428800 bytes)
> +disk size: 54M

I confirmed that when patch 1/2 is not present, this reports 2.2G
instead of 54M.

The output here will need tweaking if my patch to rework qemu-img
human-readable sizing lands first (it's currently queued on Kevin's
block-next tree, but needs a rework to fix a few more iotests and to
resolve a 32-bit compilation issue).

> +++ b/tests/qemu-iotests/group
> @@ -248,3 +248,4 @@
>  246 rw auto quick
>  247 rw auto quick
>  248 rw auto quick
> +249 rw auto
> 

I agree that this isn't quick. I will also point out that at least three
earlier messages have also claimed 249:

Berto: [PATCH for-4.1 v2 2/2] iotests: Check that images are in
read-only mode after block-commit

Max: [PATCH v3 6/7] iotests: Test qemu-img convert --salvage

and yourself :) [PATCH v6 7/7] iotests: test nbd reconnect

so someone gets to rename.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] iotests: test big qcow2 shrink
@ 2019-04-17 15:04     ` Eric Blake
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Blake @ 2019-04-17 15:04 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block
  Cc: kwolf, fam, den, stefanha, mreitz

[-- Attachment #1: Type: text/plain, Size: 3139 bytes --]

On 4/17/19 5:09 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  tests/qemu-iotests/249     | 69 ++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/249.out | 30 +++++++++++++++++
>  tests/qemu-iotests/group   |  1 +
>  3 files changed, 100 insertions(+)
>  create mode 100755 tests/qemu-iotests/249
>  create mode 100644 tests/qemu-iotests/249.out

> +
> +# This test checks bug in qcow2_process_discards, fixed by previous commit.

This sentence makes sense in a commit message, but is a bit stale in the
test itself. Simpler to just state:

This test checks that qcow2_process_discards does not truncate a discard
request > 2G.

> +# The problem was that bdrv_pdiscard was called with uint64_t bytes parameter
> +# which was cropped to int.

then drop this sentence.

> +# To reproduce bug we need to overflow int by one sequential discard, so we
> +# need size > 2G, bigger cluster size (as with default 64k we may have maximum
> +# of 512M sequential data, corresponding to one L1 entry, and we need to write

s/entry,/entry),/

> +# at offset 0 at last, to prevent bdrv_co_truncate(bs->file) in
> +# qcow2_co_truncate to stole the whole effect of failed discard.

and we need a second write at offset 0 to prevent
bdrv_co_truncate(bs->file) from short-circuiting the entire discard.

> +
> +size=2300M
> +IMGOPTS="cluster_size=1M"
> +
> +_make_test_img $size
> +$QEMU_IO -c 'write 100M 2000M' -c 'write 2100M 200M' \
> +    -c 'write 0 100M' "$TEST_IMG" | _filter_qemu_io

This takes a LONG time to produce; and when I tried it on a 32-bit
machine, I got an ENOMEM failure for being unable to allocate a 2000M
buffer. Is there any faster way to produce a similar setup, perhaps by
manually messing with the qcow2 file rather than directly writing that
much consecutive data?  Even if there isn't, you'll want to at least
make the test gracefully skip rather than fail if it hits ENOMEM.

> +Image resized.
> +image: TEST_DIR/t.qcow2
> +file format: qcow2
> +virtual size: 50M (52428800 bytes)
> +disk size: 54M

I confirmed that when patch 1/2 is not present, this reports 2.2G
instead of 54M.

The output here will need tweaking if my patch to rework qemu-img
human-readable sizing lands first (it's currently queued on Kevin's
block-next tree, but needs a rework to fix a few more iotests and to
resolve a 32-bit compilation issue).

> +++ b/tests/qemu-iotests/group
> @@ -248,3 +248,4 @@
>  246 rw auto quick
>  247 rw auto quick
>  248 rw auto quick
> +249 rw auto
> 

I agree that this isn't quick. I will also point out that at least three
earlier messages have also claimed 249:

Berto: [PATCH for-4.1 v2 2/2] iotests: Check that images are in
read-only mode after block-commit

Max: [PATCH v3 6/7] iotests: Test qemu-img convert --salvage

and yourself :) [PATCH v6 7/7] iotests: test nbd reconnect

so someone gets to rename.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] block/io: bdrv_pdiscard: support int64_t bytes parameter
@ 2019-04-19 11:14       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-19 11:14 UTC (permalink / raw)
  To: Eric Blake, qemu-devel, qemu-block
  Cc: kwolf, fam, Denis Lunev, mreitz, stefanha

17.04.2019 17:48, Eric Blake wrote:
> On 4/17/19 5:09 AM, Vladimir Sementsov-Ogievskiy wrote:
>> This fixes at least one overflow in qcow2_process_discards.
> 
> It's worth calling out how long the problem of passing >2G discard
> requests has been present (my reply to the cover letter tracked down
> 0b919fae as tracking a 64-bit discard region but passing it to
> bdrv_discard() which took an int sectors; I'm not sure if later changes
> to byte-based rather than sector-based made a difference).
> 
> 
>> @@ -2664,14 +2664,15 @@ static void coroutine_fn bdrv_pdiscard_co_entry(void *opaque)
>>       aio_wait_kick();
>>   }
>>   
>> -int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
>> +int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset,
>> +                                  int64_t bytes)
>>   {
>>       BdrvTrackedRequest req;
>>       int max_pdiscard, ret;
>>       int head, tail, align;
>>       BlockDriverState *bs = child->bs;
>>   
>> -    if (!bs || !bs->drv) {
>> +    if (!bs || !bs->drv || !bdrv_is_inserted(bs)) {
> 
> This change seems unrelated? Oh, it's because you are inlining the rest
> of what bdrv_check_byte_request used to do.
> 
>>           return -ENOMEDIUM;
>>       }
>>   
>> @@ -2679,9 +2680,8 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
>>           return -EPERM;
>>       }
>>   
>> -    ret = bdrv_check_byte_request(bs, offset, bytes);
>> -    if (ret < 0) {
>> -        return ret;
> 
> If we keep this call in place, we can flag if there were any other
> callers that were passing truncated 64-bit quantities. But I also agree
> that now that we are switching to a 64-bit interface, we no longer have
> to check whether callers were properly limiting their requests.
> 
> Hmm - I just realized that bdrv_check_byte_request() takes a size_t
> (rather than int64_t) size argument - could this result in any other
> truncations on a 32-bit platform that don't affect 64-bit platforms?
> 
>> @@ -2711,12 +2711,13 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
>>           goto out;
>>       }
>>   
>> -    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard, INT_MAX),
>> +    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard,
>> +                                                BDRV_REQUEST_MAX_BYTES),
>>                                      align);
> 
> This change is a no-op, since BDRV_REQUEST_MAX_BYTES is already INT_MAX
> aligned down to sector size, and align is at least sector size.

It's a part of inlining bdrv_check_byte_request too.
It's MIN(SIZE_MAX, INT_MAX).. is it possible for size to less than int? seems not

> 
>>       assert(max_pdiscard >= bs->bl.request_alignment);
>>   
>>       while (bytes > 0) {
>> -        int num = bytes;
>> +        int64_t num = bytes;
>>   
>>           if (head) {
>>               /* Make small requests to get to alignment boundaries. */
>> @@ -2778,7 +2779,7 @@ out:
>>       return ret;
>>   }
>>   
>> -int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes)
>> +int bdrv_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes)
>>   {
>>       Coroutine *co;
>>       DiscardCo rwco = {
>>
> 
> I'm not sure the patch is perfect, but I definitely agree that we want
> to support 64-byte discard length (where the block layer fragments the
> request as needed) rather than the current 32-byte discard length (where
> callers have to be careful to not suffer from truncation).
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] block/io: bdrv_pdiscard: support int64_t bytes parameter
@ 2019-04-19 11:14       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-19 11:14 UTC (permalink / raw)
  To: Eric Blake, qemu-devel, qemu-block
  Cc: kwolf, fam, Denis Lunev, stefanha, mreitz

17.04.2019 17:48, Eric Blake wrote:
> On 4/17/19 5:09 AM, Vladimir Sementsov-Ogievskiy wrote:
>> This fixes at least one overflow in qcow2_process_discards.
> 
> It's worth calling out how long the problem of passing >2G discard
> requests has been present (my reply to the cover letter tracked down
> 0b919fae as tracking a 64-bit discard region but passing it to
> bdrv_discard() which took an int sectors; I'm not sure if later changes
> to byte-based rather than sector-based made a difference).
> 
> 
>> @@ -2664,14 +2664,15 @@ static void coroutine_fn bdrv_pdiscard_co_entry(void *opaque)
>>       aio_wait_kick();
>>   }
>>   
>> -int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
>> +int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset,
>> +                                  int64_t bytes)
>>   {
>>       BdrvTrackedRequest req;
>>       int max_pdiscard, ret;
>>       int head, tail, align;
>>       BlockDriverState *bs = child->bs;
>>   
>> -    if (!bs || !bs->drv) {
>> +    if (!bs || !bs->drv || !bdrv_is_inserted(bs)) {
> 
> This change seems unrelated? Oh, it's because you are inlining the rest
> of what bdrv_check_byte_request used to do.
> 
>>           return -ENOMEDIUM;
>>       }
>>   
>> @@ -2679,9 +2680,8 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
>>           return -EPERM;
>>       }
>>   
>> -    ret = bdrv_check_byte_request(bs, offset, bytes);
>> -    if (ret < 0) {
>> -        return ret;
> 
> If we keep this call in place, we can flag if there were any other
> callers that were passing truncated 64-bit quantities. But I also agree
> that now that we are switching to a 64-bit interface, we no longer have
> to check whether callers were properly limiting their requests.
> 
> Hmm - I just realized that bdrv_check_byte_request() takes a size_t
> (rather than int64_t) size argument - could this result in any other
> truncations on a 32-bit platform that don't affect 64-bit platforms?
> 
>> @@ -2711,12 +2711,13 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
>>           goto out;
>>       }
>>   
>> -    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard, INT_MAX),
>> +    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard,
>> +                                                BDRV_REQUEST_MAX_BYTES),
>>                                      align);
> 
> This change is a no-op, since BDRV_REQUEST_MAX_BYTES is already INT_MAX
> aligned down to sector size, and align is at least sector size.

It's a part of inlining bdrv_check_byte_request too.
It's MIN(SIZE_MAX, INT_MAX).. is it possible for size to less than int? seems not

> 
>>       assert(max_pdiscard >= bs->bl.request_alignment);
>>   
>>       while (bytes > 0) {
>> -        int num = bytes;
>> +        int64_t num = bytes;
>>   
>>           if (head) {
>>               /* Make small requests to get to alignment boundaries. */
>> @@ -2778,7 +2779,7 @@ out:
>>       return ret;
>>   }
>>   
>> -int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes)
>> +int bdrv_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes)
>>   {
>>       Coroutine *co;
>>       DiscardCo rwco = {
>>
> 
> I'm not sure the patch is perfect, but I definitely agree that we want
> to support 64-byte discard length (where the block layer fragments the
> request as needed) rather than the current 32-byte discard length (where
> callers have to be careful to not suffer from truncation).
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-04-19 11:18 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-17 10:09 [Qemu-devel] [PATCH for-4.0? 0/2] Fix overflow bug in qcow2 discard Vladimir Sementsov-Ogievskiy
2019-04-17 10:09 ` Vladimir Sementsov-Ogievskiy
2019-04-17 10:09 ` [Qemu-devel] [PATCH 1/2] block/io: bdrv_pdiscard: support int64_t bytes parameter Vladimir Sementsov-Ogievskiy
2019-04-17 10:09   ` Vladimir Sementsov-Ogievskiy
2019-04-17 14:48   ` Eric Blake
2019-04-17 14:48     ` Eric Blake
2019-04-19 11:14     ` Vladimir Sementsov-Ogievskiy
2019-04-19 11:14       ` Vladimir Sementsov-Ogievskiy
2019-04-17 10:09 ` [Qemu-devel] [PATCH 2/2] iotests: test big qcow2 shrink Vladimir Sementsov-Ogievskiy
2019-04-17 10:09   ` Vladimir Sementsov-Ogievskiy
2019-04-17 15:04   ` Eric Blake
2019-04-17 15:04     ` Eric Blake
2019-04-17 12:33 ` [Qemu-devel] [PATCH for-4.0? 0/2] Fix overflow bug in qcow2 discard Eric Blake
2019-04-17 12:33   ` Eric Blake

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.