All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] block: prepare for 64bit
@ 2020-12-03 22:27 Vladimir Sementsov-Ogievskiy
  2020-12-03 22:27 ` [PATCH 1/4] block/file-posix: fix workaround in raw_do_pwrite_zeroes() Vladimir Sementsov-Ogievskiy
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-03 22:27 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, fam, stefanha, mreitz, kwolf, eblake, berto, den, vsementsov

Hi all!

This is a preparation series for v4 of "[PATCH v3 00/17] 64bit
block-layer".

The whole thing is in 04, and 01-03 are small preparations.

Vladimir Sementsov-Ogievskiy (4):
  block/file-posix: fix workaround in raw_do_pwrite_zeroes()
  block/io: bdrv_refresh_limits(): use ERRP_GUARD
  block/io: bdrv_check_byte_request(): drop bdrv_is_inserted()
  block: introduce BDRV_MAX_LENGTH

 include/block/block.h        | 10 ++++++
 include/block/block_int.h    |  8 +++++
 block.c                      | 17 ++++++++-
 block/file-posix.c           |  7 ++--
 block/io.c                   | 69 ++++++++++++++++++++++++++----------
 tests/test-write-threshold.c |  4 +++
 tests/qemu-iotests/206       |  2 +-
 tests/qemu-iotests/206.out   |  6 ++++
 8 files changed, 98 insertions(+), 25 deletions(-)

-- 
2.21.3



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] block/file-posix: fix workaround in raw_do_pwrite_zeroes()
  2020-12-03 22:27 [PATCH 0/4] block: prepare for 64bit Vladimir Sementsov-Ogievskiy
@ 2020-12-03 22:27 ` Vladimir Sementsov-Ogievskiy
  2020-12-03 22:27 ` [PATCH 2/4] block/io: bdrv_refresh_limits(): use ERRP_GUARD Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-03 22:27 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, fam, stefanha, mreitz, kwolf, eblake, berto, den, vsementsov

We should not set overlap_bytes:

1. Don't worry: it is calculated by bdrv_mark_request_serialising() and
   will be equal to or greater than bytes anyway.

2. If the request was already aligned up to some greater alignment,
   than we may break things: we reduce overlap_bytes, and further
   bdrv_mark_request_serialising() may not help, as it will not restore
   old bigger alignment.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/file-posix.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index d5fd1dbcd2..1b35bd6cfa 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2952,7 +2952,6 @@ raw_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes,
 
         end = INT64_MAX & -(uint64_t)bs->bl.request_alignment;
         req->bytes = end - req->offset;
-        req->overlap_bytes = req->bytes;
 
         bdrv_mark_request_serialising(req, bs->bl.request_alignment);
     }
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] block/io: bdrv_refresh_limits(): use ERRP_GUARD
  2020-12-03 22:27 [PATCH 0/4] block: prepare for 64bit Vladimir Sementsov-Ogievskiy
  2020-12-03 22:27 ` [PATCH 1/4] block/file-posix: fix workaround in raw_do_pwrite_zeroes() Vladimir Sementsov-Ogievskiy
@ 2020-12-03 22:27 ` Vladimir Sementsov-Ogievskiy
  2020-12-04 15:08   ` Alberto Garcia
  2020-12-03 22:27 ` [PATCH 3/4] block/io: bdrv_check_byte_request(): drop bdrv_is_inserted() Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-03 22:27 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, fam, stefanha, mreitz, kwolf, eblake, berto, den, vsementsov

This simplifies following commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/io.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/block/io.c b/block/io.c
index ec5e152bb7..3e91074c9f 100644
--- a/block/io.c
+++ b/block/io.c
@@ -135,10 +135,10 @@ static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src)
 
 void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
 {
+    ERRP_GUARD();
     BlockDriver *drv = bs->drv;
     BdrvChild *c;
     bool have_limits;
-    Error *local_err = NULL;
 
     memset(&bs->bl, 0, sizeof(bs->bl));
 
@@ -156,9 +156,8 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
     QLIST_FOREACH(c, &bs->children, next) {
         if (c->role & (BDRV_CHILD_DATA | BDRV_CHILD_FILTERED | BDRV_CHILD_COW))
         {
-            bdrv_refresh_limits(c->bs, &local_err);
-            if (local_err) {
-                error_propagate(errp, local_err);
+            bdrv_refresh_limits(c->bs, errp);
+            if (*errp) {
                 return;
             }
             bdrv_merge_limits(&bs->bl, &c->bs->bl);
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] block/io: bdrv_check_byte_request(): drop bdrv_is_inserted()
  2020-12-03 22:27 [PATCH 0/4] block: prepare for 64bit Vladimir Sementsov-Ogievskiy
  2020-12-03 22:27 ` [PATCH 1/4] block/file-posix: fix workaround in raw_do_pwrite_zeroes() Vladimir Sementsov-Ogievskiy
  2020-12-03 22:27 ` [PATCH 2/4] block/io: bdrv_refresh_limits(): use ERRP_GUARD Vladimir Sementsov-Ogievskiy
@ 2020-12-03 22:27 ` Vladimir Sementsov-Ogievskiy
  2020-12-04 15:16   ` Alberto Garcia
  2020-12-03 22:27 ` [PATCH 4/4] block: introduce BDRV_MAX_LENGTH Vladimir Sementsov-Ogievskiy
  2020-12-08 17:13 ` [PATCH 0/4] block: prepare for 64bit Kevin Wolf
  4 siblings, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-03 22:27 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, fam, stefanha, mreitz, kwolf, eblake, berto, den, vsementsov

Move bdrv_is_inserted() calls into callers.

We are going to make bdrv_check_byte_request() a clean thing.
bdrv_is_inserted() is not about checking the request, it's about
checking the bs. So, it should be separate.

With this patch we probably change error path for some failure
scenarios. But depending on the fact that querying too big request on
empty cdrom (or corrupted qcow2 node with no drv) will result in EIO
and not ENOMEDIUM would be very strange. More over, we are going to
move to 64bit requests, so larger requests will be allowed anyway.

More over, keeping in mind that cdrom is the only driver that has
.bdrv_is_inserted() handler it's strange that we should care so much
about it in generic block layer, intuitively we should just do read and
write, and cdrom driver should return correct errors if it is not
inserted. But it's a work for another series.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/io.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/block/io.c b/block/io.c
index 3e91074c9f..ef75a5abb4 100644
--- a/block/io.c
+++ b/block/io.c
@@ -884,17 +884,12 @@ static bool coroutine_fn bdrv_wait_serialising_requests(BdrvTrackedRequest *self
     return waited;
 }
 
-static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
-                                   size_t size)
+static int bdrv_check_byte_request(int64_t offset, size_t size)
 {
     if (size > BDRV_REQUEST_MAX_BYTES) {
         return -EIO;
     }
 
-    if (!bdrv_is_inserted(bs)) {
-        return -ENOMEDIUM;
-    }
-
     if (offset < 0) {
         return -EIO;
     }
@@ -1642,7 +1637,11 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild *child,
 
     trace_bdrv_co_preadv(bs, offset, bytes, flags);
 
-    ret = bdrv_check_byte_request(bs, offset, bytes);
+    if (!bdrv_is_inserted(bs)) {
+        return -ENOMEDIUM;
+    }
+
+    ret = bdrv_check_byte_request(offset, bytes);
     if (ret < 0) {
         return ret;
     }
@@ -2054,11 +2053,11 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *child,
 
     trace_bdrv_co_pwritev(child->bs, offset, bytes, flags);
 
-    if (!bs->drv) {
+    if (!bdrv_is_inserted(bs)) {
         return -ENOMEDIUM;
     }
 
-    ret = bdrv_check_byte_request(bs, offset, bytes);
+    ret = bdrv_check_byte_request(offset, bytes);
     if (ret < 0) {
         return ret;
     }
@@ -3045,10 +3044,10 @@ static int coroutine_fn bdrv_co_copy_range_internal(
     assert(!(read_flags & BDRV_REQ_NO_FALLBACK));
     assert(!(write_flags & BDRV_REQ_NO_FALLBACK));
 
-    if (!dst || !dst->bs) {
+    if (!dst || !dst->bs || !bdrv_is_inserted(dst->bs)) {
         return -ENOMEDIUM;
     }
-    ret = bdrv_check_byte_request(dst->bs, dst_offset, bytes);
+    ret = bdrv_check_byte_request(dst_offset, bytes);
     if (ret) {
         return ret;
     }
@@ -3056,10 +3055,10 @@ static int coroutine_fn bdrv_co_copy_range_internal(
         return bdrv_co_pwrite_zeroes(dst, dst_offset, bytes, write_flags);
     }
 
-    if (!src || !src->bs) {
+    if (!src || !src->bs || !bdrv_is_inserted(src->bs)) {
         return -ENOMEDIUM;
     }
-    ret = bdrv_check_byte_request(src->bs, src_offset, bytes);
+    ret = bdrv_check_byte_request(src_offset, bytes);
     if (ret) {
         return ret;
     }
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] block: introduce BDRV_MAX_LENGTH
  2020-12-03 22:27 [PATCH 0/4] block: prepare for 64bit Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2020-12-03 22:27 ` [PATCH 3/4] block/io: bdrv_check_byte_request(): drop bdrv_is_inserted() Vladimir Sementsov-Ogievskiy
@ 2020-12-03 22:27 ` Vladimir Sementsov-Ogievskiy
  2021-01-07  9:58   ` Richard W.M. Jones
  2020-12-08 17:13 ` [PATCH 0/4] block: prepare for 64bit Kevin Wolf
  4 siblings, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-03 22:27 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, fam, stefanha, mreitz, kwolf, eblake, berto, den, vsementsov

We are going to modify block layer to work with 64bit requests. And
first step is moving to int64_t type for both offset and bytes
arguments in all block request related functions.

It's mostly safe (when widening signed or unsigned int to int64_t), but
switching from uint64_t is questionable.

So, let's first establish the set of requests we want to work with.
First signed int64_t should be enough, as off_t is signed anyway. Then,
obviously offset + bytes should not overflow.

And most interesting: (offset + bytes) being aligned up should not
overflow as well. Aligned to what alignment? First thing that comes in
mind is bs->bl.request_alignment, as we align up request to this
alignment. But there is another thing: look at
bdrv_mark_request_serialising(). It aligns request up to some given
alignment. And this parameter may be bdrv_get_cluster_size(), which is
often a lot greater than bs->bl.request_alignment.
Note also, that bdrv_mark_request_serialising() uses signed int64_t for
calculations. So, actually, we already depend on some restrictions.

Happily, bdrv_get_cluster_size() returns int and
bs->bl.request_alignment has 32bit unsigned type, but defined to be a
power of 2 less than INT_MAX. So, we may establish, that INT_MAX is
absolute maximum for any kind of alignment that may occur with the
request.

Note, that bdrv_get_cluster_size() is not documented to return power
of 2, still bdrv_mark_request_serialising() behaves like it is.
Also, backup uses bdi.cluster_size and is not prepared to it not being
power of 2.
So, let's establish that Qemu supports only power-of-2 clusters and
alignments.

So, alignment can't be greater than 2^30.

Finally to be safe with calculations, to not calculate different
maximums for different nodes (depending on cluster size and
request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
as absolute maximum bytes length for Qemu. Actually, it's not much less
than INT64_MAX.

OK, then, let's apply it to block/io.

Let's consider all block/io entry points of offset/bytes:

4 bytes/offset interface functions: bdrv_co_preadv_part(),
bdrv_co_pwritev_part(), bdrv_co_copy_range_internal() and
bdrv_co_pdiscard() and we check them all with bdrv_check_request().

We also have one entry point with only offset: bdrv_co_truncate().
Check the offset.

And one public structure: BdrvTrackedRequest. Happily, it has only
three external users:

 file-posix.c: adopted by this patch
 write-threshold.c: only read fields
 test-write-threshold.c: sets obviously small constant values

Better is to make the structure private and add corresponding
interfaces.. Still it's not obvious what kind of interface is needed
for file-posix.c. Let's keep it public but add corresponding
assertions.

After this patch we'll convert functions in block/io.c to int64_t bytes
and offset parameters. We can assume that offset/bytes pair always
satisfy new restrictions, and make
corresponding assertions where needed. If we reach some offset/bytes
point in block/io.c missing bdrv_check_request() it is considered a
bug. As well, if block/io.c modifies a offset/bytes request, expanding
it more then aligning up to request_alignment, it's a bug too.

For all io requests except for discard we keep for now old restriction
of 32bit request length.

iotest 206 output error message changed, as now test disk size is
larger than new limit. Add one more test case with new maximum disk
size to cover too-big-L1 case.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h        | 10 +++++++
 include/block/block_int.h    |  8 ++++++
 block.c                      | 17 +++++++++++-
 block/file-posix.c           |  6 ++---
 block/io.c                   | 51 +++++++++++++++++++++++++++++-------
 tests/test-write-threshold.c |  4 +++
 tests/qemu-iotests/206       |  2 +-
 tests/qemu-iotests/206.out   |  6 +++++
 8 files changed, 90 insertions(+), 14 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index c9d7c58765..5b81e33e94 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -142,6 +142,16 @@ typedef struct HDGeometry {
                                            INT_MAX >> BDRV_SECTOR_BITS)
 #define BDRV_REQUEST_MAX_BYTES (BDRV_REQUEST_MAX_SECTORS << BDRV_SECTOR_BITS)
 
+/*
+ * We want allow aligning requests and disk length up to any 32bit alignment
+ * and don't afraid of overflow.
+ * To achieve it, and in the same time use some pretty number as maximum disk
+ * size, let's define maximum "length" (a limit for any offset/bytes request and
+ * for disk size) to be the greatest power of 2 less than INT64_MAX.
+ */
+#define BDRV_MAX_ALIGNMENT (1L << 30)
+#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT))
+
 /*
  * Allocation status flags for bdrv_block_status() and friends.
  *
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 95d9333be1..1eeafc118c 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -70,6 +70,12 @@ enum BdrvTrackedRequestType {
     BDRV_TRACKED_TRUNCATE,
 };
 
+/*
+ * That is not quite good that BdrvTrackedRequest structure is public,
+ * as block/io.c is very careful about incoming offset/bytes being
+ * correct. Be sure to assert bdrv_check_request() succeeded after any
+ * modification of BdrvTrackedRequest object out of block/io.c
+ */
 typedef struct BdrvTrackedRequest {
     BlockDriverState *bs;
     int64_t offset;
@@ -87,6 +93,8 @@ typedef struct BdrvTrackedRequest {
     struct BdrvTrackedRequest *waiting_for;
 } BdrvTrackedRequest;
 
+int bdrv_check_request(int64_t offset, int64_t bytes);
+
 struct BlockDriver {
     const char *format_name;
     int instance_size;
diff --git a/block.c b/block.c
index f1cedac362..1a8494e19c 100644
--- a/block.c
+++ b/block.c
@@ -961,6 +961,11 @@ int refresh_total_sectors(BlockDriverState *bs, int64_t hint)
     }
 
     bs->total_sectors = hint;
+
+    if (bs->total_sectors * BDRV_SECTOR_SIZE > BDRV_MAX_LENGTH) {
+        return -EFBIG;
+    }
+
     return 0;
 }
 
@@ -5534,6 +5539,7 @@ void bdrv_get_backing_filename(BlockDriverState *bs,
 
 int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
 {
+    int ret;
     BlockDriver *drv = bs->drv;
     /* if bs->drv == NULL, bs is closed, so there's nothing to do here */
     if (!drv) {
@@ -5547,7 +5553,16 @@ int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
         return -ENOTSUP;
     }
     memset(bdi, 0, sizeof(*bdi));
-    return drv->bdrv_get_info(bs, bdi);
+    ret = drv->bdrv_get_info(bs, bdi);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (bdi->cluster_size > BDRV_MAX_ALIGNMENT) {
+        return -EINVAL;
+    }
+
+    return 0;
 }
 
 ImageInfoSpecific *bdrv_get_specific_info(BlockDriverState *bs,
diff --git a/block/file-posix.c b/block/file-posix.c
index 1b35bd6cfa..fd0a8d0dbf 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2926,7 +2926,6 @@ raw_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes,
 #ifdef CONFIG_FALLOCATE
     if (offset + bytes > bs->total_sectors * BDRV_SECTOR_SIZE) {
         BdrvTrackedRequest *req;
-        uint64_t end;
 
         /*
          * This is a workaround for a bug in the Linux XFS driver,
@@ -2950,8 +2949,9 @@ raw_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes,
         assert(req->offset <= offset);
         assert(req->offset + req->bytes >= offset + bytes);
 
-        end = INT64_MAX & -(uint64_t)bs->bl.request_alignment;
-        req->bytes = end - req->offset;
+        req->bytes = BDRV_MAX_LENGTH - req->offset;
+
+        assert(bdrv_check_request(req->offset, req->bytes) == 0);
 
         bdrv_mark_request_serialising(req, bs->bl.request_alignment);
     }
diff --git a/block/io.c b/block/io.c
index ef75a5abb4..6343d85476 100644
--- a/block/io.c
+++ b/block/io.c
@@ -176,6 +176,13 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
     /* Then let the driver override it */
     if (drv->bdrv_refresh_limits) {
         drv->bdrv_refresh_limits(bs, errp);
+        if (*errp) {
+            return;
+        }
+    }
+
+    if (bs->bl.request_alignment > BDRV_MAX_ALIGNMENT) {
+        error_setg(errp, "Driver requires too large request alignment");
     }
 }
 
@@ -884,13 +891,31 @@ static bool coroutine_fn bdrv_wait_serialising_requests(BdrvTrackedRequest *self
     return waited;
 }
 
-static int bdrv_check_byte_request(int64_t offset, size_t size)
+int bdrv_check_request(int64_t offset, int64_t bytes)
 {
-    if (size > BDRV_REQUEST_MAX_BYTES) {
+    if (offset < 0 || bytes < 0) {
         return -EIO;
     }
 
-    if (offset < 0) {
+    if (bytes > BDRV_MAX_LENGTH) {
+        return -EIO;
+    }
+
+    if (offset > BDRV_MAX_LENGTH - bytes) {
+        return -EIO;
+    }
+
+    return 0;
+}
+
+static int bdrv_check_request32(int64_t offset, int64_t bytes)
+{
+    int ret = bdrv_check_request(offset, bytes);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (bytes > BDRV_REQUEST_MAX_BYTES) {
         return -EIO;
     }
 
@@ -1641,7 +1666,7 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild *child,
         return -ENOMEDIUM;
     }
 
-    ret = bdrv_check_byte_request(offset, bytes);
+    ret = bdrv_check_request32(offset, bytes);
     if (ret < 0) {
         return ret;
     }
@@ -2057,7 +2082,7 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *child,
         return -ENOMEDIUM;
     }
 
-    ret = bdrv_check_byte_request(offset, bytes);
+    ret = bdrv_check_request32(offset, bytes);
     if (ret < 0) {
         return ret;
     }
@@ -2787,8 +2812,9 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset,
         return -EPERM;
     }
 
-    if (offset < 0 || bytes < 0 || bytes > INT64_MAX - offset) {
-        return -EIO;
+    ret = bdrv_check_request(offset, bytes);
+    if (ret < 0) {
+        return ret;
     }
 
     /* Do nothing if disabled.  */
@@ -3047,7 +3073,7 @@ static int coroutine_fn bdrv_co_copy_range_internal(
     if (!dst || !dst->bs || !bdrv_is_inserted(dst->bs)) {
         return -ENOMEDIUM;
     }
-    ret = bdrv_check_byte_request(dst_offset, bytes);
+    ret = bdrv_check_request32(dst_offset, bytes);
     if (ret) {
         return ret;
     }
@@ -3058,7 +3084,7 @@ static int coroutine_fn bdrv_co_copy_range_internal(
     if (!src || !src->bs || !bdrv_is_inserted(src->bs)) {
         return -ENOMEDIUM;
     }
-    ret = bdrv_check_byte_request(src_offset, bytes);
+    ret = bdrv_check_request32(src_offset, bytes);
     if (ret) {
         return ret;
     }
@@ -3188,6 +3214,13 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact,
         return -EINVAL;
     }
 
+    ret = bdrv_check_request(offset, 0);
+    if (ret < 0) {
+        error_setg(errp, "Required too big image size, it must be not greater "
+                   "than %" PRId64, BDRV_MAX_LENGTH);
+        return ret;
+    }
+
     old_size = bdrv_getlength(bs);
     if (old_size < 0) {
         error_setg_errno(errp, -old_size, "Failed to get old image size");
diff --git a/tests/test-write-threshold.c b/tests/test-write-threshold.c
index 97ca12f710..4cf032652d 100644
--- a/tests/test-write-threshold.c
+++ b/tests/test-write-threshold.c
@@ -64,6 +64,8 @@ static void test_threshold_not_trigger(void)
     req.offset = 1024;
     req.bytes = 1024;
 
+    assert(bdrv_check_request(req.offset, req.bytes) == 0);
+
     bdrv_write_threshold_set(&bs, threshold);
     amount = bdrv_write_threshold_exceeded(&bs, &req);
     g_assert_cmpuint(amount, ==, 0);
@@ -82,6 +84,8 @@ static void test_threshold_trigger(void)
     req.offset = (4 * 1024 * 1024) - 1024;
     req.bytes = 2 * 1024;
 
+    assert(bdrv_check_request(req.offset, req.bytes) == 0);
+
     bdrv_write_threshold_set(&bs, threshold);
     amount = bdrv_write_threshold_exceeded(&bs, &req);
     g_assert_cmpuint(amount, >=, 1024);
diff --git a/tests/qemu-iotests/206 b/tests/qemu-iotests/206
index 11bc51f256..021d94142e 100755
--- a/tests/qemu-iotests/206
+++ b/tests/qemu-iotests/206
@@ -201,7 +201,7 @@ with iotests.FilePath('t.qcow2') as disk_path, \
 
     vm.launch()
     for size in [ 1234, 18446744073709551104, 9223372036854775808,
-                  9223372036854775296 ]:
+                  9223372036854775296, 9223372035781033984 ]:
         vm.blockdev_create({ 'driver': imgfmt,
                              'file': 'node0',
                              'size': size })
diff --git a/tests/qemu-iotests/206.out b/tests/qemu-iotests/206.out
index a100849fcb..e8a36de00b 100644
--- a/tests/qemu-iotests/206.out
+++ b/tests/qemu-iotests/206.out
@@ -180,6 +180,12 @@ Job failed: Could not resize image: Image size cannot be negative
 
 {"execute": "blockdev-create", "arguments": {"job-id": "job0", "options": {"driver": "qcow2", "file": "node0", "size": 9223372036854775296}}}
 {"return": {}}
+Job failed: Could not resize image: Required too big image size, it must be not greater than 9223372035781033984
+{"execute": "job-dismiss", "arguments": {"id": "job0"}}
+{"return": {}}
+
+{"execute": "blockdev-create", "arguments": {"job-id": "job0", "options": {"driver": "qcow2", "file": "node0", "size": 9223372035781033984}}}
+{"return": {}}
 Job failed: Could not resize image: Failed to grow the L1 table: File too large
 {"execute": "job-dismiss", "arguments": {"id": "job0"}}
 {"return": {}}
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] block/io: bdrv_refresh_limits(): use ERRP_GUARD
  2020-12-03 22:27 ` [PATCH 2/4] block/io: bdrv_refresh_limits(): use ERRP_GUARD Vladimir Sementsov-Ogievskiy
@ 2020-12-04 15:08   ` Alberto Garcia
  0 siblings, 0 replies; 17+ messages in thread
From: Alberto Garcia @ 2020-12-04 15:08 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, vsementsov, qemu-devel, mreitz, stefanha, den

On Thu 03 Dec 2020 11:27:11 PM CET, Vladimir Sementsov-Ogievskiy wrote:
> This simplifies following commit.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] block/io: bdrv_check_byte_request(): drop bdrv_is_inserted()
  2020-12-03 22:27 ` [PATCH 3/4] block/io: bdrv_check_byte_request(): drop bdrv_is_inserted() Vladimir Sementsov-Ogievskiy
@ 2020-12-04 15:16   ` Alberto Garcia
  0 siblings, 0 replies; 17+ messages in thread
From: Alberto Garcia @ 2020-12-04 15:16 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, vsementsov, qemu-devel, mreitz, stefanha, den

On Thu 03 Dec 2020 11:27:12 PM CET, Vladimir Sementsov-Ogievskiy wrote:
> Move bdrv_is_inserted() calls into callers.
>
> We are going to make bdrv_check_byte_request() a clean thing.
> bdrv_is_inserted() is not about checking the request, it's about
> checking the bs. So, it should be separate.
>
> With this patch we probably change error path for some failure
> scenarios. But depending on the fact that querying too big request on
> empty cdrom (or corrupted qcow2 node with no drv) will result in EIO
> and not ENOMEDIUM would be very strange. More over, we are going to
> move to 64bit requests, so larger requests will be allowed anyway.
>
> More over, keeping in mind that cdrom is the only driver that has
> .bdrv_is_inserted() handler it's strange that we should care so much
> about it in generic block layer, intuitively we should just do read and
> write, and cdrom driver should return correct errors if it is not
> inserted. But it's a work for another series.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] block: prepare for 64bit
  2020-12-03 22:27 [PATCH 0/4] block: prepare for 64bit Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2020-12-03 22:27 ` [PATCH 4/4] block: introduce BDRV_MAX_LENGTH Vladimir Sementsov-Ogievskiy
@ 2020-12-08 17:13 ` Kevin Wolf
  2020-12-08 17:32   ` Vladimir Sementsov-Ogievskiy
  4 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2020-12-08 17:13 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, berto, qemu-block, qemu-devel, mreitz, stefanha, den

Am 03.12.2020 um 23:27 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Hi all!
> 
> This is a preparation series for v4 of "[PATCH v3 00/17] 64bit
> block-layer".
> 
> The whole thing is in 04, and 01-03 are small preparations.

Thanks, applied to the block branch.

Kevin



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] block: prepare for 64bit
  2020-12-08 17:13 ` [PATCH 0/4] block: prepare for 64bit Kevin Wolf
@ 2020-12-08 17:32   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-08 17:32 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-block, qemu-devel, fam, stefanha, mreitz, eblake, berto, den

08.12.2020 20:13, Kevin Wolf wrote:
> Am 03.12.2020 um 23:27 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Hi all!
>>
>> This is a preparation series for v4 of "[PATCH v3 00/17] 64bit
>> block-layer".
>>
>> The whole thing is in 04, and 01-03 are small preparations.
> 
> Thanks, applied to the block branch.
> 

Thank you!

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH
  2020-12-03 22:27 ` [PATCH 4/4] block: introduce BDRV_MAX_LENGTH Vladimir Sementsov-Ogievskiy
@ 2021-01-07  9:58   ` Richard W.M. Jones
  2021-01-07 10:56     ` Richard W.M. Jones
  2021-01-08 10:51     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 2 replies; 17+ messages in thread
From: Richard W.M. Jones @ 2021-01-07  9:58 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, berto, qemu-block, qemu-devel, mreitz, stefanha, den

On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Finally to be safe with calculations, to not calculate different
> maximums for different nodes (depending on cluster size and
> request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
> as absolute maximum bytes length for Qemu. Actually, it's not much less
> than INT64_MAX.

> +/*
> + * We want allow aligning requests and disk length up to any 32bit alignment
> + * and don't afraid of overflow.
> + * To achieve it, and in the same time use some pretty number as maximum disk
> + * size, let's define maximum "length" (a limit for any offset/bytes request and
> + * for disk size) to be the greatest power of 2 less than INT64_MAX.
> + */
> +#define BDRV_MAX_ALIGNMENT (1L << 30)
> +#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT))

This change broke nbdkit tests.

We test that qemu can handle a qemu NBD export of size 2^63 - 512, the
largest size that (experimentally) we found qemu could safely handle.
eg:

  https://github.com/libguestfs/nbdkit/blob/master/tests/test-memory-largest-for-qemu.sh

Before this commit:

  $ nbdkit memory $(( 2**63 - 512 )) --run './qemu-img info "$uri"'
  image: nbd://localhost:10809
  file format: raw
  virtual size: 8 EiB (9223372036854775296 bytes)
  disk size: unavailable

After this commit:

  $ nbdkit memory $(( 2**63 - 512 )) --run './qemu-img info "$uri"'
  qemu-img: Could not open 'nbd://localhost:10809': Could not refresh total sector count: File too large

Can I confirm that this limit is now the new official one and we
should adjust nbdkit tests?  Or was this change unintentional given
that qemu seemed happy to handle 2^63 - 512 disks before?

Note that nbdkit & libnbd support up to 2^63 - 1 bytes (we are not
limited to whole sectors).  Also the Linux kernel will let you create
a /dev/nbdX device of size 2^63 - 1.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH
  2021-01-07  9:58   ` Richard W.M. Jones
@ 2021-01-07 10:56     ` Richard W.M. Jones
  2021-01-07 12:20       ` Richard W.M. Jones
  2021-01-08 10:51     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 17+ messages in thread
From: Richard W.M. Jones @ 2021-01-07 10:56 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, berto, qemu-block, qemu-devel, mreitz, stefanha, den

On Thu, Jan 07, 2021 at 09:58:17AM +0000, Richard W.M. Jones wrote:
> On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > Finally to be safe with calculations, to not calculate different
> > maximums for different nodes (depending on cluster size and
> > request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
> > as absolute maximum bytes length for Qemu. Actually, it's not much less
> > than INT64_MAX.
> 
> > +/*
> > + * We want allow aligning requests and disk length up to any 32bit alignment
> > + * and don't afraid of overflow.
> > + * To achieve it, and in the same time use some pretty number as maximum disk
> > + * size, let's define maximum "length" (a limit for any offset/bytes request and
> > + * for disk size) to be the greatest power of 2 less than INT64_MAX.
> > + */
> > +#define BDRV_MAX_ALIGNMENT (1L << 30)
> > +#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT))
> 
> This change broke nbdkit tests.

Actually that's not the only problem.  It appears that we're unable to
read or write the last sector of this disk:

$ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -r -f raw "$uri" -c "r -v $(( 2**63 - 2**30 - 512 )) 512" ' 
read failed: Input/output error

$ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' 
write failed: Input/output error

You can play around with the constants.  I found it's possible to read
and write the non-aligned 512 bytes starting at 2^63-2^30-513.  Could
be a fencepost error somewhere in qemu?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH
  2021-01-07 10:56     ` Richard W.M. Jones
@ 2021-01-07 12:20       ` Richard W.M. Jones
  2021-01-08 11:14         ` Vladimir Sementsov-Ogievskiy
  2021-02-04 14:05         ` Eric Blake
  0 siblings, 2 replies; 17+ messages in thread
From: Richard W.M. Jones @ 2021-01-07 12:20 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, berto, qemu-block, qemu-devel, mreitz, berrange,
	stefanha, den

On Thu, Jan 07, 2021 at 10:56:12AM +0000, Richard W.M. Jones wrote:
> On Thu, Jan 07, 2021 at 09:58:17AM +0000, Richard W.M. Jones wrote:
> > On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > Finally to be safe with calculations, to not calculate different
> > > maximums for different nodes (depending on cluster size and
> > > request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
> > > as absolute maximum bytes length for Qemu. Actually, it's not much less
> > > than INT64_MAX.
> > 
> > > +/*
> > > + * We want allow aligning requests and disk length up to any 32bit alignment
> > > + * and don't afraid of overflow.
> > > + * To achieve it, and in the same time use some pretty number as maximum disk
> > > + * size, let's define maximum "length" (a limit for any offset/bytes request and
> > > + * for disk size) to be the greatest power of 2 less than INT64_MAX.
> > > + */
> > > +#define BDRV_MAX_ALIGNMENT (1L << 30)
> > > +#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT))
> > 
> > This change broke nbdkit tests.
> 
> Actually that's not the only problem.  It appears that we're unable to
> read or write the last sector of this disk:
> 
> $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -r -f raw "$uri" -c "r -v $(( 2**63 - 2**30 - 512 )) 512" ' 
> read failed: Input/output error
> 
> $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' 
> write failed: Input/output error
> 
> You can play around with the constants.  I found it's possible to read
> and write the non-aligned 512 bytes starting at 2^63-2^30-513.  Could
> be a fencepost error somewhere in qemu?

Actually this is a pre-existing bug in qemu.

What happens is qemu-io calls qemu_strtosz("9223372035781033472")
which returns 0x7fffffffc0000000 and no error.  That answer is plain
flat out wrong.  The reason for that is qemu_strtosz uses floating
point for the calculation(!) so is limited to 53 bits of precision and
silently truncates.

It happened we were just lucky that our earlier test with
2^63 - 1024 worked.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH
  2021-01-07  9:58   ` Richard W.M. Jones
  2021-01-07 10:56     ` Richard W.M. Jones
@ 2021-01-08 10:51     ` Vladimir Sementsov-Ogievskiy
  2021-01-08 11:02       ` Richard W.M. Jones
  1 sibling, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-08 10:51 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: fam, kwolf, berto, qemu-block, qemu-devel, mreitz, stefanha, den

07.01.2021 12:58, Richard W.M. Jones wrote:
> On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> Finally to be safe with calculations, to not calculate different
>> maximums for different nodes (depending on cluster size and
>> request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
>> as absolute maximum bytes length for Qemu. Actually, it's not much less
>> than INT64_MAX.
> 
>> +/*
>> + * We want allow aligning requests and disk length up to any 32bit alignment
>> + * and don't afraid of overflow.
>> + * To achieve it, and in the same time use some pretty number as maximum disk
>> + * size, let's define maximum "length" (a limit for any offset/bytes request and
>> + * for disk size) to be the greatest power of 2 less than INT64_MAX.
>> + */
>> +#define BDRV_MAX_ALIGNMENT (1L << 30)
>> +#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT))
> 
> This change broke nbdkit tests.
> 
> We test that qemu can handle a qemu NBD export of size 2^63 - 512, the
> largest size that (experimentally) we found qemu could safely handle.
> eg:
> 
>    https://github.com/libguestfs/nbdkit/blob/master/tests/test-memory-largest-for-qemu.sh
> 
> Before this commit:
> 
>    $ nbdkit memory $(( 2**63 - 512 )) --run './qemu-img info "$uri"'
>    image: nbd://localhost:10809
>    file format: raw
>    virtual size: 8 EiB (9223372036854775296 bytes)
>    disk size: unavailable
> 
> After this commit:
> 
>    $ nbdkit memory $(( 2**63 - 512 )) --run './qemu-img info "$uri"'
>    qemu-img: Could not open 'nbd://localhost:10809': Could not refresh total sector count: File too large
> 
> Can I confirm that this limit is now the new official one and we
> should adjust nbdkit tests?  Or was this change unintentional given
> that qemu seemed happy to handle 2^63 - 512 disks before?
> 
> Note that nbdkit & libnbd support up to 2^63 - 1 bytes (we are not
> limited to whole sectors).  Also the Linux kernel will let you create
> a /dev/nbdX device of size 2^63 - 1.
> 

Hi Rich! The change is intentional.

I think the benefit of having clean limit, allowing us to align up bytes to some alignment (which is a common operation) exceeds the loss of 1G at the end of 2**63 range. We get simpler and safer code. And anyway, new limit is not much worse than 2**63. If at some point we have a problem with it being too restrictive, it's than very likely that 2**63 would be too small as well, which will require so much work that our a bit more restrictive limit is unlikely to increase the difficulty.


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH
  2021-01-08 10:51     ` Vladimir Sementsov-Ogievskiy
@ 2021-01-08 11:02       ` Richard W.M. Jones
  0 siblings, 0 replies; 17+ messages in thread
From: Richard W.M. Jones @ 2021-01-08 11:02 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, berto, qemu-block, qemu-devel, mreitz, stefanha, den

On Fri, Jan 08, 2021 at 01:51:35PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 07.01.2021 12:58, Richard W.M. Jones wrote:
> >On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> >>Finally to be safe with calculations, to not calculate different
> >>maximums for different nodes (depending on cluster size and
> >>request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
> >>as absolute maximum bytes length for Qemu. Actually, it's not much less
> >>than INT64_MAX.
> >
> >>+/*
> >>+ * We want allow aligning requests and disk length up to any 32bit alignment
> >>+ * and don't afraid of overflow.
> >>+ * To achieve it, and in the same time use some pretty number as maximum disk
> >>+ * size, let's define maximum "length" (a limit for any offset/bytes request and
> >>+ * for disk size) to be the greatest power of 2 less than INT64_MAX.
> >>+ */
> >>+#define BDRV_MAX_ALIGNMENT (1L << 30)
> >>+#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT))
> >
> >This change broke nbdkit tests.
> >
> >We test that qemu can handle a qemu NBD export of size 2^63 - 512, the
> >largest size that (experimentally) we found qemu could safely handle.
> >eg:
> >
> >   https://github.com/libguestfs/nbdkit/blob/master/tests/test-memory-largest-for-qemu.sh
> >
> >Before this commit:
> >
> >   $ nbdkit memory $(( 2**63 - 512 )) --run './qemu-img info "$uri"'
> >   image: nbd://localhost:10809
> >   file format: raw
> >   virtual size: 8 EiB (9223372036854775296 bytes)
> >   disk size: unavailable
> >
> >After this commit:
> >
> >   $ nbdkit memory $(( 2**63 - 512 )) --run './qemu-img info "$uri"'
> >   qemu-img: Could not open 'nbd://localhost:10809': Could not refresh total sector count: File too large
> >
> >Can I confirm that this limit is now the new official one and we
> >should adjust nbdkit tests?  Or was this change unintentional given
> >that qemu seemed happy to handle 2^63 - 512 disks before?
> >
> >Note that nbdkit & libnbd support up to 2^63 - 1 bytes (we are not
> >limited to whole sectors).  Also the Linux kernel will let you create
> >a /dev/nbdX device of size 2^63 - 1.
> >
> 
> Hi Rich! The change is intentional.
>
> I think the benefit of having clean limit, allowing us to align up
> bytes to some alignment (which is a common operation) exceeds the
> loss of 1G at the end of 2**63 range. We get simpler and safer
> code. And anyway, new limit is not much worse than 2**63.

That's fine, as long as we settle on this.  I've updated the nbdkit tests:

https://github.com/libguestfs/nbdkit/commit/c3ec8c951e39a0f921252c162c236f23c588d2bd

> If at some
> point we have a problem with it being too restrictive, it's than
> very likely that 2**63 would be too small as well, which will
> require so much work that our a bit more restrictive limit is
> unlikely to increase the difficulty.

The next step is definitely working on 128 bit offsets!
https://rwmj.wordpress.com/2011/10/03/when-will-disk-sizes-go-beyond-64-bits/

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH
  2021-01-07 12:20       ` Richard W.M. Jones
@ 2021-01-08 11:14         ` Vladimir Sementsov-Ogievskiy
  2021-01-08 11:27           ` Daniel P. Berrangé
  2021-02-04 14:05         ` Eric Blake
  1 sibling, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-08 11:14 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: fam, kwolf, berto, qemu-block, qemu-devel, mreitz, berrange,
	stefanha, den

07.01.2021 15:20, Richard W.M. Jones wrote:
> On Thu, Jan 07, 2021 at 10:56:12AM +0000, Richard W.M. Jones wrote:
>> On Thu, Jan 07, 2021 at 09:58:17AM +0000, Richard W.M. Jones wrote:
>>> On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>>> Finally to be safe with calculations, to not calculate different
>>>> maximums for different nodes (depending on cluster size and
>>>> request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
>>>> as absolute maximum bytes length for Qemu. Actually, it's not much less
>>>> than INT64_MAX.
>>>
>>>> +/*
>>>> + * We want allow aligning requests and disk length up to any 32bit alignment
>>>> + * and don't afraid of overflow.
>>>> + * To achieve it, and in the same time use some pretty number as maximum disk
>>>> + * size, let's define maximum "length" (a limit for any offset/bytes request and
>>>> + * for disk size) to be the greatest power of 2 less than INT64_MAX.
>>>> + */
>>>> +#define BDRV_MAX_ALIGNMENT (1L << 30)
>>>> +#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT))
>>>
>>> This change broke nbdkit tests.
>>
>> Actually that's not the only problem.  It appears that we're unable to
>> read or write the last sector of this disk:
>>
>> $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -r -f raw "$uri" -c "r -v $(( 2**63 - 2**30 - 512 )) 512" '
>> read failed: Input/output error
>>
>> $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
>> write failed: Input/output error
>>
>> You can play around with the constants.  I found it's possible to read
>> and write the non-aligned 512 bytes starting at 2^63-2^30-513.  Could
>> be a fencepost error somewhere in qemu?
> 
> Actually this is a pre-existing bug in qemu.
> 
> What happens is qemu-io calls qemu_strtosz("9223372035781033472")
> which returns 0x7fffffffc0000000 and no error.  That answer is plain
> flat out wrong.  The reason for that is qemu_strtosz uses floating
> point for the calculation(!) so is limited to 53 bits of precision and
> silently truncates.

Hmm.. Seems it wants to support floats, like 1.5G.. Don't seem to be very useful, but why not. I think we should call qemu_strtou64 first, and only if it fails or leaves '.', 'e' or 'E' (anything else?) as end character we should do all this floating point logic.

> 
> It happened we were just lucky that our earlier test with
> 2^63 - 1024 worked.
> 
> Rich.
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH
  2021-01-08 11:14         ` Vladimir Sementsov-Ogievskiy
@ 2021-01-08 11:27           ` Daniel P. Berrangé
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel P. Berrangé @ 2021-01-08 11:27 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, berto, qemu-block, Richard W.M. Jones, qemu-devel,
	stefanha, den, mreitz

On Fri, Jan 08, 2021 at 02:14:30PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 07.01.2021 15:20, Richard W.M. Jones wrote:
> > On Thu, Jan 07, 2021 at 10:56:12AM +0000, Richard W.M. Jones wrote:
> > > On Thu, Jan 07, 2021 at 09:58:17AM +0000, Richard W.M. Jones wrote:
> > > > On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > > > Finally to be safe with calculations, to not calculate different
> > > > > maximums for different nodes (depending on cluster size and
> > > > > request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
> > > > > as absolute maximum bytes length for Qemu. Actually, it's not much less
> > > > > than INT64_MAX.
> > > > 
> > > > > +/*
> > > > > + * We want allow aligning requests and disk length up to any 32bit alignment
> > > > > + * and don't afraid of overflow.
> > > > > + * To achieve it, and in the same time use some pretty number as maximum disk
> > > > > + * size, let's define maximum "length" (a limit for any offset/bytes request and
> > > > > + * for disk size) to be the greatest power of 2 less than INT64_MAX.
> > > > > + */
> > > > > +#define BDRV_MAX_ALIGNMENT (1L << 30)
> > > > > +#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT))
> > > > 
> > > > This change broke nbdkit tests.
> > > 
> > > Actually that's not the only problem.  It appears that we're unable to
> > > read or write the last sector of this disk:
> > > 
> > > $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -r -f raw "$uri" -c "r -v $(( 2**63 - 2**30 - 512 )) 512" '
> > > read failed: Input/output error
> > > 
> > > $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
> > > write failed: Input/output error
> > > 
> > > You can play around with the constants.  I found it's possible to read
> > > and write the non-aligned 512 bytes starting at 2^63-2^30-513.  Could
> > > be a fencepost error somewhere in qemu?
> > 
> > Actually this is a pre-existing bug in qemu.
> > 
> > What happens is qemu-io calls qemu_strtosz("9223372035781033472")
> > which returns 0x7fffffffc0000000 and no error.  That answer is plain
> > flat out wrong.  The reason for that is qemu_strtosz uses floating
> > point for the calculation(!) so is limited to 53 bits of precision and
> > silently truncates.
> 
> Hmm.. Seems it wants to support floats, like 1.5G.. Don't seem to be very useful, but why not. I think we should call qemu_strtou64 first, and only if it fails or leaves '.', 'e' or 'E' (anything else?) as end character we should do all this floating point logic.

I'm not sure we actually care about the floating point exponent
syntax suffixes. The test suite doesn't cover that format, only
the plain fractional parts.

It should be possible to avoid floating point entirely if we just
split on "." and then use int64 arithmetic to scale the two parts
before recombining them. 


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH
  2021-01-07 12:20       ` Richard W.M. Jones
  2021-01-08 11:14         ` Vladimir Sementsov-Ogievskiy
@ 2021-02-04 14:05         ` Eric Blake
  1 sibling, 0 replies; 17+ messages in thread
From: Eric Blake @ 2021-02-04 14:05 UTC (permalink / raw)
  To: Richard W.M. Jones, Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, berto, qemu-block, qemu-devel, mreitz, stefanha, den,
	berrange

On 1/7/21 6:20 AM, Richard W.M. Jones wrote:

>> Actually that's not the only problem.  It appears that we're unable to
>> read or write the last sector of this disk:
>>
>> $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -r -f raw "$uri" -c "r -v $(( 2**63 - 2**30 - 512 )) 512" ' 
>> read failed: Input/output error
>>
>> $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' 
>> write failed: Input/output error
>>
>> You can play around with the constants.  I found it's possible to read
>> and write the non-aligned 512 bytes starting at 2^63-2^30-513.  Could
>> be a fencepost error somewhere in qemu?
> 
> Actually this is a pre-existing bug in qemu.
> 
> What happens is qemu-io calls qemu_strtosz("9223372035781033472")
> which returns 0x7fffffffc0000000 and no error.  That answer is plain
> flat out wrong.  The reason for that is qemu_strtosz uses floating
> point for the calculation(!) so is limited to 53 bits of precision and
> silently truncates.

I'm working a patch for that soon.  It was easy to fix things to parse
with full 64 bits of precision while still allowing a fractional bump
(for things like 1.5M), but harder to chase down all the spots in the
testsuite that are impacted by our parser now being more accurate.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-02-04 14:08 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-03 22:27 [PATCH 0/4] block: prepare for 64bit Vladimir Sementsov-Ogievskiy
2020-12-03 22:27 ` [PATCH 1/4] block/file-posix: fix workaround in raw_do_pwrite_zeroes() Vladimir Sementsov-Ogievskiy
2020-12-03 22:27 ` [PATCH 2/4] block/io: bdrv_refresh_limits(): use ERRP_GUARD Vladimir Sementsov-Ogievskiy
2020-12-04 15:08   ` Alberto Garcia
2020-12-03 22:27 ` [PATCH 3/4] block/io: bdrv_check_byte_request(): drop bdrv_is_inserted() Vladimir Sementsov-Ogievskiy
2020-12-04 15:16   ` Alberto Garcia
2020-12-03 22:27 ` [PATCH 4/4] block: introduce BDRV_MAX_LENGTH Vladimir Sementsov-Ogievskiy
2021-01-07  9:58   ` Richard W.M. Jones
2021-01-07 10:56     ` Richard W.M. Jones
2021-01-07 12:20       ` Richard W.M. Jones
2021-01-08 11:14         ` Vladimir Sementsov-Ogievskiy
2021-01-08 11:27           ` Daniel P. Berrangé
2021-02-04 14:05         ` Eric Blake
2021-01-08 10:51     ` Vladimir Sementsov-Ogievskiy
2021-01-08 11:02       ` Richard W.M. Jones
2020-12-08 17:13 ` [PATCH 0/4] block: prepare for 64bit Kevin Wolf
2020-12-08 17:32   ` Vladimir Sementsov-Ogievskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.