qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] qcow2: Misc BDRV_SECTOR_SIZE updates
@ 2020-01-09 19:12 Alberto Garcia
  2020-01-09 19:12 ` [PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size Alberto Garcia
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Alberto Garcia @ 2020-01-09 19:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Nir Soffer

This small series gets rid of all the remaining instances of hardcoded
sector sizes in the qcow2 code and adds a check for images whose
virtual size is not a multiple of the sector size.

See the individual patches for details.

Berto

v2:
- Modify output of iotest 080 to make it easier to understand [Nir]
- Use the QEMU_IS_ALIGNED() macro instead of the modulus operator [Nir]
- Tighten some assertions [Kevin]

v1: https://lists.gnu.org/archive/html/qemu-block/2020-01/msg00139.html

Alberto Garcia (4):
  qcow2: Require that the virtual size is a multiple of the sector size
  qcow2: Don't round the L1 table allocation up to the sector size
  qcow2: Tighten cluster_offset alignment assertions
  qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value

 block/qcow2-cluster.c      |  7 +++----
 block/qcow2-refcount.c     |  2 +-
 block/qcow2-snapshot.c     |  3 +--
 block/qcow2.c              | 28 +++++++++++++++++-----------
 docs/interop/qcow2.txt     |  3 ++-
 tests/qemu-iotests/080     |  8 ++++++++
 tests/qemu-iotests/080.out |  5 +++++
 7 files changed, 37 insertions(+), 19 deletions(-)

-- 
2.20.1



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size
  2020-01-09 19:12 [PATCH v2 0/4] qcow2: Misc BDRV_SECTOR_SIZE updates Alberto Garcia
@ 2020-01-09 19:12 ` Alberto Garcia
  2020-01-14 13:01   ` Max Reitz
  2020-01-09 19:13 ` [PATCH v2 2/4] qcow2: Don't round the L1 table allocation up to " Alberto Garcia
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 23+ messages in thread
From: Alberto Garcia @ 2020-01-09 19:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Nir Soffer

The qcow2 header specifies the virtual size of the image in bytes, but
BlockDriverState stores it as a number of 512-byte sectors.

If the user tries to create an image with a size that is not a
multiple of the sector size then this is fixed on creation by
silently rounding the image size up (see commit c2eb918e32).
qcow2_co_truncate() is more strict and returns an error instead.

However when an image is opened the virtual size is rounded down,
which means that trying to access the last few advertised bytes will
result in an error. As seen above QEMU cannot create such images and
there's no good use case that would require us to try to handle them
so let's just treat them as unsupported.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2.c              | 7 +++++++
 docs/interop/qcow2.txt     | 3 ++-
 tests/qemu-iotests/080     | 8 ++++++++
 tests/qemu-iotests/080.out | 5 +++++
 4 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 7fbaac8457..87ca2832f0 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1326,6 +1326,13 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
         goto fail;
     }
 
+    if (!QEMU_IS_ALIGNED(header.size, BDRV_SECTOR_SIZE)) {
+        error_setg(errp, "Virtual size is not a multiple of %u",
+                   (unsigned) BDRV_SECTOR_SIZE);
+        ret = -EINVAL;
+        goto fail;
+    }
+
     if (header.header_length > sizeof(header)) {
         s->unknown_header_fields_size = header.header_length - sizeof(header);
         s->unknown_header_fields = g_malloc(s->unknown_header_fields_size);
diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index af5711e533..891f5662d8 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -40,7 +40,8 @@ The first cluster of a qcow2 image contains the file header:
                     with larger cluster sizes.
 
          24 - 31:   size
-                    Virtual disk size in bytes.
+                    Virtual disk size in bytes. qemu can only handle
+                    sizes that are a multiple of 512 bytes.
 
                     Note: qemu has an implementation limit of 32 MB as
                     the maximum L1 table size.  With a 2 MB cluster
diff --git a/tests/qemu-iotests/080 b/tests/qemu-iotests/080
index 4bcb5021e8..6f136d616f 100755
--- a/tests/qemu-iotests/080
+++ b/tests/qemu-iotests/080
@@ -48,6 +48,7 @@ header_size=104
 
 offset_backing_file_offset=8
 offset_backing_file_size=16
+offset_virtual_size=24
 offset_l1_size=36
 offset_l1_table_offset=40
 offset_refcount_table_offset=48
@@ -197,6 +198,13 @@ poke_file "$TEST_IMG" "$offset_snap1_l1_size" "\x10\x00\x00\x00"
 { $QEMU_IMG snapshot -d test $TEST_IMG; } 2>&1 | _filter_testdir
 _check_test_img
 
+echo
+echo "== Image size not a multiple of the sector size =="
+_make_test_img 64k
+echo "Modifying virtual size to 65535 bytes"
+poke_file "$TEST_IMG" "$offset_virtual_size" "\x00\x00\x00\x00\x00\x00\xff\xff"
+{ $QEMU_IMG info $TEST_IMG; } 2>&1 | _filter_testdir | _filter_imgfmt
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/080.out b/tests/qemu-iotests/080.out
index 45ab01db8e..aadc817339 100644
--- a/tests/qemu-iotests/080.out
+++ b/tests/qemu-iotests/080.out
@@ -104,4 +104,9 @@ Data may be corrupted, or further writes to the image may corrupt it.
 
 3 leaked clusters were found on the image.
 This means waste of disk space, but no harm to data.
+
+== Image size not a multiple of the sector size ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=65536
+Modifying virtual size to 65535 bytes
+qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Virtual size is not a multiple of 512
 *** done
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 2/4] qcow2: Don't round the L1 table allocation up to the sector size
  2020-01-09 19:12 [PATCH v2 0/4] qcow2: Misc BDRV_SECTOR_SIZE updates Alberto Garcia
  2020-01-09 19:12 ` [PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size Alberto Garcia
@ 2020-01-09 19:13 ` Alberto Garcia
  2020-01-14 13:46   ` Max Reitz
  2020-01-09 19:13 ` [PATCH v2 3/4] qcow2: Tighten cluster_offset alignment assertions Alberto Garcia
  2020-01-09 19:13 ` [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value Alberto Garcia
  3 siblings, 1 reply; 23+ messages in thread
From: Alberto Garcia @ 2020-01-09 19:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Nir Soffer

The L1 table is read from disk using the byte-based bdrv_pread() and
is never accessed beyond its last element, so there's no need to
allocate more memory than that.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-cluster.c  | 5 ++---
 block/qcow2-refcount.c | 2 +-
 block/qcow2-snapshot.c | 3 +--
 block/qcow2.c          | 2 +-
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 8982b7b762..932fc48919 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -124,12 +124,11 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
 #endif
 
     new_l1_size2 = sizeof(uint64_t) * new_l1_size;
-    new_l1_table = qemu_try_blockalign(bs->file->bs,
-                                       ROUND_UP(new_l1_size2, 512));
+    new_l1_table = qemu_try_blockalign(bs->file->bs, new_l1_size2);
     if (new_l1_table == NULL) {
         return -ENOMEM;
     }
-    memset(new_l1_table, 0, ROUND_UP(new_l1_size2, 512));
+    memset(new_l1_table, 0, new_l1_size2);
 
     if (s->l1_size) {
         memcpy(new_l1_table, s->l1_table, s->l1_size * sizeof(uint64_t));
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index f67ac6b2d8..c963bc8de1 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1262,7 +1262,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
      * l1_table_offset when it is the current s->l1_table_offset! Be careful
      * when changing this! */
     if (l1_table_offset != s->l1_table_offset) {
-        l1_table = g_try_malloc0(ROUND_UP(l1_size2, 512));
+        l1_table = g_try_malloc0(l1_size2);
         if (l1_size2 && l1_table == NULL) {
             ret = -ENOMEM;
             goto fail;
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 5ab64da1ec..82c32d4c9b 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -1024,8 +1024,7 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
         return ret;
     }
     new_l1_bytes = sn->l1_size * sizeof(uint64_t);
-    new_l1_table = qemu_try_blockalign(bs->file->bs,
-                                       ROUND_UP(new_l1_bytes, 512));
+    new_l1_table = qemu_try_blockalign(bs->file->bs, new_l1_bytes);
     if (new_l1_table == NULL) {
         return -ENOMEM;
     }
diff --git a/block/qcow2.c b/block/qcow2.c
index 87ca2832f0..848a6c5182 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1499,7 +1499,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
 
     if (s->l1_size > 0) {
         s->l1_table = qemu_try_blockalign(bs->file->bs,
-            ROUND_UP(s->l1_size * sizeof(uint64_t), 512));
+                                          s->l1_size * sizeof(uint64_t));
         if (s->l1_table == NULL) {
             error_setg(errp, "Could not allocate L1 table");
             ret = -ENOMEM;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 3/4] qcow2: Tighten cluster_offset alignment assertions
  2020-01-09 19:12 [PATCH v2 0/4] qcow2: Misc BDRV_SECTOR_SIZE updates Alberto Garcia
  2020-01-09 19:12 ` [PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size Alberto Garcia
  2020-01-09 19:13 ` [PATCH v2 2/4] qcow2: Don't round the L1 table allocation up to " Alberto Garcia
@ 2020-01-09 19:13 ` Alberto Garcia
  2020-01-10 12:14   ` Alberto Garcia
  2020-01-14 13:56   ` Max Reitz
  2020-01-09 19:13 ` [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value Alberto Garcia
  3 siblings, 2 replies; 23+ messages in thread
From: Alberto Garcia @ 2020-01-09 19:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Nir Soffer

qcow2_alloc_cluster_offset() and qcow2_get_cluster_offset() always
return offsets that are cluster-aligned so don't just check that they
are sector-aligned.

The check in qcow2_co_preadv_task() is also replaced by an assertion
for the same reason.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 848a6c5182..783d2b9578 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2175,10 +2175,7 @@ static coroutine_fn int qcow2_co_preadv_task(BlockDriverState *bs,
                                           offset, bytes, qiov, qiov_offset);
 
     case QCOW2_CLUSTER_NORMAL:
-        if ((file_cluster_offset & 511) != 0) {
-            return -EIO;
-        }
-
+        assert(QEMU_IS_ALIGNED(file_cluster_offset, s->cluster_size));
         if (bs->encrypted) {
             return qcow2_co_preadv_encrypted(bs, file_cluster_offset,
                                              offset, bytes, qiov, qiov_offset);
@@ -2514,7 +2511,7 @@ static coroutine_fn int qcow2_co_pwritev_part(
             goto out_locked;
         }
 
-        assert((cluster_offset & 511) == 0);
+        assert(QEMU_IS_ALIGNED(cluster_offset, s->cluster_size));
 
         ret = qcow2_pre_write_overlap_check(bs, 0,
                                             cluster_offset + offset_in_cluster,
@@ -3904,7 +3901,7 @@ qcow2_co_copy_range_to(BlockDriverState *bs,
             goto fail;
         }
 
-        assert((cluster_offset & 511) == 0);
+        assert(QEMU_IS_ALIGNED(cluster_offset, s->cluster_size));
 
         ret = qcow2_pre_write_overlap_check(bs, 0,
                 cluster_offset + offset_in_cluster, cur_bytes, true);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value
  2020-01-09 19:12 [PATCH v2 0/4] qcow2: Misc BDRV_SECTOR_SIZE updates Alberto Garcia
                   ` (2 preceding siblings ...)
  2020-01-09 19:13 ` [PATCH v2 3/4] qcow2: Tighten cluster_offset alignment assertions Alberto Garcia
@ 2020-01-09 19:13 ` Alberto Garcia
  2020-01-14 14:15   ` Max Reitz
  3 siblings, 1 reply; 23+ messages in thread
From: Alberto Garcia @ 2020-01-09 19:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Nir Soffer

This replaces all remaining instances in the qcow2 code.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-cluster.c |  2 +-
 block/qcow2.c         | 10 ++++++----
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 932fc48919..777ca2d409 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset,
  * Writes one sector of the L1 table to the disk (can't update single entries
  * and we really don't want bdrv_pread to perform a read-modify-write)
  */
-#define L1_ENTRIES_PER_SECTOR (512 / 8)
+#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8)
 int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
 {
     BDRVQcow2State *s = bs->opaque;
diff --git a/block/qcow2.c b/block/qcow2.c
index 783d2b9578..c0f3e715ef 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3280,7 +3280,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
 
     /* Validate options and set default values */
     if (!QEMU_IS_ALIGNED(qcow2_opts->size, BDRV_SECTOR_SIZE)) {
-        error_setg(errp, "Image size must be a multiple of 512 bytes");
+        error_setg(errp, "Image size must be a multiple of %u bytes",
+                   (unsigned) BDRV_SECTOR_SIZE);
         ret = -EINVAL;
         goto out;
     }
@@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
         case QCOW2_CLUSTER_NORMAL:
             child = s->data_file;
             copy_offset += offset_into_cluster(s, src_offset);
-            if ((copy_offset & 511) != 0) {
+            if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) {
                 ret = -EIO;
                 goto out;
             }
@@ -3958,8 +3959,9 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
         return -ENOTSUP;
     }
 
-    if (offset & 511) {
-        error_setg(errp, "The new size must be a multiple of 512");
+    if (!QEMU_IS_ALIGNED(offset, BDRV_SECTOR_SIZE)) {
+        error_setg(errp, "The new size must be a multiple of %u",
+                   (unsigned) BDRV_SECTOR_SIZE);
         return -EINVAL;
     }
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/4] qcow2: Tighten cluster_offset alignment assertions
  2020-01-09 19:13 ` [PATCH v2 3/4] qcow2: Tighten cluster_offset alignment assertions Alberto Garcia
@ 2020-01-10 12:14   ` Alberto Garcia
  2020-01-14 13:58     ` Max Reitz
  2020-01-14 13:56   ` Max Reitz
  1 sibling, 1 reply; 23+ messages in thread
From: Alberto Garcia @ 2020-01-10 12:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block, Max Reitz

On Thu 09 Jan 2020 08:13:01 PM CET, Alberto Garcia wrote:
> -        assert((cluster_offset & 511) == 0);
> +        assert(QEMU_IS_ALIGNED(cluster_offset, s->cluster_size));

On second thoughts this patch could also use the (probably more
efficient) offset_into_cluster() call instead.

I can resend it if you think it's a good idea.

Berto


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size
  2020-01-09 19:12 ` [PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size Alberto Garcia
@ 2020-01-14 13:01   ` Max Reitz
  2020-01-14 13:20     ` Alberto Garcia
  0 siblings, 1 reply; 23+ messages in thread
From: Max Reitz @ 2020-01-14 13:01 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 828 bytes --]

On 09.01.20 20:12, Alberto Garcia wrote:
> The qcow2 header specifies the virtual size of the image in bytes, but
> BlockDriverState stores it as a number of 512-byte sectors.
> 
> If the user tries to create an image with a size that is not a
> multiple of the sector size then this is fixed on creation by
> silently rounding the image size up (see commit c2eb918e32).
> qcow2_co_truncate() is more strict and returns an error instead.
> 
> However when an image is opened the virtual size is rounded down,
> which means that trying to access the last few advertised bytes will
> result in an error. As seen above QEMU cannot create such images and
> there's no good use case that would require us to try to handle them
> so let's just treat them as unsupported.

But isn’t that just a bug in qemu?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size
  2020-01-14 13:01   ` Max Reitz
@ 2020-01-14 13:20     ` Alberto Garcia
  2020-01-14 13:47       ` Max Reitz
  0 siblings, 1 reply; 23+ messages in thread
From: Alberto Garcia @ 2020-01-14 13:20 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block

On Tue 14 Jan 2020 02:01:03 PM CET, Max Reitz wrote:
>> However when an image is opened the virtual size is rounded down,
>> which means that trying to access the last few advertised bytes will
>> result in an error. As seen above QEMU cannot create such images and
>> there's no good use case that would require us to try to handle them
>> so let's just treat them as unsupported.
>
> But isn’t that just a bug in qemu?

Yes, but does it make sense to try to support images with unaligned
sizes if no one is going to create them ever and QEMU cannot even
generate them?

Berto


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/4] qcow2: Don't round the L1 table allocation up to the sector size
  2020-01-09 19:13 ` [PATCH v2 2/4] qcow2: Don't round the L1 table allocation up to " Alberto Garcia
@ 2020-01-14 13:46   ` Max Reitz
  0 siblings, 0 replies; 23+ messages in thread
From: Max Reitz @ 2020-01-14 13:46 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 521 bytes --]

On 09.01.20 20:13, Alberto Garcia wrote:
> The L1 table is read from disk using the byte-based bdrv_pread() and
> is never accessed beyond its last element, so there's no need to
> allocate more memory than that.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>  block/qcow2-cluster.c  | 5 ++---
>  block/qcow2-refcount.c | 2 +-
>  block/qcow2-snapshot.c | 3 +--
>  block/qcow2.c          | 2 +-
>  4 files changed, 5 insertions(+), 7 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size
  2020-01-14 13:20     ` Alberto Garcia
@ 2020-01-14 13:47       ` Max Reitz
  2020-01-14 13:58         ` Alberto Garcia
  0 siblings, 1 reply; 23+ messages in thread
From: Max Reitz @ 2020-01-14 13:47 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 726 bytes --]

On 14.01.20 14:20, Alberto Garcia wrote:
> On Tue 14 Jan 2020 02:01:03 PM CET, Max Reitz wrote:
>>> However when an image is opened the virtual size is rounded down,
>>> which means that trying to access the last few advertised bytes will
>>> result in an error. As seen above QEMU cannot create such images and
>>> there's no good use case that would require us to try to handle them
>>> so let's just treat them as unsupported.
>>
>> But isn’t that just a bug in qemu?
> 
> Yes, but does it make sense to try to support images with unaligned
> sizes if no one is going to create them ever and QEMU cannot even
> generate them?

If nobody uses such images ever, isn’t the current code fine as-is?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/4] qcow2: Tighten cluster_offset alignment assertions
  2020-01-09 19:13 ` [PATCH v2 3/4] qcow2: Tighten cluster_offset alignment assertions Alberto Garcia
  2020-01-10 12:14   ` Alberto Garcia
@ 2020-01-14 13:56   ` Max Reitz
  1 sibling, 0 replies; 23+ messages in thread
From: Max Reitz @ 2020-01-14 13:56 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 503 bytes --]

On 09.01.20 20:13, Alberto Garcia wrote:
> qcow2_alloc_cluster_offset() and qcow2_get_cluster_offset() always
> return offsets that are cluster-aligned so don't just check that they
> are sector-aligned.
> 
> The check in qcow2_co_preadv_task() is also replaced by an assertion
> for the same reason.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>  block/qcow2.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size
  2020-01-14 13:47       ` Max Reitz
@ 2020-01-14 13:58         ` Alberto Garcia
  2020-01-14 14:03           ` Max Reitz
  0 siblings, 1 reply; 23+ messages in thread
From: Alberto Garcia @ 2020-01-14 13:58 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block

On Tue 14 Jan 2020 02:47:53 PM CET, Max Reitz wrote:

>> Yes, but does it make sense to try to support images with unaligned
>> sizes if no one is going to create them ever and QEMU cannot even
>> generate them?
>
> If nobody uses such images ever, isn’t the current code fine as-is?

I'll rephrase:

1. It is possible to have a qcow2 image with a size of 1 GB + 13 bytes
   (the size field in the qcow2 header allows it).

2. qemu-img cannot currently produce such images.

3. QEMU can open them but it gets the size wrong.

4. Fixing this in QEMU involves non-trivial changes (at least at the
   moment).

5. I don't think there's any use case for such images so instead of
   fixing QEMU I propose that we simply refuse to open them.

Berto


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/4] qcow2: Tighten cluster_offset alignment assertions
  2020-01-10 12:14   ` Alberto Garcia
@ 2020-01-14 13:58     ` Max Reitz
  0 siblings, 0 replies; 23+ messages in thread
From: Max Reitz @ 2020-01-14 13:58 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 523 bytes --]

On 10.01.20 13:14, Alberto Garcia wrote:
> On Thu 09 Jan 2020 08:13:01 PM CET, Alberto Garcia wrote:
>> -        assert((cluster_offset & 511) == 0);
>> +        assert(QEMU_IS_ALIGNED(cluster_offset, s->cluster_size));
> 
> On second thoughts this patch could also use the (probably more
> efficient) offset_into_cluster() call instead.
> 
> I can resend it if you think it's a good idea.
Either way works for me.  (I don’t think it’s going to make any
difference in practice, but of course, why not.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size
  2020-01-14 13:58         ` Alberto Garcia
@ 2020-01-14 14:03           ` Max Reitz
  2020-01-16 14:24             ` Alberto Garcia
  0 siblings, 1 reply; 23+ messages in thread
From: Max Reitz @ 2020-01-14 14:03 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 1760 bytes --]

On 14.01.20 14:58, Alberto Garcia wrote:
> On Tue 14 Jan 2020 02:47:53 PM CET, Max Reitz wrote:
> 
>>> Yes, but does it make sense to try to support images with unaligned
>>> sizes if no one is going to create them ever and QEMU cannot even
>>> generate them?
>>
>> If nobody uses such images ever, isn’t the current code fine as-is?
> 
> I'll rephrase:
> 
> 1. It is possible to have a qcow2 image with a size of 1 GB + 13 bytes
>    (the size field in the qcow2 header allows it).
> 
> 2. qemu-img cannot currently produce such images.
> 
> 3. QEMU can open them but it gets the size wrong.

Yes, but in a safe way.  The user simply doesn’t get access to those 13
bytes.

With your proposed change, it would just reject the image altogether.
As a user, I’d prefer not being able to access 13 bytes to not being
able to access 1 GB + 13 bytes.

Furthermore, the user couldn’t even fix the image then because they
couldn’t resize it with qemu-img resize.

> 4. Fixing this in QEMU involves non-trivial changes (at least at the
>    moment).
> 
> 5. I don't think there's any use case for such images so instead of
>    fixing QEMU I propose that we simply refuse to open them.

If there’s no use case, I don’t see how the current behavior is problematic.

The arguments I could immediately agree to are (1) simplicity, and (2)
the image is probably corrupted.

As for (1), this patch adds more lines than it removes, so from a pure
quantitative standpoint, it doesn’t make the code simpler.

As for (2), I’m always of the opinion we shouldn’t refuse to read
corrupted images just because they are corrupted.  We should open them
and let the user read as much as is reasonably possible.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value
  2020-01-09 19:13 ` [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value Alberto Garcia
@ 2020-01-14 14:15   ` Max Reitz
  2020-01-16 23:26     ` Alberto Garcia
  2020-01-18 18:07     ` Alberto Garcia
  0 siblings, 2 replies; 23+ messages in thread
From: Max Reitz @ 2020-01-14 14:15 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 3292 bytes --]

On 09.01.20 20:13, Alberto Garcia wrote:
> This replaces all remaining instances in the qcow2 code.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>  block/qcow2-cluster.c |  2 +-
>  block/qcow2.c         | 10 ++++++----
>  2 files changed, 7 insertions(+), 5 deletions(-)

The question of course is why we would even have such instances still.

> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 932fc48919..777ca2d409 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset,
>   * Writes one sector of the L1 table to the disk (can't update single entries
>   * and we really don't want bdrv_pread to perform a read-modify-write)
>   */
> -#define L1_ENTRIES_PER_SECTOR (512 / 8)
> +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8)
>  int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)

Here it’s because the comment is wrong: “Can’t update single entries” –
yes, we can.  We’d just have to do a bdrv_pwrite() to a single entry.

>  {
>      BDRVQcow2State *s = bs->opaque;
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 783d2b9578..c0f3e715ef 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -3280,7 +3280,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
>  
>      /* Validate options and set default values */
>      if (!QEMU_IS_ALIGNED(qcow2_opts->size, BDRV_SECTOR_SIZE)) {
> -        error_setg(errp, "Image size must be a multiple of 512 bytes");
> +        error_setg(errp, "Image size must be a multiple of %u bytes",
> +                   (unsigned) BDRV_SECTOR_SIZE);

Either way how patch 1 goes, this is due to a limitation in qemu.

>          ret = -EINVAL;
>          goto out;
>      }
> @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
>          case QCOW2_CLUSTER_NORMAL:
>              child = s->data_file;
>              copy_offset += offset_into_cluster(s, src_offset);
> -            if ((copy_offset & 511) != 0) {
> +            if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) {

Hm.  I don’t get this one.

>                  ret = -EIO;
>                  goto out;
>              }
> @@ -3958,8 +3959,9 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
>          return -ENOTSUP;
>      }
>  
> -    if (offset & 511) {
> -        error_setg(errp, "The new size must be a multiple of 512");
> +    if (!QEMU_IS_ALIGNED(offset, BDRV_SECTOR_SIZE)) {
> +        error_setg(errp, "The new size must be a multiple of %u",
> +                   (unsigned) BDRV_SECTOR_SIZE);

This too is due to the limitation in qemu that BDS lengths are only
stored in multiples of BDRV_SECTOR_SIZE, so we disallow any other image
sizes.

>          return -EINVAL;
>      }
So in 3/4 instances this patch looks good to me (it’s really about the
block layers concept of a “sector”, even though that notion is outdated
in the first hunk), but I don’t quite get the third hunk.  (I’m sure it
has to do something with the block layer, so it’s OK to change in this
patch, but I’m still interested in why we have that limitation there.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size
  2020-01-14 14:03           ` Max Reitz
@ 2020-01-16 14:24             ` Alberto Garcia
  0 siblings, 0 replies; 23+ messages in thread
From: Alberto Garcia @ 2020-01-16 14:24 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block

On Tue 14 Jan 2020 03:03:31 PM CET, Max Reitz wrote:
>> 3. QEMU can open them but it gets the size wrong.
>
> Yes, but in a safe way.  The user simply doesn’t get access to those
> 13 bytes.

Ok, I think I'll withdraw the patch then :-)

Berto


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value
  2020-01-14 14:15   ` Max Reitz
@ 2020-01-16 23:26     ` Alberto Garcia
  2020-01-17  9:12       ` Max Reitz
  2020-01-18 18:07     ` Alberto Garcia
  1 sibling, 1 reply; 23+ messages in thread
From: Alberto Garcia @ 2020-01-16 23:26 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block

On Tue 14 Jan 2020 03:15:48 PM CET, Max Reitz wrote:
>> @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset,
>>   * Writes one sector of the L1 table to the disk (can't update single entries
>>   * and we really don't want bdrv_pread to perform a read-modify-write)
>>   */
>> -#define L1_ENTRIES_PER_SECTOR (512 / 8)
>> +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8)
>>  int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
>
> Here it’s because the comment is wrong: “Can’t update single entries” –
> yes, we can.  We’d just have to do a bdrv_pwrite() to a single entry.

What's the point of qcow2_write_l1_entry() then?

>> @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
>>          case QCOW2_CLUSTER_NORMAL:
>>              child = s->data_file;
>>              copy_offset += offset_into_cluster(s, src_offset);
>> -            if ((copy_offset & 511) != 0) {
>> +            if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) {
>
> Hm.  I don’t get this one.

Checking the code (e.g. block_copy_do_copy()) it seems that the whole
chunk must be cluster aligned so I don't get this one either.

Berto


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value
  2020-01-16 23:26     ` Alberto Garcia
@ 2020-01-17  9:12       ` Max Reitz
  2020-01-17  9:55         ` Kevin Wolf
  0 siblings, 1 reply; 23+ messages in thread
From: Max Reitz @ 2020-01-17  9:12 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 2050 bytes --]

On 17.01.20 00:26, Alberto Garcia wrote:
> On Tue 14 Jan 2020 03:15:48 PM CET, Max Reitz wrote:
>>> @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset,
>>>   * Writes one sector of the L1 table to the disk (can't update single entries
>>>   * and we really don't want bdrv_pread to perform a read-modify-write)
>>>   */
>>> -#define L1_ENTRIES_PER_SECTOR (512 / 8)
>>> +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8)
>>>  int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
>>
>> Here it’s because the comment is wrong: “Can’t update single entries” –
>> yes, we can.  We’d just have to do a bdrv_pwrite() to a single entry.
> 
> What's the point of qcow2_write_l1_entry() then?

I think the point was that we couldn’t, for a long time, because the
block layer only provided sector-granularity access.  This function
simply was never changed when the block layer gained the ability to do
byte-granularity I/O.

(We’d still need this function, but only for the endian swap, I think.)

>>> @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
>>>          case QCOW2_CLUSTER_NORMAL:
>>>              child = s->data_file;
>>>              copy_offset += offset_into_cluster(s, src_offset);
>>> -            if ((copy_offset & 511) != 0) {
>>> +            if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) {
>>
>> Hm.  I don’t get this one.
> 
> Checking the code (e.g. block_copy_do_copy()) it seems that the whole
> chunk must be cluster aligned so I don't get this one either.

Hm, how did you get to block_copy_do_copy()?  That’s part of the
block-copy infrastructure that’s only used for the backup job, as far as
I’m aware.  It’s different from copy_range.

I don’t see any limitation for copy_range.  I suppose maybe it doesn’t
work for anything that isn’t aligned to physical sectors?  But the qcow2
driver shouldn’t care about that.

On thing’s for sure, the raw driver doesn’t care about it.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value
  2020-01-17  9:12       ` Max Reitz
@ 2020-01-17  9:55         ` Kevin Wolf
  2020-01-17 11:01           ` Max Reitz
  0 siblings, 1 reply; 23+ messages in thread
From: Kevin Wolf @ 2020-01-17  9:55 UTC (permalink / raw)
  To: Max Reitz; +Cc: Nir Soffer, Alberto Garcia, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 2874 bytes --]

Am 17.01.2020 um 10:12 hat Max Reitz geschrieben:
> On 17.01.20 00:26, Alberto Garcia wrote:
> > On Tue 14 Jan 2020 03:15:48 PM CET, Max Reitz wrote:
> >>> @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset,
> >>>   * Writes one sector of the L1 table to the disk (can't update single entries
> >>>   * and we really don't want bdrv_pread to perform a read-modify-write)
> >>>   */
> >>> -#define L1_ENTRIES_PER_SECTOR (512 / 8)
> >>> +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8)
> >>>  int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
> >>
> >> Here it’s because the comment is wrong: “Can’t update single entries” –
> >> yes, we can.  We’d just have to do a bdrv_pwrite() to a single entry.
> > 
> > What's the point of qcow2_write_l1_entry() then?
> 
> I think the point was that we couldn’t, for a long time, because the
> block layer only provided sector-granularity access.  This function
> simply was never changed when the block layer gained the ability to do
> byte-granularity I/O.
> 
> (We’d still need this function, but only for the endian swap, I think.)

We still can't do byte-granularity writes with O_DIRECT, because that's
a kernel requirement.

The comment explains that we don't want to do a RMW cycle to write a
single entry because that would be slower than just writing a whole
sector. I think this is still accurate. Maybe we should change the
comment to say "can't necessarily update". (The part that looks really
wrong in the comment is "bdrv_pread", that should be "bdrv_pwrite"...)

Now, what's wrong about the logic to avoid the RMW is that it assumes
a fixed required alignment of 512. What it should do is looking at
bs->file->bl.request_alignment and rounding accordingly.

> >>> @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
> >>>          case QCOW2_CLUSTER_NORMAL:
> >>>              child = s->data_file;
> >>>              copy_offset += offset_into_cluster(s, src_offset);
> >>> -            if ((copy_offset & 511) != 0) {
> >>> +            if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) {
> >>
> >> Hm.  I don’t get this one.
> > 
> > Checking the code (e.g. block_copy_do_copy()) it seems that the whole
> > chunk must be cluster aligned so I don't get this one either.
> 
> Hm, how did you get to block_copy_do_copy()?  That’s part of the
> block-copy infrastructure that’s only used for the backup job, as far as
> I’m aware.  It’s different from copy_range.
> 
> I don’t see any limitation for copy_range.  I suppose maybe it doesn’t
> work for anything that isn’t aligned to physical sectors?  But the qcow2
> driver shouldn’t care about that.
> 
> On thing’s for sure, the raw driver doesn’t care about it.

I don't understand this one either.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value
  2020-01-17  9:55         ` Kevin Wolf
@ 2020-01-17 11:01           ` Max Reitz
  2020-01-17 11:51             ` Kevin Wolf
  0 siblings, 1 reply; 23+ messages in thread
From: Max Reitz @ 2020-01-17 11:01 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Nir Soffer, Alberto Garcia, qemu-devel, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 3105 bytes --]

On 17.01.20 10:55, Kevin Wolf wrote:
> Am 17.01.2020 um 10:12 hat Max Reitz geschrieben:
>> On 17.01.20 00:26, Alberto Garcia wrote:
>>> On Tue 14 Jan 2020 03:15:48 PM CET, Max Reitz wrote:
>>>>> @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset,
>>>>>   * Writes one sector of the L1 table to the disk (can't update single entries
>>>>>   * and we really don't want bdrv_pread to perform a read-modify-write)
>>>>>   */
>>>>> -#define L1_ENTRIES_PER_SECTOR (512 / 8)
>>>>> +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8)
>>>>>  int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
>>>>
>>>> Here it’s because the comment is wrong: “Can’t update single entries” –
>>>> yes, we can.  We’d just have to do a bdrv_pwrite() to a single entry.
>>>
>>> What's the point of qcow2_write_l1_entry() then?
>>
>> I think the point was that we couldn’t, for a long time, because the
>> block layer only provided sector-granularity access.  This function
>> simply was never changed when the block layer gained the ability to do
>> byte-granularity I/O.
>>
>> (We’d still need this function, but only for the endian swap, I think.)
> 
> We still can't do byte-granularity writes with O_DIRECT, because that's
> a kernel requirement.

Ah, yes.  But that makes BDRV_SECTOR_SIZE the wrong choice.

> The comment explains that we don't want to do a RMW cycle to write a
> single entry because that would be slower than just writing a whole
> sector. I think this is still accurate. Maybe we should change the
> comment to say "can't necessarily update". (The part that looks really
> wrong in the comment is "bdrv_pread", that should be "bdrv_pwrite"...)

Hm.  But we wouldn’t do an RMW cycle without O_DIRECT, would we?

> Now, what's wrong about the logic to avoid the RMW is that it assumes
> a fixed required alignment of 512. What it should do is looking at
> bs->file->bl.request_alignment and rounding accordingly.

Yes.

>>>>> @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
>>>>>          case QCOW2_CLUSTER_NORMAL:
>>>>>              child = s->data_file;
>>>>>              copy_offset += offset_into_cluster(s, src_offset);
>>>>> -            if ((copy_offset & 511) != 0) {
>>>>> +            if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) {
>>>>
>>>> Hm.  I don’t get this one.
>>>
>>> Checking the code (e.g. block_copy_do_copy()) it seems that the whole
>>> chunk must be cluster aligned so I don't get this one either.
>>
>> Hm, how did you get to block_copy_do_copy()?  That’s part of the
>> block-copy infrastructure that’s only used for the backup job, as far as
>> I’m aware.  It’s different from copy_range.
>>
>> I don’t see any limitation for copy_range.  I suppose maybe it doesn’t
>> work for anything that isn’t aligned to physical sectors?  But the qcow2
>> driver shouldn’t care about that.
>>
>> On thing’s for sure, the raw driver doesn’t care about it.
> 
> I don't understand this one either.

Good. :-)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value
  2020-01-17 11:01           ` Max Reitz
@ 2020-01-17 11:51             ` Kevin Wolf
  2020-01-17 14:34               ` Alberto Garcia
  0 siblings, 1 reply; 23+ messages in thread
From: Kevin Wolf @ 2020-01-17 11:51 UTC (permalink / raw)
  To: Max Reitz; +Cc: Nir Soffer, Alberto Garcia, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 2441 bytes --]

Am 17.01.2020 um 12:01 hat Max Reitz geschrieben:
> On 17.01.20 10:55, Kevin Wolf wrote:
> > Am 17.01.2020 um 10:12 hat Max Reitz geschrieben:
> >> On 17.01.20 00:26, Alberto Garcia wrote:
> >>> On Tue 14 Jan 2020 03:15:48 PM CET, Max Reitz wrote:
> >>>>> @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset,
> >>>>>   * Writes one sector of the L1 table to the disk (can't update single entries
> >>>>>   * and we really don't want bdrv_pread to perform a read-modify-write)
> >>>>>   */
> >>>>> -#define L1_ENTRIES_PER_SECTOR (512 / 8)
> >>>>> +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8)
> >>>>>  int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
> >>>>
> >>>> Here it’s because the comment is wrong: “Can’t update single entries” –
> >>>> yes, we can.  We’d just have to do a bdrv_pwrite() to a single entry.
> >>>
> >>> What's the point of qcow2_write_l1_entry() then?
> >>
> >> I think the point was that we couldn’t, for a long time, because the
> >> block layer only provided sector-granularity access.  This function
> >> simply was never changed when the block layer gained the ability to do
> >> byte-granularity I/O.
> >>
> >> (We’d still need this function, but only for the endian swap, I think.)
> > 
> > We still can't do byte-granularity writes with O_DIRECT, because that's
> > a kernel requirement.
> 
> Ah, yes.  But that makes BDRV_SECTOR_SIZE the wrong choice.
> 
> > The comment explains that we don't want to do a RMW cycle to write a
> > single entry because that would be slower than just writing a whole
> > sector. I think this is still accurate. Maybe we should change the
> > comment to say "can't necessarily update". (The part that looks really
> > wrong in the comment is "bdrv_pread", that should be "bdrv_pwrite"...)
> 
> Hm.  But we wouldn’t do an RMW cycle without O_DIRECT, would we?

file-posix with a regular file has request_alignment == 1 and wouldn't
cause an RMW cycle, I think. I won't try to make a statement about all
non-O_DIRECT cases in all protocols.

The important point is that some cases exist which would get us RMW.

> > Now, what's wrong about the logic to avoid the RMW is that it assumes
> > a fixed required alignment of 512. What it should do is looking at
> > bs->file->bl.request_alignment and rounding accordingly.
> 
> Yes.

Who'll write the patch? :-)

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value
  2020-01-17 11:51             ` Kevin Wolf
@ 2020-01-17 14:34               ` Alberto Garcia
  0 siblings, 0 replies; 23+ messages in thread
From: Alberto Garcia @ 2020-01-17 14:34 UTC (permalink / raw)
  To: Kevin Wolf, Max Reitz; +Cc: Nir Soffer, qemu-devel, qemu-block

On Fri 17 Jan 2020 12:51:04 PM CET, Kevin Wolf wrote:
>> > Now, what's wrong about the logic to avoid the RMW is that it assumes
>> > a fixed required alignment of 512. What it should do is looking at
>> > bs->file->bl.request_alignment and rounding accordingly.
>> 
>> Yes.
>
> Who'll write the patch? :-)

I have to resend the series anyway so I can do it.

Berto


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value
  2020-01-14 14:15   ` Max Reitz
  2020-01-16 23:26     ` Alberto Garcia
@ 2020-01-18 18:07     ` Alberto Garcia
  1 sibling, 0 replies; 23+ messages in thread
From: Alberto Garcia @ 2020-01-18 18:07 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block

On Tue 14 Jan 2020 03:15:48 PM CET, Max Reitz <mreitz@redhat.com> wrote:
>> @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
>>          case QCOW2_CLUSTER_NORMAL:
>>              child = s->data_file;
>>              copy_offset += offset_into_cluster(s, src_offset);
>> -            if ((copy_offset & 511) != 0) {
>> +            if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) {
>
> Hm.  I don’t get this one.

Ok, this came with Fam's "qemu-img convert with copy offloading" series:

   https://lists.gnu.org/archive/html/qemu-block/2018-06/msg00015.html

and qemu-img uses sectors here:

   blk_co_copy_range(..., sector_num << BDRV_SECTOR_BITS,
                     n << BDRV_SECTOR_BITS, ...)

so I guess that's why the check is there. Again, I think this should be
bl.request_alignment, because as far as I can tell copy_file_range()
works just fine unless O_DIRECT is used.

Berto


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2020-01-18 18:08 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-09 19:12 [PATCH v2 0/4] qcow2: Misc BDRV_SECTOR_SIZE updates Alberto Garcia
2020-01-09 19:12 ` [PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size Alberto Garcia
2020-01-14 13:01   ` Max Reitz
2020-01-14 13:20     ` Alberto Garcia
2020-01-14 13:47       ` Max Reitz
2020-01-14 13:58         ` Alberto Garcia
2020-01-14 14:03           ` Max Reitz
2020-01-16 14:24             ` Alberto Garcia
2020-01-09 19:13 ` [PATCH v2 2/4] qcow2: Don't round the L1 table allocation up to " Alberto Garcia
2020-01-14 13:46   ` Max Reitz
2020-01-09 19:13 ` [PATCH v2 3/4] qcow2: Tighten cluster_offset alignment assertions Alberto Garcia
2020-01-10 12:14   ` Alberto Garcia
2020-01-14 13:58     ` Max Reitz
2020-01-14 13:56   ` Max Reitz
2020-01-09 19:13 ` [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value Alberto Garcia
2020-01-14 14:15   ` Max Reitz
2020-01-16 23:26     ` Alberto Garcia
2020-01-17  9:12       ` Max Reitz
2020-01-17  9:55         ` Kevin Wolf
2020-01-17 11:01           ` Max Reitz
2020-01-17 11:51             ` Kevin Wolf
2020-01-17 14:34               ` Alberto Garcia
2020-01-18 18:07     ` Alberto Garcia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).