All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-5.1 0/2] qemu-img convert: Fix abort with unaligned image size
@ 2020-07-10 14:21 Kevin Wolf
  2020-07-10 14:21 ` [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure Kevin Wolf
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Kevin Wolf @ 2020-07-10 14:21 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, nsoffer, qemu-devel, mreitz

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1834646

Patch 1 fixes the assertion failure by failing gracefully when opening
an image whose size isn't aligned to the required request alignment.

Patch 2 relaxes the restrictions for NFS, which actually supports byte
alignment, but incorrectly gets a 4k request alignment in the file-posix
block driver.

Kevin Wolf (2):
  block: Require aligned image size to avoid assertion failure
  file-posix: Allow byte-aligned O_DIRECT with NFS

 block.c            | 10 ++++++++++
 block/file-posix.c | 26 +++++++++++++++++++++++++-
 2 files changed, 35 insertions(+), 1 deletion(-)

-- 
2.25.4



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-10 14:21 [PATCH for-5.1 0/2] qemu-img convert: Fix abort with unaligned image size Kevin Wolf
@ 2020-07-10 14:21 ` Kevin Wolf
  2020-07-10 14:37   ` Eric Blake
                     ` (2 more replies)
  2020-07-10 14:21 ` [PATCH for-5.1 2/2] file-posix: Allow byte-aligned O_DIRECT with NFS Kevin Wolf
  2020-07-10 14:43 ` [PATCH for-5.1 0/2] qemu-img convert: Fix abort with unaligned image size no-reply
  2 siblings, 3 replies; 20+ messages in thread
From: Kevin Wolf @ 2020-07-10 14:21 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, nsoffer, qemu-devel, mreitz

Unaligned requests will automatically be aligned to bl.request_alignment
and we don't want to extend requests to access space beyond the end of
the image, so it's required that the image size is aligned.

With write requests, this could cause assertion failures like this if
RESIZE permissions weren't requested:

qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.

This was e.g. triggered by qemu-img converting to a target image with 4k
request alignment when the image was only aligned to 512 bytes, but not
to 4k.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/block.c b/block.c
index cc377d7ef3..c635777911 100644
--- a/block.c
+++ b/block.c
@@ -1489,6 +1489,16 @@ static int bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv,
         return -EINVAL;
     }
 
+    /*
+     * Unaligned requests will automatically be aligned to bl.request_alignment
+     * and we don't want to extend requests to access space beyond the end of
+     * the image, so it's required that the image size is aligned.
+     */
+    if ((bs->total_sectors * BDRV_SECTOR_SIZE) % bs->bl.request_alignment) {
+        error_setg(errp, "Image size is not a multiple of request alignment");
+        return -EINVAL;
+    }
+
     assert(bdrv_opt_mem_align(bs) != 0);
     assert(bdrv_min_mem_align(bs) != 0);
     assert(is_power_of_2(bs->bl.request_alignment));
-- 
2.25.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH for-5.1 2/2] file-posix: Allow byte-aligned O_DIRECT with NFS
  2020-07-10 14:21 [PATCH for-5.1 0/2] qemu-img convert: Fix abort with unaligned image size Kevin Wolf
  2020-07-10 14:21 ` [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure Kevin Wolf
@ 2020-07-10 14:21 ` Kevin Wolf
  2020-07-10 14:39   ` Eric Blake
  2020-07-13 16:29   ` Nir Soffer
  2020-07-10 14:43 ` [PATCH for-5.1 0/2] qemu-img convert: Fix abort with unaligned image size no-reply
  2 siblings, 2 replies; 20+ messages in thread
From: Kevin Wolf @ 2020-07-10 14:21 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, nsoffer, qemu-devel, mreitz

Since commit a6b257a08e3 ('file-posix: Handle undetectable alignment'),
we assume that if we open a file with O_DIRECT and alignment probing
returns 1, we just couldn't find out the real alignment requirement
because some filesystems make the requirement only for allocated blocks.
In this case, a safe default of 4k is used.

This is too strict NFS, which does actually allow byte-aligned requests
even with O_DIRECT. Because we can't distinguish both cases with generic
code, let's just look at the file system magic and disable
s->needs_alignment for NFS. This way, O_DIRECT can still be used on NFS
for images that are not aligned to 4k.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/file-posix.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 0c4e07c415..4e9dac461b 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -62,10 +62,12 @@
 #include <sys/ioctl.h>
 #include <sys/param.h>
 #include <sys/syscall.h>
+#include <sys/vfs.h>
 #include <linux/cdrom.h>
 #include <linux/fd.h>
 #include <linux/fs.h>
 #include <linux/hdreg.h>
+#include <linux/magic.h>
 #include <scsi/sg.h>
 #ifdef __s390__
 #include <asm/dasd.h>
@@ -300,6 +302,28 @@ static int probe_physical_blocksize(int fd, unsigned int *blk_size)
 #endif
 }
 
+/*
+ * Returns true if no alignment restrictions are necessary even for files
+ * opened with O_DIRECT.
+ *
+ * raw_probe_alignment() probes the required alignment and assume that 1 means
+ * the probing failed, so it falls back to a safe default of 4k. This can be
+ * avoided if we know that byte alignment is okay for the file.
+ */
+static bool dio_byte_aligned(int fd)
+{
+#ifdef __linux__
+    struct statfs buf;
+    int ret;
+
+    ret = fstatfs(fd, &buf);
+    if (ret == 0 && buf.f_type == NFS_SUPER_MAGIC) {
+        return true;
+    }
+#endif
+    return false;
+}
+
 /* Check if read is allowed with given memory buffer and length.
  *
  * This function is used to check O_DIRECT memory buffer and request alignment.
@@ -631,7 +655,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
 
     s->has_discard = true;
     s->has_write_zeroes = true;
-    if ((bs->open_flags & BDRV_O_NOCACHE) != 0) {
+    if ((bs->open_flags & BDRV_O_NOCACHE) != 0 && !dio_byte_aligned(s->fd)) {
         s->needs_alignment = true;
     }
 
-- 
2.25.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-10 14:21 ` [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure Kevin Wolf
@ 2020-07-10 14:37   ` Eric Blake
  2020-07-13 11:19   ` Max Reitz
  2020-07-13 16:33   ` Nir Soffer
  2 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2020-07-10 14:37 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: qemu-devel, mreitz

On 7/10/20 9:21 AM, Kevin Wolf wrote:
> Unaligned requests will automatically be aligned to bl.request_alignment
> and we don't want to extend requests to access space beyond the end of
> the image, so it's required that the image size is aligned.

Yep, that's what I've already done on nbd images.

nbdkit has '--filter=truncate' which rounds an image size up to 
alignment by reading the absent tail as zeros, and permitting writes 
that rewrite zero but failing with EIO any write that would attempt to 
change the tail.  We may eventually want that complexity in qemu's block 
layer for ALL drivers (as part of switching the block layer to 
byte-accurate sizing everywhere), but that's a LOT more effort.  The 
short term of just mandating alignment is much easier and still defensible.

> 
> With write requests, this could cause assertion failures like this if
> RESIZE permissions weren't requested:
> 
> qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
> 
> This was e.g. triggered by qemu-img converting to a target image with 4k
> request alignment when the image was only aligned to 512 bytes, but not
> to 4k.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>   block.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/block.c b/block.c
> index cc377d7ef3..c635777911 100644
> --- a/block.c
> +++ b/block.c
> @@ -1489,6 +1489,16 @@ static int bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv,
>           return -EINVAL;
>       }
>   
> +    /*
> +     * Unaligned requests will automatically be aligned to bl.request_alignment
> +     * and we don't want to extend requests to access space beyond the end of
> +     * the image, so it's required that the image size is aligned.
> +     */
> +    if ((bs->total_sectors * BDRV_SECTOR_SIZE) % bs->bl.request_alignment) {
> +        error_setg(errp, "Image size is not a multiple of request alignment");
> +        return -EINVAL;
> +    }
> +

Do we have any iotest coverage of this new message?  (If none of our 
existing tests broke, then you should add one...)


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 2/2] file-posix: Allow byte-aligned O_DIRECT with NFS
  2020-07-10 14:21 ` [PATCH for-5.1 2/2] file-posix: Allow byte-aligned O_DIRECT with NFS Kevin Wolf
@ 2020-07-10 14:39   ` Eric Blake
  2020-07-13 16:29   ` Nir Soffer
  1 sibling, 0 replies; 20+ messages in thread
From: Eric Blake @ 2020-07-10 14:39 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: qemu-devel, mreitz

On 7/10/20 9:21 AM, Kevin Wolf wrote:
> Since commit a6b257a08e3 ('file-posix: Handle undetectable alignment'),
> we assume that if we open a file with O_DIRECT and alignment probing
> returns 1, we just couldn't find out the real alignment requirement
> because some filesystems make the requirement only for allocated blocks.
> In this case, a safe default of 4k is used.
> 
> This is too strict NFS, which does actually allow byte-aligned requests

strict for

> even with O_DIRECT. Because we can't distinguish both cases with generic
> code, let's just look at the file system magic and disable
> s->needs_alignment for NFS. This way, O_DIRECT can still be used on NFS
> for images that are not aligned to 4k.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>   block/file-posix.c | 26 +++++++++++++++++++++++++-
>   1 file changed, 25 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 0/2] qemu-img convert: Fix abort with unaligned image size
  2020-07-10 14:21 [PATCH for-5.1 0/2] qemu-img convert: Fix abort with unaligned image size Kevin Wolf
  2020-07-10 14:21 ` [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure Kevin Wolf
  2020-07-10 14:21 ` [PATCH for-5.1 2/2] file-posix: Allow byte-aligned O_DIRECT with NFS Kevin Wolf
@ 2020-07-10 14:43 ` no-reply
  2 siblings, 0 replies; 20+ messages in thread
From: no-reply @ 2020-07-10 14:43 UTC (permalink / raw)
  To: kwolf; +Cc: kwolf, nsoffer, qemu-devel, qemu-block, mreitz

Patchew URL: https://patchew.org/QEMU/20200710142149.40962-1-kwolf@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

Not run: 259
Failures: 268
Failed 1 of 119 iotests
make: *** [check-tests/check-block.sh] Error 1
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 669, in <module>
    sys.exit(main())
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=7dd2224f6d7b4c79a604d33021e80c19', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-vz0tbkd6/src/docker-src.2020-07-10-10.27.42.8932:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=7dd2224f6d7b4c79a604d33021e80c19
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-vz0tbkd6/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    16m16.326s
user    0m9.052s


The full log is available at
http://patchew.org/logs/20200710142149.40962-1-kwolf@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-10 14:21 ` [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure Kevin Wolf
  2020-07-10 14:37   ` Eric Blake
@ 2020-07-13 11:19   ` Max Reitz
  2020-07-13 11:52     ` Max Reitz
  2020-07-13 14:29     ` Kevin Wolf
  2020-07-13 16:33   ` Nir Soffer
  2 siblings, 2 replies; 20+ messages in thread
From: Max Reitz @ 2020-07-13 11:19 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: nsoffer, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1866 bytes --]

On 10.07.20 16:21, Kevin Wolf wrote:
> Unaligned requests will automatically be aligned to bl.request_alignment
> and we don't want to extend requests to access space beyond the end of
> the image, so it's required that the image size is aligned.
> 
> With write requests, this could cause assertion failures like this if
> RESIZE permissions weren't requested:
> 
> qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
> 
> This was e.g. triggered by qemu-img converting to a target image with 4k
> request alignment when the image was only aligned to 512 bytes, but not
> to 4k.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)

(I think we had some proposal like this before, but I can’t find it,
unfortunately...)

I can’t see how with this patch you could create qcow2 images and then
use them with direct I/O, because AFAICS, qemu-img create doesn’t allow
specifying caching options, so AFAIU you’re stuck with:

$ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M
Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536
compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16

$ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2
qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a
multiple of request alignment

(/mnt/tmp is a filesystem on a “losetup -b 4096” device.)

Or you use blockdev-create, that seems to work (because of course you
can set the cache mode on the protocol node when you open it for
formatting).  But, well, I think there should be a working qemu-img
create case.

Also, I’m afraid of breaking existing use cases with this patch (just
qemu-img create + using the image with cache=none).

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-13 11:19   ` Max Reitz
@ 2020-07-13 11:52     ` Max Reitz
  2020-07-13 14:29     ` Kevin Wolf
  1 sibling, 0 replies; 20+ messages in thread
From: Max Reitz @ 2020-07-13 11:52 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: nsoffer, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 2171 bytes --]

On 13.07.20 13:19, Max Reitz wrote:
> On 10.07.20 16:21, Kevin Wolf wrote:
>> Unaligned requests will automatically be aligned to bl.request_alignment
>> and we don't want to extend requests to access space beyond the end of
>> the image, so it's required that the image size is aligned.
>>
>> With write requests, this could cause assertion failures like this if
>> RESIZE permissions weren't requested:
>>
>> qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
>>
>> This was e.g. triggered by qemu-img converting to a target image with 4k
>> request alignment when the image was only aligned to 512 bytes, but not
>> to 4k.
>>
>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>> ---
>>  block.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
> 
> (I think we had some proposal like this before, but I can’t find it,
> unfortunately...)

(Ah, here it is:

https://lists.nongnu.org/archive/html/qemu-devel/2020-03/msg03077.html

(Which interestingly teases yet another mysterious “we had a discussion
on this before”...))

> I can’t see how with this patch you could create qcow2 images and then
> use them with direct I/O, because AFAICS, qemu-img create doesn’t allow
> specifying caching options, so AFAIU you’re stuck with:
> 
> $ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M
> Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536
> compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16
> 
> $ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2
> qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a
> multiple of request alignment
> 
> (/mnt/tmp is a filesystem on a “losetup -b 4096” device.)
> 
> Or you use blockdev-create, that seems to work (because of course you
> can set the cache mode on the protocol node when you open it for
> formatting).  But, well, I think there should be a working qemu-img
> create case.
> 
> Also, I’m afraid of breaking existing use cases with this patch (just
> qemu-img create + using the image with cache=none).
> 
> Max
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-13 11:19   ` Max Reitz
  2020-07-13 11:52     ` Max Reitz
@ 2020-07-13 14:29     ` Kevin Wolf
  2020-07-14  9:56       ` Max Reitz
  1 sibling, 1 reply; 20+ messages in thread
From: Kevin Wolf @ 2020-07-13 14:29 UTC (permalink / raw)
  To: Max Reitz; +Cc: nsoffer, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 2787 bytes --]

Am 13.07.2020 um 13:19 hat Max Reitz geschrieben:
> On 10.07.20 16:21, Kevin Wolf wrote:
> > Unaligned requests will automatically be aligned to bl.request_alignment
> > and we don't want to extend requests to access space beyond the end of
> > the image, so it's required that the image size is aligned.
> > 
> > With write requests, this could cause assertion failures like this if
> > RESIZE permissions weren't requested:
> > 
> > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
> > 
> > This was e.g. triggered by qemu-img converting to a target image with 4k
> > request alignment when the image was only aligned to 512 bytes, but not
> > to 4k.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  block.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> 
> (I think we had some proposal like this before, but I can’t find it,
> unfortunately...)
> 
> I can’t see how with this patch you could create qcow2 images and then
> use them with direct I/O, because AFAICS, qemu-img create doesn’t allow
> specifying caching options, so AFAIU you’re stuck with:
> 
> $ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M
> Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536
> compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16
> 
> $ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2
> qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a
> multiple of request alignment
> 
> (/mnt/tmp is a filesystem on a “losetup -b 4096” device.)

Hm, that looks like some regrettable collateral damage...

Well, you could argue that we should be writing full L1 tables with zero
padding instead of just the used part. I thought we had fixed this long
ago. But looks like we haven't.

But we should still avoid crashing in other cases, so what is the
difference between both? Is it just that qcow2 has the RESIZE permission
anyway so it doesn't matter?

If so, maybe attaching to a block node with WRITE, but not RESIZE is
what needs to fail when the image size is unaligned?

> Or you use blockdev-create, that seems to work (because of course you
> can set the cache mode on the protocol node when you open it for
> formatting).  But, well, I think there should be a working qemu-img
> create case.
> 
> Also, I’m afraid of breaking existing use cases with this patch (just
> qemu-img create + using the image with cache=none).

I think for raw images, failure on start is better than crashing when
the VM is running. The qcow2 case needs to be fixed, of course.

Either case, I guess patch 2 can already be merged and would solve at
least the immediate bug report.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 2/2] file-posix: Allow byte-aligned O_DIRECT with NFS
  2020-07-10 14:21 ` [PATCH for-5.1 2/2] file-posix: Allow byte-aligned O_DIRECT with NFS Kevin Wolf
  2020-07-10 14:39   ` Eric Blake
@ 2020-07-13 16:29   ` Nir Soffer
  1 sibling, 0 replies; 20+ messages in thread
From: Nir Soffer @ 2020-07-13 16:29 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: QEMU Developers, qemu-block, Max Reitz

On Fri, Jul 10, 2020 at 5:22 PM Kevin Wolf <kwolf@redhat.com> wrote:
>
> Since commit a6b257a08e3 ('file-posix: Handle undetectable alignment'),
> we assume that if we open a file with O_DIRECT and alignment probing
> returns 1, we just couldn't find out the real alignment requirement
> because some filesystems make the requirement only for allocated blocks.
> In this case, a safe default of 4k is used.
>
> This is too strict NFS, which does actually allow byte-aligned requests
> even with O_DIRECT. Because we can't distinguish both cases with generic
> code, let's just look at the file system magic and disable
> s->needs_alignment for NFS. This way, O_DIRECT can still be used on NFS
> for images that are not aligned to 4k.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/file-posix.c | 26 +++++++++++++++++++++++++-
>  1 file changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 0c4e07c415..4e9dac461b 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -62,10 +62,12 @@
>  #include <sys/ioctl.h>
>  #include <sys/param.h>
>  #include <sys/syscall.h>
> +#include <sys/vfs.h>
>  #include <linux/cdrom.h>
>  #include <linux/fd.h>
>  #include <linux/fs.h>
>  #include <linux/hdreg.h>
> +#include <linux/magic.h>
>  #include <scsi/sg.h>
>  #ifdef __s390__
>  #include <asm/dasd.h>
> @@ -300,6 +302,28 @@ static int probe_physical_blocksize(int fd, unsigned int *blk_size)
>  #endif
>  }
>
> +/*
> + * Returns true if no alignment restrictions are necessary even for files
> + * opened with O_DIRECT.
> + *
> + * raw_probe_alignment() probes the required alignment and assume that 1 means
> + * the probing failed, so it falls back to a safe default of 4k. This can be
> + * avoided if we know that byte alignment is okay for the file.
> + */
> +static bool dio_byte_aligned(int fd)
> +{
> +#ifdef __linux__
> +    struct statfs buf;
> +    int ret;
> +
> +    ret = fstatfs(fd, &buf);
> +    if (ret == 0 && buf.f_type == NFS_SUPER_MAGIC) {
> +        return true;
> +    }
> +#endif
> +    return false;
> +}
> +
>  /* Check if read is allowed with given memory buffer and length.
>   *
>   * This function is used to check O_DIRECT memory buffer and request alignment.
> @@ -631,7 +655,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
>
>      s->has_discard = true;
>      s->has_write_zeroes = true;
> -    if ((bs->open_flags & BDRV_O_NOCACHE) != 0) {
> +    if ((bs->open_flags & BDRV_O_NOCACHE) != 0 && !dio_byte_aligned(s->fd)) {
>          s->needs_alignment = true;

I did not know we have needs_alignment. Isn't this the same as using
request_alignment = 1?

For example we can check if we are on NFS and avoid the fallback to max_align:

    if (!bs->bl.request_alignment) {
        int i;
        size_t align;
        buf = qemu_memalign(max_align, max_align);
        for (i = 0; i < ARRAY_SIZE(alignments); i++) {
            align = alignments[i];
            if (raw_is_io_aligned(fd, buf, align)) {
                /* Fallback to safe value. */
                bs->bl.request_alignment = (align != 1) ? align : max_align;
                break;
            }
        }
        qemu_vfree(buf);
    }

After this we will have correct bl.request_alignment and buf_align.
Hopefully this will not break code expecting request_alignment >= 512.

Assuming that needs_alignment is well tested, this patch may be safer.

Nir

>      }
>
> --
> 2.25.4
>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-10 14:21 ` [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure Kevin Wolf
  2020-07-10 14:37   ` Eric Blake
  2020-07-13 11:19   ` Max Reitz
@ 2020-07-13 16:33   ` Nir Soffer
  2020-07-13 16:56     ` Kevin Wolf
  2 siblings, 1 reply; 20+ messages in thread
From: Nir Soffer @ 2020-07-13 16:33 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: QEMU Developers, qemu-block, Max Reitz

On Fri, Jul 10, 2020 at 5:22 PM Kevin Wolf <kwolf@redhat.com> wrote:
>
> Unaligned requests will automatically be aligned to bl.request_alignment
> and we don't want to extend requests to access space beyond the end of
> the image, so it's required that the image size is aligned.
>
> With write requests, this could cause assertion failures like this if
> RESIZE permissions weren't requested:
>
> qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
>
> This was e.g. triggered by qemu-img converting to a target image with 4k
> request alignment when the image was only aligned to 512 bytes, but not
> to 4k.

Was it on NFS? Shouldn't this be fix by the next patch then?

>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/block.c b/block.c
> index cc377d7ef3..c635777911 100644
> --- a/block.c
> +++ b/block.c
> @@ -1489,6 +1489,16 @@ static int bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv,
>          return -EINVAL;
>      }
>
> +    /*
> +     * Unaligned requests will automatically be aligned to bl.request_alignment
> +     * and we don't want to extend requests to access space beyond the end of
> +     * the image, so it's required that the image size is aligned.
> +     */
> +    if ((bs->total_sectors * BDRV_SECTOR_SIZE) % bs->bl.request_alignment) {
> +        error_setg(errp, "Image size is not a multiple of request alignment");
> +        return -EINVAL;
> +    }
> +
>      assert(bdrv_opt_mem_align(bs) != 0);
>      assert(bdrv_min_mem_align(bs) != 0);
>      assert(is_power_of_2(bs->bl.request_alignment));
> --
> 2.25.4
>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-13 16:33   ` Nir Soffer
@ 2020-07-13 16:56     ` Kevin Wolf
  2020-07-15 13:22       ` Nir Soffer
  0 siblings, 1 reply; 20+ messages in thread
From: Kevin Wolf @ 2020-07-13 16:56 UTC (permalink / raw)
  To: Nir Soffer; +Cc: QEMU Developers, qemu-block, Max Reitz

Am 13.07.2020 um 18:33 hat Nir Soffer geschrieben:
> On Fri, Jul 10, 2020 at 5:22 PM Kevin Wolf <kwolf@redhat.com> wrote:
> >
> > Unaligned requests will automatically be aligned to bl.request_alignment
> > and we don't want to extend requests to access space beyond the end of
> > the image, so it's required that the image size is aligned.
> >
> > With write requests, this could cause assertion failures like this if
> > RESIZE permissions weren't requested:
> >
> > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
> >
> > This was e.g. triggered by qemu-img converting to a target image with 4k
> > request alignment when the image was only aligned to 512 bytes, but not
> > to 4k.
> 
> Was it on NFS? Shouldn't this be fix by the next patch then?

Patch 2 makes the problem go away for NFS because NFS doesn't even
require the 4k alignment. But on storage that legitimately needs 4k
alignment (or possibly other filesystems that are misdetected), you
would still hit the same problem.

Kevin



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-13 14:29     ` Kevin Wolf
@ 2020-07-14  9:56       ` Max Reitz
  2020-07-14 11:08         ` Kevin Wolf
  0 siblings, 1 reply; 20+ messages in thread
From: Max Reitz @ 2020-07-14  9:56 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: nsoffer, qemu-devel, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 3432 bytes --]

On 13.07.20 16:29, Kevin Wolf wrote:
> Am 13.07.2020 um 13:19 hat Max Reitz geschrieben:
>> On 10.07.20 16:21, Kevin Wolf wrote:
>>> Unaligned requests will automatically be aligned to bl.request_alignment
>>> and we don't want to extend requests to access space beyond the end of
>>> the image, so it's required that the image size is aligned.
>>>
>>> With write requests, this could cause assertion failures like this if
>>> RESIZE permissions weren't requested:
>>>
>>> qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
>>>
>>> This was e.g. triggered by qemu-img converting to a target image with 4k
>>> request alignment when the image was only aligned to 512 bytes, but not
>>> to 4k.
>>>
>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>> ---
>>>  block.c | 10 ++++++++++
>>>  1 file changed, 10 insertions(+)
>>
>> (I think we had some proposal like this before, but I can’t find it,
>> unfortunately...)
>>
>> I can’t see how with this patch you could create qcow2 images and then
>> use them with direct I/O, because AFAICS, qemu-img create doesn’t allow
>> specifying caching options, so AFAIU you’re stuck with:
>>
>> $ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M
>> Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536
>> compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16
>>
>> $ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2
>> qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a
>> multiple of request alignment
>>
>> (/mnt/tmp is a filesystem on a “losetup -b 4096” device.)
> 
> Hm, that looks like some regrettable collateral damage...
> 
> Well, you could argue that we should be writing full L1 tables with zero
> padding instead of just the used part. I thought we had fixed this long
> ago. But looks like we haven't.

That would help for the standard case.  It wouldn’t when the cluster
size is smaller than the request alignment, which, while maybe not
important, would still be a shame.

> But we should still avoid crashing in other cases, so what is the
> difference between both? Is it just that qcow2 has the RESIZE permission
> anyway so it doesn't matter?

I assume so.

> If so, maybe attaching to a block node with WRITE, but not RESIZE is
> what needs to fail when the image size is unaligned?

That sounds reasonable.

The obvious question is what happens when the RESIZE capability is
removed.  Dropping capabilities may never fail – I suppose we could
force-keep the RESIZE capability for such nodes?

Or we could immediately align such files to the block size once they are
opened (with the RESIZE capability).

>> Or you use blockdev-create, that seems to work (because of course you
>> can set the cache mode on the protocol node when you open it for
>> formatting).  But, well, I think there should be a working qemu-img
>> create case.
>>
>> Also, I’m afraid of breaking existing use cases with this patch (just
>> qemu-img create + using the image with cache=none).
> 
> I think for raw images, failure on start is better than crashing when
> the VM is running.

Agreed.

> The qcow2 case needs to be fixed, of course.
> 
> Either case, I guess patch 2 can already be merged and would solve at
> least the immediate bug report.

Also true.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-14  9:56       ` Max Reitz
@ 2020-07-14 11:08         ` Kevin Wolf
  2020-07-14 16:22           ` Max Reitz
  0 siblings, 1 reply; 20+ messages in thread
From: Kevin Wolf @ 2020-07-14 11:08 UTC (permalink / raw)
  To: Max Reitz; +Cc: nsoffer, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 3496 bytes --]

Am 14.07.2020 um 11:56 hat Max Reitz geschrieben:
> On 13.07.20 16:29, Kevin Wolf wrote:
> > Am 13.07.2020 um 13:19 hat Max Reitz geschrieben:
> >> On 10.07.20 16:21, Kevin Wolf wrote:
> >>> Unaligned requests will automatically be aligned to bl.request_alignment
> >>> and we don't want to extend requests to access space beyond the end of
> >>> the image, so it's required that the image size is aligned.
> >>>
> >>> With write requests, this could cause assertion failures like this if
> >>> RESIZE permissions weren't requested:
> >>>
> >>> qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
> >>>
> >>> This was e.g. triggered by qemu-img converting to a target image with 4k
> >>> request alignment when the image was only aligned to 512 bytes, but not
> >>> to 4k.
> >>>
> >>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> >>> ---
> >>>  block.c | 10 ++++++++++
> >>>  1 file changed, 10 insertions(+)
> >>
> >> (I think we had some proposal like this before, but I can’t find it,
> >> unfortunately...)
> >>
> >> I can’t see how with this patch you could create qcow2 images and then
> >> use them with direct I/O, because AFAICS, qemu-img create doesn’t allow
> >> specifying caching options, so AFAIU you’re stuck with:
> >>
> >> $ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M
> >> Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536
> >> compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16
> >>
> >> $ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2
> >> qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a
> >> multiple of request alignment
> >>
> >> (/mnt/tmp is a filesystem on a “losetup -b 4096” device.)
> > 
> > Hm, that looks like some regrettable collateral damage...
> > 
> > Well, you could argue that we should be writing full L1 tables with zero
> > padding instead of just the used part. I thought we had fixed this long
> > ago. But looks like we haven't.
> 
> That would help for the standard case.  It wouldn’t when the cluster
> size is smaller than the request alignment, which, while maybe not
> important, would still be a shame.

I don't think it would be unreasonable to require a cluster size that is
a multiple of the logical block size of your host storage if you want to
use O_DIRECT.

But we have unaligned images in practice, so this is pure theory anyway.

> > But we should still avoid crashing in other cases, so what is the
> > difference between both? Is it just that qcow2 has the RESIZE permission
> > anyway so it doesn't matter?
> 
> I assume so.
> 
> > If so, maybe attaching to a block node with WRITE, but not RESIZE is
> > what needs to fail when the image size is unaligned?
> 
> That sounds reasonable.
> 
> The obvious question is what happens when the RESIZE capability is
> removed.  Dropping capabilities may never fail – I suppose we could
> force-keep the RESIZE capability for such nodes?

It's not nice, but I think we already have this kind of behaviour for
unlocking failures. So yes, that sounds like an option.

> Or we could immediately align such files to the block size once they
> are opened (with the RESIZE capability).

Automatically resizing the image file is obviously harmless for qcow2
images, but it would be a guest-visible change for raw images. It might
be better to avoid this.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-14 11:08         ` Kevin Wolf
@ 2020-07-14 16:22           ` Max Reitz
  2020-07-15  9:20             ` Kevin Wolf
  0 siblings, 1 reply; 20+ messages in thread
From: Max Reitz @ 2020-07-14 16:22 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: nsoffer, qemu-devel, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 4112 bytes --]

On 14.07.20 13:08, Kevin Wolf wrote:
> Am 14.07.2020 um 11:56 hat Max Reitz geschrieben:
>> On 13.07.20 16:29, Kevin Wolf wrote:
>>> Am 13.07.2020 um 13:19 hat Max Reitz geschrieben:
>>>> On 10.07.20 16:21, Kevin Wolf wrote:
>>>>> Unaligned requests will automatically be aligned to bl.request_alignment
>>>>> and we don't want to extend requests to access space beyond the end of
>>>>> the image, so it's required that the image size is aligned.
>>>>>
>>>>> With write requests, this could cause assertion failures like this if
>>>>> RESIZE permissions weren't requested:
>>>>>
>>>>> qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
>>>>>
>>>>> This was e.g. triggered by qemu-img converting to a target image with 4k
>>>>> request alignment when the image was only aligned to 512 bytes, but not
>>>>> to 4k.
>>>>>
>>>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>>>> ---
>>>>>  block.c | 10 ++++++++++
>>>>>  1 file changed, 10 insertions(+)
>>>>
>>>> (I think we had some proposal like this before, but I can’t find it,
>>>> unfortunately...)
>>>>
>>>> I can’t see how with this patch you could create qcow2 images and then
>>>> use them with direct I/O, because AFAICS, qemu-img create doesn’t allow
>>>> specifying caching options, so AFAIU you’re stuck with:
>>>>
>>>> $ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M
>>>> Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536
>>>> compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16
>>>>
>>>> $ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2
>>>> qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a
>>>> multiple of request alignment
>>>>
>>>> (/mnt/tmp is a filesystem on a “losetup -b 4096” device.)
>>>
>>> Hm, that looks like some regrettable collateral damage...
>>>
>>> Well, you could argue that we should be writing full L1 tables with zero
>>> padding instead of just the used part. I thought we had fixed this long
>>> ago. But looks like we haven't.
>>
>> That would help for the standard case.  It wouldn’t when the cluster
>> size is smaller than the request alignment, which, while maybe not
>> important, would still be a shame.
> 
> I don't think it would be unreasonable to require a cluster size that is
> a multiple of the logical block size of your host storage if you want to
> use O_DIRECT.

True.

> But we have unaligned images in practice, so this is pure theory anyway.

Hm.  Maybe it would help to just adjust the error message to instruct
the user to resize the image to fit the request alignment?  (e.g. “is
not a multiple of the request alignment %u (try resizing the image to
%llu bytes)”)

>>> But we should still avoid crashing in other cases, so what is the
>>> difference between both? Is it just that qcow2 has the RESIZE permission
>>> anyway so it doesn't matter?
>>
>> I assume so.
>>
>>> If so, maybe attaching to a block node with WRITE, but not RESIZE is
>>> what needs to fail when the image size is unaligned?
>>
>> That sounds reasonable.
>>
>> The obvious question is what happens when the RESIZE capability is
>> removed.  Dropping capabilities may never fail – I suppose we could
>> force-keep the RESIZE capability for such nodes?
> 
> It's not nice, but I think we already have this kind of behaviour for
> unlocking failures. So yes, that sounds like an option.
> 
>> Or we could immediately align such files to the block size once they
>> are opened (with the RESIZE capability).
> 
> Automatically resizing the image file is obviously harmless for qcow2
> images, but it would be a guest-visible change for raw images. It might
> be better to avoid this.

Well, it seems to be what already happens if the guest device has taken
the RESIZE capability (i.e., whenever there’s no failing assertion).
The only difference that appears to me is just that it happens only when
writing to the end of the image instead of unconditionally when opening it.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-14 16:22           ` Max Reitz
@ 2020-07-15  9:20             ` Kevin Wolf
  0 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2020-07-15  9:20 UTC (permalink / raw)
  To: Max Reitz; +Cc: nsoffer, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 4840 bytes --]

Am 14.07.2020 um 18:22 hat Max Reitz geschrieben:
> On 14.07.20 13:08, Kevin Wolf wrote:
> > Am 14.07.2020 um 11:56 hat Max Reitz geschrieben:
> >> On 13.07.20 16:29, Kevin Wolf wrote:
> >>> Am 13.07.2020 um 13:19 hat Max Reitz geschrieben:
> >>>> On 10.07.20 16:21, Kevin Wolf wrote:
> >>>>> Unaligned requests will automatically be aligned to bl.request_alignment
> >>>>> and we don't want to extend requests to access space beyond the end of
> >>>>> the image, so it's required that the image size is aligned.
> >>>>>
> >>>>> With write requests, this could cause assertion failures like this if
> >>>>> RESIZE permissions weren't requested:
> >>>>>
> >>>>> qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
> >>>>>
> >>>>> This was e.g. triggered by qemu-img converting to a target image with 4k
> >>>>> request alignment when the image was only aligned to 512 bytes, but not
> >>>>> to 4k.
> >>>>>
> >>>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> >>>>> ---
> >>>>>  block.c | 10 ++++++++++
> >>>>>  1 file changed, 10 insertions(+)
> >>>>
> >>>> (I think we had some proposal like this before, but I can’t find it,
> >>>> unfortunately...)
> >>>>
> >>>> I can’t see how with this patch you could create qcow2 images and then
> >>>> use them with direct I/O, because AFAICS, qemu-img create doesn’t allow
> >>>> specifying caching options, so AFAIU you’re stuck with:
> >>>>
> >>>> $ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M
> >>>> Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536
> >>>> compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16
> >>>>
> >>>> $ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2
> >>>> qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a
> >>>> multiple of request alignment
> >>>>
> >>>> (/mnt/tmp is a filesystem on a “losetup -b 4096” device.)
> >>>
> >>> Hm, that looks like some regrettable collateral damage...
> >>>
> >>> Well, you could argue that we should be writing full L1 tables with zero
> >>> padding instead of just the used part. I thought we had fixed this long
> >>> ago. But looks like we haven't.
> >>
> >> That would help for the standard case.  It wouldn’t when the cluster
> >> size is smaller than the request alignment, which, while maybe not
> >> important, would still be a shame.
> > 
> > I don't think it would be unreasonable to require a cluster size that is
> > a multiple of the logical block size of your host storage if you want to
> > use O_DIRECT.
> 
> True.
> 
> > But we have unaligned images in practice, so this is pure theory anyway.
> 
> Hm.  Maybe it would help to just adjust the error message to instruct
> the user to resize the image to fit the request alignment?  (e.g. “is
> not a multiple of the request alignment %u (try resizing the image to
> %llu bytes)”)

This would require management tools to automatically do this or we would
break any users that don't manually invoke QEMU. I don't think this is a
realistic option, especially since "management tools" must probably
include all those one-off shell scripts that people use.

> >>> But we should still avoid crashing in other cases, so what is the
> >>> difference between both? Is it just that qcow2 has the RESIZE permission
> >>> anyway so it doesn't matter?
> >>
> >> I assume so.
> >>
> >>> If so, maybe attaching to a block node with WRITE, but not RESIZE is
> >>> what needs to fail when the image size is unaligned?
> >>
> >> That sounds reasonable.
> >>
> >> The obvious question is what happens when the RESIZE capability is
> >> removed.  Dropping capabilities may never fail – I suppose we could
> >> force-keep the RESIZE capability for such nodes?
> > 
> > It's not nice, but I think we already have this kind of behaviour for
> > unlocking failures. So yes, that sounds like an option.
> > 
> >> Or we could immediately align such files to the block size once they
> >> are opened (with the RESIZE capability).
> > 
> > Automatically resizing the image file is obviously harmless for qcow2
> > images, but it would be a guest-visible change for raw images. It might
> > be better to avoid this.
> 
> Well, it seems to be what already happens if the guest device has taken
> the RESIZE capability (i.e., whenever there’s no failing assertion).
> The only difference that appears to me is just that it happens only when
> writing to the end of the image instead of unconditionally when opening it.

I would have considered this as part of the bug rather than a desirable
future behaviour. blk_check_byte_request() tries to catch any request
going past EOF, it just doesn't know anything about request_alignment.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-13 16:56     ` Kevin Wolf
@ 2020-07-15 13:22       ` Nir Soffer
  2020-07-15 13:42         ` Kevin Wolf
  2020-07-15 14:03         ` Daniel P. Berrangé
  0 siblings, 2 replies; 20+ messages in thread
From: Nir Soffer @ 2020-07-15 13:22 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: QEMU Developers, qemu-block, Max Reitz

On Mon, Jul 13, 2020 at 7:56 PM Kevin Wolf <kwolf@redhat.com> wrote:
>
> Am 13.07.2020 um 18:33 hat Nir Soffer geschrieben:
> > On Fri, Jul 10, 2020 at 5:22 PM Kevin Wolf <kwolf@redhat.com> wrote:
> > >
> > > Unaligned requests will automatically be aligned to bl.request_alignment
> > > and we don't want to extend requests to access space beyond the end of
> > > the image, so it's required that the image size is aligned.
> > >
> > > With write requests, this could cause assertion failures like this if
> > > RESIZE permissions weren't requested:
> > >
> > > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
> > >
> > > This was e.g. triggered by qemu-img converting to a target image with 4k
> > > request alignment when the image was only aligned to 512 bytes, but not
> > > to 4k.
> >
> > Was it on NFS? Shouldn't this be fix by the next patch then?
>
> Patch 2 makes the problem go away for NFS because NFS doesn't even
> require the 4k alignment. But on storage that legitimately needs 4k
> alignment (or possibly other filesystems that are misdetected), you
> would still hit the same problem.

I want to add oVirt point of view on this. We enforce raw image
alignment of 4k on
file based storage, and 128m on block storage, so our raw images cannot have
this issue.

We have an issue with empty qcow2 images which are unaligned size, but we don't
create such images in normal flows.

Nir



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-15 13:22       ` Nir Soffer
@ 2020-07-15 13:42         ` Kevin Wolf
  2020-07-15 14:03           ` Nir Soffer
  2020-07-15 14:03         ` Daniel P. Berrangé
  1 sibling, 1 reply; 20+ messages in thread
From: Kevin Wolf @ 2020-07-15 13:42 UTC (permalink / raw)
  To: Nir Soffer; +Cc: QEMU Developers, qemu-block, Max Reitz

Am 15.07.2020 um 15:22 hat Nir Soffer geschrieben:
> On Mon, Jul 13, 2020 at 7:56 PM Kevin Wolf <kwolf@redhat.com> wrote:
> >
> > Am 13.07.2020 um 18:33 hat Nir Soffer geschrieben:
> > > On Fri, Jul 10, 2020 at 5:22 PM Kevin Wolf <kwolf@redhat.com> wrote:
> > > >
> > > > Unaligned requests will automatically be aligned to bl.request_alignment
> > > > and we don't want to extend requests to access space beyond the end of
> > > > the image, so it's required that the image size is aligned.
> > > >
> > > > With write requests, this could cause assertion failures like this if
> > > > RESIZE permissions weren't requested:
> > > >
> > > > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
> > > >
> > > > This was e.g. triggered by qemu-img converting to a target image with 4k
> > > > request alignment when the image was only aligned to 512 bytes, but not
> > > > to 4k.
> > >
> > > Was it on NFS? Shouldn't this be fix by the next patch then?
> >
> > Patch 2 makes the problem go away for NFS because NFS doesn't even
> > require the 4k alignment. But on storage that legitimately needs 4k
> > alignment (or possibly other filesystems that are misdetected), you
> > would still hit the same problem.
> 
> I want to add oVirt point of view on this. We enforce raw image
> alignment of 4k on file based storage, and 128m on block storage, so
> our raw images cannot have this issue.

Yes, then you won't hit the problem.

> We have an issue with empty qcow2 images which are unaligned size, but
> we don't create such images in normal flows.

Can you give a reproducer where qcow2 images would be affected?
Generally speaking, the qcow2 driver either takes both WRITE and RESIZE
permissions or neither. So it should just automatically resize the image
as needed instead of crashing.

Kevin



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-15 13:22       ` Nir Soffer
  2020-07-15 13:42         ` Kevin Wolf
@ 2020-07-15 14:03         ` Daniel P. Berrangé
  1 sibling, 0 replies; 20+ messages in thread
From: Daniel P. Berrangé @ 2020-07-15 14:03 UTC (permalink / raw)
  To: Nir Soffer; +Cc: Kevin Wolf, QEMU Developers, qemu-block, Max Reitz

On Wed, Jul 15, 2020 at 04:22:06PM +0300, Nir Soffer wrote:
> On Mon, Jul 13, 2020 at 7:56 PM Kevin Wolf <kwolf@redhat.com> wrote:
> >
> > Am 13.07.2020 um 18:33 hat Nir Soffer geschrieben:
> > > On Fri, Jul 10, 2020 at 5:22 PM Kevin Wolf <kwolf@redhat.com> wrote:
> > > >
> > > > Unaligned requests will automatically be aligned to bl.request_alignment
> > > > and we don't want to extend requests to access space beyond the end of
> > > > the image, so it's required that the image size is aligned.
> > > >
> > > > With write requests, this could cause assertion failures like this if
> > > > RESIZE permissions weren't requested:
> > > >
> > > > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
> > > >
> > > > This was e.g. triggered by qemu-img converting to a target image with 4k
> > > > request alignment when the image was only aligned to 512 bytes, but not
> > > > to 4k.
> > >
> > > Was it on NFS? Shouldn't this be fix by the next patch then?
> >
> > Patch 2 makes the problem go away for NFS because NFS doesn't even
> > require the 4k alignment. But on storage that legitimately needs 4k
> > alignment (or possibly other filesystems that are misdetected), you
> > would still hit the same problem.
> 
> I want to add oVirt point of view on this. We enforce raw image
> alignment of 4k on
> file based storage, and 128m on block storage, so our raw images cannot have
> this issue.

OpenStack should have minimium alignment of 1 GB for image sizes, so
this change is also no trouble for it.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure
  2020-07-15 13:42         ` Kevin Wolf
@ 2020-07-15 14:03           ` Nir Soffer
  0 siblings, 0 replies; 20+ messages in thread
From: Nir Soffer @ 2020-07-15 14:03 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: QEMU Developers, qemu-block, Max Reitz

On Wed, Jul 15, 2020 at 4:42 PM Kevin Wolf <kwolf@redhat.com> wrote:
>
> Am 15.07.2020 um 15:22 hat Nir Soffer geschrieben:
> > On Mon, Jul 13, 2020 at 7:56 PM Kevin Wolf <kwolf@redhat.com> wrote:
> > >
> > > Am 13.07.2020 um 18:33 hat Nir Soffer geschrieben:
> > > > On Fri, Jul 10, 2020 at 5:22 PM Kevin Wolf <kwolf@redhat.com> wrote:
> > > > >
> > > > > Unaligned requests will automatically be aligned to bl.request_alignment
> > > > > and we don't want to extend requests to access space beyond the end of
> > > > > the image, so it's required that the image size is aligned.
> > > > >
> > > > > With write requests, this could cause assertion failures like this if
> > > > > RESIZE permissions weren't requested:
> > > > >
> > > > > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed.
> > > > >
> > > > > This was e.g. triggered by qemu-img converting to a target image with 4k
> > > > > request alignment when the image was only aligned to 512 bytes, but not
> > > > > to 4k.
> > > >
> > > > Was it on NFS? Shouldn't this be fix by the next patch then?
> > >
> > > Patch 2 makes the problem go away for NFS because NFS doesn't even
> > > require the 4k alignment. But on storage that legitimately needs 4k
> > > alignment (or possibly other filesystems that are misdetected), you
> > > would still hit the same problem.
> >
> > I want to add oVirt point of view on this. We enforce raw image
> > alignment of 4k on file based storage, and 128m on block storage, so
> > our raw images cannot have this issue.
>
> Yes, then you won't hit the problem.
>
> > We have an issue with empty qcow2 images which are unaligned size, but
> > we don't create such images in normal flows.
>
> Can you give a reproducer where qcow2 images would be affected?
> Generally speaking, the qcow2 driver either takes both WRITE and RESIZE
> permissions or neither. So it should just automatically resize the image
> as needed instead of crashing.

I think this is a theoretical issue in other programs trying to access
the unaligned
images using direct I/O.



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-07-15 14:06 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-10 14:21 [PATCH for-5.1 0/2] qemu-img convert: Fix abort with unaligned image size Kevin Wolf
2020-07-10 14:21 ` [PATCH for-5.1 1/2] block: Require aligned image size to avoid assertion failure Kevin Wolf
2020-07-10 14:37   ` Eric Blake
2020-07-13 11:19   ` Max Reitz
2020-07-13 11:52     ` Max Reitz
2020-07-13 14:29     ` Kevin Wolf
2020-07-14  9:56       ` Max Reitz
2020-07-14 11:08         ` Kevin Wolf
2020-07-14 16:22           ` Max Reitz
2020-07-15  9:20             ` Kevin Wolf
2020-07-13 16:33   ` Nir Soffer
2020-07-13 16:56     ` Kevin Wolf
2020-07-15 13:22       ` Nir Soffer
2020-07-15 13:42         ` Kevin Wolf
2020-07-15 14:03           ` Nir Soffer
2020-07-15 14:03         ` Daniel P. Berrangé
2020-07-10 14:21 ` [PATCH for-5.1 2/2] file-posix: Allow byte-aligned O_DIRECT with NFS Kevin Wolf
2020-07-10 14:39   ` Eric Blake
2020-07-13 16:29   ` Nir Soffer
2020-07-10 14:43 ` [PATCH for-5.1 0/2] qemu-img convert: Fix abort with unaligned image size no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.