* [PATCH 0/2] file-posix: alignment probing improvements @ 2022-11-01 19:00 Stefan Hajnoczi 2022-11-01 19:00 ` [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned Stefan Hajnoczi 2022-11-01 19:00 ` [PATCH 2/2] file-posix: add statx(STATX_DIOALIGN) support Stefan Hajnoczi 0 siblings, 2 replies; 13+ messages in thread From: Stefan Hajnoczi @ 2022-11-01 19:00 UTC (permalink / raw) To: qemu-devel; +Cc: hreitz, qemu-block, Kevin Wolf, nsoffer, Stefan Hajnoczi These patches fix alignment probing with dm-crypt and add support for the new Linux statx(STATX_DIOALIGN) interface. Stefan Hajnoczi (2): file-posix: fix Linux alignment probing when EIO is returned file-posix: add statx(STATX_DIOALIGN) support block/file-posix.c | 96 +++++++++++++++++++++++++--------------------- 1 file changed, 52 insertions(+), 44 deletions(-) -- 2.38.1 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned 2022-11-01 19:00 [PATCH 0/2] file-posix: alignment probing improvements Stefan Hajnoczi @ 2022-11-01 19:00 ` Stefan Hajnoczi 2022-11-02 2:27 ` Eric Biggers 2022-11-01 19:00 ` [PATCH 2/2] file-posix: add statx(STATX_DIOALIGN) support Stefan Hajnoczi 1 sibling, 1 reply; 13+ messages in thread From: Stefan Hajnoczi @ 2022-11-01 19:00 UTC (permalink / raw) To: qemu-devel; +Cc: hreitz, qemu-block, Kevin Wolf, nsoffer, Stefan Hajnoczi Linux dm-crypt returns errno EIO from unaligned O_DIRECT pread(2) calls. Alignment probing fails on dm-crypt devices because the code expects EINVAL. Treating any errno as an "unaligned" indicator would be easy, but breaks commit 22d182e82b4b ("block/raw-posix: fix launching with failed disks"). Offline disks return EIO for correctly aligned requests and EINVAL for unaligned requests. It's possible to make both dm-crypt and offline disks work: look for the transition from EINVAL to EIO instead of for a single EINVAL value. Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1290 Fixes: 22d182e82b4b ("block/raw-posix: fix launching with failed disks") Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- block/file-posix.c | 42 +++++++++++++++--------------------------- 1 file changed, 15 insertions(+), 27 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index b9647c5ffc..b9d62f52fe 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -355,31 +355,6 @@ static bool raw_needs_alignment(BlockDriverState *bs) return s->force_alignment; } -/* Check if read is allowed with given memory buffer and length. - * - * This function is used to check O_DIRECT memory buffer and request alignment. - */ -static bool raw_is_io_aligned(int fd, void *buf, size_t len) -{ - ssize_t ret = pread(fd, buf, len, 0); - - if (ret >= 0) { - return true; - } - -#ifdef __linux__ - /* The Linux kernel returns EINVAL for misaligned O_DIRECT reads. Ignore - * other errors (e.g. real I/O error), which could happen on a failed - * drive, since we only care about probing alignment. - */ - if (errno != EINVAL) { - return true; - } -#endif - - return false; -} - static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp) { BDRVRawState *s = bs->opaque; @@ -426,34 +401,47 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp) * try to detect buf_align, which cannot be detected in some cases (e.g. * Gluster). If buf_align cannot be detected, we fallback to the value of * request_alignment. + * + * The probing loop keeps track of the last errno so that the alignment of + * offline disks can be probed. On Linux pread(2) returns with errno EINVAL + * for most file descriptors when O_DIRECT alignment constraints are unmet. + * Offline disks fail correctly aligned pread(2) with EIO. Therefore it's + * possible to detect alignment on offline disks by observing when the + * errno changes from EINVAL to something else. */ if (!bs->bl.request_alignment) { + int last_errno = 0; int i; size_t align; buf = qemu_memalign(max_align, max_align); for (i = 0; i < ARRAY_SIZE(alignments); i++) { align = alignments[i]; - if (raw_is_io_aligned(fd, buf, align)) { + if (pread(fd, buf, align, 0) >= 0 || + (errno != EINVAL && last_errno == EINVAL)) { /* Fallback to safe value. */ bs->bl.request_alignment = (align != 1) ? align : max_align; break; } + last_errno = errno; } qemu_vfree(buf); } if (!s->buf_align) { + int last_errno = 0; int i; size_t align; buf = qemu_memalign(max_align, 2 * max_align); for (i = 0; i < ARRAY_SIZE(alignments); i++) { align = alignments[i]; - if (raw_is_io_aligned(fd, buf + align, max_align)) { + if (pread(fd, buf + align, max_align, 0) >= 0 || + (errno != EINVAL && last_errno == EINVAL)) { /* Fallback to request_alignment. */ s->buf_align = (align != 1) ? align : bs->bl.request_alignment; break; } + last_errno = errno; } qemu_vfree(buf); } -- 2.38.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned 2022-11-01 19:00 ` [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned Stefan Hajnoczi @ 2022-11-02 2:27 ` Eric Biggers 2022-11-02 2:49 ` Eric Biggers 0 siblings, 1 reply; 13+ messages in thread From: Eric Biggers @ 2022-11-02 2:27 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel, hreitz, qemu-block, Kevin Wolf, nsoffer On Tue, Nov 01, 2022 at 03:00:30PM -0400, Stefan Hajnoczi wrote: > Linux dm-crypt returns errno EIO from unaligned O_DIRECT pread(2) calls. Citation needed. For direct I/O to block devices, the kernel's block layer checks the alignment before the I/O is actually submitted to the underlying block device. See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/fops.c?h=v6.1-rc3#n306 > Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1290 That "bug" seems to be based on a misunderstanding of the kernel source code, and not any actual testing. I just tested it, and the error code is EINVAL. - Eric ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned 2022-11-02 2:27 ` Eric Biggers @ 2022-11-02 2:49 ` Eric Biggers 2022-11-02 18:50 ` Stefan Hajnoczi 2022-11-03 9:52 ` Kevin Wolf 0 siblings, 2 replies; 13+ messages in thread From: Eric Biggers @ 2022-11-02 2:49 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel, hreitz, qemu-block, Kevin Wolf, nsoffer On Tue, Nov 01, 2022 at 07:27:16PM -0700, Eric Biggers wrote: > On Tue, Nov 01, 2022 at 03:00:30PM -0400, Stefan Hajnoczi wrote: > > Linux dm-crypt returns errno EIO from unaligned O_DIRECT pread(2) calls. > > Citation needed. For direct I/O to block devices, the kernel's block layer > checks the alignment before the I/O is actually submitted to the underlying > block device. See > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/fops.c?h=v6.1-rc3#n306 > > > Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1290 > > That "bug" seems to be based on a misunderstanding of the kernel source code, > and not any actual testing. > > I just tested it, and the error code is EINVAL. > I think I see what's happening. The kernel code was broken just a few months ago, in v6.0 by the commit "block: relax direct io memory alignment" (https://git.kernel.org/linus/b1a000d3b8ec582d). Now the block layer lets DIO through when the user buffer is only aligned to the device's dma_alignment. But a dm-crypt device has a dma_alignment of 512 even when the crypto sector size (and thus also the logical block size) is 4096. So there is now a case where misaligned DIO can reach dm-crypt, when that shouldn't be possible. It also means that STATX_DIOALIGN will give the wrong value for stx_dio_mem_align in the above case, 512 instead of 4096. This is because STATX_DIOALIGN for block devices relies on the dma_alignment. I'll raise this on the linux-block and dm-devel mailing lists. It would be nice if people reported kernel bugs instead of silently working around them... - Eric ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned 2022-11-02 2:49 ` Eric Biggers @ 2022-11-02 18:50 ` Stefan Hajnoczi 2022-11-03 9:52 ` Kevin Wolf 1 sibling, 0 replies; 13+ messages in thread From: Stefan Hajnoczi @ 2022-11-02 18:50 UTC (permalink / raw) To: Eric Biggers; +Cc: qemu-devel, hreitz, qemu-block, Kevin Wolf, nsoffer [-- Attachment #1: Type: text/plain, Size: 1848 bytes --] On Tue, Nov 01, 2022 at 07:49:20PM -0700, Eric Biggers wrote: > On Tue, Nov 01, 2022 at 07:27:16PM -0700, Eric Biggers wrote: > > On Tue, Nov 01, 2022 at 03:00:30PM -0400, Stefan Hajnoczi wrote: > > > Linux dm-crypt returns errno EIO from unaligned O_DIRECT pread(2) calls. > > > > Citation needed. For direct I/O to block devices, the kernel's block layer > > checks the alignment before the I/O is actually submitted to the underlying > > block device. See > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/fops.c?h=v6.1-rc3#n306 > > > > > Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1290 > > > > That "bug" seems to be based on a misunderstanding of the kernel source code, > > and not any actual testing. > > > > I just tested it, and the error code is EINVAL. > > > > I think I see what's happening. The kernel code was broken just a few months > ago, in v6.0 by the commit "block: relax direct io memory alignment" > (https://git.kernel.org/linus/b1a000d3b8ec582d). Now the block layer lets DIO > through when the user buffer is only aligned to the device's dma_alignment. But > a dm-crypt device has a dma_alignment of 512 even when the crypto sector size > (and thus also the logical block size) is 4096. So there is now a case where > misaligned DIO can reach dm-crypt, when that shouldn't be possible. > > It also means that STATX_DIOALIGN will give the wrong value for > stx_dio_mem_align in the above case, 512 instead of 4096. This is because > STATX_DIOALIGN for block devices relies on the dma_alignment. > > I'll raise this on the linux-block and dm-devel mailing lists. It would be nice > if people reported kernel bugs instead of silently working around them... Thanks! You have completed the picture of what's going on here. Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned 2022-11-02 2:49 ` Eric Biggers 2022-11-02 18:50 ` Stefan Hajnoczi @ 2022-11-03 9:52 ` Kevin Wolf 2022-11-03 13:57 ` Stefan Hajnoczi 2022-11-03 16:26 ` Eric Biggers 1 sibling, 2 replies; 13+ messages in thread From: Kevin Wolf @ 2022-11-03 9:52 UTC (permalink / raw) To: Eric Biggers; +Cc: Stefan Hajnoczi, qemu-devel, hreitz, qemu-block, nsoffer Am 02.11.2022 um 03:49 hat Eric Biggers geschrieben: > On Tue, Nov 01, 2022 at 07:27:16PM -0700, Eric Biggers wrote: > > On Tue, Nov 01, 2022 at 03:00:30PM -0400, Stefan Hajnoczi wrote: > > > Linux dm-crypt returns errno EIO from unaligned O_DIRECT pread(2) calls. > > > > Citation needed. For direct I/O to block devices, the kernel's block layer > > checks the alignment before the I/O is actually submitted to the underlying > > block device. See > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/fops.c?h=v6.1-rc3#n306 > > > > > Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1290 > > > > That "bug" seems to be based on a misunderstanding of the kernel source code, > > and not any actual testing. > > > > I just tested it, and the error code is EINVAL. > > > > I think I see what's happening. The kernel code was broken just a few months > ago, in v6.0 by the commit "block: relax direct io memory alignment" > (https://git.kernel.org/linus/b1a000d3b8ec582d). Now the block layer lets DIO > through when the user buffer is only aligned to the device's dma_alignment. But > a dm-crypt device has a dma_alignment of 512 even when the crypto sector size > (and thus also the logical block size) is 4096. So there is now a case where > misaligned DIO can reach dm-crypt, when that shouldn't be possible. > > It also means that STATX_DIOALIGN will give the wrong value for > stx_dio_mem_align in the above case, 512 instead of 4096. This is because > STATX_DIOALIGN for block devices relies on the dma_alignment. In other words, STATX_DIOALIGN is unusable from the start because we don't know whether the information it returns is actually correct? :-/ I guess we could still use the value returned by STATX_DIOALIGN as a preferred value that we'll use if it survives probing, and otherwise fall back to the same probing we've always been doing because there was no (or no sane) way to get the information from the kernel. > I'll raise this on the linux-block and dm-devel mailing lists. It > would be nice if people reported kernel bugs instead of silently > working around them... I wasn't involved in this specific one, but in case you're wondering why I wouldn't have reported it either... On one hand, I agree with you because I want bugs in my code reported, too, but on the other hand, it has also happened to me before that you're treated as the stupid userland developer who doesn't know how the kernel works and who should better have kept his ideas of how it should work to himself - which is not exactly encouraging to report things when you can just deal with the way they are. I wouldn't have been able to tell whether in the mind of the respective maintainers, -EINVAL is required behaviour or whether that was just a totally unreasonable assumption on our side. Erring on the safe side, I'll give up an assumption that obviously doesn't match reality. And even a kernel fix now doesn't change that there are broken kernels out there and we need to work with them. So reporting it doesn't even solve our problem, it's just additional effort with limited expectations of success. Kevin ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned 2022-11-03 9:52 ` Kevin Wolf @ 2022-11-03 13:57 ` Stefan Hajnoczi 2022-11-03 16:26 ` Eric Biggers 1 sibling, 0 replies; 13+ messages in thread From: Stefan Hajnoczi @ 2022-11-03 13:57 UTC (permalink / raw) To: Kevin Wolf; +Cc: Eric Biggers, qemu-devel, hreitz, qemu-block, nsoffer [-- Attachment #1: Type: text/plain, Size: 2394 bytes --] On Thu, Nov 03, 2022 at 10:52:43AM +0100, Kevin Wolf wrote: > Am 02.11.2022 um 03:49 hat Eric Biggers geschrieben: > > On Tue, Nov 01, 2022 at 07:27:16PM -0700, Eric Biggers wrote: > > > On Tue, Nov 01, 2022 at 03:00:30PM -0400, Stefan Hajnoczi wrote: > > > > Linux dm-crypt returns errno EIO from unaligned O_DIRECT pread(2) calls. > > > > > > Citation needed. For direct I/O to block devices, the kernel's block layer > > > checks the alignment before the I/O is actually submitted to the underlying > > > block device. See > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/fops.c?h=v6.1-rc3#n306 > > > > > > > Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1290 > > > > > > That "bug" seems to be based on a misunderstanding of the kernel source code, > > > and not any actual testing. > > > > > > I just tested it, and the error code is EINVAL. > > > > > > > I think I see what's happening. The kernel code was broken just a few months > > ago, in v6.0 by the commit "block: relax direct io memory alignment" > > (https://git.kernel.org/linus/b1a000d3b8ec582d). Now the block layer lets DIO > > through when the user buffer is only aligned to the device's dma_alignment. But > > a dm-crypt device has a dma_alignment of 512 even when the crypto sector size > > (and thus also the logical block size) is 4096. So there is now a case where > > misaligned DIO can reach dm-crypt, when that shouldn't be possible. > > > > It also means that STATX_DIOALIGN will give the wrong value for > > stx_dio_mem_align in the above case, 512 instead of 4096. This is because > > STATX_DIOALIGN for block devices relies on the dma_alignment. > > In other words, STATX_DIOALIGN is unusable from the start because we > don't know whether the information it returns is actually correct? :-/ > > I guess we could still use the value returned by STATX_DIOALIGN as a > preferred value that we'll use if it survives probing, and otherwise > fall back to the same probing we've always been doing because there was > no (or no sane) way to get the information from the kernel. Yes, it seems probing is required to verify the values returned by STATX_DIOALIGN. At least until enough time passes so we can say "STATX_DIOALIGN has been correct for X years and no one is running those old kernels anymore". Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned 2022-11-03 9:52 ` Kevin Wolf 2022-11-03 13:57 ` Stefan Hajnoczi @ 2022-11-03 16:26 ` Eric Biggers 2022-11-03 16:54 ` Eric Biggers 1 sibling, 1 reply; 13+ messages in thread From: Eric Biggers @ 2022-11-03 16:26 UTC (permalink / raw) To: Kevin Wolf; +Cc: Stefan Hajnoczi, qemu-devel, hreitz, qemu-block, nsoffer On Thu, Nov 03, 2022 at 10:52:43AM +0100, Kevin Wolf wrote: > Am 02.11.2022 um 03:49 hat Eric Biggers geschrieben: > > On Tue, Nov 01, 2022 at 07:27:16PM -0700, Eric Biggers wrote: > > > On Tue, Nov 01, 2022 at 03:00:30PM -0400, Stefan Hajnoczi wrote: > > > > Linux dm-crypt returns errno EIO from unaligned O_DIRECT pread(2) calls. > > > > > > Citation needed. For direct I/O to block devices, the kernel's block layer > > > checks the alignment before the I/O is actually submitted to the underlying > > > block device. See > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/fops.c?h=v6.1-rc3#n306 > > > > > > > Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1290 > > > > > > That "bug" seems to be based on a misunderstanding of the kernel source code, > > > and not any actual testing. > > > > > > I just tested it, and the error code is EINVAL. > > > > > > > I think I see what's happening. The kernel code was broken just a few months > > ago, in v6.0 by the commit "block: relax direct io memory alignment" > > (https://git.kernel.org/linus/b1a000d3b8ec582d). Now the block layer lets DIO > > through when the user buffer is only aligned to the device's dma_alignment. But > > a dm-crypt device has a dma_alignment of 512 even when the crypto sector size > > (and thus also the logical block size) is 4096. So there is now a case where > > misaligned DIO can reach dm-crypt, when that shouldn't be possible. > > > > It also means that STATX_DIOALIGN will give the wrong value for > > stx_dio_mem_align in the above case, 512 instead of 4096. This is because > > STATX_DIOALIGN for block devices relies on the dma_alignment. > > In other words, STATX_DIOALIGN is unusable from the start because we > don't know whether the information it returns is actually correct? :-/ That's a silly point of view. STATX_DIOALIGN has only been in a released kernel for a few weeks (v6.0), the bug is only in one edge case, and it will get fixed quickly and backported to v6.0.y where users of 6.0 will get it. Basically every Linux API has been broken at one point or the other, but things get fixed. Direct I/O itself has been totally broken on some filesystems, so by this argument why are you even using direct I/O? Actually, why are you even using Linux? Just use one of those operating systems with no bugs instead. Also note that your alternative is probing, which is super broken because many filesystems fall back to buffered I/O for misaligned direct I/O instead of returning an error. So please cooperate with getting things fixed properly instead of continuing to use broken workarounds. > > I'll raise this on the linux-block and dm-devel mailing lists. It > > would be nice if people reported kernel bugs instead of silently > > working around them... > > I wasn't involved in this specific one, but in case you're wondering why > I wouldn't have reported it either... > > On one hand, I agree with you because I want bugs in my code reported, > too, but on the other hand, it has also happened to me before that > you're treated as the stupid userland developer who doesn't know how > the kernel works and who should better have kept his ideas of how it > should work to himself - which is not exactly encouraging to report > things when you can just deal with the way they are. I wouldn't have > been able to tell whether in the mind of the respective maintainers, > -EINVAL is required behaviour or whether that was just a totally > unreasonable assumption on our side. Erring on the safe side, I'll give > up an assumption that obviously doesn't match reality. The error code is documented in the read(2) and write(2) man pages. So clearly either the kernel was wrong or the man page was wrong. Either way it needed to be reported. - Eric ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned 2022-11-03 16:26 ` Eric Biggers @ 2022-11-03 16:54 ` Eric Biggers 2022-11-03 17:54 ` Stefan Hajnoczi 0 siblings, 1 reply; 13+ messages in thread From: Eric Biggers @ 2022-11-03 16:54 UTC (permalink / raw) To: Kevin Wolf; +Cc: Stefan Hajnoczi, qemu-devel, hreitz, qemu-block, nsoffer On Thu, Nov 03, 2022 at 04:26:14PM +0000, Eric Biggers wrote: > > In other words, STATX_DIOALIGN is unusable from the start because we > > don't know whether the information it returns is actually correct? :-/ > > That's a silly point of view. STATX_DIOALIGN has only been in a released kernel > for a few weeks (v6.0), the bug is only in one edge case, and it will get fixed > quickly and backported to v6.0.y where users of 6.0 will get it. Actually, scratch that. STATX_DIOALIGN is in 6.1, not 6.0. So it hasn't even been released yet. Upstream is currently on v6.1-rc3. So thank you for reporting (or for not reporting?) this. We'll make sure it gets fixed before release :-) - Eric ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned 2022-11-03 16:54 ` Eric Biggers @ 2022-11-03 17:54 ` Stefan Hajnoczi 0 siblings, 0 replies; 13+ messages in thread From: Stefan Hajnoczi @ 2022-11-03 17:54 UTC (permalink / raw) To: Eric Biggers; +Cc: Kevin Wolf, qemu-devel, hreitz, qemu-block, nsoffer [-- Attachment #1: Type: text/plain, Size: 825 bytes --] On Thu, Nov 03, 2022 at 04:54:09PM +0000, Eric Biggers wrote: > On Thu, Nov 03, 2022 at 04:26:14PM +0000, Eric Biggers wrote: > > > In other words, STATX_DIOALIGN is unusable from the start because we > > > don't know whether the information it returns is actually correct? :-/ > > > > That's a silly point of view. STATX_DIOALIGN has only been in a released kernel > > for a few weeks (v6.0), the bug is only in one edge case, and it will get fixed > > quickly and backported to v6.0.y where users of 6.0 will get it. > > Actually, scratch that. STATX_DIOALIGN is in 6.1, not 6.0. So it hasn't even > been released yet. Upstream is currently on v6.1-rc3. > > So thank you for reporting (or for not reporting?) this. We'll make sure it > gets fixed before release :-) That's great. Thanks! Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 2/2] file-posix: add statx(STATX_DIOALIGN) support 2022-11-01 19:00 [PATCH 0/2] file-posix: alignment probing improvements Stefan Hajnoczi 2022-11-01 19:00 ` [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned Stefan Hajnoczi @ 2022-11-01 19:00 ` Stefan Hajnoczi 2022-11-02 3:32 ` Eric Biggers 1 sibling, 1 reply; 13+ messages in thread From: Stefan Hajnoczi @ 2022-11-01 19:00 UTC (permalink / raw) To: qemu-devel Cc: hreitz, qemu-block, Kevin Wolf, nsoffer, Stefan Hajnoczi, Eric Biggers Linux commit 825cf206ed51 ("statx: add direct I/O alignment information") added an interface to fetch O_DIRECT alignment values for block devices and file systems. Prefer STATX_DIOALIGN to older interfaces and probing, but keep them as fallbacks in case STATX_DIOALIGN cannot provide the information. Testing shows the status of STATX_DIOALIGN support in Linux 6.1-rc3 appears to be: - btrfs: no - ext4: yes - XFS: yes - NVMe block devices: yes - dm-crypt: yes Cc: Eric Biggers <ebiggers@google.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- block/file-posix.c | 54 +++++++++++++++++++++++++++++++--------------- 1 file changed, 37 insertions(+), 17 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index b9d62f52fe..00d8191880 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -372,28 +372,48 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp) bs->bl.request_alignment = 0; s->buf_align = 0; + +#if defined(__linux__) && defined(STATX_DIOALIGN) + struct statx stx; + + /* + * Linux 6.1 introduced an interface for both block devices and file + * systems. The system call returns with the STATX_DIOALIGN bit cleared + * when the information is unavailable. + */ + if (statx(fd, "", AT_EMPTY_PATH, STATX_DIOALIGN, &stx) == 0 && + (stx.stx_mask & STATX_DIOALIGN)) { + bs->bl.request_alignment = stx.stx_dio_offset_align; + s->buf_align = stx.stx_dio_mem_align; + } +#endif /* defined(__linux__) && defined(STATX_DIOALIGN) */ + /* Let's try to use the logical blocksize for the alignment. */ - if (probe_logical_blocksize(fd, &bs->bl.request_alignment) < 0) { - bs->bl.request_alignment = 0; + if (!bs->bl.request_alignment) { + if (probe_logical_blocksize(fd, &bs->bl.request_alignment) < 0) { + bs->bl.request_alignment = 0; + } } #ifdef __linux__ - /* - * The XFS ioctl definitions are shipped in extra packages that might - * not always be available. Since we just need the XFS_IOC_DIOINFO ioctl - * here, we simply use our own definition instead: - */ - struct xfs_dioattr { - uint32_t d_mem; - uint32_t d_miniosz; - uint32_t d_maxiosz; - } da; - if (ioctl(fd, _IOR('X', 30, struct xfs_dioattr), &da) >= 0) { - bs->bl.request_alignment = da.d_miniosz; - /* The kernel returns wrong information for d_mem */ - /* s->buf_align = da.d_mem; */ + if (!bs->bl.request_alignment) { + /* + * The XFS ioctl definitions are shipped in extra packages that might + * not always be available. Since we just need the XFS_IOC_DIOINFO ioctl + * here, we simply use our own definition instead: + */ + struct xfs_dioattr { + uint32_t d_mem; + uint32_t d_miniosz; + uint32_t d_maxiosz; + } da; + if (ioctl(fd, _IOR('X', 30, struct xfs_dioattr), &da) >= 0) { + bs->bl.request_alignment = da.d_miniosz; + /* The kernel returns wrong information for d_mem */ + /* s->buf_align = da.d_mem; */ + } } -#endif +#endif /* __linux__ */ /* * If we could not get the sizes so far, we can only guess them. First try -- 2.38.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] file-posix: add statx(STATX_DIOALIGN) support 2022-11-01 19:00 ` [PATCH 2/2] file-posix: add statx(STATX_DIOALIGN) support Stefan Hajnoczi @ 2022-11-02 3:32 ` Eric Biggers 2022-11-02 18:53 ` Stefan Hajnoczi 0 siblings, 1 reply; 13+ messages in thread From: Eric Biggers @ 2022-11-02 3:32 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel, hreitz, qemu-block, Kevin Wolf, nsoffer On Tue, Nov 01, 2022 at 03:00:31PM -0400, Stefan Hajnoczi wrote: > /* Let's try to use the logical blocksize for the alignment. */ > - if (probe_logical_blocksize(fd, &bs->bl.request_alignment) < 0) { > - bs->bl.request_alignment = 0; > + if (!bs->bl.request_alignment) { > + if (probe_logical_blocksize(fd, &bs->bl.request_alignment) < 0) { > + bs->bl.request_alignment = 0; > + } > } > > #ifdef __linux__ > - /* > - * The XFS ioctl definitions are shipped in extra packages that might > - * not always be available. Since we just need the XFS_IOC_DIOINFO ioctl > - * here, we simply use our own definition instead: > - */ > - struct xfs_dioattr { > - uint32_t d_mem; > - uint32_t d_miniosz; > - uint32_t d_maxiosz; > - } da; > - if (ioctl(fd, _IOR('X', 30, struct xfs_dioattr), &da) >= 0) { > - bs->bl.request_alignment = da.d_miniosz; > - /* The kernel returns wrong information for d_mem */ > - /* s->buf_align = da.d_mem; */ > + if (!bs->bl.request_alignment) { This patch changes the fallback code to make the request_alignment value from probe_logical_blocksize() override the value from XFS_IOC_DIOINFO. Is that intentional? > + if (ioctl(fd, _IOR('X', 30, struct xfs_dioattr), &da) >= 0) { > + bs->bl.request_alignment = da.d_miniosz; > + /* The kernel returns wrong information for d_mem */ > + /* s->buf_align = da.d_mem; */ Has this bug been reported to the XFS developers (linux-xfs@vger.kernel.org)? - Eric ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] file-posix: add statx(STATX_DIOALIGN) support 2022-11-02 3:32 ` Eric Biggers @ 2022-11-02 18:53 ` Stefan Hajnoczi 0 siblings, 0 replies; 13+ messages in thread From: Stefan Hajnoczi @ 2022-11-02 18:53 UTC (permalink / raw) To: Eric Biggers, pbonzini Cc: qemu-devel, hreitz, qemu-block, Kevin Wolf, nsoffer [-- Attachment #1: Type: text/plain, Size: 1965 bytes --] On Tue, Nov 01, 2022 at 08:32:30PM -0700, Eric Biggers wrote: > On Tue, Nov 01, 2022 at 03:00:31PM -0400, Stefan Hajnoczi wrote: > > /* Let's try to use the logical blocksize for the alignment. */ > > - if (probe_logical_blocksize(fd, &bs->bl.request_alignment) < 0) { > > - bs->bl.request_alignment = 0; > > + if (!bs->bl.request_alignment) { > > + if (probe_logical_blocksize(fd, &bs->bl.request_alignment) < 0) { > > + bs->bl.request_alignment = 0; > > + } > > } > > > > #ifdef __linux__ > > - /* > > - * The XFS ioctl definitions are shipped in extra packages that might > > - * not always be available. Since we just need the XFS_IOC_DIOINFO ioctl > > - * here, we simply use our own definition instead: > > - */ > > - struct xfs_dioattr { > > - uint32_t d_mem; > > - uint32_t d_miniosz; > > - uint32_t d_maxiosz; > > - } da; > > - if (ioctl(fd, _IOR('X', 30, struct xfs_dioattr), &da) >= 0) { > > - bs->bl.request_alignment = da.d_miniosz; > > - /* The kernel returns wrong information for d_mem */ > > - /* s->buf_align = da.d_mem; */ > > + if (!bs->bl.request_alignment) { > > This patch changes the fallback code to make the request_alignment value from > probe_logical_blocksize() override the value from XFS_IOC_DIOINFO. Is that > intentional? Thanks for pointing out the bug. That was not intentional. Will fix. > > + if (ioctl(fd, _IOR('X', 30, struct xfs_dioattr), &da) >= 0) { > > + bs->bl.request_alignment = da.d_miniosz; > > + /* The kernel returns wrong information for d_mem */ > > + /* s->buf_align = da.d_mem; */ > > Has this bug been reported to the XFS developers (linux-xfs@vger.kernel.org)? Paolo: Do you remember if you reported this when you wrote commit c25f53b06eba ("raw: Probe required direct I/O alignment")? Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2022-11-03 17:55 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-11-01 19:00 [PATCH 0/2] file-posix: alignment probing improvements Stefan Hajnoczi 2022-11-01 19:00 ` [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned Stefan Hajnoczi 2022-11-02 2:27 ` Eric Biggers 2022-11-02 2:49 ` Eric Biggers 2022-11-02 18:50 ` Stefan Hajnoczi 2022-11-03 9:52 ` Kevin Wolf 2022-11-03 13:57 ` Stefan Hajnoczi 2022-11-03 16:26 ` Eric Biggers 2022-11-03 16:54 ` Eric Biggers 2022-11-03 17:54 ` Stefan Hajnoczi 2022-11-01 19:00 ` [PATCH 2/2] file-posix: add statx(STATX_DIOALIGN) support Stefan Hajnoczi 2022-11-02 3:32 ` Eric Biggers 2022-11-02 18:53 ` Stefan Hajnoczi
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.