* [Qemu-devel] Unaligned images with O_DIRECT @ 2019-05-14 15:06 Max Reitz 2019-05-14 15:45 ` Eric Blake 0 siblings, 1 reply; 5+ messages in thread From: Max Reitz @ 2019-05-14 15:06 UTC (permalink / raw) To: Qemu-block; +Cc: Kevin Wolf, qemu-devel [-- Attachment #1: Type: text/plain, Size: 950 bytes --] Hi, Unaligned images don’t work so well with O_DIRECT: $ echo > foo $ qemu-img map --image-opts driver=file,filename=foo,cache.direct=on Offset Length Mapped to File qemu-img: block/io.c:2093: bdrv_co_block_status: Assertion `*pnum && QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset' failed. [1] 10954 abort (core dumped) qemu-img map --image-opts driver=file,filename=foo,cache.direct=on (compare https://bugzilla.redhat.com/show_bug.cgi?id=1588356) This is because the request_alignment is 512 (in my case), but the EOF is not aligned accordingly, so raw_co_block_status() returns an aligned *pnum. I suppose having an unaligned tail is not so bad and maybe we can just adjust the assertion accordingly. On the other hand, this has been broken for a while. Does it even make sense to use O_DIRECT with unaligned images? Shouldn’t we just reject them outright? Max [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Unaligned images with O_DIRECT 2019-05-14 15:06 [Qemu-devel] Unaligned images with O_DIRECT Max Reitz @ 2019-05-14 15:45 ` Eric Blake 2019-05-14 16:15 ` Max Reitz 0 siblings, 1 reply; 5+ messages in thread From: Eric Blake @ 2019-05-14 15:45 UTC (permalink / raw) To: Max Reitz, Qemu-block; +Cc: Kevin Wolf, qemu-devel [-- Attachment #1: Type: text/plain, Size: 2497 bytes --] On 5/14/19 10:06 AM, Max Reitz wrote: > Hi, > > Unaligned images don’t work so well with O_DIRECT: > > $ echo > foo > $ qemu-img map --image-opts driver=file,filename=foo,cache.direct=on > Offset Length Mapped to File > qemu-img: block/io.c:2093: bdrv_co_block_status: Assertion `*pnum && > QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset' failed. > [1] 10954 abort (core dumped) qemu-img map --image-opts > driver=file,filename=foo,cache.direct=on > > (compare https://bugzilla.redhat.com/show_bug.cgi?id=1588356) > > This is because the request_alignment is 512 (in my case), but the EOF > is not aligned accordingly, so raw_co_block_status() returns an aligned > *pnum. Uggh. Yet another reason why I want qemu to support byte-accurate sizing, instead of rounding up. The rounding keeps raising its head in more and more places. I have pending patches that are trying to improve block status to round driver answers up to match request_alignment (when the protocol layer has finer granularity than the format layer); but this sounds like it is a bug in the file driver itself for returning an answer that is not properly rounded according to its own request_alignment boundary, and not one where my pending patches would help. > > I suppose having an unaligned tail is not so bad and maybe we can just > adjust the assertion accordingly. On the other hand, this has been > broken for a while. Does it even make sense to use O_DIRECT with > unaligned images? Shouldn’t we just reject them outright? The tail of an unaligned file is generally inaccessible to O_DIRECT, where it is easier to use ftruncate() up to an aligned boundary if you really must play with that region of the file, and then ftruncate() back to the intended size after I/O. But that sounds hairy. We could also round down and silently ignore the tail of the file, but that is at odds with our practice of rounding size up. So for the short term, I'd be happy with a patch that just rejects any attempt to use cache.direct=on (O_DIRECT) with a file that is not already a multiple of the alignment required thereby. (For reference, that's what qemu as NBD client recently did when talking to a server that advertises a size inconsistent with forced minimum block access: commit 3add3ab7) -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Unaligned images with O_DIRECT 2019-05-14 15:45 ` Eric Blake @ 2019-05-14 16:15 ` Max Reitz 2019-05-14 17:28 ` Max Reitz 0 siblings, 1 reply; 5+ messages in thread From: Max Reitz @ 2019-05-14 16:15 UTC (permalink / raw) To: Eric Blake, Qemu-block; +Cc: Kevin Wolf, qemu-devel [-- Attachment #1: Type: text/plain, Size: 2578 bytes --] On 14.05.19 17:45, Eric Blake wrote: > On 5/14/19 10:06 AM, Max Reitz wrote: >> Hi, >> >> Unaligned images don’t work so well with O_DIRECT: >> >> $ echo > foo >> $ qemu-img map --image-opts driver=file,filename=foo,cache.direct=on >> Offset Length Mapped to File >> qemu-img: block/io.c:2093: bdrv_co_block_status: Assertion `*pnum && >> QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset' failed. >> [1] 10954 abort (core dumped) qemu-img map --image-opts >> driver=file,filename=foo,cache.direct=on >> >> (compare https://bugzilla.redhat.com/show_bug.cgi?id=1588356) >> >> This is because the request_alignment is 512 (in my case), but the EOF >> is not aligned accordingly, so raw_co_block_status() returns an aligned >> *pnum. > > Uggh. Yet another reason why I want qemu to support byte-accurate > sizing, instead of rounding up. The rounding keeps raising its head in > more and more places. I have pending patches that are trying to improve > block status to round driver answers up to match request_alignment (when > the protocol layer has finer granularity than the format layer); but > this sounds like it is a bug in the file driver itself for returning an > answer that is not properly rounded according to its own > request_alignment boundary, and not one where my pending patches would help. Yes, I think so, too. >> I suppose having an unaligned tail is not so bad and maybe we can just >> adjust the assertion accordingly. On the other hand, this has been >> broken for a while. Does it even make sense to use O_DIRECT with >> unaligned images? Shouldn’t we just reject them outright? > > The tail of an unaligned file is generally inaccessible to O_DIRECT, Especially with this. > where it is easier to use ftruncate() up to an aligned boundary if you > really must play with that region of the file, and then ftruncate() back > to the intended size after I/O. But that sounds hairy. We could also > round down and silently ignore the tail of the file, but that is at odds > with our practice of rounding size up. So for the short term, I'd be > happy with a patch that just rejects any attempt to use cache.direct=on > (O_DIRECT) with a file that is not already a multiple of the alignment > required thereby. (For reference, that's what qemu as NBD client > recently did when talking to a server that advertises a size > inconsistent with forced minimum block access: commit 3add3ab7) OK, I’ll send a patch. Thanks for you explanation! Max [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Unaligned images with O_DIRECT 2019-05-14 16:15 ` Max Reitz @ 2019-05-14 17:28 ` Max Reitz 2019-05-14 21:36 ` Eric Blake 0 siblings, 1 reply; 5+ messages in thread From: Max Reitz @ 2019-05-14 17:28 UTC (permalink / raw) To: Eric Blake, Qemu-block; +Cc: Kevin Wolf, qemu-devel [-- Attachment #1: Type: text/plain, Size: 3831 bytes --] On 14.05.19 18:15, Max Reitz wrote: > On 14.05.19 17:45, Eric Blake wrote: >> On 5/14/19 10:06 AM, Max Reitz wrote: >>> Hi, >>> >>> Unaligned images don’t work so well with O_DIRECT: >>> >>> $ echo > foo >>> $ qemu-img map --image-opts driver=file,filename=foo,cache.direct=on >>> Offset Length Mapped to File >>> qemu-img: block/io.c:2093: bdrv_co_block_status: Assertion `*pnum && >>> QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset' failed. >>> [1] 10954 abort (core dumped) qemu-img map --image-opts >>> driver=file,filename=foo,cache.direct=on >>> >>> (compare https://bugzilla.redhat.com/show_bug.cgi?id=1588356) >>> >>> This is because the request_alignment is 512 (in my case), but the EOF >>> is not aligned accordingly, so raw_co_block_status() returns an aligned >>> *pnum. >> >> Uggh. Yet another reason why I want qemu to support byte-accurate >> sizing, instead of rounding up. The rounding keeps raising its head in >> more and more places. I have pending patches that are trying to improve >> block status to round driver answers up to match request_alignment (when >> the protocol layer has finer granularity than the format layer); but >> this sounds like it is a bug in the file driver itself for returning an >> answer that is not properly rounded according to its own >> request_alignment boundary, and not one where my pending patches would help. > > Yes, I think so, too. > >>> I suppose having an unaligned tail is not so bad and maybe we can just >>> adjust the assertion accordingly. On the other hand, this has been >>> broken for a while. Does it even make sense to use O_DIRECT with >>> unaligned images? Shouldn’t we just reject them outright? >> >> The tail of an unaligned file is generally inaccessible to O_DIRECT, > > Especially with this. > >> where it is easier to use ftruncate() up to an aligned boundary if you >> really must play with that region of the file, and then ftruncate() back >> to the intended size after I/O. But that sounds hairy. We could also >> round down and silently ignore the tail of the file, but that is at odds >> with our practice of rounding size up. So for the short term, I'd be >> happy with a patch that just rejects any attempt to use cache.direct=on >> (O_DIRECT) with a file that is not already a multiple of the alignment >> required thereby. (For reference, that's what qemu as NBD client >> recently did when talking to a server that advertises a size >> inconsistent with forced minimum block access: commit 3add3ab7) > > OK, I’ll send a patch. Thanks for you explanation! Well, or maybe not. $ ./qemu-img create -f qcow2 foo.qcow2 64M $ ./qemu-img map --image-opts \ driver=qcow2,file.filename=foo.qcow2,cache.direct=on qemu-img: Could not open 'driver=qcow2,file.filename=foo.qcow2,cache.direct=on': File length (196616 bytes) is not a multiple of the O_DIRECT alignment (512 bytes) Try cache.direct=off, or increasing the file size to match the alignment That may be considered a bug in qcow2. Maybe it should always fill all clusters. But even if we did so and fixed it now, we can’t disallow qemu from opening such images. Also, well, the tail is accessible, we just need to access it with the proper alignment (and then we get a short read). This seems to be handled just fine. So I think file-posix should just return a rounded result. Well, or bdrv_co_Block_status() could ignore it for the EOF, because it throws away everything past the EOF anyway with: if (*pnum > bytes) { *pnum = bytes; } On one hand, I agree that file-posix should return an aligned result. On the other, it doesn’t make a difference, so I don’t think we need to enforce it (at EOF). Max [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Unaligned images with O_DIRECT 2019-05-14 17:28 ` Max Reitz @ 2019-05-14 21:36 ` Eric Blake 0 siblings, 0 replies; 5+ messages in thread From: Eric Blake @ 2019-05-14 21:36 UTC (permalink / raw) To: Max Reitz, Qemu-block; +Cc: Kevin Wolf, qemu-devel [-- Attachment #1: Type: text/plain, Size: 3795 bytes --] On 5/14/19 12:28 PM, Max Reitz wrote: >>> >>> The tail of an unaligned file is generally inaccessible to O_DIRECT, >> >> Especially with this. >> >>> where it is easier to use ftruncate() up to an aligned boundary if you >>> really must play with that region of the file, and then ftruncate() back >>> to the intended size after I/O. But that sounds hairy. We could also >>> round down and silently ignore the tail of the file, but that is at odds >>> with our practice of rounding size up. So for the short term, I'd be >>> happy with a patch that just rejects any attempt to use cache.direct=on >>> (O_DIRECT) with a file that is not already a multiple of the alignment >>> required thereby. (For reference, that's what qemu as NBD client >>> recently did when talking to a server that advertises a size >>> inconsistent with forced minimum block access: commit 3add3ab7) >> >> OK, I’ll send a patch. Thanks for you explanation! > Well, or maybe not. > > $ ./qemu-img create -f qcow2 foo.qcow2 64M > $ ./qemu-img map --image-opts \ > driver=qcow2,file.filename=foo.qcow2,cache.direct=on > qemu-img: Could not open > 'driver=qcow2,file.filename=foo.qcow2,cache.direct=on': File length > (196616 bytes) is not a multiple of the O_DIRECT alignment (512 bytes) > Try cache.direct=off, or increasing the file size to match the alignment > > That may be considered a bug in qcow2. Maybe it should always fill all > clusters. But even if we did so and fixed it now, we can’t disallow > qemu from opening such images. > > Also, well, the tail is accessible, we just need to access it with the > proper alignment (and then we get a short read). This seems to be > handled just fine. Oh. Yeah, short reads with O_DIRECT are possible (short writes not so much; for those, you have to write a full buffer then ftruncate back down). But we DO want to support short reads because of pre-existing images, whether or not we also improve qcow2 to always create aligned image sizes. The qcow2 spec allows unaligned images, even if we quit creating new ones. > > So I think file-posix should just return a rounded result. Well, or > bdrv_co_Block_status() could ignore it for the EOF, because it throws > away everything past the EOF anyway with: > > if (*pnum > bytes) { > *pnum = bytes; > } > > On one hand, I agree that file-posix should return an aligned result. > On the other, it doesn’t make a difference, so I don’t think we need to > enforce it (at EOF). My thoughts: Right now, only io.c sets (or even reads) BDRV_BLOCK_EOF, and it is documented as an internal flag for optimizations. But it would be very easy to amend the contract of driver's .bdrv_co_block_status to state that a driver may set BDRV_BLOCK_EOF at the end of a file, and MUST set that flag if the end of the file also happens to be unaligned with respect to the driver's request_alignment. (Most drivers won't need to care, but file-posix.c under O_DIRECT would have to start caring). Then fix io.c to relax the assertion - the result must either be aligned (current condition) OR the driver must have reported BDRV_BLOCK_EOF (new condition). At that point, the block layer can take care of rounding out the block status for the unaligned tail beyond EOF up to the alignment boundary (similar to the rounding I have proposed in my other patches). If you don't get to that first, then it looks like I'll have to fold that in to my v2 patches when I get back to addressing those block status alignment problems. Thanks again for testing, and forcing me to think about the issue. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-05-14 21:37 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-05-14 15:06 [Qemu-devel] Unaligned images with O_DIRECT Max Reitz 2019-05-14 15:45 ` Eric Blake 2019-05-14 16:15 ` Max Reitz 2019-05-14 17:28 ` Max Reitz 2019-05-14 21:36 ` Eric Blake
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.