* [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure @ 2019-04-05 14:24 Andrey Shinkevich 2019-04-05 14:24 ` Andrey Shinkevich ` (3 more replies) 0 siblings, 4 replies; 25+ messages in thread From: Andrey Shinkevich @ 2019-04-05 14:24 UTC (permalink / raw) To: qemu-devel, qemu-block Cc: stefanha, fam, kwolf, mreitz, den, vsementsov, andrey.shinkevich On a file system used by the customer, fallocate() returns an error if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() fails. We can handle that case the same way as it is done for the unsupported cases, namely, call to bdrv_driver_pwritev() that writes zeroes to an image for the unaligned chunk of the block. Suggested-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> --- block/io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/io.c b/block/io.c index dfc153b..0412a51 100644 --- a/block/io.c +++ b/block/io.c @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, assert(!bs->supported_zero_flags); } - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { /* Fall back to bounce buffer if write zeroes is unsupported */ BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure 2019-04-05 14:24 [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure Andrey Shinkevich @ 2019-04-05 14:24 ` Andrey Shinkevich 2019-04-05 22:50 ` [Qemu-devel] [Qemu-block] " John Snow ` (2 subsequent siblings) 3 siblings, 0 replies; 25+ messages in thread From: Andrey Shinkevich @ 2019-04-05 14:24 UTC (permalink / raw) To: qemu-devel, qemu-block Cc: fam, kwolf, vsementsov, mreitz, stefanha, andrey.shinkevich, den On a file system used by the customer, fallocate() returns an error if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() fails. We can handle that case the same way as it is done for the unsupported cases, namely, call to bdrv_driver_pwritev() that writes zeroes to an image for the unaligned chunk of the block. Suggested-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> --- block/io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/io.c b/block/io.c index dfc153b..0412a51 100644 --- a/block/io.c +++ b/block/io.c @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, assert(!bs->supported_zero_flags); } - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { /* Fall back to bounce buffer if write zeroes is unsupported */ BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure 2019-04-05 14:24 [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure Andrey Shinkevich 2019-04-05 14:24 ` Andrey Shinkevich @ 2019-04-05 22:50 ` John Snow 2019-04-05 22:50 ` John Snow 2019-04-08 9:44 ` Andrey Shinkevich 2019-04-08 9:00 ` [Qemu-devel] " Stefan Hajnoczi 2019-08-17 14:42 ` Eric Blake 3 siblings, 2 replies; 25+ messages in thread From: John Snow @ 2019-04-05 22:50 UTC (permalink / raw) To: Andrey Shinkevich, qemu-devel, qemu-block Cc: fam, kwolf, vsementsov, mreitz, stefanha, den On 4/5/19 10:24 AM, Andrey Shinkevich wrote: > On a file system used by the customer, fallocate() returns an error > if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() > fails. We can handle that case the same way as it is done for the > unsupported cases, namely, call to bdrv_driver_pwritev() that writes > zeroes to an image for the unaligned chunk of the block. > > Suggested-by: Denis V. Lunev <den@openvz.org> > Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> > --- > block/io.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/io.c b/block/io.c > index dfc153b..0412a51 100644 > --- a/block/io.c > +++ b/block/io.c > @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, > assert(!bs->supported_zero_flags); > } > > - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { > + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { > /* Fall back to bounce buffer if write zeroes is unsupported */ > BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; > > I suppose that if fallocate fails for any reason and we're allowing fallback, we're either going to succeed ... or fail again very soon thereafter. Are there any cases where it is vital to not ignore the first fallocate failure? I'm a little wary of ignoring the return code from bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real" failure here that the following bounce writes will also fail "safely." I'm not completely confident, but I have no tangible objections: Reviewed-by: John Snow <jsnow@redhat.com> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure 2019-04-05 22:50 ` [Qemu-devel] [Qemu-block] " John Snow @ 2019-04-05 22:50 ` John Snow 2019-04-08 9:44 ` Andrey Shinkevich 1 sibling, 0 replies; 25+ messages in thread From: John Snow @ 2019-04-05 22:50 UTC (permalink / raw) To: Andrey Shinkevich, qemu-devel, qemu-block Cc: fam, kwolf, vsementsov, mreitz, stefanha, den On 4/5/19 10:24 AM, Andrey Shinkevich wrote: > On a file system used by the customer, fallocate() returns an error > if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() > fails. We can handle that case the same way as it is done for the > unsupported cases, namely, call to bdrv_driver_pwritev() that writes > zeroes to an image for the unaligned chunk of the block. > > Suggested-by: Denis V. Lunev <den@openvz.org> > Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> > --- > block/io.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/io.c b/block/io.c > index dfc153b..0412a51 100644 > --- a/block/io.c > +++ b/block/io.c > @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, > assert(!bs->supported_zero_flags); > } > > - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { > + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { > /* Fall back to bounce buffer if write zeroes is unsupported */ > BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; > > I suppose that if fallocate fails for any reason and we're allowing fallback, we're either going to succeed ... or fail again very soon thereafter. Are there any cases where it is vital to not ignore the first fallocate failure? I'm a little wary of ignoring the return code from bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real" failure here that the following bounce writes will also fail "safely." I'm not completely confident, but I have no tangible objections: Reviewed-by: John Snow <jsnow@redhat.com> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure 2019-04-05 22:50 ` [Qemu-devel] [Qemu-block] " John Snow 2019-04-05 22:50 ` John Snow @ 2019-04-08 9:44 ` Andrey Shinkevich 2019-04-08 9:44 ` Andrey Shinkevich 2019-04-08 10:04 ` Kevin Wolf 1 sibling, 2 replies; 25+ messages in thread From: Andrey Shinkevich @ 2019-04-08 9:44 UTC (permalink / raw) To: John Snow, qemu-devel, qemu-block Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, mreitz, stefanha, Denis Lunev On 06/04/2019 01:50, John Snow wrote: > > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote: >> On a file system used by the customer, fallocate() returns an error >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() >> fails. We can handle that case the same way as it is done for the >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes >> zeroes to an image for the unaligned chunk of the block. >> >> Suggested-by: Denis V. Lunev <den@openvz.org> >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> >> --- >> block/io.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/block/io.c b/block/io.c >> index dfc153b..0412a51 100644 >> --- a/block/io.c >> +++ b/block/io.c >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, >> assert(!bs->supported_zero_flags); >> } >> >> - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { >> + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { >> /* Fall back to bounce buffer if write zeroes is unsupported */ >> BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; >> >> > > I suppose that if fallocate fails for any reason and we're allowing > fallback, we're either going to succeed ... or fail again very soon > thereafter. > > Are there any cases where it is vital to not ignore the first fallocate > failure? I'm a little wary of ignoring the return code from > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real" > failure here that the following bounce writes will also fail "safely." > > I'm not completely confident, but I have no tangible objections: > Reviewed-by: John Snow <jsnow@redhat.com> > Thank you for your review, John! Let me clarify the circumstances and quote the bug report: "Customer had Win-2012 VM with 50GB system disk which was later resized to 256GB without resizing the partition inside VM. Now, while trying to resize to 50G, the following error will appear 'Failed to reduce the number of L2 tables: Invalid argument' It was found that it is possible to shrink the disk to 128G and any size above that number, but size below 128G will bring the mentioned error." The fallocate() returns no error on that file system if the offset and the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both are aligned to 4K. -- With the best regards, Andrey Shinkevich ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure 2019-04-08 9:44 ` Andrey Shinkevich @ 2019-04-08 9:44 ` Andrey Shinkevich 2019-04-08 10:04 ` Kevin Wolf 1 sibling, 0 replies; 25+ messages in thread From: Andrey Shinkevich @ 2019-04-08 9:44 UTC (permalink / raw) To: John Snow, qemu-devel, qemu-block Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, Denis Lunev, mreitz, stefanha On 06/04/2019 01:50, John Snow wrote: > > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote: >> On a file system used by the customer, fallocate() returns an error >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() >> fails. We can handle that case the same way as it is done for the >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes >> zeroes to an image for the unaligned chunk of the block. >> >> Suggested-by: Denis V. Lunev <den@openvz.org> >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> >> --- >> block/io.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/block/io.c b/block/io.c >> index dfc153b..0412a51 100644 >> --- a/block/io.c >> +++ b/block/io.c >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, >> assert(!bs->supported_zero_flags); >> } >> >> - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { >> + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { >> /* Fall back to bounce buffer if write zeroes is unsupported */ >> BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; >> >> > > I suppose that if fallocate fails for any reason and we're allowing > fallback, we're either going to succeed ... or fail again very soon > thereafter. > > Are there any cases where it is vital to not ignore the first fallocate > failure? I'm a little wary of ignoring the return code from > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real" > failure here that the following bounce writes will also fail "safely." > > I'm not completely confident, but I have no tangible objections: > Reviewed-by: John Snow <jsnow@redhat.com> > Thank you for your review, John! Let me clarify the circumstances and quote the bug report: "Customer had Win-2012 VM with 50GB system disk which was later resized to 256GB without resizing the partition inside VM. Now, while trying to resize to 50G, the following error will appear 'Failed to reduce the number of L2 tables: Invalid argument' It was found that it is possible to shrink the disk to 128G and any size above that number, but size below 128G will bring the mentioned error." The fallocate() returns no error on that file system if the offset and the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both are aligned to 4K. -- With the best regards, Andrey Shinkevich ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure 2019-04-08 9:44 ` Andrey Shinkevich 2019-04-08 9:44 ` Andrey Shinkevich @ 2019-04-08 10:04 ` Kevin Wolf 2019-04-08 10:04 ` Kevin Wolf ` (2 more replies) 1 sibling, 3 replies; 25+ messages in thread From: Kevin Wolf @ 2019-04-08 10:04 UTC (permalink / raw) To: Andrey Shinkevich Cc: John Snow, qemu-devel, qemu-block, fam, Vladimir Sementsov-Ogievskiy, mreitz, stefanha, Denis Lunev Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben: > > > On 06/04/2019 01:50, John Snow wrote: > > > > > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote: > >> On a file system used by the customer, fallocate() returns an error > >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() > >> fails. We can handle that case the same way as it is done for the > >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes > >> zeroes to an image for the unaligned chunk of the block. > >> > >> Suggested-by: Denis V. Lunev <den@openvz.org> > >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> > >> --- > >> block/io.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/block/io.c b/block/io.c > >> index dfc153b..0412a51 100644 > >> --- a/block/io.c > >> +++ b/block/io.c > >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, > >> assert(!bs->supported_zero_flags); > >> } > >> > >> - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { > >> + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { > >> /* Fall back to bounce buffer if write zeroes is unsupported */ > >> BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; > >> > >> > > > > I suppose that if fallocate fails for any reason and we're allowing > > fallback, we're either going to succeed ... or fail again very soon > > thereafter. > > > > Are there any cases where it is vital to not ignore the first fallocate > > failure? I'm a little wary of ignoring the return code from > > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real" > > failure here that the following bounce writes will also fail "safely." > > > > I'm not completely confident, but I have no tangible objections: > > Reviewed-by: John Snow <jsnow@redhat.com> > > > > Thank you for your review, John! > > Let me clarify the circumstances and quote the bug report: > "Customer had Win-2012 VM with 50GB system disk which was later resized > to 256GB without resizing the partition inside VM. > Now, while trying to resize to 50G, the following error will appear > 'Failed to reduce the number of L2 tables: Invalid argument' > It was found that it is possible to shrink the disk to 128G and any size > above that number, but size below 128G will bring the mentioned error." > > The fallocate() returns no error on that file system if the offset and > the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both > are aligned to 4K. What is the return value you get from this file system? Maybe turning that into ENOTSUP in file-posix would be less invasive. Just falling back for any error gives me the vague feeling that it could cause problems sooner or later. Kevin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure 2019-04-08 10:04 ` Kevin Wolf @ 2019-04-08 10:04 ` Kevin Wolf 2019-04-08 10:14 ` Kevin Wolf 2019-04-08 11:55 ` Andrey Shinkevich 2 siblings, 0 replies; 25+ messages in thread From: Kevin Wolf @ 2019-04-08 10:04 UTC (permalink / raw) To: Andrey Shinkevich Cc: fam, Vladimir Sementsov-Ogievskiy, Denis Lunev, qemu-block, qemu-devel, mreitz, stefanha, John Snow Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben: > > > On 06/04/2019 01:50, John Snow wrote: > > > > > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote: > >> On a file system used by the customer, fallocate() returns an error > >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() > >> fails. We can handle that case the same way as it is done for the > >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes > >> zeroes to an image for the unaligned chunk of the block. > >> > >> Suggested-by: Denis V. Lunev <den@openvz.org> > >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> > >> --- > >> block/io.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/block/io.c b/block/io.c > >> index dfc153b..0412a51 100644 > >> --- a/block/io.c > >> +++ b/block/io.c > >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, > >> assert(!bs->supported_zero_flags); > >> } > >> > >> - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { > >> + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { > >> /* Fall back to bounce buffer if write zeroes is unsupported */ > >> BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; > >> > >> > > > > I suppose that if fallocate fails for any reason and we're allowing > > fallback, we're either going to succeed ... or fail again very soon > > thereafter. > > > > Are there any cases where it is vital to not ignore the first fallocate > > failure? I'm a little wary of ignoring the return code from > > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real" > > failure here that the following bounce writes will also fail "safely." > > > > I'm not completely confident, but I have no tangible objections: > > Reviewed-by: John Snow <jsnow@redhat.com> > > > > Thank you for your review, John! > > Let me clarify the circumstances and quote the bug report: > "Customer had Win-2012 VM with 50GB system disk which was later resized > to 256GB without resizing the partition inside VM. > Now, while trying to resize to 50G, the following error will appear > 'Failed to reduce the number of L2 tables: Invalid argument' > It was found that it is possible to shrink the disk to 128G and any size > above that number, but size below 128G will bring the mentioned error." > > The fallocate() returns no error on that file system if the offset and > the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both > are aligned to 4K. What is the return value you get from this file system? Maybe turning that into ENOTSUP in file-posix would be less invasive. Just falling back for any error gives me the vague feeling that it could cause problems sooner or later. Kevin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure 2019-04-08 10:04 ` Kevin Wolf 2019-04-08 10:04 ` Kevin Wolf @ 2019-04-08 10:14 ` Kevin Wolf 2019-04-08 10:14 ` Kevin Wolf 2019-04-10 14:54 ` Stefan Hajnoczi 2019-04-08 11:55 ` Andrey Shinkevich 2 siblings, 2 replies; 25+ messages in thread From: Kevin Wolf @ 2019-04-08 10:14 UTC (permalink / raw) To: Andrey Shinkevich Cc: John Snow, qemu-devel, qemu-block, fam, Vladimir Sementsov-Ogievskiy, mreitz, stefanha, Denis Lunev Am 08.04.2019 um 12:04 hat Kevin Wolf geschrieben: > Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben: > > > > > > On 06/04/2019 01:50, John Snow wrote: > > > > > > > > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote: > > >> On a file system used by the customer, fallocate() returns an error > > >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() > > >> fails. We can handle that case the same way as it is done for the > > >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes > > >> zeroes to an image for the unaligned chunk of the block. > > >> > > >> Suggested-by: Denis V. Lunev <den@openvz.org> > > >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> > > >> --- > > >> block/io.c | 2 +- > > >> 1 file changed, 1 insertion(+), 1 deletion(-) > > >> > > >> diff --git a/block/io.c b/block/io.c > > >> index dfc153b..0412a51 100644 > > >> --- a/block/io.c > > >> +++ b/block/io.c > > >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, > > >> assert(!bs->supported_zero_flags); > > >> } > > >> > > >> - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { > > >> + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { > > >> /* Fall back to bounce buffer if write zeroes is unsupported */ > > >> BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; > > >> > > >> > > > > > > I suppose that if fallocate fails for any reason and we're allowing > > > fallback, we're either going to succeed ... or fail again very soon > > > thereafter. > > > > > > Are there any cases where it is vital to not ignore the first fallocate > > > failure? I'm a little wary of ignoring the return code from > > > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real" > > > failure here that the following bounce writes will also fail "safely." > > > > > > I'm not completely confident, but I have no tangible objections: > > > Reviewed-by: John Snow <jsnow@redhat.com> > > > > > > > Thank you for your review, John! > > > > Let me clarify the circumstances and quote the bug report: > > "Customer had Win-2012 VM with 50GB system disk which was later resized > > to 256GB without resizing the partition inside VM. > > Now, while trying to resize to 50G, the following error will appear > > 'Failed to reduce the number of L2 tables: Invalid argument' > > It was found that it is possible to shrink the disk to 128G and any size > > above that number, but size below 128G will bring the mentioned error." > > > > The fallocate() returns no error on that file system if the offset and > > the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both > > are aligned to 4K. > > What is the return value you get from this file system? > > Maybe turning that into ENOTSUP in file-posix would be less invasive. > Just falling back for any error gives me the vague feeling that it could > cause problems sooner or later. Also, which file system is this? This seems to be a file system bug. fallocate() isn't documented to possibly have alignment restrictions for FALLOC_FL_ZERO_RANGE (if this is the operation you're talking about). FALLOC_FL_PUNCH_HOLE even explicitly mentions the behaviour for partial blocks, so there is no doubt that operations for partial blocks are considered valid. Operations that may impose restrictions are explicitly documented as such. So in any case, this would only be a workaround for a buggy file system. The real bug needs to be fixed there. Kevin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure 2019-04-08 10:14 ` Kevin Wolf @ 2019-04-08 10:14 ` Kevin Wolf 2019-04-10 14:54 ` Stefan Hajnoczi 1 sibling, 0 replies; 25+ messages in thread From: Kevin Wolf @ 2019-04-08 10:14 UTC (permalink / raw) To: Andrey Shinkevich Cc: fam, Vladimir Sementsov-Ogievskiy, Denis Lunev, qemu-block, qemu-devel, mreitz, stefanha, John Snow Am 08.04.2019 um 12:04 hat Kevin Wolf geschrieben: > Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben: > > > > > > On 06/04/2019 01:50, John Snow wrote: > > > > > > > > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote: > > >> On a file system used by the customer, fallocate() returns an error > > >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() > > >> fails. We can handle that case the same way as it is done for the > > >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes > > >> zeroes to an image for the unaligned chunk of the block. > > >> > > >> Suggested-by: Denis V. Lunev <den@openvz.org> > > >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> > > >> --- > > >> block/io.c | 2 +- > > >> 1 file changed, 1 insertion(+), 1 deletion(-) > > >> > > >> diff --git a/block/io.c b/block/io.c > > >> index dfc153b..0412a51 100644 > > >> --- a/block/io.c > > >> +++ b/block/io.c > > >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, > > >> assert(!bs->supported_zero_flags); > > >> } > > >> > > >> - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { > > >> + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { > > >> /* Fall back to bounce buffer if write zeroes is unsupported */ > > >> BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; > > >> > > >> > > > > > > I suppose that if fallocate fails for any reason and we're allowing > > > fallback, we're either going to succeed ... or fail again very soon > > > thereafter. > > > > > > Are there any cases where it is vital to not ignore the first fallocate > > > failure? I'm a little wary of ignoring the return code from > > > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real" > > > failure here that the following bounce writes will also fail "safely." > > > > > > I'm not completely confident, but I have no tangible objections: > > > Reviewed-by: John Snow <jsnow@redhat.com> > > > > > > > Thank you for your review, John! > > > > Let me clarify the circumstances and quote the bug report: > > "Customer had Win-2012 VM with 50GB system disk which was later resized > > to 256GB without resizing the partition inside VM. > > Now, while trying to resize to 50G, the following error will appear > > 'Failed to reduce the number of L2 tables: Invalid argument' > > It was found that it is possible to shrink the disk to 128G and any size > > above that number, but size below 128G will bring the mentioned error." > > > > The fallocate() returns no error on that file system if the offset and > > the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both > > are aligned to 4K. > > What is the return value you get from this file system? > > Maybe turning that into ENOTSUP in file-posix would be less invasive. > Just falling back for any error gives me the vague feeling that it could > cause problems sooner or later. Also, which file system is this? This seems to be a file system bug. fallocate() isn't documented to possibly have alignment restrictions for FALLOC_FL_ZERO_RANGE (if this is the operation you're talking about). FALLOC_FL_PUNCH_HOLE even explicitly mentions the behaviour for partial blocks, so there is no doubt that operations for partial blocks are considered valid. Operations that may impose restrictions are explicitly documented as such. So in any case, this would only be a workaround for a buggy file system. The real bug needs to be fixed there. Kevin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure 2019-04-08 10:14 ` Kevin Wolf 2019-04-08 10:14 ` Kevin Wolf @ 2019-04-10 14:54 ` Stefan Hajnoczi 2019-04-10 14:54 ` Stefan Hajnoczi 1 sibling, 1 reply; 25+ messages in thread From: Stefan Hajnoczi @ 2019-04-10 14:54 UTC (permalink / raw) To: Kevin Wolf Cc: Andrey Shinkevich, fam, Vladimir Sementsov-Ogievskiy, Denis Lunev, qemu-block, qemu-devel, mreitz, stefanha [-- Attachment #1: Type: text/plain, Size: 4075 bytes --] On Mon, Apr 08, 2019 at 12:14:49PM +0200, Kevin Wolf wrote: > Am 08.04.2019 um 12:04 hat Kevin Wolf geschrieben: > > Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben: > > > > > > > > > On 06/04/2019 01:50, John Snow wrote: > > > > > > > > > > > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote: > > > >> On a file system used by the customer, fallocate() returns an error > > > >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() > > > >> fails. We can handle that case the same way as it is done for the > > > >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes > > > >> zeroes to an image for the unaligned chunk of the block. > > > >> > > > >> Suggested-by: Denis V. Lunev <den@openvz.org> > > > >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> > > > >> --- > > > >> block/io.c | 2 +- > > > >> 1 file changed, 1 insertion(+), 1 deletion(-) > > > >> > > > >> diff --git a/block/io.c b/block/io.c > > > >> index dfc153b..0412a51 100644 > > > >> --- a/block/io.c > > > >> +++ b/block/io.c > > > >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, > > > >> assert(!bs->supported_zero_flags); > > > >> } > > > >> > > > >> - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { > > > >> + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { > > > >> /* Fall back to bounce buffer if write zeroes is unsupported */ > > > >> BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; > > > >> > > > >> > > > > > > > > I suppose that if fallocate fails for any reason and we're allowing > > > > fallback, we're either going to succeed ... or fail again very soon > > > > thereafter. > > > > > > > > Are there any cases where it is vital to not ignore the first fallocate > > > > failure? I'm a little wary of ignoring the return code from > > > > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real" > > > > failure here that the following bounce writes will also fail "safely." > > > > > > > > I'm not completely confident, but I have no tangible objections: > > > > Reviewed-by: John Snow <jsnow@redhat.com> > > > > > > > > > > Thank you for your review, John! > > > > > > Let me clarify the circumstances and quote the bug report: > > > "Customer had Win-2012 VM with 50GB system disk which was later resized > > > to 256GB without resizing the partition inside VM. > > > Now, while trying to resize to 50G, the following error will appear > > > 'Failed to reduce the number of L2 tables: Invalid argument' > > > It was found that it is possible to shrink the disk to 128G and any size > > > above that number, but size below 128G will bring the mentioned error." > > > > > > The fallocate() returns no error on that file system if the offset and > > > the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both > > > are aligned to 4K. > > > > What is the return value you get from this file system? > > > > Maybe turning that into ENOTSUP in file-posix would be less invasive. > > Just falling back for any error gives me the vague feeling that it could > > cause problems sooner or later. > > Also, which file system is this? This seems to be a file system bug. > fallocate() isn't documented to possibly have alignment restrictions for > FALLOC_FL_ZERO_RANGE (if this is the operation you're talking about). > FALLOC_FL_PUNCH_HOLE even explicitly mentions the behaviour for partial > blocks, so there is no doubt that operations for partial blocks are > considered valid. Operations that may impose restrictions are explicitly > documented as such. > > So in any case, this would only be a workaround for a buggy file system. > The real bug needs to be fixed there. I agree regarding the root cause of the bug, but the fallback behavior is reasonable IMO. Andrey: If you update the patch with a more specific errno I'll queue that patch instead. Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure 2019-04-10 14:54 ` Stefan Hajnoczi @ 2019-04-10 14:54 ` Stefan Hajnoczi 0 siblings, 0 replies; 25+ messages in thread From: Stefan Hajnoczi @ 2019-04-10 14:54 UTC (permalink / raw) To: Kevin Wolf Cc: fam, Vladimir Sementsov-Ogievskiy, Denis Lunev, qemu-block, qemu-devel, mreitz, stefanha, Andrey Shinkevich [-- Attachment #1: Type: text/plain, Size: 4075 bytes --] On Mon, Apr 08, 2019 at 12:14:49PM +0200, Kevin Wolf wrote: > Am 08.04.2019 um 12:04 hat Kevin Wolf geschrieben: > > Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben: > > > > > > > > > On 06/04/2019 01:50, John Snow wrote: > > > > > > > > > > > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote: > > > >> On a file system used by the customer, fallocate() returns an error > > > >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() > > > >> fails. We can handle that case the same way as it is done for the > > > >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes > > > >> zeroes to an image for the unaligned chunk of the block. > > > >> > > > >> Suggested-by: Denis V. Lunev <den@openvz.org> > > > >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> > > > >> --- > > > >> block/io.c | 2 +- > > > >> 1 file changed, 1 insertion(+), 1 deletion(-) > > > >> > > > >> diff --git a/block/io.c b/block/io.c > > > >> index dfc153b..0412a51 100644 > > > >> --- a/block/io.c > > > >> +++ b/block/io.c > > > >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, > > > >> assert(!bs->supported_zero_flags); > > > >> } > > > >> > > > >> - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { > > > >> + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { > > > >> /* Fall back to bounce buffer if write zeroes is unsupported */ > > > >> BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; > > > >> > > > >> > > > > > > > > I suppose that if fallocate fails for any reason and we're allowing > > > > fallback, we're either going to succeed ... or fail again very soon > > > > thereafter. > > > > > > > > Are there any cases where it is vital to not ignore the first fallocate > > > > failure? I'm a little wary of ignoring the return code from > > > > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real" > > > > failure here that the following bounce writes will also fail "safely." > > > > > > > > I'm not completely confident, but I have no tangible objections: > > > > Reviewed-by: John Snow <jsnow@redhat.com> > > > > > > > > > > Thank you for your review, John! > > > > > > Let me clarify the circumstances and quote the bug report: > > > "Customer had Win-2012 VM with 50GB system disk which was later resized > > > to 256GB without resizing the partition inside VM. > > > Now, while trying to resize to 50G, the following error will appear > > > 'Failed to reduce the number of L2 tables: Invalid argument' > > > It was found that it is possible to shrink the disk to 128G and any size > > > above that number, but size below 128G will bring the mentioned error." > > > > > > The fallocate() returns no error on that file system if the offset and > > > the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both > > > are aligned to 4K. > > > > What is the return value you get from this file system? > > > > Maybe turning that into ENOTSUP in file-posix would be less invasive. > > Just falling back for any error gives me the vague feeling that it could > > cause problems sooner or later. > > Also, which file system is this? This seems to be a file system bug. > fallocate() isn't documented to possibly have alignment restrictions for > FALLOC_FL_ZERO_RANGE (if this is the operation you're talking about). > FALLOC_FL_PUNCH_HOLE even explicitly mentions the behaviour for partial > blocks, so there is no doubt that operations for partial blocks are > considered valid. Operations that may impose restrictions are explicitly > documented as such. > > So in any case, this would only be a workaround for a buggy file system. > The real bug needs to be fixed there. I agree regarding the root cause of the bug, but the fallback behavior is reasonable IMO. Andrey: If you update the patch with a more specific errno I'll queue that patch instead. Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure 2019-04-08 10:04 ` Kevin Wolf 2019-04-08 10:04 ` Kevin Wolf 2019-04-08 10:14 ` Kevin Wolf @ 2019-04-08 11:55 ` Andrey Shinkevich 2019-04-08 11:55 ` Andrey Shinkevich 2 siblings, 1 reply; 25+ messages in thread From: Andrey Shinkevich @ 2019-04-08 11:55 UTC (permalink / raw) To: Kevin Wolf Cc: John Snow, qemu-devel, qemu-block, fam, Vladimir Sementsov-Ogievskiy, mreitz, stefanha, Denis Lunev On 08/04/2019 13:04, Kevin Wolf wrote: > Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben: >> >> >> On 06/04/2019 01:50, John Snow wrote: >>> >>> >>> On 4/5/19 10:24 AM, Andrey Shinkevich wrote: >>>> On a file system used by the customer, fallocate() returns an error >>>> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() >>>> fails. We can handle that case the same way as it is done for the >>>> unsupported cases, namely, call to bdrv_driver_pwritev() that writes >>>> zeroes to an image for the unaligned chunk of the block. >>>> >>>> Suggested-by: Denis V. Lunev <den@openvz.org> >>>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> >>>> --- >>>> block/io.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/block/io.c b/block/io.c >>>> index dfc153b..0412a51 100644 >>>> --- a/block/io.c >>>> +++ b/block/io.c >>>> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, >>>> assert(!bs->supported_zero_flags); >>>> } >>>> >>>> - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { >>>> + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { >>>> /* Fall back to bounce buffer if write zeroes is unsupported */ >>>> BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; >>>> >>>> >>> >>> I suppose that if fallocate fails for any reason and we're allowing >>> fallback, we're either going to succeed ... or fail again very soon >>> thereafter. >>> >>> Are there any cases where it is vital to not ignore the first fallocate >>> failure? I'm a little wary of ignoring the return code from >>> bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real" >>> failure here that the following bounce writes will also fail "safely." >>> >>> I'm not completely confident, but I have no tangible objections: >>> Reviewed-by: John Snow <jsnow@redhat.com> >>> >> >> Thank you for your review, John! >> >> Let me clarify the circumstances and quote the bug report: >> "Customer had Win-2012 VM with 50GB system disk which was later resized >> to 256GB without resizing the partition inside VM. >> Now, while trying to resize to 50G, the following error will appear >> 'Failed to reduce the number of L2 tables: Invalid argument' >> It was found that it is possible to shrink the disk to 128G and any size >> above that number, but size below 128G will bring the mentioned error." >> >> The fallocate() returns no error on that file system if the offset and >> the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both >> are aligned to 4K. > > What is the return value you get from this file system? > > Maybe turning that into ENOTSUP in file-posix would be less invasive. > Just falling back for any error gives me the vague feeling that it could > cause problems sooner or later. > > Kevin > The return value for that custom distributed file system is "Invalid argument", that's "EINVAL: mode is FALLOC_FL_ZERO_RANGE, but the file referred to by fd is not a regular file". When I reproduced the bug, I saw that the alignment in the bdrv_co_do_pwrite_zeroes() was set to '1' in that case: MAX(bs->bl.pwrite_zeroes_alignment /=0/), bs->bl.request_alignment /=1/); With my first patch I had not sent before, a new member of the structure BlockLimits, say pwrite_zeroes_alignment_min, was set to 4K for a raw file and the alignment was made for the block. Then zeroes were written to the image for the unaligned head and tail by invoking the bdrv_driver_pwritev(). That approach has cons also: we would write to the disk always. The file system maintainers say that the bug is a particular case and they may not return the general error such as 'unsupported'. -- Andrey ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure 2019-04-08 11:55 ` Andrey Shinkevich @ 2019-04-08 11:55 ` Andrey Shinkevich 0 siblings, 0 replies; 25+ messages in thread From: Andrey Shinkevich @ 2019-04-08 11:55 UTC (permalink / raw) To: Kevin Wolf Cc: fam, Vladimir Sementsov-Ogievskiy, Denis Lunev, qemu-block, qemu-devel, mreitz, stefanha, John Snow On 08/04/2019 13:04, Kevin Wolf wrote: > Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben: >> >> >> On 06/04/2019 01:50, John Snow wrote: >>> >>> >>> On 4/5/19 10:24 AM, Andrey Shinkevich wrote: >>>> On a file system used by the customer, fallocate() returns an error >>>> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() >>>> fails. We can handle that case the same way as it is done for the >>>> unsupported cases, namely, call to bdrv_driver_pwritev() that writes >>>> zeroes to an image for the unaligned chunk of the block. >>>> >>>> Suggested-by: Denis V. Lunev <den@openvz.org> >>>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> >>>> --- >>>> block/io.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/block/io.c b/block/io.c >>>> index dfc153b..0412a51 100644 >>>> --- a/block/io.c >>>> +++ b/block/io.c >>>> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, >>>> assert(!bs->supported_zero_flags); >>>> } >>>> >>>> - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { >>>> + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { >>>> /* Fall back to bounce buffer if write zeroes is unsupported */ >>>> BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; >>>> >>>> >>> >>> I suppose that if fallocate fails for any reason and we're allowing >>> fallback, we're either going to succeed ... or fail again very soon >>> thereafter. >>> >>> Are there any cases where it is vital to not ignore the first fallocate >>> failure? I'm a little wary of ignoring the return code from >>> bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real" >>> failure here that the following bounce writes will also fail "safely." >>> >>> I'm not completely confident, but I have no tangible objections: >>> Reviewed-by: John Snow <jsnow@redhat.com> >>> >> >> Thank you for your review, John! >> >> Let me clarify the circumstances and quote the bug report: >> "Customer had Win-2012 VM with 50GB system disk which was later resized >> to 256GB without resizing the partition inside VM. >> Now, while trying to resize to 50G, the following error will appear >> 'Failed to reduce the number of L2 tables: Invalid argument' >> It was found that it is possible to shrink the disk to 128G and any size >> above that number, but size below 128G will bring the mentioned error." >> >> The fallocate() returns no error on that file system if the offset and >> the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both >> are aligned to 4K. > > What is the return value you get from this file system? > > Maybe turning that into ENOTSUP in file-posix would be less invasive. > Just falling back for any error gives me the vague feeling that it could > cause problems sooner or later. > > Kevin > The return value for that custom distributed file system is "Invalid argument", that's "EINVAL: mode is FALLOC_FL_ZERO_RANGE, but the file referred to by fd is not a regular file". When I reproduced the bug, I saw that the alignment in the bdrv_co_do_pwrite_zeroes() was set to '1' in that case: MAX(bs->bl.pwrite_zeroes_alignment /=0/), bs->bl.request_alignment /=1/); With my first patch I had not sent before, a new member of the structure BlockLimits, say pwrite_zeroes_alignment_min, was set to 4K for a raw file and the alignment was made for the block. Then zeroes were written to the image for the unaligned head and tail by invoking the bdrv_driver_pwritev(). That approach has cons also: we would write to the disk always. The file system maintainers say that the bug is a particular case and they may not return the general error such as 'unsupported'. -- Andrey ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure 2019-04-05 14:24 [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure Andrey Shinkevich 2019-04-05 14:24 ` Andrey Shinkevich 2019-04-05 22:50 ` [Qemu-devel] [Qemu-block] " John Snow @ 2019-04-08 9:00 ` Stefan Hajnoczi 2019-04-08 9:00 ` Stefan Hajnoczi 2019-04-08 9:45 ` Andrey Shinkevich 2019-08-17 14:42 ` Eric Blake 3 siblings, 2 replies; 25+ messages in thread From: Stefan Hajnoczi @ 2019-04-08 9:00 UTC (permalink / raw) To: Andrey Shinkevich Cc: qemu-devel, qemu-block, fam, kwolf, mreitz, den, vsementsov [-- Attachment #1: Type: text/plain, Size: 729 bytes --] On Fri, Apr 05, 2019 at 05:24:04PM +0300, Andrey Shinkevich wrote: > On a file system used by the customer, fallocate() returns an error > if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() > fails. We can handle that case the same way as it is done for the > unsupported cases, namely, call to bdrv_driver_pwritev() that writes > zeroes to an image for the unaligned chunk of the block. > > Suggested-by: Denis V. Lunev <den@openvz.org> > Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> > --- > block/io.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Thanks, applied to my block-next tree for QEMU 4.1: https://github.com/stefanha/qemu/commits/block-next Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure 2019-04-08 9:00 ` [Qemu-devel] " Stefan Hajnoczi @ 2019-04-08 9:00 ` Stefan Hajnoczi 2019-04-08 9:45 ` Andrey Shinkevich 1 sibling, 0 replies; 25+ messages in thread From: Stefan Hajnoczi @ 2019-04-08 9:00 UTC (permalink / raw) To: Andrey Shinkevich Cc: fam, kwolf, vsementsov, qemu-block, qemu-devel, mreitz, den [-- Attachment #1: Type: text/plain, Size: 729 bytes --] On Fri, Apr 05, 2019 at 05:24:04PM +0300, Andrey Shinkevich wrote: > On a file system used by the customer, fallocate() returns an error > if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() > fails. We can handle that case the same way as it is done for the > unsupported cases, namely, call to bdrv_driver_pwritev() that writes > zeroes to an image for the unaligned chunk of the block. > > Suggested-by: Denis V. Lunev <den@openvz.org> > Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> > --- > block/io.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Thanks, applied to my block-next tree for QEMU 4.1: https://github.com/stefanha/qemu/commits/block-next Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure 2019-04-08 9:00 ` [Qemu-devel] " Stefan Hajnoczi 2019-04-08 9:00 ` Stefan Hajnoczi @ 2019-04-08 9:45 ` Andrey Shinkevich 2019-04-08 9:45 ` Andrey Shinkevich 1 sibling, 1 reply; 25+ messages in thread From: Andrey Shinkevich @ 2019-04-08 9:45 UTC (permalink / raw) To: Stefan Hajnoczi Cc: qemu-devel, qemu-block, fam, kwolf, mreitz, Denis Lunev, Vladimir Sementsov-Ogievskiy On 08/04/2019 12:00, Stefan Hajnoczi wrote: > On Fri, Apr 05, 2019 at 05:24:04PM +0300, Andrey Shinkevich wrote: >> On a file system used by the customer, fallocate() returns an error >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() >> fails. We can handle that case the same way as it is done for the >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes >> zeroes to an image for the unaligned chunk of the block. >> >> Suggested-by: Denis V. Lunev <den@openvz.org> >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> >> --- >> block/io.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) > > Thanks, applied to my block-next tree for QEMU 4.1: > https://github.com/stefanha/qemu/commits/block-next > > Stefan > Thank you all! -- With the best regards, Andrey Shinkevich ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure 2019-04-08 9:45 ` Andrey Shinkevich @ 2019-04-08 9:45 ` Andrey Shinkevich 0 siblings, 0 replies; 25+ messages in thread From: Andrey Shinkevich @ 2019-04-08 9:45 UTC (permalink / raw) To: Stefan Hajnoczi Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, Denis Lunev, qemu-block, qemu-devel, mreitz On 08/04/2019 12:00, Stefan Hajnoczi wrote: > On Fri, Apr 05, 2019 at 05:24:04PM +0300, Andrey Shinkevich wrote: >> On a file system used by the customer, fallocate() returns an error >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() >> fails. We can handle that case the same way as it is done for the >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes >> zeroes to an image for the unaligned chunk of the block. >> >> Suggested-by: Denis V. Lunev <den@openvz.org> >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> >> --- >> block/io.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) > > Thanks, applied to my block-next tree for QEMU 4.1: > https://github.com/stefanha/qemu/commits/block-next > > Stefan > Thank you all! -- With the best regards, Andrey Shinkevich ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure 2019-04-05 14:24 [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure Andrey Shinkevich ` (2 preceding siblings ...) 2019-04-08 9:00 ` [Qemu-devel] " Stefan Hajnoczi @ 2019-08-17 14:42 ` Eric Blake 2019-08-17 14:49 ` Eric Blake 3 siblings, 1 reply; 25+ messages in thread From: Eric Blake @ 2019-08-17 14:42 UTC (permalink / raw) To: Andrey Shinkevich, qemu-devel, qemu-block Cc: fam, kwolf, vsementsov, mreitz, stefanha, den [-- Attachment #1.1: Type: text/plain, Size: 1646 bytes --] On 4/5/19 9:24 AM, Andrey Shinkevich wrote: > On a file system used by the customer, fallocate() returns an error Which error? > if the block is not properly aligned. So, bdrv_co_pwrite_zeroes() > fails. We can handle that case the same way as it is done for the > unsupported cases, namely, call to bdrv_driver_pwritev() that writes > zeroes to an image for the unaligned chunk of the block. > > Suggested-by: Denis V. Lunev <den@openvz.org> > Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> > --- > block/io.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/io.c b/block/io.c > index dfc153b..0412a51 100644 > --- a/block/io.c > +++ b/block/io.c > @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, > assert(!bs->supported_zero_flags); > } > > - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { > + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { This change is a regression of sorts. Now, you are unconditionally attempting the fallback for ALL failures (such as EIO) and for all drivers, even when that was not previously attempted and increases the traffic. I think we should revert this patch and instead fix the fallocate() path to convert whatever ACTUAL errno you got from unaligned fallocate failure into ENOTSUP (that is, just the file-posix.c location that failed), while leaving all other errors as immediately fatal. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure 2019-08-17 14:42 ` Eric Blake @ 2019-08-17 14:49 ` Eric Blake 2019-08-17 14:56 ` Eric Blake 0 siblings, 1 reply; 25+ messages in thread From: Eric Blake @ 2019-08-17 14:49 UTC (permalink / raw) To: Andrey Shinkevich, qemu-devel, qemu-block Cc: fam, kwolf, vsementsov, mreitz, stefanha, den [-- Attachment #1.1: Type: text/plain, Size: 1949 bytes --] On 8/17/19 9:42 AM, Eric Blake wrote: > On 4/5/19 9:24 AM, Andrey Shinkevich wrote: >> On a file system used by the customer, fallocate() returns an error > > Which error? Okay, I read the rest of the thread; EINVAL. But the commit message was not amended before becoming commit 118f9944. >> >> - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) { >> + if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) { > > This change is a regression of sorts. Now, you are unconditionally > attempting the fallback for ALL failures (such as EIO) and for all > drivers, even when that was not previously attempted and increases the > traffic. I think we should revert this patch and instead fix the > fallocate() path to convert whatever ACTUAL errno you got from unaligned > fallocate failure into ENOTSUP (that is, just the file-posix.c location > that failed), while leaving all other errors as immediately fatal. And the rest of the thread worried about that exact scenario. Here's how I encountered it. I was trying to debug the nbdkit sh plugin, with: $ cat >script <<\EOF case $1 in get_size) echo 1m;; pread) false;; can_write|can_zero) ;; pwrite) ;; zero) echo ENOTSUP; exit 1 ;; *) exit 2;; esac EOF (the script has a subtle bug; zero) should be using 'echo ENOTSUP >&2', but because it didn't, nbdkit treats the failure as EIO rather than the intended ENOTSUP) coupled with: $ qemu-io -f raw nbd://localhost:10810 -c 'w -z 0 1' but when the script fails with EIO and qemu-io reported that the write was still successful, I was confused (I was debugging a server-side fallback to write, not a client-side one), until I discovered that we changed the semantics in qemu 4.1 that EIO is no longer fatal and attempts the write fallback. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure 2019-08-17 14:49 ` Eric Blake @ 2019-08-17 14:56 ` Eric Blake 2019-08-19 19:46 ` Denis V. Lunev 0 siblings, 1 reply; 25+ messages in thread From: Eric Blake @ 2019-08-17 14:56 UTC (permalink / raw) To: Andrey Shinkevich, qemu-devel, qemu-block Cc: fam, kwolf, vsementsov, mreitz, stefanha, den [-- Attachment #1.1: Type: text/plain, Size: 872 bytes --] On 8/17/19 9:49 AM, Eric Blake wrote: >> This change is a regression of sorts. Now, you are unconditionally >> attempting the fallback for ALL failures (such as EIO) and for all >> drivers, even when that was not previously attempted and increases the >> traffic. I think we should revert this patch and instead fix the >> fallocate() path to convert whatever ACTUAL errno you got from unaligned >> fallocate failure into ENOTSUP (that is, just the file-posix.c location >> that failed), while leaving all other errors as immediately fatal. Or even better, fix the call site of fallocate() to skip attempting an unaligned fallocate(), and just directly return ENOTSUP, rather than trying to diagnose EINVAL after the fact. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure 2019-08-17 14:56 ` Eric Blake @ 2019-08-19 19:46 ` Denis V. Lunev 2019-08-19 20:30 ` Eric Blake 0 siblings, 1 reply; 25+ messages in thread From: Denis V. Lunev @ 2019-08-19 19:46 UTC (permalink / raw) To: Eric Blake, Andrey Shinkevich, qemu-devel, qemu-block Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, stefanha, mreitz On 8/17/19 5:56 PM, Eric Blake wrote: > On 8/17/19 9:49 AM, Eric Blake wrote: > >>> This change is a regression of sorts. Now, you are unconditionally >>> attempting the fallback for ALL failures (such as EIO) and for all >>> drivers, even when that was not previously attempted and increases the >>> traffic. I think we should revert this patch and instead fix the >>> fallocate() path to convert whatever ACTUAL errno you got from unaligned >>> fallocate failure into ENOTSUP (that is, just the file-posix.c location >>> that failed), while leaving all other errors as immediately fatal. > Or even better, fix the call site of fallocate() to skip attempting an > unaligned fallocate(), and just directly return ENOTSUP, rather than > trying to diagnose EINVAL after the fact. > No way. Single ENOTSUP will turn off fallocate() support on caller side while aligned (99.99% of calls) works normally. Den ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure 2019-08-19 19:46 ` Denis V. Lunev @ 2019-08-19 20:30 ` Eric Blake 2019-08-19 20:53 ` Denis V. Lunev 0 siblings, 1 reply; 25+ messages in thread From: Eric Blake @ 2019-08-19 20:30 UTC (permalink / raw) To: Denis V. Lunev, Andrey Shinkevich, qemu-devel, qemu-block Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, stefanha, mreitz [-- Attachment #1.1: Type: text/plain, Size: 1342 bytes --] On 8/19/19 2:46 PM, Denis V. Lunev wrote: > On 8/17/19 5:56 PM, Eric Blake wrote: >> On 8/17/19 9:49 AM, Eric Blake wrote: >> >>>> This change is a regression of sorts. Now, you are unconditionally >>>> attempting the fallback for ALL failures (such as EIO) and for all >>>> drivers, even when that was not previously attempted and increases the >>>> traffic. I think we should revert this patch and instead fix the >>>> fallocate() path to convert whatever ACTUAL errno you got from unaligned >>>> fallocate failure into ENOTSUP (that is, just the file-posix.c location >>>> that failed), while leaving all other errors as immediately fatal. >> Or even better, fix the call site of fallocate() to skip attempting an >> unaligned fallocate(), and just directly return ENOTSUP, rather than >> trying to diagnose EINVAL after the fact. >> > No way. Single ENOTSUP will turn off fallocate() support on caller side > while > aligned (99.99% of calls) works normally. I didn't mean skip fallocate() unconditionally, only when unaligned: if (request not aligned enough) return -ENOTSUP; fallocate() ... so that the 99.99% requests that ARE aligned get to use fallocate() normally. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure 2019-08-19 20:30 ` Eric Blake @ 2019-08-19 20:53 ` Denis V. Lunev 2019-08-19 21:29 ` Eric Blake 0 siblings, 1 reply; 25+ messages in thread From: Denis V. Lunev @ 2019-08-19 20:53 UTC (permalink / raw) To: Eric Blake, Andrey Shinkevich, qemu-devel, qemu-block Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, stefanha, mreitz On 8/19/19 11:30 PM, Eric Blake wrote: > On 8/19/19 2:46 PM, Denis V. Lunev wrote: >> On 8/17/19 5:56 PM, Eric Blake wrote: >>> On 8/17/19 9:49 AM, Eric Blake wrote: >>> >>>>> This change is a regression of sorts. Now, you are unconditionally >>>>> attempting the fallback for ALL failures (such as EIO) and for all >>>>> drivers, even when that was not previously attempted and increases the >>>>> traffic. I think we should revert this patch and instead fix the >>>>> fallocate() path to convert whatever ACTUAL errno you got from unaligned >>>>> fallocate failure into ENOTSUP (that is, just the file-posix.c location >>>>> that failed), while leaving all other errors as immediately fatal. >>> Or even better, fix the call site of fallocate() to skip attempting an >>> unaligned fallocate(), and just directly return ENOTSUP, rather than >>> trying to diagnose EINVAL after the fact. >>> >> No way. Single ENOTSUP will turn off fallocate() support on caller side >> while >> aligned (99.99% of calls) works normally. > I didn't mean skip fallocate() unconditionally, only when unaligned: > > if (request not aligned enough) > return -ENOTSUP; > fallocate() ... > > so that the 99.99% requests that ARE aligned get to use fallocate() > normally. > static int handle_aiocb_write_zeroes(void *opaque) { ... #ifdef CONFIG_FALLOCATE_ZERO_RANGE if (s->has_write_zeroes) { int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE, aiocb->aio_offset, aiocb->aio_nbytes); if (ret == 0 || ret != -ENOTSUP) { return ret; } s->has_write_zeroes = false; } #endif thus, right now, single ENOTSUP disables fallocate functionality completely setting s->has_write_zeroes to false and that is pretty much correct. ENOTSUP is "static" error code which returns persistent ENOTSUP under any consequences. Its handling usually disables some functionality. This is why original idea is proposed. Den ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure 2019-08-19 20:53 ` Denis V. Lunev @ 2019-08-19 21:29 ` Eric Blake 0 siblings, 0 replies; 25+ messages in thread From: Eric Blake @ 2019-08-19 21:29 UTC (permalink / raw) To: Denis V. Lunev, Andrey Shinkevich, qemu-devel, qemu-block Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, stefanha, mreitz [-- Attachment #1.1: Type: text/plain, Size: 2133 bytes --] On 8/19/19 3:53 PM, Denis V. Lunev wrote: >>>> Or even better, fix the call site of fallocate() to skip attempting an >>>> unaligned fallocate(), and just directly return ENOTSUP, rather than >>>> trying to diagnose EINVAL after the fact. >>>> >>> No way. Single ENOTSUP will turn off fallocate() support on caller side >>> while >>> aligned (99.99% of calls) works normally. >> I didn't mean skip fallocate() unconditionally, only when unaligned: >> >> if (request not aligned enough) >> return -ENOTSUP; >> fallocate() ... >> >> so that the 99.99% requests that ARE aligned get to use fallocate() >> normally. >> > static int handle_aiocb_write_zeroes(void *opaque) > { > ... > #ifdef CONFIG_FALLOCATE_ZERO_RANGE > if (s->has_write_zeroes) { > int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE, > aiocb->aio_offset, aiocb->aio_nbytes); > if (ret == 0 || ret != -ENOTSUP) { > return ret; > } > s->has_write_zeroes = false; > } > #endif > > thus, right now, single ENOTSUP disables fallocate > functionality completely setting s->has_write_zeroes > to false and that is pretty much correct. > > ENOTSUP is "static" error code which returns persistent > ENOTSUP under any consequences. Not always true. And the block layer doesn't expect it to be true. It is perfectly fine for one invocation to return ENOTSUP ('I can't handle this request, so fall back to pwrite for me) and the next to just work ('this one was aligned, so I handled it just fine). It just means that you have to be more careful with the logic: never set s->has_write_zeroes=false if you skipped the fallocate, or if the fallocate failed due to EINVAL rather than ENOTSUP (but still report ENOTSUP to the block layer, to document that you want the EINVAL for unaligned request to be turned into a fallback to pwrite). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2019-08-19 21:30 UTC | newest] Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-04-05 14:24 [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure Andrey Shinkevich 2019-04-05 14:24 ` Andrey Shinkevich 2019-04-05 22:50 ` [Qemu-devel] [Qemu-block] " John Snow 2019-04-05 22:50 ` John Snow 2019-04-08 9:44 ` Andrey Shinkevich 2019-04-08 9:44 ` Andrey Shinkevich 2019-04-08 10:04 ` Kevin Wolf 2019-04-08 10:04 ` Kevin Wolf 2019-04-08 10:14 ` Kevin Wolf 2019-04-08 10:14 ` Kevin Wolf 2019-04-10 14:54 ` Stefan Hajnoczi 2019-04-10 14:54 ` Stefan Hajnoczi 2019-04-08 11:55 ` Andrey Shinkevich 2019-04-08 11:55 ` Andrey Shinkevich 2019-04-08 9:00 ` [Qemu-devel] " Stefan Hajnoczi 2019-04-08 9:00 ` Stefan Hajnoczi 2019-04-08 9:45 ` Andrey Shinkevich 2019-04-08 9:45 ` Andrey Shinkevich 2019-08-17 14:42 ` Eric Blake 2019-08-17 14:49 ` Eric Blake 2019-08-17 14:56 ` Eric Blake 2019-08-19 19:46 ` Denis V. Lunev 2019-08-19 20:30 ` Eric Blake 2019-08-19 20:53 ` Denis V. Lunev 2019-08-19 21:29 ` Eric Blake
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).