qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure
@ 2019-04-05 14:24 Andrey Shinkevich
  2019-04-05 14:24 ` Andrey Shinkevich
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Andrey Shinkevich @ 2019-04-05 14:24 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: stefanha, fam, kwolf, mreitz, den, vsementsov, andrey.shinkevich

On a file system used by the customer, fallocate() returns an error
if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
fails. We can handle that case the same way as it is done for the
unsupported cases, namely, call to bdrv_driver_pwritev() that writes
zeroes to an image for the unaligned chunk of the block.

Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 block/io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index dfc153b..0412a51 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
             assert(!bs->supported_zero_flags);
         }
 
-        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
+        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
             /* Fall back to bounce buffer if write zeroes is unsupported */
             BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure
  2019-04-05 14:24 [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure Andrey Shinkevich
@ 2019-04-05 14:24 ` Andrey Shinkevich
  2019-04-05 22:50 ` [Qemu-devel] [Qemu-block] " John Snow
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: Andrey Shinkevich @ 2019-04-05 14:24 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: fam, kwolf, vsementsov, mreitz, stefanha, andrey.shinkevich, den

On a file system used by the customer, fallocate() returns an error
if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
fails. We can handle that case the same way as it is done for the
unsupported cases, namely, call to bdrv_driver_pwritev() that writes
zeroes to an image for the unaligned chunk of the block.

Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 block/io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index dfc153b..0412a51 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
             assert(!bs->supported_zero_flags);
         }
 
-        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
+        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
             /* Fall back to bounce buffer if write zeroes is unsupported */
             BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
 
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
  2019-04-05 14:24 [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure Andrey Shinkevich
  2019-04-05 14:24 ` Andrey Shinkevich
@ 2019-04-05 22:50 ` John Snow
  2019-04-05 22:50   ` John Snow
  2019-04-08  9:44   ` Andrey Shinkevich
  2019-04-08  9:00 ` [Qemu-devel] " Stefan Hajnoczi
  2019-08-17 14:42 ` Eric Blake
  3 siblings, 2 replies; 25+ messages in thread
From: John Snow @ 2019-04-05 22:50 UTC (permalink / raw)
  To: Andrey Shinkevich, qemu-devel, qemu-block
  Cc: fam, kwolf, vsementsov, mreitz, stefanha, den



On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
> On a file system used by the customer, fallocate() returns an error
> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> fails. We can handle that case the same way as it is done for the
> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> zeroes to an image for the unaligned chunk of the block.
> 
> Suggested-by: Denis V. Lunev <den@openvz.org>
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>  block/io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/io.c b/block/io.c
> index dfc153b..0412a51 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>              assert(!bs->supported_zero_flags);
>          }
>  
> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
>              /* Fall back to bounce buffer if write zeroes is unsupported */
>              BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
>  
> 

I suppose that if fallocate fails for any reason and we're allowing
fallback, we're either going to succeed ... or fail again very soon
thereafter.

Are there any cases where it is vital to not ignore the first fallocate
failure? I'm a little wary of ignoring the return code from
bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
failure here that the following bounce writes will also fail "safely."

I'm not completely confident, but I have no tangible objections:
Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
  2019-04-05 22:50 ` [Qemu-devel] [Qemu-block] " John Snow
@ 2019-04-05 22:50   ` John Snow
  2019-04-08  9:44   ` Andrey Shinkevich
  1 sibling, 0 replies; 25+ messages in thread
From: John Snow @ 2019-04-05 22:50 UTC (permalink / raw)
  To: Andrey Shinkevich, qemu-devel, qemu-block
  Cc: fam, kwolf, vsementsov, mreitz, stefanha, den



On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
> On a file system used by the customer, fallocate() returns an error
> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> fails. We can handle that case the same way as it is done for the
> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> zeroes to an image for the unaligned chunk of the block.
> 
> Suggested-by: Denis V. Lunev <den@openvz.org>
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>  block/io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/io.c b/block/io.c
> index dfc153b..0412a51 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>              assert(!bs->supported_zero_flags);
>          }
>  
> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
>              /* Fall back to bounce buffer if write zeroes is unsupported */
>              BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
>  
> 

I suppose that if fallocate fails for any reason and we're allowing
fallback, we're either going to succeed ... or fail again very soon
thereafter.

Are there any cases where it is vital to not ignore the first fallocate
failure? I'm a little wary of ignoring the return code from
bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
failure here that the following bounce writes will also fail "safely."

I'm not completely confident, but I have no tangible objections:
Reviewed-by: John Snow <jsnow@redhat.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure
  2019-04-05 14:24 [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure Andrey Shinkevich
  2019-04-05 14:24 ` Andrey Shinkevich
  2019-04-05 22:50 ` [Qemu-devel] [Qemu-block] " John Snow
@ 2019-04-08  9:00 ` Stefan Hajnoczi
  2019-04-08  9:00   ` Stefan Hajnoczi
  2019-04-08  9:45   ` Andrey Shinkevich
  2019-08-17 14:42 ` Eric Blake
  3 siblings, 2 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2019-04-08  9:00 UTC (permalink / raw)
  To: Andrey Shinkevich
  Cc: qemu-devel, qemu-block, fam, kwolf, mreitz, den, vsementsov

[-- Attachment #1: Type: text/plain, Size: 729 bytes --]

On Fri, Apr 05, 2019 at 05:24:04PM +0300, Andrey Shinkevich wrote:
> On a file system used by the customer, fallocate() returns an error
> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> fails. We can handle that case the same way as it is done for the
> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> zeroes to an image for the unaligned chunk of the block.
> 
> Suggested-by: Denis V. Lunev <den@openvz.org>
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>  block/io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks, applied to my block-next tree for QEMU 4.1:
https://github.com/stefanha/qemu/commits/block-next

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure
  2019-04-08  9:00 ` [Qemu-devel] " Stefan Hajnoczi
@ 2019-04-08  9:00   ` Stefan Hajnoczi
  2019-04-08  9:45   ` Andrey Shinkevich
  1 sibling, 0 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2019-04-08  9:00 UTC (permalink / raw)
  To: Andrey Shinkevich
  Cc: fam, kwolf, vsementsov, qemu-block, qemu-devel, mreitz, den

[-- Attachment #1: Type: text/plain, Size: 729 bytes --]

On Fri, Apr 05, 2019 at 05:24:04PM +0300, Andrey Shinkevich wrote:
> On a file system used by the customer, fallocate() returns an error
> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> fails. We can handle that case the same way as it is done for the
> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> zeroes to an image for the unaligned chunk of the block.
> 
> Suggested-by: Denis V. Lunev <den@openvz.org>
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>  block/io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks, applied to my block-next tree for QEMU 4.1:
https://github.com/stefanha/qemu/commits/block-next

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
  2019-04-05 22:50 ` [Qemu-devel] [Qemu-block] " John Snow
  2019-04-05 22:50   ` John Snow
@ 2019-04-08  9:44   ` Andrey Shinkevich
  2019-04-08  9:44     ` Andrey Shinkevich
  2019-04-08 10:04     ` Kevin Wolf
  1 sibling, 2 replies; 25+ messages in thread
From: Andrey Shinkevich @ 2019-04-08  9:44 UTC (permalink / raw)
  To: John Snow, qemu-devel, qemu-block
  Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, mreitz, stefanha, Denis Lunev



On 06/04/2019 01:50, John Snow wrote:
> 
> 
> On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
>> On a file system used by the customer, fallocate() returns an error
>> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
>> fails. We can handle that case the same way as it is done for the
>> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
>> zeroes to an image for the unaligned chunk of the block.
>>
>> Suggested-by: Denis V. Lunev <den@openvz.org>
>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>> ---
>>   block/io.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/block/io.c b/block/io.c
>> index dfc153b..0412a51 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>>               assert(!bs->supported_zero_flags);
>>           }
>>   
>> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
>> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
>>               /* Fall back to bounce buffer if write zeroes is unsupported */
>>               BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
>>   
>>
> 
> I suppose that if fallocate fails for any reason and we're allowing
> fallback, we're either going to succeed ... or fail again very soon
> thereafter.
> 
> Are there any cases where it is vital to not ignore the first fallocate
> failure? I'm a little wary of ignoring the return code from
> bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
> failure here that the following bounce writes will also fail "safely."
> 
> I'm not completely confident, but I have no tangible objections:
> Reviewed-by: John Snow <jsnow@redhat.com>
> 

Thank you for your review, John!

Let me clarify the circumstances and quote the bug report:
"Customer had Win-2012 VM with 50GB system disk which was later resized 
to 256GB without resizing the partition inside VM.
Now, while trying to resize to 50G, the following error will appear
'Failed to reduce the number of L2 tables: Invalid argument'
It was found that it is possible to shrink the disk to 128G and any size 
above that number, but size below 128G will bring the mentioned error."

The fallocate() returns no error on that file system if the offset and
the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both
are aligned to 4K.
-- 
With the best regards,
Andrey Shinkevich

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
  2019-04-08  9:44   ` Andrey Shinkevich
@ 2019-04-08  9:44     ` Andrey Shinkevich
  2019-04-08 10:04     ` Kevin Wolf
  1 sibling, 0 replies; 25+ messages in thread
From: Andrey Shinkevich @ 2019-04-08  9:44 UTC (permalink / raw)
  To: John Snow, qemu-devel, qemu-block
  Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, Denis Lunev, mreitz, stefanha



On 06/04/2019 01:50, John Snow wrote:
> 
> 
> On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
>> On a file system used by the customer, fallocate() returns an error
>> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
>> fails. We can handle that case the same way as it is done for the
>> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
>> zeroes to an image for the unaligned chunk of the block.
>>
>> Suggested-by: Denis V. Lunev <den@openvz.org>
>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>> ---
>>   block/io.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/block/io.c b/block/io.c
>> index dfc153b..0412a51 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>>               assert(!bs->supported_zero_flags);
>>           }
>>   
>> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
>> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
>>               /* Fall back to bounce buffer if write zeroes is unsupported */
>>               BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
>>   
>>
> 
> I suppose that if fallocate fails for any reason and we're allowing
> fallback, we're either going to succeed ... or fail again very soon
> thereafter.
> 
> Are there any cases where it is vital to not ignore the first fallocate
> failure? I'm a little wary of ignoring the return code from
> bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
> failure here that the following bounce writes will also fail "safely."
> 
> I'm not completely confident, but I have no tangible objections:
> Reviewed-by: John Snow <jsnow@redhat.com>
> 

Thank you for your review, John!

Let me clarify the circumstances and quote the bug report:
"Customer had Win-2012 VM with 50GB system disk which was later resized 
to 256GB without resizing the partition inside VM.
Now, while trying to resize to 50G, the following error will appear
'Failed to reduce the number of L2 tables: Invalid argument'
It was found that it is possible to shrink the disk to 128G and any size 
above that number, but size below 128G will bring the mentioned error."

The fallocate() returns no error on that file system if the offset and
the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both
are aligned to 4K.
-- 
With the best regards,
Andrey Shinkevich

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure
  2019-04-08  9:00 ` [Qemu-devel] " Stefan Hajnoczi
  2019-04-08  9:00   ` Stefan Hajnoczi
@ 2019-04-08  9:45   ` Andrey Shinkevich
  2019-04-08  9:45     ` Andrey Shinkevich
  1 sibling, 1 reply; 25+ messages in thread
From: Andrey Shinkevich @ 2019-04-08  9:45 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, qemu-block, fam, kwolf, mreitz, Denis Lunev,
	Vladimir Sementsov-Ogievskiy



On 08/04/2019 12:00, Stefan Hajnoczi wrote:
> On Fri, Apr 05, 2019 at 05:24:04PM +0300, Andrey Shinkevich wrote:
>> On a file system used by the customer, fallocate() returns an error
>> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
>> fails. We can handle that case the same way as it is done for the
>> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
>> zeroes to an image for the unaligned chunk of the block.
>>
>> Suggested-by: Denis V. Lunev <den@openvz.org>
>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>> ---
>>   block/io.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Thanks, applied to my block-next tree for QEMU 4.1:
> https://github.com/stefanha/qemu/commits/block-next
> 
> Stefan
> 

Thank you all!
-- 
With the best regards,
Andrey Shinkevich

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure
  2019-04-08  9:45   ` Andrey Shinkevich
@ 2019-04-08  9:45     ` Andrey Shinkevich
  0 siblings, 0 replies; 25+ messages in thread
From: Andrey Shinkevich @ 2019-04-08  9:45 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, Denis Lunev,
	qemu-block, qemu-devel, mreitz



On 08/04/2019 12:00, Stefan Hajnoczi wrote:
> On Fri, Apr 05, 2019 at 05:24:04PM +0300, Andrey Shinkevich wrote:
>> On a file system used by the customer, fallocate() returns an error
>> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
>> fails. We can handle that case the same way as it is done for the
>> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
>> zeroes to an image for the unaligned chunk of the block.
>>
>> Suggested-by: Denis V. Lunev <den@openvz.org>
>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>> ---
>>   block/io.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Thanks, applied to my block-next tree for QEMU 4.1:
> https://github.com/stefanha/qemu/commits/block-next
> 
> Stefan
> 

Thank you all!
-- 
With the best regards,
Andrey Shinkevich


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
  2019-04-08  9:44   ` Andrey Shinkevich
  2019-04-08  9:44     ` Andrey Shinkevich
@ 2019-04-08 10:04     ` Kevin Wolf
  2019-04-08 10:04       ` Kevin Wolf
                         ` (2 more replies)
  1 sibling, 3 replies; 25+ messages in thread
From: Kevin Wolf @ 2019-04-08 10:04 UTC (permalink / raw)
  To: Andrey Shinkevich
  Cc: John Snow, qemu-devel, qemu-block, fam,
	Vladimir Sementsov-Ogievskiy, mreitz, stefanha, Denis Lunev

Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben:
> 
> 
> On 06/04/2019 01:50, John Snow wrote:
> > 
> > 
> > On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
> >> On a file system used by the customer, fallocate() returns an error
> >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> >> fails. We can handle that case the same way as it is done for the
> >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> >> zeroes to an image for the unaligned chunk of the block.
> >>
> >> Suggested-by: Denis V. Lunev <den@openvz.org>
> >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> >> ---
> >>   block/io.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/block/io.c b/block/io.c
> >> index dfc153b..0412a51 100644
> >> --- a/block/io.c
> >> +++ b/block/io.c
> >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
> >>               assert(!bs->supported_zero_flags);
> >>           }
> >>   
> >> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
> >> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
> >>               /* Fall back to bounce buffer if write zeroes is unsupported */
> >>               BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
> >>   
> >>
> > 
> > I suppose that if fallocate fails for any reason and we're allowing
> > fallback, we're either going to succeed ... or fail again very soon
> > thereafter.
> > 
> > Are there any cases where it is vital to not ignore the first fallocate
> > failure? I'm a little wary of ignoring the return code from
> > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
> > failure here that the following bounce writes will also fail "safely."
> > 
> > I'm not completely confident, but I have no tangible objections:
> > Reviewed-by: John Snow <jsnow@redhat.com>
> > 
> 
> Thank you for your review, John!
> 
> Let me clarify the circumstances and quote the bug report:
> "Customer had Win-2012 VM with 50GB system disk which was later resized 
> to 256GB without resizing the partition inside VM.
> Now, while trying to resize to 50G, the following error will appear
> 'Failed to reduce the number of L2 tables: Invalid argument'
> It was found that it is possible to shrink the disk to 128G and any size 
> above that number, but size below 128G will bring the mentioned error."
> 
> The fallocate() returns no error on that file system if the offset and
> the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both
> are aligned to 4K.

What is the return value you get from this file system?

Maybe turning that into ENOTSUP in file-posix would be less invasive.
Just falling back for any error gives me the vague feeling that it could
cause problems sooner or later.

Kevin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
  2019-04-08 10:04     ` Kevin Wolf
@ 2019-04-08 10:04       ` Kevin Wolf
  2019-04-08 10:14       ` Kevin Wolf
  2019-04-08 11:55       ` Andrey Shinkevich
  2 siblings, 0 replies; 25+ messages in thread
From: Kevin Wolf @ 2019-04-08 10:04 UTC (permalink / raw)
  To: Andrey Shinkevich
  Cc: fam, Vladimir Sementsov-Ogievskiy, Denis Lunev, qemu-block,
	qemu-devel, mreitz, stefanha, John Snow

Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben:
> 
> 
> On 06/04/2019 01:50, John Snow wrote:
> > 
> > 
> > On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
> >> On a file system used by the customer, fallocate() returns an error
> >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> >> fails. We can handle that case the same way as it is done for the
> >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> >> zeroes to an image for the unaligned chunk of the block.
> >>
> >> Suggested-by: Denis V. Lunev <den@openvz.org>
> >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> >> ---
> >>   block/io.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/block/io.c b/block/io.c
> >> index dfc153b..0412a51 100644
> >> --- a/block/io.c
> >> +++ b/block/io.c
> >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
> >>               assert(!bs->supported_zero_flags);
> >>           }
> >>   
> >> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
> >> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
> >>               /* Fall back to bounce buffer if write zeroes is unsupported */
> >>               BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
> >>   
> >>
> > 
> > I suppose that if fallocate fails for any reason and we're allowing
> > fallback, we're either going to succeed ... or fail again very soon
> > thereafter.
> > 
> > Are there any cases where it is vital to not ignore the first fallocate
> > failure? I'm a little wary of ignoring the return code from
> > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
> > failure here that the following bounce writes will also fail "safely."
> > 
> > I'm not completely confident, but I have no tangible objections:
> > Reviewed-by: John Snow <jsnow@redhat.com>
> > 
> 
> Thank you for your review, John!
> 
> Let me clarify the circumstances and quote the bug report:
> "Customer had Win-2012 VM with 50GB system disk which was later resized 
> to 256GB without resizing the partition inside VM.
> Now, while trying to resize to 50G, the following error will appear
> 'Failed to reduce the number of L2 tables: Invalid argument'
> It was found that it is possible to shrink the disk to 128G and any size 
> above that number, but size below 128G will bring the mentioned error."
> 
> The fallocate() returns no error on that file system if the offset and
> the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both
> are aligned to 4K.

What is the return value you get from this file system?

Maybe turning that into ENOTSUP in file-posix would be less invasive.
Just falling back for any error gives me the vague feeling that it could
cause problems sooner or later.

Kevin


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
  2019-04-08 10:04     ` Kevin Wolf
  2019-04-08 10:04       ` Kevin Wolf
@ 2019-04-08 10:14       ` Kevin Wolf
  2019-04-08 10:14         ` Kevin Wolf
  2019-04-10 14:54         ` Stefan Hajnoczi
  2019-04-08 11:55       ` Andrey Shinkevich
  2 siblings, 2 replies; 25+ messages in thread
From: Kevin Wolf @ 2019-04-08 10:14 UTC (permalink / raw)
  To: Andrey Shinkevich
  Cc: John Snow, qemu-devel, qemu-block, fam,
	Vladimir Sementsov-Ogievskiy, mreitz, stefanha, Denis Lunev

Am 08.04.2019 um 12:04 hat Kevin Wolf geschrieben:
> Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben:
> > 
> > 
> > On 06/04/2019 01:50, John Snow wrote:
> > > 
> > > 
> > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
> > >> On a file system used by the customer, fallocate() returns an error
> > >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> > >> fails. We can handle that case the same way as it is done for the
> > >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> > >> zeroes to an image for the unaligned chunk of the block.
> > >>
> > >> Suggested-by: Denis V. Lunev <den@openvz.org>
> > >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> > >> ---
> > >>   block/io.c | 2 +-
> > >>   1 file changed, 1 insertion(+), 1 deletion(-)
> > >>
> > >> diff --git a/block/io.c b/block/io.c
> > >> index dfc153b..0412a51 100644
> > >> --- a/block/io.c
> > >> +++ b/block/io.c
> > >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
> > >>               assert(!bs->supported_zero_flags);
> > >>           }
> > >>   
> > >> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > >> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > >>               /* Fall back to bounce buffer if write zeroes is unsupported */
> > >>               BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
> > >>   
> > >>
> > > 
> > > I suppose that if fallocate fails for any reason and we're allowing
> > > fallback, we're either going to succeed ... or fail again very soon
> > > thereafter.
> > > 
> > > Are there any cases where it is vital to not ignore the first fallocate
> > > failure? I'm a little wary of ignoring the return code from
> > > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
> > > failure here that the following bounce writes will also fail "safely."
> > > 
> > > I'm not completely confident, but I have no tangible objections:
> > > Reviewed-by: John Snow <jsnow@redhat.com>
> > > 
> > 
> > Thank you for your review, John!
> > 
> > Let me clarify the circumstances and quote the bug report:
> > "Customer had Win-2012 VM with 50GB system disk which was later resized 
> > to 256GB without resizing the partition inside VM.
> > Now, while trying to resize to 50G, the following error will appear
> > 'Failed to reduce the number of L2 tables: Invalid argument'
> > It was found that it is possible to shrink the disk to 128G and any size 
> > above that number, but size below 128G will bring the mentioned error."
> > 
> > The fallocate() returns no error on that file system if the offset and
> > the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both
> > are aligned to 4K.
> 
> What is the return value you get from this file system?
> 
> Maybe turning that into ENOTSUP in file-posix would be less invasive.
> Just falling back for any error gives me the vague feeling that it could
> cause problems sooner or later.

Also, which file system is this? This seems to be a file system bug.
fallocate() isn't documented to possibly have alignment restrictions for
FALLOC_FL_ZERO_RANGE (if this is the operation you're talking about).
FALLOC_FL_PUNCH_HOLE even explicitly mentions the behaviour for partial
blocks, so there is no doubt that operations for partial blocks are
considered valid. Operations that may impose restrictions are explicitly
documented as such.

So in any case, this would only be a workaround for a buggy file system.
The real bug needs to be fixed there.

Kevin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
  2019-04-08 10:14       ` Kevin Wolf
@ 2019-04-08 10:14         ` Kevin Wolf
  2019-04-10 14:54         ` Stefan Hajnoczi
  1 sibling, 0 replies; 25+ messages in thread
From: Kevin Wolf @ 2019-04-08 10:14 UTC (permalink / raw)
  To: Andrey Shinkevich
  Cc: fam, Vladimir Sementsov-Ogievskiy, Denis Lunev, qemu-block,
	qemu-devel, mreitz, stefanha, John Snow

Am 08.04.2019 um 12:04 hat Kevin Wolf geschrieben:
> Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben:
> > 
> > 
> > On 06/04/2019 01:50, John Snow wrote:
> > > 
> > > 
> > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
> > >> On a file system used by the customer, fallocate() returns an error
> > >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> > >> fails. We can handle that case the same way as it is done for the
> > >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> > >> zeroes to an image for the unaligned chunk of the block.
> > >>
> > >> Suggested-by: Denis V. Lunev <den@openvz.org>
> > >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> > >> ---
> > >>   block/io.c | 2 +-
> > >>   1 file changed, 1 insertion(+), 1 deletion(-)
> > >>
> > >> diff --git a/block/io.c b/block/io.c
> > >> index dfc153b..0412a51 100644
> > >> --- a/block/io.c
> > >> +++ b/block/io.c
> > >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
> > >>               assert(!bs->supported_zero_flags);
> > >>           }
> > >>   
> > >> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > >> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > >>               /* Fall back to bounce buffer if write zeroes is unsupported */
> > >>               BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
> > >>   
> > >>
> > > 
> > > I suppose that if fallocate fails for any reason and we're allowing
> > > fallback, we're either going to succeed ... or fail again very soon
> > > thereafter.
> > > 
> > > Are there any cases where it is vital to not ignore the first fallocate
> > > failure? I'm a little wary of ignoring the return code from
> > > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
> > > failure here that the following bounce writes will also fail "safely."
> > > 
> > > I'm not completely confident, but I have no tangible objections:
> > > Reviewed-by: John Snow <jsnow@redhat.com>
> > > 
> > 
> > Thank you for your review, John!
> > 
> > Let me clarify the circumstances and quote the bug report:
> > "Customer had Win-2012 VM with 50GB system disk which was later resized 
> > to 256GB without resizing the partition inside VM.
> > Now, while trying to resize to 50G, the following error will appear
> > 'Failed to reduce the number of L2 tables: Invalid argument'
> > It was found that it is possible to shrink the disk to 128G and any size 
> > above that number, but size below 128G will bring the mentioned error."
> > 
> > The fallocate() returns no error on that file system if the offset and
> > the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both
> > are aligned to 4K.
> 
> What is the return value you get from this file system?
> 
> Maybe turning that into ENOTSUP in file-posix would be less invasive.
> Just falling back for any error gives me the vague feeling that it could
> cause problems sooner or later.

Also, which file system is this? This seems to be a file system bug.
fallocate() isn't documented to possibly have alignment restrictions for
FALLOC_FL_ZERO_RANGE (if this is the operation you're talking about).
FALLOC_FL_PUNCH_HOLE even explicitly mentions the behaviour for partial
blocks, so there is no doubt that operations for partial blocks are
considered valid. Operations that may impose restrictions are explicitly
documented as such.

So in any case, this would only be a workaround for a buggy file system.
The real bug needs to be fixed there.

Kevin


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
  2019-04-08 10:04     ` Kevin Wolf
  2019-04-08 10:04       ` Kevin Wolf
  2019-04-08 10:14       ` Kevin Wolf
@ 2019-04-08 11:55       ` Andrey Shinkevich
  2019-04-08 11:55         ` Andrey Shinkevich
  2 siblings, 1 reply; 25+ messages in thread
From: Andrey Shinkevich @ 2019-04-08 11:55 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: John Snow, qemu-devel, qemu-block, fam,
	Vladimir Sementsov-Ogievskiy, mreitz, stefanha, Denis Lunev



On 08/04/2019 13:04, Kevin Wolf wrote:
> Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben:
>>
>>
>> On 06/04/2019 01:50, John Snow wrote:
>>>
>>>
>>> On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
>>>> On a file system used by the customer, fallocate() returns an error
>>>> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
>>>> fails. We can handle that case the same way as it is done for the
>>>> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
>>>> zeroes to an image for the unaligned chunk of the block.
>>>>
>>>> Suggested-by: Denis V. Lunev <den@openvz.org>
>>>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>> ---
>>>>    block/io.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/block/io.c b/block/io.c
>>>> index dfc153b..0412a51 100644
>>>> --- a/block/io.c
>>>> +++ b/block/io.c
>>>> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>>>>                assert(!bs->supported_zero_flags);
>>>>            }
>>>>    
>>>> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
>>>> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
>>>>                /* Fall back to bounce buffer if write zeroes is unsupported */
>>>>                BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
>>>>    
>>>>
>>>
>>> I suppose that if fallocate fails for any reason and we're allowing
>>> fallback, we're either going to succeed ... or fail again very soon
>>> thereafter.
>>>
>>> Are there any cases where it is vital to not ignore the first fallocate
>>> failure? I'm a little wary of ignoring the return code from
>>> bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
>>> failure here that the following bounce writes will also fail "safely."
>>>
>>> I'm not completely confident, but I have no tangible objections:
>>> Reviewed-by: John Snow <jsnow@redhat.com>
>>>
>>
>> Thank you for your review, John!
>>
>> Let me clarify the circumstances and quote the bug report:
>> "Customer had Win-2012 VM with 50GB system disk which was later resized
>> to 256GB without resizing the partition inside VM.
>> Now, while trying to resize to 50G, the following error will appear
>> 'Failed to reduce the number of L2 tables: Invalid argument'
>> It was found that it is possible to shrink the disk to 128G and any size
>> above that number, but size below 128G will bring the mentioned error."
>>
>> The fallocate() returns no error on that file system if the offset and
>> the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both
>> are aligned to 4K.
> 
> What is the return value you get from this file system?
> 
> Maybe turning that into ENOTSUP in file-posix would be less invasive.
> Just falling back for any error gives me the vague feeling that it could
> cause problems sooner or later.
> 
> Kevin
> 

The return value for that custom distributed file system is
"Invalid argument", that's
"EINVAL: mode is FALLOC_FL_ZERO_RANGE, but the file referred to
by fd is not a regular file".

When I reproduced the bug, I saw that the alignment in the
bdrv_co_do_pwrite_zeroes() was set to '1' in that case:

MAX(bs->bl.pwrite_zeroes_alignment /=0/),
     bs->bl.request_alignment /=1/);

With my first patch I had not sent before, a new member of the structure
BlockLimits, say pwrite_zeroes_alignment_min, was set to 4K for a raw
file and the alignment was made for the block. Then zeroes were written
to the image for the unaligned head and tail by invoking the
bdrv_driver_pwritev(). That approach has cons also: we would write to
the disk always.

The file system maintainers say that the bug is a particular case and
they may not return the general error such as 'unsupported'.
-- 
Andrey

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
  2019-04-08 11:55       ` Andrey Shinkevich
@ 2019-04-08 11:55         ` Andrey Shinkevich
  0 siblings, 0 replies; 25+ messages in thread
From: Andrey Shinkevich @ 2019-04-08 11:55 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: fam, Vladimir Sementsov-Ogievskiy, Denis Lunev, qemu-block,
	qemu-devel, mreitz, stefanha, John Snow



On 08/04/2019 13:04, Kevin Wolf wrote:
> Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben:
>>
>>
>> On 06/04/2019 01:50, John Snow wrote:
>>>
>>>
>>> On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
>>>> On a file system used by the customer, fallocate() returns an error
>>>> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
>>>> fails. We can handle that case the same way as it is done for the
>>>> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
>>>> zeroes to an image for the unaligned chunk of the block.
>>>>
>>>> Suggested-by: Denis V. Lunev <den@openvz.org>
>>>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>> ---
>>>>    block/io.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/block/io.c b/block/io.c
>>>> index dfc153b..0412a51 100644
>>>> --- a/block/io.c
>>>> +++ b/block/io.c
>>>> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>>>>                assert(!bs->supported_zero_flags);
>>>>            }
>>>>    
>>>> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
>>>> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
>>>>                /* Fall back to bounce buffer if write zeroes is unsupported */
>>>>                BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
>>>>    
>>>>
>>>
>>> I suppose that if fallocate fails for any reason and we're allowing
>>> fallback, we're either going to succeed ... or fail again very soon
>>> thereafter.
>>>
>>> Are there any cases where it is vital to not ignore the first fallocate
>>> failure? I'm a little wary of ignoring the return code from
>>> bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
>>> failure here that the following bounce writes will also fail "safely."
>>>
>>> I'm not completely confident, but I have no tangible objections:
>>> Reviewed-by: John Snow <jsnow@redhat.com>
>>>
>>
>> Thank you for your review, John!
>>
>> Let me clarify the circumstances and quote the bug report:
>> "Customer had Win-2012 VM with 50GB system disk which was later resized
>> to 256GB without resizing the partition inside VM.
>> Now, while trying to resize to 50G, the following error will appear
>> 'Failed to reduce the number of L2 tables: Invalid argument'
>> It was found that it is possible to shrink the disk to 128G and any size
>> above that number, but size below 128G will bring the mentioned error."
>>
>> The fallocate() returns no error on that file system if the offset and
>> the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both
>> are aligned to 4K.
> 
> What is the return value you get from this file system?
> 
> Maybe turning that into ENOTSUP in file-posix would be less invasive.
> Just falling back for any error gives me the vague feeling that it could
> cause problems sooner or later.
> 
> Kevin
> 

The return value for that custom distributed file system is
"Invalid argument", that's
"EINVAL: mode is FALLOC_FL_ZERO_RANGE, but the file referred to
by fd is not a regular file".

When I reproduced the bug, I saw that the alignment in the
bdrv_co_do_pwrite_zeroes() was set to '1' in that case:

MAX(bs->bl.pwrite_zeroes_alignment /=0/),
     bs->bl.request_alignment /=1/);

With my first patch I had not sent before, a new member of the structure
BlockLimits, say pwrite_zeroes_alignment_min, was set to 4K for a raw
file and the alignment was made for the block. Then zeroes were written
to the image for the unaligned head and tail by invoking the
bdrv_driver_pwritev(). That approach has cons also: we would write to
the disk always.

The file system maintainers say that the bug is a particular case and
they may not return the general error such as 'unsupported'.
-- 
Andrey

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
  2019-04-08 10:14       ` Kevin Wolf
  2019-04-08 10:14         ` Kevin Wolf
@ 2019-04-10 14:54         ` Stefan Hajnoczi
  2019-04-10 14:54           ` Stefan Hajnoczi
  1 sibling, 1 reply; 25+ messages in thread
From: Stefan Hajnoczi @ 2019-04-10 14:54 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Andrey Shinkevich, fam, Vladimir Sementsov-Ogievskiy,
	Denis Lunev, qemu-block, qemu-devel, mreitz, stefanha

[-- Attachment #1: Type: text/plain, Size: 4075 bytes --]

On Mon, Apr 08, 2019 at 12:14:49PM +0200, Kevin Wolf wrote:
> Am 08.04.2019 um 12:04 hat Kevin Wolf geschrieben:
> > Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben:
> > > 
> > > 
> > > On 06/04/2019 01:50, John Snow wrote:
> > > > 
> > > > 
> > > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
> > > >> On a file system used by the customer, fallocate() returns an error
> > > >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> > > >> fails. We can handle that case the same way as it is done for the
> > > >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> > > >> zeroes to an image for the unaligned chunk of the block.
> > > >>
> > > >> Suggested-by: Denis V. Lunev <den@openvz.org>
> > > >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> > > >> ---
> > > >>   block/io.c | 2 +-
> > > >>   1 file changed, 1 insertion(+), 1 deletion(-)
> > > >>
> > > >> diff --git a/block/io.c b/block/io.c
> > > >> index dfc153b..0412a51 100644
> > > >> --- a/block/io.c
> > > >> +++ b/block/io.c
> > > >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
> > > >>               assert(!bs->supported_zero_flags);
> > > >>           }
> > > >>   
> > > >> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > > >> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > > >>               /* Fall back to bounce buffer if write zeroes is unsupported */
> > > >>               BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
> > > >>   
> > > >>
> > > > 
> > > > I suppose that if fallocate fails for any reason and we're allowing
> > > > fallback, we're either going to succeed ... or fail again very soon
> > > > thereafter.
> > > > 
> > > > Are there any cases where it is vital to not ignore the first fallocate
> > > > failure? I'm a little wary of ignoring the return code from
> > > > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
> > > > failure here that the following bounce writes will also fail "safely."
> > > > 
> > > > I'm not completely confident, but I have no tangible objections:
> > > > Reviewed-by: John Snow <jsnow@redhat.com>
> > > > 
> > > 
> > > Thank you for your review, John!
> > > 
> > > Let me clarify the circumstances and quote the bug report:
> > > "Customer had Win-2012 VM with 50GB system disk which was later resized 
> > > to 256GB without resizing the partition inside VM.
> > > Now, while trying to resize to 50G, the following error will appear
> > > 'Failed to reduce the number of L2 tables: Invalid argument'
> > > It was found that it is possible to shrink the disk to 128G and any size 
> > > above that number, but size below 128G will bring the mentioned error."
> > > 
> > > The fallocate() returns no error on that file system if the offset and
> > > the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both
> > > are aligned to 4K.
> > 
> > What is the return value you get from this file system?
> > 
> > Maybe turning that into ENOTSUP in file-posix would be less invasive.
> > Just falling back for any error gives me the vague feeling that it could
> > cause problems sooner or later.
> 
> Also, which file system is this? This seems to be a file system bug.
> fallocate() isn't documented to possibly have alignment restrictions for
> FALLOC_FL_ZERO_RANGE (if this is the operation you're talking about).
> FALLOC_FL_PUNCH_HOLE even explicitly mentions the behaviour for partial
> blocks, so there is no doubt that operations for partial blocks are
> considered valid. Operations that may impose restrictions are explicitly
> documented as such.
> 
> So in any case, this would only be a workaround for a buggy file system.
> The real bug needs to be fixed there.

I agree regarding the root cause of the bug, but the fallback behavior
is reasonable IMO.

Andrey: If you update the patch with a more specific errno I'll queue
that patch instead.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
  2019-04-10 14:54         ` Stefan Hajnoczi
@ 2019-04-10 14:54           ` Stefan Hajnoczi
  0 siblings, 0 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2019-04-10 14:54 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: fam, Vladimir Sementsov-Ogievskiy, Denis Lunev, qemu-block,
	qemu-devel, mreitz, stefanha, Andrey Shinkevich

[-- Attachment #1: Type: text/plain, Size: 4075 bytes --]

On Mon, Apr 08, 2019 at 12:14:49PM +0200, Kevin Wolf wrote:
> Am 08.04.2019 um 12:04 hat Kevin Wolf geschrieben:
> > Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben:
> > > 
> > > 
> > > On 06/04/2019 01:50, John Snow wrote:
> > > > 
> > > > 
> > > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
> > > >> On a file system used by the customer, fallocate() returns an error
> > > >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> > > >> fails. We can handle that case the same way as it is done for the
> > > >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> > > >> zeroes to an image for the unaligned chunk of the block.
> > > >>
> > > >> Suggested-by: Denis V. Lunev <den@openvz.org>
> > > >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> > > >> ---
> > > >>   block/io.c | 2 +-
> > > >>   1 file changed, 1 insertion(+), 1 deletion(-)
> > > >>
> > > >> diff --git a/block/io.c b/block/io.c
> > > >> index dfc153b..0412a51 100644
> > > >> --- a/block/io.c
> > > >> +++ b/block/io.c
> > > >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
> > > >>               assert(!bs->supported_zero_flags);
> > > >>           }
> > > >>   
> > > >> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > > >> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > > >>               /* Fall back to bounce buffer if write zeroes is unsupported */
> > > >>               BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
> > > >>   
> > > >>
> > > > 
> > > > I suppose that if fallocate fails for any reason and we're allowing
> > > > fallback, we're either going to succeed ... or fail again very soon
> > > > thereafter.
> > > > 
> > > > Are there any cases where it is vital to not ignore the first fallocate
> > > > failure? I'm a little wary of ignoring the return code from
> > > > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
> > > > failure here that the following bounce writes will also fail "safely."
> > > > 
> > > > I'm not completely confident, but I have no tangible objections:
> > > > Reviewed-by: John Snow <jsnow@redhat.com>
> > > > 
> > > 
> > > Thank you for your review, John!
> > > 
> > > Let me clarify the circumstances and quote the bug report:
> > > "Customer had Win-2012 VM with 50GB system disk which was later resized 
> > > to 256GB without resizing the partition inside VM.
> > > Now, while trying to resize to 50G, the following error will appear
> > > 'Failed to reduce the number of L2 tables: Invalid argument'
> > > It was found that it is possible to shrink the disk to 128G and any size 
> > > above that number, but size below 128G will bring the mentioned error."
> > > 
> > > The fallocate() returns no error on that file system if the offset and
> > > the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both
> > > are aligned to 4K.
> > 
> > What is the return value you get from this file system?
> > 
> > Maybe turning that into ENOTSUP in file-posix would be less invasive.
> > Just falling back for any error gives me the vague feeling that it could
> > cause problems sooner or later.
> 
> Also, which file system is this? This seems to be a file system bug.
> fallocate() isn't documented to possibly have alignment restrictions for
> FALLOC_FL_ZERO_RANGE (if this is the operation you're talking about).
> FALLOC_FL_PUNCH_HOLE even explicitly mentions the behaviour for partial
> blocks, so there is no doubt that operations for partial blocks are
> considered valid. Operations that may impose restrictions are explicitly
> documented as such.
> 
> So in any case, this would only be a workaround for a buggy file system.
> The real bug needs to be fixed there.

I agree regarding the root cause of the bug, but the fallback behavior
is reasonable IMO.

Andrey: If you update the patch with a more specific errno I'll queue
that patch instead.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure
  2019-04-05 14:24 [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure Andrey Shinkevich
                   ` (2 preceding siblings ...)
  2019-04-08  9:00 ` [Qemu-devel] " Stefan Hajnoczi
@ 2019-08-17 14:42 ` Eric Blake
  2019-08-17 14:49   ` Eric Blake
  3 siblings, 1 reply; 25+ messages in thread
From: Eric Blake @ 2019-08-17 14:42 UTC (permalink / raw)
  To: Andrey Shinkevich, qemu-devel, qemu-block
  Cc: fam, kwolf, vsementsov, mreitz, stefanha, den


[-- Attachment #1.1: Type: text/plain, Size: 1646 bytes --]

On 4/5/19 9:24 AM, Andrey Shinkevich wrote:
> On a file system used by the customer, fallocate() returns an error

Which error?

> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> fails. We can handle that case the same way as it is done for the
> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> zeroes to an image for the unaligned chunk of the block.
> 
> Suggested-by: Denis V. Lunev <den@openvz.org>
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>  block/io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/io.c b/block/io.c
> index dfc153b..0412a51 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>              assert(!bs->supported_zero_flags);
>          }
>  
> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {

This change is a regression of sorts.  Now, you are unconditionally
attempting the fallback for ALL failures (such as EIO) and for all
drivers, even when that was not previously attempted and increases the
traffic.  I think we should revert this patch and instead fix the
fallocate() path to convert whatever ACTUAL errno you got from unaligned
fallocate failure into ENOTSUP (that is, just the file-posix.c location
that failed), while leaving all other errors as immediately fatal.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure
  2019-08-17 14:42 ` Eric Blake
@ 2019-08-17 14:49   ` Eric Blake
  2019-08-17 14:56     ` Eric Blake
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Blake @ 2019-08-17 14:49 UTC (permalink / raw)
  To: Andrey Shinkevich, qemu-devel, qemu-block
  Cc: fam, kwolf, vsementsov, mreitz, stefanha, den


[-- Attachment #1.1: Type: text/plain, Size: 1949 bytes --]

On 8/17/19 9:42 AM, Eric Blake wrote:
> On 4/5/19 9:24 AM, Andrey Shinkevich wrote:
>> On a file system used by the customer, fallocate() returns an error
> 
> Which error?

Okay, I read the rest of the thread; EINVAL.  But the commit message was
not amended before becoming commit 118f9944.


>>  
>> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
>> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
> 
> This change is a regression of sorts.  Now, you are unconditionally
> attempting the fallback for ALL failures (such as EIO) and for all
> drivers, even when that was not previously attempted and increases the
> traffic.  I think we should revert this patch and instead fix the
> fallocate() path to convert whatever ACTUAL errno you got from unaligned
> fallocate failure into ENOTSUP (that is, just the file-posix.c location
> that failed), while leaving all other errors as immediately fatal.

And the rest of the thread worried about that exact scenario.

Here's how I encountered it. I was trying to debug the nbdkit sh plugin,
with:

$ cat >script  <<\EOF
case $1 in
get_size) echo 1m;;
pread) false;;
can_write|can_zero) ;;
pwrite) ;;
zero) echo ENOTSUP; exit 1 ;;
*) exit 2;;
esac
EOF

(the script has a subtle bug; zero) should be using 'echo ENOTSUP >&2',
but because it didn't, nbdkit treats the failure as EIO rather than the
intended ENOTSUP)

coupled with:

$ qemu-io -f raw nbd://localhost:10810 -c 'w -z 0 1'

but when the script fails with EIO and qemu-io reported that the write
was still successful, I was confused (I was debugging a server-side
fallback to write, not a client-side one), until I discovered that we
changed the semantics in qemu 4.1 that EIO is no longer fatal and
attempts the write fallback.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure
  2019-08-17 14:49   ` Eric Blake
@ 2019-08-17 14:56     ` Eric Blake
  2019-08-19 19:46       ` Denis V. Lunev
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Blake @ 2019-08-17 14:56 UTC (permalink / raw)
  To: Andrey Shinkevich, qemu-devel, qemu-block
  Cc: fam, kwolf, vsementsov, mreitz, stefanha, den


[-- Attachment #1.1: Type: text/plain, Size: 872 bytes --]

On 8/17/19 9:49 AM, Eric Blake wrote:

>> This change is a regression of sorts.  Now, you are unconditionally
>> attempting the fallback for ALL failures (such as EIO) and for all
>> drivers, even when that was not previously attempted and increases the
>> traffic.  I think we should revert this patch and instead fix the
>> fallocate() path to convert whatever ACTUAL errno you got from unaligned
>> fallocate failure into ENOTSUP (that is, just the file-posix.c location
>> that failed), while leaving all other errors as immediately fatal.

Or even better, fix the call site of fallocate() to skip attempting an
unaligned fallocate(), and just directly return ENOTSUP, rather than
trying to diagnose EINVAL after the fact.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure
  2019-08-17 14:56     ` Eric Blake
@ 2019-08-19 19:46       ` Denis V. Lunev
  2019-08-19 20:30         ` Eric Blake
  0 siblings, 1 reply; 25+ messages in thread
From: Denis V. Lunev @ 2019-08-19 19:46 UTC (permalink / raw)
  To: Eric Blake, Andrey Shinkevich, qemu-devel, qemu-block
  Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, stefanha, mreitz

On 8/17/19 5:56 PM, Eric Blake wrote:
> On 8/17/19 9:49 AM, Eric Blake wrote:
>
>>> This change is a regression of sorts.  Now, you are unconditionally
>>> attempting the fallback for ALL failures (such as EIO) and for all
>>> drivers, even when that was not previously attempted and increases the
>>> traffic.  I think we should revert this patch and instead fix the
>>> fallocate() path to convert whatever ACTUAL errno you got from unaligned
>>> fallocate failure into ENOTSUP (that is, just the file-posix.c location
>>> that failed), while leaving all other errors as immediately fatal.
> Or even better, fix the call site of fallocate() to skip attempting an
> unaligned fallocate(), and just directly return ENOTSUP, rather than
> trying to diagnose EINVAL after the fact.
>
No way. Single ENOTSUP will turn off fallocate() support on caller side
while
aligned (99.99% of calls) works normally.

Den

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure
  2019-08-19 19:46       ` Denis V. Lunev
@ 2019-08-19 20:30         ` Eric Blake
  2019-08-19 20:53           ` Denis V. Lunev
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Blake @ 2019-08-19 20:30 UTC (permalink / raw)
  To: Denis V. Lunev, Andrey Shinkevich, qemu-devel, qemu-block
  Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, stefanha, mreitz


[-- Attachment #1.1: Type: text/plain, Size: 1342 bytes --]

On 8/19/19 2:46 PM, Denis V. Lunev wrote:
> On 8/17/19 5:56 PM, Eric Blake wrote:
>> On 8/17/19 9:49 AM, Eric Blake wrote:
>>
>>>> This change is a regression of sorts.  Now, you are unconditionally
>>>> attempting the fallback for ALL failures (such as EIO) and for all
>>>> drivers, even when that was not previously attempted and increases the
>>>> traffic.  I think we should revert this patch and instead fix the
>>>> fallocate() path to convert whatever ACTUAL errno you got from unaligned
>>>> fallocate failure into ENOTSUP (that is, just the file-posix.c location
>>>> that failed), while leaving all other errors as immediately fatal.
>> Or even better, fix the call site of fallocate() to skip attempting an
>> unaligned fallocate(), and just directly return ENOTSUP, rather than
>> trying to diagnose EINVAL after the fact.
>>
> No way. Single ENOTSUP will turn off fallocate() support on caller side
> while
> aligned (99.99% of calls) works normally.

I didn't mean skip fallocate() unconditionally, only when unaligned:

if (request not aligned enough)
   return -ENOTSUP;
fallocate() ...

so that the 99.99% requests that ARE aligned get to use fallocate()
normally.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure
  2019-08-19 20:30         ` Eric Blake
@ 2019-08-19 20:53           ` Denis V. Lunev
  2019-08-19 21:29             ` Eric Blake
  0 siblings, 1 reply; 25+ messages in thread
From: Denis V. Lunev @ 2019-08-19 20:53 UTC (permalink / raw)
  To: Eric Blake, Andrey Shinkevich, qemu-devel, qemu-block
  Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, stefanha, mreitz

On 8/19/19 11:30 PM, Eric Blake wrote:
> On 8/19/19 2:46 PM, Denis V. Lunev wrote:
>> On 8/17/19 5:56 PM, Eric Blake wrote:
>>> On 8/17/19 9:49 AM, Eric Blake wrote:
>>>
>>>>> This change is a regression of sorts.  Now, you are unconditionally
>>>>> attempting the fallback for ALL failures (such as EIO) and for all
>>>>> drivers, even when that was not previously attempted and increases the
>>>>> traffic.  I think we should revert this patch and instead fix the
>>>>> fallocate() path to convert whatever ACTUAL errno you got from unaligned
>>>>> fallocate failure into ENOTSUP (that is, just the file-posix.c location
>>>>> that failed), while leaving all other errors as immediately fatal.
>>> Or even better, fix the call site of fallocate() to skip attempting an
>>> unaligned fallocate(), and just directly return ENOTSUP, rather than
>>> trying to diagnose EINVAL after the fact.
>>>
>> No way. Single ENOTSUP will turn off fallocate() support on caller side
>> while
>> aligned (99.99% of calls) works normally.
> I didn't mean skip fallocate() unconditionally, only when unaligned:
>
> if (request not aligned enough)
>    return -ENOTSUP;
> fallocate() ...
>
> so that the 99.99% requests that ARE aligned get to use fallocate()
> normally.
>
static int handle_aiocb_write_zeroes(void *opaque)
{
...
#ifdef CONFIG_FALLOCATE_ZERO_RANGE
    if (s->has_write_zeroes) {
        int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
                               aiocb->aio_offset, aiocb->aio_nbytes);
        if (ret == 0 || ret != -ENOTSUP) {
            return ret;
        }
        s->has_write_zeroes = false;
    }
#endif

thus, right now, single ENOTSUP disables fallocate
functionality completely setting s->has_write_zeroes
to false and that is pretty much correct.

ENOTSUP is "static" error code which returns persistent
ENOTSUP under any consequences. Its handling usually
disables some functionality.

This is why original idea is proposed.

Den

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure
  2019-08-19 20:53           ` Denis V. Lunev
@ 2019-08-19 21:29             ` Eric Blake
  0 siblings, 0 replies; 25+ messages in thread
From: Eric Blake @ 2019-08-19 21:29 UTC (permalink / raw)
  To: Denis V. Lunev, Andrey Shinkevich, qemu-devel, qemu-block
  Cc: fam, kwolf, Vladimir Sementsov-Ogievskiy, stefanha, mreitz


[-- Attachment #1.1: Type: text/plain, Size: 2133 bytes --]

On 8/19/19 3:53 PM, Denis V. Lunev wrote:

>>>> Or even better, fix the call site of fallocate() to skip attempting an
>>>> unaligned fallocate(), and just directly return ENOTSUP, rather than
>>>> trying to diagnose EINVAL after the fact.
>>>>
>>> No way. Single ENOTSUP will turn off fallocate() support on caller side
>>> while
>>> aligned (99.99% of calls) works normally.
>> I didn't mean skip fallocate() unconditionally, only when unaligned:
>>
>> if (request not aligned enough)
>>    return -ENOTSUP;
>> fallocate() ...
>>
>> so that the 99.99% requests that ARE aligned get to use fallocate()
>> normally.
>>
> static int handle_aiocb_write_zeroes(void *opaque)
> {
> ...
> #ifdef CONFIG_FALLOCATE_ZERO_RANGE
>     if (s->has_write_zeroes) {
>         int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
>                                aiocb->aio_offset, aiocb->aio_nbytes);
>         if (ret == 0 || ret != -ENOTSUP) {
>             return ret;
>         }
>         s->has_write_zeroes = false;
>     }
> #endif
> 
> thus, right now, single ENOTSUP disables fallocate
> functionality completely setting s->has_write_zeroes
> to false and that is pretty much correct.
> 
> ENOTSUP is "static" error code which returns persistent
> ENOTSUP under any consequences.

Not always true. And the block layer doesn't expect it to be true. It is
perfectly fine for one invocation to return ENOTSUP ('I can't handle
this request, so fall back to pwrite for me) and the next to just work
('this one was aligned, so I handled it just fine).  It just means that
you have to be more careful with the logic: never set
s->has_write_zeroes=false if you skipped the fallocate, or if the
fallocate failed due to EINVAL rather than ENOTSUP (but still report
ENOTSUP to the block layer, to document that you want the EINVAL for
unaligned request to be turned into a fallback to pwrite).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-08-19 21:30 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-05 14:24 [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure Andrey Shinkevich
2019-04-05 14:24 ` Andrey Shinkevich
2019-04-05 22:50 ` [Qemu-devel] [Qemu-block] " John Snow
2019-04-05 22:50   ` John Snow
2019-04-08  9:44   ` Andrey Shinkevich
2019-04-08  9:44     ` Andrey Shinkevich
2019-04-08 10:04     ` Kevin Wolf
2019-04-08 10:04       ` Kevin Wolf
2019-04-08 10:14       ` Kevin Wolf
2019-04-08 10:14         ` Kevin Wolf
2019-04-10 14:54         ` Stefan Hajnoczi
2019-04-10 14:54           ` Stefan Hajnoczi
2019-04-08 11:55       ` Andrey Shinkevich
2019-04-08 11:55         ` Andrey Shinkevich
2019-04-08  9:00 ` [Qemu-devel] " Stefan Hajnoczi
2019-04-08  9:00   ` Stefan Hajnoczi
2019-04-08  9:45   ` Andrey Shinkevich
2019-04-08  9:45     ` Andrey Shinkevich
2019-08-17 14:42 ` Eric Blake
2019-08-17 14:49   ` Eric Blake
2019-08-17 14:56     ` Eric Blake
2019-08-19 19:46       ` Denis V. Lunev
2019-08-19 20:30         ` Eric Blake
2019-08-19 20:53           ` Denis V. Lunev
2019-08-19 21:29             ` Eric Blake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).