qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-5.1] nbd: Fix large trim/zero requests
@ 2020-07-22 21:22 Eric Blake
  2020-07-23  7:23 ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Blake @ 2020-07-22 21:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: vsementsov, qemu-stable, open list:Network Block Dev...

Although qemu as NBD client limits requests to <2G, the NBD protocol
allows clients to send requests almost all the way up to 4G.  But
because our block layer is not yet 64-bit clean, we accidentally wrap
such requests into a negative size, and fail with EIO instead of
performing the intended operation.

The bug is visible in modern systems with something as simple as:

$ qemu-img create -f qcow2 /tmp/image.img 5G
$ sudo qemu-nbd --connect=/dev/nbd0 /tmp/image.img
$ sudo blkdiscard /dev/nbd0

or with user-space only:

$ truncate --size=3G file
$ qemu-nbd -f raw file
$ nbdsh -u nbd://localhost:10809 -c 'h.trim(3*1024*1024*1024,0)'

Alas, our iotests do not currently make it easy to add external
dependencies on blkdiscard or nbdsh, so we have to rely on manual
testing for now.

This patch can be reverted when we later improve the overall block
layer to be 64-bit clean, but for now, a minimal fix was deemed less
risky prior to release.

CC: qemu-stable@nongnu.org
Fixes: 1f4d6d18ed
Fixes: 1c6c4bb7f0
Fixes: https://github.com/systemd/systemd/issues/16242
Signed-off-by: Eric Blake <eblake@redhat.com>
---
 nbd/server.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 4752a6c8bc07..029618017c90 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2378,8 +2378,17 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
         if (request->flags & NBD_CMD_FLAG_FAST_ZERO) {
             flags |= BDRV_REQ_NO_FALLBACK;
         }
-        ret = blk_pwrite_zeroes(exp->blk, request->from + exp->dev_offset,
-                                request->len, flags);
+        ret = 0;
+        /* FIXME simplify this when blk_pwrite_zeroes switches to 64-bit */
+        while (ret == 0 && request->len) {
+            int align = client->check_align ?: 1;
+            int len = MIN(request->len, QEMU_ALIGN_DOWN(BDRV_REQUEST_MAX_BYTES,
+                                                        align));
+            ret = blk_pwrite_zeroes(exp->blk, request->from + exp->dev_offset,
+                                    len, flags);
+            request->len -= len;
+            request->from += len;
+        }
         return nbd_send_generic_reply(client, request->handle, ret,
                                       "writing to file failed", errp);

@@ -2393,8 +2402,17 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
                                       "flush failed", errp);

     case NBD_CMD_TRIM:
-        ret = blk_co_pdiscard(exp->blk, request->from + exp->dev_offset,
-                              request->len);
+        ret = 0;
+        /* FIXME simplify this when blk_co_pdiscard switches to 64-bit */
+        while (ret == 0 && request->len) {
+            int align = client->check_align ?: 1;
+            int len = MIN(request->len, QEMU_ALIGN_DOWN(BDRV_REQUEST_MAX_BYTES,
+                                                        align));
+            ret = blk_co_pdiscard(exp->blk, request->from + exp->dev_offset,
+                                  len);
+            request->len -= len;
+            request->from += len;
+        }
         if (ret == 0 && request->flags & NBD_CMD_FLAG_FUA) {
             ret = blk_co_flush(exp->blk);
         }
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH for-5.1] nbd: Fix large trim/zero requests
  2020-07-22 21:22 [PATCH for-5.1] nbd: Fix large trim/zero requests Eric Blake
@ 2020-07-23  7:23 ` Vladimir Sementsov-Ogievskiy
  2020-07-23 11:47   ` Eric Blake
  0 siblings, 1 reply; 4+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-23  7:23 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-stable, open list:Network Block Dev...

23.07.2020 00:22, Eric Blake wrote:
> Although qemu as NBD client limits requests to <2G, the NBD protocol
> allows clients to send requests almost all the way up to 4G.  But
> because our block layer is not yet 64-bit clean, we accidentally wrap
> such requests into a negative size, and fail with EIO instead of
> performing the intended operation.
> 
> The bug is visible in modern systems with something as simple as:
> 
> $ qemu-img create -f qcow2 /tmp/image.img 5G
> $ sudo qemu-nbd --connect=/dev/nbd0 /tmp/image.img
> $ sudo blkdiscard /dev/nbd0
> 
> or with user-space only:
> 
> $ truncate --size=3G file
> $ qemu-nbd -f raw file
> $ nbdsh -u nbd://localhost:10809 -c 'h.trim(3*1024*1024*1024,0)'
> 
> Alas, our iotests do not currently make it easy to add external
> dependencies on blkdiscard or nbdsh, so we have to rely on manual
> testing for now.
> 
> This patch can be reverted when we later improve the overall block
> layer to be 64-bit clean, but for now, a minimal fix was deemed less
> risky prior to release.
> 
> CC: qemu-stable@nongnu.org
> Fixes: 1f4d6d18ed
> Fixes: 1c6c4bb7f0
> Fixes: https://github.com/systemd/systemd/issues/16242
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>   nbd/server.c | 26 ++++++++++++++++++++++----
>   1 file changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/nbd/server.c b/nbd/server.c
> index 4752a6c8bc07..029618017c90 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -2378,8 +2378,17 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
>           if (request->flags & NBD_CMD_FLAG_FAST_ZERO) {
>               flags |= BDRV_REQ_NO_FALLBACK;
>           }
> -        ret = blk_pwrite_zeroes(exp->blk, request->from + exp->dev_offset,
> -                                request->len, flags);
> +        ret = 0;
> +        /* FIXME simplify this when blk_pwrite_zeroes switches to 64-bit */
> +        while (ret == 0 && request->len) {
> +            int align = client->check_align ?: 1;
> +            int len = MIN(request->len, QEMU_ALIGN_DOWN(BDRV_REQUEST_MAX_BYTES,
> +                                                        align));
> +            ret = blk_pwrite_zeroes(exp->blk, request->from + exp->dev_offset,
> +                                    len, flags);
> +            request->len -= len;
> +            request->from += len;
> +        }
>           return nbd_send_generic_reply(client, request->handle, ret,
>                                         "writing to file failed", errp);
> 
> @@ -2393,8 +2402,17 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
>                                         "flush failed", errp);
> 
>       case NBD_CMD_TRIM:
> -        ret = blk_co_pdiscard(exp->blk, request->from + exp->dev_offset,
> -                              request->len);
> +        ret = 0;
> +        /* FIXME simplify this when blk_co_pdiscard switches to 64-bit */
> +        while (ret == 0 && request->len) {

Did you check all the paths not to return positive ret on success? I'd prefer to compare ret >= 0 for this temporary code to not care of it

> +            int align = client->check_align ?: 1;
> +            int len = MIN(request->len, QEMU_ALIGN_DOWN(BDRV_REQUEST_MAX_BYTES,
> +                                                        align));
> +            ret = blk_co_pdiscard(exp->blk, request->from + exp->dev_offset,
> +                                  len);
> +            request->len -= len;
> +            request->from += len;

Hmm.. Modifying the function parameter. Safe now, as nbd_handle_request() call is the last usage of request in nbd_trip().

> +        }
>           if (ret == 0 && request->flags & NBD_CMD_FLAG_FUA) {
>               ret = blk_co_flush(exp->blk);
>           }
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH for-5.1] nbd: Fix large trim/zero requests
  2020-07-23  7:23 ` Vladimir Sementsov-Ogievskiy
@ 2020-07-23 11:47   ` Eric Blake
  2020-07-23 13:08     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Blake @ 2020-07-23 11:47 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: qemu-stable, open list:Network Block Dev...

On 7/23/20 2:23 AM, Vladimir Sementsov-Ogievskiy wrote:
> 23.07.2020 00:22, Eric Blake wrote:
>> Although qemu as NBD client limits requests to <2G, the NBD protocol
>> allows clients to send requests almost all the way up to 4G.  But
>> because our block layer is not yet 64-bit clean, we accidentally wrap
>> such requests into a negative size, and fail with EIO instead of
>> performing the intended operation.
>>

>> @@ -2378,8 +2378,17 @@ static coroutine_fn int 
>> nbd_handle_request(NBDClient *client,
>>           if (request->flags & NBD_CMD_FLAG_FAST_ZERO) {
>>               flags |= BDRV_REQ_NO_FALLBACK;
>>           }
>> -        ret = blk_pwrite_zeroes(exp->blk, request->from + 
>> exp->dev_offset,
>> -                                request->len, flags);
>> +        ret = 0;
>> +        /* FIXME simplify this when blk_pwrite_zeroes switches to 
>> 64-bit */
>> +        while (ret == 0 && request->len) {
>> +            int align = client->check_align ?: 1;
>> +            int len = MIN(request->len, 
>> QEMU_ALIGN_DOWN(BDRV_REQUEST_MAX_BYTES,
>> +                                                        align));
>> +            ret = blk_pwrite_zeroes(exp->blk, request->from + 
>> exp->dev_offset,
>> +                                    len, flags);
>> +            request->len -= len;
>> +            request->from += len;
>> +        }
>>           return nbd_send_generic_reply(client, request->handle, ret,
>>                                         "writing to file failed", errp);

It's easy enough to audit that blk_pwrite_zeroes returns 0 (and not 
arbitrary positive) on success.

>>
>> @@ -2393,8 +2402,17 @@ static coroutine_fn int 
>> nbd_handle_request(NBDClient *client,
>>                                         "flush failed", errp);
>>
>>       case NBD_CMD_TRIM:
>> -        ret = blk_co_pdiscard(exp->blk, request->from + exp->dev_offset,
>> -                              request->len);
>> +        ret = 0;
>> +        /* FIXME simplify this when blk_co_pdiscard switches to 
>> 64-bit */
>> +        while (ret == 0 && request->len) {
> 
> Did you check all the paths not to return positive ret on success? I'd 
> prefer to compare ret >= 0 for this temporary code to not care of it

And for blk_co_pdiscard, the audit is already right here in the patch: 
pre-patch, ret = blk_co_pdiscard, followed by...

> 
>> +            int align = client->check_align ?: 1;
>> +            int len = MIN(request->len, 
>> QEMU_ALIGN_DOWN(BDRV_REQUEST_MAX_BYTES,
>> +                                                        align));
>> +            ret = blk_co_pdiscard(exp->blk, request->from + 
>> exp->dev_offset,
>> +                                  len);
>> +            request->len -= len;
>> +            request->from += len;
> 
> Hmm.. Modifying the function parameter. Safe now, as 
> nbd_handle_request() call is the last usage of request in nbd_trip().
> 
>> +        }
>>           if (ret == 0 && request->flags & NBD_CMD_FLAG_FUA) {

...a check for ret == 0.

>>               ret = blk_co_flush(exp->blk);
>>           }
>>
> 
> 

Yes, I can respin the patch to use >= 0 as the check for success in both 
functions, but it doesn't change the behavior.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH for-5.1] nbd: Fix large trim/zero requests
  2020-07-23 11:47   ` Eric Blake
@ 2020-07-23 13:08     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 4+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-23 13:08 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-stable, open list:Network Block Dev...

23.07.2020 14:47, Eric Blake wrote:
> On 7/23/20 2:23 AM, Vladimir Sementsov-Ogievskiy wrote:
>> 23.07.2020 00:22, Eric Blake wrote:
>>> Although qemu as NBD client limits requests to <2G, the NBD protocol
>>> allows clients to send requests almost all the way up to 4G.  But
>>> because our block layer is not yet 64-bit clean, we accidentally wrap
>>> such requests into a negative size, and fail with EIO instead of
>>> performing the intended operation.
>>>
> 
>>> @@ -2378,8 +2378,17 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
>>>           if (request->flags & NBD_CMD_FLAG_FAST_ZERO) {
>>>               flags |= BDRV_REQ_NO_FALLBACK;
>>>           }
>>> -        ret = blk_pwrite_zeroes(exp->blk, request->from + exp->dev_offset,
>>> -                                request->len, flags);
>>> +        ret = 0;
>>> +        /* FIXME simplify this when blk_pwrite_zeroes switches to 64-bit */
>>> +        while (ret == 0 && request->len) {
>>> +            int align = client->check_align ?: 1;
>>> +            int len = MIN(request->len, QEMU_ALIGN_DOWN(BDRV_REQUEST_MAX_BYTES,
>>> +                                                        align));
>>> +            ret = blk_pwrite_zeroes(exp->blk, request->from + exp->dev_offset,
>>> +                                    len, flags);
>>> +            request->len -= len;
>>> +            request->from += len;
>>> +        }
>>>           return nbd_send_generic_reply(client, request->handle, ret,
>>>                                         "writing to file failed", errp);
> 
> It's easy enough to audit that blk_pwrite_zeroes returns 0 (and not arbitrary positive) on success.

Yes, that's why I've posted my commend to blk_co_pdiscard :)

> 
>>>
>>> @@ -2393,8 +2402,17 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
>>>                                         "flush failed", errp);
>>>
>>>       case NBD_CMD_TRIM:
>>> -        ret = blk_co_pdiscard(exp->blk, request->from + exp->dev_offset,
>>> -                              request->len);
>>> +        ret = 0;
>>> +        /* FIXME simplify this when blk_co_pdiscard switches to 64-bit */
>>> +        while (ret == 0 && request->len) {
>>
>> Did you check all the paths not to return positive ret on success? I'd prefer to compare ret >= 0 for this temporary code to not care of it
> 
> And for blk_co_pdiscard, the audit is already right here in the patch: pre-patch, ret = blk_co_pdiscard, followed by...
> 
>>
>>> +            int align = client->check_align ?: 1;
>>> +            int len = MIN(request->len, QEMU_ALIGN_DOWN(BDRV_REQUEST_MAX_BYTES,
>>> +                                                        align));
>>> +            ret = blk_co_pdiscard(exp->blk, request->from + exp->dev_offset,
>>> +                                  len);
>>> +            request->len -= len;
>>> +            request->from += len;
>>
>> Hmm.. Modifying the function parameter. Safe now, as nbd_handle_request() call is the last usage of request in nbd_trip().
>>
>>> +        }
>>>           if (ret == 0 && request->flags & NBD_CMD_FLAG_FUA) {
> 
> ...a check for ret == 0.

Hmm. Is it a bug or not? Anyway, bdrv_co_pdiscard() does "if (ret && .." as well, so if some driver return positive ret,
it may fail earlier..

> 
>>>               ret = blk_co_flush(exp->blk);
>>>           }
>>>
>>
>>
> 
> Yes, I can respin the patch to use >= 0 as the check for success in both functions, but it doesn't change the behavior.
> 

OK. Anyway:
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-07-23 13:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-22 21:22 [PATCH for-5.1] nbd: Fix large trim/zero requests Eric Blake
2020-07-23  7:23 ` Vladimir Sementsov-Ogievskiy
2020-07-23 11:47   ` Eric Blake
2020-07-23 13:08     ` Vladimir Sementsov-Ogievskiy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).