All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/2] Optimize sparse reads over NBD
@ 2017-11-07  3:09 Eric Blake
  2017-11-07  3:09 ` [Qemu-devel] [PATCH 1/2] nbd/server: Implement sparse reads atop structured reply Eric Blake
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Eric Blake @ 2017-11-07  3:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, vsementsov

When I first proposed the NBD extension of structured reads,
it was in order to more efficiently read sparse files without
sending lots of zeroes over the wire.  These two patches feel
like a feature addition, and missed soft freeze, so I'm
reluctant to include them in a 2.11 pull request; on the other
hand, implementing structured replies without sparse reads is
a rather incomplete feature addition even if it complies with
the NBD spec.  Since structured replies is a new 2.11 feature,
I could argue that it is a bug if the new feature does not go
all the way to sparse reads.

I've posted this as two patches, but welcome opinions on whether
it should be squashed into one.

Based-on: 20171107030236.23633-1-eblake@redhat.com
([PATCH 0/8] various NBD fixes for 2.11)

Eric Blake (2):
  nbd/server: Implement sparse reads atop structured reply
  nbd/server: Optimize final chunk of sparse read

 nbd/server.c     | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 nbd/trace-events |  1 +
 2 files changed, 77 insertions(+), 3 deletions(-)

-- 
2.13.6

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 1/2] nbd/server: Implement sparse reads atop structured reply
  2017-11-07  3:09 [Qemu-devel] [PATCH 0/2] Optimize sparse reads over NBD Eric Blake
@ 2017-11-07  3:09 ` Eric Blake
  2017-11-09  8:43   ` Vladimir Sementsov-Ogievskiy
  2017-11-07  3:09 ` [Qemu-devel] [PATCH 2/2] nbd/server: Optimize final chunk of sparse read Eric Blake
  2017-11-07 11:03 ` [Qemu-devel] [PATCH 0/2] Optimize sparse reads over NBD Paolo Bonzini
  2 siblings, 1 reply; 6+ messages in thread
From: Eric Blake @ 2017-11-07  3:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, vsementsov

The reason that NBD added structured reply in the first place was
to allow for efficient reads of sparse files, by allowing the
reply to include chunks to quickly communicate holes to the client
without sending lots of zeroes over the wire.  Time to implement
this in the server; our client can already read such data.

We can only skip holes insofar as the block layer can query them;
and only if the client is okay with a fragmented request (if a
client requests NBD_CMD_FLAG_DF and the entire read is a hole, we
could technically return a single NBD_REPLY_TYPE_OFFSET_HOLE, but
that's a fringe case not worth catering to here).  Sadly, the
control flow is a bit wonkier than I would have preferred, but
it was minimally invasive to have a split in the action between
a fragmented read (handled directly where we recognize
NBD_CMD_READ with the right conditions, and sending multiple
chunks) vs. a single read (handled at the end of nbd_trip, for
both simple and structured replies, when we know there is only
one thing being read).  Likewise, I didn't make any effort to
optimize the final chunk of a fragmented read to set the
NBD_REPLY_FLAG_DONE, but unconditionally send that as a separate
NBD_REPLY_TYPE_NONE.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 nbd/server.c     | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 nbd/trace-events |  1 +
 2 files changed, 76 insertions(+), 3 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index df771fd42f..3c5e3005a6 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1293,6 +1293,7 @@ static int coroutine_fn nbd_co_send_structured_read(NBDClient *client,
                                                     uint64_t offset,
                                                     void *data,
                                                     size_t size,
+                                                    bool final,
                                                     Error **errp)
 {
     NBDStructuredReadData chunk;
@@ -1303,13 +1304,73 @@ static int coroutine_fn nbd_co_send_structured_read(NBDClient *client,

     assert(size);
     trace_nbd_co_send_structured_read(handle, offset, data, size);
-    set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_OFFSET_DATA,
-                 handle, sizeof(chunk) - sizeof(chunk.h) + size);
+    set_be_chunk(&chunk.h, final ? NBD_REPLY_FLAG_DONE : 0,
+                 NBD_REPLY_TYPE_OFFSET_DATA, handle,
+                 sizeof(chunk) - sizeof(chunk.h) + size);
     stq_be_p(&chunk.offset, offset);

     return nbd_co_send_iov(client, iov, 2, errp);
 }

+static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
+                                                uint64_t handle,
+                                                uint64_t offset,
+                                                uint8_t *data,
+                                                size_t size,
+                                                Error **errp)
+{
+    int ret = 0;
+    NBDExport *exp = client->exp;
+    size_t progress = 0;
+
+    while (progress < size) {
+        int64_t pnum;
+        int status = bdrv_block_status_above(blk_bs(exp->blk), NULL,
+                                             offset + progress,
+                                             size - progress, &pnum, NULL,
+                                             NULL);
+
+        if (status < 0) {
+            error_setg_errno(errp, -status, "unable to check for holes");
+            return status;
+        }
+        assert(pnum && pnum <= size - progress);
+        if (status & BDRV_BLOCK_ZERO) {
+            NBDStructuredReadHole chunk;
+            struct iovec iov[] = {
+                {.iov_base = &chunk, .iov_len = sizeof(chunk)},
+            };
+
+            trace_nbd_co_send_structured_read_hole(handle, offset + progress,
+                                                   pnum);
+            set_be_chunk(&chunk.h, 0, NBD_REPLY_TYPE_OFFSET_HOLE,
+                         handle, sizeof(chunk) - sizeof(chunk.h));
+            stq_be_p(&chunk.offset, offset + progress);
+            stl_be_p(&chunk.length, pnum);
+            ret = nbd_co_send_iov(client, iov, 1, errp);
+        } else {
+            ret = blk_pread(exp->blk, offset + progress + exp->dev_offset,
+                            data + progress, pnum);
+            if (ret < 0) {
+                error_setg_errno(errp, -ret, "reading from file failed");
+                break;
+            }
+            ret = nbd_co_send_structured_read(client, handle, offset + progress,
+                                              data + progress, pnum, false,
+                                              errp);
+        }
+
+        if (ret < 0) {
+            break;
+        }
+        progress += pnum;
+    }
+    if (!ret) {
+        ret = nbd_co_send_structured_done(client, handle, errp);
+    }
+    return ret;
+}
+
 static int coroutine_fn nbd_co_send_structured_error(NBDClient *client,
                                                      uint64_t handle,
                                                      uint32_t error,
@@ -1471,6 +1532,16 @@ static coroutine_fn void nbd_trip(void *opaque)
             }
         }

+        if (client->structured_reply && !(request.flags & NBD_CMD_FLAG_DF)) {
+            ret = nbd_co_send_sparse_read(req->client, request.handle,
+                                          request.from, req->data, request.len,
+                                          &local_err);
+            if (ret < 0) {
+                goto reply;
+            }
+            goto done;
+        }
+
         ret = blk_pread(exp->blk, request.from + exp->dev_offset,
                         req->data, request.len);
         if (ret < 0) {
@@ -1563,7 +1634,8 @@ reply:
         } else if (reply_data_len) {
             ret = nbd_co_send_structured_read(req->client, request.handle,
                                               request.from, req->data,
-                                              reply_data_len, &local_err);
+                                              reply_data_len, true,
+                                              &local_err);
         } else {
             ret = nbd_co_send_structured_done(req->client, request.handle,
                                               &local_err);
diff --git a/nbd/trace-events b/nbd/trace-events
index 92568edce5..2b8268ce8c 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -57,6 +57,7 @@ nbd_blk_aio_detach(const char *name, void *ctx) "Export %s: Detaching clients fr
 nbd_co_send_simple_reply(uint64_t handle, uint32_t error, const char *errname, int len) "Send simple reply: handle = %" PRIu64 ", error = %" PRIu32 " (%s), len = %d"
 nbd_co_send_structured_done(uint64_t handle) "Send structured reply done: handle = %" PRIu64
 nbd_co_send_structured_read(uint64_t handle, uint64_t offset, void *data, size_t size) "Send structured read data reply: handle = %" PRIu64 ", offset = %" PRIu64 ", data = %p, len = %zu"
+nbd_co_send_structured_read_hole(uint64_t handle, uint64_t offset, size_t size) "Send structured read hole reply: handle = %" PRIu64 ", offset = %" PRIu64 ", len = %zu"
 nbd_co_send_structured_error(uint64_t handle, int err, const char *errname, const char *msg) "Send structured error reply: handle = %" PRIu64 ", error = %d (%s), msg = '%s'"
 nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type, const char *name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16 " (%s)"
 nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len) "Payload received: handle = %" PRIu64 ", len = %" PRIu32
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 2/2] nbd/server: Optimize final chunk of sparse read
  2017-11-07  3:09 [Qemu-devel] [PATCH 0/2] Optimize sparse reads over NBD Eric Blake
  2017-11-07  3:09 ` [Qemu-devel] [PATCH 1/2] nbd/server: Implement sparse reads atop structured reply Eric Blake
@ 2017-11-07  3:09 ` Eric Blake
  2017-11-09  8:49   ` Vladimir Sementsov-Ogievskiy
  2017-11-07 11:03 ` [Qemu-devel] [PATCH 0/2] Optimize sparse reads over NBD Paolo Bonzini
  2 siblings, 1 reply; 6+ messages in thread
From: Eric Blake @ 2017-11-07  3:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, vsementsov

If we are careful to handle 0-length read requests correctly,
we can optimize our sparse read to send the NBD_REPLY_FLAG_DONE
bit on our last OFFSET_DATA or OFFSET_HOLE chunk rather than
needing a separate chunk.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 nbd/server.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 3c5e3005a6..cde61d2461 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1329,12 +1329,14 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
                                              offset + progress,
                                              size - progress, &pnum, NULL,
                                              NULL);
+        bool final;

         if (status < 0) {
             error_setg_errno(errp, -status, "unable to check for holes");
             return status;
         }
         assert(pnum && pnum <= size - progress);
+        final = progress + pnum == size;
         if (status & BDRV_BLOCK_ZERO) {
             NBDStructuredReadHole chunk;
             struct iovec iov[] = {
@@ -1343,7 +1345,8 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,

             trace_nbd_co_send_structured_read_hole(handle, offset + progress,
                                                    pnum);
-            set_be_chunk(&chunk.h, 0, NBD_REPLY_TYPE_OFFSET_HOLE,
+            set_be_chunk(&chunk.h, final ? NBD_REPLY_FLAG_DONE : 0,
+                         NBD_REPLY_TYPE_OFFSET_HOLE,
                          handle, sizeof(chunk) - sizeof(chunk.h));
             stq_be_p(&chunk.offset, offset + progress);
             stl_be_p(&chunk.length, pnum);
@@ -1356,7 +1359,7 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
                 break;
             }
             ret = nbd_co_send_structured_read(client, handle, offset + progress,
-                                              data + progress, pnum, false,
+                                              data + progress, pnum, final,
                                               errp);
         }

@@ -1365,9 +1368,6 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
         }
         progress += pnum;
     }
-    if (!ret) {
-        ret = nbd_co_send_structured_done(client, handle, errp);
-    }
     return ret;
 }

@@ -1532,7 +1532,8 @@ static coroutine_fn void nbd_trip(void *opaque)
             }
         }

-        if (client->structured_reply && !(request.flags & NBD_CMD_FLAG_DF)) {
+        if (client->structured_reply && !(request.flags & NBD_CMD_FLAG_DF) &&
+            request.len) {
             ret = nbd_co_send_sparse_read(req->client, request.handle,
                                           request.from, req->data, request.len,
                                           &local_err);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] Optimize sparse reads over NBD
  2017-11-07  3:09 [Qemu-devel] [PATCH 0/2] Optimize sparse reads over NBD Eric Blake
  2017-11-07  3:09 ` [Qemu-devel] [PATCH 1/2] nbd/server: Implement sparse reads atop structured reply Eric Blake
  2017-11-07  3:09 ` [Qemu-devel] [PATCH 2/2] nbd/server: Optimize final chunk of sparse read Eric Blake
@ 2017-11-07 11:03 ` Paolo Bonzini
  2 siblings, 0 replies; 6+ messages in thread
From: Paolo Bonzini @ 2017-11-07 11:03 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, vsementsov

On 07/11/2017 04:09, Eric Blake wrote:
> When I first proposed the NBD extension of structured reads,
> it was in order to more efficiently read sparse files without
> sending lots of zeroes over the wire.  These two patches feel
> like a feature addition, and missed soft freeze, so I'm
> reluctant to include them in a 2.11 pull request; on the other
> hand, implementing structured replies without sparse reads is
> a rather incomplete feature addition even if it complies with
> the NBD spec.  Since structured replies is a new 2.11 feature,
> I could argue that it is a bug if the new feature does not go
> all the way to sparse reads.
> 
> I've posted this as two patches, but welcome opinions on whether
> it should be squashed into one.

I think I agree with pushing this to the next release.

Paolo

> Based-on: 20171107030236.23633-1-eblake@redhat.com
> ([PATCH 0/8] various NBD fixes for 2.11)
> 
> Eric Blake (2):
>   nbd/server: Implement sparse reads atop structured reply
>   nbd/server: Optimize final chunk of sparse read
> 
>  nbd/server.c     | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
>  nbd/trace-events |  1 +
>  2 files changed, 77 insertions(+), 3 deletions(-)
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] nbd/server: Implement sparse reads atop structured reply
  2017-11-07  3:09 ` [Qemu-devel] [PATCH 1/2] nbd/server: Implement sparse reads atop structured reply Eric Blake
@ 2017-11-09  8:43   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 6+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-11-09  8:43 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, pbonzini

07.11.2017 06:09, Eric Blake wrote:
> The reason that NBD added structured reply in the first place was
> to allow for efficient reads of sparse files, by allowing the
> reply to include chunks to quickly communicate holes to the client
> without sending lots of zeroes over the wire.  Time to implement
> this in the server; our client can already read such data.
>
> We can only skip holes insofar as the block layer can query them;
> and only if the client is okay with a fragmented request (if a
> client requests NBD_CMD_FLAG_DF and the entire read is a hole, we
> could technically return a single NBD_REPLY_TYPE_OFFSET_HOLE, but
> that's a fringe case not worth catering to here).  Sadly, the
> control flow is a bit wonkier than I would have preferred, but
> it was minimally invasive to have a split in the action between
> a fragmented read (handled directly where we recognize
> NBD_CMD_READ with the right conditions, and sending multiple
> chunks) vs. a single read (handled at the end of nbd_trip, for
> both simple and structured replies, when we know there is only
> one thing being read).  Likewise, I didn't make any effort to
> optimize the final chunk of a fragmented read to set the
> NBD_REPLY_FLAG_DONE, but unconditionally send that as a separate
> NBD_REPLY_TYPE_NONE.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] nbd/server: Optimize final chunk of sparse read
  2017-11-07  3:09 ` [Qemu-devel] [PATCH 2/2] nbd/server: Optimize final chunk of sparse read Eric Blake
@ 2017-11-09  8:49   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 6+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-11-09  8:49 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, pbonzini

07.11.2017 06:09, Eric Blake wrote:
> If we are careful to handle 0-length read requests correctly,
> we can optimize our sparse read to send the NBD_REPLY_FLAG_DONE
> bit on our last OFFSET_DATA or OFFSET_HOLE chunk rather than
> needing a separate chunk.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>



-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-11-09  8:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-07  3:09 [Qemu-devel] [PATCH 0/2] Optimize sparse reads over NBD Eric Blake
2017-11-07  3:09 ` [Qemu-devel] [PATCH 1/2] nbd/server: Implement sparse reads atop structured reply Eric Blake
2017-11-09  8:43   ` Vladimir Sementsov-Ogievskiy
2017-11-07  3:09 ` [Qemu-devel] [PATCH 2/2] nbd/server: Optimize final chunk of sparse read Eric Blake
2017-11-09  8:49   ` Vladimir Sementsov-Ogievskiy
2017-11-07 11:03 ` [Qemu-devel] [PATCH 0/2] Optimize sparse reads over NBD Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.