All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/24] qemu patches for 64-bit NBD extensions
@ 2023-06-08 13:56 Eric Blake
  2023-06-08 13:56 ` [PATCH v4 01/24] nbd/client: Use smarter assert Eric Blake
                   ` (23 more replies)
  0 siblings, 24 replies; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov

v3 was here:
https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg03607.html

Since then, I've incorporated lots of feedback from Vladimir:
 - split several patches into smaller pieces
 - use an enum to track various negotiation modes
 - reorder a few patches
 - several new patches (2, 5, 6)
 - other cleanups

001/24:[----] [--] 'nbd/client: Use smarter assert'
002/24:[down] 'nbd: Consistent typedef usage in header'
003/24:[0137] [FC] 'nbd/server: Prepare for alternate-size headers'
004/24:[0093] [FC] 'nbd/server: Refactor to pass full request around'
005/24:[down] 'nbd: s/handle/cookie/ to match NBD spec'
006/24:[down] 'nbd/client: Simplify cookie vs. index computation'
007/24:[----] [-C] 'nbd/client: Add safety check on chunk payload length'
008/24:[down] 'nbd: Use enum for various negotiation modes'
009/24:[down] 'nbd: Replace bool structured_reply with mode enum'
010/24:[down] 'nbd/client: Pass mode through to nbd_send_request'
011/24:[0096] [FC] 'nbd: Add types for extended headers'
012/24:[0118] [FC] 'nbd: Prepare for 64-bit request effect lengths'
013/24:[0071] [FC] 'nbd/server: Refactor handling of request payload'
014/24:[down] 'nbd/server: Prepare to receive extended header requests'
015/24:[down] 'nbd/server: Prepare to send extended header replies'
016/24:[0132] [FC] 'nbd/server: Support 64-bit block status'
017/24:[down] 'nbd/server: Enable initial support for extended headers'
018/24:[down] 'nbd/client: Plumb errp through nbd_receive_replies'
019/24:[0066] [FC] 'nbd/client: Initial support for extended headers'
020/24:[0032] [FC] 'nbd/client: Accept 64-bit block status chunks'
021/24:[0058] [FC] 'nbd/client: Request extended headers during negotiation'
022/24:[down] 'nbd/server: Refactor list of negotiated meta contexts'
023/24:[0132] [FC] 'nbd/server: Prepare for per-request filtering of BLOCK_STATUS'
024/24:[0109] [FC] 'nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS'

Eric Blake (24):
  nbd/client: Use smarter assert
  nbd: Consistent typedef usage in header
  nbd/server: Prepare for alternate-size headers
  nbd/server: Refactor to pass full request around
  nbd: s/handle/cookie/ to match NBD spec
  nbd/client: Simplify cookie vs. index computation
  nbd/client: Add safety check on chunk payload length
  nbd: Use enum for various negotiation modes
  nbd: Replace bool structured_reply with mode enum
  nbd/client: Pass mode through to nbd_send_request
  nbd: Add types for extended headers
  nbd: Prepare for 64-bit request effect lengths
  nbd/server: Refactor handling of request payload
  nbd/server: Prepare to receive extended header requests
  nbd/server: Prepare to send extended header replies
  nbd/server: Support 64-bit block status
  nbd/server: Enable initial support for extended headers
  nbd/client: Plumb errp through nbd_receive_replies
  nbd/client: Initial support for extended headers
  nbd/client: Accept 64-bit block status chunks
  nbd/client: Request extended headers during negotiation
  nbd/server: Refactor list of negotiated meta contexts
  nbd/server: Prepare for per-request filtering of BLOCK_STATUS
  nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS

 docs/interop/nbd.txt                          |   1 +
 include/block/nbd.h                           | 201 ++++--
 nbd/nbd-internal.h                            |   8 +-
 block/nbd.c                                   | 189 +++--
 nbd/client-connection.c                       |   4 +-
 nbd/client.c                                  | 199 ++++--
 nbd/common.c                                  |  29 +-
 nbd/server.c                                  | 666 ++++++++++++------
 qemu-nbd.c                                    |   8 +-
 block/trace-events                            |   1 +
 nbd/trace-events                              |  29 +-
 tests/qemu-iotests/223.out                    |  18 +-
 tests/qemu-iotests/233.out                    |   4 +
 tests/qemu-iotests/241.out                    |   3 +
 tests/qemu-iotests/307.out                    |  15 +-
 .../tests/nbd-qemu-allocation.out             |   3 +-
 16 files changed, 937 insertions(+), 441 deletions(-)


base-commit: 4f65e89f8cf0e079b4ec3ddfede314bbb4e35c76
-- 
2.40.1



^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v4 01/24] nbd/client: Use smarter assert
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-08 13:56 ` [PATCH v4 02/24] nbd: Consistent typedef usage in header Eric Blake
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Dr . David Alan Gilbert

Assigning strlen() to a uint32_t and then asserting that it isn't too
large doesn't catch the case of an input string 4G in length.
Thankfully, the incoming strings can never be that large: if the
export name or query is reflecting a string the client got from the
server, we already guarantee that we dropped the NBD connection if the
server sent more than 32M in a single reply to our NBD_OPT_* request;
if the export name is coming from qemu, nbd_receive_negotiate()
asserted that strlen(info->name) <= NBD_MAX_STRING_SIZE; and
similarly, a query string via x->dirty_bitmap coming from the user was
bounds-checked in either qemu-nbd or by the limitations of QMP.
Still, it doesn't hurt to be more explicit in how we write our
assertions to not have to analyze whether inadvertent wraparound is
possible.

Fixes: 93676c88 ("nbd: Don't send oversize strings", v4.2.0)
Reported-by: Dr. David Alan Gilbert <dave@treblig.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
 nbd/client.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/nbd/client.c b/nbd/client.c
index 30d5383cb19..ff75722e487 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -650,19 +650,20 @@ static int nbd_send_meta_query(QIOChannel *ioc, uint32_t opt,
                                Error **errp)
 {
     int ret;
-    uint32_t export_len = strlen(export);
+    uint32_t export_len;
     uint32_t queries = !!query;
     uint32_t query_len = 0;
     uint32_t data_len;
     char *data;
     char *p;

+    assert(strnlen(export, NBD_MAX_STRING_SIZE + 1) <= NBD_MAX_STRING_SIZE);
+    export_len = strlen(export);
     data_len = sizeof(export_len) + export_len + sizeof(queries);
-    assert(export_len <= NBD_MAX_STRING_SIZE);
     if (query) {
+        assert(strnlen(query, NBD_MAX_STRING_SIZE + 1) <= NBD_MAX_STRING_SIZE);
         query_len = strlen(query);
         data_len += sizeof(query_len) + query_len;
-        assert(query_len <= NBD_MAX_STRING_SIZE);
     } else {
         assert(opt == NBD_OPT_LIST_META_CONTEXT);
     }
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 02/24] nbd: Consistent typedef usage in header
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
  2023-06-08 13:56 ` [PATCH v4 01/24] nbd/client: Use smarter assert Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-08 14:17   ` [Libguestfs] " Eric Blake
  2023-06-08 13:56 ` [PATCH v4 03/24] nbd/server: Prepare for alternate-size headers Eric Blake
                   ` (21 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

We had a mix of struct declarataions followed by typedefs, and direct
struct definitions as part of a typedef.  Pick a single style.  Also
float a couple of opaque typedefs earlier in the file, as a later
patch wants to refer NBDExport* in NBDRequest.  No semantic impact.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: new patch
---
 include/block/nbd.h | 28 ++++++++++++----------------
 1 file changed, 12 insertions(+), 16 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index a4c98169c39..9c3ceae5ba5 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -1,5 +1,5 @@
 /*
- *  Copyright (C) 2016-2022 Red Hat, Inc.
+ *  Copyright Red Hat
  *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
  *
  *  Network Block Device
@@ -26,24 +26,25 @@
 #include "qapi/error.h"
 #include "qemu/bswap.h"

+typedef struct NBDExport NBDExport;
+typedef struct NBDClient NBDClient;
+
 extern const BlockExportDriver blk_exp_nbd;

 /* Handshake phase structs - this struct is passed on the wire */

-struct NBDOption {
+typedef struct NBDOption {
     uint64_t magic; /* NBD_OPTS_MAGIC */
     uint32_t option; /* NBD_OPT_* */
     uint32_t length;
-} QEMU_PACKED;
-typedef struct NBDOption NBDOption;
+} QEMU_PACKED NBDOption;

-struct NBDOptionReply {
+typedef struct NBDOptionReply {
     uint64_t magic; /* NBD_REP_MAGIC */
     uint32_t option; /* NBD_OPT_* */
     uint32_t type; /* NBD_REP_* */
     uint32_t length;
-} QEMU_PACKED;
-typedef struct NBDOptionReply NBDOptionReply;
+} QEMU_PACKED NBDOptionReply;

 typedef struct NBDOptionReplyMetaContext {
     NBDOptionReply h; /* h.type = NBD_REP_META_CONTEXT, h.length > 4 */
@@ -56,14 +57,13 @@ typedef struct NBDOptionReplyMetaContext {
  * Note: these are _NOT_ the same as the network representation of an NBD
  * request and reply!
  */
-struct NBDRequest {
+typedef struct NBDRequest {
     uint64_t handle;
     uint64_t from;
     uint32_t len;
     uint16_t flags; /* NBD_CMD_FLAG_* */
     uint16_t type; /* NBD_CMD_* */
-};
-typedef struct NBDRequest NBDRequest;
+} NBDRequest;

 typedef struct NBDSimpleReply {
     uint32_t magic;  /* NBD_SIMPLE_REPLY_MAGIC */
@@ -282,7 +282,7 @@ static inline bool nbd_reply_type_is_error(int type)
 #define NBD_ESHUTDOWN  108

 /* Details collected by NBD_OPT_EXPORT_NAME and NBD_OPT_GO */
-struct NBDExportInfo {
+typedef struct NBDExportInfo {
     /* Set by client before nbd_receive_negotiate() */
     bool request_sizes;
     char *x_dirty_bitmap;
@@ -310,8 +310,7 @@ struct NBDExportInfo {
     char *description;
     int n_contexts;
     char **contexts;
-};
-typedef struct NBDExportInfo NBDExportInfo;
+} NBDExportInfo;

 int nbd_receive_negotiate(AioContext *aio_context, QIOChannel *ioc,
                           QCryptoTLSCreds *tlscreds,
@@ -330,9 +329,6 @@ int nbd_client(int fd);
 int nbd_disconnect(int fd);
 int nbd_errno_to_system_errno(int err);

-typedef struct NBDExport NBDExport;
-typedef struct NBDClient NBDClient;
-
 void nbd_export_set_on_eject_blk(BlockExport *exp, BlockBackend *blk);

 AioContext *nbd_export_aio_context(NBDExport *exp);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 03/24] nbd/server: Prepare for alternate-size headers
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
  2023-06-08 13:56 ` [PATCH v4 01/24] nbd/client: Use smarter assert Eric Blake
  2023-06-08 13:56 ` [PATCH v4 02/24] nbd: Consistent typedef usage in header Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-12 13:53   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 04/24] nbd/server: Refactor to pass full request around Eric Blake
                   ` (20 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

Upstream NBD now documents[1] an extension that supports 64-bit effect
lengths in requests.  As part of that extension, the size of the reply
headers will change in order to permit a 64-bit length in the reply
for symmetry[2].  Additionally, where the reply header is currently 16
bytes for simple reply, and 20 bytes for structured reply; with the
extension enabled, there will only be one extended reply header, of 32
bytes, with both structured and extended modes sending identical
payloads for chunked replies.

Since we are already wired up to use iovecs, it is easiest to allow
for this change in header size by splitting each structured reply
across multiple iovecs, one for the header (which will become wider in
a future patch according to client negotiation), and the other(s) for
the chunk payload, and removing the header from the payload struct
definitions.  Rename the affected functions with s/structured/chunk/
to make it obvious that the code will be reused in extended mode.

Interestingly, the client side code never utilized the packed types,
so only the server code needs to be updated.

[1] https://github.com/NetworkBlockDevice/nbd/blob/extension-ext-header/doc/proto.md
as of NBD commit e6f3b94a934

[2] Note that on the surface, this is because some future server might
permit a 4G+ NBD_CMD_READ and need to reply with that much data in one
transaction.  But even though the extended reply length is widened to
64 bits, for now the NBD spec is clear that servers will not reply
with more than a maximum payload bounded by the 32-bit
NBD_INFO_BLOCK_SIZE field; allowing a client and server to mutually
agree to transactions larger than 4G would require yet another
extension.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: hoist earlier in series, drop most changes to
nbd_co_send_simple_reply, pass niov to set_be_chunk, rename several
functions, drop R-b
---
 include/block/nbd.h |   8 +--
 nbd/server.c        | 137 ++++++++++++++++++++++++++------------------
 nbd/trace-events    |   8 +--
 3 files changed, 88 insertions(+), 65 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 9c3ceae5ba5..e563f1774b0 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -96,28 +96,28 @@ typedef union NBDReply {

 /* Header of chunk for NBD_REPLY_TYPE_OFFSET_DATA */
 typedef struct NBDStructuredReadData {
-    NBDStructuredReplyChunk h; /* h.length >= 9 */
+    /* header's .length >= 9 */
     uint64_t offset;
     /* At least one byte of data payload follows, calculated from h.length */
 } QEMU_PACKED NBDStructuredReadData;

 /* Complete chunk for NBD_REPLY_TYPE_OFFSET_HOLE */
 typedef struct NBDStructuredReadHole {
-    NBDStructuredReplyChunk h; /* h.length == 12 */
+    /* header's length == 12 */
     uint64_t offset;
     uint32_t length;
 } QEMU_PACKED NBDStructuredReadHole;

 /* Header of all NBD_REPLY_TYPE_ERROR* errors */
 typedef struct NBDStructuredError {
-    NBDStructuredReplyChunk h; /* h.length >= 6 */
+    /* header's length >= 6 */
     uint32_t error;
     uint16_t message_length;
 } QEMU_PACKED NBDStructuredError;

 /* Header of NBD_REPLY_TYPE_BLOCK_STATUS */
 typedef struct NBDStructuredMeta {
-    NBDStructuredReplyChunk h; /* h.length >= 12 (at least one extent) */
+    /* header's length >= 12 (at least one extent) */
     uint32_t context_id;
     /* extents follows */
 } QEMU_PACKED NBDStructuredMeta;
diff --git a/nbd/server.c b/nbd/server.c
index febe001a399..6698ab46365 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright (C) 2016-2022 Red Hat, Inc.
+ *  Copyright Red Hat
  *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
  *
  *  Network Block Device Server Side
@@ -1906,16 +1906,36 @@ static int coroutine_fn nbd_co_send_simple_reply(NBDClient *client,
         {.iov_base = data, .iov_len = len}
     };

+    assert(!len || !nbd_err);
     trace_nbd_co_send_simple_reply(handle, nbd_err, nbd_err_lookup(nbd_err),
                                    len);
     set_be_simple_reply(&reply, nbd_err, handle);

-    return nbd_co_send_iov(client, iov, len ? 2 : 1, errp);
+    return nbd_co_send_iov(client, iov, 2, errp);
 }

-static inline void set_be_chunk(NBDStructuredReplyChunk *chunk, uint16_t flags,
-                                uint16_t type, uint64_t handle, uint32_t length)
+/*
+ * Prepare the header of a reply chunk for network transmission.
+ *
+ * On input, @iov is partially initialized: iov[0].iov_base must point
+ * to an uninitialized NBDReply, while the remaining @niov elements
+ * (if any) must be ready for transmission.  This function then
+ * populates iov[0] for transmission.
+ */
+static inline void set_be_chunk(NBDClient *client, struct iovec *iov,
+                                size_t niov, uint16_t flags, uint16_t type,
+                                uint64_t handle)
 {
+    /* TODO - handle structured vs. extended replies */
+    NBDStructuredReplyChunk *chunk = iov->iov_base;
+    size_t i, length = 0;
+
+    for (i = 1; i < niov; i++) {
+        length += iov[i].iov_len;
+    }
+    assert(length <= NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData));
+
+    iov[0].iov_len = sizeof(*chunk);
     stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC);
     stw_be_p(&chunk->flags, flags);
     stw_be_p(&chunk->type, type);
@@ -1923,67 +1943,71 @@ static inline void set_be_chunk(NBDStructuredReplyChunk *chunk, uint16_t flags,
     stl_be_p(&chunk->length, length);
 }

-static int coroutine_fn nbd_co_send_structured_done(NBDClient *client,
-                                                    uint64_t handle,
-                                                    Error **errp)
+static int coroutine_fn nbd_co_send_chunk_done(NBDClient *client,
+                                               uint64_t handle,
+                                               Error **errp)
 {
-    NBDStructuredReplyChunk chunk;
+    NBDReply hdr;
     struct iovec iov[] = {
-        {.iov_base = &chunk, .iov_len = sizeof(chunk)},
+        {.iov_base = &hdr},
     };

-    trace_nbd_co_send_structured_done(handle);
-    set_be_chunk(&chunk, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_NONE, handle, 0);
+    trace_nbd_co_send_chunk_done(handle);
+    set_be_chunk(client, iov, 1, NBD_REPLY_FLAG_DONE,
+                 NBD_REPLY_TYPE_NONE, handle);

     return nbd_co_send_iov(client, iov, 1, errp);
 }

-static int coroutine_fn nbd_co_send_structured_read(NBDClient *client,
-                                                    uint64_t handle,
-                                                    uint64_t offset,
-                                                    void *data,
-                                                    size_t size,
-                                                    bool final,
-                                                    Error **errp)
+static int coroutine_fn nbd_co_send_chunk_read(NBDClient *client,
+                                               uint64_t handle,
+                                               uint64_t offset,
+                                               void *data,
+                                               size_t size,
+                                               bool final,
+                                               Error **errp)
 {
+    NBDReply hdr;
     NBDStructuredReadData chunk;
     struct iovec iov[] = {
+        {.iov_base = &hdr},
         {.iov_base = &chunk, .iov_len = sizeof(chunk)},
         {.iov_base = data, .iov_len = size}
     };

     assert(size);
-    trace_nbd_co_send_structured_read(handle, offset, data, size);
-    set_be_chunk(&chunk.h, final ? NBD_REPLY_FLAG_DONE : 0,
-                 NBD_REPLY_TYPE_OFFSET_DATA, handle,
-                 sizeof(chunk) - sizeof(chunk.h) + size);
+    trace_nbd_co_send_chunk_read(handle, offset, data, size);
+    set_be_chunk(client, iov, 3, final ? NBD_REPLY_FLAG_DONE : 0,
+                 NBD_REPLY_TYPE_OFFSET_DATA, handle);
     stq_be_p(&chunk.offset, offset);

-    return nbd_co_send_iov(client, iov, 2, errp);
+    return nbd_co_send_iov(client, iov, 3, errp);
 }

-static int coroutine_fn nbd_co_send_structured_error(NBDClient *client,
-                                                     uint64_t handle,
-                                                     uint32_t error,
-                                                     const char *msg,
-                                                     Error **errp)
+static int coroutine_fn nbd_co_send_chunk_error(NBDClient *client,
+                                                uint64_t handle,
+                                                uint32_t error,
+                                                const char *msg,
+                                                Error **errp)
 {
+    NBDReply hdr;
     NBDStructuredError chunk;
     int nbd_err = system_errno_to_nbd_errno(error);
     struct iovec iov[] = {
+        {.iov_base = &hdr},
         {.iov_base = &chunk, .iov_len = sizeof(chunk)},
         {.iov_base = (char *)msg, .iov_len = msg ? strlen(msg) : 0},
     };

     assert(nbd_err);
-    trace_nbd_co_send_structured_error(handle, nbd_err,
-                                       nbd_err_lookup(nbd_err), msg ? msg : "");
-    set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_ERROR, handle,
-                 sizeof(chunk) - sizeof(chunk.h) + iov[1].iov_len);
+    trace_nbd_co_send_chunk_error(handle, nbd_err,
+                                  nbd_err_lookup(nbd_err), msg ? msg : "");
+    set_be_chunk(client, iov, 3, NBD_REPLY_FLAG_DONE,
+                 NBD_REPLY_TYPE_ERROR, handle);
     stl_be_p(&chunk.error, nbd_err);
-    stw_be_p(&chunk.message_length, iov[1].iov_len);
+    stw_be_p(&chunk.message_length, iov[2].iov_len);

-    return nbd_co_send_iov(client, iov, 1 + !!iov[1].iov_len, errp);
+    return nbd_co_send_iov(client, iov, 3, errp);
 }

 /* Do a sparse read and send the structured reply to the client.
@@ -2013,27 +2037,27 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
             char *msg = g_strdup_printf("unable to check for holes: %s",
                                         strerror(-status));

-            ret = nbd_co_send_structured_error(client, handle, -status, msg,
-                                               errp);
+            ret = nbd_co_send_chunk_error(client, handle, -status, msg, errp);
             g_free(msg);
             return ret;
         }
         assert(pnum && pnum <= size - progress);
         final = progress + pnum == size;
         if (status & BDRV_BLOCK_ZERO) {
+            NBDReply hdr;
             NBDStructuredReadHole chunk;
             struct iovec iov[] = {
+                {.iov_base = &hdr},
                 {.iov_base = &chunk, .iov_len = sizeof(chunk)},
             };

-            trace_nbd_co_send_structured_read_hole(handle, offset + progress,
-                                                   pnum);
-            set_be_chunk(&chunk.h, final ? NBD_REPLY_FLAG_DONE : 0,
-                         NBD_REPLY_TYPE_OFFSET_HOLE,
-                         handle, sizeof(chunk) - sizeof(chunk.h));
+            trace_nbd_co_send_chunk_read_hole(handle, offset + progress, pnum);
+            set_be_chunk(client, iov, 2,
+                         final ? NBD_REPLY_FLAG_DONE : 0,
+                         NBD_REPLY_TYPE_OFFSET_HOLE, handle);
             stq_be_p(&chunk.offset, offset + progress);
             stl_be_p(&chunk.length, pnum);
-            ret = nbd_co_send_iov(client, iov, 1, errp);
+            ret = nbd_co_send_iov(client, iov, 2, errp);
         } else {
             ret = blk_co_pread(exp->common.blk, offset + progress, pnum,
                                data + progress, 0);
@@ -2041,9 +2065,8 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
                 error_setg_errno(errp, -ret, "reading from file failed");
                 break;
             }
-            ret = nbd_co_send_structured_read(client, handle, offset + progress,
-                                              data + progress, pnum, final,
-                                              errp);
+            ret = nbd_co_send_chunk_read(client, handle, offset + progress,
+                                         data + progress, pnum, final, errp);
         }

         if (ret < 0) {
@@ -2199,8 +2222,10 @@ static int coroutine_fn
 nbd_co_send_extents(NBDClient *client, uint64_t handle, NBDExtentArray *ea,
                     bool last, uint32_t context_id, Error **errp)
 {
+    NBDReply hdr;
     NBDStructuredMeta chunk;
     struct iovec iov[] = {
+        {.iov_base = &hdr},
         {.iov_base = &chunk, .iov_len = sizeof(chunk)},
         {.iov_base = ea->extents, .iov_len = ea->count * sizeof(ea->extents[0])}
     };
@@ -2209,12 +2234,11 @@ nbd_co_send_extents(NBDClient *client, uint64_t handle, NBDExtentArray *ea,

     trace_nbd_co_send_extents(handle, ea->count, context_id, ea->total_length,
                               last);
-    set_be_chunk(&chunk.h, last ? NBD_REPLY_FLAG_DONE : 0,
-                 NBD_REPLY_TYPE_BLOCK_STATUS,
-                 handle, sizeof(chunk) - sizeof(chunk.h) + iov[1].iov_len);
+    set_be_chunk(client, iov, 3, last ? NBD_REPLY_FLAG_DONE : 0,
+                 NBD_REPLY_TYPE_BLOCK_STATUS, handle);
     stl_be_p(&chunk.context_id, context_id);

-    return nbd_co_send_iov(client, iov, 2, errp);
+    return nbd_co_send_iov(client, iov, 3, errp);
 }

 /* Get block status from the exported device and send it to the client */
@@ -2235,8 +2259,8 @@ coroutine_fn nbd_co_send_block_status(NBDClient *client, uint64_t handle,
         ret = blockalloc_to_extents(blk, offset, length, ea);
     }
     if (ret < 0) {
-        return nbd_co_send_structured_error(
-                client, handle, -ret, "can't get block status", errp);
+        return nbd_co_send_chunk_error(client, handle, -ret,
+                                       "can't get block status", errp);
     }

     return nbd_co_send_extents(client, handle, ea, last, context_id, errp);
@@ -2408,8 +2432,7 @@ static coroutine_fn int nbd_send_generic_reply(NBDClient *client,
                                                Error **errp)
 {
     if (client->structured_reply && ret < 0) {
-        return nbd_co_send_structured_error(client, handle, -ret, error_msg,
-                                            errp);
+        return nbd_co_send_chunk_error(client, handle, -ret, error_msg, errp);
     } else {
         return nbd_co_send_simple_reply(client, handle, ret < 0 ? -ret : 0,
                                         NULL, 0, errp);
@@ -2451,11 +2474,11 @@ static coroutine_fn int nbd_do_cmd_read(NBDClient *client, NBDRequest *request,

     if (client->structured_reply) {
         if (request->len) {
-            return nbd_co_send_structured_read(client, request->handle,
-                                               request->from, data,
-                                               request->len, true, errp);
+            return nbd_co_send_chunk_read(client, request->handle,
+                                          request->from, data,
+                                          request->len, true, errp);
         } else {
-            return nbd_co_send_structured_done(client, request->handle, errp);
+            return nbd_co_send_chunk_done(client, request->handle, errp);
         }
     } else {
         return nbd_co_send_simple_reply(client, request->handle, 0,
diff --git a/nbd/trace-events b/nbd/trace-events
index b7032ca2778..50ca05a9e22 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -64,11 +64,11 @@ nbd_receive_request(uint32_t magic, uint16_t flags, uint16_t type, uint64_t from
 nbd_blk_aio_attached(const char *name, void *ctx) "Export %s: Attaching clients to AIO context %p"
 nbd_blk_aio_detach(const char *name, void *ctx) "Export %s: Detaching clients from AIO context %p"
 nbd_co_send_simple_reply(uint64_t handle, uint32_t error, const char *errname, int len) "Send simple reply: handle = %" PRIu64 ", error = %" PRIu32 " (%s), len = %d"
-nbd_co_send_structured_done(uint64_t handle) "Send structured reply done: handle = %" PRIu64
-nbd_co_send_structured_read(uint64_t handle, uint64_t offset, void *data, size_t size) "Send structured read data reply: handle = %" PRIu64 ", offset = %" PRIu64 ", data = %p, len = %zu"
-nbd_co_send_structured_read_hole(uint64_t handle, uint64_t offset, size_t size) "Send structured read hole reply: handle = %" PRIu64 ", offset = %" PRIu64 ", len = %zu"
+nbd_co_send_chunk_done(uint64_t handle) "Send structured reply done: handle = %" PRIu64
+nbd_co_send_chunk_read(uint64_t handle, uint64_t offset, void *data, size_t size) "Send structured read data reply: handle = %" PRIu64 ", offset = %" PRIu64 ", data = %p, len = %zu"
+nbd_co_send_chunk_read_hole(uint64_t handle, uint64_t offset, size_t size) "Send structured read hole reply: handle = %" PRIu64 ", offset = %" PRIu64 ", len = %zu"
 nbd_co_send_extents(uint64_t handle, unsigned int extents, uint32_t id, uint64_t length, int last) "Send block status reply: handle = %" PRIu64 ", extents = %u, context = %d (extents cover %" PRIu64 " bytes, last chunk = %d)"
-nbd_co_send_structured_error(uint64_t handle, int err, const char *errname, const char *msg) "Send structured error reply: handle = %" PRIu64 ", error = %d (%s), msg = '%s'"
+nbd_co_send_chunk_error(uint64_t handle, int err, const char *errname, const char *msg) "Send structured error reply: handle = %" PRIu64 ", error = %d (%s), msg = '%s'"
 nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type, const char *name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16 " (%s)"
 nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len) "Payload received: handle = %" PRIu64 ", len = %" PRIu32
 nbd_co_receive_align_compliance(const char *op, uint64_t from, uint32_t len, uint32_t align) "client sent non-compliant unaligned %s request: from=0x%" PRIx64 ", len=0x%" PRIx32 ", align=0x%" PRIx32
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 04/24] nbd/server: Refactor to pass full request around
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (2 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 03/24] nbd/server: Prepare for alternate-size headers Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-08 13:56 ` [PATCH v4 05/24] nbd: s/handle/cookie/ to match NBD spec Eric Blake
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov

Part of NBD's 64-bit headers extension involves passing the client's
requested offset back as part of the reply header (one reason it
stated for this change: converting absolute offsets stored in
NBD_REPLY_TYPE_OFFSET_DATA to relative offsets within the buffer is
easier if the absolute offset of the buffer is also available).  This
is a refactoring patch to pass the full request around the reply
stack, rather than just the handle, so that later patches can then
access request->from when extended headers are active.  Meanwhile,
this patch enables us to now assert that simple replies are only
attempted when appropriate, and otherwise has no semantic change.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---

v4: reorder earlier in series, add assertion, keep R-b
---
 nbd/server.c | 114 ++++++++++++++++++++++++++-------------------------
 1 file changed, 59 insertions(+), 55 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 6698ab46365..26b27d69202 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1893,7 +1893,7 @@ static inline void set_be_simple_reply(NBDSimpleReply *reply, uint64_t error,
 }

 static int coroutine_fn nbd_co_send_simple_reply(NBDClient *client,
-                                                 uint64_t handle,
+                                                 NBDRequest *request,
                                                  uint32_t error,
                                                  void *data,
                                                  size_t len,
@@ -1907,9 +1907,10 @@ static int coroutine_fn nbd_co_send_simple_reply(NBDClient *client,
     };

     assert(!len || !nbd_err);
-    trace_nbd_co_send_simple_reply(handle, nbd_err, nbd_err_lookup(nbd_err),
-                                   len);
-    set_be_simple_reply(&reply, nbd_err, handle);
+    assert(!client->structured_reply || request->type != NBD_CMD_READ);
+    trace_nbd_co_send_simple_reply(request->handle, nbd_err,
+                                   nbd_err_lookup(nbd_err), len);
+    set_be_simple_reply(&reply, nbd_err, request->handle);

     return nbd_co_send_iov(client, iov, 2, errp);
 }
@@ -1924,7 +1925,7 @@ static int coroutine_fn nbd_co_send_simple_reply(NBDClient *client,
  */
 static inline void set_be_chunk(NBDClient *client, struct iovec *iov,
                                 size_t niov, uint16_t flags, uint16_t type,
-                                uint64_t handle)
+                                NBDRequest *request)
 {
     /* TODO - handle structured vs. extended replies */
     NBDStructuredReplyChunk *chunk = iov->iov_base;
@@ -1939,12 +1940,12 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov,
     stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC);
     stw_be_p(&chunk->flags, flags);
     stw_be_p(&chunk->type, type);
-    stq_be_p(&chunk->handle, handle);
+    stq_be_p(&chunk->handle, request->handle);
     stl_be_p(&chunk->length, length);
 }

 static int coroutine_fn nbd_co_send_chunk_done(NBDClient *client,
-                                               uint64_t handle,
+                                               NBDRequest *request,
                                                Error **errp)
 {
     NBDReply hdr;
@@ -1952,15 +1953,15 @@ static int coroutine_fn nbd_co_send_chunk_done(NBDClient *client,
         {.iov_base = &hdr},
     };

-    trace_nbd_co_send_chunk_done(handle);
+    trace_nbd_co_send_chunk_done(request->handle);
     set_be_chunk(client, iov, 1, NBD_REPLY_FLAG_DONE,
-                 NBD_REPLY_TYPE_NONE, handle);
+                 NBD_REPLY_TYPE_NONE, request);

     return nbd_co_send_iov(client, iov, 1, errp);
 }

 static int coroutine_fn nbd_co_send_chunk_read(NBDClient *client,
-                                               uint64_t handle,
+                                               NBDRequest *request,
                                                uint64_t offset,
                                                void *data,
                                                size_t size,
@@ -1976,16 +1977,16 @@ static int coroutine_fn nbd_co_send_chunk_read(NBDClient *client,
     };

     assert(size);
-    trace_nbd_co_send_chunk_read(handle, offset, data, size);
+    trace_nbd_co_send_chunk_read(request->handle, offset, data, size);
     set_be_chunk(client, iov, 3, final ? NBD_REPLY_FLAG_DONE : 0,
-                 NBD_REPLY_TYPE_OFFSET_DATA, handle);
+                 NBD_REPLY_TYPE_OFFSET_DATA, request);
     stq_be_p(&chunk.offset, offset);

     return nbd_co_send_iov(client, iov, 3, errp);
 }
-
+/*ebb*/
 static int coroutine_fn nbd_co_send_chunk_error(NBDClient *client,
-                                                uint64_t handle,
+                                                NBDRequest *request,
                                                 uint32_t error,
                                                 const char *msg,
                                                 Error **errp)
@@ -2000,10 +2001,10 @@ static int coroutine_fn nbd_co_send_chunk_error(NBDClient *client,
     };

     assert(nbd_err);
-    trace_nbd_co_send_chunk_error(handle, nbd_err,
+    trace_nbd_co_send_chunk_error(request->handle, nbd_err,
                                   nbd_err_lookup(nbd_err), msg ? msg : "");
     set_be_chunk(client, iov, 3, NBD_REPLY_FLAG_DONE,
-                 NBD_REPLY_TYPE_ERROR, handle);
+                 NBD_REPLY_TYPE_ERROR, request);
     stl_be_p(&chunk.error, nbd_err);
     stw_be_p(&chunk.message_length, iov[2].iov_len);

@@ -2015,7 +2016,7 @@ static int coroutine_fn nbd_co_send_chunk_error(NBDClient *client,
  * reported to the client, at which point this function succeeds.
  */
 static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
-                                                uint64_t handle,
+                                                NBDRequest *request,
                                                 uint64_t offset,
                                                 uint8_t *data,
                                                 size_t size,
@@ -2037,7 +2038,7 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
             char *msg = g_strdup_printf("unable to check for holes: %s",
                                         strerror(-status));

-            ret = nbd_co_send_chunk_error(client, handle, -status, msg, errp);
+            ret = nbd_co_send_chunk_error(client, request, -status, msg, errp);
             g_free(msg);
             return ret;
         }
@@ -2051,10 +2052,11 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
                 {.iov_base = &chunk, .iov_len = sizeof(chunk)},
             };

-            trace_nbd_co_send_chunk_read_hole(handle, offset + progress, pnum);
+            trace_nbd_co_send_chunk_read_hole(request->handle,
+                                              offset + progress, pnum);
             set_be_chunk(client, iov, 2,
                          final ? NBD_REPLY_FLAG_DONE : 0,
-                         NBD_REPLY_TYPE_OFFSET_HOLE, handle);
+                         NBD_REPLY_TYPE_OFFSET_HOLE, request);
             stq_be_p(&chunk.offset, offset + progress);
             stl_be_p(&chunk.length, pnum);
             ret = nbd_co_send_iov(client, iov, 2, errp);
@@ -2065,7 +2067,7 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
                 error_setg_errno(errp, -ret, "reading from file failed");
                 break;
             }
-            ret = nbd_co_send_chunk_read(client, handle, offset + progress,
+            ret = nbd_co_send_chunk_read(client, request, offset + progress,
                                          data + progress, pnum, final, errp);
         }

@@ -2219,7 +2221,7 @@ static int coroutine_fn blockalloc_to_extents(BlockBackend *blk,
  * @last controls whether NBD_REPLY_FLAG_DONE is sent.
  */
 static int coroutine_fn
-nbd_co_send_extents(NBDClient *client, uint64_t handle, NBDExtentArray *ea,
+nbd_co_send_extents(NBDClient *client, NBDRequest *request, NBDExtentArray *ea,
                     bool last, uint32_t context_id, Error **errp)
 {
     NBDReply hdr;
@@ -2232,10 +2234,10 @@ nbd_co_send_extents(NBDClient *client, uint64_t handle, NBDExtentArray *ea,

     nbd_extent_array_convert_to_be(ea);

-    trace_nbd_co_send_extents(handle, ea->count, context_id, ea->total_length,
-                              last);
+    trace_nbd_co_send_extents(request->handle, ea->count, context_id,
+                              ea->total_length, last);
     set_be_chunk(client, iov, 3, last ? NBD_REPLY_FLAG_DONE : 0,
-                 NBD_REPLY_TYPE_BLOCK_STATUS, handle);
+                 NBD_REPLY_TYPE_BLOCK_STATUS, request);
     stl_be_p(&chunk.context_id, context_id);

     return nbd_co_send_iov(client, iov, 3, errp);
@@ -2243,7 +2245,7 @@ nbd_co_send_extents(NBDClient *client, uint64_t handle, NBDExtentArray *ea,

 /* Get block status from the exported device and send it to the client */
 static int
-coroutine_fn nbd_co_send_block_status(NBDClient *client, uint64_t handle,
+coroutine_fn nbd_co_send_block_status(NBDClient *client, NBDRequest *request,
                                       BlockBackend *blk, uint64_t offset,
                                       uint32_t length, bool dont_fragment,
                                       bool last, uint32_t context_id,
@@ -2259,11 +2261,11 @@ coroutine_fn nbd_co_send_block_status(NBDClient *client, uint64_t handle,
         ret = blockalloc_to_extents(blk, offset, length, ea);
     }
     if (ret < 0) {
-        return nbd_co_send_chunk_error(client, handle, -ret,
+        return nbd_co_send_chunk_error(client, request, -ret,
                                        "can't get block status", errp);
     }

-    return nbd_co_send_extents(client, handle, ea, last, context_id, errp);
+    return nbd_co_send_extents(client, request, ea, last, context_id, errp);
 }

 /* Populate @ea from a dirty bitmap. */
@@ -2298,17 +2300,20 @@ static void bitmap_to_extents(BdrvDirtyBitmap *bitmap,
     bdrv_dirty_bitmap_unlock(bitmap);
 }

-static int coroutine_fn nbd_co_send_bitmap(NBDClient *client, uint64_t handle,
-                                           BdrvDirtyBitmap *bitmap, uint64_t offset,
-                                           uint32_t length, bool dont_fragment, bool last,
-                                           uint32_t context_id, Error **errp)
+static int coroutine_fn nbd_co_send_bitmap(NBDClient *client,
+                                           NBDRequest *request,
+                                           BdrvDirtyBitmap *bitmap,
+                                           uint64_t offset,
+                                           uint32_t length, bool dont_fragment,
+                                           bool last, uint32_t context_id,
+                                           Error **errp)
 {
     unsigned int nb_extents = dont_fragment ? 1 : NBD_MAX_BLOCK_STATUS_EXTENTS;
     g_autoptr(NBDExtentArray) ea = nbd_extent_array_new(nb_extents);

     bitmap_to_extents(bitmap, offset, length, ea);

-    return nbd_co_send_extents(client, handle, ea, last, context_id, errp);
+    return nbd_co_send_extents(client, request, ea, last, context_id, errp);
 }

 /* nbd_co_receive_request
@@ -2426,15 +2431,15 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
  * Returns 0 if connection is still live, -errno on failure to talk to client
  */
 static coroutine_fn int nbd_send_generic_reply(NBDClient *client,
-                                               uint64_t handle,
+                                               NBDRequest *request,
                                                int ret,
                                                const char *error_msg,
                                                Error **errp)
 {
     if (client->structured_reply && ret < 0) {
-        return nbd_co_send_chunk_error(client, handle, -ret, error_msg, errp);
+        return nbd_co_send_chunk_error(client, request, -ret, error_msg, errp);
     } else {
-        return nbd_co_send_simple_reply(client, handle, ret < 0 ? -ret : 0,
+        return nbd_co_send_simple_reply(client, request, ret < 0 ? -ret : 0,
                                         NULL, 0, errp);
     }
 }
@@ -2454,7 +2459,7 @@ static coroutine_fn int nbd_do_cmd_read(NBDClient *client, NBDRequest *request,
     if (request->flags & NBD_CMD_FLAG_FUA) {
         ret = blk_co_flush(exp->common.blk);
         if (ret < 0) {
-            return nbd_send_generic_reply(client, request->handle, ret,
+            return nbd_send_generic_reply(client, request, ret,
                                           "flush failed", errp);
         }
     }
@@ -2462,26 +2467,25 @@ static coroutine_fn int nbd_do_cmd_read(NBDClient *client, NBDRequest *request,
     if (client->structured_reply && !(request->flags & NBD_CMD_FLAG_DF) &&
         request->len)
     {
-        return nbd_co_send_sparse_read(client, request->handle, request->from,
+        return nbd_co_send_sparse_read(client, request, request->from,
                                        data, request->len, errp);
     }

     ret = blk_co_pread(exp->common.blk, request->from, request->len, data, 0);
     if (ret < 0) {
-        return nbd_send_generic_reply(client, request->handle, ret,
+        return nbd_send_generic_reply(client, request, ret,
                                       "reading from file failed", errp);
     }

     if (client->structured_reply) {
         if (request->len) {
-            return nbd_co_send_chunk_read(client, request->handle,
-                                          request->from, data,
+            return nbd_co_send_chunk_read(client, request, request->from, data,
                                           request->len, true, errp);
         } else {
-            return nbd_co_send_chunk_done(client, request->handle, errp);
+            return nbd_co_send_chunk_done(client, request, errp);
         }
     } else {
-        return nbd_co_send_simple_reply(client, request->handle, 0,
+        return nbd_co_send_simple_reply(client, request, 0,
                                         data, request->len, errp);
     }
 }
@@ -2504,7 +2508,7 @@ static coroutine_fn int nbd_do_cmd_cache(NBDClient *client, NBDRequest *request,
     ret = blk_co_preadv(exp->common.blk, request->from, request->len,
                         NULL, BDRV_REQ_COPY_ON_READ | BDRV_REQ_PREFETCH);

-    return nbd_send_generic_reply(client, request->handle, ret,
+    return nbd_send_generic_reply(client, request, ret,
                                   "caching data failed", errp);
 }

@@ -2535,7 +2539,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
         }
         ret = blk_co_pwrite(exp->common.blk, request->from, request->len, data,
                             flags);
-        return nbd_send_generic_reply(client, request->handle, ret,
+        return nbd_send_generic_reply(client, request, ret,
                                       "writing to file failed", errp);

     case NBD_CMD_WRITE_ZEROES:
@@ -2551,7 +2555,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
         }
         ret = blk_co_pwrite_zeroes(exp->common.blk, request->from, request->len,
                                    flags);
-        return nbd_send_generic_reply(client, request->handle, ret,
+        return nbd_send_generic_reply(client, request, ret,
                                       "writing to file failed", errp);

     case NBD_CMD_DISC:
@@ -2560,7 +2564,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,

     case NBD_CMD_FLUSH:
         ret = blk_co_flush(exp->common.blk);
-        return nbd_send_generic_reply(client, request->handle, ret,
+        return nbd_send_generic_reply(client, request, ret,
                                       "flush failed", errp);

     case NBD_CMD_TRIM:
@@ -2568,12 +2572,12 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
         if (ret >= 0 && request->flags & NBD_CMD_FLAG_FUA) {
             ret = blk_co_flush(exp->common.blk);
         }
-        return nbd_send_generic_reply(client, request->handle, ret,
+        return nbd_send_generic_reply(client, request, ret,
                                       "discard failed", errp);

     case NBD_CMD_BLOCK_STATUS:
         if (!request->len) {
-            return nbd_send_generic_reply(client, request->handle, -EINVAL,
+            return nbd_send_generic_reply(client, request, -EINVAL,
                                           "need non-zero length", errp);
         }
         if (client->export_meta.count) {
@@ -2581,7 +2585,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
             int contexts_remaining = client->export_meta.count;

             if (client->export_meta.base_allocation) {
-                ret = nbd_co_send_block_status(client, request->handle,
+                ret = nbd_co_send_block_status(client, request,
                                                exp->common.blk,
                                                request->from,
                                                request->len, dont_fragment,
@@ -2594,7 +2598,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
             }

             if (client->export_meta.allocation_depth) {
-                ret = nbd_co_send_block_status(client, request->handle,
+                ret = nbd_co_send_block_status(client, request,
                                                exp->common.blk,
                                                request->from, request->len,
                                                dont_fragment,
@@ -2610,7 +2614,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
                 if (!client->export_meta.bitmaps[i]) {
                     continue;
                 }
-                ret = nbd_co_send_bitmap(client, request->handle,
+                ret = nbd_co_send_bitmap(client, request,
                                          client->exp->export_bitmaps[i],
                                          request->from, request->len,
                                          dont_fragment, !--contexts_remaining,
@@ -2624,7 +2628,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,

             return 0;
         } else {
-            return nbd_send_generic_reply(client, request->handle, -EINVAL,
+            return nbd_send_generic_reply(client, request, -EINVAL,
                                           "CMD_BLOCK_STATUS not negotiated",
                                           errp);
         }
@@ -2632,7 +2636,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
     default:
         msg = g_strdup_printf("invalid request type (%" PRIu32 ") received",
                               request->type);
-        ret = nbd_send_generic_reply(client, request->handle, -EINVAL, msg,
+        ret = nbd_send_generic_reply(client, request, -EINVAL, msg,
                                      errp);
         g_free(msg);
         return ret;
@@ -2695,7 +2699,7 @@ static coroutine_fn void nbd_trip(void *opaque)
         Error *export_err = local_err;

         local_err = NULL;
-        ret = nbd_send_generic_reply(client, request.handle, -EINVAL,
+        ret = nbd_send_generic_reply(client, &request, -EINVAL,
                                      error_get_pretty(export_err), &local_err);
         error_free(export_err);
     } else {
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 05/24] nbd: s/handle/cookie/ to match NBD spec
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (3 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 04/24] nbd/server: Refactor to pass full request around Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-08 14:32   ` [Libguestfs] " Eric Blake
  2023-06-12 14:12   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 06/24] nbd/client: Simplify cookie vs. index computation Eric Blake
                   ` (18 subsequent siblings)
  23 siblings, 2 replies; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

Externally, libnbd exposed the 64-bit opaque marker for each client
NBD packet as the "cookie", because it was less confusing when
contrasted with 'struct nbd_handle *' holding all libnbd state.  It
also avoids confusion between the nown 'handle' as a way to identify a
packet and the verb 'handle' for reacting to things like signals.
Upstream NBD changed their spec to favor the name "cookie" based on
libnbd's recommendations[1], so we can do likewise.

[1] https://github.com/NetworkBlockDevice/nbd/commit/ca4392eb2b

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: new patch
---
 include/block/nbd.h | 11 +++---
 block/nbd.c         | 96 +++++++++++++++++++++++----------------------
 nbd/client.c        | 14 +++----
 nbd/server.c        | 29 +++++++-------
 nbd/trace-events    | 22 +++++------
 5 files changed, 87 insertions(+), 85 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index e563f1774b0..59db69bafa5 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -58,7 +58,7 @@ typedef struct NBDOptionReplyMetaContext {
  * request and reply!
  */
 typedef struct NBDRequest {
-    uint64_t handle;
+    uint64_t cookie;
     uint64_t from;
     uint32_t len;
     uint16_t flags; /* NBD_CMD_FLAG_* */
@@ -68,7 +68,7 @@ typedef struct NBDRequest {
 typedef struct NBDSimpleReply {
     uint32_t magic;  /* NBD_SIMPLE_REPLY_MAGIC */
     uint32_t error;
-    uint64_t handle;
+    uint64_t cookie;
 } QEMU_PACKED NBDSimpleReply;

 /* Header of all structured replies */
@@ -76,7 +76,7 @@ typedef struct NBDStructuredReplyChunk {
     uint32_t magic;  /* NBD_STRUCTURED_REPLY_MAGIC */
     uint16_t flags;  /* combination of NBD_REPLY_FLAG_* */
     uint16_t type;   /* NBD_REPLY_TYPE_* */
-    uint64_t handle; /* request handle */
+    uint64_t cookie; /* request handle */
     uint32_t length; /* length of payload */
 } QEMU_PACKED NBDStructuredReplyChunk;

@@ -84,13 +84,14 @@ typedef union NBDReply {
     NBDSimpleReply simple;
     NBDStructuredReplyChunk structured;
     struct {
-        /* @magic and @handle fields have the same offset and size both in
+        /*
+         * @magic and @cookie fields have the same offset and size both in
          * simple reply and structured reply chunk, so let them be accessible
          * without ".simple." or ".structured." specification
          */
         uint32_t magic;
         uint32_t _skip;
-        uint64_t handle;
+        uint64_t cookie;
     } QEMU_PACKED;
 } NBDReply;

diff --git a/block/nbd.c b/block/nbd.c
index 5aef5cb6bd5..be3c46c6fee 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -1,8 +1,8 @@
 /*
- * QEMU Block driver for  NBD
+ * QEMU Block driver for NBD
  *
  * Copyright (c) 2019 Virtuozzo International GmbH.
- * Copyright (C) 2016 Red Hat, Inc.
+ * Copyright Red Hat
  * Copyright (C) 2008 Bull S.A.S.
  *     Author: Laurent Vivier <Laurent.Vivier@bull.net>
  *
@@ -50,8 +50,8 @@
 #define EN_OPTSTR ":exportname="
 #define MAX_NBD_REQUESTS    16

-#define HANDLE_TO_INDEX(bs, handle) ((handle) ^ (uint64_t)(intptr_t)(bs))
-#define INDEX_TO_HANDLE(bs, index)  ((index)  ^ (uint64_t)(intptr_t)(bs))
+#define COOKIE_TO_INDEX(bs, cookie) ((cookie) ^ (uint64_t)(intptr_t)(bs))
+#define INDEX_TO_COOKIE(bs, index)  ((index)  ^ (uint64_t)(intptr_t)(bs))

 typedef struct {
     Coroutine *coroutine;
@@ -417,25 +417,25 @@ static void coroutine_fn GRAPH_RDLOCK nbd_reconnect_attempt(BDRVNBDState *s)
     reconnect_delay_timer_del(s);
 }

-static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t handle)
+static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie)
 {
     int ret;
-    uint64_t ind = HANDLE_TO_INDEX(s, handle), ind2;
+    uint64_t ind = COOKIE_TO_INDEX(s, cookie), ind2;
     QEMU_LOCK_GUARD(&s->receive_mutex);

     while (true) {
-        if (s->reply.handle == handle) {
+        if (s->reply.cookie == cookie) {
             /* We are done */
             return 0;
         }

-        if (s->reply.handle != 0) {
+        if (s->reply.cookie != 0) {
             /*
              * Some other request is being handled now. It should already be
-             * woken by whoever set s->reply.handle (or never wait in this
+             * woken by whoever set s->reply.cookie (or never wait in this
              * yield). So, we should not wake it here.
              */
-            ind2 = HANDLE_TO_INDEX(s, s->reply.handle);
+            ind2 = COOKIE_TO_INDEX(s, s->reply.cookie);
             assert(!s->requests[ind2].receiving);

             s->requests[ind].receiving = true;
@@ -445,9 +445,9 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t handle)
             /*
              * We may be woken for 2 reasons:
              * 1. From this function, executing in parallel coroutine, when our
-             *    handle is received.
+             *    cookie is received.
              * 2. From nbd_co_receive_one_chunk(), when previous request is
-             *    finished and s->reply.handle set to 0.
+             *    finished and s->reply.cookie set to 0.
              * Anyway, it's OK to lock the mutex and go to the next iteration.
              */

@@ -456,8 +456,8 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t handle)
             continue;
         }

-        /* We are under mutex and handle is 0. We have to do the dirty work. */
-        assert(s->reply.handle == 0);
+        /* We are under mutex and cookie is 0. We have to do the dirty work. */
+        assert(s->reply.cookie == 0);
         ret = nbd_receive_reply(s->bs, s->ioc, &s->reply, NULL);
         if (ret <= 0) {
             ret = ret ? ret : -EIO;
@@ -468,12 +468,12 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t handle)
             nbd_channel_error(s, -EINVAL);
             return -EINVAL;
         }
-        ind2 = HANDLE_TO_INDEX(s, s->reply.handle);
+        ind2 = COOKIE_TO_INDEX(s, s->reply.cookie);
         if (ind2 >= MAX_NBD_REQUESTS || !s->requests[ind2].coroutine) {
             nbd_channel_error(s, -EINVAL);
             return -EINVAL;
         }
-        if (s->reply.handle == handle) {
+        if (s->reply.cookie == cookie) {
             /* We are done */
             return 0;
         }
@@ -519,7 +519,7 @@ nbd_co_send_request(BlockDriverState *bs, NBDRequest *request,
     qemu_mutex_unlock(&s->requests_lock);

     qemu_co_mutex_lock(&s->send_mutex);
-    request->handle = INDEX_TO_HANDLE(s, i);
+    request->cookie = INDEX_TO_COOKIE(s, i);

     assert(s->ioc);

@@ -828,11 +828,11 @@ static coroutine_fn int nbd_co_receive_structured_payload(
  * corresponding to the server's error reply), and errp is unchanged.
  */
 static coroutine_fn int nbd_co_do_receive_one_chunk(
-        BDRVNBDState *s, uint64_t handle, bool only_structured,
+        BDRVNBDState *s, uint64_t cookie, bool only_structured,
         int *request_ret, QEMUIOVector *qiov, void **payload, Error **errp)
 {
     int ret;
-    int i = HANDLE_TO_INDEX(s, handle);
+    int i = COOKIE_TO_INDEX(s, cookie);
     void *local_payload = NULL;
     NBDStructuredReplyChunk *chunk;

@@ -841,14 +841,14 @@ static coroutine_fn int nbd_co_do_receive_one_chunk(
     }
     *request_ret = 0;

-    ret = nbd_receive_replies(s, handle);
+    ret = nbd_receive_replies(s, cookie);
     if (ret < 0) {
         error_setg(errp, "Connection closed");
         return -EIO;
     }
     assert(s->ioc);

-    assert(s->reply.handle == handle);
+    assert(s->reply.cookie == cookie);

     if (nbd_reply_is_simple(&s->reply)) {
         if (only_structured) {
@@ -918,11 +918,11 @@ static coroutine_fn int nbd_co_do_receive_one_chunk(
  * Return value is a fatal error code or normal nbd reply error code
  */
 static coroutine_fn int nbd_co_receive_one_chunk(
-        BDRVNBDState *s, uint64_t handle, bool only_structured,
+        BDRVNBDState *s, uint64_t cookie, bool only_structured,
         int *request_ret, QEMUIOVector *qiov, NBDReply *reply, void **payload,
         Error **errp)
 {
-    int ret = nbd_co_do_receive_one_chunk(s, handle, only_structured,
+    int ret = nbd_co_do_receive_one_chunk(s, cookie, only_structured,
                                           request_ret, qiov, payload, errp);

     if (ret < 0) {
@@ -932,7 +932,7 @@ static coroutine_fn int nbd_co_receive_one_chunk(
         /* For assert at loop start in nbd_connection_entry */
         *reply = s->reply;
     }
-    s->reply.handle = 0;
+    s->reply.cookie = 0;

     nbd_recv_coroutines_wake(s);

@@ -975,10 +975,10 @@ static void nbd_iter_request_error(NBDReplyChunkIter *iter, int ret)
  * NBD_FOREACH_REPLY_CHUNK
  * The pointer stored in @payload requires g_free() to free it.
  */
-#define NBD_FOREACH_REPLY_CHUNK(s, iter, handle, structured, \
+#define NBD_FOREACH_REPLY_CHUNK(s, iter, cookie, structured, \
                                 qiov, reply, payload) \
     for (iter = (NBDReplyChunkIter) { .only_structured = structured }; \
-         nbd_reply_chunk_iter_receive(s, &iter, handle, qiov, reply, payload);)
+         nbd_reply_chunk_iter_receive(s, &iter, cookie, qiov, reply, payload);)

 /*
  * nbd_reply_chunk_iter_receive
@@ -986,7 +986,7 @@ static void nbd_iter_request_error(NBDReplyChunkIter *iter, int ret)
  */
 static bool coroutine_fn nbd_reply_chunk_iter_receive(BDRVNBDState *s,
                                                       NBDReplyChunkIter *iter,
-                                                      uint64_t handle,
+                                                      uint64_t cookie,
                                                       QEMUIOVector *qiov,
                                                       NBDReply *reply,
                                                       void **payload)
@@ -1005,7 +1005,7 @@ static bool coroutine_fn nbd_reply_chunk_iter_receive(BDRVNBDState *s,
         reply = &local_reply;
     }

-    ret = nbd_co_receive_one_chunk(s, handle, iter->only_structured,
+    ret = nbd_co_receive_one_chunk(s, cookie, iter->only_structured,
                                    &request_ret, qiov, reply, payload,
                                    &local_err);
     if (ret < 0) {
@@ -1038,7 +1038,7 @@ static bool coroutine_fn nbd_reply_chunk_iter_receive(BDRVNBDState *s,

 break_loop:
     qemu_mutex_lock(&s->requests_lock);
-    s->requests[HANDLE_TO_INDEX(s, handle)].coroutine = NULL;
+    s->requests[COOKIE_TO_INDEX(s, cookie)].coroutine = NULL;
     s->in_flight--;
     qemu_co_queue_next(&s->free_sema);
     qemu_mutex_unlock(&s->requests_lock);
@@ -1046,12 +1046,13 @@ break_loop:
     return false;
 }

-static int coroutine_fn nbd_co_receive_return_code(BDRVNBDState *s, uint64_t handle,
-                                                   int *request_ret, Error **errp)
+static int coroutine_fn
+nbd_co_receive_return_code(BDRVNBDState *s, uint64_t cookie,
+                           int *request_ret, Error **errp)
 {
     NBDReplyChunkIter iter;

-    NBD_FOREACH_REPLY_CHUNK(s, iter, handle, false, NULL, NULL, NULL) {
+    NBD_FOREACH_REPLY_CHUNK(s, iter, cookie, false, NULL, NULL, NULL) {
         /* nbd_reply_chunk_iter_receive does all the work */
     }

@@ -1060,16 +1061,17 @@ static int coroutine_fn nbd_co_receive_return_code(BDRVNBDState *s, uint64_t han
     return iter.ret;
 }

-static int coroutine_fn nbd_co_receive_cmdread_reply(BDRVNBDState *s, uint64_t handle,
-                                                     uint64_t offset, QEMUIOVector *qiov,
-                                                     int *request_ret, Error **errp)
+static int coroutine_fn
+nbd_co_receive_cmdread_reply(BDRVNBDState *s, uint64_t cookie,
+                             uint64_t offset, QEMUIOVector *qiov,
+                             int *request_ret, Error **errp)
 {
     NBDReplyChunkIter iter;
     NBDReply reply;
     void *payload = NULL;
     Error *local_err = NULL;

-    NBD_FOREACH_REPLY_CHUNK(s, iter, handle, s->info.structured_reply,
+    NBD_FOREACH_REPLY_CHUNK(s, iter, cookie, s->info.structured_reply,
                             qiov, &reply, &payload)
     {
         int ret;
@@ -1112,10 +1114,10 @@ static int coroutine_fn nbd_co_receive_cmdread_reply(BDRVNBDState *s, uint64_t h
     return iter.ret;
 }

-static int coroutine_fn nbd_co_receive_blockstatus_reply(BDRVNBDState *s,
-                                                         uint64_t handle, uint64_t length,
-                                                         NBDExtent *extent,
-                                                         int *request_ret, Error **errp)
+static int coroutine_fn
+nbd_co_receive_blockstatus_reply(BDRVNBDState *s, uint64_t cookie,
+                                 uint64_t length, NBDExtent *extent,
+                                 int *request_ret, Error **errp)
 {
     NBDReplyChunkIter iter;
     NBDReply reply;
@@ -1124,7 +1126,7 @@ static int coroutine_fn nbd_co_receive_blockstatus_reply(BDRVNBDState *s,
     bool received = false;

     assert(!extent->length);
-    NBD_FOREACH_REPLY_CHUNK(s, iter, handle, false, NULL, &reply, &payload) {
+    NBD_FOREACH_REPLY_CHUNK(s, iter, cookie, false, NULL, &reply, &payload) {
         int ret;
         NBDStructuredReplyChunk *chunk = &reply.structured;

@@ -1194,11 +1196,11 @@ nbd_co_request(BlockDriverState *bs, NBDRequest *request,
             continue;
         }

-        ret = nbd_co_receive_return_code(s, request->handle,
+        ret = nbd_co_receive_return_code(s, request->cookie,
                                          &request_ret, &local_err);
         if (local_err) {
             trace_nbd_co_request_fail(request->from, request->len,
-                                      request->handle, request->flags,
+                                      request->cookie, request->flags,
                                       request->type,
                                       nbd_cmd_lookup(request->type),
                                       ret, error_get_pretty(local_err));
@@ -1253,10 +1255,10 @@ nbd_client_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes,
             continue;
         }

-        ret = nbd_co_receive_cmdread_reply(s, request.handle, offset, qiov,
+        ret = nbd_co_receive_cmdread_reply(s, request.cookie, offset, qiov,
                                            &request_ret, &local_err);
         if (local_err) {
-            trace_nbd_co_request_fail(request.from, request.len, request.handle,
+            trace_nbd_co_request_fail(request.from, request.len, request.cookie,
                                       request.flags, request.type,
                                       nbd_cmd_lookup(request.type),
                                       ret, error_get_pretty(local_err));
@@ -1411,11 +1413,11 @@ static int coroutine_fn GRAPH_RDLOCK nbd_client_co_block_status(
             continue;
         }

-        ret = nbd_co_receive_blockstatus_reply(s, request.handle, bytes,
+        ret = nbd_co_receive_blockstatus_reply(s, request.cookie, bytes,
                                                &extent, &request_ret,
                                                &local_err);
         if (local_err) {
-            trace_nbd_co_request_fail(request.from, request.len, request.handle,
+            trace_nbd_co_request_fail(request.from, request.len, request.cookie,
                                       request.flags, request.type,
                                       nbd_cmd_lookup(request.type),
                                       ret, error_get_pretty(local_err));
diff --git a/nbd/client.c b/nbd/client.c
index ff75722e487..ea3590ca3d0 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright (C) 2016-2019 Red Hat, Inc.
+ *  Copyright Red Hat
  *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
  *
  *  Network Block Device Client Side
@@ -1350,14 +1350,14 @@ int nbd_send_request(QIOChannel *ioc, NBDRequest *request)
 {
     uint8_t buf[NBD_REQUEST_SIZE];

-    trace_nbd_send_request(request->from, request->len, request->handle,
+    trace_nbd_send_request(request->from, request->len, request->cookie,
                            request->flags, request->type,
                            nbd_cmd_lookup(request->type));

     stl_be_p(buf, NBD_REQUEST_MAGIC);
     stw_be_p(buf + 4, request->flags);
     stw_be_p(buf + 6, request->type);
-    stq_be_p(buf + 8, request->handle);
+    stq_be_p(buf + 8, request->cookie);
     stq_be_p(buf + 16, request->from);
     stl_be_p(buf + 24, request->len);

@@ -1383,7 +1383,7 @@ static int nbd_receive_simple_reply(QIOChannel *ioc, NBDSimpleReply *reply,
     }

     reply->error = be32_to_cpu(reply->error);
-    reply->handle = be64_to_cpu(reply->handle);
+    reply->cookie = be64_to_cpu(reply->cookie);

     return 0;
 }
@@ -1410,7 +1410,7 @@ static int nbd_receive_structured_reply_chunk(QIOChannel *ioc,

     chunk->flags = be16_to_cpu(chunk->flags);
     chunk->type = be16_to_cpu(chunk->type);
-    chunk->handle = be64_to_cpu(chunk->handle);
+    chunk->cookie = be64_to_cpu(chunk->cookie);
     chunk->length = be32_to_cpu(chunk->length);

     return 0;
@@ -1487,7 +1487,7 @@ int coroutine_fn nbd_receive_reply(BlockDriverState *bs, QIOChannel *ioc,
         }
         trace_nbd_receive_simple_reply(reply->simple.error,
                                        nbd_err_lookup(reply->simple.error),
-                                       reply->handle);
+                                       reply->cookie);
         break;
     case NBD_STRUCTURED_REPLY_MAGIC:
         ret = nbd_receive_structured_reply_chunk(ioc, &reply->structured, errp);
@@ -1497,7 +1497,7 @@ int coroutine_fn nbd_receive_reply(BlockDriverState *bs, QIOChannel *ioc,
         type = nbd_reply_type_lookup(reply->structured.type);
         trace_nbd_receive_structured_reply_chunk(reply->structured.flags,
                                                  reply->structured.type, type,
-                                                 reply->structured.handle,
+                                                 reply->structured.cookie,
                                                  reply->structured.length);
         break;
     default:
diff --git a/nbd/server.c b/nbd/server.c
index 26b27d69202..8486b64b15d 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1428,7 +1428,7 @@ static int coroutine_fn nbd_receive_request(NBDClient *client, NBDRequest *reque
        [ 0 ..  3]   magic   (NBD_REQUEST_MAGIC)
        [ 4 ..  5]   flags   (NBD_CMD_FLAG_FUA, ...)
        [ 6 ..  7]   type    (NBD_CMD_READ, ...)
-       [ 8 .. 15]   handle
+       [ 8 .. 15]   cookie
        [16 .. 23]   from
        [24 .. 27]   len
      */
@@ -1436,7 +1436,7 @@ static int coroutine_fn nbd_receive_request(NBDClient *client, NBDRequest *reque
     magic = ldl_be_p(buf);
     request->flags  = lduw_be_p(buf + 4);
     request->type   = lduw_be_p(buf + 6);
-    request->handle = ldq_be_p(buf + 8);
+    request->cookie = ldq_be_p(buf + 8);
     request->from   = ldq_be_p(buf + 16);
     request->len    = ldl_be_p(buf + 24);

@@ -1885,11 +1885,11 @@ static int coroutine_fn nbd_co_send_iov(NBDClient *client, struct iovec *iov,
 }

 static inline void set_be_simple_reply(NBDSimpleReply *reply, uint64_t error,
-                                       uint64_t handle)
+                                       uint64_t cookie)
 {
     stl_be_p(&reply->magic, NBD_SIMPLE_REPLY_MAGIC);
     stl_be_p(&reply->error, error);
-    stq_be_p(&reply->handle, handle);
+    stq_be_p(&reply->cookie, cookie);
 }

 static int coroutine_fn nbd_co_send_simple_reply(NBDClient *client,
@@ -1908,9 +1908,9 @@ static int coroutine_fn nbd_co_send_simple_reply(NBDClient *client,

     assert(!len || !nbd_err);
     assert(!client->structured_reply || request->type != NBD_CMD_READ);
-    trace_nbd_co_send_simple_reply(request->handle, nbd_err,
+    trace_nbd_co_send_simple_reply(request->cookie, nbd_err,
                                    nbd_err_lookup(nbd_err), len);
-    set_be_simple_reply(&reply, nbd_err, request->handle);
+    set_be_simple_reply(&reply, nbd_err, request->cookie);

     return nbd_co_send_iov(client, iov, 2, errp);
 }
@@ -1940,7 +1940,7 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov,
     stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC);
     stw_be_p(&chunk->flags, flags);
     stw_be_p(&chunk->type, type);
-    stq_be_p(&chunk->handle, request->handle);
+    stq_be_p(&chunk->cookie, request->cookie);
     stl_be_p(&chunk->length, length);
 }

@@ -1953,10 +1953,9 @@ static int coroutine_fn nbd_co_send_chunk_done(NBDClient *client,
         {.iov_base = &hdr},
     };

-    trace_nbd_co_send_chunk_done(request->handle);
+    trace_nbd_co_send_chunk_done(request->cookie);
     set_be_chunk(client, iov, 1, NBD_REPLY_FLAG_DONE,
                  NBD_REPLY_TYPE_NONE, request);
-
     return nbd_co_send_iov(client, iov, 1, errp);
 }

@@ -1977,7 +1976,7 @@ static int coroutine_fn nbd_co_send_chunk_read(NBDClient *client,
     };

     assert(size);
-    trace_nbd_co_send_chunk_read(request->handle, offset, data, size);
+    trace_nbd_co_send_chunk_read(request->cookie, offset, data, size);
     set_be_chunk(client, iov, 3, final ? NBD_REPLY_FLAG_DONE : 0,
                  NBD_REPLY_TYPE_OFFSET_DATA, request);
     stq_be_p(&chunk.offset, offset);
@@ -2001,7 +2000,7 @@ static int coroutine_fn nbd_co_send_chunk_error(NBDClient *client,
     };

     assert(nbd_err);
-    trace_nbd_co_send_chunk_error(request->handle, nbd_err,
+    trace_nbd_co_send_chunk_error(request->cookie, nbd_err,
                                   nbd_err_lookup(nbd_err), msg ? msg : "");
     set_be_chunk(client, iov, 3, NBD_REPLY_FLAG_DONE,
                  NBD_REPLY_TYPE_ERROR, request);
@@ -2052,7 +2051,7 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
                 {.iov_base = &chunk, .iov_len = sizeof(chunk)},
             };

-            trace_nbd_co_send_chunk_read_hole(request->handle,
+            trace_nbd_co_send_chunk_read_hole(request->cookie,
                                               offset + progress, pnum);
             set_be_chunk(client, iov, 2,
                          final ? NBD_REPLY_FLAG_DONE : 0,
@@ -2234,7 +2233,7 @@ nbd_co_send_extents(NBDClient *client, NBDRequest *request, NBDExtentArray *ea,

     nbd_extent_array_convert_to_be(ea);

-    trace_nbd_co_send_extents(request->handle, ea->count, context_id,
+    trace_nbd_co_send_extents(request->cookie, ea->count, context_id,
                               ea->total_length, last);
     set_be_chunk(client, iov, 3, last ? NBD_REPLY_FLAG_DONE : 0,
                  NBD_REPLY_TYPE_BLOCK_STATUS, request);
@@ -2337,7 +2336,7 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
         return ret;
     }

-    trace_nbd_co_receive_request_decode_type(request->handle, request->type,
+    trace_nbd_co_receive_request_decode_type(request->cookie, request->type,
                                              nbd_cmd_lookup(request->type));

     if (request->type != NBD_CMD_WRITE) {
@@ -2378,7 +2377,7 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
         }
         req->complete = true;

-        trace_nbd_co_receive_request_payload_received(request->handle,
+        trace_nbd_co_receive_request_payload_received(request->cookie,
                                                       request->len);
     }

diff --git a/nbd/trace-events b/nbd/trace-events
index 50ca05a9e22..f19a4d0db39 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -31,9 +31,9 @@ nbd_client_loop(void) "Doing NBD loop"
 nbd_client_loop_ret(int ret, const char *error) "NBD loop returned %d: %s"
 nbd_client_clear_queue(void) "Clearing NBD queue"
 nbd_client_clear_socket(void) "Clearing NBD socket"
-nbd_send_request(uint64_t from, uint32_t len, uint64_t handle, uint16_t flags, uint16_t type, const char *name) "Sending request to server: { .from = %" PRIu64", .len = %" PRIu32 ", .handle = %" PRIu64 ", .flags = 0x%" PRIx16 ", .type = %" PRIu16 " (%s) }"
-nbd_receive_simple_reply(int32_t error, const char *errname, uint64_t handle) "Got simple reply: { .error = %" PRId32 " (%s), handle = %" PRIu64" }"
-nbd_receive_structured_reply_chunk(uint16_t flags, uint16_t type, const char *name, uint64_t handle, uint32_t length) "Got structured reply chunk: { flags = 0x%" PRIx16 ", type = %d (%s), handle = %" PRIu64 ", length = %" PRIu32 " }"
+nbd_send_request(uint64_t from, uint32_t len, uint64_t cookie, uint16_t flags, uint16_t type, const char *name) "Sending request to server: { .from = %" PRIu64", .len = %" PRIu32 ", .cookie = %" PRIu64 ", .flags = 0x%" PRIx16 ", .type = %" PRIu16 " (%s) }"
+nbd_receive_simple_reply(int32_t error, const char *errname, uint64_t cookie) "Got simple reply: { .error = %" PRId32 " (%s), cookie = %" PRIu64" }"
+nbd_receive_structured_reply_chunk(uint16_t flags, uint16_t type, const char *name, uint64_t cookie, uint32_t length) "Got structured reply chunk: { flags = 0x%" PRIx16 ", type = %d (%s), cookie = %" PRIu64 ", length = %" PRIu32 " }"

 # common.c
 nbd_unknown_error(int err) "Squashing unexpected error %d to EINVAL"
@@ -63,14 +63,14 @@ nbd_negotiate_success(void) "Negotiation succeeded"
 nbd_receive_request(uint32_t magic, uint16_t flags, uint16_t type, uint64_t from, uint32_t len) "Got request: { magic = 0x%" PRIx32 ", .flags = 0x%" PRIx16 ", .type = 0x%" PRIx16 ", from = %" PRIu64 ", len = %" PRIu32 " }"
 nbd_blk_aio_attached(const char *name, void *ctx) "Export %s: Attaching clients to AIO context %p"
 nbd_blk_aio_detach(const char *name, void *ctx) "Export %s: Detaching clients from AIO context %p"
-nbd_co_send_simple_reply(uint64_t handle, uint32_t error, const char *errname, int len) "Send simple reply: handle = %" PRIu64 ", error = %" PRIu32 " (%s), len = %d"
-nbd_co_send_chunk_done(uint64_t handle) "Send structured reply done: handle = %" PRIu64
-nbd_co_send_chunk_read(uint64_t handle, uint64_t offset, void *data, size_t size) "Send structured read data reply: handle = %" PRIu64 ", offset = %" PRIu64 ", data = %p, len = %zu"
-nbd_co_send_chunk_read_hole(uint64_t handle, uint64_t offset, size_t size) "Send structured read hole reply: handle = %" PRIu64 ", offset = %" PRIu64 ", len = %zu"
-nbd_co_send_extents(uint64_t handle, unsigned int extents, uint32_t id, uint64_t length, int last) "Send block status reply: handle = %" PRIu64 ", extents = %u, context = %d (extents cover %" PRIu64 " bytes, last chunk = %d)"
-nbd_co_send_chunk_error(uint64_t handle, int err, const char *errname, const char *msg) "Send structured error reply: handle = %" PRIu64 ", error = %d (%s), msg = '%s'"
-nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type, const char *name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16 " (%s)"
-nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len) "Payload received: handle = %" PRIu64 ", len = %" PRIu32
+nbd_co_send_simple_reply(uint64_t cookie, uint32_t error, const char *errname, int len) "Send simple reply: cookie = %" PRIu64 ", error = %" PRIu32 " (%s), len = %d"
+nbd_co_send_chunk_done(uint64_t cookie) "Send structured reply done: cookie = %" PRIu64
+nbd_co_send_chunk_read(uint64_t cookie, uint64_t offset, void *data, size_t size) "Send structured read data reply: cookie = %" PRIu64 ", offset = %" PRIu64 ", data = %p, len = %zu"
+nbd_co_send_chunk_read_hole(uint64_t cookie, uint64_t offset, size_t size) "Send structured read hole reply: cookie = %" PRIu64 ", offset = %" PRIu64 ", len = %zu"
+nbd_co_send_extents(uint64_t cookie, unsigned int extents, uint32_t id, uint64_t length, int last) "Send block status reply: cookie = %" PRIu64 ", extents = %u, context = %d (extents cover %" PRIu64 " bytes, last chunk = %d)"
+nbd_co_send_chunk_error(uint64_t cookie, int err, const char *errname, const char *msg) "Send structured error reply: cookie = %" PRIu64 ", error = %d (%s), msg = '%s'"
+nbd_co_receive_request_decode_type(uint64_t cookie, uint16_t type, const char *name) "Decoding type: cookie = %" PRIu64 ", type = %" PRIu16 " (%s)"
+nbd_co_receive_request_payload_received(uint64_t cookie, uint32_t len) "Payload received: cookie = %" PRIu64 ", len = %" PRIu32
 nbd_co_receive_align_compliance(const char *op, uint64_t from, uint32_t len, uint32_t align) "client sent non-compliant unaligned %s request: from=0x%" PRIx64 ", len=0x%" PRIx32 ", align=0x%" PRIx32
 nbd_trip(void) "Reading request"

-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 06/24] nbd/client: Simplify cookie vs. index computation
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (4 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 05/24] nbd: s/handle/cookie/ to match NBD spec Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-12 14:27   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 07/24] nbd/client: Add safety check on chunk payload length Eric Blake
                   ` (17 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

Our code relies on a sentinel cookie value of zero for deciding when a
packet has been handled, as well as relying on array indices between 0
and MAX_NBD_REQUESTS-1 for dereferencing purposes.  As long as we can
symmetrically convert between two forms, there is no reason to go with
the odd choice of using XOR with a random pointer, when we can instead
simplify the mappings with a mere offset of 1.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: new patch
---
 block/nbd.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index be3c46c6fee..5322e66166c 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -50,8 +50,8 @@
 #define EN_OPTSTR ":exportname="
 #define MAX_NBD_REQUESTS    16

-#define COOKIE_TO_INDEX(bs, cookie) ((cookie) ^ (uint64_t)(intptr_t)(bs))
-#define INDEX_TO_COOKIE(bs, index)  ((index)  ^ (uint64_t)(intptr_t)(bs))
+#define COOKIE_TO_INDEX(cookie) ((cookie) - 1)
+#define INDEX_TO_COOKIE(index)  ((index) + 1)

 typedef struct {
     Coroutine *coroutine;
@@ -420,7 +420,7 @@ static void coroutine_fn GRAPH_RDLOCK nbd_reconnect_attempt(BDRVNBDState *s)
 static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie)
 {
     int ret;
-    uint64_t ind = COOKIE_TO_INDEX(s, cookie), ind2;
+    uint64_t ind = COOKIE_TO_INDEX(cookie), ind2;
     QEMU_LOCK_GUARD(&s->receive_mutex);

     while (true) {
@@ -435,7 +435,7 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie)
              * woken by whoever set s->reply.cookie (or never wait in this
              * yield). So, we should not wake it here.
              */
-            ind2 = COOKIE_TO_INDEX(s, s->reply.cookie);
+            ind2 = COOKIE_TO_INDEX(s->reply.cookie);
             assert(!s->requests[ind2].receiving);

             s->requests[ind].receiving = true;
@@ -468,7 +468,7 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie)
             nbd_channel_error(s, -EINVAL);
             return -EINVAL;
         }
-        ind2 = COOKIE_TO_INDEX(s, s->reply.cookie);
+        ind2 = COOKIE_TO_INDEX(s->reply.cookie);
         if (ind2 >= MAX_NBD_REQUESTS || !s->requests[ind2].coroutine) {
             nbd_channel_error(s, -EINVAL);
             return -EINVAL;
@@ -519,7 +519,7 @@ nbd_co_send_request(BlockDriverState *bs, NBDRequest *request,
     qemu_mutex_unlock(&s->requests_lock);

     qemu_co_mutex_lock(&s->send_mutex);
-    request->cookie = INDEX_TO_COOKIE(s, i);
+    request->cookie = INDEX_TO_COOKIE(i);

     assert(s->ioc);

@@ -832,7 +832,7 @@ static coroutine_fn int nbd_co_do_receive_one_chunk(
         int *request_ret, QEMUIOVector *qiov, void **payload, Error **errp)
 {
     int ret;
-    int i = COOKIE_TO_INDEX(s, cookie);
+    int i = COOKIE_TO_INDEX(cookie);
     void *local_payload = NULL;
     NBDStructuredReplyChunk *chunk;

@@ -1038,7 +1038,7 @@ static bool coroutine_fn nbd_reply_chunk_iter_receive(BDRVNBDState *s,

 break_loop:
     qemu_mutex_lock(&s->requests_lock);
-    s->requests[COOKIE_TO_INDEX(s, cookie)].coroutine = NULL;
+    s->requests[COOKIE_TO_INDEX(cookie)].coroutine = NULL;
     s->in_flight--;
     qemu_co_queue_next(&s->free_sema);
     qemu_mutex_unlock(&s->requests_lock);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 07/24] nbd/client: Add safety check on chunk payload length
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (5 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 06/24] nbd/client: Simplify cookie vs. index computation Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-08 13:56 ` [PATCH v4 08/24] nbd: Use enum for various negotiation modes Eric Blake
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov

Our existing use of structured replies either reads into a qiov capped
at 32M (NBD_CMD_READ) or caps allocation to 1000 bytes (see
NBD_MAX_MALLOC_PAYLOAD in block/nbd.c).  But the existing length
checks are rather late; if we encounter a buggy (or malicious) server
that sends a super-large payload length, we should drop the connection
right then rather than assuming the layer on top will be careful.
This becomes more important when we permit 64-bit lengths which are
even more likely to have the potential for attempted denial of service
abuse.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---

v4: sink this later in series [Vladimir]
---
 nbd/client.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/nbd/client.c b/nbd/client.c
index ea3590ca3d0..1b5569556fe 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -1413,6 +1413,18 @@ static int nbd_receive_structured_reply_chunk(QIOChannel *ioc,
     chunk->cookie = be64_to_cpu(chunk->cookie);
     chunk->length = be32_to_cpu(chunk->length);

+    /*
+     * Because we use BLOCK_STATUS with REQ_ONE, and cap READ requests
+     * at 32M, no valid server should send us payload larger than
+     * this.  Even if we stopped using REQ_ONE, sane servers will cap
+     * the number of extents they return for block status.
+     */
+    if (chunk->length > NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData)) {
+        error_setg(errp, "server chunk %" PRIu32 " (%s) payload is too long",
+                   chunk->type, nbd_rep_lookup(chunk->type));
+        return -EINVAL;
+    }
+
     return 0;
 }

-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 08/24] nbd: Use enum for various negotiation modes
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (6 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 07/24] nbd/client: Add safety check on chunk payload length Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-12 14:39   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 09/24] nbd: Replace bool structured_reply with mode enum Eric Blake
                   ` (15 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

Deciphering the hard-coded list of integer return values from
nbd_start_negotiate() will only get more confusing when adding support
for 64-bit extended headers.  Better is to name things in an enum.
Although the function in question is private to client.c, putting the
enum in a public header and including an enum-to-string conversion
will allow its use in more places in upcoming patches.

The enum is intentionally laid out so that operators like <= can be
used to group multiple modes with similar characteristics, and where
the least powerful mode has value 0, even though this patch does not
exploit that.  No semantic change intended.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: new patch, expanding enum idea from v3 4/14
---
 include/block/nbd.h | 11 +++++++++++
 nbd/client.c        | 46 ++++++++++++++++++++++++---------------------
 nbd/common.c        | 17 +++++++++++++++++
 3 files changed, 53 insertions(+), 21 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 59db69bafa5..aba4279b56c 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -52,6 +52,16 @@ typedef struct NBDOptionReplyMetaContext {
     /* metadata context name follows */
 } QEMU_PACKED NBDOptionReplyMetaContext;

+/* Track results of negotiation */
+typedef enum NBDMode {
+    /* Keep this list in a continuum of increasing features. */
+    NBD_MODE_OLDSTYLE,     /* server lacks newstyle negotiation */
+    NBD_MODE_EXPORT_NAME,  /* newstyle but only OPT_EXPORT_NAME safe */
+    NBD_MODE_SIMPLE,       /* newstyle but only simple replies */
+    NBD_MODE_STRUCTURED,   /* newstyle, structured replies enabled */
+    /* TODO add NBD_MODE_EXTENDED */
+} NBDMode;
+
 /* Transmission phase structs
  *
  * Note: these are _NOT_ the same as the network representation of an NBD
@@ -404,6 +414,7 @@ const char *nbd_rep_lookup(uint32_t rep);
 const char *nbd_info_lookup(uint16_t info);
 const char *nbd_cmd_lookup(uint16_t info);
 const char *nbd_err_lookup(int err);
+const char *nbd_mode_lookup(NBDMode mode);

 /* nbd/client-connection.c */
 typedef struct NBDClientConnection NBDClientConnection;
diff --git a/nbd/client.c b/nbd/client.c
index 1b5569556fe..479208d5d9d 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -875,10 +875,7 @@ static int nbd_list_meta_contexts(QIOChannel *ioc,
  * Start the handshake to the server.  After a positive return, the server
  * is ready to accept additional NBD_OPT requests.
  * Returns: negative errno: failure talking to server
- *          0: server is oldstyle, must call nbd_negotiate_finish_oldstyle
- *          1: server is newstyle, but can only accept EXPORT_NAME
- *          2: server is newstyle, but lacks structured replies
- *          3: server is newstyle and set up for structured replies
+ *          non-negative: enum NBDMode describing server abilities
  */
 static int nbd_start_negotiate(AioContext *aio_context, QIOChannel *ioc,
                                QCryptoTLSCreds *tlscreds,
@@ -969,16 +966,16 @@ static int nbd_start_negotiate(AioContext *aio_context, QIOChannel *ioc,
                     return -EINVAL;
                 }
             }
-            return 2 + result;
+            return result ? NBD_MODE_STRUCTURED : NBD_MODE_SIMPLE;
         } else {
-            return 1;
+            return NBD_MODE_EXPORT_NAME;
         }
     } else if (magic == NBD_CLIENT_MAGIC) {
         if (tlscreds) {
             error_setg(errp, "Server does not support STARTTLS");
             return -EINVAL;
         }
-        return 0;
+        return NBD_MODE_OLDSTYLE;
     } else {
         error_setg(errp, "Bad server magic received: 0x%" PRIx64, magic);
         return -EINVAL;
@@ -1032,6 +1029,9 @@ int nbd_receive_negotiate(AioContext *aio_context, QIOChannel *ioc,

     result = nbd_start_negotiate(aio_context, ioc, tlscreds, hostname, outioc,
                                  info->structured_reply, &zeroes, errp);
+    if (result < 0) {
+        return result;
+    }

     info->structured_reply = false;
     info->base_allocation = false;
@@ -1039,8 +1039,8 @@ int nbd_receive_negotiate(AioContext *aio_context, QIOChannel *ioc,
         ioc = *outioc;
     }

-    switch (result) {
-    case 3: /* newstyle, with structured replies */
+    switch ((NBDMode)result) {
+    case NBD_MODE_STRUCTURED:
         info->structured_reply = true;
         if (base_allocation) {
             result = nbd_negotiate_simple_meta_context(ioc, info, errp);
@@ -1050,7 +1050,7 @@ int nbd_receive_negotiate(AioContext *aio_context, QIOChannel *ioc,
             info->base_allocation = result == 1;
         }
         /* fall through */
-    case 2: /* newstyle, try OPT_GO */
+    case NBD_MODE_SIMPLE:
         /* Try NBD_OPT_GO first - if it works, we are done (it
          * also gives us a good message if the server requires
          * TLS).  If it is not available, fall back to
@@ -1073,7 +1073,7 @@ int nbd_receive_negotiate(AioContext *aio_context, QIOChannel *ioc,
             return -EINVAL;
         }
         /* fall through */
-    case 1: /* newstyle, but limited to EXPORT_NAME */
+    case NBD_MODE_EXPORT_NAME:
         /* write the export name request */
         if (nbd_send_option_request(ioc, NBD_OPT_EXPORT_NAME, -1, info->name,
                                     errp) < 0) {
@@ -1089,7 +1089,7 @@ int nbd_receive_negotiate(AioContext *aio_context, QIOChannel *ioc,
             return -EINVAL;
         }
         break;
-    case 0: /* oldstyle, parse length and flags */
+    case NBD_MODE_OLDSTYLE:
         if (*info->name) {
             error_setg(errp, "Server does not support non-empty export names");
             return -EINVAL;
@@ -1099,7 +1099,7 @@ int nbd_receive_negotiate(AioContext *aio_context, QIOChannel *ioc,
         }
         break;
     default:
-        return result;
+        g_assert_not_reached();
     }

     trace_nbd_receive_negotiate_size_flags(info->size, info->flags);
@@ -1155,10 +1155,13 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
     if (tlscreds && sioc) {
         ioc = sioc;
     }
+    if (result < 0) {
+        goto out;
+    }

-    switch (result) {
-    case 2:
-    case 3:
+    switch ((NBDMode)result) {
+    case NBD_MODE_SIMPLE:
+    case NBD_MODE_STRUCTURED:
         /* newstyle - use NBD_OPT_LIST to populate array, then try
          * NBD_OPT_INFO on each array member. If structured replies
          * are enabled, also try NBD_OPT_LIST_META_CONTEXT. */
@@ -1179,7 +1182,7 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
             memset(&array[count - 1], 0, sizeof(*array));
             array[count - 1].name = name;
             array[count - 1].description = desc;
-            array[count - 1].structured_reply = result == 3;
+            array[count - 1].structured_reply = result == NBD_MODE_STRUCTURED;
         }

         for (i = 0; i < count; i++) {
@@ -1195,7 +1198,7 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
                 break;
             }

-            if (result == 3 &&
+            if (result == NBD_MODE_STRUCTURED &&
                 nbd_list_meta_contexts(ioc, &array[i], errp) < 0) {
                 goto out;
             }
@@ -1204,11 +1207,12 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
         /* Send NBD_OPT_ABORT as a courtesy before hanging up */
         nbd_send_opt_abort(ioc);
         break;
-    case 1: /* newstyle, but limited to EXPORT_NAME */
+    case NBD_MODE_EXPORT_NAME:
         error_setg(errp, "Server does not support export lists");
         /* We can't even send NBD_OPT_ABORT, so merely hang up */
         goto out;
-    case 0: /* oldstyle, parse length and flags */
+    case NBD_MODE_OLDSTYLE:
+        /* Lone export name is implied, but we can parse length and flags */
         array = g_new0(NBDExportInfo, 1);
         array->name = g_strdup("");
         count = 1;
@@ -1226,7 +1230,7 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
         }
         break;
     default:
-        goto out;
+        g_assert_not_reached();
     }

     *info = array;
diff --git a/nbd/common.c b/nbd/common.c
index ddfe7d11837..989fbe54a19 100644
--- a/nbd/common.c
+++ b/nbd/common.c
@@ -248,3 +248,20 @@ int nbd_errno_to_system_errno(int err)
     }
     return ret;
 }
+
+
+const char *nbd_mode_lookup(NBDMode mode)
+{
+    switch (mode) {
+    case NBD_MODE_OLDSTYLE:
+        return "oldstyle";
+    case NBD_MODE_EXPORT_NAME:
+        return "export name only";
+    case NBD_MODE_SIMPLE:
+        return "simple headers";
+    case NBD_MODE_STRUCTURED:
+        return "structured replies";
+    default:
+        return "<unknown>";
+    }
+}
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 09/24] nbd: Replace bool structured_reply with mode enum
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (7 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 08/24] nbd: Use enum for various negotiation modes Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-12 15:07   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 10/24] nbd/client: Pass mode through to nbd_send_request Eric Blake
                   ` (14 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

The upcoming patches for 64-bit extensions requires various points in
the protocol to make decisions based on what was negotiated.  While we
could easily add a 'bool extended_headers' alongside the existing
'bool structured_reply', this does not scale well if more modes are
added in the future.  Better is to expose the mode enum added in the
previous patch out to a wider use in the code base.

Where the code previously checked for structured_reply being set or
clear, it now prefers checking for an inequality; this works because
the nodes are in a continuum of increasing abilities, and allows us to
touch fewer places if we ever insert other modes in the middle of the
enum.  There should be no semantic change in this patch.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: new patch, expanding enum idea from v3 4/14
---
 include/block/nbd.h     |  2 +-
 block/nbd.c             |  8 +++++---
 nbd/client-connection.c |  4 ++--
 nbd/client.c            | 18 +++++++++---------
 nbd/server.c            | 27 +++++++++++++++------------
 qemu-nbd.c              |  4 +++-
 6 files changed, 35 insertions(+), 28 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index aba4279b56c..fea69ac24bb 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -304,7 +304,7 @@ typedef struct NBDExportInfo {

     /* In-out fields, set by client before nbd_receive_negotiate() and
      * updated by server results during nbd_receive_negotiate() */
-    bool structured_reply;
+    NBDMode mode; /* input maximum mode tolerated; output actual mode chosen */
     bool base_allocation; /* base:allocation context for NBD_CMD_BLOCK_STATUS */

     /* Set by server results during nbd_receive_negotiate() and
diff --git a/block/nbd.c b/block/nbd.c
index 5322e66166c..5f88f7a819b 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -464,7 +464,8 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie)
             nbd_channel_error(s, ret);
             return ret;
         }
-        if (nbd_reply_is_structured(&s->reply) && !s->info.structured_reply) {
+        if (nbd_reply_is_structured(&s->reply) &&
+            s->info.mode < NBD_MODE_STRUCTURED) {
             nbd_channel_error(s, -EINVAL);
             return -EINVAL;
         }
@@ -867,7 +868,7 @@ static coroutine_fn int nbd_co_do_receive_one_chunk(
     }

     /* handle structured reply chunk */
-    assert(s->info.structured_reply);
+    assert(s->info.mode >= NBD_MODE_STRUCTURED);
     chunk = &s->reply.structured;

     if (chunk->type == NBD_REPLY_TYPE_NONE) {
@@ -1071,7 +1072,8 @@ nbd_co_receive_cmdread_reply(BDRVNBDState *s, uint64_t cookie,
     void *payload = NULL;
     Error *local_err = NULL;

-    NBD_FOREACH_REPLY_CHUNK(s, iter, cookie, s->info.structured_reply,
+    NBD_FOREACH_REPLY_CHUNK(s, iter, cookie,
+                            s->info.mode >= NBD_MODE_STRUCTURED,
                             qiov, &reply, &payload)
     {
         int ret;
diff --git a/nbd/client-connection.c b/nbd/client-connection.c
index 3d14296c042..13e4cb6684b 100644
--- a/nbd/client-connection.c
+++ b/nbd/client-connection.c
@@ -1,5 +1,5 @@
 /*
- * QEMU Block driver for  NBD
+ * QEMU Block driver for NBD
  *
  * Copyright (c) 2021 Virtuozzo International GmbH.
  *
@@ -93,7 +93,7 @@ NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr,
         .do_negotiation = do_negotiation,

         .initial_info.request_sizes = true,
-        .initial_info.structured_reply = true,
+        .initial_info.mode = NBD_MODE_STRUCTURED,
         .initial_info.base_allocation = true,
         .initial_info.x_dirty_bitmap = g_strdup(x_dirty_bitmap),
         .initial_info.name = g_strdup(export_name ?: "")
diff --git a/nbd/client.c b/nbd/client.c
index 479208d5d9d..faa054c4527 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -880,7 +880,7 @@ static int nbd_list_meta_contexts(QIOChannel *ioc,
 static int nbd_start_negotiate(AioContext *aio_context, QIOChannel *ioc,
                                QCryptoTLSCreds *tlscreds,
                                const char *hostname, QIOChannel **outioc,
-                               bool structured_reply, bool *zeroes,
+                               NBDMode max_mode, bool *zeroes,
                                Error **errp)
 {
     ERRP_GUARD();
@@ -958,7 +958,7 @@ static int nbd_start_negotiate(AioContext *aio_context, QIOChannel *ioc,
         if (fixedNewStyle) {
             int result = 0;

-            if (structured_reply) {
+            if (max_mode >= NBD_MODE_STRUCTURED) {
                 result = nbd_request_simple_option(ioc,
                                                    NBD_OPT_STRUCTURED_REPLY,
                                                    false, errp);
@@ -1028,20 +1028,19 @@ int nbd_receive_negotiate(AioContext *aio_context, QIOChannel *ioc,
     trace_nbd_receive_negotiate_name(info->name);

     result = nbd_start_negotiate(aio_context, ioc, tlscreds, hostname, outioc,
-                                 info->structured_reply, &zeroes, errp);
+                                 info->mode, &zeroes, errp);
     if (result < 0) {
         return result;
     }

-    info->structured_reply = false;
+    info->mode = result;
     info->base_allocation = false;
     if (tlscreds && *outioc) {
         ioc = *outioc;
     }

-    switch ((NBDMode)result) {
+    switch (info->mode) {
     case NBD_MODE_STRUCTURED:
-        info->structured_reply = true;
         if (base_allocation) {
             result = nbd_negotiate_simple_meta_context(ioc, info, errp);
             if (result < 0) {
@@ -1150,8 +1149,8 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
     QIOChannel *sioc = NULL;

     *info = NULL;
-    result = nbd_start_negotiate(NULL, ioc, tlscreds, hostname, &sioc, true,
-                                 NULL, errp);
+    result = nbd_start_negotiate(NULL, ioc, tlscreds, hostname, &sioc,
+                                 NBD_MODE_STRUCTURED, NULL, errp);
     if (tlscreds && sioc) {
         ioc = sioc;
     }
@@ -1182,7 +1181,7 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
             memset(&array[count - 1], 0, sizeof(*array));
             array[count - 1].name = name;
             array[count - 1].description = desc;
-            array[count - 1].structured_reply = result == NBD_MODE_STRUCTURED;
+            array[count - 1].mode = result;
         }

         for (i = 0; i < count; i++) {
@@ -1215,6 +1214,7 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
         /* Lone export name is implied, but we can parse length and flags */
         array = g_new0(NBDExportInfo, 1);
         array->name = g_strdup("");
+        array->mode = NBD_MODE_OLDSTYLE;
         count = 1;

         if (nbd_negotiate_finish_oldstyle(ioc, array, errp) < 0) {
diff --git a/nbd/server.c b/nbd/server.c
index 8486b64b15d..bade4f7990c 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -143,7 +143,7 @@ struct NBDClient {

     uint32_t check_align; /* If non-zero, check for aligned client requests */

-    bool structured_reply;
+    NBDMode mode;
     NBDExportMetaContexts export_meta;

     uint32_t opt; /* Current option being negotiated */
@@ -502,7 +502,7 @@ static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes,
     }

     myflags = client->exp->nbdflags;
-    if (client->structured_reply) {
+    if (client->mode >= NBD_MODE_STRUCTURED) {
         myflags |= NBD_FLAG_SEND_DF;
     }
     trace_nbd_negotiate_new_style_size_flags(client->exp->size, myflags);
@@ -687,7 +687,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp)

     /* Send NBD_INFO_EXPORT always */
     myflags = exp->nbdflags;
-    if (client->structured_reply) {
+    if (client->mode >= NBD_MODE_STRUCTURED) {
         myflags |= NBD_FLAG_SEND_DF;
     }
     trace_nbd_negotiate_new_style_size_flags(exp->size, myflags);
@@ -985,7 +985,8 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
     size_t i;
     size_t count = 0;

-    if (client->opt == NBD_OPT_SET_META_CONTEXT && !client->structured_reply) {
+    if (client->opt == NBD_OPT_SET_META_CONTEXT &&
+        client->mode < NBD_MODE_STRUCTURED) {
         return nbd_opt_invalid(client, errp,
                                "request option '%s' when structured reply "
                                "is not negotiated",
@@ -1261,13 +1262,13 @@ static int nbd_negotiate_options(NBDClient *client, Error **errp)
             case NBD_OPT_STRUCTURED_REPLY:
                 if (length) {
                     ret = nbd_reject_length(client, false, errp);
-                } else if (client->structured_reply) {
+                } else if (client->mode >= NBD_MODE_STRUCTURED) {
                     ret = nbd_negotiate_send_rep_err(
                         client, NBD_REP_ERR_INVALID, errp,
                         "structured reply already negotiated");
                 } else {
                     ret = nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
-                    client->structured_reply = true;
+                    client->mode = NBD_MODE_STRUCTURED;
                 }
                 break;

@@ -1907,7 +1908,9 @@ static int coroutine_fn nbd_co_send_simple_reply(NBDClient *client,
     };

     assert(!len || !nbd_err);
-    assert(!client->structured_reply || request->type != NBD_CMD_READ);
+    assert(client->mode < NBD_MODE_STRUCTURED ||
+           (client->mode == NBD_MODE_STRUCTURED &&
+            request->type != NBD_CMD_READ));
     trace_nbd_co_send_simple_reply(request->cookie, nbd_err,
                                    nbd_err_lookup(nbd_err), len);
     set_be_simple_reply(&reply, nbd_err, request->cookie);
@@ -2409,7 +2412,7 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
                                               client->check_align);
     }
     valid_flags = NBD_CMD_FLAG_FUA;
-    if (request->type == NBD_CMD_READ && client->structured_reply) {
+    if (request->type == NBD_CMD_READ && client->mode >= NBD_MODE_STRUCTURED) {
         valid_flags |= NBD_CMD_FLAG_DF;
     } else if (request->type == NBD_CMD_WRITE_ZEROES) {
         valid_flags |= NBD_CMD_FLAG_NO_HOLE | NBD_CMD_FLAG_FAST_ZERO;
@@ -2435,7 +2438,7 @@ static coroutine_fn int nbd_send_generic_reply(NBDClient *client,
                                                const char *error_msg,
                                                Error **errp)
 {
-    if (client->structured_reply && ret < 0) {
+    if (client->mode >= NBD_MODE_STRUCTURED && ret < 0) {
         return nbd_co_send_chunk_error(client, request, -ret, error_msg, errp);
     } else {
         return nbd_co_send_simple_reply(client, request, ret < 0 ? -ret : 0,
@@ -2463,8 +2466,8 @@ static coroutine_fn int nbd_do_cmd_read(NBDClient *client, NBDRequest *request,
         }
     }

-    if (client->structured_reply && !(request->flags & NBD_CMD_FLAG_DF) &&
-        request->len)
+    if (client->mode >= NBD_MODE_STRUCTURED &&
+        !(request->flags & NBD_CMD_FLAG_DF) && request->len)
     {
         return nbd_co_send_sparse_read(client, request, request->from,
                                        data, request->len, errp);
@@ -2476,7 +2479,7 @@ static coroutine_fn int nbd_do_cmd_read(NBDClient *client, NBDRequest *request,
                                       "reading from file failed", errp);
     }

-    if (client->structured_reply) {
+    if (client->mode >= NBD_MODE_STRUCTURED) {
         if (request->len) {
             return nbd_co_send_chunk_read(client, request, request->from, data,
                                           request->len, true, errp);
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 4276163564b..3ddd0bf02b4 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -274,8 +274,10 @@ static void *show_parts(void *arg)

 static void *nbd_client_thread(void *arg)
 {
+    /* TODO: Revisit this if nbd.ko ever gains support for structured reply */
     char *device = arg;
-    NBDExportInfo info = { .request_sizes = false, .name = g_strdup("") };
+    NBDExportInfo info = { .request_sizes = false, .name = g_strdup(""),
+                           .mode = NBD_MODE_SIMPLE };
     QIOChannelSocket *sioc;
     int fd = -1;
     int ret = EXIT_FAILURE;
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 10/24] nbd/client: Pass mode through to nbd_send_request
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (8 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 09/24] nbd: Replace bool structured_reply with mode enum Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-12 15:48   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 11/24] nbd: Add types for extended headers Eric Blake
                   ` (13 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

Once the 64-bit headers extension is enabled, the data layout we send
over the wire for a client request depends on the mode negotiated with
the server.  Rather than adding a parameter to nbd_send_request, we
can add a member to struct NBDRequest, since it already does not
reflect on-wire format.  Some callers initialize it directly; many
others rely on a common initialization point during
nbd_co_send_request().  At this point, there is no semantic change.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: new patch, based on ideas in v3 4/14, but by modifying NBDRequest
instead of adding a parameter
---
 include/block/nbd.h | 12 +++++++-----
 block/nbd.c         |  5 +++--
 nbd/client.c        |  3 ++-
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index fea69ac24bb..52420660a65 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -62,17 +62,19 @@ typedef enum NBDMode {
     /* TODO add NBD_MODE_EXTENDED */
 } NBDMode;

-/* Transmission phase structs
- *
- * Note: these are _NOT_ the same as the network representation of an NBD
- * request and reply!
+/* Transmission phase structs */
+
+/*
+ * Note: NBDRequest is _NOT_ the same as the network representation of an NBD
+ * request!
  */
 typedef struct NBDRequest {
     uint64_t cookie;
     uint64_t from;
     uint32_t len;
     uint16_t flags; /* NBD_CMD_FLAG_* */
-    uint16_t type; /* NBD_CMD_* */
+    uint16_t type;  /* NBD_CMD_* */
+    NBDMode mode;   /* Determines which network representation to use */
 } NBDRequest;

 typedef struct NBDSimpleReply {
diff --git a/block/nbd.c b/block/nbd.c
index 5f88f7a819b..ca5991f868a 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -339,7 +339,7 @@ int coroutine_fn nbd_co_do_establish_connection(BlockDriverState *bs,
          * We have connected, but must fail for other reasons.
          * Send NBD_CMD_DISC as a courtesy to the server.
          */
-        NBDRequest request = { .type = NBD_CMD_DISC };
+        NBDRequest request = { .type = NBD_CMD_DISC, .mode = s->info.mode };

         nbd_send_request(s->ioc, &request);

@@ -521,6 +521,7 @@ nbd_co_send_request(BlockDriverState *bs, NBDRequest *request,

     qemu_co_mutex_lock(&s->send_mutex);
     request->cookie = INDEX_TO_COOKIE(i);
+    request->mode = s->info.mode;

     assert(s->ioc);

@@ -1466,7 +1467,7 @@ static void nbd_yank(void *opaque)
 static void nbd_client_close(BlockDriverState *bs)
 {
     BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
-    NBDRequest request = { .type = NBD_CMD_DISC };
+    NBDRequest request = { .type = NBD_CMD_DISC, .mode = s->info.mode };

     if (s->ioc) {
         nbd_send_request(s->ioc, &request);
diff --git a/nbd/client.c b/nbd/client.c
index faa054c4527..40a1eb72346 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -1224,7 +1224,7 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
         /* Send NBD_CMD_DISC as a courtesy to the server, but ignore all
          * errors now that we have the information we wanted. */
         if (nbd_drop(ioc, 124, NULL) == 0) {
-            NBDRequest request = { .type = NBD_CMD_DISC };
+            NBDRequest request = { .type = NBD_CMD_DISC, .mode = result };

             nbd_send_request(ioc, &request);
         }
@@ -1354,6 +1354,7 @@ int nbd_send_request(QIOChannel *ioc, NBDRequest *request)
 {
     uint8_t buf[NBD_REQUEST_SIZE];

+    assert(request->mode <= NBD_MODE_STRUCTURED); /* TODO handle extended */
     trace_nbd_send_request(request->from, request->len, request->cookie,
                            request->flags, request->type,
                            nbd_cmd_lookup(request->type));
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 11/24] nbd: Add types for extended headers
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (9 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 10/24] nbd/client: Pass mode through to nbd_send_request Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-12 16:11   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 12/24] nbd: Prepare for 64-bit request effect lengths Eric Blake
                   ` (12 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

Add the constants and structs necessary for later patches to start
implementing the NBD_OPT_EXTENDED_HEADERS extension in both the client
and server, matching recent upstream nbd.git (through commit
e6f3b94a934).  This patch does not change any existing behavior, but
merely sets the stage for upcoming patches.

This patch does not change the status quo that neither the client nor
server use a packed-struct representation for the request header.
While most of the patch adds new types, there is also some churn for
renaming the existing NBDExtent to NBDExtent32 to contrast it with
NBDExtent64, which I thought was a nicer name than NBDExtentExt.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: Hoist earlier in series, tweak a few comments, defer docs/interop
change to when feature is actually turned on, NBDExtent rename, add
QEMU_BUG_BUILD_ON for sanity sake, hoist in block status payload bits
from v3 14/14; R-b dropped
---
 include/block/nbd.h | 124 +++++++++++++++++++++++++++++++-------------
 nbd/nbd-internal.h  |   3 +-
 block/nbd.c         |   6 +--
 nbd/common.c        |  12 ++++-
 nbd/server.c        |   6 +--
 5 files changed, 106 insertions(+), 45 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 52420660a65..f706e38dc72 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -59,7 +59,7 @@ typedef enum NBDMode {
     NBD_MODE_EXPORT_NAME,  /* newstyle but only OPT_EXPORT_NAME safe */
     NBD_MODE_SIMPLE,       /* newstyle but only simple replies */
     NBD_MODE_STRUCTURED,   /* newstyle, structured replies enabled */
-    /* TODO add NBD_MODE_EXTENDED */
+    NBD_MODE_EXTENDED,     /* newstyle, extended headers enabled */
 } NBDMode;

 /* Transmission phase structs */
@@ -92,20 +92,36 @@ typedef struct NBDStructuredReplyChunk {
     uint32_t length; /* length of payload */
 } QEMU_PACKED NBDStructuredReplyChunk;

+typedef struct NBDExtendedReplyChunk {
+    uint32_t magic;  /* NBD_EXTENDED_REPLY_MAGIC */
+    uint16_t flags;  /* combination of NBD_REPLY_FLAG_* */
+    uint16_t type;   /* NBD_REPLY_TYPE_* */
+    uint64_t cookie; /* request handle */
+    uint64_t offset; /* request offset */
+    uint64_t length; /* length of payload */
+} QEMU_PACKED NBDExtendedReplyChunk;
+
 typedef union NBDReply {
     NBDSimpleReply simple;
     NBDStructuredReplyChunk structured;
+    NBDExtendedReplyChunk extended;
     struct {
         /*
-         * @magic and @cookie fields have the same offset and size both in
-         * simple reply and structured reply chunk, so let them be accessible
-         * without ".simple." or ".structured." specification
+         * @magic and @cookie fields have the same offset and size in all
+         * forms of replies, so let them be accessible without ".simple.",
+         * ".structured.", or ".extended." specifications.
          */
         uint32_t magic;
         uint32_t _skip;
         uint64_t cookie;
-    } QEMU_PACKED;
+    };
 } NBDReply;
+QEMU_BUILD_BUG_ON(offsetof(NBDReply, simple.cookie) !=
+                  offsetof(NBDReply, cookie));
+QEMU_BUILD_BUG_ON(offsetof(NBDReply, structured.cookie) !=
+                  offsetof(NBDReply, cookie));
+QEMU_BUILD_BUG_ON(offsetof(NBDReply, extended.cookie) !=
+                  offsetof(NBDReply, cookie));

 /* Header of chunk for NBD_REPLY_TYPE_OFFSET_DATA */
 typedef struct NBDStructuredReadData {
@@ -132,14 +148,34 @@ typedef struct NBDStructuredError {
 typedef struct NBDStructuredMeta {
     /* header's length >= 12 (at least one extent) */
     uint32_t context_id;
-    /* extents follows */
+    /* NBDExtent32 extents[] follows, array length implied by header */
 } QEMU_PACKED NBDStructuredMeta;

-/* Extent chunk for NBD_REPLY_TYPE_BLOCK_STATUS */
-typedef struct NBDExtent {
+/* Extent array element for NBD_REPLY_TYPE_BLOCK_STATUS */
+typedef struct NBDExtent32 {
     uint32_t length;
     uint32_t flags; /* NBD_STATE_* */
-} QEMU_PACKED NBDExtent;
+} QEMU_PACKED NBDExtent32;
+
+/* Header of NBD_REPLY_TYPE_BLOCK_STATUS_EXT */
+typedef struct NBDExtendedMeta {
+    /* header's length >= 24 (at least one extent) */
+    uint32_t context_id;
+    uint32_t count; /* header length must be count * 16 + 8 */
+    /* NBDExtent64 extents[count] follows */
+} QEMU_PACKED NBDExtendedMeta;
+
+/* Extent array element for NBD_REPLY_TYPE_BLOCK_STATUS_EXT */
+typedef struct NBDExtent64 {
+    uint64_t length;
+    uint64_t flags; /* NBD_STATE_* */
+} QEMU_PACKED NBDExtent64;
+
+/* Client payload for limiting NBD_CMD_BLOCK_STATUS reply */
+typedef struct NBDBlockStatusPayload {
+    uint64_t effect_length;
+    /* uint32_t ids[] follows, array length implied by header */
+} QEMU_PACKED NBDBlockStatusPayload;

 /* Transmission (export) flags: sent from server to client during handshake,
    but describe what will happen during transmission */
@@ -157,20 +193,22 @@ enum {
     NBD_FLAG_SEND_RESIZE_BIT        =  9, /* Send resize */
     NBD_FLAG_SEND_CACHE_BIT         = 10, /* Send CACHE (prefetch) */
     NBD_FLAG_SEND_FAST_ZERO_BIT     = 11, /* FAST_ZERO flag for WRITE_ZEROES */
+    NBD_FLAG_BLOCK_STAT_PAYLOAD_BIT = 12, /* PAYLOAD flag for BLOCK_STATUS */
 };

-#define NBD_FLAG_HAS_FLAGS         (1 << NBD_FLAG_HAS_FLAGS_BIT)
-#define NBD_FLAG_READ_ONLY         (1 << NBD_FLAG_READ_ONLY_BIT)
-#define NBD_FLAG_SEND_FLUSH        (1 << NBD_FLAG_SEND_FLUSH_BIT)
-#define NBD_FLAG_SEND_FUA          (1 << NBD_FLAG_SEND_FUA_BIT)
-#define NBD_FLAG_ROTATIONAL        (1 << NBD_FLAG_ROTATIONAL_BIT)
-#define NBD_FLAG_SEND_TRIM         (1 << NBD_FLAG_SEND_TRIM_BIT)
-#define NBD_FLAG_SEND_WRITE_ZEROES (1 << NBD_FLAG_SEND_WRITE_ZEROES_BIT)
-#define NBD_FLAG_SEND_DF           (1 << NBD_FLAG_SEND_DF_BIT)
-#define NBD_FLAG_CAN_MULTI_CONN    (1 << NBD_FLAG_CAN_MULTI_CONN_BIT)
-#define NBD_FLAG_SEND_RESIZE       (1 << NBD_FLAG_SEND_RESIZE_BIT)
-#define NBD_FLAG_SEND_CACHE        (1 << NBD_FLAG_SEND_CACHE_BIT)
-#define NBD_FLAG_SEND_FAST_ZERO    (1 << NBD_FLAG_SEND_FAST_ZERO_BIT)
+#define NBD_FLAG_HAS_FLAGS          (1 << NBD_FLAG_HAS_FLAGS_BIT)
+#define NBD_FLAG_READ_ONLY          (1 << NBD_FLAG_READ_ONLY_BIT)
+#define NBD_FLAG_SEND_FLUSH         (1 << NBD_FLAG_SEND_FLUSH_BIT)
+#define NBD_FLAG_SEND_FUA           (1 << NBD_FLAG_SEND_FUA_BIT)
+#define NBD_FLAG_ROTATIONAL         (1 << NBD_FLAG_ROTATIONAL_BIT)
+#define NBD_FLAG_SEND_TRIM          (1 << NBD_FLAG_SEND_TRIM_BIT)
+#define NBD_FLAG_SEND_WRITE_ZEROES  (1 << NBD_FLAG_SEND_WRITE_ZEROES_BIT)
+#define NBD_FLAG_SEND_DF            (1 << NBD_FLAG_SEND_DF_BIT)
+#define NBD_FLAG_CAN_MULTI_CONN     (1 << NBD_FLAG_CAN_MULTI_CONN_BIT)
+#define NBD_FLAG_SEND_RESIZE        (1 << NBD_FLAG_SEND_RESIZE_BIT)
+#define NBD_FLAG_SEND_CACHE         (1 << NBD_FLAG_SEND_CACHE_BIT)
+#define NBD_FLAG_SEND_FAST_ZERO     (1 << NBD_FLAG_SEND_FAST_ZERO_BIT)
+#define NBD_FLAG_BLOCK_STAT_PAYLOAD (1 << NBD_FLAG_BLOCK_STAT_PAYLOAD_BIT)

 /* New-style handshake (global) flags, sent from server to client, and
    control what will happen during handshake phase. */
@@ -193,6 +231,7 @@ enum {
 #define NBD_OPT_STRUCTURED_REPLY  (8)
 #define NBD_OPT_LIST_META_CONTEXT (9)
 #define NBD_OPT_SET_META_CONTEXT  (10)
+#define NBD_OPT_EXTENDED_HEADERS  (11)

 /* Option reply types. */
 #define NBD_REP_ERR(value) ((UINT32_C(1) << 31) | (value))
@@ -210,6 +249,8 @@ enum {
 #define NBD_REP_ERR_UNKNOWN         NBD_REP_ERR(6)  /* Export unknown */
 #define NBD_REP_ERR_SHUTDOWN        NBD_REP_ERR(7)  /* Server shutting down */
 #define NBD_REP_ERR_BLOCK_SIZE_REQD NBD_REP_ERR(8)  /* Need INFO_BLOCK_SIZE */
+#define NBD_REP_ERR_TOO_BIG         NBD_REP_ERR(9)  /* Payload size overflow */
+#define NBD_REP_ERR_EXT_HEADER_REQD NBD_REP_ERR(10) /* Need extended headers */

 /* Info types, used during NBD_REP_INFO */
 #define NBD_INFO_EXPORT         0
@@ -218,12 +259,14 @@ enum {
 #define NBD_INFO_BLOCK_SIZE     3

 /* Request flags, sent from client to server during transmission phase */
-#define NBD_CMD_FLAG_FUA        (1 << 0) /* 'force unit access' during write */
-#define NBD_CMD_FLAG_NO_HOLE    (1 << 1) /* don't punch hole on zero run */
-#define NBD_CMD_FLAG_DF         (1 << 2) /* don't fragment structured read */
-#define NBD_CMD_FLAG_REQ_ONE    (1 << 3) /* only one extent in BLOCK_STATUS
-                                          * reply chunk */
-#define NBD_CMD_FLAG_FAST_ZERO  (1 << 4) /* fail if WRITE_ZEROES is not fast */
+#define NBD_CMD_FLAG_FUA         (1 << 0) /* 'force unit access' during write */
+#define NBD_CMD_FLAG_NO_HOLE     (1 << 1) /* don't punch hole on zero run */
+#define NBD_CMD_FLAG_DF          (1 << 2) /* don't fragment structured read */
+#define NBD_CMD_FLAG_REQ_ONE     (1 << 3) \
+    /* only one extent in BLOCK_STATUS reply chunk */
+#define NBD_CMD_FLAG_FAST_ZERO   (1 << 4) /* fail if WRITE_ZEROES is not fast */
+#define NBD_CMD_FLAG_PAYLOAD_LEN (1 << 5) \
+    /* length describes payload, not effect; only with ext header */

 /* Supported request types */
 enum {
@@ -249,22 +292,31 @@ enum {
  */
 #define NBD_MAX_STRING_SIZE 4096

-/* Two types of reply structures */
+/* Two types of request structures, a given client will only use 1 */
+#define NBD_REQUEST_MAGIC           0x25609513
+#define NBD_EXTENDED_REQUEST_MAGIC  0x21e41c71
+
+/*
+ * Three types of reply structures, but what a client expects depends
+ * on NBD_OPT_STRUCTURED_REPLY and NBD_OPT_EXTENDED_HEADERS.
+ */
 #define NBD_SIMPLE_REPLY_MAGIC      0x67446698
 #define NBD_STRUCTURED_REPLY_MAGIC  0x668e33ef
+#define NBD_EXTENDED_REPLY_MAGIC    0x6e8a278c

-/* Structured reply flags */
+/* Chunk reply flags (for structured and extended replies) */
 #define NBD_REPLY_FLAG_DONE          (1 << 0) /* This reply-chunk is last */

-/* Structured reply types */
+/* Chunk reply types */
 #define NBD_REPLY_ERR(value)         ((1 << 15) | (value))

-#define NBD_REPLY_TYPE_NONE          0
-#define NBD_REPLY_TYPE_OFFSET_DATA   1
-#define NBD_REPLY_TYPE_OFFSET_HOLE   2
-#define NBD_REPLY_TYPE_BLOCK_STATUS  5
-#define NBD_REPLY_TYPE_ERROR         NBD_REPLY_ERR(1)
-#define NBD_REPLY_TYPE_ERROR_OFFSET  NBD_REPLY_ERR(2)
+#define NBD_REPLY_TYPE_NONE              0
+#define NBD_REPLY_TYPE_OFFSET_DATA       1
+#define NBD_REPLY_TYPE_OFFSET_HOLE       2
+#define NBD_REPLY_TYPE_BLOCK_STATUS      5
+#define NBD_REPLY_TYPE_BLOCK_STATUS_EXT  6
+#define NBD_REPLY_TYPE_ERROR             NBD_REPLY_ERR(1)
+#define NBD_REPLY_TYPE_ERROR_OFFSET      NBD_REPLY_ERR(2)

 /* Extent flags for base:allocation in NBD_REPLY_TYPE_BLOCK_STATUS */
 #define NBD_STATE_HOLE (1 << 0)
diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index df42fef7066..133b1d94b50 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -1,7 +1,7 @@
 /*
  * NBD Internal Declarations
  *
- * Copyright (C) 2016 Red Hat, Inc.
+ * Copyright Red Hat
  *
  * This work is licensed under the terms of the GNU GPL, version 2 or later.
  * See the COPYING file in the top-level directory.
@@ -44,7 +44,6 @@
 #define NBD_OLDSTYLE_NEGOTIATE_SIZE (8 + 8 + 8 + 4 + 124)

 #define NBD_INIT_MAGIC              0x4e42444d41474943LL /* ASCII "NBDMAGIC" */
-#define NBD_REQUEST_MAGIC           0x25609513
 #define NBD_OPTS_MAGIC              0x49484156454F5054LL /* ASCII "IHAVEOPT" */
 #define NBD_CLIENT_MAGIC            0x0000420281861253LL
 #define NBD_REP_MAGIC               0x0003e889045565a9LL
diff --git a/block/nbd.c b/block/nbd.c
index ca5991f868a..c7581794873 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -611,7 +611,7 @@ static int nbd_parse_offset_hole_payload(BDRVNBDState *s,
 static int nbd_parse_blockstatus_payload(BDRVNBDState *s,
                                          NBDStructuredReplyChunk *chunk,
                                          uint8_t *payload, uint64_t orig_length,
-                                         NBDExtent *extent, Error **errp)
+                                         NBDExtent32 *extent, Error **errp)
 {
     uint32_t context_id;

@@ -1119,7 +1119,7 @@ nbd_co_receive_cmdread_reply(BDRVNBDState *s, uint64_t cookie,

 static int coroutine_fn
 nbd_co_receive_blockstatus_reply(BDRVNBDState *s, uint64_t cookie,
-                                 uint64_t length, NBDExtent *extent,
+                                 uint64_t length, NBDExtent32 *extent,
                                  int *request_ret, Error **errp)
 {
     NBDReplyChunkIter iter;
@@ -1374,7 +1374,7 @@ static int coroutine_fn GRAPH_RDLOCK nbd_client_co_block_status(
         int64_t *pnum, int64_t *map, BlockDriverState **file)
 {
     int ret, request_ret;
-    NBDExtent extent = { 0 };
+    NBDExtent32 extent = { 0 };
     BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
     Error *local_err = NULL;

diff --git a/nbd/common.c b/nbd/common.c
index 989fbe54a19..3247c1d618a 100644
--- a/nbd/common.c
+++ b/nbd/common.c
@@ -79,6 +79,8 @@ const char *nbd_opt_lookup(uint32_t opt)
         return "list meta context";
     case NBD_OPT_SET_META_CONTEXT:
         return "set meta context";
+    case NBD_OPT_EXTENDED_HEADERS:
+        return "extended headers";
     default:
         return "<unknown>";
     }
@@ -112,6 +114,10 @@ const char *nbd_rep_lookup(uint32_t rep)
         return "server shutting down";
     case NBD_REP_ERR_BLOCK_SIZE_REQD:
         return "block size required";
+    case NBD_REP_ERR_TOO_BIG:
+        return "option payload too big";
+    case NBD_REP_ERR_EXT_HEADER_REQD:
+        return "extended headers required";
     default:
         return "<unknown>";
     }
@@ -170,7 +176,9 @@ const char *nbd_reply_type_lookup(uint16_t type)
     case NBD_REPLY_TYPE_OFFSET_HOLE:
         return "hole";
     case NBD_REPLY_TYPE_BLOCK_STATUS:
-        return "block status";
+        return "block status (32-bit)";
+    case NBD_REPLY_TYPE_BLOCK_STATUS_EXT:
+        return "block status (64-bit)";
     case NBD_REPLY_TYPE_ERROR:
         return "generic error";
     case NBD_REPLY_TYPE_ERROR_OFFSET:
@@ -261,6 +269,8 @@ const char *nbd_mode_lookup(NBDMode mode)
         return "simple headers";
     case NBD_MODE_STRUCTURED:
         return "structured replies";
+    case NBD_MODE_EXTENDED:
+        return "extended headers";
     default:
         return "<unknown>";
     }
diff --git a/nbd/server.c b/nbd/server.c
index bade4f7990c..9b16f7e5405 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2082,7 +2082,7 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
 }

 typedef struct NBDExtentArray {
-    NBDExtent *extents;
+    NBDExtent32 *extents;
     unsigned int nb_alloc;
     unsigned int count;
     uint64_t total_length;
@@ -2095,7 +2095,7 @@ static NBDExtentArray *nbd_extent_array_new(unsigned int nb_alloc)
     NBDExtentArray *ea = g_new0(NBDExtentArray, 1);

     ea->nb_alloc = nb_alloc;
-    ea->extents = g_new(NBDExtent, nb_alloc);
+    ea->extents = g_new(NBDExtent32, nb_alloc);
     ea->can_add = true;

     return ea;
@@ -2158,7 +2158,7 @@ static int nbd_extent_array_add(NBDExtentArray *ea,
     }

     ea->total_length += length;
-    ea->extents[ea->count] = (NBDExtent) {.length = length, .flags = flags};
+    ea->extents[ea->count] = (NBDExtent32) {.length = length, .flags = flags};
     ea->count++;

     return 0;
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 12/24] nbd: Prepare for 64-bit request effect lengths
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (10 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 11/24] nbd: Add types for extended headers Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-08 18:26   ` [Libguestfs] " Eric Blake
  2023-06-16 18:16   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 13/24] nbd/server: Refactor handling of request payload Eric Blake
                   ` (11 subsequent siblings)
  23 siblings, 2 replies; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

Widen the length field of NBDRequest to 64-bits, although we can
assert that all current uses are still under 32 bits, because nothing
ever puts us into NBD_MODE_EXTENDED yet.  Thus no semantic change.  No
semantic change yet.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: split off enum changes to earlier patches [Vladimir]
---
 include/block/nbd.h |  4 ++--
 block/nbd.c         | 25 +++++++++++++++++++------
 nbd/client.c        |  1 +
 nbd/server.c        | 11 ++++++++---
 nbd/trace-events    |  8 ++++----
 5 files changed, 34 insertions(+), 15 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index f706e38dc72..dc05f5981fb 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -70,8 +70,8 @@ typedef enum NBDMode {
  */
 typedef struct NBDRequest {
     uint64_t cookie;
-    uint64_t from;
-    uint32_t len;
+    uint64_t from;  /* Offset touched by the command */
+    uint64_t len;   /* Effect length; 32 bit limit without extended headers */
     uint16_t flags; /* NBD_CMD_FLAG_* */
     uint16_t type;  /* NBD_CMD_* */
     NBDMode mode;   /* Determines which network representation to use */
diff --git a/block/nbd.c b/block/nbd.c
index c7581794873..57123c17f94 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -1306,10 +1306,11 @@ nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int64_t bytes,
     NBDRequest request = {
         .type = NBD_CMD_WRITE_ZEROES,
         .from = offset,
-        .len = bytes,  /* .len is uint32_t actually */
+        .len = bytes,
     };

-    assert(bytes <= UINT32_MAX); /* rely on max_pwrite_zeroes */
+    /* rely on max_pwrite_zeroes */
+    assert(bytes <= UINT32_MAX || s->info.mode >= NBD_MODE_EXTENDED);

     assert(!(s->info.flags & NBD_FLAG_READ_ONLY));
     if (!(s->info.flags & NBD_FLAG_SEND_WRITE_ZEROES)) {
@@ -1356,10 +1357,11 @@ nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes)
     NBDRequest request = {
         .type = NBD_CMD_TRIM,
         .from = offset,
-        .len = bytes, /* len is uint32_t */
+        .len = bytes,
     };

-    assert(bytes <= UINT32_MAX); /* rely on max_pdiscard */
+    /* rely on max_pdiscard */
+    assert(bytes <= UINT32_MAX || s->info.mode >= NBD_MODE_EXTENDED);

     assert(!(s->info.flags & NBD_FLAG_READ_ONLY));
     if (!(s->info.flags & NBD_FLAG_SEND_TRIM) || !bytes) {
@@ -1381,8 +1383,7 @@ static int coroutine_fn GRAPH_RDLOCK nbd_client_co_block_status(
     NBDRequest request = {
         .type = NBD_CMD_BLOCK_STATUS,
         .from = offset,
-        .len = MIN(QEMU_ALIGN_DOWN(INT_MAX, bs->bl.request_alignment),
-                   MIN(bytes, s->info.size - offset)),
+        .len = MIN(bytes, s->info.size - offset),
         .flags = NBD_CMD_FLAG_REQ_ONE,
     };

@@ -1392,6 +1393,10 @@ static int coroutine_fn GRAPH_RDLOCK nbd_client_co_block_status(
         *file = bs;
         return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
     }
+    if (s->info.mode < NBD_MODE_EXTENDED) {
+        request.len = MIN(QEMU_ALIGN_DOWN(INT_MAX, bs->bl.request_alignment),
+                          request.len);
+    }

     /*
      * Work around the fact that the block layer doesn't do
@@ -1956,6 +1961,14 @@ static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
     bs->bl.max_pwrite_zeroes = max;
     bs->bl.max_transfer = max;

+    /*
+     * Assume that if the server supports extended headers, it also
+     * supports unlimited size zero and trim commands.
+     */
+    if (s->info.mode >= NBD_MODE_EXTENDED) {
+        bs->bl.max_pdiscard = bs->bl.max_pwrite_zeroes = 0;
+    }
+
     if (s->info.opt_block &&
         s->info.opt_block > bs->bl.opt_transfer) {
         bs->bl.opt_transfer = s->info.opt_block;
diff --git a/nbd/client.c b/nbd/client.c
index 40a1eb72346..1495a9b0ab1 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -1355,6 +1355,7 @@ int nbd_send_request(QIOChannel *ioc, NBDRequest *request)
     uint8_t buf[NBD_REQUEST_SIZE];

     assert(request->mode <= NBD_MODE_STRUCTURED); /* TODO handle extended */
+    assert(request->len <= UINT32_MAX);
     trace_nbd_send_request(request->from, request->len, request->cookie,
                            request->flags, request->type,
                            nbd_cmd_lookup(request->type));
diff --git a/nbd/server.c b/nbd/server.c
index 9b16f7e5405..4ac05d0cd7b 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1439,7 +1439,7 @@ static int coroutine_fn nbd_receive_request(NBDClient *client, NBDRequest *reque
     request->type   = lduw_be_p(buf + 6);
     request->cookie = ldq_be_p(buf + 8);
     request->from   = ldq_be_p(buf + 16);
-    request->len    = ldl_be_p(buf + 24);
+    request->len    = ldl_be_p(buf + 24); /* widen 32 to 64 bits */

     trace_nbd_receive_request(magic, request->flags, request->type,
                               request->from, request->len);
@@ -2357,7 +2357,7 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
         request->type == NBD_CMD_CACHE)
     {
         if (request->len > NBD_MAX_BUFFER_SIZE) {
-            error_setg(errp, "len (%" PRIu32" ) is larger than max len (%u)",
+            error_setg(errp, "len (%" PRIu64" ) is larger than max len (%u)",
                        request->len, NBD_MAX_BUFFER_SIZE);
             return -EINVAL;
         }
@@ -2373,6 +2373,7 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
     }

     if (request->type == NBD_CMD_WRITE) {
+        assert(request->len <= NBD_MAX_BUFFER_SIZE);
         if (nbd_read(client->ioc, req->data, request->len, "CMD_WRITE data",
                      errp) < 0)
         {
@@ -2394,7 +2395,7 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
     }
     if (request->from > client->exp->size ||
         request->len > client->exp->size - request->from) {
-        error_setg(errp, "operation past EOF; From: %" PRIu64 ", Len: %" PRIu32
+        error_setg(errp, "operation past EOF; From: %" PRIu64 ", Len: %" PRIu64
                    ", Size: %" PRIu64, request->from, request->len,
                    client->exp->size);
         return (request->type == NBD_CMD_WRITE ||
@@ -2456,6 +2457,7 @@ static coroutine_fn int nbd_do_cmd_read(NBDClient *client, NBDRequest *request,
     NBDExport *exp = client->exp;

     assert(request->type == NBD_CMD_READ);
+    assert(request->len <= NBD_MAX_BUFFER_SIZE);

     /* XXX: NBD Protocol only documents use of FUA with WRITE */
     if (request->flags & NBD_CMD_FLAG_FUA) {
@@ -2506,6 +2508,7 @@ static coroutine_fn int nbd_do_cmd_cache(NBDClient *client, NBDRequest *request,
     NBDExport *exp = client->exp;

     assert(request->type == NBD_CMD_CACHE);
+    assert(request->len <= NBD_MAX_BUFFER_SIZE);

     ret = blk_co_preadv(exp->common.blk, request->from, request->len,
                         NULL, BDRV_REQ_COPY_ON_READ | BDRV_REQ_PREFETCH);
@@ -2539,6 +2542,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
         if (request->flags & NBD_CMD_FLAG_FUA) {
             flags |= BDRV_REQ_FUA;
         }
+        assert(request->len <= NBD_MAX_BUFFER_SIZE);
         ret = blk_co_pwrite(exp->common.blk, request->from, request->len, data,
                             flags);
         return nbd_send_generic_reply(client, request, ret,
@@ -2582,6 +2586,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
             return nbd_send_generic_reply(client, request, -EINVAL,
                                           "need non-zero length", errp);
         }
+        assert(request->len <= UINT32_MAX);
         if (client->export_meta.count) {
             bool dont_fragment = request->flags & NBD_CMD_FLAG_REQ_ONE;
             int contexts_remaining = client->export_meta.count;
diff --git a/nbd/trace-events b/nbd/trace-events
index f19a4d0db39..3338da2be2a 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -31,7 +31,7 @@ nbd_client_loop(void) "Doing NBD loop"
 nbd_client_loop_ret(int ret, const char *error) "NBD loop returned %d: %s"
 nbd_client_clear_queue(void) "Clearing NBD queue"
 nbd_client_clear_socket(void) "Clearing NBD socket"
-nbd_send_request(uint64_t from, uint32_t len, uint64_t cookie, uint16_t flags, uint16_t type, const char *name) "Sending request to server: { .from = %" PRIu64", .len = %" PRIu32 ", .cookie = %" PRIu64 ", .flags = 0x%" PRIx16 ", .type = %" PRIu16 " (%s) }"
+nbd_send_request(uint64_t from, uint64_t len, uint64_t cookie, uint16_t flags, uint16_t type, const char *name) "Sending request to server: { .from = %" PRIu64", .len = %" PRIu64 ", .cookie = %" PRIu64 ", .flags = 0x%" PRIx16 ", .type = %" PRIu16 " (%s) }"
 nbd_receive_simple_reply(int32_t error, const char *errname, uint64_t cookie) "Got simple reply: { .error = %" PRId32 " (%s), cookie = %" PRIu64" }"
 nbd_receive_structured_reply_chunk(uint16_t flags, uint16_t type, const char *name, uint64_t cookie, uint32_t length) "Got structured reply chunk: { flags = 0x%" PRIx16 ", type = %d (%s), cookie = %" PRIu64 ", length = %" PRIu32 " }"

@@ -60,7 +60,7 @@ nbd_negotiate_options_check_option(uint32_t option, const char *name) "Checking
 nbd_negotiate_begin(void) "Beginning negotiation"
 nbd_negotiate_new_style_size_flags(uint64_t size, unsigned flags) "advertising size %" PRIu64 " and flags 0x%x"
 nbd_negotiate_success(void) "Negotiation succeeded"
-nbd_receive_request(uint32_t magic, uint16_t flags, uint16_t type, uint64_t from, uint32_t len) "Got request: { magic = 0x%" PRIx32 ", .flags = 0x%" PRIx16 ", .type = 0x%" PRIx16 ", from = %" PRIu64 ", len = %" PRIu32 " }"
+nbd_receive_request(uint32_t magic, uint16_t flags, uint16_t type, uint64_t from, uint64_t len) "Got request: { magic = 0x%" PRIx32 ", .flags = 0x%" PRIx16 ", .type = 0x%" PRIx16 ", from = %" PRIu64 ", len = %" PRIu64 " }"
 nbd_blk_aio_attached(const char *name, void *ctx) "Export %s: Attaching clients to AIO context %p"
 nbd_blk_aio_detach(const char *name, void *ctx) "Export %s: Detaching clients from AIO context %p"
 nbd_co_send_simple_reply(uint64_t cookie, uint32_t error, const char *errname, int len) "Send simple reply: cookie = %" PRIu64 ", error = %" PRIu32 " (%s), len = %d"
@@ -70,8 +70,8 @@ nbd_co_send_chunk_read_hole(uint64_t cookie, uint64_t offset, size_t size) "Send
 nbd_co_send_extents(uint64_t cookie, unsigned int extents, uint32_t id, uint64_t length, int last) "Send block status reply: cookie = %" PRIu64 ", extents = %u, context = %d (extents cover %" PRIu64 " bytes, last chunk = %d)"
 nbd_co_send_chunk_error(uint64_t cookie, int err, const char *errname, const char *msg) "Send structured error reply: cookie = %" PRIu64 ", error = %d (%s), msg = '%s'"
 nbd_co_receive_request_decode_type(uint64_t cookie, uint16_t type, const char *name) "Decoding type: cookie = %" PRIu64 ", type = %" PRIu16 " (%s)"
-nbd_co_receive_request_payload_received(uint64_t cookie, uint32_t len) "Payload received: cookie = %" PRIu64 ", len = %" PRIu32
-nbd_co_receive_align_compliance(const char *op, uint64_t from, uint32_t len, uint32_t align) "client sent non-compliant unaligned %s request: from=0x%" PRIx64 ", len=0x%" PRIx32 ", align=0x%" PRIx32
+nbd_co_receive_request_payload_received(uint64_t cookie, uint64_t len) "Payload received: cookie = %" PRIu64 ", len = %" PRIu64
+nbd_co_receive_align_compliance(const char *op, uint64_t from, uint64_t len, uint32_t align) "client sent non-compliant unaligned %s request: from=0x%" PRIx64 ", len=0x%" PRIx64 ", align=0x%" PRIx32
 nbd_trip(void) "Reading request"

 # client-connection.c
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 13/24] nbd/server: Refactor handling of request payload
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (11 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 12/24] nbd: Prepare for 64-bit request effect lengths Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-08 18:29   ` [Libguestfs] " Eric Blake
  2023-06-08 13:56 ` [PATCH v4 14/24] nbd/server: Prepare to receive extended header requests Eric Blake
                   ` (10 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov

Upcoming additions to support NBD 64-bit effect lengths allow for the
possibility to distinguish between payload length (capped at 32M) and
effect length (up to 63 bits).  Without that extension, only the
NBD_CMD_WRITE request has a payload; but with the extension, it makes
sense to allow at least NBD_CMD_BLOCK_STATUS to have both a payload
and effect length (where the payload is a limited-size struct that in
turns gives the real effect length as well as a subset of known ids
for which status is requested).  Other future NBD commands may also
have a request payload, so the 64-bit extension introduces a new
NBD_CMD_FLAG_PAYLOAD_LEN that distinguishes between whether the header
length is a payload length or an effect length, rather than
hard-coding the decision based on the command.  Note that we do not
support the payload version of BLOCK_STATUS yet.

For this patch, no semantic change is intended for a compliant client.
For a non-compliant client, it is possible that the error behavior
changes (a different message, a change on whether the connection is
killed or remains alive for the next command, or so forth), in part
because req->complete is set later on some paths, but all errors
should still be handled gracefully.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: less indentation on several 'if's [Vladimir]
---
 nbd/server.c     | 76 ++++++++++++++++++++++++++++++------------------
 nbd/trace-events |  1 +
 2 files changed, 49 insertions(+), 28 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 4ac05d0cd7b..d7dc29f0445 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2329,6 +2329,8 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
                                                Error **errp)
 {
     NBDClient *client = req->client;
+    bool extended_with_payload;
+    unsigned payload_len = 0;
     int valid_flags;
     int ret;

@@ -2342,48 +2344,63 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
     trace_nbd_co_receive_request_decode_type(request->cookie, request->type,
                                              nbd_cmd_lookup(request->type));

-    if (request->type != NBD_CMD_WRITE) {
-        /* No payload, we are ready to read the next request.  */
-        req->complete = true;
-    }
-
     if (request->type == NBD_CMD_DISC) {
         /* Special case: we're going to disconnect without a reply,
          * whether or not flags, from, or len are bogus */
+        req->complete = true;
         return -EIO;
     }

-    if (request->type == NBD_CMD_READ || request->type == NBD_CMD_WRITE ||
-        request->type == NBD_CMD_CACHE)
-    {
-        if (request->len > NBD_MAX_BUFFER_SIZE) {
-            error_setg(errp, "len (%" PRIu64" ) is larger than max len (%u)",
-                       request->len, NBD_MAX_BUFFER_SIZE);
-            return -EINVAL;
+    /* Payload and buffer handling. */
+    extended_with_payload = client->mode >= NBD_MODE_EXTENDED &&
+        (request->flags & NBD_CMD_FLAG_PAYLOAD_LEN);
+    if ((request->type == NBD_CMD_READ || request->type == NBD_CMD_WRITE ||
+         request->type == NBD_CMD_CACHE || extended_with_payload) &&
+        request->len > NBD_MAX_BUFFER_SIZE) {
+        error_setg(errp, "len (%" PRIu64" ) is larger than max len (%u)",
+                   request->len, NBD_MAX_BUFFER_SIZE);
+        return -EINVAL;
+    }
+
+    if (request->type == NBD_CMD_WRITE || extended_with_payload) {
+        payload_len = request->len;
+        if (request->type != NBD_CMD_WRITE) {
+            /*
+             * For now, we don't support payloads on other
+             * commands; but we can keep the connection alive.
+             */
+            request->len = 0;
+        } else if (client->mode >= NBD_MODE_EXTENDED &&
+                   !extended_with_payload) {
+            /* The client is noncompliant. Trace it, but proceed. */
+            trace_nbd_co_receive_ext_payload_compliance(request->from,
+                                                        request->len);
         }
+    }

-        if (request->type != NBD_CMD_CACHE) {
-            req->data = blk_try_blockalign(client->exp->common.blk,
-                                           request->len);
-            if (req->data == NULL) {
-                error_setg(errp, "No memory");
-                return -ENOMEM;
-            }
+    if (request->type == NBD_CMD_WRITE || request->type == NBD_CMD_READ) {
+        req->data = blk_try_blockalign(client->exp->common.blk,
+                                       request->len);
+        if (req->data == NULL) {
+            error_setg(errp, "No memory");
+            return -ENOMEM;
         }
     }

-    if (request->type == NBD_CMD_WRITE) {
-        assert(request->len <= NBD_MAX_BUFFER_SIZE);
-        if (nbd_read(client->ioc, req->data, request->len, "CMD_WRITE data",
-                     errp) < 0)
-        {
+    if (payload_len) {
+        if (req->data) {
+            ret = nbd_read(client->ioc, req->data, payload_len,
+                           "CMD_WRITE data", errp);
+        } else {
+            ret = nbd_drop(client->ioc, payload_len, errp);
+        }
+        if (ret < 0) {
             return -EIO;
         }
-        req->complete = true;
-
         trace_nbd_co_receive_request_payload_received(request->cookie,
-                                                      request->len);
+                                                      payload_len);
     }
+    req->complete = true;

     /* Sanity checks. */
     if (client->exp->nbdflags & NBD_FLAG_READ_ONLY &&
@@ -2413,7 +2430,10 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
                                               client->check_align);
     }
     valid_flags = NBD_CMD_FLAG_FUA;
-    if (request->type == NBD_CMD_READ && client->mode >= NBD_MODE_STRUCTURED) {
+    if (request->type == NBD_CMD_WRITE && client->mode >= NBD_MODE_EXTENDED) {
+        valid_flags |= NBD_CMD_FLAG_PAYLOAD_LEN;
+    } else if (request->type == NBD_CMD_READ &&
+               client->mode >= NBD_MODE_STRUCTURED) {
         valid_flags |= NBD_CMD_FLAG_DF;
     } else if (request->type == NBD_CMD_WRITE_ZEROES) {
         valid_flags |= NBD_CMD_FLAG_NO_HOLE | NBD_CMD_FLAG_FAST_ZERO;
diff --git a/nbd/trace-events b/nbd/trace-events
index 3338da2be2a..6a34d7f027a 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -71,6 +71,7 @@ nbd_co_send_extents(uint64_t cookie, unsigned int extents, uint32_t id, uint64_t
 nbd_co_send_chunk_error(uint64_t cookie, int err, const char *errname, const char *msg) "Send structured error reply: cookie = %" PRIu64 ", error = %d (%s), msg = '%s'"
 nbd_co_receive_request_decode_type(uint64_t cookie, uint16_t type, const char *name) "Decoding type: cookie = %" PRIu64 ", type = %" PRIu16 " (%s)"
 nbd_co_receive_request_payload_received(uint64_t cookie, uint64_t len) "Payload received: cookie = %" PRIu64 ", len = %" PRIu64
+nbd_co_receive_ext_payload_compliance(uint64_t from, uint64_t len) "client sent non-compliant write without payload flag: from=0x%" PRIx64 ", len=0x%" PRIx64
 nbd_co_receive_align_compliance(const char *op, uint64_t from, uint64_t len, uint32_t align) "client sent non-compliant unaligned %s request: from=0x%" PRIx64 ", len=0x%" PRIx64 ", align=0x%" PRIx32
 nbd_trip(void) "Reading request"

-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 14/24] nbd/server: Prepare to receive extended header requests
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (12 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 13/24] nbd/server: Refactor handling of request payload Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-16 18:35   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 15/24] nbd/server: Prepare to send extended header replies Eric Blake
                   ` (9 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov

Although extended mode is not yet enabled, once we do turn it on, we
need to accept extended requests for all messages.  Previous patches
have already taken care of supporting 64-bit lengths, now we just need
to read it off the wire.

Note that this implementation will block indefinitely on a buggy
client that sends a non-extended payload (that is, we try to read a
full packet before we ever check the magic number, but a client that
mistakenly sends a simple request after negotiating extended headers
doesn't send us enough bytes), but it's no different from any other
client that stops talking to us partway through a packet and thus not
worth coding around.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: new patch, split out from v3 9/14
---
 nbd/nbd-internal.h |  5 ++++-
 nbd/server.c       | 43 ++++++++++++++++++++++++++++++-------------
 2 files changed, 34 insertions(+), 14 deletions(-)

diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 133b1d94b50..dfa02f77ee4 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -34,8 +34,11 @@
  * https://github.com/yoe/nbd/blob/master/doc/proto.md
  */

-/* Size of all NBD_OPT_*, without payload */
+/* Size of all compact NBD_CMD_*, without payload */
 #define NBD_REQUEST_SIZE            (4 + 2 + 2 + 8 + 8 + 4)
+/* Size of all extended NBD_CMD_*, without payload */
+#define NBD_EXTENDED_REQUEST_SIZE   (4 + 2 + 2 + 8 + 8 + 8)
+
 /* Size of all NBD_REP_* sent in answer to most NBD_OPT_*, without payload */
 #define NBD_REPLY_SIZE              (4 + 4 + 8)
 /* Size of reply to NBD_OPT_EXPORT_NAME */
diff --git a/nbd/server.c b/nbd/server.c
index d7dc29f0445..119ac765f09 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1413,11 +1413,13 @@ nbd_read_eof(NBDClient *client, void *buffer, size_t size, Error **errp)
 static int coroutine_fn nbd_receive_request(NBDClient *client, NBDRequest *request,
                                             Error **errp)
 {
-    uint8_t buf[NBD_REQUEST_SIZE];
-    uint32_t magic;
+    uint8_t buf[NBD_EXTENDED_REQUEST_SIZE];
+    uint32_t magic, expect;
     int ret;
+    size_t size = client->mode >= NBD_MODE_EXTENDED ?
+        NBD_EXTENDED_REQUEST_SIZE : NBD_REQUEST_SIZE;

-    ret = nbd_read_eof(client, buf, sizeof(buf), errp);
+    ret = nbd_read_eof(client, buf, size, errp);
     if (ret < 0) {
         return ret;
     }
@@ -1425,13 +1427,21 @@ static int coroutine_fn nbd_receive_request(NBDClient *client, NBDRequest *reque
         return -EIO;
     }

-    /* Request
-       [ 0 ..  3]   magic   (NBD_REQUEST_MAGIC)
-       [ 4 ..  5]   flags   (NBD_CMD_FLAG_FUA, ...)
-       [ 6 ..  7]   type    (NBD_CMD_READ, ...)
-       [ 8 .. 15]   cookie
-       [16 .. 23]   from
-       [24 .. 27]   len
+    /*
+     * Compact request
+     *  [ 0 ..  3]   magic   (NBD_REQUEST_MAGIC)
+     *  [ 4 ..  5]   flags   (NBD_CMD_FLAG_FUA, ...)
+     *  [ 6 ..  7]   type    (NBD_CMD_READ, ...)
+     *  [ 8 .. 15]   cookie
+     *  [16 .. 23]   from
+     *  [24 .. 27]   len
+     * Extended request
+     *  [ 0 ..  3]   magic   (NBD_EXTENDED_REQUEST_MAGIC)
+     *  [ 4 ..  5]   flags   (NBD_CMD_FLAG_FUA, NBD_CMD_FLAG_PAYLOAD_LEN, ...)
+     *  [ 6 ..  7]   type    (NBD_CMD_READ, ...)
+     *  [ 8 .. 15]   cookie
+     *  [16 .. 23]   from
+     *  [24 .. 31]   len
      */

     magic = ldl_be_p(buf);
@@ -1439,13 +1449,20 @@ static int coroutine_fn nbd_receive_request(NBDClient *client, NBDRequest *reque
     request->type   = lduw_be_p(buf + 6);
     request->cookie = ldq_be_p(buf + 8);
     request->from   = ldq_be_p(buf + 16);
-    request->len    = ldl_be_p(buf + 24); /* widen 32 to 64 bits */
+    if (client->mode >= NBD_MODE_EXTENDED) {
+        request->len = ldq_be_p(buf + 24);
+        expect = NBD_EXTENDED_REQUEST_MAGIC;
+    } else {
+        request->len = ldl_be_p(buf + 24); /* widen 32 to 64 bits */
+        expect = NBD_REQUEST_MAGIC;
+    }

     trace_nbd_receive_request(magic, request->flags, request->type,
                               request->from, request->len);

-    if (magic != NBD_REQUEST_MAGIC) {
-        error_setg(errp, "invalid magic (got 0x%" PRIx32 ")", magic);
+    if (magic != expect) {
+        error_setg(errp, "invalid magic (got 0x%" PRIx32 ", expected 0x%"
+                   PRIx32 ")", magic, expect);
         return -EINVAL;
     }
     return 0;
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 15/24] nbd/server: Prepare to send extended header replies
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (13 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 14/24] nbd/server: Prepare to receive extended header requests Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-16 18:48   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 16/24] nbd/server: Support 64-bit block status Eric Blake
                   ` (8 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov

Although extended mode is not yet enabled, once we do turn it on, we
need to reply with extended headers to all messages.  Update the low
level entry points necessary so that all other callers automatically
get the right header based on the current mode.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: new patch, split out from v3 9/14
---
 nbd/server.c | 30 ++++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 119ac765f09..84c848a31d3 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1947,8 +1947,6 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov,
                                 size_t niov, uint16_t flags, uint16_t type,
                                 NBDRequest *request)
 {
-    /* TODO - handle structured vs. extended replies */
-    NBDStructuredReplyChunk *chunk = iov->iov_base;
     size_t i, length = 0;

     for (i = 1; i < niov; i++) {
@@ -1956,12 +1954,26 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov,
     }
     assert(length <= NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData));

-    iov[0].iov_len = sizeof(*chunk);
-    stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC);
-    stw_be_p(&chunk->flags, flags);
-    stw_be_p(&chunk->type, type);
-    stq_be_p(&chunk->cookie, request->cookie);
-    stl_be_p(&chunk->length, length);
+    if (client->mode >= NBD_MODE_EXTENDED) {
+        NBDExtendedReplyChunk *chunk = iov->iov_base;
+
+        iov->iov_len = sizeof(*chunk);
+        stl_be_p(&chunk->magic, NBD_EXTENDED_REPLY_MAGIC);
+        stw_be_p(&chunk->flags, flags);
+        stw_be_p(&chunk->type, type);
+        stq_be_p(&chunk->cookie, request->cookie);
+        stq_be_p(&chunk->offset, request->from);
+        stq_be_p(&chunk->length, length);
+    } else {
+        NBDStructuredReplyChunk *chunk = iov->iov_base;
+
+        iov->iov_len = sizeof(*chunk);
+        stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC);
+        stw_be_p(&chunk->flags, flags);
+        stw_be_p(&chunk->type, type);
+        stq_be_p(&chunk->cookie, request->cookie);
+        stl_be_p(&chunk->length, length);
+    }
 }

 static int coroutine_fn nbd_co_send_chunk_done(NBDClient *client,
@@ -2478,6 +2490,8 @@ static coroutine_fn int nbd_send_generic_reply(NBDClient *client,
 {
     if (client->mode >= NBD_MODE_STRUCTURED && ret < 0) {
         return nbd_co_send_chunk_error(client, request, -ret, error_msg, errp);
+    } else if (client->mode >= NBD_MODE_EXTENDED) {
+        return nbd_co_send_chunk_done(client, request, errp);
     } else {
         return nbd_co_send_simple_reply(client, request, ret < 0 ? -ret : 0,
                                         NULL, 0, errp);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 16/24] nbd/server: Support 64-bit block status
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (14 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 15/24] nbd/server: Prepare to send extended header replies Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-27 13:23   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 17/24] nbd/server: Enable initial support for extended headers Eric Blake
                   ` (7 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov

The NBD spec states that if the client negotiates extended headers,
the server must avoid NBD_REPLY_TYPE_BLOCK_STATUS and instead use
NBD_REPLY_TYPE_BLOCK_STATUS_EXT which supports 64-bit lengths, even if
the reply does not need more than 32 bits.  As of this patch,
client->mode is still never NBD_MODE_EXTENDED, so the code added here
does not take effect until the next patch enables negotiation.

For now, all metacontexts that we know how to export never populate
more than 32 bits of information, so we don't have to worry about
NBD_REP_ERR_EXT_HEADER_REQD or filtering during handshake, and we
always send all zeroes for the upper 32 bits of status during
NBD_CMD_BLOCK_STATUS.

Note that we previously had some interesting size-juggling on call
chains, such as:

nbd_co_send_block_status(uint32_t length)
-> blockstatus_to_extents(uint32_t bytes)
  -> bdrv_block_status_above(bytes, &uint64_t num)
  -> nbd_extent_array_add(uint64_t num)
    -> store num in 32-bit length

But we were lucky that it never overflowed: bdrv_block_status_above
never sets num larger than bytes, and we had previously been capping
'bytes' at 32 bits (since the protocol does not allow sending a larger
request without extended headers).  This patch adds some assertions
that ensure we continue to avoid overflowing 32 bits for a narrow
client, while fully utilizing 64-bits all the way through when the
client understands that.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: split conversion to big-endian across two helper functions rather
than in-place union [Vladimir]
---
 nbd/server.c | 104 ++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 78 insertions(+), 26 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 84c848a31d3..3010ff0dca4 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2111,20 +2111,24 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
 }

 typedef struct NBDExtentArray {
-    NBDExtent32 *extents;
+    NBDExtent64 *extents;
     unsigned int nb_alloc;
     unsigned int count;
     uint64_t total_length;
+    bool extended;
     bool can_add;
     bool converted_to_be;
 } NBDExtentArray;

-static NBDExtentArray *nbd_extent_array_new(unsigned int nb_alloc)
+static NBDExtentArray *nbd_extent_array_new(unsigned int nb_alloc,
+                                            NBDMode mode)
 {
     NBDExtentArray *ea = g_new0(NBDExtentArray, 1);

+    assert(mode >= NBD_MODE_STRUCTURED);
     ea->nb_alloc = nb_alloc;
-    ea->extents = g_new(NBDExtent32, nb_alloc);
+    ea->extents = g_new(NBDExtent64, nb_alloc);
+    ea->extended = mode >= NBD_MODE_EXTENDED;
     ea->can_add = true;

     return ea;
@@ -2143,15 +2147,36 @@ static void nbd_extent_array_convert_to_be(NBDExtentArray *ea)
     int i;

     assert(!ea->converted_to_be);
+    assert(ea->extended);
     ea->can_add = false;
     ea->converted_to_be = true;

     for (i = 0; i < ea->count; i++) {
-        ea->extents[i].flags = cpu_to_be32(ea->extents[i].flags);
-        ea->extents[i].length = cpu_to_be32(ea->extents[i].length);
+        ea->extents[i].length = cpu_to_be64(ea->extents[i].length);
+        ea->extents[i].flags = cpu_to_be64(ea->extents[i].flags);
     }
 }

+/* Further modifications of the array after conversion are abandoned */
+static NBDExtent32 *nbd_extent_array_convert_to_narrow(NBDExtentArray *ea)
+{
+    int i;
+    NBDExtent32 *extents = g_new(NBDExtent32, ea->count);
+
+    assert(!ea->converted_to_be);
+    assert(!ea->extended);
+    ea->can_add = false;
+    ea->converted_to_be = true;
+
+    for (i = 0; i < ea->count; i++) {
+        assert((ea->extents[i].length | ea->extents[i].flags) <= UINT32_MAX);
+        extents[i].length = cpu_to_be32(ea->extents[i].length);
+        extents[i].flags = cpu_to_be32(ea->extents[i].flags);
+    }
+
+    return extents;
+}
+
 /*
  * Add extent to NBDExtentArray. If extent can't be added (no available space),
  * return -1.
@@ -2162,19 +2187,23 @@ static void nbd_extent_array_convert_to_be(NBDExtentArray *ea)
  * would result in an incorrect range reported to the client)
  */
 static int nbd_extent_array_add(NBDExtentArray *ea,
-                                uint32_t length, uint32_t flags)
+                                uint64_t length, uint32_t flags)
 {
     assert(ea->can_add);

     if (!length) {
         return 0;
     }
+    if (!ea->extended) {
+        assert(length <= UINT32_MAX);
+    }

     /* Extend previous extent if flags are the same */
     if (ea->count > 0 && flags == ea->extents[ea->count - 1].flags) {
-        uint64_t sum = (uint64_t)length + ea->extents[ea->count - 1].length;
+        uint64_t sum = length + ea->extents[ea->count - 1].length;

-        if (sum <= UINT32_MAX) {
+        assert(sum >= length);
+        if (sum <= UINT32_MAX || ea->extended) {
             ea->extents[ea->count - 1].length = sum;
             ea->total_length += length;
             return 0;
@@ -2187,7 +2216,7 @@ static int nbd_extent_array_add(NBDExtentArray *ea,
     }

     ea->total_length += length;
-    ea->extents[ea->count] = (NBDExtent32) {.length = length, .flags = flags};
+    ea->extents[ea->count] = (NBDExtent64) {.length = length, .flags = flags};
     ea->count++;

     return 0;
@@ -2256,20 +2285,39 @@ nbd_co_send_extents(NBDClient *client, NBDRequest *request, NBDExtentArray *ea,
                     bool last, uint32_t context_id, Error **errp)
 {
     NBDReply hdr;
-    NBDStructuredMeta chunk;
-    struct iovec iov[] = {
-        {.iov_base = &hdr},
-        {.iov_base = &chunk, .iov_len = sizeof(chunk)},
-        {.iov_base = ea->extents, .iov_len = ea->count * sizeof(ea->extents[0])}
-    };
-
-    nbd_extent_array_convert_to_be(ea);
+    NBDStructuredMeta meta;
+    NBDExtendedMeta meta_ext;
+    g_autofree NBDExtent32 *extents = NULL;
+    uint16_t type;
+    struct iovec iov[] = { {.iov_base = &hdr}, {0}, {0} };
+
+    if (client->mode >= NBD_MODE_EXTENDED) {
+        type = NBD_REPLY_TYPE_BLOCK_STATUS_EXT;
+
+        iov[1].iov_base = &meta_ext;
+        iov[1].iov_len = sizeof(meta_ext);
+        stl_be_p(&meta_ext.context_id, context_id);
+        stl_be_p(&meta_ext.count, ea->count);
+
+        nbd_extent_array_convert_to_be(ea);
+        iov[2].iov_base = ea->extents;
+        iov[2].iov_len = ea->count * sizeof(ea->extents[0]);
+    } else {
+        type = NBD_REPLY_TYPE_BLOCK_STATUS;
+
+        iov[1].iov_base = &meta;
+        iov[1].iov_len = sizeof(meta);
+        stl_be_p(&meta.context_id, context_id);
+
+        extents = nbd_extent_array_convert_to_narrow(ea);
+        iov[2].iov_base = extents;
+        iov[2].iov_len = ea->count * sizeof(extents[0]);
+    }

     trace_nbd_co_send_extents(request->cookie, ea->count, context_id,
                               ea->total_length, last);
-    set_be_chunk(client, iov, 3, last ? NBD_REPLY_FLAG_DONE : 0,
-                 NBD_REPLY_TYPE_BLOCK_STATUS, request);
-    stl_be_p(&chunk.context_id, context_id);
+    set_be_chunk(client, iov, 3, last ? NBD_REPLY_FLAG_DONE : 0, type,
+                 request);

     return nbd_co_send_iov(client, iov, 3, errp);
 }
@@ -2278,13 +2326,14 @@ nbd_co_send_extents(NBDClient *client, NBDRequest *request, NBDExtentArray *ea,
 static int
 coroutine_fn nbd_co_send_block_status(NBDClient *client, NBDRequest *request,
                                       BlockBackend *blk, uint64_t offset,
-                                      uint32_t length, bool dont_fragment,
+                                      uint64_t length, bool dont_fragment,
                                       bool last, uint32_t context_id,
                                       Error **errp)
 {
     int ret;
     unsigned int nb_extents = dont_fragment ? 1 : NBD_MAX_BLOCK_STATUS_EXTENTS;
-    g_autoptr(NBDExtentArray) ea = nbd_extent_array_new(nb_extents);
+    g_autoptr(NBDExtentArray) ea =
+        nbd_extent_array_new(nb_extents, client->mode);

     if (context_id == NBD_META_ID_BASE_ALLOCATION) {
         ret = blockstatus_to_extents(blk, offset, length, ea);
@@ -2307,11 +2356,12 @@ static void bitmap_to_extents(BdrvDirtyBitmap *bitmap,
     int64_t start, dirty_start, dirty_count;
     int64_t end = offset + length;
     bool full = false;
+    int64_t bound = es->extended ? INT64_MAX : INT32_MAX;

     bdrv_dirty_bitmap_lock(bitmap);

     for (start = offset;
-         bdrv_dirty_bitmap_next_dirty_area(bitmap, start, end, INT32_MAX,
+         bdrv_dirty_bitmap_next_dirty_area(bitmap, start, end, bound,
                                            &dirty_start, &dirty_count);
          start = dirty_start + dirty_count)
     {
@@ -2335,12 +2385,13 @@ static int coroutine_fn nbd_co_send_bitmap(NBDClient *client,
                                            NBDRequest *request,
                                            BdrvDirtyBitmap *bitmap,
                                            uint64_t offset,
-                                           uint32_t length, bool dont_fragment,
+                                           uint64_t length, bool dont_fragment,
                                            bool last, uint32_t context_id,
                                            Error **errp)
 {
     unsigned int nb_extents = dont_fragment ? 1 : NBD_MAX_BLOCK_STATUS_EXTENTS;
-    g_autoptr(NBDExtentArray) ea = nbd_extent_array_new(nb_extents);
+    g_autoptr(NBDExtentArray) ea =
+        nbd_extent_array_new(nb_extents, client->mode);

     bitmap_to_extents(bitmap, offset, length, ea);

@@ -2637,7 +2688,8 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
             return nbd_send_generic_reply(client, request, -EINVAL,
                                           "need non-zero length", errp);
         }
-        assert(request->len <= UINT32_MAX);
+        assert(client->mode >= NBD_MODE_EXTENDED ||
+               request->len <= UINT32_MAX);
         if (client->export_meta.count) {
             bool dont_fragment = request->flags & NBD_CMD_FLAG_REQ_ONE;
             int contexts_remaining = client->export_meta.count;
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 17/24] nbd/server: Enable initial support for extended headers
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (15 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 16/24] nbd/server: Support 64-bit block status Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-27 13:26   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 18/24] nbd/client: Plumb errp through nbd_receive_replies Eric Blake
                   ` (6 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov

Time to start supporting clients that request extended headers.  Now
we can finally reach the code added across several previous patches.

Even though the NBD spec has been altered to allow us to accept
NBD_CMD_READ larger than the max payload size (provided our response
is a hole or broken up over more than one data chunk), we are not
planning to take advantage of that, and continue to cap NBD_CMD_READ
to 32M regardless of header size.

For NBD_CMD_WRITE_ZEROES and NBD_CMD_TRIM, the block layer already
supports 64-bit operations without any effort on our part.  For
NBD_CMD_BLOCK_STATUS, the client's length is a hint, and the previous
patch took care of implementing the required
NBD_REPLY_TYPE_BLOCK_STATUS_EXT.

We do not yet support clients that want to do request payload
filtering of NBD_CMD_BLOCK_STATUS; that will be added in later
patches, but is not essential for qemu as a client since qemu only
requests the single context base:allocation.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: split out parts into earlier patches, rebase to earlier changes,
simplify handling of generic replies, retitle (compare to v3 9/14)
---
 docs/interop/nbd.txt |  1 +
 nbd/server.c         | 21 +++++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt
index f5ca25174a6..abaf4c28a96 100644
--- a/docs/interop/nbd.txt
+++ b/docs/interop/nbd.txt
@@ -69,3 +69,4 @@ NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE
 NBD_CMD_FLAG_FAST_ZERO
 * 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth"
 * 7.1: NBD_FLAG_CAN_MULTI_CONN for shareable writable exports
+* 8.1: NBD_OPT_EXTENDED_HEADERS
diff --git a/nbd/server.c b/nbd/server.c
index 3010ff0dca4..ae293663ca2 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -482,6 +482,10 @@ static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes,
         [10 .. 133]   reserved     (0) [unless no_zeroes]
      */
     trace_nbd_negotiate_handle_export_name();
+    if (client->mode >= NBD_MODE_EXTENDED) {
+        error_setg(errp, "Extended headers already negotiated");
+        return -EINVAL;
+    }
     if (client->optlen > NBD_MAX_STRING_SIZE) {
         error_setg(errp, "Bad length received");
         return -EINVAL;
@@ -1262,6 +1266,10 @@ static int nbd_negotiate_options(NBDClient *client, Error **errp)
             case NBD_OPT_STRUCTURED_REPLY:
                 if (length) {
                     ret = nbd_reject_length(client, false, errp);
+                } else if (client->mode >= NBD_MODE_EXTENDED) {
+                    ret = nbd_negotiate_send_rep_err(
+                        client, NBD_REP_ERR_EXT_HEADER_REQD, errp,
+                        "extended headers already negotiated");
                 } else if (client->mode >= NBD_MODE_STRUCTURED) {
                     ret = nbd_negotiate_send_rep_err(
                         client, NBD_REP_ERR_INVALID, errp,
@@ -1278,6 +1286,19 @@ static int nbd_negotiate_options(NBDClient *client, Error **errp)
                                                  errp);
                 break;

+            case NBD_OPT_EXTENDED_HEADERS:
+                if (length) {
+                    ret = nbd_reject_length(client, false, errp);
+                } else if (client->mode >= NBD_MODE_EXTENDED) {
+                    ret = nbd_negotiate_send_rep_err(
+                        client, NBD_REP_ERR_INVALID, errp,
+                        "extended headers already negotiated");
+                } else {
+                    ret = nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
+                    client->mode = NBD_MODE_EXTENDED;
+                }
+                break;
+
             default:
                 ret = nbd_opt_drop(client, NBD_REP_ERR_UNSUP, errp,
                                    "Unsupported option %" PRIu32 " (%s)",
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 18/24] nbd/client: Plumb errp through nbd_receive_replies
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (16 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 17/24] nbd/server: Enable initial support for extended headers Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-08 19:10   ` [Libguestfs] " Eric Blake
  2023-06-27 13:31   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 19/24] nbd/client: Initial support for extended headers Eric Blake
                   ` (5 subsequent siblings)
  23 siblings, 2 replies; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

Instead of ignoring the low-level error just to refabricate our own
message to pass to the caller, we can just plump the caller's errp
down to the low level.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: new patch [Vladimir]
---
 block/nbd.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 57123c17f94..c17ce935f17 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -417,7 +417,8 @@ static void coroutine_fn GRAPH_RDLOCK nbd_reconnect_attempt(BDRVNBDState *s)
     reconnect_delay_timer_del(s);
 }

-static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie)
+static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie,
+                                            Error **errp)
 {
     int ret;
     uint64_t ind = COOKIE_TO_INDEX(cookie), ind2;
@@ -458,9 +459,12 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie)

         /* We are under mutex and cookie is 0. We have to do the dirty work. */
         assert(s->reply.cookie == 0);
-        ret = nbd_receive_reply(s->bs, s->ioc, &s->reply, NULL);
-        if (ret <= 0) {
-            ret = ret ? ret : -EIO;
+        ret = nbd_receive_reply(s->bs, s->ioc, &s->reply, errp);
+        if (ret == 0) {
+            ret = -EIO;
+            error_setg(errp, "server dropped connection");
+        }
+        if (ret < 0) {
             nbd_channel_error(s, ret);
             return ret;
         }
@@ -843,9 +847,9 @@ static coroutine_fn int nbd_co_do_receive_one_chunk(
     }
     *request_ret = 0;

-    ret = nbd_receive_replies(s, cookie);
+    ret = nbd_receive_replies(s, cookie, errp);
     if (ret < 0) {
-        error_setg(errp, "Connection closed");
+        error_prepend(errp, "Connection closed: ");
         return -EIO;
     }
     assert(s->ioc);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 19/24] nbd/client: Initial support for extended headers
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (17 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 18/24] nbd/client: Plumb errp through nbd_receive_replies Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-27 14:22   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 20/24] nbd/client: Accept 64-bit block status chunks Eric Blake
                   ` (4 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

Update the client code to be able to send an extended request, and
parse an extended header from the server.  Note that since we reject
any structured reply with a too-large payload, we can always normalize
a valid header back into the compact form, so that the caller need not
deal with two branches of a union.  Still, until a later patch lets
the client negotiate extended headers, the code added here should not
be reached.  Note that because of the different magic numbers, it is
just as easy to trace and then tolerate a non-compliant server sending
the wrong header reply as it would be to insist that the server is
compliant.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: split off errp handling to separate patch [Vladimir], better
function naming [Vladimir]
---
 include/block/nbd.h |   3 +-
 block/nbd.c         |   2 +-
 nbd/client.c        | 100 +++++++++++++++++++++++++++++---------------
 nbd/trace-events    |   3 +-
 4 files changed, 72 insertions(+), 36 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index dc05f5981fb..af80087e2cd 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -389,7 +389,8 @@ int nbd_init(int fd, QIOChannelSocket *sioc, NBDExportInfo *info,
              Error **errp);
 int nbd_send_request(QIOChannel *ioc, NBDRequest *request);
 int coroutine_fn nbd_receive_reply(BlockDriverState *bs, QIOChannel *ioc,
-                                   NBDReply *reply, Error **errp);
+                                   NBDReply *reply, NBDMode mode,
+                                   Error **errp);
 int nbd_client(int fd);
 int nbd_disconnect(int fd);
 int nbd_errno_to_system_errno(int err);
diff --git a/block/nbd.c b/block/nbd.c
index c17ce935f17..e281fac43d1 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -459,7 +459,7 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie,

         /* We are under mutex and cookie is 0. We have to do the dirty work. */
         assert(s->reply.cookie == 0);
-        ret = nbd_receive_reply(s->bs, s->ioc, &s->reply, errp);
+        ret = nbd_receive_reply(s->bs, s->ioc, &s->reply, s->info.mode, errp);
         if (ret == 0) {
             ret = -EIO;
             error_setg(errp, "server dropped connection");
diff --git a/nbd/client.c b/nbd/client.c
index 1495a9b0ab1..a4598a95427 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -1352,22 +1352,29 @@ int nbd_disconnect(int fd)

 int nbd_send_request(QIOChannel *ioc, NBDRequest *request)
 {
-    uint8_t buf[NBD_REQUEST_SIZE];
+    uint8_t buf[NBD_EXTENDED_REQUEST_SIZE];
+    size_t len;

-    assert(request->mode <= NBD_MODE_STRUCTURED); /* TODO handle extended */
-    assert(request->len <= UINT32_MAX);
     trace_nbd_send_request(request->from, request->len, request->cookie,
                            request->flags, request->type,
                            nbd_cmd_lookup(request->type));

-    stl_be_p(buf, NBD_REQUEST_MAGIC);
     stw_be_p(buf + 4, request->flags);
     stw_be_p(buf + 6, request->type);
     stq_be_p(buf + 8, request->cookie);
     stq_be_p(buf + 16, request->from);
-    stl_be_p(buf + 24, request->len);
+    if (request->mode >= NBD_MODE_EXTENDED) {
+        stl_be_p(buf, NBD_EXTENDED_REQUEST_MAGIC);
+        stq_be_p(buf + 24, request->len);
+        len = NBD_EXTENDED_REQUEST_SIZE;
+    } else {
+        assert(request->len <= UINT32_MAX);
+        stl_be_p(buf, NBD_REQUEST_MAGIC);
+        stl_be_p(buf + 24, request->len);
+        len = NBD_REQUEST_SIZE;
+    }

-    return nbd_write(ioc, buf, sizeof(buf), NULL);
+    return nbd_write(ioc, buf, len, NULL);
 }

 /* nbd_receive_simple_reply
@@ -1394,30 +1401,36 @@ static int nbd_receive_simple_reply(QIOChannel *ioc, NBDSimpleReply *reply,
     return 0;
 }

-/* nbd_receive_structured_reply_chunk
+/* nbd_receive_reply_chunk_header
  * Read structured reply chunk except magic field (which should be already
- * read).
+ * read).  Normalize into the compact form.
  * Payload is not read.
  */
-static int nbd_receive_structured_reply_chunk(QIOChannel *ioc,
-                                              NBDStructuredReplyChunk *chunk,
-                                              Error **errp)
+static int nbd_receive_reply_chunk_header(QIOChannel *ioc, NBDReply *chunk,
+                                          Error **errp)
 {
     int ret;
+    size_t len;
+    uint64_t payload_len;

-    assert(chunk->magic == NBD_STRUCTURED_REPLY_MAGIC);
+    if (chunk->magic == NBD_STRUCTURED_REPLY_MAGIC) {
+        len = sizeof(chunk->structured);
+    } else {
+        assert(chunk->magic == NBD_EXTENDED_REPLY_MAGIC);
+        len = sizeof(chunk->extended);
+    }

     ret = nbd_read(ioc, (uint8_t *)chunk + sizeof(chunk->magic),
-                   sizeof(*chunk) - sizeof(chunk->magic), "structured chunk",
+                   len - sizeof(chunk->magic), "structured chunk",
                    errp);
     if (ret < 0) {
         return ret;
     }

-    chunk->flags = be16_to_cpu(chunk->flags);
-    chunk->type = be16_to_cpu(chunk->type);
-    chunk->cookie = be64_to_cpu(chunk->cookie);
-    chunk->length = be32_to_cpu(chunk->length);
+    /* flags, type, and cookie occupy same space between forms */
+    chunk->structured.flags = be16_to_cpu(chunk->structured.flags);
+    chunk->structured.type = be16_to_cpu(chunk->structured.type);
+    chunk->structured.cookie = be64_to_cpu(chunk->structured.cookie);

     /*
      * Because we use BLOCK_STATUS with REQ_ONE, and cap READ requests
@@ -1425,11 +1438,20 @@ static int nbd_receive_structured_reply_chunk(QIOChannel *ioc,
      * this.  Even if we stopped using REQ_ONE, sane servers will cap
      * the number of extents they return for block status.
      */
-    if (chunk->length > NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData)) {
+    if (chunk->magic == NBD_STRUCTURED_REPLY_MAGIC) {
+        payload_len = be32_to_cpu(chunk->structured.length);
+    } else {
+        /* For now, we are ignoring the extended header offset. */
+        payload_len = be64_to_cpu(chunk->extended.length);
+        chunk->magic = NBD_STRUCTURED_REPLY_MAGIC;
+    }
+    if (payload_len > NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData)) {
         error_setg(errp, "server chunk %" PRIu32 " (%s) payload is too long",
-                   chunk->type, nbd_rep_lookup(chunk->type));
+                   chunk->structured.type,
+                   nbd_rep_lookup(chunk->structured.type));
         return -EINVAL;
     }
+    chunk->structured.length = payload_len;

     return 0;
 }
@@ -1476,19 +1498,21 @@ nbd_read_eof(BlockDriverState *bs, QIOChannel *ioc, void *buffer, size_t size,

 /* nbd_receive_reply
  *
- * Decreases bs->in_flight while waiting for a new reply. This yield is where
- * we wait indefinitely and the coroutine must be able to be safely reentered
- * for nbd_client_attach_aio_context().
+ * Wait for a new reply. If this yields, the coroutine must be able to be
+ * safely reentered for nbd_client_attach_aio_context().  @mode determines
+ * which reply magic we are expecting, although this normalizes the result
+ * so that the caller only has to work with compact headers.
  *
  * Returns 1 on success
- *         0 on eof, when no data was read (errp is not set)
- *         negative errno on failure (errp is set)
+ *         0 on eof, when no data was read
+ *         negative errno on failure
  */
 int coroutine_fn nbd_receive_reply(BlockDriverState *bs, QIOChannel *ioc,
-                                   NBDReply *reply, Error **errp)
+                                   NBDReply *reply, NBDMode mode, Error **errp)
 {
     int ret;
     const char *type;
+    uint32_t expected;

     ret = nbd_read_eof(bs, ioc, &reply->magic, sizeof(reply->magic), errp);
     if (ret <= 0) {
@@ -1497,8 +1521,13 @@ int coroutine_fn nbd_receive_reply(BlockDriverState *bs, QIOChannel *ioc,

     reply->magic = be32_to_cpu(reply->magic);

+    /* Diagnose but accept wrong-width header */
     switch (reply->magic) {
     case NBD_SIMPLE_REPLY_MAGIC:
+        if (mode >= NBD_MODE_EXTENDED) {
+            trace_nbd_receive_wrong_header(reply->magic,
+                                           nbd_mode_lookup(mode));
+        }
         ret = nbd_receive_simple_reply(ioc, &reply->simple, errp);
         if (ret < 0) {
             break;
@@ -1508,23 +1537,28 @@ int coroutine_fn nbd_receive_reply(BlockDriverState *bs, QIOChannel *ioc,
                                        reply->cookie);
         break;
     case NBD_STRUCTURED_REPLY_MAGIC:
-        ret = nbd_receive_structured_reply_chunk(ioc, &reply->structured, errp);
+    case NBD_EXTENDED_REPLY_MAGIC:
+        expected = mode >= NBD_MODE_EXTENDED ? NBD_EXTENDED_REPLY_MAGIC
+            : NBD_STRUCTURED_REPLY_MAGIC;
+        if (reply->magic != expected) {
+            trace_nbd_receive_wrong_header(reply->magic,
+                                           nbd_mode_lookup(mode));
+        }
+        ret = nbd_receive_reply_chunk_header(ioc, reply, errp);
         if (ret < 0) {
             break;
         }
         type = nbd_reply_type_lookup(reply->structured.type);
-        trace_nbd_receive_structured_reply_chunk(reply->structured.flags,
-                                                 reply->structured.type, type,
-                                                 reply->structured.cookie,
-                                                 reply->structured.length);
+        trace_nbd_receive_reply_chunk_header(reply->structured.flags,
+                                             reply->structured.type, type,
+                                             reply->structured.cookie,
+                                             reply->structured.length);
         break;
     default:
+        trace_nbd_receive_wrong_header(reply->magic, nbd_mode_lookup(mode));
         error_setg(errp, "invalid magic (got 0x%" PRIx32 ")", reply->magic);
         return -EINVAL;
     }
-    if (ret < 0) {
-        return ret;
-    }

     return 1;
 }
diff --git a/nbd/trace-events b/nbd/trace-events
index 6a34d7f027a..51bfb129c95 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -33,7 +33,8 @@ nbd_client_clear_queue(void) "Clearing NBD queue"
 nbd_client_clear_socket(void) "Clearing NBD socket"
 nbd_send_request(uint64_t from, uint64_t len, uint64_t cookie, uint16_t flags, uint16_t type, const char *name) "Sending request to server: { .from = %" PRIu64", .len = %" PRIu64 ", .cookie = %" PRIu64 ", .flags = 0x%" PRIx16 ", .type = %" PRIu16 " (%s) }"
 nbd_receive_simple_reply(int32_t error, const char *errname, uint64_t cookie) "Got simple reply: { .error = %" PRId32 " (%s), cookie = %" PRIu64" }"
-nbd_receive_structured_reply_chunk(uint16_t flags, uint16_t type, const char *name, uint64_t cookie, uint32_t length) "Got structured reply chunk: { flags = 0x%" PRIx16 ", type = %d (%s), cookie = %" PRIu64 ", length = %" PRIu32 " }"
+nbd_receive_reply_chunk_header(uint16_t flags, uint16_t type, const char *name, uint64_t cookie, uint32_t length) "Got reply chunk header: { flags = 0x%" PRIx16 ", type = %d (%s), cookie = %" PRIu64 ", length = %" PRIu32 " }"
+nbd_receive_wrong_header(uint32_t magic, const char *mode) "Server sent unexpected magic 0x%" PRIx32 " for negotiated mode %s"

 # common.c
 nbd_unknown_error(int err) "Squashing unexpected error %d to EINVAL"
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 20/24] nbd/client: Accept 64-bit block status chunks
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (18 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 19/24] nbd/client: Initial support for extended headers Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-27 14:50   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 21/24] nbd/client: Request extended headers during negotiation Eric Blake
                   ` (3 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

Once extended mode is enabled, we need to accept 64-bit status replies
(even for replies that don't exceed a 32-bit length).  It is easier to
normalize narrow replies into wide format so that the rest of our code
only has to handle one width.  Although a server is non-compliant if
it sends a 64-bit reply in compact mode, or a 32-bit reply in extended
mode, it is still easy enough to tolerate these mismatches.

In normal execution, we are only requesting "base:allocation" which
never exceeds 32 bits for flag values. But during testing with
x-dirty-bitmap, we can force qemu to connect to some other context
that might have 64-bit status bit; however, we ignore those upper bits
(other than mapping qemu:allocation-depth into something that
'qemu-img map --output=json' can expose), and since that only affects
testing, we really don't bother with checking whether more than the
two least-significant bits are set.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: tweak comments and error message about count mismatch, fix setting
of wide in loop [Vladimir]
---
 block/nbd.c        | 47 ++++++++++++++++++++++++++++++++--------------
 block/trace-events |  1 +
 2 files changed, 34 insertions(+), 14 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index e281fac43d1..74c0a9d3b8c 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -614,13 +614,16 @@ static int nbd_parse_offset_hole_payload(BDRVNBDState *s,
  */
 static int nbd_parse_blockstatus_payload(BDRVNBDState *s,
                                          NBDStructuredReplyChunk *chunk,
-                                         uint8_t *payload, uint64_t orig_length,
-                                         NBDExtent32 *extent, Error **errp)
+                                         uint8_t *payload, bool wide,
+                                         uint64_t orig_length,
+                                         NBDExtent64 *extent, Error **errp)
 {
     uint32_t context_id;
+    uint32_t count;
+    size_t len = wide ? sizeof(*extent) : sizeof(NBDExtent32);

     /* The server succeeded, so it must have sent [at least] one extent */
-    if (chunk->length < sizeof(context_id) + sizeof(*extent)) {
+    if (chunk->length < sizeof(context_id) + wide * sizeof(count) + len) {
         error_setg(errp, "Protocol error: invalid payload for "
                          "NBD_REPLY_TYPE_BLOCK_STATUS");
         return -EINVAL;
@@ -635,8 +638,15 @@ static int nbd_parse_blockstatus_payload(BDRVNBDState *s,
         return -EINVAL;
     }

-    extent->length = payload_advance32(&payload);
-    extent->flags = payload_advance32(&payload);
+    if (wide) {
+        count = payload_advance32(&payload);
+        extent->length = payload_advance64(&payload);
+        extent->flags = payload_advance64(&payload);
+    } else {
+        count = 0;
+        extent->length = payload_advance32(&payload);
+        extent->flags = payload_advance32(&payload);
+    }

     if (extent->length == 0) {
         error_setg(errp, "Protocol error: server sent status chunk with "
@@ -671,13 +681,16 @@ static int nbd_parse_blockstatus_payload(BDRVNBDState *s,
     /*
      * We used NBD_CMD_FLAG_REQ_ONE, so the server should not have
      * sent us any more than one extent, nor should it have included
-     * status beyond our request in that extent. However, it's easy
-     * enough to ignore the server's noncompliance without killing the
+     * status beyond our request in that extent. Furthermore, a wide
+     * server should have replied with an accurate count (we left
+     * count at 0 for a narrow server).  However, it's easy enough to
+     * ignore the server's noncompliance without killing the
      * connection; just ignore trailing extents, and clamp things to
      * the length of our request.
      */
-    if (chunk->length > sizeof(context_id) + sizeof(*extent)) {
-        trace_nbd_parse_blockstatus_compliance("more than one extent");
+    if (count != wide ||
+        chunk->length > sizeof(context_id) + wide * sizeof(count) + len) {
+        trace_nbd_parse_blockstatus_compliance("unexpected extent count");
     }
     if (extent->length > orig_length) {
         extent->length = orig_length;
@@ -1123,7 +1136,7 @@ nbd_co_receive_cmdread_reply(BDRVNBDState *s, uint64_t cookie,

 static int coroutine_fn
 nbd_co_receive_blockstatus_reply(BDRVNBDState *s, uint64_t cookie,
-                                 uint64_t length, NBDExtent32 *extent,
+                                 uint64_t length, NBDExtent64 *extent,
                                  int *request_ret, Error **errp)
 {
     NBDReplyChunkIter iter;
@@ -1136,11 +1149,17 @@ nbd_co_receive_blockstatus_reply(BDRVNBDState *s, uint64_t cookie,
     NBD_FOREACH_REPLY_CHUNK(s, iter, cookie, false, NULL, &reply, &payload) {
         int ret;
         NBDStructuredReplyChunk *chunk = &reply.structured;
+        bool wide;

         assert(nbd_reply_is_structured(&reply));

         switch (chunk->type) {
+        case NBD_REPLY_TYPE_BLOCK_STATUS_EXT:
         case NBD_REPLY_TYPE_BLOCK_STATUS:
+            wide = chunk->type == NBD_REPLY_TYPE_BLOCK_STATUS_EXT;
+            if ((s->info.mode >= NBD_MODE_EXTENDED) != wide) {
+                trace_nbd_extended_headers_compliance("block_status");
+            }
             if (received) {
                 nbd_channel_error(s, -EINVAL);
                 error_setg(&local_err, "Several BLOCK_STATUS chunks in reply");
@@ -1148,9 +1167,9 @@ nbd_co_receive_blockstatus_reply(BDRVNBDState *s, uint64_t cookie,
             }
             received = true;

-            ret = nbd_parse_blockstatus_payload(s, &reply.structured,
-                                                payload, length, extent,
-                                                &local_err);
+            ret = nbd_parse_blockstatus_payload(
+                s, &reply.structured, payload, wide,
+                length, extent, &local_err);
             if (ret < 0) {
                 nbd_channel_error(s, ret);
                 nbd_iter_channel_error(&iter, ret, &local_err);
@@ -1380,7 +1399,7 @@ static int coroutine_fn GRAPH_RDLOCK nbd_client_co_block_status(
         int64_t *pnum, int64_t *map, BlockDriverState **file)
 {
     int ret, request_ret;
-    NBDExtent32 extent = { 0 };
+    NBDExtent64 extent = { 0 };
     BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
     Error *local_err = NULL;

diff --git a/block/trace-events b/block/trace-events
index 6f121b76365..a0fc79153d3 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -166,6 +166,7 @@ iscsi_xcopy(void *src_lun, uint64_t src_off, void *dst_lun, uint64_t dst_off, ui
 # nbd.c
 nbd_parse_blockstatus_compliance(const char *err) "ignoring extra data from non-compliant server: %s"
 nbd_structured_read_compliance(const char *type) "server sent non-compliant unaligned read %s chunk"
+nbd_extended_headers_compliance(const char *type) "server sent non-compliant %s chunk not matching choice of extended headers"
 nbd_read_reply_entry_fail(int ret, const char *err) "ret = %d, err: %s"
 nbd_co_request_fail(uint64_t from, uint32_t len, uint64_t handle, uint16_t flags, uint16_t type, const char *name, int ret, const char *err) "Request failed { .from = %" PRIu64", .len = %" PRIu32 ", .handle = %" PRIu64 ", .flags = 0x%" PRIx16 ", .type = %" PRIu16 " (%s) } ret = %d, err: %s"
 nbd_client_handshake(const char *export_name) "export '%s'"
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 21/24] nbd/client: Request extended headers during negotiation
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (19 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 20/24] nbd/client: Accept 64-bit block status chunks Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-27 14:55   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 22/24] nbd/server: Refactor list of negotiated meta contexts Eric Blake
                   ` (2 subsequent siblings)
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

All the pieces are in place for a client to finally request extended
headers.  Note that we must not request extended headers when qemu-nbd
is used to connect to the kernel module (as nbd.ko does not expect
them, but expects us to do the negotiation in userspace before handing
the socket over to the kernel), but there is no harm in all other
clients requesting them.

Extended headers are not essential to the information collected during
'qemu-nbd --list', but probing for it gives us one more piece of
information in that output.  Update the iotests affected by the new
line of output.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: rebase to earlier changes, tweak commit message for why qemu-nbd
connection to /dev/nbd cannot use extended mode [Vladimir]
---
 nbd/client-connection.c                       |  2 +-
 nbd/client.c                                  | 20 ++++++++++++++-----
 qemu-nbd.c                                    |  3 +++
 tests/qemu-iotests/223.out                    |  6 ++++++
 tests/qemu-iotests/233.out                    |  4 ++++
 tests/qemu-iotests/241.out                    |  3 +++
 tests/qemu-iotests/307.out                    |  5 +++++
 .../tests/nbd-qemu-allocation.out             |  1 +
 8 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/nbd/client-connection.c b/nbd/client-connection.c
index 13e4cb6684b..d9d946da006 100644
--- a/nbd/client-connection.c
+++ b/nbd/client-connection.c
@@ -93,7 +93,7 @@ NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr,
         .do_negotiation = do_negotiation,

         .initial_info.request_sizes = true,
-        .initial_info.mode = NBD_MODE_STRUCTURED,
+        .initial_info.mode = NBD_MODE_EXTENDED,
         .initial_info.base_allocation = true,
         .initial_info.x_dirty_bitmap = g_strdup(x_dirty_bitmap),
         .initial_info.name = g_strdup(export_name ?: "")
diff --git a/nbd/client.c b/nbd/client.c
index a4598a95427..99c0e5c8114 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -958,15 +958,23 @@ static int nbd_start_negotiate(AioContext *aio_context, QIOChannel *ioc,
         if (fixedNewStyle) {
             int result = 0;

+            if (max_mode >= NBD_MODE_EXTENDED) {
+                result = nbd_request_simple_option(ioc,
+                                                   NBD_OPT_EXTENDED_HEADERS,
+                                                   false, errp);
+                if (result) {
+                    return result < 0 ? -EINVAL : NBD_MODE_EXTENDED;
+                }
+            }
             if (max_mode >= NBD_MODE_STRUCTURED) {
                 result = nbd_request_simple_option(ioc,
                                                    NBD_OPT_STRUCTURED_REPLY,
                                                    false, errp);
-                if (result < 0) {
-                    return -EINVAL;
+                if (result) {
+                    return result < 0 ? -EINVAL : NBD_MODE_STRUCTURED;
                 }
             }
-            return result ? NBD_MODE_STRUCTURED : NBD_MODE_SIMPLE;
+            return NBD_MODE_SIMPLE;
         } else {
             return NBD_MODE_EXPORT_NAME;
         }
@@ -1040,6 +1048,7 @@ int nbd_receive_negotiate(AioContext *aio_context, QIOChannel *ioc,
     }

     switch (info->mode) {
+    case NBD_MODE_EXTENDED:
     case NBD_MODE_STRUCTURED:
         if (base_allocation) {
             result = nbd_negotiate_simple_meta_context(ioc, info, errp);
@@ -1150,7 +1159,7 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,

     *info = NULL;
     result = nbd_start_negotiate(NULL, ioc, tlscreds, hostname, &sioc,
-                                 NBD_MODE_STRUCTURED, NULL, errp);
+                                 NBD_MODE_EXTENDED, NULL, errp);
     if (tlscreds && sioc) {
         ioc = sioc;
     }
@@ -1161,6 +1170,7 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
     switch ((NBDMode)result) {
     case NBD_MODE_SIMPLE:
     case NBD_MODE_STRUCTURED:
+    case NBD_MODE_EXTENDED:
         /* newstyle - use NBD_OPT_LIST to populate array, then try
          * NBD_OPT_INFO on each array member. If structured replies
          * are enabled, also try NBD_OPT_LIST_META_CONTEXT. */
@@ -1197,7 +1207,7 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
                 break;
             }

-            if (result == NBD_MODE_STRUCTURED &&
+            if (result >= NBD_MODE_STRUCTURED &&
                 nbd_list_meta_contexts(ioc, &array[i], errp) < 0) {
                 goto out;
             }
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 3ddd0bf02b4..1d155fc2c66 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -238,6 +238,9 @@ static int qemu_nbd_client_list(SocketAddress *saddr, QCryptoTLSCreds *tls,
             printf("  opt block: %u\n", list[i].opt_block);
             printf("  max block: %u\n", list[i].max_block);
         }
+        printf("  transaction size: %s\n",
+               list[i].mode >= NBD_MODE_EXTENDED ?
+               "64-bit" : "32-bit");
         if (list[i].n_contexts) {
             printf("  available meta contexts: %d\n", list[i].n_contexts);
             for (j = 0; j < list[i].n_contexts; j++) {
diff --git a/tests/qemu-iotests/223.out b/tests/qemu-iotests/223.out
index 26fb347c5da..b98582c38ea 100644
--- a/tests/qemu-iotests/223.out
+++ b/tests/qemu-iotests/223.out
@@ -87,6 +87,7 @@ exports available: 3
   min block: 1
   opt block: 4096
   max block: 33554432
+  transaction size: 64-bit
   available meta contexts: 2
    base:allocation
    qemu:dirty-bitmap:b
@@ -97,6 +98,7 @@ exports available: 3
   min block: 1
   opt block: 4096
   max block: 33554432
+  transaction size: 64-bit
   available meta contexts: 2
    base:allocation
    qemu:dirty-bitmap:b2
@@ -106,6 +108,7 @@ exports available: 3
   min block: 1
   opt block: 4096
   max block: 33554432
+  transaction size: 64-bit
   available meta contexts: 2
    base:allocation
    qemu:dirty-bitmap:b3
@@ -206,6 +209,7 @@ exports available: 3
   min block: 1
   opt block: 4096
   max block: 33554432
+  transaction size: 64-bit
   available meta contexts: 2
    base:allocation
    qemu:dirty-bitmap:b
@@ -216,6 +220,7 @@ exports available: 3
   min block: 1
   opt block: 4096
   max block: 33554432
+  transaction size: 64-bit
   available meta contexts: 2
    base:allocation
    qemu:dirty-bitmap:b2
@@ -225,6 +230,7 @@ exports available: 3
   min block: 1
   opt block: 4096
   max block: 33554432
+  transaction size: 64-bit
   available meta contexts: 2
    base:allocation
    qemu:dirty-bitmap:b3
diff --git a/tests/qemu-iotests/233.out b/tests/qemu-iotests/233.out
index 237c82767ea..1910f7df20f 100644
--- a/tests/qemu-iotests/233.out
+++ b/tests/qemu-iotests/233.out
@@ -39,6 +39,7 @@ exports available: 1
  export: ''
   size:  67108864
   min block: 1
+  transaction size: 64-bit

 == check TLS fail over TCP with mismatched hostname ==
 qemu-img: Could not open 'driver=nbd,host=localhost,port=PORT,tls-creds=tls0': Certificate does not match the hostname localhost
@@ -53,6 +54,7 @@ exports available: 1
  export: ''
   size:  67108864
   min block: 1
+  transaction size: 64-bit

 == check TLS with different CA fails ==
 qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': The certificate hasn't got a known issuer
@@ -83,6 +85,7 @@ exports available: 1
  export: ''
   size:  67108864
   min block: 1
+  transaction size: 64-bit

 == check TLS works over UNIX with PSK ==
 image: nbd+unix://?socket=SOCK_DIR/qemu-nbd.sock
@@ -93,6 +96,7 @@ exports available: 1
  export: ''
   size:  67108864
   min block: 1
+  transaction size: 64-bit

 == check TLS fails over UNIX with mismatch PSK ==
 qemu-img: Could not open 'driver=nbd,path=SOCK_DIR/qemu-nbd.sock,tls-creds=tls0': TLS handshake failed: The TLS connection was non-properly terminated.
diff --git a/tests/qemu-iotests/241.out b/tests/qemu-iotests/241.out
index 88e8cfcd7e2..a9efb876521 100644
--- a/tests/qemu-iotests/241.out
+++ b/tests/qemu-iotests/241.out
@@ -6,6 +6,7 @@ exports available: 1
  export: ''
   size:  1024
   min block: 1
+  transaction size: 64-bit
 [{ "start": 0, "length": 1000, "depth": 0, "present": true, "zero": false, "data": true, "offset": OFFSET},
 { "start": 1000, "length": 24, "depth": 0, "present": true, "zero": true, "data": false, "offset": OFFSET}]
 1 KiB (0x400) bytes     allocated at offset 0 bytes (0x0)
@@ -16,6 +17,7 @@ exports available: 1
  export: ''
   size:  1024
   min block: 512
+  transaction size: 64-bit
 [{ "start": 0, "length": 1024, "depth": 0, "present": true, "zero": false, "data": true, "offset": OFFSET}]
 1 KiB (0x400) bytes     allocated at offset 0 bytes (0x0)
 WARNING: Image format was not specified for 'TEST_DIR/t.raw' and probing guessed raw.
@@ -28,6 +30,7 @@ exports available: 1
  export: ''
   size:  1024
   min block: 1
+  transaction size: 64-bit
 [{ "start": 0, "length": 1000, "depth": 0, "present": true, "zero": false, "data": true, "offset": OFFSET},
 { "start": 1000, "length": 24, "depth": 0, "present": true, "zero": true, "data": false, "offset": OFFSET}]
 1 KiB (0x400) bytes     allocated at offset 0 bytes (0x0)
diff --git a/tests/qemu-iotests/307.out b/tests/qemu-iotests/307.out
index 390f05d1b78..2b9a6a67a1a 100644
--- a/tests/qemu-iotests/307.out
+++ b/tests/qemu-iotests/307.out
@@ -19,6 +19,7 @@ exports available: 1
   min block: XXX
   opt block: XXX
   max block: XXX
+  transaction size: 64-bit
   available meta contexts: 1
    base:allocation

@@ -47,6 +48,7 @@ exports available: 1
   min block: XXX
   opt block: XXX
   max block: XXX
+  transaction size: 64-bit
   available meta contexts: 1
    base:allocation

@@ -78,6 +80,7 @@ exports available: 2
   min block: XXX
   opt block: XXX
   max block: XXX
+  transaction size: 64-bit
   available meta contexts: 1
    base:allocation
  export: 'export1'
@@ -87,6 +90,7 @@ exports available: 2
   min block: XXX
   opt block: XXX
   max block: XXX
+  transaction size: 64-bit
   available meta contexts: 1
    base:allocation

@@ -113,6 +117,7 @@ exports available: 1
   min block: XXX
   opt block: XXX
   max block: XXX
+  transaction size: 64-bit
   available meta contexts: 1
    base:allocation

diff --git a/tests/qemu-iotests/tests/nbd-qemu-allocation.out b/tests/qemu-iotests/tests/nbd-qemu-allocation.out
index 9d938db24e6..659276032b0 100644
--- a/tests/qemu-iotests/tests/nbd-qemu-allocation.out
+++ b/tests/qemu-iotests/tests/nbd-qemu-allocation.out
@@ -21,6 +21,7 @@ exports available: 1
   min block: 1
   opt block: 4096
   max block: 33554432
+  transaction size: 64-bit
   available meta contexts: 2
    base:allocation
    qemu:allocation-depth
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 22/24] nbd/server: Refactor list of negotiated meta contexts
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (20 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 21/24] nbd/client: Request extended headers during negotiation Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-27 15:11   ` Vladimir Sementsov-Ogievskiy
  2023-06-08 13:56 ` [PATCH v4 23/24] nbd/server: Prepare for per-request filtering of BLOCK_STATUS Eric Blake
  2023-06-08 13:56 ` [PATCH v4 24/24] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Eric Blake
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

Peform several minor refactorings of how the list of negotiated meta
contexts is managed, to make upcoming patches easier: Promote the
internal type NBDExportMetaContexts to the public opaque type
NBDMetaContexts, and mark exp const.  Use a shorter member name in
NBDClient.  Hoist calls to nbd_check_meta_context() earlier in their
callers, as the number of negotiated contexts may impact the flags
exposed in regards to an export, which in turn requires a new
parameter.  Drop a redundant parameter to nbd_negotiate_meta_queries.
No semantic change intended.

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: new patch split out from v3 13/14, with smaller impact (quit
trying to separate exp outside of NBDMeataContexts)
---
 include/block/nbd.h |  1 +
 nbd/server.c        | 55 ++++++++++++++++++++++++---------------------
 2 files changed, 31 insertions(+), 25 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index af80087e2cd..f240707f646 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -28,6 +28,7 @@

 typedef struct NBDExport NBDExport;
 typedef struct NBDClient NBDClient;
+typedef struct NBDMetaContexts NBDMetaContexts;

 extern const BlockExportDriver blk_exp_nbd;

diff --git a/nbd/server.c b/nbd/server.c
index ae293663ca2..42a4300c95e 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -105,11 +105,13 @@ struct NBDExport {

 static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports);

-/* NBDExportMetaContexts represents a list of contexts to be exported,
+/*
+ * NBDMetaContexts represents a list of meta contexts in use,
  * as selected by NBD_OPT_SET_META_CONTEXT. Also used for
- * NBD_OPT_LIST_META_CONTEXT. */
-typedef struct NBDExportMetaContexts {
-    NBDExport *exp;
+ * NBD_OPT_LIST_META_CONTEXT.
+ */
+struct NBDMetaContexts {
+    const NBDExport *exp; /* associated export */
     size_t count; /* number of negotiated contexts */
     bool base_allocation; /* export base:allocation context (block status) */
     bool allocation_depth; /* export qemu:allocation-depth */
@@ -117,7 +119,7 @@ typedef struct NBDExportMetaContexts {
                     * export qemu:dirty-bitmap:<export bitmap name>,
                     * sized by exp->nr_export_bitmaps
                     */
-} NBDExportMetaContexts;
+};

 struct NBDClient {
     int refcount;
@@ -144,7 +146,7 @@ struct NBDClient {
     uint32_t check_align; /* If non-zero, check for aligned client requests */

     NBDMode mode;
-    NBDExportMetaContexts export_meta;
+    NBDMetaContexts contexts; /* Negotiated meta contexts */

     uint32_t opt; /* Current option being negotiated */
     uint32_t optlen; /* remaining length of data in ioc for the option being
@@ -455,10 +457,10 @@ static int nbd_negotiate_handle_list(NBDClient *client, Error **errp)
     return nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
 }

-static void nbd_check_meta_export(NBDClient *client)
+static void nbd_check_meta_export(NBDClient *client, NBDExport *exp)
 {
-    if (client->exp != client->export_meta.exp) {
-        client->export_meta.count = 0;
+    if (exp != client->contexts.exp) {
+        client->contexts.count = 0;
     }
 }

@@ -504,6 +506,7 @@ static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes,
         error_setg(errp, "export not found");
         return -EINVAL;
     }
+    nbd_check_meta_export(client, client->exp);

     myflags = client->exp->nbdflags;
     if (client->mode >= NBD_MODE_STRUCTURED) {
@@ -521,7 +524,6 @@ static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes,

     QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
     blk_exp_ref(&client->exp->common);
-    nbd_check_meta_export(client);

     return 0;
 }
@@ -641,6 +643,9 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp)
                                           errp, "export '%s' not present",
                                           sane_name);
     }
+    if (client->opt == NBD_OPT_GO) {
+        nbd_check_meta_export(client, exp);
+    }

     /* Don't bother sending NBD_INFO_NAME unless client requested it */
     if (sendname) {
@@ -729,7 +734,6 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp)
         client->check_align = check_align;
         QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
         blk_exp_ref(&client->exp->common);
-        nbd_check_meta_export(client);
         rc = 1;
     }
     return rc;
@@ -852,7 +856,7 @@ static bool nbd_strshift(const char **str, const char *prefix)
  * Handle queries to 'base' namespace. For now, only the base:allocation
  * context is available.  Return true if @query has been handled.
  */
-static bool nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
+static bool nbd_meta_base_query(NBDClient *client, NBDMetaContexts *meta,
                                 const char *query)
 {
     if (!nbd_strshift(&query, "base:")) {
@@ -872,7 +876,7 @@ static bool nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
  * and qemu:allocation-depth contexts are available.  Return true if @query
  * has been handled.
  */
-static bool nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
+static bool nbd_meta_qemu_query(NBDClient *client, NBDMetaContexts *meta,
                                 const char *query)
 {
     size_t i;
@@ -938,7 +942,7 @@ static bool nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
  * Return -errno on I/O error, 0 if option was completely handled by
  * sending a reply about inconsistent lengths, or 1 on success. */
 static int nbd_negotiate_meta_query(NBDClient *client,
-                                    NBDExportMetaContexts *meta, Error **errp)
+                                    NBDMetaContexts *meta, Error **errp)
 {
     int ret;
     g_autofree char *query = NULL;
@@ -977,14 +981,14 @@ static int nbd_negotiate_meta_query(NBDClient *client,
  * Handle NBD_OPT_LIST_META_CONTEXT and NBD_OPT_SET_META_CONTEXT
  *
  * Return -errno on I/O error, or 0 if option was completely handled. */
-static int nbd_negotiate_meta_queries(NBDClient *client,
-                                      NBDExportMetaContexts *meta, Error **errp)
+static int nbd_negotiate_meta_queries(NBDClient *client, Error **errp)
 {
     int ret;
     g_autofree char *export_name = NULL;
     /* Mark unused to work around https://bugs.llvm.org/show_bug.cgi?id=3888 */
     g_autofree G_GNUC_UNUSED bool *bitmaps = NULL;
-    NBDExportMetaContexts local_meta = {0};
+    NBDMetaContexts local_meta = {0};
+    NBDMetaContexts *meta;
     uint32_t nb_queries;
     size_t i;
     size_t count = 0;
@@ -1000,6 +1004,8 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
     if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
         /* Only change the caller's meta on SET. */
         meta = &local_meta;
+    } else {
+        meta = &client->contexts;
     }

     g_free(meta->bitmaps);
@@ -1282,8 +1288,7 @@ static int nbd_negotiate_options(NBDClient *client, Error **errp)

             case NBD_OPT_LIST_META_CONTEXT:
             case NBD_OPT_SET_META_CONTEXT:
-                ret = nbd_negotiate_meta_queries(client, &client->export_meta,
-                                                 errp);
+                ret = nbd_negotiate_meta_queries(client, errp);
                 break;

             case NBD_OPT_EXTENDED_HEADERS:
@@ -1515,7 +1520,7 @@ void nbd_client_put(NBDClient *client)
             QTAILQ_REMOVE(&client->exp->clients, client, next);
             blk_exp_unref(&client->exp->common);
         }
-        g_free(client->export_meta.bitmaps);
+        g_free(client->contexts.bitmaps);
         g_free(client);
     }
 }
@@ -2711,11 +2716,11 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
         }
         assert(client->mode >= NBD_MODE_EXTENDED ||
                request->len <= UINT32_MAX);
-        if (client->export_meta.count) {
+        if (client->contexts.count) {
             bool dont_fragment = request->flags & NBD_CMD_FLAG_REQ_ONE;
-            int contexts_remaining = client->export_meta.count;
+            int contexts_remaining = client->contexts.count;

-            if (client->export_meta.base_allocation) {
+            if (client->contexts.base_allocation) {
                 ret = nbd_co_send_block_status(client, request,
                                                exp->common.blk,
                                                request->from,
@@ -2728,7 +2733,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
                 }
             }

-            if (client->export_meta.allocation_depth) {
+            if (client->contexts.allocation_depth) {
                 ret = nbd_co_send_block_status(client, request,
                                                exp->common.blk,
                                                request->from, request->len,
@@ -2742,7 +2747,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
             }

             for (i = 0; i < client->exp->nr_export_bitmaps; i++) {
-                if (!client->export_meta.bitmaps[i]) {
+                if (!client->contexts.bitmaps[i]) {
                     continue;
                 }
                 ret = nbd_co_send_bitmap(client, request,
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 23/24] nbd/server: Prepare for per-request filtering of BLOCK_STATUS
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (21 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 22/24] nbd/server: Refactor list of negotiated meta contexts Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-08 19:15   ` [Libguestfs] " Eric Blake
  2023-06-08 13:56 ` [PATCH v4 24/24] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Eric Blake
  23 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

The next commit will add support for the optional extension
NBD_CMD_FLAG_PAYLOAD during NBD_CMD_BLOCK_STATUS, where the client can
request that the server only return a subset of negotiated contexts,
rather than all contexts.  To make that task easier, this patch
populates the list of contexts to return on a per-command basis (for
now, identical to the full set of negotiated contexts).

Signed-off-by: Eric Blake <eblake@redhat.com>
---

v4: split out NBDMetaContexts refactoring to its own patch, track
NBDRequests.contexts as a pointer rather than inline
---
 include/block/nbd.h |  1 +
 nbd/server.c        | 22 +++++++++++++++++-----
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index f240707f646..47850be5a66 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -76,6 +76,7 @@ typedef struct NBDRequest {
     uint16_t flags; /* NBD_CMD_FLAG_* */
     uint16_t type;  /* NBD_CMD_* */
     NBDMode mode;   /* Determines which network representation to use */
+    NBDMetaContexts *contexts; /* Used by NBD_CMD_BLOCK_STATUS */
 } NBDRequest;

 typedef struct NBDSimpleReply {
diff --git a/nbd/server.c b/nbd/server.c
index 42a4300c95e..308846fe46b 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2491,6 +2491,8 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
             error_setg(errp, "No memory");
             return -ENOMEM;
         }
+    } else if (request->type == NBD_CMD_BLOCK_STATUS) {
+        request->contexts = &client->contexts;
     }

     if (payload_len) {
@@ -2716,11 +2718,11 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
         }
         assert(client->mode >= NBD_MODE_EXTENDED ||
                request->len <= UINT32_MAX);
-        if (client->contexts.count) {
+        if (request->contexts->count) {
             bool dont_fragment = request->flags & NBD_CMD_FLAG_REQ_ONE;
-            int contexts_remaining = client->contexts.count;
+            int contexts_remaining = request->contexts->count;

-            if (client->contexts.base_allocation) {
+            if (request->contexts->base_allocation) {
                 ret = nbd_co_send_block_status(client, request,
                                                exp->common.blk,
                                                request->from,
@@ -2733,7 +2735,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
                 }
             }

-            if (client->contexts.allocation_depth) {
+            if (request->contexts->allocation_depth) {
                 ret = nbd_co_send_block_status(client, request,
                                                exp->common.blk,
                                                request->from, request->len,
@@ -2746,8 +2748,9 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
                 }
             }

+            assert(request->contexts->exp == client->exp);
             for (i = 0; i < client->exp->nr_export_bitmaps; i++) {
-                if (!client->contexts.bitmaps[i]) {
+                if (!request->contexts->bitmaps[i]) {
                     continue;
                 }
                 ret = nbd_co_send_bitmap(client, request,
@@ -2763,6 +2766,10 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
             assert(!contexts_remaining);

             return 0;
+        } else if (client->contexts.count) {
+            return nbd_send_generic_reply(client, request, -EINVAL,
+                                          "CMD_BLOCK_STATUS payload not valid",
+                                          errp);
         } else {
             return nbd_send_generic_reply(client, request, -EINVAL,
                                           "CMD_BLOCK_STATUS not negotiated",
@@ -2841,6 +2848,11 @@ static coroutine_fn void nbd_trip(void *opaque)
     } else {
         ret = nbd_handle_request(client, &request, req->data, &local_err);
     }
+    if (request.type == NBD_CMD_BLOCK_STATUS &&
+        request.contexts != &client->contexts) {
+        g_free(request.contexts->bitmaps);
+        g_free(request.contexts);
+    }
     if (ret < 0) {
         error_prepend(&local_err, "Failed to send reply: ");
         goto disconnect;
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 24/24] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS
  2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
                   ` (22 preceding siblings ...)
  2023-06-08 13:56 ` [PATCH v4 23/24] nbd/server: Prepare for per-request filtering of BLOCK_STATUS Eric Blake
@ 2023-06-08 13:56 ` Eric Blake
  2023-06-08 19:19   ` [Libguestfs] " Eric Blake
  2023-06-27 19:42   ` Vladimir Sementsov-Ogievskiy
  23 siblings, 2 replies; 62+ messages in thread
From: Eric Blake @ 2023-06-08 13:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, libguestfs, vsementsov, Kevin Wolf, Hanna Reitz

Allow a client to request a subset of negotiated meta contexts.  For
example, a client may ask to use a single connection to learn about
both block status and dirty bitmaps, but where the dirty bitmap
queries only need to be performed on a subset of the disk; forcing the
server to compute that information on block status queries in the rest
of the disk is wasted effort (both at the server, and on the amount of
traffic sent over the wire to be parsed and ignored by the client).

Qemu as an NBD client never requests to use more than one meta
context, so it has no need to use block status payloads.  Testing this
instead requires support from libnbd, which CAN access multiple meta
contexts in parallel from a single NBD connection; an interop test
submitted to the libnbd project at the same time as this patch
demonstrates the feature working, as well as testing some corner cases
(for example, when the payload length is longer than the export
length), although other corner cases (like passing the same id
duplicated) requires a protocol fuzzer because libnbd is not wired up
to break the protocol that badly.

This also includes tweaks to 'qemu-nbd --list' to show when a server
is advertising the capability, and to the testsuite to reflect the
addition to that output.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 docs/interop/nbd.txt                          |  2 +-
 nbd/server.c                                  | 99 ++++++++++++++++++-
 qemu-nbd.c                                    |  1 +
 nbd/trace-events                              |  1 +
 tests/qemu-iotests/223.out                    | 12 +--
 tests/qemu-iotests/307.out                    | 10 +-
 .../tests/nbd-qemu-allocation.out             |  2 +-
 7 files changed, 111 insertions(+), 16 deletions(-)

diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt
index abaf4c28a96..83d85ce8d13 100644
--- a/docs/interop/nbd.txt
+++ b/docs/interop/nbd.txt
@@ -69,4 +69,4 @@ NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE
 NBD_CMD_FLAG_FAST_ZERO
 * 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth"
 * 7.1: NBD_FLAG_CAN_MULTI_CONN for shareable writable exports
-* 8.1: NBD_OPT_EXTENDED_HEADERS
+* 8.1: NBD_OPT_EXTENDED_HEADERS, NBD_FLAG_BLOCK_STATUS_PAYLOAD
diff --git a/nbd/server.c b/nbd/server.c
index 308846fe46b..696afcf5c46 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -512,6 +512,9 @@ static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes,
     if (client->mode >= NBD_MODE_STRUCTURED) {
         myflags |= NBD_FLAG_SEND_DF;
     }
+    if (client->mode >= NBD_MODE_EXTENDED && client->contexts.count) {
+        myflags |= NBD_FLAG_BLOCK_STAT_PAYLOAD;
+    }
     trace_nbd_negotiate_new_style_size_flags(client->exp->size, myflags);
     stq_be_p(buf, client->exp->size);
     stw_be_p(buf + 8, myflags);
@@ -699,6 +702,10 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp)
     if (client->mode >= NBD_MODE_STRUCTURED) {
         myflags |= NBD_FLAG_SEND_DF;
     }
+    if (client->mode >= NBD_MODE_EXTENDED &&
+        (client->contexts.count || client->opt == NBD_OPT_INFO)) {
+        myflags |= NBD_FLAG_BLOCK_STAT_PAYLOAD;
+    }
     trace_nbd_negotiate_new_style_size_flags(exp->size, myflags);
     stq_be_p(buf, exp->size);
     stw_be_p(buf + 8, myflags);
@@ -2424,6 +2431,81 @@ static int coroutine_fn nbd_co_send_bitmap(NBDClient *client,
     return nbd_co_send_extents(client, request, ea, last, context_id, errp);
 }

+/*
+ * nbd_co_block_status_payload_read
+ * Called when a client wants a subset of negotiated contexts via a
+ * BLOCK_STATUS payload.  Check the payload for valid length and
+ * contents.  On success, return 0 with request updated to effective
+ * length.  If request was invalid but payload consumed, return 0 with
+ * request->len and request->contexts->count set to 0 (which will
+ * trigger an appropriate NBD_EINVAL response later on).  On I/O
+ * error, return -EIO.
+ */
+static int
+nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request,
+                                 Error **errp)
+{
+    int payload_len = request->len;
+    g_autofree char *buf = NULL;
+    size_t count, i, nr_bitmaps;
+    uint32_t id;
+
+    assert(request->len <= NBD_MAX_BUFFER_SIZE);
+    assert(client->contexts.exp == client->exp);
+    nr_bitmaps = client->exp->nr_export_bitmaps;
+    request->contexts = g_new0(NBDMetaContexts, 1);
+    request->contexts->exp = client->exp;
+
+    if (payload_len % sizeof(uint32_t) ||
+        payload_len < sizeof(NBDBlockStatusPayload) ||
+        payload_len > (sizeof(NBDBlockStatusPayload) +
+                       sizeof(id) * client->contexts.count)) {
+        goto skip;
+    }
+
+    buf = g_malloc(payload_len);
+    if (nbd_read(client->ioc, buf, payload_len,
+                 "CMD_BLOCK_STATUS data", errp) < 0) {
+        return -EIO;
+    }
+    trace_nbd_co_receive_request_payload_received(request->cookie,
+                                                  payload_len);
+    request->contexts->bitmaps = g_new0(bool, nr_bitmaps);
+    count = (payload_len - sizeof(NBDBlockStatusPayload)) / sizeof(id);
+    payload_len = 0;
+
+    for (i = 0; i < count; i++) {
+        id = ldl_be_p(buf + sizeof(NBDBlockStatusPayload) + sizeof(id) * i);
+        if (id == NBD_META_ID_BASE_ALLOCATION) {
+            if (request->contexts->base_allocation) {
+                goto skip;
+            }
+            request->contexts->base_allocation = true;
+        } else if (id == NBD_META_ID_ALLOCATION_DEPTH) {
+            if (request->contexts->allocation_depth) {
+                goto skip;
+            }
+            request->contexts->allocation_depth = true;
+        } else {
+            if (id - NBD_META_ID_DIRTY_BITMAP > nr_bitmaps ||
+                request->contexts->bitmaps[id - NBD_META_ID_DIRTY_BITMAP]) {
+                goto skip;
+            }
+            request->contexts->bitmaps[id - NBD_META_ID_DIRTY_BITMAP] = true;
+        }
+    }
+
+    request->len = ldq_be_p(buf);
+    request->contexts->count = count;
+    return 0;
+
+ skip:
+    trace_nbd_co_receive_block_status_payload_compliance(request->from,
+                                                         request->len);
+    request->len = request->contexts->count = 0;
+    return nbd_drop(client->ioc, payload_len, errp);
+}
+
 /* nbd_co_receive_request
  * Collect a client request. Return 0 if request looks valid, -EIO to drop
  * connection right away, -EAGAIN to indicate we were interrupted and the
@@ -2470,7 +2552,13 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *

     if (request->type == NBD_CMD_WRITE || extended_with_payload) {
         payload_len = request->len;
-        if (request->type != NBD_CMD_WRITE) {
+        if (request->type == NBD_CMD_BLOCK_STATUS) {
+            ret = nbd_co_block_status_payload_read(client, request, errp);
+            if (ret < 0) {
+                return ret;
+            }
+            payload_len = 0;
+        } else if (request->type != NBD_CMD_WRITE) {
             /*
              * For now, we don't support payloads on other
              * commands; but we can keep the connection alive.
@@ -2491,7 +2579,8 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
             error_setg(errp, "No memory");
             return -ENOMEM;
         }
-    } else if (request->type == NBD_CMD_BLOCK_STATUS) {
+    } else if (request->type == NBD_CMD_BLOCK_STATUS &&
+               !extended_with_payload) {
         request->contexts = &client->contexts;
     }

@@ -2547,6 +2636,9 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
         valid_flags |= NBD_CMD_FLAG_NO_HOLE | NBD_CMD_FLAG_FAST_ZERO;
     } else if (request->type == NBD_CMD_BLOCK_STATUS) {
         valid_flags |= NBD_CMD_FLAG_REQ_ONE;
+        if (client->mode >= NBD_MODE_EXTENDED && client->contexts.count) {
+            valid_flags |= NBD_CMD_FLAG_PAYLOAD_LEN;
+        }
     }
     if (request->flags & ~valid_flags) {
         error_setg(errp, "unsupported flags for command %s (got 0x%x)",
@@ -2712,7 +2804,8 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
                                       "discard failed", errp);

     case NBD_CMD_BLOCK_STATUS:
-        if (!request->len) {
+        assert(request->contexts);
+        if (!request->len && !(request->flags & NBD_CMD_FLAG_PAYLOAD_LEN)) {
             return nbd_send_generic_reply(client, request, -EINVAL,
                                           "need non-zero length", errp);
         }
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 1d155fc2c66..cbca0eeee62 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -222,6 +222,7 @@ static int qemu_nbd_client_list(SocketAddress *saddr, QCryptoTLSCreds *tls,
                 [NBD_FLAG_SEND_RESIZE_BIT]          = "resize",
                 [NBD_FLAG_SEND_CACHE_BIT]           = "cache",
                 [NBD_FLAG_SEND_FAST_ZERO_BIT]       = "fast-zero",
+                [NBD_FLAG_BLOCK_STAT_PAYLOAD_BIT]   = "block-status-payload",
             };

             printf("  size:  %" PRIu64 "\n", list[i].size);
diff --git a/nbd/trace-events b/nbd/trace-events
index 51bfb129c95..a1af6d003b4 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -70,6 +70,7 @@ nbd_co_send_chunk_read(uint64_t cookie, uint64_t offset, void *data, size_t size
 nbd_co_send_chunk_read_hole(uint64_t cookie, uint64_t offset, size_t size) "Send structured read hole reply: cookie = %" PRIu64 ", offset = %" PRIu64 ", len = %zu"
 nbd_co_send_extents(uint64_t cookie, unsigned int extents, uint32_t id, uint64_t length, int last) "Send block status reply: cookie = %" PRIu64 ", extents = %u, context = %d (extents cover %" PRIu64 " bytes, last chunk = %d)"
 nbd_co_send_chunk_error(uint64_t cookie, int err, const char *errname, const char *msg) "Send structured error reply: cookie = %" PRIu64 ", error = %d (%s), msg = '%s'"
+nbd_co_receive_block_status_payload_compliance(uint64_t from, int len) "client sent unusable block status payload: from=0x%" PRIx64 ", len=0x%x"
 nbd_co_receive_request_decode_type(uint64_t cookie, uint16_t type, const char *name) "Decoding type: cookie = %" PRIu64 ", type = %" PRIu16 " (%s)"
 nbd_co_receive_request_payload_received(uint64_t cookie, uint64_t len) "Payload received: cookie = %" PRIu64 ", len = %" PRIu64
 nbd_co_receive_ext_payload_compliance(uint64_t from, uint64_t len) "client sent non-compliant write without payload flag: from=0x%" PRIx64 ", len=0x%" PRIx64
diff --git a/tests/qemu-iotests/223.out b/tests/qemu-iotests/223.out
index b98582c38ea..b38f0b7963b 100644
--- a/tests/qemu-iotests/223.out
+++ b/tests/qemu-iotests/223.out
@@ -83,7 +83,7 @@ exports available: 0
 exports available: 3
  export: 'n'
   size:  4194304
-  flags: 0x58f ( readonly flush fua df multi cache )
+  flags: 0x158f ( readonly flush fua df multi cache block-status-payload )
   min block: 1
   opt block: 4096
   max block: 33554432
@@ -94,7 +94,7 @@ exports available: 3
  export: 'n2'
   description: some text
   size:  4194304
-  flags: 0xded ( flush fua trim zeroes df multi cache fast-zero )
+  flags: 0x1ded ( flush fua trim zeroes df multi cache fast-zero block-status-payload )
   min block: 1
   opt block: 4096
   max block: 33554432
@@ -104,7 +104,7 @@ exports available: 3
    qemu:dirty-bitmap:b2
  export: 'n3'
   size:  4194304
-  flags: 0x58f ( readonly flush fua df multi cache )
+  flags: 0x158f ( readonly flush fua df multi cache block-status-payload )
   min block: 1
   opt block: 4096
   max block: 33554432
@@ -205,7 +205,7 @@ exports available: 0
 exports available: 3
  export: 'n'
   size:  4194304
-  flags: 0x58f ( readonly flush fua df multi cache )
+  flags: 0x158f ( readonly flush fua df multi cache block-status-payload )
   min block: 1
   opt block: 4096
   max block: 33554432
@@ -216,7 +216,7 @@ exports available: 3
  export: 'n2'
   description: some text
   size:  4194304
-  flags: 0xded ( flush fua trim zeroes df multi cache fast-zero )
+  flags: 0x1ded ( flush fua trim zeroes df multi cache fast-zero block-status-payload )
   min block: 1
   opt block: 4096
   max block: 33554432
@@ -226,7 +226,7 @@ exports available: 3
    qemu:dirty-bitmap:b2
  export: 'n3'
   size:  4194304
-  flags: 0x58f ( readonly flush fua df multi cache )
+  flags: 0x158f ( readonly flush fua df multi cache block-status-payload )
   min block: 1
   opt block: 4096
   max block: 33554432
diff --git a/tests/qemu-iotests/307.out b/tests/qemu-iotests/307.out
index 2b9a6a67a1a..f645f3315f8 100644
--- a/tests/qemu-iotests/307.out
+++ b/tests/qemu-iotests/307.out
@@ -15,7 +15,7 @@ wrote 4096/4096 bytes at offset 0
 exports available: 1
  export: 'fmt'
   size:  67108864
-  flags: 0x58f ( readonly flush fua df multi cache )
+  flags: 0x158f ( readonly flush fua df multi cache block-status-payload )
   min block: XXX
   opt block: XXX
   max block: XXX
@@ -44,7 +44,7 @@ exports available: 1
 exports available: 1
  export: 'fmt'
   size:  67108864
-  flags: 0x58f ( readonly flush fua df multi cache )
+  flags: 0x158f ( readonly flush fua df multi cache block-status-payload )
   min block: XXX
   opt block: XXX
   max block: XXX
@@ -76,7 +76,7 @@ exports available: 1
 exports available: 2
  export: 'fmt'
   size:  67108864
-  flags: 0x58f ( readonly flush fua df multi cache )
+  flags: 0x158f ( readonly flush fua df multi cache block-status-payload )
   min block: XXX
   opt block: XXX
   max block: XXX
@@ -86,7 +86,7 @@ exports available: 2
  export: 'export1'
   description: This is the writable second export
   size:  67108864
-  flags: 0xded ( flush fua trim zeroes df multi cache fast-zero )
+  flags: 0x1ded ( flush fua trim zeroes df multi cache fast-zero block-status-payload )
   min block: XXX
   opt block: XXX
   max block: XXX
@@ -113,7 +113,7 @@ exports available: 1
  export: 'export1'
   description: This is the writable second export
   size:  67108864
-  flags: 0xded ( flush fua trim zeroes df multi cache fast-zero )
+  flags: 0x1ded ( flush fua trim zeroes df multi cache fast-zero block-status-payload )
   min block: XXX
   opt block: XXX
   max block: XXX
diff --git a/tests/qemu-iotests/tests/nbd-qemu-allocation.out b/tests/qemu-iotests/tests/nbd-qemu-allocation.out
index 659276032b0..794d1bfce62 100644
--- a/tests/qemu-iotests/tests/nbd-qemu-allocation.out
+++ b/tests/qemu-iotests/tests/nbd-qemu-allocation.out
@@ -17,7 +17,7 @@ wrote 2097152/2097152 bytes at offset 1048576
 exports available: 1
  export: ''
   size:  4194304
-  flags: 0x48f ( readonly flush fua df cache )
+  flags: 0x148f ( readonly flush fua df cache block-status-payload )
   min block: 1
   opt block: 4096
   max block: 33554432
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [Libguestfs] [PATCH v4 02/24] nbd: Consistent typedef usage in header
  2023-06-08 13:56 ` [PATCH v4 02/24] nbd: Consistent typedef usage in header Eric Blake
@ 2023-06-08 14:17   ` Eric Blake
  2023-06-12 11:59     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 14:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Hanna Reitz, libguestfs, qemu-block, vsementsov

On Thu, Jun 08, 2023 at 08:56:31AM -0500, Eric Blake wrote:
> We had a mix of struct declarataions followed by typedefs, and direct

declarations

> struct definitions as part of a typedef.  Pick a single style.  Also
> float a couple of opaque typedefs earlier in the file, as a later
> patch wants to refer NBDExport* in NBDRequest.  No semantic impact.

The curse of writing a commit message and then rebasing to a different
idea; in patch 22, I had originally intended to make NBDMetaContexts a
concrete type in nbd.h (which depends on NBDExport*, and would be
directly used in NBDRequest, which in turn is declared before the
pre-patch mention of NBDExport), but then changed my mind to instead
have NBDMetaContexts itself also be an opaque type with NBDRequest
only using NBDMetaContexts*.  And I missed floating the typedef for
NBDClientConnection to the same point, because we somewhat separated
opaque types along the lines of which .c files provide various
functions and opaque types.

> @@ -26,24 +26,25 @@
>  #include "qapi/error.h"
>  #include "qemu/bswap.h"
> 
> +typedef struct NBDExport NBDExport;
> +typedef struct NBDClient NBDClient;
> +

Preferences on how I should tweak that aspect of this patch?  Options:

- Don't float NBDExport or NBDClient, and drop that part of the commit
  message.  However, the later patch that adds the typedef for
  NBDMetaContexts still has to do it earlier than the definition of
  NBDRequest, rather than alongside the other opaque types relevant to
  server.c

- Also float NBDClientConnection up here, and reword the commit
  message along the lines of: Also float forward declarations of
  opaque types to the top of the file, rather than interspersed with
  function declarations, which will help a future patch that wants to
  expose yet another opaque type that will be referenced in
  NBDRequest.

- something else?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Libguestfs] [PATCH v4 05/24] nbd: s/handle/cookie/ to match NBD spec
  2023-06-08 13:56 ` [PATCH v4 05/24] nbd: s/handle/cookie/ to match NBD spec Eric Blake
@ 2023-06-08 14:32   ` Eric Blake
  2023-06-12 14:12   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 62+ messages in thread
From: Eric Blake @ 2023-06-08 14:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Hanna Reitz, libguestfs, qemu-block, vsementsov

On Thu, Jun 08, 2023 at 08:56:34AM -0500, Eric Blake wrote:
> Externally, libnbd exposed the 64-bit opaque marker for each client
> NBD packet as the "cookie", because it was less confusing when
> contrasted with 'struct nbd_handle *' holding all libnbd state.  It
> also avoids confusion between the nown 'handle' as a way to identify a

noun

> packet and the verb 'handle' for reacting to things like signals.
> Upstream NBD changed their spec to favor the name "cookie" based on
> libnbd's recommendations[1], so we can do likewise.
> 
> [1] https://github.com/NetworkBlockDevice/nbd/commit/ca4392eb2b
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Libguestfs] [PATCH v4 12/24] nbd: Prepare for 64-bit request effect lengths
  2023-06-08 13:56 ` [PATCH v4 12/24] nbd: Prepare for 64-bit request effect lengths Eric Blake
@ 2023-06-08 18:26   ` Eric Blake
  2023-06-16 18:16   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 62+ messages in thread
From: Eric Blake @ 2023-06-08 18:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Hanna Reitz, libguestfs, qemu-block, vsementsov

On Thu, Jun 08, 2023 at 08:56:41AM -0500, Eric Blake wrote:
> Widen the length field of NBDRequest to 64-bits, although we can
> assert that all current uses are still under 32 bits, because nothing
> ever puts us into NBD_MODE_EXTENDED yet.  Thus no semantic change.  No
> semantic change yet.

I should quit repeating myself ;) That last sentence is leftover from
a squash.

> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Libguestfs] [PATCH v4 13/24] nbd/server: Refactor handling of request payload
  2023-06-08 13:56 ` [PATCH v4 13/24] nbd/server: Refactor handling of request payload Eric Blake
@ 2023-06-08 18:29   ` Eric Blake
  2023-06-16 18:29     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 18:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: libguestfs, qemu-block, vsementsov

On Thu, Jun 08, 2023 at 08:56:42AM -0500, Eric Blake wrote:
> Upcoming additions to support NBD 64-bit effect lengths allow for the
> possibility to distinguish between payload length (capped at 32M) and
> effect length (up to 63 bits).  Without that extension, only the

Technically, the NBD spec does not (yet) have a documented cap of a
63-bit size limit; although I should probably propose a patch upstream
that does that (I had one in my draft work, but it hasn't yet made it
upstream)

> NBD_CMD_WRITE request has a payload; but with the extension, it makes
> sense to allow at least NBD_CMD_BLOCK_STATUS to have both a payload
> and effect length (where the payload is a limited-size struct that in
> turns gives the real effect length as well as a subset of known ids
> for which status is requested).  Other future NBD commands may also
> have a request payload, so the 64-bit extension introduces a new
> NBD_CMD_FLAG_PAYLOAD_LEN that distinguishes between whether the header
> length is a payload length or an effect length, rather than
> hard-coding the decision based on the command.  Note that we do not
> support the payload version of BLOCK_STATUS yet.
> 
> For this patch, no semantic change is intended for a compliant client.
> For a non-compliant client, it is possible that the error behavior
> changes (a different message, a change on whether the connection is
> killed or remains alive for the next command, or so forth), in part
> because req->complete is set later on some paths, but all errors
> should still be handled gracefully.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> 
> v4: less indentation on several 'if's [Vladimir]
> ---
>  nbd/server.c     | 76 ++++++++++++++++++++++++++++++------------------
>  nbd/trace-events |  1 +
>  2 files changed, 49 insertions(+), 28 deletions(-)
> 
> diff --git a/nbd/server.c b/nbd/server.c
> index 4ac05d0cd7b..d7dc29f0445 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -2329,6 +2329,8 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
>                                                 Error **errp)
>  {
>      NBDClient *client = req->client;
> +    bool extended_with_payload;
> +    unsigned payload_len = 0;
>      int valid_flags;
>      int ret;
> 
> @@ -2342,48 +2344,63 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
>      trace_nbd_co_receive_request_decode_type(request->cookie, request->type,
>                                               nbd_cmd_lookup(request->type));
> 
> -    if (request->type != NBD_CMD_WRITE) {
> -        /* No payload, we are ready to read the next request.  */
> -        req->complete = true;
> -    }
> -
>      if (request->type == NBD_CMD_DISC) {
>          /* Special case: we're going to disconnect without a reply,
>           * whether or not flags, from, or len are bogus */
> +        req->complete = true;
>          return -EIO;
>      }
> 
> -    if (request->type == NBD_CMD_READ || request->type == NBD_CMD_WRITE ||
> -        request->type == NBD_CMD_CACHE)
> -    {
> -        if (request->len > NBD_MAX_BUFFER_SIZE) {
> -            error_setg(errp, "len (%" PRIu64" ) is larger than max len (%u)",
> -                       request->len, NBD_MAX_BUFFER_SIZE);
> -            return -EINVAL;
> +    /* Payload and buffer handling. */
> +    extended_with_payload = client->mode >= NBD_MODE_EXTENDED &&
> +        (request->flags & NBD_CMD_FLAG_PAYLOAD_LEN);
> +    if ((request->type == NBD_CMD_READ || request->type == NBD_CMD_WRITE ||
> +         request->type == NBD_CMD_CACHE || extended_with_payload) &&
> +        request->len > NBD_MAX_BUFFER_SIZE) {

I'm still debating about rewriting this series of slightly-different
'if' into a single switch (request->type) block; the benefit is that
all actions for one command will be localized instead of split over
multiple lines of if, the drawback is that some code will now have to
be duplicated across commands (such as the buffer allocation code for
CMD_READ and CMD_WRITE).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Libguestfs] [PATCH v4 18/24] nbd/client: Plumb errp through nbd_receive_replies
  2023-06-08 13:56 ` [PATCH v4 18/24] nbd/client: Plumb errp through nbd_receive_replies Eric Blake
@ 2023-06-08 19:10   ` Eric Blake
  2023-06-27 13:31   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 62+ messages in thread
From: Eric Blake @ 2023-06-08 19:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Hanna Reitz, libguestfs, qemu-block, vsementsov

On Thu, Jun 08, 2023 at 08:56:47AM -0500, Eric Blake wrote:
> Instead of ignoring the low-level error just to refabricate our own
> message to pass to the caller, we can just plump the caller's errp

plumb

(at least I got it right in the subject line...)

> down to the low level.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Libguestfs] [PATCH v4 23/24] nbd/server: Prepare for per-request filtering of BLOCK_STATUS
  2023-06-08 13:56 ` [PATCH v4 23/24] nbd/server: Prepare for per-request filtering of BLOCK_STATUS Eric Blake
@ 2023-06-08 19:15   ` Eric Blake
  2023-06-27 15:19     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-08 19:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Hanna Reitz, libguestfs, qemu-block, vsementsov

On Thu, Jun 08, 2023 at 08:56:52AM -0500, Eric Blake wrote:
> The next commit will add support for the optional extension
> NBD_CMD_FLAG_PAYLOAD during NBD_CMD_BLOCK_STATUS, where the client can
> request that the server only return a subset of negotiated contexts,
> rather than all contexts.  To make that task easier, this patch
> populates the list of contexts to return on a per-command basis (for
> now, identical to the full set of negotiated contexts).
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> 

> +++ b/nbd/server.c
> @@ -2491,6 +2491,8 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
>              error_setg(errp, "No memory");
>              return -ENOMEM;
>          }
> +    } else if (request->type == NBD_CMD_BLOCK_STATUS) {
> +        request->contexts = &client->contexts;
>      }

THere are paths where request->contexts is left NULL but request->type
was set...

> @@ -2841,6 +2848,11 @@ static coroutine_fn void nbd_trip(void *opaque)
>      } else {
>          ret = nbd_handle_request(client, &request, req->data, &local_err);
>      }
> +    if (request.type == NBD_CMD_BLOCK_STATUS &&
> +        request.contexts != &client->contexts) {
> +        g_free(request.contexts->bitmaps);
> +        g_free(request.contexts);
> +    }

so I think this also needs to be tweaked to check that
request.contexts is non-NULL before dereferencing to free bitmaps.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Libguestfs] [PATCH v4 24/24] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS
  2023-06-08 13:56 ` [PATCH v4 24/24] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Eric Blake
@ 2023-06-08 19:19   ` Eric Blake
  2023-06-27 19:42   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 62+ messages in thread
From: Eric Blake @ 2023-06-08 19:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Hanna Reitz, libguestfs, qemu-block, vsementsov

On Thu, Jun 08, 2023 at 08:56:53AM -0500, Eric Blake wrote:
> Allow a client to request a subset of negotiated meta contexts.  For
> example, a client may ask to use a single connection to learn about
> both block status and dirty bitmaps, but where the dirty bitmap
> queries only need to be performed on a subset of the disk; forcing the
> server to compute that information on block status queries in the rest
> of the disk is wasted effort (both at the server, and on the amount of
> traffic sent over the wire to be parsed and ignored by the client).
> 
> Qemu as an NBD client never requests to use more than one meta
> context, so it has no need to use block status payloads.  Testing this
> instead requires support from libnbd, which CAN access multiple meta
> contexts in parallel from a single NBD connection; an interop test
> submitted to the libnbd project at the same time as this patch
> demonstrates the feature working, as well as testing some corner cases
> (for example, when the payload length is longer than the export
> length), although other corner cases (like passing the same id
> duplicated) requires a protocol fuzzer because libnbd is not wired up
> to break the protocol that badly.
> 
> This also includes tweaks to 'qemu-nbd --list' to show when a server
> is advertising the capability, and to the testsuite to reflect the
> addition to that output.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---

> +++ b/nbd/server.c
> @@ -512,6 +512,9 @@ static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes,
>      if (client->mode >= NBD_MODE_STRUCTURED) {
>          myflags |= NBD_FLAG_SEND_DF;
>      }
> +    if (client->mode >= NBD_MODE_EXTENDED && client->contexts.count) {
> +        myflags |= NBD_FLAG_BLOCK_STAT_PAYLOAD;
> +    }

This one's awkward.  At the last minute, I changed what went into
upstream NBD to state that a client that successfully negotiates
extended headers MUST NOT use NBD_OPT_EXPORT_NAME, which means this
branch should never fire for a compliant client.  I don't think it
hurts to leave this in, but it does point out that I am missing code
(either here or in 17/24) that at least logs when we detect a
non-compliant client trying to connect with the wrong command.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Libguestfs] [PATCH v4 02/24] nbd: Consistent typedef usage in header
  2023-06-08 14:17   ` [Libguestfs] " Eric Blake
@ 2023-06-12 11:59     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-12 11:59 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: Kevin Wolf, Hanna Reitz, libguestfs, qemu-block

On 08.06.23 17:17, Eric Blake wrote:
> On Thu, Jun 08, 2023 at 08:56:31AM -0500, Eric Blake wrote:
>> We had a mix of struct declarataions followed by typedefs, and direct
> 
> declarations
> 
>> struct definitions as part of a typedef.  Pick a single style.  Also
>> float a couple of opaque typedefs earlier in the file, as a later
>> patch wants to refer NBDExport* in NBDRequest.  No semantic impact.
> 
> The curse of writing a commit message and then rebasing to a different
> idea; in patch 22, I had originally intended to make NBDMetaContexts a
> concrete type in nbd.h (which depends on NBDExport*, and would be
> directly used in NBDRequest, which in turn is declared before the
> pre-patch mention of NBDExport), but then changed my mind to instead
> have NBDMetaContexts itself also be an opaque type with NBDRequest
> only using NBDMetaContexts*.  And I missed floating the typedef for
> NBDClientConnection to the same point, because we somewhat separated
> opaque types along the lines of which .c files provide various
> functions and opaque types.
> 
>> @@ -26,24 +26,25 @@
>>   #include "qapi/error.h"
>>   #include "qemu/bswap.h"
>>
>> +typedef struct NBDExport NBDExport;
>> +typedef struct NBDClient NBDClient;
>> +
> 
> Preferences on how I should tweak that aspect of this patch?  Options:
> 
> - Don't float NBDExport or NBDClient, and drop that part of the commit
>    message.  However, the later patch that adds the typedef for
>    NBDMetaContexts still has to do it earlier than the definition of
>    NBDRequest, rather than alongside the other opaque types relevant to
>    server.c
> 
> - Also float NBDClientConnection up here, and reword the commit
>    message along the lines of: Also float forward declarations of
>    opaque types to the top of the file, rather than interspersed with
>    function declarations, which will help a future patch that wants to
>    expose yet another opaque type that will be referenced in
>    NBDRequest.

Sounds good to me. Anyway:
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

> 
> - something else?
> 

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 03/24] nbd/server: Prepare for alternate-size headers
  2023-06-08 13:56 ` [PATCH v4 03/24] nbd/server: Prepare for alternate-size headers Eric Blake
@ 2023-06-12 13:53   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-12 13:53 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> Upstream NBD now documents[1] an extension that supports 64-bit effect
> lengths in requests.  As part of that extension, the size of the reply
> headers will change in order to permit a 64-bit length in the reply
> for symmetry[2].  Additionally, where the reply header is currently 16
> bytes for simple reply, and 20 bytes for structured reply; with the
> extension enabled, there will only be one extended reply header, of 32
> bytes, with both structured and extended modes sending identical
> payloads for chunked replies.
> 
> Since we are already wired up to use iovecs, it is easiest to allow
> for this change in header size by splitting each structured reply
> across multiple iovecs, one for the header (which will become wider in
> a future patch according to client negotiation), and the other(s) for
> the chunk payload, and removing the header from the payload struct
> definitions.  Rename the affected functions with s/structured/chunk/
> to make it obvious that the code will be reused in extended mode.
> 
> Interestingly, the client side code never utilized the packed types,
> so only the server code needs to be updated.
> 
> [1]https://github.com/NetworkBlockDevice/nbd/blob/extension-ext-header/doc/proto.md
> as of NBD commit e6f3b94a934
> 
> [2] Note that on the surface, this is because some future server might
> permit a 4G+ NBD_CMD_READ and need to reply with that much data in one
> transaction.  But even though the extended reply length is widened to
> 64 bits, for now the NBD spec is clear that servers will not reply
> with more than a maximum payload bounded by the 32-bit
> NBD_INFO_BLOCK_SIZE field; allowing a client and server to mutually
> agree to transactions larger than 4G would require yet another
> extension.
> 
> Signed-off-by: Eric Blake<eblake@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 05/24] nbd: s/handle/cookie/ to match NBD spec
  2023-06-08 13:56 ` [PATCH v4 05/24] nbd: s/handle/cookie/ to match NBD spec Eric Blake
  2023-06-08 14:32   ` [Libguestfs] " Eric Blake
@ 2023-06-12 14:12   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-12 14:12 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> Externally, libnbd exposed the 64-bit opaque marker for each client
> NBD packet as the "cookie", because it was less confusing when
> contrasted with 'struct nbd_handle *' holding all libnbd state.  It
> also avoids confusion between the nown 'handle' as a way to identify a
> packet and the verb 'handle' for reacting to things like signals.
> Upstream NBD changed their spec to favor the name "cookie" based on
> libnbd's recommendations[1], so we can do likewise.
> 
> [1]https://github.com/NetworkBlockDevice/nbd/commit/ca4392eb2b
> 
> Signed-off-by: Eric Blake<eblake@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 06/24] nbd/client: Simplify cookie vs. index computation
  2023-06-08 13:56 ` [PATCH v4 06/24] nbd/client: Simplify cookie vs. index computation Eric Blake
@ 2023-06-12 14:27   ` Vladimir Sementsov-Ogievskiy
  2023-06-12 19:13     ` Eric Blake
  0 siblings, 1 reply; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-12 14:27 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> Our code relies on a sentinel cookie value of zero for deciding when a
> packet has been handled, as well as relying on array indices between 0
> and MAX_NBD_REQUESTS-1 for dereferencing purposes.  As long as we can
> symmetrically convert between two forms, there is no reason to go with
> the odd choice of using XOR with a random pointer, when we can instead
> simplify the mappings with a mere offset of 1.

Should we go further and use (uint64)-1 as a sentinel cookie value, and just use index as a cookie?  Or, using zero cookie in a wire looks too asymmetric?

> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

> ---
> 
> v4: new patch
> ---
>   block/nbd.c | 16 ++++++++--------
>   1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/block/nbd.c b/block/nbd.c
> index be3c46c6fee..5322e66166c 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -50,8 +50,8 @@
>   #define EN_OPTSTR ":exportname="
>   #define MAX_NBD_REQUESTS    16
> 
> -#define COOKIE_TO_INDEX(bs, cookie) ((cookie) ^ (uint64_t)(intptr_t)(bs))
> -#define INDEX_TO_COOKIE(bs, index)  ((index)  ^ (uint64_t)(intptr_t)(bs))

That looked like some security trick to hide real indices. But I don't think that index of request in a list is a secret information.

> +#define COOKIE_TO_INDEX(cookie) ((cookie) - 1)
> +#define INDEX_TO_COOKIE(index)  ((index) + 1)
> 

[..]

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 08/24] nbd: Use enum for various negotiation modes
  2023-06-08 13:56 ` [PATCH v4 08/24] nbd: Use enum for various negotiation modes Eric Blake
@ 2023-06-12 14:39   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-12 14:39 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> Deciphering the hard-coded list of integer return values from
> nbd_start_negotiate() will only get more confusing when adding support
> for 64-bit extended headers.  Better is to name things in an enum.
> Although the function in question is private to client.c, putting the
> enum in a public header and including an enum-to-string conversion
> will allow its use in more places in upcoming patches.
> 
> The enum is intentionally laid out so that operators like <= can be
> used to group multiple modes with similar characteristics, and where
> the least powerful mode has value 0, even though this patch does not
> exploit that.  No semantic change intended.
> 
> Signed-off-by: Eric Blake<eblake@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 09/24] nbd: Replace bool structured_reply with mode enum
  2023-06-08 13:56 ` [PATCH v4 09/24] nbd: Replace bool structured_reply with mode enum Eric Blake
@ 2023-06-12 15:07   ` Vladimir Sementsov-Ogievskiy
  2023-06-12 19:24     ` [Libguestfs] " Eric Blake
  0 siblings, 1 reply; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-12 15:07 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> The upcoming patches for 64-bit extensions requires various points in
> the protocol to make decisions based on what was negotiated.  While we
> could easily add a 'bool extended_headers' alongside the existing
> 'bool structured_reply', this does not scale well if more modes are
> added in the future.  Better is to expose the mode enum added in the
> previous patch out to a wider use in the code base.
> 
> Where the code previously checked for structured_reply being set or
> clear, it now prefers checking for an inequality; this works because
> the nodes are in a continuum of increasing abilities, and allows us to
> touch fewer places if we ever insert other modes in the middle of the
> enum.  There should be no semantic change in this patch.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> 
> v4: new patch, expanding enum idea from v3 4/14
> ---

[..]

> diff --git a/nbd/server.c b/nbd/server.c
> index 8486b64b15d..bade4f7990c 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -143,7 +143,7 @@ struct NBDClient {
> 
>       uint32_t check_align; /* If non-zero, check for aligned client requests */
> 
> -    bool structured_reply;
> +    NBDMode mode;
>       NBDExportMetaContexts export_meta;
> 
>       uint32_t opt; /* Current option being negotiated */
> @@ -502,7 +502,7 @@ static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes,
>       }
> 
>       myflags = client->exp->nbdflags;
> -    if (client->structured_reply) {
> +    if (client->mode >= NBD_MODE_STRUCTURED) {
>           myflags |= NBD_FLAG_SEND_DF;
>       }
>       trace_nbd_negotiate_new_style_size_flags(client->exp->size, myflags);
> @@ -687,7 +687,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp)
> 
>       /* Send NBD_INFO_EXPORT always */
>       myflags = exp->nbdflags;
> -    if (client->structured_reply) {
> +    if (client->mode >= NBD_MODE_STRUCTURED) {
>           myflags |= NBD_FLAG_SEND_DF;
>       }
>       trace_nbd_negotiate_new_style_size_flags(exp->size, myflags);
> @@ -985,7 +985,8 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
>       size_t i;
>       size_t count = 0;
> 
> -    if (client->opt == NBD_OPT_SET_META_CONTEXT && !client->structured_reply) {
> +    if (client->opt == NBD_OPT_SET_META_CONTEXT &&
> +        client->mode < NBD_MODE_STRUCTURED) {
>           return nbd_opt_invalid(client, errp,
>                                  "request option '%s' when structured reply "
>                                  "is not negotiated",
> @@ -1261,13 +1262,13 @@ static int nbd_negotiate_options(NBDClient *client, Error **errp)
>               case NBD_OPT_STRUCTURED_REPLY:
>                   if (length) {
>                       ret = nbd_reject_length(client, false, errp);
> -                } else if (client->structured_reply) {
> +                } else if (client->mode >= NBD_MODE_STRUCTURED) {
>                       ret = nbd_negotiate_send_rep_err(
>                           client, NBD_REP_ERR_INVALID, errp,
>                           "structured reply already negotiated");
>                   } else {
>                       ret = nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
> -                    client->structured_reply = true;
> +                    client->mode = NBD_MODE_STRUCTURED;

Hmm. in all other cases in server code client.mode remains zero = OLDSTYLE, which is not quite correct.

>                   }
>                   break;
> 
> @@ -1907,7 +1908,9 @@ static int coroutine_fn nbd_co_send_simple_reply(NBDClient *client,
>       };
> 
>       assert(!len || !nbd_err);
> -    assert(!client->structured_reply || request->type != NBD_CMD_READ);
> +    assert(client->mode < NBD_MODE_STRUCTURED ||
> +           (client->mode == NBD_MODE_STRUCTURED &&
> +            request->type != NBD_CMD_READ));
>       trace_nbd_co_send_simple_reply(request->cookie, nbd_err,
>                                      nbd_err_lookup(nbd_err), len);
>       set_be_simple_reply(&reply, nbd_err, request->cookie);
> @@ -2409,7 +2412,7 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
>                                                 client->check_align);
>       }
>       valid_flags = NBD_CMD_FLAG_FUA;

[..]

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 10/24] nbd/client: Pass mode through to nbd_send_request
  2023-06-08 13:56 ` [PATCH v4 10/24] nbd/client: Pass mode through to nbd_send_request Eric Blake
@ 2023-06-12 15:48   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-12 15:48 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> Once the 64-bit headers extension is enabled, the data layout we send
> over the wire for a client request depends on the mode negotiated with
> the server.  Rather than adding a parameter to nbd_send_request, we
> can add a member to struct NBDRequest, since it already does not
> reflect on-wire format.  Some callers initialize it directly; many
> others rely on a common initialization point during
> nbd_co_send_request().  At this point, there is no semantic change.
> 
> Signed-off-by: Eric Blake<eblake@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 11/24] nbd: Add types for extended headers
  2023-06-08 13:56 ` [PATCH v4 11/24] nbd: Add types for extended headers Eric Blake
@ 2023-06-12 16:11   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-12 16:11 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> Add the constants and structs necessary for later patches to start
> implementing the NBD_OPT_EXTENDED_HEADERS extension in both the client
> and server, matching recent upstream nbd.git (through commit
> e6f3b94a934).  This patch does not change any existing behavior, but
> merely sets the stage for upcoming patches.
> 
> This patch does not change the status quo that neither the client nor
> server use a packed-struct representation for the request header.
> While most of the patch adds new types, there is also some churn for
> renaming the existing NBDExtent to NBDExtent32 to contrast it with
> NBDExtent64, which I thought was a nicer name than NBDExtentExt.
> 
> Signed-off-by: Eric Blake<eblake@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 06/24] nbd/client: Simplify cookie vs. index computation
  2023-06-12 14:27   ` Vladimir Sementsov-Ogievskiy
@ 2023-06-12 19:13     ` Eric Blake
  0 siblings, 0 replies; 62+ messages in thread
From: Eric Blake @ 2023-06-12 19:13 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On Mon, Jun 12, 2023 at 05:27:19PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 08.06.23 16:56, Eric Blake wrote:
> > Our code relies on a sentinel cookie value of zero for deciding when a
> > packet has been handled, as well as relying on array indices between 0
> > and MAX_NBD_REQUESTS-1 for dereferencing purposes.  As long as we can
> > symmetrically convert between two forms, there is no reason to go with
> > the odd choice of using XOR with a random pointer, when we can instead
> > simplify the mappings with a mere offset of 1.
> 
> Should we go further and use (uint64)-1 as a sentinel cookie value, and just use index as a cookie?  Or, using zero cookie in a wire looks too asymmetric?

I thought about that too, but in the end I decided it would require
auditing more lines of code to make sure I was catching all places
where we currently expected a zero sentinel (where some of those uses
are not obvious, because of things like hiding behind g_new0).  And
there is indeed the argument that if data corruption is going to
happen, it's harder to tell if an all-zero field on the wire was
intentional than a non-zero field.

> 
> > 
> > Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

Thanks; for now, I'll just leave this one as-is.


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Libguestfs] [PATCH v4 09/24] nbd: Replace bool structured_reply with mode enum
  2023-06-12 15:07   ` Vladimir Sementsov-Ogievskiy
@ 2023-06-12 19:24     ` Eric Blake
  2023-07-19 20:11       ` Eric Blake
  0 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-06-12 19:24 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, Kevin Wolf, Hanna Reitz, libguestfs, qemu-block

On Mon, Jun 12, 2023 at 06:07:59PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 08.06.23 16:56, Eric Blake wrote:
> > The upcoming patches for 64-bit extensions requires various points in
> > the protocol to make decisions based on what was negotiated.  While we
> > could easily add a 'bool extended_headers' alongside the existing
> > 'bool structured_reply', this does not scale well if more modes are
> > added in the future.  Better is to expose the mode enum added in the
> > previous patch out to a wider use in the code base.
> > 
> > Where the code previously checked for structured_reply being set or
> > clear, it now prefers checking for an inequality; this works because
> > the nodes are in a continuum of increasing abilities, and allows us to
> > touch fewer places if we ever insert other modes in the middle of the
> > enum.  There should be no semantic change in this patch.
> > 
> > Signed-off-by: Eric Blake <eblake@redhat.com>
> > ---
> > 
> > v4: new patch, expanding enum idea from v3 4/14
> > ---
> 
> [..]
> 
> > diff --git a/nbd/server.c b/nbd/server.c
> > index 8486b64b15d..bade4f7990c 100644
> > --- a/nbd/server.c
> > +++ b/nbd/server.c

> > @@ -1261,13 +1262,13 @@ static int nbd_negotiate_options(NBDClient *client, Error **errp)
> >               case NBD_OPT_STRUCTURED_REPLY:
> >                   if (length) {
> >                       ret = nbd_reject_length(client, false, errp);
> > -                } else if (client->structured_reply) {
> > +                } else if (client->mode >= NBD_MODE_STRUCTURED) {
> >                       ret = nbd_negotiate_send_rep_err(
> >                           client, NBD_REP_ERR_INVALID, errp,
> >                           "structured reply already negotiated");
> >                   } else {
> >                       ret = nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
> > -                    client->structured_reply = true;
> > +                    client->mode = NBD_MODE_STRUCTURED;
> 
> Hmm. in all other cases in server code client.mode remains zero = OLDSTYLE, which is not quite correct.

Good catch.  Consider this squashed in (note that as a server we NEVER
talk NBD_MODE_OLDSTYLE - we ripped that out back in commit 7f7dfe2a;
but whether we end up on EXPORT_NAME or SIMPLE depends on the client's
response to our initial flag advertisement.  The only reason I didn't
spot it sooner is that in the server, all subsequent checks of
client->mode grouped OLDSTYLE, EXPORT_NAME, and SIMPLE into the same
handling.

diff --git a/nbd/server.c b/nbd/server.c
index bade4f7990c..bc6858cafe6 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1123,10 +1123,12 @@ static int nbd_negotiate_options(NBDClient *client, Error **errp)
     if (nbd_read32(client->ioc, &flags, "flags", errp) < 0) {
         return -EIO;
     }
+    client->mode = NBD_MODE_EXPORT_NAME;
     trace_nbd_negotiate_options_flags(flags);
     if (flags & NBD_FLAG_C_FIXED_NEWSTYLE) {
         fixedNewstyle = true;
         flags &= ~NBD_FLAG_C_FIXED_NEWSTYLE;
+        client->mode = NBD_MODE_SIMPLE;
     }
     if (flags & NBD_FLAG_C_NO_ZEROES) {
         no_zeroes = true;


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 12/24] nbd: Prepare for 64-bit request effect lengths
  2023-06-08 13:56 ` [PATCH v4 12/24] nbd: Prepare for 64-bit request effect lengths Eric Blake
  2023-06-08 18:26   ` [Libguestfs] " Eric Blake
@ 2023-06-16 18:16   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-16 18:16 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> Widen the length field of NBDRequest to 64-bits, although we can
> assert that all current uses are still under 32 bits, because nothing
> ever puts us into NBD_MODE_EXTENDED yet.  Thus no semantic change.  No
> semantic change yet.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

assertions about NBD_MAX_BUFFER_SIZE worth a note in commit message?

trace eventsjk nbd_co_request_fail, nbd_co_request_fail, nbd_co_request_fail missed an update to 64bit

Also, some functions use size_t. Should we move them to uint64_t for consistancy?

  - nbd_co_send_sparse_read
  - nbd_co_send_chunk_read
  - nbd_co_send_simple_reply

but that may be another story



-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Libguestfs] [PATCH v4 13/24] nbd/server: Refactor handling of request payload
  2023-06-08 18:29   ` [Libguestfs] " Eric Blake
@ 2023-06-16 18:29     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-16 18:29 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: libguestfs, qemu-block

On 08.06.23 21:29, Eric Blake wrote:
> On Thu, Jun 08, 2023 at 08:56:42AM -0500, Eric Blake wrote:
>> Upcoming additions to support NBD 64-bit effect lengths allow for the
>> possibility to distinguish between payload length (capped at 32M) and
>> effect length (up to 63 bits).  Without that extension, only the
> 
> Technically, the NBD spec does not (yet) have a documented cap of a
> 63-bit size limit; although I should probably propose a patch upstream
> that does that (I had one in my draft work, but it hasn't yet made it
> upstream)
> 
>> NBD_CMD_WRITE request has a payload; but with the extension, it makes
>> sense to allow at least NBD_CMD_BLOCK_STATUS to have both a payload
>> and effect length (where the payload is a limited-size struct that in
>> turns gives the real effect length as well as a subset of known ids
>> for which status is requested).  Other future NBD commands may also
>> have a request payload, so the 64-bit extension introduces a new
>> NBD_CMD_FLAG_PAYLOAD_LEN that distinguishes between whether the header
>> length is a payload length or an effect length, rather than
>> hard-coding the decision based on the command.  Note that we do not
>> support the payload version of BLOCK_STATUS yet.
>>
>> For this patch, no semantic change is intended for a compliant client.
>> For a non-compliant client, it is possible that the error behavior
>> changes (a different message, a change on whether the connection is
>> killed or remains alive for the next command, or so forth), in part
>> because req->complete is set later on some paths, but all errors
>> should still be handled gracefully.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---
>>
>> v4: less indentation on several 'if's [Vladimir]
>> ---
>>   nbd/server.c     | 76 ++++++++++++++++++++++++++++++------------------
>>   nbd/trace-events |  1 +
>>   2 files changed, 49 insertions(+), 28 deletions(-)
>>
>> diff --git a/nbd/server.c b/nbd/server.c
>> index 4ac05d0cd7b..d7dc29f0445 100644
>> --- a/nbd/server.c
>> +++ b/nbd/server.c
>> @@ -2329,6 +2329,8 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
>>                                                  Error **errp)
>>   {
>>       NBDClient *client = req->client;
>> +    bool extended_with_payload;
>> +    unsigned payload_len = 0;
>>       int valid_flags;
>>       int ret;
>>
>> @@ -2342,48 +2344,63 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
>>       trace_nbd_co_receive_request_decode_type(request->cookie, request->type,
>>                                                nbd_cmd_lookup(request->type));
>>
>> -    if (request->type != NBD_CMD_WRITE) {
>> -        /* No payload, we are ready to read the next request.  */
>> -        req->complete = true;
>> -    }
>> -
>>       if (request->type == NBD_CMD_DISC) {
>>           /* Special case: we're going to disconnect without a reply,
>>            * whether or not flags, from, or len are bogus */
>> +        req->complete = true;
>>           return -EIO;
>>       }
>>
>> -    if (request->type == NBD_CMD_READ || request->type == NBD_CMD_WRITE ||
>> -        request->type == NBD_CMD_CACHE)
>> -    {
>> -        if (request->len > NBD_MAX_BUFFER_SIZE) {
>> -            error_setg(errp, "len (%" PRIu64" ) is larger than max len (%u)",
>> -                       request->len, NBD_MAX_BUFFER_SIZE);
>> -            return -EINVAL;
>> +    /* Payload and buffer handling. */
>> +    extended_with_payload = client->mode >= NBD_MODE_EXTENDED &&
>> +        (request->flags & NBD_CMD_FLAG_PAYLOAD_LEN);
>> +    if ((request->type == NBD_CMD_READ || request->type == NBD_CMD_WRITE ||
>> +         request->type == NBD_CMD_CACHE || extended_with_payload) &&
>> +        request->len > NBD_MAX_BUFFER_SIZE) {
> 
> I'm still debating about rewriting this series of slightly-different
> 'if' into a single switch (request->type) block; the benefit is that

I vote for switch!) Really, seems it would be a lot simpler to read.

> all actions for one command will be localized instead of split over
> multiple lines of if, the drawback is that some code will now have to
> be duplicated across commands (such as the buffer allocation code for
> CMD_READ and CMD_WRITE).
> 

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 14/24] nbd/server: Prepare to receive extended header requests
  2023-06-08 13:56 ` [PATCH v4 14/24] nbd/server: Prepare to receive extended header requests Eric Blake
@ 2023-06-16 18:35   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-16 18:35 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs

On 08.06.23 16:56, Eric Blake wrote:
> Although extended mode is not yet enabled, once we do turn it on, we
> need to accept extended requests for all messages.  Previous patches
> have already taken care of supporting 64-bit lengths, now we just need
> to read it off the wire.
> 
> Note that this implementation will block indefinitely on a buggy
> client that sends a non-extended payload (that is, we try to read a
> full packet before we ever check the magic number, but a client that
> mistakenly sends a simple request after negotiating extended headers
> doesn't send us enough bytes), but it's no different from any other
> client that stops talking to us partway through a packet and thus not
> worth coding around.
> 
> Signed-off-by: Eric Blake<eblake@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 15/24] nbd/server: Prepare to send extended header replies
  2023-06-08 13:56 ` [PATCH v4 15/24] nbd/server: Prepare to send extended header replies Eric Blake
@ 2023-06-16 18:48   ` Vladimir Sementsov-Ogievskiy
  2023-08-04 19:28     ` Eric Blake
  0 siblings, 1 reply; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-16 18:48 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs

On 08.06.23 16:56, Eric Blake wrote:
> Although extended mode is not yet enabled, once we do turn it on, we
> need to reply with extended headers to all messages.  Update the low
> level entry points necessary so that all other callers automatically
> get the right header based on the current mode.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> 
> v4: new patch, split out from v3 9/14
> ---
>   nbd/server.c | 30 ++++++++++++++++++++++--------
>   1 file changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/nbd/server.c b/nbd/server.c
> index 119ac765f09..84c848a31d3 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -1947,8 +1947,6 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov,
>                                   size_t niov, uint16_t flags, uint16_t type,
>                                   NBDRequest *request)
>   {
> -    /* TODO - handle structured vs. extended replies */
> -    NBDStructuredReplyChunk *chunk = iov->iov_base;
>       size_t i, length = 0;
> 
>       for (i = 1; i < niov; i++) {
> @@ -1956,12 +1954,26 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov,
>       }
>       assert(length <= NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData));
> 
> -    iov[0].iov_len = sizeof(*chunk);
> -    stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC);
> -    stw_be_p(&chunk->flags, flags);
> -    stw_be_p(&chunk->type, type);
> -    stq_be_p(&chunk->cookie, request->cookie);
> -    stl_be_p(&chunk->length, length);
> +    if (client->mode >= NBD_MODE_EXTENDED) {
> +        NBDExtendedReplyChunk *chunk = iov->iov_base;
> +
> +        iov->iov_len = sizeof(*chunk);

I'd prefer to keep iov[0].iov_len notation, to stress that iov is an array

anyway:
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

> +        stl_be_p(&chunk->magic, NBD_EXTENDED_REPLY_MAGIC);
> +        stw_be_p(&chunk->flags, flags);
> +        stw_be_p(&chunk->type, type);
> +        stq_be_p(&chunk->cookie, request->cookie);

Hm. Not about this patch:

we now moved to simple cookies. And it seems that actually, 64bit is too much for number of request.

> +        stq_be_p(&chunk->offset, request->from);
> +        stq_be_p(&chunk->length, length);
> +    } else {
> +        NBDStructuredReplyChunk *chunk = iov->iov_base;
> +
> +        iov->iov_len = sizeof(*chunk);
> +        stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC);
> +        stw_be_p(&chunk->flags, flags);
> +        stw_be_p(&chunk->type, type);
> +        stq_be_p(&chunk->cookie, request->cookie);
> +        stl_be_p(&chunk->length, length);
> +    }
>   }
> 
>   static int coroutine_fn nbd_co_send_chunk_done(NBDClient *client,
> @@ -2478,6 +2490,8 @@ static coroutine_fn int nbd_send_generic_reply(NBDClient *client,
>   {
>       if (client->mode >= NBD_MODE_STRUCTURED && ret < 0) {
>           return nbd_co_send_chunk_error(client, request, -ret, error_msg, errp);
> +    } else if (client->mode >= NBD_MODE_EXTENDED) {
> +        return nbd_co_send_chunk_done(client, request, errp);
>       } else {
>           return nbd_co_send_simple_reply(client, request, ret < 0 ? -ret : 0,
>                                           NULL, 0, errp);

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 16/24] nbd/server: Support 64-bit block status
  2023-06-08 13:56 ` [PATCH v4 16/24] nbd/server: Support 64-bit block status Eric Blake
@ 2023-06-27 13:23   ` Vladimir Sementsov-Ogievskiy
  2023-08-04 19:36     ` Eric Blake
  0 siblings, 1 reply; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-27 13:23 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs

On 08.06.23 16:56, Eric Blake wrote:
> The NBD spec states that if the client negotiates extended headers,
> the server must avoid NBD_REPLY_TYPE_BLOCK_STATUS and instead use
> NBD_REPLY_TYPE_BLOCK_STATUS_EXT which supports 64-bit lengths, even if
> the reply does not need more than 32 bits.  As of this patch,
> client->mode is still never NBD_MODE_EXTENDED, so the code added here
> does not take effect until the next patch enables negotiation.
> 
> For now, all metacontexts that we know how to export never populate
> more than 32 bits of information, so we don't have to worry about
> NBD_REP_ERR_EXT_HEADER_REQD or filtering during handshake, and we
> always send all zeroes for the upper 32 bits of status during
> NBD_CMD_BLOCK_STATUS.
> 
> Note that we previously had some interesting size-juggling on call
> chains, such as:
> 
> nbd_co_send_block_status(uint32_t length)
> -> blockstatus_to_extents(uint32_t bytes)
>    -> bdrv_block_status_above(bytes, &uint64_t num)
>    -> nbd_extent_array_add(uint64_t num)
>      -> store num in 32-bit length
> 
> But we were lucky that it never overflowed: bdrv_block_status_above
> never sets num larger than bytes, and we had previously been capping
> 'bytes' at 32 bits (since the protocol does not allow sending a larger
> request without extended headers).  This patch adds some assertions
> that ensure we continue to avoid overflowing 32 bits for a narrow


[..]

> @@ -2162,19 +2187,23 @@ static void nbd_extent_array_convert_to_be(NBDExtentArray *ea)
>    * would result in an incorrect range reported to the client)
>    */
>   static int nbd_extent_array_add(NBDExtentArray *ea,
> -                                uint32_t length, uint32_t flags)
> +                                uint64_t length, uint32_t flags)
>   {
>       assert(ea->can_add);
> 
>       if (!length) {
>           return 0;
>       }
> +    if (!ea->extended) {
> +        assert(length <= UINT32_MAX);
> +    }
> 
>       /* Extend previous extent if flags are the same */
>       if (ea->count > 0 && flags == ea->extents[ea->count - 1].flags) {
> -        uint64_t sum = (uint64_t)length + ea->extents[ea->count - 1].length;
> +        uint64_t sum = length + ea->extents[ea->count - 1].length;
> 
> -        if (sum <= UINT32_MAX) {
> +        assert(sum >= length);
> +        if (sum <= UINT32_MAX || ea->extended) {

that "if" and uint64_t sum was to avoid overflow. I think, we can't just assert, instead include the check into if:

if (sum >= length && (sum <= UINT32_MAX || ea->extended) {

  ...

}

with this:
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 17/24] nbd/server: Enable initial support for extended headers
  2023-06-08 13:56 ` [PATCH v4 17/24] nbd/server: Enable initial support for extended headers Eric Blake
@ 2023-06-27 13:26   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-27 13:26 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs

On 08.06.23 16:56, Eric Blake wrote:
> Time to start supporting clients that request extended headers.  Now
> we can finally reach the code added across several previous patches.
> 
> Even though the NBD spec has been altered to allow us to accept
> NBD_CMD_READ larger than the max payload size (provided our response
> is a hole or broken up over more than one data chunk), we are not
> planning to take advantage of that, and continue to cap NBD_CMD_READ
> to 32M regardless of header size.
> 
> For NBD_CMD_WRITE_ZEROES and NBD_CMD_TRIM, the block layer already
> supports 64-bit operations without any effort on our part.  For
> NBD_CMD_BLOCK_STATUS, the client's length is a hint, and the previous
> patch took care of implementing the required
> NBD_REPLY_TYPE_BLOCK_STATUS_EXT.
> 
> We do not yet support clients that want to do request payload
> filtering of NBD_CMD_BLOCK_STATUS; that will be added in later
> patches, but is not essential for qemu as a client since qemu only
> requests the single context base:allocation.
> 
> Signed-off-by: Eric Blake<eblake@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 18/24] nbd/client: Plumb errp through nbd_receive_replies
  2023-06-08 13:56 ` [PATCH v4 18/24] nbd/client: Plumb errp through nbd_receive_replies Eric Blake
  2023-06-08 19:10   ` [Libguestfs] " Eric Blake
@ 2023-06-27 13:31   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-27 13:31 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> Instead of ignoring the low-level error just to refabricate our own
> message to pass to the caller, we can just plump the caller's errp
> down to the low level.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> 
> v4: new patch [Vladimir]
> ---
>   block/nbd.c | 16 ++++++++++------
>   1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/block/nbd.c b/block/nbd.c
> index 57123c17f94..c17ce935f17 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -417,7 +417,8 @@ static void coroutine_fn GRAPH_RDLOCK nbd_reconnect_attempt(BDRVNBDState *s)
>       reconnect_delay_timer_del(s);
>   }
> 
> -static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie)
> +static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie,
> +                                            Error **errp)
>   {
>       int ret;
>       uint64_t ind = COOKIE_TO_INDEX(cookie), ind2;
> @@ -458,9 +459,12 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie)
> 
>           /* We are under mutex and cookie is 0. We have to do the dirty work. */
>           assert(s->reply.cookie == 0);
> -        ret = nbd_receive_reply(s->bs, s->ioc, &s->reply, NULL);
> -        if (ret <= 0) {
> -            ret = ret ? ret : -EIO;
> +        ret = nbd_receive_reply(s->bs, s->ioc, &s->reply, errp);
> +        if (ret == 0) {
> +            ret = -EIO;
> +            error_setg(errp, "server dropped connection");

we also need to set errp for further negative returns in the function

> +        }
> +        if (ret < 0) {
>               nbd_channel_error(s, ret);
>               return ret;
>           }
> @@ -843,9 +847,9 @@ static coroutine_fn int nbd_co_do_receive_one_chunk(
>       }
>       *request_ret = 0;
> 
> -    ret = nbd_receive_replies(s, cookie);
> +    ret = nbd_receive_replies(s, cookie, errp);
>       if (ret < 0) {
> -        error_setg(errp, "Connection closed");
> +        error_prepend(errp, "Connection closed: ");
>           return -EIO;
>       }
>       assert(s->ioc);

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 19/24] nbd/client: Initial support for extended headers
  2023-06-08 13:56 ` [PATCH v4 19/24] nbd/client: Initial support for extended headers Eric Blake
@ 2023-06-27 14:22   ` Vladimir Sementsov-Ogievskiy
  2023-08-07 19:20     ` Eric Blake
  0 siblings, 1 reply; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-27 14:22 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> Update the client code to be able to send an extended request, and
> parse an extended header from the server.  Note that since we reject
> any structured reply with a too-large payload, we can always normalize
> a valid header back into the compact form, so that the caller need not
> deal with two branches of a union.  Still, until a later patch lets
> the client negotiate extended headers, the code added here should not
> be reached.  Note that because of the different magic numbers, it is
> just as easy to trace and then tolerate a non-compliant server sending
> the wrong header reply as it would be to insist that the server is
> compliant.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> 
> v4: split off errp handling to separate patch [Vladimir], better
> function naming [Vladimir]
> ---
>   include/block/nbd.h |   3 +-
>   block/nbd.c         |   2 +-
>   nbd/client.c        | 100 +++++++++++++++++++++++++++++---------------
>   nbd/trace-events    |   3 +-
>   4 files changed, 72 insertions(+), 36 deletions(-)
> 
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index dc05f5981fb..af80087e2cd 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -389,7 +389,8 @@ int nbd_init(int fd, QIOChannelSocket *sioc, NBDExportInfo *info,
>                Error **errp);
>   int nbd_send_request(QIOChannel *ioc, NBDRequest *request);
>   int coroutine_fn nbd_receive_reply(BlockDriverState *bs, QIOChannel *ioc,
> -                                   NBDReply *reply, Error **errp);
> +                                   NBDReply *reply, NBDMode mode,
> +                                   Error **errp);
>   int nbd_client(int fd);
>   int nbd_disconnect(int fd);
>   int nbd_errno_to_system_errno(int err);
> diff --git a/block/nbd.c b/block/nbd.c
> index c17ce935f17..e281fac43d1 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -459,7 +459,7 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie,
> 
>           /* We are under mutex and cookie is 0. We have to do the dirty work. */
>           assert(s->reply.cookie == 0);
> -        ret = nbd_receive_reply(s->bs, s->ioc, &s->reply, errp);
> +        ret = nbd_receive_reply(s->bs, s->ioc, &s->reply, s->info.mode, errp);
>           if (ret == 0) {
>               ret = -EIO;
>               error_setg(errp, "server dropped connection");
> diff --git a/nbd/client.c b/nbd/client.c
> index 1495a9b0ab1..a4598a95427 100644
> --- a/nbd/client.c
> +++ b/nbd/client.c
> @@ -1352,22 +1352,29 @@ int nbd_disconnect(int fd)
> 
>   int nbd_send_request(QIOChannel *ioc, NBDRequest *request)
>   {
> -    uint8_t buf[NBD_REQUEST_SIZE];
> +    uint8_t buf[NBD_EXTENDED_REQUEST_SIZE];
> +    size_t len;
> 
> -    assert(request->mode <= NBD_MODE_STRUCTURED); /* TODO handle extended */
> -    assert(request->len <= UINT32_MAX);
>       trace_nbd_send_request(request->from, request->len, request->cookie,
>                              request->flags, request->type,
>                              nbd_cmd_lookup(request->type));
> 
> -    stl_be_p(buf, NBD_REQUEST_MAGIC);
>       stw_be_p(buf + 4, request->flags);
>       stw_be_p(buf + 6, request->type);
>       stq_be_p(buf + 8, request->cookie);
>       stq_be_p(buf + 16, request->from);
> -    stl_be_p(buf + 24, request->len);
> +    if (request->mode >= NBD_MODE_EXTENDED) {
> +        stl_be_p(buf, NBD_EXTENDED_REQUEST_MAGIC);
> +        stq_be_p(buf + 24, request->len);
> +        len = NBD_EXTENDED_REQUEST_SIZE;
> +    } else {
> +        assert(request->len <= UINT32_MAX);
> +        stl_be_p(buf, NBD_REQUEST_MAGIC);
> +        stl_be_p(buf + 24, request->len);
> +        len = NBD_REQUEST_SIZE;
> +    }
> 
> -    return nbd_write(ioc, buf, sizeof(buf), NULL);
> +    return nbd_write(ioc, buf, len, NULL);
>   }
> 
>   /* nbd_receive_simple_reply
> @@ -1394,30 +1401,36 @@ static int nbd_receive_simple_reply(QIOChannel *ioc, NBDSimpleReply *reply,
>       return 0;
>   }
> 
> -/* nbd_receive_structured_reply_chunk
> +/* nbd_receive_reply_chunk_header
>    * Read structured reply chunk except magic field (which should be already
> - * read).
> + * read).  Normalize into the compact form.
>    * Payload is not read.
>    */
> -static int nbd_receive_structured_reply_chunk(QIOChannel *ioc,
> -                                              NBDStructuredReplyChunk *chunk,
> -                                              Error **errp)
> +static int nbd_receive_reply_chunk_header(QIOChannel *ioc, NBDReply *chunk,
> +                                          Error **errp)
>   {
>       int ret;
> +    size_t len;
> +    uint64_t payload_len;
> 
> -    assert(chunk->magic == NBD_STRUCTURED_REPLY_MAGIC);
> +    if (chunk->magic == NBD_STRUCTURED_REPLY_MAGIC) {
> +        len = sizeof(chunk->structured);
> +    } else {
> +        assert(chunk->magic == NBD_EXTENDED_REPLY_MAGIC);
> +        len = sizeof(chunk->extended);
> +    }
> 
>       ret = nbd_read(ioc, (uint8_t *)chunk + sizeof(chunk->magic),
> -                   sizeof(*chunk) - sizeof(chunk->magic), "structured chunk",
> +                   len - sizeof(chunk->magic), "structured chunk",
>                      errp);
>       if (ret < 0) {
>           return ret;
>       }
> 
> -    chunk->flags = be16_to_cpu(chunk->flags);
> -    chunk->type = be16_to_cpu(chunk->type);
> -    chunk->cookie = be64_to_cpu(chunk->cookie);
> -    chunk->length = be32_to_cpu(chunk->length);
> +    /* flags, type, and cookie occupy same space between forms */
> +    chunk->structured.flags = be16_to_cpu(chunk->structured.flags);
> +    chunk->structured.type = be16_to_cpu(chunk->structured.type);
> +    chunk->structured.cookie = be64_to_cpu(chunk->structured.cookie);
> 
>       /*
>        * Because we use BLOCK_STATUS with REQ_ONE, and cap READ requests
> @@ -1425,11 +1438,20 @@ static int nbd_receive_structured_reply_chunk(QIOChannel *ioc,
>        * this.  Even if we stopped using REQ_ONE, sane servers will cap
>        * the number of extents they return for block status.
>        */
> -    if (chunk->length > NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData)) {
> +    if (chunk->magic == NBD_STRUCTURED_REPLY_MAGIC) {
> +        payload_len = be32_to_cpu(chunk->structured.length);
> +    } else {
> +        /* For now, we are ignoring the extended header offset. */
> +        payload_len = be64_to_cpu(chunk->extended.length);
> +        chunk->magic = NBD_STRUCTURED_REPLY_MAGIC;
> +    }
> +    if (payload_len > NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData)) {
>           error_setg(errp, "server chunk %" PRIu32 " (%s) payload is too long",
> -                   chunk->type, nbd_rep_lookup(chunk->type));
> +                   chunk->structured.type,
> +                   nbd_rep_lookup(chunk->structured.type));
>           return -EINVAL;
>       }
> +    chunk->structured.length = payload_len;
> 
>       return 0;
>   }
> @@ -1476,19 +1498,21 @@ nbd_read_eof(BlockDriverState *bs, QIOChannel *ioc, void *buffer, size_t size,
> 
>   /* nbd_receive_reply
>    *
> - * Decreases bs->in_flight while waiting for a new reply. This yield is where
> - * we wait indefinitely and the coroutine must be able to be safely reentered
> - * for nbd_client_attach_aio_context().
> + * Wait for a new reply. If this yields, the coroutine must be able to be
> + * safely reentered for nbd_client_attach_aio_context().  @mode determines
> + * which reply magic we are expecting, although this normalizes the result
> + * so that the caller only has to work with compact headers.
>    *
>    * Returns 1 on success
> - *         0 on eof, when no data was read (errp is not set)
> - *         negative errno on failure (errp is set)
> + *         0 on eof, when no data was read
> + *         negative errno on failure
>    */
>   int coroutine_fn nbd_receive_reply(BlockDriverState *bs, QIOChannel *ioc,
> -                                   NBDReply *reply, Error **errp)
> +                                   NBDReply *reply, NBDMode mode, Error **errp)
>   {
>       int ret;
>       const char *type;
> +    uint32_t expected;
> 
>       ret = nbd_read_eof(bs, ioc, &reply->magic, sizeof(reply->magic), errp);
>       if (ret <= 0) {
> @@ -1497,8 +1521,13 @@ int coroutine_fn nbd_receive_reply(BlockDriverState *bs, QIOChannel *ioc,
> 
>       reply->magic = be32_to_cpu(reply->magic);
> 
> +    /* Diagnose but accept wrong-width header */
>       switch (reply->magic) {
>       case NBD_SIMPLE_REPLY_MAGIC:
> +        if (mode >= NBD_MODE_EXTENDED) {
> +            trace_nbd_receive_wrong_header(reply->magic,
> +                                           nbd_mode_lookup(mode));
> +        }
>           ret = nbd_receive_simple_reply(ioc, &reply->simple, errp);
>           if (ret < 0) {
>               break;
> @@ -1508,23 +1537,28 @@ int coroutine_fn nbd_receive_reply(BlockDriverState *bs, QIOChannel *ioc,
>                                          reply->cookie);
>           break;
>       case NBD_STRUCTURED_REPLY_MAGIC:
> -        ret = nbd_receive_structured_reply_chunk(ioc, &reply->structured, errp);
> +    case NBD_EXTENDED_REPLY_MAGIC:
> +        expected = mode >= NBD_MODE_EXTENDED ? NBD_EXTENDED_REPLY_MAGIC
> +            : NBD_STRUCTURED_REPLY_MAGIC;
> +        if (reply->magic != expected) {
> +            trace_nbd_receive_wrong_header(reply->magic,
> +                                           nbd_mode_lookup(mode));
> +        }
> +        ret = nbd_receive_reply_chunk_header(ioc, reply, errp);
>           if (ret < 0) {
>               break;
>           }

maybe

assert(reply->magic == NBD_STRUCTURED_REPLY_MAGIC)

>           type = nbd_reply_type_lookup(reply->structured.type);
> -        trace_nbd_receive_structured_reply_chunk(reply->structured.flags,
> -                                                 reply->structured.type, type,
> -                                                 reply->structured.cookie,
> -                                                 reply->structured.length);
> +        trace_nbd_receive_reply_chunk_header(reply->structured.flags,
> +                                             reply->structured.type, type,
> +                                             reply->structured.cookie,
> +                                             reply->structured.length);
>           break;
>       default:
> +        trace_nbd_receive_wrong_header(reply->magic, nbd_mode_lookup(mode));
>           error_setg(errp, "invalid magic (got 0x%" PRIx32 ")", reply->magic);
>           return -EINVAL;
>       }
> -    if (ret < 0) {
> -        return ret;
> -    }

hmm, mistake? Seems, we'll return 1 on error with set errp this way

> 
>       return 1;
>   }
> diff --git a/nbd/trace-events b/nbd/trace-events
> index 6a34d7f027a..51bfb129c95 100644
> --- a/nbd/trace-events
> +++ b/nbd/trace-events
> @@ -33,7 +33,8 @@ nbd_client_clear_queue(void) "Clearing NBD queue"
>   nbd_client_clear_socket(void) "Clearing NBD socket"
>   nbd_send_request(uint64_t from, uint64_t len, uint64_t cookie, uint16_t flags, uint16_t type, const char *name) "Sending request to server: { .from = %" PRIu64", .len = %" PRIu64 ", .cookie = %" PRIu64 ", .flags = 0x%" PRIx16 ", .type = %" PRIu16 " (%s) }"
>   nbd_receive_simple_reply(int32_t error, const char *errname, uint64_t cookie) "Got simple reply: { .error = %" PRId32 " (%s), cookie = %" PRIu64" }"
> -nbd_receive_structured_reply_chunk(uint16_t flags, uint16_t type, const char *name, uint64_t cookie, uint32_t length) "Got structured reply chunk: { flags = 0x%" PRIx16 ", type = %d (%s), cookie = %" PRIu64 ", length = %" PRIu32 " }"
> +nbd_receive_reply_chunk_header(uint16_t flags, uint16_t type, const char *name, uint64_t cookie, uint32_t length) "Got reply chunk header: { flags = 0x%" PRIx16 ", type = %d (%s), cookie = %" PRIu64 ", length = %" PRIu32 " }"
> +nbd_receive_wrong_header(uint32_t magic, const char *mode) "Server sent unexpected magic 0x%" PRIx32 " for negotiated mode %s"
> 
>   # common.c
>   nbd_unknown_error(int err) "Squashing unexpected error %d to EINVAL"

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 20/24] nbd/client: Accept 64-bit block status chunks
  2023-06-08 13:56 ` [PATCH v4 20/24] nbd/client: Accept 64-bit block status chunks Eric Blake
@ 2023-06-27 14:50   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-27 14:50 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> Once extended mode is enabled, we need to accept 64-bit status replies
> (even for replies that don't exceed a 32-bit length).  It is easier to
> normalize narrow replies into wide format so that the rest of our code
> only has to handle one width.  Although a server is non-compliant if
> it sends a 64-bit reply in compact mode, or a 32-bit reply in extended
> mode, it is still easy enough to tolerate these mismatches.
> 
> In normal execution, we are only requesting "base:allocation" which
> never exceeds 32 bits for flag values. But during testing with
> x-dirty-bitmap, we can force qemu to connect to some other context
> that might have 64-bit status bit; however, we ignore those upper bits
> (other than mapping qemu:allocation-depth into something that
> 'qemu-img map --output=json' can expose), and since that only affects
> testing, we really don't bother with checking whether more than the
> two least-significant bits are set.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

[..]

> +     * count at 0 for a narrow server).  However, it's easy enough to
> +     * ignore the server's noncompliance without killing the
>        * connection; just ignore trailing extents, and clamp things to
>        * the length of our request.
>        */
> -    if (chunk->length > sizeof(context_id) + sizeof(*extent)) {
> -        trace_nbd_parse_blockstatus_compliance("more than one extent");
> +    if (count != wide ||
> +        chunk->length > sizeof(context_id) + wide * sizeof(count) + len) {

this calculation is done twice. I'd put it into expected_chunk_length variable.

> +        trace_nbd_parse_blockstatus_compliance("unexpected extent count");
>       }
>       if (extent->length > orig_length) {
>           extent->length = orig_length;
> @@ -1123,7 +1136,7 @@ nbd_co_receive_cmdread_reply(BDRVNBDState *s, uint64_t cookie,
> 
>   static int coroutine_fn
>   nbd_co_receive_blockstatus_reply(BDRVNBDState *s, uint64_t cookie,

[..]

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 21/24] nbd/client: Request extended headers during negotiation
  2023-06-08 13:56 ` [PATCH v4 21/24] nbd/client: Request extended headers during negotiation Eric Blake
@ 2023-06-27 14:55   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-27 14:55 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> All the pieces are in place for a client to finally request extended
> headers.  Note that we must not request extended headers when qemu-nbd
> is used to connect to the kernel module (as nbd.ko does not expect
> them, but expects us to do the negotiation in userspace before handing
> the socket over to the kernel), but there is no harm in all other
> clients requesting them.
> 
> Extended headers are not essential to the information collected during
> 'qemu-nbd --list', but probing for it gives us one more piece of
> information in that output.  Update the iotests affected by the new
> line of output.
> 
> Signed-off-by: Eric Blake<eblake@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 22/24] nbd/server: Refactor list of negotiated meta contexts
  2023-06-08 13:56 ` [PATCH v4 22/24] nbd/server: Refactor list of negotiated meta contexts Eric Blake
@ 2023-06-27 15:11   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-27 15:11 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> Peform several minor refactorings of how the list of negotiated meta
> contexts is managed, to make upcoming patches easier: Promote the
> internal type NBDExportMetaContexts to the public opaque type
> NBDMetaContexts, and mark exp const.  Use a shorter member name in
> NBDClient.  Hoist calls to nbd_check_meta_context() earlier in their
> callers, as the number of negotiated contexts may impact the flags
> exposed in regards to an export, which in turn requires a new
> parameter.

Hmm, a bit of semantic change: we drop context in nbd_check_meta_export() even when we report error. Pre-patch we call nbd_check_meta_export() only on success path.. Still, it seems even more safe.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

   Drop a redundant parameter to nbd_negotiate_meta_queries.
> No semantic change intended.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>


-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Libguestfs] [PATCH v4 23/24] nbd/server: Prepare for per-request filtering of BLOCK_STATUS
  2023-06-08 19:15   ` [Libguestfs] " Eric Blake
@ 2023-06-27 15:19     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-27 15:19 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: Kevin Wolf, Hanna Reitz, libguestfs, qemu-block

On 08.06.23 22:15, Eric Blake wrote:
> On Thu, Jun 08, 2023 at 08:56:52AM -0500, Eric Blake wrote:
>> The next commit will add support for the optional extension
>> NBD_CMD_FLAG_PAYLOAD during NBD_CMD_BLOCK_STATUS, where the client can
>> request that the server only return a subset of negotiated contexts,
>> rather than all contexts.  To make that task easier, this patch
>> populates the list of contexts to return on a per-command basis (for
>> now, identical to the full set of negotiated contexts).
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---
>>
> 
>> +++ b/nbd/server.c
>> @@ -2491,6 +2491,8 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
>>               error_setg(errp, "No memory");
>>               return -ENOMEM;
>>           }
>> +    } else if (request->type == NBD_CMD_BLOCK_STATUS) {
>> +        request->contexts = &client->contexts;
>>       }
> 
> THere are paths where request->contexts is left NULL but request->type
> was set...
> 
>> @@ -2841,6 +2848,11 @@ static coroutine_fn void nbd_trip(void *opaque)
>>       } else {
>>           ret = nbd_handle_request(client, &request, req->data, &local_err);
>>       }
>> +    if (request.type == NBD_CMD_BLOCK_STATUS &&
>> +        request.contexts != &client->contexts) {
>> +        g_free(request.contexts->bitmaps);
>> +        g_free(request.contexts);
>> +    }
> 
> so I think this also needs to be tweaked to check that
> request.contexts is non-NULL before dereferencing to free bitmaps.
> 

agree (and I missed this during my review :)

suggest:

if (request.contexts && request.contexts != &client->contexts) {
   g_free(..)
   g_free(..)
}


-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 24/24] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS
  2023-06-08 13:56 ` [PATCH v4 24/24] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Eric Blake
  2023-06-08 19:19   ` [Libguestfs] " Eric Blake
@ 2023-06-27 19:42   ` Vladimir Sementsov-Ogievskiy
  2023-08-07 20:23     ` [Libguestfs] " Eric Blake
  1 sibling, 1 reply; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-06-27 19:42 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On 08.06.23 16:56, Eric Blake wrote:
> Allow a client to request a subset of negotiated meta contexts.  For
> example, a client may ask to use a single connection to learn about
> both block status and dirty bitmaps, but where the dirty bitmap
> queries only need to be performed on a subset of the disk; forcing the
> server to compute that information on block status queries in the rest
> of the disk is wasted effort (both at the server, and on the amount of
> traffic sent over the wire to be parsed and ignored by the client).
> 
> Qemu as an NBD client never requests to use more than one meta
> context, so it has no need to use block status payloads.  Testing this
> instead requires support from libnbd, which CAN access multiple meta
> contexts in parallel from a single NBD connection; an interop test
> submitted to the libnbd project at the same time as this patch
> demonstrates the feature working, as well as testing some corner cases
> (for example, when the payload length is longer than the export
> length), although other corner cases (like passing the same id
> duplicated) requires a protocol fuzzer because libnbd is not wired up
> to break the protocol that badly.
> 
> This also includes tweaks to 'qemu-nbd --list' to show when a server
> is advertising the capability, and to the testsuite to reflect the
> addition to that output.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---

[..]

> +static int
> +nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request,
> +                                 Error **errp)
> +{
> +    int payload_len = request->len;
> +    g_autofree char *buf = NULL;
> +    size_t count, i, nr_bitmaps;
> +    uint32_t id;
> +
> +    assert(request->len <= NBD_MAX_BUFFER_SIZE);
> +    assert(client->contexts.exp == client->exp);
> +    nr_bitmaps = client->exp->nr_export_bitmaps;
> +    request->contexts = g_new0(NBDMetaContexts, 1);
> +    request->contexts->exp = client->exp;
> +
> +    if (payload_len % sizeof(uint32_t) ||
> +        payload_len < sizeof(NBDBlockStatusPayload) ||
> +        payload_len > (sizeof(NBDBlockStatusPayload) +
> +                       sizeof(id) * client->contexts.count)) {
> +        goto skip;
> +    }
> +
> +    buf = g_malloc(payload_len);
> +    if (nbd_read(client->ioc, buf, payload_len,
> +                 "CMD_BLOCK_STATUS data", errp) < 0) {
> +        return -EIO;
> +    }
> +    trace_nbd_co_receive_request_payload_received(request->cookie,
> +                                                  payload_len);
> +    request->contexts->bitmaps = g_new0(bool, nr_bitmaps);
> +    count = (payload_len - sizeof(NBDBlockStatusPayload)) / sizeof(id);
> +    payload_len = 0;
> +
> +    for (i = 0; i < count; i++) {
> +        id = ldl_be_p(buf + sizeof(NBDBlockStatusPayload) + sizeof(id) * i);
> +        if (id == NBD_META_ID_BASE_ALLOCATION) {
> +            if (request->contexts->base_allocation) {
> +                goto skip;
> +            }
> +            request->contexts->base_allocation = true;
> +        } else if (id == NBD_META_ID_ALLOCATION_DEPTH) {
> +            if (request->contexts->allocation_depth) {
> +                goto skip;
> +            }
> +            request->contexts->allocation_depth = true;
> +        } else {

maybe,

int ind = id - NBD_META_ID_DIRTY_BITMAP;


> +            if (id - NBD_META_ID_DIRTY_BITMAP > nr_bitmaps ||

>= ?

> +                request->contexts->bitmaps[id - NBD_META_ID_DIRTY_BITMAP]) {
> +                goto skip;
> +            }
> +            request->contexts->bitmaps[id - NBD_META_ID_DIRTY_BITMAP] = true;
> +        }
> +    }
> +
> +    request->len = ldq_be_p(buf);
> +    request->contexts->count = count;
> +    return 0;
> +
> + skip:
> +    trace_nbd_co_receive_block_status_payload_compliance(request->from,
> +                                                         request->len);
> +    request->len = request->contexts->count = 0;
> +    return nbd_drop(client->ioc, payload_len, errp);
> +}
> +
>   /* nbd_co_receive_request
>    * Collect a client request. Return 0 if request looks valid, -EIO to drop
>    * connection right away, -EAGAIN to indicate we were interrupted and the
> @@ -2470,7 +2552,13 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *

[..]

> @@ -2712,7 +2804,8 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
>                                         "discard failed", errp);
> 
>       case NBD_CMD_BLOCK_STATUS:
> -        if (!request->len) {
> +        assert(request->contexts);
> +        if (!request->len && !(request->flags & NBD_CMD_FLAG_PAYLOAD_LEN)) {

why not error-out for len=0 in case of payload?

>               return nbd_send_generic_reply(client, request, -EINVAL,
>                                             "need non-zero length", errp);
>           }
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index 1d155fc2c66..cbca0eeee62 100644


-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Libguestfs] [PATCH v4 09/24] nbd: Replace bool structured_reply with mode enum
  2023-06-12 19:24     ` [Libguestfs] " Eric Blake
@ 2023-07-19 20:11       ` Eric Blake
  0 siblings, 0 replies; 62+ messages in thread
From: Eric Blake @ 2023-07-19 20:11 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Kevin Wolf, Hanna Reitz, qemu-block, qemu-devel, libguestfs

On Mon, Jun 12, 2023 at 02:24:52PM -0500, Eric Blake wrote:
> On Mon, Jun 12, 2023 at 06:07:59PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > On 08.06.23 16:56, Eric Blake wrote:
> > > The upcoming patches for 64-bit extensions requires various points in
> > > the protocol to make decisions based on what was negotiated.  While we
> > > could easily add a 'bool extended_headers' alongside the existing
> > > 'bool structured_reply', this does not scale well if more modes are
> > > added in the future.  Better is to expose the mode enum added in the
> > > previous patch out to a wider use in the code base.
> > > 
> > > Where the code previously checked for structured_reply being set or
> > > clear, it now prefers checking for an inequality; this works because
> > > the nodes are in a continuum of increasing abilities, and allows us to
> > > touch fewer places if we ever insert other modes in the middle of the
> > > enum.  There should be no semantic change in this patch.
> > > 
> > > Signed-off-by: Eric Blake <eblake@redhat.com>
> > > ---
> > > 
> > > v4: new patch, expanding enum idea from v3 4/14
> > > ---
> > 
> > [..]
> > 
> > > diff --git a/nbd/server.c b/nbd/server.c
> > > index 8486b64b15d..bade4f7990c 100644
> > > --- a/nbd/server.c
> > > +++ b/nbd/server.c
> 
> > > @@ -1261,13 +1262,13 @@ static int nbd_negotiate_options(NBDClient *client, Error **errp)
> > >               case NBD_OPT_STRUCTURED_REPLY:
> > >                   if (length) {
> > >                       ret = nbd_reject_length(client, false, errp);
> > > -                } else if (client->structured_reply) {
> > > +                } else if (client->mode >= NBD_MODE_STRUCTURED) {
> > >                       ret = nbd_negotiate_send_rep_err(
> > >                           client, NBD_REP_ERR_INVALID, errp,
> > >                           "structured reply already negotiated");
> > >                   } else {
> > >                       ret = nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
> > > -                    client->structured_reply = true;
> > > +                    client->mode = NBD_MODE_STRUCTURED;
> > 
> > Hmm. in all other cases in server code client.mode remains zero = OLDSTYLE, which is not quite correct.
> 
> Good catch.  Consider this squashed in (note that as a server we NEVER
> talk NBD_MODE_OLDSTYLE - we ripped that out back in commit 7f7dfe2a;
> but whether we end up on EXPORT_NAME or SIMPLE depends on the client's
> response to our initial flag advertisement.  The only reason I didn't
> spot it sooner is that in the server, all subsequent checks of
> client->mode grouped OLDSTYLE, EXPORT_NAME, and SIMPLE into the same
> handling.

To move things along, I have now staged 1-8 in my NBD queue for a pull
request, and will then repost this patch and the remainder of the
series as v5, to make it easier to pick up the final needed R-b.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 15/24] nbd/server: Prepare to send extended header replies
  2023-06-16 18:48   ` Vladimir Sementsov-Ogievskiy
@ 2023-08-04 19:28     ` Eric Blake
  2023-08-07 17:20       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-08-04 19:28 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy; +Cc: qemu-devel, qemu-block, libguestfs

On Fri, Jun 16, 2023 at 09:48:18PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 08.06.23 16:56, Eric Blake wrote:
> > Although extended mode is not yet enabled, once we do turn it on, we
> > need to reply with extended headers to all messages.  Update the low
> > level entry points necessary so that all other callers automatically
> > get the right header based on the current mode.
> > 
> > Signed-off-by: Eric Blake <eblake@redhat.com>
> > ---
> > 
> > v4: new patch, split out from v3 9/14
> > ---
> >   nbd/server.c | 30 ++++++++++++++++++++++--------
> >   1 file changed, 22 insertions(+), 8 deletions(-)
> > 
> > diff --git a/nbd/server.c b/nbd/server.c
> > index 119ac765f09..84c848a31d3 100644
> > --- a/nbd/server.c
> > +++ b/nbd/server.c
> > @@ -1947,8 +1947,6 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov,
> >                                   size_t niov, uint16_t flags, uint16_t type,
> >                                   NBDRequest *request)
> >   {
> > -    /* TODO - handle structured vs. extended replies */
> > -    NBDStructuredReplyChunk *chunk = iov->iov_base;
> >       size_t i, length = 0;
> > 
> >       for (i = 1; i < niov; i++) {
> > @@ -1956,12 +1954,26 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov,
> >       }
> >       assert(length <= NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData));
> > 
> > -    iov[0].iov_len = sizeof(*chunk);
> > -    stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC);
> > -    stw_be_p(&chunk->flags, flags);
> > -    stw_be_p(&chunk->type, type);
> > -    stq_be_p(&chunk->cookie, request->cookie);
> > -    stl_be_p(&chunk->length, length);
> > +    if (client->mode >= NBD_MODE_EXTENDED) {
> > +        NBDExtendedReplyChunk *chunk = iov->iov_base;
> > +
> > +        iov->iov_len = sizeof(*chunk);
> 
> I'd prefer to keep iov[0].iov_len notation, to stress that iov is an array
> 
> anyway:
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

I can make that change, and keep your R-b.

> 
> > +        stl_be_p(&chunk->magic, NBD_EXTENDED_REPLY_MAGIC);
> > +        stw_be_p(&chunk->flags, flags);
> > +        stw_be_p(&chunk->type, type);
> > +        stq_be_p(&chunk->cookie, request->cookie);
> 
> Hm. Not about this patch:
> 
> we now moved to simple cookies. And it seems that actually, 64bit is too much for number of request.

You're right that it's more than qemu cared about.  But there may be
other implementations that really do store a 64-bit pointer as their
opaque cookie, for ease of reverse-lookup on which command the
server's reply corresponds to, so I don't see it changing any time
soon in the NBD protocol.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 16/24] nbd/server: Support 64-bit block status
  2023-06-27 13:23   ` Vladimir Sementsov-Ogievskiy
@ 2023-08-04 19:36     ` Eric Blake
  2023-08-07 17:28       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 62+ messages in thread
From: Eric Blake @ 2023-08-04 19:36 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy; +Cc: qemu-devel, qemu-block, libguestfs

On Tue, Jun 27, 2023 at 04:23:49PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 08.06.23 16:56, Eric Blake wrote:
> > The NBD spec states that if the client negotiates extended headers,
> > the server must avoid NBD_REPLY_TYPE_BLOCK_STATUS and instead use
> > NBD_REPLY_TYPE_BLOCK_STATUS_EXT which supports 64-bit lengths, even if
> > the reply does not need more than 32 bits.  As of this patch,
> > client->mode is still never NBD_MODE_EXTENDED, so the code added here
> > does not take effect until the next patch enables negotiation.
> > 
> > For now, all metacontexts that we know how to export never populate
> > more than 32 bits of information, so we don't have to worry about
> > NBD_REP_ERR_EXT_HEADER_REQD or filtering during handshake, and we
> > always send all zeroes for the upper 32 bits of status during
> > NBD_CMD_BLOCK_STATUS.
> > 
> > Note that we previously had some interesting size-juggling on call
> > chains, such as:
> > 
> > nbd_co_send_block_status(uint32_t length)
> > -> blockstatus_to_extents(uint32_t bytes)
> >    -> bdrv_block_status_above(bytes, &uint64_t num)
> >    -> nbd_extent_array_add(uint64_t num)
> >      -> store num in 32-bit length
> > 
> > But we were lucky that it never overflowed: bdrv_block_status_above
> > never sets num larger than bytes, and we had previously been capping
> > 'bytes' at 32 bits (since the protocol does not allow sending a larger
> > request without extended headers).  This patch adds some assertions
> > that ensure we continue to avoid overflowing 32 bits for a narrow
> 
> 
> [..]
> 
> > @@ -2162,19 +2187,23 @@ static void nbd_extent_array_convert_to_be(NBDExtentArray *ea)
> >    * would result in an incorrect range reported to the client)
> >    */
> >   static int nbd_extent_array_add(NBDExtentArray *ea,
> > -                                uint32_t length, uint32_t flags)
> > +                                uint64_t length, uint32_t flags)
> >   {
> >       assert(ea->can_add);
> > 
> >       if (!length) {
> >           return 0;
> >       }
> > +    if (!ea->extended) {
> > +        assert(length <= UINT32_MAX);
> > +    }
> > 
> >       /* Extend previous extent if flags are the same */
> >       if (ea->count > 0 && flags == ea->extents[ea->count - 1].flags) {
> > -        uint64_t sum = (uint64_t)length + ea->extents[ea->count - 1].length;
> > +        uint64_t sum = length + ea->extents[ea->count - 1].length;
> > 
> > -        if (sum <= UINT32_MAX) {
> > +        assert(sum >= length);
> > +        if (sum <= UINT32_MAX || ea->extended) {
> 
> that "if" and uint64_t sum was to avoid overflow. I think, we can't just assert, instead include the check into if:
> 
> if (sum >= length && (sum <= UINT32_MAX || ea->extended) {

Why?  The assertion is stating that there was no overflow, because we
are in control of ea->extents[ea->count - 1].length (it came from
local code performing block status, and our block layer guarantees
that no block status returns more than 2^63 bytes because we don't
support images larger than off_t).  At best, all I need is a comment
why the assertion is valid.

> 
>  ...
> 
> }
> 
> with this:
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> 
> -- 
> Best regards,
> Vladimir
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 15/24] nbd/server: Prepare to send extended header replies
  2023-08-04 19:28     ` Eric Blake
@ 2023-08-07 17:20       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-08-07 17:20 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, libguestfs

On 04.08.23 22:28, Eric Blake wrote:
> On Fri, Jun 16, 2023 at 09:48:18PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> On 08.06.23 16:56, Eric Blake wrote:
>>> Although extended mode is not yet enabled, once we do turn it on, we
>>> need to reply with extended headers to all messages.  Update the low
>>> level entry points necessary so that all other callers automatically
>>> get the right header based on the current mode.
>>>
>>> Signed-off-by: Eric Blake <eblake@redhat.com>
>>> ---
>>>
>>> v4: new patch, split out from v3 9/14
>>> ---
>>>    nbd/server.c | 30 ++++++++++++++++++++++--------
>>>    1 file changed, 22 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/nbd/server.c b/nbd/server.c
>>> index 119ac765f09..84c848a31d3 100644
>>> --- a/nbd/server.c
>>> +++ b/nbd/server.c
>>> @@ -1947,8 +1947,6 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov,
>>>                                    size_t niov, uint16_t flags, uint16_t type,
>>>                                    NBDRequest *request)
>>>    {
>>> -    /* TODO - handle structured vs. extended replies */
>>> -    NBDStructuredReplyChunk *chunk = iov->iov_base;
>>>        size_t i, length = 0;
>>>
>>>        for (i = 1; i < niov; i++) {
>>> @@ -1956,12 +1954,26 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov,
>>>        }
>>>        assert(length <= NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData));
>>>
>>> -    iov[0].iov_len = sizeof(*chunk);
>>> -    stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC);
>>> -    stw_be_p(&chunk->flags, flags);
>>> -    stw_be_p(&chunk->type, type);
>>> -    stq_be_p(&chunk->cookie, request->cookie);
>>> -    stl_be_p(&chunk->length, length);
>>> +    if (client->mode >= NBD_MODE_EXTENDED) {
>>> +        NBDExtendedReplyChunk *chunk = iov->iov_base;
>>> +
>>> +        iov->iov_len = sizeof(*chunk);
>>
>> I'd prefer to keep iov[0].iov_len notation, to stress that iov is an array
>>
>> anyway:
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> 
> I can make that change, and keep your R-b.

OK

> 
>>
>>> +        stl_be_p(&chunk->magic, NBD_EXTENDED_REPLY_MAGIC);
>>> +        stw_be_p(&chunk->flags, flags);
>>> +        stw_be_p(&chunk->type, type);
>>> +        stq_be_p(&chunk->cookie, request->cookie);
>>
>> Hm. Not about this patch:
>>
>> we now moved to simple cookies. And it seems that actually, 64bit is too much for number of request.
> 
> You're right that it's more than qemu cared about.  But there may be
> other implementations that really do store a 64-bit pointer as their
> opaque cookie, for ease of reverse-lookup on which command the
> server's reply corresponds to, so I don't see it changing any time
> soon in the NBD protocol.
> 

Agree

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 16/24] nbd/server: Support 64-bit block status
  2023-08-04 19:36     ` Eric Blake
@ 2023-08-07 17:28       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 62+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-08-07 17:28 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, libguestfs

On 04.08.23 22:36, Eric Blake wrote:
> On Tue, Jun 27, 2023 at 04:23:49PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> On 08.06.23 16:56, Eric Blake wrote:
>>> The NBD spec states that if the client negotiates extended headers,
>>> the server must avoid NBD_REPLY_TYPE_BLOCK_STATUS and instead use
>>> NBD_REPLY_TYPE_BLOCK_STATUS_EXT which supports 64-bit lengths, even if
>>> the reply does not need more than 32 bits.  As of this patch,
>>> client->mode is still never NBD_MODE_EXTENDED, so the code added here
>>> does not take effect until the next patch enables negotiation.
>>>
>>> For now, all metacontexts that we know how to export never populate
>>> more than 32 bits of information, so we don't have to worry about
>>> NBD_REP_ERR_EXT_HEADER_REQD or filtering during handshake, and we
>>> always send all zeroes for the upper 32 bits of status during
>>> NBD_CMD_BLOCK_STATUS.
>>>
>>> Note that we previously had some interesting size-juggling on call
>>> chains, such as:
>>>
>>> nbd_co_send_block_status(uint32_t length)
>>> -> blockstatus_to_extents(uint32_t bytes)
>>>     -> bdrv_block_status_above(bytes, &uint64_t num)
>>>     -> nbd_extent_array_add(uint64_t num)
>>>       -> store num in 32-bit length
>>>
>>> But we were lucky that it never overflowed: bdrv_block_status_above
>>> never sets num larger than bytes, and we had previously been capping
>>> 'bytes' at 32 bits (since the protocol does not allow sending a larger
>>> request without extended headers).  This patch adds some assertions
>>> that ensure we continue to avoid overflowing 32 bits for a narrow
>>
>>
>> [..]
>>
>>> @@ -2162,19 +2187,23 @@ static void nbd_extent_array_convert_to_be(NBDExtentArray *ea)
>>>     * would result in an incorrect range reported to the client)
>>>     */
>>>    static int nbd_extent_array_add(NBDExtentArray *ea,
>>> -                                uint32_t length, uint32_t flags)
>>> +                                uint64_t length, uint32_t flags)
>>>    {
>>>        assert(ea->can_add);
>>>
>>>        if (!length) {
>>>            return 0;
>>>        }
>>> +    if (!ea->extended) {
>>> +        assert(length <= UINT32_MAX);
>>> +    }
>>>
>>>        /* Extend previous extent if flags are the same */
>>>        if (ea->count > 0 && flags == ea->extents[ea->count - 1].flags) {
>>> -        uint64_t sum = (uint64_t)length + ea->extents[ea->count - 1].length;
>>> +        uint64_t sum = length + ea->extents[ea->count - 1].length;
>>>
>>> -        if (sum <= UINT32_MAX) {
>>> +        assert(sum >= length);
>>> +        if (sum <= UINT32_MAX || ea->extended) {
>>
>> that "if" and uint64_t sum was to avoid overflow. I think, we can't just assert, instead include the check into if:
>>
>> if (sum >= length && (sum <= UINT32_MAX || ea->extended) {
> 
> Why?  The assertion is stating that there was no overflow, because we
> are in control of ea->extents[ea->count - 1].length (it came from
> local code performing block status, and our block layer guarantees
> that no block status returns more than 2^63 bytes because we don't
> support images larger than off_t).  At best, all I need is a comment
> why the assertion is valid.

OK. Small comment would be good. Keep my r-b.

The only my point is that you make this small "extent API" block-layer dependent. But I'm not sure that is the only dependency, and I don't insist anyway.

> 
>>
>>   ...
>>
>> }
>>
>> with this:
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>>
>> -- 
>> Best regards,
>> Vladimir
>>
> 

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 19/24] nbd/client: Initial support for extended headers
  2023-06-27 14:22   ` Vladimir Sementsov-Ogievskiy
@ 2023-08-07 19:20     ` Eric Blake
  0 siblings, 0 replies; 62+ messages in thread
From: Eric Blake @ 2023-08-07 19:20 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, qemu-block, libguestfs, Kevin Wolf, Hanna Reitz

On Tue, Jun 27, 2023 at 05:22:09PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 08.06.23 16:56, Eric Blake wrote:
> > Update the client code to be able to send an extended request, and
> > parse an extended header from the server.  Note that since we reject
> > any structured reply with a too-large payload, we can always normalize
> > a valid header back into the compact form, so that the caller need not
> > deal with two branches of a union.  Still, until a later patch lets
> > the client negotiate extended headers, the code added here should not
> > be reached.  Note that because of the different magic numbers, it is
> > just as easy to trace and then tolerate a non-compliant server sending
> > the wrong header reply as it would be to insist that the server is
> > compliant.
> > 
> > Signed-off-by: Eric Blake <eblake@redhat.com>
> > ---
> > @@ -1508,23 +1537,28 @@ int coroutine_fn nbd_receive_reply(BlockDriverState *bs, QIOChannel *ioc,
> >                                          reply->cookie);
        case NBD_SIMPLE_REPLY_MAGIC:
            if (mode >= NBD_MODE_EXTENDED) {
                trace_nbd_receive_wrong_header(reply->magic,
                                               nbd_mode_lookup(mode));
            }
            ret = nbd_receive_simple_reply(ioc, &reply->simple, errp);
            if (ret < 0) {
                break;
            }
            trace_nbd_receive_simple_reply(reply->simple.error,
                                           nbd_err_lookup(reply->simple.error),
                                           reply->cookie);
> >           break;
> >       case NBD_STRUCTURED_REPLY_MAGIC:
> > -        ret = nbd_receive_structured_reply_chunk(ioc, &reply->structured, errp);
> > +    case NBD_EXTENDED_REPLY_MAGIC:
> > +        expected = mode >= NBD_MODE_EXTENDED ? NBD_EXTENDED_REPLY_MAGIC
> > +            : NBD_STRUCTURED_REPLY_MAGIC;
> > +        if (reply->magic != expected) {
> > +            trace_nbd_receive_wrong_header(reply->magic,
> > +                                           nbd_mode_lookup(mode));
> > +        }
> > +        ret = nbd_receive_reply_chunk_header(ioc, reply, errp);
> >           if (ret < 0) {
> >               break;
> >           }
> 
> maybe
> 
> assert(reply->magic == NBD_STRUCTURED_REPLY_MAGIC)

No, not even as a temporary assertion.  You are correct that as of
this patch, a compliant server will only be sending structured
replies, not extended (because we haven't turned on a request for
extended headers yet); but we should never assert something that a
non-compliant server can send over the wire.  So the best we can do is
trace the server's unusual response.

> 
> >           type = nbd_reply_type_lookup(reply->structured.type);
> > -        trace_nbd_receive_structured_reply_chunk(reply->structured.flags,
> > -                                                 reply->structured.type, type,
> > -                                                 reply->structured.cookie,
> > -                                                 reply->structured.length);
> > +        trace_nbd_receive_reply_chunk_header(reply->structured.flags,
> > +                                             reply->structured.type, type,
> > +                                             reply->structured.cookie,
> > +                                             reply->structured.length);
> >           break;
> >       default:
> > +        trace_nbd_receive_wrong_header(reply->magic, nbd_mode_lookup(mode));
> >           error_setg(errp, "invalid magic (got 0x%" PRIx32 ")", reply->magic);
> >           return -EINVAL;
> >       }
> > -    if (ret < 0) {
> > -        return ret;
> > -    }
> 
> hmm, mistake? Seems, we'll return 1 on error with set errp this way
> 
> > 
> >       return 1;

Indeed, I botched that logic (either keep the 'ret < 0' filter after
the switch, or change 'break' to 'return ret' within the switch).
Will fix in v5.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Libguestfs] [PATCH v4 24/24] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS
  2023-06-27 19:42   ` Vladimir Sementsov-Ogievskiy
@ 2023-08-07 20:23     ` Eric Blake
  0 siblings, 0 replies; 62+ messages in thread
From: Eric Blake @ 2023-08-07 20:23 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, Kevin Wolf, Hanna Reitz, libguestfs, qemu-block

On Tue, Jun 27, 2023 at 10:42:20PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 08.06.23 16:56, Eric Blake wrote:
> > Allow a client to request a subset of negotiated meta contexts.  For
> > example, a client may ask to use a single connection to learn about
> > both block status and dirty bitmaps, but where the dirty bitmap
> > queries only need to be performed on a subset of the disk; forcing the
> > server to compute that information on block status queries in the rest
> > of the disk is wasted effort (both at the server, and on the amount of
> > traffic sent over the wire to be parsed and ignored by the client).
> > 
> > Qemu as an NBD client never requests to use more than one meta
> > context, so it has no need to use block status payloads.  Testing this
> > instead requires support from libnbd, which CAN access multiple meta
> > contexts in parallel from a single NBD connection; an interop test
> > submitted to the libnbd project at the same time as this patch
> > demonstrates the feature working, as well as testing some corner cases
> > (for example, when the payload length is longer than the export
> > length), although other corner cases (like passing the same id
> > duplicated) requires a protocol fuzzer because libnbd is not wired up
> > to break the protocol that badly.
> > 
> > This also includes tweaks to 'qemu-nbd --list' to show when a server
> > is advertising the capability, and to the testsuite to reflect the
> > addition to that output.
> > 
> > Signed-off-by: Eric Blake <eblake@redhat.com>
> > ---
> 
> [..]
> 
> > +static int
> > +nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request,
> > +                                 Error **errp)
> > +{
> > +    int payload_len = request->len;
> > +    g_autofree char *buf = NULL;
> > +    size_t count, i, nr_bitmaps;
> > +    uint32_t id;
> > +
> > +    assert(request->len <= NBD_MAX_BUFFER_SIZE);
> > +    assert(client->contexts.exp == client->exp);
> > +    nr_bitmaps = client->exp->nr_export_bitmaps;
> > +    request->contexts = g_new0(NBDMetaContexts, 1);
> > +    request->contexts->exp = client->exp;
> > +
> > +    if (payload_len % sizeof(uint32_t) ||
> > +        payload_len < sizeof(NBDBlockStatusPayload) ||
> > +        payload_len > (sizeof(NBDBlockStatusPayload) +
> > +                       sizeof(id) * client->contexts.count)) {
> > +        goto skip;
> > +    }
> > @@ -2470,7 +2552,13 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest *
> 
> [..]
> 
> > @@ -2712,7 +2804,8 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
> >                                         "discard failed", errp);
> > 
> >       case NBD_CMD_BLOCK_STATUS:
> > -        if (!request->len) {
> > +        assert(request->contexts);
> > +        if (!request->len && !(request->flags & NBD_CMD_FLAG_PAYLOAD_LEN)) {
> 
> why not error-out for len=0 in case of payload?

Because nbd_co_block_status_payload_read() already rejected that case.
But an assertion would not hurt to make it obvious.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2023-08-07 20:24 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-08 13:56 [PATCH v4 00/24] qemu patches for 64-bit NBD extensions Eric Blake
2023-06-08 13:56 ` [PATCH v4 01/24] nbd/client: Use smarter assert Eric Blake
2023-06-08 13:56 ` [PATCH v4 02/24] nbd: Consistent typedef usage in header Eric Blake
2023-06-08 14:17   ` [Libguestfs] " Eric Blake
2023-06-12 11:59     ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 03/24] nbd/server: Prepare for alternate-size headers Eric Blake
2023-06-12 13:53   ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 04/24] nbd/server: Refactor to pass full request around Eric Blake
2023-06-08 13:56 ` [PATCH v4 05/24] nbd: s/handle/cookie/ to match NBD spec Eric Blake
2023-06-08 14:32   ` [Libguestfs] " Eric Blake
2023-06-12 14:12   ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 06/24] nbd/client: Simplify cookie vs. index computation Eric Blake
2023-06-12 14:27   ` Vladimir Sementsov-Ogievskiy
2023-06-12 19:13     ` Eric Blake
2023-06-08 13:56 ` [PATCH v4 07/24] nbd/client: Add safety check on chunk payload length Eric Blake
2023-06-08 13:56 ` [PATCH v4 08/24] nbd: Use enum for various negotiation modes Eric Blake
2023-06-12 14:39   ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 09/24] nbd: Replace bool structured_reply with mode enum Eric Blake
2023-06-12 15:07   ` Vladimir Sementsov-Ogievskiy
2023-06-12 19:24     ` [Libguestfs] " Eric Blake
2023-07-19 20:11       ` Eric Blake
2023-06-08 13:56 ` [PATCH v4 10/24] nbd/client: Pass mode through to nbd_send_request Eric Blake
2023-06-12 15:48   ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 11/24] nbd: Add types for extended headers Eric Blake
2023-06-12 16:11   ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 12/24] nbd: Prepare for 64-bit request effect lengths Eric Blake
2023-06-08 18:26   ` [Libguestfs] " Eric Blake
2023-06-16 18:16   ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 13/24] nbd/server: Refactor handling of request payload Eric Blake
2023-06-08 18:29   ` [Libguestfs] " Eric Blake
2023-06-16 18:29     ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 14/24] nbd/server: Prepare to receive extended header requests Eric Blake
2023-06-16 18:35   ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 15/24] nbd/server: Prepare to send extended header replies Eric Blake
2023-06-16 18:48   ` Vladimir Sementsov-Ogievskiy
2023-08-04 19:28     ` Eric Blake
2023-08-07 17:20       ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 16/24] nbd/server: Support 64-bit block status Eric Blake
2023-06-27 13:23   ` Vladimir Sementsov-Ogievskiy
2023-08-04 19:36     ` Eric Blake
2023-08-07 17:28       ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 17/24] nbd/server: Enable initial support for extended headers Eric Blake
2023-06-27 13:26   ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 18/24] nbd/client: Plumb errp through nbd_receive_replies Eric Blake
2023-06-08 19:10   ` [Libguestfs] " Eric Blake
2023-06-27 13:31   ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 19/24] nbd/client: Initial support for extended headers Eric Blake
2023-06-27 14:22   ` Vladimir Sementsov-Ogievskiy
2023-08-07 19:20     ` Eric Blake
2023-06-08 13:56 ` [PATCH v4 20/24] nbd/client: Accept 64-bit block status chunks Eric Blake
2023-06-27 14:50   ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 21/24] nbd/client: Request extended headers during negotiation Eric Blake
2023-06-27 14:55   ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 22/24] nbd/server: Refactor list of negotiated meta contexts Eric Blake
2023-06-27 15:11   ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 23/24] nbd/server: Prepare for per-request filtering of BLOCK_STATUS Eric Blake
2023-06-08 19:15   ` [Libguestfs] " Eric Blake
2023-06-27 15:19     ` Vladimir Sementsov-Ogievskiy
2023-06-08 13:56 ` [PATCH v4 24/24] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Eric Blake
2023-06-08 19:19   ` [Libguestfs] " Eric Blake
2023-06-27 19:42   ` Vladimir Sementsov-Ogievskiy
2023-08-07 20:23     ` [Libguestfs] " Eric Blake

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.