All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read
@ 2017-10-09 17:27 Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 01/10] block/nbd-client: assert qiov len once in nbd_co_request Vladimir Sementsov-Ogievskiy
                   ` (17 more replies)
  0 siblings, 18 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Minimally implement nbd structured read extension.

v2:

clone: tag up-nbd-minimal-structured-read-v2 from https://src.openvz.org/scm/~vsementsov/qemu.git
online: https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=up-nbd-minimal-structured-read-v2

01: improve assertion (Eric)
02: add Eric's r-b
06: define NBD_SREP_TYPE_OFFSET_HOLE and NBD_SREP_TYPE_ERROR_OFFSET
    here (Eric)
07: fix return value of nbd_request_simple_option
08-09: new patches
10: mostly rewritten

Vladimir Sementsov-Ogievskiy (10):
  block/nbd-client: assert qiov len once in nbd_co_request
  block/nbd-client: refactor nbd_co_receive_reply
  nbd: rename NBD_REPLY_MAGIC to NBD_SIMPLE_REPLY_MAGIC
  nbd-server: refactor simple reply sending
  nbd: header constants indenting
  nbd: Minimal structured read for server
  nbd/client: refactor nbd_receive_starttls
  nbd: share some nbd entities to be reused in block/nbd-client.c
  nbd/client: prepare nbd_receive_reply for structured reply
  nbd: Minimal structured read for client

 include/block/nbd.h                      | 132 ++++++++++-
 nbd/nbd-internal.h                       |  65 +++---
 block/nbd-client.c                       | 364 ++++++++++++++++++++++++++++---
 nbd/client.c                             | 205 +++++++++++------
 nbd/server.c                             | 189 +++++++++++-----
 nbd/trace-events                         |  10 +-
 tests/qemu-iotests/nbd-fault-injector.py |   4 +-
 7 files changed, 756 insertions(+), 213 deletions(-)

-- 
2.11.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 01/10] block/nbd-client: assert qiov len once in nbd_co_request
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 18:37   ` Eric Blake
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 1/7] block/nbd-client: refactor nbd_co_receive_reply Vladimir Sementsov-Ogievskiy
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Also improve the assertion: check that qiov is NULL for other commands
than CMD_READ and CMD_WRITE.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd-client.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 72651dcdb1..ddf273a6a4 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -156,7 +156,6 @@ static int nbd_co_send_request(BlockDriverState *bs,
         qio_channel_set_cork(s->ioc, true);
         rc = nbd_send_request(s->ioc, request);
         if (rc >= 0 && !s->quit) {
-            assert(request->len == iov_size(qiov->iov, qiov->niov));
             if (qio_channel_writev_all(s->ioc, qiov->iov, qiov->niov,
                                        NULL) < 0) {
                 rc = -EIO;
@@ -197,7 +196,6 @@ static int nbd_co_receive_reply(NBDClientSession *s,
         assert(s->reply.handle == request->handle);
         ret = -s->reply.error;
         if (qiov && s->reply.error == 0) {
-            assert(request->len == iov_size(qiov->iov, qiov->niov));
             if (qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
                                       NULL) < 0) {
                 ret = -EIO;
@@ -231,8 +229,12 @@ static int nbd_co_request(BlockDriverState *bs,
     NBDClientSession *client = nbd_get_client_session(bs);
     int ret;
 
-    assert(!qiov || request->type == NBD_CMD_WRITE ||
-           request->type == NBD_CMD_READ);
+    if (qiov) {
+        assert(request->type == NBD_CMD_WRITE || request->type == NBD_CMD_READ);
+        assert(request->len == iov_size(qiov->iov, qiov->niov));
+    } else {
+        assert(request->type != NBD_CMD_WRITE && request->type != NBD_CMD_READ);
+    }
     ret = nbd_co_send_request(bs, request,
                               request->type == NBD_CMD_WRITE ? qiov : NULL);
     if (ret < 0) {
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 1/7] block/nbd-client: refactor nbd_co_receive_reply
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 01/10] block/nbd-client: assert qiov len once in nbd_co_request Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 2/7] block/nbd-client: exit reply-reading coroutine on incorrect handle Vladimir Sementsov-Ogievskiy
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

"NBDReply *reply" parameter of nbd_co_receive_reply is used only
to pass return value for nbd_co_request (reply.error). Remove it
and use function return value instead.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd-client.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index ee7f758e68..acd8e5d007 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -178,26 +178,26 @@ err:
     return rc;
 }
 
-static void nbd_co_receive_reply(NBDClientSession *s,
-                                 NBDRequest *request,
-                                 NBDReply *reply,
-                                 QEMUIOVector *qiov)
+static int nbd_co_receive_reply(NBDClientSession *s,
+                                NBDRequest *request,
+                                QEMUIOVector *qiov)
 {
+    int ret;
     int i = HANDLE_TO_INDEX(s, request->handle);
 
     /* Wait until we're woken up by nbd_read_reply_entry.  */
     s->requests[i].receiving = true;
     qemu_coroutine_yield();
     s->requests[i].receiving = false;
-    *reply = s->reply;
-    if (reply->handle != request->handle || !s->ioc || s->quit) {
-        reply->error = EIO;
+    if (s->reply.handle != request->handle || !s->ioc || s->quit) {
+        ret = -EIO;
     } else {
-        if (qiov && reply->error == 0) {
+        ret = -s->reply.error;
+        if (qiov && s->reply.error == 0) {
             assert(request->len == iov_size(qiov->iov, qiov->niov));
             if (qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
                                       NULL) < 0) {
-                reply->error = EIO;
+                ret = -EIO;
                 s->quit = true;
             }
         }
@@ -217,6 +217,8 @@ static void nbd_co_receive_reply(NBDClientSession *s,
     s->in_flight--;
     qemu_co_queue_next(&s->free_sema);
     qemu_co_mutex_unlock(&s->send_mutex);
+
+    return ret;
 }
 
 static int nbd_co_request(BlockDriverState *bs,
@@ -224,7 +226,6 @@ static int nbd_co_request(BlockDriverState *bs,
                           QEMUIOVector *qiov)
 {
     NBDClientSession *client = nbd_get_client_session(bs);
-    NBDReply reply;
     int ret;
 
     assert(!qiov || request->type == NBD_CMD_WRITE ||
@@ -232,12 +233,11 @@ static int nbd_co_request(BlockDriverState *bs,
     ret = nbd_co_send_request(bs, request,
                               request->type == NBD_CMD_WRITE ? qiov : NULL);
     if (ret < 0) {
-        reply.error = -ret;
-    } else {
-        nbd_co_receive_reply(client, request, &reply,
-                             request->type == NBD_CMD_READ ? qiov : NULL);
+        return ret;
     }
-    return -reply.error;
+
+    return nbd_co_receive_reply(client, request,
+                                request->type == NBD_CMD_READ ? qiov : NULL);
 }
 
 int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 2/7] block/nbd-client: exit reply-reading coroutine on incorrect handle
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 01/10] block/nbd-client: assert qiov len once in nbd_co_request Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 1/7] block/nbd-client: refactor nbd_co_receive_reply Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 02/10] block/nbd-client: refactor nbd_co_receive_reply Vladimir Sementsov-Ogievskiy
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Check reply-handle == request-handle in the same place, where
recv coroutine number is calculated from reply->handle and it's
correctness checked - in nbd_read_reply_entry.

Also finish nbd_read_reply_entry in case of reply-handle !=
request-handle in the same way as in case of incorrect reply-handle.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd-client.h | 1 +
 block/nbd-client.c | 8 ++++++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/block/nbd-client.h b/block/nbd-client.h
index b435754b82..b1900e6a6d 100644
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -20,6 +20,7 @@
 typedef struct {
     Coroutine *coroutine;
     bool receiving;         /* waiting for read_reply_co? */
+    NBDRequest *request;
 } NBDClientRequest;
 
 typedef struct NBDClientSession {
diff --git a/block/nbd-client.c b/block/nbd-client.c
index acd8e5d007..5f241ecc22 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -92,7 +92,9 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
         i = HANDLE_TO_INDEX(s, s->reply.handle);
         if (i >= MAX_NBD_REQUESTS ||
             !s->requests[i].coroutine ||
-            !s->requests[i].receiving) {
+            !s->requests[i].receiving ||
+            s->reply.handle != s->requests[i].request->handle)
+        {
             break;
         }
 
@@ -142,6 +144,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
     s->requests[i].receiving = false;
 
     request->handle = INDEX_TO_HANDLE(s, i);
+    s->requests[i].request = request;
 
     if (s->quit) {
         rc = -EIO;
@@ -189,9 +192,10 @@ static int nbd_co_receive_reply(NBDClientSession *s,
     s->requests[i].receiving = true;
     qemu_coroutine_yield();
     s->requests[i].receiving = false;
-    if (s->reply.handle != request->handle || !s->ioc || s->quit) {
+    if (!s->ioc || s->quit) {
         ret = -EIO;
     } else {
+        assert(s->reply.handle == request->handle);
         ret = -s->reply.error;
         if (qiov && s->reply.error == 0) {
             assert(request->len == iov_size(qiov->iov, qiov->niov));
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 02/10] block/nbd-client: refactor nbd_co_receive_reply
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 2/7] block/nbd-client: exit reply-reading coroutine on incorrect handle Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 3/7] block/nbd-client: refactor reading reply Vladimir Sementsov-Ogievskiy
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Pass handle parameter directly, as the whole request isn't needed.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/nbd-client.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index ddf273a6a4..c0683c3c83 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -180,11 +180,11 @@ err:
 }
 
 static int nbd_co_receive_reply(NBDClientSession *s,
-                                NBDRequest *request,
+                                uint64_t handle,
                                 QEMUIOVector *qiov)
 {
     int ret;
-    int i = HANDLE_TO_INDEX(s, request->handle);
+    int i = HANDLE_TO_INDEX(s, handle);
 
     /* Wait until we're woken up by nbd_read_reply_entry.  */
     s->requests[i].receiving = true;
@@ -193,7 +193,7 @@ static int nbd_co_receive_reply(NBDClientSession *s,
     if (!s->ioc || s->quit) {
         ret = -EIO;
     } else {
-        assert(s->reply.handle == request->handle);
+        assert(s->reply.handle == handle);
         ret = -s->reply.error;
         if (qiov && s->reply.error == 0) {
             if (qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
@@ -241,7 +241,7 @@ static int nbd_co_request(BlockDriverState *bs,
         return ret;
     }
 
-    return nbd_co_receive_reply(client, request,
+    return nbd_co_receive_reply(client, request->handle,
                                 request->type == NBD_CMD_READ ? qiov : NULL);
 }
 
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 3/7] block/nbd-client: refactor reading reply
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 02/10] block/nbd-client: refactor nbd_co_receive_reply Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 03/10] nbd: rename NBD_REPLY_MAGIC to NBD_SIMPLE_REPLY_MAGIC Vladimir Sementsov-Ogievskiy
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Read the whole reply in one place - in nbd_read_reply_entry.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd-client.h |  1 +
 block/nbd-client.c | 42 ++++++++++++++++++++++++------------------
 2 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/block/nbd-client.h b/block/nbd-client.h
index b1900e6a6d..3f97d31013 100644
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -21,6 +21,7 @@ typedef struct {
     Coroutine *coroutine;
     bool receiving;         /* waiting for read_reply_co? */
     NBDRequest *request;
+    QEMUIOVector *qiov;
 } NBDClientRequest;
 
 typedef struct NBDClientSession {
diff --git a/block/nbd-client.c b/block/nbd-client.c
index 5f241ecc22..f7b642f079 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -98,6 +98,21 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
             break;
         }
 
+        if (s->reply.error == 0 &&
+            s->requests[i].request->type == NBD_CMD_READ)
+        {
+            QEMUIOVector *qiov = s->requests[i].qiov;
+            assert(qiov != NULL);
+            assert(s->requests[i].request->len ==
+                   iov_size(qiov->iov, qiov->niov));
+            if (qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
+                                      NULL) < 0)
+            {
+                s->reply.error = EIO;
+                break;
+            }
+        }
+
         /* We're woken up again by the request itself.  Note that there
          * is no race between yielding and reentering read_reply_co.  This
          * is because:
@@ -118,6 +133,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
     s->read_reply_co = NULL;
 }
 
+/* qiov should be provided for READ and WRITE */
 static int nbd_co_send_request(BlockDriverState *bs,
                                NBDRequest *request,
                                QEMUIOVector *qiov)
@@ -145,6 +161,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
 
     request->handle = INDEX_TO_HANDLE(s, i);
     s->requests[i].request = request;
+    s->requests[i].qiov = qiov;
 
     if (s->quit) {
         rc = -EIO;
@@ -155,7 +172,8 @@ static int nbd_co_send_request(BlockDriverState *bs,
         goto err;
     }
 
-    if (qiov) {
+    if (s->requests[i].request->type == NBD_CMD_WRITE) {
+        assert(qiov);
         qio_channel_set_cork(s->ioc, true);
         rc = nbd_send_request(s->ioc, request);
         if (rc >= 0 && !s->quit) {
@@ -181,9 +199,7 @@ err:
     return rc;
 }
 
-static int nbd_co_receive_reply(NBDClientSession *s,
-                                NBDRequest *request,
-                                QEMUIOVector *qiov)
+static int nbd_co_receive_reply(NBDClientSession *s, NBDRequest *request)
 {
     int ret;
     int i = HANDLE_TO_INDEX(s, request->handle);
@@ -197,14 +213,6 @@ static int nbd_co_receive_reply(NBDClientSession *s,
     } else {
         assert(s->reply.handle == request->handle);
         ret = -s->reply.error;
-        if (qiov && s->reply.error == 0) {
-            assert(request->len == iov_size(qiov->iov, qiov->niov));
-            if (qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
-                                      NULL) < 0) {
-                ret = -EIO;
-                s->quit = true;
-            }
-        }
 
         /* Tell the read handler to read another header.  */
         s->reply.handle = 0;
@@ -232,16 +240,14 @@ static int nbd_co_request(BlockDriverState *bs,
     NBDClientSession *client = nbd_get_client_session(bs);
     int ret;
 
-    assert(!qiov || request->type == NBD_CMD_WRITE ||
-           request->type == NBD_CMD_READ);
-    ret = nbd_co_send_request(bs, request,
-                              request->type == NBD_CMD_WRITE ? qiov : NULL);
+    assert((qiov == NULL) !=
+           (request->type == NBD_CMD_WRITE || request->type == NBD_CMD_READ));
+    ret = nbd_co_send_request(bs, request, qiov);
     if (ret < 0) {
         return ret;
     }
 
-    return nbd_co_receive_reply(client, request,
-                                request->type == NBD_CMD_READ ? qiov : NULL);
+    return nbd_co_receive_reply(client, request);
 }
 
 int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 03/10] nbd: rename NBD_REPLY_MAGIC to NBD_SIMPLE_REPLY_MAGIC
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 3/7] block/nbd-client: refactor reading reply Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 4/7] block/nbd-client: drop reply field from NBDClientSession Vladimir Sementsov-Ogievskiy
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

To be consistent when NBD_STRUCTURED_REPLY_MAGIC will be introduced.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 nbd/nbd-internal.h                       | 2 +-
 nbd/client.c                             | 4 ++--
 nbd/server.c                             | 4 ++--
 tests/qemu-iotests/nbd-fault-injector.py | 4 ++--
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 8a609a227f..2d6663de23 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -47,7 +47,7 @@
 #define NBD_OLDSTYLE_NEGOTIATE_SIZE (8 + 8 + 8 + 4 + 124)
 
 #define NBD_REQUEST_MAGIC       0x25609513
-#define NBD_REPLY_MAGIC         0x67446698
+#define NBD_SIMPLE_REPLY_MAGIC  0x67446698
 #define NBD_OPTS_MAGIC          0x49484156454F5054LL
 #define NBD_CLIENT_MAGIC        0x0000420281861253LL
 #define NBD_REP_MAGIC           0x0003e889045565a9LL
diff --git a/nbd/client.c b/nbd/client.c
index 68a0bc1ffc..cd5a2c80ac 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -931,7 +931,7 @@ int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)
     }
 
     /* Reply
-       [ 0 ..  3]    magic   (NBD_REPLY_MAGIC)
+       [ 0 ..  3]    magic   (NBD_SIMPLE_REPLY_MAGIC)
        [ 4 ..  7]    error   (0 == no error)
        [ 7 .. 15]    handle
      */
@@ -949,7 +949,7 @@ int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)
     }
     trace_nbd_receive_reply(magic, reply->error, reply->handle);
 
-    if (magic != NBD_REPLY_MAGIC) {
+    if (magic != NBD_SIMPLE_REPLY_MAGIC) {
         error_setg(errp, "invalid magic (got 0x%" PRIx32 ")", magic);
         return -EINVAL;
     }
diff --git a/nbd/server.c b/nbd/server.c
index 993ade30bb..a1b21a6951 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -911,11 +911,11 @@ static int nbd_send_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)
     trace_nbd_send_reply(reply->error, reply->handle);
 
     /* Reply
-       [ 0 ..  3]    magic   (NBD_REPLY_MAGIC)
+       [ 0 ..  3]    magic   (NBD_SIMPLE_REPLY_MAGIC)
        [ 4 ..  7]    error   (0 == no error)
        [ 7 .. 15]    handle
      */
-    stl_be_p(buf, NBD_REPLY_MAGIC);
+    stl_be_p(buf, NBD_SIMPLE_REPLY_MAGIC);
     stl_be_p(buf + 4, reply->error);
     stq_be_p(buf + 8, reply->handle);
 
diff --git a/tests/qemu-iotests/nbd-fault-injector.py b/tests/qemu-iotests/nbd-fault-injector.py
index 1c10dcb51c..8a04d979aa 100755
--- a/tests/qemu-iotests/nbd-fault-injector.py
+++ b/tests/qemu-iotests/nbd-fault-injector.py
@@ -56,7 +56,7 @@ NBD_CMD_READ = 0
 NBD_CMD_WRITE = 1
 NBD_CMD_DISC = 2
 NBD_REQUEST_MAGIC = 0x25609513
-NBD_REPLY_MAGIC = 0x67446698
+NBD_SIMPLE_REPLY_MAGIC = 0x67446698
 NBD_PASSWD = 0x4e42444d41474943
 NBD_OPTS_MAGIC = 0x49484156454F5054
 NBD_CLIENT_MAGIC = 0x0000420281861253
@@ -166,7 +166,7 @@ def read_request(conn):
     return req
 
 def write_reply(conn, error, handle):
-    buf = reply_struct.pack(NBD_REPLY_MAGIC, error, handle)
+    buf = reply_struct.pack(NBD_SIMPLE_REPLY_MAGIC, error, handle)
     conn.send(buf, event='reply')
 
 def handle_connection(conn, use_export):
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 4/7] block/nbd-client: drop reply field from NBDClientSession
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 03/10] nbd: rename NBD_REPLY_MAGIC to NBD_SIMPLE_REPLY_MAGIC Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 04/10] nbd-server: refactor simple reply sending Vladimir Sementsov-Ogievskiy
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Drop 'reply' from NBDClientSession. It's used to only deliver request
return code to receiving coroutine. Instead introduce per-request ret
variable to simplify the scheme and make further refactoring possible.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd-client.h |  2 +-
 block/nbd-client.c | 22 +++++++++-------------
 2 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/block/nbd-client.h b/block/nbd-client.h
index 3f97d31013..4bc8fe3993 100644
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -22,6 +22,7 @@ typedef struct {
     bool receiving;         /* waiting for read_reply_co? */
     NBDRequest *request;
     QEMUIOVector *qiov;
+    int ret;
 } NBDClientRequest;
 
 typedef struct NBDClientSession {
@@ -35,7 +36,6 @@ typedef struct NBDClientSession {
     int in_flight;
 
     NBDClientRequest requests[MAX_NBD_REQUESTS];
-    NBDReply reply;
     bool quit;
 } NBDClientSession;
 
diff --git a/block/nbd-client.c b/block/nbd-client.c
index f7b642f079..f331f08a9a 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -74,10 +74,10 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
     uint64_t i;
     int ret = 0;
     Error *local_err = NULL;
+    NBDReply reply;
 
     while (!s->quit) {
-        assert(s->reply.handle == 0);
-        ret = nbd_receive_reply(s->ioc, &s->reply, &local_err);
+        ret = nbd_receive_reply(s->ioc, &reply, &local_err);
         if (ret < 0) {
             error_report_err(local_err);
         }
@@ -89,18 +89,18 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
          * handler acts as a synchronization point and ensures that only
          * one coroutine is called until the reply finishes.
          */
-        i = HANDLE_TO_INDEX(s, s->reply.handle);
+        i = HANDLE_TO_INDEX(s, reply.handle);
         if (i >= MAX_NBD_REQUESTS ||
             !s->requests[i].coroutine ||
             !s->requests[i].receiving ||
-            s->reply.handle != s->requests[i].request->handle)
+            reply.handle != s->requests[i].request->handle)
         {
             break;
         }
 
-        if (s->reply.error == 0 &&
-            s->requests[i].request->type == NBD_CMD_READ)
-        {
+        s->requests[i].ret = -reply.error;
+
+        if (reply.error == 0 && s->requests[i].request->type == NBD_CMD_READ) {
             QEMUIOVector *qiov = s->requests[i].qiov;
             assert(qiov != NULL);
             assert(s->requests[i].request->len ==
@@ -108,7 +108,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
             if (qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
                                       NULL) < 0)
             {
-                s->reply.error = EIO;
+                s->requests[i].ret = -EIO;
                 break;
             }
         }
@@ -211,11 +211,7 @@ static int nbd_co_receive_reply(NBDClientSession *s, NBDRequest *request)
     if (!s->ioc || s->quit) {
         ret = -EIO;
     } else {
-        assert(s->reply.handle == request->handle);
-        ret = -s->reply.error;
-
-        /* Tell the read handler to read another header.  */
-        s->reply.handle = 0;
+        ret = s->requests[i].ret;
     }
 
     s->requests[i].coroutine = NULL;
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 04/10] nbd-server: refactor simple reply sending
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 4/7] block/nbd-client: drop reply field from NBDClientSession Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 19:18   ` Eric Blake
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 5/7] block/nbd-client: nbd_co_send_request: return -EIO if s->quit was set in parallel Vladimir Sementsov-Ogievskiy
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Get rid of calculating structure fields offsets by hand and set_cork,
rename nbd_co_send_reply to nbd_co_send_simple_reply. Do not use
NBDReply which will be upgraded in future patches to handle both simple
and structured replies and will be used only in the client.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/nbd.h |  6 ++++
 nbd/nbd-internal.h  |  9 +++++
 nbd/server.c        | 97 ++++++++++++++++++++++-------------------------------
 3 files changed, 56 insertions(+), 56 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 707fd37575..49008980f4 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -63,6 +63,12 @@ struct NBDReply {
 };
 typedef struct NBDReply NBDReply;
 
+typedef struct NBDSimpleReply {
+    uint32_t magic;  /* NBD_SIMPLE_REPLY_MAGIC */
+    uint32_t error;
+    uint64_t handle;
+} QEMU_PACKED NBDSimpleReply;
+
 /* Transmission (export) flags: sent from server to client during handshake,
    but describe what will happen during transmission */
 #define NBD_FLAG_HAS_FLAGS      (1 << 0)        /* Flags are there */
diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 2d6663de23..320abef296 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -113,6 +113,15 @@ static inline int nbd_write(QIOChannel *ioc, const void *buffer, size_t size,
     return qio_channel_write_all(ioc, buffer, size, errp) < 0 ? -EIO : 0;
 }
 
+/* nbd_writev
+ * Writes @size bytes to @ioc. Returns 0 on success.
+ */
+static inline int nbd_writev(QIOChannel *ioc, const struct iovec *iov,
+                             size_t niov, Error **errp)
+{
+    return qio_channel_writev_all(ioc, iov, niov, errp) < 0 ? -EIO : 0;
+}
+
 struct NBDTLSHandshakeData {
     GMainLoop *loop;
     bool complete;
diff --git a/nbd/server.c b/nbd/server.c
index a1b21a6951..57d5598e0f 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -902,26 +902,6 @@ static int nbd_receive_request(QIOChannel *ioc, NBDRequest *request,
     return 0;
 }
 
-static int nbd_send_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)
-{
-    uint8_t buf[NBD_REPLY_SIZE];
-
-    reply->error = system_errno_to_nbd_errno(reply->error);
-
-    trace_nbd_send_reply(reply->error, reply->handle);
-
-    /* Reply
-       [ 0 ..  3]    magic   (NBD_SIMPLE_REPLY_MAGIC)
-       [ 4 ..  7]    error   (0 == no error)
-       [ 7 .. 15]    handle
-     */
-    stl_be_p(buf, NBD_SIMPLE_REPLY_MAGIC);
-    stl_be_p(buf + 4, reply->error);
-    stq_be_p(buf + 8, reply->handle);
-
-    return nbd_write(ioc, buf, sizeof(buf), errp);
-}
-
 #define MAX_NBD_REQUESTS 16
 
 void nbd_client_get(NBDClient *client)
@@ -1208,38 +1188,51 @@ void nbd_export_close_all(void)
     }
 }
 
-static int nbd_co_send_reply(NBDRequestData *req, NBDReply *reply, int len,
-                             Error **errp)
+static int coroutine_fn nbd_co_send_iov(NBDClient *client, struct iovec *iov,
+                                        unsigned niov, Error **errp)
 {
-    NBDClient *client = req->client;
     int ret;
 
     g_assert(qemu_in_coroutine());
-
-    trace_nbd_co_send_reply(reply->handle, reply->error, len);
-
     qemu_co_mutex_lock(&client->send_lock);
     client->send_coroutine = qemu_coroutine_self();
 
-    if (!len) {
-        ret = nbd_send_reply(client->ioc, reply, errp);
-    } else {
-        qio_channel_set_cork(client->ioc, true);
-        ret = nbd_send_reply(client->ioc, reply, errp);
-        if (ret == 0) {
-            ret = nbd_write(client->ioc, req->data, len, errp);
-            if (ret < 0) {
-                ret = -EIO;
-            }
-        }
-        qio_channel_set_cork(client->ioc, false);
-    }
+    ret = nbd_writev(client->ioc, iov, niov, errp);
 
     client->send_coroutine = NULL;
     qemu_co_mutex_unlock(&client->send_lock);
+
     return ret;
 }
 
+static inline void set_be_simple_reply(NBDSimpleReply *reply, uint64_t error,
+                                       uint64_t handle)
+{
+    stl_be_p(&reply->magic, NBD_SIMPLE_REPLY_MAGIC);
+    stl_be_p(&reply->error, error);
+    stq_be_p(&reply->handle, handle);
+}
+
+static int nbd_co_send_simple_reply(NBDClient *client,
+                                    uint64_t handle,
+                                    uint32_t error,
+                                    void *data,
+                                    size_t size,
+                                    Error **errp)
+{
+    NBDSimpleReply reply;
+    struct iovec iov[] = {
+        {.iov_base = &reply, .iov_len = sizeof(reply)},
+        {.iov_base = data, .iov_len = size}
+    };
+
+    trace_nbd_co_send_reply(handle, error, size);
+
+    set_be_simple_reply(&reply, system_errno_to_nbd_errno(error), handle);
+
+    return nbd_co_send_iov(client, iov, size ? 2 : 1, errp);
+}
+
 /* nbd_co_receive_request
  * Collect a client request. Return 0 if request looks valid, -EIO to drop
  * connection right away, and any other negative value to report an error to
@@ -1331,7 +1324,6 @@ static coroutine_fn void nbd_trip(void *opaque)
     NBDExport *exp = client->exp;
     NBDRequestData *req;
     NBDRequest request = { 0 };    /* GCC thinks it can be used uninitialized */
-    NBDReply reply;
     int ret;
     int flags;
     int reply_data_len = 0;
@@ -1351,11 +1343,7 @@ static coroutine_fn void nbd_trip(void *opaque)
         goto disconnect;
     }
 
-    reply.handle = request.handle;
-    reply.error = 0;
-
     if (ret < 0) {
-        reply.error = -ret;
         goto reply;
     }
 
@@ -1374,7 +1362,6 @@ static coroutine_fn void nbd_trip(void *opaque)
             ret = blk_co_flush(exp->blk);
             if (ret < 0) {
                 error_setg_errno(&local_err, -ret, "flush failed");
-                reply.error = -ret;
                 break;
             }
         }
@@ -1383,7 +1370,6 @@ static coroutine_fn void nbd_trip(void *opaque)
                         req->data, request.len);
         if (ret < 0) {
             error_setg_errno(&local_err, -ret, "reading from file failed");
-            reply.error = -ret;
             break;
         }
 
@@ -1392,7 +1378,7 @@ static coroutine_fn void nbd_trip(void *opaque)
         break;
     case NBD_CMD_WRITE:
         if (exp->nbdflags & NBD_FLAG_READ_ONLY) {
-            reply.error = EROFS;
+            ret = -EROFS;
             break;
         }
 
@@ -1404,14 +1390,13 @@ static coroutine_fn void nbd_trip(void *opaque)
                          req->data, request.len, flags);
         if (ret < 0) {
             error_setg_errno(&local_err, -ret, "writing to file failed");
-            reply.error = -ret;
         }
 
         break;
     case NBD_CMD_WRITE_ZEROES:
         if (exp->nbdflags & NBD_FLAG_READ_ONLY) {
             error_setg(&local_err, "Server is read-only, return error");
-            reply.error = EROFS;
+            ret = -EROFS;
             break;
         }
 
@@ -1426,7 +1411,6 @@ static coroutine_fn void nbd_trip(void *opaque)
                                 request.len, flags);
         if (ret < 0) {
             error_setg_errno(&local_err, -ret, "writing to file failed");
-            reply.error = -ret;
         }
 
         break;
@@ -1438,7 +1422,6 @@ static coroutine_fn void nbd_trip(void *opaque)
         ret = blk_co_flush(exp->blk);
         if (ret < 0) {
             error_setg_errno(&local_err, -ret, "flush failed");
-            reply.error = -ret;
         }
 
         break;
@@ -1447,25 +1430,27 @@ static coroutine_fn void nbd_trip(void *opaque)
                               request.len);
         if (ret < 0) {
             error_setg_errno(&local_err, -ret, "discard failed");
-            reply.error = -ret;
         }
 
         break;
     default:
         error_setg(&local_err, "invalid request type (%" PRIu32 ") received",
                    request.type);
-        reply.error = EINVAL;
+        ret = -EINVAL;
     }
 
 reply:
     if (local_err) {
-        /* If we are here local_err is not fatal error, already stored in
-         * reply.error */
+        /* If we are here local_err is not fatal error, which should be sent
+         * to client. */
         error_report_err(local_err);
         local_err = NULL;
     }
 
-    if (nbd_co_send_reply(req, &reply, reply_data_len, &local_err) < 0) {
+    if (nbd_co_send_simple_reply(req->client, request.handle,
+                                 ret < 0 ? -ret : 0,
+                                 req->data, reply_data_len, &local_err) < 0)
+    {
         error_prepend(&local_err, "Failed to send reply: ");
         goto disconnect;
     }
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 5/7] block/nbd-client: nbd_co_send_request: return -EIO if s->quit was set in parallel
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 04/10] nbd-server: refactor simple reply sending Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 05/10] nbd: header constants indenting Vladimir Sementsov-Ogievskiy
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

It's incorrect to return success rc >= 0 if we skip qio_channel_writev_all()
call due to s->quit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd-client.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index f331f08a9a..280147e6a7 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -189,6 +189,9 @@ static int nbd_co_send_request(BlockDriverState *bs,
     }
 
 err:
+    if (rc >= 0 && s->quit) {
+        rc = -EIO;
+    }
     if (rc < 0) {
         s->quit = true;
         s->requests[i].coroutine = NULL;
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 05/10] nbd: header constants indenting
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (8 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 5/7] block/nbd-client: nbd_co_send_request: return -EIO if s->quit was set in parallel Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 19:21   ` Eric Blake
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 6/7] block/nbd-client: early fail nbd_read_reply_entry if s->quit is set Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Prepare indenting for the following commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/nbd.h | 15 ++++++++-------
 nbd/nbd-internal.h  | 34 +++++++++++++++++-----------------
 2 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 49008980f4..a6df5ce8b5 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -71,13 +71,14 @@ typedef struct NBDSimpleReply {
 
 /* Transmission (export) flags: sent from server to client during handshake,
    but describe what will happen during transmission */
-#define NBD_FLAG_HAS_FLAGS      (1 << 0)        /* Flags are there */
-#define NBD_FLAG_READ_ONLY      (1 << 1)        /* Device is read-only */
-#define NBD_FLAG_SEND_FLUSH     (1 << 2)        /* Send FLUSH */
-#define NBD_FLAG_SEND_FUA       (1 << 3)        /* Send FUA (Force Unit Access) */
-#define NBD_FLAG_ROTATIONAL     (1 << 4)        /* Use elevator algorithm - rotational media */
-#define NBD_FLAG_SEND_TRIM      (1 << 5)        /* Send TRIM (discard) */
-#define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6)     /* Send WRITE_ZEROES */
+#define NBD_FLAG_HAS_FLAGS         (1 << 0) /* Flags are there */
+#define NBD_FLAG_READ_ONLY         (1 << 1) /* Device is read-only */
+#define NBD_FLAG_SEND_FLUSH        (1 << 2) /* Send FLUSH */
+#define NBD_FLAG_SEND_FUA          (1 << 3) /* Send FUA (Force Unit Access) */
+#define NBD_FLAG_ROTATIONAL        (1 << 4) /* Use elevator algorithm -
+                                               rotational media */
+#define NBD_FLAG_SEND_TRIM         (1 << 5) /* Send TRIM (discard) */
+#define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) /* Send WRITE_ZEROES */
 
 /* New-style handshake (global) flags, sent from server to client, and
    control what will happen during handshake phase. */
diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 320abef296..d96c9cc7fd 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -46,23 +46,23 @@
 /* Size of oldstyle negotiation */
 #define NBD_OLDSTYLE_NEGOTIATE_SIZE (8 + 8 + 8 + 4 + 124)
 
-#define NBD_REQUEST_MAGIC       0x25609513
-#define NBD_SIMPLE_REPLY_MAGIC  0x67446698
-#define NBD_OPTS_MAGIC          0x49484156454F5054LL
-#define NBD_CLIENT_MAGIC        0x0000420281861253LL
-#define NBD_REP_MAGIC           0x0003e889045565a9LL
-
-#define NBD_SET_SOCK            _IO(0xab, 0)
-#define NBD_SET_BLKSIZE         _IO(0xab, 1)
-#define NBD_SET_SIZE            _IO(0xab, 2)
-#define NBD_DO_IT               _IO(0xab, 3)
-#define NBD_CLEAR_SOCK          _IO(0xab, 4)
-#define NBD_CLEAR_QUE           _IO(0xab, 5)
-#define NBD_PRINT_DEBUG         _IO(0xab, 6)
-#define NBD_SET_SIZE_BLOCKS     _IO(0xab, 7)
-#define NBD_DISCONNECT          _IO(0xab, 8)
-#define NBD_SET_TIMEOUT         _IO(0xab, 9)
-#define NBD_SET_FLAGS           _IO(0xab, 10)
+#define NBD_REQUEST_MAGIC           0x25609513
+#define NBD_SIMPLE_REPLY_MAGIC      0x67446698
+#define NBD_OPTS_MAGIC              0x49484156454F5054LL
+#define NBD_CLIENT_MAGIC            0x0000420281861253LL
+#define NBD_REP_MAGIC               0x0003e889045565a9LL
+
+#define NBD_SET_SOCK                _IO(0xab, 0)
+#define NBD_SET_BLKSIZE             _IO(0xab, 1)
+#define NBD_SET_SIZE                _IO(0xab, 2)
+#define NBD_DO_IT                   _IO(0xab, 3)
+#define NBD_CLEAR_SOCK              _IO(0xab, 4)
+#define NBD_CLEAR_QUE               _IO(0xab, 5)
+#define NBD_PRINT_DEBUG             _IO(0xab, 6)
+#define NBD_SET_SIZE_BLOCKS         _IO(0xab, 7)
+#define NBD_DISCONNECT              _IO(0xab, 8)
+#define NBD_SET_TIMEOUT             _IO(0xab, 9)
+#define NBD_SET_FLAGS               _IO(0xab, 10)
 
 /* NBD errors are based on errno numbers, so there is a 1:1 mapping,
  * but only a limited set of errno values is specified in the protocol.
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 6/7] block/nbd-client: early fail nbd_read_reply_entry if s->quit is set
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (9 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 05/10] nbd: header constants indenting Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 06/10] nbd: Minimal structured read for server Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Do not continue any operation if s->quit is set in parallel.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd-client.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 280147e6a7..f80a4c5564 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -81,7 +81,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
         if (ret < 0) {
             error_report_err(local_err);
         }
-        if (ret <= 0) {
+        if (ret <= 0 || s->quit) {
             break;
         }
 
@@ -105,9 +105,8 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
             assert(qiov != NULL);
             assert(s->requests[i].request->len ==
                    iov_size(qiov->iov, qiov->niov));
-            if (qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
-                                      NULL) < 0)
-            {
+            ret = qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov, NULL);
+            if (ret < 0 || s->quit) {
                 s->requests[i].ret = -EIO;
                 break;
             }
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 06/10] nbd: Minimal structured read for server
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (10 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 6/7] block/nbd-client: early fail nbd_read_reply_entry if s->quit is set Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 7/7] block/nbd-client: do not yield from nbd_read_reply_entry Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Minimal implementation of structured read: one structured reply chunk,
no segmentation.
Minimal structured error implementation: no text message.
Support DF flag, but just ignore it, as there is no segmentation any
way.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/nbd.h |  33 +++++++++++++++++
 nbd/nbd-internal.h  |   1 +
 nbd/server.c        | 100 ++++++++++++++++++++++++++++++++++++++++++++++++----
 3 files changed, 128 insertions(+), 6 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index a6df5ce8b5..dd261f66f0 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -69,6 +69,25 @@ typedef struct NBDSimpleReply {
     uint64_t handle;
 } QEMU_PACKED NBDSimpleReply;
 
+typedef struct NBDStructuredReplyChunk {
+    uint32_t magic;  /* NBD_STRUCTURED_REPLY_MAGIC */
+    uint16_t flags;  /* combination of NBD_SREP_FLAG_* */
+    uint16_t type;   /* NBD_SREP_TYPE_* */
+    uint64_t handle; /* request handle */
+    uint32_t length; /* length of payload */
+} QEMU_PACKED NBDStructuredReplyChunk;
+
+typedef struct NBDStructuredRead {
+    NBDStructuredReplyChunk h;
+    uint64_t offset;
+} QEMU_PACKED NBDStructuredRead;
+
+typedef struct NBDStructuredError {
+    NBDStructuredReplyChunk h;
+    uint32_t error;
+    uint16_t message_length;
+} QEMU_PACKED NBDStructuredError;
+
 /* Transmission (export) flags: sent from server to client during handshake,
    but describe what will happen during transmission */
 #define NBD_FLAG_HAS_FLAGS         (1 << 0) /* Flags are there */
@@ -79,6 +98,7 @@ typedef struct NBDSimpleReply {
                                                rotational media */
 #define NBD_FLAG_SEND_TRIM         (1 << 5) /* Send TRIM (discard) */
 #define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) /* Send WRITE_ZEROES */
+#define NBD_FLAG_SEND_DF           (1 << 7) /* Send DF (Do not Fragment) */
 
 /* New-style handshake (global) flags, sent from server to client, and
    control what will happen during handshake phase. */
@@ -125,6 +145,7 @@ typedef struct NBDSimpleReply {
 /* Request flags, sent from client to server during transmission phase */
 #define NBD_CMD_FLAG_FUA        (1 << 0) /* 'force unit access' during write */
 #define NBD_CMD_FLAG_NO_HOLE    (1 << 1) /* don't punch hole on zero run */
+#define NBD_CMD_FLAG_DF         (1 << 2) /* don't fragment structured read */
 
 /* Supported request types */
 enum {
@@ -149,6 +170,18 @@ enum {
  * aren't overflowing some other buffer. */
 #define NBD_MAX_NAME_SIZE 256
 
+/* Structured reply flags */
+#define NBD_SREP_FLAG_DONE          (1 << 0) /* This reply-chunk is last */
+
+/* Structured reply types */
+#define NBD_SREP_ERR(value)         ((1 << 15) | (value))
+
+#define NBD_SREP_TYPE_NONE          0
+#define NBD_SREP_TYPE_OFFSET_DATA   1
+#define NBD_SREP_TYPE_OFFSET_HOLE   2
+#define NBD_SREP_TYPE_ERROR         NBD_SREP_ERR(1)
+#define NBD_SREP_TYPE_ERROR_OFFSET  NBD_SREP_ERR(2)
+
 /* Details collected by NBD_OPT_EXPORT_NAME and NBD_OPT_GO */
 struct NBDExportInfo {
     /* Set by client before nbd_receive_negotiate() */
diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index d96c9cc7fd..6b0d1183ba 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -48,6 +48,7 @@
 
 #define NBD_REQUEST_MAGIC           0x25609513
 #define NBD_SIMPLE_REPLY_MAGIC      0x67446698
+#define NBD_STRUCTURED_REPLY_MAGIC  0x668e33ef
 #define NBD_OPTS_MAGIC              0x49484156454F5054LL
 #define NBD_CLIENT_MAGIC            0x0000420281861253LL
 #define NBD_REP_MAGIC               0x0003e889045565a9LL
diff --git a/nbd/server.c b/nbd/server.c
index 57d5598e0f..0af94a293d 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -98,6 +98,8 @@ struct NBDClient {
     QTAILQ_ENTRY(NBDClient) next;
     int nb_requests;
     bool closing;
+
+    bool structured_reply;
 };
 
 /* That's all folks */
@@ -760,6 +762,20 @@ static int nbd_negotiate_options(NBDClient *client, uint16_t myflags,
                     return ret;
                 }
                 break;
+
+            case NBD_OPT_STRUCTURED_REPLY:
+                if (client->structured_reply) {
+                    error_setg(errp, "Double negotiation of structured reply");
+                    return -EINVAL;
+                }
+                ret = nbd_negotiate_send_rep(client->ioc, NBD_REP_ACK, option,
+                                             errp);
+                if (ret < 0) {
+                    return ret;
+                }
+                client->structured_reply = true;
+                break;
+
             default:
                 if (nbd_drop(client->ioc, length, errp) < 0) {
                     return -EIO;
@@ -1233,6 +1249,61 @@ static int nbd_co_send_simple_reply(NBDClient *client,
     return nbd_co_send_iov(client, iov, size ? 2 : 1, errp);
 }
 
+static inline void set_be_chunk(NBDStructuredReplyChunk *chunk, uint16_t flags,
+                                uint16_t type, uint64_t handle, uint32_t length)
+{
+    stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC);
+    stw_be_p(&chunk->flags, flags);
+    stw_be_p(&chunk->type, type);
+    stq_be_p(&chunk->handle, handle);
+    stl_be_p(&chunk->length, length);
+}
+
+static inline int coroutine_fn nbd_co_send_buf(NBDClient *client, void *buf,
+                                               size_t size, Error **errp)
+{
+    struct iovec iov[] = {
+        {.iov_base = buf, .iov_len = size}
+    };
+
+    return nbd_co_send_iov(client, iov, 1, errp);
+}
+
+static int coroutine_fn nbd_co_send_structured_read(NBDClient *client,
+                                                    uint64_t handle,
+                                                    uint64_t offset,
+                                                    void *data,
+                                                    size_t size,
+                                                    Error **errp)
+{
+    NBDStructuredRead chunk;
+    struct iovec iov[] = {
+        {.iov_base = &chunk, .iov_len = sizeof(chunk)},
+        {.iov_base = data, .iov_len = size}
+    };
+
+    set_be_chunk(&chunk.h, NBD_SREP_FLAG_DONE, NBD_SREP_TYPE_OFFSET_DATA,
+                 handle, sizeof(chunk) - sizeof(chunk.h) + size);
+    stq_be_p(&chunk.offset, offset);
+
+    return nbd_co_send_iov(client, iov, 2, errp);
+}
+
+static int coroutine_fn nbd_co_send_structured_error(NBDClient *client,
+                                                     uint64_t handle,
+                                                     uint32_t error,
+                                                     Error **errp)
+{
+    NBDStructuredError chunk;
+
+    set_be_chunk(&chunk.h, NBD_SREP_FLAG_DONE, NBD_SREP_TYPE_ERROR, handle,
+                 sizeof(chunk) - sizeof(chunk.h));
+    stl_be_p(&chunk.error, error);
+    stw_be_p(&chunk.message_length, 0);
+
+    return nbd_co_send_buf(client, &chunk, sizeof(chunk), errp);
+}
+
 /* nbd_co_receive_request
  * Collect a client request. Return 0 if request looks valid, -EIO to drop
  * connection right away, and any other negative value to report an error to
@@ -1304,10 +1375,17 @@ static int nbd_co_receive_request(NBDRequestData *req, NBDRequest *request,
                    (uint64_t)client->exp->size);
         return request->type == NBD_CMD_WRITE ? -ENOSPC : -EINVAL;
     }
-    if (request->flags & ~(NBD_CMD_FLAG_FUA | NBD_CMD_FLAG_NO_HOLE)) {
+    if (request->flags & ~(NBD_CMD_FLAG_FUA | NBD_CMD_FLAG_NO_HOLE |
+                           NBD_CMD_FLAG_DF))
+    {
         error_setg(errp, "unsupported flags (got 0x%x)", request->flags);
         return -EINVAL;
     }
+    if (request->type != NBD_CMD_READ && (request->flags & NBD_CMD_FLAG_DF)) {
+        error_setg(errp, "DF flag used with command %d (%s) which is not READ",
+                   request->type, nbd_cmd_lookup(request->type));
+        return -EINVAL;
+    }
     if (request->type != NBD_CMD_WRITE_ZEROES &&
         (request->flags & NBD_CMD_FLAG_NO_HOLE)) {
         error_setg(errp, "unexpected flags (got 0x%x)", request->flags);
@@ -1374,7 +1452,6 @@ static coroutine_fn void nbd_trip(void *opaque)
         }
 
         reply_data_len = request.len;
-
         break;
     case NBD_CMD_WRITE:
         if (exp->nbdflags & NBD_FLAG_READ_ONLY) {
@@ -1447,10 +1524,21 @@ reply:
         local_err = NULL;
     }
 
-    if (nbd_co_send_simple_reply(req->client, request.handle,
-                                 ret < 0 ? -ret : 0,
-                                 req->data, reply_data_len, &local_err) < 0)
-    {
+    if (client->structured_reply && request.type == NBD_CMD_READ) {
+        if (ret < 0) {
+            ret = nbd_co_send_structured_error(req->client, request.handle,
+                                               -ret, &local_err);
+        } else {
+            ret = nbd_co_send_structured_read(req->client, request.handle,
+                                              request.from, req->data,
+                                              reply_data_len, &local_err);
+        }
+    } else {
+        ret = nbd_co_send_simple_reply(req->client, request.handle,
+                                       ret < 0 ? -ret : 0,
+                                       req->data, reply_data_len, &local_err);
+    }
+    if (ret < 0) {
         error_prepend(&local_err, "Failed to send reply: ");
         goto disconnect;
     }
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 7/7] block/nbd-client: do not yield from nbd_read_reply_entry
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (11 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 06/10] nbd: Minimal structured read for server Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 07/10] nbd/client: refactor nbd_receive_starttls Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Refactor nbd client to not yield from nbd_read_reply_entry. It's
possible now as all reading is done in nbd_read_reply_entry and
all request-related data is stored in per-request s->requests[i].

We need here some additional work with s->requests[i].ret and
s->quit to not fail requests on normal disconnet and to not report
reading errors on normal disconnect.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd-client.c | 29 ++++++++++-------------------
 1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index f80a4c5564..bf20d5d5e6 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -79,7 +79,11 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
     while (!s->quit) {
         ret = nbd_receive_reply(s->ioc, &reply, &local_err);
         if (ret < 0) {
-            error_report_err(local_err);
+            if (s->quit) {
+                error_free(local_err);
+            } else {
+                error_report_err(local_err);
+            }
         }
         if (ret <= 0 || s->quit) {
             break;
@@ -112,19 +116,8 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
             }
         }
 
-        /* We're woken up again by the request itself.  Note that there
-         * is no race between yielding and reentering read_reply_co.  This
-         * is because:
-         *
-         * - if the request runs on the same AioContext, it is only
-         *   entered after we yield
-         *
-         * - if the request runs on a different AioContext, reentering
-         *   read_reply_co happens through a bottom half, which can only
-         *   run after we yield.
-         */
+        s->requests[i].receiving = false;
         aio_co_wake(s->requests[i].coroutine);
-        qemu_coroutine_yield();
     }
 
     s->quit = true;
@@ -157,6 +150,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
 
     s->requests[i].coroutine = qemu_coroutine_self();
     s->requests[i].receiving = false;
+    s->requests[i].ret = -EIO;
 
     request->handle = INDEX_TO_HANDLE(s, i);
     s->requests[i].request = request;
@@ -210,7 +204,7 @@ static int nbd_co_receive_reply(NBDClientSession *s, NBDRequest *request)
     s->requests[i].receiving = true;
     qemu_coroutine_yield();
     s->requests[i].receiving = false;
-    if (!s->ioc || s->quit) {
+    if (!s->ioc) {
         ret = -EIO;
     } else {
         ret = s->requests[i].ret;
@@ -218,11 +212,6 @@ static int nbd_co_receive_reply(NBDClientSession *s, NBDRequest *request)
 
     s->requests[i].coroutine = NULL;
 
-    /* Kick the read_reply_co to get the next reply.  */
-    if (s->read_reply_co) {
-        aio_co_wake(s->read_reply_co);
-    }
-
     qemu_co_mutex_lock(&s->send_mutex);
     s->in_flight--;
     qemu_co_queue_next(&s->free_sema);
@@ -364,6 +353,8 @@ void nbd_client_close(BlockDriverState *bs)
 
     nbd_send_request(client->ioc, &request);
 
+    client->quit = true;
+
     nbd_teardown_connection(bs);
 }
 
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 07/10] nbd/client: refactor nbd_receive_starttls
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (12 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 7/7] block/nbd-client: do not yield from nbd_read_reply_entry Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 08/10] nbd: share some nbd entities to be reused in block/nbd-client.c Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Split out nbd_receive_simple_option to be reused for structured reply
option.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 nbd/client.c     | 64 ++++++++++++++++++++++++++++++++++++++++----------------
 nbd/trace-events |  7 ++++---
 2 files changed, 50 insertions(+), 21 deletions(-)

diff --git a/nbd/client.c b/nbd/client.c
index cd5a2c80ac..c8702a80b1 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -540,35 +540,63 @@ static int nbd_receive_query_exports(QIOChannel *ioc,
     }
 }
 
-static QIOChannel *nbd_receive_starttls(QIOChannel *ioc,
-                                        QCryptoTLSCreds *tlscreds,
-                                        const char *hostname, Error **errp)
+/* nbd_request_simple_option
+ * return 1 for successful negotiation,
+ *        0 if operation is unsupported,
+ *        -1 with errp set for any other error
+ */
+static int nbd_request_simple_option(QIOChannel *ioc, int opt, Error **errp)
 {
     nbd_opt_reply reply;
-    QIOChannelTLS *tioc;
-    struct NBDTLSHandshakeData data = { 0 };
 
-    trace_nbd_receive_starttls_request();
-    if (nbd_send_option_request(ioc, NBD_OPT_STARTTLS, 0, NULL, errp) < 0) {
-        return NULL;
+    trace_nbd_receive_simple_option_request(opt, nbd_opt_lookup(opt));
+    if (nbd_send_option_request(ioc, opt, 0, NULL, errp) < 0) {
+        return -1;
     }
 
-    trace_nbd_receive_starttls_reply();
-    if (nbd_receive_option_reply(ioc, NBD_OPT_STARTTLS, &reply, errp) < 0) {
-        return NULL;
+    trace_nbd_receive_simple_option_reply(opt, nbd_opt_lookup(opt));
+    if (nbd_receive_option_reply(ioc, opt, &reply, errp) < 0) {
+        return -1;
     }
 
-    if (reply.type != NBD_REP_ACK) {
-        error_setg(errp, "Server rejected request to start TLS %" PRIx32,
-                   reply.type);
+    if (reply.length != 0) {
+        error_setg(errp, "Option %d ('%s') response length is %" PRIu32
+                   " (it should be zero)", opt, nbd_opt_lookup(opt),
+                   reply.length);
         nbd_send_opt_abort(ioc);
-        return NULL;
+        return -1;
     }
 
-    if (reply.length != 0) {
-        error_setg(errp, "Start TLS response was not zero %" PRIu32,
-                   reply.length);
+    if (reply.type == NBD_REP_ERR_UNSUP) {
+        return 0;
+    }
+
+    if (reply.type != NBD_REP_ACK) {
+        error_setg(errp, "Server rejected request for option %d (%s) "
+                   "with reply %" PRIx32 " (%s)", opt, nbd_opt_lookup(opt),
+                   reply.type, nbd_rep_lookup(reply.type));
         nbd_send_opt_abort(ioc);
+        return -1;
+    }
+
+    trace_nbd_receive_simple_option_approved(opt, nbd_opt_lookup(opt));
+    return 1;
+}
+
+static QIOChannel *nbd_receive_starttls(QIOChannel *ioc,
+                                        QCryptoTLSCreds *tlscreds,
+                                        const char *hostname, Error **errp)
+{
+    int ret;
+    QIOChannelTLS *tioc;
+    struct NBDTLSHandshakeData data = { 0 };
+
+    ret = nbd_request_simple_option(ioc, NBD_OPT_STARTTLS, errp);
+    if (ret <= 0) {
+        if (ret == 0) {
+            error_setg(errp, "Server don't support STARTTLS option");
+            nbd_send_opt_abort(ioc);
+        }
         return NULL;
     }
 
diff --git a/nbd/trace-events b/nbd/trace-events
index 48a4f27682..ea44e6963f 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -9,9 +9,10 @@ nbd_opt_go_info_unknown(int info, const char *name) "Ignoring unknown info %d (%
 nbd_opt_go_info_block_size(uint32_t minimum, uint32_t preferred, uint32_t maximum) "Block sizes are 0x%" PRIx32 ", 0x%" PRIx32 ", 0x%" PRIx32
 nbd_receive_query_exports_start(const char *wantname) "Querying export list for '%s'"
 nbd_receive_query_exports_success(const char *wantname) "Found desired export name '%s'"
-nbd_receive_starttls_request(void) "Requesting TLS from server"
-nbd_receive_starttls_reply(void) "Getting TLS reply from server"
-nbd_receive_starttls_new_client(void) "TLS request approved, setting up TLS"
+nbd_receive_simple_option_request(int opt, const char *name) "Requesting option %d (%s) from server"
+nbd_receive_simple_option_reply(int opt, const char *name) "Getting reply for option %d (%s) from server"
+nbd_receive_simple_option_approved(int opt, const char *name) "Option %d (%s) approved"
+nbd_receive_starttls_new_client(void) "Setting up TLS"
 nbd_receive_starttls_tls_handshake(void) "Starting TLS handshake"
 nbd_receive_negotiate(void *tlscreds, const char *hostname) "Receiving negotiation tlscreds=%p hostname=%s"
 nbd_receive_negotiate_magic(uint64_t magic) "Magic is 0x%" PRIx64
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 08/10] nbd: share some nbd entities to be reused in block/nbd-client.c
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (13 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 07/10] nbd/client: refactor nbd_receive_starttls Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 09/10] nbd/client: prepare nbd_receive_reply for structured reply Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/nbd.h | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 nbd/nbd-internal.h  | 25 -------------------------
 nbd/client.c        | 32 --------------------------------
 3 files changed, 48 insertions(+), 57 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index dd261f66f0..09e4592971 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -77,6 +77,9 @@ typedef struct NBDStructuredReplyChunk {
     uint32_t length; /* length of payload */
 } QEMU_PACKED NBDStructuredReplyChunk;
 
+#define NBD_SIMPLE_REPLY_MAGIC      0x67446698
+#define NBD_STRUCTURED_REPLY_MAGIC  0x668e33ef
+
 typedef struct NBDStructuredRead {
     NBDStructuredReplyChunk h;
     uint64_t offset;
@@ -182,6 +185,40 @@ enum {
 #define NBD_SREP_TYPE_ERROR         NBD_SREP_ERR(1)
 #define NBD_SREP_TYPE_ERROR_OFFSET  NBD_SREP_ERR(2)
 
+/* NBD errors are based on errno numbers, so there is a 1:1 mapping,
+ * but only a limited set of errno values is specified in the protocol.
+ * Everything else is squashed to EINVAL.
+ */
+#define NBD_SUCCESS    0
+#define NBD_EPERM      1
+#define NBD_EIO        5
+#define NBD_ENOMEM     12
+#define NBD_EINVAL     22
+#define NBD_ENOSPC     28
+#define NBD_ESHUTDOWN  108
+
+static inline int nbd_errno_to_system_errno(int err)
+{
+    switch (err) {
+    case NBD_SUCCESS:
+        return 0;
+    case NBD_EPERM:
+        return EPERM;
+    case NBD_EIO:
+        return EIO;
+    case NBD_ENOMEM:
+        return ENOMEM;
+    case NBD_ENOSPC:
+        return ENOSPC;
+    case NBD_ESHUTDOWN:
+        return ESHUTDOWN;
+    case NBD_EINVAL:
+        return EINVAL;
+    }
+
+    return EINVAL;
+}
+
 /* Details collected by NBD_OPT_EXPORT_NAME and NBD_OPT_GO */
 struct NBDExportInfo {
     /* Set by client before nbd_receive_negotiate() */
@@ -235,4 +272,15 @@ void nbd_client_put(NBDClient *client);
 void nbd_server_start(SocketAddress *addr, const char *tls_creds,
                       Error **errp);
 
+/* nbd_read
+ * Reads @size bytes from @ioc. Returns 0 on success.
+ */
+static inline int nbd_read(QIOChannel *ioc, void *buffer, size_t size,
+                           Error **errp)
+{
+    return qio_channel_read_all(ioc, buffer, size, errp) < 0 ? -EIO : 0;
+}
+
+int nbd_drop(QIOChannel *ioc, size_t size, Error **errp);
+
 #endif
diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 6b0d1183ba..9f7b6b68e8 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -47,8 +47,6 @@
 #define NBD_OLDSTYLE_NEGOTIATE_SIZE (8 + 8 + 8 + 4 + 124)
 
 #define NBD_REQUEST_MAGIC           0x25609513
-#define NBD_SIMPLE_REPLY_MAGIC      0x67446698
-#define NBD_STRUCTURED_REPLY_MAGIC  0x668e33ef
 #define NBD_OPTS_MAGIC              0x49484156454F5054LL
 #define NBD_CLIENT_MAGIC            0x0000420281861253LL
 #define NBD_REP_MAGIC               0x0003e889045565a9LL
@@ -65,18 +63,6 @@
 #define NBD_SET_TIMEOUT             _IO(0xab, 9)
 #define NBD_SET_FLAGS               _IO(0xab, 10)
 
-/* NBD errors are based on errno numbers, so there is a 1:1 mapping,
- * but only a limited set of errno values is specified in the protocol.
- * Everything else is squashed to EINVAL.
- */
-#define NBD_SUCCESS    0
-#define NBD_EPERM      1
-#define NBD_EIO        5
-#define NBD_ENOMEM     12
-#define NBD_EINVAL     22
-#define NBD_ENOSPC     28
-#define NBD_ESHUTDOWN  108
-
 /* nbd_read_eof
  * Tries to read @size bytes from @ioc.
  * Returns 1 on success
@@ -96,15 +82,6 @@ static inline int nbd_read_eof(QIOChannel *ioc, void *buffer, size_t size,
     return ret;
 }
 
-/* nbd_read
- * Reads @size bytes from @ioc. Returns 0 on success.
- */
-static inline int nbd_read(QIOChannel *ioc, void *buffer, size_t size,
-                           Error **errp)
-{
-    return qio_channel_read_all(ioc, buffer, size, errp) < 0 ? -EIO : 0;
-}
-
 /* nbd_write
  * Writes @size bytes to @ioc. Returns 0 on success.
  */
@@ -137,6 +114,4 @@ const char *nbd_rep_lookup(uint32_t rep);
 const char *nbd_info_lookup(uint16_t info);
 const char *nbd_cmd_lookup(uint16_t info);
 
-int nbd_drop(QIOChannel *ioc, size_t size, Error **errp);
-
 #endif
diff --git a/nbd/client.c b/nbd/client.c
index c8702a80b1..f0f3075569 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -22,38 +22,6 @@
 #include "trace.h"
 #include "nbd-internal.h"
 
-static int nbd_errno_to_system_errno(int err)
-{
-    int ret;
-    switch (err) {
-    case NBD_SUCCESS:
-        ret = 0;
-        break;
-    case NBD_EPERM:
-        ret = EPERM;
-        break;
-    case NBD_EIO:
-        ret = EIO;
-        break;
-    case NBD_ENOMEM:
-        ret = ENOMEM;
-        break;
-    case NBD_ENOSPC:
-        ret = ENOSPC;
-        break;
-    case NBD_ESHUTDOWN:
-        ret = ESHUTDOWN;
-        break;
-    default:
-        trace_nbd_unknown_error(err);
-        /* fallthrough */
-    case NBD_EINVAL:
-        ret = EINVAL;
-        break;
-    }
-    return ret;
-}
-
 /* Definitions for opaque data types */
 
 static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports);
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 09/10] nbd/client: prepare nbd_receive_reply for structured reply
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (14 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 08/10] nbd: share some nbd entities to be reused in block/nbd-client.c Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 10/10] nbd: Minimal structured read for client Vladimir Sementsov-Ogievskiy
  2017-10-09 17:33 ` [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

In following patch nbd_receive_reply will be used both for simple
and structured reply header receiving.
NBDReply is altered into union of simple reply header and structured
reply chunk header, simple error translation moved to block/nbd-client
to be consistent with further structured reply error translation.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/nbd.h |  30 ++++++++++++----
 block/nbd-client.c  |   8 +++--
 nbd/client.c        | 102 +++++++++++++++++++++++++++++++++++++++++-----------
 nbd/trace-events    |   3 +-
 4 files changed, 113 insertions(+), 30 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 09e4592971..1ef8c8897f 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -57,12 +57,6 @@ struct NBDRequest {
 };
 typedef struct NBDRequest NBDRequest;
 
-struct NBDReply {
-    uint64_t handle;
-    uint32_t error;
-};
-typedef struct NBDReply NBDReply;
-
 typedef struct NBDSimpleReply {
     uint32_t magic;  /* NBD_SIMPLE_REPLY_MAGIC */
     uint32_t error;
@@ -77,9 +71,33 @@ typedef struct NBDStructuredReplyChunk {
     uint32_t length; /* length of payload */
 } QEMU_PACKED NBDStructuredReplyChunk;
 
+typedef union NBDReply {
+    NBDSimpleReply simple;
+    NBDStructuredReplyChunk structured;
+    struct {
+        /* @magic and @handle fields have the same offset and size both in
+         * simple reply and structured reply chunk, so let them be accessible
+         * without ".simple." or ".structured." specification
+         */
+        uint32_t magic;
+        uint32_t _skip;
+        uint64_t handle;
+    } QEMU_PACKED;
+} NBDReply;
+
 #define NBD_SIMPLE_REPLY_MAGIC      0x67446698
 #define NBD_STRUCTURED_REPLY_MAGIC  0x668e33ef
 
+static inline bool nbd_reply_is_simple(NBDReply *reply)
+{
+    return reply->magic == NBD_SIMPLE_REPLY_MAGIC;
+}
+
+static inline bool nbd_reply_is_structured(NBDReply *reply)
+{
+    return reply->magic == NBD_STRUCTURED_REPLY_MAGIC;
+}
+
 typedef struct NBDStructuredRead {
     NBDStructuredReplyChunk h;
     uint64_t offset;
diff --git a/block/nbd-client.c b/block/nbd-client.c
index c0683c3c83..58493b7ac4 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -92,7 +92,9 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
         i = HANDLE_TO_INDEX(s, s->reply.handle);
         if (i >= MAX_NBD_REQUESTS ||
             !s->requests[i].coroutine ||
-            !s->requests[i].receiving) {
+            !s->requests[i].receiving ||
+            nbd_reply_is_structured(&s->reply))
+        {
             break;
         }
 
@@ -194,8 +196,8 @@ static int nbd_co_receive_reply(NBDClientSession *s,
         ret = -EIO;
     } else {
         assert(s->reply.handle == handle);
-        ret = -s->reply.error;
-        if (qiov && s->reply.error == 0) {
+        ret = -nbd_errno_to_system_errno(s->reply.simple.error);
+        if (qiov && ret == 0) {
             if (qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
                                       NULL) < 0) {
                 ret = -EIO;
diff --git a/nbd/client.c b/nbd/client.c
index f0f3075569..a38e1a7d8e 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -910,6 +910,57 @@ int nbd_send_request(QIOChannel *ioc, NBDRequest *request)
     return nbd_write(ioc, buf, sizeof(buf), NULL);
 }
 
+/* nbd_receive_simple_reply
+ * Read simple reply except magic field (which should be already read).
+ * Payload is not read (payload is possible for CMD_READ, but here we even
+ * don't know whether it take place or not).
+ */
+static int nbd_receive_simple_reply(QIOChannel *ioc, NBDSimpleReply *reply,
+                                    Error **errp)
+{
+    int ret;
+
+    assert(reply->magic == NBD_SIMPLE_REPLY_MAGIC);
+
+    ret = nbd_read(ioc, (uint8_t *)reply + sizeof(reply->magic),
+                   sizeof(*reply) - sizeof(reply->magic), errp);
+    if (ret < 0) {
+        return ret;
+    }
+
+    be32_to_cpus(&reply->error);
+    be64_to_cpus(&reply->handle);
+
+    return 0;
+}
+
+/* nbd_receive_structured_reply_chunk
+ * Read structured reply chunk except magic field (which should be already
+ * read).
+ * Payload is not read.
+ */
+static int nbd_receive_structured_reply_chunk(QIOChannel *ioc,
+                                              NBDStructuredReplyChunk *chunk,
+                                              Error **errp)
+{
+    int ret;
+
+    assert(chunk->magic == NBD_STRUCTURED_REPLY_MAGIC);
+
+    ret = nbd_read(ioc, (uint8_t *)chunk + sizeof(chunk->magic),
+                   sizeof(*chunk) - sizeof(chunk->magic), errp);
+    if (ret < 0) {
+        return ret;
+    }
+
+    be16_to_cpus(&chunk->flags);
+    be16_to_cpus(&chunk->type);
+    be64_to_cpus(&chunk->handle);
+    be32_to_cpus(&chunk->length);
+
+    return 0;
+}
+
 /* nbd_receive_reply
  * Returns 1 on success
  *         0 on eof, when no data was read (errp is not set)
@@ -917,37 +968,48 @@ int nbd_send_request(QIOChannel *ioc, NBDRequest *request)
  */
 int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)
 {
-    uint8_t buf[NBD_REPLY_SIZE];
-    uint32_t magic;
     int ret;
 
-    ret = nbd_read_eof(ioc, buf, sizeof(buf), errp);
+    ret = nbd_read_eof(ioc, &reply->magic, sizeof(reply->magic), errp);
     if (ret <= 0) {
         return ret;
     }
 
-    /* Reply
-       [ 0 ..  3]    magic   (NBD_SIMPLE_REPLY_MAGIC)
-       [ 4 ..  7]    error   (0 == no error)
-       [ 7 .. 15]    handle
-     */
+    be32_to_cpus(&reply->magic);
 
-    magic = ldl_be_p(buf);
-    reply->error  = ldl_be_p(buf + 4);
-    reply->handle = ldq_be_p(buf + 8);
+    switch (reply->magic) {
+    case NBD_SIMPLE_REPLY_MAGIC:
+        ret = nbd_receive_simple_reply(ioc, &reply->simple, errp);
+        if (ret < 0) {
+            break;
+        }
 
-    reply->error = nbd_errno_to_system_errno(reply->error);
+        if (reply->simple.error == NBD_ESHUTDOWN) {
+            /* This works even on mingw which lacks a native ESHUTDOWN */
+            error_setg(errp, "server shutting down");
+            return -EINVAL;
+        }
 
-    if (reply->error == ESHUTDOWN) {
-        /* This works even on mingw which lacks a native ESHUTDOWN */
-        error_setg(errp, "server shutting down");
+        trace_nbd_receive_simple_reply(
+                nbd_errno_to_system_errno(reply->simple.error),
+                reply->simple.handle);
+        break;
+    case NBD_STRUCTURED_REPLY_MAGIC:
+        ret = nbd_receive_structured_reply_chunk(ioc, &reply->structured, errp);
+        if (ret < 0) {
+            break;
+        }
+        trace_nbd_receive_structured_reply_chunk(reply->structured.flags,
+                                                 reply->structured.type,
+                                                 reply->structured.handle,
+                                                 reply->structured.length);
+        break;
+    default:
+        error_setg(errp, "invalid magic (got 0x%" PRIx32 ")", reply->magic);
         return -EINVAL;
     }
-    trace_nbd_receive_reply(magic, reply->error, reply->handle);
-
-    if (magic != NBD_SIMPLE_REPLY_MAGIC) {
-        error_setg(errp, "invalid magic (got 0x%" PRIx32 ")", magic);
-        return -EINVAL;
+    if (ret < 0) {
+        return ret;
     }
 
     return 1;
diff --git a/nbd/trace-events b/nbd/trace-events
index ea44e6963f..647df4152c 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -30,7 +30,8 @@ nbd_client_loop_ret(int ret, const char *error) "NBD loop returned %d: %s"
 nbd_client_clear_queue(void) "Clearing NBD queue"
 nbd_client_clear_socket(void) "Clearing NBD socket"
 nbd_send_request(uint64_t from, uint32_t len, uint64_t handle, uint16_t flags, uint16_t type, const char *name) "Sending request to server: { .from = %" PRIu64", .len = %" PRIu32 ", .handle = %" PRIu64 ", .flags = 0x%" PRIx16 ", .type = %" PRIu16 " (%s) }"
-nbd_receive_reply(uint32_t magic, int32_t error, uint64_t handle) "Got reply: { magic = 0x%" PRIx32 ", .error = % " PRId32 ", handle = %" PRIu64" }"
+nbd_receive_simple_reply(int32_t error, uint64_t handle) "Got simple reply: { error = % " PRId32 ", handle = %" PRIu64" }"
+nbd_receive_structured_reply_chunk(uint16_t flags, uint16_t type, uint64_t handle, uint32_t length) "Got structured reply chunk: { flags = 0x%" PRIx16 ", type = %d, handle = %" PRIu64 ", length = %" PRIu32 " }"
 
 # nbd/server.c
 nbd_negotiate_send_rep_len(uint32_t opt, const char *optname, uint32_t type, const char *typename, uint32_t len) "Reply opt=0x%" PRIx32 " (%s), type=0x%" PRIx32 " (%s), len=%" PRIu32
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 10/10] nbd: Minimal structured read for client
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (15 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 09/10] nbd/client: prepare nbd_receive_reply for structured reply Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:27 ` Vladimir Sementsov-Ogievskiy
  2017-10-10  8:02   ` Paolo Bonzini
  2017-10-10 14:43   ` Vladimir Sementsov-Ogievskiy
  2017-10-09 17:33 ` [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
  17 siblings, 2 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:27 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, vsementsov, den

Minimal implementation: for structured error only error_report error
message.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/nbd.h |   6 +
 block/nbd-client.c  | 358 ++++++++++++++++++++++++++++++++++++++++++++++++----
 nbd/client.c        |   7 +
 3 files changed, 343 insertions(+), 28 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 1ef8c8897f..e3350b67a4 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -203,6 +203,11 @@ enum {
 #define NBD_SREP_TYPE_ERROR         NBD_SREP_ERR(1)
 #define NBD_SREP_TYPE_ERROR_OFFSET  NBD_SREP_ERR(2)
 
+static inline bool nbd_srep_type_is_error(int type)
+{
+    return type & (1 << 15);
+}
+
 /* NBD errors are based on errno numbers, so there is a 1:1 mapping,
  * but only a limited set of errno values is specified in the protocol.
  * Everything else is squashed to EINVAL.
@@ -241,6 +246,7 @@ static inline int nbd_errno_to_system_errno(int err)
 struct NBDExportInfo {
     /* Set by client before nbd_receive_negotiate() */
     bool request_sizes;
+    bool structured_reply;
     /* Set by server results during nbd_receive_negotiate() */
     uint64_t size;
     uint16_t flags;
diff --git a/block/nbd-client.c b/block/nbd-client.c
index 58493b7ac4..eda78130a2 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -29,6 +29,7 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qemu/error-report.h"
 #include "nbd-client.h"
 
 #define HANDLE_TO_INDEX(bs, handle) ((handle) ^ (uint64_t)(intptr_t)(bs))
@@ -93,7 +94,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
         if (i >= MAX_NBD_REQUESTS ||
             !s->requests[i].coroutine ||
             !s->requests[i].receiving ||
-            nbd_reply_is_structured(&s->reply))
+            (nbd_reply_is_structured(&s->reply) && !s->info.structured_reply))
         {
             break;
         }
@@ -181,41 +182,336 @@ err:
     return rc;
 }
 
-static int nbd_co_receive_reply(NBDClientSession *s,
-                                uint64_t handle,
-                                QEMUIOVector *qiov)
+static inline void payload_advance16(uint8_t **payload, uint16_t **ptr)
+{
+    *ptr = (uint16_t *)*payload;
+    be16_to_cpus(*ptr);
+    *payload += sizeof(**ptr);
+}
+
+static inline void payload_advance32(uint8_t **payload, uint32_t **ptr)
+{
+    *ptr = (uint32_t *)*payload;
+    be32_to_cpus(*ptr);
+    *payload += sizeof(**ptr);
+}
+
+static inline void payload_advance64(uint8_t **payload, uint64_t **ptr)
+{
+    *ptr = (uint64_t *)*payload;
+    be64_to_cpus(*ptr);
+    *payload += sizeof(**ptr);
+}
+
+static int nbd_parse_offset_hole_payload(NBDStructuredReplyChunk *chunk,
+                                         uint8_t *payload, QEMUIOVector *qiov)
+{
+    uint64_t *offset;
+    uint32_t *hole_size;
+
+    if (chunk->length != sizeof(*offset) + sizeof(*hole_size)) {
+        return -EINVAL;
+    }
+
+    payload_advance64(&payload, &offset);
+    payload_advance32(&payload, &hole_size);
+
+    if (*offset + *hole_size > qiov->size) {
+        return -EINVAL;
+    }
+
+    qemu_iovec_memset(qiov, *offset, 0, *hole_size);
+
+    return 0;
+}
+
+static int nbd_parse_error_payload(NBDStructuredReplyChunk *chunk,
+                                   uint8_t *payload, int *request_ret)
+{
+    uint32_t *error;
+    uint16_t *message_size;
+
+    assert(chunk->type & (1 << 15));
+
+    if (chunk->length < sizeof(error) + sizeof(message_size)) {
+        return -EINVAL;
+    }
+
+    payload_advance32(&payload, &error);
+    payload_advance16(&payload, &message_size);
+
+    error_report("%.*s", *message_size, payload);
+
+    /* TODO add special case for ERROR_OFFSET */
+
+    *request_ret = nbd_errno_to_system_errno(*error);
+
+    return 0;
+}
+
+static int nbd_co_receive_offset_data_payload(NBDClientSession *s,
+                                              QEMUIOVector *qiov)
+{
+    QEMUIOVector sub_qiov;
+    uint64_t offset;
+    size_t data_size;
+    int ret;
+    NBDStructuredReplyChunk *chunk = &s->reply.structured;
+
+    assert(nbd_reply_is_structured(&s->reply));
+
+    if (chunk->length < sizeof(offset)) {
+        return -EINVAL;
+    }
+
+    if (nbd_read(s->ioc, &offset, sizeof(offset), NULL) < 0) {
+        return -EIO;
+    }
+    be64_to_cpus(&offset);
+
+    data_size = chunk->length - sizeof(offset);
+    if (offset + data_size > qiov->size) {
+        return -EINVAL;
+    }
+
+    qemu_iovec_init(&sub_qiov, qiov->niov);
+    qemu_iovec_concat(&sub_qiov, qiov, offset, data_size);
+    ret = qio_channel_readv_all(s->ioc, sub_qiov.iov, sub_qiov.niov, NULL);
+    qemu_iovec_destroy(&sub_qiov);
+
+    return ret < 0 ? -EIO : 0;
+}
+
+#define NBD_MAX_MALLOC_PAYLOAD 1000
+static int nbd_co_receive_structured_payload(NBDClientSession *s,
+                                             void **payload)
+{
+    int ret;
+    uint32_t len;
+
+    assert(nbd_reply_is_structured(&s->reply));
+
+    len = s->reply.structured.length;
+
+    if (len == 0) {
+        return 0;
+    }
+
+    if (payload == NULL) {
+        return -EINVAL;
+    }
+
+    if (len > NBD_MAX_MALLOC_PAYLOAD) {
+        return -EINVAL;
+    }
+
+    *payload = qemu_memalign(8, len);
+    ret = nbd_read(s->ioc, *payload, len, NULL);
+    if (ret < 0) {
+        qemu_vfree(*payload);
+        *payload = NULL;
+        return ret;
+    }
+
+    return 0;
+}
+
+/* nbd_co_do_receive_one_chunk
+ * for simple reply:
+ *   set request_ret to received reply error
+ *   if qiov is not NULL: read payload to @qiov
+ * for structured reply chunk:
+ *   if error chunk: read payload, set @request_ret, do not set @payload
+ *   else if offset_data chunk: read payload data to @qiov, do not set @payload
+ *   else: read payload to @payload
+ */
+static int nbd_co_do_receive_one_chunk(NBDClientSession *s, uint64_t handle,
+                                       bool only_structured, int *request_ret,
+                                       QEMUIOVector *qiov, void **payload)
 {
     int ret;
     int i = HANDLE_TO_INDEX(s, handle);
+    void *local_payload = NULL;
+
+    if (payload) {
+        *payload = NULL;
+    }
+    *request_ret = 0;
 
     /* Wait until we're woken up by nbd_read_reply_entry.  */
     s->requests[i].receiving = true;
     qemu_coroutine_yield();
     s->requests[i].receiving = false;
     if (!s->ioc || s->quit) {
-        ret = -EIO;
-    } else {
-        assert(s->reply.handle == handle);
-        ret = -nbd_errno_to_system_errno(s->reply.simple.error);
-        if (qiov && ret == 0) {
-            if (qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
-                                      NULL) < 0) {
-                ret = -EIO;
-                s->quit = true;
-            }
+        return -EIO;
+    }
+
+    assert(s->reply.handle == handle);
+
+    if (nbd_reply_is_simple(&s->reply)) {
+        if (only_structured) {
+            return -EINVAL;
         }
 
-        /* Tell the read handler to read another header.  */
-        s->reply.handle = 0;
+        *request_ret = -nbd_errno_to_system_errno(s->reply.simple.error);
+        if (*request_ret < 0 || !qiov) {
+            return 0;
+        }
+
+        return qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
+                                     NULL) < 0 ? -EIO : 0;
     }
 
-    s->requests[i].coroutine = NULL;
+    /* handle structured reply chunk */
+    assert(s->info.structured_reply);
+
+    if (s->reply.structured.type == NBD_SREP_TYPE_NONE) {
+        return 0;
+    }
+
+    if (s->reply.structured.type == NBD_SREP_TYPE_OFFSET_DATA) {
+        if (!qiov) {
+            return -EINVAL;
+        }
+
+        return nbd_co_receive_offset_data_payload(s, qiov);
+    }
+
+    if (nbd_srep_type_is_error(s->reply.structured.type)) {
+        payload = &local_payload;
+    }
+
+    ret = nbd_co_receive_structured_payload(s, payload);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (nbd_srep_type_is_error(s->reply.structured.type)) {
+        ret = nbd_parse_error_payload(&s->reply.structured, local_payload,
+                                      request_ret);
+        qemu_vfree(local_payload);
+        return ret;
+    }
+
+    return 0;
+}
+
+/* nbd_co_receive_one_chunk
+ * Read reply, wake up read_reply_co and set s->quit if needed.
+ * Return value is a fatal error code or normal nbd reply error code
+ */
+static int nbd_co_receive_one_chunk(NBDClientSession *s, uint64_t handle,
+                                    bool only_structured,
+                                    QEMUIOVector *qiov, NBDReply *reply,
+                                    void **payload)
+{
+    int request_ret;
+    int ret = nbd_co_do_receive_one_chunk(s, handle, only_structured,
+                                          &request_ret, qiov, payload);
+
+    if (ret < 0) {
+        s->quit = true;
+    } else {
+        /* For assert at loop start in nbd_read_reply_entry */
+        if (reply) {
+            *reply = s->reply;
+        }
+        s->reply.handle = 0;
+        ret = request_ret;
+    }
 
-    /* Kick the read_reply_co to get the next reply.  */
     if (s->read_reply_co) {
         aio_co_wake(s->read_reply_co);
     }
 
+    return ret;
+}
+
+static int nbd_co_receive_return_code(NBDClientSession *s, uint64_t handle)
+{
+    int ret = 0;
+    int i = HANDLE_TO_INDEX(s, handle);
+    NBDReply reply;
+    bool only_structured = false;
+
+    do {
+        int rc = nbd_co_receive_one_chunk(s, handle, only_structured,
+                                          NULL, &reply, NULL);
+        if (rc < 0 || nbd_reply_is_simple(&reply)) {
+            if (ret == 0) {
+                ret = rc;
+            }
+            only_structured = true;
+            continue;
+        }
+
+        if (reply.structured.type != NBD_SREP_TYPE_NONE) {
+            /* not allowed reply type */
+            ret = -EIO;
+            s->quit = true;
+            break;
+        }
+
+        only_structured = true;
+    } while (!s->quit && nbd_reply_is_structured(&s->reply) &&
+             !(s->reply.structured.flags & NBD_SREP_FLAG_DONE));
+
+    s->requests[i].coroutine = NULL;
+
+    qemu_co_mutex_lock(&s->send_mutex);
+    s->in_flight--;
+    qemu_co_queue_next(&s->free_sema);
+    qemu_co_mutex_unlock(&s->send_mutex);
+
+    return ret;
+}
+
+static int nbd_co_receive_cmdread_reply(NBDClientSession *s, uint64_t handle,
+                                        QEMUIOVector *qiov)
+{
+    int ret = 0;
+    int i = HANDLE_TO_INDEX(s, handle);
+    bool only_structured = s->info.structured_reply;
+    NBDReply reply;
+
+    do {
+        void *payload = NULL;
+        int rc = nbd_co_receive_one_chunk(s, handle, only_structured,
+                                          qiov, &reply, &payload);
+        if (rc < 0 || nbd_reply_is_simple(&reply)) {
+            if (ret == 0) {
+                ret = rc;
+            }
+            only_structured = true;
+            continue;
+        }
+
+        switch (reply.structured.type) {
+        case NBD_SREP_TYPE_OFFSET_DATA:
+            /* special cased in nbd_co_receive_one_chunk, data is already
+             * in qiov */
+            break;
+        case NBD_SREP_TYPE_OFFSET_HOLE:
+            ret = nbd_parse_offset_hole_payload(&reply.structured, payload,
+                                                qiov);
+            if (ret < 0) {
+                s->quit = true;
+            }
+            break;
+        default:
+            /* not allowed reply type */
+            ret = -EINVAL;
+            s->quit = true;
+        }
+
+        qemu_vfree(payload);
+        only_structured = true;
+    } while (!s->quit && nbd_reply_is_structured(&s->reply) &&
+             !(s->reply.structured.flags & NBD_SREP_FLAG_DONE));
+
+
+    s->requests[i].coroutine = NULL;
+
     qemu_co_mutex_lock(&s->send_mutex);
     s->in_flight--;
     qemu_co_queue_next(&s->free_sema);
@@ -226,30 +522,31 @@ static int nbd_co_receive_reply(NBDClientSession *s,
 
 static int nbd_co_request(BlockDriverState *bs,
                           NBDRequest *request,
-                          QEMUIOVector *qiov)
+                          QEMUIOVector *write_qiov)
 {
     NBDClientSession *client = nbd_get_client_session(bs);
     int ret;
 
-    if (qiov) {
-        assert(request->type == NBD_CMD_WRITE || request->type == NBD_CMD_READ);
-        assert(request->len == iov_size(qiov->iov, qiov->niov));
+    assert(request->type != NBD_CMD_READ);
+    if (write_qiov) {
+        assert(request->type == NBD_CMD_WRITE);
+        assert(request->len == iov_size(write_qiov->iov, write_qiov->niov));
     } else {
-        assert(request->type != NBD_CMD_WRITE && request->type != NBD_CMD_READ);
+        assert(request->type != NBD_CMD_WRITE);
     }
-    ret = nbd_co_send_request(bs, request,
-                              request->type == NBD_CMD_WRITE ? qiov : NULL);
+    ret = nbd_co_send_request(bs, request, write_qiov);
     if (ret < 0) {
         return ret;
     }
 
-    return nbd_co_receive_reply(client, request->handle,
-                                request->type == NBD_CMD_READ ? qiov : NULL);
+    return nbd_co_receive_return_code(client, request->handle);
 }
 
 int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
                          uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
+    int ret;
+    NBDClientSession *client = nbd_get_client_session(bs);
     NBDRequest request = {
         .type = NBD_CMD_READ,
         .from = offset,
@@ -259,7 +556,12 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
     assert(bytes <= NBD_MAX_BUFFER_SIZE);
     assert(!flags);
 
-    return nbd_co_request(bs, &request, qiov);
+    ret = nbd_co_send_request(bs, &request, NULL);
+    if (ret < 0) {
+        return ret;
+    }
+
+    return nbd_co_receive_cmdread_reply(client, request.handle, qiov);
 }
 
 int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
diff --git a/nbd/client.c b/nbd/client.c
index a38e1a7d8e..2f256ee771 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -687,6 +687,13 @@ int nbd_receive_negotiate(QIOChannel *ioc, const char *name,
         if (fixedNewStyle) {
             int result;
 
+            result = nbd_request_simple_option(ioc, NBD_OPT_STRUCTURED_REPLY,
+                                               errp);
+            if (result < 0) {
+                goto fail;
+            }
+            info->structured_reply = result == 1;
+
             /* Try NBD_OPT_GO first - if it works, we are done (it
              * also gives us a good message if the server requires
              * TLS).  If it is not available, fall back to
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read
  2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
                   ` (16 preceding siblings ...)
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 10/10] nbd: Minimal structured read for client Vladimir Sementsov-Ogievskiy
@ 2017-10-09 17:33 ` Vladimir Sementsov-Ogievskiy
  17 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-09 17:33 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, den

Ohm, sorry for trash patches (with */7 in prefix). Consider only patches 
with */10 in prefix.
I can resend if you want.


09.10.2017 20:27, Vladimir Sementsov-Ogievskiy wrote:
> Minimally implement nbd structured read extension.
>
> v2:
>
> clone: tag up-nbd-minimal-structured-read-v2 from https://src.openvz.org/scm/~vsementsov/qemu.git
> online: https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=up-nbd-minimal-structured-read-v2
>
> 01: improve assertion (Eric)
> 02: add Eric's r-b
> 06: define NBD_SREP_TYPE_OFFSET_HOLE and NBD_SREP_TYPE_ERROR_OFFSET
>      here (Eric)
> 07: fix return value of nbd_request_simple_option
> 08-09: new patches
> 10: mostly rewritten
>
> Vladimir Sementsov-Ogievskiy (10):
>    block/nbd-client: assert qiov len once in nbd_co_request
>    block/nbd-client: refactor nbd_co_receive_reply
>    nbd: rename NBD_REPLY_MAGIC to NBD_SIMPLE_REPLY_MAGIC
>    nbd-server: refactor simple reply sending
>    nbd: header constants indenting
>    nbd: Minimal structured read for server
>    nbd/client: refactor nbd_receive_starttls
>    nbd: share some nbd entities to be reused in block/nbd-client.c
>    nbd/client: prepare nbd_receive_reply for structured reply
>    nbd: Minimal structured read for client
>
>   include/block/nbd.h                      | 132 ++++++++++-
>   nbd/nbd-internal.h                       |  65 +++---
>   block/nbd-client.c                       | 364 ++++++++++++++++++++++++++++---
>   nbd/client.c                             | 205 +++++++++++------
>   nbd/server.c                             | 189 +++++++++++-----
>   nbd/trace-events                         |  10 +-
>   tests/qemu-iotests/nbd-fault-injector.py |   4 +-
>   7 files changed, 756 insertions(+), 213 deletions(-)
>


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 01/10] block/nbd-client: assert qiov len once in nbd_co_request
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 01/10] block/nbd-client: assert qiov len once in nbd_co_request Vladimir Sementsov-Ogievskiy
@ 2017-10-09 18:37   ` Eric Blake
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Blake @ 2017-10-09 18:37 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: mreitz, kwolf, pbonzini, den

[-- Attachment #1: Type: text/plain, Size: 522 bytes --]

On 10/09/2017 12:27 PM, Vladimir Sementsov-Ogievskiy wrote:
> Also improve the assertion: check that qiov is NULL for other commands
> than CMD_READ and CMD_WRITE.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/nbd-client.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 04/10] nbd-server: refactor simple reply sending
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 04/10] nbd-server: refactor simple reply sending Vladimir Sementsov-Ogievskiy
@ 2017-10-09 19:18   ` Eric Blake
  2017-10-10  8:40     ` Daniel P. Berrange
  2017-10-11 17:24     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 2 replies; 31+ messages in thread
From: Eric Blake @ 2017-10-09 19:18 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: mreitz, kwolf, pbonzini, den, Daniel P. Berrange

[-- Attachment #1: Type: text/plain, Size: 8769 bytes --]

[adding Dan for a qio question - search for your name below]

On 10/09/2017 12:27 PM, Vladimir Sementsov-Ogievskiy wrote:
> Get rid of calculating structure fields offsets by hand and set_cork,
> rename nbd_co_send_reply to nbd_co_send_simple_reply. Do not use
> NBDReply which will be upgraded in future patches to handle both simple
> and structured replies and will be used only in the client.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/block/nbd.h |  6 ++++
>  nbd/nbd-internal.h  |  9 +++++
>  nbd/server.c        | 97 ++++++++++++++++++++++-------------------------------
>  3 files changed, 56 insertions(+), 56 deletions(-)

Feels a bit big for one patch, although I'm borderline for being able to
take it as-is.

> 
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index 707fd37575..49008980f4 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -63,6 +63,12 @@ struct NBDReply {
>  };
>  typedef struct NBDReply NBDReply;
>  
> +typedef struct NBDSimpleReply {
> +    uint32_t magic;  /* NBD_SIMPLE_REPLY_MAGIC */
> +    uint32_t error;
> +    uint64_t handle;
> +} QEMU_PACKED NBDSimpleReply;
> +

So (one of?) the goal of this patch is to use a packed struct for
on-the-wire transmissions, instead of manually packing things into
offsets by hand.  I can live with that.

>  /* Transmission (export) flags: sent from server to client during handshake,
>     but describe what will happen during transmission */
>  #define NBD_FLAG_HAS_FLAGS      (1 << 0)        /* Flags are there */
> diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
> index 2d6663de23..320abef296 100644
> --- a/nbd/nbd-internal.h
> +++ b/nbd/nbd-internal.h
> @@ -113,6 +113,15 @@ static inline int nbd_write(QIOChannel *ioc, const void *buffer, size_t size,
>      return qio_channel_write_all(ioc, buffer, size, errp) < 0 ? -EIO : 0;
>  }
>  
> +/* nbd_writev
> + * Writes @size bytes to @ioc. Returns 0 on success.
> + */
> +static inline int nbd_writev(QIOChannel *ioc, const struct iovec *iov,
> +                             size_t niov, Error **errp)

Comment is wrong; you don't have a @size parameter.

> +{
> +    return qio_channel_writev_all(ioc, iov, niov, errp) < 0 ? -EIO : 0;
> +}

Do we really need this, or can we just directly use
qio_channel_writev_all() at the lone site that uses this new interface?

> +
>  struct NBDTLSHandshakeData {
>      GMainLoop *loop;
>      bool complete;
> diff --git a/nbd/server.c b/nbd/server.c
> index a1b21a6951..57d5598e0f 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -902,26 +902,6 @@ static int nbd_receive_request(QIOChannel *ioc, NBDRequest *request,
>      return 0;
>  }
>  
> -static int nbd_send_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)

So instead of taking a non-packed struct, doing the hand-packing here,
then sending the message, you split this up...

> -static int nbd_co_send_reply(NBDRequestData *req, NBDReply *reply, int len,
> -                             Error **errp)
> +static int coroutine_fn nbd_co_send_iov(NBDClient *client, struct iovec *iov,
> +                                        unsigned niov, Error **errp)
>  {
> -    NBDClient *client = req->client;
>      int ret;
>  
>      g_assert(qemu_in_coroutine());
> -
> -    trace_nbd_co_send_reply(reply->handle, reply->error, len);
> -
>      qemu_co_mutex_lock(&client->send_lock);
>      client->send_coroutine = qemu_coroutine_self();
>  
> -    if (!len) {
> -        ret = nbd_send_reply(client->ioc, reply, errp);
> -    } else {
> -        qio_channel_set_cork(client->ioc, true);
> -        ret = nbd_send_reply(client->ioc, reply, errp);
> -        if (ret == 0) {
> -            ret = nbd_write(client->ioc, req->data, len, errp);
> -            if (ret < 0) {
> -                ret = -EIO;
> -            }
> -        }
> -        qio_channel_set_cork(client->ioc, false);
> -    }
> +    ret = nbd_writev(client->ioc, iov, niov, errp);

...the transmission is now done via iov so that you can send two buffers
at once without having to play with the cork,...

Dan - are we guaranteed that qio_channel_writev_all() with a multi-iov
argument called by one thread will send all its data in order, without
being interleaved with any other parallel coroutine also calling
qio_channel_writev_all()?  In other words, it looks like the NBD code
was previously using manual cork changes to ensure that two halves of a
message were sent without interleave, but Vladimir is now relying on the
qio code to do that under the hood.

>  
>      client->send_coroutine = NULL;
>      qemu_co_mutex_unlock(&client->send_lock);
> +
>      return ret;
>  }
>  
> +static inline void set_be_simple_reply(NBDSimpleReply *reply, uint64_t error,
> +                                       uint64_t handle)
> +{
> +    stl_be_p(&reply->magic, NBD_SIMPLE_REPLY_MAGIC);
> +    stl_be_p(&reply->error, error);
> +    stq_be_p(&reply->handle, handle);
> +}

...loading the packed struct is now its own helper function,...

> +
> +static int nbd_co_send_simple_reply(NBDClient *client,
> +                                    uint64_t handle,
> +                                    uint32_t error,
> +                                    void *data,
> +                                    size_t size,
> +                                    Error **errp)
> +{
> +    NBDSimpleReply reply;
> +    struct iovec iov[] = {
> +        {.iov_base = &reply, .iov_len = sizeof(reply)},
> +        {.iov_base = data, .iov_len = size}
> +    };
> +
> +    trace_nbd_co_send_reply(handle, error, size);
> +
> +    set_be_simple_reply(&reply, system_errno_to_nbd_errno(error), handle);
> +
> +    return nbd_co_send_iov(client, iov, size ? 2 : 1, errp);
> +}

...and the function gets renamed, all at once.  Okay, maybe it IS easier
to review if split.  A good split might be:

Focus on naming: Rename nbd_co_send_reply() to
nbd_co_send_simple_reply() in one patch (in fact, this part should be
squashed with the rename of the magic number in 3/10)

Focus on conversion to on-the-wire representation: Add the packed struct
and set_be_simple_reply() in one patch, but still using corks

Focus on simplified transmission: Switch to using iov to send both
halves of a two-part message in one transaction

> +
>  /* nbd_co_receive_request
>   * Collect a client request. Return 0 if request looks valid, -EIO to drop
>   * connection right away, and any other negative value to report an error to
> @@ -1331,7 +1324,6 @@ static coroutine_fn void nbd_trip(void *opaque)
>      NBDExport *exp = client->exp;
>      NBDRequestData *req;
>      NBDRequest request = { 0 };    /* GCC thinks it can be used uninitialized */
> -    NBDReply reply;
>      int ret;
>      int flags;
>      int reply_data_len = 0;
> @@ -1351,11 +1343,7 @@ static coroutine_fn void nbd_trip(void *opaque)
>          goto disconnect;
>      }
>  
> -    reply.handle = request.handle;
> -    reply.error = 0;
> -
>      if (ret < 0) {
> -        reply.error = -ret;
>          goto reply;
>      }

Oh, and you're doing a fourth thing - tracking the error directly in ret
instead of buried in reply.error.  Could also be split.

>  
>  reply:
>      if (local_err) {
> -        /* If we are here local_err is not fatal error, already stored in
> -         * reply.error */
> +        /* If we are here local_err is not fatal error, which should be sent
> +         * to client. */

Reads awkwardly (even pre-patch); maybe this is better:

/* If we get here, local_err was not a fatal error, and should be sent
to the client. */

>          error_report_err(local_err);
>          local_err = NULL;
>      }
>  
> -    if (nbd_co_send_reply(req, &reply, reply_data_len, &local_err) < 0) {
> +    if (nbd_co_send_simple_reply(req->client, request.handle,
> +                                 ret < 0 ? -ret : 0,
> +                                 req->data, reply_data_len, &local_err) < 0)
> +    {
>          error_prepend(&local_err, "Failed to send reply: ");
>          goto disconnect;
>      }
> 

I'm not necessarily going to insist that you split things if I don't see
any other reasons to respin the series; but I might also try to do the
split locally and post it for you to see my thinking (remember, a series
of small patches that each do one thing well can be easier to review
than large patches, even if it results in more emails).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/10] nbd: header constants indenting
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 05/10] nbd: header constants indenting Vladimir Sementsov-Ogievskiy
@ 2017-10-09 19:21   ` Eric Blake
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Blake @ 2017-10-09 19:21 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: mreitz, kwolf, pbonzini, den

[-- Attachment #1: Type: text/plain, Size: 536 bytes --]

On 10/09/2017 12:27 PM, Vladimir Sementsov-Ogievskiy wrote:
> Prepare indenting for the following commit.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/block/nbd.h | 15 ++++++++-------
>  nbd/nbd-internal.h  | 34 +++++++++++++++++-----------------
>  2 files changed, 25 insertions(+), 24 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/10] nbd: Minimal structured read for client
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 10/10] nbd: Minimal structured read for client Vladimir Sementsov-Ogievskiy
@ 2017-10-10  8:02   ` Paolo Bonzini
  2017-10-10 14:55     ` Vladimir Sementsov-Ogievskiy
  2017-10-11 16:59     ` Vladimir Sementsov-Ogievskiy
  2017-10-10 14:43   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2017-10-10  8:02 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: mreitz, kwolf, eblake, den

On 09/10/2017 19:27, Vladimir Sementsov-Ogievskiy wrote:
> 
> +    int ret = nbd_co_do_receive_one_chunk(s, handle, only_structured,
> +                                          &request_ret, qiov, payload);
> +
> +    if (ret < 0) {
> +        s->quit = true;
> +    } else {
> +        /* For assert at loop start in nbd_read_reply_entry */
> +        if (reply) {
> +            *reply = s->reply;
> +        }

reply is always non-NULL, right?

> +        s->reply.handle = 0;
> +        ret = request_ret;
> +    }
>  

...

> +static int nbd_co_receive_return_code(NBDClientSession *s, uint64_t handle)
> +{
> +    int ret = 0;
> +    int i = HANDLE_TO_INDEX(s, handle);
> +    NBDReply reply;
> +    bool only_structured = false;
> +
> +    do {
> +        int rc = nbd_co_receive_one_chunk(s, handle, only_structured,
> +                                          NULL, &reply, NULL);

Is rc > 0 propagated correctly?

> +        if (rc < 0 || nbd_reply_is_simple(&reply)) {
> +            if (ret == 0) {
> +                ret = rc;
> +            }
> +            only_structured = true;
> +            continue;
> +        }

> +    } while (!s->quit && nbd_reply_is_structured(&s->reply) &&
> +             !(s->reply.structured.flags & NBD_SREP_FLAG_DONE));
> +
> +    s->requests[i].coroutine = NULL;
> +
> +    qemu_co_mutex_lock(&s->send_mutex);
> +    s->in_flight--;
> +    qemu_co_queue_next(&s->free_sema);
> +    qemu_co_mutex_unlock(&s->send_mutex);

The code after the loop is duplicated.

An idea could be to wrap nbd_co_receive_one_chunk with an iterator like

typedef struct NBDReplyChunkIter {
    int ret;
    bool done, only_structured;
} NBDReplyChunkIter;

#define NBD_FOREACH_REPLY_CHUNK(s, handle, iter, qiov, reply, payload, \
                                structured)                            \
    for (iter = (NBDReplyChunkIter) { .only_structured = structured }; \
         nbd_reply_chunk_iter_receive(s, &iter, qiov, &reply, handle,  \
                                      payload);;)

bool coroutine_fn nbd_reply_chunk_iter_receive(...)
{
    if (s->quit) {
        if (iter->ret == 0) {
            iter->ret = -EIO;
        }
        goto break_loop;
    }
    /* Handle the end of the previous iteration.  */
    if (iter->done) {
        goto break_loop;
    }

    rc = nbd_co_receive_one_chunk(s, handle, iter->only_structured,
                                  qiov, reply, payload);
    if (rc < 0) {
        assert(s->quit);
        if (iter->ret == 0) {
            iter->ret = rc;
        }
        goto break_loop;
    }

    /* No structured reply?  Do not execute the body of
     * NBD_FOREACH_REPLY_CHUNK.
     */
    if (nbd_reply_is_simple(&s->reply) || s->quit) {
        goto break_loop;
    }

    iter->only_structured = true;
    if (s->reply.structured.flags & NBD_SREP_FLAG_DONE) {
        iter->ret = rc;
        iter->done = true;
        /* Execute the loop body, but this iteration is the last.  */
        return true;
    }

    /* An error reply must have NBD_SREP_FLAG_DONE set, otherwise it is
     * a protocol error (FIXME: is this true?  but if not, how do you
     * cope with multiple error replies?)
     */
    if (rc > 0) {
        s->quit = true;
        iter->ret = -EIO;
        goto break_loop;
    }
    return true;

break_loop:
    iter->done = true;

    s->requests[HANDLE_TO_INDEX(s, handle)].coroutine = NULL;

    qemu_co_mutex_lock(&s->send_mutex);
    s->in_flight--;
    qemu_co_queue_next(&s->free_sema);
    qemu_co_mutex_unlock(&s->send_mutex);
    return false;
}

so this loop could be written like

    FOR_EACH_REPLY_CHUNK(s, handle, iter, NULL, &reply, NULL) {
        if (reply.structured.type != NBD_SREP_TYPE_NONE) {
            /* not allowed reply type */
            s->quit = true;
            break;
        }
    }
    return iter.ret;

and likewise for nbd_co_receive_cmdread_reply.  The iterator is a bit
more complex, but it abstracts all the protocol handling and avoids lots
of duplicated code.

Paolo

> +    return ret;
> +}

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 04/10] nbd-server: refactor simple reply sending
  2017-10-09 19:18   ` Eric Blake
@ 2017-10-10  8:40     ` Daniel P. Berrange
  2017-10-10  9:13       ` Paolo Bonzini
  2017-10-11 17:24     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 31+ messages in thread
From: Daniel P. Berrange @ 2017-10-10  8:40 UTC (permalink / raw)
  To: Eric Blake
  Cc: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel, mreitz,
	kwolf, pbonzini, den

On Mon, Oct 09, 2017 at 02:18:10PM -0500, Eric Blake wrote:
> [adding Dan for a qio question - search for your name below]
> 
> On 10/09/2017 12:27 PM, Vladimir Sementsov-Ogievskiy wrote:
> > Get rid of calculating structure fields offsets by hand and set_cork,
> > rename nbd_co_send_reply to nbd_co_send_simple_reply. Do not use
> > NBDReply which will be upgraded in future patches to handle both simple
> > and structured replies and will be used only in the client.
> > 
> > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > ---
> >  include/block/nbd.h |  6 ++++
> >  nbd/nbd-internal.h  |  9 +++++
> >  nbd/server.c        | 97 ++++++++++++++++++++++-------------------------------
> >  3 files changed, 56 insertions(+), 56 deletions(-)

> > -static int nbd_co_send_reply(NBDRequestData *req, NBDReply *reply, int len,
> > -                             Error **errp)
> > +static int coroutine_fn nbd_co_send_iov(NBDClient *client, struct iovec *iov,
> > +                                        unsigned niov, Error **errp)
> >  {
> > -    NBDClient *client = req->client;
> >      int ret;
> >  
> >      g_assert(qemu_in_coroutine());
> > -
> > -    trace_nbd_co_send_reply(reply->handle, reply->error, len);
> > -
> >      qemu_co_mutex_lock(&client->send_lock);
> >      client->send_coroutine = qemu_coroutine_self();
> >  
> > -    if (!len) {
> > -        ret = nbd_send_reply(client->ioc, reply, errp);
> > -    } else {
> > -        qio_channel_set_cork(client->ioc, true);
> > -        ret = nbd_send_reply(client->ioc, reply, errp);
> > -        if (ret == 0) {
> > -            ret = nbd_write(client->ioc, req->data, len, errp);
> > -            if (ret < 0) {
> > -                ret = -EIO;
> > -            }
> > -        }
> > -        qio_channel_set_cork(client->ioc, false);
> > -    }
> > +    ret = nbd_writev(client->ioc, iov, niov, errp);
> 
> ...the transmission is now done via iov so that you can send two buffers
> at once without having to play with the cork,...
> 
> Dan - are we guaranteed that qio_channel_writev_all() with a multi-iov
> argument called by one thread will send all its data in order, without
> being interleaved with any other parallel coroutine also calling
> qio_channel_writev_all()?  In other words, it looks like the NBD code
> was previously using manual cork changes to ensure that two halves of a
> message were sent without interleave, but Vladimir is now relying on the
> qio code to do that under the hood.

The I/O channels code does not make guarantees wrt concurrent usage of
threads or coroutines. It is the callers responsibility to avoid any
concurrent usage for all APIs. With coroutines you are at least avoiding
the danger of corrupting memory state, but you still have risk of unexpected
data ordering.

IOW, if you have 2 coroutines each doing a series writes on the same QIOChannel
object, and one does a yield, I/O can certainly be interleaved between the two.
This is true whether doing multiple qio_channel_writev() calls directly, or
whether using qio_channel_writev_all().

Unless I'm misunderstanding though, cork did not offer you protection in
this scenario either. cork just prevents the data being transmitted onto
the wire immediately - ie it ensures that if you do 2 writes, they get
sent in 1 TCP packet instead of many TCP packets.

If you have multiple coroutines writing to the channel at the same time
and one yielded in its series of writes, the I/O from multiple coroutines
will get merged into that 1 single TCP packet when uncorked.

So if this concurrent usage is a problem NBD was already broken AFAICT.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 04/10] nbd-server: refactor simple reply sending
  2017-10-10  8:40     ` Daniel P. Berrange
@ 2017-10-10  9:13       ` Paolo Bonzini
  0 siblings, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2017-10-10  9:13 UTC (permalink / raw)
  To: Daniel P. Berrange, Eric Blake
  Cc: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel, mreitz, kwolf, den

On 10/10/2017 10:40, Daniel P. Berrange wrote:
> The I/O channels code does not make guarantees wrt concurrent usage of
> threads or coroutines. It is the callers responsibility to avoid any
> concurrent usage for all APIs. With coroutines you are at least avoiding
> the danger of corrupting memory state, but you still have risk of unexpected
> data ordering.
> 
> IOW, if you have 2 coroutines each doing a series writes on the same QIOChannel
> object, and one does a yield, I/O can certainly be interleaved between the two.
> This is true whether doing multiple qio_channel_writev() calls directly, or
> whether using qio_channel_writev_all().
> 
> Unless I'm misunderstanding though, cork did not offer you protection in
> this scenario either. cork just prevents the data being transmitted onto
> the wire immediately - ie it ensures that if you do 2 writes, they get
> sent in 1 TCP packet instead of many TCP packets.
> 
> If you have multiple coroutines writing to the channel at the same time
> and one yielded in its series of writes, the I/O from multiple coroutines
> will get merged into that 1 single TCP packet when uncorked.
> 
> So if this concurrent usage is a problem NBD was already broken AFAICT.

I don't think there was an issue here.  nbd_co_send_reply is entirely
protected by client->send_lock.

Paolo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/10] nbd: Minimal structured read for client
  2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 10/10] nbd: Minimal structured read for client Vladimir Sementsov-Ogievskiy
  2017-10-10  8:02   ` Paolo Bonzini
@ 2017-10-10 14:43   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-10 14:43 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: mreitz, kwolf, pbonzini, eblake, den

09.10.2017 20:27, Vladimir Sementsov-Ogievskiy wrote:
> Minimal implementation: for structured error only error_report error
> message.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   include/block/nbd.h |   6 +
>   block/nbd-client.c  | 358 ++++++++++++++++++++++++++++++++++++++++++++++++----
>   nbd/client.c        |   7 +
>   3 files changed, 343 insertions(+), 28 deletions(-)
>
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index 1ef8c8897f..e3350b67a4 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -203,6 +203,11 @@ enum {
>   #define NBD_SREP_TYPE_ERROR         NBD_SREP_ERR(1)
>   #define NBD_SREP_TYPE_ERROR_OFFSET  NBD_SREP_ERR(2)
>   
> +static inline bool nbd_srep_type_is_error(int type)
> +{
> +    return type & (1 << 15);
> +}
> +
>   /* NBD errors are based on errno numbers, so there is a 1:1 mapping,
>    * but only a limited set of errno values is specified in the protocol.
>    * Everything else is squashed to EINVAL.
> @@ -241,6 +246,7 @@ static inline int nbd_errno_to_system_errno(int err)
>   struct NBDExportInfo {
>       /* Set by client before nbd_receive_negotiate() */
>       bool request_sizes;
> +    bool structured_reply;
>       /* Set by server results during nbd_receive_negotiate() */
>       uint64_t size;
>       uint16_t flags;
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index 58493b7ac4..eda78130a2 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -29,6 +29,7 @@
>   
>   #include "qemu/osdep.h"
>   #include "qapi/error.h"
> +#include "qemu/error-report.h"
>   #include "nbd-client.h"
>   
>   #define HANDLE_TO_INDEX(bs, handle) ((handle) ^ (uint64_t)(intptr_t)(bs))
> @@ -93,7 +94,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
>           if (i >= MAX_NBD_REQUESTS ||
>               !s->requests[i].coroutine ||
>               !s->requests[i].receiving ||
> -            nbd_reply_is_structured(&s->reply))
> +            (nbd_reply_is_structured(&s->reply) && !s->info.structured_reply))
>           {
>               break;
>           }
> @@ -181,41 +182,336 @@ err:
>       return rc;
>   }
>   
> -static int nbd_co_receive_reply(NBDClientSession *s,
> -                                uint64_t handle,
> -                                QEMUIOVector *qiov)
> +static inline void payload_advance16(uint8_t **payload, uint16_t **ptr)
> +{
> +    *ptr = (uint16_t *)*payload;
> +    be16_to_cpus(*ptr);
> +    *payload += sizeof(**ptr);
> +}
> +
> +static inline void payload_advance32(uint8_t **payload, uint32_t **ptr)
> +{
> +    *ptr = (uint32_t *)*payload;
> +    be32_to_cpus(*ptr);
> +    *payload += sizeof(**ptr);
> +}
> +
> +static inline void payload_advance64(uint8_t **payload, uint64_t **ptr)
> +{
> +    *ptr = (uint64_t *)*payload;
> +    be64_to_cpus(*ptr);
> +    *payload += sizeof(**ptr);
> +}
> +
> +static int nbd_parse_offset_hole_payload(NBDStructuredReplyChunk *chunk,
> +                                         uint8_t *payload, QEMUIOVector *qiov)
> +{
> +    uint64_t *offset;
> +    uint32_t *hole_size;
> +
> +    if (chunk->length != sizeof(*offset) + sizeof(*hole_size)) {
> +        return -EINVAL;
> +    }
> +
> +    payload_advance64(&payload, &offset);
> +    payload_advance32(&payload, &hole_size);
> +
> +    if (*offset + *hole_size > qiov->size) {
> +        return -EINVAL;
> +    }
> +
> +    qemu_iovec_memset(qiov, *offset, 0, *hole_size);
> +
> +    return 0;
> +}
> +
> +static int nbd_parse_error_payload(NBDStructuredReplyChunk *chunk,
> +                                   uint8_t *payload, int *request_ret)
> +{
> +    uint32_t *error;
> +    uint16_t *message_size;
> +
> +    assert(chunk->type & (1 << 15));
> +
> +    if (chunk->length < sizeof(error) + sizeof(message_size)) {
> +        return -EINVAL;
> +    }
> +
> +    payload_advance32(&payload, &error);
> +    payload_advance16(&payload, &message_size);
> +
> +    error_report("%.*s", *message_size, payload);
> +
> +    /* TODO add special case for ERROR_OFFSET */
> +
> +    *request_ret = nbd_errno_to_system_errno(*error);
> +
> +    return 0;
> +}
> +
> +static int nbd_co_receive_offset_data_payload(NBDClientSession *s,
> +                                              QEMUIOVector *qiov)
> +{
> +    QEMUIOVector sub_qiov;
> +    uint64_t offset;
> +    size_t data_size;
> +    int ret;
> +    NBDStructuredReplyChunk *chunk = &s->reply.structured;
> +
> +    assert(nbd_reply_is_structured(&s->reply));
> +
> +    if (chunk->length < sizeof(offset)) {
> +        return -EINVAL;
> +    }
> +
> +    if (nbd_read(s->ioc, &offset, sizeof(offset), NULL) < 0) {
> +        return -EIO;
> +    }
> +    be64_to_cpus(&offset);
> +
> +    data_size = chunk->length - sizeof(offset);
> +    if (offset + data_size > qiov->size) {
> +        return -EINVAL;
> +    }
> +
> +    qemu_iovec_init(&sub_qiov, qiov->niov);
> +    qemu_iovec_concat(&sub_qiov, qiov, offset, data_size);
> +    ret = qio_channel_readv_all(s->ioc, sub_qiov.iov, sub_qiov.niov, NULL);
> +    qemu_iovec_destroy(&sub_qiov);
> +
> +    return ret < 0 ? -EIO : 0;
> +}
> +
> +#define NBD_MAX_MALLOC_PAYLOAD 1000
> +static int nbd_co_receive_structured_payload(NBDClientSession *s,
> +                                             void **payload)
> +{
> +    int ret;
> +    uint32_t len;
> +
> +    assert(nbd_reply_is_structured(&s->reply));
> +
> +    len = s->reply.structured.length;
> +
> +    if (len == 0) {
> +        return 0;
> +    }
> +
> +    if (payload == NULL) {
> +        return -EINVAL;
> +    }
> +
> +    if (len > NBD_MAX_MALLOC_PAYLOAD) {
> +        return -EINVAL;
> +    }
> +
> +    *payload = qemu_memalign(8, len);
> +    ret = nbd_read(s->ioc, *payload, len, NULL);
> +    if (ret < 0) {
> +        qemu_vfree(*payload);
> +        *payload = NULL;
> +        return ret;
> +    }
> +
> +    return 0;
> +}
> +
> +/* nbd_co_do_receive_one_chunk
> + * for simple reply:
> + *   set request_ret to received reply error
> + *   if qiov is not NULL: read payload to @qiov
> + * for structured reply chunk:
> + *   if error chunk: read payload, set @request_ret, do not set @payload
> + *   else if offset_data chunk: read payload data to @qiov, do not set @payload
> + *   else: read payload to @payload
> + */
> +static int nbd_co_do_receive_one_chunk(NBDClientSession *s, uint64_t handle,
> +                                       bool only_structured, int *request_ret,
> +                                       QEMUIOVector *qiov, void **payload)
>   {
>       int ret;
>       int i = HANDLE_TO_INDEX(s, handle);
> +    void *local_payload = NULL;
> +
> +    if (payload) {
> +        *payload = NULL;
> +    }
> +    *request_ret = 0;
>   
>       /* Wait until we're woken up by nbd_read_reply_entry.  */
>       s->requests[i].receiving = true;
>       qemu_coroutine_yield();
>       s->requests[i].receiving = false;
>       if (!s->ioc || s->quit) {
> -        ret = -EIO;
> -    } else {
> -        assert(s->reply.handle == handle);
> -        ret = -nbd_errno_to_system_errno(s->reply.simple.error);
> -        if (qiov && ret == 0) {
> -            if (qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
> -                                      NULL) < 0) {
> -                ret = -EIO;
> -                s->quit = true;
> -            }
> +        return -EIO;
> +    }
> +
> +    assert(s->reply.handle == handle);
> +
> +    if (nbd_reply_is_simple(&s->reply)) {
> +        if (only_structured) {
> +            return -EINVAL;
>           }
>   
> -        /* Tell the read handler to read another header.  */
> -        s->reply.handle = 0;
> +        *request_ret = -nbd_errno_to_system_errno(s->reply.simple.error);
> +        if (*request_ret < 0 || !qiov) {
> +            return 0;
> +        }
> +
> +        return qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
> +                                     NULL) < 0 ? -EIO : 0;
>       }
>   
> -    s->requests[i].coroutine = NULL;
> +    /* handle structured reply chunk */
> +    assert(s->info.structured_reply);
> +
> +    if (s->reply.structured.type == NBD_SREP_TYPE_NONE) {
> +        return 0;
> +    }
> +
> +    if (s->reply.structured.type == NBD_SREP_TYPE_OFFSET_DATA) {
> +        if (!qiov) {
> +            return -EINVAL;
> +        }
> +
> +        return nbd_co_receive_offset_data_payload(s, qiov);
> +    }
> +
> +    if (nbd_srep_type_is_error(s->reply.structured.type)) {
> +        payload = &local_payload;
> +    }
> +
> +    ret = nbd_co_receive_structured_payload(s, payload);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +
> +    if (nbd_srep_type_is_error(s->reply.structured.type)) {
> +        ret = nbd_parse_error_payload(&s->reply.structured, local_payload,
> +                                      request_ret);
> +        qemu_vfree(local_payload);
> +        return ret;
> +    }
> +
> +    return 0;
> +}
> +
> +/* nbd_co_receive_one_chunk
> + * Read reply, wake up read_reply_co and set s->quit if needed.
> + * Return value is a fatal error code or normal nbd reply error code
> + */
> +static int nbd_co_receive_one_chunk(NBDClientSession *s, uint64_t handle,
> +                                    bool only_structured,
> +                                    QEMUIOVector *qiov, NBDReply *reply,
> +                                    void **payload)
> +{
> +    int request_ret;
> +    int ret = nbd_co_do_receive_one_chunk(s, handle, only_structured,
> +                                          &request_ret, qiov, payload);
> +
> +    if (ret < 0) {
> +        s->quit = true;
> +    } else {
> +        /* For assert at loop start in nbd_read_reply_entry */
> +        if (reply) {
> +            *reply = s->reply;
> +        }
> +        s->reply.handle = 0;
> +        ret = request_ret;
> +    }
>   
> -    /* Kick the read_reply_co to get the next reply.  */
>       if (s->read_reply_co) {
>           aio_co_wake(s->read_reply_co);
>       }
>   
> +    return ret;
> +}
> +
> +static int nbd_co_receive_return_code(NBDClientSession *s, uint64_t handle)
> +{
> +    int ret = 0;
> +    int i = HANDLE_TO_INDEX(s, handle);
> +    NBDReply reply;
> +    bool only_structured = false;
> +
> +    do {
> +        int rc = nbd_co_receive_one_chunk(s, handle, only_structured,
> +                                          NULL, &reply, NULL);
> +        if (rc < 0 || nbd_reply_is_simple(&reply)) {
> +            if (ret == 0) {
> +                ret = rc;
> +            }
> +            only_structured = true;
> +            continue;
> +        }
> +
> +        if (reply.structured.type != NBD_SREP_TYPE_NONE) {
> +            /* not allowed reply type */
> +            ret = -EIO;
> +            s->quit = true;
> +            break;
> +        }
> +
> +        only_structured = true;
> +    } while (!s->quit && nbd_reply_is_structured(&s->reply) &&
> +             !(s->reply.structured.flags & NBD_SREP_FLAG_DONE));
> +
> +    s->requests[i].coroutine = NULL;
> +
> +    qemu_co_mutex_lock(&s->send_mutex);
> +    s->in_flight--;
> +    qemu_co_queue_next(&s->free_sema);
> +    qemu_co_mutex_unlock(&s->send_mutex);
> +
> +    return ret;
> +}
> +
> +static int nbd_co_receive_cmdread_reply(NBDClientSession *s, uint64_t handle,
> +                                        QEMUIOVector *qiov)
> +{
> +    int ret = 0;
> +    int i = HANDLE_TO_INDEX(s, handle);
> +    bool only_structured = s->info.structured_reply;
> +    NBDReply reply;
> +
> +    do {
> +        void *payload = NULL;
> +        int rc = nbd_co_receive_one_chunk(s, handle, only_structured,
> +                                          qiov, &reply, &payload);
> +        if (rc < 0 || nbd_reply_is_simple(&reply)) {
> +            if (ret == 0) {
> +                ret = rc;
> +            }
> +            only_structured = true;
> +            continue;
> +        }
> +
> +        switch (reply.structured.type) {
> +        case NBD_SREP_TYPE_OFFSET_DATA:
> +            /* special cased in nbd_co_receive_one_chunk, data is already
> +             * in qiov */
> +            break;
> +        case NBD_SREP_TYPE_OFFSET_HOLE:
> +            ret = nbd_parse_offset_hole_payload(&reply.structured, payload,
> +                                                qiov);

we can loose previous error here, if ret is already < 0. rc should be used.

> +            if (ret < 0) {
> +                s->quit = true;
> +            }
> +            break;
> +        default:
> +            /* not allowed reply type */
> +            ret = -EINVAL;
> +            s->quit = true;
> +        }
> +
> +        qemu_vfree(payload);
> +        only_structured = true;
> +    } while (!s->quit && nbd_reply_is_structured(&s->reply) &&
> +             !(s->reply.structured.flags & NBD_SREP_FLAG_DONE));
> +
> +
> +    s->requests[i].coroutine = NULL;
> +
>       qemu_co_mutex_lock(&s->send_mutex);
>       s->in_flight--;
>       qemu_co_queue_next(&s->free_sema);
> @@ -226,30 +522,31 @@ static int nbd_co_receive_reply(NBDClientSession *s,
>   
>   static int nbd_co_request(BlockDriverState *bs,
>                             NBDRequest *request,
> -                          QEMUIOVector *qiov)
> +                          QEMUIOVector *write_qiov)
>   {
>       NBDClientSession *client = nbd_get_client_session(bs);
>       int ret;
>   
> -    if (qiov) {
> -        assert(request->type == NBD_CMD_WRITE || request->type == NBD_CMD_READ);
> -        assert(request->len == iov_size(qiov->iov, qiov->niov));
> +    assert(request->type != NBD_CMD_READ);
> +    if (write_qiov) {
> +        assert(request->type == NBD_CMD_WRITE);
> +        assert(request->len == iov_size(write_qiov->iov, write_qiov->niov));
>       } else {
> -        assert(request->type != NBD_CMD_WRITE && request->type != NBD_CMD_READ);
> +        assert(request->type != NBD_CMD_WRITE);
>       }
> -    ret = nbd_co_send_request(bs, request,
> -                              request->type == NBD_CMD_WRITE ? qiov : NULL);
> +    ret = nbd_co_send_request(bs, request, write_qiov);
>       if (ret < 0) {
>           return ret;
>       }
>   
> -    return nbd_co_receive_reply(client, request->handle,
> -                                request->type == NBD_CMD_READ ? qiov : NULL);
> +    return nbd_co_receive_return_code(client, request->handle);
>   }
>   
>   int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
>                            uint64_t bytes, QEMUIOVector *qiov, int flags)
>   {
> +    int ret;
> +    NBDClientSession *client = nbd_get_client_session(bs);
>       NBDRequest request = {
>           .type = NBD_CMD_READ,
>           .from = offset,
> @@ -259,7 +556,12 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
>       assert(bytes <= NBD_MAX_BUFFER_SIZE);
>       assert(!flags);
>   
> -    return nbd_co_request(bs, &request, qiov);
> +    ret = nbd_co_send_request(bs, &request, NULL);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +
> +    return nbd_co_receive_cmdread_reply(client, request.handle, qiov);
>   }
>   
>   int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
> diff --git a/nbd/client.c b/nbd/client.c
> index a38e1a7d8e..2f256ee771 100644
> --- a/nbd/client.c
> +++ b/nbd/client.c
> @@ -687,6 +687,13 @@ int nbd_receive_negotiate(QIOChannel *ioc, const char *name,
>           if (fixedNewStyle) {
>               int result;
>   
> +            result = nbd_request_simple_option(ioc, NBD_OPT_STRUCTURED_REPLY,
> +                                               errp);
> +            if (result < 0) {
> +                goto fail;
> +            }
> +            info->structured_reply = result == 1;
> +
>               /* Try NBD_OPT_GO first - if it works, we are done (it
>                * also gives us a good message if the server requires
>                * TLS).  If it is not available, fall back to


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/10] nbd: Minimal structured read for client
  2017-10-10  8:02   ` Paolo Bonzini
@ 2017-10-10 14:55     ` Vladimir Sementsov-Ogievskiy
  2017-10-10 15:00       ` Paolo Bonzini
  2017-10-11 16:59     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-10 14:55 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-block, qemu-devel; +Cc: mreitz, kwolf, eblake, den

10.10.2017 11:02, Paolo Bonzini wrote:
> On 09/10/2017 19:27, Vladimir Sementsov-Ogievskiy wrote:
>> +    int ret = nbd_co_do_receive_one_chunk(s, handle, only_structured,
>> +                                          &request_ret, qiov, payload);
>> +
>> +    if (ret < 0) {
>> +        s->quit = true;
>> +    } else {
>> +        /* For assert at loop start in nbd_read_reply_entry */
>> +        if (reply) {
>> +            *reply = s->reply;
>> +        }
> reply is always non-NULL, right?
>
>> +        s->reply.handle = 0;
>> +        ret = request_ret;
>> +    }
>>   
> ...
>
>> +static int nbd_co_receive_return_code(NBDClientSession *s, uint64_t handle)
>> +{
>> +    int ret = 0;
>> +    int i = HANDLE_TO_INDEX(s, handle);
>> +    NBDReply reply;
>> +    bool only_structured = false;
>> +
>> +    do {
>> +        int rc = nbd_co_receive_one_chunk(s, handle, only_structured,
>> +                                          NULL, &reply, NULL);
> Is rc > 0 propagated correctly?
>
>> +        if (rc < 0 || nbd_reply_is_simple(&reply)) {
>> +            if (ret == 0) {
>> +                ret = rc;
>> +            }
>> +            only_structured = true;
>> +            continue;
>> +        }
>> +    } while (!s->quit && nbd_reply_is_structured(&s->reply) &&
>> +             !(s->reply.structured.flags & NBD_SREP_FLAG_DONE));
>> +
>> +    s->requests[i].coroutine = NULL;
>> +
>> +    qemu_co_mutex_lock(&s->send_mutex);
>> +    s->in_flight--;
>> +    qemu_co_queue_next(&s->free_sema);
>> +    qemu_co_mutex_unlock(&s->send_mutex);
> The code after the loop is duplicated.
>
> An idea could be to wrap nbd_co_receive_one_chunk with an iterator like
>
> typedef struct NBDReplyChunkIter {
>      int ret;
>      bool done, only_structured;
> } NBDReplyChunkIter;
>
> #define NBD_FOREACH_REPLY_CHUNK(s, handle, iter, qiov, reply, payload, \
>                                  structured)                            \
>      for (iter = (NBDReplyChunkIter) { .only_structured = structured }; \
>           nbd_reply_chunk_iter_receive(s, &iter, qiov, &reply, handle,  \
>                                        payload);;)
>
> bool coroutine_fn nbd_reply_chunk_iter_receive(...)
> {
>      if (s->quit) {
>          if (iter->ret == 0) {
>              iter->ret = -EIO;
>          }
>          goto break_loop;
>      }
>      /* Handle the end of the previous iteration.  */
>      if (iter->done) {
>          goto break_loop;
>      }
>
>      rc = nbd_co_receive_one_chunk(s, handle, iter->only_structured,
>                                    qiov, reply, payload);
>      if (rc < 0) {
>          assert(s->quit);
>          if (iter->ret == 0) {
>              iter->ret = rc;
>          }
>          goto break_loop;
>      }
>
>      /* No structured reply?  Do not execute the body of
>       * NBD_FOREACH_REPLY_CHUNK.
>       */
>      if (nbd_reply_is_simple(&s->reply) || s->quit) {
>          goto break_loop;
>      }
>
>      iter->only_structured = true;
>      if (s->reply.structured.flags & NBD_SREP_FLAG_DONE) {
>          iter->ret = rc;
>          iter->done = true;
>          /* Execute the loop body, but this iteration is the last.  */
>          return true;
>      }
>
>      /* An error reply must have NBD_SREP_FLAG_DONE set, otherwise it is
>       * a protocol error (FIXME: is this true?  but if not, how do you
>       * cope with multiple error replies?)
>       */
>      if (rc > 0) {
>          s->quit = true;
>          iter->ret = -EIO;
>          goto break_loop;
>      }
>      return true;
>
> break_loop:
>      iter->done = true;
>
>      s->requests[HANDLE_TO_INDEX(s, handle)].coroutine = NULL;
>
>      qemu_co_mutex_lock(&s->send_mutex);
>      s->in_flight--;
>      qemu_co_queue_next(&s->free_sema);
>      qemu_co_mutex_unlock(&s->send_mutex);
>      return false;
> }
>
> so this loop could be written like
>
>      FOR_EACH_REPLY_CHUNK(s, handle, iter, NULL, &reply, NULL) {
>          if (reply.structured.type != NBD_SREP_TYPE_NONE) {
>              /* not allowed reply type */
>              s->quit = true;
>              break;
>          }
>      }
>      return iter.ret;
>
> and likewise for nbd_co_receive_cmdread_reply.  The iterator is a bit
> more complex, but it abstracts all the protocol handling and avoids lots
> of duplicated code.
>
> Paolo
>
>> +    return ret;
>> +}

Hmm, would it be simpler just pass a function pointer, which should be 
called on each loop iteration?
So, we will return to one common func nbd_co_receive_reply, but with two 
additional parameters: func and opaque?

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/10] nbd: Minimal structured read for client
  2017-10-10 14:55     ` Vladimir Sementsov-Ogievskiy
@ 2017-10-10 15:00       ` Paolo Bonzini
  2017-10-11  9:22         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2017-10-10 15:00 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: mreitz, kwolf, eblake, den

On 10/10/2017 16:55, Vladimir Sementsov-Ogievskiy wrote:
> Hmm, would it be simpler just pass a function pointer, which should be
> called on each loop iteration?
> So, we will return to one common func nbd_co_receive_reply, but with two
> additional parameters: func and opaque?

Function pointers typically result in having to pass the state around in
a structure, for all the callers.

An iterator also has to package the state in a structure, but it is only
done once.

So function pointers would be simpler in the beginning, but would not
scale as well.

Paolo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/10] nbd: Minimal structured read for client
  2017-10-10 15:00       ` Paolo Bonzini
@ 2017-10-11  9:22         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-11  9:22 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-block, qemu-devel; +Cc: mreitz, kwolf, eblake, den

10.10.2017 18:00, Paolo Bonzini wrote:
> On 10/10/2017 16:55, Vladimir Sementsov-Ogievskiy wrote:
>> Hmm, would it be simpler just pass a function pointer, which should be
>> called on each loop iteration?
>> So, we will return to one common func nbd_co_receive_reply, but with two
>> additional parameters: func and opaque?
> Function pointers typically result in having to pass the state around in
> a structure, for all the callers.
>
> An iterator also has to package the state in a structure, but it is only
> done once.
>
> So function pointers would be simpler in the beginning, but would not
> scale as well.
>
> Paolo

Understand. OK, I'll try your suggestion.

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/10] nbd: Minimal structured read for client
  2017-10-10  8:02   ` Paolo Bonzini
  2017-10-10 14:55     ` Vladimir Sementsov-Ogievskiy
@ 2017-10-11 16:59     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-11 16:59 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-block, qemu-devel; +Cc: mreitz, kwolf, eblake, den

10.10.2017 11:02, Paolo Bonzini wrote:
> On 09/10/2017 19:27, Vladimir Sementsov-Ogievskiy wrote:
>> +    int ret = nbd_co_do_receive_one_chunk(s, handle, only_structured,
>> +                                          &request_ret, qiov, payload);
>> +
>> +    if (ret < 0) {
>> +        s->quit = true;
>> +    } else {
>> +        /* For assert at loop start in nbd_read_reply_entry */
>> +        if (reply) {
>> +            *reply = s->reply;
>> +        }
> reply is always non-NULL, right?
>
>> +        s->reply.handle = 0;
>> +        ret = request_ret;
>> +    }
>>   
> ...
>
>> +static int nbd_co_receive_return_code(NBDClientSession *s, uint64_t handle)
>> +{
>> +    int ret = 0;
>> +    int i = HANDLE_TO_INDEX(s, handle);
>> +    NBDReply reply;
>> +    bool only_structured = false;
>> +
>> +    do {
>> +        int rc = nbd_co_receive_one_chunk(s, handle, only_structured,
>> +                                          NULL, &reply, NULL);
> Is rc > 0 propagated correctly?
>
>> +        if (rc < 0 || nbd_reply_is_simple(&reply)) {
>> +            if (ret == 0) {
>> +                ret = rc;
>> +            }
>> +            only_structured = true;
>> +            continue;
>> +        }
>> +    } while (!s->quit && nbd_reply_is_structured(&s->reply) &&
>> +             !(s->reply.structured.flags & NBD_SREP_FLAG_DONE));
>> +
>> +    s->requests[i].coroutine = NULL;
>> +
>> +    qemu_co_mutex_lock(&s->send_mutex);
>> +    s->in_flight--;
>> +    qemu_co_queue_next(&s->free_sema);
>> +    qemu_co_mutex_unlock(&s->send_mutex);
> The code after the loop is duplicated.
>
> An idea could be to wrap nbd_co_receive_one_chunk with an iterator like
>
> typedef struct NBDReplyChunkIter {
>      int ret;
>      bool done, only_structured;
> } NBDReplyChunkIter;
>
> #define NBD_FOREACH_REPLY_CHUNK(s, handle, iter, qiov, reply, payload, \
>                                  structured)                            \
>      for (iter = (NBDReplyChunkIter) { .only_structured = structured }; \
>           nbd_reply_chunk_iter_receive(s, &iter, qiov, &reply, handle,  \
>                                        payload);;)
>
> bool coroutine_fn nbd_reply_chunk_iter_receive(...)
> {
>      if (s->quit) {
>          if (iter->ret == 0) {
>              iter->ret = -EIO;
>          }
>          goto break_loop;
>      }
>      /* Handle the end of the previous iteration.  */
>      if (iter->done) {
>          goto break_loop;
>      }
>
>      rc = nbd_co_receive_one_chunk(s, handle, iter->only_structured,
>                                    qiov, reply, payload);
>      if (rc < 0) {
>          assert(s->quit);
>          if (iter->ret == 0) {
>              iter->ret = rc;
>          }
>          goto break_loop;
>      }
>
>      /* No structured reply?  Do not execute the body of
>       * NBD_FOREACH_REPLY_CHUNK.
>       */
>      if (nbd_reply_is_simple(&s->reply) || s->quit) {
>          goto break_loop;
>      }
>
>      iter->only_structured = true;
>      if (s->reply.structured.flags & NBD_SREP_FLAG_DONE) {
>          iter->ret = rc;
>          iter->done = true;
>          /* Execute the loop body, but this iteration is the last.  */
>          return true;
>      }
>
>      /* An error reply must have NBD_SREP_FLAG_DONE set, otherwise it is
>       * a protocol error (FIXME: is this true?  but if not, how do you
>       * cope with multiple error replies?)

unfortunately it's not true, spec doesn't forbid several error replies 
on one request.
Furthermore, we have NBD_REPLY_TYPE_ERROR_OFFSET, which have ability of 
several error chunks it its nature.

>       */
>      if (rc > 0) {
>          s->quit = true;
>          iter->ret = -EIO;
>          goto break_loop;
>      }
>      return true;
>
> break_loop:
>      iter->done = true;
>
>      s->requests[HANDLE_TO_INDEX(s, handle)].coroutine = NULL;
>
>      qemu_co_mutex_lock(&s->send_mutex);
>      s->in_flight--;
>      qemu_co_queue_next(&s->free_sema);
>      qemu_co_mutex_unlock(&s->send_mutex);
>      return false;
> }
>
> so this loop could be written like
>
>      FOR_EACH_REPLY_CHUNK(s, handle, iter, NULL, &reply, NULL) {
>          if (reply.structured.type != NBD_SREP_TYPE_NONE) {
>              /* not allowed reply type */
>              s->quit = true;
>              break;
>          }
>      }
>      return iter.ret;
>
> and likewise for nbd_co_receive_cmdread_reply.  The iterator is a bit
> more complex, but it abstracts all the protocol handling and avoids lots
> of duplicated code.
>
> Paolo
>
>> +    return ret;
>> +}


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 04/10] nbd-server: refactor simple reply sending
  2017-10-09 19:18   ` Eric Blake
  2017-10-10  8:40     ` Daniel P. Berrange
@ 2017-10-11 17:24     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-11 17:24 UTC (permalink / raw)
  To: Eric Blake, qemu-block, qemu-devel
  Cc: mreitz, kwolf, pbonzini, den, Daniel P. Berrange

09.10.2017 22:18, Eric Blake wrote:
> [adding Dan for a qio question - search for your name below]
>
> On 10/09/2017 12:27 PM, Vladimir Sementsov-Ogievskiy wrote:
>> Get rid of calculating structure fields offsets by hand and set_cork,
>> rename nbd_co_send_reply to nbd_co_send_simple_reply. Do not use
>> NBDReply which will be upgraded in future patches to handle both simple
>> and structured replies and will be used only in the client.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   include/block/nbd.h |  6 ++++
>>   nbd/nbd-internal.h  |  9 +++++
>>   nbd/server.c        | 97 ++++++++++++++++++++++-------------------------------
>>   3 files changed, 56 insertions(+), 56 deletions(-)
> Feels a bit big for one patch, although I'm borderline for being able to
> take it as-is.
>
>> diff --git a/include/block/nbd.h b/include/block/nbd.h
>> index 707fd37575..49008980f4 100644
>> --- a/include/block/nbd.h
>> +++ b/include/block/nbd.h
>> @@ -63,6 +63,12 @@ struct NBDReply {
>>   };
>>   typedef struct NBDReply NBDReply;
>>   
>> +typedef struct NBDSimpleReply {
>> +    uint32_t magic;  /* NBD_SIMPLE_REPLY_MAGIC */
>> +    uint32_t error;
>> +    uint64_t handle;
>> +} QEMU_PACKED NBDSimpleReply;
>> +
> So (one of?) the goal of this patch is to use a packed struct for
> on-the-wire transmissions, instead of manually packing things into
> offsets by hand.  I can live with that.
>
>>   /* Transmission (export) flags: sent from server to client during handshake,
>>      but describe what will happen during transmission */
>>   #define NBD_FLAG_HAS_FLAGS      (1 << 0)        /* Flags are there */
>> diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
>> index 2d6663de23..320abef296 100644
>> --- a/nbd/nbd-internal.h
>> +++ b/nbd/nbd-internal.h
>> @@ -113,6 +113,15 @@ static inline int nbd_write(QIOChannel *ioc, const void *buffer, size_t size,
>>       return qio_channel_write_all(ioc, buffer, size, errp) < 0 ? -EIO : 0;
>>   }
>>   
>> +/* nbd_writev
>> + * Writes @size bytes to @ioc. Returns 0 on success.
>> + */
>> +static inline int nbd_writev(QIOChannel *ioc, const struct iovec *iov,
>> +                             size_t niov, Error **errp)
> Comment is wrong; you don't have a @size parameter.
>
>> +{
>> +    return qio_channel_writev_all(ioc, iov, niov, errp) < 0 ? -EIO : 0;
>> +}
> Do we really need this, or can we just directly use
> qio_channel_writev_all() at the lone site that uses this new interface?
>
>> +
>>   struct NBDTLSHandshakeData {
>>       GMainLoop *loop;
>>       bool complete;
>> diff --git a/nbd/server.c b/nbd/server.c
>> index a1b21a6951..57d5598e0f 100644
>> --- a/nbd/server.c
>> +++ b/nbd/server.c
>> @@ -902,26 +902,6 @@ static int nbd_receive_request(QIOChannel *ioc, NBDRequest *request,
>>       return 0;
>>   }
>>   
>> -static int nbd_send_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)
> So instead of taking a non-packed struct, doing the hand-packing here,
> then sending the message, you split this up...
>
>> -static int nbd_co_send_reply(NBDRequestData *req, NBDReply *reply, int len,
>> -                             Error **errp)
>> +static int coroutine_fn nbd_co_send_iov(NBDClient *client, struct iovec *iov,
>> +                                        unsigned niov, Error **errp)
>>   {
>> -    NBDClient *client = req->client;
>>       int ret;
>>   
>>       g_assert(qemu_in_coroutine());
>> -
>> -    trace_nbd_co_send_reply(reply->handle, reply->error, len);
>> -
>>       qemu_co_mutex_lock(&client->send_lock);
>>       client->send_coroutine = qemu_coroutine_self();
>>   
>> -    if (!len) {
>> -        ret = nbd_send_reply(client->ioc, reply, errp);
>> -    } else {
>> -        qio_channel_set_cork(client->ioc, true);
>> -        ret = nbd_send_reply(client->ioc, reply, errp);
>> -        if (ret == 0) {
>> -            ret = nbd_write(client->ioc, req->data, len, errp);
>> -            if (ret < 0) {
>> -                ret = -EIO;
>> -            }
>> -        }
>> -        qio_channel_set_cork(client->ioc, false);
>> -    }
>> +    ret = nbd_writev(client->ioc, iov, niov, errp);
> ...the transmission is now done via iov so that you can send two buffers
> at once without having to play with the cork,...
>
> Dan - are we guaranteed that qio_channel_writev_all() with a multi-iov
> argument called by one thread will send all its data in order, without
> being interleaved with any other parallel coroutine also calling
> qio_channel_writev_all()?  In other words, it looks like the NBD code
> was previously using manual cork changes to ensure that two halves of a
> message were sent without interleave, but Vladimir is now relying on the
> qio code to do that under the hood.
>
>>   
>>       client->send_coroutine = NULL;
>>       qemu_co_mutex_unlock(&client->send_lock);
>> +
>>       return ret;
>>   }
>>   
>> +static inline void set_be_simple_reply(NBDSimpleReply *reply, uint64_t error,
>> +                                       uint64_t handle)
>> +{
>> +    stl_be_p(&reply->magic, NBD_SIMPLE_REPLY_MAGIC);
>> +    stl_be_p(&reply->error, error);
>> +    stq_be_p(&reply->handle, handle);
>> +}
> ...loading the packed struct is now its own helper function,...
>
>> +
>> +static int nbd_co_send_simple_reply(NBDClient *client,
>> +                                    uint64_t handle,
>> +                                    uint32_t error,
>> +                                    void *data,
>> +                                    size_t size,
>> +                                    Error **errp)
>> +{
>> +    NBDSimpleReply reply;
>> +    struct iovec iov[] = {
>> +        {.iov_base = &reply, .iov_len = sizeof(reply)},
>> +        {.iov_base = data, .iov_len = size}
>> +    };
>> +
>> +    trace_nbd_co_send_reply(handle, error, size);
>> +
>> +    set_be_simple_reply(&reply, system_errno_to_nbd_errno(error), handle);
>> +
>> +    return nbd_co_send_iov(client, iov, size ? 2 : 1, errp);
>> +}
> ...and the function gets renamed, all at once.  Okay, maybe it IS easier
> to review if split.  A good split might be:
>
> Focus on naming: Rename nbd_co_send_reply() to
> nbd_co_send_simple_reply() in one patch (in fact, this part should be
> squashed with the rename of the magic number in 3/10)
>
> Focus on conversion to on-the-wire representation: Add the packed struct
> and set_be_simple_reply() in one patch, but still using corks
>
> Focus on simplified transmission: Switch to using iov to send both
> halves of a two-part message in one transaction
>
>> +
>>   /* nbd_co_receive_request
>>    * Collect a client request. Return 0 if request looks valid, -EIO to drop
>>    * connection right away, and any other negative value to report an error to
>> @@ -1331,7 +1324,6 @@ static coroutine_fn void nbd_trip(void *opaque)
>>       NBDExport *exp = client->exp;
>>       NBDRequestData *req;
>>       NBDRequest request = { 0 };    /* GCC thinks it can be used uninitialized */
>> -    NBDReply reply;
>>       int ret;
>>       int flags;
>>       int reply_data_len = 0;
>> @@ -1351,11 +1343,7 @@ static coroutine_fn void nbd_trip(void *opaque)
>>           goto disconnect;
>>       }
>>   
>> -    reply.handle = request.handle;
>> -    reply.error = 0;
>> -
>>       if (ret < 0) {
>> -        reply.error = -ret;
>>           goto reply;
>>       }
> Oh, and you're doing a fourth thing - tracking the error directly in ret
> instead of buried in reply.error.  Could also be split.
>
>>   
>>   reply:
>>       if (local_err) {
>> -        /* If we are here local_err is not fatal error, already stored in
>> -         * reply.error */
>> +        /* If we are here local_err is not fatal error, which should be sent
>> +         * to client. */
> Reads awkwardly (even pre-patch); maybe this is better:
>
> /* If we get here, local_err was not a fatal error, and should be sent
> to the client. */
>
>>           error_report_err(local_err);
>>           local_err = NULL;
>>       }
>>   
>> -    if (nbd_co_send_reply(req, &reply, reply_data_len, &local_err) < 0) {
>> +    if (nbd_co_send_simple_reply(req->client, request.handle,
>> +                                 ret < 0 ? -ret : 0,
>> +                                 req->data, reply_data_len, &local_err) < 0)
>> +    {
>>           error_prepend(&local_err, "Failed to send reply: ");
>>           goto disconnect;
>>       }
>>
> I'm not necessarily going to insist that you split things if I don't see
> any other reasons to respin the series; but I might also try to do the
> split locally and post it for you to see my thinking (remember, a series
> of small patches that each do one thing well can be easier to review
> than large patches, even if it results in more emails).
>

Done, will send tomorrow. I agree, there are too many things in one patch.

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-10-11 17:24 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-09 17:27 [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 01/10] block/nbd-client: assert qiov len once in nbd_co_request Vladimir Sementsov-Ogievskiy
2017-10-09 18:37   ` Eric Blake
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 1/7] block/nbd-client: refactor nbd_co_receive_reply Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 2/7] block/nbd-client: exit reply-reading coroutine on incorrect handle Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 02/10] block/nbd-client: refactor nbd_co_receive_reply Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 3/7] block/nbd-client: refactor reading reply Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 03/10] nbd: rename NBD_REPLY_MAGIC to NBD_SIMPLE_REPLY_MAGIC Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 4/7] block/nbd-client: drop reply field from NBDClientSession Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 04/10] nbd-server: refactor simple reply sending Vladimir Sementsov-Ogievskiy
2017-10-09 19:18   ` Eric Blake
2017-10-10  8:40     ` Daniel P. Berrange
2017-10-10  9:13       ` Paolo Bonzini
2017-10-11 17:24     ` Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 5/7] block/nbd-client: nbd_co_send_request: return -EIO if s->quit was set in parallel Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 05/10] nbd: header constants indenting Vladimir Sementsov-Ogievskiy
2017-10-09 19:21   ` Eric Blake
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 6/7] block/nbd-client: early fail nbd_read_reply_entry if s->quit is set Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 06/10] nbd: Minimal structured read for server Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 7/7] block/nbd-client: do not yield from nbd_read_reply_entry Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 07/10] nbd/client: refactor nbd_receive_starttls Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 08/10] nbd: share some nbd entities to be reused in block/nbd-client.c Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 09/10] nbd/client: prepare nbd_receive_reply for structured reply Vladimir Sementsov-Ogievskiy
2017-10-09 17:27 ` [Qemu-devel] [PATCH v2 10/10] nbd: Minimal structured read for client Vladimir Sementsov-Ogievskiy
2017-10-10  8:02   ` Paolo Bonzini
2017-10-10 14:55     ` Vladimir Sementsov-Ogievskiy
2017-10-10 15:00       ` Paolo Bonzini
2017-10-11  9:22         ` Vladimir Sementsov-Ogievskiy
2017-10-11 16:59     ` Vladimir Sementsov-Ogievskiy
2017-10-10 14:43   ` Vladimir Sementsov-Ogievskiy
2017-10-09 17:33 ` [Qemu-devel] [PATCH v2 00/10] nbd minimal structured read Vladimir Sementsov-Ogievskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.