All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 0/4] vhost-user block device backend implementation
@ 2020-06-04 23:35 Coiby Xu
  2020-06-04 23:35 ` [PATCH v8 1/4] Allow vu_message_read to be replaced Coiby Xu
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Coiby Xu @ 2020-06-04 23:35 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, bharatlkmlkvm, Coiby Xu, stefanha

v8
 - re-try connecting to socket server to fix asan error
 - fix license naming issue

v7
 - fix docker-test-debug@fedora errors by freeing malloced memory

v6
 - add missing license header and include guard
 - vhost-user server only serve one client one time
 - fix a bug in custom vu_message_read
 - using qemu-storage-daemon to start vhost-user-blk-server
 - a bug fix to pass docker-test-clang@ubuntu

v5:
 * re-use vu_kick_cb in libvhost-user
 * keeping processing VhostUserMsg in the same coroutine until there is
   detachment/attachment of AIOContext
 * Spawn separate coroutine for each VuVirtqElement
 * Other changes including relocating vhost-user-blk-server.c, coding
   style etc.

v4:
 * add object properties in class_init
 * relocate vhost-user-blk-test
 * other changes including using SocketAddress, coding style, etc.

v3:
 * separate generic vhost-user-server code from vhost-user-blk-server
   code
 * re-write vu_message_read and kick hander function as coroutines to
   directly call blk_co_preadv, blk_co_pwritev, etc.
 * add aio_context notifier functions to support multi-threading model
 * other fixes regarding coding style, warning report, etc.

v2:
 * Only enable this feature for Linux because eventfd is a Linux-specific
   feature


This patch series is an implementation of vhost-user block device
backend server, thanks to Stefan and Kevin's guidance.

Vhost-user block device backend server is a UserCreatable object and can be
started using object_add,

 (qemu) object_add vhost-user-blk-server,id=ID,unix-socket=/tmp/vhost-user-blk_vhost.socket,node-name=DRIVE_NAME,writable=off,blk-size=512
 (qemu) object_del ID

or appending the "-object" option when starting QEMU,

  $ -object vhost-user-blk-server,id=disk,unix-socket=/tmp/vhost-user-blk_vhost.socket,node-name=DRIVE_NAME,writable=off,blk-size=512

Then vhost-user client can connect to the server backend.
For example, QEMU could act as a client,

  $ -m 256 -object memory-backend-memfd,id=mem,size=256M,share=on -numa node,memdev=mem -chardev socket,id=char1,path=/tmp/vhost-user-blk_vhost.socket -device vhost-user-blk-pci,id=blk0,chardev=char1

And guest OS could access this vhost-user block device after mounting it.

Coiby Xu (4):
  Allow vu_message_read to be replaced
  generic vhost user server
  vhost-user block device backend server
  new qTest case to test the vhost-user-blk-server

 block/Makefile.objs                        |   1 +
 block/export/vhost-user-blk-server.c       | 716 ++++++++++++++++++++
 block/export/vhost-user-blk-server.h       |  34 +
 contrib/libvhost-user/libvhost-user-glib.c |   2 +-
 contrib/libvhost-user/libvhost-user.c      |  11 +-
 contrib/libvhost-user/libvhost-user.h      |  21 +
 softmmu/vl.c                               |   4 +
 tests/Makefile.include                     |   3 +-
 tests/qtest/Makefile.include               |   2 +
 tests/qtest/libqos/vhost-user-blk.c        | 130 ++++
 tests/qtest/libqos/vhost-user-blk.h        |  44 ++
 tests/qtest/libqtest.c                     |  54 +-
 tests/qtest/libqtest.h                     |  38 ++
 tests/qtest/vhost-user-blk-test.c          | 737 +++++++++++++++++++++
 tests/vhost-user-bridge.c                  |   2 +
 tools/virtiofsd/fuse_virtio.c              |   4 +-
 util/Makefile.objs                         |   1 +
 util/vhost-user-server.c                   | 406 ++++++++++++
 util/vhost-user-server.h                   |  59 ++
 19 files changed, 2229 insertions(+), 40 deletions(-)
 create mode 100644 block/export/vhost-user-blk-server.c
 create mode 100644 block/export/vhost-user-blk-server.h
 create mode 100644 tests/qtest/libqos/vhost-user-blk.c
 create mode 100644 tests/qtest/libqos/vhost-user-blk.h
 create mode 100644 tests/qtest/vhost-user-blk-test.c
 create mode 100644 util/vhost-user-server.c
 create mode 100644 util/vhost-user-server.h

--
2.26.2



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v8 1/4] Allow vu_message_read to be replaced
  2020-06-04 23:35 [PATCH v8 0/4] vhost-user block device backend implementation Coiby Xu
@ 2020-06-04 23:35 ` Coiby Xu
  2020-06-11 10:45   ` Stefan Hajnoczi
  2020-06-11 11:26   ` Marc-André Lureau
  2020-06-04 23:35 ` [PATCH v8 2/4] generic vhost user server Coiby Xu
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: Coiby Xu @ 2020-06-04 23:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, bharatlkmlkvm, Coiby Xu, stefanha, Dr. David Alan Gilbert

Allow vu_message_read to be replaced by one which will make use of the
QIOChannel functions. Thus reading vhost-user message won't stall the
guest.

Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
---
 contrib/libvhost-user/libvhost-user-glib.c |  2 +-
 contrib/libvhost-user/libvhost-user.c      | 11 ++++++-----
 contrib/libvhost-user/libvhost-user.h      | 21 +++++++++++++++++++++
 tests/vhost-user-bridge.c                  |  2 ++
 tools/virtiofsd/fuse_virtio.c              |  4 ++--
 5 files changed, 32 insertions(+), 8 deletions(-)

diff --git a/contrib/libvhost-user/libvhost-user-glib.c b/contrib/libvhost-user/libvhost-user-glib.c
index 53f1ca4cdd..0df2ec9271 100644
--- a/contrib/libvhost-user/libvhost-user-glib.c
+++ b/contrib/libvhost-user/libvhost-user-glib.c
@@ -147,7 +147,7 @@ vug_init(VugDev *dev, uint16_t max_queues, int socket,
     g_assert(dev);
     g_assert(iface);
 
-    if (!vu_init(&dev->parent, max_queues, socket, panic, set_watch,
+    if (!vu_init(&dev->parent, max_queues, socket, panic, NULL, set_watch,
                  remove_watch, iface)) {
         return false;
     }
diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c
index 3bca996c62..0c7368baa2 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -67,8 +67,6 @@
 /* The version of inflight buffer */
 #define INFLIGHT_VERSION 1
 
-#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
-
 /* The version of the protocol we support */
 #define VHOST_USER_VERSION 1
 #define LIBVHOST_USER_DEBUG 0
@@ -412,7 +410,7 @@ vu_process_message_reply(VuDev *dev, const VhostUserMsg *vmsg)
         goto out;
     }
 
-    if (!vu_message_read(dev, dev->slave_fd, &msg_reply)) {
+    if (!dev->read_msg(dev, dev->slave_fd, &msg_reply)) {
         goto out;
     }
 
@@ -647,7 +645,7 @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg)
     /* Wait for QEMU to confirm that it's registered the handler for the
      * faults.
      */
-    if (!vu_message_read(dev, dev->sock, vmsg) ||
+    if (!dev->read_msg(dev, dev->sock, vmsg) ||
         vmsg->size != sizeof(vmsg->payload.u64) ||
         vmsg->payload.u64 != 0) {
         vu_panic(dev, "failed to receive valid ack for postcopy set-mem-table");
@@ -1653,7 +1651,7 @@ vu_dispatch(VuDev *dev)
     int reply_requested;
     bool need_reply, success = false;
 
-    if (!vu_message_read(dev, dev->sock, &vmsg)) {
+    if (!dev->read_msg(dev, dev->sock, &vmsg)) {
         goto end;
     }
 
@@ -1704,6 +1702,7 @@ vu_deinit(VuDev *dev)
         }
 
         if (vq->kick_fd != -1) {
+            dev->remove_watch(dev, vq->kick_fd);
             close(vq->kick_fd);
             vq->kick_fd = -1;
         }
@@ -1751,6 +1750,7 @@ vu_init(VuDev *dev,
         uint16_t max_queues,
         int socket,
         vu_panic_cb panic,
+        vu_read_msg_cb read_msg,
         vu_set_watch_cb set_watch,
         vu_remove_watch_cb remove_watch,
         const VuDevIface *iface)
@@ -1768,6 +1768,7 @@ vu_init(VuDev *dev,
 
     dev->sock = socket;
     dev->panic = panic;
+    dev->read_msg = read_msg ? read_msg : vu_message_read;
     dev->set_watch = set_watch;
     dev->remove_watch = remove_watch;
     dev->iface = iface;
diff --git a/contrib/libvhost-user/libvhost-user.h b/contrib/libvhost-user/libvhost-user.h
index f30394fab6..d756da8548 100644
--- a/contrib/libvhost-user/libvhost-user.h
+++ b/contrib/libvhost-user/libvhost-user.h
@@ -30,6 +30,8 @@
 
 #define VHOST_MEMORY_MAX_NREGIONS 8
 
+#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
+
 typedef enum VhostSetConfigType {
     VHOST_SET_CONFIG_TYPE_MASTER = 0,
     VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
@@ -205,6 +207,7 @@ typedef uint64_t (*vu_get_features_cb) (VuDev *dev);
 typedef void (*vu_set_features_cb) (VuDev *dev, uint64_t features);
 typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg *vmsg,
                                   int *do_reply);
+typedef bool (*vu_read_msg_cb) (VuDev *dev, int sock, VhostUserMsg *vmsg);
 typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool started);
 typedef bool (*vu_queue_is_processed_in_order_cb) (VuDev *dev, int qidx);
 typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, uint32_t len);
@@ -373,6 +376,23 @@ struct VuDev {
     bool broken;
     uint16_t max_queues;
 
+    /* @read_msg: custom method to read vhost-user message
+     *
+     * Read data from vhost_user socket fd and fill up
+     * the passed VhostUserMsg *vmsg struct.
+     *
+     * If reading fails, it should close the received set of file
+     * descriptors as socket message's auxiliary data.
+     *
+     * For the details, please refer to vu_message_read in libvhost-user.c
+     * which will be used by default if not custom method is provided when
+     * calling vu_init
+     *
+     * Returns: true if vhost-user message successfully received,
+     *          otherwise return false.
+     *
+     */
+    vu_read_msg_cb read_msg;
     /* @set_watch: add or update the given fd to the watch set,
      * call cb when condition is met */
     vu_set_watch_cb set_watch;
@@ -416,6 +436,7 @@ bool vu_init(VuDev *dev,
              uint16_t max_queues,
              int socket,
              vu_panic_cb panic,
+             vu_read_msg_cb read_msg,
              vu_set_watch_cb set_watch,
              vu_remove_watch_cb remove_watch,
              const VuDevIface *iface);
diff --git a/tests/vhost-user-bridge.c b/tests/vhost-user-bridge.c
index 6c3d490611..bd43607a4d 100644
--- a/tests/vhost-user-bridge.c
+++ b/tests/vhost-user-bridge.c
@@ -520,6 +520,7 @@ vubr_accept_cb(int sock, void *ctx)
                  VHOST_USER_BRIDGE_MAX_QUEUES,
                  conn_fd,
                  vubr_panic,
+                 NULL,
                  vubr_set_watch,
                  vubr_remove_watch,
                  &vuiface)) {
@@ -573,6 +574,7 @@ vubr_new(const char *path, bool client)
                      VHOST_USER_BRIDGE_MAX_QUEUES,
                      dev->sock,
                      vubr_panic,
+                     NULL,
                      vubr_set_watch,
                      vubr_remove_watch,
                      &vuiface)) {
diff --git a/tools/virtiofsd/fuse_virtio.c b/tools/virtiofsd/fuse_virtio.c
index 3b6d16a041..666945c897 100644
--- a/tools/virtiofsd/fuse_virtio.c
+++ b/tools/virtiofsd/fuse_virtio.c
@@ -980,8 +980,8 @@ int virtio_session_mount(struct fuse_session *se)
     se->vu_socketfd = data_sock;
     se->virtio_dev->se = se;
     pthread_rwlock_init(&se->virtio_dev->vu_dispatch_rwlock, NULL);
-    vu_init(&se->virtio_dev->dev, 2, se->vu_socketfd, fv_panic, fv_set_watch,
-            fv_remove_watch, &fv_iface);
+    vu_init(&se->virtio_dev->dev, 2, se->vu_socketfd, fv_panic, NULL,
+            fv_set_watch, fv_remove_watch, &fv_iface);
 
     return 0;
 }
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v8 2/4] generic vhost user server
  2020-06-04 23:35 [PATCH v8 0/4] vhost-user block device backend implementation Coiby Xu
  2020-06-04 23:35 ` [PATCH v8 1/4] Allow vu_message_read to be replaced Coiby Xu
@ 2020-06-04 23:35 ` Coiby Xu
  2020-06-11 13:14   ` Stefan Hajnoczi
  2020-06-04 23:35 ` [PATCH v8 3/4] vhost-user block device backend server Coiby Xu
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Coiby Xu @ 2020-06-04 23:35 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, bharatlkmlkvm, Coiby Xu, stefanha

Sharing QEMU devices via vhost-user protocol.

Only one vhost-user client can connect to the server one time.

Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
---
 util/Makefile.objs       |   1 +
 util/vhost-user-server.c | 406 +++++++++++++++++++++++++++++++++++++++
 util/vhost-user-server.h |  59 ++++++
 3 files changed, 466 insertions(+)
 create mode 100644 util/vhost-user-server.c
 create mode 100644 util/vhost-user-server.h

diff --git a/util/Makefile.objs b/util/Makefile.objs
index fe339c2636..f54a6c80ec 100644
--- a/util/Makefile.objs
+++ b/util/Makefile.objs
@@ -40,6 +40,7 @@ util-obj-y += readline.o
 util-obj-y += rcu.o
 util-obj-$(CONFIG_MEMBARRIER) += sys_membarrier.o
 util-obj-y += qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o
+util-obj-$(CONFIG_LINUX) += vhost-user-server.o
 util-obj-y += qemu-coroutine-sleep.o
 util-obj-y += qemu-co-shared-resource.o
 util-obj-y += coroutine-$(CONFIG_COROUTINE_BACKEND).o
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
new file mode 100644
index 0000000000..8fafd97bdc
--- /dev/null
+++ b/util/vhost-user-server.c
@@ -0,0 +1,406 @@
+/*
+ * Sharing QEMU devices via vhost-user protocol
+ *
+ * Author: Coiby Xu <coiby.xu@gmail.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include <sys/eventfd.h>
+#include "qemu/main-loop.h"
+#include "vhost-user-server.h"
+
+static void vmsg_close_fds(VhostUserMsg *vmsg)
+{
+    int i;
+    for (i = 0; i < vmsg->fd_num; i++) {
+        close(vmsg->fds[i]);
+    }
+}
+
+static void vmsg_unblock_fds(VhostUserMsg *vmsg)
+{
+    int i;
+    for (i = 0; i < vmsg->fd_num; i++) {
+        qemu_set_nonblock(vmsg->fds[i]);
+    }
+}
+
+static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
+                      gpointer opaque);
+
+static void close_client(VuServer *server)
+{
+    vu_deinit(&server->vu_dev);
+    server->sioc = NULL;
+    object_unref(OBJECT(server->ioc));
+
+    server->sioc_slave = NULL;
+    object_unref(OBJECT(server->ioc_slave));
+    /*
+     * Set the callback function for network listener so another
+     * vhost-user client can connect to this server
+     */
+    qio_net_listener_set_client_func(server->listener,
+                                     vu_accept,
+                                     server,
+                                     NULL);
+}
+
+static void panic_cb(VuDev *vu_dev, const char *buf)
+{
+    VuServer *server = container_of(vu_dev, VuServer, vu_dev);
+
+    if (buf) {
+        error_report("vu_panic: %s", buf);
+    }
+
+    if (server->sioc) {
+        close_client(server);
+    }
+
+    if (server->device_panic_notifier) {
+        server->device_panic_notifier(server);
+    }
+}
+
+static QIOChannel *slave_io_channel(VuServer *server, int fd,
+                                    Error **local_err)
+{
+    if (server->sioc_slave) {
+        if (fd == server->sioc_slave->fd) {
+            return server->ioc_slave;
+        }
+    } else {
+        server->sioc_slave = qio_channel_socket_new_fd(fd, local_err);
+        if (!*local_err) {
+            server->ioc_slave = QIO_CHANNEL(server->sioc_slave);
+            return server->ioc_slave;
+        }
+    }
+
+    return NULL;
+}
+
+static bool coroutine_fn
+vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
+{
+    struct iovec iov = {
+        .iov_base = (char *)vmsg,
+        .iov_len = VHOST_USER_HDR_SIZE,
+    };
+    int rc, read_bytes = 0;
+    Error *local_err = NULL;
+    /*
+     * Store fds/nfds returned from qio_channel_readv_full into
+     * temporary variables.
+     *
+     * VhostUserMsg is a packed structure, gcc will complain about passing
+     * pointer to a packed structure member if we pass &VhostUserMsg.fd_num
+     * and &VhostUserMsg.fds directly when calling qio_channel_readv_full,
+     * thus two temporary variables nfds and fds are used here.
+     */
+    size_t nfds = 0, nfds_t = 0;
+    int *fds = NULL, *fds_t = NULL;
+    VuServer *server = container_of(vu_dev, VuServer, vu_dev);
+    QIOChannel *ioc = NULL;
+
+    if (conn_fd == server->sioc->fd) {
+        ioc = server->ioc;
+    } else {
+        /* Slave communication will also use this function to read msg */
+        ioc = slave_io_channel(server, conn_fd, &local_err);
+    }
+
+    if (!ioc) {
+        error_report_err(local_err);
+        goto fail;
+    }
+
+    assert(qemu_in_coroutine());
+    do {
+        /*
+         * qio_channel_readv_full may have short reads, keeping calling it
+         * until getting VHOST_USER_HDR_SIZE or 0 bytes in total
+         */
+        rc = qio_channel_readv_full(ioc, &iov, 1, &fds_t, &nfds_t, &local_err);
+        if (rc < 0) {
+            if (rc == QIO_CHANNEL_ERR_BLOCK) {
+                qio_channel_yield(ioc, G_IO_IN);
+                continue;
+            } else {
+                error_report_err(local_err);
+                return false;
+            }
+        }
+        read_bytes += rc;
+        if (nfds_t > 0) {
+            fds = g_renew(int, fds, nfds + nfds_t);
+            memcpy(fds + nfds, fds_t, nfds_t *sizeof(int));
+            nfds += nfds_t;
+            if (nfds > VHOST_MEMORY_MAX_NREGIONS) {
+                error_report("A maximum of %d fds are allowed, "
+                             "however got %lu fds now",
+                             VHOST_MEMORY_MAX_NREGIONS, nfds);
+                goto fail;
+            }
+            g_free(fds_t);
+        }
+        if (read_bytes == VHOST_USER_HDR_SIZE || rc == 0) {
+            break;
+        }
+        iov.iov_base = (char *)vmsg + read_bytes;
+        iov.iov_len = VHOST_USER_HDR_SIZE - read_bytes;
+    } while (true);
+
+    vmsg->fd_num = nfds;
+    if (nfds > 0) {
+        memcpy(vmsg->fds, fds, nfds * sizeof(int));
+    }
+    g_free(fds);
+    /* qio_channel_readv_full will make socket fds blocking, unblock them */
+    vmsg_unblock_fds(vmsg);
+    if (vmsg->size > sizeof(vmsg->payload)) {
+        error_report("Error: too big message request: %d, "
+                     "size: vmsg->size: %u, "
+                     "while sizeof(vmsg->payload) = %zu",
+                     vmsg->request, vmsg->size, sizeof(vmsg->payload));
+        goto fail;
+    }
+
+    struct iovec iov_payload = {
+        .iov_base = (char *)&vmsg->payload,
+        .iov_len = vmsg->size,
+    };
+    if (vmsg->size) {
+        rc = qio_channel_readv_all_eof(ioc, &iov_payload, 1, &local_err);
+        if (rc == -1) {
+            error_report_err(local_err);
+            goto fail;
+        }
+    }
+
+    return true;
+
+fail:
+    vmsg_close_fds(vmsg);
+
+    return false;
+}
+
+
+static void vu_client_start(VuServer *server);
+static coroutine_fn void vu_client_trip(void *opaque)
+{
+    VuServer *server = opaque;
+
+    while (!server->aio_context_changed && server->sioc) {
+        vu_dispatch(&server->vu_dev);
+    }
+
+    if (server->aio_context_changed && server->sioc) {
+        server->aio_context_changed = false;
+        vu_client_start(server);
+    }
+}
+
+static void vu_client_start(VuServer *server)
+{
+    server->co_trip = qemu_coroutine_create(vu_client_trip, server);
+    aio_co_enter(server->ctx, server->co_trip);
+}
+
+/*
+ * a wrapper for vu_kick_cb
+ *
+ * since aio_dispatch can only pass one user data pointer to the
+ * callback function, pack VuDev and pvt into a struct. Then unpack it
+ * and pass them to vu_kick_cb
+ */
+static void kick_handler(void *opaque)
+{
+    KickInfo *kick_info = opaque;
+    kick_info->cb(kick_info->vu_dev, 0, (void *) kick_info->index);
+}
+
+
+static void
+set_watch(VuDev *vu_dev, int fd, int vu_evt,
+          vu_watch_cb cb, void *pvt)
+{
+
+    VuServer *server = container_of(vu_dev, VuServer, vu_dev);
+    g_assert(vu_dev);
+    g_assert(fd >= 0);
+    long index = (intptr_t) pvt;
+    g_assert(cb);
+    KickInfo *kick_info = &server->kick_info[index];
+    if (!kick_info->cb) {
+        kick_info->fd = fd;
+        kick_info->cb = cb;
+        qemu_set_nonblock(fd);
+        aio_set_fd_handler(server->ioc->ctx, fd, false, kick_handler,
+                           NULL, NULL, kick_info);
+        kick_info->vu_dev = vu_dev;
+    }
+}
+
+
+static void remove_watch(VuDev *vu_dev, int fd)
+{
+    VuServer *server;
+    int i;
+    int index = -1;
+    g_assert(vu_dev);
+    g_assert(fd >= 0);
+
+    server = container_of(vu_dev, VuServer, vu_dev);
+    for (i = 0; i < vu_dev->max_queues; i++) {
+        if (server->kick_info[i].fd == fd) {
+            index = i;
+            break;
+        }
+    }
+
+    if (index == -1) {
+        return;
+    }
+    server->kick_info[i].cb = NULL;
+    aio_set_fd_handler(server->ioc->ctx, fd, false, NULL, NULL, NULL, NULL);
+}
+
+
+static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
+                      gpointer opaque)
+{
+    VuServer *server = opaque;
+
+    if (server->sioc) {
+        warn_report("Only one vhost-user client is allowed to "
+                    "connect the server one time");
+        return;
+    }
+
+    if (!vu_init(&server->vu_dev, server->max_queues, sioc->fd, panic_cb,
+                 vu_message_read, set_watch, remove_watch, server->vu_iface)) {
+        error_report("Failed to initialized libvhost-user");
+        return;
+    }
+
+    /*
+     * Unset the callback function for network listener to make another
+     * vhost-user client keeping waiting until this client disconnects
+     */
+    qio_net_listener_set_client_func(server->listener,
+                                     NULL,
+                                     NULL,
+                                     NULL);
+    server->sioc = sioc;
+    server->kick_info = g_new0(KickInfo, server->max_queues);
+    /*
+     * Increase the object reference, so cioc will not freed by
+     * qio_net_listener_channel_func which will call object_unref(OBJECT(sioc))
+     */
+    object_ref(OBJECT(server->sioc));
+    qio_channel_set_name(QIO_CHANNEL(sioc), "vhost-user client");
+    server->ioc = QIO_CHANNEL(sioc);
+    object_ref(OBJECT(server->ioc));
+    object_ref(OBJECT(sioc));
+    qio_channel_attach_aio_context(server->ioc, server->ctx);
+    qio_channel_set_blocking(QIO_CHANNEL(server->sioc), false, NULL);
+    vu_client_start(server);
+}
+
+
+void vhost_user_server_stop(VuServer *server)
+{
+    if (!server) {
+        return;
+    }
+
+    if (server->sioc) {
+        close_client(server);
+        object_unref(OBJECT(server->sioc));
+    }
+
+    if (server->listener) {
+        qio_net_listener_disconnect(server->listener);
+        object_unref(OBJECT(server->listener));
+    }
+}
+
+static void detach_context(VuServer *server)
+{
+    int i;
+    AioContext *ctx = server->ioc->ctx;
+    qio_channel_detach_aio_context(server->ioc);
+    for (i = 0; i < server->vu_dev.max_queues; i++) {
+        if (server->kick_info[i].cb) {
+            aio_set_fd_handler(ctx, server->kick_info[i].fd, false, NULL,
+                               NULL, NULL, NULL);
+        }
+    }
+}
+
+static void attach_context(VuServer *server, AioContext *ctx)
+{
+    int i;
+    qio_channel_attach_aio_context(server->ioc, ctx);
+    server->aio_context_changed = true;
+    if (server->co_trip) {
+        aio_co_schedule(ctx, server->co_trip);
+    }
+    for (i = 0; i < server->vu_dev.max_queues; i++) {
+        if (server->kick_info[i].cb) {
+            aio_set_fd_handler(ctx, server->kick_info[i].fd, false,
+                               kick_handler, NULL, NULL,
+                               &server->kick_info[i]);
+        }
+    }
+}
+
+void vhost_user_server_set_aio_context(AioContext *ctx, VuServer *server)
+{
+    server->ctx = ctx ? ctx : qemu_get_aio_context();
+    if (!server->sioc) {
+        return;
+    }
+    if (ctx) {
+        attach_context(server, ctx);
+    } else {
+        detach_context(server);
+    }
+}
+
+
+bool vhost_user_server_start(uint16_t max_queues,
+                             SocketAddress *socket_addr,
+                             AioContext *ctx,
+                             VuServer *server,
+                             void *device_panic_notifier,
+                             const VuDevIface *vu_iface,
+                             Error **errp)
+{
+    server->listener = qio_net_listener_new();
+    if (qio_net_listener_open_sync(server->listener, socket_addr, 1,
+                                   errp) < 0) {
+        goto error;
+    }
+
+    qio_net_listener_set_name(server->listener, "vhost-user-backend-listener");
+
+    server->vu_iface = vu_iface;
+    server->max_queues = max_queues;
+    server->ctx = ctx;
+    server->device_panic_notifier = device_panic_notifier;
+    qio_net_listener_set_client_func(server->listener,
+                                     vu_accept,
+                                     server,
+                                     NULL);
+
+    return true;
+error:
+    g_free(server);
+    return false;
+}
diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
new file mode 100644
index 0000000000..4315556b66
--- /dev/null
+++ b/util/vhost-user-server.h
@@ -0,0 +1,59 @@
+/*
+ * Sharing QEMU devices via vhost-user protocol
+ *
+ * Author: Coiby Xu <coiby.xu@gmail.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef VHOST_USER_SERVER_H
+#define VHOST_USER_SERVER_H
+
+#include "contrib/libvhost-user/libvhost-user.h"
+#include "io/channel-socket.h"
+#include "io/channel-file.h"
+#include "io/net-listener.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "standard-headers/linux/virtio_blk.h"
+
+typedef struct KickInfo {
+    VuDev *vu_dev;
+    int fd; /*kick fd*/
+    long index; /*queue index*/
+    vu_watch_cb cb;
+} KickInfo;
+
+typedef struct VuServer {
+    QIONetListener *listener;
+    AioContext *ctx;
+    void (*device_panic_notifier)(struct VuServer *server) ;
+    int max_queues;
+    const VuDevIface *vu_iface;
+    VuDev vu_dev;
+    QIOChannel *ioc; /* The I/O channel with the client */
+    QIOChannelSocket *sioc; /* The underlying data channel with the client */
+    /* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */
+    QIOChannel *ioc_slave;
+    QIOChannelSocket *sioc_slave;
+    Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
+    KickInfo *kick_info; /* an array with the length of the queue number */
+    /* restart coroutine co_trip if AIOContext is changed */
+    bool aio_context_changed;
+} VuServer;
+
+
+bool vhost_user_server_start(uint16_t max_queues,
+                             SocketAddress *unix_socket,
+                             AioContext *ctx,
+                             VuServer *server,
+                             void *device_panic_notifier,
+                             const VuDevIface *vu_iface,
+                             Error **errp);
+
+void vhost_user_server_stop(VuServer *server);
+
+void vhost_user_server_set_aio_context(AioContext *ctx, VuServer *server);
+
+#endif /* VHOST_USER_SERVER_H */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v8 3/4] vhost-user block device backend server
  2020-06-04 23:35 [PATCH v8 0/4] vhost-user block device backend implementation Coiby Xu
  2020-06-04 23:35 ` [PATCH v8 1/4] Allow vu_message_read to be replaced Coiby Xu
  2020-06-04 23:35 ` [PATCH v8 2/4] generic vhost user server Coiby Xu
@ 2020-06-04 23:35 ` Coiby Xu
  2020-06-11 15:24   ` Stefan Hajnoczi
  2020-06-04 23:35 ` [PATCH v8 4/4] new qTest case to test the vhost-user-blk-server Coiby Xu
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Coiby Xu @ 2020-06-04 23:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, open list:Block layer core, Coiby Xu, Max Reitz,
	bharatlkmlkvm, stefanha, Paolo Bonzini

By making use of libvhost-user, block device drive can be shared to
the connected vhost-user client. Only one client can connect to the
server one time.

Since vhost-user-server needs a block drive to be created first, delay
the creation of this object.

Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
---
 block/Makefile.objs                  |   1 +
 block/export/vhost-user-blk-server.c | 716 +++++++++++++++++++++++++++
 block/export/vhost-user-blk-server.h |  34 ++
 softmmu/vl.c                         |   4 +
 4 files changed, 755 insertions(+)
 create mode 100644 block/export/vhost-user-blk-server.c
 create mode 100644 block/export/vhost-user-blk-server.h

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 3635b6b4c1..0eb7eff470 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -24,6 +24,7 @@ block-obj-y += throttle-groups.o
 block-obj-$(CONFIG_LINUX) += nvme.o
 
 block-obj-y += nbd.o
+block-obj-$(CONFIG_LINUX) += export/vhost-user-blk-server.o ../contrib/libvhost-user/libvhost-user.o
 block-obj-$(CONFIG_SHEEPDOG) += sheepdog.o
 block-obj-$(CONFIG_LIBISCSI) += iscsi.o
 block-obj-$(if $(CONFIG_LIBISCSI),y,n) += iscsi-opts.o
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
new file mode 100644
index 0000000000..a9dec0625f
--- /dev/null
+++ b/block/export/vhost-user-blk-server.c
@@ -0,0 +1,716 @@
+/*
+ * Sharing QEMU block devices via vhost-user protocal
+ *
+ * Author: Coiby Xu <coiby.xu@gmail.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include "block/block.h"
+#include "vhost-user-blk-server.h"
+#include "qapi/error.h"
+#include "qom/object_interfaces.h"
+#include "sysemu/block-backend.h"
+
+enum {
+    VHOST_USER_BLK_MAX_QUEUES = 1,
+};
+struct virtio_blk_inhdr {
+    unsigned char status;
+};
+
+static QTAILQ_HEAD(, VuBlockDev) vu_block_devs =
+                                 QTAILQ_HEAD_INITIALIZER(vu_block_devs);
+
+
+typedef struct VuBlockReq {
+    VuVirtqElement *elem;
+    int64_t sector_num;
+    size_t size;
+    struct virtio_blk_inhdr *in;
+    struct virtio_blk_outhdr out;
+    VuServer *server;
+    struct VuVirtq *vq;
+} VuBlockReq;
+
+
+static void vu_block_req_complete(VuBlockReq *req)
+{
+    VuDev *vu_dev = &req->server->vu_dev;
+
+    /* IO size with 1 extra status byte */
+    vu_queue_push(vu_dev, req->vq, req->elem, req->size + 1);
+    vu_queue_notify(vu_dev, req->vq);
+
+    if (req->elem) {
+        free(req->elem);
+    }
+
+    g_free(req);
+}
+
+static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
+{
+    return container_of(server, VuBlockDev, vu_server);
+}
+
+static int coroutine_fn
+vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
+                              uint32_t iovcnt, uint32_t type)
+{
+    struct virtio_blk_discard_write_zeroes desc;
+    ssize_t size = iov_to_buf(iov, iovcnt, 0, &desc, sizeof(desc));
+    if (unlikely(size != sizeof(desc))) {
+        error_report("Invalid size %ld, expect %ld", size, sizeof(desc));
+        return -EINVAL;
+    }
+
+    VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
+    uint64_t range[2] = { le64toh(desc.sector) << 9,
+                          le32toh(desc.num_sectors) << 9 };
+    if (type == VIRTIO_BLK_T_DISCARD) {
+        if (blk_co_pdiscard(vdev_blk->backend, range[0], range[1]) == 0) {
+            return 0;
+        }
+    } else if (type == VIRTIO_BLK_T_WRITE_ZEROES) {
+        if (blk_co_pwrite_zeroes(vdev_blk->backend,
+                                 range[0], range[1], 0) == 0) {
+            return 0;
+        }
+    }
+
+    return -EINVAL;
+}
+
+
+static void coroutine_fn vu_block_flush(VuBlockReq *req)
+{
+    VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
+    BlockBackend *backend = vdev_blk->backend;
+    blk_co_flush(backend);
+}
+
+
+struct req_data {
+    VuServer *server;
+    VuVirtq *vq;
+    VuVirtqElement *elem;
+};
+
+static void coroutine_fn vu_block_virtio_process_req(void *opaque)
+{
+    struct req_data *data = opaque;
+    VuServer *server = data->server;
+    VuVirtq *vq = data->vq;
+    VuVirtqElement *elem = data->elem;
+    uint32_t type;
+    VuBlockReq *req;
+
+    VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
+    BlockBackend *backend = vdev_blk->backend;
+
+    struct iovec *in_iov = elem->in_sg;
+    struct iovec *out_iov = elem->out_sg;
+    unsigned in_num = elem->in_num;
+    unsigned out_num = elem->out_num;
+    /* refer to hw/block/virtio_blk.c */
+    if (elem->out_num < 1 || elem->in_num < 1) {
+        error_report("virtio-blk request missing headers");
+        free(elem);
+        return;
+    }
+
+    req = g_new0(VuBlockReq, 1);
+    req->server = server;
+    req->vq = vq;
+    req->elem = elem;
+
+    if (unlikely(iov_to_buf(out_iov, out_num, 0, &req->out,
+                            sizeof(req->out)) != sizeof(req->out))) {
+        error_report("virtio-blk request outhdr too short");
+        goto err;
+    }
+
+    iov_discard_front(&out_iov, &out_num, sizeof(req->out));
+
+    if (in_iov[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) {
+        error_report("virtio-blk request inhdr too short");
+        goto err;
+    }
+
+    /* We always touch the last byte, so just see how big in_iov is.  */
+    req->in = (void *)in_iov[in_num - 1].iov_base
+              + in_iov[in_num - 1].iov_len
+              - sizeof(struct virtio_blk_inhdr);
+    iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr));
+
+
+    type = le32toh(req->out.type);
+    switch (type & ~VIRTIO_BLK_T_BARRIER) {
+    case VIRTIO_BLK_T_IN:
+    case VIRTIO_BLK_T_OUT: {
+        ssize_t ret = 0;
+        bool is_write = type & VIRTIO_BLK_T_OUT;
+        req->sector_num = le64toh(req->out.sector);
+
+        int64_t offset = req->sector_num * vdev_blk->blk_size;
+        QEMUIOVector *qiov = g_new0(QEMUIOVector, 1);
+        if (is_write) {
+            qemu_iovec_init_external(qiov, out_iov, out_num);
+            ret = blk_co_pwritev(backend, offset, qiov->size,
+                                 qiov, 0);
+        } else {
+            qemu_iovec_init_external(qiov, in_iov, in_num);
+            ret = blk_co_preadv(backend, offset, qiov->size,
+                                qiov, 0);
+        }
+        if (ret >= 0) {
+            req->in->status = VIRTIO_BLK_S_OK;
+        } else {
+            req->in->status = VIRTIO_BLK_S_IOERR;
+        }
+        g_free(qiov);
+        break;
+    }
+    case VIRTIO_BLK_T_FLUSH:
+        vu_block_flush(req);
+        req->in->status = VIRTIO_BLK_S_OK;
+        break;
+    case VIRTIO_BLK_T_GET_ID: {
+        size_t size = MIN(iov_size(&elem->in_sg[0], in_num),
+                          VIRTIO_BLK_ID_BYTES);
+        snprintf(elem->in_sg[0].iov_base, size, "%s", "vhost_user_blk_server");
+        req->in->status = VIRTIO_BLK_S_OK;
+        req->size = elem->in_sg[0].iov_len;
+        break;
+    }
+    case VIRTIO_BLK_T_DISCARD:
+    case VIRTIO_BLK_T_WRITE_ZEROES: {
+        int rc;
+        rc = vu_block_discard_write_zeroes(req, &elem->out_sg[1],
+                                           out_num, type);
+        if (rc == 0) {
+            req->in->status = VIRTIO_BLK_S_OK;
+        } else {
+            req->in->status = VIRTIO_BLK_S_IOERR;
+        }
+        break;
+    }
+    default:
+        req->in->status = VIRTIO_BLK_S_UNSUPP;
+        break;
+    }
+
+    vu_block_req_complete(req);
+    return;
+
+err:
+    free(elem);
+    g_free(req);
+    return;
+}
+
+
+
+static void vu_block_process_vq(VuDev *vu_dev, int idx)
+{
+    VuServer *server;
+    VuVirtq *vq;
+
+    server = container_of(vu_dev, VuServer, vu_dev);
+    assert(server);
+
+    vq = vu_get_queue(vu_dev, idx);
+    assert(vq);
+    VuVirtqElement *elem;
+    while (1) {
+        elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement) +
+                                    sizeof(VuBlockReq));
+        if (elem) {
+            struct req_data req_data = {
+                .server = server,
+                .vq = vq,
+                .elem = elem
+            };
+            Coroutine *co = qemu_coroutine_create(vu_block_virtio_process_req,
+                                                  &req_data);
+            aio_co_enter(server->ioc->ctx, co);
+        } else {
+            break;
+        }
+    }
+}
+
+static void vu_block_queue_set_started(VuDev *vu_dev, int idx, bool started)
+{
+    VuVirtq *vq;
+
+    assert(vu_dev);
+
+    vq = vu_get_queue(vu_dev, idx);
+    vu_set_queue_handler(vu_dev, vq, started ? vu_block_process_vq : NULL);
+}
+
+static uint64_t vu_block_get_features(VuDev *dev)
+{
+    uint64_t features;
+    VuServer *server = container_of(dev, VuServer, vu_dev);
+    VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
+    features = 1ull << VIRTIO_BLK_F_SIZE_MAX |
+               1ull << VIRTIO_BLK_F_SEG_MAX |
+               1ull << VIRTIO_BLK_F_TOPOLOGY |
+               1ull << VIRTIO_BLK_F_BLK_SIZE |
+               1ull << VIRTIO_BLK_F_FLUSH |
+               1ull << VIRTIO_BLK_F_DISCARD |
+               1ull << VIRTIO_BLK_F_WRITE_ZEROES |
+               1ull << VIRTIO_BLK_F_CONFIG_WCE |
+               1ull << VIRTIO_F_VERSION_1 |
+               1ull << VIRTIO_RING_F_INDIRECT_DESC |
+               1ull << VIRTIO_RING_F_EVENT_IDX |
+               1ull << VHOST_USER_F_PROTOCOL_FEATURES;
+
+    if (!vdev_blk->writable) {
+        features |= 1ull << VIRTIO_BLK_F_RO;
+    }
+
+    return features;
+}
+
+static uint64_t vu_block_get_protocol_features(VuDev *dev)
+{
+    return 1ull << VHOST_USER_PROTOCOL_F_CONFIG |
+           1ull << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD;
+}
+
+static int
+vu_block_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len)
+{
+    VuServer *server = container_of(vu_dev, VuServer, vu_dev);
+    VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
+    memcpy(config, &vdev_blk->blkcfg, len);
+
+    return 0;
+}
+
+static int
+vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
+                    uint32_t offset, uint32_t size, uint32_t flags)
+{
+    VuServer *server = container_of(vu_dev, VuServer, vu_dev);
+    VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
+    uint8_t wce;
+
+    /* don't support live migration */
+    if (flags != VHOST_SET_CONFIG_TYPE_MASTER) {
+        return -EINVAL;
+    }
+
+
+    if (offset != offsetof(struct virtio_blk_config, wce) ||
+        size != 1) {
+        return -EINVAL;
+    }
+
+    wce = *data;
+    if (wce == vdev_blk->blkcfg.wce) {
+        /* Do nothing as same with old configuration */
+        return 0;
+    }
+
+    vdev_blk->blkcfg.wce = wce;
+    blk_set_enable_write_cache(vdev_blk->backend, wce);
+    return 0;
+}
+
+
+/*
+ * When the client disconnects, it sends a VHOST_USER_NONE request
+ * and vu_process_message will simple call exit which cause the VM
+ * to exit abruptly.
+ * To avoid this issue,  process VHOST_USER_NONE request ahead
+ * of vu_process_message.
+ *
+ */
+static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
+{
+    if (vmsg->request == VHOST_USER_NONE) {
+        dev->panic(dev, "disconnect");
+        return true;
+    }
+    return false;
+}
+
+
+static const VuDevIface vu_block_iface = {
+    .get_features          = vu_block_get_features,
+    .queue_set_started     = vu_block_queue_set_started,
+    .get_protocol_features = vu_block_get_protocol_features,
+    .get_config            = vu_block_get_config,
+    .set_config            = vu_block_set_config,
+    .process_msg           = vu_block_process_msg,
+};
+
+static void blk_aio_attached(AioContext *ctx, void *opaque)
+{
+    VuBlockDev *vub_dev = opaque;
+    aio_context_acquire(ctx);
+    vhost_user_server_set_aio_context(ctx, &vub_dev->vu_server);
+    aio_context_release(ctx);
+}
+
+static void blk_aio_detach(void *opaque)
+{
+    VuBlockDev *vub_dev = opaque;
+    AioContext *ctx = vub_dev->vu_server.ctx;
+    aio_context_acquire(ctx);
+    vhost_user_server_set_aio_context(NULL, &vub_dev->vu_server);
+    aio_context_release(ctx);
+}
+
+static void vu_block_free(VuBlockDev *vu_block_dev)
+{
+    if (!vu_block_dev) {
+        return;
+    }
+
+    if (vu_block_dev->backend) {
+        blk_remove_aio_context_notifier(vu_block_dev->backend, blk_aio_attached,
+                                        blk_aio_detach, vu_block_dev);
+    }
+
+    blk_unref(vu_block_dev->backend);
+
+    if (vu_block_dev->next.tqe_circ.tql_prev) {
+        /*
+         * remove vu_block_device from the list
+         *
+         * if vu_block_dev->next.tqe_circ.tql_prev = null,
+         * vu_block_dev hasn't been inserted into the queue and
+         * vu_block_free is called by obj->instance_finalize.
+         */
+        QTAILQ_REMOVE(&vu_block_devs, vu_block_dev, next);
+    }
+}
+
+static void
+vu_block_initialize_config(BlockDriverState *bs,
+                           struct virtio_blk_config *config, uint32_t blk_size)
+{
+    config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
+    config->blk_size = blk_size;
+    config->size_max = 0;
+    config->seg_max = 128 - 2;
+    config->min_io_size = 1;
+    config->opt_io_size = 1;
+    config->num_queues = VHOST_USER_BLK_MAX_QUEUES;
+    config->max_discard_sectors = 32768;
+    config->max_discard_seg = 1;
+    config->discard_sector_alignment = config->blk_size >> 9;
+    config->max_write_zeroes_sectors = 32768;
+    config->max_write_zeroes_seg = 1;
+}
+
+
+static VuBlockDev *vu_block_init(VuBlockDev *vu_block_device, Error **errp)
+{
+
+    BlockBackend *blk;
+    Error *local_error = NULL;
+    const char *node_name = vu_block_device->node_name;
+    bool writable = vu_block_device->writable;
+    /*
+     * Don't allow resize while the vhost user server is running,
+     * otherwise we don't care what happens with the node.
+     */
+    uint64_t perm = BLK_PERM_CONSISTENT_READ;
+    int ret;
+
+    AioContext *ctx;
+
+    BlockDriverState *bs = bdrv_lookup_bs(node_name, node_name, &local_error);
+
+    if (!bs) {
+        error_propagate(errp, local_error);
+        return NULL;
+    }
+
+    if (bdrv_is_read_only(bs)) {
+        writable = false;
+    }
+
+    if (writable) {
+        perm |= BLK_PERM_WRITE;
+    }
+
+    ctx = bdrv_get_aio_context(bs);
+    aio_context_acquire(ctx);
+    bdrv_invalidate_cache(bs, NULL);
+    aio_context_release(ctx);
+
+    blk = blk_new(bdrv_get_aio_context(bs), perm,
+                  BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
+                  BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD);
+    ret = blk_insert_bs(blk, bs, errp);
+
+    if (ret < 0) {
+        goto fail;
+    }
+
+    blk_set_enable_write_cache(blk, false);
+
+    blk_set_allow_aio_context_change(blk, true);
+
+    vu_block_device->blkcfg.wce = 0;
+    vu_block_device->backend = blk;
+    if (!vu_block_device->blk_size) {
+        vu_block_device->blk_size = BDRV_SECTOR_SIZE;
+    }
+    vu_block_device->blkcfg.blk_size = vu_block_device->blk_size;
+    blk_set_guest_block_size(blk, vu_block_device->blk_size);
+    vu_block_initialize_config(bs, &vu_block_device->blkcfg,
+                                   vu_block_device->blk_size);
+    return vu_block_device;
+
+fail:
+    blk_unref(blk);
+    return NULL;
+}
+
+static void vhost_user_blk_server_free(VuBlockDev *vu_block_device)
+{
+    if (!vu_block_device) {
+        return;
+    }
+    vhost_user_server_stop(&vu_block_device->vu_server);
+    vu_block_free(vu_block_device);
+
+}
+
+/*
+ * A exported drive can serve multiple multiple clients simutateously,
+ * thus no need to export the same drive twice.
+ *
+ */
+static VuBlockDev *vu_block_dev_find(const char *node_name)
+{
+    VuBlockDev *vu_block_device;
+    QTAILQ_FOREACH(vu_block_device, &vu_block_devs, next) {
+        if (strcmp(node_name, vu_block_device->node_name) == 0) {
+            return vu_block_device;
+        }
+    }
+
+    return NULL;
+}
+
+
+static VuBlockDev
+*vu_block_dev_find_by_unix_socket(const char *unix_socket)
+{
+    VuBlockDev *vu_block_device;
+    QTAILQ_FOREACH(vu_block_device, &vu_block_devs, next) {
+        if (strcmp(unix_socket, vu_block_device->addr->u.q_unix.path) == 0) {
+            return vu_block_device;
+        }
+    }
+
+    return NULL;
+}
+
+
+static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
+                                        Error **errp)
+{
+
+    const char *name = vu_block_device->node_name;
+    SocketAddress *addr = vu_block_device->addr;
+    char *unix_socket = vu_block_device->addr->u.q_unix.path;
+
+    if (vu_block_dev_find(name)) {
+        error_setg(errp, "Vhost-user-blk server with node-name '%s' "
+                   "has already been started",
+                   name);
+        return;
+    }
+
+    if (vu_block_dev_find_by_unix_socket(unix_socket)) {
+        error_setg(errp, "Vhost-user-blk server with with socket_path '%s' "
+                   "has already been started", unix_socket);
+        return;
+    }
+
+    if (!vu_block_init(vu_block_device, errp)) {
+        return;
+    }
+
+
+    AioContext *ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
+
+    if (!vhost_user_server_start(VHOST_USER_BLK_MAX_QUEUES, addr, ctx,
+                                 &vu_block_device->vu_server,
+                                 NULL, &vu_block_iface,
+                                 errp)) {
+        goto error;
+    }
+
+    QTAILQ_INSERT_TAIL(&vu_block_devs, vu_block_device, next);
+    blk_add_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
+                                 blk_aio_detach, vu_block_device);
+    return;
+
+ error:
+    vu_block_free(vu_block_device);
+}
+
+static void vu_set_node_name(Object *obj, const char *value, Error **errp)
+{
+    VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
+
+    if (vus->node_name) {
+        error_setg(errp, "evdev property already set");
+        return;
+    }
+
+    vus->node_name = g_strdup(value);
+}
+
+static char *vu_get_node_name(Object *obj, Error **errp)
+{
+    VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
+    return g_strdup(vus->node_name);
+}
+
+
+static void vu_set_unix_socket(Object *obj, const char *value,
+                               Error **errp)
+{
+    VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
+
+    if (vus->addr) {
+        error_setg(errp, "unix_socket property already set");
+        return;
+    }
+
+    SocketAddress *addr = g_new0(SocketAddress, 1);
+    addr->type = SOCKET_ADDRESS_TYPE_UNIX;
+    addr->u.q_unix.path = g_strdup(value);
+    vus->addr = addr;
+}
+
+static char *vu_get_unix_socket(Object *obj, Error **errp)
+{
+    VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
+    return g_strdup(vus->addr->u.q_unix.path);
+}
+
+static bool vu_get_block_writable(Object *obj, Error **errp)
+{
+    VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
+    return vus->writable;
+}
+
+static void vu_set_block_writable(Object *obj, bool value, Error **errp)
+{
+    VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
+
+    vus->writable = value;
+}
+
+static void vu_get_blk_size(Object *obj, Visitor *v, const char *name,
+                            void *opaque, Error **errp)
+{
+    VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
+    uint32_t value = vus->blk_size;
+
+    visit_type_uint32(v, name, &value, errp);
+}
+
+static void vu_set_blk_size(Object *obj, Visitor *v, const char *name,
+                            void *opaque, Error **errp)
+{
+    VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
+
+    Error *local_err = NULL;
+    uint32_t value;
+
+    visit_type_uint32(v, name, &value, &local_err);
+    if (local_err) {
+        goto out;
+    }
+    if (value != BDRV_SECTOR_SIZE && value != 4096) {
+        error_setg(&local_err,
+                   "Property '%s.%s' can only take value 512 or 4096",
+                   object_get_typename(obj), name);
+        goto out;
+    }
+
+    vus->blk_size = value;
+
+out:
+    error_propagate(errp, local_err);
+    vus->blk_size = value;
+}
+
+
+static void vhost_user_blk_server_instance_finalize(Object *obj)
+{
+    VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
+
+    vhost_user_blk_server_free(vub);
+}
+
+static void vhost_user_blk_server_complete(UserCreatable *obj, Error **errp)
+{
+    Error *local_error = NULL;
+    VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
+
+    vhost_user_blk_server_start(vub, &local_error);
+
+    if (local_error) {
+        error_propagate(errp, local_error);
+        return;
+    }
+}
+
+static void vhost_user_blk_server_class_init(ObjectClass *klass,
+                                             void *class_data)
+{
+    UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
+    ucc->complete = vhost_user_blk_server_complete;
+
+    object_class_property_add_bool(klass, "writable",
+                                   vu_get_block_writable,
+                                   vu_set_block_writable);
+
+    object_class_property_add_str(klass, "node-name",
+                                  vu_get_node_name,
+                                  vu_set_node_name);
+
+    object_class_property_add_str(klass, "unix-socket",
+                                  vu_get_unix_socket,
+                                  vu_set_unix_socket);
+
+    object_class_property_add(klass, "blk-size", "uint32",
+                              vu_get_blk_size, vu_set_blk_size,
+                              NULL, NULL);
+}
+
+static const TypeInfo vhost_user_blk_server_info = {
+    .name = TYPE_VHOST_USER_BLK_SERVER,
+    .parent = TYPE_OBJECT,
+    .instance_size = sizeof(VuBlockDev),
+    .instance_finalize = vhost_user_blk_server_instance_finalize,
+    .class_init = vhost_user_blk_server_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        {TYPE_USER_CREATABLE},
+        {}
+    },
+};
+
+static void vhost_user_blk_server_register_types(void)
+{
+    type_register_static(&vhost_user_blk_server_info);
+}
+
+type_init(vhost_user_blk_server_register_types)
diff --git a/block/export/vhost-user-blk-server.h b/block/export/vhost-user-blk-server.h
new file mode 100644
index 0000000000..6a393ed49d
--- /dev/null
+++ b/block/export/vhost-user-blk-server.h
@@ -0,0 +1,34 @@
+/*
+ * Sharing QEMU block devices via vhost-user protocal
+ *
+ * Author: Coiby Xu <coiby.xu@gmail.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef VHOST_USER_BLK_SERVER_H
+#define VHOST_USER_BLK_SERVER_H
+#include "util/vhost-user-server.h"
+
+typedef struct VuBlockDev VuBlockDev;
+#define TYPE_VHOST_USER_BLK_SERVER "vhost-user-blk-server"
+#define VHOST_USER_BLK_SERVER(obj) \
+   OBJECT_CHECK(VuBlockDev, obj, TYPE_VHOST_USER_BLK_SERVER)
+
+/* vhost user block device */
+struct VuBlockDev {
+    Object parent_obj;
+    char *node_name;
+    SocketAddress *addr;
+    AioContext *ctx;
+    VuServer vu_server;
+    uint32_t blk_size;
+    BlockBackend *backend;
+    QIOChannelSocket *sioc;
+    QTAILQ_ENTRY(VuBlockDev) next;
+    struct virtio_blk_config blkcfg;
+    bool writable;
+};
+
+#endif /* VHOST_USER_BLK_SERVER_H */
diff --git a/softmmu/vl.c b/softmmu/vl.c
index ae5451bc23..e4549871e1 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2520,6 +2520,10 @@ static bool object_create_initial(const char *type, QemuOpts *opts)
     }
 #endif
 
+    /* Reason: vhost-user-blk-server property "node-name" */
+    if (g_str_equal(type, "vhost-user-blk-server")) {
+        return false;
+    }
     /*
      * Reason: filter-* property "netdev" etc.
      */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v8 4/4] new qTest case to test the vhost-user-blk-server
  2020-06-04 23:35 [PATCH v8 0/4] vhost-user block device backend implementation Coiby Xu
                   ` (2 preceding siblings ...)
  2020-06-04 23:35 ` [PATCH v8 3/4] vhost-user block device backend server Coiby Xu
@ 2020-06-04 23:35 ` Coiby Xu
  2020-06-05  5:01   ` Thomas Huth
  2020-06-11 12:37 ` [PATCH v8 0/4] vhost-user block device backend implementation Stefano Garzarella
  2020-06-11 15:27 ` Stefan Hajnoczi
  5 siblings, 1 reply; 21+ messages in thread
From: Coiby Xu @ 2020-06-04 23:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, Laurent Vivier, Thomas Huth, Coiby Xu, bharatlkmlkvm,
	stefanha, Paolo Bonzini

This test case has the same tests as tests/virtio-blk-test.c except for
tests have block_resize. Since vhost-user server can only server one
client one time, two instances of qemu-storage-daemon are launched
for the hotplug test.

In order to not block scripts/tap-driver.pl, vhost-user-blk-server will
send "quit" command to qemu-storage-daemon's QMP monitor. So a function
is added to libqtest.c to establish socket connection with socket
server.

Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
---
 tests/Makefile.include              |   3 +-
 tests/qtest/Makefile.include        |   2 +
 tests/qtest/libqos/vhost-user-blk.c | 130 +++++
 tests/qtest/libqos/vhost-user-blk.h |  44 ++
 tests/qtest/libqtest.c              |  54 +-
 tests/qtest/libqtest.h              |  38 ++
 tests/qtest/vhost-user-blk-test.c   | 737 ++++++++++++++++++++++++++++
 7 files changed, 976 insertions(+), 32 deletions(-)
 create mode 100644 tests/qtest/libqos/vhost-user-blk.c
 create mode 100644 tests/qtest/libqos/vhost-user-blk.h
 create mode 100644 tests/qtest/vhost-user-blk-test.c

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 6e3d6370df..d8578346b0 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -636,7 +636,8 @@ endef
 $(patsubst %, check-qtest-%, $(QTEST_TARGETS)): check-qtest-%: %-softmmu/all $(check-qtest-y)
 	$(call do_test_human,$(check-qtest-$*-y:%=tests/qtest/%$(EXESUF)) $(check-qtest-generic-y:%=tests/qtest/%$(EXESUF)), \
 	  QTEST_QEMU_BINARY=$*-softmmu/qemu-system-$* \
-	  QTEST_QEMU_IMG=qemu-img$(EXESUF))
+	  QTEST_QEMU_IMG=./qemu-img$(EXESUF) \
+	  QTEST_QEMU_STORAGE_DAEMON_BINARY=./qemu-storage-daemon$(EXESUF))
 
 check-unit: $(check-unit-y)
 	$(call do_test_human, $^)
diff --git a/tests/qtest/Makefile.include b/tests/qtest/Makefile.include
index 9e5a51d033..b6f081cb26 100644
--- a/tests/qtest/Makefile.include
+++ b/tests/qtest/Makefile.include
@@ -186,6 +186,7 @@ libqos-obj-y += tests/qtest/libqos/virtio.o
 libqos-obj-$(CONFIG_VIRTFS) += tests/qtest/libqos/virtio-9p.o
 libqos-obj-y += tests/qtest/libqos/virtio-balloon.o
 libqos-obj-y += tests/qtest/libqos/virtio-blk.o
+libqos-obj-$(CONFIG_LINUX) += tests/qtest/libqos/vhost-user-blk.o
 libqos-obj-y += tests/qtest/libqos/virtio-mmio.o
 libqos-obj-y += tests/qtest/libqos/virtio-net.o
 libqos-obj-y += tests/qtest/libqos/virtio-pci.o
@@ -230,6 +231,7 @@ qos-test-obj-$(CONFIG_VHOST_NET_USER) += tests/qtest/vhost-user-test.o $(chardev
 qos-test-obj-y += tests/qtest/virtio-test.o
 qos-test-obj-$(CONFIG_VIRTFS) += tests/qtest/virtio-9p-test.o
 qos-test-obj-y += tests/qtest/virtio-blk-test.o
+qos-test-obj-$(CONFIG_LINUX) += tests/qtest/vhost-user-blk-test.o
 qos-test-obj-y += tests/qtest/virtio-net-test.o
 qos-test-obj-y += tests/qtest/virtio-rng-test.o
 qos-test-obj-y += tests/qtest/virtio-scsi-test.o
diff --git a/tests/qtest/libqos/vhost-user-blk.c b/tests/qtest/libqos/vhost-user-blk.c
new file mode 100644
index 0000000000..3de9c59194
--- /dev/null
+++ b/tests/qtest/libqos/vhost-user-blk.c
@@ -0,0 +1,130 @@
+/*
+ * libqos driver framework
+ *
+ * Based on tests/qtest/libqos/virtio-blk.c
+ *
+ * Copyright (c) 2020 Coiby Xu <coiby.xu@gmail.com>
+ *
+ * Copyright (c) 2018 Emanuele Giuseppe Esposito <e.emanuelegiuseppe@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License version 2.1 as published by the Free Software Foundation.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest.h"
+#include "qemu/module.h"
+#include "standard-headers/linux/virtio_blk.h"
+#include "libqos/qgraph.h"
+#include "libqos/vhost-user-blk.h"
+
+#define PCI_SLOT                0x04
+#define PCI_FN                  0x00
+
+/* virtio-blk-device */
+static void *qvhost_user_blk_get_driver(QVhostUserBlk *v_blk,
+                                    const char *interface)
+{
+    if (!g_strcmp0(interface, "vhost-user-blk")) {
+        return v_blk;
+    }
+    if (!g_strcmp0(interface, "virtio")) {
+        return v_blk->vdev;
+    }
+
+    fprintf(stderr, "%s not present in vhost-user-blk-device\n", interface);
+    g_assert_not_reached();
+}
+
+static void *qvhost_user_blk_device_get_driver(void *object,
+                                           const char *interface)
+{
+    QVhostUserBlkDevice *v_blk = object;
+    return qvhost_user_blk_get_driver(&v_blk->blk, interface);
+}
+
+static void *vhost_user_blk_device_create(void *virtio_dev,
+                                      QGuestAllocator *t_alloc,
+                                      void *addr)
+{
+    QVhostUserBlkDevice *vhost_user_blk = g_new0(QVhostUserBlkDevice, 1);
+    QVhostUserBlk *interface = &vhost_user_blk->blk;
+
+    interface->vdev = virtio_dev;
+
+    vhost_user_blk->obj.get_driver = qvhost_user_blk_device_get_driver;
+
+    return &vhost_user_blk->obj;
+}
+
+/* virtio-blk-pci */
+static void *qvhost_user_blk_pci_get_driver(void *object, const char *interface)
+{
+    QVhostUserBlkPCI *v_blk = object;
+    if (!g_strcmp0(interface, "pci-device")) {
+        return v_blk->pci_vdev.pdev;
+    }
+    return qvhost_user_blk_get_driver(&v_blk->blk, interface);
+}
+
+static void *vhost_user_blk_pci_create(void *pci_bus, QGuestAllocator *t_alloc,
+                                      void *addr)
+{
+    QVhostUserBlkPCI *vhost_user_blk = g_new0(QVhostUserBlkPCI, 1);
+    QVhostUserBlk *interface = &vhost_user_blk->blk;
+    QOSGraphObject *obj = &vhost_user_blk->pci_vdev.obj;
+
+    virtio_pci_init(&vhost_user_blk->pci_vdev, pci_bus, addr);
+    interface->vdev = &vhost_user_blk->pci_vdev.vdev;
+
+    g_assert_cmphex(interface->vdev->device_type, ==, VIRTIO_ID_BLOCK);
+
+    obj->get_driver = qvhost_user_blk_pci_get_driver;
+
+    return obj;
+}
+
+static void vhost_user_blk_register_nodes(void)
+{
+    /*
+     * FIXME: every test using these two nodes needs to setup a
+     * -drive,id=drive0 otherwise QEMU is not going to start.
+     * Therefore, we do not include "produces" edge for virtio
+     * and pci-device yet.
+     */
+
+    char *arg = g_strdup_printf("id=drv0,chardev=char1,addr=%x.%x",
+                                PCI_SLOT, PCI_FN);
+
+    QPCIAddress addr = {
+        .devfn = QPCI_DEVFN(PCI_SLOT, PCI_FN),
+    };
+
+    QOSGraphEdgeOptions opts = { };
+
+    /* virtio-blk-device */
+    /** opts.extra_device_opts = "drive=drive0"; */
+    qos_node_create_driver("vhost-user-blk-device", vhost_user_blk_device_create);
+    qos_node_consumes("vhost-user-blk-device", "virtio-bus", &opts);
+    qos_node_produces("vhost-user-blk-device", "vhost-user-blk");
+
+    /* virtio-blk-pci */
+    opts.extra_device_opts = arg;
+    add_qpci_address(&opts, &addr);
+    qos_node_create_driver("vhost-user-blk-pci", vhost_user_blk_pci_create);
+    qos_node_consumes("vhost-user-blk-pci", "pci-bus", &opts);
+    qos_node_produces("vhost-user-blk-pci", "vhost-user-blk");
+
+    g_free(arg);
+}
+
+libqos_init(vhost_user_blk_register_nodes);
diff --git a/tests/qtest/libqos/vhost-user-blk.h b/tests/qtest/libqos/vhost-user-blk.h
new file mode 100644
index 0000000000..ef4ef09cca
--- /dev/null
+++ b/tests/qtest/libqos/vhost-user-blk.h
@@ -0,0 +1,44 @@
+/*
+ * libqos driver framework
+ *
+ * Copyright (c) 2018 Emanuele Giuseppe Esposito <e.emanuelegiuseppe@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License version 2 as published by the Free Software Foundation.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#ifndef TESTS_LIBQOS_VHOST_USER_BLK_H
+#define TESTS_LIBQOS_VHOST_USER_BLK_H
+
+#include "libqos/qgraph.h"
+#include "libqos/virtio.h"
+#include "libqos/virtio-pci.h"
+
+typedef struct QVhostUserBlk QVhostUserBlk;
+typedef struct QVhostUserBlkPCI QVhostUserBlkPCI;
+typedef struct QVhostUserBlkDevice QVhostUserBlkDevice;
+
+struct QVhostUserBlk {
+    QVirtioDevice *vdev;
+};
+
+struct QVhostUserBlkPCI {
+    QVirtioPCIDevice pci_vdev;
+    QVhostUserBlk blk;
+};
+
+struct QVhostUserBlkDevice {
+    QOSGraphObject obj;
+    QVhostUserBlk blk;
+};
+
+#endif
diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
index 49075b55a1..a7b7c96206 100644
--- a/tests/qtest/libqtest.c
+++ b/tests/qtest/libqtest.c
@@ -31,40 +31,9 @@
 #include "qapi/qmp/qlist.h"
 #include "qapi/qmp/qstring.h"
 
-#define MAX_IRQ 256
 #define SOCKET_TIMEOUT 50
 #define SOCKET_MAX_FDS 16
 
-
-typedef void (*QTestSendFn)(QTestState *s, const char *buf);
-typedef void (*ExternalSendFn)(void *s, const char *buf);
-typedef GString* (*QTestRecvFn)(QTestState *);
-
-typedef struct QTestClientTransportOps {
-    QTestSendFn     send;      /* for sending qtest commands */
-
-    /*
-     * use external_send to send qtest command strings through functions which
-     * do not accept a QTestState as the first parameter.
-     */
-    ExternalSendFn  external_send;
-
-    QTestRecvFn     recv_line; /* for receiving qtest command responses */
-} QTestTransportOps;
-
-struct QTestState
-{
-    int fd;
-    int qmp_fd;
-    pid_t qemu_pid;  /* our child QEMU process */
-    int wstatus;
-    int expected_status;
-    bool big_endian;
-    bool irq_level[MAX_IRQ];
-    GString *rx;
-    QTestTransportOps ops;
-};
-
 static GHookList abrt_hooks;
 static struct sigaction sigact_old;
 
@@ -101,6 +70,29 @@ static int init_socket(const char *socket_path)
     return sock;
 }
 
+int qtest_socket_client(char *server_socket_path)
+{
+    struct sockaddr_un serv_addr;
+    int sock;
+    int ret;
+    int retries = 0;
+    sock = socket(PF_UNIX, SOCK_STREAM, 0);
+    g_assert_cmpint(sock, !=, -1);
+    serv_addr.sun_family = AF_UNIX;
+    snprintf(serv_addr.sun_path, sizeof(serv_addr.sun_path), "%s", server_socket_path);
+
+    do {
+        ret = connect(sock, (struct sockaddr *)&serv_addr, sizeof(serv_addr));
+        if (ret == 0) {
+            break;
+        }
+        retries += 1;
+        g_usleep(G_USEC_PER_SEC);
+    } while (retries < 3);
+
+    g_assert_cmpint(ret, ==, 0);
+    return sock;
+}
 static int socket_accept(int sock)
 {
     struct sockaddr_un addr;
diff --git a/tests/qtest/libqtest.h b/tests/qtest/libqtest.h
index f5cf93c386..a6774d8781 100644
--- a/tests/qtest/libqtest.h
+++ b/tests/qtest/libqtest.h
@@ -20,8 +20,38 @@
 #include "qapi/qmp/qobject.h"
 #include "qapi/qmp/qdict.h"
 
+#define MAX_IRQ 256
+
 typedef struct QTestState QTestState;
 
+typedef void (*QTestSendFn)(QTestState *s, const char *buf);
+typedef void (*ExternalSendFn)(void *s, const char *buf);
+typedef GString* (*QTestRecvFn)(QTestState *);
+
+typedef struct QTestClientTransportOps {
+    QTestSendFn     send;      /* for sending qtest commands */
+
+    /*
+     * use external_send to send qtest command strings through functions which
+     * do not accept a QTestState as the first parameter.
+     */
+    ExternalSendFn  external_send;
+
+    QTestRecvFn     recv_line; /* for receiving qtest command responses */
+} QTestTransportOps;
+
+struct QTestState {
+    int fd;
+    int qmp_fd;
+    pid_t qemu_pid;  /* our child QEMU process */
+    int wstatus;
+    int expected_status;
+    bool big_endian;
+    bool irq_level[MAX_IRQ];
+    GString *rx;
+    QTestTransportOps ops;
+};
+
 /**
  * qtest_initf:
  * @fmt...: Format for creating other arguments to pass to QEMU, formatted
@@ -626,6 +656,14 @@ void qtest_add_data_func_full(const char *str, void *data,
 
 void qtest_add_abrt_handler(GHookFunc fn, const void *data);
 
+/**
+ * qtest_socket_client:
+ * @server_socket_path: the socket server's path
+ *
+ * Connect to a socket server.
+ */
+int qtest_socket_client(char *server_socket_path);
+
 /**
  * qtest_qmp_assert_success:
  * @qts: QTestState instance to operate on
diff --git a/tests/qtest/vhost-user-blk-test.c b/tests/qtest/vhost-user-blk-test.c
new file mode 100644
index 0000000000..07fbe1508f
--- /dev/null
+++ b/tests/qtest/vhost-user-blk-test.c
@@ -0,0 +1,737 @@
+/*
+ * QTest testcase for VirtIO Block Device
+ *
+ * Copyright (c) 2014 SUSE LINUX Products GmbH
+ * Copyright (c) 2014 Marc Marí
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest-single.h"
+#include "qemu/bswap.h"
+#include "qemu/module.h"
+#include "standard-headers/linux/virtio_blk.h"
+#include "standard-headers/linux/virtio_pci.h"
+#include "libqos/qgraph.h"
+#include "libqos/vhost-user-blk.h"
+#include "libqos/libqos-pc.h"
+
+/* TODO actually test the results and get rid of this */
+#define qmp_discard_response(...) qobject_unref(qmp(__VA_ARGS__))
+
+#define TEST_IMAGE_SIZE         (64 * 1024 * 1024)
+#define QVIRTIO_BLK_TIMEOUT_US  (30 * 1000 * 1000)
+#define PCI_SLOT_HP             0x06
+
+typedef struct QVirtioBlkReq {
+    uint32_t type;
+    uint32_t ioprio;
+    uint64_t sector;
+    char *data;
+    uint8_t status;
+} QVirtioBlkReq;
+
+
+#ifdef HOST_WORDS_BIGENDIAN
+static const bool host_is_big_endian = true;
+#else
+static const bool host_is_big_endian; /* false */
+#endif
+
+static inline void virtio_blk_fix_request(QVirtioDevice *d, QVirtioBlkReq *req)
+{
+    if (qvirtio_is_big_endian(d) != host_is_big_endian) {
+        req->type = bswap32(req->type);
+        req->ioprio = bswap32(req->ioprio);
+        req->sector = bswap64(req->sector);
+    }
+}
+
+
+static inline void virtio_blk_fix_dwz_hdr(QVirtioDevice *d,
+    struct virtio_blk_discard_write_zeroes *dwz_hdr)
+{
+    if (qvirtio_is_big_endian(d) != host_is_big_endian) {
+        dwz_hdr->sector = bswap64(dwz_hdr->sector);
+        dwz_hdr->num_sectors = bswap32(dwz_hdr->num_sectors);
+        dwz_hdr->flags = bswap32(dwz_hdr->flags);
+    }
+}
+
+static uint64_t virtio_blk_request(QGuestAllocator *alloc, QVirtioDevice *d,
+                                   QVirtioBlkReq *req, uint64_t data_size)
+{
+    uint64_t addr;
+    uint8_t status = 0xFF;
+
+    switch (req->type) {
+    case VIRTIO_BLK_T_IN:
+    case VIRTIO_BLK_T_OUT:
+        g_assert_cmpuint(data_size % 512, ==, 0);
+        break;
+    case VIRTIO_BLK_T_DISCARD:
+    case VIRTIO_BLK_T_WRITE_ZEROES:
+        g_assert_cmpuint(data_size %
+                         sizeof(struct virtio_blk_discard_write_zeroes), ==, 0);
+        break;
+    default:
+        g_assert_cmpuint(data_size, ==, 0);
+    }
+
+    addr = guest_alloc(alloc, sizeof(*req) + data_size);
+
+    virtio_blk_fix_request(d, req);
+
+    memwrite(addr, req, 16);
+    memwrite(addr + 16, req->data, data_size);
+    memwrite(addr + 16 + data_size, &status, sizeof(status));
+
+    return addr;
+}
+
+/* Returns the request virtqueue so the caller can perform further tests */
+static QVirtQueue *test_basic(QVirtioDevice *dev, QGuestAllocator *alloc)
+{
+    QVirtioBlkReq req;
+    uint64_t req_addr;
+    uint64_t capacity;
+    uint64_t features;
+    uint32_t free_head;
+    uint8_t status;
+    char *data;
+    QTestState *qts = global_qtest;
+    QVirtQueue *vq;
+
+    features = qvirtio_get_features(dev);
+    features = features & ~(QVIRTIO_F_BAD_FEATURE |
+                    (1u << VIRTIO_RING_F_INDIRECT_DESC) |
+                    (1u << VIRTIO_RING_F_EVENT_IDX) |
+                    (1u << VIRTIO_BLK_F_SCSI));
+    qvirtio_set_features(dev, features);
+
+    capacity = qvirtio_config_readq(dev, 0);
+    g_assert_cmpint(capacity, ==, TEST_IMAGE_SIZE / 512);
+
+    vq = qvirtqueue_setup(dev, alloc, 0);
+
+    qvirtio_set_driver_ok(dev);
+
+    /* Write and read with 3 descriptor layout */
+    /* Write request */
+    req.type = VIRTIO_BLK_T_OUT;
+    req.ioprio = 1;
+    req.sector = 0;
+    req.data = g_malloc0(512);
+    strcpy(req.data, "TEST");
+
+    req_addr = virtio_blk_request(alloc, dev, &req, 512);
+
+    g_free(req.data);
+
+    free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
+    qvirtqueue_add(qts, vq, req_addr + 16, 512, false, true);
+    qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
+
+    qvirtqueue_kick(qts, dev, vq, free_head);
+
+    qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
+                           QVIRTIO_BLK_TIMEOUT_US);
+    status = readb(req_addr + 528);
+    g_assert_cmpint(status, ==, 0);
+
+    guest_free(alloc, req_addr);
+
+    /* Read request */
+    req.type = VIRTIO_BLK_T_IN;
+    req.ioprio = 1;
+    req.sector = 0;
+    req.data = g_malloc0(512);
+
+    req_addr = virtio_blk_request(alloc, dev, &req, 512);
+
+    g_free(req.data);
+
+    free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
+    qvirtqueue_add(qts, vq, req_addr + 16, 512, true, true);
+    qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
+
+    qvirtqueue_kick(qts, dev, vq, free_head);
+
+    qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
+                           QVIRTIO_BLK_TIMEOUT_US);
+    status = readb(req_addr + 528);
+    g_assert_cmpint(status, ==, 0);
+
+    data = g_malloc0(512);
+    memread(req_addr + 16, data, 512);
+    g_assert_cmpstr(data, ==, "TEST");
+    g_free(data);
+
+    guest_free(alloc, req_addr);
+
+    if (features & (1u << VIRTIO_BLK_F_WRITE_ZEROES)) {
+        struct virtio_blk_discard_write_zeroes dwz_hdr;
+        void *expected;
+
+        /*
+         * WRITE_ZEROES request on the same sector of previous test where
+         * we wrote "TEST".
+         */
+        req.type = VIRTIO_BLK_T_WRITE_ZEROES;
+        req.data = (char *) &dwz_hdr;
+        dwz_hdr.sector = 0;
+        dwz_hdr.num_sectors = 1;
+        dwz_hdr.flags = 0;
+
+        virtio_blk_fix_dwz_hdr(dev, &dwz_hdr);
+
+        req_addr = virtio_blk_request(alloc, dev, &req, sizeof(dwz_hdr));
+
+        free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
+        qvirtqueue_add(qts, vq, req_addr + 16, sizeof(dwz_hdr), false, true);
+        qvirtqueue_add(qts, vq, req_addr + 16 + sizeof(dwz_hdr), 1, true,
+                       false);
+
+        qvirtqueue_kick(qts, dev, vq, free_head);
+
+        qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
+                               QVIRTIO_BLK_TIMEOUT_US);
+        status = readb(req_addr + 16 + sizeof(dwz_hdr));
+        g_assert_cmpint(status, ==, 0);
+
+        guest_free(alloc, req_addr);
+
+        /* Read request to check if the sector contains all zeroes */
+        req.type = VIRTIO_BLK_T_IN;
+        req.ioprio = 1;
+        req.sector = 0;
+        req.data = g_malloc0(512);
+
+        req_addr = virtio_blk_request(alloc, dev, &req, 512);
+
+        g_free(req.data);
+
+        free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
+        qvirtqueue_add(qts, vq, req_addr + 16, 512, true, true);
+        qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
+
+        qvirtqueue_kick(qts, dev, vq, free_head);
+
+        qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
+                               QVIRTIO_BLK_TIMEOUT_US);
+        status = readb(req_addr + 528);
+        g_assert_cmpint(status, ==, 0);
+
+        data = g_malloc(512);
+        expected = g_malloc0(512);
+        memread(req_addr + 16, data, 512);
+        g_assert_cmpmem(data, 512, expected, 512);
+        g_free(expected);
+        g_free(data);
+
+        guest_free(alloc, req_addr);
+    }
+
+    if (features & (1u << VIRTIO_BLK_F_DISCARD)) {
+        struct virtio_blk_discard_write_zeroes dwz_hdr;
+
+        req.type = VIRTIO_BLK_T_DISCARD;
+        req.data = (char *) &dwz_hdr;
+        dwz_hdr.sector = 0;
+        dwz_hdr.num_sectors = 1;
+        dwz_hdr.flags = 0;
+
+        virtio_blk_fix_dwz_hdr(dev, &dwz_hdr);
+
+        req_addr = virtio_blk_request(alloc, dev, &req, sizeof(dwz_hdr));
+
+        free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
+        qvirtqueue_add(qts, vq, req_addr + 16, sizeof(dwz_hdr), false, true);
+        qvirtqueue_add(qts, vq, req_addr + 16 + sizeof(dwz_hdr),
+                       1, true, false);
+
+        qvirtqueue_kick(qts, dev, vq, free_head);
+
+        qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
+                               QVIRTIO_BLK_TIMEOUT_US);
+        status = readb(req_addr + 16 + sizeof(dwz_hdr));
+        g_assert_cmpint(status, ==, 0);
+
+        guest_free(alloc, req_addr);
+    }
+
+    if (features & (1u << VIRTIO_F_ANY_LAYOUT)) {
+        /* Write and read with 2 descriptor layout */
+        /* Write request */
+        req.type = VIRTIO_BLK_T_OUT;
+        req.ioprio = 1;
+        req.sector = 1;
+        req.data = g_malloc0(512);
+        strcpy(req.data, "TEST");
+
+        req_addr = virtio_blk_request(alloc, dev, &req, 512);
+
+        g_free(req.data);
+
+        free_head = qvirtqueue_add(qts, vq, req_addr, 528, false, true);
+        qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
+        qvirtqueue_kick(qts, dev, vq, free_head);
+
+        qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
+                               QVIRTIO_BLK_TIMEOUT_US);
+        status = readb(req_addr + 528);
+        g_assert_cmpint(status, ==, 0);
+
+        guest_free(alloc, req_addr);
+
+        /* Read request */
+        req.type = VIRTIO_BLK_T_IN;
+        req.ioprio = 1;
+        req.sector = 1;
+        req.data = g_malloc0(512);
+
+        req_addr = virtio_blk_request(alloc, dev, &req, 512);
+
+        g_free(req.data);
+
+        free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
+        qvirtqueue_add(qts, vq, req_addr + 16, 513, true, false);
+
+        qvirtqueue_kick(qts, dev, vq, free_head);
+
+        qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
+                               QVIRTIO_BLK_TIMEOUT_US);
+        status = readb(req_addr + 528);
+        g_assert_cmpint(status, ==, 0);
+
+        data = g_malloc0(512);
+        memread(req_addr + 16, data, 512);
+        g_assert_cmpstr(data, ==, "TEST");
+        g_free(data);
+
+        guest_free(alloc, req_addr);
+    }
+
+    return vq;
+}
+
+static void basic(void *obj, void *data, QGuestAllocator *t_alloc)
+{
+    QVhostUserBlk *blk_if = obj;
+    QVirtQueue *vq;
+
+    vq = test_basic(blk_if->vdev, t_alloc);
+    qvirtqueue_cleanup(blk_if->vdev->bus, vq, t_alloc);
+
+}
+
+static void indirect(void *obj, void *u_data, QGuestAllocator *t_alloc)
+{
+    QVirtQueue *vq;
+    QVhostUserBlk *blk_if = obj;
+    QVirtioDevice *dev = blk_if->vdev;
+    QVirtioBlkReq req;
+    QVRingIndirectDesc *indirect;
+    uint64_t req_addr;
+    uint64_t capacity;
+    uint64_t features;
+    uint32_t free_head;
+    uint8_t status;
+    char *data;
+    QTestState *qts = global_qtest;
+
+    features = qvirtio_get_features(dev);
+    g_assert_cmphex(features & (1u << VIRTIO_RING_F_INDIRECT_DESC), !=, 0);
+    features = features & ~(QVIRTIO_F_BAD_FEATURE |
+                            (1u << VIRTIO_RING_F_EVENT_IDX) |
+                            (1u << VIRTIO_BLK_F_SCSI));
+    qvirtio_set_features(dev, features);
+
+    capacity = qvirtio_config_readq(dev, 0);
+    g_assert_cmpint(capacity, ==, TEST_IMAGE_SIZE / 512);
+
+    vq = qvirtqueue_setup(dev, t_alloc, 0);
+    qvirtio_set_driver_ok(dev);
+
+    /* Write request */
+    req.type = VIRTIO_BLK_T_OUT;
+    req.ioprio = 1;
+    req.sector = 0;
+    req.data = g_malloc0(512);
+    strcpy(req.data, "TEST");
+
+    req_addr = virtio_blk_request(t_alloc, dev, &req, 512);
+
+    g_free(req.data);
+
+    indirect = qvring_indirect_desc_setup(qts, dev, t_alloc, 2);
+    qvring_indirect_desc_add(dev, qts, indirect, req_addr, 528, false);
+    qvring_indirect_desc_add(dev, qts, indirect, req_addr + 528, 1, true);
+    free_head = qvirtqueue_add_indirect(qts, vq, indirect);
+    qvirtqueue_kick(qts, dev, vq, free_head);
+
+    qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
+                           QVIRTIO_BLK_TIMEOUT_US);
+    status = readb(req_addr + 528);
+    g_assert_cmpint(status, ==, 0);
+
+    g_free(indirect);
+    guest_free(t_alloc, req_addr);
+
+    /* Read request */
+    req.type = VIRTIO_BLK_T_IN;
+    req.ioprio = 1;
+    req.sector = 0;
+    req.data = g_malloc0(512);
+    strcpy(req.data, "TEST");
+
+    req_addr = virtio_blk_request(t_alloc, dev, &req, 512);
+
+    g_free(req.data);
+
+    indirect = qvring_indirect_desc_setup(qts, dev, t_alloc, 2);
+    qvring_indirect_desc_add(dev, qts, indirect, req_addr, 16, false);
+    qvring_indirect_desc_add(dev, qts, indirect, req_addr + 16, 513, true);
+    free_head = qvirtqueue_add_indirect(qts, vq, indirect);
+    qvirtqueue_kick(qts, dev, vq, free_head);
+
+    qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
+                           QVIRTIO_BLK_TIMEOUT_US);
+    status = readb(req_addr + 528);
+    g_assert_cmpint(status, ==, 0);
+
+    data = g_malloc0(512);
+    memread(req_addr + 16, data, 512);
+    g_assert_cmpstr(data, ==, "TEST");
+    g_free(data);
+
+    g_free(indirect);
+    guest_free(t_alloc, req_addr);
+    qvirtqueue_cleanup(dev->bus, vq, t_alloc);
+}
+
+
+static void idx(void *obj, void *u_data, QGuestAllocator *t_alloc)
+{
+    QVirtQueue *vq;
+    QVhostUserBlkPCI *blk = obj;
+    QVirtioPCIDevice *pdev = &blk->pci_vdev;
+    QVirtioDevice *dev = &pdev->vdev;
+    QVirtioBlkReq req;
+    uint64_t req_addr;
+    uint64_t capacity;
+    uint64_t features;
+    uint32_t free_head;
+    uint32_t write_head;
+    uint32_t desc_idx;
+    uint8_t status;
+    char *data;
+    QOSGraphObject *blk_object = obj;
+    QPCIDevice *pci_dev = blk_object->get_driver(blk_object, "pci-device");
+    QTestState *qts = global_qtest;
+
+    if (qpci_check_buggy_msi(pci_dev)) {
+        return;
+    }
+
+    qpci_msix_enable(pdev->pdev);
+    qvirtio_pci_set_msix_configuration_vector(pdev, t_alloc, 0);
+
+    features = qvirtio_get_features(dev);
+    features = features & ~(QVIRTIO_F_BAD_FEATURE |
+                            (1u << VIRTIO_RING_F_INDIRECT_DESC) |
+                            (1u << VIRTIO_F_NOTIFY_ON_EMPTY) |
+                            (1u << VIRTIO_BLK_F_SCSI));
+    qvirtio_set_features(dev, features);
+
+    capacity = qvirtio_config_readq(dev, 0);
+    g_assert_cmpint(capacity, ==, TEST_IMAGE_SIZE / 512);
+
+    vq = qvirtqueue_setup(dev, t_alloc, 0);
+    qvirtqueue_pci_msix_setup(pdev, (QVirtQueuePCI *)vq, t_alloc, 1);
+
+    qvirtio_set_driver_ok(dev);
+
+    /* Write request */
+    req.type = VIRTIO_BLK_T_OUT;
+    req.ioprio = 1;
+    req.sector = 0;
+    req.data = g_malloc0(512);
+    strcpy(req.data, "TEST");
+
+    req_addr = virtio_blk_request(t_alloc, dev, &req, 512);
+
+    g_free(req.data);
+
+    free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
+    qvirtqueue_add(qts, vq, req_addr + 16, 512, false, true);
+    qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
+    qvirtqueue_kick(qts, dev, vq, free_head);
+
+    qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
+                           QVIRTIO_BLK_TIMEOUT_US);
+
+    /* Write request */
+    req.type = VIRTIO_BLK_T_OUT;
+    req.ioprio = 1;
+    req.sector = 1;
+    req.data = g_malloc0(512);
+    strcpy(req.data, "TEST");
+
+    req_addr = virtio_blk_request(t_alloc, dev, &req, 512);
+
+    g_free(req.data);
+
+    /* Notify after processing the third request */
+    qvirtqueue_set_used_event(qts, vq, 2);
+    free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
+    qvirtqueue_add(qts, vq, req_addr + 16, 512, false, true);
+    qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
+    qvirtqueue_kick(qts, dev, vq, free_head);
+    write_head = free_head;
+
+    /* No notification expected */
+    status = qvirtio_wait_status_byte_no_isr(qts, dev,
+                                             vq, req_addr + 528,
+                                             QVIRTIO_BLK_TIMEOUT_US);
+    g_assert_cmpint(status, ==, 0);
+
+    guest_free(t_alloc, req_addr);
+
+    /* Read request */
+    req.type = VIRTIO_BLK_T_IN;
+    req.ioprio = 1;
+    req.sector = 1;
+    req.data = g_malloc0(512);
+
+    req_addr = virtio_blk_request(t_alloc, dev, &req, 512);
+
+    g_free(req.data);
+
+    free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
+    qvirtqueue_add(qts, vq, req_addr + 16, 512, true, true);
+    qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
+
+    qvirtqueue_kick(qts, dev, vq, free_head);
+
+    /* We get just one notification for both requests */
+    qvirtio_wait_used_elem(qts, dev, vq, write_head, NULL,
+                           QVIRTIO_BLK_TIMEOUT_US);
+    g_assert(qvirtqueue_get_buf(qts, vq, &desc_idx, NULL));
+    g_assert_cmpint(desc_idx, ==, free_head);
+
+    status = readb(req_addr + 528);
+    g_assert_cmpint(status, ==, 0);
+
+    data = g_malloc0(512);
+    memread(req_addr + 16, data, 512);
+    g_assert_cmpstr(data, ==, "TEST");
+    g_free(data);
+
+    guest_free(t_alloc, req_addr);
+
+    /* End test */
+    qpci_msix_disable(pdev->pdev);
+
+    qvirtqueue_cleanup(dev->bus, vq, t_alloc);
+}
+
+static void pci_hotplug(void *obj, void *data, QGuestAllocator *t_alloc)
+{
+    QVirtioPCIDevice *dev1 = obj;
+    QVirtioPCIDevice *dev;
+    QTestState *qts = dev1->pdev->bus->qts;
+
+    /* plug secondary disk */
+    qtest_qmp_device_add(qts, "vhost-user-blk-pci", "drv1",
+                         "{'addr': %s, 'chardev': 'char2'}",
+                         stringify(PCI_SLOT_HP) ".0");
+
+    dev = virtio_pci_new(dev1->pdev->bus,
+                         &(QPCIAddress) { .devfn = QPCI_DEVFN(PCI_SLOT_HP, 0)
+                                        });
+    g_assert_nonnull(dev);
+    g_assert_cmpint(dev->vdev.device_type, ==, VIRTIO_ID_BLOCK);
+    qvirtio_pci_device_disable(dev);
+    qos_object_destroy((QOSGraphObject *)dev);
+
+    /* unplug secondary disk */
+    qpci_unplug_acpi_device_test(qts, "drv1", PCI_SLOT_HP);
+}
+
+/*
+ * Check that setting the vring addr on a non-existent virtqueue does
+ * not crash.
+ */
+static void test_nonexistent_virtqueue(void *obj, void *data,
+                                       QGuestAllocator *t_alloc)
+{
+    QVhostUserBlkPCI *blk = obj;
+    QVirtioPCIDevice *pdev = &blk->pci_vdev;
+    QPCIBar bar0;
+    QPCIDevice *dev;
+
+    dev = qpci_device_find(pdev->pdev->bus, QPCI_DEVFN(4, 0));
+    g_assert(dev != NULL);
+    qpci_device_enable(dev);
+
+    bar0 = qpci_iomap(dev, 0, NULL);
+
+    qpci_io_writeb(dev, bar0, VIRTIO_PCI_QUEUE_SEL, 2);
+    qpci_io_writel(dev, bar0, VIRTIO_PCI_QUEUE_PFN, 1);
+
+    g_free(dev);
+}
+
+static const char *qtest_qemu_storage_daemon_binary(void)
+{
+    const char *qemu_storage_daemon_bin;
+
+    qemu_storage_daemon_bin = getenv("QTEST_QEMU_STORAGE_DAEMON_BINARY");
+    if (!qemu_storage_daemon_bin) {
+        fprintf(stderr, "Environment variable "
+                        "QTEST_QEMU_STORAGE_DAEMON_BINARY required\n");
+        exit(0);
+    }
+
+    return qemu_storage_daemon_bin;
+}
+
+static void drive_destroy(void *path)
+{
+    unlink(path);
+    g_free(path);
+    qos_invalidate_command_line();
+}
+
+
+static char *drive_create(void)
+{
+    int fd, ret;
+    /** vhost-user-blk won't recognize drive located in /tmp */
+    char *t_path = g_strdup("qtest.XXXXXX");
+
+    /** Create a temporary raw image */
+    fd = mkstemp(t_path);
+    g_assert_cmpint(fd, >=, 0);
+    ret = ftruncate(fd, TEST_IMAGE_SIZE);
+    g_assert_cmpint(ret, ==, 0);
+    close(fd);
+
+    g_test_queue_destroy(drive_destroy, t_path);
+    return t_path;
+}
+
+static char sock_path_tempate[] = "/tmp/qtest.vhost_user_blk.XXXXXX";
+static char qmp_sock_path_tempate[] = "/tmp/qtest.vhost_user_blk.qmp.XXXXXX";
+
+
+static void quit_storage_daemon(void *qmp_test_state)
+{
+    qobject_unref(qtest_qmp((QTestState *)qmp_test_state, "{ 'execute': 'quit' }"));
+    g_free(qmp_test_state);
+}
+
+static char *start_vhost_user_blk(void)
+{
+    int fd, qmp_fd;
+    char *sock_path = g_strdup(sock_path_tempate);
+    char *qmp_sock_path = g_strdup(qmp_sock_path_tempate);
+    fd = mkstemp(sock_path);
+    g_assert_cmpint(fd, >=, 0);
+    g_test_queue_destroy(drive_destroy, sock_path);
+
+
+    qmp_fd = mkstemp(qmp_sock_path);
+    g_assert_cmpint(qmp_fd, >=, 0);
+    g_test_queue_destroy(drive_destroy, qmp_sock_path);
+    QTestState *qmp_test_state = g_new0(QTestState, 1);
+
+    /*
+     * Ask qemu-storage-daemon to quit so it
+     * will not block scripts/tap-driver.pl.
+     */
+    g_test_queue_destroy(quit_storage_daemon, qmp_test_state);
+    /* create image file */
+    const char *img_path = drive_create();
+
+    const char *vhost_user_blk_bin = qtest_qemu_storage_daemon_binary();
+    gchar *command = g_strdup_printf(
+            "exec %s "
+            "--blockdev driver=file,node-name=disk,filename=%s "
+            "--object vhost-user-blk-server,id=disk,unix-socket=%s,"
+            "node-name=disk,writable=on "
+            "--chardev socket,id=qmp,path=%s,server,nowait --monitor chardev=qmp",
+            vhost_user_blk_bin, img_path, sock_path, qmp_sock_path);
+
+
+    g_test_message("starting vhost-user backend: %s", command);
+    pid_t pid = fork();
+    if (pid == 0) {
+        execlp("/bin/sh", "sh", "-c", command, NULL);
+        exit(1);
+    }
+    g_free(command);
+    qmp_test_state->qmp_fd = qtest_socket_client(qmp_sock_path);
+
+    qobject_unref(qtest_qmp(qmp_test_state,
+                  "{ 'execute': 'qmp_capabilities' }"));
+    return sock_path;
+}
+
+
+static void *vhost_user_blk_test_setup(GString *cmd_line, void *arg)
+{
+    char *sock_path1 = start_vhost_user_blk();
+    g_string_append_printf(cmd_line,
+                           " -object memory-backend-memfd,id=mem,size=128M,share=on -numa node,memdev=mem "
+                           "-chardev socket,id=char1,path=%s ", sock_path1);
+    return arg;
+}
+
+
+/*
+ * Setup for hotplug.
+ *
+ * Since vhost-user server only serves one vhost-user client one time,
+ * another exprot
+ *
+ */
+static void *vhost_user_blk_hotplug_test_setup(GString *cmd_line, void *arg)
+{
+    vhost_user_blk_test_setup(cmd_line, arg);
+    char *sock_path2 = start_vhost_user_blk();
+    /* "-chardev socket,id=char2" is used for pci_hotplug*/
+    g_string_append_printf(cmd_line, "-chardev socket,id=char2,path=%s",
+                           sock_path2);
+    return arg;
+}
+
+static void register_vhost_user_blk_test(void)
+{
+    QOSGraphTestOptions opts = {
+        .before = vhost_user_blk_test_setup,
+    };
+
+    /*
+     * tests for vhost-user-blk and vhost-user-blk-pci
+     * The tests are borrowed from tests/virtio-blk-test.c. But some tests
+     * regarding block_resize don't work for vhost-user-blk.
+     * vhost-user-blk device doesn't have -drive, so tests containing
+     * block_resize are also abandoned,
+     *  - config
+     *  - resize
+     */
+    qos_add_test("basic", "vhost-user-blk", basic, &opts);
+    qos_add_test("indirect", "vhost-user-blk", indirect, &opts);
+    qos_add_test("idx", "vhost-user-blk-pci", idx, &opts);
+    qos_add_test("nxvirtq", "vhost-user-blk-pci",
+                 test_nonexistent_virtqueue, &opts);
+
+    opts.before = vhost_user_blk_hotplug_test_setup;
+    qos_add_test("hotplug", "vhost-user-blk-pci", pci_hotplug, &opts);
+}
+
+libqos_init(register_vhost_user_blk_test);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 4/4] new qTest case to test the vhost-user-blk-server
  2020-06-04 23:35 ` [PATCH v8 4/4] new qTest case to test the vhost-user-blk-server Coiby Xu
@ 2020-06-05  5:01   ` Thomas Huth
  2020-06-05  6:22     ` Coiby Xu
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Huth @ 2020-06-05  5:01 UTC (permalink / raw)
  To: Coiby Xu, qemu-devel
  Cc: kwolf, Laurent Vivier, bharatlkmlkvm, stefanha, Paolo Bonzini

On 05/06/2020 01.35, Coiby Xu wrote:
> This test case has the same tests as tests/virtio-blk-test.c except for
> tests have block_resize. Since vhost-user server can only server one
> client one time, two instances of qemu-storage-daemon are launched
> for the hotplug test.
> 
> In order to not block scripts/tap-driver.pl, vhost-user-blk-server will
> send "quit" command to qemu-storage-daemon's QMP monitor. So a function
> is added to libqtest.c to establish socket connection with socket
> server.
> 
> Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
> ---
>  tests/Makefile.include              |   3 +-
>  tests/qtest/Makefile.include        |   2 +
>  tests/qtest/libqos/vhost-user-blk.c | 130 +++++
>  tests/qtest/libqos/vhost-user-blk.h |  44 ++
>  tests/qtest/libqtest.c              |  54 +-
>  tests/qtest/libqtest.h              |  38 ++
>  tests/qtest/vhost-user-blk-test.c   | 737 ++++++++++++++++++++++++++++
>  7 files changed, 976 insertions(+), 32 deletions(-)
>  create mode 100644 tests/qtest/libqos/vhost-user-blk.c
>  create mode 100644 tests/qtest/libqos/vhost-user-blk.h
>  create mode 100644 tests/qtest/vhost-user-blk-test.c
[...]
> new file mode 100644
> index 0000000000..3de9c59194
> --- /dev/null
> +++ b/tests/qtest/libqos/vhost-user-blk.c
> @@ -0,0 +1,130 @@
> +/*
> + * libqos driver framework
> + *
> + * Based on tests/qtest/libqos/virtio-blk.c
> + *
> + * Copyright (c) 2020 Coiby Xu <coiby.xu@gmail.com>
> + *
> + * Copyright (c) 2018 Emanuele Giuseppe Esposito <e.emanuelegiuseppe@gmail.com>
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License version 2.1 as published by the Free Software Foundation.

Thanks for the update! ...

[...]
> diff --git a/tests/qtest/libqos/vhost-user-blk.h b/tests/qtest/libqos/vhost-user-blk.h
> new file mode 100644
> index 0000000000..ef4ef09cca
> --- /dev/null
> +++ b/tests/qtest/libqos/vhost-user-blk.h
> @@ -0,0 +1,44 @@
> +/*
> + * libqos driver framework
> + *
> + * Copyright (c) 2018 Emanuele Giuseppe Esposito <e.emanuelegiuseppe@gmail.com>
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License version 2 as published by the Free Software Foundation.

... but you've missed the header here :-(

[...]
> diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
> index 49075b55a1..a7b7c96206 100644
> --- a/tests/qtest/libqtest.c
> +++ b/tests/qtest/libqtest.c
> @@ -31,40 +31,9 @@
>  #include "qapi/qmp/qlist.h"
>  #include "qapi/qmp/qstring.h"
>  
> -#define MAX_IRQ 256
>  #define SOCKET_TIMEOUT 50
>  #define SOCKET_MAX_FDS 16
>  
> -
> -typedef void (*QTestSendFn)(QTestState *s, const char *buf);
> -typedef void (*ExternalSendFn)(void *s, const char *buf);
> -typedef GString* (*QTestRecvFn)(QTestState *);
> -
> -typedef struct QTestClientTransportOps {
> -    QTestSendFn     send;      /* for sending qtest commands */
> -
> -    /*
> -     * use external_send to send qtest command strings through functions which
> -     * do not accept a QTestState as the first parameter.
> -     */
> -    ExternalSendFn  external_send;
> -
> -    QTestRecvFn     recv_line; /* for receiving qtest command responses */
> -} QTestTransportOps;
> -
> -struct QTestState
> -{
> -    int fd;
> -    int qmp_fd;
> -    pid_t qemu_pid;  /* our child QEMU process */
> -    int wstatus;
> -    int expected_status;
> -    bool big_endian;
> -    bool irq_level[MAX_IRQ];
> -    GString *rx;
> -    QTestTransportOps ops;
> -};

Why do you have to move struct QTestState and friends to the header
instead? I'd prefer if we could keep it here if possible?

 Thomas



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 4/4] new qTest case to test the vhost-user-blk-server
  2020-06-05  5:01   ` Thomas Huth
@ 2020-06-05  6:22     ` Coiby Xu
  2020-06-05  9:25       ` Thomas Huth
  0 siblings, 1 reply; 21+ messages in thread
From: Coiby Xu @ 2020-06-05  6:22 UTC (permalink / raw)
  To: Thomas Huth
  Cc: kwolf, Laurent Vivier, qemu-devel, bharatlkmlkvm, stefanha,
	Paolo Bonzini

On Fri, Jun 05, 2020 at 07:01:33AM +0200, Thomas Huth wrote:
>> diff --git a/tests/qtest/libqos/vhost-user-blk.h b/tests/qtest/libqos/vhost-user-blk.h
>> new file mode 100644
>> index 0000000000..ef4ef09cca
>> --- /dev/null
>> +++ b/tests/qtest/libqos/vhost-user-blk.h
>> @@ -0,0 +1,44 @@
>> +/*
>> + * libqos driver framework
>> + *
>> + * Copyright (c) 2018 Emanuele Giuseppe Esposito <e.emanuelegiuseppe@gmail.com>
>> + *
>> + * This library is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU Lesser General Public
>> + * License version 2 as published by the Free Software Foundation.
>
>... but you've missed the header here :-(

Thank you for reminding me of this issue!

>> diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
>> index 49075b55a1..a7b7c96206 100644
>> --- a/tests/qtest/libqtest.c
>> +++ b/tests/qtest/libqtest.c
>> @@ -31,40 +31,9 @@
>>  #include "qapi/qmp/qlist.h"
>>  #include "qapi/qmp/qstring.h"
>>
>> -#define MAX_IRQ 256
>>  #define SOCKET_TIMEOUT 50
>>  #define SOCKET_MAX_FDS 16
>>
>> -
>> -typedef void (*QTestSendFn)(QTestState *s, const char *buf);
>> -typedef void (*ExternalSendFn)(void *s, const char *buf);
>> -typedef GString* (*QTestRecvFn)(QTestState *);
>> -
>> -typedef struct QTestClientTransportOps {
>> -    QTestSendFn     send;      /* for sending qtest commands */
>> -
>> -    /*
>> -     * use external_send to send qtest command strings through functions which
>> -     * do not accept a QTestState as the first parameter.
>> -     */
>> -    ExternalSendFn  external_send;
>> -
>> -    QTestRecvFn     recv_line; /* for receiving qtest command responses */
>> -} QTestTransportOps;
>> -
>> -struct QTestState
>> -{
>> -    int fd;
>> -    int qmp_fd;
>> -    pid_t qemu_pid;  /* our child QEMU process */
>> -    int wstatus;
>> -    int expected_status;
>> -    bool big_endian;
>> -    bool irq_level[MAX_IRQ];
>> -    GString *rx;
>> -    QTestTransportOps ops;
>> -};
>
>Why do you have to move struct QTestState and friends to the header
>instead? I'd prefer if we could keep it here if possible?

tests/qtest/vhost-user-blk-test.c needs to talk to qemu-storage-daemon's
QMP. Thus I g_new0 a QTestState struct to make use of related functions
like qtest_qmp and this requires the QTestState struct definition.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 4/4] new qTest case to test the vhost-user-blk-server
  2020-06-05  6:22     ` Coiby Xu
@ 2020-06-05  9:25       ` Thomas Huth
  2020-06-05 13:27         ` Coiby Xu
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Huth @ 2020-06-05  9:25 UTC (permalink / raw)
  To: Coiby Xu
  Cc: kwolf, Laurent Vivier, qemu-devel, bharatlkmlkvm, stefanha,
	Paolo Bonzini

On 05/06/2020 08.22, Coiby Xu wrote:
> On Fri, Jun 05, 2020 at 07:01:33AM +0200, Thomas Huth wrote:
>>> diff --git a/tests/qtest/libqos/vhost-user-blk.h
>>> b/tests/qtest/libqos/vhost-user-blk.h
>>> new file mode 100644
>>> index 0000000000..ef4ef09cca
>>> --- /dev/null
>>> +++ b/tests/qtest/libqos/vhost-user-blk.h
>>> @@ -0,0 +1,44 @@
>>> +/*
>>> + * libqos driver framework
>>> + *
>>> + * Copyright (c) 2018 Emanuele Giuseppe Esposito
>>> <e.emanuelegiuseppe@gmail.com>
>>> + *
>>> + * This library is free software; you can redistribute it and/or
>>> + * modify it under the terms of the GNU Lesser General Public
>>> + * License version 2 as published by the Free Software Foundation.
>>
>> ... but you've missed the header here :-(
> 
> Thank you for reminding me of this issue!
> 
>>> diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
>>> index 49075b55a1..a7b7c96206 100644
>>> --- a/tests/qtest/libqtest.c
>>> +++ b/tests/qtest/libqtest.c
>>> @@ -31,40 +31,9 @@
>>>  #include "qapi/qmp/qlist.h"
>>>  #include "qapi/qmp/qstring.h"
>>>
>>> -#define MAX_IRQ 256
>>>  #define SOCKET_TIMEOUT 50
>>>  #define SOCKET_MAX_FDS 16
>>>
>>> -
>>> -typedef void (*QTestSendFn)(QTestState *s, const char *buf);
>>> -typedef void (*ExternalSendFn)(void *s, const char *buf);
>>> -typedef GString* (*QTestRecvFn)(QTestState *);
>>> -
>>> -typedef struct QTestClientTransportOps {
>>> -    QTestSendFn     send;      /* for sending qtest commands */
>>> -
>>> -    /*
>>> -     * use external_send to send qtest command strings through
>>> functions which
>>> -     * do not accept a QTestState as the first parameter.
>>> -     */
>>> -    ExternalSendFn  external_send;
>>> -
>>> -    QTestRecvFn     recv_line; /* for receiving qtest command
>>> responses */
>>> -} QTestTransportOps;
>>> -
>>> -struct QTestState
>>> -{
>>> -    int fd;
>>> -    int qmp_fd;
>>> -    pid_t qemu_pid;  /* our child QEMU process */
>>> -    int wstatus;
>>> -    int expected_status;
>>> -    bool big_endian;
>>> -    bool irq_level[MAX_IRQ];
>>> -    GString *rx;
>>> -    QTestTransportOps ops;
>>> -};
>>
>> Why do you have to move struct QTestState and friends to the header
>> instead? I'd prefer if we could keep it here if possible?
> 
> tests/qtest/vhost-user-blk-test.c needs to talk to qemu-storage-daemon's
> QMP. Thus I g_new0 a QTestState struct to make use of related functions
> like qtest_qmp and this requires the QTestState struct definition.

Hm, ok, could that maybe be solved by introducing a wrapper function to
libqtest.c instead? Something like qtest_create_state_with_qmp_fd() or so?
Moving a define with a generic name like MAX_IRQ to a header really does
not sound like a good idea to me, so if that idea with the wrapper
function does not work out, could you please at least rename MAX_IRQ to
QTEST_MAX_IRQ or something similar?

 Thanks,
  Thomas



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 4/4] new qTest case to test the vhost-user-blk-server
  2020-06-05  9:25       ` Thomas Huth
@ 2020-06-05 13:27         ` Coiby Xu
  0 siblings, 0 replies; 21+ messages in thread
From: Coiby Xu @ 2020-06-05 13:27 UTC (permalink / raw)
  To: Thomas Huth
  Cc: kwolf, Laurent Vivier, qemu-devel, bharatlkmlkvm, stefanha,
	Paolo Bonzini

On Fri, Jun 05, 2020 at 11:25:26AM +0200, Thomas Huth wrote:
>On 05/06/2020 08.22, Coiby Xu wrote:
>> On Fri, Jun 05, 2020 at 07:01:33AM +0200, Thomas Huth wrote:
>>>> diff --git a/tests/qtest/libqos/vhost-user-blk.h
>>>> b/tests/qtest/libqos/vhost-user-blk.h
>>>> new file mode 100644
>>>> index 0000000000..ef4ef09cca
>>>> --- /dev/null
>>>> +++ b/tests/qtest/libqos/vhost-user-blk.h
>>>> @@ -0,0 +1,44 @@
>>>> +/*
>>>> + * libqos driver framework
>>>> + *
>>>> + * Copyright (c) 2018 Emanuele Giuseppe Esposito
>>>> <e.emanuelegiuseppe@gmail.com>
>>>> + *
>>>> + * This library is free software; you can redistribute it and/or
>>>> + * modify it under the terms of the GNU Lesser General Public
>>>> + * License version 2 as published by the Free Software Foundation.
>>>
>>> ... but you've missed the header here :-(
>>
>> Thank you for reminding me of this issue!
>>
>>>> diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
>>>> index 49075b55a1..a7b7c96206 100644
>>>> --- a/tests/qtest/libqtest.c
>>>> +++ b/tests/qtest/libqtest.c
>>>> @@ -31,40 +31,9 @@
>>>>  #include "qapi/qmp/qlist.h"
>>>>  #include "qapi/qmp/qstring.h"
>>>>
>>>> -#define MAX_IRQ 256
>>>>  #define SOCKET_TIMEOUT 50
>>>>  #define SOCKET_MAX_FDS 16
>>>>
>>>> -
>>>> -typedef void (*QTestSendFn)(QTestState *s, const char *buf);
>>>> -typedef void (*ExternalSendFn)(void *s, const char *buf);
>>>> -typedef GString* (*QTestRecvFn)(QTestState *);
>>>> -
>>>> -typedef struct QTestClientTransportOps {
>>>> -    QTestSendFn     send;      /* for sending qtest commands */
>>>> -
>>>> -    /*
>>>> -     * use external_send to send qtest command strings through
>>>> functions which
>>>> -     * do not accept a QTestState as the first parameter.
>>>> -     */
>>>> -    ExternalSendFn  external_send;
>>>> -
>>>> -    QTestRecvFn     recv_line; /* for receiving qtest command
>>>> responses */
>>>> -} QTestTransportOps;
>>>> -
>>>> -struct QTestState
>>>> -{
>>>> -    int fd;
>>>> -    int qmp_fd;
>>>> -    pid_t qemu_pid;  /* our child QEMU process */
>>>> -    int wstatus;
>>>> -    int expected_status;
>>>> -    bool big_endian;
>>>> -    bool irq_level[MAX_IRQ];
>>>> -    GString *rx;
>>>> -    QTestTransportOps ops;
>>>> -};
>>>
>>> Why do you have to move struct QTestState and friends to the header
>>> instead? I'd prefer if we could keep it here if possible?
>>
>> tests/qtest/vhost-user-blk-test.c needs to talk to qemu-storage-daemon's
>> QMP. Thus I g_new0 a QTestState struct to make use of related functions
>> like qtest_qmp and this requires the QTestState struct definition.
>
>Hm, ok, could that maybe be solved by introducing a wrapper function to
>libqtest.c instead? Something like qtest_create_state_with_qmp_fd() or so?
>Moving a define with a generic name like MAX_IRQ to a header really does
>not sound like a good idea to me, so if that idea with the wrapper
>function does not work out, could you please at least rename MAX_IRQ to
>QTEST_MAX_IRQ or something similar?
I didn't realize the QTestState struct is supposed to be hidden from the user and
not directly accessible. To typedef a struct in a header file and define
the struct in the c file is a new trick for me:)

This idea of creating a wrapper function qtest_create_state_with_qmp_fd
works as expected. Thank you!




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 1/4] Allow vu_message_read to be replaced
  2020-06-04 23:35 ` [PATCH v8 1/4] Allow vu_message_read to be replaced Coiby Xu
@ 2020-06-11 10:45   ` Stefan Hajnoczi
  2020-06-11 11:26   ` Marc-André Lureau
  1 sibling, 0 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2020-06-11 10:45 UTC (permalink / raw)
  To: Coiby Xu; +Cc: kwolf, bharatlkmlkvm, qemu-devel, Dr. David Alan Gilbert

[-- Attachment #1: Type: text/plain, Size: 688 bytes --]

On Fri, Jun 05, 2020 at 07:35:35AM +0800, Coiby Xu wrote:
> Allow vu_message_read to be replaced by one which will make use of the
> QIOChannel functions. Thus reading vhost-user message won't stall the
> guest.
> 
> Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
> ---
>  contrib/libvhost-user/libvhost-user-glib.c |  2 +-
>  contrib/libvhost-user/libvhost-user.c      | 11 ++++++-----
>  contrib/libvhost-user/libvhost-user.h      | 21 +++++++++++++++++++++
>  tests/vhost-user-bridge.c                  |  2 ++
>  tools/virtiofsd/fuse_virtio.c              |  4 ++--
>  5 files changed, 32 insertions(+), 8 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 1/4] Allow vu_message_read to be replaced
  2020-06-04 23:35 ` [PATCH v8 1/4] Allow vu_message_read to be replaced Coiby Xu
  2020-06-11 10:45   ` Stefan Hajnoczi
@ 2020-06-11 11:26   ` Marc-André Lureau
  1 sibling, 0 replies; 21+ messages in thread
From: Marc-André Lureau @ 2020-06-11 11:26 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Kevin Wolf, bharatlkmlkvm, QEMU, Stefan Hajnoczi, Dr. David Alan Gilbert

[-- Attachment #1: Type: text/plain, Size: 7628 bytes --]

On Fri, Jun 5, 2020 at 3:36 AM Coiby Xu <coiby.xu@gmail.com> wrote:

> Allow vu_message_read to be replaced by one which will make use of the
> QIOChannel functions. Thus reading vhost-user message won't stall the
> guest.
>
> Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
>

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>

---
>  contrib/libvhost-user/libvhost-user-glib.c |  2 +-
>  contrib/libvhost-user/libvhost-user.c      | 11 ++++++-----
>  contrib/libvhost-user/libvhost-user.h      | 21 +++++++++++++++++++++
>  tests/vhost-user-bridge.c                  |  2 ++
>  tools/virtiofsd/fuse_virtio.c              |  4 ++--
>  5 files changed, 32 insertions(+), 8 deletions(-)
>
> diff --git a/contrib/libvhost-user/libvhost-user-glib.c
> b/contrib/libvhost-user/libvhost-user-glib.c
> index 53f1ca4cdd..0df2ec9271 100644
> --- a/contrib/libvhost-user/libvhost-user-glib.c
> +++ b/contrib/libvhost-user/libvhost-user-glib.c
> @@ -147,7 +147,7 @@ vug_init(VugDev *dev, uint16_t max_queues, int socket,
>      g_assert(dev);
>      g_assert(iface);
>
> -    if (!vu_init(&dev->parent, max_queues, socket, panic, set_watch,
> +    if (!vu_init(&dev->parent, max_queues, socket, panic, NULL, set_watch,
>                   remove_watch, iface)) {
>          return false;
>      }
> diff --git a/contrib/libvhost-user/libvhost-user.c
> b/contrib/libvhost-user/libvhost-user.c
> index 3bca996c62..0c7368baa2 100644
> --- a/contrib/libvhost-user/libvhost-user.c
> +++ b/contrib/libvhost-user/libvhost-user.c
> @@ -67,8 +67,6 @@
>  /* The version of inflight buffer */
>  #define INFLIGHT_VERSION 1
>
> -#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
> -
>  /* The version of the protocol we support */
>  #define VHOST_USER_VERSION 1
>  #define LIBVHOST_USER_DEBUG 0
> @@ -412,7 +410,7 @@ vu_process_message_reply(VuDev *dev, const
> VhostUserMsg *vmsg)
>          goto out;
>      }
>
> -    if (!vu_message_read(dev, dev->slave_fd, &msg_reply)) {
> +    if (!dev->read_msg(dev, dev->slave_fd, &msg_reply)) {
>          goto out;
>      }
>
> @@ -647,7 +645,7 @@ vu_set_mem_table_exec_postcopy(VuDev *dev,
> VhostUserMsg *vmsg)
>      /* Wait for QEMU to confirm that it's registered the handler for the
>       * faults.
>       */
> -    if (!vu_message_read(dev, dev->sock, vmsg) ||
> +    if (!dev->read_msg(dev, dev->sock, vmsg) ||
>          vmsg->size != sizeof(vmsg->payload.u64) ||
>          vmsg->payload.u64 != 0) {
>          vu_panic(dev, "failed to receive valid ack for postcopy
> set-mem-table");
> @@ -1653,7 +1651,7 @@ vu_dispatch(VuDev *dev)
>      int reply_requested;
>      bool need_reply, success = false;
>
> -    if (!vu_message_read(dev, dev->sock, &vmsg)) {
> +    if (!dev->read_msg(dev, dev->sock, &vmsg)) {
>          goto end;
>      }
>
> @@ -1704,6 +1702,7 @@ vu_deinit(VuDev *dev)
>          }
>
>          if (vq->kick_fd != -1) {
> +            dev->remove_watch(dev, vq->kick_fd);
>              close(vq->kick_fd);
>              vq->kick_fd = -1;
>          }
> @@ -1751,6 +1750,7 @@ vu_init(VuDev *dev,
>          uint16_t max_queues,
>          int socket,
>          vu_panic_cb panic,
> +        vu_read_msg_cb read_msg,
>          vu_set_watch_cb set_watch,
>          vu_remove_watch_cb remove_watch,
>          const VuDevIface *iface)
> @@ -1768,6 +1768,7 @@ vu_init(VuDev *dev,
>
>      dev->sock = socket;
>      dev->panic = panic;
> +    dev->read_msg = read_msg ? read_msg : vu_message_read;
>      dev->set_watch = set_watch;
>      dev->remove_watch = remove_watch;
>      dev->iface = iface;
> diff --git a/contrib/libvhost-user/libvhost-user.h
> b/contrib/libvhost-user/libvhost-user.h
> index f30394fab6..d756da8548 100644
> --- a/contrib/libvhost-user/libvhost-user.h
> +++ b/contrib/libvhost-user/libvhost-user.h
> @@ -30,6 +30,8 @@
>
>  #define VHOST_MEMORY_MAX_NREGIONS 8
>
> +#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
> +
>  typedef enum VhostSetConfigType {
>      VHOST_SET_CONFIG_TYPE_MASTER = 0,
>      VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> @@ -205,6 +207,7 @@ typedef uint64_t (*vu_get_features_cb) (VuDev *dev);
>  typedef void (*vu_set_features_cb) (VuDev *dev, uint64_t features);
>  typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg *vmsg,
>                                    int *do_reply);
> +typedef bool (*vu_read_msg_cb) (VuDev *dev, int sock, VhostUserMsg *vmsg);
>  typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool
> started);
>  typedef bool (*vu_queue_is_processed_in_order_cb) (VuDev *dev, int qidx);
>  typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, uint32_t
> len);
> @@ -373,6 +376,23 @@ struct VuDev {
>      bool broken;
>      uint16_t max_queues;
>
> +    /* @read_msg: custom method to read vhost-user message
> +     *
> +     * Read data from vhost_user socket fd and fill up
> +     * the passed VhostUserMsg *vmsg struct.
> +     *
> +     * If reading fails, it should close the received set of file
> +     * descriptors as socket message's auxiliary data.
> +     *
> +     * For the details, please refer to vu_message_read in libvhost-user.c
> +     * which will be used by default if not custom method is provided when
> +     * calling vu_init
> +     *
> +     * Returns: true if vhost-user message successfully received,
> +     *          otherwise return false.
> +     *
> +     */
> +    vu_read_msg_cb read_msg;
>      /* @set_watch: add or update the given fd to the watch set,
>       * call cb when condition is met */
>      vu_set_watch_cb set_watch;
> @@ -416,6 +436,7 @@ bool vu_init(VuDev *dev,
>               uint16_t max_queues,
>               int socket,
>               vu_panic_cb panic,
> +             vu_read_msg_cb read_msg,
>               vu_set_watch_cb set_watch,
>               vu_remove_watch_cb remove_watch,
>               const VuDevIface *iface);
> diff --git a/tests/vhost-user-bridge.c b/tests/vhost-user-bridge.c
> index 6c3d490611..bd43607a4d 100644
> --- a/tests/vhost-user-bridge.c
> +++ b/tests/vhost-user-bridge.c
> @@ -520,6 +520,7 @@ vubr_accept_cb(int sock, void *ctx)
>                   VHOST_USER_BRIDGE_MAX_QUEUES,
>                   conn_fd,
>                   vubr_panic,
> +                 NULL,
>                   vubr_set_watch,
>                   vubr_remove_watch,
>                   &vuiface)) {
> @@ -573,6 +574,7 @@ vubr_new(const char *path, bool client)
>                       VHOST_USER_BRIDGE_MAX_QUEUES,
>                       dev->sock,
>                       vubr_panic,
> +                     NULL,
>                       vubr_set_watch,
>                       vubr_remove_watch,
>                       &vuiface)) {
> diff --git a/tools/virtiofsd/fuse_virtio.c b/tools/virtiofsd/fuse_virtio.c
> index 3b6d16a041..666945c897 100644
> --- a/tools/virtiofsd/fuse_virtio.c
> +++ b/tools/virtiofsd/fuse_virtio.c
> @@ -980,8 +980,8 @@ int virtio_session_mount(struct fuse_session *se)
>      se->vu_socketfd = data_sock;
>      se->virtio_dev->se = se;
>      pthread_rwlock_init(&se->virtio_dev->vu_dispatch_rwlock, NULL);
> -    vu_init(&se->virtio_dev->dev, 2, se->vu_socketfd, fv_panic,
> fv_set_watch,
> -            fv_remove_watch, &fv_iface);
> +    vu_init(&se->virtio_dev->dev, 2, se->vu_socketfd, fv_panic, NULL,
> +            fv_set_watch, fv_remove_watch, &fv_iface);
>
>      return 0;
>  }
> --
> 2.26.2
>
>
>

-- 
Marc-André Lureau

[-- Attachment #2: Type: text/html, Size: 9353 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 0/4] vhost-user block device backend implementation
  2020-06-04 23:35 [PATCH v8 0/4] vhost-user block device backend implementation Coiby Xu
                   ` (3 preceding siblings ...)
  2020-06-04 23:35 ` [PATCH v8 4/4] new qTest case to test the vhost-user-blk-server Coiby Xu
@ 2020-06-11 12:37 ` Stefano Garzarella
  2020-06-14 18:46   ` Coiby Xu
  2020-06-11 15:27 ` Stefan Hajnoczi
  5 siblings, 1 reply; 21+ messages in thread
From: Stefano Garzarella @ 2020-06-11 12:37 UTC (permalink / raw)
  To: Coiby Xu; +Cc: kwolf, bharatlkmlkvm, qemu-devel, stefanha

Hi Coiby Xu,

On Fri, Jun 05, 2020 at 07:35:34AM +0800, Coiby Xu wrote:
> v8
>  - re-try connecting to socket server to fix asan error
>  - fix license naming issue
> 
> v7
>  - fix docker-test-debug@fedora errors by freeing malloced memory
> 
> v6
>  - add missing license header and include guard
>  - vhost-user server only serve one client one time
>  - fix a bug in custom vu_message_read
>  - using qemu-storage-daemon to start vhost-user-blk-server
>  - a bug fix to pass docker-test-clang@ubuntu
> 
> v5:
>  * re-use vu_kick_cb in libvhost-user
>  * keeping processing VhostUserMsg in the same coroutine until there is
>    detachment/attachment of AIOContext
>  * Spawn separate coroutine for each VuVirtqElement
>  * Other changes including relocating vhost-user-blk-server.c, coding
>    style etc.
> 
> v4:
>  * add object properties in class_init
>  * relocate vhost-user-blk-test
>  * other changes including using SocketAddress, coding style, etc.
> 
> v3:
>  * separate generic vhost-user-server code from vhost-user-blk-server
>    code
>  * re-write vu_message_read and kick hander function as coroutines to
>    directly call blk_co_preadv, blk_co_pwritev, etc.
>  * add aio_context notifier functions to support multi-threading model
>  * other fixes regarding coding style, warning report, etc.
> 
> v2:
>  * Only enable this feature for Linux because eventfd is a Linux-specific
>    feature
> 
> 
> This patch series is an implementation of vhost-user block device
> backend server, thanks to Stefan and Kevin's guidance.
> 
> Vhost-user block device backend server is a UserCreatable object and can be
> started using object_add,
> 
>  (qemu) object_add vhost-user-blk-server,id=ID,unix-socket=/tmp/vhost-user-blk_vhost.socket,node-name=DRIVE_NAME,writable=off,blk-size=512
>  (qemu) object_del ID
> 
> or appending the "-object" option when starting QEMU,
> 
>   $ -object vhost-user-blk-server,id=disk,unix-socket=/tmp/vhost-user-blk_vhost.socket,node-name=DRIVE_NAME,writable=off,blk-size=512
> 
> Then vhost-user client can connect to the server backend.
> For example, QEMU could act as a client,
> 
>   $ -m 256 -object memory-backend-memfd,id=mem,size=256M,share=on -numa node,memdev=mem -chardev socket,id=char1,path=/tmp/vhost-user-blk_vhost.socket -device vhost-user-blk-pci,id=blk0,chardev=char1
> 
> And guest OS could access this vhost-user block device after mounting it.
> 
> Coiby Xu (4):
>   Allow vu_message_read to be replaced
>   generic vhost user server
>   vhost-user block device backend server
>   new qTest case to test the vhost-user-blk-server
> 
>  block/Makefile.objs                        |   1 +
>  block/export/vhost-user-blk-server.c       | 716 ++++++++++++++++++++
>  block/export/vhost-user-blk-server.h       |  34 +
>  contrib/libvhost-user/libvhost-user-glib.c |   2 +-
>  contrib/libvhost-user/libvhost-user.c      |  11 +-
>  contrib/libvhost-user/libvhost-user.h      |  21 +
>  softmmu/vl.c                               |   4 +
>  tests/Makefile.include                     |   3 +-
>  tests/qtest/Makefile.include               |   2 +
>  tests/qtest/libqos/vhost-user-blk.c        | 130 ++++
>  tests/qtest/libqos/vhost-user-blk.h        |  44 ++
>  tests/qtest/libqtest.c                     |  54 +-
>  tests/qtest/libqtest.h                     |  38 ++
>  tests/qtest/vhost-user-blk-test.c          | 737 +++++++++++++++++++++
>  tests/vhost-user-bridge.c                  |   2 +
>  tools/virtiofsd/fuse_virtio.c              |   4 +-
>  util/Makefile.objs                         |   1 +
>  util/vhost-user-server.c                   | 406 ++++++++++++
>  util/vhost-user-server.h                   |  59 ++
>  19 files changed, 2229 insertions(+), 40 deletions(-)
>  create mode 100644 block/export/vhost-user-blk-server.c
>  create mode 100644 block/export/vhost-user-blk-server.h
>  create mode 100644 tests/qtest/libqos/vhost-user-blk.c
>  create mode 100644 tests/qtest/libqos/vhost-user-blk.h
>  create mode 100644 tests/qtest/vhost-user-blk-test.c
>  create mode 100644 util/vhost-user-server.c
>  create mode 100644 util/vhost-user-server.h
> 

Should we add an entry in the MAINTAINERS file for some of the new files?
(e.g. util/vhost-user-server.*)

Thanks,
Stefano



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 2/4] generic vhost user server
  2020-06-04 23:35 ` [PATCH v8 2/4] generic vhost user server Coiby Xu
@ 2020-06-11 13:14   ` Stefan Hajnoczi
  2020-06-14 18:43     ` Coiby Xu
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2020-06-11 13:14 UTC (permalink / raw)
  To: Coiby Xu; +Cc: kwolf, bharatlkmlkvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 14170 bytes --]

On Fri, Jun 05, 2020 at 07:35:36AM +0800, Coiby Xu wrote:
> +static bool coroutine_fn
> +vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
> +{
> +    struct iovec iov = {
> +        .iov_base = (char *)vmsg,
> +        .iov_len = VHOST_USER_HDR_SIZE,
> +    };
> +    int rc, read_bytes = 0;
> +    Error *local_err = NULL;
> +    /*
> +     * Store fds/nfds returned from qio_channel_readv_full into
> +     * temporary variables.
> +     *
> +     * VhostUserMsg is a packed structure, gcc will complain about passing
> +     * pointer to a packed structure member if we pass &VhostUserMsg.fd_num
> +     * and &VhostUserMsg.fds directly when calling qio_channel_readv_full,
> +     * thus two temporary variables nfds and fds are used here.
> +     */
> +    size_t nfds = 0, nfds_t = 0;
> +    int *fds = NULL, *fds_t = NULL;
> +    VuServer *server = container_of(vu_dev, VuServer, vu_dev);
> +    QIOChannel *ioc = NULL;
> +
> +    if (conn_fd == server->sioc->fd) {
> +        ioc = server->ioc;
> +    } else {
> +        /* Slave communication will also use this function to read msg */
> +        ioc = slave_io_channel(server, conn_fd, &local_err);
> +    }
> +
> +    if (!ioc) {
> +        error_report_err(local_err);
> +        goto fail;
> +    }
> +
> +    assert(qemu_in_coroutine());
> +    do {
> +        /*
> +         * qio_channel_readv_full may have short reads, keeping calling it
> +         * until getting VHOST_USER_HDR_SIZE or 0 bytes in total
> +         */
> +        rc = qio_channel_readv_full(ioc, &iov, 1, &fds_t, &nfds_t, &local_err);
> +        if (rc < 0) {
> +            if (rc == QIO_CHANNEL_ERR_BLOCK) {
> +                qio_channel_yield(ioc, G_IO_IN);
> +                continue;
> +            } else {
> +                error_report_err(local_err);
> +                return false;
> +            }
> +        }
> +        read_bytes += rc;
> +        if (nfds_t > 0) {
> +            fds = g_renew(int, fds, nfds + nfds_t);
> +            memcpy(fds + nfds, fds_t, nfds_t *sizeof(int));
> +            nfds += nfds_t;
> +            if (nfds > VHOST_MEMORY_MAX_NREGIONS) {
> +                error_report("A maximum of %d fds are allowed, "
> +                             "however got %lu fds now",
> +                             VHOST_MEMORY_MAX_NREGIONS, nfds);
> +                goto fail;
> +            }
> +            g_free(fds_t);

I'm not sure why the temporary fds[] array is necessary. Copying the fds
directly into vmsg->fds would be simpler:

  if (nfds + nfds_t > G_N_ELEMENTS(vmsg->fds)) {
      error_report("A maximum of %d fds are allowed, "
                   "however got %lu fds now",
                   VHOST_MEMORY_MAX_NREGIONS, nfds);
      goto fail;
  }
  memcpy(vmsg->fds + nfds, fds_t, nfds_t * sizeof(vds->fds[0]));
  nfds += nfds_t;

Did I misunderstand how this works?

> +        }
> +        if (read_bytes == VHOST_USER_HDR_SIZE || rc == 0) {
> +            break;
> +        }
> +        iov.iov_base = (char *)vmsg + read_bytes;
> +        iov.iov_len = VHOST_USER_HDR_SIZE - read_bytes;
> +    } while (true);
> +
> +    vmsg->fd_num = nfds;
> +    if (nfds > 0) {
> +        memcpy(vmsg->fds, fds, nfds * sizeof(int));
> +    }
> +    g_free(fds);
> +    /* qio_channel_readv_full will make socket fds blocking, unblock them */
> +    vmsg_unblock_fds(vmsg);
> +    if (vmsg->size > sizeof(vmsg->payload)) {
> +        error_report("Error: too big message request: %d, "
> +                     "size: vmsg->size: %u, "
> +                     "while sizeof(vmsg->payload) = %zu",
> +                     vmsg->request, vmsg->size, sizeof(vmsg->payload));
> +        goto fail;
> +    }
> +
> +    struct iovec iov_payload = {
> +        .iov_base = (char *)&vmsg->payload,
> +        .iov_len = vmsg->size,
> +    };
> +    if (vmsg->size) {
> +        rc = qio_channel_readv_all_eof(ioc, &iov_payload, 1, &local_err);
> +        if (rc == -1) {
> +            error_report_err(local_err);
> +            goto fail;
> +        }
> +    }
> +
> +    return true;
> +
> +fail:
> +    vmsg_close_fds(vmsg);
> +
> +    return false;
> +}
> +
> +
> +static void vu_client_start(VuServer *server);
> +static coroutine_fn void vu_client_trip(void *opaque)
> +{
> +    VuServer *server = opaque;
> +
> +    while (!server->aio_context_changed && server->sioc) {
> +        vu_dispatch(&server->vu_dev);
> +    }
> +
> +    if (server->aio_context_changed && server->sioc) {
> +        server->aio_context_changed = false;
> +        vu_client_start(server);
> +    }
> +}
> +
> +static void vu_client_start(VuServer *server)
> +{
> +    server->co_trip = qemu_coroutine_create(vu_client_trip, server);
> +    aio_co_enter(server->ctx, server->co_trip);
> +}
> +
> +/*
> + * a wrapper for vu_kick_cb
> + *
> + * since aio_dispatch can only pass one user data pointer to the
> + * callback function, pack VuDev and pvt into a struct. Then unpack it
> + * and pass them to vu_kick_cb
> + */
> +static void kick_handler(void *opaque)
> +{
> +    KickInfo *kick_info = opaque;
> +    kick_info->cb(kick_info->vu_dev, 0, (void *) kick_info->index);
> +}
> +
> +
> +static void
> +set_watch(VuDev *vu_dev, int fd, int vu_evt,
> +          vu_watch_cb cb, void *pvt)
> +{
> +
> +    VuServer *server = container_of(vu_dev, VuServer, vu_dev);
> +    g_assert(vu_dev);
> +    g_assert(fd >= 0);
> +    long index = (intptr_t) pvt;
> +    g_assert(cb);
> +    KickInfo *kick_info = &server->kick_info[index];
> +    if (!kick_info->cb) {
> +        kick_info->fd = fd;
> +        kick_info->cb = cb;
> +        qemu_set_nonblock(fd);
> +        aio_set_fd_handler(server->ioc->ctx, fd, false, kick_handler,
> +                           NULL, NULL, kick_info);
> +        kick_info->vu_dev = vu_dev;
> +    }
> +}
> +
> +
> +static void remove_watch(VuDev *vu_dev, int fd)
> +{
> +    VuServer *server;
> +    int i;
> +    int index = -1;
> +    g_assert(vu_dev);
> +    g_assert(fd >= 0);
> +
> +    server = container_of(vu_dev, VuServer, vu_dev);
> +    for (i = 0; i < vu_dev->max_queues; i++) {
> +        if (server->kick_info[i].fd == fd) {
> +            index = i;
> +            break;
> +        }
> +    }
> +
> +    if (index == -1) {
> +        return;
> +    }
> +    server->kick_info[i].cb = NULL;
> +    aio_set_fd_handler(server->ioc->ctx, fd, false, NULL, NULL, NULL, NULL);
> +}
> +
> +
> +static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
> +                      gpointer opaque)
> +{
> +    VuServer *server = opaque;
> +
> +    if (server->sioc) {
> +        warn_report("Only one vhost-user client is allowed to "
> +                    "connect the server one time");
> +        return;
> +    }
> +
> +    if (!vu_init(&server->vu_dev, server->max_queues, sioc->fd, panic_cb,
> +                 vu_message_read, set_watch, remove_watch, server->vu_iface)) {
> +        error_report("Failed to initialized libvhost-user");
> +        return;
> +    }
> +
> +    /*
> +     * Unset the callback function for network listener to make another
> +     * vhost-user client keeping waiting until this client disconnects
> +     */
> +    qio_net_listener_set_client_func(server->listener,
> +                                     NULL,
> +                                     NULL,
> +                                     NULL);
> +    server->sioc = sioc;
> +    server->kick_info = g_new0(KickInfo, server->max_queues);

Where is kick_info freed?

> +    /*
> +     * Increase the object reference, so cioc will not freed by

s/cioc/sioc/

> +     * qio_net_listener_channel_func which will call object_unref(OBJECT(sioc))
> +     */
> +    object_ref(OBJECT(server->sioc));
> +    qio_channel_set_name(QIO_CHANNEL(sioc), "vhost-user client");
> +    server->ioc = QIO_CHANNEL(sioc);
> +    object_ref(OBJECT(server->ioc));
> +    object_ref(OBJECT(sioc));

Why are there two object_refs for sioc and where is unref called?

> +    qio_channel_attach_aio_context(server->ioc, server->ctx);
> +    qio_channel_set_blocking(QIO_CHANNEL(server->sioc), false, NULL);
> +    vu_client_start(server);
> +}
> +
> +
> +void vhost_user_server_stop(VuServer *server)
> +{
> +    if (!server) {
> +        return;
> +    }
> +
> +    if (server->sioc) {
> +        close_client(server);
> +        object_unref(OBJECT(server->sioc));

This call is object_unref(NULL) since close_client() does server->sioc =
NULL.

> +    }
> +
> +    if (server->listener) {
> +        qio_net_listener_disconnect(server->listener);
> +        object_unref(OBJECT(server->listener));
> +    }
> +}
> +
> +static void detach_context(VuServer *server)
> +{
> +    int i;
> +    AioContext *ctx = server->ioc->ctx;
> +    qio_channel_detach_aio_context(server->ioc);
> +    for (i = 0; i < server->vu_dev.max_queues; i++) {
> +        if (server->kick_info[i].cb) {
> +            aio_set_fd_handler(ctx, server->kick_info[i].fd, false, NULL,
> +                               NULL, NULL, NULL);
> +        }
> +    }
> +}
> +
> +static void attach_context(VuServer *server, AioContext *ctx)
> +{
> +    int i;
> +    qio_channel_attach_aio_context(server->ioc, ctx);
> +    server->aio_context_changed = true;
> +    if (server->co_trip) {
> +        aio_co_schedule(ctx, server->co_trip);
> +    }
> +    for (i = 0; i < server->vu_dev.max_queues; i++) {
> +        if (server->kick_info[i].cb) {
> +            aio_set_fd_handler(ctx, server->kick_info[i].fd, false,
> +                               kick_handler, NULL, NULL,
> +                               &server->kick_info[i]);
> +        }
> +    }
> +}
> +
> +void vhost_user_server_set_aio_context(AioContext *ctx, VuServer *server)
> +{
> +    server->ctx = ctx ? ctx : qemu_get_aio_context();
> +    if (!server->sioc) {
> +        return;
> +    }
> +    if (ctx) {
> +        attach_context(server, ctx);
> +    } else {
> +        detach_context(server);
> +    }
> +}
> +
> +
> +bool vhost_user_server_start(uint16_t max_queues,
> +                             SocketAddress *socket_addr,
> +                             AioContext *ctx,
> +                             VuServer *server,
> +                             void *device_panic_notifier,
> +                             const VuDevIface *vu_iface,
> +                             Error **errp)
> +{
> +    server->listener = qio_net_listener_new();
> +    if (qio_net_listener_open_sync(server->listener, socket_addr, 1,
> +                                   errp) < 0) {
> +        goto error;
> +    }
> +
> +    qio_net_listener_set_name(server->listener, "vhost-user-backend-listener");
> +
> +    server->vu_iface = vu_iface;
> +    server->max_queues = max_queues;
> +    server->ctx = ctx;
> +    server->device_panic_notifier = device_panic_notifier;
> +    qio_net_listener_set_client_func(server->listener,
> +                                     vu_accept,
> +                                     server,
> +                                     NULL);

The qio_net_listener_set_client_func() call uses the default
GMainContext but we have an AioContext *ctx argument. This is
surprising. I would expect the socket to be handled in the AioContext.

Can you clarify how this should work?

> +
> +    return true;
> +error:
> +    g_free(server);

It's surprising that this function frees the server argument when an
error occurs. vhost_user_server_stop() does not free server. I suggest
letting the caller free server since they own the object.

> +    return false;
> +}
> diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
> new file mode 100644
> index 0000000000..4315556b66
> --- /dev/null
> +++ b/util/vhost-user-server.h
> @@ -0,0 +1,59 @@
> +/*
> + * Sharing QEMU devices via vhost-user protocol
> + *
> + * Author: Coiby Xu <coiby.xu@gmail.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef VHOST_USER_SERVER_H
> +#define VHOST_USER_SERVER_H
> +
> +#include "contrib/libvhost-user/libvhost-user.h"
> +#include "io/channel-socket.h"
> +#include "io/channel-file.h"
> +#include "io/net-listener.h"
> +#include "qemu/error-report.h"
> +#include "qapi/error.h"
> +#include "standard-headers/linux/virtio_blk.h"
> +
> +typedef struct KickInfo {
> +    VuDev *vu_dev;
> +    int fd; /*kick fd*/
> +    long index; /*queue index*/
> +    vu_watch_cb cb;
> +} KickInfo;
> +
> +typedef struct VuServer {
> +    QIONetListener *listener;
> +    AioContext *ctx;
> +    void (*device_panic_notifier)(struct VuServer *server) ;
> +    int max_queues;
> +    const VuDevIface *vu_iface;
> +    VuDev vu_dev;
> +    QIOChannel *ioc; /* The I/O channel with the client */
> +    QIOChannelSocket *sioc; /* The underlying data channel with the client */
> +    /* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */
> +    QIOChannel *ioc_slave;
> +    QIOChannelSocket *sioc_slave;
> +    Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
> +    KickInfo *kick_info; /* an array with the length of the queue number */
> +    /* restart coroutine co_trip if AIOContext is changed */
> +    bool aio_context_changed;
> +} VuServer;
> +
> +
> +bool vhost_user_server_start(uint16_t max_queues,
> +                             SocketAddress *unix_socket,
> +                             AioContext *ctx,
> +                             VuServer *server,
> +                             void *device_panic_notifier,

Please declare the function pointer type:

typedef void DevicePanicNotifierFn(struct VuServer *server);

Then the argument list can use DevicePanicNotifierFn
*device_panic_notifier instead of void *.

> +                             const VuDevIface *vu_iface,
> +                             Error **errp);
> +
> +void vhost_user_server_stop(VuServer *server);
> +
> +void vhost_user_server_set_aio_context(AioContext *ctx, VuServer *server);

If you send another revision, please make VuServer *server the first
argument of vhost_user_server_start() and
vhost_user_server_set_aio_context(). Functions usually have the object
they act on as the first argument.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 3/4] vhost-user block device backend server
  2020-06-04 23:35 ` [PATCH v8 3/4] vhost-user block device backend server Coiby Xu
@ 2020-06-11 15:24   ` Stefan Hajnoczi
  2020-06-14 19:04     ` Coiby Xu
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2020-06-11 15:24 UTC (permalink / raw)
  To: Coiby Xu
  Cc: kwolf, open list:Block layer core, qemu-devel, Max Reitz,
	bharatlkmlkvm, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 9655 bytes --]

On Fri, Jun 05, 2020 at 07:35:37AM +0800, Coiby Xu wrote:
> +static void coroutine_fn vu_block_virtio_process_req(void *opaque)
> +{
> +    struct req_data *data = opaque;
> +    VuServer *server = data->server;
> +    VuVirtq *vq = data->vq;
> +    VuVirtqElement *elem = data->elem;
> +    uint32_t type;
> +    VuBlockReq *req;
> +
> +    VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
> +    BlockBackend *backend = vdev_blk->backend;
> +
> +    struct iovec *in_iov = elem->in_sg;
> +    struct iovec *out_iov = elem->out_sg;
> +    unsigned in_num = elem->in_num;
> +    unsigned out_num = elem->out_num;
> +    /* refer to hw/block/virtio_blk.c */
> +    if (elem->out_num < 1 || elem->in_num < 1) {
> +        error_report("virtio-blk request missing headers");
> +        free(elem);
> +        return;
> +    }
> +
> +    req = g_new0(VuBlockReq, 1);

elem was allocated with enough space for VuBlockReq. Can this allocation
be eliminated?

  typedef struct VuBlockReq {
-     VuVirtqElement *elem;
+     VuVirtqElement elem;
      int64_t sector_num;
      size_t size;
      struct virtio_blk_inhdr *in;
      struct virtio_blk_outhdr out;
      VuServer *server;
      struct VuVirtq *vq;
  } VuBlockReq;

  req = vu_queue_pop(vu_dev, vq, sizeof(*req));

> +    req->server = server;
> +    req->vq = vq;
> +    req->elem = elem;
> +
> +    if (unlikely(iov_to_buf(out_iov, out_num, 0, &req->out,
> +                            sizeof(req->out)) != sizeof(req->out))) {
> +        error_report("virtio-blk request outhdr too short");
> +        goto err;
> +    }
> +
> +    iov_discard_front(&out_iov, &out_num, sizeof(req->out));
> +
> +    if (in_iov[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) {
> +        error_report("virtio-blk request inhdr too short");
> +        goto err;
> +    }
> +
> +    /* We always touch the last byte, so just see how big in_iov is.  */
> +    req->in = (void *)in_iov[in_num - 1].iov_base
> +              + in_iov[in_num - 1].iov_len
> +              - sizeof(struct virtio_blk_inhdr);
> +    iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr));
> +
> +
> +    type = le32toh(req->out.type);

This implementation assumes the request is always little-endian. This is
true for VIRTIO 1.0+ but not for older versions. Please check that
VIRTIO_F_VERSION_1 has been set.

In QEMU code the le32_to_cpu(), le64_to_cpu(), etc are common used
instead of le32toh(), etc.

> +    switch (type & ~VIRTIO_BLK_T_BARRIER) {
> +    case VIRTIO_BLK_T_IN:
> +    case VIRTIO_BLK_T_OUT: {
> +        ssize_t ret = 0;
> +        bool is_write = type & VIRTIO_BLK_T_OUT;
> +        req->sector_num = le64toh(req->out.sector);
> +
> +        int64_t offset = req->sector_num * vdev_blk->blk_size;
> +        QEMUIOVector *qiov = g_new0(QEMUIOVector, 1);

This can be allocated on the stack:

  QEMUIOVector qiov;

> +static void vhost_user_blk_server_free(VuBlockDev *vu_block_device)
> +{

I'm unsure why this is a separate from vu_block_free(). Neither of these
functions actually free VuBlockDev, so the name could be changed to
vhost_user_blk_server_stop().

> +    if (!vu_block_device) {
> +        return;
> +    }
> +    vhost_user_server_stop(&vu_block_device->vu_server);
> +    vu_block_free(vu_block_device);
> +
> +}
> +
> +/*
> + * A exported drive can serve multiple multiple clients simutateously,
> + * thus no need to export the same drive twice.

This comment is outdated. Only one client is served at a time.

> +static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
> +                                        Error **errp)
> +{
> +
> +    const char *name = vu_block_device->node_name;
> +    SocketAddress *addr = vu_block_device->addr;
> +    char *unix_socket = vu_block_device->addr->u.q_unix.path;
> +
> +    if (vu_block_dev_find(name)) {
> +        error_setg(errp, "Vhost-user-blk server with node-name '%s' "
> +                   "has already been started",
> +                   name);
> +        return;
> +    }

I think blk_new() permissions should prevent multiple writers. Having
multiple readers would be okay. Therefore this check can be removed.

> +
> +    if (vu_block_dev_find_by_unix_socket(unix_socket)) {
> +        error_setg(errp, "Vhost-user-blk server with with socket_path '%s' "
> +                   "has already been started", unix_socket);
> +        return;
> +    }

Is it a problem if the same path is reused? I don't see an issue if the
user creates a vhost-user-blk server, connects a client, unlinks the
UNIX domain socket, and creates a new vhost-user-blk server with the
same path. It might be a little confusing but if the user wants to do
it, I don't see a reason to stop them.

> +
> +    if (!vu_block_init(vu_block_device, errp)) {
> +        return;
> +    }
> +
> +
> +    AioContext *ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
> +
> +    if (!vhost_user_server_start(VHOST_USER_BLK_MAX_QUEUES, addr, ctx,
> +                                 &vu_block_device->vu_server,
> +                                 NULL, &vu_block_iface,
> +                                 errp)) {

In the previous patch I mentioned that calling g_free(server) is
probably unexpected and here is an example of why that can be a problem.
vu_server is a struct field, not an independent heap-allocated object,
so calling g_free(server) will result in undefined behavior (freeing an
object that was not allocated with g_new()).

> +        goto error;
> +    }
> +
> +    QTAILQ_INSERT_TAIL(&vu_block_devs, vu_block_device, next);
> +    blk_add_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
> +                                 blk_aio_detach, vu_block_device);
> +    return;
> +
> + error:
> +    vu_block_free(vu_block_device);
> +}
> +
> +static void vu_set_node_name(Object *obj, const char *value, Error **errp)
> +{
> +    VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
> +
> +    if (vus->node_name) {
> +        error_setg(errp, "evdev property already set");
> +        return;
> +    }

Setting it twice is okay, we just need to g_free(vus->node_name).

> +
> +    vus->node_name = g_strdup(value);
> +}
> +
> +static char *vu_get_node_name(Object *obj, Error **errp)
> +{
> +    VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
> +    return g_strdup(vus->node_name);
> +}
> +
> +
> +static void vu_set_unix_socket(Object *obj, const char *value,
> +                               Error **errp)
> +{
> +    VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
> +
> +    if (vus->addr) {
> +        error_setg(errp, "unix_socket property already set");
> +        return;
> +    }

Setting it twice is okay, we just need to
g_free(vus->addr->u.q_unix.path) and g_free(vus->addr).

> +static void vu_set_blk_size(Object *obj, Visitor *v, const char *name,
> +                            void *opaque, Error **errp)
> +{
> +    VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
> +
> +    Error *local_err = NULL;
> +    uint32_t value;
> +
> +    visit_type_uint32(v, name, &value, &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +    if (value != BDRV_SECTOR_SIZE && value != 4096) {
> +        error_setg(&local_err,
> +                   "Property '%s.%s' can only take value 512 or 4096",
> +                   object_get_typename(obj), name);
> +        goto out;
> +    }

Please see hw/core/qdev-properties.c:set_blocksize() for input
validation checks (min=512, max=32768, must be a power of 2). This code
can be moved into a common utility function so that both
hw/core/qdev-properties.c and vhost-user-blk-server.c can use it.

> +
> +    vus->blk_size = value;
> +
> +out:
> +    error_propagate(errp, local_err);
> +    vus->blk_size = value;
> +}
> +
> +
> +static void vhost_user_blk_server_instance_finalize(Object *obj)
> +{
> +    VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
> +
> +    vhost_user_blk_server_free(vub);
> +}
> +
> +static void vhost_user_blk_server_complete(UserCreatable *obj, Error **errp)
> +{
> +    Error *local_error = NULL;
> +    VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
> +
> +    vhost_user_blk_server_start(vub, &local_error);

After this call succeeds the properties should become read-only
("writable", "node-name", "unix-socket", etc) to prevent modification at
runtime.

I think the easiest way to do that is by keeping a bool field in
VuBlockDev that the property setter functions can check.

> +
> +    if (local_error) {
> +        error_propagate(errp, local_error);
> +        return;
> +    }
> +}
> +
> +static void vhost_user_blk_server_class_init(ObjectClass *klass,
> +                                             void *class_data)
> +{
> +    UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
> +    ucc->complete = vhost_user_blk_server_complete;
> +
> +    object_class_property_add_bool(klass, "writable",
> +                                   vu_get_block_writable,
> +                                   vu_set_block_writable);
> +
> +    object_class_property_add_str(klass, "node-name",
> +                                  vu_get_node_name,
> +                                  vu_set_node_name);
> +
> +    object_class_property_add_str(klass, "unix-socket",
> +                                  vu_get_unix_socket,
> +                                  vu_set_unix_socket);
> +
> +    object_class_property_add(klass, "blk-size", "uint32",
> +                              vu_get_blk_size, vu_set_blk_size,
> +                              NULL, NULL);

include/hw/block/block.h:DEFINE_BLOCK_PROPERTIES_BASE calls this
property "logical_block_size". Please use the same name for consistency.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 0/4] vhost-user block device backend implementation
  2020-06-04 23:35 [PATCH v8 0/4] vhost-user block device backend implementation Coiby Xu
                   ` (4 preceding siblings ...)
  2020-06-11 12:37 ` [PATCH v8 0/4] vhost-user block device backend implementation Stefano Garzarella
@ 2020-06-11 15:27 ` Stefan Hajnoczi
  2020-06-12 15:58   ` Coiby Xu
  5 siblings, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2020-06-11 15:27 UTC (permalink / raw)
  To: Coiby Xu; +Cc: kwolf, bharatlkmlkvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 359 bytes --]

On Fri, Jun 05, 2020 at 07:35:34AM +0800, Coiby Xu wrote:
> v8
>  - re-try connecting to socket server to fix asan error
>  - fix license naming issue

Great, thanks for posting these patches!

I have posted feedback. I'd like to merge this soon. If you are busy I
can send you patches that address the comments I've made, please let me
know.

Thanks,
Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 0/4] vhost-user block device backend implementation
  2020-06-11 15:27 ` Stefan Hajnoczi
@ 2020-06-12 15:58   ` Coiby Xu
  0 siblings, 0 replies; 21+ messages in thread
From: Coiby Xu @ 2020-06-12 15:58 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, bharatlkmlkvm, qemu-devel

On Thu, Jun 11, 2020 at 04:27:44PM +0100, Stefan Hajnoczi wrote:
>On Fri, Jun 05, 2020 at 07:35:34AM +0800, Coiby Xu wrote:
>> v8
>>  - re-try connecting to socket server to fix asan error
>>  - fix license naming issue
>
>Great, thanks for posting these patches!
>
>I have posted feedback. I'd like to merge this soon. If you are busy I
>can send you patches that address the comments I've made, please let me
>know.

Thank you for reviewing my work! I'll post v9 to address all the comments this
weekend, does you think it's soon enough?

Best regards,
Coiby


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 2/4] generic vhost user server
  2020-06-11 13:14   ` Stefan Hajnoczi
@ 2020-06-14 18:43     ` Coiby Xu
  0 siblings, 0 replies; 21+ messages in thread
From: Coiby Xu @ 2020-06-14 18:43 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, bharatlkmlkvm, qemu-devel

On Thu, Jun 11, 2020 at 02:14:49PM +0100, Stefan Hajnoczi wrote:
>On Fri, Jun 05, 2020 at 07:35:36AM +0800, Coiby Xu wrote:
>> +static bool coroutine_fn
>> +vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
>> +{
>> +    struct iovec iov = {
>> +        .iov_base = (char *)vmsg,
>> +        .iov_len = VHOST_USER_HDR_SIZE,
>> +    };
>> +    int rc, read_bytes = 0;
>> +    Error *local_err = NULL;
>> +    /*
>> +     * Store fds/nfds returned from qio_channel_readv_full into
>> +     * temporary variables.
>> +     *
>> +     * VhostUserMsg is a packed structure, gcc will complain about passing
>> +     * pointer to a packed structure member if we pass &VhostUserMsg.fd_num
>> +     * and &VhostUserMsg.fds directly when calling qio_channel_readv_full,
>> +     * thus two temporary variables nfds and fds are used here.
>> +     */
>> +    size_t nfds = 0, nfds_t = 0;
>> +    int *fds = NULL, *fds_t = NULL;
>> +    VuServer *server = container_of(vu_dev, VuServer, vu_dev);
>> +    QIOChannel *ioc = NULL;
>> +
>> +    if (conn_fd == server->sioc->fd) {
>> +        ioc = server->ioc;
>> +    } else {
>> +        /* Slave communication will also use this function to read msg */
>> +        ioc = slave_io_channel(server, conn_fd, &local_err);
>> +    }
>> +
>> +    if (!ioc) {
>> +        error_report_err(local_err);
>> +        goto fail;
>> +    }
>> +
>> +    assert(qemu_in_coroutine());
>> +    do {
>> +        /*
>> +         * qio_channel_readv_full may have short reads, keeping calling it
>> +         * until getting VHOST_USER_HDR_SIZE or 0 bytes in total
>> +         */
>> +        rc = qio_channel_readv_full(ioc, &iov, 1, &fds_t, &nfds_t, &local_err);
>> +        if (rc < 0) {
>> +            if (rc == QIO_CHANNEL_ERR_BLOCK) {
>> +                qio_channel_yield(ioc, G_IO_IN);
>> +                continue;
>> +            } else {
>> +                error_report_err(local_err);
>> +                return false;
>> +            }
>> +        }
>> +        read_bytes += rc;
>> +        if (nfds_t > 0) {
>> +            fds = g_renew(int, fds, nfds + nfds_t);
>> +            memcpy(fds + nfds, fds_t, nfds_t *sizeof(int));
>> +            nfds += nfds_t;
>> +            if (nfds > VHOST_MEMORY_MAX_NREGIONS) {
>> +                error_report("A maximum of %d fds are allowed, "
>> +                             "however got %lu fds now",
>> +                             VHOST_MEMORY_MAX_NREGIONS, nfds);
>> +                goto fail;
>> +            }
>> +            g_free(fds_t);
>
>I'm not sure why the temporary fds[] array is necessary. Copying the fds
>directly into vmsg->fds would be simpler:
>
>  if (nfds + nfds_t > G_N_ELEMENTS(vmsg->fds)) {
>      error_report("A maximum of %d fds are allowed, "
>                   "however got %lu fds now",
>                   VHOST_MEMORY_MAX_NREGIONS, nfds);
>      goto fail;
>  }
>  memcpy(vmsg->fds + nfds, fds_t, nfds_t * sizeof(vds->fds[0]));
>  nfds += nfds_t;
>
>Did I misunderstand how this works?

No, the temporary fds[] array is not necessary. Thank for the
simplication!

>> +        }
>> +        if (read_bytes == VHOST_USER_HDR_SIZE || rc == 0) {
>> +            break;
>> +        }
>> +        iov.iov_base = (char *)vmsg + read_bytes;
>> +        iov.iov_len = VHOST_USER_HDR_SIZE - read_bytes;
>> +    } while (true);
>> +
>> +    vmsg->fd_num = nfds;
>> +    if (nfds > 0) {
>> +        memcpy(vmsg->fds, fds, nfds * sizeof(int));
>> +    }
>> +    g_free(fds);
>> +    /* qio_channel_readv_full will make socket fds blocking, unblock them */
>> +    vmsg_unblock_fds(vmsg);
>> +    if (vmsg->size > sizeof(vmsg->payload)) {
>> +        error_report("Error: too big message request: %d, "
>> +                     "size: vmsg->size: %u, "
>> +                     "while sizeof(vmsg->payload) = %zu",
>> +                     vmsg->request, vmsg->size, sizeof(vmsg->payload));
>> +        goto fail;
>> +    }
>> +
>> +    struct iovec iov_payload = {
>> +        .iov_base = (char *)&vmsg->payload,
>> +        .iov_len = vmsg->size,
>> +    };
>> +    if (vmsg->size) {
>> +        rc = qio_channel_readv_all_eof(ioc, &iov_payload, 1, &local_err);
>> +        if (rc == -1) {
>> +            error_report_err(local_err);
>> +            goto fail;
>> +        }
>> +    }
>> +
>> +    return true;
>> +
>> +fail:
>> +    vmsg_close_fds(vmsg);
>> +
>> +    return false;
>> +}
>> +
>> +
>> +static void vu_client_start(VuServer *server);
>> +static coroutine_fn void vu_client_trip(void *opaque)
>> +{
>> +    VuServer *server = opaque;
>> +
>> +    while (!server->aio_context_changed && server->sioc) {
>> +        vu_dispatch(&server->vu_dev);
>> +    }
>> +
>> +    if (server->aio_context_changed && server->sioc) {
>> +        server->aio_context_changed = false;
>> +        vu_client_start(server);
>> +    }
>> +}
>> +
>> +static void vu_client_start(VuServer *server)
>> +{
>> +    server->co_trip = qemu_coroutine_create(vu_client_trip, server);
>> +    aio_co_enter(server->ctx, server->co_trip);
>> +}
>> +
>> +/*
>> + * a wrapper for vu_kick_cb
>> + *
>> + * since aio_dispatch can only pass one user data pointer to the
>> + * callback function, pack VuDev and pvt into a struct. Then unpack it
>> + * and pass them to vu_kick_cb
>> + */
>> +static void kick_handler(void *opaque)
>> +{
>> +    KickInfo *kick_info = opaque;
>> +    kick_info->cb(kick_info->vu_dev, 0, (void *) kick_info->index);
>> +}
>> +
>> +
>> +static void
>> +set_watch(VuDev *vu_dev, int fd, int vu_evt,
>> +          vu_watch_cb cb, void *pvt)
>> +{
>> +
>> +    VuServer *server = container_of(vu_dev, VuServer, vu_dev);
>> +    g_assert(vu_dev);
>> +    g_assert(fd >= 0);
>> +    long index = (intptr_t) pvt;
>> +    g_assert(cb);
>> +    KickInfo *kick_info = &server->kick_info[index];
>> +    if (!kick_info->cb) {
>> +        kick_info->fd = fd;
>> +        kick_info->cb = cb;
>> +        qemu_set_nonblock(fd);
>> +        aio_set_fd_handler(server->ioc->ctx, fd, false, kick_handler,
>> +                           NULL, NULL, kick_info);
>> +        kick_info->vu_dev = vu_dev;
>> +    }
>> +}
>> +
>> +
>> +static void remove_watch(VuDev *vu_dev, int fd)
>> +{
>> +    VuServer *server;
>> +    int i;
>> +    int index = -1;
>> +    g_assert(vu_dev);
>> +    g_assert(fd >= 0);
>> +
>> +    server = container_of(vu_dev, VuServer, vu_dev);
>> +    for (i = 0; i < vu_dev->max_queues; i++) {
>> +        if (server->kick_info[i].fd == fd) {
>> +            index = i;
>> +            break;
>> +        }
>> +    }
>> +
>> +    if (index == -1) {
>> +        return;
>> +    }
>> +    server->kick_info[i].cb = NULL;
>> +    aio_set_fd_handler(server->ioc->ctx, fd, false, NULL, NULL, NULL, NULL);
>> +}
>> +
>> +
>> +static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
>> +                      gpointer opaque)
>> +{
>> +    VuServer *server = opaque;
>> +
>> +    if (server->sioc) {
>> +        warn_report("Only one vhost-user client is allowed to "
>> +                    "connect the server one time");
>> +        return;
>> +    }
>> +
>> +    if (!vu_init(&server->vu_dev, server->max_queues, sioc->fd, panic_cb,
>> +                 vu_message_read, set_watch, remove_watch, server->vu_iface)) {
>> +        error_report("Failed to initialized libvhost-user");
>> +        return;
>> +    }
>> +
>> +    /*
>> +     * Unset the callback function for network listener to make another
>> +     * vhost-user client keeping waiting until this client disconnects
>> +     */
>> +    qio_net_listener_set_client_func(server->listener,
>> +                                     NULL,
>> +                                     NULL,
>> +                                     NULL);
>> +    server->sioc = sioc;
>> +    server->kick_info = g_new0(KickInfo, server->max_queues);
>
>Where is kick_info freed?
>
>> +    /*
>> +     * Increase the object reference, so cioc will not freed by
>
>s/cioc/sioc/
>
>> +     * qio_net_listener_channel_func which will call object_unref(OBJECT(sioc))
>> +     */
>> +    object_ref(OBJECT(server->sioc));
>> +    qio_channel_set_name(QIO_CHANNEL(sioc), "vhost-user client");
>> +    server->ioc = QIO_CHANNEL(sioc);
>> +    object_ref(OBJECT(server->ioc));
>> +    object_ref(OBJECT(sioc));
>
>Why are there two object_refs for sioc and where is unref called?

Thank you for pointing out the errors regarding memory deallocation and
the typo.
>> +    qio_channel_attach_aio_context(server->ioc, server->ctx);
>> +    qio_channel_set_blocking(QIO_CHANNEL(server->sioc), false, NULL);
>> +    vu_client_start(server);
>> +}
>> +
>> +
>> +void vhost_user_server_stop(VuServer *server)
>> +{
>> +    if (!server) {
>> +        return;
>> +    }
>> +
>> +    if (server->sioc) {
>> +        close_client(server);
>> +        object_unref(OBJECT(server->sioc));
>
>This call is object_unref(NULL) since close_client() does server->sioc =
>NULL.
>
>> +    }
>> +
>> +    if (server->listener) {
>> +        qio_net_listener_disconnect(server->listener);
>> +        object_unref(OBJECT(server->listener));
>> +    }
>> +}
>> +
>> +static void detach_context(VuServer *server)
>> +{
>> +    int i;
>> +    AioContext *ctx = server->ioc->ctx;
>> +    qio_channel_detach_aio_context(server->ioc);
>> +    for (i = 0; i < server->vu_dev.max_queues; i++) {
>> +        if (server->kick_info[i].cb) {
>> +            aio_set_fd_handler(ctx, server->kick_info[i].fd, false, NULL,
>> +                               NULL, NULL, NULL);
>> +        }
>> +    }
>> +}
>> +
>> +static void attach_context(VuServer *server, AioContext *ctx)
>> +{
>> +    int i;
>> +    qio_channel_attach_aio_context(server->ioc, ctx);
>> +    server->aio_context_changed = true;
>> +    if (server->co_trip) {
>> +        aio_co_schedule(ctx, server->co_trip);
>> +    }
>> +    for (i = 0; i < server->vu_dev.max_queues; i++) {
>> +        if (server->kick_info[i].cb) {
>> +            aio_set_fd_handler(ctx, server->kick_info[i].fd, false,
>> +                               kick_handler, NULL, NULL,
>> +                               &server->kick_info[i]);
>> +        }
>> +    }
>> +}
>> +
>> +void vhost_user_server_set_aio_context(AioContext *ctx, VuServer *server)
>> +{
>> +    server->ctx = ctx ? ctx : qemu_get_aio_context();
>> +    if (!server->sioc) {
>> +        return;
>> +    }
>> +    if (ctx) {
>> +        attach_context(server, ctx);
>> +    } else {
>> +        detach_context(server);
>> +    }
>> +}
>> +
>> +
>> +bool vhost_user_server_start(uint16_t max_queues,
>> +                             SocketAddress *socket_addr,
>> +                             AioContext *ctx,
>> +                             VuServer *server,
>> +                             void *device_panic_notifier,
>> +                             const VuDevIface *vu_iface,
>> +                             Error **errp)
>> +{
>> +    server->listener = qio_net_listener_new();
>> +    if (qio_net_listener_open_sync(server->listener, socket_addr, 1,
>> +                                   errp) < 0) {
>> +        goto error;
>> +    }
>> +
>> +    qio_net_listener_set_name(server->listener, "vhost-user-backend-listener");
>> +
>> +    server->vu_iface = vu_iface;
>> +    server->max_queues = max_queues;
>> +    server->ctx = ctx;
>> +    server->device_panic_notifier = device_panic_notifier;
>> +    qio_net_listener_set_client_func(server->listener,
>> +                                     vu_accept,
>> +                                     server,
>> +                                     NULL);
>
>The qio_net_listener_set_client_func() call uses the default
>GMainContext but we have an AioContext *ctx argument. This is
>surprising. I would expect the socket to be handled in the AioContext.
>
>Can you clarify how this should work?
Yes, the vhost-user server will accept new client connections in the
default GMainText. But vhost-user message and kick event will be
processed in the block drive's AIOContext and if the block drive's
AIOContext is changed, these tasks will also be move to the new AIOContext.

Btw, I intended to use chardev to help manage client connections after
limiting one vhost-user server to serve one client one time. But according
to Mark and other people[1], converting all chardev functions to AIO doesn't
seem to be worth the effort.

[1] https://lists.gnu.org/archive/html/qemu-devel/2020-04/msg01485.html


>> +
>> +    return true;
>> +error:
>> +    g_free(server);
>
>It's surprising that this function frees the server argument when an
>error occurs. vhost_user_server_stop() does not free server. I suggest
>letting the caller free server since they own the object.
>
>> +    return false;
>> +}
>> diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
>> new file mode 100644
>> index 0000000000..4315556b66
>> --- /dev/null
>> +++ b/util/vhost-user-server.h
>> @@ -0,0 +1,59 @@
>> +/*
>> + * Sharing QEMU devices via vhost-user protocol
>> + *
>> + * Author: Coiby Xu <coiby.xu@gmail.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> + * later.  See the COPYING file in the top-level directory.
>> + */
>> +
>> +#ifndef VHOST_USER_SERVER_H
>> +#define VHOST_USER_SERVER_H
>> +
>> +#include "contrib/libvhost-user/libvhost-user.h"
>> +#include "io/channel-socket.h"
>> +#include "io/channel-file.h"
>> +#include "io/net-listener.h"
>> +#include "qemu/error-report.h"
>> +#include "qapi/error.h"
>> +#include "standard-headers/linux/virtio_blk.h"
>> +
>> +typedef struct KickInfo {
>> +    VuDev *vu_dev;
>> +    int fd; /*kick fd*/
>> +    long index; /*queue index*/
>> +    vu_watch_cb cb;
>> +} KickInfo;
>> +
>> +typedef struct VuServer {
>> +    QIONetListener *listener;
>> +    AioContext *ctx;
>> +    void (*device_panic_notifier)(struct VuServer *server) ;
>> +    int max_queues;
>> +    const VuDevIface *vu_iface;
>> +    VuDev vu_dev;
>> +    QIOChannel *ioc; /* The I/O channel with the client */
>> +    QIOChannelSocket *sioc; /* The underlying data channel with the client */
>> +    /* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */
>> +    QIOChannel *ioc_slave;
>> +    QIOChannelSocket *sioc_slave;
>> +    Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
>> +    KickInfo *kick_info; /* an array with the length of the queue number */
>> +    /* restart coroutine co_trip if AIOContext is changed */
>> +    bool aio_context_changed;
>> +} VuServer;
>> +
>> +
>> +bool vhost_user_server_start(uint16_t max_queues,
>> +                             SocketAddress *unix_socket,
>> +                             AioContext *ctx,
>> +                             VuServer *server,
>> +                             void *device_panic_notifier,
>
>Please declare the function pointer type:
>
>typedef void DevicePanicNotifierFn(struct VuServer *server);
>
>Then the argument list can use DevicePanicNotifierFn
>*device_panic_notifier instead of void *.
>
>> +                             const VuDevIface *vu_iface,
>> +                             Error **errp);
>> +
>> +void vhost_user_server_stop(VuServer *server);
>> +
>> +void vhost_user_server_set_aio_context(AioContext *ctx, VuServer *server);
>
>If you send another revision, please make VuServer *server the first
>argument of vhost_user_server_start() and
>vhost_user_server_set_aio_context(). Functions usually have the object
>they act on as the first argument.

Thank you! These issues have been addressed in v9.


--
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 0/4] vhost-user block device backend implementation
  2020-06-11 12:37 ` [PATCH v8 0/4] vhost-user block device backend implementation Stefano Garzarella
@ 2020-06-14 18:46   ` Coiby Xu
  2020-06-15  8:46     ` Stefano Garzarella
  0 siblings, 1 reply; 21+ messages in thread
From: Coiby Xu @ 2020-06-14 18:46 UTC (permalink / raw)
  To: Stefano Garzarella; +Cc: kwolf, bharatlkmlkvm, qemu-devel, stefanha

Hi Stefano Garzarella,

On Thu, Jun 11, 2020 at 02:37:03PM +0200, Stefano Garzarella wrote:
>Hi Coiby Xu,
>
>On Fri, Jun 05, 2020 at 07:35:34AM +0800, Coiby Xu wrote:
>> v8
>>  - re-try connecting to socket server to fix asan error
>>  - fix license naming issue
>>
>> v7
>>  - fix docker-test-debug@fedora errors by freeing malloced memory
>>
>> v6
>>  - add missing license header and include guard
>>  - vhost-user server only serve one client one time
>>  - fix a bug in custom vu_message_read
>>  - using qemu-storage-daemon to start vhost-user-blk-server
>>  - a bug fix to pass docker-test-clang@ubuntu
>>
>> v5:
>>  * re-use vu_kick_cb in libvhost-user
>>  * keeping processing VhostUserMsg in the same coroutine until there is
>>    detachment/attachment of AIOContext
>>  * Spawn separate coroutine for each VuVirtqElement
>>  * Other changes including relocating vhost-user-blk-server.c, coding
>>    style etc.
>>
>> v4:
>>  * add object properties in class_init
>>  * relocate vhost-user-blk-test
>>  * other changes including using SocketAddress, coding style, etc.
>>
>> v3:
>>  * separate generic vhost-user-server code from vhost-user-blk-server
>>    code
>>  * re-write vu_message_read and kick hander function as coroutines to
>>    directly call blk_co_preadv, blk_co_pwritev, etc.
>>  * add aio_context notifier functions to support multi-threading model
>>  * other fixes regarding coding style, warning report, etc.
>>
>> v2:
>>  * Only enable this feature for Linux because eventfd is a Linux-specific
>>    feature
>>
>>
>> This patch series is an implementation of vhost-user block device
>> backend server, thanks to Stefan and Kevin's guidance.
>>
>> Vhost-user block device backend server is a UserCreatable object and can be
>> started using object_add,
>>
>>  (qemu) object_add vhost-user-blk-server,id=ID,unix-socket=/tmp/vhost-user-blk_vhost.socket,node-name=DRIVE_NAME,writable=off,blk-size=512
>>  (qemu) object_del ID
>>
>> or appending the "-object" option when starting QEMU,
>>
>>   $ -object vhost-user-blk-server,id=disk,unix-socket=/tmp/vhost-user-blk_vhost.socket,node-name=DRIVE_NAME,writable=off,blk-size=512
>>
>> Then vhost-user client can connect to the server backend.
>> For example, QEMU could act as a client,
>>
>>   $ -m 256 -object memory-backend-memfd,id=mem,size=256M,share=on -numa node,memdev=mem -chardev socket,id=char1,path=/tmp/vhost-user-blk_vhost.socket -device vhost-user-blk-pci,id=blk0,chardev=char1
>>
>> And guest OS could access this vhost-user block device after mounting it.
>>
>> Coiby Xu (4):
>>   Allow vu_message_read to be replaced
>>   generic vhost user server
>>   vhost-user block device backend server
>>   new qTest case to test the vhost-user-blk-server
>>
>>  block/Makefile.objs                        |   1 +
>>  block/export/vhost-user-blk-server.c       | 716 ++++++++++++++++++++
>>  block/export/vhost-user-blk-server.h       |  34 +
>>  contrib/libvhost-user/libvhost-user-glib.c |   2 +-
>>  contrib/libvhost-user/libvhost-user.c      |  11 +-
>>  contrib/libvhost-user/libvhost-user.h      |  21 +
>>  softmmu/vl.c                               |   4 +
>>  tests/Makefile.include                     |   3 +-
>>  tests/qtest/Makefile.include               |   2 +
>>  tests/qtest/libqos/vhost-user-blk.c        | 130 ++++
>>  tests/qtest/libqos/vhost-user-blk.h        |  44 ++
>>  tests/qtest/libqtest.c                     |  54 +-
>>  tests/qtest/libqtest.h                     |  38 ++
>>  tests/qtest/vhost-user-blk-test.c          | 737 +++++++++++++++++++++
>>  tests/vhost-user-bridge.c                  |   2 +
>>  tools/virtiofsd/fuse_virtio.c              |   4 +-
>>  util/Makefile.objs                         |   1 +
>>  util/vhost-user-server.c                   | 406 ++++++++++++
>>  util/vhost-user-server.h                   |  59 ++
>>  19 files changed, 2229 insertions(+), 40 deletions(-)
>>  create mode 100644 block/export/vhost-user-blk-server.c
>>  create mode 100644 block/export/vhost-user-blk-server.h
>>  create mode 100644 tests/qtest/libqos/vhost-user-blk.c
>>  create mode 100644 tests/qtest/libqos/vhost-user-blk.h
>>  create mode 100644 tests/qtest/vhost-user-blk-test.c
>>  create mode 100644 util/vhost-user-server.c
>>  create mode 100644 util/vhost-user-server.h
>>
>
>Should we add an entry in the MAINTAINERS file for some of the new files?
>(e.g. util/vhost-user-server.*)

Yes, please. Thank you!

--
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 3/4] vhost-user block device backend server
  2020-06-11 15:24   ` Stefan Hajnoczi
@ 2020-06-14 19:04     ` Coiby Xu
  0 siblings, 0 replies; 21+ messages in thread
From: Coiby Xu @ 2020-06-14 19:04 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, open list:Block layer core, qemu-devel, Max Reitz,
	bharatlkmlkvm, Paolo Bonzini

On Thu, Jun 11, 2020 at 04:24:52PM +0100, Stefan Hajnoczi wrote:
>On Fri, Jun 05, 2020 at 07:35:37AM +0800, Coiby Xu wrote:
>> +static void coroutine_fn vu_block_virtio_process_req(void *opaque)
>> +{
>> +    struct req_data *data = opaque;
>> +    VuServer *server = data->server;
>> +    VuVirtq *vq = data->vq;
>> +    VuVirtqElement *elem = data->elem;
>> +    uint32_t type;
>> +    VuBlockReq *req;
>> +
>> +    VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
>> +    BlockBackend *backend = vdev_blk->backend;
>> +
>> +    struct iovec *in_iov = elem->in_sg;
>> +    struct iovec *out_iov = elem->out_sg;
>> +    unsigned in_num = elem->in_num;
>> +    unsigned out_num = elem->out_num;
>> +    /* refer to hw/block/virtio_blk.c */
>> +    if (elem->out_num < 1 || elem->in_num < 1) {
>> +        error_report("virtio-blk request missing headers");
>> +        free(elem);
>> +        return;
>> +    }
>> +
>> +    req = g_new0(VuBlockReq, 1);
>
>elem was allocated with enough space for VuBlockReq. Can this allocation
>be eliminated?
>
>  typedef struct VuBlockReq {
>-     VuVirtqElement *elem;
>+     VuVirtqElement elem;
>      int64_t sector_num;
>      size_t size;
>      struct virtio_blk_inhdr *in;
>      struct virtio_blk_outhdr out;
>      VuServer *server;
>      struct VuVirtq *vq;
>  } VuBlockReq;

Thank you for review this patch. Other issues for this patch have been
addressed in v9 except for this one. I'm not sure what you mean. I can't
find a way that doesn't require to allocate a VuBlockReq struct.


--
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 0/4] vhost-user block device backend implementation
  2020-06-14 18:46   ` Coiby Xu
@ 2020-06-15  8:46     ` Stefano Garzarella
  2020-06-16  6:55       ` Coiby Xu
  0 siblings, 1 reply; 21+ messages in thread
From: Stefano Garzarella @ 2020-06-15  8:46 UTC (permalink / raw)
  To: Coiby Xu; +Cc: kwolf, bharatlkmlkvm, qemu-devel, stefanha

On Mon, Jun 15, 2020 at 02:46:40AM +0800, Coiby Xu wrote:
> Hi Stefano Garzarella,
> 
> On Thu, Jun 11, 2020 at 02:37:03PM +0200, Stefano Garzarella wrote:
> > Hi Coiby Xu,
> > 
> > On Fri, Jun 05, 2020 at 07:35:34AM +0800, Coiby Xu wrote:
> > > v8
> > >  - re-try connecting to socket server to fix asan error
> > >  - fix license naming issue
> > > 
> > > v7
> > >  - fix docker-test-debug@fedora errors by freeing malloced memory
> > > 
> > > v6
> > >  - add missing license header and include guard
> > >  - vhost-user server only serve one client one time
> > >  - fix a bug in custom vu_message_read
> > >  - using qemu-storage-daemon to start vhost-user-blk-server
> > >  - a bug fix to pass docker-test-clang@ubuntu
> > > 
> > > v5:
> > >  * re-use vu_kick_cb in libvhost-user
> > >  * keeping processing VhostUserMsg in the same coroutine until there is
> > >    detachment/attachment of AIOContext
> > >  * Spawn separate coroutine for each VuVirtqElement
> > >  * Other changes including relocating vhost-user-blk-server.c, coding
> > >    style etc.
> > > 
> > > v4:
> > >  * add object properties in class_init
> > >  * relocate vhost-user-blk-test
> > >  * other changes including using SocketAddress, coding style, etc.
> > > 
> > > v3:
> > >  * separate generic vhost-user-server code from vhost-user-blk-server
> > >    code
> > >  * re-write vu_message_read and kick hander function as coroutines to
> > >    directly call blk_co_preadv, blk_co_pwritev, etc.
> > >  * add aio_context notifier functions to support multi-threading model
> > >  * other fixes regarding coding style, warning report, etc.
> > > 
> > > v2:
> > >  * Only enable this feature for Linux because eventfd is a Linux-specific
> > >    feature
> > > 
> > > 
> > > This patch series is an implementation of vhost-user block device
> > > backend server, thanks to Stefan and Kevin's guidance.
> > > 
> > > Vhost-user block device backend server is a UserCreatable object and can be
> > > started using object_add,
> > > 
> > >  (qemu) object_add vhost-user-blk-server,id=ID,unix-socket=/tmp/vhost-user-blk_vhost.socket,node-name=DRIVE_NAME,writable=off,blk-size=512
> > >  (qemu) object_del ID
> > > 
> > > or appending the "-object" option when starting QEMU,
> > > 
> > >   $ -object vhost-user-blk-server,id=disk,unix-socket=/tmp/vhost-user-blk_vhost.socket,node-name=DRIVE_NAME,writable=off,blk-size=512
> > > 
> > > Then vhost-user client can connect to the server backend.
> > > For example, QEMU could act as a client,
> > > 
> > >   $ -m 256 -object memory-backend-memfd,id=mem,size=256M,share=on -numa node,memdev=mem -chardev socket,id=char1,path=/tmp/vhost-user-blk_vhost.socket -device vhost-user-blk-pci,id=blk0,chardev=char1
> > > 
> > > And guest OS could access this vhost-user block device after mounting it.
> > > 
> > > Coiby Xu (4):
> > >   Allow vu_message_read to be replaced
> > >   generic vhost user server
> > >   vhost-user block device backend server
> > >   new qTest case to test the vhost-user-blk-server
> > > 
> > >  block/Makefile.objs                        |   1 +
> > >  block/export/vhost-user-blk-server.c       | 716 ++++++++++++++++++++
> > >  block/export/vhost-user-blk-server.h       |  34 +
> > >  contrib/libvhost-user/libvhost-user-glib.c |   2 +-
> > >  contrib/libvhost-user/libvhost-user.c      |  11 +-
> > >  contrib/libvhost-user/libvhost-user.h      |  21 +
> > >  softmmu/vl.c                               |   4 +
> > >  tests/Makefile.include                     |   3 +-
> > >  tests/qtest/Makefile.include               |   2 +
> > >  tests/qtest/libqos/vhost-user-blk.c        | 130 ++++
> > >  tests/qtest/libqos/vhost-user-blk.h        |  44 ++
> > >  tests/qtest/libqtest.c                     |  54 +-
> > >  tests/qtest/libqtest.h                     |  38 ++
> > >  tests/qtest/vhost-user-blk-test.c          | 737 +++++++++++++++++++++
> > >  tests/vhost-user-bridge.c                  |   2 +
> > >  tools/virtiofsd/fuse_virtio.c              |   4 +-
> > >  util/Makefile.objs                         |   1 +
> > >  util/vhost-user-server.c                   | 406 ++++++++++++
> > >  util/vhost-user-server.h                   |  59 ++
> > >  19 files changed, 2229 insertions(+), 40 deletions(-)
> > >  create mode 100644 block/export/vhost-user-blk-server.c
> > >  create mode 100644 block/export/vhost-user-blk-server.h
> > >  create mode 100644 tests/qtest/libqos/vhost-user-blk.c
> > >  create mode 100644 tests/qtest/libqos/vhost-user-blk.h
> > >  create mode 100644 tests/qtest/vhost-user-blk-test.c
> > >  create mode 100644 util/vhost-user-server.c
> > >  create mode 100644 util/vhost-user-server.h
> > > 
> > 
> > Should we add an entry in the MAINTAINERS file for some of the new files?
> > (e.g. util/vhost-user-server.*)
> 
> Yes, please. Thank you!

I think the best thing should be to edit MAINTAINERS in this series,
since you're adding new files, but I don't know who will maintain them ;-)

Thanks,
Stefano



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 0/4] vhost-user block device backend implementation
  2020-06-15  8:46     ` Stefano Garzarella
@ 2020-06-16  6:55       ` Coiby Xu
  0 siblings, 0 replies; 21+ messages in thread
From: Coiby Xu @ 2020-06-16  6:55 UTC (permalink / raw)
  To: Stefano Garzarella; +Cc: kwolf, bharatlkmlkvm, qemu-devel, stefanha

On Mon, Jun 15, 2020 at 10:46:10AM +0200, Stefano Garzarella wrote:
>On Mon, Jun 15, 2020 at 02:46:40AM +0800, Coiby Xu wrote:
>> Hi Stefano Garzarella,
>>
>> On Thu, Jun 11, 2020 at 02:37:03PM +0200, Stefano Garzarella wrote:
>> > Hi Coiby Xu,
>> >
>> > On Fri, Jun 05, 2020 at 07:35:34AM +0800, Coiby Xu wrote:
>> > > v8
>> > >  - re-try connecting to socket server to fix asan error
>> > >  - fix license naming issue
>> > >
>> > > v7
>> > >  - fix docker-test-debug@fedora errors by freeing malloced memory
>> > >
>> > > v6
>> > >  - add missing license header and include guard
>> > >  - vhost-user server only serve one client one time
>> > >  - fix a bug in custom vu_message_read
>> > >  - using qemu-storage-daemon to start vhost-user-blk-server
>> > >  - a bug fix to pass docker-test-clang@ubuntu
>> > >
>> > > v5:
>> > >  * re-use vu_kick_cb in libvhost-user
>> > >  * keeping processing VhostUserMsg in the same coroutine until there is
>> > >    detachment/attachment of AIOContext
>> > >  * Spawn separate coroutine for each VuVirtqElement
>> > >  * Other changes including relocating vhost-user-blk-server.c, coding
>> > >    style etc.
>> > >
>> > > v4:
>> > >  * add object properties in class_init
>> > >  * relocate vhost-user-blk-test
>> > >  * other changes including using SocketAddress, coding style, etc.
>> > >
>> > > v3:
>> > >  * separate generic vhost-user-server code from vhost-user-blk-server
>> > >    code
>> > >  * re-write vu_message_read and kick hander function as coroutines to
>> > >    directly call blk_co_preadv, blk_co_pwritev, etc.
>> > >  * add aio_context notifier functions to support multi-threading model
>> > >  * other fixes regarding coding style, warning report, etc.
>> > >
>> > > v2:
>> > >  * Only enable this feature for Linux because eventfd is a Linux-specific
>> > >    feature
>> > >
>> > >
>> > > This patch series is an implementation of vhost-user block device
>> > > backend server, thanks to Stefan and Kevin's guidance.
>> > >
>> > > Vhost-user block device backend server is a UserCreatable object and can be
>> > > started using object_add,
>> > >
>> > >  (qemu) object_add vhost-user-blk-server,id=ID,unix-socket=/tmp/vhost-user-blk_vhost.socket,node-name=DRIVE_NAME,writable=off,blk-size=512
>> > >  (qemu) object_del ID
>> > >
>> > > or appending the "-object" option when starting QEMU,
>> > >
>> > >   $ -object vhost-user-blk-server,id=disk,unix-socket=/tmp/vhost-user-blk_vhost.socket,node-name=DRIVE_NAME,writable=off,blk-size=512
>> > >
>> > > Then vhost-user client can connect to the server backend.
>> > > For example, QEMU could act as a client,
>> > >
>> > >   $ -m 256 -object memory-backend-memfd,id=mem,size=256M,share=on -numa node,memdev=mem -chardev socket,id=char1,path=/tmp/vhost-user-blk_vhost.socket -device vhost-user-blk-pci,id=blk0,chardev=char1
>> > >
>> > > And guest OS could access this vhost-user block device after mounting it.
>> > >
>> > > Coiby Xu (4):
>> > >   Allow vu_message_read to be replaced
>> > >   generic vhost user server
>> > >   vhost-user block device backend server
>> > >   new qTest case to test the vhost-user-blk-server
>> > >
>> > >  block/Makefile.objs                        |   1 +
>> > >  block/export/vhost-user-blk-server.c       | 716 ++++++++++++++++++++
>> > >  block/export/vhost-user-blk-server.h       |  34 +
>> > >  contrib/libvhost-user/libvhost-user-glib.c |   2 +-
>> > >  contrib/libvhost-user/libvhost-user.c      |  11 +-
>> > >  contrib/libvhost-user/libvhost-user.h      |  21 +
>> > >  softmmu/vl.c                               |   4 +
>> > >  tests/Makefile.include                     |   3 +-
>> > >  tests/qtest/Makefile.include               |   2 +
>> > >  tests/qtest/libqos/vhost-user-blk.c        | 130 ++++
>> > >  tests/qtest/libqos/vhost-user-blk.h        |  44 ++
>> > >  tests/qtest/libqtest.c                     |  54 +-
>> > >  tests/qtest/libqtest.h                     |  38 ++
>> > >  tests/qtest/vhost-user-blk-test.c          | 737 +++++++++++++++++++++
>> > >  tests/vhost-user-bridge.c                  |   2 +
>> > >  tools/virtiofsd/fuse_virtio.c              |   4 +-
>> > >  util/Makefile.objs                         |   1 +
>> > >  util/vhost-user-server.c                   | 406 ++++++++++++
>> > >  util/vhost-user-server.h                   |  59 ++
>> > >  19 files changed, 2229 insertions(+), 40 deletions(-)
>> > >  create mode 100644 block/export/vhost-user-blk-server.c
>> > >  create mode 100644 block/export/vhost-user-blk-server.h
>> > >  create mode 100644 tests/qtest/libqos/vhost-user-blk.c
>> > >  create mode 100644 tests/qtest/libqos/vhost-user-blk.h
>> > >  create mode 100644 tests/qtest/vhost-user-blk-test.c
>> > >  create mode 100644 util/vhost-user-server.c
>> > >  create mode 100644 util/vhost-user-server.h
>> > >
>> >
>> > Should we add an entry in the MAINTAINERS file for some of the new files?
>> > (e.g. util/vhost-user-server.*)
>>
>> Yes, please. Thank you!
>
>I think the best thing should be to edit MAINTAINERS in this series,
>since you're adding new files, but I don't know who will maintain them ;-)

Thank you for the explanation! I thought the MAINTAINERS file is supposed
to be treated in a special way:)

--
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-06-16  7:10 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-04 23:35 [PATCH v8 0/4] vhost-user block device backend implementation Coiby Xu
2020-06-04 23:35 ` [PATCH v8 1/4] Allow vu_message_read to be replaced Coiby Xu
2020-06-11 10:45   ` Stefan Hajnoczi
2020-06-11 11:26   ` Marc-André Lureau
2020-06-04 23:35 ` [PATCH v8 2/4] generic vhost user server Coiby Xu
2020-06-11 13:14   ` Stefan Hajnoczi
2020-06-14 18:43     ` Coiby Xu
2020-06-04 23:35 ` [PATCH v8 3/4] vhost-user block device backend server Coiby Xu
2020-06-11 15:24   ` Stefan Hajnoczi
2020-06-14 19:04     ` Coiby Xu
2020-06-04 23:35 ` [PATCH v8 4/4] new qTest case to test the vhost-user-blk-server Coiby Xu
2020-06-05  5:01   ` Thomas Huth
2020-06-05  6:22     ` Coiby Xu
2020-06-05  9:25       ` Thomas Huth
2020-06-05 13:27         ` Coiby Xu
2020-06-11 12:37 ` [PATCH v8 0/4] vhost-user block device backend implementation Stefano Garzarella
2020-06-14 18:46   ` Coiby Xu
2020-06-15  8:46     ` Stefano Garzarella
2020-06-16  6:55       ` Coiby Xu
2020-06-11 15:27 ` Stefan Hajnoczi
2020-06-12 15:58   ` Coiby Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.