All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends
@ 2014-01-31 17:34 Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 01/13] Convert -mem-path to QemuOpts and add prealloc and share properties Antonios Motakis
                   ` (13 more replies)
  0 siblings, 14 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel; +Cc: lukego, Antonios Motakis, tech, n.nikolaev, mst

In this patch series we would like to introduce our approach for putting a
virtio-net backend in an external userspace process. Our eventual target is to
run the network backend in the Snabbswitch ethernet switch, while receiving
traffic from a guest inside QEMU/KVM which runs an unmodified virtio-net
implementation.

For this, we are working into extending vhost to allow equivalent functionality
for userspace. Vhost already passes control of the data plane of virtio-net to
the host kernel; we want to realize a similar model, but for userspace.

In this patch series the concept of a vhost-backend is introduced.

We define two vhost backend types - vhost-kernel and vhost-user. The former is
the interface to the current kernel module implementation. Its control plane is
ioctl based. The data plane is the kernel directly accessing the QEMU allocated,
guest memory.

In the new vhost-user backend, the control plane is based on communication
between QEMU and another userspace process using a unix domain socket. This
allows to implement a virtio backend for a guest running in QEMU, inside the
other userspace process. For this communication we use a chardev with a unix socket
backend. Vhost-user is client/server agnostic regarding the chardev, however
it does not support the 'nowait' and 'telnet' options.

We change -mem-path to QemuOpts and add prealloc and share as properties
to it. HugeTLBFS is required for this option to work.

The data path is realized by directly accessing the vrings and the buffer data
off the guest's memory.

The current user of vhost-user is only vhost-net. We add new netdev backend
that is intended to initialize vhost-net with vhost-user backend.

Example usage:

qemu -m 1024 -mem-path /hugetlbfs,share=on \
     -chardev socket,id=chr0,path=/path/to/socket \
     -netdev type=vhost-user,id=net0,chardev=chr0 \
     -device virtio-net-pci,netdev=net0

This code can be pulled from git@github.com:virtualopensystems/qemu.git vhost-user-v7

A reference vhost-user slave for testing is available from git@github.com:virtualopensystems/vapp.git

TODOs include:
 - Include a test in QEMU to avoid regressions
 - Slave reconnection and nowait support

Changes from v6:
 - Remove the 'unlink' property of '-mem-path'
 - Extend qemu-char: blocking read, send fds, monitor for connection close
 - Vhost-user uses chardev as a backend
 - Poll and reconnect removed (no VHOST_USER_ECHO).
 - Disconnect is deteced by the chardev (G_IO_HUP event)
 - vhost-backend.c split to vhost-user.c

Changes from v5:
 - Split -mem-path unlink option to a separate patch
 - Fds are passed only in the ancillary data
 - Stricter message size checks on receive/send
 - Netdev vhost-user now includes path and poll_time options
 - The connection probing interval is configurable

Changes from v4:
 - Use error_report for errors
 - VhostUserMsg has new field `size` indicating the following payload length.
   Field `flags` now has version and reply bits. The structure is packed.
 - Send data is of variable length (`size` field in message)
 - Receive in 2 steps, header and payload
 - Add new message type VHOST_USER_ECHO, to check connection status

Changes from v3:
 - Convert -mem-path to QemuOpts with prealloc, share and unlink properties
 - Set 1 sec timeout when read/write to the unix domain socket
 - Fix file descriptor leak

Changes from v2:
 - Reconnect when the backend disappears

Changes from v1:
 - Implementation of vhost-user netdev backend
 - Code improvements

Antonios Motakis (13):
  Convert -mem-path to QemuOpts and add prealloc and share properties
  Add chardev API  qemu_chr_fe_read_all
  Add chardev API qemu_chr_fe_set_msgfds
  Add G_IO_HUP handler for socket chardev
  vhost_net should call the poll callback only when it is set
  Refactor virtio-net to use a generic get_vhost_net
  vhost_net_init will use VhostNetOptions to get all its arguments
  Add vhost_ops to the vhost_dev struct and replace all relevant ioctls
  Add vhost-backend and VhostBackendType
  Add vhost-user as a vhost backend.
  Add new vhost-user netdev backend
  Add the vhost-user netdev backend to command line
  Add vhost-user protocol documentation

 docs/specs/vhost-user.txt         | 249 ++++++++++++++++++++++++++++
 exec.c                            |  30 +++-
 hmp-commands.hx                   |   4 +-
 hw/net/vhost_net.c                | 142 +++++++++++-----
 hw/net/virtio-net.c               |  42 ++---
 hw/scsi/vhost-scsi.c              |  20 ++-
 hw/virtio/Makefile.objs           |   2 +-
 hw/virtio/vhost-backend.c         |  71 ++++++++
 hw/virtio/vhost-user.c            | 331 ++++++++++++++++++++++++++++++++++++++
 hw/virtio/vhost.c                 |  55 ++++---
 include/exec/cpu-all.h            |   3 -
 include/hw/virtio/vhost-backend.h |  38 +++++
 include/hw/virtio/vhost.h         |   8 +-
 include/net/vhost-user.h          |  17 ++
 include/net/vhost_net.h           |  11 +-
 include/sysemu/char.h             |  28 ++++
 net/Makefile.objs                 |   2 +-
 net/clients.h                     |   3 +
 net/hub.c                         |   1 +
 net/net.c                         |   2 +
 net/tap.c                         |  18 ++-
 net/vhost-user.c                  | 217 +++++++++++++++++++++++++
 qapi-schema.json                  |  18 ++-
 qemu-char.c                       | 185 ++++++++++++++++++++-
 qemu-options.hx                   |  25 ++-
 vl.c                              |  37 ++++-
 26 files changed, 1425 insertions(+), 134 deletions(-)
 create mode 100644 docs/specs/vhost-user.txt
 create mode 100644 hw/virtio/vhost-backend.c
 create mode 100644 hw/virtio/vhost-user.c
 create mode 100644 include/hw/virtio/vhost-backend.h
 create mode 100644 include/net/vhost-user.h
 create mode 100644 net/vhost-user.c

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v7 01/13] Convert -mem-path to QemuOpts and add prealloc and share properties
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
@ 2014-01-31 17:34 ` Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 02/13] Add chardev API qemu_chr_fe_read_all Antonios Motakis
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel
  Cc: Peter Maydell, Stefan Hajnoczi, mst, Juan Quintela,
	Michael Tokarev, Alexander Graf, n.nikolaev, Markus Armbruster,
	Anthony Liguori, Paolo Bonzini, lukego, Antonios Motakis, tech,
	Andreas Färber, Richard Henderson

Extend -mem-path with additional properties:

 - prealloc=on|off - default off, same as -mem-prealloc
 - share=on|off - default off, memory is mmapped with MAP_SHARED flag

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 exec.c                 | 30 ++++++++++++++++++++++++++++--
 include/exec/cpu-all.h |  3 ---
 qemu-options.hx        |  9 +++++++--
 vl.c                   | 37 ++++++++++++++++++++++++++++++++-----
 4 files changed, 67 insertions(+), 12 deletions(-)

diff --git a/exec.c b/exec.c
index 2435d9e..324f3d8 100644
--- a/exec.c
+++ b/exec.c
@@ -990,7 +990,10 @@ static void *file_ram_alloc(RAMBlock *block,
     char *c;
     void *area;
     int fd;
+    int flags;
     unsigned long hpagesize;
+    QemuOpts *opts;
+    unsigned int mem_prealloc = 0, mem_share = 0;
 
     hpagesize = gethugepagesize(path);
     if (!hpagesize) {
@@ -1006,6 +1009,13 @@ static void *file_ram_alloc(RAMBlock *block,
         return NULL;
     }
 
+    /* Fill config options */
+    opts = qemu_opts_find(qemu_find_opts("mem-path"), NULL);
+    if (opts) {
+        mem_prealloc = qemu_opt_get_bool(opts, "prealloc", 0);
+        mem_share = qemu_opt_get_bool(opts, "share", 0);
+    }
+
     /* Make name safe to use with mkstemp by replacing '/' with '_'. */
     sanitized_name = g_strdup(block->mr->name);
     for (c = sanitized_name; *c != '\0'; c++) {
@@ -1026,7 +1036,7 @@ static void *file_ram_alloc(RAMBlock *block,
     unlink(filename);
     g_free(filename);
 
-    memory = (memory+hpagesize-1) & ~(hpagesize-1);
+    memory = (memory + hpagesize - 1) & ~(hpagesize - 1);
 
     /*
      * ftruncate is not supported by hugetlbfs in older
@@ -1037,7 +1047,8 @@ static void *file_ram_alloc(RAMBlock *block,
     if (ftruncate(fd, memory))
         perror("ftruncate");
 
-    area = mmap(0, memory, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
+    flags = mem_share ? MAP_SHARED : MAP_PRIVATE;
+    area = mmap(0, memory, PROT_READ | PROT_WRITE, flags, fd, 0);
     if (area == MAP_FAILED) {
         perror("file_ram_alloc: can't mmap RAM pages");
         close(fd);
@@ -1207,6 +1218,8 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
                                    MemoryRegion *mr)
 {
     RAMBlock *block, *new_block;
+    QemuOpts *opts;
+    const char *mem_path = 0;
     ram_addr_t old_ram_size, new_ram_size;
 
     old_ram_size = last_ram_offset() >> TARGET_PAGE_BITS;
@@ -1215,6 +1228,11 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
     new_block = g_malloc0(sizeof(*new_block));
     new_block->fd = -1;
 
+    opts = qemu_opts_find(qemu_find_opts("mem-path"), NULL);
+    if (opts) {
+        mem_path = qemu_opt_get(opts, "path");
+    }
+
     /* This assumes the iothread lock is taken here too.  */
     qemu_mutex_lock_ramlist();
     new_block->mr = mr;
@@ -1353,6 +1371,14 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
     ram_addr_t offset;
     int flags;
     void *area, *vaddr;
+    QemuOpts *opts;
+    unsigned int mem_prealloc = 0;
+
+    /* Fill config options */
+    opts = qemu_opts_find(qemu_find_opts("mem-path"), NULL);
+    if (opts) {
+        mem_prealloc = qemu_opt_get_bool(opts, "prealloc", 0);
+    }
 
     QTAILQ_FOREACH(block, &ram_list.blocks, next) {
         offset = addr - block->offset;
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 4cb4b4a..b46055d 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -468,9 +468,6 @@ typedef struct RAMList {
 } RAMList;
 extern RAMList ram_list;
 
-extern const char *mem_path;
-extern int mem_prealloc;
-
 /* Flags stored in the low bits of the TLB virtual address.  These are
    defined so that fast path ram access is all zeros.  */
 /* Zero if TLB entry is valid.  */
diff --git a/qemu-options.hx b/qemu-options.hx
index 56e5fdf..60ecc95 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -221,9 +221,14 @@ gigabytes respectively.
 ETEXI
 
 DEF("mem-path", HAS_ARG, QEMU_OPTION_mempath,
-    "-mem-path FILE  provide backing storage for guest RAM\n", QEMU_ARCH_ALL)
+    "-mem-path [path=]path[,prealloc=on|off][,share=on|off]\n"
+    "                provide backing storage for guest RAM\n"
+    "                path= a directory path for the backing store\n"
+    "                prealloc= preallocate guest memory [default disabled]\n"
+    "                share= enable mmap share flag [default disabled]\n",
+        QEMU_ARCH_ALL)
 STEXI
-@item -mem-path @var{path}
+@item -mem-path [path=]@var{path}[,prealloc=on|off][,share=on|off]
 @findex -mem-path
 Allocate guest RAM from a temporarily created file in @var{path}.
 ETEXI
diff --git a/vl.c b/vl.c
index 2b47866..7e67cdd 100644
--- a/vl.c
+++ b/vl.c
@@ -187,8 +187,6 @@ DisplayType display_type = DT_DEFAULT;
 static int display_remote;
 const char* keyboard_layout = NULL;
 ram_addr_t ram_size;
-const char *mem_path = NULL;
-int mem_prealloc = 0; /* force preallocation of physical target memory */
 int nb_nics;
 NICInfo nd_table[MAX_NICS];
 int autostart;
@@ -531,6 +529,27 @@ static QemuOptsList qemu_msg_opts = {
     },
 };
 
+static QemuOptsList qemu_mem_path_opts = {
+    .name = "mem-path",
+    .implied_opt_name = "path",
+    .head = QTAILQ_HEAD_INITIALIZER(qemu_mem_path_opts.head),
+    .desc = {
+        {
+            .name = "path",
+            .type = QEMU_OPT_STRING,
+        },
+        {
+            .name = "prealloc",
+            .type = QEMU_OPT_BOOL,
+        },
+        {
+            .name = "share",
+            .type = QEMU_OPT_BOOL,
+        },
+        { /* end of list */ }
+    },
+};
+
 /**
  * Get machine options
  *
@@ -2895,6 +2914,7 @@ int main(int argc, char **argv, char **envp)
     qemu_add_opts(&qemu_tpmdev_opts);
     qemu_add_opts(&qemu_realtime_opts);
     qemu_add_opts(&qemu_msg_opts);
+    qemu_add_opts(&qemu_mem_path_opts);
 
     runstate_init();
 
@@ -3212,11 +3232,18 @@ int main(int argc, char **argv, char **envp)
                 break;
 #endif
             case QEMU_OPTION_mempath:
-                mem_path = optarg;
+                if (!qemu_opts_parse(qemu_find_opts("mem-path"), optarg, 1)) {
+                    exit(1);
+                }
                 break;
-            case QEMU_OPTION_mem_prealloc:
-                mem_prealloc = 1;
+            case QEMU_OPTION_mem_prealloc: {
+                QemuOpts *mem_opts = qemu_opts_find(qemu_find_opts("mem-path"),
+                                                    NULL);
+                if (mem_opts) {
+                    qemu_opt_set(mem_opts, "prealloc", "on");
+                }
                 break;
+            }
             case QEMU_OPTION_d:
                 log_mask = optarg;
                 break;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v7 02/13] Add chardev API qemu_chr_fe_read_all
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 01/13] Convert -mem-path to QemuOpts and add prealloc and share properties Antonios Motakis
@ 2014-01-31 17:34 ` Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 03/13] Add chardev API qemu_chr_fe_set_msgfds Antonios Motakis
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel
  Cc: mst, Amit Shah, Michael Roth, n.nikolaev, Hans de Goede,
	Gerd Hoffmann, Anthony Liguori, lukego, Antonios Motakis, tech

This function will attempt to read data from the chardev trying
to fill the buffer up to the given length.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 include/sysemu/char.h | 14 +++++++++++
 qemu-char.c           | 65 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 79 insertions(+)

diff --git a/include/sysemu/char.h b/include/sysemu/char.h
index b81a6ff..9981a6a 100644
--- a/include/sysemu/char.h
+++ b/include/sysemu/char.h
@@ -56,6 +56,8 @@ typedef void IOEventHandler(void *opaque, int event);
 struct CharDriverState {
     void (*init)(struct CharDriverState *s);
     int (*chr_write)(struct CharDriverState *s, const uint8_t *buf, int len);
+    int (*chr_sync_read)(struct CharDriverState *s,
+                         const uint8_t *buf, int len);
     GSource *(*chr_add_watch)(struct CharDriverState *s, GIOCondition cond);
     void (*chr_update_read_handler)(struct CharDriverState *s);
     int (*chr_ioctl)(struct CharDriverState *s, int cmd, void *arg);
@@ -189,6 +191,18 @@ int qemu_chr_fe_write(CharDriverState *s, const uint8_t *buf, int len);
 int qemu_chr_fe_write_all(CharDriverState *s, const uint8_t *buf, int len);
 
 /**
+ * @qemu_chr_fe_read_all:
+ *
+ * Read data to a buffer from the back end.
+ *
+ * @buf the data buffer
+ * @len the number of bytes to read
+ *
+ * Returns: the number of bytes read
+ */
+int qemu_chr_fe_read_all(CharDriverState *s, uint8_t *buf, int len);
+
+/**
  * @qemu_chr_fe_ioctl:
  *
  * Issue a device specific ioctl to a backend.
diff --git a/qemu-char.c b/qemu-char.c
index 30c5a6a..54ac03c 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -84,6 +84,7 @@
 #include "ui/qemu-spice.h"
 
 #define READ_BUF_LEN 4096
+#define READ_RETRIES 10
 
 /***********************************************************/
 /* character device */
@@ -145,6 +146,41 @@ int qemu_chr_fe_write_all(CharDriverState *s, const uint8_t *buf, int len)
     return offset;
 }
 
+int qemu_chr_fe_read_all(CharDriverState *s, uint8_t *buf, int len)
+{
+    int offset = 0, counter = 10;
+    int res;
+
+    if (!s->chr_sync_read) {
+        return 0;
+    }
+
+    while (offset < len) {
+        do {
+            res = s->chr_sync_read(s, buf + offset, len - offset);
+            if (res == -1 && errno == EAGAIN) {
+                g_usleep(100);
+            }
+        } while (res == -1 && errno == EAGAIN);
+
+        if (res == 0) {
+            break;
+        }
+
+        if (res < 0) {
+            return res;
+        }
+
+        offset += res;
+
+        if (!counter--) {
+            break;
+        }
+    }
+
+    return offset;
+}
+
 int qemu_chr_fe_ioctl(CharDriverState *s, int cmd, void *arg)
 {
     if (!s->chr_ioctl)
@@ -2489,6 +2525,34 @@ static gboolean tcp_chr_read(GIOChannel *chan, GIOCondition cond, void *opaque)
     return TRUE;
 }
 
+static int tcp_chr_sync_read(CharDriverState *chr, const uint8_t *buf, int len)
+{
+    TCPCharDriver *s = chr->opaque;
+    int size;
+
+    if (!s->connected) {
+        return 0;
+    }
+
+    size = tcp_chr_recv(chr, (void *) buf, len);
+    if (size == 0) {
+        /* connection closed */
+        s->connected = 0;
+        if (s->listen_chan) {
+            s->listen_tag = g_io_add_watch(s->listen_chan, G_IO_IN,
+                    tcp_chr_accept, chr);
+        }
+        remove_fd_in_watch(chr);
+        g_io_channel_unref(s->chan);
+        s->chan = NULL;
+        closesocket(s->fd);
+        s->fd = -1;
+        qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
+    }
+
+    return size;
+}
+
 #ifndef _WIN32
 CharDriverState *qemu_chr_open_eventfd(int eventfd)
 {
@@ -2660,6 +2724,7 @@ static CharDriverState *qemu_chr_open_socket_fd(int fd, bool do_nodelay,
 
     chr->opaque = s;
     chr->chr_write = tcp_chr_write;
+    chr->chr_sync_read = tcp_chr_sync_read;
     chr->chr_close = tcp_chr_close;
     chr->get_msgfd = tcp_get_msgfd;
     chr->chr_add_client = tcp_chr_add_client;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v7 03/13] Add chardev API qemu_chr_fe_set_msgfds
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 01/13] Convert -mem-path to QemuOpts and add prealloc and share properties Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 02/13] Add chardev API qemu_chr_fe_read_all Antonios Motakis
@ 2014-01-31 17:34 ` Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 04/13] Add G_IO_HUP handler for socket chardev Antonios Motakis
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel
  Cc: mst, Amit Shah, Michael Roth, n.nikolaev, Hans de Goede,
	Gerd Hoffmann, Anthony Liguori, lukego, Antonios Motakis, tech

This will set an array of file descriptors to the internal structures.
The next time a message is send the array will be send as ancillary
data. This feature works on unix domain socket backend only.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 include/sysemu/char.h |  14 +++++++
 qemu-char.c           | 105 ++++++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 111 insertions(+), 8 deletions(-)

diff --git a/include/sysemu/char.h b/include/sysemu/char.h
index 9981a6a..d99dcf6 100644
--- a/include/sysemu/char.h
+++ b/include/sysemu/char.h
@@ -62,6 +62,7 @@ struct CharDriverState {
     void (*chr_update_read_handler)(struct CharDriverState *s);
     int (*chr_ioctl)(struct CharDriverState *s, int cmd, void *arg);
     int (*get_msgfd)(struct CharDriverState *s);
+    int (*set_msgfds)(struct CharDriverState *s, int *fds, int num);
     int (*chr_add_client)(struct CharDriverState *chr, int fd);
     IOEventHandler *chr_event;
     IOCanReadHandler *chr_can_read;
@@ -229,6 +230,19 @@ int qemu_chr_fe_ioctl(CharDriverState *s, int cmd, void *arg);
 int qemu_chr_fe_get_msgfd(CharDriverState *s);
 
 /**
+ * @qemu_chr_fe_set_msgfds:
+ *
+ * For backends capable of fd passing, set an array of fds to be passed with
+ * the next send operation.
+ * A subsequent call to this function before calling a write function will
+ * result in overwriting the fd array with the new value without being send.
+ * Upon writing the message the fd array is freed.
+ *
+ * Returns: -1 if fd passing isn't supported.
+ */
+int qemu_chr_fe_set_msgfds(CharDriverState *s, int *fds, int num);
+
+/**
  * @qemu_chr_fe_claim:
  *
  * Claim a backend before using it, should be called before calling
diff --git a/qemu-char.c b/qemu-char.c
index 54ac03c..c2e599e 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -207,6 +207,11 @@ int qemu_chr_fe_get_msgfd(CharDriverState *s)
     return s->get_msgfd ? s->get_msgfd(s) : -1;
 }
 
+int qemu_chr_fe_set_msgfds(CharDriverState *s, int *fds, int num)
+{
+    return s->set_msgfds ? s->set_msgfds(s, fds, num) : -1;
+}
+
 int qemu_chr_add_client(CharDriverState *s, int fd)
 {
     return s->chr_add_client ? s->chr_add_client(s, fd) : -1;
@@ -2331,16 +2336,77 @@ typedef struct {
     int do_telnetopt;
     int do_nodelay;
     int is_unix;
-    int msgfd;
+    int read_msgfd;
+    int *write_msgfds;
+    int write_msgfds_num;
 } TCPCharDriver;
 
 static gboolean tcp_chr_accept(GIOChannel *chan, GIOCondition cond, void *opaque);
 
+#ifndef _WIN32
+static int unix_send_msgfds(CharDriverState *chr, const uint8_t *buf, int len)
+{
+    TCPCharDriver *s = chr->opaque;
+    struct msghdr msgh;
+    struct iovec iov;
+    int r;
+
+    size_t fd_size = s->write_msgfds_num * sizeof(int);
+    char control[CMSG_SPACE(fd_size)];
+    struct cmsghdr *cmsg;
+
+    memset(&msgh, 0, sizeof(msgh));
+    memset(control, 0, sizeof(control));
+
+    /* set the payload */
+    iov.iov_base = (uint8_t *) buf;
+    iov.iov_len = len;
+
+    msgh.msg_iov = &iov;
+    msgh.msg_iovlen = 1;
+
+    msgh.msg_control = control;
+    msgh.msg_controllen = sizeof(control);
+
+    cmsg = CMSG_FIRSTHDR(&msgh);
+
+    cmsg->cmsg_len = CMSG_LEN(fd_size);
+    cmsg->cmsg_level = SOL_SOCKET;
+    cmsg->cmsg_type = SCM_RIGHTS;
+    memcpy(CMSG_DATA(cmsg), s->write_msgfds, fd_size);
+
+    do {
+        r = sendmsg(s->fd, &msgh, 0);
+    } while (r < 0 && errno == EINTR);
+
+    /* free the written msgfds, no matter what */
+    if (s->write_msgfds_num) {
+        g_free(s->write_msgfds);
+        s->write_msgfds = 0;
+        s->write_msgfds_num = 0;
+    }
+
+    if (r < 0) {
+        error_report("Failed to send fds, reason: %s\n", strerror(errno));
+        return -1;
+    }
+
+    return r;
+}
+#endif
+
 static int tcp_chr_write(CharDriverState *chr, const uint8_t *buf, int len)
 {
     TCPCharDriver *s = chr->opaque;
     if (s->connected) {
-        return io_channel_send(s->chan, buf, len);
+#ifndef _WIN32
+        if (s->is_unix && s->write_msgfds_num) {
+            return unix_send_msgfds(chr, buf, len);
+        } else
+#endif
+        {
+            return io_channel_send(s->chan, buf, len);
+        }
     } else {
         /* XXX: indicate an error ? */
         return len;
@@ -2410,11 +2476,27 @@ static void tcp_chr_process_IAC_bytes(CharDriverState *chr,
 static int tcp_get_msgfd(CharDriverState *chr)
 {
     TCPCharDriver *s = chr->opaque;
-    int fd = s->msgfd;
-    s->msgfd = -1;
+    int fd = s->read_msgfd;
+    s->read_msgfd = -1;
     return fd;
 }
 
+static int tcp_set_msgfds(CharDriverState *chr, int *fds, int num)
+{
+    TCPCharDriver *s = chr->opaque;
+
+    /* clear old pending fd array */
+    if (s->write_msgfds) {
+        g_free(s->write_msgfds);
+    }
+
+    s->write_msgfds = g_malloc(num * sizeof(int));
+    memcpy(s->write_msgfds, fds, num * sizeof(int));
+    s->write_msgfds_num = num;
+
+    return 0;
+}
+
 #ifndef _WIN32
 static void unix_process_msgfd(CharDriverState *chr, struct msghdr *msg)
 {
@@ -2439,9 +2521,10 @@ static void unix_process_msgfd(CharDriverState *chr, struct msghdr *msg)
 #ifndef MSG_CMSG_CLOEXEC
         qemu_set_cloexec(fd);
 #endif
-        if (s->msgfd != -1)
-            close(s->msgfd);
-        s->msgfd = fd;
+        if (s->read_msgfd != -1) {
+            close(s->read_msgfd);
+        }
+        s->read_msgfd = fd;
     }
 }
 
@@ -2667,6 +2750,9 @@ static void tcp_chr_close(CharDriverState *chr)
         }
         closesocket(s->listen_fd);
     }
+    if (s->write_msgfds_num) {
+        g_free(s->write_msgfds);
+    }
     g_free(s);
     qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
 }
@@ -2695,7 +2781,9 @@ static CharDriverState *qemu_chr_open_socket_fd(int fd, bool do_nodelay,
     s->connected = 0;
     s->fd = -1;
     s->listen_fd = -1;
-    s->msgfd = -1;
+    s->read_msgfd = -1;
+    s->write_msgfds = 0;
+    s->write_msgfds_num = 0;
 
     chr->filename = g_malloc(256);
     switch (ss.ss_family) {
@@ -2727,6 +2815,7 @@ static CharDriverState *qemu_chr_open_socket_fd(int fd, bool do_nodelay,
     chr->chr_sync_read = tcp_chr_sync_read;
     chr->chr_close = tcp_chr_close;
     chr->get_msgfd = tcp_get_msgfd;
+    chr->set_msgfds = tcp_set_msgfds;
     chr->chr_add_client = tcp_chr_add_client;
     chr->chr_add_watch = tcp_chr_add_watch;
     /* be isn't opened until we get a connection */
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v7 04/13] Add G_IO_HUP handler for socket chardev
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (2 preceding siblings ...)
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 03/13] Add chardev API qemu_chr_fe_set_msgfds Antonios Motakis
@ 2014-01-31 17:34 ` Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 05/13] vhost_net should call the poll callback only when it is set Antonios Motakis
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel
  Cc: mst, n.nikolaev, Anthony Liguori, lukego, Antonios Motakis, tech

Close the chardev on receiving this event.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 qemu-char.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/qemu-char.c b/qemu-char.c
index c2e599e..1c34b2b 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -2643,6 +2643,20 @@ CharDriverState *qemu_chr_open_eventfd(int eventfd)
 }
 #endif
 
+static gboolean tcp_chr_chan_close(GIOChannel *channel, GIOCondition cond,
+                                   void *opaque)
+{
+    CharDriverState *chr = opaque;
+
+    if (cond == G_IO_HUP) {
+        if (chr->chr_close) {
+            chr->chr_close(chr);
+        }
+    }
+
+    return TRUE;
+}
+
 static void tcp_chr_connect(void *opaque)
 {
     CharDriverState *chr = opaque;
@@ -2652,6 +2666,7 @@ static void tcp_chr_connect(void *opaque)
     if (s->chan) {
         chr->fd_in_tag = io_add_watch_poll(s->chan, tcp_chr_read_poll,
                                            tcp_chr_read, chr);
+        g_io_add_watch(s->chan, G_IO_HUP, tcp_chr_chan_close, chr);
     }
     qemu_chr_be_generic_open(chr);
 }
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v7 05/13] vhost_net should call the poll callback only when it is set
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (3 preceding siblings ...)
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 04/13] Add G_IO_HUP handler for socket chardev Antonios Motakis
@ 2014-01-31 17:34 ` Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 06/13] Refactor virtio-net to use a generic get_vhost_net Antonios Motakis
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel; +Cc: lukego, Antonios Motakis, tech, n.nikolaev, mst

The poll callback needs to be called when bringing up or down
the vhost_net instance. As it is not mandatory for an NetClient
to implement it, invoke it only when it is set.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 hw/net/vhost_net.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 006576d..6aa6e87 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -166,7 +166,10 @@ static int vhost_net_start_one(struct vhost_net *net,
         goto fail_start;
     }
 
-    net->nc->info->poll(net->nc, false);
+    if (net->nc->info->poll) {
+        net->nc->info->poll(net->nc, false);
+    }
+
     qemu_set_fd_handler(net->backend, NULL, NULL, NULL);
     file.fd = net->backend;
     for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
@@ -183,7 +186,9 @@ fail:
         int r = ioctl(net->dev.control, VHOST_NET_SET_BACKEND, &file);
         assert(r >= 0);
     }
-    net->nc->info->poll(net->nc, true);
+    if (net->nc->info->poll) {
+        net->nc->info->poll(net->nc, true);
+    }
     vhost_dev_stop(&net->dev, dev);
 fail_start:
     vhost_dev_disable_notifiers(&net->dev, dev);
@@ -204,7 +209,9 @@ static void vhost_net_stop_one(struct vhost_net *net,
         int r = ioctl(net->dev.control, VHOST_NET_SET_BACKEND, &file);
         assert(r >= 0);
     }
-    net->nc->info->poll(net->nc, true);
+    if (net->nc->info->poll) {
+        net->nc->info->poll(net->nc, true);
+    }
     vhost_dev_stop(&net->dev, dev);
     vhost_dev_disable_notifiers(&net->dev, dev);
 }
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v7 06/13] Refactor virtio-net to use a generic get_vhost_net
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (4 preceding siblings ...)
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 05/13] vhost_net should call the poll callback only when it is set Antonios Motakis
@ 2014-01-31 17:34 ` Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 07/13] vhost_net_init will use VhostNetOptions to get all its arguments Antonios Motakis
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel
  Cc: mst, Jason Wang, n.nikolaev, Anthony Liguori, Paolo Bonzini,
	lukego, Antonios Motakis, tech

This decouples virtio-net from the TAP netdev backend and allows support
for other backends to be implemented.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 hw/net/vhost_net.c      | 30 +++++++++++++++++++++++++++---
 hw/net/virtio-net.c     | 39 ++++++++++++++-------------------------
 include/net/vhost_net.h |  1 +
 3 files changed, 42 insertions(+), 28 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 6aa6e87..7ee904e 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -231,7 +231,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
     }
 
     for (i = 0; i < total_queues; i++) {
-        r = vhost_net_start_one(tap_get_vhost_net(ncs[i].peer), dev, i * 2);
+        r = vhost_net_start_one(get_vhost_net(ncs[i].peer), dev, i * 2);
 
         if (r < 0) {
             goto err;
@@ -248,7 +248,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
 
 err:
     while (--i >= 0) {
-        vhost_net_stop_one(tap_get_vhost_net(ncs[i].peer), dev);
+        vhost_net_stop_one(get_vhost_net(ncs[i].peer), dev);
     }
     return r;
 }
@@ -269,7 +269,7 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
     assert(r >= 0);
 
     for (i = 0; i < total_queues; i++) {
-        vhost_net_stop_one(tap_get_vhost_net(ncs[i].peer), dev);
+        vhost_net_stop_one(get_vhost_net(ncs[i].peer), dev);
     }
 }
 
@@ -289,6 +289,25 @@ void vhost_net_virtqueue_mask(VHostNetState *net, VirtIODevice *dev,
 {
     vhost_virtqueue_mask(&net->dev, dev, idx, mask);
 }
+
+VHostNetState *get_vhost_net(NetClientState *nc)
+{
+    VHostNetState *vhost_net = 0;
+
+    if (!nc) {
+        return 0;
+    }
+
+    switch (nc->info->type) {
+    case NET_CLIENT_OPTIONS_KIND_TAP:
+        vhost_net = tap_get_vhost_net(nc);
+        break;
+    default:
+        break;
+    }
+
+    return vhost_net;
+}
 #else
 struct vhost_net *vhost_net_init(NetClientState *backend, int devfd,
                                  bool force)
@@ -335,4 +354,9 @@ void vhost_net_virtqueue_mask(VHostNetState *net, VirtIODevice *dev,
                               int idx, bool mask)
 {
 }
+
+VHostNetState *get_vhost_net(NetClientState *nc)
+{
+    return 0;
+}
 #endif
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 3626608..72acd15 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -105,14 +105,7 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
     NetClientState *nc = qemu_get_queue(n->nic);
     int queues = n->multiqueue ? n->max_queues : 1;
 
-    if (!nc->peer) {
-        return;
-    }
-    if (nc->peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
-        return;
-    }
-
-    if (!tap_get_vhost_net(nc->peer)) {
+    if (!get_vhost_net(nc->peer)) {
         return;
     }
 
@@ -122,7 +115,7 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
     }
     if (!n->vhost_started) {
         int r;
-        if (!vhost_net_query(tap_get_vhost_net(nc->peer), vdev)) {
+        if (!vhost_net_query(get_vhost_net(nc->peer), vdev)) {
             return;
         }
         n->vhost_started = 1;
@@ -325,11 +318,13 @@ static void peer_test_vnet_hdr(VirtIONet *n)
         return;
     }
 
-    if (nc->peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
-        return;
+    switch (nc->peer->info->type) {
+    case NET_CLIENT_OPTIONS_KIND_TAP:
+        n->has_vnet_hdr = tap_has_vnet_hdr(nc->peer);
+        break;
+    default:
+        break;
     }
-
-    n->has_vnet_hdr = tap_has_vnet_hdr(nc->peer);
 }
 
 static int peer_has_vnet_hdr(VirtIONet *n)
@@ -437,13 +432,10 @@ static uint32_t virtio_net_get_features(VirtIODevice *vdev, uint32_t features)
         features &= ~(0x1 << VIRTIO_NET_F_HOST_UFO);
     }
 
-    if (!nc->peer || nc->peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
+    if (!get_vhost_net(nc->peer)) {
         return features;
     }
-    if (!tap_get_vhost_net(nc->peer)) {
-        return features;
-    }
-    return vhost_net_get_features(tap_get_vhost_net(nc->peer), features);
+    return vhost_net_get_features(get_vhost_net(nc->peer), features);
 }
 
 static uint32_t virtio_net_bad_features(VirtIODevice *vdev)
@@ -507,13 +499,10 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint32_t features)
     for (i = 0;  i < n->max_queues; i++) {
         NetClientState *nc = qemu_get_subqueue(n->nic, i);
 
-        if (!nc->peer || nc->peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
-            continue;
-        }
-        if (!tap_get_vhost_net(nc->peer)) {
+        if (!get_vhost_net(nc->peer)) {
             continue;
         }
-        vhost_net_ack_features(tap_get_vhost_net(nc->peer), features);
+        vhost_net_ack_features(get_vhost_net(nc->peer), features);
     }
 }
 
@@ -1443,7 +1432,7 @@ static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx)
     VirtIONet *n = VIRTIO_NET(vdev);
     NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
     assert(n->vhost_started);
-    return vhost_net_virtqueue_pending(tap_get_vhost_net(nc->peer), idx);
+    return vhost_net_virtqueue_pending(get_vhost_net(nc->peer), idx);
 }
 
 static void virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,
@@ -1452,7 +1441,7 @@ static void virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,
     VirtIONet *n = VIRTIO_NET(vdev);
     NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
     assert(n->vhost_started);
-    vhost_net_virtqueue_mask(tap_get_vhost_net(nc->peer),
+    vhost_net_virtqueue_mask(get_vhost_net(nc->peer),
                              vdev, idx, mask);
 }
 
diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 2d936bb..e2bd61c 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -20,4 +20,5 @@ void vhost_net_ack_features(VHostNetState *net, unsigned features);
 bool vhost_net_virtqueue_pending(VHostNetState *net, int n);
 void vhost_net_virtqueue_mask(VHostNetState *net, VirtIODevice *dev,
                               int idx, bool mask);
+VHostNetState *get_vhost_net(NetClientState *nc);
 #endif
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v7 07/13] vhost_net_init will use VhostNetOptions to get all its arguments
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (5 preceding siblings ...)
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 06/13] Refactor virtio-net to use a generic get_vhost_net Antonios Motakis
@ 2014-01-31 17:34 ` Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 08/13] Add vhost_ops to the vhost_dev struct and replace all relevant ioctls Antonios Motakis
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel
  Cc: Stefan Hajnoczi, mst, Jason Wang, n.nikolaev, Anthony Liguori,
	Paolo Bonzini, lukego, Antonios Motakis, tech

vhost_dev_init will replace devfd and devpath with a single opaque argument.
This is initialised with a file descriptor. When TAP is used (through
vhost_net), open /dev/vhost-net and pass the fd as an opaque parameter in
VhostNetOptions. The same applies to vhost-scsi - open /dev/vhost-scsi and
pass the fd.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 hw/net/vhost_net.c        | 24 +++++++++++++-----------
 hw/scsi/vhost-scsi.c      | 10 +++++++++-
 hw/virtio/vhost.c         | 12 +++---------
 include/hw/virtio/vhost.h |  2 +-
 include/net/vhost_net.h   |  8 +++++++-
 net/tap.c                 | 17 +++++++++++++----
 6 files changed, 46 insertions(+), 27 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 7ee904e..c705fe6 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -92,32 +92,35 @@ static int vhost_net_get_fd(NetClientState *backend)
     }
 }
 
-struct vhost_net *vhost_net_init(NetClientState *backend, int devfd,
-                                 bool force)
+struct vhost_net *vhost_net_init(VhostNetOptions *options)
 {
     int r;
     struct vhost_net *net = g_malloc(sizeof *net);
-    if (!backend) {
-        fprintf(stderr, "vhost-net requires backend to be setup\n");
+
+    if (!options->net_backend) {
+        fprintf(stderr, "vhost-net requires net backend to be setup\n");
         goto fail;
     }
-    r = vhost_net_get_fd(backend);
+
+    r = vhost_net_get_fd(options->net_backend);
     if (r < 0) {
         goto fail;
     }
-    net->nc = backend;
-    net->dev.backend_features = tap_has_vnet_hdr(backend) ? 0 :
+
+    net->nc = options->net_backend;
+    net->dev.backend_features = tap_has_vnet_hdr(options->net_backend) ? 0 :
         (1 << VHOST_NET_F_VIRTIO_NET_HDR);
     net->backend = r;
 
     net->dev.nvqs = 2;
     net->dev.vqs = net->vqs;
 
-    r = vhost_dev_init(&net->dev, devfd, "/dev/vhost-net", force);
+    r = vhost_dev_init(&net->dev, options->opaque,
+                       options->force);
     if (r < 0) {
         goto fail;
     }
-    if (!tap_has_vnet_hdr_len(backend,
+    if (!tap_has_vnet_hdr_len(options->net_backend,
                               sizeof(struct virtio_net_hdr_mrg_rxbuf))) {
         net->dev.features &= ~(1 << VIRTIO_NET_F_MRG_RXBUF);
     }
@@ -309,8 +312,7 @@ VHostNetState *get_vhost_net(NetClientState *nc)
     return vhost_net;
 }
 #else
-struct vhost_net *vhost_net_init(NetClientState *backend, int devfd,
-                                 bool force)
+struct vhost_net *vhost_net_init(VhostNetOptions *options)
 {
     error_report("vhost-net support is not compiled in");
     return NULL;
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 3983a5b..9b03fb6 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -215,6 +215,13 @@ static void vhost_scsi_realize(DeviceState *dev, Error **errp)
             error_setg(errp, "vhost-scsi: unable to parse vhostfd");
             return;
         }
+    } else {
+        vhostfd = open("/dev/vhost-scsi", O_RDWR);
+        if (vhostfd < 0) {
+            error_setg(errp, "vhost-scsi: open vhost char device failed: %s",
+                       strerror(errno));
+            return;
+        }
     }
 
     virtio_scsi_common_realize(dev, &err);
@@ -227,7 +234,8 @@ static void vhost_scsi_realize(DeviceState *dev, Error **errp)
     s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs);
     s->dev.vq_index = 0;
 
-    ret = vhost_dev_init(&s->dev, vhostfd, "/dev/vhost-scsi", true);
+    ret = vhost_dev_init(&s->dev, (void *)(uintptr_t)vhostfd,
+                         true);
     if (ret < 0) {
         error_setg(errp, "vhost-scsi: vhost initialization failed: %s",
                    strerror(-ret));
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 9e336ad..7636836 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -814,19 +814,13 @@ static void vhost_virtqueue_cleanup(struct vhost_virtqueue *vq)
     event_notifier_cleanup(&vq->masked_notifier);
 }
 
-int vhost_dev_init(struct vhost_dev *hdev, int devfd, const char *devpath,
+int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
                    bool force)
 {
     uint64_t features;
     int i, r;
-    if (devfd >= 0) {
-        hdev->control = devfd;
-    } else {
-        hdev->control = open(devpath, O_RDWR);
-        if (hdev->control < 0) {
-            return -errno;
-        }
-    }
+    hdev->control = (uintptr_t) opaque;;
+
     r = ioctl(hdev->control, VHOST_SET_OWNER, NULL);
     if (r < 0) {
         goto fail;
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index de24746..eb25ffa 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -50,7 +50,7 @@ struct vhost_dev {
     hwaddr mem_changed_end_addr;
 };
 
-int vhost_dev_init(struct vhost_dev *hdev, int devfd, const char *devpath,
+int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
                    bool force);
 void vhost_dev_cleanup(struct vhost_dev *hdev);
 bool vhost_dev_query(struct vhost_dev *hdev, VirtIODevice *vdev);
diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index e2bd61c..2067ee2 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -6,7 +6,13 @@
 struct vhost_net;
 typedef struct vhost_net VHostNetState;
 
-VHostNetState *vhost_net_init(NetClientState *backend, int devfd, bool force);
+typedef struct VhostNetOptions {
+    NetClientState *net_backend;
+    void *opaque;
+    bool force;
+} VhostNetOptions;
+
+struct vhost_net *vhost_net_init(VhostNetOptions *options);
 
 bool vhost_net_query(VHostNetState *net, VirtIODevice *dev);
 int vhost_net_start(VirtIODevice *dev, NetClientState *ncs, int total_queues);
diff --git a/net/tap.c b/net/tap.c
index 39c1cda..0840093 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -591,6 +591,7 @@ static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
                             int vnet_hdr, int fd)
 {
     TAPState *s;
+    int vhostfd;
 
     s = net_tap_fd_init(peer, model, name, fd, vnet_hdr);
     if (!s) {
@@ -621,7 +622,10 @@ static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
 
     if (tap->has_vhost ? tap->vhost :
         vhostfdname || (tap->has_vhostforce && tap->vhostforce)) {
-        int vhostfd;
+        VhostNetOptions options;
+
+        options.net_backend = &s->nc;
+        options.force = tap->has_vhostforce && tap->vhostforce;
 
         if (tap->has_vhostfd || tap->has_vhostfds) {
             vhostfd = monitor_handle_fd_param(cur_mon, vhostfdname);
@@ -629,11 +633,16 @@ static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
                 return -1;
             }
         } else {
-            vhostfd = -1;
+            vhostfd = open("/dev/vhost-net", O_RDWR);
+            if (vhostfd < 0) {
+                error_report("tap: open vhost char device failed: %s",
+                           strerror(errno));
+                return -1;
+            }
         }
+        options.opaque = (void *)(uintptr_t)vhostfd;
 
-        s->vhost_net = vhost_net_init(&s->nc, vhostfd,
-                                      tap->has_vhostforce && tap->vhostforce);
+        s->vhost_net = vhost_net_init(&options);
         if (!s->vhost_net) {
             error_report("vhost-net requested but could not be initialized");
             return -1;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v7 08/13] Add vhost_ops to the vhost_dev struct and replace all relevant ioctls
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (6 preceding siblings ...)
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 07/13] vhost_net_init will use VhostNetOptions to get all its arguments Antonios Motakis
@ 2014-01-31 17:34 ` Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 09/13] Add vhost-backend and VhostBackendType Antonios Motakis
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel
  Cc: mst, n.nikolaev, Anthony Liguori, Paolo Bonzini, lukego,
	Antonios Motakis, tech

Decouple vhost from the Linux kernel by introducing vhost_ops. The
intention is to provide different backends - 'kernel' backend based on
the ioctl interface, and 'user' backend based on a unix domain socket and
shared memory.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 hw/net/vhost_net.c                | 10 ++++++----
 hw/scsi/vhost-scsi.c              | 10 +++++++---
 hw/virtio/vhost.c                 | 41 ++++++++++++++++++++-------------------
 include/hw/virtio/vhost-backend.h | 27 ++++++++++++++++++++++++++
 include/hw/virtio/vhost.h         |  2 ++
 5 files changed, 63 insertions(+), 27 deletions(-)
 create mode 100644 include/hw/virtio/vhost-backend.h

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index c705fe6..8e19b8f 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -27,7 +27,6 @@
 #include <sys/socket.h>
 #include <linux/kvm.h>
 #include <fcntl.h>
-#include <sys/ioctl.h>
 #include <linux/virtio_ring.h>
 #include <netpacket/packet.h>
 #include <net/ethernet.h>
@@ -176,7 +175,8 @@ static int vhost_net_start_one(struct vhost_net *net,
     qemu_set_fd_handler(net->backend, NULL, NULL, NULL);
     file.fd = net->backend;
     for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
-        r = ioctl(net->dev.control, VHOST_NET_SET_BACKEND, &file);
+        const VhostOps *vhost_ops = net->dev.vhost_ops;
+        r = vhost_ops->vhost_call(&net->dev, VHOST_NET_SET_BACKEND, &file);
         if (r < 0) {
             r = -errno;
             goto fail;
@@ -186,7 +186,8 @@ static int vhost_net_start_one(struct vhost_net *net,
 fail:
     file.fd = -1;
     while (file.index-- > 0) {
-        int r = ioctl(net->dev.control, VHOST_NET_SET_BACKEND, &file);
+        const VhostOps *vhost_ops = net->dev.vhost_ops;
+        int r = vhost_ops->vhost_call(&net->dev, VHOST_NET_SET_BACKEND, &file);
         assert(r >= 0);
     }
     if (net->nc->info->poll) {
@@ -209,7 +210,8 @@ static void vhost_net_stop_one(struct vhost_net *net,
     }
 
     for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
-        int r = ioctl(net->dev.control, VHOST_NET_SET_BACKEND, &file);
+        const VhostOps *vhost_ops = net->dev.vhost_ops;
+        int r = vhost_ops->vhost_call(&net->dev, VHOST_NET_SET_BACKEND, &file);
         assert(r >= 0);
     }
     if (net->nc->info->poll) {
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 9b03fb6..48a9ced 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -27,12 +27,13 @@
 static int vhost_scsi_set_endpoint(VHostSCSI *s)
 {
     VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
+    const VhostOps *vhost_ops = s->dev.vhost_ops;
     struct vhost_scsi_target backend;
     int ret;
 
     memset(&backend, 0, sizeof(backend));
     pstrcpy(backend.vhost_wwpn, sizeof(backend.vhost_wwpn), vs->conf.wwpn);
-    ret = ioctl(s->dev.control, VHOST_SCSI_SET_ENDPOINT, &backend);
+    ret = vhost_ops->vhost_call(&s->dev, VHOST_SCSI_SET_ENDPOINT, &backend);
     if (ret < 0) {
         return -errno;
     }
@@ -43,10 +44,11 @@ static void vhost_scsi_clear_endpoint(VHostSCSI *s)
 {
     VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
     struct vhost_scsi_target backend;
+    const VhostOps *vhost_ops = s->dev.vhost_ops;
 
     memset(&backend, 0, sizeof(backend));
     pstrcpy(backend.vhost_wwpn, sizeof(backend.vhost_wwpn), vs->conf.wwpn);
-    ioctl(s->dev.control, VHOST_SCSI_CLEAR_ENDPOINT, &backend);
+    vhost_ops->vhost_call(&s->dev, VHOST_SCSI_CLEAR_ENDPOINT, &backend);
 }
 
 static int vhost_scsi_start(VHostSCSI *s)
@@ -55,13 +57,15 @@ static int vhost_scsi_start(VHostSCSI *s)
     VirtIODevice *vdev = VIRTIO_DEVICE(s);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+    const VhostOps *vhost_ops = s->dev.vhost_ops;
 
     if (!k->set_guest_notifiers) {
         error_report("binding does not support guest notifiers");
         return -ENOSYS;
     }
 
-    ret = ioctl(s->dev.control, VHOST_SCSI_GET_ABI_VERSION, &abi_version);
+    ret = vhost_ops->vhost_call(&s->dev,
+                                VHOST_SCSI_GET_ABI_VERSION, &abi_version);
     if (ret < 0) {
         return -errno;
     }
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 7636836..a2c76a4 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -13,7 +13,6 @@
  * GNU GPL, version 2 or (at your option) any later version.
  */
 
-#include <sys/ioctl.h>
 #include "hw/virtio/vhost.h"
 #include "hw/hw.h"
 #include "qemu/atomic.h"
@@ -291,7 +290,7 @@ static inline void vhost_dev_log_resize(struct vhost_dev* dev, uint64_t size)
 
     log = g_malloc0(size * sizeof *log);
     log_base = (uint64_t)(unsigned long)log;
-    r = ioctl(dev->control, VHOST_SET_LOG_BASE, &log_base);
+    r = dev->vhost_ops->vhost_call(dev, VHOST_SET_LOG_BASE, &log_base);
     assert(r >= 0);
     /* Sync only the range covered by the old log */
     if (dev->log_size) {
@@ -460,7 +459,7 @@ static void vhost_commit(MemoryListener *listener)
     }
 
     if (!dev->log_enabled) {
-        r = ioctl(dev->control, VHOST_SET_MEM_TABLE, dev->mem);
+        r = dev->vhost_ops->vhost_call(dev, VHOST_SET_MEM_TABLE, dev->mem);
         assert(r >= 0);
         dev->memory_changed = false;
         return;
@@ -473,7 +472,7 @@ static void vhost_commit(MemoryListener *listener)
     if (dev->log_size < log_size) {
         vhost_dev_log_resize(dev, log_size + VHOST_LOG_BUFFER);
     }
-    r = ioctl(dev->control, VHOST_SET_MEM_TABLE, dev->mem);
+    r = dev->vhost_ops->vhost_call(dev, VHOST_SET_MEM_TABLE, dev->mem);
     assert(r >= 0);
     /* To log less, can only decrease log size after table update. */
     if (dev->log_size > log_size + VHOST_LOG_BUFFER) {
@@ -541,7 +540,7 @@ static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
         .log_guest_addr = vq->used_phys,
         .flags = enable_log ? (1 << VHOST_VRING_F_LOG) : 0,
     };
-    int r = ioctl(dev->control, VHOST_SET_VRING_ADDR, &addr);
+    int r = dev->vhost_ops->vhost_call(dev, VHOST_SET_VRING_ADDR, &addr);
     if (r < 0) {
         return -errno;
     }
@@ -555,7 +554,7 @@ static int vhost_dev_set_features(struct vhost_dev *dev, bool enable_log)
     if (enable_log) {
         features |= 0x1 << VHOST_F_LOG_ALL;
     }
-    r = ioctl(dev->control, VHOST_SET_FEATURES, &features);
+    r = dev->vhost_ops->vhost_call(dev, VHOST_SET_FEATURES, &features);
     return r < 0 ? -errno : 0;
 }
 
@@ -670,13 +669,13 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
     assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
 
     vq->num = state.num = virtio_queue_get_num(vdev, idx);
-    r = ioctl(dev->control, VHOST_SET_VRING_NUM, &state);
+    r = dev->vhost_ops->vhost_call(dev, VHOST_SET_VRING_NUM, &state);
     if (r) {
         return -errno;
     }
 
     state.num = virtio_queue_get_last_avail_idx(vdev, idx);
-    r = ioctl(dev->control, VHOST_SET_VRING_BASE, &state);
+    r = dev->vhost_ops->vhost_call(dev, VHOST_SET_VRING_BASE, &state);
     if (r) {
         return -errno;
     }
@@ -718,7 +717,7 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
     }
 
     file.fd = event_notifier_get_fd(virtio_queue_get_host_notifier(vvq));
-    r = ioctl(dev->control, VHOST_SET_VRING_KICK, &file);
+    r = dev->vhost_ops->vhost_call(dev, VHOST_SET_VRING_KICK, &file);
     if (r) {
         r = -errno;
         goto fail_kick;
@@ -756,7 +755,7 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
     };
     int r;
     assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
-    r = ioctl(dev->control, VHOST_GET_VRING_BASE, &state);
+    r = dev->vhost_ops->vhost_call(dev, VHOST_GET_VRING_BASE, &state);
     if (r < 0) {
         fprintf(stderr, "vhost VQ %d ring restore failed: %d\n", idx, r);
         fflush(stderr);
@@ -798,7 +797,7 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
     }
 
     file.fd = event_notifier_get_fd(&vq->masked_notifier);
-    r = ioctl(dev->control, VHOST_SET_VRING_CALL, &file);
+    r = dev->vhost_ops->vhost_call(dev, VHOST_SET_VRING_CALL, &file);
     if (r) {
         r = -errno;
         goto fail_call;
@@ -819,14 +818,17 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
 {
     uint64_t features;
     int i, r;
-    hdev->control = (uintptr_t) opaque;;
 
-    r = ioctl(hdev->control, VHOST_SET_OWNER, NULL);
+    if (hdev->vhost_ops->vhost_backend_init(hdev, opaque) < 0) {
+        return -errno;
+    }
+
+    r = hdev->vhost_ops->vhost_call(hdev, VHOST_SET_OWNER, NULL);
     if (r < 0) {
         goto fail;
     }
 
-    r = ioctl(hdev->control, VHOST_GET_FEATURES, &features);
+    r = hdev->vhost_ops->vhost_call(hdev, VHOST_GET_FEATURES, &features);
     if (r < 0) {
         goto fail;
     }
@@ -871,7 +873,7 @@ fail_vq:
     }
 fail:
     r = -errno;
-    close(hdev->control);
+    hdev->vhost_ops->vhost_backend_cleanup(hdev);
     return r;
 }
 
@@ -884,7 +886,7 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
     memory_listener_unregister(&hdev->memory_listener);
     g_free(hdev->mem);
     g_free(hdev->mem_sections);
-    close(hdev->control);
+    hdev->vhost_ops->vhost_backend_cleanup(hdev);
 }
 
 bool vhost_dev_query(struct vhost_dev *hdev, VirtIODevice *vdev)
@@ -986,7 +988,7 @@ void vhost_virtqueue_mask(struct vhost_dev *hdev, VirtIODevice *vdev, int n,
     } else {
         file.fd = event_notifier_get_fd(virtio_queue_get_guest_notifier(vvq));
     }
-    r = ioctl(hdev->control, VHOST_SET_VRING_CALL, &file);
+    r = hdev->vhost_ops->vhost_call(hdev, VHOST_SET_VRING_CALL, &file);
     assert(r >= 0);
 }
 
@@ -1001,7 +1003,7 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
     if (r < 0) {
         goto fail_features;
     }
-    r = ioctl(hdev->control, VHOST_SET_MEM_TABLE, hdev->mem);
+    r = hdev->vhost_ops->vhost_call(hdev, VHOST_SET_MEM_TABLE, hdev->mem);
     if (r < 0) {
         r = -errno;
         goto fail_mem;
@@ -1020,8 +1022,7 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
         hdev->log_size = vhost_get_log_size(hdev);
         hdev->log = hdev->log_size ?
             g_malloc0(hdev->log_size * sizeof *hdev->log) : NULL;
-        r = ioctl(hdev->control, VHOST_SET_LOG_BASE,
-                  (uint64_t)(unsigned long)hdev->log);
+        r = hdev->vhost_ops->vhost_call(hdev, VHOST_SET_LOG_BASE, hdev->log);
         if (r < 0) {
             r = -errno;
             goto fail_log;
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
new file mode 100644
index 0000000..14e5878
--- /dev/null
+++ b/include/hw/virtio/vhost-backend.h
@@ -0,0 +1,27 @@
+/*
+ * vhost-backend
+ *
+ * Copyright (c) 2013 Virtual Open Systems Sarl.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef VHOST_BACKEND_H_
+#define VHOST_BACKEND_H_
+
+struct vhost_dev;
+
+typedef int (*vhost_call)(struct vhost_dev *dev, unsigned long int request,
+             void *arg);
+typedef int (*vhost_backend_init)(struct vhost_dev *dev, void *opaque);
+typedef int (*vhost_backend_cleanup)(struct vhost_dev *dev);
+
+typedef struct VhostOps {
+    vhost_call vhost_call;
+    vhost_backend_init vhost_backend_init;
+    vhost_backend_cleanup vhost_backend_cleanup;
+} VhostOps;
+
+#endif /* VHOST_BACKEND_H_ */
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index eb25ffa..97641b6 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -2,6 +2,7 @@
 #define VHOST_H
 
 #include "hw/hw.h"
+#include "hw/virtio/vhost-backend.h"
 #include "hw/virtio/virtio.h"
 #include "exec/memory.h"
 
@@ -48,6 +49,7 @@ struct vhost_dev {
     bool memory_changed;
     hwaddr mem_changed_start_addr;
     hwaddr mem_changed_end_addr;
+    const VhostOps *vhost_ops;
 };
 
 int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v7 09/13] Add vhost-backend and VhostBackendType
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (7 preceding siblings ...)
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 08/13] Add vhost_ops to the vhost_dev struct and replace all relevant ioctls Antonios Motakis
@ 2014-01-31 17:34 ` Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 10/13] Add vhost-user as a vhost backend Antonios Motakis
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel
  Cc: Peter Maydell, Stefan Hajnoczi, mst, Jason Wang, n.nikolaev,
	Anthony Liguori, Paolo Bonzini, lukego, Antonios Motakis, tech,
	KONRAD Frederic

Use vhost_set_backend_type to initialise a proper vhost_ops structure.
In vhost_net_init and vhost_net_start_one call conditionally TAP related
initialisation depending on the vhost backend type.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 hw/net/vhost_net.c                | 81 +++++++++++++++++++++++----------------
 hw/scsi/vhost-scsi.c              |  2 +-
 hw/virtio/Makefile.objs           |  2 +-
 hw/virtio/vhost-backend.c         | 66 +++++++++++++++++++++++++++++++
 hw/virtio/vhost.c                 |  6 ++-
 include/hw/virtio/vhost-backend.h | 11 ++++++
 include/hw/virtio/vhost.h         |  4 +-
 include/net/vhost_net.h           |  2 +
 net/tap.c                         |  1 +
 9 files changed, 137 insertions(+), 38 deletions(-)
 create mode 100644 hw/virtio/vhost-backend.c

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 8e19b8f..6b6268b 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -93,7 +93,7 @@ static int vhost_net_get_fd(NetClientState *backend)
 
 struct vhost_net *vhost_net_init(VhostNetOptions *options)
 {
-    int r;
+    int r = -1;
     struct vhost_net *net = g_malloc(sizeof *net);
 
     if (!options->net_backend) {
@@ -101,35 +101,41 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
         goto fail;
     }
 
-    r = vhost_net_get_fd(options->net_backend);
-    if (r < 0) {
-        goto fail;
+    if (options->backend_type == VHOST_BACKEND_TYPE_KERNEL) {
+        r = vhost_net_get_fd(options->net_backend);
+        if (r < 0) {
+            goto fail;
+        }
+
+        net->dev.backend_features =
+                tap_has_vnet_hdr(options->net_backend) ? 0 :
+                                (1 << VHOST_NET_F_VIRTIO_NET_HDR);
     }
 
     net->nc = options->net_backend;
-    net->dev.backend_features = tap_has_vnet_hdr(options->net_backend) ? 0 :
-        (1 << VHOST_NET_F_VIRTIO_NET_HDR);
     net->backend = r;
 
     net->dev.nvqs = 2;
     net->dev.vqs = net->vqs;
 
     r = vhost_dev_init(&net->dev, options->opaque,
-                       options->force);
+                       options->backend_type, options->force);
     if (r < 0) {
         goto fail;
     }
-    if (!tap_has_vnet_hdr_len(options->net_backend,
-                              sizeof(struct virtio_net_hdr_mrg_rxbuf))) {
-        net->dev.features &= ~(1 << VIRTIO_NET_F_MRG_RXBUF);
-    }
-    if (~net->dev.features & net->dev.backend_features) {
-        fprintf(stderr, "vhost lacks feature mask %" PRIu64 " for backend\n",
-                (uint64_t)(~net->dev.features & net->dev.backend_features));
-        vhost_dev_cleanup(&net->dev);
-        goto fail;
+    if (options->backend_type == VHOST_BACKEND_TYPE_KERNEL) {
+        if (!tap_has_vnet_hdr_len(options->net_backend,
+                        sizeof(struct virtio_net_hdr_mrg_rxbuf))) {
+            net->dev.features &= ~(1 << VIRTIO_NET_F_MRG_RXBUF);
+        }
+        if (~net->dev.features & net->dev.backend_features) {
+            fprintf(stderr, "vhost lacks feature mask %" PRIu64
+                   " for backend\n",
+                   (uint64_t)(~net->dev.features & net->dev.backend_features));
+            vhost_dev_cleanup(&net->dev);
+            goto fail;
+        }
     }
-
     /* Set sane init value. Override when guest acks. */
     vhost_net_ack_features(net, 0);
     return net;
@@ -172,23 +178,29 @@ static int vhost_net_start_one(struct vhost_net *net,
         net->nc->info->poll(net->nc, false);
     }
 
-    qemu_set_fd_handler(net->backend, NULL, NULL, NULL);
-    file.fd = net->backend;
-    for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
-        const VhostOps *vhost_ops = net->dev.vhost_ops;
-        r = vhost_ops->vhost_call(&net->dev, VHOST_NET_SET_BACKEND, &file);
-        if (r < 0) {
-            r = -errno;
-            goto fail;
+    if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_TAP) {
+        qemu_set_fd_handler(net->backend, NULL, NULL, NULL);
+        file.fd = net->backend;
+        for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
+            const VhostOps *vhost_ops = net->dev.vhost_ops;
+            r = vhost_ops->vhost_call(&net->dev, VHOST_NET_SET_BACKEND,
+                                      &file);
+            if (r < 0) {
+                r = -errno;
+                goto fail;
+            }
         }
     }
     return 0;
 fail:
     file.fd = -1;
-    while (file.index-- > 0) {
-        const VhostOps *vhost_ops = net->dev.vhost_ops;
-        int r = vhost_ops->vhost_call(&net->dev, VHOST_NET_SET_BACKEND, &file);
-        assert(r >= 0);
+    if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_TAP) {
+        while (file.index-- > 0) {
+            const VhostOps *vhost_ops = net->dev.vhost_ops;
+            int r = vhost_ops->vhost_call(&net->dev, VHOST_NET_SET_BACKEND,
+                                          &file);
+            assert(r >= 0);
+        }
     }
     if (net->nc->info->poll) {
         net->nc->info->poll(net->nc, true);
@@ -209,10 +221,13 @@ static void vhost_net_stop_one(struct vhost_net *net,
         return;
     }
 
-    for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
-        const VhostOps *vhost_ops = net->dev.vhost_ops;
-        int r = vhost_ops->vhost_call(&net->dev, VHOST_NET_SET_BACKEND, &file);
-        assert(r >= 0);
+    if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_TAP) {
+        for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
+            const VhostOps *vhost_ops = net->dev.vhost_ops;
+            int r = vhost_ops->vhost_call(&net->dev, VHOST_NET_SET_BACKEND,
+                                          &file);
+            assert(r >= 0);
+        }
     }
     if (net->nc->info->poll) {
         net->nc->info->poll(net->nc, true);
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 48a9ced..c099fb6 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -239,7 +239,7 @@ static void vhost_scsi_realize(DeviceState *dev, Error **errp)
     s->dev.vq_index = 0;
 
     ret = vhost_dev_init(&s->dev, (void *)(uintptr_t)vhostfd,
-                         true);
+                         VHOST_BACKEND_TYPE_KERNEL, true);
     if (ret < 0) {
         error_setg(errp, "vhost-scsi: vhost initialization failed: %s",
                    strerror(-ret));
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 1ba53d9..51e5bdb 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -5,4 +5,4 @@ common-obj-y += virtio-mmio.o
 common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
 
 obj-y += virtio.o virtio-balloon.o 
-obj-$(CONFIG_LINUX) += vhost.o
+obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o
diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
new file mode 100644
index 0000000..509e103
--- /dev/null
+++ b/hw/virtio/vhost-backend.c
@@ -0,0 +1,66 @@
+/*
+ * vhost-backend
+ *
+ * Copyright (c) 2013 Virtual Open Systems Sarl.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-backend.h"
+#include "qemu/error-report.h"
+
+#include <sys/ioctl.h>
+
+static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request,
+                             void *arg)
+{
+    int fd = (uintptr_t) dev->opaque;
+
+    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_KERNEL);
+
+    return ioctl(fd, request, arg);
+}
+
+static int vhost_kernel_init(struct vhost_dev *dev, void *opaque)
+{
+    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_KERNEL);
+
+    dev->opaque = opaque;
+
+    return 0;
+}
+
+static int vhost_kernel_cleanup(struct vhost_dev *dev)
+{
+    int fd = (uintptr_t) dev->opaque;
+
+    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_KERNEL);
+
+    return close(fd);
+}
+
+static const VhostOps kernel_ops = {
+        .backend_type = VHOST_BACKEND_TYPE_KERNEL,
+        .vhost_call = vhost_kernel_call,
+        .vhost_backend_init = vhost_kernel_init,
+        .vhost_backend_cleanup = vhost_kernel_cleanup
+};
+
+int vhost_set_backend_type(struct vhost_dev *dev, VhostBackendType backend_type)
+{
+    int r = 0;
+
+    switch (backend_type) {
+    case VHOST_BACKEND_TYPE_KERNEL:
+        dev->vhost_ops = &kernel_ops;
+        break;
+    default:
+        error_report("Unknown vhost backend type\n");
+        r = -1;
+    }
+
+    return r;
+}
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index a2c76a4..d572a4e 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -814,11 +814,15 @@ static void vhost_virtqueue_cleanup(struct vhost_virtqueue *vq)
 }
 
 int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
-                   bool force)
+                   VhostBackendType backend_type, bool force)
 {
     uint64_t features;
     int i, r;
 
+    if (vhost_set_backend_type(hdev, backend_type) < 0) {
+        return -1;
+    }
+
     if (hdev->vhost_ops->vhost_backend_init(hdev, opaque) < 0) {
         return -errno;
     }
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
index 14e5878..d31768a 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -11,6 +11,13 @@
 #ifndef VHOST_BACKEND_H_
 #define VHOST_BACKEND_H_
 
+typedef enum VhostBackendType {
+    VHOST_BACKEND_TYPE_NONE = 0,
+    VHOST_BACKEND_TYPE_KERNEL = 1,
+    VHOST_BACKEND_TYPE_USER = 2,
+    VHOST_BACKEND_TYPE_MAX = 3,
+} VhostBackendType;
+
 struct vhost_dev;
 
 typedef int (*vhost_call)(struct vhost_dev *dev, unsigned long int request,
@@ -19,9 +26,13 @@ typedef int (*vhost_backend_init)(struct vhost_dev *dev, void *opaque);
 typedef int (*vhost_backend_cleanup)(struct vhost_dev *dev);
 
 typedef struct VhostOps {
+    VhostBackendType backend_type;
     vhost_call vhost_call;
     vhost_backend_init vhost_backend_init;
     vhost_backend_cleanup vhost_backend_cleanup;
 } VhostOps;
 
+int vhost_set_backend_type(struct vhost_dev *dev,
+                           VhostBackendType backend_type);
+
 #endif /* VHOST_BACKEND_H_ */
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 97641b6..4806f0d 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -30,7 +30,6 @@ typedef unsigned long vhost_log_chunk_t;
 struct vhost_memory;
 struct vhost_dev {
     MemoryListener memory_listener;
-    int control;
     struct vhost_memory *mem;
     int n_mem_sections;
     MemoryRegionSection *mem_sections;
@@ -50,10 +49,11 @@ struct vhost_dev {
     hwaddr mem_changed_start_addr;
     hwaddr mem_changed_end_addr;
     const VhostOps *vhost_ops;
+    void *opaque;
 };
 
 int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
-                   bool force);
+                   VhostBackendType backend_type, bool force);
 void vhost_dev_cleanup(struct vhost_dev *hdev);
 bool vhost_dev_query(struct vhost_dev *hdev, VirtIODevice *vdev);
 int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev);
diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 2067ee2..b1c18a3 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -2,11 +2,13 @@
 #define VHOST_NET_H
 
 #include "net/net.h"
+#include "hw/virtio/vhost-backend.h"
 
 struct vhost_net;
 typedef struct vhost_net VHostNetState;
 
 typedef struct VhostNetOptions {
+    VhostBackendType backend_type;
     NetClientState *net_backend;
     void *opaque;
     bool force;
diff --git a/net/tap.c b/net/tap.c
index 0840093..eda4039 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -624,6 +624,7 @@ static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
         vhostfdname || (tap->has_vhostforce && tap->vhostforce)) {
         VhostNetOptions options;
 
+        options.backend_type = VHOST_BACKEND_TYPE_KERNEL;
         options.net_backend = &s->nc;
         options.force = tap->has_vhostforce && tap->vhostforce;
 
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v7 10/13] Add vhost-user as a vhost backend.
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (8 preceding siblings ...)
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 09/13] Add vhost-backend and VhostBackendType Antonios Motakis
@ 2014-01-31 17:34 ` Antonios Motakis
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 11/13] Add new vhost-user netdev backend Antonios Motakis
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel
  Cc: Peter Maydell, mst, n.nikolaev, Paolo Bonzini, lukego,
	Antonios Motakis, tech, KONRAD Frederic

The initialization takes a chardev backed by a unix domain socket.
It should implement qemu_fe_set_msgfds in order to be able to pass
file descriptors to the remote process.

Each ioctl request of vhost-kernel has a vhost-user message equivalent,
which is sent over the control socket.

The general approach is to copy the data from the supplied argument
pointer to a designated field in the message. If a file descriptor is
to be passed it will be placed in the fds array for inclusion in
the sendmsg control header.

VHOST_SET_MEM_TABLE ignores the supplied vhost_memory structure and scans
the global ram_list for ram blocks with a valid fd field set. This would
be set when the -mem-path option with shared=on property is used.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 hw/virtio/Makefile.objs   |   2 +-
 hw/virtio/vhost-backend.c |   5 +
 hw/virtio/vhost-user.c    | 331 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 337 insertions(+), 1 deletion(-)
 create mode 100644 hw/virtio/vhost-user.c

diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 51e5bdb..ec9e855 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -5,4 +5,4 @@ common-obj-y += virtio-mmio.o
 common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
 
 obj-y += virtio.o virtio-balloon.o 
-obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o
+obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o
diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index 509e103..35316c4 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -14,6 +14,8 @@
 
 #include <sys/ioctl.h>
 
+extern const VhostOps user_ops;
+
 static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request,
                              void *arg)
 {
@@ -57,6 +59,9 @@ int vhost_set_backend_type(struct vhost_dev *dev, VhostBackendType backend_type)
     case VHOST_BACKEND_TYPE_KERNEL:
         dev->vhost_ops = &kernel_ops;
         break;
+    case VHOST_BACKEND_TYPE_USER:
+        dev->vhost_ops = &user_ops;
+        break;
     default:
         error_report("Unknown vhost backend type\n");
         r = -1;
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
new file mode 100644
index 0000000..1483647
--- /dev/null
+++ b/hw/virtio/vhost-user.c
@@ -0,0 +1,331 @@
+/*
+ * vhost-user
+ *
+ * Copyright (c) 2013 Virtual Open Systems Sarl.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-backend.h"
+#include "sysemu/char.h"
+#include "qemu/error-report.h"
+#include "qemu/sockets.h"
+
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <linux/vhost.h>
+
+#define VHOST_MEMORY_MAX_NREGIONS    8
+
+typedef enum VhostUserRequest {
+    VHOST_USER_NONE = 0,
+    VHOST_USER_GET_FEATURES = 1,
+    VHOST_USER_SET_FEATURES = 2,
+    VHOST_USER_SET_OWNER = 3,
+    VHOST_USER_RESET_OWNER = 4,
+    VHOST_USER_SET_MEM_TABLE = 5,
+    VHOST_USER_SET_LOG_BASE = 6,
+    VHOST_USER_SET_LOG_FD = 7,
+    VHOST_USER_SET_VRING_NUM = 8,
+    VHOST_USER_SET_VRING_ADDR = 9,
+    VHOST_USER_SET_VRING_BASE = 10,
+    VHOST_USER_GET_VRING_BASE = 11,
+    VHOST_USER_SET_VRING_KICK = 12,
+    VHOST_USER_SET_VRING_CALL = 13,
+    VHOST_USER_SET_VRING_ERR = 14,
+    VHOST_USER_MAX
+} VhostUserRequest;
+
+typedef struct VhostUserMemoryRegion {
+    uint64_t guest_phys_addr;
+    uint64_t memory_size;
+    uint64_t userspace_addr;
+} VhostUserMemoryRegion;
+
+typedef struct VhostUserMemory {
+    uint32_t nregions;
+    uint32_t padding;
+    VhostUserMemoryRegion regions[VHOST_MEMORY_MAX_NREGIONS];
+} VhostUserMemory;
+
+typedef struct VhostUserMsg {
+    VhostUserRequest request;
+
+#define VHOST_USER_VERSION_MASK     (0x3)
+#define VHOST_USER_REPLY_MASK       (0x1<<2)
+    uint32_t flags;
+    uint32_t size; /* the following payload size */
+    union {
+        uint64_t u64;
+        struct vhost_vring_state state;
+        struct vhost_vring_addr addr;
+        VhostUserMemory memory;
+    };
+} QEMU_PACKED VhostUserMsg;
+
+static VhostUserMsg m __attribute__ ((unused));
+#define VHOST_USER_HDR_SIZE (sizeof(m.request) \
+                            + sizeof(m.flags) \
+                            + sizeof(m.size))
+
+#define VHOST_USER_PAYLOAD_SIZE (sizeof(m) - VHOST_USER_HDR_SIZE)
+
+/* The version of the protocol we support */
+#define VHOST_USER_VERSION    (0x1)
+
+static unsigned long int ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
+    -1,                     /* VHOST_USER_NONE */
+    VHOST_GET_FEATURES,     /* VHOST_USER_GET_FEATURES */
+    VHOST_SET_FEATURES,     /* VHOST_USER_SET_FEATURES */
+    VHOST_SET_OWNER,        /* VHOST_USER_SET_OWNER */
+    VHOST_RESET_OWNER,      /* VHOST_USER_RESET_OWNER */
+    VHOST_SET_MEM_TABLE,    /* VHOST_USER_SET_MEM_TABLE */
+    VHOST_SET_LOG_BASE,     /* VHOST_USER_SET_LOG_BASE */
+    VHOST_SET_LOG_FD,       /* VHOST_USER_SET_LOG_FD */
+    VHOST_SET_VRING_NUM,    /* VHOST_USER_SET_VRING_NUM */
+    VHOST_SET_VRING_ADDR,   /* VHOST_USER_SET_VRING_ADDR */
+    VHOST_SET_VRING_BASE,   /* VHOST_USER_SET_VRING_BASE */
+    VHOST_GET_VRING_BASE,   /* VHOST_USER_GET_VRING_BASE */
+    VHOST_SET_VRING_KICK,   /* VHOST_USER_SET_VRING_KICK */
+    VHOST_SET_VRING_CALL,   /* VHOST_USER_SET_VRING_CALL */
+    VHOST_SET_VRING_ERR     /* VHOST_USER_SET_VRING_ERR */
+};
+
+static VhostUserRequest vhost_user_request_translate(unsigned long int request)
+{
+    VhostUserRequest idx;
+
+    for (idx = 0; idx < VHOST_USER_MAX; idx++) {
+        if (ioctl_to_vhost_user_request[idx] == request) {
+            break;
+        }
+    }
+
+    return (idx == VHOST_USER_MAX) ? VHOST_USER_NONE : idx;
+}
+
+static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg)
+{
+    CharDriverState *chr = dev->opaque;
+    uint8_t *p = (uint8_t *) msg;
+    int r, size = VHOST_USER_HDR_SIZE;
+
+    r = qemu_chr_fe_read_all(chr, p, size);
+    if (r != size) {
+        error_report("Failed to read msg header. Read %d instead of %d.\n", r,
+                size);
+        goto fail;
+    }
+
+    /* validate received flags */
+    if (msg->flags != (VHOST_USER_REPLY_MASK | VHOST_USER_VERSION)) {
+        error_report("Failed to read msg header."
+                " Flags 0x%x instead of 0x%x.\n", msg->flags,
+                VHOST_USER_REPLY_MASK | VHOST_USER_VERSION);
+        goto fail;
+    }
+
+    /* validate message size is sane */
+    if (msg->size > VHOST_USER_PAYLOAD_SIZE) {
+        error_report("Failed to read msg header."
+                " Size %d exceeds the maximum %zu.\n", msg->size,
+                VHOST_USER_PAYLOAD_SIZE);
+        goto fail;
+    }
+
+    if (msg->size) {
+        p += VHOST_USER_HDR_SIZE;
+        size = msg->size;
+        r = qemu_chr_fe_read_all(chr, p, size);
+        if (r != size) {
+            error_report("Failed to read msg payload."
+                         " Read %d instead of %d.\n", r, msg->size);
+            goto fail;
+        }
+    }
+
+    return 0;
+
+fail:
+    return -1;
+}
+
+static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg,
+                            int *fds, int fd_num)
+{
+    CharDriverState *chr = dev->opaque;
+    int size = VHOST_USER_HDR_SIZE + msg->size;
+
+    if (fd_num) {
+        qemu_chr_fe_set_msgfds(chr, fds, fd_num);
+    }
+
+    return qemu_chr_fe_write_all(chr, (const uint8_t *) msg, size) == size ?
+            0 : -1;
+}
+
+static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
+        void *arg)
+{
+    VhostUserMsg msg;
+    VhostUserRequest msg_request;
+    RAMBlock *block = 0;
+    struct vhost_vring_file *file = 0;
+    int need_reply = 0;
+    int fds[VHOST_MEMORY_MAX_NREGIONS];
+    size_t fd_num = 0;
+
+    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
+
+    msg_request = vhost_user_request_translate(request);
+    msg.request = msg_request;
+    msg.flags = VHOST_USER_VERSION;
+    msg.size = 0;
+
+    switch (request) {
+    case VHOST_GET_FEATURES:
+        need_reply = 1;
+        break;
+
+    case VHOST_SET_FEATURES:
+    case VHOST_SET_LOG_BASE:
+        msg.u64 = *((__u64 *) arg);
+        msg.size = sizeof(m.u64);
+        break;
+
+    case VHOST_SET_OWNER:
+    case VHOST_RESET_OWNER:
+        break;
+
+    case VHOST_SET_MEM_TABLE:
+        QTAILQ_FOREACH(block, &ram_list.blocks, next)
+        {
+            if (block->fd > 0) {
+                msg.memory.regions[fd_num].userspace_addr = (__u64) block->host;
+                msg.memory.regions[fd_num].memory_size = block->length;
+                msg.memory.regions[fd_num].guest_phys_addr = block->offset;
+                fds[fd_num++] = block->fd;
+            }
+        }
+
+        msg.memory.nregions = fd_num;
+
+        if (!fd_num) {
+            error_report("Failed initializing vhost-user memory map\n"
+                    "consider using -mem-path option\n");
+            return -1;
+        }
+
+        msg.size = sizeof(m.memory.nregions);
+        msg.size += sizeof(m.memory.padding);
+        msg.size += fd_num * sizeof(VhostUserMemoryRegion);
+
+        break;
+
+    case VHOST_SET_LOG_FD:
+        fds[fd_num++] = *((int *) arg);
+        break;
+
+    case VHOST_SET_VRING_NUM:
+    case VHOST_SET_VRING_BASE:
+        memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
+        msg.size = sizeof(m.state);
+        break;
+
+    case VHOST_GET_VRING_BASE:
+        memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
+        msg.size = sizeof(m.state);
+        need_reply = 1;
+        break;
+
+    case VHOST_SET_VRING_ADDR:
+        memcpy(&msg.addr, arg, sizeof(struct vhost_vring_addr));
+        msg.size = sizeof(m.addr);
+        break;
+
+    case VHOST_SET_VRING_KICK:
+    case VHOST_SET_VRING_CALL:
+    case VHOST_SET_VRING_ERR:
+        file = arg;
+        msg.u64 = file->index;
+        msg.size = sizeof(m.u64);
+        if (file->fd > 0) {
+            fds[fd_num++] = file->fd;
+        }
+        break;
+    default:
+        error_report("vhost-user trying to send unhandled ioctl\n");
+        return -1;
+        break;
+    }
+
+    if (vhost_user_write(dev, &msg, fds, fd_num) < 0) {
+        return 0;
+    }
+
+    if (need_reply) {
+        if (vhost_user_read(dev, &msg) < 0) {
+            return 0;
+        }
+
+        if (msg_request != msg.request) {
+            error_report("Received unexpected msg type."
+                    " Expected %d received %d\n", msg_request, msg.request);
+            return -1;
+        }
+
+        switch (msg_request) {
+        case VHOST_USER_GET_FEATURES:
+            if (msg.size != sizeof(m.u64)) {
+                error_report("Received bad msg size.\n");
+                return -1;
+            }
+            *((__u64 *) arg) = msg.u64;
+            break;
+        case VHOST_USER_GET_VRING_BASE:
+            if (msg.size != sizeof(m.state)) {
+                error_report("Received bad msg size.\n");
+                return -1;
+            }
+            memcpy(arg, &msg.state, sizeof(struct vhost_vring_state));
+            break;
+        default:
+            error_report("Received unexpected msg type.\n");
+            return -1;
+            break;
+        }
+    }
+
+    return 0;
+}
+
+static int vhost_user_init(struct vhost_dev *dev, void *opaque)
+{
+    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
+
+    dev->opaque = opaque;
+
+    return 0;
+}
+
+static int vhost_user_cleanup(struct vhost_dev *dev)
+{
+    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
+
+    dev->opaque = 0;
+
+    return 0;
+}
+
+const VhostOps user_ops = {
+        .backend_type = VHOST_BACKEND_TYPE_USER,
+        .vhost_call = vhost_user_call,
+        .vhost_backend_init = vhost_user_init,
+        .vhost_backend_cleanup = vhost_user_cleanup
+        };
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v7 11/13] Add new vhost-user netdev backend
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (9 preceding siblings ...)
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 10/13] Add vhost-user as a vhost backend Antonios Motakis
@ 2014-01-31 17:34 ` Antonios Motakis
  2014-02-10  8:42   ` Michael S. Tsirkin
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 12/13] Add the vhost-user netdev backend to command line Antonios Motakis
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel
  Cc: Stefan Hajnoczi, mst, n.nikolaev, Anthony Liguori, lukego,
	Antonios Motakis, tech

Add a new QEMU netdev backend that is intended to invoke vhost_net with the
vhost-user backend.

At runtime the netdev will detect if the vhost backend is up or down. Upon
disconnection it will set link_down accordingly and notify virtio-net. The
virtio-net interface goes down.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 include/net/vhost-user.h |  17 +++++++
 net/Makefile.objs        |   2 +-
 net/clients.h            |   3 ++
 net/vhost-user.c         | 130 +++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 151 insertions(+), 1 deletion(-)
 create mode 100644 include/net/vhost-user.h
 create mode 100644 net/vhost-user.c

diff --git a/include/net/vhost-user.h b/include/net/vhost-user.h
new file mode 100644
index 0000000..85109f6
--- /dev/null
+++ b/include/net/vhost-user.h
@@ -0,0 +1,17 @@
+/*
+ * vhost-user.h
+ *
+ * Copyright (c) 2013 Virtual Open Systems Sarl.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef VHOST_USER_H_
+#define VHOST_USER_H_
+
+struct vhost_net;
+struct vhost_net *vhost_user_get_vhost_net(NetClientState *nc);
+
+#endif /* VHOST_USER_H_ */
diff --git a/net/Makefile.objs b/net/Makefile.objs
index c25fe69..301f6b6 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -2,7 +2,7 @@ common-obj-y = net.o queue.o checksum.o util.o hub.o
 common-obj-y += socket.o
 common-obj-y += dump.o
 common-obj-y += eth.o
-common-obj-$(CONFIG_POSIX) += tap.o
+common-obj-$(CONFIG_POSIX) += tap.o vhost-user.o
 common-obj-$(CONFIG_LINUX) += tap-linux.o
 common-obj-$(CONFIG_WIN32) += tap-win32.o
 common-obj-$(CONFIG_BSD) += tap-bsd.o
diff --git a/net/clients.h b/net/clients.h
index 7322ff5..7f3d4ae 100644
--- a/net/clients.h
+++ b/net/clients.h
@@ -57,4 +57,7 @@ int net_init_netmap(const NetClientOptions *opts, const char *name,
                     NetClientState *peer);
 #endif
 
+int net_init_vhost_user(const NetClientOptions *opts, const char *name,
+                        NetClientState *peer);
+
 #endif /* QEMU_NET_CLIENTS_H */
diff --git a/net/vhost-user.c b/net/vhost-user.c
new file mode 100644
index 0000000..b25722c
--- /dev/null
+++ b/net/vhost-user.c
@@ -0,0 +1,130 @@
+/*
+ * vhost-user.c
+ *
+ * Copyright (c) 2013 Virtual Open Systems Sarl.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "clients.h"
+#include "net/vhost_net.h"
+#include "net/vhost-user.h"
+#include "sysemu/char.h"
+#include "qemu/error-report.h"
+
+typedef struct VhostUserState {
+    NetClientState nc;
+    CharDriverState *chr;
+    VHostNetState *vhost_net;
+} VhostUserState;
+
+VHostNetState *vhost_user_get_vhost_net(NetClientState *nc)
+{
+    VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
+    return s->vhost_net;
+}
+
+static int vhost_user_running(VhostUserState *s)
+{
+    return (s->vhost_net) ? 1 : 0;
+}
+
+static int vhost_user_start(VhostUserState *s)
+{
+    VhostNetOptions options;
+
+    if (vhost_user_running(s)) {
+        return 0;
+    }
+
+    options.backend_type = VHOST_BACKEND_TYPE_USER;
+    options.net_backend = &s->nc;
+    options.opaque = s->chr;
+    options.force = 1;
+
+    s->vhost_net = vhost_net_init(&options);
+
+    return vhost_user_running(s) ? 0 : -1;
+}
+
+static void vhost_user_stop(VhostUserState *s)
+{
+    if (vhost_user_running(s)) {
+        vhost_net_cleanup(s->vhost_net);
+    }
+
+    s->vhost_net = 0;
+}
+
+static void vhost_user_cleanup(NetClientState *nc)
+{
+    VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
+
+    vhost_user_stop(s);
+    qemu_purge_queued_packets(nc);
+}
+
+static NetClientInfo net_vhost_user_info = {
+        .type = 0,
+        .size = sizeof(VhostUserState),
+        .cleanup = vhost_user_cleanup,
+};
+
+static void net_vhost_user_event(void *opaque, int event)
+{
+    VhostUserState *s = opaque;
+
+    switch (event) {
+    case CHR_EVENT_OPENED:
+        vhost_user_start(s);
+        break;
+    case CHR_EVENT_CLOSED:
+        s->nc.link_down = 1;
+
+        if (s->nc.peer) {
+            s->nc.peer->link_down = 1;
+        }
+
+        if (s->nc.info->link_status_changed) {
+            s->nc.info->link_status_changed(&s->nc);
+        }
+
+        if (s->nc.peer && s->nc.peer->info->link_status_changed) {
+            s->nc.peer->info->link_status_changed(s->nc.peer);
+        }
+
+        vhost_user_stop(s);
+        error_report("chardev \"%s\" went down\n", s->chr->label);
+        break;
+    }
+}
+
+static int net_vhost_user_init(NetClientState *peer, const char *device,
+                               const char *name, CharDriverState *chr)
+{
+    NetClientState *nc;
+    VhostUserState *s;
+
+    nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
+
+    snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s",
+             chr->label);
+
+    s = DO_UPCAST(VhostUserState, nc, nc);
+
+    /* We don't provide a receive callback */
+    s->nc.receive_disabled = 1;
+    s->chr = chr;
+
+    qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event, s);
+
+    return 0;
+}
+
+int net_init_vhost_user(const NetClientOptions *opts, const char *name,
+                   NetClientState *peer)
+{
+    return net_vhost_user_init(peer, "vhost_user", 0, 0);
+}
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v7 12/13] Add the vhost-user netdev backend to command line
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (10 preceding siblings ...)
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 11/13] Add new vhost-user netdev backend Antonios Motakis
@ 2014-01-31 17:34 ` Antonios Motakis
  2014-02-10  8:49   ` Michael S. Tsirkin
  2014-02-10 16:43   ` Eric Blake
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 13/13] Add vhost-user protocol documentation Antonios Motakis
  2014-02-10  8:57 ` [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Michael S. Tsirkin
  13 siblings, 2 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel
  Cc: Stefan Hajnoczi, mst, Michael Tokarev, Markus Armbruster,
	n.nikolaev, Luiz Capitulino, Anthony Liguori, Paolo Bonzini,
	lukego, Antonios Motakis, tech

The supplied chardev id will be inspected for supported options. Only
a socket backend, with a set path (i.e. a unix socket) and optionally
the server parameter set, will be allowed. Other options (nowait, telnet)
will make the chardev unusable and the netdev will not be initialised.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 hmp-commands.hx     |  4 +--
 hw/net/vhost_net.c  |  4 +++
 hw/net/virtio-net.c |  3 ++
 net/hub.c           |  1 +
 net/net.c           |  2 ++
 net/vhost-user.c    | 91 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 qapi-schema.json    | 18 ++++++++++-
 qemu-options.hx     | 16 ++++++++++
 8 files changed, 134 insertions(+), 5 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index f3fc514..68128c1 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1195,7 +1195,7 @@ ETEXI
     {
         .name       = "host_net_add",
         .args_type  = "device:s,opts:s?",
-        .params     = "tap|user|socket|vde|netmap|dump [options]",
+        .params     = "tap|user|socket|vde|netmap|vhost-user|dump [options]",
         .help       = "add host VLAN client",
         .mhandler.cmd = net_host_device_add,
     },
@@ -1223,7 +1223,7 @@ ETEXI
     {
         .name       = "netdev_add",
         .args_type  = "netdev:O",
-        .params     = "[user|tap|socket|hubport|netmap],id=str[,prop=value][,...]",
+        .params     = "[user|tap|socket|hubport|netmap|vhost-user],id=str[,prop=value][,...]",
         .help       = "add host network device",
         .mhandler.cmd = hmp_netdev_add,
     },
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 6b6268b..e630407 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -15,6 +15,7 @@
 
 #include "net/net.h"
 #include "net/tap.h"
+#include "net/vhost-user.h"
 
 #include "hw/virtio/virtio-net.h"
 #include "net/vhost_net.h"
@@ -322,6 +323,9 @@ VHostNetState *get_vhost_net(NetClientState *nc)
     case NET_CLIENT_OPTIONS_KIND_TAP:
         vhost_net = tap_get_vhost_net(nc);
         break;
+    case NET_CLIENT_OPTIONS_KIND_VHOST_USER:
+        vhost_net = vhost_user_get_vhost_net(nc);
+        break;
     default:
         break;
     }
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 72acd15..d49ee82 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -322,6 +322,9 @@ static void peer_test_vnet_hdr(VirtIONet *n)
     case NET_CLIENT_OPTIONS_KIND_TAP:
         n->has_vnet_hdr = tap_has_vnet_hdr(nc->peer);
         break;
+    case NET_CLIENT_OPTIONS_KIND_VHOST_USER:
+        n->has_vnet_hdr = 0;
+        break;
     default:
         break;
     }
diff --git a/net/hub.c b/net/hub.c
index 33a99c9..7e0f2d6 100644
--- a/net/hub.c
+++ b/net/hub.c
@@ -322,6 +322,7 @@ void net_hub_check_clients(void)
             case NET_CLIENT_OPTIONS_KIND_TAP:
             case NET_CLIENT_OPTIONS_KIND_SOCKET:
             case NET_CLIENT_OPTIONS_KIND_VDE:
+            case NET_CLIENT_OPTIONS_KIND_VHOST_USER:
                 has_host_dev = 1;
                 break;
             default:
diff --git a/net/net.c b/net/net.c
index 2c3af20..30f1273 100644
--- a/net/net.c
+++ b/net/net.c
@@ -731,6 +731,7 @@ static int (* const net_client_init_fun[NET_CLIENT_OPTIONS_KIND_MAX])(
         [NET_CLIENT_OPTIONS_KIND_BRIDGE]    = net_init_bridge,
 #endif
         [NET_CLIENT_OPTIONS_KIND_HUBPORT]   = net_init_hubport,
+        [NET_CLIENT_OPTIONS_KIND_VHOST_USER] = net_init_vhost_user,
 };
 
 
@@ -764,6 +765,7 @@ static int net_client_init1(const void *object, int is_netdev, Error **errp)
         case NET_CLIENT_OPTIONS_KIND_BRIDGE:
 #endif
         case NET_CLIENT_OPTIONS_KIND_HUBPORT:
+        case NET_CLIENT_OPTIONS_KIND_VHOST_USER:
             break;
 
         default:
diff --git a/net/vhost-user.c b/net/vhost-user.c
index b25722c..f5bd211 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -12,6 +12,7 @@
 #include "net/vhost_net.h"
 #include "net/vhost-user.h"
 #include "sysemu/char.h"
+#include "qemu/config-file.h"
 #include "qemu/error-report.h"
 
 typedef struct VhostUserState {
@@ -20,9 +21,17 @@ typedef struct VhostUserState {
     VHostNetState *vhost_net;
 } VhostUserState;
 
+typedef struct VhostUserChardevProps {
+    bool is_socket;
+    bool is_unix;
+    bool is_server;
+    bool has_unsupported;
+} VhostUserChardevProps;
+
 VHostNetState *vhost_user_get_vhost_net(NetClientState *nc)
 {
     VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
+    assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
     return s->vhost_net;
 }
 
@@ -67,7 +76,7 @@ static void vhost_user_cleanup(NetClientState *nc)
 }
 
 static NetClientInfo net_vhost_user_info = {
-        .type = 0,
+        .type = NET_CLIENT_OPTIONS_KIND_VHOST_USER,
         .size = sizeof(VhostUserState),
         .cleanup = vhost_user_cleanup,
 };
@@ -123,8 +132,86 @@ static int net_vhost_user_init(NetClientState *peer, const char *device,
     return 0;
 }
 
+static int net_vhost_chardev_opts(const char *name, const char *value,
+        void *opaque)
+{
+    VhostUserChardevProps *props = opaque;
+
+    if (strcmp(name, "backend") == 0 && strcmp(value, "socket") == 0) {
+        props->is_socket = 1;
+    } else if (strcmp(name, "path") == 0) {
+        props->is_unix = 1;
+    } else if (strcmp(name, "server") == 0) {
+        props->is_server = 1;
+    } else {
+        error_report("vhost-user does not support a chardev"
+                     " with the following option:\n %s = %s",
+                     name, value);
+        props->has_unsupported = 1;
+        return -1;
+    }
+    return 0;
+}
+
+static CharDriverState *net_vhost_parse_chardev(
+        const NetdevVhostUserOptions *opts)
+{
+    CharDriverState *chr = qemu_chr_find(opts->chardev);
+    VhostUserChardevProps props;
+
+    if (chr == NULL) {
+        error_report("chardev \"%s\" not found\n", opts->chardev);
+        return 0;
+    }
+
+    /* inspect chardev opts */
+    memset(&props, 0, sizeof(props));
+    qemu_opt_foreach(chr->opts, net_vhost_chardev_opts, &props, false);
+
+    if (!props.is_socket || !props.is_unix) {
+        error_report("chardev \"%s\" is not a unix socket\n",
+                     opts->chardev);
+        return 0;
+    }
+
+    if (props.has_unsupported) {
+        error_report("chardev \"%s\" has an unsupported option\n",
+                opts->chardev);
+        return 0;
+    }
+
+    qemu_chr_fe_claim_no_fail(chr);
+
+    return chr;
+}
+
 int net_init_vhost_user(const NetClientOptions *opts, const char *name,
                    NetClientState *peer)
 {
-    return net_vhost_user_init(peer, "vhost_user", 0, 0);
+    const NetdevVhostUserOptions *vhost_user_opts;
+    CharDriverState *chr;
+    QemuOpts *mem_opts;
+    unsigned int mem_share = 0;
+
+    assert(opts->kind == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
+    vhost_user_opts = opts->vhost_user;
+
+    chr = net_vhost_parse_chardev(vhost_user_opts);
+    if (!chr) {
+        error_report("No suitable chardev found\n");
+        return -1;
+    }
+
+    /* verify mem-path is set and shared */
+    mem_opts = qemu_opts_find(qemu_find_opts("mem-path"), NULL);
+    if (mem_opts) {
+        mem_share = qemu_opt_get_bool(mem_opts, "share", 0);
+    }
+
+    if (!mem_share) {
+        error_report("vhost-user requires -mem-path /path,share=on");
+        return -1;
+    }
+
+    return net_vhost_user_init(peer, "vhost_user", name, chr);
 }
diff --git a/qapi-schema.json b/qapi-schema.json
index 05ced9d..51609a4 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3104,6 +3104,21 @@
     '*devname':    'str' } }
 
 ##
+# @NetdevVhostUserOptions
+#
+# Vhost-user network backend
+#
+# @path: control socket path
+#
+# Since 2.0
+##
+{ 'type': 'NetdevVhostUserOptions',
+  'data': {
+    'chardev': 'str' } }
+
+##
+
+##
 # @NetClientOptions
 #
 # A discriminated record of network device traits.
@@ -3121,7 +3136,8 @@
     'dump':     'NetdevDumpOptions',
     'bridge':   'NetdevBridgeOptions',
     'hubport':  'NetdevHubPortOptions',
-    'netmap':   'NetdevNetmapOptions' } }
+    'netmap':   'NetdevNetmapOptions',
+    'vhost-user': 'NetdevVhostUserOptions' } }
 
 ##
 # @NetLegacy
diff --git a/qemu-options.hx b/qemu-options.hx
index 60ecc95..2c59164 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1435,6 +1435,7 @@ DEF("netdev", HAS_ARG, QEMU_OPTION_netdev,
 #ifdef CONFIG_NETMAP
     "netmap|"
 #endif
+    "vhost-user|"
     "socket|"
     "hubport],id=str[,option][,option][,...]\n", QEMU_ARCH_ALL)
 STEXI
@@ -1766,6 +1767,21 @@ The hubport netdev lets you connect a NIC to a QEMU "vlan" instead of a single
 netdev.  @code{-net} and @code{-device} with parameter @option{vlan} create the
 required hub automatically.
 
+@item -netdev vhost-user,chardev=@var{id}
+
+Establish a vhost-user netdev, backedb by a chardev @var{id}. The chardev should
+be a unix domain socket backed one. The vhost-user uses a specifically defined
+protocol to pass vhost ioctl replacement messages to an application on the other
+end of the socket.
+
+Example:
+@example
+qemu -m 1024 -mem-path /hugetlbfs,prealloc=on,share=on \
+     -chardev socket,path=/path/to/socket \
+     -netdev type=vhost-user,id=net0,chardev=chr0 \
+     -device virtio-net-pci,netdev=net0
+@end example
+
 @item -net dump[,vlan=@var{n}][,file=@var{file}][,len=@var{len}]
 Dump network traffic on VLAN @var{n} to file @var{file} (@file{qemu-vlan0.pcap} by default).
 At most @var{len} bytes (64k by default) per packet are stored. The file format is
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v7 13/13] Add vhost-user protocol documentation
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (11 preceding siblings ...)
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 12/13] Add the vhost-user netdev backend to command line Antonios Motakis
@ 2014-01-31 17:34 ` Antonios Motakis
  2014-02-10  8:57 ` [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Michael S. Tsirkin
  13 siblings, 0 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-01-31 17:34 UTC (permalink / raw)
  To: qemu-devel, snabb-devel; +Cc: lukego, Antonios Motakis, tech, n.nikolaev, mst

This document describes the basic message format used by vhost-user
for communication over a unix domain socket. The protocol is based
on the existing ioctl interface used for the kernel version of vhost.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 docs/specs/vhost-user.txt | 249 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 249 insertions(+)
 create mode 100644 docs/specs/vhost-user.txt

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
new file mode 100644
index 0000000..236ec45
--- /dev/null
+++ b/docs/specs/vhost-user.txt
@@ -0,0 +1,249 @@
+Vhost-user Protocol
+===================
+
+This protocol is aiming to complement the ioctl interface used to control the
+vhost implementation in the Linux kernel. It implements the control plane needed
+to establish virtqueue sharing with a user space process on the same host. It
+uses communication over a Unix domain socket to share file descriptors in the
+ancillary data of the message.
+
+The protocol defines 2 sides of the communication, master and slave. Master is
+the application that shares it's virtqueues, in our case QEMU. Slave is the
+consumer of the virtqueues.
+
+In the current implementation QEMU is the Master, and the Slave is intended to
+be a software ethernet switch running in user space, such as Snabbswitch.
+
+Master and slave can be either a client (i.e. connecting) or server (listening)
+in the socket communication.
+
+Message Specification
+---------------------
+
+Note that all numbers are in the machine native byte order. A vhost-user message
+consists of 3 header fields and a payload:
+
+------------------------------------
+| request | flags | size | payload |
+------------------------------------
+
+ * Request: 32-bit type of the request
+ * Flags: 32-bit bit field:
+   - Lower 2 bits are the version (currently 0x01)
+   - Bit 2 is the reply flag - needs to be sent on each reply from the slave
+ * Size - 32-bit size of the payload
+
+
+Depending on the request type, payload can be:
+
+ * A single 64-bit integer
+   -------
+   | u64 |
+   -------
+
+   u64: a 64-bit unsigned integer
+
+ * A vring state description
+   ---------------
+  | index | num |
+  ---------------
+
+   Index: a 32-bit index
+   Num: a 32-bit number
+
+ * A vring address description
+   --------------------------------------------------------------
+   | index | flags | size | descriptor | used | available | log |
+   --------------------------------------------------------------
+
+   Index: a 32-bit vring index
+   Flags: a 32-bit vring flags
+   Descriptor: a 64-bit user address of the vring descriptor table
+   Used: a 64-bit user address of the vring used ring
+   Available: a 64-bit user address of the vring available ring
+   Log: a 64-bit guest address for logging
+
+ * Memory regions description
+   ---------------------------------------------------
+   | num regions | padding | region0 | ... | region7 |
+   ---------------------------------------------------
+
+   Num regions: a 32-bit number of regions
+   Padding: 32-bit
+
+   A region is:
+   ---------------------------------------
+   | guest address | size | user address |
+   ---------------------------------------
+
+   Guest address: a 64-bit guest address of the region
+   Size: a 64-bit size
+   User address: a 64-bit user address
+
+
+In QEMU the vhost-user message is implemented with the following struct:
+
+typedef struct VhostUserMsg {
+    VhostUserRequest request;
+    uint32_t flags;
+    uint32_t size;
+    union {
+        uint64_t u64;
+        struct vhost_vring_state state;
+        struct vhost_vring_addr addr;
+        VhostUserMemory memory;
+    };
+} QEMU_PACKED VhostUserMsg;
+
+Communication
+-------------
+
+The protocol for vhost-user is based on the existing implementation of vhost
+for the Linux Kernel. Most messages that can be send via the Unix domain socket
+implementing vhost-user have an equivalent ioctl to the kernel implementation.
+
+The communication consists of master sending message requests and slave sending
+message replies. Most of the requests don't require replies. Here is a list of
+the ones that do:
+
+ * VHOST_GET_FEATURES
+ * VHOST_GET_VRING_BASE
+
+There are several messages that the master sends with file descriptors passed
+in the ancillary data:
+
+ * VHOST_SET_MEM_TABLE
+ * VHOST_SET_LOG_FD
+ * VHOST_SET_VRING_KICK
+ * VHOST_SET_VRING_CALL
+ * VHOST_SET_VRING_ERR
+
+If Master is unable to send the full message or receives a wrong reply it will
+close the connection. An optional reconnection mechanism can be implemented.
+
+Message types
+-------------
+
+ * VHOST_USER_GET_FEATURES
+
+      Id: 2
+      Equivalent ioctl: VHOST_GET_FEATURES
+      Master payload: N/A
+      Slave payload: u64
+
+      Get from the underlying vhost implementation the features bitmask.
+
+ * VHOST_USER_SET_FEATURES
+
+      Id: 3
+      Ioctl: VHOST_SET_FEATURES
+      Master payload: u64
+
+      Enable features in the underlying vhost implementation using a bitmask.
+
+ * VHOST_USER_SET_OWNER
+
+      Id: 4
+      Equivalent ioctl: VHOST_SET_OWNER
+      Master payload: N/A
+
+      Issued when a new connection is established. It sets the current Master
+      as an owner of the session. This can be used on the Slave as a
+      "session start" flag.
+
+ * VHOST_USER_RESET_OWNER
+
+      Id: 5
+      Equivalent ioctl: VHOST_RESET_OWNER
+      Master payload: N/A
+
+      Issued when a new connection is about to be closed. The Master will no
+      longer own this connection (and will usually close it).
+
+ * VHOST_USER_SET_MEM_TABLE
+
+      Id: 6
+      Equivalent ioctl: VHOST_SET_MEM_TABLE
+      Master payload: memory regions description
+
+      Sets the memory map regions on the slave so it can translate the vring
+      addresses. In the ancillary data there is an array of file descriptors
+      for each memory mapped region. The size and ordering of the fds matches
+      the number and ordering of memory regions.
+
+ * VHOST_USER_SET_LOG_BASE
+
+      Id: 7
+      Equivalent ioctl: VHOST_SET_LOG_BASE
+      Master payload: u64
+
+      Sets the logging base address.
+
+ * VHOST_USER_SET_LOG_FD
+
+      Id: 8
+      Equivalent ioctl: VHOST_SET_LOG_FD
+      Master payload: N/A
+
+      Sets the logging file descriptor, which is passed as ancillary data.
+
+ * VHOST_USER_SET_VRING_NUM
+
+      Id: 9
+      Equivalent ioctl: VHOST_SET_VRING_NUM
+      Master payload: vring state description
+
+      Sets the number of vrings for this owner.
+
+ * VHOST_USER_SET_VRING_ADDR
+
+      Id: 10
+      Equivalent ioctl: VHOST_SET_VRING_ADDR
+      Master payload: vring address description
+      Slave payload: N/A
+
+      Sets the addresses of the different aspects of the vring.
+
+ * VHOST_USER_SET_VRING_BASE
+
+      Id: 11
+      Equivalent ioctl: VHOST_SET_VRING_BASE
+      Master payload: vring state description
+
+      Sets the base address where the available descriptors are.
+
+ * VHOST_USER_GET_VRING_BASE
+
+      Id: 12
+      Equivalent ioctl: VHOST_USER_GET_VRING_BASE
+      Master payload: vring state description
+      Slave payload: vring state description
+
+      Get the vring base address.
+
+ * VHOST_USER_SET_VRING_KICK
+
+      Id: 13
+      Equivalent ioctl: VHOST_SET_VRING_KICK
+      Master payload: N/A
+
+      Set the event file descriptor for adding buffers to the vring. It
+      is passed in the ancillary data.
+
+ * VHOST_USER_SET_VRING_CALL
+
+      Id: 14
+      Equivalent ioctl: VHOST_SET_VRING_CALL
+      Master payload: N/A
+
+      Set the event file descriptor to signal when buffers are used. It
+      is passed in the ancillary data.
+
+ * VHOST_USER_SET_VRING_ERR
+
+      Id: 15
+      Equivalent ioctl: VHOST_SET_VRING_ERR
+      Master payload: N/A
+
+      Set the event file descriptor to signal when error occurs. It
+      is passed in the ancillary data.
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v7 11/13] Add new vhost-user netdev backend
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 11/13] Add new vhost-user netdev backend Antonios Motakis
@ 2014-02-10  8:42   ` Michael S. Tsirkin
  2014-02-10 16:05     ` Antonios Motakis
  0 siblings, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2014-02-10  8:42 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: snabb-devel, Anthony Liguori, qemu-devel, n.nikolaev,
	Stefan Hajnoczi, lukego, tech

On Fri, Jan 31, 2014 at 06:34:40PM +0100, Antonios Motakis wrote:
> Add a new QEMU netdev backend that is intended to invoke vhost_net with the
> vhost-user backend.
> 
> At runtime the netdev will detect if the vhost backend is up or down. Upon
> disconnection it will set link_down accordingly and notify virtio-net. The
> virtio-net interface goes down.
> 
> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>

What happens if users try to configure e.g. e1000 with this
netdev backend?
I would expect some code in the backend checking that
frontend is virtio, but I don't see such.

> ---
>  include/net/vhost-user.h |  17 +++++++
>  net/Makefile.objs        |   2 +-
>  net/clients.h            |   3 ++
>  net/vhost-user.c         | 130 +++++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 151 insertions(+), 1 deletion(-)
>  create mode 100644 include/net/vhost-user.h
>  create mode 100644 net/vhost-user.c
> 
> diff --git a/include/net/vhost-user.h b/include/net/vhost-user.h
> new file mode 100644
> index 0000000..85109f6
> --- /dev/null
> +++ b/include/net/vhost-user.h
> @@ -0,0 +1,17 @@
> +/*
> + * vhost-user.h
> + *
> + * Copyright (c) 2013 Virtual Open Systems Sarl.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef VHOST_USER_H_
> +#define VHOST_USER_H_
> +
> +struct vhost_net;
> +struct vhost_net *vhost_user_get_vhost_net(NetClientState *nc);
> +
> +#endif /* VHOST_USER_H_ */
> diff --git a/net/Makefile.objs b/net/Makefile.objs
> index c25fe69..301f6b6 100644
> --- a/net/Makefile.objs
> +++ b/net/Makefile.objs
> @@ -2,7 +2,7 @@ common-obj-y = net.o queue.o checksum.o util.o hub.o
>  common-obj-y += socket.o
>  common-obj-y += dump.o
>  common-obj-y += eth.o
> -common-obj-$(CONFIG_POSIX) += tap.o
> +common-obj-$(CONFIG_POSIX) += tap.o vhost-user.o
>  common-obj-$(CONFIG_LINUX) += tap-linux.o
>  common-obj-$(CONFIG_WIN32) += tap-win32.o
>  common-obj-$(CONFIG_BSD) += tap-bsd.o
> diff --git a/net/clients.h b/net/clients.h
> index 7322ff5..7f3d4ae 100644
> --- a/net/clients.h
> +++ b/net/clients.h
> @@ -57,4 +57,7 @@ int net_init_netmap(const NetClientOptions *opts, const char *name,
>                      NetClientState *peer);
>  #endif
>  
> +int net_init_vhost_user(const NetClientOptions *opts, const char *name,
> +                        NetClientState *peer);
> +
>  #endif /* QEMU_NET_CLIENTS_H */
> diff --git a/net/vhost-user.c b/net/vhost-user.c
> new file mode 100644
> index 0000000..b25722c
> --- /dev/null
> +++ b/net/vhost-user.c
> @@ -0,0 +1,130 @@
> +/*
> + * vhost-user.c
> + *
> + * Copyright (c) 2013 Virtual Open Systems Sarl.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "clients.h"
> +#include "net/vhost_net.h"
> +#include "net/vhost-user.h"
> +#include "sysemu/char.h"
> +#include "qemu/error-report.h"
> +
> +typedef struct VhostUserState {
> +    NetClientState nc;
> +    CharDriverState *chr;
> +    VHostNetState *vhost_net;
> +} VhostUserState;
> +
> +VHostNetState *vhost_user_get_vhost_net(NetClientState *nc)
> +{
> +    VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
> +    return s->vhost_net;
> +}
> +
> +static int vhost_user_running(VhostUserState *s)
> +{
> +    return (s->vhost_net) ? 1 : 0;
> +}
> +
> +static int vhost_user_start(VhostUserState *s)
> +{
> +    VhostNetOptions options;
> +
> +    if (vhost_user_running(s)) {
> +        return 0;
> +    }
> +
> +    options.backend_type = VHOST_BACKEND_TYPE_USER;
> +    options.net_backend = &s->nc;
> +    options.opaque = s->chr;
> +    options.force = 1;
> +
> +    s->vhost_net = vhost_net_init(&options);
> +
> +    return vhost_user_running(s) ? 0 : -1;
> +}
> +
> +static void vhost_user_stop(VhostUserState *s)
> +{
> +    if (vhost_user_running(s)) {
> +        vhost_net_cleanup(s->vhost_net);
> +    }
> +
> +    s->vhost_net = 0;
> +}
> +
> +static void vhost_user_cleanup(NetClientState *nc)
> +{
> +    VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
> +
> +    vhost_user_stop(s);
> +    qemu_purge_queued_packets(nc);
> +}
> +
> +static NetClientInfo net_vhost_user_info = {
> +        .type = 0,
> +        .size = sizeof(VhostUserState),
> +        .cleanup = vhost_user_cleanup,
> +};
> +
> +static void net_vhost_user_event(void *opaque, int event)
> +{
> +    VhostUserState *s = opaque;
> +
> +    switch (event) {
> +    case CHR_EVENT_OPENED:
> +        vhost_user_start(s);
> +        break;
> +    case CHR_EVENT_CLOSED:
> +        s->nc.link_down = 1;
> +
> +        if (s->nc.peer) {
> +            s->nc.peer->link_down = 1;
> +        }
> +
> +        if (s->nc.info->link_status_changed) {
> +            s->nc.info->link_status_changed(&s->nc);
> +        }
> +
> +        if (s->nc.peer && s->nc.peer->info->link_status_changed) {
> +            s->nc.peer->info->link_status_changed(s->nc.peer);
> +        }
> +
> +        vhost_user_stop(s);
> +        error_report("chardev \"%s\" went down\n", s->chr->label);
> +        break;
> +    }
> +}
> +
> +static int net_vhost_user_init(NetClientState *peer, const char *device,
> +                               const char *name, CharDriverState *chr)
> +{
> +    NetClientState *nc;
> +    VhostUserState *s;
> +
> +    nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
> +
> +    snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s",
> +             chr->label);
> +
> +    s = DO_UPCAST(VhostUserState, nc, nc);
> +
> +    /* We don't provide a receive callback */
> +    s->nc.receive_disabled = 1;
> +    s->chr = chr;
> +
> +    qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event, s);
> +
> +    return 0;
> +}
> +
> +int net_init_vhost_user(const NetClientOptions *opts, const char *name,
> +                   NetClientState *peer)
> +{
> +    return net_vhost_user_init(peer, "vhost_user", 0, 0);
> +}
> -- 
> 1.8.3.2
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v7 12/13] Add the vhost-user netdev backend to command line
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 12/13] Add the vhost-user netdev backend to command line Antonios Motakis
@ 2014-02-10  8:49   ` Michael S. Tsirkin
  2014-02-10 16:43   ` Eric Blake
  1 sibling, 0 replies; 20+ messages in thread
From: Michael S. Tsirkin @ 2014-02-10  8:49 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: snabb-devel, Anthony Liguori, tech, Michael Tokarev, qemu-devel,
	n.nikolaev, Markus Armbruster, Stefan Hajnoczi, lukego,
	Paolo Bonzini, Luiz Capitulino

On Fri, Jan 31, 2014 at 06:34:41PM +0100, Antonios Motakis wrote:
> The supplied chardev id will be inspected for supported options. Only
> a socket backend, with a set path (i.e. a unix socket) and optionally
> the server parameter set, will be allowed. Other options (nowait, telnet)
> will make the chardev unusable and the netdev will not be initialised.
> 
> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
> ---
>  hmp-commands.hx     |  4 +--
>  hw/net/vhost_net.c  |  4 +++
>  hw/net/virtio-net.c |  3 ++
>  net/hub.c           |  1 +
>  net/net.c           |  2 ++
>  net/vhost-user.c    | 91 +++++++++++++++++++++++++++++++++++++++++++++++++++--
>  qapi-schema.json    | 18 ++++++++++-
>  qemu-options.hx     | 16 ++++++++++
>  8 files changed, 134 insertions(+), 5 deletions(-)
> 
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index f3fc514..68128c1 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1195,7 +1195,7 @@ ETEXI
>      {
>          .name       = "host_net_add",
>          .args_type  = "device:s,opts:s?",
> -        .params     = "tap|user|socket|vde|netmap|dump [options]",
> +        .params     = "tap|user|socket|vde|netmap|vhost-user|dump [options]",
>          .help       = "add host VLAN client",
>          .mhandler.cmd = net_host_device_add,
>      },
> @@ -1223,7 +1223,7 @@ ETEXI
>      {
>          .name       = "netdev_add",
>          .args_type  = "netdev:O",
> -        .params     = "[user|tap|socket|hubport|netmap],id=str[,prop=value][,...]",
> +        .params     = "[user|tap|socket|hubport|netmap|vhost-user],id=str[,prop=value][,...]",
>          .help       = "add host network device",
>          .mhandler.cmd = hmp_netdev_add,
>      },
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index 6b6268b..e630407 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -15,6 +15,7 @@
>  
>  #include "net/net.h"
>  #include "net/tap.h"
> +#include "net/vhost-user.h"
>  
>  #include "hw/virtio/virtio-net.h"
>  #include "net/vhost_net.h"
> @@ -322,6 +323,9 @@ VHostNetState *get_vhost_net(NetClientState *nc)
>      case NET_CLIENT_OPTIONS_KIND_TAP:
>          vhost_net = tap_get_vhost_net(nc);
>          break;
> +    case NET_CLIENT_OPTIONS_KIND_VHOST_USER:
> +        vhost_net = vhost_user_get_vhost_net(nc);
> +        break;
>      default:
>          break;
>      }
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 72acd15..d49ee82 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -322,6 +322,9 @@ static void peer_test_vnet_hdr(VirtIONet *n)
>      case NET_CLIENT_OPTIONS_KIND_TAP:
>          n->has_vnet_hdr = tap_has_vnet_hdr(nc->peer);
>          break;
> +    case NET_CLIENT_OPTIONS_KIND_VHOST_USER:
> +        n->has_vnet_hdr = 0;
> +        break;
>      default:
>          break;
>      }
> diff --git a/net/hub.c b/net/hub.c
> index 33a99c9..7e0f2d6 100644
> --- a/net/hub.c
> +++ b/net/hub.c
> @@ -322,6 +322,7 @@ void net_hub_check_clients(void)
>              case NET_CLIENT_OPTIONS_KIND_TAP:
>              case NET_CLIENT_OPTIONS_KIND_SOCKET:
>              case NET_CLIENT_OPTIONS_KIND_VDE:
> +            case NET_CLIENT_OPTIONS_KIND_VHOST_USER:
>                  has_host_dev = 1;
>                  break;
>              default:
> diff --git a/net/net.c b/net/net.c
> index 2c3af20..30f1273 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -731,6 +731,7 @@ static int (* const net_client_init_fun[NET_CLIENT_OPTIONS_KIND_MAX])(
>          [NET_CLIENT_OPTIONS_KIND_BRIDGE]    = net_init_bridge,
>  #endif
>          [NET_CLIENT_OPTIONS_KIND_HUBPORT]   = net_init_hubport,
> +        [NET_CLIENT_OPTIONS_KIND_VHOST_USER] = net_init_vhost_user,
>  };
>  
>  

Please align other options at =.

> @@ -764,6 +765,7 @@ static int net_client_init1(const void *object, int is_netdev, Error **errp)
>          case NET_CLIENT_OPTIONS_KIND_BRIDGE:
>  #endif
>          case NET_CLIENT_OPTIONS_KIND_HUBPORT:
> +        case NET_CLIENT_OPTIONS_KIND_VHOST_USER:
>              break;
>  
>          default:
> diff --git a/net/vhost-user.c b/net/vhost-user.c
> index b25722c..f5bd211 100644
> --- a/net/vhost-user.c
> +++ b/net/vhost-user.c
> @@ -12,6 +12,7 @@
>  #include "net/vhost_net.h"
>  #include "net/vhost-user.h"
>  #include "sysemu/char.h"
> +#include "qemu/config-file.h"
>  #include "qemu/error-report.h"
>  
>  typedef struct VhostUserState {
> @@ -20,9 +21,17 @@ typedef struct VhostUserState {
>      VHostNetState *vhost_net;
>  } VhostUserState;
>  
> +typedef struct VhostUserChardevProps {
> +    bool is_socket;
> +    bool is_unix;
> +    bool is_server;
> +    bool has_unsupported;
> +} VhostUserChardevProps;
> +
>  VHostNetState *vhost_user_get_vhost_net(NetClientState *nc)
>  {
>      VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
> +    assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
>      return s->vhost_net;
>  }
>  
> @@ -67,7 +76,7 @@ static void vhost_user_cleanup(NetClientState *nc)
>  }
>  
>  static NetClientInfo net_vhost_user_info = {
> -        .type = 0,
> +        .type = NET_CLIENT_OPTIONS_KIND_VHOST_USER,
>          .size = sizeof(VhostUserState),
>          .cleanup = vhost_user_cleanup,
>  };
> @@ -123,8 +132,86 @@ static int net_vhost_user_init(NetClientState *peer, const char *device,
>      return 0;
>  }
>  
> +static int net_vhost_chardev_opts(const char *name, const char *value,
> +        void *opaque)
> +{
> +    VhostUserChardevProps *props = opaque;
> +
> +    if (strcmp(name, "backend") == 0 && strcmp(value, "socket") == 0) {
> +        props->is_socket = 1;
> +    } else if (strcmp(name, "path") == 0) {
> +        props->is_unix = 1;
> +    } else if (strcmp(name, "server") == 0) {
> +        props->is_server = 1;
> +    } else {
> +        error_report("vhost-user does not support a chardev"
> +                     " with the following option:\n %s = %s",
> +                     name, value);
> +        props->has_unsupported = 1;
> +        return -1;
> +    }
> +    return 0;
> +}
> +
> +static CharDriverState *net_vhost_parse_chardev(
> +        const NetdevVhostUserOptions *opts)
> +{
> +    CharDriverState *chr = qemu_chr_find(opts->chardev);
> +    VhostUserChardevProps props;
> +
> +    if (chr == NULL) {
> +        error_report("chardev \"%s\" not found\n", opts->chardev);
> +        return 0;
> +    }
> +
> +    /* inspect chardev opts */
> +    memset(&props, 0, sizeof(props));
> +    qemu_opt_foreach(chr->opts, net_vhost_chardev_opts, &props, false);
> +
> +    if (!props.is_socket || !props.is_unix) {
> +        error_report("chardev \"%s\" is not a unix socket\n",
> +                     opts->chardev);
> +        return 0;
> +    }
> +
> +    if (props.has_unsupported) {
> +        error_report("chardev \"%s\" has an unsupported option\n",
> +                opts->chardev);
> +        return 0;
> +    }
> +
> +    qemu_chr_fe_claim_no_fail(chr);
> +
> +    return chr;
> +}
> +
>  int net_init_vhost_user(const NetClientOptions *opts, const char *name,
>                     NetClientState *peer)
>  {
> -    return net_vhost_user_init(peer, "vhost_user", 0, 0);
> +    const NetdevVhostUserOptions *vhost_user_opts;
> +    CharDriverState *chr;
> +    QemuOpts *mem_opts;
> +    unsigned int mem_share = 0;
> +
> +    assert(opts->kind == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
> +    vhost_user_opts = opts->vhost_user;
> +
> +    chr = net_vhost_parse_chardev(vhost_user_opts);
> +    if (!chr) {
> +        error_report("No suitable chardev found\n");
> +        return -1;
> +    }
> +
> +    /* verify mem-path is set and shared */
> +    mem_opts = qemu_opts_find(qemu_find_opts("mem-path"), NULL);
> +    if (mem_opts) {
> +        mem_share = qemu_opt_get_bool(mem_opts, "share", 0);
> +    }
> +
> +    if (!mem_share) {
> +        error_report("vhost-user requires -mem-path /path,share=on");
> +        return -1;
> +    }
> +
> +    return net_vhost_user_init(peer, "vhost_user", name, chr);
>  }
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 05ced9d..51609a4 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -3104,6 +3104,21 @@
>      '*devname':    'str' } }
>  
>  ##
> +# @NetdevVhostUserOptions
> +#
> +# Vhost-user network backend
> +#
> +# @path: control socket path
> +#
> +# Since 2.0
> +##
> +{ 'type': 'NetdevVhostUserOptions',
> +  'data': {
> +    'chardev': 'str' } }
> +
> +##
> +
> +##
>  # @NetClientOptions
>  #
>  # A discriminated record of network device traits.
> @@ -3121,7 +3136,8 @@
>      'dump':     'NetdevDumpOptions',
>      'bridge':   'NetdevBridgeOptions',
>      'hubport':  'NetdevHubPortOptions',
> -    'netmap':   'NetdevNetmapOptions' } }
> +    'netmap':   'NetdevNetmapOptions',
> +    'vhost-user': 'NetdevVhostUserOptions' } }
>  
>  ##
>  # @NetLegacy
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 60ecc95..2c59164 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1435,6 +1435,7 @@ DEF("netdev", HAS_ARG, QEMU_OPTION_netdev,
>  #ifdef CONFIG_NETMAP
>      "netmap|"
>  #endif
> +    "vhost-user|"
>      "socket|"
>      "hubport],id=str[,option][,option][,...]\n", QEMU_ARCH_ALL)
>  STEXI
> @@ -1766,6 +1767,21 @@ The hubport netdev lets you connect a NIC to a QEMU "vlan" instead of a single
>  netdev.  @code{-net} and @code{-device} with parameter @option{vlan} create the
>  required hub automatically.
>  
> +@item -netdev vhost-user,chardev=@var{id}
> +
> +Establish a vhost-user netdev, backedb by a chardev @var{id}. The chardev should

typo

> +be a unix domain socket backed one. The vhost-user uses a specifically defined
> +protocol to pass vhost ioctl replacement messages to an application on the other
> +end of the socket.
> +
> +Example:
> +@example
> +qemu -m 1024 -mem-path /hugetlbfs,prealloc=on,share=on \
> +     -chardev socket,path=/path/to/socket \
> +     -netdev type=vhost-user,id=net0,chardev=chr0 \
> +     -device virtio-net-pci,netdev=net0
> +@end example
> +
>  @item -net dump[,vlan=@var{n}][,file=@var{file}][,len=@var{len}]
>  Dump network traffic on VLAN @var{n} to file @var{file} (@file{qemu-vlan0.pcap} by default).
>  At most @var{len} bytes (64k by default) per packet are stored. The file format is
> -- 
> 1.8.3.2
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends
  2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (12 preceding siblings ...)
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 13/13] Add vhost-user protocol documentation Antonios Motakis
@ 2014-02-10  8:57 ` Michael S. Tsirkin
  2014-02-10 16:02   ` Antonios Motakis
  13 siblings, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2014-02-10  8:57 UTC (permalink / raw)
  To: Antonios Motakis; +Cc: lukego, snabb-devel, n.nikolaev, qemu-devel, tech

On Fri, Jan 31, 2014 at 06:34:29PM +0100, Antonios Motakis wrote:
> In this patch series we would like to introduce our approach for putting a
> virtio-net backend in an external userspace process. Our eventual target is to
> run the network backend in the Snabbswitch ethernet switch, while receiving
> traffic from a guest inside QEMU/KVM which runs an unmodified virtio-net
> implementation.
> 
> For this, we are working into extending vhost to allow equivalent functionality
> for userspace. Vhost already passes control of the data plane of virtio-net to
> the host kernel; we want to realize a similar model, but for userspace.
> 
> In this patch series the concept of a vhost-backend is introduced.
> 
> We define two vhost backend types - vhost-kernel and vhost-user. The former is
> the interface to the current kernel module implementation. Its control plane is
> ioctl based. The data plane is the kernel directly accessing the QEMU allocated,
> guest memory.
> 
> In the new vhost-user backend, the control plane is based on communication
> between QEMU and another userspace process using a unix domain socket. This
> allows to implement a virtio backend for a guest running in QEMU, inside the
> other userspace process. For this communication we use a chardev with a unix socket
> backend. Vhost-user is client/server agnostic regarding the chardev, however
> it does not support the 'nowait' and 'telnet' options.
> 
> We change -mem-path to QemuOpts and add prealloc and share as properties
> to it. HugeTLBFS is required for this option to work.
> 
> The data path is realized by directly accessing the vrings and the buffer data
> off the guest's memory.
> 
> The current user of vhost-user is only vhost-net. We add new netdev backend
> that is intended to initialize vhost-net with vhost-user backend.


You mentioned that there will be an in-tree utility that can
communicate over this channel from the other side.
Did I miss it in this patchset or is it not included yet?

> Example usage:
> 
> qemu -m 1024 -mem-path /hugetlbfs,share=on \
>      -chardev socket,id=chr0,path=/path/to/socket \
>      -netdev type=vhost-user,id=net0,chardev=chr0 \
>      -device virtio-net-pci,netdev=net0
> 
> This code can be pulled from git@github.com:virtualopensystems/qemu.git vhost-user-v7
> 
> A reference vhost-user slave for testing is available from git@github.com:virtualopensystems/vapp.git
> 
> TODOs include:
>  - Include a test in QEMU to avoid regressions
>  - Slave reconnection and nowait support
> 
> Changes from v6:
>  - Remove the 'unlink' property of '-mem-path'
>  - Extend qemu-char: blocking read, send fds, monitor for connection close
>  - Vhost-user uses chardev as a backend
>  - Poll and reconnect removed (no VHOST_USER_ECHO).
>  - Disconnect is deteced by the chardev (G_IO_HUP event)
>  - vhost-backend.c split to vhost-user.c
> 
> Changes from v5:
>  - Split -mem-path unlink option to a separate patch
>  - Fds are passed only in the ancillary data
>  - Stricter message size checks on receive/send
>  - Netdev vhost-user now includes path and poll_time options
>  - The connection probing interval is configurable
> 
> Changes from v4:
>  - Use error_report for errors
>  - VhostUserMsg has new field `size` indicating the following payload length.
>    Field `flags` now has version and reply bits. The structure is packed.
>  - Send data is of variable length (`size` field in message)
>  - Receive in 2 steps, header and payload
>  - Add new message type VHOST_USER_ECHO, to check connection status
> 
> Changes from v3:
>  - Convert -mem-path to QemuOpts with prealloc, share and unlink properties
>  - Set 1 sec timeout when read/write to the unix domain socket
>  - Fix file descriptor leak
> 
> Changes from v2:
>  - Reconnect when the backend disappears
> 
> Changes from v1:
>  - Implementation of vhost-user netdev backend
>  - Code improvements
> 
> Antonios Motakis (13):
>   Convert -mem-path to QemuOpts and add prealloc and share properties
>   Add chardev API  qemu_chr_fe_read_all
>   Add chardev API qemu_chr_fe_set_msgfds
>   Add G_IO_HUP handler for socket chardev
>   vhost_net should call the poll callback only when it is set
>   Refactor virtio-net to use a generic get_vhost_net
>   vhost_net_init will use VhostNetOptions to get all its arguments
>   Add vhost_ops to the vhost_dev struct and replace all relevant ioctls
>   Add vhost-backend and VhostBackendType
>   Add vhost-user as a vhost backend.
>   Add new vhost-user netdev backend
>   Add the vhost-user netdev backend to command line
>   Add vhost-user protocol documentation
> 
>  docs/specs/vhost-user.txt         | 249 ++++++++++++++++++++++++++++
>  exec.c                            |  30 +++-
>  hmp-commands.hx                   |   4 +-
>  hw/net/vhost_net.c                | 142 +++++++++++-----
>  hw/net/virtio-net.c               |  42 ++---
>  hw/scsi/vhost-scsi.c              |  20 ++-
>  hw/virtio/Makefile.objs           |   2 +-
>  hw/virtio/vhost-backend.c         |  71 ++++++++
>  hw/virtio/vhost-user.c            | 331 ++++++++++++++++++++++++++++++++++++++
>  hw/virtio/vhost.c                 |  55 ++++---
>  include/exec/cpu-all.h            |   3 -
>  include/hw/virtio/vhost-backend.h |  38 +++++
>  include/hw/virtio/vhost.h         |   8 +-
>  include/net/vhost-user.h          |  17 ++
>  include/net/vhost_net.h           |  11 +-
>  include/sysemu/char.h             |  28 ++++
>  net/Makefile.objs                 |   2 +-
>  net/clients.h                     |   3 +
>  net/hub.c                         |   1 +
>  net/net.c                         |   2 +
>  net/tap.c                         |  18 ++-
>  net/vhost-user.c                  | 217 +++++++++++++++++++++++++
>  qapi-schema.json                  |  18 ++-
>  qemu-char.c                       | 185 ++++++++++++++++++++-
>  qemu-options.hx                   |  25 ++-
>  vl.c                              |  37 ++++-
>  26 files changed, 1425 insertions(+), 134 deletions(-)
>  create mode 100644 docs/specs/vhost-user.txt
>  create mode 100644 hw/virtio/vhost-backend.c
>  create mode 100644 hw/virtio/vhost-user.c
>  create mode 100644 include/hw/virtio/vhost-backend.h
>  create mode 100644 include/net/vhost-user.h
>  create mode 100644 net/vhost-user.c
> 
> -- 
> 1.8.3.2
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends
  2014-02-10  8:57 ` [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Michael S. Tsirkin
@ 2014-02-10 16:02   ` Antonios Motakis
  0 siblings, 0 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-02-10 16:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Luke Gorrie, snabb-devel, Nikolay Nikolaev,
	qemu-devel qemu-devel, VirtualOpenSystems Technical Team

Hello,


On Mon, Feb 10, 2014 at 9:57 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Jan 31, 2014 at 06:34:29PM +0100, Antonios Motakis wrote:
> > In this patch series we would like to introduce our approach for putting a
> > virtio-net backend in an external userspace process. Our eventual target is to
> > run the network backend in the Snabbswitch ethernet switch, while receiving
> > traffic from a guest inside QEMU/KVM which runs an unmodified virtio-net
> > implementation.
> >
> > For this, we are working into extending vhost to allow equivalent functionality
> > for userspace. Vhost already passes control of the data plane of virtio-net to
> > the host kernel; we want to realize a similar model, but for userspace.
> >
> > In this patch series the concept of a vhost-backend is introduced.
> >
> > We define two vhost backend types - vhost-kernel and vhost-user. The former is
> > the interface to the current kernel module implementation. Its control plane is
> > ioctl based. The data plane is the kernel directly accessing the QEMU allocated,
> > guest memory.
> >
> > In the new vhost-user backend, the control plane is based on communication
> > between QEMU and another userspace process using a unix domain socket. This
> > allows to implement a virtio backend for a guest running in QEMU, inside the
> > other userspace process. For this communication we use a chardev with a unix socket
> > backend. Vhost-user is client/server agnostic regarding the chardev, however
> > it does not support the 'nowait' and 'telnet' options.
> >
> > We change -mem-path to QemuOpts and add prealloc and share as properties
> > to it. HugeTLBFS is required for this option to work.
> >
> > The data path is realized by directly accessing the vrings and the buffer data
> > off the guest's memory.
> >
> > The current user of vhost-user is only vhost-net. We add new netdev backend
> > that is intended to initialize vhost-net with vhost-user backend.
>
>
> You mentioned that there will be an in-tree utility that can
> communicate over this channel from the other side.
> Did I miss it in this patchset or is it not included yet?

You haven't missed it; we just wanted to push our intermediate changes
for review, while we were still implementing the in-tree test. The
next version (v8), which we will post quite soon, will include it.


>
>
> > Example usage:
> >
> > qemu -m 1024 -mem-path /hugetlbfs,share=on \
> >      -chardev socket,id=chr0,path=/path/to/socket \
> >      -netdev type=vhost-user,id=net0,chardev=chr0 \
> >      -device virtio-net-pci,netdev=net0
> >
> > This code can be pulled from git@github.com:virtualopensystems/qemu.git vhost-user-v7
> >
> > A reference vhost-user slave for testing is available from git@github.com:virtualopensystems/vapp.git
> >
> > TODOs include:
> >  - Include a test in QEMU to avoid regressions
> >  - Slave reconnection and nowait support
> >
> > Changes from v6:
> >  - Remove the 'unlink' property of '-mem-path'
> >  - Extend qemu-char: blocking read, send fds, monitor for connection close
> >  - Vhost-user uses chardev as a backend
> >  - Poll and reconnect removed (no VHOST_USER_ECHO).
> >  - Disconnect is deteced by the chardev (G_IO_HUP event)
> >  - vhost-backend.c split to vhost-user.c
> >
> > Changes from v5:
> >  - Split -mem-path unlink option to a separate patch
> >  - Fds are passed only in the ancillary data
> >  - Stricter message size checks on receive/send
> >  - Netdev vhost-user now includes path and poll_time options
> >  - The connection probing interval is configurable
> >
> > Changes from v4:
> >  - Use error_report for errors
> >  - VhostUserMsg has new field `size` indicating the following payload length.
> >    Field `flags` now has version and reply bits. The structure is packed.
> >  - Send data is of variable length (`size` field in message)
> >  - Receive in 2 steps, header and payload
> >  - Add new message type VHOST_USER_ECHO, to check connection status
> >
> > Changes from v3:
> >  - Convert -mem-path to QemuOpts with prealloc, share and unlink properties
> >  - Set 1 sec timeout when read/write to the unix domain socket
> >  - Fix file descriptor leak
> >
> > Changes from v2:
> >  - Reconnect when the backend disappears
> >
> > Changes from v1:
> >  - Implementation of vhost-user netdev backend
> >  - Code improvements
> >
> > Antonios Motakis (13):
> >   Convert -mem-path to QemuOpts and add prealloc and share properties
> >   Add chardev API  qemu_chr_fe_read_all
> >   Add chardev API qemu_chr_fe_set_msgfds
> >   Add G_IO_HUP handler for socket chardev
> >   vhost_net should call the poll callback only when it is set
> >   Refactor virtio-net to use a generic get_vhost_net
> >   vhost_net_init will use VhostNetOptions to get all its arguments
> >   Add vhost_ops to the vhost_dev struct and replace all relevant ioctls
> >   Add vhost-backend and VhostBackendType
> >   Add vhost-user as a vhost backend.
> >   Add new vhost-user netdev backend
> >   Add the vhost-user netdev backend to command line
> >   Add vhost-user protocol documentation
> >
> >  docs/specs/vhost-user.txt         | 249 ++++++++++++++++++++++++++++
> >  exec.c                            |  30 +++-
> >  hmp-commands.hx                   |   4 +-
> >  hw/net/vhost_net.c                | 142 +++++++++++-----
> >  hw/net/virtio-net.c               |  42 ++---
> >  hw/scsi/vhost-scsi.c              |  20 ++-
> >  hw/virtio/Makefile.objs           |   2 +-
> >  hw/virtio/vhost-backend.c         |  71 ++++++++
> >  hw/virtio/vhost-user.c            | 331 ++++++++++++++++++++++++++++++++++++++
> >  hw/virtio/vhost.c                 |  55 ++++---
> >  include/exec/cpu-all.h            |   3 -
> >  include/hw/virtio/vhost-backend.h |  38 +++++
> >  include/hw/virtio/vhost.h         |   8 +-
> >  include/net/vhost-user.h          |  17 ++
> >  include/net/vhost_net.h           |  11 +-
> >  include/sysemu/char.h             |  28 ++++
> >  net/Makefile.objs                 |   2 +-
> >  net/clients.h                     |   3 +
> >  net/hub.c                         |   1 +
> >  net/net.c                         |   2 +
> >  net/tap.c                         |  18 ++-
> >  net/vhost-user.c                  | 217 +++++++++++++++++++++++++
> >  qapi-schema.json                  |  18 ++-
> >  qemu-char.c                       | 185 ++++++++++++++++++++-
> >  qemu-options.hx                   |  25 ++-
> >  vl.c                              |  37 ++++-
> >  26 files changed, 1425 insertions(+), 134 deletions(-)
> >  create mode 100644 docs/specs/vhost-user.txt
> >  create mode 100644 hw/virtio/vhost-backend.c
> >  create mode 100644 hw/virtio/vhost-user.c
> >  create mode 100644 include/hw/virtio/vhost-backend.h
> >  create mode 100644 include/net/vhost-user.h
> >  create mode 100644 net/vhost-user.c
> >
> > --
> > 1.8.3.2
> >




-- 
Antonios Motakis
Virtual Open Systems

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v7 11/13] Add new vhost-user netdev backend
  2014-02-10  8:42   ` Michael S. Tsirkin
@ 2014-02-10 16:05     ` Antonios Motakis
  0 siblings, 0 replies; 20+ messages in thread
From: Antonios Motakis @ 2014-02-10 16:05 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: snabb-devel, Anthony Liguori, qemu-devel qemu-devel,
	Nikolay Nikolaev, Stefan Hajnoczi, Luke Gorrie,
	VirtualOpenSystems Technical Team

On Mon, Feb 10, 2014 at 9:42 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Fri, Jan 31, 2014 at 06:34:40PM +0100, Antonios Motakis wrote:
>> Add a new QEMU netdev backend that is intended to invoke vhost_net with the
>> vhost-user backend.
>>
>> At runtime the netdev will detect if the vhost backend is up or down. Upon
>> disconnection it will set link_down accordingly and notify virtio-net. The
>> virtio-net interface goes down.
>>
>> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
>> Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
>
> What happens if users try to configure e.g. e1000 with this
> netdev backend?
> I would expect some code in the backend checking that
> frontend is virtio, but I don't see such.

Good point, we are looking into it. Thanks.

>
>> ---
>>  include/net/vhost-user.h |  17 +++++++
>>  net/Makefile.objs        |   2 +-
>>  net/clients.h            |   3 ++
>>  net/vhost-user.c         | 130 +++++++++++++++++++++++++++++++++++++++++++++++
>>  4 files changed, 151 insertions(+), 1 deletion(-)
>>  create mode 100644 include/net/vhost-user.h
>>  create mode 100644 net/vhost-user.c
>>
>> diff --git a/include/net/vhost-user.h b/include/net/vhost-user.h
>> new file mode 100644
>> index 0000000..85109f6
>> --- /dev/null
>> +++ b/include/net/vhost-user.h
>> @@ -0,0 +1,17 @@
>> +/*
>> + * vhost-user.h
>> + *
>> + * Copyright (c) 2013 Virtual Open Systems Sarl.
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> + * See the COPYING file in the top-level directory.
>> + *
>> + */
>> +
>> +#ifndef VHOST_USER_H_
>> +#define VHOST_USER_H_
>> +
>> +struct vhost_net;
>> +struct vhost_net *vhost_user_get_vhost_net(NetClientState *nc);
>> +
>> +#endif /* VHOST_USER_H_ */
>> diff --git a/net/Makefile.objs b/net/Makefile.objs
>> index c25fe69..301f6b6 100644
>> --- a/net/Makefile.objs
>> +++ b/net/Makefile.objs
>> @@ -2,7 +2,7 @@ common-obj-y = net.o queue.o checksum.o util.o hub.o
>>  common-obj-y += socket.o
>>  common-obj-y += dump.o
>>  common-obj-y += eth.o
>> -common-obj-$(CONFIG_POSIX) += tap.o
>> +common-obj-$(CONFIG_POSIX) += tap.o vhost-user.o
>>  common-obj-$(CONFIG_LINUX) += tap-linux.o
>>  common-obj-$(CONFIG_WIN32) += tap-win32.o
>>  common-obj-$(CONFIG_BSD) += tap-bsd.o
>> diff --git a/net/clients.h b/net/clients.h
>> index 7322ff5..7f3d4ae 100644
>> --- a/net/clients.h
>> +++ b/net/clients.h
>> @@ -57,4 +57,7 @@ int net_init_netmap(const NetClientOptions *opts, const char *name,
>>                      NetClientState *peer);
>>  #endif
>>
>> +int net_init_vhost_user(const NetClientOptions *opts, const char *name,
>> +                        NetClientState *peer);
>> +
>>  #endif /* QEMU_NET_CLIENTS_H */
>> diff --git a/net/vhost-user.c b/net/vhost-user.c
>> new file mode 100644
>> index 0000000..b25722c
>> --- /dev/null
>> +++ b/net/vhost-user.c
>> @@ -0,0 +1,130 @@
>> +/*
>> + * vhost-user.c
>> + *
>> + * Copyright (c) 2013 Virtual Open Systems Sarl.
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> + * See the COPYING file in the top-level directory.
>> + *
>> + */
>> +
>> +#include "clients.h"
>> +#include "net/vhost_net.h"
>> +#include "net/vhost-user.h"
>> +#include "sysemu/char.h"
>> +#include "qemu/error-report.h"
>> +
>> +typedef struct VhostUserState {
>> +    NetClientState nc;
>> +    CharDriverState *chr;
>> +    VHostNetState *vhost_net;
>> +} VhostUserState;
>> +
>> +VHostNetState *vhost_user_get_vhost_net(NetClientState *nc)
>> +{
>> +    VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
>> +    return s->vhost_net;
>> +}
>> +
>> +static int vhost_user_running(VhostUserState *s)
>> +{
>> +    return (s->vhost_net) ? 1 : 0;
>> +}
>> +
>> +static int vhost_user_start(VhostUserState *s)
>> +{
>> +    VhostNetOptions options;
>> +
>> +    if (vhost_user_running(s)) {
>> +        return 0;
>> +    }
>> +
>> +    options.backend_type = VHOST_BACKEND_TYPE_USER;
>> +    options.net_backend = &s->nc;
>> +    options.opaque = s->chr;
>> +    options.force = 1;
>> +
>> +    s->vhost_net = vhost_net_init(&options);
>> +
>> +    return vhost_user_running(s) ? 0 : -1;
>> +}
>> +
>> +static void vhost_user_stop(VhostUserState *s)
>> +{
>> +    if (vhost_user_running(s)) {
>> +        vhost_net_cleanup(s->vhost_net);
>> +    }
>> +
>> +    s->vhost_net = 0;
>> +}
>> +
>> +static void vhost_user_cleanup(NetClientState *nc)
>> +{
>> +    VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
>> +
>> +    vhost_user_stop(s);
>> +    qemu_purge_queued_packets(nc);
>> +}
>> +
>> +static NetClientInfo net_vhost_user_info = {
>> +        .type = 0,
>> +        .size = sizeof(VhostUserState),
>> +        .cleanup = vhost_user_cleanup,
>> +};
>> +
>> +static void net_vhost_user_event(void *opaque, int event)
>> +{
>> +    VhostUserState *s = opaque;
>> +
>> +    switch (event) {
>> +    case CHR_EVENT_OPENED:
>> +        vhost_user_start(s);
>> +        break;
>> +    case CHR_EVENT_CLOSED:
>> +        s->nc.link_down = 1;
>> +
>> +        if (s->nc.peer) {
>> +            s->nc.peer->link_down = 1;
>> +        }
>> +
>> +        if (s->nc.info->link_status_changed) {
>> +            s->nc.info->link_status_changed(&s->nc);
>> +        }
>> +
>> +        if (s->nc.peer && s->nc.peer->info->link_status_changed) {
>> +            s->nc.peer->info->link_status_changed(s->nc.peer);
>> +        }
>> +
>> +        vhost_user_stop(s);
>> +        error_report("chardev \"%s\" went down\n", s->chr->label);
>> +        break;
>> +    }
>> +}
>> +
>> +static int net_vhost_user_init(NetClientState *peer, const char *device,
>> +                               const char *name, CharDriverState *chr)
>> +{
>> +    NetClientState *nc;
>> +    VhostUserState *s;
>> +
>> +    nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
>> +
>> +    snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s",
>> +             chr->label);
>> +
>> +    s = DO_UPCAST(VhostUserState, nc, nc);
>> +
>> +    /* We don't provide a receive callback */
>> +    s->nc.receive_disabled = 1;
>> +    s->chr = chr;
>> +
>> +    qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event, s);
>> +
>> +    return 0;
>> +}
>> +
>> +int net_init_vhost_user(const NetClientOptions *opts, const char *name,
>> +                   NetClientState *peer)
>> +{
>> +    return net_vhost_user_init(peer, "vhost_user", 0, 0);
>> +}
>> --
>> 1.8.3.2
>>



-- 
Antonios Motakis
Virtual Open Systems

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v7 12/13] Add the vhost-user netdev backend to command line
  2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 12/13] Add the vhost-user netdev backend to command line Antonios Motakis
  2014-02-10  8:49   ` Michael S. Tsirkin
@ 2014-02-10 16:43   ` Eric Blake
  1 sibling, 0 replies; 20+ messages in thread
From: Eric Blake @ 2014-02-10 16:43 UTC (permalink / raw)
  To: Antonios Motakis, qemu-devel, snabb-devel
  Cc: Stefan Hajnoczi, mst, Michael Tokarev, Markus Armbruster,
	n.nikolaev, Luiz Capitulino, Anthony Liguori, lukego,
	Paolo Bonzini, tech

[-- Attachment #1: Type: text/plain, Size: 984 bytes --]

On 01/31/2014 10:34 AM, Antonios Motakis wrote:
> The supplied chardev id will be inspected for supported options. Only
> a socket backend, with a set path (i.e. a unix socket) and optionally
> the server parameter set, will be allowed. Other options (nowait, telnet)
> will make the chardev unusable and the netdev will not be initialised.
> 
> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
> ---

> +++ b/qapi-schema.json
> @@ -3104,6 +3104,21 @@
>      '*devname':    'str' } }
>  
>  ##
> +# @NetdevVhostUserOptions
> +#
> +# Vhost-user network backend
> +#
> +# @path: control socket path
> +#
> +# Since 2.0
> +##
> +{ 'type': 'NetdevVhostUserOptions',
> +  'data': {
> +    'chardev': 'str' } }
> +
> +##
> +

This comment line looks spurious.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-02-10 16:43 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-31 17:34 [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Antonios Motakis
2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 01/13] Convert -mem-path to QemuOpts and add prealloc and share properties Antonios Motakis
2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 02/13] Add chardev API qemu_chr_fe_read_all Antonios Motakis
2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 03/13] Add chardev API qemu_chr_fe_set_msgfds Antonios Motakis
2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 04/13] Add G_IO_HUP handler for socket chardev Antonios Motakis
2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 05/13] vhost_net should call the poll callback only when it is set Antonios Motakis
2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 06/13] Refactor virtio-net to use a generic get_vhost_net Antonios Motakis
2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 07/13] vhost_net_init will use VhostNetOptions to get all its arguments Antonios Motakis
2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 08/13] Add vhost_ops to the vhost_dev struct and replace all relevant ioctls Antonios Motakis
2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 09/13] Add vhost-backend and VhostBackendType Antonios Motakis
2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 10/13] Add vhost-user as a vhost backend Antonios Motakis
2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 11/13] Add new vhost-user netdev backend Antonios Motakis
2014-02-10  8:42   ` Michael S. Tsirkin
2014-02-10 16:05     ` Antonios Motakis
2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 12/13] Add the vhost-user netdev backend to command line Antonios Motakis
2014-02-10  8:49   ` Michael S. Tsirkin
2014-02-10 16:43   ` Eric Blake
2014-01-31 17:34 ` [Qemu-devel] [PATCH v7 13/13] Add vhost-user protocol documentation Antonios Motakis
2014-02-10  8:57 ` [Qemu-devel] [PATCH v7 00/13] Vhost and vhost-net support for userspace based backends Michael S. Tsirkin
2014-02-10 16:02   ` Antonios Motakis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.