All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] ivshmem: update documentation, add client/server tools
@ 2014-08-08  8:55 ` David Marchand
  0 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-08  8:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, pbonzini, claudio.fontana, jani.kokkonen, eblake, cam, armbru

Here is a patchset containing an update on ivshmem specs documentation and
importing ivshmem server and client tools.
These tools have been written from scratch and are not related to what is
available in nahanni repository.
I put them in contrib/ directory as the qemu-doc.texi was already telling the
server was supposed to be there.

Changes since v2:
- fixed license issues in ivshmem client/server (I took hw/virtio/virtio-rng.c
  file as a reference).

Changes since v1:
- moved client/server import patch before doc update,
- tried to re-organise the ivshmem_device_spec.txt file based on Claudio
  comments (still not sure if the result is that great, comments welcome),
- incorporated comments from Claudio, Eric and Cam,
- added more details on the server <-> client messages exchange (but sorry, no
  ASCII art here).

By the way, there are still some functionnalities that need description (use of
ioeventfd, the lack of irqfd support) and some parts of the ivshmem code clearly
need cleanup. I will try to address this in future patches when these first
patches are ok.


-- 
David Marchand

David Marchand (2):
  contrib: add ivshmem client and server
  docs: update ivshmem device spec

 contrib/ivshmem-client/Makefile         |   29 +++
 contrib/ivshmem-client/ivshmem-client.c |  418 ++++++++++++++++++++++++++++++
 contrib/ivshmem-client/ivshmem-client.h |  238 ++++++++++++++++++
 contrib/ivshmem-client/main.c           |  246 ++++++++++++++++++
 contrib/ivshmem-server/Makefile         |   29 +++
 contrib/ivshmem-server/ivshmem-server.c |  420 +++++++++++++++++++++++++++++++
 contrib/ivshmem-server/ivshmem-server.h |  185 ++++++++++++++
 contrib/ivshmem-server/main.c           |  296 ++++++++++++++++++++++
 docs/specs/ivshmem_device_spec.txt      |  124 ++++++---
 qemu-doc.texi                           |   10 +-
 10 files changed, 1961 insertions(+), 34 deletions(-)
 create mode 100644 contrib/ivshmem-client/Makefile
 create mode 100644 contrib/ivshmem-client/ivshmem-client.c
 create mode 100644 contrib/ivshmem-client/ivshmem-client.h
 create mode 100644 contrib/ivshmem-client/main.c
 create mode 100644 contrib/ivshmem-server/Makefile
 create mode 100644 contrib/ivshmem-server/ivshmem-server.c
 create mode 100644 contrib/ivshmem-server/ivshmem-server.h
 create mode 100644 contrib/ivshmem-server/main.c

-- 
1.7.10.4


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v3 0/2] ivshmem: update documentation, add client/server tools
@ 2014-08-08  8:55 ` David Marchand
  0 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-08  8:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, claudio.fontana, armbru, pbonzini, jani.kokkonen, cam

Here is a patchset containing an update on ivshmem specs documentation and
importing ivshmem server and client tools.
These tools have been written from scratch and are not related to what is
available in nahanni repository.
I put them in contrib/ directory as the qemu-doc.texi was already telling the
server was supposed to be there.

Changes since v2:
- fixed license issues in ivshmem client/server (I took hw/virtio/virtio-rng.c
  file as a reference).

Changes since v1:
- moved client/server import patch before doc update,
- tried to re-organise the ivshmem_device_spec.txt file based on Claudio
  comments (still not sure if the result is that great, comments welcome),
- incorporated comments from Claudio, Eric and Cam,
- added more details on the server <-> client messages exchange (but sorry, no
  ASCII art here).

By the way, there are still some functionnalities that need description (use of
ioeventfd, the lack of irqfd support) and some parts of the ivshmem code clearly
need cleanup. I will try to address this in future patches when these first
patches are ok.


-- 
David Marchand

David Marchand (2):
  contrib: add ivshmem client and server
  docs: update ivshmem device spec

 contrib/ivshmem-client/Makefile         |   29 +++
 contrib/ivshmem-client/ivshmem-client.c |  418 ++++++++++++++++++++++++++++++
 contrib/ivshmem-client/ivshmem-client.h |  238 ++++++++++++++++++
 contrib/ivshmem-client/main.c           |  246 ++++++++++++++++++
 contrib/ivshmem-server/Makefile         |   29 +++
 contrib/ivshmem-server/ivshmem-server.c |  420 +++++++++++++++++++++++++++++++
 contrib/ivshmem-server/ivshmem-server.h |  185 ++++++++++++++
 contrib/ivshmem-server/main.c           |  296 ++++++++++++++++++++++
 docs/specs/ivshmem_device_spec.txt      |  124 ++++++---
 qemu-doc.texi                           |   10 +-
 10 files changed, 1961 insertions(+), 34 deletions(-)
 create mode 100644 contrib/ivshmem-client/Makefile
 create mode 100644 contrib/ivshmem-client/ivshmem-client.c
 create mode 100644 contrib/ivshmem-client/ivshmem-client.h
 create mode 100644 contrib/ivshmem-client/main.c
 create mode 100644 contrib/ivshmem-server/Makefile
 create mode 100644 contrib/ivshmem-server/ivshmem-server.c
 create mode 100644 contrib/ivshmem-server/ivshmem-server.h
 create mode 100644 contrib/ivshmem-server/main.c

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 1/2] contrib: add ivshmem client and server
  2014-08-08  8:55 ` [Qemu-devel] " David Marchand
@ 2014-08-08  8:55   ` David Marchand
  -1 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-08  8:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, pbonzini, claudio.fontana, jani.kokkonen, eblake, cam,
	armbru, Olivier Matz

When using ivshmem devices, notifications between guests can be sent as
interrupts using a ivshmem-server (typical use described in documentation).
The client is provided as a debug tool.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 contrib/ivshmem-client/Makefile         |   29 +++
 contrib/ivshmem-client/ivshmem-client.c |  418 ++++++++++++++++++++++++++++++
 contrib/ivshmem-client/ivshmem-client.h |  238 ++++++++++++++++++
 contrib/ivshmem-client/main.c           |  246 ++++++++++++++++++
 contrib/ivshmem-server/Makefile         |   29 +++
 contrib/ivshmem-server/ivshmem-server.c |  420 +++++++++++++++++++++++++++++++
 contrib/ivshmem-server/ivshmem-server.h |  185 ++++++++++++++
 contrib/ivshmem-server/main.c           |  296 ++++++++++++++++++++++
 qemu-doc.texi                           |   10 +-
 9 files changed, 1868 insertions(+), 3 deletions(-)
 create mode 100644 contrib/ivshmem-client/Makefile
 create mode 100644 contrib/ivshmem-client/ivshmem-client.c
 create mode 100644 contrib/ivshmem-client/ivshmem-client.h
 create mode 100644 contrib/ivshmem-client/main.c
 create mode 100644 contrib/ivshmem-server/Makefile
 create mode 100644 contrib/ivshmem-server/ivshmem-server.c
 create mode 100644 contrib/ivshmem-server/ivshmem-server.h
 create mode 100644 contrib/ivshmem-server/main.c

diff --git a/contrib/ivshmem-client/Makefile b/contrib/ivshmem-client/Makefile
new file mode 100644
index 0000000..eee97c6
--- /dev/null
+++ b/contrib/ivshmem-client/Makefile
@@ -0,0 +1,29 @@
+# Copyright 6WIND S.A., 2014
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# (at your option) any later version.  See the COPYING file in the
+# top-level directory.
+
+S ?= $(CURDIR)
+O ?= $(CURDIR)
+
+CFLAGS += -Wall -Wextra -Werror -g
+LDFLAGS +=
+LDLIBS += -lrt
+
+VPATH = $(S)
+PROG = ivshmem-client
+OBJS := $(O)/ivshmem-client.o
+OBJS += $(O)/main.o
+
+$(O)/%.o: %.c
+	$(CC) $(CFLAGS) -o $@ -c $<
+
+$(O)/$(PROG): $(OBJS)
+	$(CC) $(LDFLAGS) -o $@ $^ $(LDLIBS)
+
+.PHONY: all
+all: $(O)/$(PROG)
+
+clean:
+	rm -f $(OBJS) $(O)/$(PROG)
diff --git a/contrib/ivshmem-client/ivshmem-client.c b/contrib/ivshmem-client/ivshmem-client.c
new file mode 100644
index 0000000..2166b64
--- /dev/null
+++ b/contrib/ivshmem-client/ivshmem-client.c
@@ -0,0 +1,418 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <signal.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <sys/queue.h>
+
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+
+#include "ivshmem-client.h"
+
+/* log a message on stdout if verbose=1 */
+#define debug_log(client, fmt, ...) do { \
+        if ((client)->verbose) {         \
+            printf(fmt, ## __VA_ARGS__); \
+        }                                \
+    } while (0)
+
+/* read message from the unix socket */
+static int
+read_one_msg(struct ivshmem_client *client, long *index, int *fd)
+{
+    int ret;
+    struct msghdr msg;
+    struct iovec iov[1];
+    union {
+        struct cmsghdr cmsg;
+        char control[CMSG_SPACE(sizeof(int))];
+    } msg_control;
+    struct cmsghdr *cmsg;
+
+    iov[0].iov_base = index;
+    iov[0].iov_len = sizeof(*index);
+
+    memset(&msg, 0, sizeof(msg));
+    msg.msg_iov = iov;
+    msg.msg_iovlen = 1;
+    msg.msg_control = &msg_control;
+    msg.msg_controllen = sizeof(msg_control);
+
+    ret = recvmsg(client->sock_fd, &msg, 0);
+    if (ret < 0) {
+        debug_log(client, "cannot read message: %s\n", strerror(errno));
+        return -1;
+    }
+    if (ret == 0) {
+        debug_log(client, "lost connection to server\n");
+        return -1;
+    }
+
+    *fd = -1;
+
+    for (cmsg = CMSG_FIRSTHDR(&msg); cmsg; cmsg = CMSG_NXTHDR(&msg, cmsg)) {
+
+        if (cmsg->cmsg_len != CMSG_LEN(sizeof(int)) ||
+            cmsg->cmsg_level != SOL_SOCKET ||
+            cmsg->cmsg_type != SCM_RIGHTS) {
+            continue;
+        }
+
+        memcpy(fd, CMSG_DATA(cmsg), sizeof(*fd));
+    }
+
+    return 0;
+}
+
+/* free a peer when the server advertise a disconnection or when the
+ * client is freed */
+static void
+free_peer(struct ivshmem_client *client, struct ivshmem_client_peer *peer)
+{
+    unsigned vector;
+
+    TAILQ_REMOVE(&client->peer_list, peer, next);
+    for (vector = 0; vector < peer->vectors_count; vector++) {
+        close(peer->vectors[vector]);
+    }
+
+    free(peer);
+}
+
+/* handle message coming from server (new peer, new vectors) */
+static int
+handle_server_msg(struct ivshmem_client *client)
+{
+    struct ivshmem_client_peer *peer;
+    long peer_id;
+    int ret, fd;
+
+    ret = read_one_msg(client, &peer_id, &fd);
+    if (ret < 0) {
+        return -1;
+    }
+
+    /* can return a peer or the local client */
+    peer = ivshmem_client_search_peer(client, peer_id);
+
+    /* delete peer */
+    if (fd == -1) {
+
+        if (peer == NULL || peer == &client->local) {
+            debug_log(client, "receive delete for invalid peer %ld", peer_id);
+            return -1;
+        }
+
+        debug_log(client, "delete peer id = %ld\n", peer_id);
+        free_peer(client, peer);
+        return 0;
+    }
+
+    /* new peer */
+    if (peer == NULL) {
+        peer = malloc(sizeof(*peer));
+        if (peer == NULL) {
+            debug_log(client, "cannot allocate new peer\n");
+            return -1;
+        }
+        memset(peer, 0, sizeof(*peer));
+        peer->id = peer_id;
+        peer->vectors_count = 0;
+        TAILQ_INSERT_TAIL(&client->peer_list, peer, next);
+        debug_log(client, "new peer id = %ld\n", peer_id);
+    }
+
+    /* new vector */
+    debug_log(client, "  new vector %d (fd=%d) for peer id %ld\n",
+              peer->vectors_count, fd, peer->id);
+    peer->vectors[peer->vectors_count] = fd;
+    peer->vectors_count++;
+
+    return 0;
+}
+
+/* init a new ivshmem client */
+int
+ivshmem_client_init(struct ivshmem_client *client, const char *unix_sock_path,
+                    ivshmem_client_notif_cb_t notif_cb, void *notif_arg,
+                    int verbose)
+{
+    unsigned i;
+
+    memset(client, 0, sizeof(*client));
+
+    snprintf(client->unix_sock_path, sizeof(client->unix_sock_path),
+             "%s", unix_sock_path);
+
+    for (i = 0; i < IVSHMEM_CLIENT_MAX_VECTORS; i++) {
+        client->local.vectors[i] = -1;
+    }
+
+    TAILQ_INIT(&client->peer_list);
+    client->local.id = -1;
+
+    client->notif_cb = notif_cb;
+    client->notif_arg = notif_arg;
+    client->verbose = verbose;
+
+    return 0;
+}
+
+/* create and connect to the unix socket */
+int
+ivshmem_client_connect(struct ivshmem_client *client)
+{
+    struct sockaddr_un sun;
+    int fd;
+    long tmp;
+
+    debug_log(client, "connect to client %s\n", client->unix_sock_path);
+
+    client->sock_fd = socket(AF_UNIX, SOCK_STREAM, 0);
+    if (client->sock_fd < 0) {
+        debug_log(client, "cannot create socket: %s\n", strerror(errno));
+        return -1;
+    }
+
+    sun.sun_family = AF_UNIX;
+    snprintf(sun.sun_path, sizeof(sun.sun_path), "%s", client->unix_sock_path);
+    if (connect(client->sock_fd, (struct sockaddr *)&sun, sizeof(sun)) < 0) {
+        debug_log(client, "cannot connect to %s: %s\n", sun.sun_path,
+                  strerror(errno));
+        close(client->sock_fd);
+        client->sock_fd = -1;
+        return -1;
+    }
+
+    /* first, we expect our index + a fd == -1 */
+    if (read_one_msg(client, &client->local.id, &fd) < 0 ||
+        client->local.id < 0 || fd != -1) {
+        debug_log(client, "cannot read from server\n");
+        close(client->sock_fd);
+        client->sock_fd = -1;
+        return -1;
+    }
+    debug_log(client, "our_id=%ld\n", client->local.id);
+
+    /* now, we expect shared mem fd + a -1 index, note that shm fd
+     * is not used */
+    if (read_one_msg(client, &tmp, &fd) < 0 ||
+        tmp != -1 || fd < 0) {
+        debug_log(client, "cannot read from server (2)\n");
+        close(client->sock_fd);
+        client->sock_fd = -1;
+        return -1;
+    }
+    debug_log(client, "shm_fd=%d\n", fd);
+
+    return 0;
+}
+
+/* close connection to the server, and free all peer structures */
+void
+ivshmem_client_close(struct ivshmem_client *client)
+{
+    struct ivshmem_client_peer *peer;
+    unsigned i;
+
+    debug_log(client, "close client\n");
+
+    while ((peer = TAILQ_FIRST(&client->peer_list)) != NULL) {
+        free_peer(client, peer);
+    }
+
+    close(client->sock_fd);
+    client->sock_fd = -1;
+    client->local.id = -1;
+    for (i = 0; i < IVSHMEM_CLIENT_MAX_VECTORS; i++) {
+        client->local.vectors[i] = -1;
+    }
+}
+
+/* get the fd_set according to the unix socket and peer list */
+void
+ivshmem_client_get_fds(const struct ivshmem_client *client, fd_set *fds,
+                       int *maxfd)
+{
+    int fd;
+    unsigned vector;
+
+    FD_SET(client->sock_fd, fds);
+    if (client->sock_fd >= *maxfd) {
+        *maxfd = client->sock_fd + 1;
+    }
+
+    for (vector = 0; vector < client->local.vectors_count; vector++) {
+        fd = client->local.vectors[vector];
+        FD_SET(fd, fds);
+        if (fd >= *maxfd) {
+            *maxfd = fd + 1;
+        }
+    }
+}
+
+/* handle events from eventfd: just print a message on notification */
+static int
+handle_event(struct ivshmem_client *client, const fd_set *cur, int maxfd)
+{
+    struct ivshmem_client_peer *peer;
+    uint64_t kick;
+    unsigned i;
+    int ret;
+
+    peer = &client->local;
+
+    for (i = 0; i < peer->vectors_count; i++) {
+        if (peer->vectors[i] >= maxfd || !FD_ISSET(peer->vectors[i], cur)) {
+            continue;
+        }
+
+        ret = read(peer->vectors[i], &kick, sizeof(kick));
+        if (ret < 0) {
+            return ret;
+        }
+        if (ret != sizeof(kick)) {
+            debug_log(client, "invalid read size = %d\n", ret);
+            errno = EINVAL;
+            return -1;
+        }
+        debug_log(client, "received event on fd %d vector %d: %ld\n",
+                  peer->vectors[i], i, kick);
+        if (client->notif_cb != NULL) {
+            client->notif_cb(client, peer, i, client->notif_arg);
+        }
+    }
+
+    return 0;
+}
+
+/* read and handle new messages on the given fd_set */
+int
+ivshmem_client_handle_fds(struct ivshmem_client *client, fd_set *fds, int maxfd)
+{
+    if (client->sock_fd < maxfd && FD_ISSET(client->sock_fd, fds) &&
+        handle_server_msg(client) < 0 && errno != EINTR) {
+        debug_log(client, "handle_server_msg() failed\n");
+        return -1;
+    } else if (handle_event(client, fds, maxfd) < 0 && errno != EINTR) {
+        debug_log(client, "handle_event() failed\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+/* send a notification on a vector of a peer */
+int
+ivshmem_client_notify(const struct ivshmem_client *client,
+                      const struct ivshmem_client_peer *peer, unsigned vector)
+{
+    uint64_t kick;
+    int fd;
+
+    if (vector > peer->vectors_count) {
+        debug_log(client, "invalid vector %u on peer %ld\n", vector, peer->id);
+        return -1;
+    }
+    fd = peer->vectors[vector];
+    debug_log(client, "notify peer %ld on vector %d, fd %d\n", peer->id, vector,
+              fd);
+
+    kick = 1;
+    if (write(fd, &kick, sizeof(kick)) != sizeof(kick)) {
+        fprintf(stderr, "could not write to %d: %s\n", peer->vectors[vector],
+                strerror(errno));
+        return -1;
+    }
+    return 0;
+}
+
+/* send a notification to all vectors of a peer */
+int
+ivshmem_client_notify_all_vects(const struct ivshmem_client *client,
+                                const struct ivshmem_client_peer *peer)
+{
+    unsigned vector;
+    int ret = 0;
+
+    for (vector = 0; vector < peer->vectors_count; vector++) {
+        if (ivshmem_client_notify(client, peer, vector) < 0) {
+            ret = -1;
+        }
+    }
+
+    return ret;
+}
+
+/* send a notification to all peers */
+int
+ivshmem_client_notify_broadcast(const struct ivshmem_client *client)
+{
+    struct ivshmem_client_peer *peer;
+    int ret = 0;
+
+    TAILQ_FOREACH(peer, &client->peer_list, next) {
+        if (ivshmem_client_notify_all_vects(client, peer) < 0) {
+            ret = -1;
+        }
+    }
+
+    return ret;
+}
+
+/* lookup peer from its id */
+struct ivshmem_client_peer *
+ivshmem_client_search_peer(struct ivshmem_client *client, long peer_id)
+{
+    struct ivshmem_client_peer *peer;
+
+    if (peer_id == client->local.id) {
+        return &client->local;
+    }
+
+    TAILQ_FOREACH(peer, &client->peer_list, next) {
+        if (peer->id == peer_id) {
+            return peer;
+        }
+    }
+    return NULL;
+}
+
+/* dump our info, the list of peers their vectors on stdout */
+void
+ivshmem_client_dump(const struct ivshmem_client *client)
+{
+    const struct ivshmem_client_peer *peer;
+    unsigned vector;
+
+    /* dump local infos */
+    peer = &client->local;
+    printf("our_id = %ld\n", peer->id);
+    for (vector = 0; vector < peer->vectors_count; vector++) {
+        printf("  vector %d is enabled (fd=%d)\n", vector,
+               peer->vectors[vector]);
+    }
+
+    /* dump peers */
+    TAILQ_FOREACH(peer, &client->peer_list, next) {
+        printf("peer_id = %ld\n", peer->id);
+
+        for (vector = 0; vector < peer->vectors_count; vector++) {
+            printf("  vector %d is enabled (fd=%d)\n", vector,
+                   peer->vectors[vector]);
+        }
+    }
+}
diff --git a/contrib/ivshmem-client/ivshmem-client.h b/contrib/ivshmem-client/ivshmem-client.h
new file mode 100644
index 0000000..d27222b
--- /dev/null
+++ b/contrib/ivshmem-client/ivshmem-client.h
@@ -0,0 +1,238 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#ifndef _IVSHMEM_CLIENT_
+#define _IVSHMEM_CLIENT_
+
+/**
+ * This file provides helper to implement an ivshmem client. It is used
+ * on the host to ask QEMU to send an interrupt to an ivshmem PCI device in a
+ * guest. QEMU also implements an ivshmem client similar to this one, they both
+ * connect to an ivshmem server.
+ *
+ * A standalone ivshmem client based on this file is provided for debug/test
+ * purposes.
+ */
+
+#include <limits.h>
+#include <sys/select.h>
+#include <sys/queue.h>
+
+/**
+ * Maximum number of notification vectors supported by the client
+ */
+#define IVSHMEM_CLIENT_MAX_VECTORS 64
+
+/**
+ * Structure storing a peer
+ *
+ * Each time a client connects to an ivshmem server, it is advertised to
+ * all connected clients through the unix socket. When our ivshmem
+ * client receives a notification, it creates a ivshmem_client_peer
+ * structure to store the infos of this peer.
+ *
+ * This structure is also used to store the information of our own
+ * client in (struct ivshmem_client)->local.
+ */
+struct ivshmem_client_peer {
+    TAILQ_ENTRY(ivshmem_client_peer) next;    /**< next in list*/
+    long id;                                    /**< the id of the peer */
+    int vectors[IVSHMEM_CLIENT_MAX_VECTORS];  /**< one fd per vector */
+    unsigned vectors_count;                     /**< number of vectors */
+};
+TAILQ_HEAD(ivshmem_client_peer_list, ivshmem_client_peer);
+
+struct ivshmem_client;
+
+/**
+ * Typedef of callback function used when our ivshmem_client receives a
+ * notification from a peer.
+ */
+typedef void (*ivshmem_client_notif_cb_t)(
+    const struct ivshmem_client *client,
+    const struct ivshmem_client_peer *peer,
+    unsigned vect, void *arg);
+
+/**
+ * Structure describing an ivshmem client
+ *
+ * This structure stores all information related to our client: the name
+ * of the server unix socket, the list of peers advertised by the
+ * server, our own client information, and a pointer the notification
+ * callback function used when we receive a notification from a peer.
+ */
+struct ivshmem_client {
+    char unix_sock_path[PATH_MAX];        /**< path to unix sock */
+    int sock_fd;                          /**< unix sock filedesc */
+
+    struct ivshmem_client_peer_list peer_list;  /**< list of peers */
+    struct ivshmem_client_peer local;   /**< our own infos */
+
+    ivshmem_client_notif_cb_t notif_cb; /**< notification callback */
+    void *notif_arg;                      /**< notification argument */
+
+    int verbose;                          /**< true to enable debug */
+};
+
+/**
+ * Initialize an ivshmem client
+ *
+ * @param client
+ *   A pointer to an uninitialized ivshmem_client structure
+ * @param unix_sock_path
+ *   The pointer to the unix socket file name
+ * @param notif_cb
+ *   If not NULL, the pointer to the function to be called when we our
+ *   ivshmem_client receives a notification from a peer
+ * @param notif_arg
+ *   Opaque pointer given as-is to the notification callback function
+ * @param verbose
+ *   True to enable debug
+ *
+ * @return
+ *   0 on success, or a negative value on error
+ */
+int ivshmem_client_init(struct ivshmem_client *client,
+    const char *unix_sock_path, ivshmem_client_notif_cb_t notif_cb,
+    void *notif_arg, int verbose);
+
+/**
+ * Connect to the server
+ *
+ * Connect to the server unix socket, and read the first initial
+ * messages sent by the server, giving the ID of the client and the file
+ * descriptor of the shared memory.
+ *
+ * @param client
+ *   The ivshmem client
+ *
+ * @return
+ *   0 on success, or a negative value on error
+ */
+int ivshmem_client_connect(struct ivshmem_client *client);
+
+/**
+ * Close connection to the server and free all peer structures
+ *
+ * @param client
+ *   The ivshmem client
+ */
+void ivshmem_client_close(struct ivshmem_client *client);
+
+/**
+ * Fill a fd_set with file descriptors to be monitored
+ *
+ * This function will fill a fd_set with all file descriptors
+ * that must be polled (unix server socket and peers eventfd). The
+ * function will not initialize the fd_set, it is up to the caller
+ * to do this.
+ *
+ * @param client
+ *   The ivshmem client
+ * @param fds
+ *   The fd_set to be updated
+ * @param maxfd
+ *   Must be set to the max file descriptor + 1 in fd_set. This value is
+ *   updated if this function adds a greated fd in fd_set.
+ */
+void ivshmem_client_get_fds(const struct ivshmem_client *client, fd_set *fds,
+                            int *maxfd);
+
+/**
+ * Read and handle new messages
+ *
+ * Given a fd_set filled by select(), handle incoming messages from
+ * server or peers.
+ *
+ * @param client
+ *   The ivshmem client
+ * @param fds
+ *   The fd_set containing the file descriptors to be checked. Note
+ *   that file descriptors that are not related to our client are
+ *   ignored.
+ * @param maxfd
+ *   The maximum fd in fd_set, plus one.
+  *
+ * @return
+ *   0 on success, negative value on failure.
+ */
+int ivshmem_client_handle_fds(struct ivshmem_client *client, fd_set *fds,
+    int maxfd);
+
+/**
+ * Send a notification to a vector of a peer
+ *
+ * @param client
+ *   The ivshmem client
+ * @param peer
+ *   The peer to be notified
+ * @param vector
+ *   The number of the vector
+ *
+ * @return
+ *   0 on success, and a negative error on failure.
+ */
+int ivshmem_client_notify(const struct ivshmem_client *client,
+    const struct ivshmem_client_peer *peer, unsigned vector);
+
+/**
+ * Send a notification to all vectors of a peer
+ *
+ * @param client
+ *   The ivshmem client
+ * @param peer
+ *   The peer to be notified
+ *
+ * @return
+ *   0 on success, and a negative error on failure (at least one
+ *   notification failed).
+ */
+int ivshmem_client_notify_all_vects(const struct ivshmem_client *client,
+    const struct ivshmem_client_peer *peer);
+
+/**
+ * Broadcat a notification to all vectors of all peers
+ *
+ * @param client
+ *   The ivshmem client
+ *
+ * @return
+ *   0 on success, and a negative error on failure (at least one
+ *   notification failed).
+ */
+int ivshmem_client_notify_broadcast(const struct ivshmem_client *client);
+
+/**
+ * Search a peer from its identifier
+ *
+ * Return the peer structure from its peer_id. If the given peer_id is
+ * the local id, the function returns the local peer structure.
+ *
+ * @param client
+ *   The ivshmem client
+ * @param peer_id
+ *   The identifier of the peer structure
+ *
+ * @return
+ *   The peer structure, or NULL if not found
+ */
+struct ivshmem_client_peer *
+ivshmem_client_search_peer(struct ivshmem_client *client, long peer_id);
+
+/**
+ * Dump information of this ivshmem client on stdout
+ *
+ * Dump the id and the vectors of the given ivshmem client and the list
+ * of its peers and their vectors on stdout.
+ *
+ * @param client
+ *   The ivshmem client
+ */
+void ivshmem_client_dump(const struct ivshmem_client *client);
+
+#endif /* _IVSHMEM_CLIENT_ */
diff --git a/contrib/ivshmem-client/main.c b/contrib/ivshmem-client/main.c
new file mode 100644
index 0000000..0d53f55
--- /dev/null
+++ b/contrib/ivshmem-client/main.c
@@ -0,0 +1,246 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <signal.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <getopt.h>
+
+#include "ivshmem-client.h"
+
+#define DEFAULT_VERBOSE        0
+#define DEFAULT_UNIX_SOCK_PATH "/tmp/ivshmem_socket"
+
+struct ivshmem_client_args {
+    int verbose;
+    char *unix_sock_path;
+};
+
+/* show usage and exit with given error code */
+static void
+usage(const char *name, int code)
+{
+    fprintf(stderr, "%s [opts]\n", name);
+    fprintf(stderr, "  -h: show this help\n");
+    fprintf(stderr, "  -v: verbose mode\n");
+    fprintf(stderr, "  -S <unix_sock_path>: path to the unix socket\n"
+                    "     to listen to.\n"
+                    "     default=%s\n", DEFAULT_UNIX_SOCK_PATH);
+    exit(code);
+}
+
+/* parse the program arguments, exit on error */
+static void
+parse_args(struct ivshmem_client_args *args, int argc, char *argv[])
+{
+    char c;
+
+    while ((c = getopt(argc, argv,
+                       "h"  /* help */
+                       "v"  /* verbose */
+                       "S:" /* unix_sock_path */
+                      )) != -1) {
+
+        switch (c) {
+        case 'h': /* help */
+            usage(argv[0], 0);
+            break;
+
+        case 'v': /* verbose */
+            args->verbose = 1;
+            break;
+
+        case 'S': /* unix_sock_path */
+            args->unix_sock_path = strdup(optarg);
+            break;
+
+        default:
+            usage(argv[0], 1);
+            break;
+        }
+    }
+}
+
+/* show command line help */
+static void
+cmdline_help(void)
+{
+    printf("dump: dump peers (including us)\n"
+           "int <peer> <vector>: notify one vector on a peer\n"
+           "int <peer> all: notify all vectors of a peer\n"
+           "int all: notify all vectors of all peers (excepting us)\n");
+}
+
+/* read stdin and handle commands */
+static int
+handle_stdin_command(struct ivshmem_client *client)
+{
+    struct ivshmem_client_peer *peer;
+    char buf[128];
+    char *s, *token;
+    int ret;
+    int peer_id, vector;
+
+    memset(buf, 0, sizeof(buf));
+    ret = read(0, buf, sizeof(buf) - 1);
+    if (ret < 0) {
+        return -1;
+    }
+
+    s = buf;
+    while ((token = strsep(&s, "\n\r;")) != NULL) {
+        if (!strcmp(token, "")) {
+            continue;
+        }
+        if (!strcmp(token, "?")) {
+            cmdline_help();
+        }
+        if (!strcmp(token, "help")) {
+            cmdline_help();
+        } else if (!strcmp(token, "dump")) {
+            ivshmem_client_dump(client);
+        } else if (!strcmp(token, "int all")) {
+            ivshmem_client_notify_broadcast(client);
+        } else if (sscanf(token, "int %d %d", &peer_id, &vector) == 2) {
+            peer = ivshmem_client_search_peer(client, peer_id);
+            if (peer == NULL) {
+                printf("cannot find peer_id = %d\n", peer_id);
+                continue;
+            }
+            ivshmem_client_notify(client, peer, vector);
+        } else if (sscanf(token, "int %d all", &peer_id) == 1) {
+            peer = ivshmem_client_search_peer(client, peer_id);
+            if (peer == NULL) {
+                printf("cannot find peer_id = %d\n", peer_id);
+                continue;
+            }
+            ivshmem_client_notify_all_vects(client, peer);
+        } else {
+            printf("invalid command, type help\n");
+        }
+    }
+
+    printf("cmd> ");
+    fflush(stdout);
+    return 0;
+}
+
+/* listen on stdin (command line), on unix socket (notifications of new
+ * and dead peers), and on eventfd (IRQ request) */
+int
+poll_events(struct ivshmem_client *client)
+{
+    fd_set fds;
+    int ret, maxfd;
+
+    while (1) {
+
+        FD_ZERO(&fds);
+        FD_SET(0, &fds); /* add stdin in fd_set */
+        maxfd = 1;
+
+        ivshmem_client_get_fds(client, &fds, &maxfd);
+
+        ret = select(maxfd, &fds, NULL, NULL, NULL);
+        if (ret < 0) {
+            if (errno == EINTR) {
+                continue;
+            }
+
+            fprintf(stderr, "select error: %s\n", strerror(errno));
+            break;
+        }
+        if (ret == 0) {
+            continue;
+        }
+
+        if (FD_ISSET(0, &fds) &&
+            handle_stdin_command(client) < 0 && errno != EINTR) {
+            fprintf(stderr, "handle_stdin_command() failed\n");
+            break;
+        }
+
+        if (ivshmem_client_handle_fds(client, &fds, maxfd) < 0) {
+            fprintf(stderr, "ivshmem_client_handle_fds() failed\n");
+            break;
+        }
+    }
+
+    return ret;
+}
+
+/* callback when we receive a notification (just display it) */
+void
+notification_cb(const struct ivshmem_client *client,
+                const struct ivshmem_client_peer *peer, unsigned vect,
+                void *arg)
+{
+    (void)client;
+    (void)arg;
+    printf("receive notification from peer_id=%ld vector=%d\n", peer->id, vect);
+}
+
+int
+main(int argc, char *argv[])
+{
+    struct sigaction sa;
+    struct ivshmem_client client;
+    struct ivshmem_client_args args = {
+        .verbose = DEFAULT_VERBOSE,
+        .unix_sock_path = DEFAULT_UNIX_SOCK_PATH,
+    };
+
+    /* parse arguments, will exit on error */
+    parse_args(&args, argc, argv);
+
+    /* Ignore SIGPIPE, see this link for more info:
+     * http://www.mail-archive.com/libevent-users@monkey.org/msg01606.html */
+    sa.sa_handler = SIG_IGN;
+    sa.sa_flags = 0;
+    if (sigemptyset(&sa.sa_mask) == -1 ||
+        sigaction(SIGPIPE, &sa, 0) == -1) {
+        perror("failed to ignore SIGPIPE; sigaction");
+        return 1;
+    }
+
+    cmdline_help();
+    printf("cmd> ");
+    fflush(stdout);
+
+    if (ivshmem_client_init(&client, args.unix_sock_path, notification_cb,
+                            NULL, args.verbose) < 0) {
+        fprintf(stderr, "cannot init client\n");
+        return 1;
+    }
+
+    while (1) {
+        if (ivshmem_client_connect(&client) < 0) {
+            fprintf(stderr, "cannot connect to server, retry in 1 second\n");
+            sleep(1);
+            continue;
+        }
+
+        fprintf(stdout, "listen on server socket %d\n", client.sock_fd);
+
+        if (poll_events(&client) == 0) {
+            continue;
+        }
+
+        /* disconnected from server, reset all peers */
+        fprintf(stdout, "disconnected from server\n");
+
+        ivshmem_client_close(&client);
+    }
+
+    return 0;
+}
diff --git a/contrib/ivshmem-server/Makefile b/contrib/ivshmem-server/Makefile
new file mode 100644
index 0000000..26b4a72
--- /dev/null
+++ b/contrib/ivshmem-server/Makefile
@@ -0,0 +1,29 @@
+# Copyright 6WIND S.A., 2014
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# (at your option) any later version.  See the COPYING file in the
+# top-level directory.
+
+S ?= $(CURDIR)
+O ?= $(CURDIR)
+
+CFLAGS += -Wall -Wextra -Werror -g
+LDFLAGS +=
+LDLIBS += -lrt
+
+VPATH = $(S)
+PROG = ivshmem-server
+OBJS := $(O)/ivshmem-server.o
+OBJS += $(O)/main.o
+
+$(O)/%.o: %.c
+	$(CC) $(CFLAGS) -o $@ -c $<
+
+$(O)/$(PROG): $(OBJS)
+	$(CC) $(LDFLAGS) -o $@ $^ $(LDLIBS)
+
+.PHONY: all
+all: $(O)/$(PROG)
+
+clean:
+	rm -f $(OBJS) $(O)/$(PROG)
diff --git a/contrib/ivshmem-server/ivshmem-server.c b/contrib/ivshmem-server/ivshmem-server.c
new file mode 100644
index 0000000..f6497bb
--- /dev/null
+++ b/contrib/ivshmem-server/ivshmem-server.c
@@ -0,0 +1,420 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <signal.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <fcntl.h>
+
+#include <sys/queue.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/eventfd.h>
+
+#include "ivshmem-server.h"
+
+/* log a message on stdout if verbose=1 */
+#define debug_log(server, fmt, ...) do { \
+        if ((server)->verbose) {         \
+            printf(fmt, ## __VA_ARGS__); \
+        }                                \
+    } while (0)
+
+/* browse the queue, allowing to remove/free the current element */
+#define    TAILQ_FOREACH_SAFE(var, var2, head, field)            \
+    for ((var) = TAILQ_FIRST((head)),                            \
+             (var2) = ((var) ? TAILQ_NEXT((var), field) : NULL); \
+         (var);                                                  \
+         (var) = (var2),                                         \
+             (var2) = ((var2) ? TAILQ_NEXT((var2), field) : NULL))
+
+/** maximum size of a huge page, used by ivshmem_ftruncate() */
+#define MAX_HUGEPAGE_SIZE (1024 * 1024 * 1024)
+
+/** default listen backlog (number of sockets not accepted) */
+#define IVSHMEM_SERVER_LISTEN_BACKLOG 10
+
+/* send message to a client unix socket */
+static int
+send_one_msg(int sock_fd, long peer_id, int fd)
+{
+    int ret;
+    struct msghdr msg;
+    struct iovec iov[1];
+    union {
+        struct cmsghdr cmsg;
+        char control[CMSG_SPACE(sizeof(int))];
+    } msg_control;
+    struct cmsghdr *cmsg;
+
+    iov[0].iov_base = &peer_id;
+    iov[0].iov_len = sizeof(peer_id);
+
+    memset(&msg, 0, sizeof(msg));
+    msg.msg_iov = iov;
+    msg.msg_iovlen = 1;
+
+    /* if fd is specified, add it in a cmsg */
+    if (fd >= 0) {
+        msg.msg_control = &msg_control;
+        msg.msg_controllen = sizeof(msg_control);
+        cmsg = CMSG_FIRSTHDR(&msg);
+        cmsg->cmsg_level = SOL_SOCKET;
+        cmsg->cmsg_type = SCM_RIGHTS;
+        cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+        memcpy(CMSG_DATA(cmsg), &fd, sizeof(fd));
+    }
+
+    ret = sendmsg(sock_fd, &msg, 0);
+    if (ret <= 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
+/* free a peer when the server advertise a disconnection or when the
+ * server is freed */
+static void
+free_peer(struct ivshmem_server *server, struct ivshmem_server_peer *peer)
+{
+    unsigned vector;
+    struct ivshmem_server_peer *other_peer;
+
+    debug_log(server, "free peer %ld\n", peer->id);
+    close(peer->sock_fd);
+    TAILQ_REMOVE(&server->peer_list, peer, next);
+
+    /* advertise the deletion to other peers */
+    TAILQ_FOREACH(other_peer, &server->peer_list, next) {
+        send_one_msg(other_peer->sock_fd, peer->id, -1);
+    }
+
+    for (vector = 0; vector < peer->vectors_count; vector++) {
+        close(peer->vectors[vector]);
+    }
+
+    free(peer);
+}
+
+/* send the peer id and the shm_fd just after a new client connection */
+static int
+send_initial_info(struct ivshmem_server *server,
+                  struct ivshmem_server_peer *peer)
+{
+    int ret;
+
+    /* send the peer id to the client */
+    ret = send_one_msg(peer->sock_fd, peer->id, -1);
+    if (ret < 0) {
+        debug_log(server, "cannot send peer id: %s\n", strerror(errno));
+        return -1;
+    }
+
+    /* send the shm_fd */
+    ret = send_one_msg(peer->sock_fd, -1, server->shm_fd);
+    if (ret < 0) {
+        debug_log(server, "cannot send shm fd: %s\n", strerror(errno));
+        return -1;
+    }
+
+    return 0;
+}
+
+/* handle message on listening unix socket (new client connection) */
+static int
+handle_new_conn(struct ivshmem_server *server)
+{
+    struct ivshmem_server_peer *peer, *other_peer;
+    struct sockaddr_un unaddr;
+    socklen_t unaddr_len;
+    int newfd;
+    unsigned i;
+
+    /* accept the incoming connection */
+    unaddr_len = sizeof(unaddr);
+    newfd = accept(server->sock_fd, (struct sockaddr *)&unaddr, &unaddr_len);
+    if (newfd < 0) {
+        debug_log(server, "cannot accept() %s\n", strerror(errno));
+        return -1;
+    }
+
+    debug_log(server, "accept()=%d\n", newfd);
+
+    /* allocate new structure for this peer */
+    peer = malloc(sizeof(*peer));
+    if (peer == NULL) {
+        debug_log(server, "cannot allocate new peer\n");
+        close(newfd);
+        return -1;
+    }
+
+    /* initialize the peer struct, one eventfd per vector */
+    memset(peer, 0, sizeof(*peer));
+    peer->sock_fd = newfd;
+
+    /* get an unused peer id */
+    while (ivshmem_server_search_peer(server, server->cur_id) != NULL) {
+        server->cur_id++;
+    }
+    peer->id = server->cur_id++;
+
+    /* create eventfd, one per vector */
+    peer->vectors_count = server->n_vectors;
+    for (i = 0; i < peer->vectors_count; i++) {
+        peer->vectors[i] = eventfd(0, 0);
+        if (peer->vectors[i] < 0) {
+            debug_log(server, "cannot create eventfd\n");
+            goto fail;
+        }
+    }
+
+    /* send peer id and shm fd */
+    if (send_initial_info(server, peer) < 0) {
+        debug_log(server, "cannot send initial info\n");
+        goto fail;
+    }
+
+    /* advertise the new peer to others */
+    TAILQ_FOREACH(other_peer, &server->peer_list, next) {
+        for (i = 0; i < peer->vectors_count; i++) {
+            send_one_msg(other_peer->sock_fd, peer->id, peer->vectors[i]);
+        }
+    }
+
+    /* advertise the other peers to the new one */
+    TAILQ_FOREACH(other_peer, &server->peer_list, next) {
+        for (i = 0; i < peer->vectors_count; i++) {
+            send_one_msg(peer->sock_fd, other_peer->id, other_peer->vectors[i]);
+        }
+    }
+
+    /* advertise the new peer to itself */
+    for (i = 0; i < peer->vectors_count; i++) {
+        send_one_msg(peer->sock_fd, peer->id, peer->vectors[i]);
+    }
+
+    TAILQ_INSERT_TAIL(&server->peer_list, peer, next);
+    debug_log(server, "new peer id = %ld\n", peer->id);
+    return 0;
+
+fail:
+    while (i--) {
+        close(peer->vectors[i]);
+    }
+    peer->sock_fd = -1;
+    close(newfd);
+    return -1;
+}
+
+/* Try to ftruncate a file to next power of 2 of shmsize.
+ * If it fails; all power of 2 above shmsize are tested until
+ * we reach the maximum huge page size. This is useful
+ * if the shm file is in a hugetlbfs that cannot be truncated to the
+ * shm_size value. */
+static int
+ivshmem_ftruncate(int fd, unsigned shmsize)
+{
+    int ret;
+
+    /* align shmsize to next power of 2 */
+    shmsize--;
+    shmsize |= shmsize >> 1;
+    shmsize |= shmsize >> 2;
+    shmsize |= shmsize >> 4;
+    shmsize |= shmsize >> 8;
+    shmsize |= shmsize >> 16;
+    shmsize++;
+
+    while (shmsize <= MAX_HUGEPAGE_SIZE) {
+        ret = ftruncate(fd, shmsize);
+        if (ret == 0) {
+            return ret;
+        }
+        shmsize *= 2;
+    }
+
+    return -1;
+}
+
+/* Init a new ivshmem server */
+int
+ivshmem_server_init(struct ivshmem_server *server, const char *unix_sock_path,
+                    const char *shm_path, size_t shm_size, unsigned n_vectors,
+                    int verbose)
+{
+    memset(server, 0, sizeof(*server));
+
+    snprintf(server->unix_sock_path, sizeof(server->unix_sock_path),
+             "%s", unix_sock_path);
+    snprintf(server->shm_path, sizeof(server->shm_path),
+             "%s", shm_path);
+
+    server->shm_size = shm_size;
+    server->n_vectors = n_vectors;
+    server->verbose = verbose;
+
+    TAILQ_INIT(&server->peer_list);
+
+    return 0;
+}
+
+/* open shm, create and bind to the unix socket */
+int
+ivshmem_server_start(struct ivshmem_server *server)
+{
+    struct sockaddr_un sun;
+    int shm_fd, sock_fd;
+
+    /* open shm file */
+    shm_fd = shm_open(server->shm_path, O_CREAT|O_RDWR, S_IRWXU);
+    if (shm_fd < 0) {
+        fprintf(stderr, "cannot open shm file %s: %s\n", server->shm_path,
+                strerror(errno));
+        return -1;
+    }
+    if (ivshmem_ftruncate(shm_fd, server->shm_size) < 0) {
+        fprintf(stderr, "ftruncate(%s) failed: %s\n", server->shm_path,
+                strerror(errno));
+        return -1;
+    }
+
+    debug_log(server, "create & bind socket %s\n", server->unix_sock_path);
+
+    /* create the unix listening socket */
+    sock_fd = socket(AF_UNIX, SOCK_STREAM, 0);
+    if (sock_fd < 0) {
+        debug_log(server, "cannot create socket: %s\n", strerror(errno));
+        close(shm_fd);
+        return -1;
+    }
+
+    sun.sun_family = AF_UNIX;
+    snprintf(sun.sun_path, sizeof(sun.sun_path), "%s", server->unix_sock_path);
+    unlink(sun.sun_path);
+    if (bind(sock_fd, (struct sockaddr *)&sun, sizeof(sun)) < 0) {
+        debug_log(server, "cannot connect to %s: %s\n", sun.sun_path,
+                  strerror(errno));
+        close(sock_fd);
+        close(shm_fd);
+        return -1;
+    }
+
+    if (listen(sock_fd, IVSHMEM_SERVER_LISTEN_BACKLOG) < 0) {
+        debug_log(server, "listen() failed: %s\n", strerror(errno));
+        close(sock_fd);
+        close(shm_fd);
+        return -1;
+    }
+
+    server->sock_fd = sock_fd;
+    server->shm_fd = shm_fd;
+
+    return 0;
+}
+
+/* close connections to clients, the unix socket and the shm fd */
+void
+ivshmem_server_close(struct ivshmem_server *server)
+{
+    struct ivshmem_server_peer *peer;
+
+    debug_log(server, "close server\n");
+
+    TAILQ_FOREACH(peer, &server->peer_list, next) {
+        free_peer(server, peer);
+    }
+
+    close(server->sock_fd);
+    close(server->shm_fd);
+    server->sock_fd = -1;
+    server->shm_fd = -1;
+}
+
+/* get the fd_set according to the unix socket and the peer list */
+void
+ivshmem_server_get_fds(const struct ivshmem_server *server, fd_set *fds,
+                       int *maxfd)
+{
+    struct ivshmem_server_peer *peer;
+
+    FD_SET(server->sock_fd, fds);
+    if (server->sock_fd >= *maxfd) {
+        *maxfd = server->sock_fd + 1;
+    }
+
+    TAILQ_FOREACH(peer, &server->peer_list, next) {
+        FD_SET(peer->sock_fd, fds);
+        if (peer->sock_fd >= *maxfd) {
+            *maxfd = peer->sock_fd + 1;
+        }
+    }
+}
+
+/* process incoming messages on the sockets in fd_set */
+int
+ivshmem_server_handle_fds(struct ivshmem_server *server, fd_set *fds, int maxfd)
+{
+    struct ivshmem_server_peer *peer, *peer_next;
+
+    if (server->sock_fd < maxfd && FD_ISSET(server->sock_fd, fds) &&
+        handle_new_conn(server) < 0 && errno != EINTR) {
+        debug_log(server, "handle_new_conn() failed\n");
+        return -1;
+    }
+
+    TAILQ_FOREACH_SAFE(peer, peer_next, &server->peer_list, next) {
+        /* any message from a peer socket result in a close() */
+        debug_log(server, "peer->sock_fd=%d\n", peer->sock_fd);
+        if (peer->sock_fd < maxfd && FD_ISSET(peer->sock_fd, fds)) {
+            free_peer(server, peer);
+        }
+    }
+
+    return 0;
+}
+
+/* lookup peer from its id */
+struct ivshmem_server_peer *
+ivshmem_server_search_peer(struct ivshmem_server *server, long peer_id)
+{
+    struct ivshmem_server_peer *peer;
+
+    TAILQ_FOREACH(peer, &server->peer_list, next) {
+        if (peer->id == peer_id) {
+            return peer;
+        }
+    }
+    return NULL;
+}
+
+/* dump our info, the list of peers their vectors on stdout */
+void
+ivshmem_server_dump(const struct ivshmem_server *server)
+{
+    const struct ivshmem_server_peer *peer;
+    unsigned vector;
+
+    /* dump peers */
+    TAILQ_FOREACH(peer, &server->peer_list, next) {
+        printf("peer_id = %ld\n", peer->id);
+
+        for (vector = 0; vector < peer->vectors_count; vector++) {
+            printf("  vector %d is enabled (fd=%d)\n", vector,
+                   peer->vectors[vector]);
+        }
+    }
+}
diff --git a/contrib/ivshmem-server/ivshmem-server.h b/contrib/ivshmem-server/ivshmem-server.h
new file mode 100644
index 0000000..cd74bbf
--- /dev/null
+++ b/contrib/ivshmem-server/ivshmem-server.h
@@ -0,0 +1,185 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#ifndef _IVSHMEM_SERVER_
+#define _IVSHMEM_SERVER_
+
+/**
+ * The ivshmem server is a daemon that creates a unix socket in listen
+ * mode. The ivshmem clients (qemu or ivshmem-client) connect to this
+ * unix socket. For each client, the server will create some eventfd
+ * (see EVENTFD(2)), one per vector. These fd are transmitted to all
+ * clients using the SCM_RIGHTS cmsg message. Therefore, each client is
+ * able to send a notification to another client without beeing
+ * "profixied" by the server.
+ *
+ * We use this mechanism to send interruptions between guests.
+ * qemu is able to transform an event on a eventfd into a PCI MSI-x
+ * interruption in the guest.
+ *
+ * The ivshmem server is also able to share the file descriptor
+ * associated to the ivshmem shared memory.
+ */
+
+#include <limits.h>
+#include <sys/select.h>
+#include <sys/queue.h>
+
+/**
+ * Maximum number of notification vectors supported by the server
+ */
+#define IVSHMEM_SERVER_MAX_VECTORS 64
+
+/**
+ * Structure storing a peer
+ *
+ * Each time a client connects to an ivshmem server, a new
+ * ivshmem_server_peer structure is created. This peer and all its
+ * vectors are advertised to all connected clients through the connected
+ * unix sockets.
+ */
+struct ivshmem_server_peer {
+    TAILQ_ENTRY(ivshmem_server_peer) next;    /**< next in list*/
+    int sock_fd;                                /**< connected unix sock */
+    long id;                                    /**< the id of the peer */
+    int vectors[IVSHMEM_SERVER_MAX_VECTORS];  /**< one fd per vector */
+    unsigned vectors_count;                     /**< number of vectors */
+};
+TAILQ_HEAD(ivshmem_server_peer_list, ivshmem_server_peer);
+
+/**
+ * Structure describing an ivshmem server
+ *
+ * This structure stores all information related to our server: the name
+ * of the server unix socket and the list of connected peers.
+ */
+struct ivshmem_server {
+    char unix_sock_path[PATH_MAX];  /**< path to unix socket */
+    int sock_fd;                    /**< unix sock file descriptor */
+    char shm_path[PATH_MAX];        /**< path to shm */
+    size_t shm_size;                /**< size of shm */
+    int shm_fd;                     /**< shm file descriptor */
+    unsigned n_vectors;             /**< number of vectors */
+    long cur_id;                    /**< id to be given to next client */
+    int verbose;                    /**< true in verbose mode */
+    struct ivshmem_server_peer_list peer_list;  /**< list of peers */
+};
+
+/**
+ * Initialize an ivshmem server
+ *
+ * @param server
+ *   A pointer to an uninitialized ivshmem_server structure
+ * @param unix_sock_path
+ *   The pointer to the unix socket file name
+ * @param shm_path
+ *   Path to the shared memory. The path corresponds to a POSIX shm name.
+ *   To use a real file, for instance in a hugetlbfs, it is possible to
+ *   use /../../abspath/to/file.
+ * @param shm_size
+ *   Size of shared memory
+ * @param n_vectors
+ *   Number of interrupt vectors per client
+ * @param verbose
+ *   True to enable verbose mode
+ *
+ * @return
+ *   0 on success, negative value on error
+ */
+int
+ivshmem_server_init(struct ivshmem_server *server,
+    const char *unix_sock_path, const char *shm_path, size_t shm_size,
+    unsigned n_vectors, int verbose);
+
+/**
+ * Open the shm, then create and bind to the unix socket
+ *
+ * @param server
+ *   The pointer to the initialized ivshmem server structure
+ *
+ * @return
+ *   0 on success, or a negative value on error
+ */
+int ivshmem_server_start(struct ivshmem_server *server);
+
+/**
+ * Close the server
+ *
+ * Close connections to all clients, close the unix socket and the
+ * shared memory file descriptor. The structure remains initialized, so
+ * it is possible to call ivshmem_server_start() again after a call to
+ * ivshmem_server_close().
+ *
+ * @param server
+ *   The ivshmem server
+ */
+void ivshmem_server_close(struct ivshmem_server *server);
+
+/**
+ * Fill a fd_set with file descriptors to be monitored
+ *
+ * This function will fill a fd_set with all file descriptors that must
+ * be polled (unix server socket and peers unix socket). The function
+ * will not initialize the fd_set, it is up to the caller to do it.
+ *
+ * @param server
+ *   The ivshmem server
+ * @param fds
+ *   The fd_set to be updated
+ * @param maxfd
+ *   Must be set to the max file descriptor + 1 in fd_set. This value is
+ *   updated if this function adds a greated fd in fd_set.
+ */
+void
+ivshmem_server_get_fds(const struct ivshmem_server *server,
+    fd_set *fds, int *maxfd);
+
+/**
+ * Read and handle new messages
+ *
+ * Given a fd_set (for instance filled by a call to select()), handle
+ * incoming messages from peers.
+ *
+ * @param server
+ *   The ivshmem server
+ * @param fds
+ *   The fd_set containing the file descriptors to be checked. Note
+ *   that file descriptors that are not related to our server are
+ *   ignored.
+ * @param maxfd
+ *   The maximum fd in fd_set, plus one.
+ *
+ * @return
+ *   0 on success, negative value on failure.
+ */
+int ivshmem_server_handle_fds(struct ivshmem_server *server, fd_set *fds,
+    int maxfd);
+
+/**
+ * Search a peer from its identifier
+ *
+ * @param server
+ *   The ivshmem server
+ * @param peer_id
+ *   The identifier of the peer structure
+ *
+ * @return
+ *   The peer structure, or NULL if not found
+ */
+struct ivshmem_server_peer *
+ivshmem_server_search_peer(struct ivshmem_server *server, long peer_id);
+
+/**
+ * Dump information of this ivshmem server and its peers on stdout
+ *
+ * @param server
+ *   The ivshmem server
+ */
+void ivshmem_server_dump(const struct ivshmem_server *server);
+
+#endif /* _IVSHMEM_SERVER_ */
diff --git a/contrib/ivshmem-server/main.c b/contrib/ivshmem-server/main.c
new file mode 100644
index 0000000..36b7028
--- /dev/null
+++ b/contrib/ivshmem-server/main.c
@@ -0,0 +1,296 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <signal.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <sys/types.h>
+#include <limits.h>
+#include <getopt.h>
+
+#include "ivshmem-server.h"
+
+#define DEFAULT_VERBOSE        0
+#define DEFAULT_FOREGROUND     0
+#define DEFAULT_PID_FILE       "/var/run/ivshmem-server.pid"
+#define DEFAULT_UNIX_SOCK_PATH "/tmp/ivshmem_socket"
+#define DEFAULT_SHM_PATH       "ivshmem"
+#define DEFAULT_SHM_SIZE       (1024*1024)
+#define DEFAULT_N_VECTORS      16
+
+/* arguments given by the user */
+struct ivshmem_server_args {
+    int verbose;
+    int foreground;
+    char *pid_file;
+    char *unix_socket_path;
+    char *shm_path;
+    size_t shm_size;
+    unsigned n_vectors;
+};
+
+/* show usage and exit with given error code */
+static void
+usage(const char *name, int code)
+{
+    fprintf(stderr, "%s [opts]\n", name);
+    fprintf(stderr, "  -h: show this help\n");
+    fprintf(stderr, "  -v: verbose mode\n");
+    fprintf(stderr, "  -F: foreground mode (default is to daemonize)\n");
+    fprintf(stderr, "  -p <pid_file>: path to the PID file (used in daemon\n"
+                    "     mode only).\n"
+                    "     Default=%s\n", DEFAULT_SHM_PATH);
+    fprintf(stderr, "  -S <unix_socket_path>: path to the unix socket\n"
+                    "     to listen to.\n"
+                    "     Default=%s\n", DEFAULT_UNIX_SOCK_PATH);
+    fprintf(stderr, "  -m <shm_path>: path to the shared memory.\n"
+                    "     The path corresponds to a POSIX shm name. To use a\n"
+                    "     real file, for instance in a hugetlbfs, use\n"
+                    "     /../../abspath/to/file.\n"
+                    "     default=%s\n", DEFAULT_SHM_PATH);
+    fprintf(stderr, "  -l <size>: size of shared memory in bytes. The suffix\n"
+                    "     K, M and G can be used (ex: 1K means 1024).\n"
+                    "     default=%u\n", DEFAULT_SHM_SIZE);
+    fprintf(stderr, "  -n <n_vects>: number of vectors.\n"
+                    "     default=%u\n", DEFAULT_N_VECTORS);
+
+    exit(code);
+}
+
+/* parse the size of shm */
+static int
+parse_size(const char *val_str, size_t *val)
+{
+    char *endptr;
+    unsigned long long tmp;
+
+    errno = 0;
+    tmp = strtoull(val_str, &endptr, 0);
+    if ((errno == ERANGE && tmp == ULLONG_MAX) || (errno != 0 && tmp == 0)) {
+        return -1;
+    }
+    if (endptr == val_str) {
+        return -1;
+    }
+    if (endptr[0] == 'K' && endptr[1] == '\0') {
+        tmp *= 1024;
+    } else if (endptr[0] == 'M' && endptr[1] == '\0') {
+        tmp *= 1024 * 1024;
+    } else if (endptr[0] == 'G' && endptr[1] == '\0') {
+        tmp *= 1024 * 1024 * 1024;
+    } else if (endptr[0] != '\0') {
+        return -1;
+    }
+
+    *val = tmp;
+    return 0;
+}
+
+/* parse an unsigned int */
+static int
+parse_uint(const char *val_str, unsigned *val)
+{
+    char *endptr;
+    unsigned long tmp;
+
+    errno = 0;
+    tmp = strtoul(val_str, &endptr, 0);
+    if ((errno == ERANGE && tmp == ULONG_MAX) || (errno != 0 && tmp == 0)) {
+        return -1;
+    }
+    if (endptr == val_str || endptr[0] != '\0') {
+        return -1;
+    }
+    *val = tmp;
+    return 0;
+}
+
+/* parse the program arguments, exit on error */
+static void
+parse_args(struct ivshmem_server_args *args, int argc, char *argv[])
+{
+    char c;
+
+    while ((c = getopt(argc, argv,
+                       "h"  /* help */
+                       "v"  /* verbose */
+                       "F"  /* foreground */
+                       "p:" /* pid_file */
+                       "S:" /* unix_socket_path */
+                       "m:" /* shm_path */
+                       "l:" /* shm_size */
+                       "n:" /* n_vectors */
+                      )) != -1) {
+
+        switch (c) {
+        case 'h': /* help */
+            usage(argv[0], 0);
+            break;
+
+        case 'v': /* verbose */
+            args->verbose = 1;
+            break;
+
+        case 'F': /* foreground */
+            args->foreground = 1;
+            break;
+
+        case 'p': /* pid_file */
+            args->pid_file = strdup(optarg);
+            break;
+
+        case 'S': /* unix_socket_path */
+            args->unix_socket_path = strdup(optarg);
+            break;
+
+        case 'm': /* shm_path */
+            args->shm_path = strdup(optarg);
+            break;
+
+        case 'l': /* shm_size */
+            if (parse_size(optarg, &args->shm_size) < 0) {
+                fprintf(stderr, "cannot parse shm size\n");
+                usage(argv[0], 1);
+            }
+            break;
+
+        case 'n': /* n_vectors */
+            if (parse_uint(optarg, &args->n_vectors) < 0) {
+                fprintf(stderr, "cannot parse n_vectors\n");
+                usage(argv[0], 1);
+            }
+            break;
+
+        default:
+            usage(argv[0], 1);
+            break;
+        }
+    }
+
+    if (args->n_vectors > IVSHMEM_SERVER_MAX_VECTORS) {
+        fprintf(stderr, "too many requested vectors (max is %d)\n",
+                IVSHMEM_SERVER_MAX_VECTORS);
+        usage(argv[0], 1);
+    }
+
+    if (args->verbose == 1 && args->foreground == 0) {
+        fprintf(stderr, "cannot use verbose in daemon mode\n");
+        usage(argv[0], 1);
+    }
+}
+
+/* wait for events on listening server unix socket and connected client
+ * sockets */
+int
+poll_events(struct ivshmem_server *server)
+{
+    fd_set fds;
+    int ret, maxfd;
+
+    while (1) {
+
+        FD_ZERO(&fds);
+        maxfd = 0;
+        ivshmem_server_get_fds(server, &fds, &maxfd);
+
+        ret = select(maxfd, &fds, NULL, NULL, NULL);
+
+        if (ret < 0) {
+            if (errno == EINTR) {
+                continue;
+            }
+
+            fprintf(stderr, "select error: %s\n", strerror(errno));
+            break;
+        }
+        if (ret == 0) {
+            continue;
+        }
+
+        if (ivshmem_server_handle_fds(server, &fds, maxfd) < 0) {
+            fprintf(stderr, "ivshmem_server_handle_fds() failed\n");
+            break;
+        }
+    }
+
+    return ret;
+}
+
+int
+main(int argc, char *argv[])
+{
+    struct ivshmem_server server;
+    struct sigaction sa;
+    struct ivshmem_server_args args = {
+        .verbose = DEFAULT_VERBOSE,
+        .foreground = DEFAULT_FOREGROUND,
+        .pid_file = DEFAULT_PID_FILE,
+        .unix_socket_path = DEFAULT_UNIX_SOCK_PATH,
+        .shm_path = DEFAULT_SHM_PATH,
+        .shm_size = DEFAULT_SHM_SIZE,
+        .n_vectors = DEFAULT_N_VECTORS,
+    };
+
+    /* parse arguments, will exit on error */
+    parse_args(&args, argc, argv);
+
+    /* Ignore SIGPIPE, see this link for more info:
+     * http://www.mail-archive.com/libevent-users@monkey.org/msg01606.html */
+    sa.sa_handler = SIG_IGN;
+    sa.sa_flags = 0;
+    if (sigemptyset(&sa.sa_mask) == -1 ||
+        sigaction(SIGPIPE, &sa, 0) == -1) {
+        perror("failed to ignore SIGPIPE; sigaction");
+        return 1;
+    }
+
+    /* init the ivshms structure */
+    if (ivshmem_server_init(&server, args.unix_socket_path, args.shm_path,
+                            args.shm_size, args.n_vectors, args.verbose) < 0) {
+        fprintf(stderr, "cannot init server\n");
+        return 1;
+    }
+
+    /* start the ivshmem server (open shm & unix socket) */
+    if (ivshmem_server_start(&server) < 0) {
+        fprintf(stderr, "cannot bind\n");
+        return 1;
+    }
+
+    /* daemonize if asked to */
+    if (!args.foreground) {
+        FILE *fp;
+
+        if (daemon(1, 1) < 0) {
+            fprintf(stderr, "cannot daemonize: %s\n", strerror(errno));
+            return 1;
+        }
+
+        /* write pid file */
+        fp = fopen(args.pid_file, "w");
+        if (fp == NULL) {
+            fprintf(stderr, "cannot write pid file: %s\n", strerror(errno));
+            return 1;
+        }
+
+        fprintf(fp, "%d\n", (int) getpid());
+        fclose(fp);
+    }
+
+    poll_events(&server);
+
+    fprintf(stdout, "server disconnected\n");
+    ivshmem_server_close(&server);
+
+    return 0;
+}
diff --git a/qemu-doc.texi b/qemu-doc.texi
index 2b232ae..380d573 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -1250,9 +1250,13 @@ is qemu.git/contrib/ivshmem-server.  An example syntax when using the shared
 memory server is:
 
 @example
-qemu-system-i386 -device ivshmem,size=<size in format accepted by -m>[,chardev=<id>]
-                 [,msi=on][,ioeventfd=on][,vectors=n][,role=peer|master]
-qemu-system-i386 -chardev socket,path=<path>,id=<id>
+# First start the ivshmem server once and for all
+ivshmem-server -p <pidfile> -S <path> -m <shm name> -l <shm size> -n <vectors n>
+
+# Then start your qemu instances with matching arguments
+qemu-system-i386 -device ivshmem,size=<shm size>,vectors=<vectors n>,chardev=<id>
+                 [,msi=on][,ioeventfd=on][,role=peer|master]
+                 -chardev socket,path=<path>,id=<id>
 @end example
 
 When using the server, the guest will be assigned a VM ID (>=0) that allows guests
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v3 1/2] contrib: add ivshmem client and server
@ 2014-08-08  8:55   ` David Marchand
  0 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-08  8:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: Olivier Matz, kvm, claudio.fontana, armbru, pbonzini, jani.kokkonen, cam

When using ivshmem devices, notifications between guests can be sent as
interrupts using a ivshmem-server (typical use described in documentation).
The client is provided as a debug tool.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 contrib/ivshmem-client/Makefile         |   29 +++
 contrib/ivshmem-client/ivshmem-client.c |  418 ++++++++++++++++++++++++++++++
 contrib/ivshmem-client/ivshmem-client.h |  238 ++++++++++++++++++
 contrib/ivshmem-client/main.c           |  246 ++++++++++++++++++
 contrib/ivshmem-server/Makefile         |   29 +++
 contrib/ivshmem-server/ivshmem-server.c |  420 +++++++++++++++++++++++++++++++
 contrib/ivshmem-server/ivshmem-server.h |  185 ++++++++++++++
 contrib/ivshmem-server/main.c           |  296 ++++++++++++++++++++++
 qemu-doc.texi                           |   10 +-
 9 files changed, 1868 insertions(+), 3 deletions(-)
 create mode 100644 contrib/ivshmem-client/Makefile
 create mode 100644 contrib/ivshmem-client/ivshmem-client.c
 create mode 100644 contrib/ivshmem-client/ivshmem-client.h
 create mode 100644 contrib/ivshmem-client/main.c
 create mode 100644 contrib/ivshmem-server/Makefile
 create mode 100644 contrib/ivshmem-server/ivshmem-server.c
 create mode 100644 contrib/ivshmem-server/ivshmem-server.h
 create mode 100644 contrib/ivshmem-server/main.c

diff --git a/contrib/ivshmem-client/Makefile b/contrib/ivshmem-client/Makefile
new file mode 100644
index 0000000..eee97c6
--- /dev/null
+++ b/contrib/ivshmem-client/Makefile
@@ -0,0 +1,29 @@
+# Copyright 6WIND S.A., 2014
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# (at your option) any later version.  See the COPYING file in the
+# top-level directory.
+
+S ?= $(CURDIR)
+O ?= $(CURDIR)
+
+CFLAGS += -Wall -Wextra -Werror -g
+LDFLAGS +=
+LDLIBS += -lrt
+
+VPATH = $(S)
+PROG = ivshmem-client
+OBJS := $(O)/ivshmem-client.o
+OBJS += $(O)/main.o
+
+$(O)/%.o: %.c
+	$(CC) $(CFLAGS) -o $@ -c $<
+
+$(O)/$(PROG): $(OBJS)
+	$(CC) $(LDFLAGS) -o $@ $^ $(LDLIBS)
+
+.PHONY: all
+all: $(O)/$(PROG)
+
+clean:
+	rm -f $(OBJS) $(O)/$(PROG)
diff --git a/contrib/ivshmem-client/ivshmem-client.c b/contrib/ivshmem-client/ivshmem-client.c
new file mode 100644
index 0000000..2166b64
--- /dev/null
+++ b/contrib/ivshmem-client/ivshmem-client.c
@@ -0,0 +1,418 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <signal.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <sys/queue.h>
+
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+
+#include "ivshmem-client.h"
+
+/* log a message on stdout if verbose=1 */
+#define debug_log(client, fmt, ...) do { \
+        if ((client)->verbose) {         \
+            printf(fmt, ## __VA_ARGS__); \
+        }                                \
+    } while (0)
+
+/* read message from the unix socket */
+static int
+read_one_msg(struct ivshmem_client *client, long *index, int *fd)
+{
+    int ret;
+    struct msghdr msg;
+    struct iovec iov[1];
+    union {
+        struct cmsghdr cmsg;
+        char control[CMSG_SPACE(sizeof(int))];
+    } msg_control;
+    struct cmsghdr *cmsg;
+
+    iov[0].iov_base = index;
+    iov[0].iov_len = sizeof(*index);
+
+    memset(&msg, 0, sizeof(msg));
+    msg.msg_iov = iov;
+    msg.msg_iovlen = 1;
+    msg.msg_control = &msg_control;
+    msg.msg_controllen = sizeof(msg_control);
+
+    ret = recvmsg(client->sock_fd, &msg, 0);
+    if (ret < 0) {
+        debug_log(client, "cannot read message: %s\n", strerror(errno));
+        return -1;
+    }
+    if (ret == 0) {
+        debug_log(client, "lost connection to server\n");
+        return -1;
+    }
+
+    *fd = -1;
+
+    for (cmsg = CMSG_FIRSTHDR(&msg); cmsg; cmsg = CMSG_NXTHDR(&msg, cmsg)) {
+
+        if (cmsg->cmsg_len != CMSG_LEN(sizeof(int)) ||
+            cmsg->cmsg_level != SOL_SOCKET ||
+            cmsg->cmsg_type != SCM_RIGHTS) {
+            continue;
+        }
+
+        memcpy(fd, CMSG_DATA(cmsg), sizeof(*fd));
+    }
+
+    return 0;
+}
+
+/* free a peer when the server advertise a disconnection or when the
+ * client is freed */
+static void
+free_peer(struct ivshmem_client *client, struct ivshmem_client_peer *peer)
+{
+    unsigned vector;
+
+    TAILQ_REMOVE(&client->peer_list, peer, next);
+    for (vector = 0; vector < peer->vectors_count; vector++) {
+        close(peer->vectors[vector]);
+    }
+
+    free(peer);
+}
+
+/* handle message coming from server (new peer, new vectors) */
+static int
+handle_server_msg(struct ivshmem_client *client)
+{
+    struct ivshmem_client_peer *peer;
+    long peer_id;
+    int ret, fd;
+
+    ret = read_one_msg(client, &peer_id, &fd);
+    if (ret < 0) {
+        return -1;
+    }
+
+    /* can return a peer or the local client */
+    peer = ivshmem_client_search_peer(client, peer_id);
+
+    /* delete peer */
+    if (fd == -1) {
+
+        if (peer == NULL || peer == &client->local) {
+            debug_log(client, "receive delete for invalid peer %ld", peer_id);
+            return -1;
+        }
+
+        debug_log(client, "delete peer id = %ld\n", peer_id);
+        free_peer(client, peer);
+        return 0;
+    }
+
+    /* new peer */
+    if (peer == NULL) {
+        peer = malloc(sizeof(*peer));
+        if (peer == NULL) {
+            debug_log(client, "cannot allocate new peer\n");
+            return -1;
+        }
+        memset(peer, 0, sizeof(*peer));
+        peer->id = peer_id;
+        peer->vectors_count = 0;
+        TAILQ_INSERT_TAIL(&client->peer_list, peer, next);
+        debug_log(client, "new peer id = %ld\n", peer_id);
+    }
+
+    /* new vector */
+    debug_log(client, "  new vector %d (fd=%d) for peer id %ld\n",
+              peer->vectors_count, fd, peer->id);
+    peer->vectors[peer->vectors_count] = fd;
+    peer->vectors_count++;
+
+    return 0;
+}
+
+/* init a new ivshmem client */
+int
+ivshmem_client_init(struct ivshmem_client *client, const char *unix_sock_path,
+                    ivshmem_client_notif_cb_t notif_cb, void *notif_arg,
+                    int verbose)
+{
+    unsigned i;
+
+    memset(client, 0, sizeof(*client));
+
+    snprintf(client->unix_sock_path, sizeof(client->unix_sock_path),
+             "%s", unix_sock_path);
+
+    for (i = 0; i < IVSHMEM_CLIENT_MAX_VECTORS; i++) {
+        client->local.vectors[i] = -1;
+    }
+
+    TAILQ_INIT(&client->peer_list);
+    client->local.id = -1;
+
+    client->notif_cb = notif_cb;
+    client->notif_arg = notif_arg;
+    client->verbose = verbose;
+
+    return 0;
+}
+
+/* create and connect to the unix socket */
+int
+ivshmem_client_connect(struct ivshmem_client *client)
+{
+    struct sockaddr_un sun;
+    int fd;
+    long tmp;
+
+    debug_log(client, "connect to client %s\n", client->unix_sock_path);
+
+    client->sock_fd = socket(AF_UNIX, SOCK_STREAM, 0);
+    if (client->sock_fd < 0) {
+        debug_log(client, "cannot create socket: %s\n", strerror(errno));
+        return -1;
+    }
+
+    sun.sun_family = AF_UNIX;
+    snprintf(sun.sun_path, sizeof(sun.sun_path), "%s", client->unix_sock_path);
+    if (connect(client->sock_fd, (struct sockaddr *)&sun, sizeof(sun)) < 0) {
+        debug_log(client, "cannot connect to %s: %s\n", sun.sun_path,
+                  strerror(errno));
+        close(client->sock_fd);
+        client->sock_fd = -1;
+        return -1;
+    }
+
+    /* first, we expect our index + a fd == -1 */
+    if (read_one_msg(client, &client->local.id, &fd) < 0 ||
+        client->local.id < 0 || fd != -1) {
+        debug_log(client, "cannot read from server\n");
+        close(client->sock_fd);
+        client->sock_fd = -1;
+        return -1;
+    }
+    debug_log(client, "our_id=%ld\n", client->local.id);
+
+    /* now, we expect shared mem fd + a -1 index, note that shm fd
+     * is not used */
+    if (read_one_msg(client, &tmp, &fd) < 0 ||
+        tmp != -1 || fd < 0) {
+        debug_log(client, "cannot read from server (2)\n");
+        close(client->sock_fd);
+        client->sock_fd = -1;
+        return -1;
+    }
+    debug_log(client, "shm_fd=%d\n", fd);
+
+    return 0;
+}
+
+/* close connection to the server, and free all peer structures */
+void
+ivshmem_client_close(struct ivshmem_client *client)
+{
+    struct ivshmem_client_peer *peer;
+    unsigned i;
+
+    debug_log(client, "close client\n");
+
+    while ((peer = TAILQ_FIRST(&client->peer_list)) != NULL) {
+        free_peer(client, peer);
+    }
+
+    close(client->sock_fd);
+    client->sock_fd = -1;
+    client->local.id = -1;
+    for (i = 0; i < IVSHMEM_CLIENT_MAX_VECTORS; i++) {
+        client->local.vectors[i] = -1;
+    }
+}
+
+/* get the fd_set according to the unix socket and peer list */
+void
+ivshmem_client_get_fds(const struct ivshmem_client *client, fd_set *fds,
+                       int *maxfd)
+{
+    int fd;
+    unsigned vector;
+
+    FD_SET(client->sock_fd, fds);
+    if (client->sock_fd >= *maxfd) {
+        *maxfd = client->sock_fd + 1;
+    }
+
+    for (vector = 0; vector < client->local.vectors_count; vector++) {
+        fd = client->local.vectors[vector];
+        FD_SET(fd, fds);
+        if (fd >= *maxfd) {
+            *maxfd = fd + 1;
+        }
+    }
+}
+
+/* handle events from eventfd: just print a message on notification */
+static int
+handle_event(struct ivshmem_client *client, const fd_set *cur, int maxfd)
+{
+    struct ivshmem_client_peer *peer;
+    uint64_t kick;
+    unsigned i;
+    int ret;
+
+    peer = &client->local;
+
+    for (i = 0; i < peer->vectors_count; i++) {
+        if (peer->vectors[i] >= maxfd || !FD_ISSET(peer->vectors[i], cur)) {
+            continue;
+        }
+
+        ret = read(peer->vectors[i], &kick, sizeof(kick));
+        if (ret < 0) {
+            return ret;
+        }
+        if (ret != sizeof(kick)) {
+            debug_log(client, "invalid read size = %d\n", ret);
+            errno = EINVAL;
+            return -1;
+        }
+        debug_log(client, "received event on fd %d vector %d: %ld\n",
+                  peer->vectors[i], i, kick);
+        if (client->notif_cb != NULL) {
+            client->notif_cb(client, peer, i, client->notif_arg);
+        }
+    }
+
+    return 0;
+}
+
+/* read and handle new messages on the given fd_set */
+int
+ivshmem_client_handle_fds(struct ivshmem_client *client, fd_set *fds, int maxfd)
+{
+    if (client->sock_fd < maxfd && FD_ISSET(client->sock_fd, fds) &&
+        handle_server_msg(client) < 0 && errno != EINTR) {
+        debug_log(client, "handle_server_msg() failed\n");
+        return -1;
+    } else if (handle_event(client, fds, maxfd) < 0 && errno != EINTR) {
+        debug_log(client, "handle_event() failed\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+/* send a notification on a vector of a peer */
+int
+ivshmem_client_notify(const struct ivshmem_client *client,
+                      const struct ivshmem_client_peer *peer, unsigned vector)
+{
+    uint64_t kick;
+    int fd;
+
+    if (vector > peer->vectors_count) {
+        debug_log(client, "invalid vector %u on peer %ld\n", vector, peer->id);
+        return -1;
+    }
+    fd = peer->vectors[vector];
+    debug_log(client, "notify peer %ld on vector %d, fd %d\n", peer->id, vector,
+              fd);
+
+    kick = 1;
+    if (write(fd, &kick, sizeof(kick)) != sizeof(kick)) {
+        fprintf(stderr, "could not write to %d: %s\n", peer->vectors[vector],
+                strerror(errno));
+        return -1;
+    }
+    return 0;
+}
+
+/* send a notification to all vectors of a peer */
+int
+ivshmem_client_notify_all_vects(const struct ivshmem_client *client,
+                                const struct ivshmem_client_peer *peer)
+{
+    unsigned vector;
+    int ret = 0;
+
+    for (vector = 0; vector < peer->vectors_count; vector++) {
+        if (ivshmem_client_notify(client, peer, vector) < 0) {
+            ret = -1;
+        }
+    }
+
+    return ret;
+}
+
+/* send a notification to all peers */
+int
+ivshmem_client_notify_broadcast(const struct ivshmem_client *client)
+{
+    struct ivshmem_client_peer *peer;
+    int ret = 0;
+
+    TAILQ_FOREACH(peer, &client->peer_list, next) {
+        if (ivshmem_client_notify_all_vects(client, peer) < 0) {
+            ret = -1;
+        }
+    }
+
+    return ret;
+}
+
+/* lookup peer from its id */
+struct ivshmem_client_peer *
+ivshmem_client_search_peer(struct ivshmem_client *client, long peer_id)
+{
+    struct ivshmem_client_peer *peer;
+
+    if (peer_id == client->local.id) {
+        return &client->local;
+    }
+
+    TAILQ_FOREACH(peer, &client->peer_list, next) {
+        if (peer->id == peer_id) {
+            return peer;
+        }
+    }
+    return NULL;
+}
+
+/* dump our info, the list of peers their vectors on stdout */
+void
+ivshmem_client_dump(const struct ivshmem_client *client)
+{
+    const struct ivshmem_client_peer *peer;
+    unsigned vector;
+
+    /* dump local infos */
+    peer = &client->local;
+    printf("our_id = %ld\n", peer->id);
+    for (vector = 0; vector < peer->vectors_count; vector++) {
+        printf("  vector %d is enabled (fd=%d)\n", vector,
+               peer->vectors[vector]);
+    }
+
+    /* dump peers */
+    TAILQ_FOREACH(peer, &client->peer_list, next) {
+        printf("peer_id = %ld\n", peer->id);
+
+        for (vector = 0; vector < peer->vectors_count; vector++) {
+            printf("  vector %d is enabled (fd=%d)\n", vector,
+                   peer->vectors[vector]);
+        }
+    }
+}
diff --git a/contrib/ivshmem-client/ivshmem-client.h b/contrib/ivshmem-client/ivshmem-client.h
new file mode 100644
index 0000000..d27222b
--- /dev/null
+++ b/contrib/ivshmem-client/ivshmem-client.h
@@ -0,0 +1,238 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#ifndef _IVSHMEM_CLIENT_
+#define _IVSHMEM_CLIENT_
+
+/**
+ * This file provides helper to implement an ivshmem client. It is used
+ * on the host to ask QEMU to send an interrupt to an ivshmem PCI device in a
+ * guest. QEMU also implements an ivshmem client similar to this one, they both
+ * connect to an ivshmem server.
+ *
+ * A standalone ivshmem client based on this file is provided for debug/test
+ * purposes.
+ */
+
+#include <limits.h>
+#include <sys/select.h>
+#include <sys/queue.h>
+
+/**
+ * Maximum number of notification vectors supported by the client
+ */
+#define IVSHMEM_CLIENT_MAX_VECTORS 64
+
+/**
+ * Structure storing a peer
+ *
+ * Each time a client connects to an ivshmem server, it is advertised to
+ * all connected clients through the unix socket. When our ivshmem
+ * client receives a notification, it creates a ivshmem_client_peer
+ * structure to store the infos of this peer.
+ *
+ * This structure is also used to store the information of our own
+ * client in (struct ivshmem_client)->local.
+ */
+struct ivshmem_client_peer {
+    TAILQ_ENTRY(ivshmem_client_peer) next;    /**< next in list*/
+    long id;                                    /**< the id of the peer */
+    int vectors[IVSHMEM_CLIENT_MAX_VECTORS];  /**< one fd per vector */
+    unsigned vectors_count;                     /**< number of vectors */
+};
+TAILQ_HEAD(ivshmem_client_peer_list, ivshmem_client_peer);
+
+struct ivshmem_client;
+
+/**
+ * Typedef of callback function used when our ivshmem_client receives a
+ * notification from a peer.
+ */
+typedef void (*ivshmem_client_notif_cb_t)(
+    const struct ivshmem_client *client,
+    const struct ivshmem_client_peer *peer,
+    unsigned vect, void *arg);
+
+/**
+ * Structure describing an ivshmem client
+ *
+ * This structure stores all information related to our client: the name
+ * of the server unix socket, the list of peers advertised by the
+ * server, our own client information, and a pointer the notification
+ * callback function used when we receive a notification from a peer.
+ */
+struct ivshmem_client {
+    char unix_sock_path[PATH_MAX];        /**< path to unix sock */
+    int sock_fd;                          /**< unix sock filedesc */
+
+    struct ivshmem_client_peer_list peer_list;  /**< list of peers */
+    struct ivshmem_client_peer local;   /**< our own infos */
+
+    ivshmem_client_notif_cb_t notif_cb; /**< notification callback */
+    void *notif_arg;                      /**< notification argument */
+
+    int verbose;                          /**< true to enable debug */
+};
+
+/**
+ * Initialize an ivshmem client
+ *
+ * @param client
+ *   A pointer to an uninitialized ivshmem_client structure
+ * @param unix_sock_path
+ *   The pointer to the unix socket file name
+ * @param notif_cb
+ *   If not NULL, the pointer to the function to be called when we our
+ *   ivshmem_client receives a notification from a peer
+ * @param notif_arg
+ *   Opaque pointer given as-is to the notification callback function
+ * @param verbose
+ *   True to enable debug
+ *
+ * @return
+ *   0 on success, or a negative value on error
+ */
+int ivshmem_client_init(struct ivshmem_client *client,
+    const char *unix_sock_path, ivshmem_client_notif_cb_t notif_cb,
+    void *notif_arg, int verbose);
+
+/**
+ * Connect to the server
+ *
+ * Connect to the server unix socket, and read the first initial
+ * messages sent by the server, giving the ID of the client and the file
+ * descriptor of the shared memory.
+ *
+ * @param client
+ *   The ivshmem client
+ *
+ * @return
+ *   0 on success, or a negative value on error
+ */
+int ivshmem_client_connect(struct ivshmem_client *client);
+
+/**
+ * Close connection to the server and free all peer structures
+ *
+ * @param client
+ *   The ivshmem client
+ */
+void ivshmem_client_close(struct ivshmem_client *client);
+
+/**
+ * Fill a fd_set with file descriptors to be monitored
+ *
+ * This function will fill a fd_set with all file descriptors
+ * that must be polled (unix server socket and peers eventfd). The
+ * function will not initialize the fd_set, it is up to the caller
+ * to do this.
+ *
+ * @param client
+ *   The ivshmem client
+ * @param fds
+ *   The fd_set to be updated
+ * @param maxfd
+ *   Must be set to the max file descriptor + 1 in fd_set. This value is
+ *   updated if this function adds a greated fd in fd_set.
+ */
+void ivshmem_client_get_fds(const struct ivshmem_client *client, fd_set *fds,
+                            int *maxfd);
+
+/**
+ * Read and handle new messages
+ *
+ * Given a fd_set filled by select(), handle incoming messages from
+ * server or peers.
+ *
+ * @param client
+ *   The ivshmem client
+ * @param fds
+ *   The fd_set containing the file descriptors to be checked. Note
+ *   that file descriptors that are not related to our client are
+ *   ignored.
+ * @param maxfd
+ *   The maximum fd in fd_set, plus one.
+  *
+ * @return
+ *   0 on success, negative value on failure.
+ */
+int ivshmem_client_handle_fds(struct ivshmem_client *client, fd_set *fds,
+    int maxfd);
+
+/**
+ * Send a notification to a vector of a peer
+ *
+ * @param client
+ *   The ivshmem client
+ * @param peer
+ *   The peer to be notified
+ * @param vector
+ *   The number of the vector
+ *
+ * @return
+ *   0 on success, and a negative error on failure.
+ */
+int ivshmem_client_notify(const struct ivshmem_client *client,
+    const struct ivshmem_client_peer *peer, unsigned vector);
+
+/**
+ * Send a notification to all vectors of a peer
+ *
+ * @param client
+ *   The ivshmem client
+ * @param peer
+ *   The peer to be notified
+ *
+ * @return
+ *   0 on success, and a negative error on failure (at least one
+ *   notification failed).
+ */
+int ivshmem_client_notify_all_vects(const struct ivshmem_client *client,
+    const struct ivshmem_client_peer *peer);
+
+/**
+ * Broadcat a notification to all vectors of all peers
+ *
+ * @param client
+ *   The ivshmem client
+ *
+ * @return
+ *   0 on success, and a negative error on failure (at least one
+ *   notification failed).
+ */
+int ivshmem_client_notify_broadcast(const struct ivshmem_client *client);
+
+/**
+ * Search a peer from its identifier
+ *
+ * Return the peer structure from its peer_id. If the given peer_id is
+ * the local id, the function returns the local peer structure.
+ *
+ * @param client
+ *   The ivshmem client
+ * @param peer_id
+ *   The identifier of the peer structure
+ *
+ * @return
+ *   The peer structure, or NULL if not found
+ */
+struct ivshmem_client_peer *
+ivshmem_client_search_peer(struct ivshmem_client *client, long peer_id);
+
+/**
+ * Dump information of this ivshmem client on stdout
+ *
+ * Dump the id and the vectors of the given ivshmem client and the list
+ * of its peers and their vectors on stdout.
+ *
+ * @param client
+ *   The ivshmem client
+ */
+void ivshmem_client_dump(const struct ivshmem_client *client);
+
+#endif /* _IVSHMEM_CLIENT_ */
diff --git a/contrib/ivshmem-client/main.c b/contrib/ivshmem-client/main.c
new file mode 100644
index 0000000..0d53f55
--- /dev/null
+++ b/contrib/ivshmem-client/main.c
@@ -0,0 +1,246 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <signal.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <getopt.h>
+
+#include "ivshmem-client.h"
+
+#define DEFAULT_VERBOSE        0
+#define DEFAULT_UNIX_SOCK_PATH "/tmp/ivshmem_socket"
+
+struct ivshmem_client_args {
+    int verbose;
+    char *unix_sock_path;
+};
+
+/* show usage and exit with given error code */
+static void
+usage(const char *name, int code)
+{
+    fprintf(stderr, "%s [opts]\n", name);
+    fprintf(stderr, "  -h: show this help\n");
+    fprintf(stderr, "  -v: verbose mode\n");
+    fprintf(stderr, "  -S <unix_sock_path>: path to the unix socket\n"
+                    "     to listen to.\n"
+                    "     default=%s\n", DEFAULT_UNIX_SOCK_PATH);
+    exit(code);
+}
+
+/* parse the program arguments, exit on error */
+static void
+parse_args(struct ivshmem_client_args *args, int argc, char *argv[])
+{
+    char c;
+
+    while ((c = getopt(argc, argv,
+                       "h"  /* help */
+                       "v"  /* verbose */
+                       "S:" /* unix_sock_path */
+                      )) != -1) {
+
+        switch (c) {
+        case 'h': /* help */
+            usage(argv[0], 0);
+            break;
+
+        case 'v': /* verbose */
+            args->verbose = 1;
+            break;
+
+        case 'S': /* unix_sock_path */
+            args->unix_sock_path = strdup(optarg);
+            break;
+
+        default:
+            usage(argv[0], 1);
+            break;
+        }
+    }
+}
+
+/* show command line help */
+static void
+cmdline_help(void)
+{
+    printf("dump: dump peers (including us)\n"
+           "int <peer> <vector>: notify one vector on a peer\n"
+           "int <peer> all: notify all vectors of a peer\n"
+           "int all: notify all vectors of all peers (excepting us)\n");
+}
+
+/* read stdin and handle commands */
+static int
+handle_stdin_command(struct ivshmem_client *client)
+{
+    struct ivshmem_client_peer *peer;
+    char buf[128];
+    char *s, *token;
+    int ret;
+    int peer_id, vector;
+
+    memset(buf, 0, sizeof(buf));
+    ret = read(0, buf, sizeof(buf) - 1);
+    if (ret < 0) {
+        return -1;
+    }
+
+    s = buf;
+    while ((token = strsep(&s, "\n\r;")) != NULL) {
+        if (!strcmp(token, "")) {
+            continue;
+        }
+        if (!strcmp(token, "?")) {
+            cmdline_help();
+        }
+        if (!strcmp(token, "help")) {
+            cmdline_help();
+        } else if (!strcmp(token, "dump")) {
+            ivshmem_client_dump(client);
+        } else if (!strcmp(token, "int all")) {
+            ivshmem_client_notify_broadcast(client);
+        } else if (sscanf(token, "int %d %d", &peer_id, &vector) == 2) {
+            peer = ivshmem_client_search_peer(client, peer_id);
+            if (peer == NULL) {
+                printf("cannot find peer_id = %d\n", peer_id);
+                continue;
+            }
+            ivshmem_client_notify(client, peer, vector);
+        } else if (sscanf(token, "int %d all", &peer_id) == 1) {
+            peer = ivshmem_client_search_peer(client, peer_id);
+            if (peer == NULL) {
+                printf("cannot find peer_id = %d\n", peer_id);
+                continue;
+            }
+            ivshmem_client_notify_all_vects(client, peer);
+        } else {
+            printf("invalid command, type help\n");
+        }
+    }
+
+    printf("cmd> ");
+    fflush(stdout);
+    return 0;
+}
+
+/* listen on stdin (command line), on unix socket (notifications of new
+ * and dead peers), and on eventfd (IRQ request) */
+int
+poll_events(struct ivshmem_client *client)
+{
+    fd_set fds;
+    int ret, maxfd;
+
+    while (1) {
+
+        FD_ZERO(&fds);
+        FD_SET(0, &fds); /* add stdin in fd_set */
+        maxfd = 1;
+
+        ivshmem_client_get_fds(client, &fds, &maxfd);
+
+        ret = select(maxfd, &fds, NULL, NULL, NULL);
+        if (ret < 0) {
+            if (errno == EINTR) {
+                continue;
+            }
+
+            fprintf(stderr, "select error: %s\n", strerror(errno));
+            break;
+        }
+        if (ret == 0) {
+            continue;
+        }
+
+        if (FD_ISSET(0, &fds) &&
+            handle_stdin_command(client) < 0 && errno != EINTR) {
+            fprintf(stderr, "handle_stdin_command() failed\n");
+            break;
+        }
+
+        if (ivshmem_client_handle_fds(client, &fds, maxfd) < 0) {
+            fprintf(stderr, "ivshmem_client_handle_fds() failed\n");
+            break;
+        }
+    }
+
+    return ret;
+}
+
+/* callback when we receive a notification (just display it) */
+void
+notification_cb(const struct ivshmem_client *client,
+                const struct ivshmem_client_peer *peer, unsigned vect,
+                void *arg)
+{
+    (void)client;
+    (void)arg;
+    printf("receive notification from peer_id=%ld vector=%d\n", peer->id, vect);
+}
+
+int
+main(int argc, char *argv[])
+{
+    struct sigaction sa;
+    struct ivshmem_client client;
+    struct ivshmem_client_args args = {
+        .verbose = DEFAULT_VERBOSE,
+        .unix_sock_path = DEFAULT_UNIX_SOCK_PATH,
+    };
+
+    /* parse arguments, will exit on error */
+    parse_args(&args, argc, argv);
+
+    /* Ignore SIGPIPE, see this link for more info:
+     * http://www.mail-archive.com/libevent-users@monkey.org/msg01606.html */
+    sa.sa_handler = SIG_IGN;
+    sa.sa_flags = 0;
+    if (sigemptyset(&sa.sa_mask) == -1 ||
+        sigaction(SIGPIPE, &sa, 0) == -1) {
+        perror("failed to ignore SIGPIPE; sigaction");
+        return 1;
+    }
+
+    cmdline_help();
+    printf("cmd> ");
+    fflush(stdout);
+
+    if (ivshmem_client_init(&client, args.unix_sock_path, notification_cb,
+                            NULL, args.verbose) < 0) {
+        fprintf(stderr, "cannot init client\n");
+        return 1;
+    }
+
+    while (1) {
+        if (ivshmem_client_connect(&client) < 0) {
+            fprintf(stderr, "cannot connect to server, retry in 1 second\n");
+            sleep(1);
+            continue;
+        }
+
+        fprintf(stdout, "listen on server socket %d\n", client.sock_fd);
+
+        if (poll_events(&client) == 0) {
+            continue;
+        }
+
+        /* disconnected from server, reset all peers */
+        fprintf(stdout, "disconnected from server\n");
+
+        ivshmem_client_close(&client);
+    }
+
+    return 0;
+}
diff --git a/contrib/ivshmem-server/Makefile b/contrib/ivshmem-server/Makefile
new file mode 100644
index 0000000..26b4a72
--- /dev/null
+++ b/contrib/ivshmem-server/Makefile
@@ -0,0 +1,29 @@
+# Copyright 6WIND S.A., 2014
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# (at your option) any later version.  See the COPYING file in the
+# top-level directory.
+
+S ?= $(CURDIR)
+O ?= $(CURDIR)
+
+CFLAGS += -Wall -Wextra -Werror -g
+LDFLAGS +=
+LDLIBS += -lrt
+
+VPATH = $(S)
+PROG = ivshmem-server
+OBJS := $(O)/ivshmem-server.o
+OBJS += $(O)/main.o
+
+$(O)/%.o: %.c
+	$(CC) $(CFLAGS) -o $@ -c $<
+
+$(O)/$(PROG): $(OBJS)
+	$(CC) $(LDFLAGS) -o $@ $^ $(LDLIBS)
+
+.PHONY: all
+all: $(O)/$(PROG)
+
+clean:
+	rm -f $(OBJS) $(O)/$(PROG)
diff --git a/contrib/ivshmem-server/ivshmem-server.c b/contrib/ivshmem-server/ivshmem-server.c
new file mode 100644
index 0000000..f6497bb
--- /dev/null
+++ b/contrib/ivshmem-server/ivshmem-server.c
@@ -0,0 +1,420 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <signal.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <fcntl.h>
+
+#include <sys/queue.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/eventfd.h>
+
+#include "ivshmem-server.h"
+
+/* log a message on stdout if verbose=1 */
+#define debug_log(server, fmt, ...) do { \
+        if ((server)->verbose) {         \
+            printf(fmt, ## __VA_ARGS__); \
+        }                                \
+    } while (0)
+
+/* browse the queue, allowing to remove/free the current element */
+#define    TAILQ_FOREACH_SAFE(var, var2, head, field)            \
+    for ((var) = TAILQ_FIRST((head)),                            \
+             (var2) = ((var) ? TAILQ_NEXT((var), field) : NULL); \
+         (var);                                                  \
+         (var) = (var2),                                         \
+             (var2) = ((var2) ? TAILQ_NEXT((var2), field) : NULL))
+
+/** maximum size of a huge page, used by ivshmem_ftruncate() */
+#define MAX_HUGEPAGE_SIZE (1024 * 1024 * 1024)
+
+/** default listen backlog (number of sockets not accepted) */
+#define IVSHMEM_SERVER_LISTEN_BACKLOG 10
+
+/* send message to a client unix socket */
+static int
+send_one_msg(int sock_fd, long peer_id, int fd)
+{
+    int ret;
+    struct msghdr msg;
+    struct iovec iov[1];
+    union {
+        struct cmsghdr cmsg;
+        char control[CMSG_SPACE(sizeof(int))];
+    } msg_control;
+    struct cmsghdr *cmsg;
+
+    iov[0].iov_base = &peer_id;
+    iov[0].iov_len = sizeof(peer_id);
+
+    memset(&msg, 0, sizeof(msg));
+    msg.msg_iov = iov;
+    msg.msg_iovlen = 1;
+
+    /* if fd is specified, add it in a cmsg */
+    if (fd >= 0) {
+        msg.msg_control = &msg_control;
+        msg.msg_controllen = sizeof(msg_control);
+        cmsg = CMSG_FIRSTHDR(&msg);
+        cmsg->cmsg_level = SOL_SOCKET;
+        cmsg->cmsg_type = SCM_RIGHTS;
+        cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+        memcpy(CMSG_DATA(cmsg), &fd, sizeof(fd));
+    }
+
+    ret = sendmsg(sock_fd, &msg, 0);
+    if (ret <= 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
+/* free a peer when the server advertise a disconnection or when the
+ * server is freed */
+static void
+free_peer(struct ivshmem_server *server, struct ivshmem_server_peer *peer)
+{
+    unsigned vector;
+    struct ivshmem_server_peer *other_peer;
+
+    debug_log(server, "free peer %ld\n", peer->id);
+    close(peer->sock_fd);
+    TAILQ_REMOVE(&server->peer_list, peer, next);
+
+    /* advertise the deletion to other peers */
+    TAILQ_FOREACH(other_peer, &server->peer_list, next) {
+        send_one_msg(other_peer->sock_fd, peer->id, -1);
+    }
+
+    for (vector = 0; vector < peer->vectors_count; vector++) {
+        close(peer->vectors[vector]);
+    }
+
+    free(peer);
+}
+
+/* send the peer id and the shm_fd just after a new client connection */
+static int
+send_initial_info(struct ivshmem_server *server,
+                  struct ivshmem_server_peer *peer)
+{
+    int ret;
+
+    /* send the peer id to the client */
+    ret = send_one_msg(peer->sock_fd, peer->id, -1);
+    if (ret < 0) {
+        debug_log(server, "cannot send peer id: %s\n", strerror(errno));
+        return -1;
+    }
+
+    /* send the shm_fd */
+    ret = send_one_msg(peer->sock_fd, -1, server->shm_fd);
+    if (ret < 0) {
+        debug_log(server, "cannot send shm fd: %s\n", strerror(errno));
+        return -1;
+    }
+
+    return 0;
+}
+
+/* handle message on listening unix socket (new client connection) */
+static int
+handle_new_conn(struct ivshmem_server *server)
+{
+    struct ivshmem_server_peer *peer, *other_peer;
+    struct sockaddr_un unaddr;
+    socklen_t unaddr_len;
+    int newfd;
+    unsigned i;
+
+    /* accept the incoming connection */
+    unaddr_len = sizeof(unaddr);
+    newfd = accept(server->sock_fd, (struct sockaddr *)&unaddr, &unaddr_len);
+    if (newfd < 0) {
+        debug_log(server, "cannot accept() %s\n", strerror(errno));
+        return -1;
+    }
+
+    debug_log(server, "accept()=%d\n", newfd);
+
+    /* allocate new structure for this peer */
+    peer = malloc(sizeof(*peer));
+    if (peer == NULL) {
+        debug_log(server, "cannot allocate new peer\n");
+        close(newfd);
+        return -1;
+    }
+
+    /* initialize the peer struct, one eventfd per vector */
+    memset(peer, 0, sizeof(*peer));
+    peer->sock_fd = newfd;
+
+    /* get an unused peer id */
+    while (ivshmem_server_search_peer(server, server->cur_id) != NULL) {
+        server->cur_id++;
+    }
+    peer->id = server->cur_id++;
+
+    /* create eventfd, one per vector */
+    peer->vectors_count = server->n_vectors;
+    for (i = 0; i < peer->vectors_count; i++) {
+        peer->vectors[i] = eventfd(0, 0);
+        if (peer->vectors[i] < 0) {
+            debug_log(server, "cannot create eventfd\n");
+            goto fail;
+        }
+    }
+
+    /* send peer id and shm fd */
+    if (send_initial_info(server, peer) < 0) {
+        debug_log(server, "cannot send initial info\n");
+        goto fail;
+    }
+
+    /* advertise the new peer to others */
+    TAILQ_FOREACH(other_peer, &server->peer_list, next) {
+        for (i = 0; i < peer->vectors_count; i++) {
+            send_one_msg(other_peer->sock_fd, peer->id, peer->vectors[i]);
+        }
+    }
+
+    /* advertise the other peers to the new one */
+    TAILQ_FOREACH(other_peer, &server->peer_list, next) {
+        for (i = 0; i < peer->vectors_count; i++) {
+            send_one_msg(peer->sock_fd, other_peer->id, other_peer->vectors[i]);
+        }
+    }
+
+    /* advertise the new peer to itself */
+    for (i = 0; i < peer->vectors_count; i++) {
+        send_one_msg(peer->sock_fd, peer->id, peer->vectors[i]);
+    }
+
+    TAILQ_INSERT_TAIL(&server->peer_list, peer, next);
+    debug_log(server, "new peer id = %ld\n", peer->id);
+    return 0;
+
+fail:
+    while (i--) {
+        close(peer->vectors[i]);
+    }
+    peer->sock_fd = -1;
+    close(newfd);
+    return -1;
+}
+
+/* Try to ftruncate a file to next power of 2 of shmsize.
+ * If it fails; all power of 2 above shmsize are tested until
+ * we reach the maximum huge page size. This is useful
+ * if the shm file is in a hugetlbfs that cannot be truncated to the
+ * shm_size value. */
+static int
+ivshmem_ftruncate(int fd, unsigned shmsize)
+{
+    int ret;
+
+    /* align shmsize to next power of 2 */
+    shmsize--;
+    shmsize |= shmsize >> 1;
+    shmsize |= shmsize >> 2;
+    shmsize |= shmsize >> 4;
+    shmsize |= shmsize >> 8;
+    shmsize |= shmsize >> 16;
+    shmsize++;
+
+    while (shmsize <= MAX_HUGEPAGE_SIZE) {
+        ret = ftruncate(fd, shmsize);
+        if (ret == 0) {
+            return ret;
+        }
+        shmsize *= 2;
+    }
+
+    return -1;
+}
+
+/* Init a new ivshmem server */
+int
+ivshmem_server_init(struct ivshmem_server *server, const char *unix_sock_path,
+                    const char *shm_path, size_t shm_size, unsigned n_vectors,
+                    int verbose)
+{
+    memset(server, 0, sizeof(*server));
+
+    snprintf(server->unix_sock_path, sizeof(server->unix_sock_path),
+             "%s", unix_sock_path);
+    snprintf(server->shm_path, sizeof(server->shm_path),
+             "%s", shm_path);
+
+    server->shm_size = shm_size;
+    server->n_vectors = n_vectors;
+    server->verbose = verbose;
+
+    TAILQ_INIT(&server->peer_list);
+
+    return 0;
+}
+
+/* open shm, create and bind to the unix socket */
+int
+ivshmem_server_start(struct ivshmem_server *server)
+{
+    struct sockaddr_un sun;
+    int shm_fd, sock_fd;
+
+    /* open shm file */
+    shm_fd = shm_open(server->shm_path, O_CREAT|O_RDWR, S_IRWXU);
+    if (shm_fd < 0) {
+        fprintf(stderr, "cannot open shm file %s: %s\n", server->shm_path,
+                strerror(errno));
+        return -1;
+    }
+    if (ivshmem_ftruncate(shm_fd, server->shm_size) < 0) {
+        fprintf(stderr, "ftruncate(%s) failed: %s\n", server->shm_path,
+                strerror(errno));
+        return -1;
+    }
+
+    debug_log(server, "create & bind socket %s\n", server->unix_sock_path);
+
+    /* create the unix listening socket */
+    sock_fd = socket(AF_UNIX, SOCK_STREAM, 0);
+    if (sock_fd < 0) {
+        debug_log(server, "cannot create socket: %s\n", strerror(errno));
+        close(shm_fd);
+        return -1;
+    }
+
+    sun.sun_family = AF_UNIX;
+    snprintf(sun.sun_path, sizeof(sun.sun_path), "%s", server->unix_sock_path);
+    unlink(sun.sun_path);
+    if (bind(sock_fd, (struct sockaddr *)&sun, sizeof(sun)) < 0) {
+        debug_log(server, "cannot connect to %s: %s\n", sun.sun_path,
+                  strerror(errno));
+        close(sock_fd);
+        close(shm_fd);
+        return -1;
+    }
+
+    if (listen(sock_fd, IVSHMEM_SERVER_LISTEN_BACKLOG) < 0) {
+        debug_log(server, "listen() failed: %s\n", strerror(errno));
+        close(sock_fd);
+        close(shm_fd);
+        return -1;
+    }
+
+    server->sock_fd = sock_fd;
+    server->shm_fd = shm_fd;
+
+    return 0;
+}
+
+/* close connections to clients, the unix socket and the shm fd */
+void
+ivshmem_server_close(struct ivshmem_server *server)
+{
+    struct ivshmem_server_peer *peer;
+
+    debug_log(server, "close server\n");
+
+    TAILQ_FOREACH(peer, &server->peer_list, next) {
+        free_peer(server, peer);
+    }
+
+    close(server->sock_fd);
+    close(server->shm_fd);
+    server->sock_fd = -1;
+    server->shm_fd = -1;
+}
+
+/* get the fd_set according to the unix socket and the peer list */
+void
+ivshmem_server_get_fds(const struct ivshmem_server *server, fd_set *fds,
+                       int *maxfd)
+{
+    struct ivshmem_server_peer *peer;
+
+    FD_SET(server->sock_fd, fds);
+    if (server->sock_fd >= *maxfd) {
+        *maxfd = server->sock_fd + 1;
+    }
+
+    TAILQ_FOREACH(peer, &server->peer_list, next) {
+        FD_SET(peer->sock_fd, fds);
+        if (peer->sock_fd >= *maxfd) {
+            *maxfd = peer->sock_fd + 1;
+        }
+    }
+}
+
+/* process incoming messages on the sockets in fd_set */
+int
+ivshmem_server_handle_fds(struct ivshmem_server *server, fd_set *fds, int maxfd)
+{
+    struct ivshmem_server_peer *peer, *peer_next;
+
+    if (server->sock_fd < maxfd && FD_ISSET(server->sock_fd, fds) &&
+        handle_new_conn(server) < 0 && errno != EINTR) {
+        debug_log(server, "handle_new_conn() failed\n");
+        return -1;
+    }
+
+    TAILQ_FOREACH_SAFE(peer, peer_next, &server->peer_list, next) {
+        /* any message from a peer socket result in a close() */
+        debug_log(server, "peer->sock_fd=%d\n", peer->sock_fd);
+        if (peer->sock_fd < maxfd && FD_ISSET(peer->sock_fd, fds)) {
+            free_peer(server, peer);
+        }
+    }
+
+    return 0;
+}
+
+/* lookup peer from its id */
+struct ivshmem_server_peer *
+ivshmem_server_search_peer(struct ivshmem_server *server, long peer_id)
+{
+    struct ivshmem_server_peer *peer;
+
+    TAILQ_FOREACH(peer, &server->peer_list, next) {
+        if (peer->id == peer_id) {
+            return peer;
+        }
+    }
+    return NULL;
+}
+
+/* dump our info, the list of peers their vectors on stdout */
+void
+ivshmem_server_dump(const struct ivshmem_server *server)
+{
+    const struct ivshmem_server_peer *peer;
+    unsigned vector;
+
+    /* dump peers */
+    TAILQ_FOREACH(peer, &server->peer_list, next) {
+        printf("peer_id = %ld\n", peer->id);
+
+        for (vector = 0; vector < peer->vectors_count; vector++) {
+            printf("  vector %d is enabled (fd=%d)\n", vector,
+                   peer->vectors[vector]);
+        }
+    }
+}
diff --git a/contrib/ivshmem-server/ivshmem-server.h b/contrib/ivshmem-server/ivshmem-server.h
new file mode 100644
index 0000000..cd74bbf
--- /dev/null
+++ b/contrib/ivshmem-server/ivshmem-server.h
@@ -0,0 +1,185 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#ifndef _IVSHMEM_SERVER_
+#define _IVSHMEM_SERVER_
+
+/**
+ * The ivshmem server is a daemon that creates a unix socket in listen
+ * mode. The ivshmem clients (qemu or ivshmem-client) connect to this
+ * unix socket. For each client, the server will create some eventfd
+ * (see EVENTFD(2)), one per vector. These fd are transmitted to all
+ * clients using the SCM_RIGHTS cmsg message. Therefore, each client is
+ * able to send a notification to another client without beeing
+ * "profixied" by the server.
+ *
+ * We use this mechanism to send interruptions between guests.
+ * qemu is able to transform an event on a eventfd into a PCI MSI-x
+ * interruption in the guest.
+ *
+ * The ivshmem server is also able to share the file descriptor
+ * associated to the ivshmem shared memory.
+ */
+
+#include <limits.h>
+#include <sys/select.h>
+#include <sys/queue.h>
+
+/**
+ * Maximum number of notification vectors supported by the server
+ */
+#define IVSHMEM_SERVER_MAX_VECTORS 64
+
+/**
+ * Structure storing a peer
+ *
+ * Each time a client connects to an ivshmem server, a new
+ * ivshmem_server_peer structure is created. This peer and all its
+ * vectors are advertised to all connected clients through the connected
+ * unix sockets.
+ */
+struct ivshmem_server_peer {
+    TAILQ_ENTRY(ivshmem_server_peer) next;    /**< next in list*/
+    int sock_fd;                                /**< connected unix sock */
+    long id;                                    /**< the id of the peer */
+    int vectors[IVSHMEM_SERVER_MAX_VECTORS];  /**< one fd per vector */
+    unsigned vectors_count;                     /**< number of vectors */
+};
+TAILQ_HEAD(ivshmem_server_peer_list, ivshmem_server_peer);
+
+/**
+ * Structure describing an ivshmem server
+ *
+ * This structure stores all information related to our server: the name
+ * of the server unix socket and the list of connected peers.
+ */
+struct ivshmem_server {
+    char unix_sock_path[PATH_MAX];  /**< path to unix socket */
+    int sock_fd;                    /**< unix sock file descriptor */
+    char shm_path[PATH_MAX];        /**< path to shm */
+    size_t shm_size;                /**< size of shm */
+    int shm_fd;                     /**< shm file descriptor */
+    unsigned n_vectors;             /**< number of vectors */
+    long cur_id;                    /**< id to be given to next client */
+    int verbose;                    /**< true in verbose mode */
+    struct ivshmem_server_peer_list peer_list;  /**< list of peers */
+};
+
+/**
+ * Initialize an ivshmem server
+ *
+ * @param server
+ *   A pointer to an uninitialized ivshmem_server structure
+ * @param unix_sock_path
+ *   The pointer to the unix socket file name
+ * @param shm_path
+ *   Path to the shared memory. The path corresponds to a POSIX shm name.
+ *   To use a real file, for instance in a hugetlbfs, it is possible to
+ *   use /../../abspath/to/file.
+ * @param shm_size
+ *   Size of shared memory
+ * @param n_vectors
+ *   Number of interrupt vectors per client
+ * @param verbose
+ *   True to enable verbose mode
+ *
+ * @return
+ *   0 on success, negative value on error
+ */
+int
+ivshmem_server_init(struct ivshmem_server *server,
+    const char *unix_sock_path, const char *shm_path, size_t shm_size,
+    unsigned n_vectors, int verbose);
+
+/**
+ * Open the shm, then create and bind to the unix socket
+ *
+ * @param server
+ *   The pointer to the initialized ivshmem server structure
+ *
+ * @return
+ *   0 on success, or a negative value on error
+ */
+int ivshmem_server_start(struct ivshmem_server *server);
+
+/**
+ * Close the server
+ *
+ * Close connections to all clients, close the unix socket and the
+ * shared memory file descriptor. The structure remains initialized, so
+ * it is possible to call ivshmem_server_start() again after a call to
+ * ivshmem_server_close().
+ *
+ * @param server
+ *   The ivshmem server
+ */
+void ivshmem_server_close(struct ivshmem_server *server);
+
+/**
+ * Fill a fd_set with file descriptors to be monitored
+ *
+ * This function will fill a fd_set with all file descriptors that must
+ * be polled (unix server socket and peers unix socket). The function
+ * will not initialize the fd_set, it is up to the caller to do it.
+ *
+ * @param server
+ *   The ivshmem server
+ * @param fds
+ *   The fd_set to be updated
+ * @param maxfd
+ *   Must be set to the max file descriptor + 1 in fd_set. This value is
+ *   updated if this function adds a greated fd in fd_set.
+ */
+void
+ivshmem_server_get_fds(const struct ivshmem_server *server,
+    fd_set *fds, int *maxfd);
+
+/**
+ * Read and handle new messages
+ *
+ * Given a fd_set (for instance filled by a call to select()), handle
+ * incoming messages from peers.
+ *
+ * @param server
+ *   The ivshmem server
+ * @param fds
+ *   The fd_set containing the file descriptors to be checked. Note
+ *   that file descriptors that are not related to our server are
+ *   ignored.
+ * @param maxfd
+ *   The maximum fd in fd_set, plus one.
+ *
+ * @return
+ *   0 on success, negative value on failure.
+ */
+int ivshmem_server_handle_fds(struct ivshmem_server *server, fd_set *fds,
+    int maxfd);
+
+/**
+ * Search a peer from its identifier
+ *
+ * @param server
+ *   The ivshmem server
+ * @param peer_id
+ *   The identifier of the peer structure
+ *
+ * @return
+ *   The peer structure, or NULL if not found
+ */
+struct ivshmem_server_peer *
+ivshmem_server_search_peer(struct ivshmem_server *server, long peer_id);
+
+/**
+ * Dump information of this ivshmem server and its peers on stdout
+ *
+ * @param server
+ *   The ivshmem server
+ */
+void ivshmem_server_dump(const struct ivshmem_server *server);
+
+#endif /* _IVSHMEM_SERVER_ */
diff --git a/contrib/ivshmem-server/main.c b/contrib/ivshmem-server/main.c
new file mode 100644
index 0000000..36b7028
--- /dev/null
+++ b/contrib/ivshmem-server/main.c
@@ -0,0 +1,296 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <signal.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <sys/types.h>
+#include <limits.h>
+#include <getopt.h>
+
+#include "ivshmem-server.h"
+
+#define DEFAULT_VERBOSE        0
+#define DEFAULT_FOREGROUND     0
+#define DEFAULT_PID_FILE       "/var/run/ivshmem-server.pid"
+#define DEFAULT_UNIX_SOCK_PATH "/tmp/ivshmem_socket"
+#define DEFAULT_SHM_PATH       "ivshmem"
+#define DEFAULT_SHM_SIZE       (1024*1024)
+#define DEFAULT_N_VECTORS      16
+
+/* arguments given by the user */
+struct ivshmem_server_args {
+    int verbose;
+    int foreground;
+    char *pid_file;
+    char *unix_socket_path;
+    char *shm_path;
+    size_t shm_size;
+    unsigned n_vectors;
+};
+
+/* show usage and exit with given error code */
+static void
+usage(const char *name, int code)
+{
+    fprintf(stderr, "%s [opts]\n", name);
+    fprintf(stderr, "  -h: show this help\n");
+    fprintf(stderr, "  -v: verbose mode\n");
+    fprintf(stderr, "  -F: foreground mode (default is to daemonize)\n");
+    fprintf(stderr, "  -p <pid_file>: path to the PID file (used in daemon\n"
+                    "     mode only).\n"
+                    "     Default=%s\n", DEFAULT_SHM_PATH);
+    fprintf(stderr, "  -S <unix_socket_path>: path to the unix socket\n"
+                    "     to listen to.\n"
+                    "     Default=%s\n", DEFAULT_UNIX_SOCK_PATH);
+    fprintf(stderr, "  -m <shm_path>: path to the shared memory.\n"
+                    "     The path corresponds to a POSIX shm name. To use a\n"
+                    "     real file, for instance in a hugetlbfs, use\n"
+                    "     /../../abspath/to/file.\n"
+                    "     default=%s\n", DEFAULT_SHM_PATH);
+    fprintf(stderr, "  -l <size>: size of shared memory in bytes. The suffix\n"
+                    "     K, M and G can be used (ex: 1K means 1024).\n"
+                    "     default=%u\n", DEFAULT_SHM_SIZE);
+    fprintf(stderr, "  -n <n_vects>: number of vectors.\n"
+                    "     default=%u\n", DEFAULT_N_VECTORS);
+
+    exit(code);
+}
+
+/* parse the size of shm */
+static int
+parse_size(const char *val_str, size_t *val)
+{
+    char *endptr;
+    unsigned long long tmp;
+
+    errno = 0;
+    tmp = strtoull(val_str, &endptr, 0);
+    if ((errno == ERANGE && tmp == ULLONG_MAX) || (errno != 0 && tmp == 0)) {
+        return -1;
+    }
+    if (endptr == val_str) {
+        return -1;
+    }
+    if (endptr[0] == 'K' && endptr[1] == '\0') {
+        tmp *= 1024;
+    } else if (endptr[0] == 'M' && endptr[1] == '\0') {
+        tmp *= 1024 * 1024;
+    } else if (endptr[0] == 'G' && endptr[1] == '\0') {
+        tmp *= 1024 * 1024 * 1024;
+    } else if (endptr[0] != '\0') {
+        return -1;
+    }
+
+    *val = tmp;
+    return 0;
+}
+
+/* parse an unsigned int */
+static int
+parse_uint(const char *val_str, unsigned *val)
+{
+    char *endptr;
+    unsigned long tmp;
+
+    errno = 0;
+    tmp = strtoul(val_str, &endptr, 0);
+    if ((errno == ERANGE && tmp == ULONG_MAX) || (errno != 0 && tmp == 0)) {
+        return -1;
+    }
+    if (endptr == val_str || endptr[0] != '\0') {
+        return -1;
+    }
+    *val = tmp;
+    return 0;
+}
+
+/* parse the program arguments, exit on error */
+static void
+parse_args(struct ivshmem_server_args *args, int argc, char *argv[])
+{
+    char c;
+
+    while ((c = getopt(argc, argv,
+                       "h"  /* help */
+                       "v"  /* verbose */
+                       "F"  /* foreground */
+                       "p:" /* pid_file */
+                       "S:" /* unix_socket_path */
+                       "m:" /* shm_path */
+                       "l:" /* shm_size */
+                       "n:" /* n_vectors */
+                      )) != -1) {
+
+        switch (c) {
+        case 'h': /* help */
+            usage(argv[0], 0);
+            break;
+
+        case 'v': /* verbose */
+            args->verbose = 1;
+            break;
+
+        case 'F': /* foreground */
+            args->foreground = 1;
+            break;
+
+        case 'p': /* pid_file */
+            args->pid_file = strdup(optarg);
+            break;
+
+        case 'S': /* unix_socket_path */
+            args->unix_socket_path = strdup(optarg);
+            break;
+
+        case 'm': /* shm_path */
+            args->shm_path = strdup(optarg);
+            break;
+
+        case 'l': /* shm_size */
+            if (parse_size(optarg, &args->shm_size) < 0) {
+                fprintf(stderr, "cannot parse shm size\n");
+                usage(argv[0], 1);
+            }
+            break;
+
+        case 'n': /* n_vectors */
+            if (parse_uint(optarg, &args->n_vectors) < 0) {
+                fprintf(stderr, "cannot parse n_vectors\n");
+                usage(argv[0], 1);
+            }
+            break;
+
+        default:
+            usage(argv[0], 1);
+            break;
+        }
+    }
+
+    if (args->n_vectors > IVSHMEM_SERVER_MAX_VECTORS) {
+        fprintf(stderr, "too many requested vectors (max is %d)\n",
+                IVSHMEM_SERVER_MAX_VECTORS);
+        usage(argv[0], 1);
+    }
+
+    if (args->verbose == 1 && args->foreground == 0) {
+        fprintf(stderr, "cannot use verbose in daemon mode\n");
+        usage(argv[0], 1);
+    }
+}
+
+/* wait for events on listening server unix socket and connected client
+ * sockets */
+int
+poll_events(struct ivshmem_server *server)
+{
+    fd_set fds;
+    int ret, maxfd;
+
+    while (1) {
+
+        FD_ZERO(&fds);
+        maxfd = 0;
+        ivshmem_server_get_fds(server, &fds, &maxfd);
+
+        ret = select(maxfd, &fds, NULL, NULL, NULL);
+
+        if (ret < 0) {
+            if (errno == EINTR) {
+                continue;
+            }
+
+            fprintf(stderr, "select error: %s\n", strerror(errno));
+            break;
+        }
+        if (ret == 0) {
+            continue;
+        }
+
+        if (ivshmem_server_handle_fds(server, &fds, maxfd) < 0) {
+            fprintf(stderr, "ivshmem_server_handle_fds() failed\n");
+            break;
+        }
+    }
+
+    return ret;
+}
+
+int
+main(int argc, char *argv[])
+{
+    struct ivshmem_server server;
+    struct sigaction sa;
+    struct ivshmem_server_args args = {
+        .verbose = DEFAULT_VERBOSE,
+        .foreground = DEFAULT_FOREGROUND,
+        .pid_file = DEFAULT_PID_FILE,
+        .unix_socket_path = DEFAULT_UNIX_SOCK_PATH,
+        .shm_path = DEFAULT_SHM_PATH,
+        .shm_size = DEFAULT_SHM_SIZE,
+        .n_vectors = DEFAULT_N_VECTORS,
+    };
+
+    /* parse arguments, will exit on error */
+    parse_args(&args, argc, argv);
+
+    /* Ignore SIGPIPE, see this link for more info:
+     * http://www.mail-archive.com/libevent-users@monkey.org/msg01606.html */
+    sa.sa_handler = SIG_IGN;
+    sa.sa_flags = 0;
+    if (sigemptyset(&sa.sa_mask) == -1 ||
+        sigaction(SIGPIPE, &sa, 0) == -1) {
+        perror("failed to ignore SIGPIPE; sigaction");
+        return 1;
+    }
+
+    /* init the ivshms structure */
+    if (ivshmem_server_init(&server, args.unix_socket_path, args.shm_path,
+                            args.shm_size, args.n_vectors, args.verbose) < 0) {
+        fprintf(stderr, "cannot init server\n");
+        return 1;
+    }
+
+    /* start the ivshmem server (open shm & unix socket) */
+    if (ivshmem_server_start(&server) < 0) {
+        fprintf(stderr, "cannot bind\n");
+        return 1;
+    }
+
+    /* daemonize if asked to */
+    if (!args.foreground) {
+        FILE *fp;
+
+        if (daemon(1, 1) < 0) {
+            fprintf(stderr, "cannot daemonize: %s\n", strerror(errno));
+            return 1;
+        }
+
+        /* write pid file */
+        fp = fopen(args.pid_file, "w");
+        if (fp == NULL) {
+            fprintf(stderr, "cannot write pid file: %s\n", strerror(errno));
+            return 1;
+        }
+
+        fprintf(fp, "%d\n", (int) getpid());
+        fclose(fp);
+    }
+
+    poll_events(&server);
+
+    fprintf(stdout, "server disconnected\n");
+    ivshmem_server_close(&server);
+
+    return 0;
+}
diff --git a/qemu-doc.texi b/qemu-doc.texi
index 2b232ae..380d573 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -1250,9 +1250,13 @@ is qemu.git/contrib/ivshmem-server.  An example syntax when using the shared
 memory server is:
 
 @example
-qemu-system-i386 -device ivshmem,size=<size in format accepted by -m>[,chardev=<id>]
-                 [,msi=on][,ioeventfd=on][,vectors=n][,role=peer|master]
-qemu-system-i386 -chardev socket,path=<path>,id=<id>
+# First start the ivshmem server once and for all
+ivshmem-server -p <pidfile> -S <path> -m <shm name> -l <shm size> -n <vectors n>
+
+# Then start your qemu instances with matching arguments
+qemu-system-i386 -device ivshmem,size=<shm size>,vectors=<vectors n>,chardev=<id>
+                 [,msi=on][,ioeventfd=on][,role=peer|master]
+                 -chardev socket,path=<path>,id=<id>
 @end example
 
 When using the server, the guest will be assigned a VM ID (>=0) that allows guests
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 2/2] docs: update ivshmem device spec
  2014-08-08  8:55 ` [Qemu-devel] " David Marchand
@ 2014-08-08  8:55   ` David Marchand
  -1 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-08  8:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, pbonzini, claudio.fontana, jani.kokkonen, eblake, cam, armbru

Add some notes on the parts needed to use ivshmem devices: more specifically,
explain the purpose of an ivshmem server and the basic concept to use the
ivshmem devices in guests.
Move some parts of the documentation and re-organise it.

Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 docs/specs/ivshmem_device_spec.txt |  124 +++++++++++++++++++++++++++---------
 1 file changed, 93 insertions(+), 31 deletions(-)

diff --git a/docs/specs/ivshmem_device_spec.txt b/docs/specs/ivshmem_device_spec.txt
index 667a862..f5f2b95 100644
--- a/docs/specs/ivshmem_device_spec.txt
+++ b/docs/specs/ivshmem_device_spec.txt
@@ -2,30 +2,103 @@
 Device Specification for Inter-VM shared memory device
 ------------------------------------------------------
 
-The Inter-VM shared memory device is designed to share a region of memory to
-userspace in multiple virtual guests.  The memory region does not belong to any
-guest, but is a POSIX memory object on the host.  Optionally, the device may
-support sending interrupts to other guests sharing the same memory region.
+The Inter-VM shared memory device is designed to share a memory region (created
+on the host via the POSIX shared memory API) between multiple QEMU processes
+running different guests. In order for all guests to be able to pick up the
+shared memory area, it is modeled by QEMU as a PCI device exposing said memory
+to the guest as a PCI BAR.
+The memory region does not belong to any guest, but is a POSIX memory object on
+the host. The host can access this shared memory if needed.
+
+The device also provides an optional communication mechanism between guests
+sharing the same memory object. More details about that in the section 'Guest to
+guest communication' section.
 
 
 The Inter-VM PCI device
 -----------------------
 
-*BARs*
+From the VM point of view, the ivshmem PCI device supports three BARs.
+
+- BAR0 is a 1 Kbyte MMIO region to support registers and interrupts when MSI is
+  not used.
+- BAR1 is used for MSI-X when it is enabled in the device.
+- BAR2 is used to access the shared memory object.
+
+It is your choice how to use the device but you must choose between two
+behaviors :
+
+- basically, if you only need the shared memory part, you will map BAR2.
+  This way, you have access to the shared memory in guest and can use it as you
+  see fit (memnic, for example, uses it in userland
+  http://dpdk.org/browse/memnic).
+
+- BAR0 and BAR1 are used to implement an optional communication mechanism
+  through interrupts in the guests. If you need an event mechanism between the
+  guests accessing the shared memory, you will most likely want to write a
+  kernel driver that will handle interrupts. See details in the section 'Guest
+  to guest communication' section.
+
+The behavior is chosen when starting your QEMU processes:
+- no communication mechanism needed, the first QEMU to start creates the shared
+  memory on the host, subsequent QEMU processes will use it.
+
+- communication mechanism needed, an ivshmem server must be started before any
+  QEMU processes, then each QEMU process connects to the server unix socket.
+
+For more details on the QEMU ivshmem parameters, see qemu-doc documentation.
+
+
+Guest to guest communication
+----------------------------
+
+This section details the communication mechanism between the guests accessing
+the ivhsmem shared memory.
 
-The device supports three BARs.  BAR0 is a 1 Kbyte MMIO region to support
-registers.  BAR1 is used for MSI-X when it is enabled in the device.  BAR2 is
-used to map the shared memory object from the host.  The size of BAR2 is
-specified when the guest is started and must be a power of 2 in size.
+*ivshmem server*
 
-*Registers*
+This server code is available in qemu.git/contrib/ivshmem-server.
 
-The device currently supports 4 registers of 32-bits each.  Registers
-are used for synchronization between guests sharing the same memory object when
-interrupts are supported (this requires using the shared memory server).
+The server must be started on the host before any guest.
+It creates a shared memory object then waits for clients to connect on an unix
+socket.
 
-The server assigns each VM an ID number and sends this ID number to the QEMU
-process when the guest starts.
+For each client (QEMU processes) that connects to the server:
+- the server assigns an ID for this client and sends this ID to him as the first
+  message,
+- the server sends a fd to the shared memory object to this client,
+- the server creates a new set of host eventfds associated to the new client and
+  sends this set to all already connected clients,
+- finally, the server sends all the eventfds sets for all clients to the new
+  client.
+
+The server signals all clients when one of them disconnects.
+
+The client IDs are limited to 16 bits because of the current implementation (see
+Doorbell register in 'PCI device registers' subsection). Hence on 65536 clients
+are supported.
+
+All the file descriptors (fd to the shared memory, eventfds for each client)
+are passed to clients using SCM_RIGHTS over the server unix socket.
+
+Apart from the current ivshmem implementation in QEMU, an ivshmem client has
+been provided in qemu.git/contrib/ivshmem-client for debug.
+
+*QEMU as an ivshmem client*
+
+At initialisation, when creating the ivshmem device, QEMU gets its ID from the
+server then make it available through BAR0 IVPosition register for the VM to use
+(see 'PCI device registers' subsection).
+QEMU then uses the fd to the shared memory to map it to BAR2.
+eventfds for all other clients received from the server are stored to implement
+BAR0 Doorbell register (see 'PCI device registers' subsection).
+Finally, eventfds assigned to this QEMU process are used to send interrupts in
+this VM.
+
+*PCI device registers*
+
+From the VM point of view, the ivshmem PCI device supports 4 registers of
+32-bits each.
 
 enum ivshmem_registers {
     IntrMask = 0,
@@ -49,8 +122,8 @@ bit to 0 and unmasked by setting the first bit to 1.
 IVPosition Register: The IVPosition register is read-only and reports the
 guest's ID number.  The guest IDs are non-negative integers.  When using the
 server, since the server is a separate process, the VM ID will only be set when
-the device is ready (shared memory is received from the server and accessible via
-the device).  If the device is not ready, the IVPosition will return -1.
+the device is ready (shared memory is received from the server and accessible
+via the device).  If the device is not ready, the IVPosition will return -1.
 Applications should ensure that they have a valid VM ID before accessing the
 shared memory.
 
@@ -59,8 +132,8 @@ Doorbell register.  The doorbell register is 32-bits, logically divided into
 two 16-bit fields.  The high 16-bits are the guest ID to interrupt and the low
 16-bits are the interrupt vector to trigger.  The semantics of the value
 written to the doorbell depends on whether the device is using MSI or a regular
-pin-based interrupt.  In short, MSI uses vectors while regular interrupts set the
-status register.
+pin-based interrupt.  In short, MSI uses vectors while regular interrupts set
+the status register.
 
 Regular Interrupts
 
@@ -71,7 +144,7 @@ interrupt in the destination guest.
 
 Message Signalled Interrupts
 
-A ivshmem device may support multiple MSI vectors.  If so, the lower 16-bits
+An ivshmem device may support multiple MSI vectors.  If so, the lower 16-bits
 written to the Doorbell register must be between 0 and the maximum number of
 vectors the guest supports.  The lower 16 bits written to the doorbell is the
 MSI vector that will be raised in the destination guest.  The number of MSI
@@ -83,14 +156,3 @@ interrupt itself should be communicated via the shared memory region.  Devices
 supporting multiple MSI vectors can use different vectors to indicate different
 events have occurred.  The semantics of interrupt vectors are left to the
 user's discretion.
-
-
-Usage in the Guest
-------------------
-
-The shared memory device is intended to be used with the provided UIO driver.
-Very little configuration is needed.  The guest should map BAR0 to access the
-registers (an array of 32-bit ints allows simple writing) and map BAR2 to
-access the shared memory region itself.  The size of the shared memory region
-is specified when the guest (or shared memory server) is started.  A guest may
-map the whole shared memory region or only part of it.
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
@ 2014-08-08  8:55   ` David Marchand
  0 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-08  8:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, claudio.fontana, armbru, pbonzini, jani.kokkonen, cam

Add some notes on the parts needed to use ivshmem devices: more specifically,
explain the purpose of an ivshmem server and the basic concept to use the
ivshmem devices in guests.
Move some parts of the documentation and re-organise it.

Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 docs/specs/ivshmem_device_spec.txt |  124 +++++++++++++++++++++++++++---------
 1 file changed, 93 insertions(+), 31 deletions(-)

diff --git a/docs/specs/ivshmem_device_spec.txt b/docs/specs/ivshmem_device_spec.txt
index 667a862..f5f2b95 100644
--- a/docs/specs/ivshmem_device_spec.txt
+++ b/docs/specs/ivshmem_device_spec.txt
@@ -2,30 +2,103 @@
 Device Specification for Inter-VM shared memory device
 ------------------------------------------------------
 
-The Inter-VM shared memory device is designed to share a region of memory to
-userspace in multiple virtual guests.  The memory region does not belong to any
-guest, but is a POSIX memory object on the host.  Optionally, the device may
-support sending interrupts to other guests sharing the same memory region.
+The Inter-VM shared memory device is designed to share a memory region (created
+on the host via the POSIX shared memory API) between multiple QEMU processes
+running different guests. In order for all guests to be able to pick up the
+shared memory area, it is modeled by QEMU as a PCI device exposing said memory
+to the guest as a PCI BAR.
+The memory region does not belong to any guest, but is a POSIX memory object on
+the host. The host can access this shared memory if needed.
+
+The device also provides an optional communication mechanism between guests
+sharing the same memory object. More details about that in the section 'Guest to
+guest communication' section.
 
 
 The Inter-VM PCI device
 -----------------------
 
-*BARs*
+From the VM point of view, the ivshmem PCI device supports three BARs.
+
+- BAR0 is a 1 Kbyte MMIO region to support registers and interrupts when MSI is
+  not used.
+- BAR1 is used for MSI-X when it is enabled in the device.
+- BAR2 is used to access the shared memory object.
+
+It is your choice how to use the device but you must choose between two
+behaviors :
+
+- basically, if you only need the shared memory part, you will map BAR2.
+  This way, you have access to the shared memory in guest and can use it as you
+  see fit (memnic, for example, uses it in userland
+  http://dpdk.org/browse/memnic).
+
+- BAR0 and BAR1 are used to implement an optional communication mechanism
+  through interrupts in the guests. If you need an event mechanism between the
+  guests accessing the shared memory, you will most likely want to write a
+  kernel driver that will handle interrupts. See details in the section 'Guest
+  to guest communication' section.
+
+The behavior is chosen when starting your QEMU processes:
+- no communication mechanism needed, the first QEMU to start creates the shared
+  memory on the host, subsequent QEMU processes will use it.
+
+- communication mechanism needed, an ivshmem server must be started before any
+  QEMU processes, then each QEMU process connects to the server unix socket.
+
+For more details on the QEMU ivshmem parameters, see qemu-doc documentation.
+
+
+Guest to guest communication
+----------------------------
+
+This section details the communication mechanism between the guests accessing
+the ivhsmem shared memory.
 
-The device supports three BARs.  BAR0 is a 1 Kbyte MMIO region to support
-registers.  BAR1 is used for MSI-X when it is enabled in the device.  BAR2 is
-used to map the shared memory object from the host.  The size of BAR2 is
-specified when the guest is started and must be a power of 2 in size.
+*ivshmem server*
 
-*Registers*
+This server code is available in qemu.git/contrib/ivshmem-server.
 
-The device currently supports 4 registers of 32-bits each.  Registers
-are used for synchronization between guests sharing the same memory object when
-interrupts are supported (this requires using the shared memory server).
+The server must be started on the host before any guest.
+It creates a shared memory object then waits for clients to connect on an unix
+socket.
 
-The server assigns each VM an ID number and sends this ID number to the QEMU
-process when the guest starts.
+For each client (QEMU processes) that connects to the server:
+- the server assigns an ID for this client and sends this ID to him as the first
+  message,
+- the server sends a fd to the shared memory object to this client,
+- the server creates a new set of host eventfds associated to the new client and
+  sends this set to all already connected clients,
+- finally, the server sends all the eventfds sets for all clients to the new
+  client.
+
+The server signals all clients when one of them disconnects.
+
+The client IDs are limited to 16 bits because of the current implementation (see
+Doorbell register in 'PCI device registers' subsection). Hence on 65536 clients
+are supported.
+
+All the file descriptors (fd to the shared memory, eventfds for each client)
+are passed to clients using SCM_RIGHTS over the server unix socket.
+
+Apart from the current ivshmem implementation in QEMU, an ivshmem client has
+been provided in qemu.git/contrib/ivshmem-client for debug.
+
+*QEMU as an ivshmem client*
+
+At initialisation, when creating the ivshmem device, QEMU gets its ID from the
+server then make it available through BAR0 IVPosition register for the VM to use
+(see 'PCI device registers' subsection).
+QEMU then uses the fd to the shared memory to map it to BAR2.
+eventfds for all other clients received from the server are stored to implement
+BAR0 Doorbell register (see 'PCI device registers' subsection).
+Finally, eventfds assigned to this QEMU process are used to send interrupts in
+this VM.
+
+*PCI device registers*
+
+From the VM point of view, the ivshmem PCI device supports 4 registers of
+32-bits each.
 
 enum ivshmem_registers {
     IntrMask = 0,
@@ -49,8 +122,8 @@ bit to 0 and unmasked by setting the first bit to 1.
 IVPosition Register: The IVPosition register is read-only and reports the
 guest's ID number.  The guest IDs are non-negative integers.  When using the
 server, since the server is a separate process, the VM ID will only be set when
-the device is ready (shared memory is received from the server and accessible via
-the device).  If the device is not ready, the IVPosition will return -1.
+the device is ready (shared memory is received from the server and accessible
+via the device).  If the device is not ready, the IVPosition will return -1.
 Applications should ensure that they have a valid VM ID before accessing the
 shared memory.
 
@@ -59,8 +132,8 @@ Doorbell register.  The doorbell register is 32-bits, logically divided into
 two 16-bit fields.  The high 16-bits are the guest ID to interrupt and the low
 16-bits are the interrupt vector to trigger.  The semantics of the value
 written to the doorbell depends on whether the device is using MSI or a regular
-pin-based interrupt.  In short, MSI uses vectors while regular interrupts set the
-status register.
+pin-based interrupt.  In short, MSI uses vectors while regular interrupts set
+the status register.
 
 Regular Interrupts
 
@@ -71,7 +144,7 @@ interrupt in the destination guest.
 
 Message Signalled Interrupts
 
-A ivshmem device may support multiple MSI vectors.  If so, the lower 16-bits
+An ivshmem device may support multiple MSI vectors.  If so, the lower 16-bits
 written to the Doorbell register must be between 0 and the maximum number of
 vectors the guest supports.  The lower 16 bits written to the doorbell is the
 MSI vector that will be raised in the destination guest.  The number of MSI
@@ -83,14 +156,3 @@ interrupt itself should be communicated via the shared memory region.  Devices
 supporting multiple MSI vectors can use different vectors to indicate different
 events have occurred.  The semantics of interrupt vectors are left to the
 user's discretion.
-
-
-Usage in the Guest
-------------------
-
-The shared memory device is intended to be used with the provided UIO driver.
-Very little configuration is needed.  The guest should map BAR0 to access the
-registers (an array of 32-bit ints allows simple writing) and map BAR2 to
-access the shared memory region itself.  The size of the shared memory region
-is specified when the guest (or shared memory server) is started.  A guest may
-map the whole shared memory region or only part of it.
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 2/2] docs: update ivshmem device spec
  2014-08-08  8:55   ` [Qemu-devel] " David Marchand
@ 2014-08-08  9:04     ` Claudio Fontana
  -1 siblings, 0 replies; 36+ messages in thread
From: Claudio Fontana @ 2014-08-08  9:04 UTC (permalink / raw)
  To: David Marchand, qemu-devel
  Cc: kvm, pbonzini, jani.kokkonen, eblake, cam, armbru

Hello David,

On 08.08.2014 10:55, David Marchand wrote:
> Add some notes on the parts needed to use ivshmem devices: more specifically,
> explain the purpose of an ivshmem server and the basic concept to use the
> ivshmem devices in guests.
> Move some parts of the documentation and re-organise it.
> 
> Signed-off-by: David Marchand <david.marchand@6wind.com>

You did not include my Reviewed-by: tag, did you change this from v2?

Ciao,

Claudio


> ---
>  docs/specs/ivshmem_device_spec.txt |  124 +++++++++++++++++++++++++++---------
>  1 file changed, 93 insertions(+), 31 deletions(-)
> 
> diff --git a/docs/specs/ivshmem_device_spec.txt b/docs/specs/ivshmem_device_spec.txt
> index 667a862..f5f2b95 100644
> --- a/docs/specs/ivshmem_device_spec.txt
> +++ b/docs/specs/ivshmem_device_spec.txt
> @@ -2,30 +2,103 @@
>  Device Specification for Inter-VM shared memory device
>  ------------------------------------------------------
>  
> -The Inter-VM shared memory device is designed to share a region of memory to
> -userspace in multiple virtual guests.  The memory region does not belong to any
> -guest, but is a POSIX memory object on the host.  Optionally, the device may
> -support sending interrupts to other guests sharing the same memory region.
> +The Inter-VM shared memory device is designed to share a memory region (created
> +on the host via the POSIX shared memory API) between multiple QEMU processes
> +running different guests. In order for all guests to be able to pick up the
> +shared memory area, it is modeled by QEMU as a PCI device exposing said memory
> +to the guest as a PCI BAR.
> +The memory region does not belong to any guest, but is a POSIX memory object on
> +the host. The host can access this shared memory if needed.
> +
> +The device also provides an optional communication mechanism between guests
> +sharing the same memory object. More details about that in the section 'Guest to
> +guest communication' section.
>  
>  
>  The Inter-VM PCI device
>  -----------------------
>  
> -*BARs*
> +From the VM point of view, the ivshmem PCI device supports three BARs.
> +
> +- BAR0 is a 1 Kbyte MMIO region to support registers and interrupts when MSI is
> +  not used.
> +- BAR1 is used for MSI-X when it is enabled in the device.
> +- BAR2 is used to access the shared memory object.
> +
> +It is your choice how to use the device but you must choose between two
> +behaviors :
> +
> +- basically, if you only need the shared memory part, you will map BAR2.
> +  This way, you have access to the shared memory in guest and can use it as you
> +  see fit (memnic, for example, uses it in userland
> +  http://dpdk.org/browse/memnic).
> +
> +- BAR0 and BAR1 are used to implement an optional communication mechanism
> +  through interrupts in the guests. If you need an event mechanism between the
> +  guests accessing the shared memory, you will most likely want to write a
> +  kernel driver that will handle interrupts. See details in the section 'Guest
> +  to guest communication' section.
> +
> +The behavior is chosen when starting your QEMU processes:
> +- no communication mechanism needed, the first QEMU to start creates the shared
> +  memory on the host, subsequent QEMU processes will use it.
> +
> +- communication mechanism needed, an ivshmem server must be started before any
> +  QEMU processes, then each QEMU process connects to the server unix socket.
> +
> +For more details on the QEMU ivshmem parameters, see qemu-doc documentation.
> +
> +
> +Guest to guest communication
> +----------------------------
> +
> +This section details the communication mechanism between the guests accessing
> +the ivhsmem shared memory.
>  
> -The device supports three BARs.  BAR0 is a 1 Kbyte MMIO region to support
> -registers.  BAR1 is used for MSI-X when it is enabled in the device.  BAR2 is
> -used to map the shared memory object from the host.  The size of BAR2 is
> -specified when the guest is started and must be a power of 2 in size.
> +*ivshmem server*
>  
> -*Registers*
> +This server code is available in qemu.git/contrib/ivshmem-server.
>  
> -The device currently supports 4 registers of 32-bits each.  Registers
> -are used for synchronization between guests sharing the same memory object when
> -interrupts are supported (this requires using the shared memory server).
> +The server must be started on the host before any guest.
> +It creates a shared memory object then waits for clients to connect on an unix
> +socket.
>  
> -The server assigns each VM an ID number and sends this ID number to the QEMU
> -process when the guest starts.
> +For each client (QEMU processes) that connects to the server:
> +- the server assigns an ID for this client and sends this ID to him as the first
> +  message,
> +- the server sends a fd to the shared memory object to this client,
> +- the server creates a new set of host eventfds associated to the new client and
> +  sends this set to all already connected clients,
> +- finally, the server sends all the eventfds sets for all clients to the new
> +  client.
> +
> +The server signals all clients when one of them disconnects.
> +
> +The client IDs are limited to 16 bits because of the current implementation (see
> +Doorbell register in 'PCI device registers' subsection). Hence on 65536 clients
> +are supported.
> +
> +All the file descriptors (fd to the shared memory, eventfds for each client)
> +are passed to clients using SCM_RIGHTS over the server unix socket.
> +
> +Apart from the current ivshmem implementation in QEMU, an ivshmem client has
> +been provided in qemu.git/contrib/ivshmem-client for debug.
> +
> +*QEMU as an ivshmem client*
> +
> +At initialisation, when creating the ivshmem device, QEMU gets its ID from the
> +server then make it available through BAR0 IVPosition register for the VM to use
> +(see 'PCI device registers' subsection).
> +QEMU then uses the fd to the shared memory to map it to BAR2.
> +eventfds for all other clients received from the server are stored to implement
> +BAR0 Doorbell register (see 'PCI device registers' subsection).
> +Finally, eventfds assigned to this QEMU process are used to send interrupts in
> +this VM.
> +
> +*PCI device registers*
> +
> +From the VM point of view, the ivshmem PCI device supports 4 registers of
> +32-bits each.
>  
>  enum ivshmem_registers {
>      IntrMask = 0,
> @@ -49,8 +122,8 @@ bit to 0 and unmasked by setting the first bit to 1.
>  IVPosition Register: The IVPosition register is read-only and reports the
>  guest's ID number.  The guest IDs are non-negative integers.  When using the
>  server, since the server is a separate process, the VM ID will only be set when
> -the device is ready (shared memory is received from the server and accessible via
> -the device).  If the device is not ready, the IVPosition will return -1.
> +the device is ready (shared memory is received from the server and accessible
> +via the device).  If the device is not ready, the IVPosition will return -1.
>  Applications should ensure that they have a valid VM ID before accessing the
>  shared memory.
>  
> @@ -59,8 +132,8 @@ Doorbell register.  The doorbell register is 32-bits, logically divided into
>  two 16-bit fields.  The high 16-bits are the guest ID to interrupt and the low
>  16-bits are the interrupt vector to trigger.  The semantics of the value
>  written to the doorbell depends on whether the device is using MSI or a regular
> -pin-based interrupt.  In short, MSI uses vectors while regular interrupts set the
> -status register.
> +pin-based interrupt.  In short, MSI uses vectors while regular interrupts set
> +the status register.
>  
>  Regular Interrupts
>  
> @@ -71,7 +144,7 @@ interrupt in the destination guest.
>  
>  Message Signalled Interrupts
>  
> -A ivshmem device may support multiple MSI vectors.  If so, the lower 16-bits
> +An ivshmem device may support multiple MSI vectors.  If so, the lower 16-bits
>  written to the Doorbell register must be between 0 and the maximum number of
>  vectors the guest supports.  The lower 16 bits written to the doorbell is the
>  MSI vector that will be raised in the destination guest.  The number of MSI
> @@ -83,14 +156,3 @@ interrupt itself should be communicated via the shared memory region.  Devices
>  supporting multiple MSI vectors can use different vectors to indicate different
>  events have occurred.  The semantics of interrupt vectors are left to the
>  user's discretion.
> -
> -
> -Usage in the Guest
> -------------------
> -
> -The shared memory device is intended to be used with the provided UIO driver.
> -Very little configuration is needed.  The guest should map BAR0 to access the
> -registers (an array of 32-bit ints allows simple writing) and map BAR2 to
> -access the shared memory region itself.  The size of the shared memory region
> -is specified when the guest (or shared memory server) is started.  A guest may
> -map the whole shared memory region or only part of it.
> 


-- 
Claudio Fontana
Server Virtualization Architect
Huawei Technologies Duesseldorf GmbH
Riesstraße 25 - 80992 München

office: +49 89 158834 4135
mobile: +49 15253060158


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
@ 2014-08-08  9:04     ` Claudio Fontana
  0 siblings, 0 replies; 36+ messages in thread
From: Claudio Fontana @ 2014-08-08  9:04 UTC (permalink / raw)
  To: David Marchand, qemu-devel; +Cc: kvm, armbru, pbonzini, jani.kokkonen, cam

Hello David,

On 08.08.2014 10:55, David Marchand wrote:
> Add some notes on the parts needed to use ivshmem devices: more specifically,
> explain the purpose of an ivshmem server and the basic concept to use the
> ivshmem devices in guests.
> Move some parts of the documentation and re-organise it.
> 
> Signed-off-by: David Marchand <david.marchand@6wind.com>

You did not include my Reviewed-by: tag, did you change this from v2?

Ciao,

Claudio


> ---
>  docs/specs/ivshmem_device_spec.txt |  124 +++++++++++++++++++++++++++---------
>  1 file changed, 93 insertions(+), 31 deletions(-)
> 
> diff --git a/docs/specs/ivshmem_device_spec.txt b/docs/specs/ivshmem_device_spec.txt
> index 667a862..f5f2b95 100644
> --- a/docs/specs/ivshmem_device_spec.txt
> +++ b/docs/specs/ivshmem_device_spec.txt
> @@ -2,30 +2,103 @@
>  Device Specification for Inter-VM shared memory device
>  ------------------------------------------------------
>  
> -The Inter-VM shared memory device is designed to share a region of memory to
> -userspace in multiple virtual guests.  The memory region does not belong to any
> -guest, but is a POSIX memory object on the host.  Optionally, the device may
> -support sending interrupts to other guests sharing the same memory region.
> +The Inter-VM shared memory device is designed to share a memory region (created
> +on the host via the POSIX shared memory API) between multiple QEMU processes
> +running different guests. In order for all guests to be able to pick up the
> +shared memory area, it is modeled by QEMU as a PCI device exposing said memory
> +to the guest as a PCI BAR.
> +The memory region does not belong to any guest, but is a POSIX memory object on
> +the host. The host can access this shared memory if needed.
> +
> +The device also provides an optional communication mechanism between guests
> +sharing the same memory object. More details about that in the section 'Guest to
> +guest communication' section.
>  
>  
>  The Inter-VM PCI device
>  -----------------------
>  
> -*BARs*
> +From the VM point of view, the ivshmem PCI device supports three BARs.
> +
> +- BAR0 is a 1 Kbyte MMIO region to support registers and interrupts when MSI is
> +  not used.
> +- BAR1 is used for MSI-X when it is enabled in the device.
> +- BAR2 is used to access the shared memory object.
> +
> +It is your choice how to use the device but you must choose between two
> +behaviors :
> +
> +- basically, if you only need the shared memory part, you will map BAR2.
> +  This way, you have access to the shared memory in guest and can use it as you
> +  see fit (memnic, for example, uses it in userland
> +  http://dpdk.org/browse/memnic).
> +
> +- BAR0 and BAR1 are used to implement an optional communication mechanism
> +  through interrupts in the guests. If you need an event mechanism between the
> +  guests accessing the shared memory, you will most likely want to write a
> +  kernel driver that will handle interrupts. See details in the section 'Guest
> +  to guest communication' section.
> +
> +The behavior is chosen when starting your QEMU processes:
> +- no communication mechanism needed, the first QEMU to start creates the shared
> +  memory on the host, subsequent QEMU processes will use it.
> +
> +- communication mechanism needed, an ivshmem server must be started before any
> +  QEMU processes, then each QEMU process connects to the server unix socket.
> +
> +For more details on the QEMU ivshmem parameters, see qemu-doc documentation.
> +
> +
> +Guest to guest communication
> +----------------------------
> +
> +This section details the communication mechanism between the guests accessing
> +the ivhsmem shared memory.
>  
> -The device supports three BARs.  BAR0 is a 1 Kbyte MMIO region to support
> -registers.  BAR1 is used for MSI-X when it is enabled in the device.  BAR2 is
> -used to map the shared memory object from the host.  The size of BAR2 is
> -specified when the guest is started and must be a power of 2 in size.
> +*ivshmem server*
>  
> -*Registers*
> +This server code is available in qemu.git/contrib/ivshmem-server.
>  
> -The device currently supports 4 registers of 32-bits each.  Registers
> -are used for synchronization between guests sharing the same memory object when
> -interrupts are supported (this requires using the shared memory server).
> +The server must be started on the host before any guest.
> +It creates a shared memory object then waits for clients to connect on an unix
> +socket.
>  
> -The server assigns each VM an ID number and sends this ID number to the QEMU
> -process when the guest starts.
> +For each client (QEMU processes) that connects to the server:
> +- the server assigns an ID for this client and sends this ID to him as the first
> +  message,
> +- the server sends a fd to the shared memory object to this client,
> +- the server creates a new set of host eventfds associated to the new client and
> +  sends this set to all already connected clients,
> +- finally, the server sends all the eventfds sets for all clients to the new
> +  client.
> +
> +The server signals all clients when one of them disconnects.
> +
> +The client IDs are limited to 16 bits because of the current implementation (see
> +Doorbell register in 'PCI device registers' subsection). Hence on 65536 clients
> +are supported.
> +
> +All the file descriptors (fd to the shared memory, eventfds for each client)
> +are passed to clients using SCM_RIGHTS over the server unix socket.
> +
> +Apart from the current ivshmem implementation in QEMU, an ivshmem client has
> +been provided in qemu.git/contrib/ivshmem-client for debug.
> +
> +*QEMU as an ivshmem client*
> +
> +At initialisation, when creating the ivshmem device, QEMU gets its ID from the
> +server then make it available through BAR0 IVPosition register for the VM to use
> +(see 'PCI device registers' subsection).
> +QEMU then uses the fd to the shared memory to map it to BAR2.
> +eventfds for all other clients received from the server are stored to implement
> +BAR0 Doorbell register (see 'PCI device registers' subsection).
> +Finally, eventfds assigned to this QEMU process are used to send interrupts in
> +this VM.
> +
> +*PCI device registers*
> +
> +From the VM point of view, the ivshmem PCI device supports 4 registers of
> +32-bits each.
>  
>  enum ivshmem_registers {
>      IntrMask = 0,
> @@ -49,8 +122,8 @@ bit to 0 and unmasked by setting the first bit to 1.
>  IVPosition Register: The IVPosition register is read-only and reports the
>  guest's ID number.  The guest IDs are non-negative integers.  When using the
>  server, since the server is a separate process, the VM ID will only be set when
> -the device is ready (shared memory is received from the server and accessible via
> -the device).  If the device is not ready, the IVPosition will return -1.
> +the device is ready (shared memory is received from the server and accessible
> +via the device).  If the device is not ready, the IVPosition will return -1.
>  Applications should ensure that they have a valid VM ID before accessing the
>  shared memory.
>  
> @@ -59,8 +132,8 @@ Doorbell register.  The doorbell register is 32-bits, logically divided into
>  two 16-bit fields.  The high 16-bits are the guest ID to interrupt and the low
>  16-bits are the interrupt vector to trigger.  The semantics of the value
>  written to the doorbell depends on whether the device is using MSI or a regular
> -pin-based interrupt.  In short, MSI uses vectors while regular interrupts set the
> -status register.
> +pin-based interrupt.  In short, MSI uses vectors while regular interrupts set
> +the status register.
>  
>  Regular Interrupts
>  
> @@ -71,7 +144,7 @@ interrupt in the destination guest.
>  
>  Message Signalled Interrupts
>  
> -A ivshmem device may support multiple MSI vectors.  If so, the lower 16-bits
> +An ivshmem device may support multiple MSI vectors.  If so, the lower 16-bits
>  written to the Doorbell register must be between 0 and the maximum number of
>  vectors the guest supports.  The lower 16 bits written to the doorbell is the
>  MSI vector that will be raised in the destination guest.  The number of MSI
> @@ -83,14 +156,3 @@ interrupt itself should be communicated via the shared memory region.  Devices
>  supporting multiple MSI vectors can use different vectors to indicate different
>  events have occurred.  The semantics of interrupt vectors are left to the
>  user's discretion.
> -
> -
> -Usage in the Guest
> -------------------
> -
> -The shared memory device is intended to be used with the provided UIO driver.
> -Very little configuration is needed.  The guest should map BAR0 to access the
> -registers (an array of 32-bit ints allows simple writing) and map BAR2 to
> -access the shared memory region itself.  The size of the shared memory region
> -is specified when the guest (or shared memory server) is started.  A guest may
> -map the whole shared memory region or only part of it.
> 


-- 
Claudio Fontana
Server Virtualization Architect
Huawei Technologies Duesseldorf GmbH
Riesstraße 25 - 80992 München

office: +49 89 158834 4135
mobile: +49 15253060158

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [Qemu-devel] [PATCH v3 0/2] ivshmem: update documentation, add client/server tools
  2014-08-08  8:55 ` [Qemu-devel] " David Marchand
@ 2014-08-08  9:30   ` Gonglei (Arei)
  -1 siblings, 0 replies; 36+ messages in thread
From: Gonglei (Arei) @ 2014-08-08  9:30 UTC (permalink / raw)
  To: David Marchand, qemu-devel
  Cc: kvm, Claudio Fontana, armbru, pbonzini, Jani Kokkonen, cam

Hi,

> Subject: [Qemu-devel] [PATCH v3 0/2] ivshmem: update documentation, add
> client/server tools
> 
> Here is a patchset containing an update on ivshmem specs documentation and
> importing ivshmem server and client tools.
> These tools have been written from scratch and are not related to what is
> available in nahanni repository.
> I put them in contrib/ directory as the qemu-doc.texi was already telling the
> server was supposed to be there.
> 
> Changes since v2:
> - fixed license issues in ivshmem client/server (I took hw/virtio/virtio-rng.c
>   file as a reference).
> 
> Changes since v1:
> - moved client/server import patch before doc update,
> - tried to re-organise the ivshmem_device_spec.txt file based on Claudio
>   comments (still not sure if the result is that great, comments welcome),
> - incorporated comments from Claudio, Eric and Cam,
> - added more details on the server <-> client messages exchange (but sorry, no
>   ASCII art here).
> 
> By the way, there are still some functionnalities that need description (use of
> ioeventfd, the lack of irqfd support) and some parts of the ivshmem code clearly
> need cleanup. I will try to address this in future patches when these first
> patches are ok.
> 
> 
If you can describe the steps of using example about
your ivshmem-client and ivshmem-server will be great IMHO.

Best regards,
-Gonglei

> --
> David Marchand
> 
> David Marchand (2):
>   contrib: add ivshmem client and server
>   docs: update ivshmem device spec
> 
>  contrib/ivshmem-client/Makefile         |   29 +++
>  contrib/ivshmem-client/ivshmem-client.c |  418
> ++++++++++++++++++++++++++++++
>  contrib/ivshmem-client/ivshmem-client.h |  238 ++++++++++++++++++
>  contrib/ivshmem-client/main.c           |  246 ++++++++++++++++++
>  contrib/ivshmem-server/Makefile         |   29 +++
>  contrib/ivshmem-server/ivshmem-server.c |  420
> +++++++++++++++++++++++++++++++
>  contrib/ivshmem-server/ivshmem-server.h |  185 ++++++++++++++
>  contrib/ivshmem-server/main.c           |  296
> ++++++++++++++++++++++
>  docs/specs/ivshmem_device_spec.txt      |  124 ++++++---
>  qemu-doc.texi                           |   10 +-
>  10 files changed, 1961 insertions(+), 34 deletions(-)
>  create mode 100644 contrib/ivshmem-client/Makefile
>  create mode 100644 contrib/ivshmem-client/ivshmem-client.c
>  create mode 100644 contrib/ivshmem-client/ivshmem-client.h
>  create mode 100644 contrib/ivshmem-client/main.c
>  create mode 100644 contrib/ivshmem-server/Makefile
>  create mode 100644 contrib/ivshmem-server/ivshmem-server.c
>  create mode 100644 contrib/ivshmem-server/ivshmem-server.h
>  create mode 100644 contrib/ivshmem-server/main.c
> 
> --
> 1.7.10.4
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/2] ivshmem: update documentation, add client/server tools
@ 2014-08-08  9:30   ` Gonglei (Arei)
  0 siblings, 0 replies; 36+ messages in thread
From: Gonglei (Arei) @ 2014-08-08  9:30 UTC (permalink / raw)
  To: David Marchand, qemu-devel
  Cc: kvm, Claudio Fontana, armbru, pbonzini, Jani Kokkonen, cam

Hi,

> Subject: [Qemu-devel] [PATCH v3 0/2] ivshmem: update documentation, add
> client/server tools
> 
> Here is a patchset containing an update on ivshmem specs documentation and
> importing ivshmem server and client tools.
> These tools have been written from scratch and are not related to what is
> available in nahanni repository.
> I put them in contrib/ directory as the qemu-doc.texi was already telling the
> server was supposed to be there.
> 
> Changes since v2:
> - fixed license issues in ivshmem client/server (I took hw/virtio/virtio-rng.c
>   file as a reference).
> 
> Changes since v1:
> - moved client/server import patch before doc update,
> - tried to re-organise the ivshmem_device_spec.txt file based on Claudio
>   comments (still not sure if the result is that great, comments welcome),
> - incorporated comments from Claudio, Eric and Cam,
> - added more details on the server <-> client messages exchange (but sorry, no
>   ASCII art here).
> 
> By the way, there are still some functionnalities that need description (use of
> ioeventfd, the lack of irqfd support) and some parts of the ivshmem code clearly
> need cleanup. I will try to address this in future patches when these first
> patches are ok.
> 
> 
If you can describe the steps of using example about
your ivshmem-client and ivshmem-server will be great IMHO.

Best regards,
-Gonglei

> --
> David Marchand
> 
> David Marchand (2):
>   contrib: add ivshmem client and server
>   docs: update ivshmem device spec
> 
>  contrib/ivshmem-client/Makefile         |   29 +++
>  contrib/ivshmem-client/ivshmem-client.c |  418
> ++++++++++++++++++++++++++++++
>  contrib/ivshmem-client/ivshmem-client.h |  238 ++++++++++++++++++
>  contrib/ivshmem-client/main.c           |  246 ++++++++++++++++++
>  contrib/ivshmem-server/Makefile         |   29 +++
>  contrib/ivshmem-server/ivshmem-server.c |  420
> +++++++++++++++++++++++++++++++
>  contrib/ivshmem-server/ivshmem-server.h |  185 ++++++++++++++
>  contrib/ivshmem-server/main.c           |  296
> ++++++++++++++++++++++
>  docs/specs/ivshmem_device_spec.txt      |  124 ++++++---
>  qemu-doc.texi                           |   10 +-
>  10 files changed, 1961 insertions(+), 34 deletions(-)
>  create mode 100644 contrib/ivshmem-client/Makefile
>  create mode 100644 contrib/ivshmem-client/ivshmem-client.c
>  create mode 100644 contrib/ivshmem-client/ivshmem-client.h
>  create mode 100644 contrib/ivshmem-client/main.c
>  create mode 100644 contrib/ivshmem-server/Makefile
>  create mode 100644 contrib/ivshmem-server/ivshmem-server.c
>  create mode 100644 contrib/ivshmem-server/ivshmem-server.h
>  create mode 100644 contrib/ivshmem-server/main.c
> 
> --
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 2/2] docs: update ivshmem device spec
  2014-08-08  9:04     ` [Qemu-devel] " Claudio Fontana
@ 2014-08-08  9:32       ` David Marchand
  -1 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-08  9:32 UTC (permalink / raw)
  To: Claudio Fontana, qemu-devel
  Cc: kvm, pbonzini, jani.kokkonen, eblake, cam, armbru

Hello Claudio,

On 08/08/2014 11:04 AM, Claudio Fontana wrote:
> On 08.08.2014 10:55, David Marchand wrote:
>> Add some notes on the parts needed to use ivshmem devices: more specifically,
>> explain the purpose of an ivshmem server and the basic concept to use the
>> ivshmem devices in guests.
>> Move some parts of the documentation and re-organise it.
>>
>> Signed-off-by: David Marchand <david.marchand@6wind.com>
>
> You did not include my Reviewed-by: tag, did you change this from v2?
>

No, I did not change anything in the documentation patch, I only touched 
the client/server patch.

I forgot to add your Reviewed-by tag ... (added to my tree for now, will 
send a v4 when I have some feedback on the client/server code).


Thanks Claudio.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
@ 2014-08-08  9:32       ` David Marchand
  0 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-08  9:32 UTC (permalink / raw)
  To: Claudio Fontana, qemu-devel; +Cc: kvm, armbru, pbonzini, jani.kokkonen, cam

Hello Claudio,

On 08/08/2014 11:04 AM, Claudio Fontana wrote:
> On 08.08.2014 10:55, David Marchand wrote:
>> Add some notes on the parts needed to use ivshmem devices: more specifically,
>> explain the purpose of an ivshmem server and the basic concept to use the
>> ivshmem devices in guests.
>> Move some parts of the documentation and re-organise it.
>>
>> Signed-off-by: David Marchand <david.marchand@6wind.com>
>
> You did not include my Reviewed-by: tag, did you change this from v2?
>

No, I did not change anything in the documentation patch, I only touched 
the client/server patch.

I forgot to add your Reviewed-by tag ... (added to my tree for now, will 
send a v4 when I have some feedback on the client/server code).


Thanks Claudio.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/2] ivshmem: update documentation, add client/server tools
  2014-08-08  9:30   ` Gonglei (Arei)
@ 2014-08-08  9:54     ` David Marchand
  -1 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-08  9:54 UTC (permalink / raw)
  To: Gonglei (Arei), qemu-devel
  Cc: kvm, Claudio Fontana, armbru, pbonzini, Jani Kokkonen, cam

Hello Gonglei,

On 08/08/2014 11:30 AM, Gonglei (Arei) wrote:
> If you can describe the steps of using example about
> your ivshmem-client and ivshmem-server will be great IMHO.

I already have included a note in the qemu-doc.texi file on how to start 
the ivshmem-server.
The (debug) client is started by only specifying -S /path/to/ivshmem_socket.

We made comments into the source code, so I am not sure what could be 
added. What do you miss ?


-- 
David Marchand

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/2] ivshmem: update documentation, add client/server tools
@ 2014-08-08  9:54     ` David Marchand
  0 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-08  9:54 UTC (permalink / raw)
  To: Gonglei (Arei), qemu-devel
  Cc: kvm, Claudio Fontana, armbru, pbonzini, Jani Kokkonen, cam

Hello Gonglei,

On 08/08/2014 11:30 AM, Gonglei (Arei) wrote:
> If you can describe the steps of using example about
> your ivshmem-client and ivshmem-server will be great IMHO.

I already have included a note in the qemu-doc.texi file on how to start 
the ivshmem-server.
The (debug) client is started by only specifying -S /path/to/ivshmem_socket.

We made comments into the source code, so I am not sure what could be 
added. What do you miss ?


-- 
David Marchand

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [Qemu-devel] [PATCH v3 0/2] ivshmem: update documentation, add client/server tools
  2014-08-08  9:54     ` David Marchand
@ 2014-08-08 10:26       ` Gonglei (Arei)
  -1 siblings, 0 replies; 36+ messages in thread
From: Gonglei (Arei) @ 2014-08-08 10:26 UTC (permalink / raw)
  To: David Marchand, qemu-devel
  Cc: kvm, Claudio Fontana, armbru, pbonzini, Jani Kokkonen, cam

Hi,

> Subject: Re: [Qemu-devel] [PATCH v3 0/2] ivshmem: update documentation,
> add client/server tools
> 
> Hello Gonglei,
> 
> On 08/08/2014 11:30 AM, Gonglei (Arei) wrote:
> > If you can describe the steps of using example about
> > your ivshmem-client and ivshmem-server will be great IMHO.
> 
> I already have included a note in the qemu-doc.texi file on how to start
> the ivshmem-server.
> The (debug) client is started by only specifying -S /path/to/ivshmem_socket.
> 
> We made comments into the source code, so I am not sure what could be
> added. What do you miss ?
> 
OK, thanks. 
I will test it and review the patch sets during the next few days.

Best regards,
-Gonglei

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/2] ivshmem: update documentation, add client/server tools
@ 2014-08-08 10:26       ` Gonglei (Arei)
  0 siblings, 0 replies; 36+ messages in thread
From: Gonglei (Arei) @ 2014-08-08 10:26 UTC (permalink / raw)
  To: David Marchand, qemu-devel
  Cc: kvm, Claudio Fontana, armbru, pbonzini, Jani Kokkonen, cam

Hi,

> Subject: Re: [Qemu-devel] [PATCH v3 0/2] ivshmem: update documentation,
> add client/server tools
> 
> Hello Gonglei,
> 
> On 08/08/2014 11:30 AM, Gonglei (Arei) wrote:
> > If you can describe the steps of using example about
> > your ivshmem-client and ivshmem-server will be great IMHO.
> 
> I already have included a note in the qemu-doc.texi file on how to start
> the ivshmem-server.
> The (debug) client is started by only specifying -S /path/to/ivshmem_socket.
> 
> We made comments into the source code, so I am not sure what could be
> added. What do you miss ?
> 
OK, thanks. 
I will test it and review the patch sets during the next few days.

Best regards,
-Gonglei

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/2] contrib: add ivshmem client and server
  2014-08-08  8:55   ` [Qemu-devel] " David Marchand
@ 2014-08-08 14:51     ` Stefan Hajnoczi
  -1 siblings, 0 replies; 36+ messages in thread
From: Stefan Hajnoczi @ 2014-08-08 14:51 UTC (permalink / raw)
  To: David Marchand
  Cc: qemu-devel, Olivier Matz, kvm, claudio.fontana, armbru, pbonzini,
	jani.kokkonen, cam

[-- Attachment #1: Type: text/plain, Size: 4157 bytes --]

On Fri, Aug 08, 2014 at 10:55:17AM +0200, David Marchand wrote:

Looks good, a few minor comments:

> diff --git a/contrib/ivshmem-client/Makefile b/contrib/ivshmem-client/Makefile
> new file mode 100644
> index 0000000..eee97c6
> --- /dev/null
> +++ b/contrib/ivshmem-client/Makefile
> @@ -0,0 +1,29 @@
> +# Copyright 6WIND S.A., 2014
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# (at your option) any later version.  See the COPYING file in the
> +# top-level directory.
> +
> +S ?= $(CURDIR)
> +O ?= $(CURDIR)
> +
> +CFLAGS += -Wall -Wextra -Werror -g
> +LDFLAGS +=
> +LDLIBS += -lrt
> +
> +VPATH = $(S)
> +PROG = ivshmem-client
> +OBJS := $(O)/ivshmem-client.o
> +OBJS += $(O)/main.o
> +
> +$(O)/%.o: %.c
> +	$(CC) $(CFLAGS) -o $@ -c $<
> +
> +$(O)/$(PROG): $(OBJS)
> +	$(CC) $(LDFLAGS) -o $@ $^ $(LDLIBS)
> +
> +.PHONY: all
> +all: $(O)/$(PROG)
> +
> +clean:
> +	rm -f $(OBJS) $(O)/$(PROG)

CCed Peter Maydell for a second opinion, I'd suggest hooking up to
QEMU's top-level ./Makefile.  QEMU does not do recursive make.

The advantages of hooking up QEMU's Makefile are:

1. So that ivshmem client/server code is built by default (on supported
   host platforms) and bitrot is avoided.

2. So that you don't have to duplicate rules.mak or any other build
   infrastructure.

> +/**
> + * Structure storing a peer
> + *
> + * Each time a client connects to an ivshmem server, it is advertised to
> + * all connected clients through the unix socket. When our ivshmem
> + * client receives a notification, it creates a ivshmem_client_peer
> + * structure to store the infos of this peer.
> + *
> + * This structure is also used to store the information of our own
> + * client in (struct ivshmem_client)->local.
> + */
> +struct ivshmem_client_peer {
> +    TAILQ_ENTRY(ivshmem_client_peer) next;    /**< next in list*/
> +    long id;                                    /**< the id of the peer */
> +    int vectors[IVSHMEM_CLIENT_MAX_VECTORS];  /**< one fd per vector */
> +    unsigned vectors_count;                     /**< number of vectors */
> +};

It would be nice to follow QEMU coding style:
typedef struct IvshmemClientPeer {
    ...
} IvshmemClientPeer;

(Use scripts/checkpatch.pl to check coding style)

> +/* browse the queue, allowing to remove/free the current element */
> +#define    TAILQ_FOREACH_SAFE(var, var2, head, field)            \
> +    for ((var) = TAILQ_FIRST((head)),                            \
> +             (var2) = ((var) ? TAILQ_NEXT((var), field) : NULL); \
> +         (var);                                                  \
> +         (var) = (var2),                                         \
> +             (var2) = ((var2) ? TAILQ_NEXT((var2), field) : NULL))

Please reuse include/qemu/queue.h.  It's a copy of the BSD <sys/queue.h>
and it has QTAILQ_FOREACH_SAFE().

> +    ret = sendmsg(sock_fd, &msg, 0);
> +    if (ret <= 0) {
> +        return -1;
> +    }

This is a blocking sendmsg(2) so it could hang the server if sock_fd's
sndbuf fills up.  This shouldn't happen since the amount of data that
gets sent in the lifetime of a session is relatively small, but there is
a chance.

If hung clients should not be able to block the server then sock_fd
needs to be non-blocking.

> +struct ivshmem_server {
> +    char unix_sock_path[PATH_MAX];  /**< path to unix socket */
> +    int sock_fd;                    /**< unix sock file descriptor */
> +    char shm_path[PATH_MAX];        /**< path to shm */
> +    size_t shm_size;                /**< size of shm */
> +    int shm_fd;                     /**< shm file descriptor */
> +    unsigned n_vectors;             /**< number of vectors */
> +    long cur_id;                    /**< id to be given to next client */
> +    int verbose;                    /**< true in verbose mode */

C99 bool is fine to use in QEMU code.  It makes the code easier to read
because you can be sure something is just true/false and not a bitmap or
integer counter.

> +/* parse the size of shm */
> +static int
> +parse_size(const char *val_str, size_t *val)

Looks similar to QEMU's util/qemu-option.c:parse_option_size().

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/2] contrib: add ivshmem client and server
@ 2014-08-08 14:51     ` Stefan Hajnoczi
  0 siblings, 0 replies; 36+ messages in thread
From: Stefan Hajnoczi @ 2014-08-08 14:51 UTC (permalink / raw)
  To: David Marchand
  Cc: Olivier Matz, kvm, claudio.fontana, armbru, qemu-devel, pbonzini,
	jani.kokkonen, cam

[-- Attachment #1: Type: text/plain, Size: 4157 bytes --]

On Fri, Aug 08, 2014 at 10:55:17AM +0200, David Marchand wrote:

Looks good, a few minor comments:

> diff --git a/contrib/ivshmem-client/Makefile b/contrib/ivshmem-client/Makefile
> new file mode 100644
> index 0000000..eee97c6
> --- /dev/null
> +++ b/contrib/ivshmem-client/Makefile
> @@ -0,0 +1,29 @@
> +# Copyright 6WIND S.A., 2014
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# (at your option) any later version.  See the COPYING file in the
> +# top-level directory.
> +
> +S ?= $(CURDIR)
> +O ?= $(CURDIR)
> +
> +CFLAGS += -Wall -Wextra -Werror -g
> +LDFLAGS +=
> +LDLIBS += -lrt
> +
> +VPATH = $(S)
> +PROG = ivshmem-client
> +OBJS := $(O)/ivshmem-client.o
> +OBJS += $(O)/main.o
> +
> +$(O)/%.o: %.c
> +	$(CC) $(CFLAGS) -o $@ -c $<
> +
> +$(O)/$(PROG): $(OBJS)
> +	$(CC) $(LDFLAGS) -o $@ $^ $(LDLIBS)
> +
> +.PHONY: all
> +all: $(O)/$(PROG)
> +
> +clean:
> +	rm -f $(OBJS) $(O)/$(PROG)

CCed Peter Maydell for a second opinion, I'd suggest hooking up to
QEMU's top-level ./Makefile.  QEMU does not do recursive make.

The advantages of hooking up QEMU's Makefile are:

1. So that ivshmem client/server code is built by default (on supported
   host platforms) and bitrot is avoided.

2. So that you don't have to duplicate rules.mak or any other build
   infrastructure.

> +/**
> + * Structure storing a peer
> + *
> + * Each time a client connects to an ivshmem server, it is advertised to
> + * all connected clients through the unix socket. When our ivshmem
> + * client receives a notification, it creates a ivshmem_client_peer
> + * structure to store the infos of this peer.
> + *
> + * This structure is also used to store the information of our own
> + * client in (struct ivshmem_client)->local.
> + */
> +struct ivshmem_client_peer {
> +    TAILQ_ENTRY(ivshmem_client_peer) next;    /**< next in list*/
> +    long id;                                    /**< the id of the peer */
> +    int vectors[IVSHMEM_CLIENT_MAX_VECTORS];  /**< one fd per vector */
> +    unsigned vectors_count;                     /**< number of vectors */
> +};

It would be nice to follow QEMU coding style:
typedef struct IvshmemClientPeer {
    ...
} IvshmemClientPeer;

(Use scripts/checkpatch.pl to check coding style)

> +/* browse the queue, allowing to remove/free the current element */
> +#define    TAILQ_FOREACH_SAFE(var, var2, head, field)            \
> +    for ((var) = TAILQ_FIRST((head)),                            \
> +             (var2) = ((var) ? TAILQ_NEXT((var), field) : NULL); \
> +         (var);                                                  \
> +         (var) = (var2),                                         \
> +             (var2) = ((var2) ? TAILQ_NEXT((var2), field) : NULL))

Please reuse include/qemu/queue.h.  It's a copy of the BSD <sys/queue.h>
and it has QTAILQ_FOREACH_SAFE().

> +    ret = sendmsg(sock_fd, &msg, 0);
> +    if (ret <= 0) {
> +        return -1;
> +    }

This is a blocking sendmsg(2) so it could hang the server if sock_fd's
sndbuf fills up.  This shouldn't happen since the amount of data that
gets sent in the lifetime of a session is relatively small, but there is
a chance.

If hung clients should not be able to block the server then sock_fd
needs to be non-blocking.

> +struct ivshmem_server {
> +    char unix_sock_path[PATH_MAX];  /**< path to unix socket */
> +    int sock_fd;                    /**< unix sock file descriptor */
> +    char shm_path[PATH_MAX];        /**< path to shm */
> +    size_t shm_size;                /**< size of shm */
> +    int shm_fd;                     /**< shm file descriptor */
> +    unsigned n_vectors;             /**< number of vectors */
> +    long cur_id;                    /**< id to be given to next client */
> +    int verbose;                    /**< true in verbose mode */

C99 bool is fine to use in QEMU code.  It makes the code easier to read
because you can be sure something is just true/false and not a bitmap or
integer counter.

> +/* parse the size of shm */
> +static int
> +parse_size(const char *val_str, size_t *val)

Looks similar to QEMU's util/qemu-option.c:parse_option_size().

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
  2014-08-08  8:55   ` [Qemu-devel] " David Marchand
@ 2014-08-08 15:02     ` Stefan Hajnoczi
  -1 siblings, 0 replies; 36+ messages in thread
From: Stefan Hajnoczi @ 2014-08-08 15:02 UTC (permalink / raw)
  To: David Marchand
  Cc: qemu-devel, kvm, claudio.fontana, armbru, pbonzini, jani.kokkonen, cam

[-- Attachment #1: Type: text/plain, Size: 1152 bytes --]

On Fri, Aug 08, 2014 at 10:55:18AM +0200, David Marchand wrote:
> +For each client (QEMU processes) that connects to the server:
> +- the server assigns an ID for this client and sends this ID to him as the first
> +  message,
> +- the server sends a fd to the shared memory object to this client,
> +- the server creates a new set of host eventfds associated to the new client and
> +  sends this set to all already connected clients,
> +- finally, the server sends all the eventfds sets for all clients to the new
> +  client.

The protocol is not extensible and no version number is exchanged.  For
the most part this should be okay because clients must run on the same
machine as the server.  It is assumed clients and server are compatible
with each other.

I wonder if we'll get into trouble later if the protocol needs to be
extended or some operation needs to happen, like upgrading QEMU or the
ivshmem-server.  At the very least someone building from source but
using system QEMU or ivshmem-server could get confusing failures if the
protocol doesn't match.

How about sending a version message as the first thing during a
connection?

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
@ 2014-08-08 15:02     ` Stefan Hajnoczi
  0 siblings, 0 replies; 36+ messages in thread
From: Stefan Hajnoczi @ 2014-08-08 15:02 UTC (permalink / raw)
  To: David Marchand
  Cc: kvm, claudio.fontana, qemu-devel, armbru, pbonzini, jani.kokkonen, cam

[-- Attachment #1: Type: text/plain, Size: 1152 bytes --]

On Fri, Aug 08, 2014 at 10:55:18AM +0200, David Marchand wrote:
> +For each client (QEMU processes) that connects to the server:
> +- the server assigns an ID for this client and sends this ID to him as the first
> +  message,
> +- the server sends a fd to the shared memory object to this client,
> +- the server creates a new set of host eventfds associated to the new client and
> +  sends this set to all already connected clients,
> +- finally, the server sends all the eventfds sets for all clients to the new
> +  client.

The protocol is not extensible and no version number is exchanged.  For
the most part this should be okay because clients must run on the same
machine as the server.  It is assumed clients and server are compatible
with each other.

I wonder if we'll get into trouble later if the protocol needs to be
extended or some operation needs to happen, like upgrading QEMU or the
ivshmem-server.  At the very least someone building from source but
using system QEMU or ivshmem-server could get confusing failures if the
protocol doesn't match.

How about sending a version message as the first thing during a
connection?

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 1/2] contrib: add ivshmem client and server
  2014-08-08  8:55   ` [Qemu-devel] " David Marchand
@ 2014-08-10  3:57     ` Gonglei
  -1 siblings, 0 replies; 36+ messages in thread
From: Gonglei @ 2014-08-10  3:57 UTC (permalink / raw)
  To: 'David Marchand', qemu-devel
  Cc: 'Olivier Matz',
	kvm, claudio.fontana, armbru, arei.gonglei, pbonzini,
	jani.kokkonen, cam

Hi,

> Subject: [Qemu-devel] [PATCH v3 1/2] contrib: add ivshmem client and server
> 
> When using ivshmem devices, notifications between guests can be sent as
> interrupts using a ivshmem-server (typical use described in documentation).
> The client is provided as a debug tool.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> Signed-off-by: David Marchand <david.marchand@6wind.com>
> ---
>  contrib/ivshmem-client/Makefile         |   29 +++
>  contrib/ivshmem-client/ivshmem-client.c |  418
> ++++++++++++++++++++++++++++++
>  contrib/ivshmem-client/ivshmem-client.h |  238 ++++++++++++++++++
>  contrib/ivshmem-client/main.c           |  246 ++++++++++++++++++
>  contrib/ivshmem-server/Makefile         |   29 +++
>  contrib/ivshmem-server/ivshmem-server.c |  420
> +++++++++++++++++++++++++++++++
>  contrib/ivshmem-server/ivshmem-server.h |  185 ++++++++++++++
>  contrib/ivshmem-server/main.c           |  296
> ++++++++++++++++++++++
>  qemu-doc.texi                           |   10 +-
>  9 files changed, 1868 insertions(+), 3 deletions(-)
>  create mode 100644 contrib/ivshmem-client/Makefile
>  create mode 100644 contrib/ivshmem-client/ivshmem-client.c
>  create mode 100644 contrib/ivshmem-client/ivshmem-client.h
>  create mode 100644 contrib/ivshmem-client/main.c
>  create mode 100644 contrib/ivshmem-server/Makefile
>  create mode 100644 contrib/ivshmem-server/ivshmem-server.c
>  create mode 100644 contrib/ivshmem-server/ivshmem-server.h
>  create mode 100644 contrib/ivshmem-server/main.c
> 
> diff --git a/contrib/ivshmem-client/Makefile b/contrib/ivshmem-client/Makefile
> new file mode 100644
> index 0000000..eee97c6
> --- /dev/null
> +++ b/contrib/ivshmem-client/Makefile
> @@ -0,0 +1,29 @@
> +# Copyright 6WIND S.A., 2014
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# (at your option) any later version.  See the COPYING file in the
> +# top-level directory.
> +
> +S ?= $(CURDIR)
> +O ?= $(CURDIR)
> +
> +CFLAGS += -Wall -Wextra -Werror -g
> +LDFLAGS +=
> +LDLIBS += -lrt
> +
> +VPATH = $(S)
> +PROG = ivshmem-client
> +OBJS := $(O)/ivshmem-client.o
> +OBJS += $(O)/main.o
> +
> +$(O)/%.o: %.c
> +	$(CC) $(CFLAGS) -o $@ -c $<
> +
> +$(O)/$(PROG): $(OBJS)
> +	$(CC) $(LDFLAGS) -o $@ $^ $(LDLIBS)
> +
> +.PHONY: all
> +all: $(O)/$(PROG)
> +
> +clean:
> +	rm -f $(OBJS) $(O)/$(PROG)
> diff --git a/contrib/ivshmem-client/ivshmem-client.c
> b/contrib/ivshmem-client/ivshmem-client.c
> new file mode 100644
> index 0000000..2166b64
> --- /dev/null
> +++ b/contrib/ivshmem-client/ivshmem-client.c
> @@ -0,0 +1,418 @@
> +/*
> + * Copyright 6WIND S.A., 2014
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * (at your option) any later version.  See the COPYING file in the
> + * top-level directory.
> + */
> +
> +#include <stdio.h>
> +#include <stdint.h>
> +#include <stdlib.h>
> +#include <errno.h>
> +#include <string.h>
> +#include <signal.h>
> +#include <unistd.h>
> +#include <inttypes.h>
> +#include <sys/queue.h>
> +
> +#include <sys/types.h>
> +#include <sys/socket.h>
> +#include <sys/un.h>
> +
> +#include "ivshmem-client.h"
> +
> +/* log a message on stdout if verbose=1 */
> +#define debug_log(client, fmt, ...) do { \
> +        if ((client)->verbose) {         \
> +            printf(fmt, ## __VA_ARGS__); \
> +        }                                \
> +    } while (0)
> +
> +/* read message from the unix socket */
> +static int
> +read_one_msg(struct ivshmem_client *client, long *index, int *fd)
> +{
> +    int ret;
> +    struct msghdr msg;
> +    struct iovec iov[1];
> +    union {
> +        struct cmsghdr cmsg;
> +        char control[CMSG_SPACE(sizeof(int))];
> +    } msg_control;
> +    struct cmsghdr *cmsg;
> +
> +    iov[0].iov_base = index;
> +    iov[0].iov_len = sizeof(*index);
> +
> +    memset(&msg, 0, sizeof(msg));
> +    msg.msg_iov = iov;
> +    msg.msg_iovlen = 1;
> +    msg.msg_control = &msg_control;
> +    msg.msg_controllen = sizeof(msg_control);
> +
> +    ret = recvmsg(client->sock_fd, &msg, 0);
> +    if (ret < 0) {
> +        debug_log(client, "cannot read message: %s\n", strerror(errno));
> +        return -1;
> +    }
> +    if (ret == 0) {
> +        debug_log(client, "lost connection to server\n");
> +        return -1;
> +    }
> +
> +    *fd = -1;
> +
> +    for (cmsg = CMSG_FIRSTHDR(&msg); cmsg; cmsg =
> CMSG_NXTHDR(&msg, cmsg)) {
> +
> +        if (cmsg->cmsg_len != CMSG_LEN(sizeof(int)) ||
> +            cmsg->cmsg_level != SOL_SOCKET ||
> +            cmsg->cmsg_type != SCM_RIGHTS) {
> +            continue;
> +        }
> +
> +        memcpy(fd, CMSG_DATA(cmsg), sizeof(*fd));
> +    }
> +
> +    return 0;
> +}
> +
> +/* free a peer when the server advertise a disconnection or when the
> + * client is freed */
> +static void
> +free_peer(struct ivshmem_client *client, struct ivshmem_client_peer *peer)
> +{
> +    unsigned vector;
> +
> +    TAILQ_REMOVE(&client->peer_list, peer, next);
> +    for (vector = 0; vector < peer->vectors_count; vector++) {
> +        close(peer->vectors[vector]);
> +    }
> +
> +    free(peer);
> +}
> +
> +/* handle message coming from server (new peer, new vectors) */
> +static int
> +handle_server_msg(struct ivshmem_client *client)
> +{
> +    struct ivshmem_client_peer *peer;
> +    long peer_id;
> +    int ret, fd;
> +
> +    ret = read_one_msg(client, &peer_id, &fd);
> +    if (ret < 0) {
> +        return -1;
> +    }
> +
> +    /* can return a peer or the local client */
> +    peer = ivshmem_client_search_peer(client, peer_id);
> +
> +    /* delete peer */
> +    if (fd == -1) {
> +
Maybe the above check should be moved before getting the peer.
And the next check peer is extra.

> +        if (peer == NULL || peer == &client->local) {
> +            debug_log(client, "receive delete for invalid peer %ld",

Missing '\n' ?

> peer_id);
> +            return -1;
> +        }
> +
> +        debug_log(client, "delete peer id = %ld\n", peer_id);
> +        free_peer(client, peer);
> +        return 0;
> +    }
> +
> +    /* new peer */
> +    if (peer == NULL) {
> +        peer = malloc(sizeof(*peer));

g_malloc0 ?.

> +        if (peer == NULL) {
> +            debug_log(client, "cannot allocate new peer\n");
> +            return -1;
> +        }
> +        memset(peer, 0, sizeof(*peer));
> +        peer->id = peer_id;
> +        peer->vectors_count = 0;
> +        TAILQ_INSERT_TAIL(&client->peer_list, peer, next);
> +        debug_log(client, "new peer id = %ld\n", peer_id);
> +    }
> +
> +    /* new vector */
> +    debug_log(client, "  new vector %d (fd=%d) for peer id %ld\n",
> +              peer->vectors_count, fd, peer->id);
> +    peer->vectors[peer->vectors_count] = fd;
> +    peer->vectors_count++;
> +
> +    return 0;
> +}
> +
> +/* init a new ivshmem client */
> +int
> +ivshmem_client_init(struct ivshmem_client *client, const char
> *unix_sock_path,
> +                    ivshmem_client_notif_cb_t notif_cb, void *notif_arg,
> +                    int verbose)
> +{
> +    unsigned i;
> +
> +    memset(client, 0, sizeof(*client));
> +
> +    snprintf(client->unix_sock_path, sizeof(client->unix_sock_path),
> +             "%s", unix_sock_path);
> +
> +    for (i = 0; i < IVSHMEM_CLIENT_MAX_VECTORS; i++) {
> +        client->local.vectors[i] = -1;
> +    }
> +
> +    TAILQ_INIT(&client->peer_list);
> +    client->local.id = -1;
> +
> +    client->notif_cb = notif_cb;
> +    client->notif_arg = notif_arg;
> +    client->verbose = verbose;

Missing client->sock_fd = -1; ?
> +
> +    return 0;
> +}
> +
> +/* create and connect to the unix socket */
> +int
> +ivshmem_client_connect(struct ivshmem_client *client)
> +{
> +    struct sockaddr_un sun;
> +    int fd;
> +    long tmp;
> +
> +    debug_log(client, "connect to client %s\n", client->unix_sock_path);
> +
> +    client->sock_fd = socket(AF_UNIX, SOCK_STREAM, 0);
> +    if (client->sock_fd < 0) {
> +        debug_log(client, "cannot create socket: %s\n", strerror(errno));
> +        return -1;
> +    }
> +
> +    sun.sun_family = AF_UNIX;
> +    snprintf(sun.sun_path, sizeof(sun.sun_path), "%s",
> client->unix_sock_path);
> +    if (connect(client->sock_fd, (struct sockaddr *)&sun, sizeof(sun)) < 0) {
> +        debug_log(client, "cannot connect to %s: %s\n", sun.sun_path,
> +                  strerror(errno));
> +        close(client->sock_fd);
> +        client->sock_fd = -1;
> +        return -1;
> +    }
> +
> +    /* first, we expect our index + a fd == -1 */
> +    if (read_one_msg(client, &client->local.id, &fd) < 0 ||
> +        client->local.id < 0 || fd != -1) {
Why not fd < 0 ?

> +        debug_log(client, "cannot read from server\n");
> +        close(client->sock_fd);
> +        client->sock_fd = -1;
> +        return -1;
> +    }
> +    debug_log(client, "our_id=%ld\n", client->local.id);
> +
> +    /* now, we expect shared mem fd + a -1 index, note that shm fd
> +     * is not used */
> +    if (read_one_msg(client, &tmp, &fd) < 0 ||
> +        tmp != -1 || fd < 0) {
> +        debug_log(client, "cannot read from server (2)\n");
> +        close(client->sock_fd);
> +        client->sock_fd = -1;
> +        return -1;
I think the error logic handle can move the end of this function, reducing 
duplicated code. Something like this:

	goto err;
  }
err:
	debug_log(client, "cannot read from server (2)\n");
    close(client->sock_fd);
    client->sock_fd = -1;
    return -1;

> +    }
> +    debug_log(client, "shm_fd=%d\n", fd);
> +
> +    return 0;
> +}
> +
> +/* close connection to the server, and free all peer structures */
> +void
> +ivshmem_client_close(struct ivshmem_client *client)
> +{
> +    struct ivshmem_client_peer *peer;
> +    unsigned i;
> +
> +    debug_log(client, "close client\n");
> +
> +    while ((peer = TAILQ_FIRST(&client->peer_list)) != NULL) {
> +        free_peer(client, peer);
> +    }
> +
> +    close(client->sock_fd);
> +    client->sock_fd = -1;
> +    client->local.id = -1;
> +    for (i = 0; i < IVSHMEM_CLIENT_MAX_VECTORS; i++) {
> +        client->local.vectors[i] = -1;
> +    }
> +}
> +
> +/* get the fd_set according to the unix socket and peer list */
> +void
> +ivshmem_client_get_fds(const struct ivshmem_client *client, fd_set *fds,
> +                       int *maxfd)
> +{
> +    int fd;
> +    unsigned vector;
> +
> +    FD_SET(client->sock_fd, fds);
> +    if (client->sock_fd >= *maxfd) {
> +        *maxfd = client->sock_fd + 1;
> +    }
> +
> +    for (vector = 0; vector < client->local.vectors_count; vector++) {
> +        fd = client->local.vectors[vector];
> +        FD_SET(fd, fds);
> +        if (fd >= *maxfd) {
> +            *maxfd = fd + 1;
> +        }
> +    }
> +}
> +
> +/* handle events from eventfd: just print a message on notification */
> +static int
> +handle_event(struct ivshmem_client *client, const fd_set *cur, int maxfd)
> +{
> +    struct ivshmem_client_peer *peer;
> +    uint64_t kick;
> +    unsigned i;
> +    int ret;
> +
> +    peer = &client->local;
> +
> +    for (i = 0; i < peer->vectors_count; i++) {
> +        if (peer->vectors[i] >= maxfd || !FD_ISSET(peer->vectors[i], cur)) {
> +            continue;
> +        }
> +
> +        ret = read(peer->vectors[i], &kick, sizeof(kick));
> +        if (ret < 0) {
> +            return ret;
> +        }
> +        if (ret != sizeof(kick)) {
> +            debug_log(client, "invalid read size = %d\n", ret);
> +            errno = EINVAL;
> +            return -1;
> +        }
> +        debug_log(client, "received event on fd %d vector %d: %ld\n",
> +                  peer->vectors[i], i, kick);
> +        if (client->notif_cb != NULL) {
> +            client->notif_cb(client, peer, i, client->notif_arg);
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +/* read and handle new messages on the given fd_set */
> +int
> +ivshmem_client_handle_fds(struct ivshmem_client *client, fd_set *fds, int
> maxfd)
> +{
> +    if (client->sock_fd < maxfd && FD_ISSET(client->sock_fd, fds) &&
> +        handle_server_msg(client) < 0 && errno != EINTR) {
> +        debug_log(client, "handle_server_msg() failed\n");
> +        return -1;
> +    } else if (handle_event(client, fds, maxfd) < 0 && errno != EINTR) {
> +        debug_log(client, "handle_event() failed\n");
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
> +/* send a notification on a vector of a peer */
> +int
> +ivshmem_client_notify(const struct ivshmem_client *client,
> +                      const struct ivshmem_client_peer *peer, unsigned
> vector)
> +{
> +    uint64_t kick;
> +    int fd;
> +
> +    if (vector > peer->vectors_count) {

Maybe if (vector >= peer->vectors_count) , otherwise the
peer->vectors[] array bounds.

> +        debug_log(client, "invalid vector %u on peer %ld\n", vector,
> peer->id);
> +        return -1;
> +    }
> +    fd = peer->vectors[vector];

Or fd = peer->vectors[vector - 1]; ?

> +    debug_log(client, "notify peer %ld on vector %d, fd %d\n", peer->id,
> vector,
> +              fd);
> +
> +    kick = 1;
> +    if (write(fd, &kick, sizeof(kick)) != sizeof(kick)) {
> +        fprintf(stderr, "could not write to %d: %s\n", peer->vectors[vector],
> +                strerror(errno));
Why not debug_log() at this here?

> +        return -1;
> +    }
> +    return 0;
> +}
> +
> +/* send a notification to all vectors of a peer */
> +int
> +ivshmem_client_notify_all_vects(const struct ivshmem_client *client,
> +                                const struct ivshmem_client_peer
> *peer)
> +{
> +    unsigned vector;
> +    int ret = 0;
> +
> +    for (vector = 0; vector < peer->vectors_count; vector++) {
> +        if (ivshmem_client_notify(client, peer, vector) < 0) {
> +            ret = -1;
The ret's value will be covered when multi clients failed. Do we need
store the failed status for server?.

> +        }
> +    }
> +
> +    return ret;
> +}
> +
> +/* send a notification to all peers */
> +int
> +ivshmem_client_notify_broadcast(const struct ivshmem_client *client)
> +{
> +    struct ivshmem_client_peer *peer;
> +    int ret = 0;
> +
> +    TAILQ_FOREACH(peer, &client->peer_list, next) {
> +        if (ivshmem_client_notify_all_vects(client, peer) < 0) {
> +            ret = -1;
> +        }
> +    }
> +
> +    return ret;
> +}
> +
> +/* lookup peer from its id */
> +struct ivshmem_client_peer *
> +ivshmem_client_search_peer(struct ivshmem_client *client, long peer_id)
> +{
> +    struct ivshmem_client_peer *peer;
> +
> +    if (peer_id == client->local.id) {
> +        return &client->local;
> +    }
> +
> +    TAILQ_FOREACH(peer, &client->peer_list, next) {
> +        if (peer->id == peer_id) {
> +            return peer;
> +        }
> +    }
> +    return NULL;
> +}
> +
> +/* dump our info, the list of peers their vectors on stdout */
> +void
> +ivshmem_client_dump(const struct ivshmem_client *client)
> +{
> +    const struct ivshmem_client_peer *peer;
> +    unsigned vector;
> +
> +    /* dump local infos */
> +    peer = &client->local;
> +    printf("our_id = %ld\n", peer->id);
> +    for (vector = 0; vector < peer->vectors_count; vector++) {
> +        printf("  vector %d is enabled (fd=%d)\n", vector,
> +               peer->vectors[vector]);
> +    }
> +
> +    /* dump peers */
> +    TAILQ_FOREACH(peer, &client->peer_list, next) {
> +        printf("peer_id = %ld\n", peer->id);
> +
> +        for (vector = 0; vector < peer->vectors_count; vector++) {
> +            printf("  vector %d is enabled (fd=%d)\n", vector,
> +                   peer->vectors[vector]);
> +        }
> +    }
> +}

To be continued...

Best regards,
-Gonglei

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/2] contrib: add ivshmem client and server
@ 2014-08-10  3:57     ` Gonglei
  0 siblings, 0 replies; 36+ messages in thread
From: Gonglei @ 2014-08-10  3:57 UTC (permalink / raw)
  To: 'David Marchand', qemu-devel
  Cc: 'Olivier Matz',
	kvm, claudio.fontana, armbru, arei.gonglei, pbonzini,
	jani.kokkonen, cam

Hi,

> Subject: [Qemu-devel] [PATCH v3 1/2] contrib: add ivshmem client and server
> 
> When using ivshmem devices, notifications between guests can be sent as
> interrupts using a ivshmem-server (typical use described in documentation).
> The client is provided as a debug tool.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> Signed-off-by: David Marchand <david.marchand@6wind.com>
> ---
>  contrib/ivshmem-client/Makefile         |   29 +++
>  contrib/ivshmem-client/ivshmem-client.c |  418
> ++++++++++++++++++++++++++++++
>  contrib/ivshmem-client/ivshmem-client.h |  238 ++++++++++++++++++
>  contrib/ivshmem-client/main.c           |  246 ++++++++++++++++++
>  contrib/ivshmem-server/Makefile         |   29 +++
>  contrib/ivshmem-server/ivshmem-server.c |  420
> +++++++++++++++++++++++++++++++
>  contrib/ivshmem-server/ivshmem-server.h |  185 ++++++++++++++
>  contrib/ivshmem-server/main.c           |  296
> ++++++++++++++++++++++
>  qemu-doc.texi                           |   10 +-
>  9 files changed, 1868 insertions(+), 3 deletions(-)
>  create mode 100644 contrib/ivshmem-client/Makefile
>  create mode 100644 contrib/ivshmem-client/ivshmem-client.c
>  create mode 100644 contrib/ivshmem-client/ivshmem-client.h
>  create mode 100644 contrib/ivshmem-client/main.c
>  create mode 100644 contrib/ivshmem-server/Makefile
>  create mode 100644 contrib/ivshmem-server/ivshmem-server.c
>  create mode 100644 contrib/ivshmem-server/ivshmem-server.h
>  create mode 100644 contrib/ivshmem-server/main.c
> 
> diff --git a/contrib/ivshmem-client/Makefile b/contrib/ivshmem-client/Makefile
> new file mode 100644
> index 0000000..eee97c6
> --- /dev/null
> +++ b/contrib/ivshmem-client/Makefile
> @@ -0,0 +1,29 @@
> +# Copyright 6WIND S.A., 2014
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# (at your option) any later version.  See the COPYING file in the
> +# top-level directory.
> +
> +S ?= $(CURDIR)
> +O ?= $(CURDIR)
> +
> +CFLAGS += -Wall -Wextra -Werror -g
> +LDFLAGS +=
> +LDLIBS += -lrt
> +
> +VPATH = $(S)
> +PROG = ivshmem-client
> +OBJS := $(O)/ivshmem-client.o
> +OBJS += $(O)/main.o
> +
> +$(O)/%.o: %.c
> +	$(CC) $(CFLAGS) -o $@ -c $<
> +
> +$(O)/$(PROG): $(OBJS)
> +	$(CC) $(LDFLAGS) -o $@ $^ $(LDLIBS)
> +
> +.PHONY: all
> +all: $(O)/$(PROG)
> +
> +clean:
> +	rm -f $(OBJS) $(O)/$(PROG)
> diff --git a/contrib/ivshmem-client/ivshmem-client.c
> b/contrib/ivshmem-client/ivshmem-client.c
> new file mode 100644
> index 0000000..2166b64
> --- /dev/null
> +++ b/contrib/ivshmem-client/ivshmem-client.c
> @@ -0,0 +1,418 @@
> +/*
> + * Copyright 6WIND S.A., 2014
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * (at your option) any later version.  See the COPYING file in the
> + * top-level directory.
> + */
> +
> +#include <stdio.h>
> +#include <stdint.h>
> +#include <stdlib.h>
> +#include <errno.h>
> +#include <string.h>
> +#include <signal.h>
> +#include <unistd.h>
> +#include <inttypes.h>
> +#include <sys/queue.h>
> +
> +#include <sys/types.h>
> +#include <sys/socket.h>
> +#include <sys/un.h>
> +
> +#include "ivshmem-client.h"
> +
> +/* log a message on stdout if verbose=1 */
> +#define debug_log(client, fmt, ...) do { \
> +        if ((client)->verbose) {         \
> +            printf(fmt, ## __VA_ARGS__); \
> +        }                                \
> +    } while (0)
> +
> +/* read message from the unix socket */
> +static int
> +read_one_msg(struct ivshmem_client *client, long *index, int *fd)
> +{
> +    int ret;
> +    struct msghdr msg;
> +    struct iovec iov[1];
> +    union {
> +        struct cmsghdr cmsg;
> +        char control[CMSG_SPACE(sizeof(int))];
> +    } msg_control;
> +    struct cmsghdr *cmsg;
> +
> +    iov[0].iov_base = index;
> +    iov[0].iov_len = sizeof(*index);
> +
> +    memset(&msg, 0, sizeof(msg));
> +    msg.msg_iov = iov;
> +    msg.msg_iovlen = 1;
> +    msg.msg_control = &msg_control;
> +    msg.msg_controllen = sizeof(msg_control);
> +
> +    ret = recvmsg(client->sock_fd, &msg, 0);
> +    if (ret < 0) {
> +        debug_log(client, "cannot read message: %s\n", strerror(errno));
> +        return -1;
> +    }
> +    if (ret == 0) {
> +        debug_log(client, "lost connection to server\n");
> +        return -1;
> +    }
> +
> +    *fd = -1;
> +
> +    for (cmsg = CMSG_FIRSTHDR(&msg); cmsg; cmsg =
> CMSG_NXTHDR(&msg, cmsg)) {
> +
> +        if (cmsg->cmsg_len != CMSG_LEN(sizeof(int)) ||
> +            cmsg->cmsg_level != SOL_SOCKET ||
> +            cmsg->cmsg_type != SCM_RIGHTS) {
> +            continue;
> +        }
> +
> +        memcpy(fd, CMSG_DATA(cmsg), sizeof(*fd));
> +    }
> +
> +    return 0;
> +}
> +
> +/* free a peer when the server advertise a disconnection or when the
> + * client is freed */
> +static void
> +free_peer(struct ivshmem_client *client, struct ivshmem_client_peer *peer)
> +{
> +    unsigned vector;
> +
> +    TAILQ_REMOVE(&client->peer_list, peer, next);
> +    for (vector = 0; vector < peer->vectors_count; vector++) {
> +        close(peer->vectors[vector]);
> +    }
> +
> +    free(peer);
> +}
> +
> +/* handle message coming from server (new peer, new vectors) */
> +static int
> +handle_server_msg(struct ivshmem_client *client)
> +{
> +    struct ivshmem_client_peer *peer;
> +    long peer_id;
> +    int ret, fd;
> +
> +    ret = read_one_msg(client, &peer_id, &fd);
> +    if (ret < 0) {
> +        return -1;
> +    }
> +
> +    /* can return a peer or the local client */
> +    peer = ivshmem_client_search_peer(client, peer_id);
> +
> +    /* delete peer */
> +    if (fd == -1) {
> +
Maybe the above check should be moved before getting the peer.
And the next check peer is extra.

> +        if (peer == NULL || peer == &client->local) {
> +            debug_log(client, "receive delete for invalid peer %ld",

Missing '\n' ?

> peer_id);
> +            return -1;
> +        }
> +
> +        debug_log(client, "delete peer id = %ld\n", peer_id);
> +        free_peer(client, peer);
> +        return 0;
> +    }
> +
> +    /* new peer */
> +    if (peer == NULL) {
> +        peer = malloc(sizeof(*peer));

g_malloc0 ?.

> +        if (peer == NULL) {
> +            debug_log(client, "cannot allocate new peer\n");
> +            return -1;
> +        }
> +        memset(peer, 0, sizeof(*peer));
> +        peer->id = peer_id;
> +        peer->vectors_count = 0;
> +        TAILQ_INSERT_TAIL(&client->peer_list, peer, next);
> +        debug_log(client, "new peer id = %ld\n", peer_id);
> +    }
> +
> +    /* new vector */
> +    debug_log(client, "  new vector %d (fd=%d) for peer id %ld\n",
> +              peer->vectors_count, fd, peer->id);
> +    peer->vectors[peer->vectors_count] = fd;
> +    peer->vectors_count++;
> +
> +    return 0;
> +}
> +
> +/* init a new ivshmem client */
> +int
> +ivshmem_client_init(struct ivshmem_client *client, const char
> *unix_sock_path,
> +                    ivshmem_client_notif_cb_t notif_cb, void *notif_arg,
> +                    int verbose)
> +{
> +    unsigned i;
> +
> +    memset(client, 0, sizeof(*client));
> +
> +    snprintf(client->unix_sock_path, sizeof(client->unix_sock_path),
> +             "%s", unix_sock_path);
> +
> +    for (i = 0; i < IVSHMEM_CLIENT_MAX_VECTORS; i++) {
> +        client->local.vectors[i] = -1;
> +    }
> +
> +    TAILQ_INIT(&client->peer_list);
> +    client->local.id = -1;
> +
> +    client->notif_cb = notif_cb;
> +    client->notif_arg = notif_arg;
> +    client->verbose = verbose;

Missing client->sock_fd = -1; ?
> +
> +    return 0;
> +}
> +
> +/* create and connect to the unix socket */
> +int
> +ivshmem_client_connect(struct ivshmem_client *client)
> +{
> +    struct sockaddr_un sun;
> +    int fd;
> +    long tmp;
> +
> +    debug_log(client, "connect to client %s\n", client->unix_sock_path);
> +
> +    client->sock_fd = socket(AF_UNIX, SOCK_STREAM, 0);
> +    if (client->sock_fd < 0) {
> +        debug_log(client, "cannot create socket: %s\n", strerror(errno));
> +        return -1;
> +    }
> +
> +    sun.sun_family = AF_UNIX;
> +    snprintf(sun.sun_path, sizeof(sun.sun_path), "%s",
> client->unix_sock_path);
> +    if (connect(client->sock_fd, (struct sockaddr *)&sun, sizeof(sun)) < 0) {
> +        debug_log(client, "cannot connect to %s: %s\n", sun.sun_path,
> +                  strerror(errno));
> +        close(client->sock_fd);
> +        client->sock_fd = -1;
> +        return -1;
> +    }
> +
> +    /* first, we expect our index + a fd == -1 */
> +    if (read_one_msg(client, &client->local.id, &fd) < 0 ||
> +        client->local.id < 0 || fd != -1) {
Why not fd < 0 ?

> +        debug_log(client, "cannot read from server\n");
> +        close(client->sock_fd);
> +        client->sock_fd = -1;
> +        return -1;
> +    }
> +    debug_log(client, "our_id=%ld\n", client->local.id);
> +
> +    /* now, we expect shared mem fd + a -1 index, note that shm fd
> +     * is not used */
> +    if (read_one_msg(client, &tmp, &fd) < 0 ||
> +        tmp != -1 || fd < 0) {
> +        debug_log(client, "cannot read from server (2)\n");
> +        close(client->sock_fd);
> +        client->sock_fd = -1;
> +        return -1;
I think the error logic handle can move the end of this function, reducing 
duplicated code. Something like this:

	goto err;
  }
err:
	debug_log(client, "cannot read from server (2)\n");
    close(client->sock_fd);
    client->sock_fd = -1;
    return -1;

> +    }
> +    debug_log(client, "shm_fd=%d\n", fd);
> +
> +    return 0;
> +}
> +
> +/* close connection to the server, and free all peer structures */
> +void
> +ivshmem_client_close(struct ivshmem_client *client)
> +{
> +    struct ivshmem_client_peer *peer;
> +    unsigned i;
> +
> +    debug_log(client, "close client\n");
> +
> +    while ((peer = TAILQ_FIRST(&client->peer_list)) != NULL) {
> +        free_peer(client, peer);
> +    }
> +
> +    close(client->sock_fd);
> +    client->sock_fd = -1;
> +    client->local.id = -1;
> +    for (i = 0; i < IVSHMEM_CLIENT_MAX_VECTORS; i++) {
> +        client->local.vectors[i] = -1;
> +    }
> +}
> +
> +/* get the fd_set according to the unix socket and peer list */
> +void
> +ivshmem_client_get_fds(const struct ivshmem_client *client, fd_set *fds,
> +                       int *maxfd)
> +{
> +    int fd;
> +    unsigned vector;
> +
> +    FD_SET(client->sock_fd, fds);
> +    if (client->sock_fd >= *maxfd) {
> +        *maxfd = client->sock_fd + 1;
> +    }
> +
> +    for (vector = 0; vector < client->local.vectors_count; vector++) {
> +        fd = client->local.vectors[vector];
> +        FD_SET(fd, fds);
> +        if (fd >= *maxfd) {
> +            *maxfd = fd + 1;
> +        }
> +    }
> +}
> +
> +/* handle events from eventfd: just print a message on notification */
> +static int
> +handle_event(struct ivshmem_client *client, const fd_set *cur, int maxfd)
> +{
> +    struct ivshmem_client_peer *peer;
> +    uint64_t kick;
> +    unsigned i;
> +    int ret;
> +
> +    peer = &client->local;
> +
> +    for (i = 0; i < peer->vectors_count; i++) {
> +        if (peer->vectors[i] >= maxfd || !FD_ISSET(peer->vectors[i], cur)) {
> +            continue;
> +        }
> +
> +        ret = read(peer->vectors[i], &kick, sizeof(kick));
> +        if (ret < 0) {
> +            return ret;
> +        }
> +        if (ret != sizeof(kick)) {
> +            debug_log(client, "invalid read size = %d\n", ret);
> +            errno = EINVAL;
> +            return -1;
> +        }
> +        debug_log(client, "received event on fd %d vector %d: %ld\n",
> +                  peer->vectors[i], i, kick);
> +        if (client->notif_cb != NULL) {
> +            client->notif_cb(client, peer, i, client->notif_arg);
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +/* read and handle new messages on the given fd_set */
> +int
> +ivshmem_client_handle_fds(struct ivshmem_client *client, fd_set *fds, int
> maxfd)
> +{
> +    if (client->sock_fd < maxfd && FD_ISSET(client->sock_fd, fds) &&
> +        handle_server_msg(client) < 0 && errno != EINTR) {
> +        debug_log(client, "handle_server_msg() failed\n");
> +        return -1;
> +    } else if (handle_event(client, fds, maxfd) < 0 && errno != EINTR) {
> +        debug_log(client, "handle_event() failed\n");
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
> +/* send a notification on a vector of a peer */
> +int
> +ivshmem_client_notify(const struct ivshmem_client *client,
> +                      const struct ivshmem_client_peer *peer, unsigned
> vector)
> +{
> +    uint64_t kick;
> +    int fd;
> +
> +    if (vector > peer->vectors_count) {

Maybe if (vector >= peer->vectors_count) , otherwise the
peer->vectors[] array bounds.

> +        debug_log(client, "invalid vector %u on peer %ld\n", vector,
> peer->id);
> +        return -1;
> +    }
> +    fd = peer->vectors[vector];

Or fd = peer->vectors[vector - 1]; ?

> +    debug_log(client, "notify peer %ld on vector %d, fd %d\n", peer->id,
> vector,
> +              fd);
> +
> +    kick = 1;
> +    if (write(fd, &kick, sizeof(kick)) != sizeof(kick)) {
> +        fprintf(stderr, "could not write to %d: %s\n", peer->vectors[vector],
> +                strerror(errno));
Why not debug_log() at this here?

> +        return -1;
> +    }
> +    return 0;
> +}
> +
> +/* send a notification to all vectors of a peer */
> +int
> +ivshmem_client_notify_all_vects(const struct ivshmem_client *client,
> +                                const struct ivshmem_client_peer
> *peer)
> +{
> +    unsigned vector;
> +    int ret = 0;
> +
> +    for (vector = 0; vector < peer->vectors_count; vector++) {
> +        if (ivshmem_client_notify(client, peer, vector) < 0) {
> +            ret = -1;
The ret's value will be covered when multi clients failed. Do we need
store the failed status for server?.

> +        }
> +    }
> +
> +    return ret;
> +}
> +
> +/* send a notification to all peers */
> +int
> +ivshmem_client_notify_broadcast(const struct ivshmem_client *client)
> +{
> +    struct ivshmem_client_peer *peer;
> +    int ret = 0;
> +
> +    TAILQ_FOREACH(peer, &client->peer_list, next) {
> +        if (ivshmem_client_notify_all_vects(client, peer) < 0) {
> +            ret = -1;
> +        }
> +    }
> +
> +    return ret;
> +}
> +
> +/* lookup peer from its id */
> +struct ivshmem_client_peer *
> +ivshmem_client_search_peer(struct ivshmem_client *client, long peer_id)
> +{
> +    struct ivshmem_client_peer *peer;
> +
> +    if (peer_id == client->local.id) {
> +        return &client->local;
> +    }
> +
> +    TAILQ_FOREACH(peer, &client->peer_list, next) {
> +        if (peer->id == peer_id) {
> +            return peer;
> +        }
> +    }
> +    return NULL;
> +}
> +
> +/* dump our info, the list of peers their vectors on stdout */
> +void
> +ivshmem_client_dump(const struct ivshmem_client *client)
> +{
> +    const struct ivshmem_client_peer *peer;
> +    unsigned vector;
> +
> +    /* dump local infos */
> +    peer = &client->local;
> +    printf("our_id = %ld\n", peer->id);
> +    for (vector = 0; vector < peer->vectors_count; vector++) {
> +        printf("  vector %d is enabled (fd=%d)\n", vector,
> +               peer->vectors[vector]);
> +    }
> +
> +    /* dump peers */
> +    TAILQ_FOREACH(peer, &client->peer_list, next) {
> +        printf("peer_id = %ld\n", peer->id);
> +
> +        for (vector = 0; vector < peer->vectors_count; vector++) {
> +            printf("  vector %d is enabled (fd=%d)\n", vector,
> +                   peer->vectors[vector]);
> +        }
> +    }
> +}

To be continued...

Best regards,
-Gonglei

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/2] contrib: add ivshmem client and server
  2014-08-08 14:51     ` Stefan Hajnoczi
@ 2014-08-18 12:09       ` David Marchand
  -1 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-18 12:09 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Olivier Matz, kvm, claudio.fontana, armbru, pbonzini,
	jani.kokkonen, cam


On 08/08/2014 04:51 PM, Stefan Hajnoczi wrote:
> On Fri, Aug 08, 2014 at 10:55:17AM +0200, David Marchand wrote:
>
> Looks good, a few minor comments:
>
>> diff --git a/contrib/ivshmem-client/Makefile b/contrib/ivshmem-client/Makefile
>> new file mode 100644
>> index 0000000..eee97c6
>> --- /dev/null
>> +++ b/contrib/ivshmem-client/Makefile
>> @@ -0,0 +1,29 @@
>
> CCed Peter Maydell for a second opinion, I'd suggest hooking up to
> QEMU's top-level ./Makefile.  QEMU does not do recursive make.
>
> The advantages of hooking up QEMU's Makefile are:
>
> 1. So that ivshmem client/server code is built by default (on supported
>     host platforms) and bitrot is avoided.
>
> 2. So that you don't have to duplicate rules.mak or any other build
>     infrastructure.
>

Ok, done.
But in this case, should we really keep the files in contrib/ ?
I used this directory but I am not too sure about this.

Maybe hw/misc/ivshmem/ would be better ?


The rest of your comments have been integrated in my tree.

Is it preferred to send fixes aside the original patches ? or should I 
send a squashed version ?
(I suppose the former is better as it is easier to read).


Thanks Stefan.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/2] contrib: add ivshmem client and server
@ 2014-08-18 12:09       ` David Marchand
  0 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-18 12:09 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Olivier Matz, kvm, claudio.fontana, armbru, qemu-devel, pbonzini,
	jani.kokkonen, cam


On 08/08/2014 04:51 PM, Stefan Hajnoczi wrote:
> On Fri, Aug 08, 2014 at 10:55:17AM +0200, David Marchand wrote:
>
> Looks good, a few minor comments:
>
>> diff --git a/contrib/ivshmem-client/Makefile b/contrib/ivshmem-client/Makefile
>> new file mode 100644
>> index 0000000..eee97c6
>> --- /dev/null
>> +++ b/contrib/ivshmem-client/Makefile
>> @@ -0,0 +1,29 @@
>
> CCed Peter Maydell for a second opinion, I'd suggest hooking up to
> QEMU's top-level ./Makefile.  QEMU does not do recursive make.
>
> The advantages of hooking up QEMU's Makefile are:
>
> 1. So that ivshmem client/server code is built by default (on supported
>     host platforms) and bitrot is avoided.
>
> 2. So that you don't have to duplicate rules.mak or any other build
>     infrastructure.
>

Ok, done.
But in this case, should we really keep the files in contrib/ ?
I used this directory but I am not too sure about this.

Maybe hw/misc/ivshmem/ would be better ?


The rest of your comments have been integrated in my tree.

Is it preferred to send fixes aside the original patches ? or should I 
send a squashed version ?
(I suppose the former is better as it is easier to read).


Thanks Stefan.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/2] contrib: add ivshmem client and server
  2014-08-10  3:57     ` [Qemu-devel] " Gonglei
@ 2014-08-18 12:19       ` David Marchand
  -1 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-18 12:19 UTC (permalink / raw)
  To: Gonglei, qemu-devel
  Cc: 'Olivier Matz',
	kvm, claudio.fontana, armbru, pbonzini, jani.kokkonen, cam,
	arei.gonglei

On 08/10/2014 05:57 AM, Gonglei wrote:
>> +    /* can return a peer or the local client */
>> +    peer = ivshmem_client_search_peer(client, peer_id);
>> +
>> +    /* delete peer */
>> +    if (fd == -1) {
>> +
> Maybe the above check should be moved before getting the peer.
> And the next check peer is extra.

We always need to know the peer, either for a deletion, creation or update.


>
>> +        if (peer == NULL || peer == &client->local) {
>> +            debug_log(client, "receive delete for invalid peer %ld",
>
> Missing '\n' ?

Ok.

>
>> peer_id);
>> +            return -1;
>> +        }
>> +
>> +        debug_log(client, "delete peer id = %ld\n", peer_id);
>> +        free_peer(client, peer);
>> +        return 0;
>> +    }
>> +
>> +    /* new peer */
>> +    if (peer == NULL) {
>> +        peer = malloc(sizeof(*peer));
>
> g_malloc0 ?.

Ok, replaced malloc/free with g_malloc/g_free in client and server.


>> +    client->notif_cb = notif_cb;
>> +    client->notif_arg = notif_arg;
>> +    client->verbose = verbose;
>
> Missing client->sock_fd = -1; ?

Ok.


>> +
>> +    /* first, we expect our index + a fd == -1 */
>> +    if (read_one_msg(client, &client->local.id, &fd) < 0 ||
>> +        client->local.id < 0 || fd != -1) {
> Why not fd < 0 ?

Because the server will send us an id and a fd == -1 see 
ivshmem-server.c send_initial_info().

>
>> +        debug_log(client, "cannot read from server\n");
>> +        close(client->sock_fd);
>> +        client->sock_fd = -1;
>> +        return -1;
>> +    }
>> +    debug_log(client, "our_id=%ld\n", client->local.id);
>> +
>> +    /* now, we expect shared mem fd + a -1 index, note that shm fd
>> +     * is not used */
>> +    if (read_one_msg(client, &tmp, &fd) < 0 ||
>> +        tmp != -1 || fd < 0) {
>> +        debug_log(client, "cannot read from server (2)\n");
>> +        close(client->sock_fd);
>> +        client->sock_fd = -1;
>> +        return -1;
> I think the error logic handle can move the end of this function, reducing
> duplicated code. Something like this:
>
> 	goto err;
>    }
> err:
> 	debug_log(client, "cannot read from server (2)\n");
>      close(client->sock_fd);
>      client->sock_fd = -1;
>      return -1;

Ok, I also updated the server.

>> +    int fd;
>> +
>> +    if (vector > peer->vectors_count) {
>
> Maybe if (vector >= peer->vectors_count) , otherwise the
> peer->vectors[] array bounds.

Oh yes, good catch.
It should not happen, at the moment, but it is wrong, indeed.

>> +/* send a notification to all vectors of a peer */
>> +int
>> +ivshmem_client_notify_all_vects(const struct ivshmem_client *client,
>> +                                const struct ivshmem_client_peer
>> *peer)
>> +{
>> +    unsigned vector;
>> +    int ret = 0;
>> +
>> +    for (vector = 0; vector < peer->vectors_count; vector++) {
>> +        if (ivshmem_client_notify(client, peer, vector) < 0) {
>> +            ret = -1;
> The ret's value will be covered when multi clients failed. Do we need
> store the failed status for server?.

It indicates that we could not notify *all* clients.



Thanks Gonglei.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/2] contrib: add ivshmem client and server
@ 2014-08-18 12:19       ` David Marchand
  0 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-18 12:19 UTC (permalink / raw)
  To: Gonglei, qemu-devel
  Cc: 'Olivier Matz',
	kvm, claudio.fontana, armbru, arei.gonglei, pbonzini,
	jani.kokkonen, cam

On 08/10/2014 05:57 AM, Gonglei wrote:
>> +    /* can return a peer or the local client */
>> +    peer = ivshmem_client_search_peer(client, peer_id);
>> +
>> +    /* delete peer */
>> +    if (fd == -1) {
>> +
> Maybe the above check should be moved before getting the peer.
> And the next check peer is extra.

We always need to know the peer, either for a deletion, creation or update.


>
>> +        if (peer == NULL || peer == &client->local) {
>> +            debug_log(client, "receive delete for invalid peer %ld",
>
> Missing '\n' ?

Ok.

>
>> peer_id);
>> +            return -1;
>> +        }
>> +
>> +        debug_log(client, "delete peer id = %ld\n", peer_id);
>> +        free_peer(client, peer);
>> +        return 0;
>> +    }
>> +
>> +    /* new peer */
>> +    if (peer == NULL) {
>> +        peer = malloc(sizeof(*peer));
>
> g_malloc0 ?.

Ok, replaced malloc/free with g_malloc/g_free in client and server.


>> +    client->notif_cb = notif_cb;
>> +    client->notif_arg = notif_arg;
>> +    client->verbose = verbose;
>
> Missing client->sock_fd = -1; ?

Ok.


>> +
>> +    /* first, we expect our index + a fd == -1 */
>> +    if (read_one_msg(client, &client->local.id, &fd) < 0 ||
>> +        client->local.id < 0 || fd != -1) {
> Why not fd < 0 ?

Because the server will send us an id and a fd == -1 see 
ivshmem-server.c send_initial_info().

>
>> +        debug_log(client, "cannot read from server\n");
>> +        close(client->sock_fd);
>> +        client->sock_fd = -1;
>> +        return -1;
>> +    }
>> +    debug_log(client, "our_id=%ld\n", client->local.id);
>> +
>> +    /* now, we expect shared mem fd + a -1 index, note that shm fd
>> +     * is not used */
>> +    if (read_one_msg(client, &tmp, &fd) < 0 ||
>> +        tmp != -1 || fd < 0) {
>> +        debug_log(client, "cannot read from server (2)\n");
>> +        close(client->sock_fd);
>> +        client->sock_fd = -1;
>> +        return -1;
> I think the error logic handle can move the end of this function, reducing
> duplicated code. Something like this:
>
> 	goto err;
>    }
> err:
> 	debug_log(client, "cannot read from server (2)\n");
>      close(client->sock_fd);
>      client->sock_fd = -1;
>      return -1;

Ok, I also updated the server.

>> +    int fd;
>> +
>> +    if (vector > peer->vectors_count) {
>
> Maybe if (vector >= peer->vectors_count) , otherwise the
> peer->vectors[] array bounds.

Oh yes, good catch.
It should not happen, at the moment, but it is wrong, indeed.

>> +/* send a notification to all vectors of a peer */
>> +int
>> +ivshmem_client_notify_all_vects(const struct ivshmem_client *client,
>> +                                const struct ivshmem_client_peer
>> *peer)
>> +{
>> +    unsigned vector;
>> +    int ret = 0;
>> +
>> +    for (vector = 0; vector < peer->vectors_count; vector++) {
>> +        if (ivshmem_client_notify(client, peer, vector) < 0) {
>> +            ret = -1;
> The ret's value will be covered when multi clients failed. Do we need
> store the failed status for server?.

It indicates that we could not notify *all* clients.



Thanks Gonglei.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
  2014-08-08 15:02     ` Stefan Hajnoczi
@ 2014-08-26  6:47       ` David Marchand
  -1 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-26  6:47 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, kvm, claudio.fontana, armbru, pbonzini, jani.kokkonen, cam

Hello Stefan,

On 08/08/2014 05:02 PM, Stefan Hajnoczi wrote:
> On Fri, Aug 08, 2014 at 10:55:18AM +0200, David Marchand wrote:
>> +For each client (QEMU processes) that connects to the server:
>> +- the server assigns an ID for this client and sends this ID to him as the first
>> +  message,
>> +- the server sends a fd to the shared memory object to this client,
>> +- the server creates a new set of host eventfds associated to the new client and
>> +  sends this set to all already connected clients,
>> +- finally, the server sends all the eventfds sets for all clients to the new
>> +  client.
>
> The protocol is not extensible and no version number is exchanged.  For
> the most part this should be okay because clients must run on the same
> machine as the server.  It is assumed clients and server are compatible
> with each other.
>
> I wonder if we'll get into trouble later if the protocol needs to be
> extended or some operation needs to happen, like upgrading QEMU or the
> ivshmem-server.  At the very least someone building from source but
> using system QEMU or ivshmem-server could get confusing failures if the
> protocol doesn't match.
>
> How about sending a version message as the first thing during a
> connection?

I am not too sure about this.

This would break current base version.

Using a version message supposes we want to keep ivshmem-server and QEMU 
separated (for example, in two distribution packages) while we can avoid 
this, so why would we do so ?

If we want the ivshmem-server to come with QEMU, then both are supposed 
to be aligned on your system.
If you want to test local modifications, then it means you know what you 
are doing and you will call the right ivshmem-server binary with the 
right QEMU binary.


Thanks.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
@ 2014-08-26  6:47       ` David Marchand
  0 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-08-26  6:47 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kvm, claudio.fontana, qemu-devel, armbru, pbonzini, jani.kokkonen, cam

Hello Stefan,

On 08/08/2014 05:02 PM, Stefan Hajnoczi wrote:
> On Fri, Aug 08, 2014 at 10:55:18AM +0200, David Marchand wrote:
>> +For each client (QEMU processes) that connects to the server:
>> +- the server assigns an ID for this client and sends this ID to him as the first
>> +  message,
>> +- the server sends a fd to the shared memory object to this client,
>> +- the server creates a new set of host eventfds associated to the new client and
>> +  sends this set to all already connected clients,
>> +- finally, the server sends all the eventfds sets for all clients to the new
>> +  client.
>
> The protocol is not extensible and no version number is exchanged.  For
> the most part this should be okay because clients must run on the same
> machine as the server.  It is assumed clients and server are compatible
> with each other.
>
> I wonder if we'll get into trouble later if the protocol needs to be
> extended or some operation needs to happen, like upgrading QEMU or the
> ivshmem-server.  At the very least someone building from source but
> using system QEMU or ivshmem-server could get confusing failures if the
> protocol doesn't match.
>
> How about sending a version message as the first thing during a
> connection?

I am not too sure about this.

This would break current base version.

Using a version message supposes we want to keep ivshmem-server and QEMU 
separated (for example, in two distribution packages) while we can avoid 
this, so why would we do so ?

If we want the ivshmem-server to come with QEMU, then both are supposed 
to be aligned on your system.
If you want to test local modifications, then it means you know what you 
are doing and you will call the right ivshmem-server binary with the 
right QEMU binary.


Thanks.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
  2014-08-26  6:47       ` David Marchand
@ 2014-08-26 11:04         ` Paolo Bonzini
  -1 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2014-08-26 11:04 UTC (permalink / raw)
  To: David Marchand, Stefan Hajnoczi
  Cc: qemu-devel, kvm, claudio.fontana, armbru, jani.kokkonen, cam

Il 26/08/2014 08:47, David Marchand ha scritto:
> 
> Using a version message supposes we want to keep ivshmem-server and QEMU
> separated (for example, in two distribution packages) while we can avoid
> this, so why would we do so ?
> 
> If we want the ivshmem-server to come with QEMU, then both are supposed
> to be aligned on your system.

What about upgrading QEMU and ivshmem-server while you have existing
guests?  You cannot restart ivshmem-server, and the new QEMU would have
to talk to the old ivshmem-server.

Paolo

> If you want to test local modifications, then it means you know what you
> are doing and you will call the right ivshmem-server binary with the
> right QEMU binary.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
@ 2014-08-26 11:04         ` Paolo Bonzini
  0 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2014-08-26 11:04 UTC (permalink / raw)
  To: David Marchand, Stefan Hajnoczi
  Cc: kvm, claudio.fontana, qemu-devel, armbru, jani.kokkonen, cam

Il 26/08/2014 08:47, David Marchand ha scritto:
> 
> Using a version message supposes we want to keep ivshmem-server and QEMU
> separated (for example, in two distribution packages) while we can avoid
> this, so why would we do so ?
> 
> If we want the ivshmem-server to come with QEMU, then both are supposed
> to be aligned on your system.

What about upgrading QEMU and ivshmem-server while you have existing
guests?  You cannot restart ivshmem-server, and the new QEMU would have
to talk to the old ivshmem-server.

Paolo

> If you want to test local modifications, then it means you know what you
> are doing and you will call the right ivshmem-server binary with the
> right QEMU binary.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
  2014-08-26 11:04         ` Paolo Bonzini
@ 2014-08-28  9:49           ` Stefan Hajnoczi
  -1 siblings, 0 replies; 36+ messages in thread
From: Stefan Hajnoczi @ 2014-08-28  9:49 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: David Marchand, qemu-devel, kvm, claudio.fontana, armbru,
	jani.kokkonen, cam

[-- Attachment #1: Type: text/plain, Size: 886 bytes --]

On Tue, Aug 26, 2014 at 01:04:30PM +0200, Paolo Bonzini wrote:
> Il 26/08/2014 08:47, David Marchand ha scritto:
> > 
> > Using a version message supposes we want to keep ivshmem-server and QEMU
> > separated (for example, in two distribution packages) while we can avoid
> > this, so why would we do so ?
> > 
> > If we want the ivshmem-server to come with QEMU, then both are supposed
> > to be aligned on your system.
> 
> What about upgrading QEMU and ivshmem-server while you have existing
> guests?  You cannot restart ivshmem-server, and the new QEMU would have
> to talk to the old ivshmem-server.

Version negotiation also helps avoid confusion if someone combines
ivshmem-server and QEMU from different origins (e.g. built from source
and distro packaged).

It's a safeguard to prevent hard-to-diagnose failures when the system is
misconfigured.

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
@ 2014-08-28  9:49           ` Stefan Hajnoczi
  0 siblings, 0 replies; 36+ messages in thread
From: Stefan Hajnoczi @ 2014-08-28  9:49 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, qemu-devel, claudio.fontana, armbru, David Marchand,
	jani.kokkonen, cam

[-- Attachment #1: Type: text/plain, Size: 886 bytes --]

On Tue, Aug 26, 2014 at 01:04:30PM +0200, Paolo Bonzini wrote:
> Il 26/08/2014 08:47, David Marchand ha scritto:
> > 
> > Using a version message supposes we want to keep ivshmem-server and QEMU
> > separated (for example, in two distribution packages) while we can avoid
> > this, so why would we do so ?
> > 
> > If we want the ivshmem-server to come with QEMU, then both are supposed
> > to be aligned on your system.
> 
> What about upgrading QEMU and ivshmem-server while you have existing
> guests?  You cannot restart ivshmem-server, and the new QEMU would have
> to talk to the old ivshmem-server.

Version negotiation also helps avoid confusion if someone combines
ivshmem-server and QEMU from different origins (e.g. built from source
and distro packaged).

It's a safeguard to prevent hard-to-diagnose failures when the system is
misconfigured.

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
  2014-08-28  9:49           ` Stefan Hajnoczi
@ 2014-09-01  9:52             ` David Marchand
  -1 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-09-01  9:52 UTC (permalink / raw)
  To: Stefan Hajnoczi, Paolo Bonzini
  Cc: qemu-devel, kvm, claudio.fontana, armbru, jani.kokkonen, cam

On 08/28/2014 11:49 AM, Stefan Hajnoczi wrote:
> On Tue, Aug 26, 2014 at 01:04:30PM +0200, Paolo Bonzini wrote:
>> Il 26/08/2014 08:47, David Marchand ha scritto:
>>>
>>> Using a version message supposes we want to keep ivshmem-server and QEMU
>>> separated (for example, in two distribution packages) while we can avoid
>>> this, so why would we do so ?
>>>
>>> If we want the ivshmem-server to come with QEMU, then both are supposed
>>> to be aligned on your system.
>>
>> What about upgrading QEMU and ivshmem-server while you have existing
>> guests?  You cannot restart ivshmem-server, and the new QEMU would have
>> to talk to the old ivshmem-server.
>
> Version negotiation also helps avoid confusion if someone combines
> ivshmem-server and QEMU from different origins (e.g. built from source
> and distro packaged).
>
> It's a safeguard to prevent hard-to-diagnose failures when the system is
> misconfigured.
>

Hum, so you want the code to be defensive against mis-use, why not.

I wanted to keep modifications on ivshmem as little as possible in a 
first phase (all the more so as there are potential ivshmem users out 
there that I think will be impacted by a protocol change).

Sending the version as the first "vm_id" with an associated fd to -1 
before sending the real client id should work with existing QEMU client 
code (hw/misc/ivshmem.c).

Do you have a better idea ?
Is there a best practice in QEMU for "version negotiation" that could 
work with ivshmem protocol ?

I have a v4 ready with this (and all the pending comments), I will send 
it later unless a better idea is exposed.


Thanks.

-- 
David Marchand

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
@ 2014-09-01  9:52             ` David Marchand
  0 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2014-09-01  9:52 UTC (permalink / raw)
  To: Stefan Hajnoczi, Paolo Bonzini
  Cc: kvm, claudio.fontana, qemu-devel, armbru, jani.kokkonen, cam

On 08/28/2014 11:49 AM, Stefan Hajnoczi wrote:
> On Tue, Aug 26, 2014 at 01:04:30PM +0200, Paolo Bonzini wrote:
>> Il 26/08/2014 08:47, David Marchand ha scritto:
>>>
>>> Using a version message supposes we want to keep ivshmem-server and QEMU
>>> separated (for example, in two distribution packages) while we can avoid
>>> this, so why would we do so ?
>>>
>>> If we want the ivshmem-server to come with QEMU, then both are supposed
>>> to be aligned on your system.
>>
>> What about upgrading QEMU and ivshmem-server while you have existing
>> guests?  You cannot restart ivshmem-server, and the new QEMU would have
>> to talk to the old ivshmem-server.
>
> Version negotiation also helps avoid confusion if someone combines
> ivshmem-server and QEMU from different origins (e.g. built from source
> and distro packaged).
>
> It's a safeguard to prevent hard-to-diagnose failures when the system is
> misconfigured.
>

Hum, so you want the code to be defensive against mis-use, why not.

I wanted to keep modifications on ivshmem as little as possible in a 
first phase (all the more so as there are potential ivshmem users out 
there that I think will be impacted by a protocol change).

Sending the version as the first "vm_id" with an associated fd to -1 
before sending the real client id should work with existing QEMU client 
code (hw/misc/ivshmem.c).

Do you have a better idea ?
Is there a best practice in QEMU for "version negotiation" that could 
work with ivshmem protocol ?

I have a v4 ready with this (and all the pending comments), I will send 
it later unless a better idea is exposed.


Thanks.

-- 
David Marchand

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 2/2] docs: update ivshmem device spec
  2014-09-01  9:52             ` David Marchand
@ 2014-09-09 19:04               ` Eric Blake
  -1 siblings, 0 replies; 36+ messages in thread
From: Eric Blake @ 2014-09-09 19:04 UTC (permalink / raw)
  To: David Marchand, Stefan Hajnoczi, Paolo Bonzini
  Cc: kvm, claudio.fontana, qemu-devel, armbru, jani.kokkonen, cam

[-- Attachment #1: Type: text/plain, Size: 3951 bytes --]

On 09/01/2014 03:52 AM, David Marchand wrote:

>>> What about upgrading QEMU and ivshmem-server while you have existing
>>> guests?  You cannot restart ivshmem-server, and the new QEMU would have
>>> to talk to the old ivshmem-server.
>>
>> Version negotiation also helps avoid confusion if someone combines
>> ivshmem-server and QEMU from different origins (e.g. built from source
>> and distro packaged).

Don't underestimate the likelihood of this happening.  Any long-running
process (which an ivshmem-server will be) continues running at the old
version, even when a package upgrade installs a new qemu binary; the new
binary should still be able to manage connections to the already-running
server.

Even neater would be a solution where an existing ivshmem-server could
re-exec an updated ivshmem-server binary that resulted from a distro
upgrade, hand over all state required for the new server to take over
from the point managed by the old server, so that you aren't stuck
running the old binary forever.  But that's a lot trickier to write, so
it is not necessary for a first implementation; and if you do that, then
you have the reverse situation to worry about (the new server must still
accept communication from existing old qemu binaries).

Note that the goal here is to support upgrades; it is probably okay if
downgrading from a new binary back to an old doesn't work correctly
(because the new software was using a feature not present in the old).

>>
>> It's a safeguard to prevent hard-to-diagnose failures when the system is
>> misconfigured.
>>
> 
> Hum, so you want the code to be defensive against mis-use, why not.
> 
> I wanted to keep modifications on ivshmem as little as possible in a
> first phase (all the more so as there are potential ivshmem users out
> there that I think will be impacted by a protocol change).

Existing ivshmem users MUST be aware that they are using something that
is not yet polished, and be prepared to make the upgrade to the polished
version.  It's best to minimize the hassle by making them upgrade
exactly once to a fully-robust version, rather than to have them upgrade
to a slightly-more robust version only to find out we didn't plan ahead
well enough to make further extensions in a back-compatible manner.

> 
> Sending the version as the first "vm_id" with an associated fd to -1
> before sending the real client id should work with existing QEMU client
> code (hw/misc/ivshmem.c).
> 
> Do you have a better idea ?
> Is there a best practice in QEMU for "version negotiation" that could
> work with ivshmem protocol ?

QMP starts off with a mandatory "qmp_capabilities" handshake, although
we haven't yet had to define any capabilities where cross-versioned
communication differs as a result.  Migration is somewhat of an example,
except that it is one-directional (we don't have a feedback path), so it
is somewhat best effort.  The qcow2 v3 file format is an example of
declaring features, rather than version numbers, and making decisions
about whether a feature is compatible (older clients can safely ignore
the bit, without corrupting the image but possibly having worse
performance) vs. incompatible (older clients must reject the image,
because not handling the feature correctly would corrupt the image).
The best handshakes are bi-directional - both sides advertise their
version (or better, their features), and a well-defined algorithm for
settling on the common subset of advertised features then ensures that
the two sides know how to talk to each other, or give a reason for
either side to disconnect early because of a missing feature.

> 
> I have a v4 ready with this (and all the pending comments), I will send
> it later unless a better idea is exposed.
> 
> 
> Thanks.
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec
@ 2014-09-09 19:04               ` Eric Blake
  0 siblings, 0 replies; 36+ messages in thread
From: Eric Blake @ 2014-09-09 19:04 UTC (permalink / raw)
  To: David Marchand, Stefan Hajnoczi, Paolo Bonzini
  Cc: kvm, claudio.fontana, qemu-devel, armbru, jani.kokkonen, cam

[-- Attachment #1: Type: text/plain, Size: 3951 bytes --]

On 09/01/2014 03:52 AM, David Marchand wrote:

>>> What about upgrading QEMU and ivshmem-server while you have existing
>>> guests?  You cannot restart ivshmem-server, and the new QEMU would have
>>> to talk to the old ivshmem-server.
>>
>> Version negotiation also helps avoid confusion if someone combines
>> ivshmem-server and QEMU from different origins (e.g. built from source
>> and distro packaged).

Don't underestimate the likelihood of this happening.  Any long-running
process (which an ivshmem-server will be) continues running at the old
version, even when a package upgrade installs a new qemu binary; the new
binary should still be able to manage connections to the already-running
server.

Even neater would be a solution where an existing ivshmem-server could
re-exec an updated ivshmem-server binary that resulted from a distro
upgrade, hand over all state required for the new server to take over
from the point managed by the old server, so that you aren't stuck
running the old binary forever.  But that's a lot trickier to write, so
it is not necessary for a first implementation; and if you do that, then
you have the reverse situation to worry about (the new server must still
accept communication from existing old qemu binaries).

Note that the goal here is to support upgrades; it is probably okay if
downgrading from a new binary back to an old doesn't work correctly
(because the new software was using a feature not present in the old).

>>
>> It's a safeguard to prevent hard-to-diagnose failures when the system is
>> misconfigured.
>>
> 
> Hum, so you want the code to be defensive against mis-use, why not.
> 
> I wanted to keep modifications on ivshmem as little as possible in a
> first phase (all the more so as there are potential ivshmem users out
> there that I think will be impacted by a protocol change).

Existing ivshmem users MUST be aware that they are using something that
is not yet polished, and be prepared to make the upgrade to the polished
version.  It's best to minimize the hassle by making them upgrade
exactly once to a fully-robust version, rather than to have them upgrade
to a slightly-more robust version only to find out we didn't plan ahead
well enough to make further extensions in a back-compatible manner.

> 
> Sending the version as the first "vm_id" with an associated fd to -1
> before sending the real client id should work with existing QEMU client
> code (hw/misc/ivshmem.c).
> 
> Do you have a better idea ?
> Is there a best practice in QEMU for "version negotiation" that could
> work with ivshmem protocol ?

QMP starts off with a mandatory "qmp_capabilities" handshake, although
we haven't yet had to define any capabilities where cross-versioned
communication differs as a result.  Migration is somewhat of an example,
except that it is one-directional (we don't have a feedback path), so it
is somewhat best effort.  The qcow2 v3 file format is an example of
declaring features, rather than version numbers, and making decisions
about whether a feature is compatible (older clients can safely ignore
the bit, without corrupting the image but possibly having worse
performance) vs. incompatible (older clients must reject the image,
because not handling the feature correctly would corrupt the image).
The best handshakes are bi-directional - both sides advertise their
version (or better, their features), and a well-defined algorithm for
settling on the common subset of advertised features then ensures that
the two sides know how to talk to each other, or give a reason for
either side to disconnect early because of a missing feature.

> 
> I have a v4 ready with this (and all the pending comments), I will send
> it later unless a better idea is exposed.
> 
> 
> Thanks.
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2014-09-09 19:04 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-08  8:55 [PATCH v3 0/2] ivshmem: update documentation, add client/server tools David Marchand
2014-08-08  8:55 ` [Qemu-devel] " David Marchand
2014-08-08  8:55 ` [PATCH v3 1/2] contrib: add ivshmem client and server David Marchand
2014-08-08  8:55   ` [Qemu-devel] " David Marchand
2014-08-08 14:51   ` Stefan Hajnoczi
2014-08-08 14:51     ` Stefan Hajnoczi
2014-08-18 12:09     ` David Marchand
2014-08-18 12:09       ` David Marchand
2014-08-10  3:57   ` Gonglei
2014-08-10  3:57     ` [Qemu-devel] " Gonglei
2014-08-18 12:19     ` David Marchand
2014-08-18 12:19       ` David Marchand
2014-08-08  8:55 ` [PATCH v3 2/2] docs: update ivshmem device spec David Marchand
2014-08-08  8:55   ` [Qemu-devel] " David Marchand
2014-08-08  9:04   ` Claudio Fontana
2014-08-08  9:04     ` [Qemu-devel] " Claudio Fontana
2014-08-08  9:32     ` David Marchand
2014-08-08  9:32       ` [Qemu-devel] " David Marchand
2014-08-08 15:02   ` Stefan Hajnoczi
2014-08-08 15:02     ` Stefan Hajnoczi
2014-08-26  6:47     ` David Marchand
2014-08-26  6:47       ` David Marchand
2014-08-26 11:04       ` Paolo Bonzini
2014-08-26 11:04         ` Paolo Bonzini
2014-08-28  9:49         ` Stefan Hajnoczi
2014-08-28  9:49           ` Stefan Hajnoczi
2014-09-01  9:52           ` David Marchand
2014-09-01  9:52             ` David Marchand
2014-09-09 19:04             ` Eric Blake
2014-09-09 19:04               ` [Qemu-devel] " Eric Blake
2014-08-08  9:30 ` [Qemu-devel] [PATCH v3 0/2] ivshmem: update documentation, add client/server tools Gonglei (Arei)
2014-08-08  9:30   ` Gonglei (Arei)
2014-08-08  9:54   ` David Marchand
2014-08-08  9:54     ` David Marchand
2014-08-08 10:26     ` Gonglei (Arei)
2014-08-08 10:26       ` Gonglei (Arei)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.