All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] Add virtio transport for AF_VSOCK
@ 2015-12-09 12:03 Stefan Hajnoczi
  2015-12-09 12:03 ` [PATCH v3 1/4] VSOCK: Introduce virtio-vsock-common.ko Stefan Hajnoczi
                   ` (9 more replies)
  0 siblings, 10 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-09 12:03 UTC (permalink / raw)
  To: kvm
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, netdev, virtualization,
	Matt Benjamin, Christoffer Dall, matt.ma

Note: the virtio-vsock device specification is currently under review but not
yet finalized.  Please review this code but don't merge until I send an update
when the spec is finalized.  Thanks!

v3:
 * Remove unnecessary 3-way handshake, just do REQUEST/RESPONSE instead
   of REQUEST/RESPONSE/ACK
 * Remove SOCK_DGRAM support and focus on SOCK_STREAM first
   (also drop v2 Patch 1, it's only needed for SOCK_DGRAM)
 * Only allow host->guest connections (same security model as latest
   VMware)
 * Don't put vhost vsock driver into staging
 * Add missing Kconfig dependencies (Arnd Bergmann <arnd@arndb.de>)
 * Remove unneeded variable used to store return value
   (Fengguang Wu <fengguang.wu@intel.com> and Julia Lawall
   <julia.lawall@lip6.fr>)

v2:
 * Rebased onto Linux v4.4-rc2
 * vhost: Refuse to assign reserved CIDs
 * vhost: Refuse guest CID if already in use
 * vhost: Only accept correctly addressed packets (no spoofing!)
 * vhost: Support flexible rx/tx descriptor layout
 * vhost: Add missing total_tx_buf decrement
 * virtio_transport: Fix total_tx_buf accounting
 * virtio_transport: Add virtio_transport global mutex to prevent races
 * common: Notify other side of SOCK_STREAM disconnect (fixes shutdown
   semantics)
 * common: Avoid recursive mutex_lock(tx_lock) for write_space (fixes deadlock)
 * common: Define VIRTIO_VSOCK_TYPE_STREAM/DGRAM hardware interface constants
 * common: Define VIRTIO_VSOCK_SHUTDOWN_RCV/SEND hardware interface constants
 * common: Fix peer_buf_alloc inheritance on child socket

This patch series adds a virtio transport for AF_VSOCK (net/vmw_vsock/).
AF_VSOCK is designed for communication between virtual machines and
hypervisors.  It is currently only implemented for VMware's VMCI transport.

This series implements the proposed virtio-vsock device specification from
here:
http://permalink.gmane.org/gmane.comp.emulators.virtio.devel/980

Most of the work was done by Asias He and Gerd Hoffmann a while back.  I have
picked up the series again.

The QEMU userspace changes are here:
https://github.com/stefanha/qemu/commits/vsock

Why virtio-vsock?
-----------------
Guest<->host communication is currently done over the virtio-serial device.
This makes it hard to port sockets API-based applications and is limited to
static ports.

virtio-vsock uses the sockets API so that applications can rely on familiar
SOCK_STREAM semantics.  Applications on the host can easily connect to guest
agents because the sockets API allows multiple connections to a listen socket
(unlike virtio-serial).  This simplifies the guest<->host communication and
eliminates the need for extra processes on the host to arbitrate virtio-serial
ports.

Overview
--------
This series adds 3 pieces:

1. virtio_transport_common.ko - core virtio vsock code that uses vsock.ko

2. virtio_transport.ko - guest driver

3. drivers/vhost/vsock.ko - host driver

Howto
-----
The following kernel options are needed:
  CONFIG_VSOCKETS=y
  CONFIG_VIRTIO_VSOCKETS=y
  CONFIG_VIRTIO_VSOCKETS_COMMON=y
  CONFIG_VHOST_VSOCK=m

Launch QEMU as follows:
  # qemu ... -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3

Guest and host can communicate via AF_VSOCK sockets.  The host's CID (address)
is 2 and the guest must be assigned a CID (3 in the example above).

Status
------
This patch series implements the latest draft specification.  Please review.

Asias He (4):
  VSOCK: Introduce virtio-vsock-common.ko
  VSOCK: Introduce virtio-vsock.ko
  VSOCK: Introduce vhost-vsock.ko
  VSOCK: Add Makefile and Kconfig

 drivers/vhost/Kconfig                   |  10 +
 drivers/vhost/Makefile                  |   4 +
 drivers/vhost/vsock.c                   | 628 +++++++++++++++++++++++
 drivers/vhost/vsock.h                   |   4 +
 include/linux/virtio_vsock.h            | 203 ++++++++
 include/uapi/linux/virtio_ids.h         |   1 +
 include/uapi/linux/virtio_vsock.h       |  87 ++++
 net/vmw_vsock/Kconfig                   |  18 +
 net/vmw_vsock/Makefile                  |   2 +
 net/vmw_vsock/virtio_transport.c        | 466 +++++++++++++++++
 net/vmw_vsock/virtio_transport_common.c | 854 ++++++++++++++++++++++++++++++++
 11 files changed, 2277 insertions(+)
 create mode 100644 drivers/vhost/vsock.c
 create mode 100644 drivers/vhost/vsock.h
 create mode 100644 include/linux/virtio_vsock.h
 create mode 100644 include/uapi/linux/virtio_vsock.h
 create mode 100644 net/vmw_vsock/virtio_transport.c
 create mode 100644 net/vmw_vsock/virtio_transport_common.c

-- 
2.5.0

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3 1/4] VSOCK: Introduce virtio-vsock-common.ko
  2015-12-09 12:03 [PATCH v3 0/4] Add virtio transport for AF_VSOCK Stefan Hajnoczi
  2015-12-09 12:03 ` [PATCH v3 1/4] VSOCK: Introduce virtio-vsock-common.ko Stefan Hajnoczi
@ 2015-12-09 12:03 ` Stefan Hajnoczi
  2015-12-10 10:17   ` Alex Bennée
  2015-12-09 12:03 ` [PATCH v3 2/4] VSOCK: Introduce virtio-vsock.ko Stefan Hajnoczi
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-09 12:03 UTC (permalink / raw)
  To: kvm
  Cc: Matt Benjamin, Christoffer Dall, netdev, Michael S. Tsirkin,
	matt.ma, virtualization, Asias He, Stefan Hajnoczi

From: Asias He <asias@redhat.com>

This module contains the common code and header files for the following
virtio-vsock and virtio-vhost kernel modules.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v3:
 * Remove unnecessary 3-way handshake, just do REQUEST/RESPONSE instead
   of REQUEST/RESPONSE/ACK
 * Remove SOCK_DGRAM support and focus on SOCK_STREAM first
 * Only allow host->guest connections (same security model as latest
   VMware)
v2:
 * Fix peer_buf_alloc inheritance on child socket
 * Notify other side of SOCK_STREAM disconnect (fixes shutdown
   semantics)
 * Avoid recursive mutex_lock(tx_lock) for write_space (fixes deadlock)
 * Define VIRTIO_VSOCK_TYPE_STREAM/DGRAM hardware interface constants
 * Define VIRTIO_VSOCK_SHUTDOWN_RCV/SEND hardware interface constants
---
 include/linux/virtio_vsock.h            | 203 ++++++++
 include/uapi/linux/virtio_ids.h         |   1 +
 include/uapi/linux/virtio_vsock.h       |  87 ++++
 net/vmw_vsock/virtio_transport_common.c | 854 ++++++++++++++++++++++++++++++++
 4 files changed, 1145 insertions(+)
 create mode 100644 include/linux/virtio_vsock.h
 create mode 100644 include/uapi/linux/virtio_vsock.h
 create mode 100644 net/vmw_vsock/virtio_transport_common.c

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
new file mode 100644
index 0000000..e54eb45
--- /dev/null
+++ b/include/linux/virtio_vsock.h
@@ -0,0 +1,203 @@
+/*
+ * This header, excluding the #ifdef __KERNEL__ part, is BSD licensed so
+ * anyone can use the definitions to implement compatible drivers/servers:
+ *
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ *    may be used to endorse or promote products derived from this software
+ *    without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * Copyright (C) Red Hat, Inc., 2013-2015
+ * Copyright (C) Asias He <asias@redhat.com>, 2013
+ * Copyright (C) Stefan Hajnoczi <stefanha@redhat.com>, 2015
+ */
+
+#ifndef _LINUX_VIRTIO_VSOCK_H
+#define _LINUX_VIRTIO_VSOCK_H
+
+#include <uapi/linux/virtio_vsock.h>
+#include <linux/socket.h>
+#include <net/sock.h>
+
+#define VIRTIO_VSOCK_DEFAULT_MIN_BUF_SIZE	128
+#define VIRTIO_VSOCK_DEFAULT_BUF_SIZE		(1024 * 256)
+#define VIRTIO_VSOCK_DEFAULT_MAX_BUF_SIZE	(1024 * 256)
+#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE	(1024 * 4)
+#define VIRTIO_VSOCK_MAX_BUF_SIZE		0xFFFFFFFFUL
+#define VIRTIO_VSOCK_MAX_PKT_BUF_SIZE		(1024 * 64)
+#define VIRTIO_VSOCK_MAX_TX_BUF_SIZE		(1024 * 1024 * 16)
+#define VIRTIO_VSOCK_MAX_DGRAM_SIZE		(1024 * 64)
+
+struct vsock_transport_recv_notify_data;
+struct vsock_transport_send_notify_data;
+struct sockaddr_vm;
+struct vsock_sock;
+
+enum {
+	VSOCK_VQ_CTRL	= 0,
+	VSOCK_VQ_RX	= 1, /* for host to guest data */
+	VSOCK_VQ_TX	= 2, /* for guest to host data */
+	VSOCK_VQ_MAX	= 3,
+};
+
+/* virtio transport socket state */
+struct virtio_transport {
+	struct virtio_transport_pkt_ops	*ops;
+	struct vsock_sock *vsk;
+
+	u32 buf_size;
+	u32 buf_size_min;
+	u32 buf_size_max;
+
+	struct mutex tx_lock;
+	struct mutex rx_lock;
+
+	struct list_head rx_queue;
+	u32 rx_bytes;
+
+	/* Protected by trans->tx_lock */
+	u32 tx_cnt;
+	u32 buf_alloc;
+	u32 peer_fwd_cnt;
+	u32 peer_buf_alloc;
+	/* Protected by trans->rx_lock */
+	u32 fwd_cnt;
+};
+
+struct virtio_vsock_pkt {
+	struct virtio_vsock_hdr	hdr;
+	struct virtio_transport	*trans;
+	struct work_struct work;
+	struct list_head list;
+	void *buf;
+	u32 len;
+	u32 off;
+};
+
+struct virtio_vsock_pkt_info {
+	u32 remote_cid, remote_port;
+	struct msghdr *msg;
+	u32 pkt_len;
+	u16 type;
+	u16 op;
+	u32 flags;
+};
+
+struct virtio_transport_pkt_ops {
+	int (*send_pkt)(struct vsock_sock *vsk,
+			struct virtio_vsock_pkt_info *info);
+};
+
+void virtio_vsock_dumppkt(const char *func,
+			  const struct virtio_vsock_pkt *pkt);
+
+struct sock *
+virtio_transport_get_pending(struct sock *listener,
+			     struct virtio_vsock_pkt *pkt);
+struct virtio_vsock_pkt *
+virtio_transport_alloc_pkt(struct vsock_sock *vsk,
+			   struct virtio_vsock_pkt_info *info,
+			   size_t len,
+			   u32 src_cid,
+			   u32 src_port,
+			   u32 dst_cid,
+			   u32 dst_port);
+ssize_t
+virtio_transport_stream_dequeue(struct vsock_sock *vsk,
+				struct msghdr *msg,
+				size_t len,
+				int type);
+int
+virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
+			       struct msghdr *msg,
+			       size_t len, int flags);
+
+s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
+s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
+
+int virtio_transport_do_socket_init(struct vsock_sock *vsk,
+				 struct vsock_sock *psk);
+u64 virtio_transport_get_buffer_size(struct vsock_sock *vsk);
+u64 virtio_transport_get_min_buffer_size(struct vsock_sock *vsk);
+u64 virtio_transport_get_max_buffer_size(struct vsock_sock *vsk);
+void virtio_transport_set_buffer_size(struct vsock_sock *vsk, u64 val);
+void virtio_transport_set_min_buffer_size(struct vsock_sock *vsk, u64 val);
+void virtio_transport_set_max_buffer_size(struct vsock_sock *vs, u64 val);
+int
+virtio_transport_notify_poll_in(struct vsock_sock *vsk,
+				size_t target,
+				bool *data_ready_now);
+int
+virtio_transport_notify_poll_out(struct vsock_sock *vsk,
+				 size_t target,
+				 bool *space_available_now);
+
+int virtio_transport_notify_recv_init(struct vsock_sock *vsk,
+	size_t target, struct vsock_transport_recv_notify_data *data);
+int virtio_transport_notify_recv_pre_block(struct vsock_sock *vsk,
+	size_t target, struct vsock_transport_recv_notify_data *data);
+int virtio_transport_notify_recv_pre_dequeue(struct vsock_sock *vsk,
+	size_t target, struct vsock_transport_recv_notify_data *data);
+int virtio_transport_notify_recv_post_dequeue(struct vsock_sock *vsk,
+	size_t target, ssize_t copied, bool data_read,
+	struct vsock_transport_recv_notify_data *data);
+int virtio_transport_notify_send_init(struct vsock_sock *vsk,
+	struct vsock_transport_send_notify_data *data);
+int virtio_transport_notify_send_pre_block(struct vsock_sock *vsk,
+	struct vsock_transport_send_notify_data *data);
+int virtio_transport_notify_send_pre_enqueue(struct vsock_sock *vsk,
+	struct vsock_transport_send_notify_data *data);
+int virtio_transport_notify_send_post_enqueue(struct vsock_sock *vsk,
+	ssize_t written, struct vsock_transport_send_notify_data *data);
+
+u64 virtio_transport_stream_rcvhiwat(struct vsock_sock *vsk);
+bool virtio_transport_stream_is_active(struct vsock_sock *vsk);
+bool virtio_transport_stream_allow(u32 cid, u32 port);
+int virtio_transport_dgram_bind(struct vsock_sock *vsk,
+				struct sockaddr_vm *addr);
+bool virtio_transport_dgram_allow(u32 cid, u32 port);
+
+int virtio_transport_connect(struct vsock_sock *vsk);
+
+int virtio_transport_shutdown(struct vsock_sock *vsk, int mode);
+
+void virtio_transport_release(struct vsock_sock *vsk);
+
+ssize_t
+virtio_transport_stream_enqueue(struct vsock_sock *vsk,
+				struct msghdr *msg,
+				size_t len);
+int
+virtio_transport_dgram_enqueue(struct vsock_sock *vsk,
+			       struct sockaddr_vm *remote_addr,
+			       struct msghdr *msg,
+			       size_t len);
+
+void virtio_transport_destruct(struct vsock_sock *vsk);
+
+void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt);
+void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt);
+void virtio_transport_inc_tx_pkt(struct virtio_vsock_pkt *pkt);
+void virtio_transport_dec_tx_pkt(struct virtio_vsock_pkt *pkt);
+u32 virtio_transport_get_credit(struct virtio_transport *trans, u32 wanted);
+void virtio_transport_put_credit(struct virtio_transport *trans, u32 credit);
+#endif /* _LINUX_VIRTIO_VSOCK_H */
diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
index 77925f5..16dcf5d 100644
--- a/include/uapi/linux/virtio_ids.h
+++ b/include/uapi/linux/virtio_ids.h
@@ -39,6 +39,7 @@
 #define VIRTIO_ID_9P		9 /* 9p virtio console */
 #define VIRTIO_ID_RPROC_SERIAL 11 /* virtio remoteproc serial link */
 #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
+#define VIRTIO_ID_VSOCK        13 /* virtio vsock transport */
 #define VIRTIO_ID_GPU          16 /* virtio GPU */
 #define VIRTIO_ID_INPUT        18 /* virtio input */
 
diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
new file mode 100644
index 0000000..ac6483d
--- /dev/null
+++ b/include/uapi/linux/virtio_vsock.h
@@ -0,0 +1,87 @@
+/*
+ * This header, excluding the #ifdef __KERNEL__ part, is BSD licensed so
+ * anyone can use the definitions to implement compatible drivers/servers:
+ *
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ *    may be used to endorse or promote products derived from this software
+ *    without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * Copyright (C) Red Hat, Inc., 2013-2015
+ * Copyright (C) Asias He <asias@redhat.com>, 2013
+ * Copyright (C) Stefan Hajnoczi <stefanha@redhat.com>, 2015
+ */
+
+#ifndef _UAPI_LINUX_VIRTIO_VSOCK_H
+#define _UAPI_LINUX_VIRTIO_VOSCK_H
+
+#include <linux/types.h>
+#include <linux/virtio_ids.h>
+#include <linux/virtio_config.h>
+
+struct virtio_vsock_config {
+	__le32 guest_cid;
+	__le32 max_virtqueue_pairs;
+};
+
+struct virtio_vsock_hdr {
+	__le32	src_cid;
+	__le32	src_port;
+	__le32	dst_cid;
+	__le32	dst_port;
+	__le32	len;
+	__le16	type;		/* enum virtio_vsock_type */
+	__le16	op;		/* enum virtio_vsock_op */
+	__le32	flags;
+	__le32	buf_alloc;
+	__le32	fwd_cnt;
+};
+
+enum virtio_vsock_type {
+	VIRTIO_VSOCK_TYPE_STREAM = 1,
+};
+
+enum virtio_vsock_op {
+	VIRTIO_VSOCK_OP_INVALID = 0,
+
+	/* Connect operations */
+	VIRTIO_VSOCK_OP_REQUEST = 1,
+	VIRTIO_VSOCK_OP_RESPONSE = 2,
+	VIRTIO_VSOCK_OP_RST = 3,
+	VIRTIO_VSOCK_OP_SHUTDOWN = 4,
+
+	/* To send payload */
+	VIRTIO_VSOCK_OP_RW = 5,
+
+	/* Tell the peer our credit info */
+	VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
+	/* Request the peer to send the credit info to us */
+	VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
+};
+
+/* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
+enum virtio_vsock_shutdown {
+	VIRTIO_VSOCK_SHUTDOWN_RCV = 1,
+	VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
+};
+
+#endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
new file mode 100644
index 0000000..025a323
--- /dev/null
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -0,0 +1,854 @@
+/*
+ * common code for virtio vsock
+ *
+ * Copyright (C) 2013-2015 Red Hat, Inc.
+ * Author: Asias He <asias@redhat.com>
+ *         Stefan Hajnoczi <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+#include <linux/module.h>
+#include <linux/ctype.h>
+#include <linux/list.h>
+#include <linux/virtio.h>
+#include <linux/virtio_ids.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_vsock.h>
+
+#include <net/sock.h>
+#include <net/af_vsock.h>
+
+void virtio_vsock_dumppkt(const char *func,  const struct virtio_vsock_pkt *pkt)
+{
+	pr_debug("%s: pkt=%p, op=%d, len=%d, %d:%d---%d:%d, len=%d\n",
+		 func, pkt,
+		 le16_to_cpu(pkt->hdr.op),
+		 le32_to_cpu(pkt->hdr.len),
+		 le32_to_cpu(pkt->hdr.src_cid),
+		 le32_to_cpu(pkt->hdr.src_port),
+		 le32_to_cpu(pkt->hdr.dst_cid),
+		 le32_to_cpu(pkt->hdr.dst_port),
+		 pkt->len);
+}
+EXPORT_SYMBOL_GPL(virtio_vsock_dumppkt);
+
+struct virtio_vsock_pkt *
+virtio_transport_alloc_pkt(struct vsock_sock *vsk,
+			   struct virtio_vsock_pkt_info *info,
+			   size_t len,
+			   u32 src_cid,
+			   u32 src_port,
+			   u32 dst_cid,
+			   u32 dst_port)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt *pkt;
+	int err;
+
+	BUG_ON(!trans);
+
+	pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
+	if (!pkt)
+		return NULL;
+
+	pkt->hdr.type		= cpu_to_le16(info->type);
+	pkt->hdr.op		= cpu_to_le16(info->op);
+	pkt->hdr.src_cid	= cpu_to_le32(src_cid);
+	pkt->hdr.src_port	= cpu_to_le32(src_port);
+	pkt->hdr.dst_cid	= cpu_to_le32(dst_cid);
+	pkt->hdr.dst_port	= cpu_to_le32(dst_port);
+	pkt->hdr.flags		= cpu_to_le32(info->flags);
+	pkt->len		= len;
+	pkt->trans		= trans;
+	pkt->hdr.len		= cpu_to_le32(len);
+
+	if (info->msg && len > 0) {
+		pkt->buf = kmalloc(len, GFP_KERNEL);
+		if (!pkt->buf)
+			goto out_pkt;
+		err = memcpy_from_msg(pkt->buf, info->msg, len);
+		if (err)
+			goto out;
+	}
+
+	return pkt;
+
+out:
+	kfree(pkt->buf);
+out_pkt:
+	kfree(pkt);
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_alloc_pkt);
+
+struct sock *
+virtio_transport_get_pending(struct sock *listener,
+			     struct virtio_vsock_pkt *pkt)
+{
+	struct vsock_sock *vlistener;
+	struct vsock_sock *vpending;
+	struct sockaddr_vm src;
+	struct sockaddr_vm dst;
+	struct sock *pending;
+
+	vsock_addr_init(&src, le32_to_cpu(pkt->hdr.src_cid), le32_to_cpu(pkt->hdr.src_port));
+	vsock_addr_init(&dst, le32_to_cpu(pkt->hdr.dst_cid), le32_to_cpu(pkt->hdr.dst_port));
+
+	vlistener = vsock_sk(listener);
+	list_for_each_entry(vpending, &vlistener->pending_links,
+			    pending_links) {
+		if (vsock_addr_equals_addr(&src, &vpending->remote_addr) &&
+		    vsock_addr_equals_addr(&dst, &vpending->local_addr)) {
+			pending = sk_vsock(vpending);
+			sock_hold(pending);
+			return pending;
+		}
+	}
+
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_pending);
+
+static void virtio_transport_inc_rx_pkt(struct virtio_vsock_pkt *pkt)
+{
+	pkt->trans->rx_bytes += pkt->len;
+}
+
+static void virtio_transport_dec_rx_pkt(struct virtio_vsock_pkt *pkt)
+{
+	pkt->trans->rx_bytes -= pkt->len;
+	pkt->trans->fwd_cnt += pkt->len;
+}
+
+void virtio_transport_inc_tx_pkt(struct virtio_vsock_pkt *pkt)
+{
+	mutex_lock(&pkt->trans->tx_lock);
+	pkt->hdr.fwd_cnt = cpu_to_le32(pkt->trans->fwd_cnt);
+	pkt->hdr.buf_alloc = cpu_to_le32(pkt->trans->buf_alloc);
+	mutex_unlock(&pkt->trans->tx_lock);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_inc_tx_pkt);
+
+void virtio_transport_dec_tx_pkt(struct virtio_vsock_pkt *pkt)
+{
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dec_tx_pkt);
+
+u32 virtio_transport_get_credit(struct virtio_transport *trans, u32 credit)
+{
+	u32 ret;
+
+	mutex_lock(&trans->tx_lock);
+	ret = trans->peer_buf_alloc - (trans->tx_cnt - trans->peer_fwd_cnt);
+	if (ret > credit)
+		ret = credit;
+	trans->tx_cnt += ret;
+	mutex_unlock(&trans->tx_lock);
+
+	pr_debug("%s: ret=%d, buf_alloc=%d, peer_buf_alloc=%d,"
+		 "tx_cnt=%d, fwd_cnt=%d, peer_fwd_cnt=%d\n", __func__,
+		 ret, trans->buf_alloc, trans->peer_buf_alloc,
+		 trans->tx_cnt, trans->fwd_cnt, trans->peer_fwd_cnt);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_credit);
+
+void virtio_transport_put_credit(struct virtio_transport *trans, u32 credit)
+{
+	mutex_lock(&trans->tx_lock);
+	trans->tx_cnt -= credit;
+	mutex_unlock(&trans->tx_lock);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_put_credit);
+
+static int virtio_transport_send_credit_update(struct vsock_sock *vsk, int type, struct virtio_vsock_hdr *hdr)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt_info info = {
+		.op = VIRTIO_VSOCK_OP_CREDIT_UPDATE,
+		.type = type,
+	};
+
+	pr_debug("%s: sk=%p send_credit_update\n", __func__, vsk);
+	return trans->ops->send_pkt(vsk, &info);
+}
+
+static ssize_t
+virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
+				   struct msghdr *msg,
+				   size_t len)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt *pkt;
+	size_t bytes, total = 0;
+	int err = -EFAULT;
+
+	mutex_lock(&trans->rx_lock);
+	while (total < len && trans->rx_bytes > 0  &&
+			!list_empty(&trans->rx_queue)) {
+		pkt = list_first_entry(&trans->rx_queue,
+				       struct virtio_vsock_pkt, list);
+
+		bytes = len - total;
+		if (bytes > pkt->len - pkt->off)
+			bytes = pkt->len - pkt->off;
+
+		err = memcpy_to_msg(msg, pkt->buf + pkt->off, bytes);
+		if (err)
+			goto out;
+		total += bytes;
+		pkt->off += bytes;
+		if (pkt->off == pkt->len) {
+			virtio_transport_dec_rx_pkt(pkt);
+			list_del(&pkt->list);
+			virtio_transport_free_pkt(pkt);
+		}
+	}
+	mutex_unlock(&trans->rx_lock);
+
+	/* Send a credit pkt to peer */
+	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_STREAM,
+					    NULL);
+
+	return total;
+
+out:
+	mutex_unlock(&trans->rx_lock);
+	if (total)
+		err = total;
+	return err;
+}
+
+ssize_t
+virtio_transport_stream_dequeue(struct vsock_sock *vsk,
+				struct msghdr *msg,
+				size_t len, int flags)
+{
+	if (flags & MSG_PEEK)
+		return -EOPNOTSUPP;
+
+	return virtio_transport_stream_do_dequeue(vsk, msg, len);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_dequeue);
+
+int
+virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
+			       struct msghdr *msg,
+			       size_t len, int flags)
+{
+	return -EOPNOTSUPP;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dgram_dequeue);
+
+s64 virtio_transport_stream_has_data(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+	s64 bytes;
+
+	mutex_lock(&trans->rx_lock);
+	bytes = trans->rx_bytes;
+	mutex_unlock(&trans->rx_lock);
+
+	return bytes;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_has_data);
+
+static s64 virtio_transport_has_space(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+	s64 bytes;
+
+	bytes = trans->peer_buf_alloc - (trans->tx_cnt - trans->peer_fwd_cnt);
+	if (bytes < 0)
+		bytes = 0;
+
+	return bytes;
+}
+
+s64 virtio_transport_stream_has_space(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+	s64 bytes;
+
+	mutex_lock(&trans->tx_lock);
+	bytes = virtio_transport_has_space(vsk);
+	mutex_unlock(&trans->tx_lock);
+
+	pr_debug("%s: bytes=%lld\n", __func__, bytes);
+
+	return bytes;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_has_space);
+
+int virtio_transport_do_socket_init(struct vsock_sock *vsk,
+				    struct vsock_sock *psk)
+{
+	struct virtio_transport *trans;
+
+	trans = kzalloc(sizeof(*trans), GFP_KERNEL);
+	if (!trans)
+		return -ENOMEM;
+
+	vsk->trans = trans;
+	trans->vsk = vsk;
+	if (psk) {
+		struct virtio_transport *ptrans = psk->trans;
+		trans->buf_size	= ptrans->buf_size;
+		trans->buf_size_min = ptrans->buf_size_min;
+		trans->buf_size_max = ptrans->buf_size_max;
+		trans->peer_buf_alloc = ptrans->peer_buf_alloc;
+	} else {
+		trans->buf_size = VIRTIO_VSOCK_DEFAULT_BUF_SIZE;
+		trans->buf_size_min = VIRTIO_VSOCK_DEFAULT_MIN_BUF_SIZE;
+		trans->buf_size_max = VIRTIO_VSOCK_DEFAULT_MAX_BUF_SIZE;
+	}
+
+	trans->buf_alloc = trans->buf_size;
+
+	pr_debug("%s: trans->buf_alloc=%d\n", __func__, trans->buf_alloc);
+
+	mutex_init(&trans->rx_lock);
+	mutex_init(&trans->tx_lock);
+	INIT_LIST_HEAD(&trans->rx_queue);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
+
+u64 virtio_transport_get_buffer_size(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	return trans->buf_size;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_buffer_size);
+
+u64 virtio_transport_get_min_buffer_size(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	return trans->buf_size_min;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_min_buffer_size);
+
+u64 virtio_transport_get_max_buffer_size(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	return trans->buf_size_max;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_max_buffer_size);
+
+void virtio_transport_set_buffer_size(struct vsock_sock *vsk, u64 val)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	if (val > VIRTIO_VSOCK_MAX_BUF_SIZE)
+		val = VIRTIO_VSOCK_MAX_BUF_SIZE;
+	if (val < trans->buf_size_min)
+		trans->buf_size_min = val;
+	if (val > trans->buf_size_max)
+		trans->buf_size_max = val;
+	trans->buf_size = val;
+	trans->buf_alloc = val;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_set_buffer_size);
+
+void virtio_transport_set_min_buffer_size(struct vsock_sock *vsk, u64 val)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	if (val > VIRTIO_VSOCK_MAX_BUF_SIZE)
+		val = VIRTIO_VSOCK_MAX_BUF_SIZE;
+	if (val > trans->buf_size)
+		trans->buf_size = val;
+	trans->buf_size_min = val;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_set_min_buffer_size);
+
+void virtio_transport_set_max_buffer_size(struct vsock_sock *vsk, u64 val)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	if (val > VIRTIO_VSOCK_MAX_BUF_SIZE)
+		val = VIRTIO_VSOCK_MAX_BUF_SIZE;
+	if (val < trans->buf_size)
+		trans->buf_size = val;
+	trans->buf_size_max = val;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_set_max_buffer_size);
+
+int
+virtio_transport_notify_poll_in(struct vsock_sock *vsk,
+				size_t target,
+				bool *data_ready_now)
+{
+	if (vsock_stream_has_data(vsk))
+		*data_ready_now = true;
+	else
+		*data_ready_now = false;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_poll_in);
+
+int
+virtio_transport_notify_poll_out(struct vsock_sock *vsk,
+				 size_t target,
+				 bool *space_avail_now)
+{
+	s64 free_space;
+
+	free_space = vsock_stream_has_space(vsk);
+	if (free_space > 0)
+		*space_avail_now = true;
+	else if (free_space == 0)
+		*space_avail_now = false;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_poll_out);
+
+int virtio_transport_notify_recv_init(struct vsock_sock *vsk,
+	size_t target, struct vsock_transport_recv_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_init);
+
+int virtio_transport_notify_recv_pre_block(struct vsock_sock *vsk,
+	size_t target, struct vsock_transport_recv_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_pre_block);
+
+int virtio_transport_notify_recv_pre_dequeue(struct vsock_sock *vsk,
+	size_t target, struct vsock_transport_recv_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_pre_dequeue);
+
+int virtio_transport_notify_recv_post_dequeue(struct vsock_sock *vsk,
+	size_t target, ssize_t copied, bool data_read,
+	struct vsock_transport_recv_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_post_dequeue);
+
+int virtio_transport_notify_send_init(struct vsock_sock *vsk,
+	struct vsock_transport_send_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_send_init);
+
+int virtio_transport_notify_send_pre_block(struct vsock_sock *vsk,
+	struct vsock_transport_send_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_send_pre_block);
+
+int virtio_transport_notify_send_pre_enqueue(struct vsock_sock *vsk,
+	struct vsock_transport_send_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_send_pre_enqueue);
+
+int virtio_transport_notify_send_post_enqueue(struct vsock_sock *vsk,
+	ssize_t written, struct vsock_transport_send_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_send_post_enqueue);
+
+u64 virtio_transport_stream_rcvhiwat(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	return trans->buf_size;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_rcvhiwat);
+
+bool virtio_transport_stream_is_active(struct vsock_sock *vsk)
+{
+	return true;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_is_active);
+
+bool virtio_transport_stream_allow(u32 cid, u32 port)
+{
+	/* Only allow guest->host connections */
+	return cid != VMADDR_CID_HOST;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_allow);
+
+int virtio_transport_dgram_bind(struct vsock_sock *vsk,
+				struct sockaddr_vm *addr)
+{
+	return -EOPNOTSUPP;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dgram_bind);
+
+bool virtio_transport_dgram_allow(u32 cid, u32 port)
+{
+	return false;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dgram_allow);
+
+int virtio_transport_connect(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt_info info = {
+		.op = VIRTIO_VSOCK_OP_REQUEST,
+		.type = VIRTIO_VSOCK_TYPE_STREAM,
+	};
+
+	pr_debug("%s: vsk=%p send_request\n", __func__, vsk);
+	return trans->ops->send_pkt(vsk, &info);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_connect);
+
+int virtio_transport_shutdown(struct vsock_sock *vsk, int mode)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt_info info = {
+		.op = VIRTIO_VSOCK_OP_SHUTDOWN,
+		.type = VIRTIO_VSOCK_TYPE_STREAM,
+		.flags = (mode & RCV_SHUTDOWN ?
+			  VIRTIO_VSOCK_SHUTDOWN_RCV : 0) |
+			 (mode & SEND_SHUTDOWN ?
+			  VIRTIO_VSOCK_SHUTDOWN_SEND : 0),
+	};
+
+	pr_debug("%s: vsk=%p: send_shutdown\n", __func__, vsk);
+	return trans->ops->send_pkt(vsk, &info);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_shutdown);
+
+void virtio_transport_release(struct vsock_sock *vsk)
+{
+	struct sock *sk = &vsk->sk;
+
+	pr_debug("%s: vsk=%p\n", __func__, vsk);
+
+	/* Tell other side to terminate connection */
+	if (sk->sk_type == SOCK_STREAM && sk->sk_state == SS_CONNECTED) {
+		virtio_transport_shutdown(vsk, SHUTDOWN_MASK);
+	}
+}
+EXPORT_SYMBOL_GPL(virtio_transport_release);
+
+int
+virtio_transport_dgram_enqueue(struct vsock_sock *vsk,
+			       struct sockaddr_vm *remote_addr,
+			       struct msghdr *msg,
+			       size_t dgram_len)
+{
+	return -EOPNOTSUPP;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dgram_enqueue);
+
+ssize_t
+virtio_transport_stream_enqueue(struct vsock_sock *vsk,
+				struct msghdr *msg,
+				size_t len)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt_info info = {
+		.op = VIRTIO_VSOCK_OP_RW,
+		.type = VIRTIO_VSOCK_TYPE_STREAM,
+		.msg = msg,
+		.pkt_len = len,
+	};
+
+	return trans->ops->send_pkt(vsk, &info);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_enqueue);
+
+void virtio_transport_destruct(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	pr_debug("%s: vsk=%p\n", __func__, vsk);
+	kfree(trans);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_destruct);
+
+static int virtio_transport_send_reset(struct vsock_sock *vsk,
+				       struct virtio_vsock_pkt *pkt)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt_info info = {
+		.op = VIRTIO_VSOCK_OP_RST,
+		.type = VIRTIO_VSOCK_TYPE_STREAM,
+	};
+
+	pr_debug("%s\n", __func__);
+
+	/* Send RST only if the original pkt is not a RST pkt */
+	if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_RST)
+		return 0;
+
+	return trans->ops->send_pkt(vsk, &info);
+}
+
+static int
+virtio_transport_recv_connecting(struct sock *sk,
+				 struct virtio_vsock_pkt *pkt)
+{
+	struct vsock_sock *vsk = vsock_sk(sk);
+	int err;
+	int skerr;
+
+	pr_debug("%s: vsk=%p\n", __func__, vsk);
+	switch (le16_to_cpu(pkt->hdr.op)) {
+	case VIRTIO_VSOCK_OP_RESPONSE:
+		pr_debug("%s: got RESPONSE\n", __func__);
+		sk->sk_state = SS_CONNECTED;
+		sk->sk_socket->state = SS_CONNECTED;
+		vsock_insert_connected(vsk);
+		sk->sk_state_change(sk);
+		break;
+	case VIRTIO_VSOCK_OP_INVALID:
+		pr_debug("%s: got invalid\n", __func__);
+		break;
+	case VIRTIO_VSOCK_OP_RST:
+		pr_debug("%s: got rst\n", __func__);
+		skerr = ECONNRESET;
+		err = 0;
+		goto destroy;
+	default:
+		pr_debug("%s: got def\n", __func__);
+		skerr = EPROTO;
+		err = -EINVAL;
+		goto destroy;
+	}
+	return 0;
+
+destroy:
+	virtio_transport_send_reset(vsk, pkt);
+	sk->sk_state = SS_UNCONNECTED;
+	sk->sk_err = skerr;
+	sk->sk_error_report(sk);
+	return err;
+}
+
+static int
+virtio_transport_recv_connected(struct sock *sk,
+				struct virtio_vsock_pkt *pkt)
+{
+	struct vsock_sock *vsk = vsock_sk(sk);
+	struct virtio_transport *trans = vsk->trans;
+	int err = 0;
+
+	switch (le16_to_cpu(pkt->hdr.op)) {
+	case VIRTIO_VSOCK_OP_RW:
+		pkt->len = le32_to_cpu(pkt->hdr.len);
+		pkt->off = 0;
+		pkt->trans = trans;
+
+		mutex_lock(&trans->rx_lock);
+		virtio_transport_inc_rx_pkt(pkt);
+		list_add_tail(&pkt->list, &trans->rx_queue);
+		mutex_unlock(&trans->rx_lock);
+
+		sk->sk_data_ready(sk);
+		return err;
+	case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
+		sk->sk_write_space(sk);
+		break;
+	case VIRTIO_VSOCK_OP_SHUTDOWN:
+		pr_debug("%s: got shutdown\n", __func__);
+		if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SHUTDOWN_RCV)
+			vsk->peer_shutdown |= RCV_SHUTDOWN;
+		if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SHUTDOWN_SEND)
+			vsk->peer_shutdown |= SEND_SHUTDOWN;
+		if (le32_to_cpu(pkt->hdr.flags))
+			sk->sk_state_change(sk);
+		break;
+	case VIRTIO_VSOCK_OP_RST:
+		pr_debug("%s: got rst\n", __func__);
+		sock_set_flag(sk, SOCK_DONE);
+		vsk->peer_shutdown = SHUTDOWN_MASK;
+		if (vsock_stream_has_data(vsk) <= 0)
+			sk->sk_state = SS_DISCONNECTING;
+		sk->sk_state_change(sk);
+		break;
+	default:
+		err = -EINVAL;
+		break;
+	}
+
+	virtio_transport_free_pkt(pkt);
+	return err;
+}
+
+static int
+virtio_transport_send_response(struct vsock_sock *vsk,
+			       struct virtio_vsock_pkt *pkt)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt_info info = {
+		.op = VIRTIO_VSOCK_OP_RESPONSE,
+		.type = VIRTIO_VSOCK_TYPE_STREAM,
+		.remote_cid = le32_to_cpu(pkt->hdr.src_cid),
+		.remote_port = le32_to_cpu(pkt->hdr.src_port),
+	};
+
+	pr_debug("%s: send_response\n", __func__);
+
+	return trans->ops->send_pkt(vsk, &info);
+}
+
+/* Handle server socket */
+static int
+virtio_transport_recv_listen(struct sock *sk, struct virtio_vsock_pkt *pkt)
+{
+	struct vsock_sock *vsk = vsock_sk(sk);
+	struct vsock_sock *vchild;
+	struct sock *child;
+
+	if (le16_to_cpu(pkt->hdr.op) != VIRTIO_VSOCK_OP_REQUEST) {
+		virtio_transport_send_reset(vsk, pkt);
+		return -EINVAL;
+	}
+
+	if (sk_acceptq_is_full(sk)) {
+		virtio_transport_send_reset(vsk, pkt);
+		return -ENOMEM;
+	}
+
+	pr_debug("%s: create pending\n", __func__);
+	child = __vsock_create(sock_net(sk), NULL, sk, GFP_KERNEL,
+			       sk->sk_type, 0);
+	if (!child) {
+		virtio_transport_send_reset(vsk, pkt);
+		return -ENOMEM;
+	}
+
+	sk->sk_ack_backlog++;
+
+	lock_sock(child);
+
+	child->sk_state = SS_CONNECTED;
+
+	vchild = vsock_sk(child);
+	vsock_addr_init(&vchild->local_addr, le32_to_cpu(pkt->hdr.dst_cid),
+			le32_to_cpu(pkt->hdr.dst_port));
+	vsock_addr_init(&vchild->remote_addr, le32_to_cpu(pkt->hdr.src_cid),
+			le32_to_cpu(pkt->hdr.src_port));
+
+	vsock_insert_connected(vchild);
+	vsock_enqueue_accept(sk, child);
+	virtio_transport_send_response(vchild, pkt);
+
+	release_sock(child);
+
+	sk->sk_data_ready(sk);
+	return 0;
+}
+
+static void virtio_transport_space_update(struct sock *sk,
+					  struct virtio_vsock_pkt *pkt)
+{
+	struct vsock_sock *vsk = vsock_sk(sk);
+	struct virtio_transport *trans = vsk->trans;
+	bool space_available;
+
+	/* buf_alloc and fwd_cnt is always included in the hdr */
+	mutex_lock(&trans->tx_lock);
+	trans->peer_buf_alloc = le32_to_cpu(pkt->hdr.buf_alloc);
+	trans->peer_fwd_cnt = le32_to_cpu(pkt->hdr.fwd_cnt);
+	space_available = virtio_transport_has_space(vsk);
+	mutex_unlock(&trans->tx_lock);
+
+	if (space_available)
+		sk->sk_write_space(sk);
+}
+
+/* We are under the virtio-vsock's vsock->rx_lock or
+ * vhost-vsock's vq->mutex lock */
+void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt)
+{
+	struct virtio_transport *trans;
+	struct sockaddr_vm src, dst;
+	struct vsock_sock *vsk;
+	struct sock *sk;
+
+	vsock_addr_init(&src, le32_to_cpu(pkt->hdr.src_cid), le32_to_cpu(pkt->hdr.src_port));
+	vsock_addr_init(&dst, le32_to_cpu(pkt->hdr.dst_cid), le32_to_cpu(pkt->hdr.dst_port));
+
+	virtio_vsock_dumppkt(__func__, pkt);
+
+	if (le16_to_cpu(pkt->hdr.type) != VIRTIO_VSOCK_TYPE_STREAM) {
+		/* TODO send RST */
+		goto free_pkt;
+	}
+
+	/* The socket must be in connected or bound table
+	 * otherwise send reset back
+	 */
+	sk = vsock_find_connected_socket(&src, &dst);
+	if (!sk) {
+		sk = vsock_find_bound_socket(&dst);
+		if (!sk) {
+			pr_debug("%s: can not find bound_socket\n", __func__);
+			virtio_vsock_dumppkt(__func__, pkt);
+			/* Ignore this pkt instead of sending reset back */
+			/* TODO send a RST unless this packet is a RST (to avoid infinite loops) */
+			goto free_pkt;
+		}
+	}
+
+	vsk = vsock_sk(sk);
+	trans = vsk->trans;
+	BUG_ON(!trans);
+
+	virtio_transport_space_update(sk, pkt);
+
+	lock_sock(sk);
+	switch (sk->sk_state) {
+	case VSOCK_SS_LISTEN:
+		virtio_transport_recv_listen(sk, pkt);
+		virtio_transport_free_pkt(pkt);
+		break;
+	case SS_CONNECTING:
+		virtio_transport_recv_connecting(sk, pkt);
+		virtio_transport_free_pkt(pkt);
+		break;
+	case SS_CONNECTED:
+		virtio_transport_recv_connected(sk, pkt);
+		break;
+	default:
+		virtio_transport_free_pkt(pkt);
+		break;
+	}
+	release_sock(sk);
+
+	/* Release refcnt obtained when we fetched this socket out of the
+	 * bound or connected list.
+	 */
+	sock_put(sk);
+	return;
+
+free_pkt:
+	virtio_transport_free_pkt(pkt);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_recv_pkt);
+
+void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt)
+{
+	kfree(pkt->buf);
+	kfree(pkt);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_free_pkt);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Asias He");
+MODULE_DESCRIPTION("common code for virtio vsock");
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 1/4] VSOCK: Introduce virtio-vsock-common.ko
  2015-12-09 12:03 [PATCH v3 0/4] Add virtio transport for AF_VSOCK Stefan Hajnoczi
@ 2015-12-09 12:03 ` Stefan Hajnoczi
  2015-12-09 12:03 ` Stefan Hajnoczi
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-09 12:03 UTC (permalink / raw)
  To: kvm
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, netdev, virtualization,
	Matt Benjamin, Asias He, Christoffer Dall, matt.ma

From: Asias He <asias@redhat.com>

This module contains the common code and header files for the following
virtio-vsock and virtio-vhost kernel modules.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v3:
 * Remove unnecessary 3-way handshake, just do REQUEST/RESPONSE instead
   of REQUEST/RESPONSE/ACK
 * Remove SOCK_DGRAM support and focus on SOCK_STREAM first
 * Only allow host->guest connections (same security model as latest
   VMware)
v2:
 * Fix peer_buf_alloc inheritance on child socket
 * Notify other side of SOCK_STREAM disconnect (fixes shutdown
   semantics)
 * Avoid recursive mutex_lock(tx_lock) for write_space (fixes deadlock)
 * Define VIRTIO_VSOCK_TYPE_STREAM/DGRAM hardware interface constants
 * Define VIRTIO_VSOCK_SHUTDOWN_RCV/SEND hardware interface constants
---
 include/linux/virtio_vsock.h            | 203 ++++++++
 include/uapi/linux/virtio_ids.h         |   1 +
 include/uapi/linux/virtio_vsock.h       |  87 ++++
 net/vmw_vsock/virtio_transport_common.c | 854 ++++++++++++++++++++++++++++++++
 4 files changed, 1145 insertions(+)
 create mode 100644 include/linux/virtio_vsock.h
 create mode 100644 include/uapi/linux/virtio_vsock.h
 create mode 100644 net/vmw_vsock/virtio_transport_common.c

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
new file mode 100644
index 0000000..e54eb45
--- /dev/null
+++ b/include/linux/virtio_vsock.h
@@ -0,0 +1,203 @@
+/*
+ * This header, excluding the #ifdef __KERNEL__ part, is BSD licensed so
+ * anyone can use the definitions to implement compatible drivers/servers:
+ *
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ *    may be used to endorse or promote products derived from this software
+ *    without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * Copyright (C) Red Hat, Inc., 2013-2015
+ * Copyright (C) Asias He <asias@redhat.com>, 2013
+ * Copyright (C) Stefan Hajnoczi <stefanha@redhat.com>, 2015
+ */
+
+#ifndef _LINUX_VIRTIO_VSOCK_H
+#define _LINUX_VIRTIO_VSOCK_H
+
+#include <uapi/linux/virtio_vsock.h>
+#include <linux/socket.h>
+#include <net/sock.h>
+
+#define VIRTIO_VSOCK_DEFAULT_MIN_BUF_SIZE	128
+#define VIRTIO_VSOCK_DEFAULT_BUF_SIZE		(1024 * 256)
+#define VIRTIO_VSOCK_DEFAULT_MAX_BUF_SIZE	(1024 * 256)
+#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE	(1024 * 4)
+#define VIRTIO_VSOCK_MAX_BUF_SIZE		0xFFFFFFFFUL
+#define VIRTIO_VSOCK_MAX_PKT_BUF_SIZE		(1024 * 64)
+#define VIRTIO_VSOCK_MAX_TX_BUF_SIZE		(1024 * 1024 * 16)
+#define VIRTIO_VSOCK_MAX_DGRAM_SIZE		(1024 * 64)
+
+struct vsock_transport_recv_notify_data;
+struct vsock_transport_send_notify_data;
+struct sockaddr_vm;
+struct vsock_sock;
+
+enum {
+	VSOCK_VQ_CTRL	= 0,
+	VSOCK_VQ_RX	= 1, /* for host to guest data */
+	VSOCK_VQ_TX	= 2, /* for guest to host data */
+	VSOCK_VQ_MAX	= 3,
+};
+
+/* virtio transport socket state */
+struct virtio_transport {
+	struct virtio_transport_pkt_ops	*ops;
+	struct vsock_sock *vsk;
+
+	u32 buf_size;
+	u32 buf_size_min;
+	u32 buf_size_max;
+
+	struct mutex tx_lock;
+	struct mutex rx_lock;
+
+	struct list_head rx_queue;
+	u32 rx_bytes;
+
+	/* Protected by trans->tx_lock */
+	u32 tx_cnt;
+	u32 buf_alloc;
+	u32 peer_fwd_cnt;
+	u32 peer_buf_alloc;
+	/* Protected by trans->rx_lock */
+	u32 fwd_cnt;
+};
+
+struct virtio_vsock_pkt {
+	struct virtio_vsock_hdr	hdr;
+	struct virtio_transport	*trans;
+	struct work_struct work;
+	struct list_head list;
+	void *buf;
+	u32 len;
+	u32 off;
+};
+
+struct virtio_vsock_pkt_info {
+	u32 remote_cid, remote_port;
+	struct msghdr *msg;
+	u32 pkt_len;
+	u16 type;
+	u16 op;
+	u32 flags;
+};
+
+struct virtio_transport_pkt_ops {
+	int (*send_pkt)(struct vsock_sock *vsk,
+			struct virtio_vsock_pkt_info *info);
+};
+
+void virtio_vsock_dumppkt(const char *func,
+			  const struct virtio_vsock_pkt *pkt);
+
+struct sock *
+virtio_transport_get_pending(struct sock *listener,
+			     struct virtio_vsock_pkt *pkt);
+struct virtio_vsock_pkt *
+virtio_transport_alloc_pkt(struct vsock_sock *vsk,
+			   struct virtio_vsock_pkt_info *info,
+			   size_t len,
+			   u32 src_cid,
+			   u32 src_port,
+			   u32 dst_cid,
+			   u32 dst_port);
+ssize_t
+virtio_transport_stream_dequeue(struct vsock_sock *vsk,
+				struct msghdr *msg,
+				size_t len,
+				int type);
+int
+virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
+			       struct msghdr *msg,
+			       size_t len, int flags);
+
+s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
+s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
+
+int virtio_transport_do_socket_init(struct vsock_sock *vsk,
+				 struct vsock_sock *psk);
+u64 virtio_transport_get_buffer_size(struct vsock_sock *vsk);
+u64 virtio_transport_get_min_buffer_size(struct vsock_sock *vsk);
+u64 virtio_transport_get_max_buffer_size(struct vsock_sock *vsk);
+void virtio_transport_set_buffer_size(struct vsock_sock *vsk, u64 val);
+void virtio_transport_set_min_buffer_size(struct vsock_sock *vsk, u64 val);
+void virtio_transport_set_max_buffer_size(struct vsock_sock *vs, u64 val);
+int
+virtio_transport_notify_poll_in(struct vsock_sock *vsk,
+				size_t target,
+				bool *data_ready_now);
+int
+virtio_transport_notify_poll_out(struct vsock_sock *vsk,
+				 size_t target,
+				 bool *space_available_now);
+
+int virtio_transport_notify_recv_init(struct vsock_sock *vsk,
+	size_t target, struct vsock_transport_recv_notify_data *data);
+int virtio_transport_notify_recv_pre_block(struct vsock_sock *vsk,
+	size_t target, struct vsock_transport_recv_notify_data *data);
+int virtio_transport_notify_recv_pre_dequeue(struct vsock_sock *vsk,
+	size_t target, struct vsock_transport_recv_notify_data *data);
+int virtio_transport_notify_recv_post_dequeue(struct vsock_sock *vsk,
+	size_t target, ssize_t copied, bool data_read,
+	struct vsock_transport_recv_notify_data *data);
+int virtio_transport_notify_send_init(struct vsock_sock *vsk,
+	struct vsock_transport_send_notify_data *data);
+int virtio_transport_notify_send_pre_block(struct vsock_sock *vsk,
+	struct vsock_transport_send_notify_data *data);
+int virtio_transport_notify_send_pre_enqueue(struct vsock_sock *vsk,
+	struct vsock_transport_send_notify_data *data);
+int virtio_transport_notify_send_post_enqueue(struct vsock_sock *vsk,
+	ssize_t written, struct vsock_transport_send_notify_data *data);
+
+u64 virtio_transport_stream_rcvhiwat(struct vsock_sock *vsk);
+bool virtio_transport_stream_is_active(struct vsock_sock *vsk);
+bool virtio_transport_stream_allow(u32 cid, u32 port);
+int virtio_transport_dgram_bind(struct vsock_sock *vsk,
+				struct sockaddr_vm *addr);
+bool virtio_transport_dgram_allow(u32 cid, u32 port);
+
+int virtio_transport_connect(struct vsock_sock *vsk);
+
+int virtio_transport_shutdown(struct vsock_sock *vsk, int mode);
+
+void virtio_transport_release(struct vsock_sock *vsk);
+
+ssize_t
+virtio_transport_stream_enqueue(struct vsock_sock *vsk,
+				struct msghdr *msg,
+				size_t len);
+int
+virtio_transport_dgram_enqueue(struct vsock_sock *vsk,
+			       struct sockaddr_vm *remote_addr,
+			       struct msghdr *msg,
+			       size_t len);
+
+void virtio_transport_destruct(struct vsock_sock *vsk);
+
+void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt);
+void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt);
+void virtio_transport_inc_tx_pkt(struct virtio_vsock_pkt *pkt);
+void virtio_transport_dec_tx_pkt(struct virtio_vsock_pkt *pkt);
+u32 virtio_transport_get_credit(struct virtio_transport *trans, u32 wanted);
+void virtio_transport_put_credit(struct virtio_transport *trans, u32 credit);
+#endif /* _LINUX_VIRTIO_VSOCK_H */
diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
index 77925f5..16dcf5d 100644
--- a/include/uapi/linux/virtio_ids.h
+++ b/include/uapi/linux/virtio_ids.h
@@ -39,6 +39,7 @@
 #define VIRTIO_ID_9P		9 /* 9p virtio console */
 #define VIRTIO_ID_RPROC_SERIAL 11 /* virtio remoteproc serial link */
 #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
+#define VIRTIO_ID_VSOCK        13 /* virtio vsock transport */
 #define VIRTIO_ID_GPU          16 /* virtio GPU */
 #define VIRTIO_ID_INPUT        18 /* virtio input */
 
diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
new file mode 100644
index 0000000..ac6483d
--- /dev/null
+++ b/include/uapi/linux/virtio_vsock.h
@@ -0,0 +1,87 @@
+/*
+ * This header, excluding the #ifdef __KERNEL__ part, is BSD licensed so
+ * anyone can use the definitions to implement compatible drivers/servers:
+ *
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ *    may be used to endorse or promote products derived from this software
+ *    without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * Copyright (C) Red Hat, Inc., 2013-2015
+ * Copyright (C) Asias He <asias@redhat.com>, 2013
+ * Copyright (C) Stefan Hajnoczi <stefanha@redhat.com>, 2015
+ */
+
+#ifndef _UAPI_LINUX_VIRTIO_VSOCK_H
+#define _UAPI_LINUX_VIRTIO_VOSCK_H
+
+#include <linux/types.h>
+#include <linux/virtio_ids.h>
+#include <linux/virtio_config.h>
+
+struct virtio_vsock_config {
+	__le32 guest_cid;
+	__le32 max_virtqueue_pairs;
+};
+
+struct virtio_vsock_hdr {
+	__le32	src_cid;
+	__le32	src_port;
+	__le32	dst_cid;
+	__le32	dst_port;
+	__le32	len;
+	__le16	type;		/* enum virtio_vsock_type */
+	__le16	op;		/* enum virtio_vsock_op */
+	__le32	flags;
+	__le32	buf_alloc;
+	__le32	fwd_cnt;
+};
+
+enum virtio_vsock_type {
+	VIRTIO_VSOCK_TYPE_STREAM = 1,
+};
+
+enum virtio_vsock_op {
+	VIRTIO_VSOCK_OP_INVALID = 0,
+
+	/* Connect operations */
+	VIRTIO_VSOCK_OP_REQUEST = 1,
+	VIRTIO_VSOCK_OP_RESPONSE = 2,
+	VIRTIO_VSOCK_OP_RST = 3,
+	VIRTIO_VSOCK_OP_SHUTDOWN = 4,
+
+	/* To send payload */
+	VIRTIO_VSOCK_OP_RW = 5,
+
+	/* Tell the peer our credit info */
+	VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
+	/* Request the peer to send the credit info to us */
+	VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
+};
+
+/* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
+enum virtio_vsock_shutdown {
+	VIRTIO_VSOCK_SHUTDOWN_RCV = 1,
+	VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
+};
+
+#endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
new file mode 100644
index 0000000..025a323
--- /dev/null
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -0,0 +1,854 @@
+/*
+ * common code for virtio vsock
+ *
+ * Copyright (C) 2013-2015 Red Hat, Inc.
+ * Author: Asias He <asias@redhat.com>
+ *         Stefan Hajnoczi <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+#include <linux/module.h>
+#include <linux/ctype.h>
+#include <linux/list.h>
+#include <linux/virtio.h>
+#include <linux/virtio_ids.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_vsock.h>
+
+#include <net/sock.h>
+#include <net/af_vsock.h>
+
+void virtio_vsock_dumppkt(const char *func,  const struct virtio_vsock_pkt *pkt)
+{
+	pr_debug("%s: pkt=%p, op=%d, len=%d, %d:%d---%d:%d, len=%d\n",
+		 func, pkt,
+		 le16_to_cpu(pkt->hdr.op),
+		 le32_to_cpu(pkt->hdr.len),
+		 le32_to_cpu(pkt->hdr.src_cid),
+		 le32_to_cpu(pkt->hdr.src_port),
+		 le32_to_cpu(pkt->hdr.dst_cid),
+		 le32_to_cpu(pkt->hdr.dst_port),
+		 pkt->len);
+}
+EXPORT_SYMBOL_GPL(virtio_vsock_dumppkt);
+
+struct virtio_vsock_pkt *
+virtio_transport_alloc_pkt(struct vsock_sock *vsk,
+			   struct virtio_vsock_pkt_info *info,
+			   size_t len,
+			   u32 src_cid,
+			   u32 src_port,
+			   u32 dst_cid,
+			   u32 dst_port)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt *pkt;
+	int err;
+
+	BUG_ON(!trans);
+
+	pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
+	if (!pkt)
+		return NULL;
+
+	pkt->hdr.type		= cpu_to_le16(info->type);
+	pkt->hdr.op		= cpu_to_le16(info->op);
+	pkt->hdr.src_cid	= cpu_to_le32(src_cid);
+	pkt->hdr.src_port	= cpu_to_le32(src_port);
+	pkt->hdr.dst_cid	= cpu_to_le32(dst_cid);
+	pkt->hdr.dst_port	= cpu_to_le32(dst_port);
+	pkt->hdr.flags		= cpu_to_le32(info->flags);
+	pkt->len		= len;
+	pkt->trans		= trans;
+	pkt->hdr.len		= cpu_to_le32(len);
+
+	if (info->msg && len > 0) {
+		pkt->buf = kmalloc(len, GFP_KERNEL);
+		if (!pkt->buf)
+			goto out_pkt;
+		err = memcpy_from_msg(pkt->buf, info->msg, len);
+		if (err)
+			goto out;
+	}
+
+	return pkt;
+
+out:
+	kfree(pkt->buf);
+out_pkt:
+	kfree(pkt);
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_alloc_pkt);
+
+struct sock *
+virtio_transport_get_pending(struct sock *listener,
+			     struct virtio_vsock_pkt *pkt)
+{
+	struct vsock_sock *vlistener;
+	struct vsock_sock *vpending;
+	struct sockaddr_vm src;
+	struct sockaddr_vm dst;
+	struct sock *pending;
+
+	vsock_addr_init(&src, le32_to_cpu(pkt->hdr.src_cid), le32_to_cpu(pkt->hdr.src_port));
+	vsock_addr_init(&dst, le32_to_cpu(pkt->hdr.dst_cid), le32_to_cpu(pkt->hdr.dst_port));
+
+	vlistener = vsock_sk(listener);
+	list_for_each_entry(vpending, &vlistener->pending_links,
+			    pending_links) {
+		if (vsock_addr_equals_addr(&src, &vpending->remote_addr) &&
+		    vsock_addr_equals_addr(&dst, &vpending->local_addr)) {
+			pending = sk_vsock(vpending);
+			sock_hold(pending);
+			return pending;
+		}
+	}
+
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_pending);
+
+static void virtio_transport_inc_rx_pkt(struct virtio_vsock_pkt *pkt)
+{
+	pkt->trans->rx_bytes += pkt->len;
+}
+
+static void virtio_transport_dec_rx_pkt(struct virtio_vsock_pkt *pkt)
+{
+	pkt->trans->rx_bytes -= pkt->len;
+	pkt->trans->fwd_cnt += pkt->len;
+}
+
+void virtio_transport_inc_tx_pkt(struct virtio_vsock_pkt *pkt)
+{
+	mutex_lock(&pkt->trans->tx_lock);
+	pkt->hdr.fwd_cnt = cpu_to_le32(pkt->trans->fwd_cnt);
+	pkt->hdr.buf_alloc = cpu_to_le32(pkt->trans->buf_alloc);
+	mutex_unlock(&pkt->trans->tx_lock);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_inc_tx_pkt);
+
+void virtio_transport_dec_tx_pkt(struct virtio_vsock_pkt *pkt)
+{
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dec_tx_pkt);
+
+u32 virtio_transport_get_credit(struct virtio_transport *trans, u32 credit)
+{
+	u32 ret;
+
+	mutex_lock(&trans->tx_lock);
+	ret = trans->peer_buf_alloc - (trans->tx_cnt - trans->peer_fwd_cnt);
+	if (ret > credit)
+		ret = credit;
+	trans->tx_cnt += ret;
+	mutex_unlock(&trans->tx_lock);
+
+	pr_debug("%s: ret=%d, buf_alloc=%d, peer_buf_alloc=%d,"
+		 "tx_cnt=%d, fwd_cnt=%d, peer_fwd_cnt=%d\n", __func__,
+		 ret, trans->buf_alloc, trans->peer_buf_alloc,
+		 trans->tx_cnt, trans->fwd_cnt, trans->peer_fwd_cnt);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_credit);
+
+void virtio_transport_put_credit(struct virtio_transport *trans, u32 credit)
+{
+	mutex_lock(&trans->tx_lock);
+	trans->tx_cnt -= credit;
+	mutex_unlock(&trans->tx_lock);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_put_credit);
+
+static int virtio_transport_send_credit_update(struct vsock_sock *vsk, int type, struct virtio_vsock_hdr *hdr)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt_info info = {
+		.op = VIRTIO_VSOCK_OP_CREDIT_UPDATE,
+		.type = type,
+	};
+
+	pr_debug("%s: sk=%p send_credit_update\n", __func__, vsk);
+	return trans->ops->send_pkt(vsk, &info);
+}
+
+static ssize_t
+virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
+				   struct msghdr *msg,
+				   size_t len)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt *pkt;
+	size_t bytes, total = 0;
+	int err = -EFAULT;
+
+	mutex_lock(&trans->rx_lock);
+	while (total < len && trans->rx_bytes > 0  &&
+			!list_empty(&trans->rx_queue)) {
+		pkt = list_first_entry(&trans->rx_queue,
+				       struct virtio_vsock_pkt, list);
+
+		bytes = len - total;
+		if (bytes > pkt->len - pkt->off)
+			bytes = pkt->len - pkt->off;
+
+		err = memcpy_to_msg(msg, pkt->buf + pkt->off, bytes);
+		if (err)
+			goto out;
+		total += bytes;
+		pkt->off += bytes;
+		if (pkt->off == pkt->len) {
+			virtio_transport_dec_rx_pkt(pkt);
+			list_del(&pkt->list);
+			virtio_transport_free_pkt(pkt);
+		}
+	}
+	mutex_unlock(&trans->rx_lock);
+
+	/* Send a credit pkt to peer */
+	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_STREAM,
+					    NULL);
+
+	return total;
+
+out:
+	mutex_unlock(&trans->rx_lock);
+	if (total)
+		err = total;
+	return err;
+}
+
+ssize_t
+virtio_transport_stream_dequeue(struct vsock_sock *vsk,
+				struct msghdr *msg,
+				size_t len, int flags)
+{
+	if (flags & MSG_PEEK)
+		return -EOPNOTSUPP;
+
+	return virtio_transport_stream_do_dequeue(vsk, msg, len);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_dequeue);
+
+int
+virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
+			       struct msghdr *msg,
+			       size_t len, int flags)
+{
+	return -EOPNOTSUPP;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dgram_dequeue);
+
+s64 virtio_transport_stream_has_data(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+	s64 bytes;
+
+	mutex_lock(&trans->rx_lock);
+	bytes = trans->rx_bytes;
+	mutex_unlock(&trans->rx_lock);
+
+	return bytes;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_has_data);
+
+static s64 virtio_transport_has_space(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+	s64 bytes;
+
+	bytes = trans->peer_buf_alloc - (trans->tx_cnt - trans->peer_fwd_cnt);
+	if (bytes < 0)
+		bytes = 0;
+
+	return bytes;
+}
+
+s64 virtio_transport_stream_has_space(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+	s64 bytes;
+
+	mutex_lock(&trans->tx_lock);
+	bytes = virtio_transport_has_space(vsk);
+	mutex_unlock(&trans->tx_lock);
+
+	pr_debug("%s: bytes=%lld\n", __func__, bytes);
+
+	return bytes;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_has_space);
+
+int virtio_transport_do_socket_init(struct vsock_sock *vsk,
+				    struct vsock_sock *psk)
+{
+	struct virtio_transport *trans;
+
+	trans = kzalloc(sizeof(*trans), GFP_KERNEL);
+	if (!trans)
+		return -ENOMEM;
+
+	vsk->trans = trans;
+	trans->vsk = vsk;
+	if (psk) {
+		struct virtio_transport *ptrans = psk->trans;
+		trans->buf_size	= ptrans->buf_size;
+		trans->buf_size_min = ptrans->buf_size_min;
+		trans->buf_size_max = ptrans->buf_size_max;
+		trans->peer_buf_alloc = ptrans->peer_buf_alloc;
+	} else {
+		trans->buf_size = VIRTIO_VSOCK_DEFAULT_BUF_SIZE;
+		trans->buf_size_min = VIRTIO_VSOCK_DEFAULT_MIN_BUF_SIZE;
+		trans->buf_size_max = VIRTIO_VSOCK_DEFAULT_MAX_BUF_SIZE;
+	}
+
+	trans->buf_alloc = trans->buf_size;
+
+	pr_debug("%s: trans->buf_alloc=%d\n", __func__, trans->buf_alloc);
+
+	mutex_init(&trans->rx_lock);
+	mutex_init(&trans->tx_lock);
+	INIT_LIST_HEAD(&trans->rx_queue);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
+
+u64 virtio_transport_get_buffer_size(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	return trans->buf_size;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_buffer_size);
+
+u64 virtio_transport_get_min_buffer_size(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	return trans->buf_size_min;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_min_buffer_size);
+
+u64 virtio_transport_get_max_buffer_size(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	return trans->buf_size_max;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_max_buffer_size);
+
+void virtio_transport_set_buffer_size(struct vsock_sock *vsk, u64 val)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	if (val > VIRTIO_VSOCK_MAX_BUF_SIZE)
+		val = VIRTIO_VSOCK_MAX_BUF_SIZE;
+	if (val < trans->buf_size_min)
+		trans->buf_size_min = val;
+	if (val > trans->buf_size_max)
+		trans->buf_size_max = val;
+	trans->buf_size = val;
+	trans->buf_alloc = val;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_set_buffer_size);
+
+void virtio_transport_set_min_buffer_size(struct vsock_sock *vsk, u64 val)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	if (val > VIRTIO_VSOCK_MAX_BUF_SIZE)
+		val = VIRTIO_VSOCK_MAX_BUF_SIZE;
+	if (val > trans->buf_size)
+		trans->buf_size = val;
+	trans->buf_size_min = val;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_set_min_buffer_size);
+
+void virtio_transport_set_max_buffer_size(struct vsock_sock *vsk, u64 val)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	if (val > VIRTIO_VSOCK_MAX_BUF_SIZE)
+		val = VIRTIO_VSOCK_MAX_BUF_SIZE;
+	if (val < trans->buf_size)
+		trans->buf_size = val;
+	trans->buf_size_max = val;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_set_max_buffer_size);
+
+int
+virtio_transport_notify_poll_in(struct vsock_sock *vsk,
+				size_t target,
+				bool *data_ready_now)
+{
+	if (vsock_stream_has_data(vsk))
+		*data_ready_now = true;
+	else
+		*data_ready_now = false;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_poll_in);
+
+int
+virtio_transport_notify_poll_out(struct vsock_sock *vsk,
+				 size_t target,
+				 bool *space_avail_now)
+{
+	s64 free_space;
+
+	free_space = vsock_stream_has_space(vsk);
+	if (free_space > 0)
+		*space_avail_now = true;
+	else if (free_space == 0)
+		*space_avail_now = false;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_poll_out);
+
+int virtio_transport_notify_recv_init(struct vsock_sock *vsk,
+	size_t target, struct vsock_transport_recv_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_init);
+
+int virtio_transport_notify_recv_pre_block(struct vsock_sock *vsk,
+	size_t target, struct vsock_transport_recv_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_pre_block);
+
+int virtio_transport_notify_recv_pre_dequeue(struct vsock_sock *vsk,
+	size_t target, struct vsock_transport_recv_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_pre_dequeue);
+
+int virtio_transport_notify_recv_post_dequeue(struct vsock_sock *vsk,
+	size_t target, ssize_t copied, bool data_read,
+	struct vsock_transport_recv_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_post_dequeue);
+
+int virtio_transport_notify_send_init(struct vsock_sock *vsk,
+	struct vsock_transport_send_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_send_init);
+
+int virtio_transport_notify_send_pre_block(struct vsock_sock *vsk,
+	struct vsock_transport_send_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_send_pre_block);
+
+int virtio_transport_notify_send_pre_enqueue(struct vsock_sock *vsk,
+	struct vsock_transport_send_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_send_pre_enqueue);
+
+int virtio_transport_notify_send_post_enqueue(struct vsock_sock *vsk,
+	ssize_t written, struct vsock_transport_send_notify_data *data)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_send_post_enqueue);
+
+u64 virtio_transport_stream_rcvhiwat(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	return trans->buf_size;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_rcvhiwat);
+
+bool virtio_transport_stream_is_active(struct vsock_sock *vsk)
+{
+	return true;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_is_active);
+
+bool virtio_transport_stream_allow(u32 cid, u32 port)
+{
+	/* Only allow guest->host connections */
+	return cid != VMADDR_CID_HOST;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_allow);
+
+int virtio_transport_dgram_bind(struct vsock_sock *vsk,
+				struct sockaddr_vm *addr)
+{
+	return -EOPNOTSUPP;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dgram_bind);
+
+bool virtio_transport_dgram_allow(u32 cid, u32 port)
+{
+	return false;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dgram_allow);
+
+int virtio_transport_connect(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt_info info = {
+		.op = VIRTIO_VSOCK_OP_REQUEST,
+		.type = VIRTIO_VSOCK_TYPE_STREAM,
+	};
+
+	pr_debug("%s: vsk=%p send_request\n", __func__, vsk);
+	return trans->ops->send_pkt(vsk, &info);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_connect);
+
+int virtio_transport_shutdown(struct vsock_sock *vsk, int mode)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt_info info = {
+		.op = VIRTIO_VSOCK_OP_SHUTDOWN,
+		.type = VIRTIO_VSOCK_TYPE_STREAM,
+		.flags = (mode & RCV_SHUTDOWN ?
+			  VIRTIO_VSOCK_SHUTDOWN_RCV : 0) |
+			 (mode & SEND_SHUTDOWN ?
+			  VIRTIO_VSOCK_SHUTDOWN_SEND : 0),
+	};
+
+	pr_debug("%s: vsk=%p: send_shutdown\n", __func__, vsk);
+	return trans->ops->send_pkt(vsk, &info);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_shutdown);
+
+void virtio_transport_release(struct vsock_sock *vsk)
+{
+	struct sock *sk = &vsk->sk;
+
+	pr_debug("%s: vsk=%p\n", __func__, vsk);
+
+	/* Tell other side to terminate connection */
+	if (sk->sk_type == SOCK_STREAM && sk->sk_state == SS_CONNECTED) {
+		virtio_transport_shutdown(vsk, SHUTDOWN_MASK);
+	}
+}
+EXPORT_SYMBOL_GPL(virtio_transport_release);
+
+int
+virtio_transport_dgram_enqueue(struct vsock_sock *vsk,
+			       struct sockaddr_vm *remote_addr,
+			       struct msghdr *msg,
+			       size_t dgram_len)
+{
+	return -EOPNOTSUPP;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dgram_enqueue);
+
+ssize_t
+virtio_transport_stream_enqueue(struct vsock_sock *vsk,
+				struct msghdr *msg,
+				size_t len)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt_info info = {
+		.op = VIRTIO_VSOCK_OP_RW,
+		.type = VIRTIO_VSOCK_TYPE_STREAM,
+		.msg = msg,
+		.pkt_len = len,
+	};
+
+	return trans->ops->send_pkt(vsk, &info);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_enqueue);
+
+void virtio_transport_destruct(struct vsock_sock *vsk)
+{
+	struct virtio_transport *trans = vsk->trans;
+
+	pr_debug("%s: vsk=%p\n", __func__, vsk);
+	kfree(trans);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_destruct);
+
+static int virtio_transport_send_reset(struct vsock_sock *vsk,
+				       struct virtio_vsock_pkt *pkt)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt_info info = {
+		.op = VIRTIO_VSOCK_OP_RST,
+		.type = VIRTIO_VSOCK_TYPE_STREAM,
+	};
+
+	pr_debug("%s\n", __func__);
+
+	/* Send RST only if the original pkt is not a RST pkt */
+	if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_RST)
+		return 0;
+
+	return trans->ops->send_pkt(vsk, &info);
+}
+
+static int
+virtio_transport_recv_connecting(struct sock *sk,
+				 struct virtio_vsock_pkt *pkt)
+{
+	struct vsock_sock *vsk = vsock_sk(sk);
+	int err;
+	int skerr;
+
+	pr_debug("%s: vsk=%p\n", __func__, vsk);
+	switch (le16_to_cpu(pkt->hdr.op)) {
+	case VIRTIO_VSOCK_OP_RESPONSE:
+		pr_debug("%s: got RESPONSE\n", __func__);
+		sk->sk_state = SS_CONNECTED;
+		sk->sk_socket->state = SS_CONNECTED;
+		vsock_insert_connected(vsk);
+		sk->sk_state_change(sk);
+		break;
+	case VIRTIO_VSOCK_OP_INVALID:
+		pr_debug("%s: got invalid\n", __func__);
+		break;
+	case VIRTIO_VSOCK_OP_RST:
+		pr_debug("%s: got rst\n", __func__);
+		skerr = ECONNRESET;
+		err = 0;
+		goto destroy;
+	default:
+		pr_debug("%s: got def\n", __func__);
+		skerr = EPROTO;
+		err = -EINVAL;
+		goto destroy;
+	}
+	return 0;
+
+destroy:
+	virtio_transport_send_reset(vsk, pkt);
+	sk->sk_state = SS_UNCONNECTED;
+	sk->sk_err = skerr;
+	sk->sk_error_report(sk);
+	return err;
+}
+
+static int
+virtio_transport_recv_connected(struct sock *sk,
+				struct virtio_vsock_pkt *pkt)
+{
+	struct vsock_sock *vsk = vsock_sk(sk);
+	struct virtio_transport *trans = vsk->trans;
+	int err = 0;
+
+	switch (le16_to_cpu(pkt->hdr.op)) {
+	case VIRTIO_VSOCK_OP_RW:
+		pkt->len = le32_to_cpu(pkt->hdr.len);
+		pkt->off = 0;
+		pkt->trans = trans;
+
+		mutex_lock(&trans->rx_lock);
+		virtio_transport_inc_rx_pkt(pkt);
+		list_add_tail(&pkt->list, &trans->rx_queue);
+		mutex_unlock(&trans->rx_lock);
+
+		sk->sk_data_ready(sk);
+		return err;
+	case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
+		sk->sk_write_space(sk);
+		break;
+	case VIRTIO_VSOCK_OP_SHUTDOWN:
+		pr_debug("%s: got shutdown\n", __func__);
+		if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SHUTDOWN_RCV)
+			vsk->peer_shutdown |= RCV_SHUTDOWN;
+		if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SHUTDOWN_SEND)
+			vsk->peer_shutdown |= SEND_SHUTDOWN;
+		if (le32_to_cpu(pkt->hdr.flags))
+			sk->sk_state_change(sk);
+		break;
+	case VIRTIO_VSOCK_OP_RST:
+		pr_debug("%s: got rst\n", __func__);
+		sock_set_flag(sk, SOCK_DONE);
+		vsk->peer_shutdown = SHUTDOWN_MASK;
+		if (vsock_stream_has_data(vsk) <= 0)
+			sk->sk_state = SS_DISCONNECTING;
+		sk->sk_state_change(sk);
+		break;
+	default:
+		err = -EINVAL;
+		break;
+	}
+
+	virtio_transport_free_pkt(pkt);
+	return err;
+}
+
+static int
+virtio_transport_send_response(struct vsock_sock *vsk,
+			       struct virtio_vsock_pkt *pkt)
+{
+	struct virtio_transport *trans = vsk->trans;
+	struct virtio_vsock_pkt_info info = {
+		.op = VIRTIO_VSOCK_OP_RESPONSE,
+		.type = VIRTIO_VSOCK_TYPE_STREAM,
+		.remote_cid = le32_to_cpu(pkt->hdr.src_cid),
+		.remote_port = le32_to_cpu(pkt->hdr.src_port),
+	};
+
+	pr_debug("%s: send_response\n", __func__);
+
+	return trans->ops->send_pkt(vsk, &info);
+}
+
+/* Handle server socket */
+static int
+virtio_transport_recv_listen(struct sock *sk, struct virtio_vsock_pkt *pkt)
+{
+	struct vsock_sock *vsk = vsock_sk(sk);
+	struct vsock_sock *vchild;
+	struct sock *child;
+
+	if (le16_to_cpu(pkt->hdr.op) != VIRTIO_VSOCK_OP_REQUEST) {
+		virtio_transport_send_reset(vsk, pkt);
+		return -EINVAL;
+	}
+
+	if (sk_acceptq_is_full(sk)) {
+		virtio_transport_send_reset(vsk, pkt);
+		return -ENOMEM;
+	}
+
+	pr_debug("%s: create pending\n", __func__);
+	child = __vsock_create(sock_net(sk), NULL, sk, GFP_KERNEL,
+			       sk->sk_type, 0);
+	if (!child) {
+		virtio_transport_send_reset(vsk, pkt);
+		return -ENOMEM;
+	}
+
+	sk->sk_ack_backlog++;
+
+	lock_sock(child);
+
+	child->sk_state = SS_CONNECTED;
+
+	vchild = vsock_sk(child);
+	vsock_addr_init(&vchild->local_addr, le32_to_cpu(pkt->hdr.dst_cid),
+			le32_to_cpu(pkt->hdr.dst_port));
+	vsock_addr_init(&vchild->remote_addr, le32_to_cpu(pkt->hdr.src_cid),
+			le32_to_cpu(pkt->hdr.src_port));
+
+	vsock_insert_connected(vchild);
+	vsock_enqueue_accept(sk, child);
+	virtio_transport_send_response(vchild, pkt);
+
+	release_sock(child);
+
+	sk->sk_data_ready(sk);
+	return 0;
+}
+
+static void virtio_transport_space_update(struct sock *sk,
+					  struct virtio_vsock_pkt *pkt)
+{
+	struct vsock_sock *vsk = vsock_sk(sk);
+	struct virtio_transport *trans = vsk->trans;
+	bool space_available;
+
+	/* buf_alloc and fwd_cnt is always included in the hdr */
+	mutex_lock(&trans->tx_lock);
+	trans->peer_buf_alloc = le32_to_cpu(pkt->hdr.buf_alloc);
+	trans->peer_fwd_cnt = le32_to_cpu(pkt->hdr.fwd_cnt);
+	space_available = virtio_transport_has_space(vsk);
+	mutex_unlock(&trans->tx_lock);
+
+	if (space_available)
+		sk->sk_write_space(sk);
+}
+
+/* We are under the virtio-vsock's vsock->rx_lock or
+ * vhost-vsock's vq->mutex lock */
+void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt)
+{
+	struct virtio_transport *trans;
+	struct sockaddr_vm src, dst;
+	struct vsock_sock *vsk;
+	struct sock *sk;
+
+	vsock_addr_init(&src, le32_to_cpu(pkt->hdr.src_cid), le32_to_cpu(pkt->hdr.src_port));
+	vsock_addr_init(&dst, le32_to_cpu(pkt->hdr.dst_cid), le32_to_cpu(pkt->hdr.dst_port));
+
+	virtio_vsock_dumppkt(__func__, pkt);
+
+	if (le16_to_cpu(pkt->hdr.type) != VIRTIO_VSOCK_TYPE_STREAM) {
+		/* TODO send RST */
+		goto free_pkt;
+	}
+
+	/* The socket must be in connected or bound table
+	 * otherwise send reset back
+	 */
+	sk = vsock_find_connected_socket(&src, &dst);
+	if (!sk) {
+		sk = vsock_find_bound_socket(&dst);
+		if (!sk) {
+			pr_debug("%s: can not find bound_socket\n", __func__);
+			virtio_vsock_dumppkt(__func__, pkt);
+			/* Ignore this pkt instead of sending reset back */
+			/* TODO send a RST unless this packet is a RST (to avoid infinite loops) */
+			goto free_pkt;
+		}
+	}
+
+	vsk = vsock_sk(sk);
+	trans = vsk->trans;
+	BUG_ON(!trans);
+
+	virtio_transport_space_update(sk, pkt);
+
+	lock_sock(sk);
+	switch (sk->sk_state) {
+	case VSOCK_SS_LISTEN:
+		virtio_transport_recv_listen(sk, pkt);
+		virtio_transport_free_pkt(pkt);
+		break;
+	case SS_CONNECTING:
+		virtio_transport_recv_connecting(sk, pkt);
+		virtio_transport_free_pkt(pkt);
+		break;
+	case SS_CONNECTED:
+		virtio_transport_recv_connected(sk, pkt);
+		break;
+	default:
+		virtio_transport_free_pkt(pkt);
+		break;
+	}
+	release_sock(sk);
+
+	/* Release refcnt obtained when we fetched this socket out of the
+	 * bound or connected list.
+	 */
+	sock_put(sk);
+	return;
+
+free_pkt:
+	virtio_transport_free_pkt(pkt);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_recv_pkt);
+
+void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt)
+{
+	kfree(pkt->buf);
+	kfree(pkt);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_free_pkt);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Asias He");
+MODULE_DESCRIPTION("common code for virtio vsock");
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 2/4] VSOCK: Introduce virtio-vsock.ko
  2015-12-09 12:03 [PATCH v3 0/4] Add virtio transport for AF_VSOCK Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2015-12-09 12:03 ` [PATCH v3 2/4] VSOCK: Introduce virtio-vsock.ko Stefan Hajnoczi
@ 2015-12-09 12:03 ` Stefan Hajnoczi
  2015-12-10 21:23   ` Alex Bennée
  2015-12-09 12:03 ` [PATCH v3 3/4] VSOCK: Introduce vhost-vsock.ko Stefan Hajnoczi
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-09 12:03 UTC (permalink / raw)
  To: kvm
  Cc: Matt Benjamin, Christoffer Dall, netdev, Michael S. Tsirkin,
	matt.ma, virtualization, Asias He, Stefan Hajnoczi

From: Asias He <asias@redhat.com>

VM sockets virtio transport implementation. This module runs in guest
kernel.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v2:
 * Fix total_tx_buf accounting
 * Add virtio_transport global mutex to prevent races
---
 net/vmw_vsock/virtio_transport.c | 466 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 466 insertions(+)
 create mode 100644 net/vmw_vsock/virtio_transport.c

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
new file mode 100644
index 0000000..df65dca
--- /dev/null
+++ b/net/vmw_vsock/virtio_transport.c
@@ -0,0 +1,466 @@
+/*
+ * virtio transport for vsock
+ *
+ * Copyright (C) 2013-2015 Red Hat, Inc.
+ * Author: Asias He <asias@redhat.com>
+ *         Stefan Hajnoczi <stefanha@redhat.com>
+ *
+ * Some of the code is take from Gerd Hoffmann <kraxel@redhat.com>'s
+ * early virtio-vsock proof-of-concept bits.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/list.h>
+#include <linux/virtio.h>
+#include <linux/virtio_ids.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_vsock.h>
+#include <net/sock.h>
+#include <linux/mutex.h>
+#include <net/af_vsock.h>
+
+static struct workqueue_struct *virtio_vsock_workqueue;
+static struct virtio_vsock *the_virtio_vsock;
+static DEFINE_MUTEX(the_virtio_vsock_mutex); /* protects the_virtio_vsock */
+static void virtio_vsock_rx_fill(struct virtio_vsock *vsock);
+
+struct virtio_vsock {
+	/* Virtio device */
+	struct virtio_device *vdev;
+	/* Virtio virtqueue */
+	struct virtqueue *vqs[VSOCK_VQ_MAX];
+	/* Wait queue for send pkt */
+	wait_queue_head_t queue_wait;
+	/* Work item to send pkt */
+	struct work_struct tx_work;
+	/* Work item to recv pkt */
+	struct work_struct rx_work;
+	/* Mutex to protect send pkt*/
+	struct mutex tx_lock;
+	/* Mutex to protect recv pkt*/
+	struct mutex rx_lock;
+	/* Number of recv buffers */
+	int rx_buf_nr;
+	/* Number of max recv buffers */
+	int rx_buf_max_nr;
+	/* Used for global tx buf limitation */
+	u32 total_tx_buf;
+	/* Guest context id, just like guest ip address */
+	u32 guest_cid;
+};
+
+static struct virtio_vsock *virtio_vsock_get(void)
+{
+	return the_virtio_vsock;
+}
+
+static u32 virtio_transport_get_local_cid(void)
+{
+	struct virtio_vsock *vsock = virtio_vsock_get();
+
+	return vsock->guest_cid;
+}
+
+static int
+virtio_transport_send_pkt(struct vsock_sock *vsk,
+			  struct virtio_vsock_pkt_info *info)
+{
+	u32 src_cid, src_port, dst_cid, dst_port;
+	int ret, in_sg = 0, out_sg = 0;
+	struct virtio_transport *trans;
+	struct virtio_vsock_pkt *pkt;
+	struct virtio_vsock *vsock;
+	struct scatterlist hdr, buf, *sgs[2];
+	struct virtqueue *vq;
+	u32 pkt_len = info->pkt_len;
+	DEFINE_WAIT(wait);
+
+	vsock = virtio_vsock_get();
+	if (!vsock)
+		return -ENODEV;
+
+	src_cid	= virtio_transport_get_local_cid();
+	src_port = vsk->local_addr.svm_port;
+	if (!info->remote_cid) {
+		dst_cid	= vsk->remote_addr.svm_cid;
+		dst_port = vsk->remote_addr.svm_port;
+	} else {
+		dst_cid = info->remote_cid;
+		dst_port = info->remote_port;
+	}
+
+	trans = vsk->trans;
+	vq = vsock->vqs[VSOCK_VQ_TX];
+
+	if (pkt_len > VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE)
+		pkt_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
+	pkt_len = virtio_transport_get_credit(trans, pkt_len);
+	/* Do not send zero length OP_RW pkt*/
+	if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW)
+		return pkt_len;
+
+	/* Respect global tx buf limitation */
+	mutex_lock(&vsock->tx_lock);
+	while (pkt_len + vsock->total_tx_buf > VIRTIO_VSOCK_MAX_TX_BUF_SIZE) {
+		prepare_to_wait_exclusive(&vsock->queue_wait, &wait,
+					  TASK_UNINTERRUPTIBLE);
+		mutex_unlock(&vsock->tx_lock);
+		schedule();
+		mutex_lock(&vsock->tx_lock);
+		finish_wait(&vsock->queue_wait, &wait);
+	}
+	vsock->total_tx_buf += pkt_len;
+	mutex_unlock(&vsock->tx_lock);
+
+	pkt = virtio_transport_alloc_pkt(vsk, info, pkt_len,
+					 src_cid, src_port,
+					 dst_cid, dst_port);
+	if (!pkt) {
+		mutex_lock(&vsock->tx_lock);
+		vsock->total_tx_buf -= pkt_len;
+		mutex_unlock(&vsock->tx_lock);
+		virtio_transport_put_credit(trans, pkt_len);
+		return -ENOMEM;
+	}
+
+	pr_debug("%s:info->pkt_len= %d\n", __func__, info->pkt_len);
+
+	/* Will be released in virtio_transport_send_pkt_work */
+	sock_hold(&trans->vsk->sk);
+	virtio_transport_inc_tx_pkt(pkt);
+
+	/* Put pkt in the virtqueue */
+	sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr));
+	sgs[out_sg++] = &hdr;
+	if (info->msg && info->pkt_len > 0) {
+		sg_init_one(&buf, pkt->buf, pkt->len);
+	        sgs[out_sg++] = &buf;
+	}
+
+	mutex_lock(&vsock->tx_lock);
+	while ((ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, pkt,
+					GFP_KERNEL)) < 0) {
+		prepare_to_wait_exclusive(&vsock->queue_wait, &wait,
+					  TASK_UNINTERRUPTIBLE);
+		mutex_unlock(&vsock->tx_lock);
+		schedule();
+		mutex_lock(&vsock->tx_lock);
+		finish_wait(&vsock->queue_wait, &wait);
+	}
+	virtqueue_kick(vq);
+	mutex_unlock(&vsock->tx_lock);
+
+	return pkt_len;
+}
+
+static struct virtio_transport_pkt_ops virtio_ops = {
+	.send_pkt = virtio_transport_send_pkt,
+};
+
+static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
+{
+	int buf_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
+	struct virtio_vsock_pkt *pkt;
+	struct scatterlist hdr, buf, *sgs[2];
+	struct virtqueue *vq;
+	int ret;
+
+	vq = vsock->vqs[VSOCK_VQ_RX];
+
+	do {
+		pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
+		if (!pkt) {
+			pr_debug("%s: fail to allocate pkt\n", __func__);
+			goto out;
+		}
+
+		/* TODO: use mergeable rx buffer */
+		pkt->buf = kmalloc(buf_len, GFP_KERNEL);
+		if (!pkt->buf) {
+			pr_debug("%s: fail to allocate pkt->buf\n", __func__);
+			goto err;
+		}
+
+		sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr));
+		sgs[0] = &hdr;
+
+		sg_init_one(&buf, pkt->buf, buf_len);
+	        sgs[1] = &buf;
+		ret = virtqueue_add_sgs(vq, sgs, 0, 2, pkt, GFP_KERNEL);
+		if (ret)
+			goto err;
+		vsock->rx_buf_nr++;
+	} while (vq->num_free);
+	if (vsock->rx_buf_nr > vsock->rx_buf_max_nr)
+		vsock->rx_buf_max_nr = vsock->rx_buf_nr;
+out:
+	virtqueue_kick(vq);
+	return;
+err:
+	virtqueue_kick(vq);
+	virtio_transport_free_pkt(pkt);
+	return;
+}
+
+static void virtio_transport_send_pkt_work(struct work_struct *work)
+{
+	struct virtio_vsock *vsock =
+		container_of(work, struct virtio_vsock, tx_work);
+	struct virtio_vsock_pkt *pkt;
+	bool added = false;
+	struct virtqueue *vq;
+	unsigned int len;
+	struct sock *sk;
+
+	vq = vsock->vqs[VSOCK_VQ_TX];
+	mutex_lock(&vsock->tx_lock);
+	do {
+		virtqueue_disable_cb(vq);
+		while ((pkt = virtqueue_get_buf(vq, &len)) != NULL) {
+			sk = &pkt->trans->vsk->sk;
+			virtio_transport_dec_tx_pkt(pkt);
+			/* Release refcnt taken in virtio_transport_send_pkt */
+			sock_put(sk);
+			vsock->total_tx_buf -= pkt->len;
+			virtio_transport_free_pkt(pkt);
+			added = true;
+		}
+	} while (!virtqueue_enable_cb(vq));
+	mutex_unlock(&vsock->tx_lock);
+
+	if (added)
+		wake_up(&vsock->queue_wait);
+}
+
+static void virtio_transport_recv_pkt_work(struct work_struct *work)
+{
+	struct virtio_vsock *vsock =
+		container_of(work, struct virtio_vsock, rx_work);
+	struct virtio_vsock_pkt *pkt;
+	struct virtqueue *vq;
+	unsigned int len;
+
+	vq = vsock->vqs[VSOCK_VQ_RX];
+	mutex_lock(&vsock->rx_lock);
+	do {
+		virtqueue_disable_cb(vq);
+		while ((pkt = virtqueue_get_buf(vq, &len)) != NULL) {
+			pkt->len = len;
+			virtio_transport_recv_pkt(pkt);
+			vsock->rx_buf_nr--;
+		}
+	} while (!virtqueue_enable_cb(vq));
+
+	if (vsock->rx_buf_nr < vsock->rx_buf_max_nr / 2)
+		virtio_vsock_rx_fill(vsock);
+	mutex_unlock(&vsock->rx_lock);
+}
+
+static void virtio_vsock_ctrl_done(struct virtqueue *vq)
+{
+}
+
+static void virtio_vsock_tx_done(struct virtqueue *vq)
+{
+	struct virtio_vsock *vsock = vq->vdev->priv;
+
+	if (!vsock)
+		return;
+	queue_work(virtio_vsock_workqueue, &vsock->tx_work);
+}
+
+static void virtio_vsock_rx_done(struct virtqueue *vq)
+{
+	struct virtio_vsock *vsock = vq->vdev->priv;
+
+	if (!vsock)
+		return;
+	queue_work(virtio_vsock_workqueue, &vsock->rx_work);
+}
+
+static int
+virtio_transport_socket_init(struct vsock_sock *vsk, struct vsock_sock *psk)
+{
+	struct virtio_transport *trans;
+	int ret;
+
+	ret = virtio_transport_do_socket_init(vsk, psk);
+	if (ret)
+		return ret;
+
+	trans = vsk->trans;
+	trans->ops = &virtio_ops;
+	return ret;
+}
+
+static struct vsock_transport virtio_transport = {
+	.get_local_cid            = virtio_transport_get_local_cid,
+
+	.init                     = virtio_transport_socket_init,
+	.destruct                 = virtio_transport_destruct,
+	.release                  = virtio_transport_release,
+	.connect                  = virtio_transport_connect,
+	.shutdown                 = virtio_transport_shutdown,
+
+	.dgram_bind               = virtio_transport_dgram_bind,
+	.dgram_dequeue            = virtio_transport_dgram_dequeue,
+	.dgram_enqueue            = virtio_transport_dgram_enqueue,
+	.dgram_allow              = virtio_transport_dgram_allow,
+
+	.stream_dequeue           = virtio_transport_stream_dequeue,
+	.stream_enqueue           = virtio_transport_stream_enqueue,
+	.stream_has_data          = virtio_transport_stream_has_data,
+	.stream_has_space         = virtio_transport_stream_has_space,
+	.stream_rcvhiwat          = virtio_transport_stream_rcvhiwat,
+	.stream_is_active         = virtio_transport_stream_is_active,
+	.stream_allow             = virtio_transport_stream_allow,
+
+	.notify_poll_in           = virtio_transport_notify_poll_in,
+	.notify_poll_out          = virtio_transport_notify_poll_out,
+	.notify_recv_init         = virtio_transport_notify_recv_init,
+	.notify_recv_pre_block    = virtio_transport_notify_recv_pre_block,
+	.notify_recv_pre_dequeue  = virtio_transport_notify_recv_pre_dequeue,
+	.notify_recv_post_dequeue = virtio_transport_notify_recv_post_dequeue,
+	.notify_send_init         = virtio_transport_notify_send_init,
+	.notify_send_pre_block    = virtio_transport_notify_send_pre_block,
+	.notify_send_pre_enqueue  = virtio_transport_notify_send_pre_enqueue,
+	.notify_send_post_enqueue = virtio_transport_notify_send_post_enqueue,
+
+	.set_buffer_size          = virtio_transport_set_buffer_size,
+	.set_min_buffer_size      = virtio_transport_set_min_buffer_size,
+	.set_max_buffer_size      = virtio_transport_set_max_buffer_size,
+	.get_buffer_size          = virtio_transport_get_buffer_size,
+	.get_min_buffer_size      = virtio_transport_get_min_buffer_size,
+	.get_max_buffer_size      = virtio_transport_get_max_buffer_size,
+};
+
+static int virtio_vsock_probe(struct virtio_device *vdev)
+{
+	vq_callback_t *callbacks[] = {
+		virtio_vsock_ctrl_done,
+		virtio_vsock_rx_done,
+		virtio_vsock_tx_done,
+	};
+	const char *names[] = {
+		"ctrl",
+		"rx",
+		"tx",
+	};
+	struct virtio_vsock *vsock = NULL;
+	u32 guest_cid;
+	int ret;
+
+	ret = mutex_lock_interruptible(&the_virtio_vsock_mutex);
+	if (ret)
+		return ret;
+
+	/* Only one virtio-vsock device per guest is supported */
+	if (the_virtio_vsock) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	vsock = kzalloc(sizeof(*vsock), GFP_KERNEL);
+	if (!vsock) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	vsock->vdev = vdev;
+
+	ret = vsock->vdev->config->find_vqs(vsock->vdev, VSOCK_VQ_MAX,
+					    vsock->vqs, callbacks, names);
+	if (ret < 0)
+		goto out;
+
+	vdev->config->get(vdev, offsetof(struct virtio_vsock_config, guest_cid),
+			  &guest_cid, sizeof(guest_cid));
+	vsock->guest_cid = le32_to_cpu(guest_cid);
+	pr_debug("%s:guest_cid=%d\n", __func__, vsock->guest_cid);
+
+	ret = vsock_core_init(&virtio_transport);
+	if (ret < 0)
+		goto out_vqs;
+
+	vsock->rx_buf_nr = 0;
+	vsock->rx_buf_max_nr = 0;
+
+	vdev->priv = the_virtio_vsock = vsock;
+	init_waitqueue_head(&vsock->queue_wait);
+	mutex_init(&vsock->tx_lock);
+	mutex_init(&vsock->rx_lock);
+	INIT_WORK(&vsock->rx_work, virtio_transport_recv_pkt_work);
+	INIT_WORK(&vsock->tx_work, virtio_transport_send_pkt_work);
+
+	mutex_lock(&vsock->rx_lock);
+	virtio_vsock_rx_fill(vsock);
+	mutex_unlock(&vsock->rx_lock);
+
+	mutex_unlock(&the_virtio_vsock_mutex);
+	return 0;
+
+out_vqs:
+	vsock->vdev->config->del_vqs(vsock->vdev);
+out:
+	kfree(vsock);
+	mutex_unlock(&the_virtio_vsock_mutex);
+	return ret;
+}
+
+static void virtio_vsock_remove(struct virtio_device *vdev)
+{
+	struct virtio_vsock *vsock = vdev->priv;
+
+	mutex_lock(&the_virtio_vsock_mutex);
+	the_virtio_vsock = NULL;
+	vsock_core_exit();
+	mutex_unlock(&the_virtio_vsock_mutex);
+
+	kfree(vsock);
+}
+
+static struct virtio_device_id id_table[] = {
+	{ VIRTIO_ID_VSOCK, VIRTIO_DEV_ANY_ID },
+	{ 0 },
+};
+
+static unsigned int features[] = {
+};
+
+static struct virtio_driver virtio_vsock_driver = {
+	.feature_table = features,
+	.feature_table_size = ARRAY_SIZE(features),
+	.driver.name = KBUILD_MODNAME,
+	.driver.owner = THIS_MODULE,
+	.id_table = id_table,
+	.probe = virtio_vsock_probe,
+	.remove = virtio_vsock_remove,
+};
+
+static int __init virtio_vsock_init(void)
+{
+	int ret;
+
+	virtio_vsock_workqueue = alloc_workqueue("virtio_vsock", 0, 0);
+	if (!virtio_vsock_workqueue)
+		return -ENOMEM;
+	ret = register_virtio_driver(&virtio_vsock_driver);
+	if (ret)
+		destroy_workqueue(virtio_vsock_workqueue);
+	return ret;
+}
+
+static void __exit virtio_vsock_exit(void)
+{
+	unregister_virtio_driver(&virtio_vsock_driver);
+	destroy_workqueue(virtio_vsock_workqueue);
+}
+
+module_init(virtio_vsock_init);
+module_exit(virtio_vsock_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Asias He");
+MODULE_DESCRIPTION("virtio transport for vsock");
+MODULE_DEVICE_TABLE(virtio, id_table);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 2/4] VSOCK: Introduce virtio-vsock.ko
  2015-12-09 12:03 [PATCH v3 0/4] Add virtio transport for AF_VSOCK Stefan Hajnoczi
  2015-12-09 12:03 ` [PATCH v3 1/4] VSOCK: Introduce virtio-vsock-common.ko Stefan Hajnoczi
  2015-12-09 12:03 ` Stefan Hajnoczi
@ 2015-12-09 12:03 ` Stefan Hajnoczi
  2015-12-09 12:03 ` Stefan Hajnoczi
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-09 12:03 UTC (permalink / raw)
  To: kvm
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, netdev, virtualization,
	Matt Benjamin, Asias He, Christoffer Dall, matt.ma

From: Asias He <asias@redhat.com>

VM sockets virtio transport implementation. This module runs in guest
kernel.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v2:
 * Fix total_tx_buf accounting
 * Add virtio_transport global mutex to prevent races
---
 net/vmw_vsock/virtio_transport.c | 466 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 466 insertions(+)
 create mode 100644 net/vmw_vsock/virtio_transport.c

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
new file mode 100644
index 0000000..df65dca
--- /dev/null
+++ b/net/vmw_vsock/virtio_transport.c
@@ -0,0 +1,466 @@
+/*
+ * virtio transport for vsock
+ *
+ * Copyright (C) 2013-2015 Red Hat, Inc.
+ * Author: Asias He <asias@redhat.com>
+ *         Stefan Hajnoczi <stefanha@redhat.com>
+ *
+ * Some of the code is take from Gerd Hoffmann <kraxel@redhat.com>'s
+ * early virtio-vsock proof-of-concept bits.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/list.h>
+#include <linux/virtio.h>
+#include <linux/virtio_ids.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_vsock.h>
+#include <net/sock.h>
+#include <linux/mutex.h>
+#include <net/af_vsock.h>
+
+static struct workqueue_struct *virtio_vsock_workqueue;
+static struct virtio_vsock *the_virtio_vsock;
+static DEFINE_MUTEX(the_virtio_vsock_mutex); /* protects the_virtio_vsock */
+static void virtio_vsock_rx_fill(struct virtio_vsock *vsock);
+
+struct virtio_vsock {
+	/* Virtio device */
+	struct virtio_device *vdev;
+	/* Virtio virtqueue */
+	struct virtqueue *vqs[VSOCK_VQ_MAX];
+	/* Wait queue for send pkt */
+	wait_queue_head_t queue_wait;
+	/* Work item to send pkt */
+	struct work_struct tx_work;
+	/* Work item to recv pkt */
+	struct work_struct rx_work;
+	/* Mutex to protect send pkt*/
+	struct mutex tx_lock;
+	/* Mutex to protect recv pkt*/
+	struct mutex rx_lock;
+	/* Number of recv buffers */
+	int rx_buf_nr;
+	/* Number of max recv buffers */
+	int rx_buf_max_nr;
+	/* Used for global tx buf limitation */
+	u32 total_tx_buf;
+	/* Guest context id, just like guest ip address */
+	u32 guest_cid;
+};
+
+static struct virtio_vsock *virtio_vsock_get(void)
+{
+	return the_virtio_vsock;
+}
+
+static u32 virtio_transport_get_local_cid(void)
+{
+	struct virtio_vsock *vsock = virtio_vsock_get();
+
+	return vsock->guest_cid;
+}
+
+static int
+virtio_transport_send_pkt(struct vsock_sock *vsk,
+			  struct virtio_vsock_pkt_info *info)
+{
+	u32 src_cid, src_port, dst_cid, dst_port;
+	int ret, in_sg = 0, out_sg = 0;
+	struct virtio_transport *trans;
+	struct virtio_vsock_pkt *pkt;
+	struct virtio_vsock *vsock;
+	struct scatterlist hdr, buf, *sgs[2];
+	struct virtqueue *vq;
+	u32 pkt_len = info->pkt_len;
+	DEFINE_WAIT(wait);
+
+	vsock = virtio_vsock_get();
+	if (!vsock)
+		return -ENODEV;
+
+	src_cid	= virtio_transport_get_local_cid();
+	src_port = vsk->local_addr.svm_port;
+	if (!info->remote_cid) {
+		dst_cid	= vsk->remote_addr.svm_cid;
+		dst_port = vsk->remote_addr.svm_port;
+	} else {
+		dst_cid = info->remote_cid;
+		dst_port = info->remote_port;
+	}
+
+	trans = vsk->trans;
+	vq = vsock->vqs[VSOCK_VQ_TX];
+
+	if (pkt_len > VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE)
+		pkt_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
+	pkt_len = virtio_transport_get_credit(trans, pkt_len);
+	/* Do not send zero length OP_RW pkt*/
+	if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW)
+		return pkt_len;
+
+	/* Respect global tx buf limitation */
+	mutex_lock(&vsock->tx_lock);
+	while (pkt_len + vsock->total_tx_buf > VIRTIO_VSOCK_MAX_TX_BUF_SIZE) {
+		prepare_to_wait_exclusive(&vsock->queue_wait, &wait,
+					  TASK_UNINTERRUPTIBLE);
+		mutex_unlock(&vsock->tx_lock);
+		schedule();
+		mutex_lock(&vsock->tx_lock);
+		finish_wait(&vsock->queue_wait, &wait);
+	}
+	vsock->total_tx_buf += pkt_len;
+	mutex_unlock(&vsock->tx_lock);
+
+	pkt = virtio_transport_alloc_pkt(vsk, info, pkt_len,
+					 src_cid, src_port,
+					 dst_cid, dst_port);
+	if (!pkt) {
+		mutex_lock(&vsock->tx_lock);
+		vsock->total_tx_buf -= pkt_len;
+		mutex_unlock(&vsock->tx_lock);
+		virtio_transport_put_credit(trans, pkt_len);
+		return -ENOMEM;
+	}
+
+	pr_debug("%s:info->pkt_len= %d\n", __func__, info->pkt_len);
+
+	/* Will be released in virtio_transport_send_pkt_work */
+	sock_hold(&trans->vsk->sk);
+	virtio_transport_inc_tx_pkt(pkt);
+
+	/* Put pkt in the virtqueue */
+	sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr));
+	sgs[out_sg++] = &hdr;
+	if (info->msg && info->pkt_len > 0) {
+		sg_init_one(&buf, pkt->buf, pkt->len);
+	        sgs[out_sg++] = &buf;
+	}
+
+	mutex_lock(&vsock->tx_lock);
+	while ((ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, pkt,
+					GFP_KERNEL)) < 0) {
+		prepare_to_wait_exclusive(&vsock->queue_wait, &wait,
+					  TASK_UNINTERRUPTIBLE);
+		mutex_unlock(&vsock->tx_lock);
+		schedule();
+		mutex_lock(&vsock->tx_lock);
+		finish_wait(&vsock->queue_wait, &wait);
+	}
+	virtqueue_kick(vq);
+	mutex_unlock(&vsock->tx_lock);
+
+	return pkt_len;
+}
+
+static struct virtio_transport_pkt_ops virtio_ops = {
+	.send_pkt = virtio_transport_send_pkt,
+};
+
+static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
+{
+	int buf_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
+	struct virtio_vsock_pkt *pkt;
+	struct scatterlist hdr, buf, *sgs[2];
+	struct virtqueue *vq;
+	int ret;
+
+	vq = vsock->vqs[VSOCK_VQ_RX];
+
+	do {
+		pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
+		if (!pkt) {
+			pr_debug("%s: fail to allocate pkt\n", __func__);
+			goto out;
+		}
+
+		/* TODO: use mergeable rx buffer */
+		pkt->buf = kmalloc(buf_len, GFP_KERNEL);
+		if (!pkt->buf) {
+			pr_debug("%s: fail to allocate pkt->buf\n", __func__);
+			goto err;
+		}
+
+		sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr));
+		sgs[0] = &hdr;
+
+		sg_init_one(&buf, pkt->buf, buf_len);
+	        sgs[1] = &buf;
+		ret = virtqueue_add_sgs(vq, sgs, 0, 2, pkt, GFP_KERNEL);
+		if (ret)
+			goto err;
+		vsock->rx_buf_nr++;
+	} while (vq->num_free);
+	if (vsock->rx_buf_nr > vsock->rx_buf_max_nr)
+		vsock->rx_buf_max_nr = vsock->rx_buf_nr;
+out:
+	virtqueue_kick(vq);
+	return;
+err:
+	virtqueue_kick(vq);
+	virtio_transport_free_pkt(pkt);
+	return;
+}
+
+static void virtio_transport_send_pkt_work(struct work_struct *work)
+{
+	struct virtio_vsock *vsock =
+		container_of(work, struct virtio_vsock, tx_work);
+	struct virtio_vsock_pkt *pkt;
+	bool added = false;
+	struct virtqueue *vq;
+	unsigned int len;
+	struct sock *sk;
+
+	vq = vsock->vqs[VSOCK_VQ_TX];
+	mutex_lock(&vsock->tx_lock);
+	do {
+		virtqueue_disable_cb(vq);
+		while ((pkt = virtqueue_get_buf(vq, &len)) != NULL) {
+			sk = &pkt->trans->vsk->sk;
+			virtio_transport_dec_tx_pkt(pkt);
+			/* Release refcnt taken in virtio_transport_send_pkt */
+			sock_put(sk);
+			vsock->total_tx_buf -= pkt->len;
+			virtio_transport_free_pkt(pkt);
+			added = true;
+		}
+	} while (!virtqueue_enable_cb(vq));
+	mutex_unlock(&vsock->tx_lock);
+
+	if (added)
+		wake_up(&vsock->queue_wait);
+}
+
+static void virtio_transport_recv_pkt_work(struct work_struct *work)
+{
+	struct virtio_vsock *vsock =
+		container_of(work, struct virtio_vsock, rx_work);
+	struct virtio_vsock_pkt *pkt;
+	struct virtqueue *vq;
+	unsigned int len;
+
+	vq = vsock->vqs[VSOCK_VQ_RX];
+	mutex_lock(&vsock->rx_lock);
+	do {
+		virtqueue_disable_cb(vq);
+		while ((pkt = virtqueue_get_buf(vq, &len)) != NULL) {
+			pkt->len = len;
+			virtio_transport_recv_pkt(pkt);
+			vsock->rx_buf_nr--;
+		}
+	} while (!virtqueue_enable_cb(vq));
+
+	if (vsock->rx_buf_nr < vsock->rx_buf_max_nr / 2)
+		virtio_vsock_rx_fill(vsock);
+	mutex_unlock(&vsock->rx_lock);
+}
+
+static void virtio_vsock_ctrl_done(struct virtqueue *vq)
+{
+}
+
+static void virtio_vsock_tx_done(struct virtqueue *vq)
+{
+	struct virtio_vsock *vsock = vq->vdev->priv;
+
+	if (!vsock)
+		return;
+	queue_work(virtio_vsock_workqueue, &vsock->tx_work);
+}
+
+static void virtio_vsock_rx_done(struct virtqueue *vq)
+{
+	struct virtio_vsock *vsock = vq->vdev->priv;
+
+	if (!vsock)
+		return;
+	queue_work(virtio_vsock_workqueue, &vsock->rx_work);
+}
+
+static int
+virtio_transport_socket_init(struct vsock_sock *vsk, struct vsock_sock *psk)
+{
+	struct virtio_transport *trans;
+	int ret;
+
+	ret = virtio_transport_do_socket_init(vsk, psk);
+	if (ret)
+		return ret;
+
+	trans = vsk->trans;
+	trans->ops = &virtio_ops;
+	return ret;
+}
+
+static struct vsock_transport virtio_transport = {
+	.get_local_cid            = virtio_transport_get_local_cid,
+
+	.init                     = virtio_transport_socket_init,
+	.destruct                 = virtio_transport_destruct,
+	.release                  = virtio_transport_release,
+	.connect                  = virtio_transport_connect,
+	.shutdown                 = virtio_transport_shutdown,
+
+	.dgram_bind               = virtio_transport_dgram_bind,
+	.dgram_dequeue            = virtio_transport_dgram_dequeue,
+	.dgram_enqueue            = virtio_transport_dgram_enqueue,
+	.dgram_allow              = virtio_transport_dgram_allow,
+
+	.stream_dequeue           = virtio_transport_stream_dequeue,
+	.stream_enqueue           = virtio_transport_stream_enqueue,
+	.stream_has_data          = virtio_transport_stream_has_data,
+	.stream_has_space         = virtio_transport_stream_has_space,
+	.stream_rcvhiwat          = virtio_transport_stream_rcvhiwat,
+	.stream_is_active         = virtio_transport_stream_is_active,
+	.stream_allow             = virtio_transport_stream_allow,
+
+	.notify_poll_in           = virtio_transport_notify_poll_in,
+	.notify_poll_out          = virtio_transport_notify_poll_out,
+	.notify_recv_init         = virtio_transport_notify_recv_init,
+	.notify_recv_pre_block    = virtio_transport_notify_recv_pre_block,
+	.notify_recv_pre_dequeue  = virtio_transport_notify_recv_pre_dequeue,
+	.notify_recv_post_dequeue = virtio_transport_notify_recv_post_dequeue,
+	.notify_send_init         = virtio_transport_notify_send_init,
+	.notify_send_pre_block    = virtio_transport_notify_send_pre_block,
+	.notify_send_pre_enqueue  = virtio_transport_notify_send_pre_enqueue,
+	.notify_send_post_enqueue = virtio_transport_notify_send_post_enqueue,
+
+	.set_buffer_size          = virtio_transport_set_buffer_size,
+	.set_min_buffer_size      = virtio_transport_set_min_buffer_size,
+	.set_max_buffer_size      = virtio_transport_set_max_buffer_size,
+	.get_buffer_size          = virtio_transport_get_buffer_size,
+	.get_min_buffer_size      = virtio_transport_get_min_buffer_size,
+	.get_max_buffer_size      = virtio_transport_get_max_buffer_size,
+};
+
+static int virtio_vsock_probe(struct virtio_device *vdev)
+{
+	vq_callback_t *callbacks[] = {
+		virtio_vsock_ctrl_done,
+		virtio_vsock_rx_done,
+		virtio_vsock_tx_done,
+	};
+	const char *names[] = {
+		"ctrl",
+		"rx",
+		"tx",
+	};
+	struct virtio_vsock *vsock = NULL;
+	u32 guest_cid;
+	int ret;
+
+	ret = mutex_lock_interruptible(&the_virtio_vsock_mutex);
+	if (ret)
+		return ret;
+
+	/* Only one virtio-vsock device per guest is supported */
+	if (the_virtio_vsock) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	vsock = kzalloc(sizeof(*vsock), GFP_KERNEL);
+	if (!vsock) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	vsock->vdev = vdev;
+
+	ret = vsock->vdev->config->find_vqs(vsock->vdev, VSOCK_VQ_MAX,
+					    vsock->vqs, callbacks, names);
+	if (ret < 0)
+		goto out;
+
+	vdev->config->get(vdev, offsetof(struct virtio_vsock_config, guest_cid),
+			  &guest_cid, sizeof(guest_cid));
+	vsock->guest_cid = le32_to_cpu(guest_cid);
+	pr_debug("%s:guest_cid=%d\n", __func__, vsock->guest_cid);
+
+	ret = vsock_core_init(&virtio_transport);
+	if (ret < 0)
+		goto out_vqs;
+
+	vsock->rx_buf_nr = 0;
+	vsock->rx_buf_max_nr = 0;
+
+	vdev->priv = the_virtio_vsock = vsock;
+	init_waitqueue_head(&vsock->queue_wait);
+	mutex_init(&vsock->tx_lock);
+	mutex_init(&vsock->rx_lock);
+	INIT_WORK(&vsock->rx_work, virtio_transport_recv_pkt_work);
+	INIT_WORK(&vsock->tx_work, virtio_transport_send_pkt_work);
+
+	mutex_lock(&vsock->rx_lock);
+	virtio_vsock_rx_fill(vsock);
+	mutex_unlock(&vsock->rx_lock);
+
+	mutex_unlock(&the_virtio_vsock_mutex);
+	return 0;
+
+out_vqs:
+	vsock->vdev->config->del_vqs(vsock->vdev);
+out:
+	kfree(vsock);
+	mutex_unlock(&the_virtio_vsock_mutex);
+	return ret;
+}
+
+static void virtio_vsock_remove(struct virtio_device *vdev)
+{
+	struct virtio_vsock *vsock = vdev->priv;
+
+	mutex_lock(&the_virtio_vsock_mutex);
+	the_virtio_vsock = NULL;
+	vsock_core_exit();
+	mutex_unlock(&the_virtio_vsock_mutex);
+
+	kfree(vsock);
+}
+
+static struct virtio_device_id id_table[] = {
+	{ VIRTIO_ID_VSOCK, VIRTIO_DEV_ANY_ID },
+	{ 0 },
+};
+
+static unsigned int features[] = {
+};
+
+static struct virtio_driver virtio_vsock_driver = {
+	.feature_table = features,
+	.feature_table_size = ARRAY_SIZE(features),
+	.driver.name = KBUILD_MODNAME,
+	.driver.owner = THIS_MODULE,
+	.id_table = id_table,
+	.probe = virtio_vsock_probe,
+	.remove = virtio_vsock_remove,
+};
+
+static int __init virtio_vsock_init(void)
+{
+	int ret;
+
+	virtio_vsock_workqueue = alloc_workqueue("virtio_vsock", 0, 0);
+	if (!virtio_vsock_workqueue)
+		return -ENOMEM;
+	ret = register_virtio_driver(&virtio_vsock_driver);
+	if (ret)
+		destroy_workqueue(virtio_vsock_workqueue);
+	return ret;
+}
+
+static void __exit virtio_vsock_exit(void)
+{
+	unregister_virtio_driver(&virtio_vsock_driver);
+	destroy_workqueue(virtio_vsock_workqueue);
+}
+
+module_init(virtio_vsock_init);
+module_exit(virtio_vsock_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Asias He");
+MODULE_DESCRIPTION("virtio transport for vsock");
+MODULE_DEVICE_TABLE(virtio, id_table);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 3/4] VSOCK: Introduce vhost-vsock.ko
  2015-12-09 12:03 [PATCH v3 0/4] Add virtio transport for AF_VSOCK Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2015-12-09 12:03 ` Stefan Hajnoczi
@ 2015-12-09 12:03 ` Stefan Hajnoczi
  2015-12-11 13:45   ` Alex Bennée
  2015-12-11 13:45   ` Alex Bennée
  2015-12-09 12:03 ` Stefan Hajnoczi
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-09 12:03 UTC (permalink / raw)
  To: kvm
  Cc: Matt Benjamin, Christoffer Dall, netdev, Michael S. Tsirkin,
	matt.ma, virtualization, Asias He, Stefan Hajnoczi

From: Asias He <asias@redhat.com>

VM sockets vhost transport implementation. This module runs in host
kernel.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v3:
 * Remove unneeded variable used to store return value
   (Fengguang Wu <fengguang.wu@intel.com> and Julia Lawall
   <julia.lawall@lip6.fr>)
v2:
 * Add missing total_tx_buf decrement
 * Support flexible rx/tx descriptor layout
 * Refuse to assign reserved CIDs
 * Refuse guest CID if already in use
 * Only accept correctly addressed packets
---
 drivers/vhost/vsock.c | 628 ++++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/vhost/vsock.h |   4 +
 2 files changed, 632 insertions(+)
 create mode 100644 drivers/vhost/vsock.c
 create mode 100644 drivers/vhost/vsock.h

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
new file mode 100644
index 0000000..3c0034a
--- /dev/null
+++ b/drivers/vhost/vsock.c
@@ -0,0 +1,628 @@
+/*
+ * vhost transport for vsock
+ *
+ * Copyright (C) 2013-2015 Red Hat, Inc.
+ * Author: Asias He <asias@redhat.com>
+ *         Stefan Hajnoczi <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+#include <linux/miscdevice.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <net/sock.h>
+#include <linux/virtio_vsock.h>
+#include <linux/vhost.h>
+
+#include <net/af_vsock.h>
+#include "vhost.h"
+#include "vsock.h"
+
+#define VHOST_VSOCK_DEFAULT_HOST_CID	2
+
+static int vhost_transport_socket_init(struct vsock_sock *vsk,
+				       struct vsock_sock *psk);
+
+enum {
+	VHOST_VSOCK_FEATURES = VHOST_FEATURES,
+};
+
+/* Used to track all the vhost_vsock instances on the system. */
+static LIST_HEAD(vhost_vsock_list);
+static DEFINE_MUTEX(vhost_vsock_mutex);
+
+struct vhost_vsock_virtqueue {
+	struct vhost_virtqueue vq;
+};
+
+struct vhost_vsock {
+	/* Vhost device */
+	struct vhost_dev dev;
+	/* Vhost vsock virtqueue*/
+	struct vhost_vsock_virtqueue vqs[VSOCK_VQ_MAX];
+	/* Link to global vhost_vsock_list*/
+	struct list_head list;
+	/* Head for pkt from host to guest */
+	struct list_head send_pkt_list;
+	/* Work item to send pkt */
+	struct vhost_work send_pkt_work;
+	/* Wait queue for send pkt */
+	wait_queue_head_t queue_wait;
+	/* Used for global tx buf limitation */
+	u32 total_tx_buf;
+	/* Guest contex id this vhost_vsock instance handles */
+	u32 guest_cid;
+};
+
+static u32 vhost_transport_get_local_cid(void)
+{
+	return VHOST_VSOCK_DEFAULT_HOST_CID;
+}
+
+static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
+{
+	struct vhost_vsock *vsock;
+
+	mutex_lock(&vhost_vsock_mutex);
+	list_for_each_entry(vsock, &vhost_vsock_list, list) {
+		if (vsock->guest_cid == guest_cid) {
+			mutex_unlock(&vhost_vsock_mutex);
+			return vsock;
+		}
+	}
+	mutex_unlock(&vhost_vsock_mutex);
+
+	return NULL;
+}
+
+static void
+vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
+			    struct vhost_virtqueue *vq)
+{
+	bool added = false;
+
+	mutex_lock(&vq->mutex);
+	vhost_disable_notify(&vsock->dev, vq);
+	for (;;) {
+		struct virtio_vsock_pkt *pkt;
+		struct iov_iter iov_iter;
+		unsigned out, in;
+		struct sock *sk;
+		size_t nbytes;
+		size_t len;
+		int head;
+
+		if (list_empty(&vsock->send_pkt_list)) {
+			vhost_enable_notify(&vsock->dev, vq);
+			break;
+		}
+
+		head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
+					 &out, &in, NULL, NULL);
+		pr_debug("%s: head = %d\n", __func__, head);
+		if (head < 0)
+			break;
+
+		if (head == vq->num) {
+			if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
+				vhost_disable_notify(&vsock->dev, vq);
+				continue;
+			}
+			break;
+		}
+
+		pkt = list_first_entry(&vsock->send_pkt_list,
+				       struct virtio_vsock_pkt, list);
+		list_del_init(&pkt->list);
+
+		if (out) {
+			virtio_transport_free_pkt(pkt);
+			vq_err(vq, "Expected 0 output buffers, got %u\n", out);
+			break;
+		}
+
+		len = iov_length(&vq->iov[out], in);
+		iov_iter_init(&iov_iter, READ, &vq->iov[out], in, len);
+
+		nbytes = copy_to_iter(&pkt->hdr, sizeof(pkt->hdr), &iov_iter);
+		if (nbytes != sizeof(pkt->hdr)) {
+			virtio_transport_free_pkt(pkt);
+			vq_err(vq, "Faulted on copying pkt hdr\n");
+			break;
+		}
+
+		nbytes = copy_to_iter(pkt->buf, pkt->len, &iov_iter);
+		if (nbytes != pkt->len) {
+			virtio_transport_free_pkt(pkt);
+			vq_err(vq, "Faulted on copying pkt buf\n");
+			break;
+		}
+
+		vhost_add_used(vq, head, pkt->len); /* TODO should this be sizeof(pkt->hdr) + pkt->len? */
+		added = true;
+
+		virtio_transport_dec_tx_pkt(pkt);
+		vsock->total_tx_buf -= pkt->len;
+
+		sk = sk_vsock(pkt->trans->vsk);
+		/* Release refcnt taken in vhost_transport_send_pkt */
+		sock_put(sk);
+
+		virtio_transport_free_pkt(pkt);
+	}
+	if (added)
+		vhost_signal(&vsock->dev, vq);
+	mutex_unlock(&vq->mutex);
+
+	if (added)
+		wake_up(&vsock->queue_wait);
+}
+
+static void vhost_transport_send_pkt_work(struct vhost_work *work)
+{
+	struct vhost_virtqueue *vq;
+	struct vhost_vsock *vsock;
+
+	vsock = container_of(work, struct vhost_vsock, send_pkt_work);
+	vq = &vsock->vqs[VSOCK_VQ_RX].vq;
+
+	vhost_transport_do_send_pkt(vsock, vq);
+}
+
+static int
+vhost_transport_send_pkt(struct vsock_sock *vsk,
+			 struct virtio_vsock_pkt_info *info)
+{
+	u32 src_cid, src_port, dst_cid, dst_port;
+	struct virtio_transport *trans;
+	struct virtio_vsock_pkt *pkt;
+	struct vhost_virtqueue *vq;
+	struct vhost_vsock *vsock;
+	u32 pkt_len = info->pkt_len;
+	DEFINE_WAIT(wait);
+
+	src_cid = vhost_transport_get_local_cid();
+	src_port = vsk->local_addr.svm_port;
+	if (!info->remote_cid) {
+		dst_cid	= vsk->remote_addr.svm_cid;
+		dst_port = vsk->remote_addr.svm_port;
+	} else {
+		dst_cid = info->remote_cid;
+		dst_port = info->remote_port;
+	}
+
+	/* Find the vhost_vsock according to guest context id  */
+	vsock = vhost_vsock_get(dst_cid);
+	if (!vsock)
+		return -ENODEV;
+
+	trans = vsk->trans;
+	vq = &vsock->vqs[VSOCK_VQ_RX].vq;
+
+	/* we can send less than pkt_len bytes */
+	if (pkt_len > VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE)
+		pkt_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
+
+	/* virtio_transport_get_credit might return less than pkt_len credit */
+	pkt_len = virtio_transport_get_credit(trans, pkt_len);
+
+	/* Do not send zero length OP_RW pkt*/
+	if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW)
+		return pkt_len;
+
+	/* Respect global tx buf limitation */
+	mutex_lock(&vq->mutex);
+	while (pkt_len + vsock->total_tx_buf > VIRTIO_VSOCK_MAX_TX_BUF_SIZE) {
+		prepare_to_wait_exclusive(&vsock->queue_wait, &wait,
+					  TASK_UNINTERRUPTIBLE);
+		mutex_unlock(&vq->mutex);
+		schedule();
+		mutex_lock(&vq->mutex);
+		finish_wait(&vsock->queue_wait, &wait);
+	}
+	vsock->total_tx_buf += pkt_len;
+	mutex_unlock(&vq->mutex);
+
+	pkt = virtio_transport_alloc_pkt(vsk, info, pkt_len,
+					 src_cid, src_port,
+					 dst_cid, dst_port);
+	if (!pkt) {
+		mutex_lock(&vq->mutex);
+		vsock->total_tx_buf -= pkt_len;
+		mutex_unlock(&vq->mutex);
+		virtio_transport_put_credit(trans, pkt_len);
+		return -ENOMEM;
+	}
+
+	pr_debug("%s:info->pkt_len= %d\n", __func__, pkt_len);
+	/* Released in vhost_transport_do_send_pkt */
+	sock_hold(&trans->vsk->sk);
+	virtio_transport_inc_tx_pkt(pkt);
+
+	/* Queue it up in vhost work */
+	mutex_lock(&vq->mutex);
+	list_add_tail(&pkt->list, &vsock->send_pkt_list);
+	vhost_work_queue(&vsock->dev, &vsock->send_pkt_work);
+	mutex_unlock(&vq->mutex);
+
+	return pkt_len;
+}
+
+static struct virtio_transport_pkt_ops vhost_ops = {
+	.send_pkt = vhost_transport_send_pkt,
+};
+
+static struct virtio_vsock_pkt *
+vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq,
+		      unsigned int out, unsigned int in)
+{
+	struct virtio_vsock_pkt *pkt;
+	struct iov_iter iov_iter;
+	size_t nbytes;
+	size_t len;
+
+	if (in != 0) {
+		vq_err(vq, "Expected 0 input buffers, got %u\n", in);
+		return NULL;
+	}
+
+	pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
+	if (!pkt)
+		return NULL;
+
+	len = iov_length(vq->iov, out);
+	iov_iter_init(&iov_iter, WRITE, vq->iov, out, len);
+
+	nbytes = copy_from_iter(&pkt->hdr, sizeof(pkt->hdr), &iov_iter);
+	if (nbytes != sizeof(pkt->hdr)) {
+		vq_err(vq, "Expected %zu bytes for pkt->hdr, got %zu bytes\n",
+		       sizeof(pkt->hdr), nbytes);
+		kfree(pkt);
+		return NULL;
+	}
+
+	if (le16_to_cpu(pkt->hdr.type) == VIRTIO_VSOCK_TYPE_STREAM)
+		pkt->len = le32_to_cpu(pkt->hdr.len);
+
+	/* No payload */
+	if (!pkt->len)
+		return pkt;
+
+	/* The pkt is too big */
+	if (pkt->len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) {
+		kfree(pkt);
+		return NULL;
+	}
+
+	pkt->buf = kmalloc(pkt->len, GFP_KERNEL);
+	if (!pkt->buf) {
+		kfree(pkt);
+		return NULL;
+	}
+
+	nbytes = copy_from_iter(pkt->buf, pkt->len, &iov_iter);
+	if (nbytes != pkt->len) {
+		vq_err(vq, "Expected %u byte payload, got %zu bytes\n",
+		       pkt->len, nbytes);
+		virtio_transport_free_pkt(pkt);
+		return NULL;
+	}
+
+	return pkt;
+}
+
+static void vhost_vsock_handle_ctl_kick(struct vhost_work *work)
+{
+	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
+						  poll.work);
+	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
+						 dev);
+
+	pr_debug("%s vq=%p, vsock=%p\n", __func__, vq, vsock);
+}
+
+static void vhost_vsock_handle_tx_kick(struct vhost_work *work)
+{
+	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
+						  poll.work);
+	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
+						 dev);
+	struct virtio_vsock_pkt *pkt;
+	int head;
+	unsigned int out, in;
+	bool added = false;
+	u32 len;
+
+	mutex_lock(&vq->mutex);
+	vhost_disable_notify(&vsock->dev, vq);
+	for (;;) {
+		head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
+					 &out, &in, NULL, NULL);
+		if (head < 0)
+			break;
+
+		if (head == vq->num) {
+			if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
+				vhost_disable_notify(&vsock->dev, vq);
+				continue;
+			}
+			break;
+		}
+
+		pkt = vhost_vsock_alloc_pkt(vq, out, in);
+		if (!pkt) {
+			vq_err(vq, "Faulted on pkt\n");
+			continue;
+		}
+
+		len = pkt->len;
+
+		/* Only accept correctly addressed packets */
+		if (le32_to_cpu(pkt->hdr.src_cid) == vsock->guest_cid &&
+		    le32_to_cpu(pkt->hdr.dst_cid) == vhost_transport_get_local_cid())
+			virtio_transport_recv_pkt(pkt);
+		else
+			virtio_transport_free_pkt(pkt);
+
+		vhost_add_used(vq, head, len);
+		added = true;
+	}
+	if (added)
+		vhost_signal(&vsock->dev, vq);
+	mutex_unlock(&vq->mutex);
+}
+
+static void vhost_vsock_handle_rx_kick(struct vhost_work *work)
+{
+	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
+						poll.work);
+	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
+						 dev);
+
+	vhost_transport_do_send_pkt(vsock, vq);
+}
+
+static int vhost_vsock_dev_open(struct inode *inode, struct file *file)
+{
+	struct vhost_virtqueue **vqs;
+	struct vhost_vsock *vsock;
+	int ret;
+
+	vsock = kzalloc(sizeof(*vsock), GFP_KERNEL);
+	if (!vsock)
+		return -ENOMEM;
+
+	pr_debug("%s:vsock=%p\n", __func__, vsock);
+
+	vqs = kmalloc(VSOCK_VQ_MAX * sizeof(*vqs), GFP_KERNEL);
+	if (!vqs) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	vqs[VSOCK_VQ_CTRL] = &vsock->vqs[VSOCK_VQ_CTRL].vq;
+	vqs[VSOCK_VQ_TX] = &vsock->vqs[VSOCK_VQ_TX].vq;
+	vqs[VSOCK_VQ_RX] = &vsock->vqs[VSOCK_VQ_RX].vq;
+	vsock->vqs[VSOCK_VQ_CTRL].vq.handle_kick = vhost_vsock_handle_ctl_kick;
+	vsock->vqs[VSOCK_VQ_TX].vq.handle_kick = vhost_vsock_handle_tx_kick;
+	vsock->vqs[VSOCK_VQ_RX].vq.handle_kick = vhost_vsock_handle_rx_kick;
+
+	vhost_dev_init(&vsock->dev, vqs, VSOCK_VQ_MAX);
+
+	file->private_data = vsock;
+	init_waitqueue_head(&vsock->queue_wait);
+	INIT_LIST_HEAD(&vsock->send_pkt_list);
+	vhost_work_init(&vsock->send_pkt_work, vhost_transport_send_pkt_work);
+
+	mutex_lock(&vhost_vsock_mutex);
+	list_add_tail(&vsock->list, &vhost_vsock_list);
+	mutex_unlock(&vhost_vsock_mutex);
+	return 0;
+
+out:
+	kfree(vsock);
+	return ret;
+}
+
+static void vhost_vsock_flush(struct vhost_vsock *vsock)
+{
+	int i;
+
+	for (i = 0; i < VSOCK_VQ_MAX; i++)
+		vhost_poll_flush(&vsock->vqs[i].vq.poll);
+	vhost_work_flush(&vsock->dev, &vsock->send_pkt_work);
+}
+
+static int vhost_vsock_dev_release(struct inode *inode, struct file *file)
+{
+	struct vhost_vsock *vsock = file->private_data;
+
+	mutex_lock(&vhost_vsock_mutex);
+	list_del(&vsock->list);
+	mutex_unlock(&vhost_vsock_mutex);
+
+	vhost_dev_stop(&vsock->dev);
+	vhost_vsock_flush(vsock);
+	vhost_dev_cleanup(&vsock->dev, false);
+	kfree(vsock->dev.vqs);
+	kfree(vsock);
+	return 0;
+}
+
+static int vhost_vsock_set_cid(struct vhost_vsock *vsock, u32 guest_cid)
+{
+	struct vhost_vsock *other;
+
+	/* Refuse reserved CIDs */
+	if (guest_cid <= VMADDR_CID_HOST) {
+		return -EINVAL;
+	}
+
+	/* Refuse if CID is already in use */
+	other = vhost_vsock_get(guest_cid);
+	if (other && other != vsock) {
+		return -EADDRINUSE;
+	}
+
+	mutex_lock(&vhost_vsock_mutex);
+	vsock->guest_cid = guest_cid;
+	pr_debug("%s:guest_cid=%d\n", __func__, guest_cid);
+	mutex_unlock(&vhost_vsock_mutex);
+
+	return 0;
+}
+
+static int vhost_vsock_set_features(struct vhost_vsock *vsock, u64 features)
+{
+	struct vhost_virtqueue *vq;
+	int i;
+
+	if (features & ~VHOST_VSOCK_FEATURES)
+		return -EOPNOTSUPP;
+
+	mutex_lock(&vsock->dev.mutex);
+	if ((features & (1 << VHOST_F_LOG_ALL)) &&
+	    !vhost_log_access_ok(&vsock->dev)) {
+		mutex_unlock(&vsock->dev.mutex);
+		return -EFAULT;
+	}
+
+	for (i = 0; i < VSOCK_VQ_MAX; i++) {
+		vq = &vsock->vqs[i].vq;
+		mutex_lock(&vq->mutex);
+		vq->acked_features = features;
+		mutex_unlock(&vq->mutex);
+	}
+	mutex_unlock(&vsock->dev.mutex);
+	return 0;
+}
+
+static long vhost_vsock_dev_ioctl(struct file *f, unsigned int ioctl,
+				  unsigned long arg)
+{
+	struct vhost_vsock *vsock = f->private_data;
+	void __user *argp = (void __user *)arg;
+	u64 __user *featurep = argp;
+	u32 __user *cidp = argp;
+	u32 guest_cid;
+	u64 features;
+	int r;
+
+	switch (ioctl) {
+	case VHOST_VSOCK_SET_GUEST_CID:
+		if (get_user(guest_cid, cidp))
+			return -EFAULT;
+		return vhost_vsock_set_cid(vsock, guest_cid);
+	case VHOST_GET_FEATURES:
+		features = VHOST_VSOCK_FEATURES;
+		if (copy_to_user(featurep, &features, sizeof(features)))
+			return -EFAULT;
+		return 0;
+	case VHOST_SET_FEATURES:
+		if (copy_from_user(&features, featurep, sizeof(features)))
+			return -EFAULT;
+		return vhost_vsock_set_features(vsock, features);
+	default:
+		mutex_lock(&vsock->dev.mutex);
+		r = vhost_dev_ioctl(&vsock->dev, ioctl, argp);
+		if (r == -ENOIOCTLCMD)
+			r = vhost_vring_ioctl(&vsock->dev, ioctl, argp);
+		else
+			vhost_vsock_flush(vsock);
+		mutex_unlock(&vsock->dev.mutex);
+		return r;
+	}
+}
+
+static const struct file_operations vhost_vsock_fops = {
+	.owner          = THIS_MODULE,
+	.open           = vhost_vsock_dev_open,
+	.release        = vhost_vsock_dev_release,
+	.llseek		= noop_llseek,
+	.unlocked_ioctl = vhost_vsock_dev_ioctl,
+};
+
+static struct miscdevice vhost_vsock_misc = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name = "vhost-vsock",
+	.fops = &vhost_vsock_fops,
+};
+
+static int
+vhost_transport_socket_init(struct vsock_sock *vsk, struct vsock_sock *psk)
+{
+	struct virtio_transport *trans;
+	int ret;
+
+	ret = virtio_transport_do_socket_init(vsk, psk);
+	if (ret)
+		return ret;
+
+	trans = vsk->trans;
+	trans->ops = &vhost_ops;
+
+	return ret;
+}
+
+static struct vsock_transport vhost_transport = {
+	.get_local_cid            = vhost_transport_get_local_cid,
+
+	.init                     = vhost_transport_socket_init,
+	.destruct                 = virtio_transport_destruct,
+	.release                  = virtio_transport_release,
+	.connect                  = virtio_transport_connect,
+	.shutdown                 = virtio_transport_shutdown,
+
+	.dgram_enqueue            = virtio_transport_dgram_enqueue,
+	.dgram_dequeue            = virtio_transport_dgram_dequeue,
+	.dgram_bind               = virtio_transport_dgram_bind,
+	.dgram_allow              = virtio_transport_dgram_allow,
+
+	.stream_enqueue           = virtio_transport_stream_enqueue,
+	.stream_dequeue           = virtio_transport_stream_dequeue,
+	.stream_has_data          = virtio_transport_stream_has_data,
+	.stream_has_space         = virtio_transport_stream_has_space,
+	.stream_rcvhiwat          = virtio_transport_stream_rcvhiwat,
+	.stream_is_active         = virtio_transport_stream_is_active,
+	.stream_allow             = virtio_transport_stream_allow,
+
+	.notify_poll_in           = virtio_transport_notify_poll_in,
+	.notify_poll_out          = virtio_transport_notify_poll_out,
+	.notify_recv_init         = virtio_transport_notify_recv_init,
+	.notify_recv_pre_block    = virtio_transport_notify_recv_pre_block,
+	.notify_recv_pre_dequeue  = virtio_transport_notify_recv_pre_dequeue,
+	.notify_recv_post_dequeue = virtio_transport_notify_recv_post_dequeue,
+	.notify_send_init         = virtio_transport_notify_send_init,
+	.notify_send_pre_block    = virtio_transport_notify_send_pre_block,
+	.notify_send_pre_enqueue  = virtio_transport_notify_send_pre_enqueue,
+	.notify_send_post_enqueue = virtio_transport_notify_send_post_enqueue,
+
+	.set_buffer_size          = virtio_transport_set_buffer_size,
+	.set_min_buffer_size      = virtio_transport_set_min_buffer_size,
+	.set_max_buffer_size      = virtio_transport_set_max_buffer_size,
+	.get_buffer_size          = virtio_transport_get_buffer_size,
+	.get_min_buffer_size      = virtio_transport_get_min_buffer_size,
+	.get_max_buffer_size      = virtio_transport_get_max_buffer_size,
+};
+
+static int __init vhost_vsock_init(void)
+{
+	int ret;
+
+	ret = vsock_core_init(&vhost_transport);
+	if (ret < 0)
+		return ret;
+	return misc_register(&vhost_vsock_misc);
+};
+
+static void __exit vhost_vsock_exit(void)
+{
+	misc_deregister(&vhost_vsock_misc);
+	vsock_core_exit();
+};
+
+module_init(vhost_vsock_init);
+module_exit(vhost_vsock_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Asias He");
+MODULE_DESCRIPTION("vhost transport for vsock ");
diff --git a/drivers/vhost/vsock.h b/drivers/vhost/vsock.h
new file mode 100644
index 0000000..0ddb107
--- /dev/null
+++ b/drivers/vhost/vsock.h
@@ -0,0 +1,4 @@
+#ifndef VHOST_VSOCK_H
+#define VHOST_VSOCK_H
+#define VHOST_VSOCK_SET_GUEST_CID _IOW(VHOST_VIRTIO, 0x60, __u32)
+#endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 3/4] VSOCK: Introduce vhost-vsock.ko
  2015-12-09 12:03 [PATCH v3 0/4] Add virtio transport for AF_VSOCK Stefan Hajnoczi
                   ` (4 preceding siblings ...)
  2015-12-09 12:03 ` [PATCH v3 3/4] VSOCK: Introduce vhost-vsock.ko Stefan Hajnoczi
@ 2015-12-09 12:03 ` Stefan Hajnoczi
  2015-12-09 12:03 ` [PATCH v3 4/4] VSOCK: Add Makefile and Kconfig Stefan Hajnoczi
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-09 12:03 UTC (permalink / raw)
  To: kvm
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, netdev, virtualization,
	Matt Benjamin, Asias He, Christoffer Dall, matt.ma

From: Asias He <asias@redhat.com>

VM sockets vhost transport implementation. This module runs in host
kernel.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v3:
 * Remove unneeded variable used to store return value
   (Fengguang Wu <fengguang.wu@intel.com> and Julia Lawall
   <julia.lawall@lip6.fr>)
v2:
 * Add missing total_tx_buf decrement
 * Support flexible rx/tx descriptor layout
 * Refuse to assign reserved CIDs
 * Refuse guest CID if already in use
 * Only accept correctly addressed packets
---
 drivers/vhost/vsock.c | 628 ++++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/vhost/vsock.h |   4 +
 2 files changed, 632 insertions(+)
 create mode 100644 drivers/vhost/vsock.c
 create mode 100644 drivers/vhost/vsock.h

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
new file mode 100644
index 0000000..3c0034a
--- /dev/null
+++ b/drivers/vhost/vsock.c
@@ -0,0 +1,628 @@
+/*
+ * vhost transport for vsock
+ *
+ * Copyright (C) 2013-2015 Red Hat, Inc.
+ * Author: Asias He <asias@redhat.com>
+ *         Stefan Hajnoczi <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+#include <linux/miscdevice.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <net/sock.h>
+#include <linux/virtio_vsock.h>
+#include <linux/vhost.h>
+
+#include <net/af_vsock.h>
+#include "vhost.h"
+#include "vsock.h"
+
+#define VHOST_VSOCK_DEFAULT_HOST_CID	2
+
+static int vhost_transport_socket_init(struct vsock_sock *vsk,
+				       struct vsock_sock *psk);
+
+enum {
+	VHOST_VSOCK_FEATURES = VHOST_FEATURES,
+};
+
+/* Used to track all the vhost_vsock instances on the system. */
+static LIST_HEAD(vhost_vsock_list);
+static DEFINE_MUTEX(vhost_vsock_mutex);
+
+struct vhost_vsock_virtqueue {
+	struct vhost_virtqueue vq;
+};
+
+struct vhost_vsock {
+	/* Vhost device */
+	struct vhost_dev dev;
+	/* Vhost vsock virtqueue*/
+	struct vhost_vsock_virtqueue vqs[VSOCK_VQ_MAX];
+	/* Link to global vhost_vsock_list*/
+	struct list_head list;
+	/* Head for pkt from host to guest */
+	struct list_head send_pkt_list;
+	/* Work item to send pkt */
+	struct vhost_work send_pkt_work;
+	/* Wait queue for send pkt */
+	wait_queue_head_t queue_wait;
+	/* Used for global tx buf limitation */
+	u32 total_tx_buf;
+	/* Guest contex id this vhost_vsock instance handles */
+	u32 guest_cid;
+};
+
+static u32 vhost_transport_get_local_cid(void)
+{
+	return VHOST_VSOCK_DEFAULT_HOST_CID;
+}
+
+static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
+{
+	struct vhost_vsock *vsock;
+
+	mutex_lock(&vhost_vsock_mutex);
+	list_for_each_entry(vsock, &vhost_vsock_list, list) {
+		if (vsock->guest_cid == guest_cid) {
+			mutex_unlock(&vhost_vsock_mutex);
+			return vsock;
+		}
+	}
+	mutex_unlock(&vhost_vsock_mutex);
+
+	return NULL;
+}
+
+static void
+vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
+			    struct vhost_virtqueue *vq)
+{
+	bool added = false;
+
+	mutex_lock(&vq->mutex);
+	vhost_disable_notify(&vsock->dev, vq);
+	for (;;) {
+		struct virtio_vsock_pkt *pkt;
+		struct iov_iter iov_iter;
+		unsigned out, in;
+		struct sock *sk;
+		size_t nbytes;
+		size_t len;
+		int head;
+
+		if (list_empty(&vsock->send_pkt_list)) {
+			vhost_enable_notify(&vsock->dev, vq);
+			break;
+		}
+
+		head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
+					 &out, &in, NULL, NULL);
+		pr_debug("%s: head = %d\n", __func__, head);
+		if (head < 0)
+			break;
+
+		if (head == vq->num) {
+			if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
+				vhost_disable_notify(&vsock->dev, vq);
+				continue;
+			}
+			break;
+		}
+
+		pkt = list_first_entry(&vsock->send_pkt_list,
+				       struct virtio_vsock_pkt, list);
+		list_del_init(&pkt->list);
+
+		if (out) {
+			virtio_transport_free_pkt(pkt);
+			vq_err(vq, "Expected 0 output buffers, got %u\n", out);
+			break;
+		}
+
+		len = iov_length(&vq->iov[out], in);
+		iov_iter_init(&iov_iter, READ, &vq->iov[out], in, len);
+
+		nbytes = copy_to_iter(&pkt->hdr, sizeof(pkt->hdr), &iov_iter);
+		if (nbytes != sizeof(pkt->hdr)) {
+			virtio_transport_free_pkt(pkt);
+			vq_err(vq, "Faulted on copying pkt hdr\n");
+			break;
+		}
+
+		nbytes = copy_to_iter(pkt->buf, pkt->len, &iov_iter);
+		if (nbytes != pkt->len) {
+			virtio_transport_free_pkt(pkt);
+			vq_err(vq, "Faulted on copying pkt buf\n");
+			break;
+		}
+
+		vhost_add_used(vq, head, pkt->len); /* TODO should this be sizeof(pkt->hdr) + pkt->len? */
+		added = true;
+
+		virtio_transport_dec_tx_pkt(pkt);
+		vsock->total_tx_buf -= pkt->len;
+
+		sk = sk_vsock(pkt->trans->vsk);
+		/* Release refcnt taken in vhost_transport_send_pkt */
+		sock_put(sk);
+
+		virtio_transport_free_pkt(pkt);
+	}
+	if (added)
+		vhost_signal(&vsock->dev, vq);
+	mutex_unlock(&vq->mutex);
+
+	if (added)
+		wake_up(&vsock->queue_wait);
+}
+
+static void vhost_transport_send_pkt_work(struct vhost_work *work)
+{
+	struct vhost_virtqueue *vq;
+	struct vhost_vsock *vsock;
+
+	vsock = container_of(work, struct vhost_vsock, send_pkt_work);
+	vq = &vsock->vqs[VSOCK_VQ_RX].vq;
+
+	vhost_transport_do_send_pkt(vsock, vq);
+}
+
+static int
+vhost_transport_send_pkt(struct vsock_sock *vsk,
+			 struct virtio_vsock_pkt_info *info)
+{
+	u32 src_cid, src_port, dst_cid, dst_port;
+	struct virtio_transport *trans;
+	struct virtio_vsock_pkt *pkt;
+	struct vhost_virtqueue *vq;
+	struct vhost_vsock *vsock;
+	u32 pkt_len = info->pkt_len;
+	DEFINE_WAIT(wait);
+
+	src_cid = vhost_transport_get_local_cid();
+	src_port = vsk->local_addr.svm_port;
+	if (!info->remote_cid) {
+		dst_cid	= vsk->remote_addr.svm_cid;
+		dst_port = vsk->remote_addr.svm_port;
+	} else {
+		dst_cid = info->remote_cid;
+		dst_port = info->remote_port;
+	}
+
+	/* Find the vhost_vsock according to guest context id  */
+	vsock = vhost_vsock_get(dst_cid);
+	if (!vsock)
+		return -ENODEV;
+
+	trans = vsk->trans;
+	vq = &vsock->vqs[VSOCK_VQ_RX].vq;
+
+	/* we can send less than pkt_len bytes */
+	if (pkt_len > VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE)
+		pkt_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
+
+	/* virtio_transport_get_credit might return less than pkt_len credit */
+	pkt_len = virtio_transport_get_credit(trans, pkt_len);
+
+	/* Do not send zero length OP_RW pkt*/
+	if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW)
+		return pkt_len;
+
+	/* Respect global tx buf limitation */
+	mutex_lock(&vq->mutex);
+	while (pkt_len + vsock->total_tx_buf > VIRTIO_VSOCK_MAX_TX_BUF_SIZE) {
+		prepare_to_wait_exclusive(&vsock->queue_wait, &wait,
+					  TASK_UNINTERRUPTIBLE);
+		mutex_unlock(&vq->mutex);
+		schedule();
+		mutex_lock(&vq->mutex);
+		finish_wait(&vsock->queue_wait, &wait);
+	}
+	vsock->total_tx_buf += pkt_len;
+	mutex_unlock(&vq->mutex);
+
+	pkt = virtio_transport_alloc_pkt(vsk, info, pkt_len,
+					 src_cid, src_port,
+					 dst_cid, dst_port);
+	if (!pkt) {
+		mutex_lock(&vq->mutex);
+		vsock->total_tx_buf -= pkt_len;
+		mutex_unlock(&vq->mutex);
+		virtio_transport_put_credit(trans, pkt_len);
+		return -ENOMEM;
+	}
+
+	pr_debug("%s:info->pkt_len= %d\n", __func__, pkt_len);
+	/* Released in vhost_transport_do_send_pkt */
+	sock_hold(&trans->vsk->sk);
+	virtio_transport_inc_tx_pkt(pkt);
+
+	/* Queue it up in vhost work */
+	mutex_lock(&vq->mutex);
+	list_add_tail(&pkt->list, &vsock->send_pkt_list);
+	vhost_work_queue(&vsock->dev, &vsock->send_pkt_work);
+	mutex_unlock(&vq->mutex);
+
+	return pkt_len;
+}
+
+static struct virtio_transport_pkt_ops vhost_ops = {
+	.send_pkt = vhost_transport_send_pkt,
+};
+
+static struct virtio_vsock_pkt *
+vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq,
+		      unsigned int out, unsigned int in)
+{
+	struct virtio_vsock_pkt *pkt;
+	struct iov_iter iov_iter;
+	size_t nbytes;
+	size_t len;
+
+	if (in != 0) {
+		vq_err(vq, "Expected 0 input buffers, got %u\n", in);
+		return NULL;
+	}
+
+	pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
+	if (!pkt)
+		return NULL;
+
+	len = iov_length(vq->iov, out);
+	iov_iter_init(&iov_iter, WRITE, vq->iov, out, len);
+
+	nbytes = copy_from_iter(&pkt->hdr, sizeof(pkt->hdr), &iov_iter);
+	if (nbytes != sizeof(pkt->hdr)) {
+		vq_err(vq, "Expected %zu bytes for pkt->hdr, got %zu bytes\n",
+		       sizeof(pkt->hdr), nbytes);
+		kfree(pkt);
+		return NULL;
+	}
+
+	if (le16_to_cpu(pkt->hdr.type) == VIRTIO_VSOCK_TYPE_STREAM)
+		pkt->len = le32_to_cpu(pkt->hdr.len);
+
+	/* No payload */
+	if (!pkt->len)
+		return pkt;
+
+	/* The pkt is too big */
+	if (pkt->len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) {
+		kfree(pkt);
+		return NULL;
+	}
+
+	pkt->buf = kmalloc(pkt->len, GFP_KERNEL);
+	if (!pkt->buf) {
+		kfree(pkt);
+		return NULL;
+	}
+
+	nbytes = copy_from_iter(pkt->buf, pkt->len, &iov_iter);
+	if (nbytes != pkt->len) {
+		vq_err(vq, "Expected %u byte payload, got %zu bytes\n",
+		       pkt->len, nbytes);
+		virtio_transport_free_pkt(pkt);
+		return NULL;
+	}
+
+	return pkt;
+}
+
+static void vhost_vsock_handle_ctl_kick(struct vhost_work *work)
+{
+	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
+						  poll.work);
+	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
+						 dev);
+
+	pr_debug("%s vq=%p, vsock=%p\n", __func__, vq, vsock);
+}
+
+static void vhost_vsock_handle_tx_kick(struct vhost_work *work)
+{
+	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
+						  poll.work);
+	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
+						 dev);
+	struct virtio_vsock_pkt *pkt;
+	int head;
+	unsigned int out, in;
+	bool added = false;
+	u32 len;
+
+	mutex_lock(&vq->mutex);
+	vhost_disable_notify(&vsock->dev, vq);
+	for (;;) {
+		head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
+					 &out, &in, NULL, NULL);
+		if (head < 0)
+			break;
+
+		if (head == vq->num) {
+			if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
+				vhost_disable_notify(&vsock->dev, vq);
+				continue;
+			}
+			break;
+		}
+
+		pkt = vhost_vsock_alloc_pkt(vq, out, in);
+		if (!pkt) {
+			vq_err(vq, "Faulted on pkt\n");
+			continue;
+		}
+
+		len = pkt->len;
+
+		/* Only accept correctly addressed packets */
+		if (le32_to_cpu(pkt->hdr.src_cid) == vsock->guest_cid &&
+		    le32_to_cpu(pkt->hdr.dst_cid) == vhost_transport_get_local_cid())
+			virtio_transport_recv_pkt(pkt);
+		else
+			virtio_transport_free_pkt(pkt);
+
+		vhost_add_used(vq, head, len);
+		added = true;
+	}
+	if (added)
+		vhost_signal(&vsock->dev, vq);
+	mutex_unlock(&vq->mutex);
+}
+
+static void vhost_vsock_handle_rx_kick(struct vhost_work *work)
+{
+	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
+						poll.work);
+	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
+						 dev);
+
+	vhost_transport_do_send_pkt(vsock, vq);
+}
+
+static int vhost_vsock_dev_open(struct inode *inode, struct file *file)
+{
+	struct vhost_virtqueue **vqs;
+	struct vhost_vsock *vsock;
+	int ret;
+
+	vsock = kzalloc(sizeof(*vsock), GFP_KERNEL);
+	if (!vsock)
+		return -ENOMEM;
+
+	pr_debug("%s:vsock=%p\n", __func__, vsock);
+
+	vqs = kmalloc(VSOCK_VQ_MAX * sizeof(*vqs), GFP_KERNEL);
+	if (!vqs) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	vqs[VSOCK_VQ_CTRL] = &vsock->vqs[VSOCK_VQ_CTRL].vq;
+	vqs[VSOCK_VQ_TX] = &vsock->vqs[VSOCK_VQ_TX].vq;
+	vqs[VSOCK_VQ_RX] = &vsock->vqs[VSOCK_VQ_RX].vq;
+	vsock->vqs[VSOCK_VQ_CTRL].vq.handle_kick = vhost_vsock_handle_ctl_kick;
+	vsock->vqs[VSOCK_VQ_TX].vq.handle_kick = vhost_vsock_handle_tx_kick;
+	vsock->vqs[VSOCK_VQ_RX].vq.handle_kick = vhost_vsock_handle_rx_kick;
+
+	vhost_dev_init(&vsock->dev, vqs, VSOCK_VQ_MAX);
+
+	file->private_data = vsock;
+	init_waitqueue_head(&vsock->queue_wait);
+	INIT_LIST_HEAD(&vsock->send_pkt_list);
+	vhost_work_init(&vsock->send_pkt_work, vhost_transport_send_pkt_work);
+
+	mutex_lock(&vhost_vsock_mutex);
+	list_add_tail(&vsock->list, &vhost_vsock_list);
+	mutex_unlock(&vhost_vsock_mutex);
+	return 0;
+
+out:
+	kfree(vsock);
+	return ret;
+}
+
+static void vhost_vsock_flush(struct vhost_vsock *vsock)
+{
+	int i;
+
+	for (i = 0; i < VSOCK_VQ_MAX; i++)
+		vhost_poll_flush(&vsock->vqs[i].vq.poll);
+	vhost_work_flush(&vsock->dev, &vsock->send_pkt_work);
+}
+
+static int vhost_vsock_dev_release(struct inode *inode, struct file *file)
+{
+	struct vhost_vsock *vsock = file->private_data;
+
+	mutex_lock(&vhost_vsock_mutex);
+	list_del(&vsock->list);
+	mutex_unlock(&vhost_vsock_mutex);
+
+	vhost_dev_stop(&vsock->dev);
+	vhost_vsock_flush(vsock);
+	vhost_dev_cleanup(&vsock->dev, false);
+	kfree(vsock->dev.vqs);
+	kfree(vsock);
+	return 0;
+}
+
+static int vhost_vsock_set_cid(struct vhost_vsock *vsock, u32 guest_cid)
+{
+	struct vhost_vsock *other;
+
+	/* Refuse reserved CIDs */
+	if (guest_cid <= VMADDR_CID_HOST) {
+		return -EINVAL;
+	}
+
+	/* Refuse if CID is already in use */
+	other = vhost_vsock_get(guest_cid);
+	if (other && other != vsock) {
+		return -EADDRINUSE;
+	}
+
+	mutex_lock(&vhost_vsock_mutex);
+	vsock->guest_cid = guest_cid;
+	pr_debug("%s:guest_cid=%d\n", __func__, guest_cid);
+	mutex_unlock(&vhost_vsock_mutex);
+
+	return 0;
+}
+
+static int vhost_vsock_set_features(struct vhost_vsock *vsock, u64 features)
+{
+	struct vhost_virtqueue *vq;
+	int i;
+
+	if (features & ~VHOST_VSOCK_FEATURES)
+		return -EOPNOTSUPP;
+
+	mutex_lock(&vsock->dev.mutex);
+	if ((features & (1 << VHOST_F_LOG_ALL)) &&
+	    !vhost_log_access_ok(&vsock->dev)) {
+		mutex_unlock(&vsock->dev.mutex);
+		return -EFAULT;
+	}
+
+	for (i = 0; i < VSOCK_VQ_MAX; i++) {
+		vq = &vsock->vqs[i].vq;
+		mutex_lock(&vq->mutex);
+		vq->acked_features = features;
+		mutex_unlock(&vq->mutex);
+	}
+	mutex_unlock(&vsock->dev.mutex);
+	return 0;
+}
+
+static long vhost_vsock_dev_ioctl(struct file *f, unsigned int ioctl,
+				  unsigned long arg)
+{
+	struct vhost_vsock *vsock = f->private_data;
+	void __user *argp = (void __user *)arg;
+	u64 __user *featurep = argp;
+	u32 __user *cidp = argp;
+	u32 guest_cid;
+	u64 features;
+	int r;
+
+	switch (ioctl) {
+	case VHOST_VSOCK_SET_GUEST_CID:
+		if (get_user(guest_cid, cidp))
+			return -EFAULT;
+		return vhost_vsock_set_cid(vsock, guest_cid);
+	case VHOST_GET_FEATURES:
+		features = VHOST_VSOCK_FEATURES;
+		if (copy_to_user(featurep, &features, sizeof(features)))
+			return -EFAULT;
+		return 0;
+	case VHOST_SET_FEATURES:
+		if (copy_from_user(&features, featurep, sizeof(features)))
+			return -EFAULT;
+		return vhost_vsock_set_features(vsock, features);
+	default:
+		mutex_lock(&vsock->dev.mutex);
+		r = vhost_dev_ioctl(&vsock->dev, ioctl, argp);
+		if (r == -ENOIOCTLCMD)
+			r = vhost_vring_ioctl(&vsock->dev, ioctl, argp);
+		else
+			vhost_vsock_flush(vsock);
+		mutex_unlock(&vsock->dev.mutex);
+		return r;
+	}
+}
+
+static const struct file_operations vhost_vsock_fops = {
+	.owner          = THIS_MODULE,
+	.open           = vhost_vsock_dev_open,
+	.release        = vhost_vsock_dev_release,
+	.llseek		= noop_llseek,
+	.unlocked_ioctl = vhost_vsock_dev_ioctl,
+};
+
+static struct miscdevice vhost_vsock_misc = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name = "vhost-vsock",
+	.fops = &vhost_vsock_fops,
+};
+
+static int
+vhost_transport_socket_init(struct vsock_sock *vsk, struct vsock_sock *psk)
+{
+	struct virtio_transport *trans;
+	int ret;
+
+	ret = virtio_transport_do_socket_init(vsk, psk);
+	if (ret)
+		return ret;
+
+	trans = vsk->trans;
+	trans->ops = &vhost_ops;
+
+	return ret;
+}
+
+static struct vsock_transport vhost_transport = {
+	.get_local_cid            = vhost_transport_get_local_cid,
+
+	.init                     = vhost_transport_socket_init,
+	.destruct                 = virtio_transport_destruct,
+	.release                  = virtio_transport_release,
+	.connect                  = virtio_transport_connect,
+	.shutdown                 = virtio_transport_shutdown,
+
+	.dgram_enqueue            = virtio_transport_dgram_enqueue,
+	.dgram_dequeue            = virtio_transport_dgram_dequeue,
+	.dgram_bind               = virtio_transport_dgram_bind,
+	.dgram_allow              = virtio_transport_dgram_allow,
+
+	.stream_enqueue           = virtio_transport_stream_enqueue,
+	.stream_dequeue           = virtio_transport_stream_dequeue,
+	.stream_has_data          = virtio_transport_stream_has_data,
+	.stream_has_space         = virtio_transport_stream_has_space,
+	.stream_rcvhiwat          = virtio_transport_stream_rcvhiwat,
+	.stream_is_active         = virtio_transport_stream_is_active,
+	.stream_allow             = virtio_transport_stream_allow,
+
+	.notify_poll_in           = virtio_transport_notify_poll_in,
+	.notify_poll_out          = virtio_transport_notify_poll_out,
+	.notify_recv_init         = virtio_transport_notify_recv_init,
+	.notify_recv_pre_block    = virtio_transport_notify_recv_pre_block,
+	.notify_recv_pre_dequeue  = virtio_transport_notify_recv_pre_dequeue,
+	.notify_recv_post_dequeue = virtio_transport_notify_recv_post_dequeue,
+	.notify_send_init         = virtio_transport_notify_send_init,
+	.notify_send_pre_block    = virtio_transport_notify_send_pre_block,
+	.notify_send_pre_enqueue  = virtio_transport_notify_send_pre_enqueue,
+	.notify_send_post_enqueue = virtio_transport_notify_send_post_enqueue,
+
+	.set_buffer_size          = virtio_transport_set_buffer_size,
+	.set_min_buffer_size      = virtio_transport_set_min_buffer_size,
+	.set_max_buffer_size      = virtio_transport_set_max_buffer_size,
+	.get_buffer_size          = virtio_transport_get_buffer_size,
+	.get_min_buffer_size      = virtio_transport_get_min_buffer_size,
+	.get_max_buffer_size      = virtio_transport_get_max_buffer_size,
+};
+
+static int __init vhost_vsock_init(void)
+{
+	int ret;
+
+	ret = vsock_core_init(&vhost_transport);
+	if (ret < 0)
+		return ret;
+	return misc_register(&vhost_vsock_misc);
+};
+
+static void __exit vhost_vsock_exit(void)
+{
+	misc_deregister(&vhost_vsock_misc);
+	vsock_core_exit();
+};
+
+module_init(vhost_vsock_init);
+module_exit(vhost_vsock_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Asias He");
+MODULE_DESCRIPTION("vhost transport for vsock ");
diff --git a/drivers/vhost/vsock.h b/drivers/vhost/vsock.h
new file mode 100644
index 0000000..0ddb107
--- /dev/null
+++ b/drivers/vhost/vsock.h
@@ -0,0 +1,4 @@
+#ifndef VHOST_VSOCK_H
+#define VHOST_VSOCK_H
+#define VHOST_VSOCK_SET_GUEST_CID _IOW(VHOST_VIRTIO, 0x60, __u32)
+#endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 4/4] VSOCK: Add Makefile and Kconfig
  2015-12-09 12:03 [PATCH v3 0/4] Add virtio transport for AF_VSOCK Stefan Hajnoczi
                   ` (5 preceding siblings ...)
  2015-12-09 12:03 ` Stefan Hajnoczi
@ 2015-12-09 12:03 ` Stefan Hajnoczi
  2015-12-11 17:19   ` Alex Bennée
  2015-12-09 12:03 ` Stefan Hajnoczi
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-09 12:03 UTC (permalink / raw)
  To: kvm
  Cc: Matt Benjamin, Christoffer Dall, netdev, Michael S. Tsirkin,
	matt.ma, virtualization, Asias He, Stefan Hajnoczi

From: Asias He <asias@redhat.com>

Enable virtio-vsock and vhost-vsock.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v3:
 * Don't put vhost vsock driver into staging
 * Add missing Kconfig dependencies (Arnd Bergmann <arnd@arndb.de>)
---
 drivers/vhost/Kconfig  | 10 ++++++++++
 drivers/vhost/Makefile |  4 ++++
 net/vmw_vsock/Kconfig  | 18 ++++++++++++++++++
 net/vmw_vsock/Makefile |  2 ++
 4 files changed, 34 insertions(+)

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index 533eaf0..a1bb4c2 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -21,6 +21,16 @@ config VHOST_SCSI
 	Say M here to enable the vhost_scsi TCM fabric module
 	for use with virtio-scsi guests
 
+config VHOST_VSOCK
+	tristate "vhost virtio-vsock driver"
+	depends on VSOCKETS && EVENTFD
+	select VIRTIO_VSOCKETS_COMMON
+	select VHOST
+	select VHOST_RING
+	default n
+	---help---
+	Say M here to enable the vhost-vsock for virtio-vsock guests
+
 config VHOST_RING
 	tristate
 	---help---
diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
index e0441c3..6b012b9 100644
--- a/drivers/vhost/Makefile
+++ b/drivers/vhost/Makefile
@@ -4,5 +4,9 @@ vhost_net-y := net.o
 obj-$(CONFIG_VHOST_SCSI) += vhost_scsi.o
 vhost_scsi-y := scsi.o
 
+obj-$(CONFIG_VHOST_VSOCK) += vhost_vsock.o
+vhost_vsock-y := vsock.o
+
 obj-$(CONFIG_VHOST_RING) += vringh.o
+
 obj-$(CONFIG_VHOST)	+= vhost.o
diff --git a/net/vmw_vsock/Kconfig b/net/vmw_vsock/Kconfig
index 14810ab..74e0bc8 100644
--- a/net/vmw_vsock/Kconfig
+++ b/net/vmw_vsock/Kconfig
@@ -26,3 +26,21 @@ config VMWARE_VMCI_VSOCKETS
 
 	  To compile this driver as a module, choose M here: the module
 	  will be called vmw_vsock_vmci_transport. If unsure, say N.
+
+config VIRTIO_VSOCKETS
+	tristate "virtio transport for Virtual Sockets"
+	depends on VSOCKETS && VIRTIO
+	select VIRTIO_VSOCKETS_COMMON
+	help
+	  This module implements a virtio transport for Virtual Sockets.
+
+	  Enable this transport if your Virtual Machine runs on Qemu/KVM.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called virtio_vsock_transport. If unsure, say N.
+
+config VIRTIO_VSOCKETS_COMMON
+       tristate
+       ---help---
+         This option is selected by any driver which needs to access
+         the virtio_vsock.
diff --git a/net/vmw_vsock/Makefile b/net/vmw_vsock/Makefile
index 2ce52d7..cf4c294 100644
--- a/net/vmw_vsock/Makefile
+++ b/net/vmw_vsock/Makefile
@@ -1,5 +1,7 @@
 obj-$(CONFIG_VSOCKETS) += vsock.o
 obj-$(CONFIG_VMWARE_VMCI_VSOCKETS) += vmw_vsock_vmci_transport.o
+obj-$(CONFIG_VIRTIO_VSOCKETS) += virtio_transport.o
+obj-$(CONFIG_VIRTIO_VSOCKETS_COMMON) += virtio_transport_common.o
 
 vsock-y += af_vsock.o vsock_addr.o
 
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 4/4] VSOCK: Add Makefile and Kconfig
  2015-12-09 12:03 [PATCH v3 0/4] Add virtio transport for AF_VSOCK Stefan Hajnoczi
                   ` (6 preceding siblings ...)
  2015-12-09 12:03 ` [PATCH v3 4/4] VSOCK: Add Makefile and Kconfig Stefan Hajnoczi
@ 2015-12-09 12:03 ` Stefan Hajnoczi
  2015-12-09 20:12 ` [PATCH v3 0/4] Add virtio transport for AF_VSOCK Michael S. Tsirkin
  2015-12-09 20:12 ` Michael S. Tsirkin
  9 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-09 12:03 UTC (permalink / raw)
  To: kvm
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, netdev, virtualization,
	Matt Benjamin, Asias He, Christoffer Dall, matt.ma

From: Asias He <asias@redhat.com>

Enable virtio-vsock and vhost-vsock.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v3:
 * Don't put vhost vsock driver into staging
 * Add missing Kconfig dependencies (Arnd Bergmann <arnd@arndb.de>)
---
 drivers/vhost/Kconfig  | 10 ++++++++++
 drivers/vhost/Makefile |  4 ++++
 net/vmw_vsock/Kconfig  | 18 ++++++++++++++++++
 net/vmw_vsock/Makefile |  2 ++
 4 files changed, 34 insertions(+)

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index 533eaf0..a1bb4c2 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -21,6 +21,16 @@ config VHOST_SCSI
 	Say M here to enable the vhost_scsi TCM fabric module
 	for use with virtio-scsi guests
 
+config VHOST_VSOCK
+	tristate "vhost virtio-vsock driver"
+	depends on VSOCKETS && EVENTFD
+	select VIRTIO_VSOCKETS_COMMON
+	select VHOST
+	select VHOST_RING
+	default n
+	---help---
+	Say M here to enable the vhost-vsock for virtio-vsock guests
+
 config VHOST_RING
 	tristate
 	---help---
diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
index e0441c3..6b012b9 100644
--- a/drivers/vhost/Makefile
+++ b/drivers/vhost/Makefile
@@ -4,5 +4,9 @@ vhost_net-y := net.o
 obj-$(CONFIG_VHOST_SCSI) += vhost_scsi.o
 vhost_scsi-y := scsi.o
 
+obj-$(CONFIG_VHOST_VSOCK) += vhost_vsock.o
+vhost_vsock-y := vsock.o
+
 obj-$(CONFIG_VHOST_RING) += vringh.o
+
 obj-$(CONFIG_VHOST)	+= vhost.o
diff --git a/net/vmw_vsock/Kconfig b/net/vmw_vsock/Kconfig
index 14810ab..74e0bc8 100644
--- a/net/vmw_vsock/Kconfig
+++ b/net/vmw_vsock/Kconfig
@@ -26,3 +26,21 @@ config VMWARE_VMCI_VSOCKETS
 
 	  To compile this driver as a module, choose M here: the module
 	  will be called vmw_vsock_vmci_transport. If unsure, say N.
+
+config VIRTIO_VSOCKETS
+	tristate "virtio transport for Virtual Sockets"
+	depends on VSOCKETS && VIRTIO
+	select VIRTIO_VSOCKETS_COMMON
+	help
+	  This module implements a virtio transport for Virtual Sockets.
+
+	  Enable this transport if your Virtual Machine runs on Qemu/KVM.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called virtio_vsock_transport. If unsure, say N.
+
+config VIRTIO_VSOCKETS_COMMON
+       tristate
+       ---help---
+         This option is selected by any driver which needs to access
+         the virtio_vsock.
diff --git a/net/vmw_vsock/Makefile b/net/vmw_vsock/Makefile
index 2ce52d7..cf4c294 100644
--- a/net/vmw_vsock/Makefile
+++ b/net/vmw_vsock/Makefile
@@ -1,5 +1,7 @@
 obj-$(CONFIG_VSOCKETS) += vsock.o
 obj-$(CONFIG_VMWARE_VMCI_VSOCKETS) += vmw_vsock_vmci_transport.o
+obj-$(CONFIG_VIRTIO_VSOCKETS) += virtio_transport.o
+obj-$(CONFIG_VIRTIO_VSOCKETS_COMMON) += virtio_transport_common.o
 
 vsock-y += af_vsock.o vsock_addr.o
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 0/4] Add virtio transport for AF_VSOCK
  2015-12-09 12:03 [PATCH v3 0/4] Add virtio transport for AF_VSOCK Stefan Hajnoczi
                   ` (8 preceding siblings ...)
  2015-12-09 20:12 ` [PATCH v3 0/4] Add virtio transport for AF_VSOCK Michael S. Tsirkin
@ 2015-12-09 20:12 ` Michael S. Tsirkin
  9 siblings, 0 replies; 23+ messages in thread
From: Michael S. Tsirkin @ 2015-12-09 20:12 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kvm, Matt Benjamin, Christoffer Dall, netdev, matt.ma, virtualization

On Wed, Dec 09, 2015 at 08:03:49PM +0800, Stefan Hajnoczi wrote:
> Note: the virtio-vsock device specification is currently under review but not
> yet finalized.  Please review this code but don't merge until I send an update
> when the spec is finalized.  Thanks!

Yes, this should have RFC in the subject.

> v3:
>  * Remove unnecessary 3-way handshake, just do REQUEST/RESPONSE instead
>    of REQUEST/RESPONSE/ACK
>  * Remove SOCK_DGRAM support and focus on SOCK_STREAM first
>    (also drop v2 Patch 1, it's only needed for SOCK_DGRAM)
>  * Only allow host->guest connections (same security model as latest
>    VMware)
>  * Don't put vhost vsock driver into staging
>  * Add missing Kconfig dependencies (Arnd Bergmann <arnd@arndb.de>)
>  * Remove unneeded variable used to store return value
>    (Fengguang Wu <fengguang.wu@intel.com> and Julia Lawall
>    <julia.lawall@lip6.fr>)
> 
> v2:
>  * Rebased onto Linux v4.4-rc2
>  * vhost: Refuse to assign reserved CIDs
>  * vhost: Refuse guest CID if already in use
>  * vhost: Only accept correctly addressed packets (no spoofing!)
>  * vhost: Support flexible rx/tx descriptor layout
>  * vhost: Add missing total_tx_buf decrement
>  * virtio_transport: Fix total_tx_buf accounting
>  * virtio_transport: Add virtio_transport global mutex to prevent races
>  * common: Notify other side of SOCK_STREAM disconnect (fixes shutdown
>    semantics)
>  * common: Avoid recursive mutex_lock(tx_lock) for write_space (fixes deadlock)
>  * common: Define VIRTIO_VSOCK_TYPE_STREAM/DGRAM hardware interface constants
>  * common: Define VIRTIO_VSOCK_SHUTDOWN_RCV/SEND hardware interface constants
>  * common: Fix peer_buf_alloc inheritance on child socket
> 
> This patch series adds a virtio transport for AF_VSOCK (net/vmw_vsock/).
> AF_VSOCK is designed for communication between virtual machines and
> hypervisors.  It is currently only implemented for VMware's VMCI transport.
> 
> This series implements the proposed virtio-vsock device specification from
> here:
> http://permalink.gmane.org/gmane.comp.emulators.virtio.devel/980
> 
> Most of the work was done by Asias He and Gerd Hoffmann a while back.  I have
> picked up the series again.
> 
> The QEMU userspace changes are here:
> https://github.com/stefanha/qemu/commits/vsock
> 
> Why virtio-vsock?
> -----------------
> Guest<->host communication is currently done over the virtio-serial device.
> This makes it hard to port sockets API-based applications and is limited to
> static ports.
> 
> virtio-vsock uses the sockets API so that applications can rely on familiar
> SOCK_STREAM semantics.  Applications on the host can easily connect to guest
> agents because the sockets API allows multiple connections to a listen socket
> (unlike virtio-serial).  This simplifies the guest<->host communication and
> eliminates the need for extra processes on the host to arbitrate virtio-serial
> ports.
> 
> Overview
> --------
> This series adds 3 pieces:
> 
> 1. virtio_transport_common.ko - core virtio vsock code that uses vsock.ko
> 
> 2. virtio_transport.ko - guest driver
> 
> 3. drivers/vhost/vsock.ko - host driver
> 
> Howto
> -----
> The following kernel options are needed:
>   CONFIG_VSOCKETS=y
>   CONFIG_VIRTIO_VSOCKETS=y
>   CONFIG_VIRTIO_VSOCKETS_COMMON=y
>   CONFIG_VHOST_VSOCK=m
> 
> Launch QEMU as follows:
>   # qemu ... -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3
> 
> Guest and host can communicate via AF_VSOCK sockets.  The host's CID (address)
> is 2 and the guest must be assigned a CID (3 in the example above).
> 
> Status
> ------
> This patch series implements the latest draft specification.  Please review.
> 
> Asias He (4):
>   VSOCK: Introduce virtio-vsock-common.ko
>   VSOCK: Introduce virtio-vsock.ko
>   VSOCK: Introduce vhost-vsock.ko
>   VSOCK: Add Makefile and Kconfig
> 
>  drivers/vhost/Kconfig                   |  10 +
>  drivers/vhost/Makefile                  |   4 +
>  drivers/vhost/vsock.c                   | 628 +++++++++++++++++++++++
>  drivers/vhost/vsock.h                   |   4 +
>  include/linux/virtio_vsock.h            | 203 ++++++++
>  include/uapi/linux/virtio_ids.h         |   1 +
>  include/uapi/linux/virtio_vsock.h       |  87 ++++
>  net/vmw_vsock/Kconfig                   |  18 +
>  net/vmw_vsock/Makefile                  |   2 +
>  net/vmw_vsock/virtio_transport.c        | 466 +++++++++++++++++
>  net/vmw_vsock/virtio_transport_common.c | 854 ++++++++++++++++++++++++++++++++
>  11 files changed, 2277 insertions(+)
>  create mode 100644 drivers/vhost/vsock.c
>  create mode 100644 drivers/vhost/vsock.h
>  create mode 100644 include/linux/virtio_vsock.h
>  create mode 100644 include/uapi/linux/virtio_vsock.h
>  create mode 100644 net/vmw_vsock/virtio_transport.c
>  create mode 100644 net/vmw_vsock/virtio_transport_common.c
> 
> -- 
> 2.5.0

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 0/4] Add virtio transport for AF_VSOCK
  2015-12-09 12:03 [PATCH v3 0/4] Add virtio transport for AF_VSOCK Stefan Hajnoczi
                   ` (7 preceding siblings ...)
  2015-12-09 12:03 ` Stefan Hajnoczi
@ 2015-12-09 20:12 ` Michael S. Tsirkin
  2015-12-09 20:12 ` Michael S. Tsirkin
  9 siblings, 0 replies; 23+ messages in thread
From: Michael S. Tsirkin @ 2015-12-09 20:12 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kvm, netdev, virtualization, Matt Benjamin, Christoffer Dall, matt.ma

On Wed, Dec 09, 2015 at 08:03:49PM +0800, Stefan Hajnoczi wrote:
> Note: the virtio-vsock device specification is currently under review but not
> yet finalized.  Please review this code but don't merge until I send an update
> when the spec is finalized.  Thanks!

Yes, this should have RFC in the subject.

> v3:
>  * Remove unnecessary 3-way handshake, just do REQUEST/RESPONSE instead
>    of REQUEST/RESPONSE/ACK
>  * Remove SOCK_DGRAM support and focus on SOCK_STREAM first
>    (also drop v2 Patch 1, it's only needed for SOCK_DGRAM)
>  * Only allow host->guest connections (same security model as latest
>    VMware)
>  * Don't put vhost vsock driver into staging
>  * Add missing Kconfig dependencies (Arnd Bergmann <arnd@arndb.de>)
>  * Remove unneeded variable used to store return value
>    (Fengguang Wu <fengguang.wu@intel.com> and Julia Lawall
>    <julia.lawall@lip6.fr>)
> 
> v2:
>  * Rebased onto Linux v4.4-rc2
>  * vhost: Refuse to assign reserved CIDs
>  * vhost: Refuse guest CID if already in use
>  * vhost: Only accept correctly addressed packets (no spoofing!)
>  * vhost: Support flexible rx/tx descriptor layout
>  * vhost: Add missing total_tx_buf decrement
>  * virtio_transport: Fix total_tx_buf accounting
>  * virtio_transport: Add virtio_transport global mutex to prevent races
>  * common: Notify other side of SOCK_STREAM disconnect (fixes shutdown
>    semantics)
>  * common: Avoid recursive mutex_lock(tx_lock) for write_space (fixes deadlock)
>  * common: Define VIRTIO_VSOCK_TYPE_STREAM/DGRAM hardware interface constants
>  * common: Define VIRTIO_VSOCK_SHUTDOWN_RCV/SEND hardware interface constants
>  * common: Fix peer_buf_alloc inheritance on child socket
> 
> This patch series adds a virtio transport for AF_VSOCK (net/vmw_vsock/).
> AF_VSOCK is designed for communication between virtual machines and
> hypervisors.  It is currently only implemented for VMware's VMCI transport.
> 
> This series implements the proposed virtio-vsock device specification from
> here:
> http://permalink.gmane.org/gmane.comp.emulators.virtio.devel/980
> 
> Most of the work was done by Asias He and Gerd Hoffmann a while back.  I have
> picked up the series again.
> 
> The QEMU userspace changes are here:
> https://github.com/stefanha/qemu/commits/vsock
> 
> Why virtio-vsock?
> -----------------
> Guest<->host communication is currently done over the virtio-serial device.
> This makes it hard to port sockets API-based applications and is limited to
> static ports.
> 
> virtio-vsock uses the sockets API so that applications can rely on familiar
> SOCK_STREAM semantics.  Applications on the host can easily connect to guest
> agents because the sockets API allows multiple connections to a listen socket
> (unlike virtio-serial).  This simplifies the guest<->host communication and
> eliminates the need for extra processes on the host to arbitrate virtio-serial
> ports.
> 
> Overview
> --------
> This series adds 3 pieces:
> 
> 1. virtio_transport_common.ko - core virtio vsock code that uses vsock.ko
> 
> 2. virtio_transport.ko - guest driver
> 
> 3. drivers/vhost/vsock.ko - host driver
> 
> Howto
> -----
> The following kernel options are needed:
>   CONFIG_VSOCKETS=y
>   CONFIG_VIRTIO_VSOCKETS=y
>   CONFIG_VIRTIO_VSOCKETS_COMMON=y
>   CONFIG_VHOST_VSOCK=m
> 
> Launch QEMU as follows:
>   # qemu ... -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3
> 
> Guest and host can communicate via AF_VSOCK sockets.  The host's CID (address)
> is 2 and the guest must be assigned a CID (3 in the example above).
> 
> Status
> ------
> This patch series implements the latest draft specification.  Please review.
> 
> Asias He (4):
>   VSOCK: Introduce virtio-vsock-common.ko
>   VSOCK: Introduce virtio-vsock.ko
>   VSOCK: Introduce vhost-vsock.ko
>   VSOCK: Add Makefile and Kconfig
> 
>  drivers/vhost/Kconfig                   |  10 +
>  drivers/vhost/Makefile                  |   4 +
>  drivers/vhost/vsock.c                   | 628 +++++++++++++++++++++++
>  drivers/vhost/vsock.h                   |   4 +
>  include/linux/virtio_vsock.h            | 203 ++++++++
>  include/uapi/linux/virtio_ids.h         |   1 +
>  include/uapi/linux/virtio_vsock.h       |  87 ++++
>  net/vmw_vsock/Kconfig                   |  18 +
>  net/vmw_vsock/Makefile                  |   2 +
>  net/vmw_vsock/virtio_transport.c        | 466 +++++++++++++++++
>  net/vmw_vsock/virtio_transport_common.c | 854 ++++++++++++++++++++++++++++++++
>  11 files changed, 2277 insertions(+)
>  create mode 100644 drivers/vhost/vsock.c
>  create mode 100644 drivers/vhost/vsock.h
>  create mode 100644 include/linux/virtio_vsock.h
>  create mode 100644 include/uapi/linux/virtio_vsock.h
>  create mode 100644 net/vmw_vsock/virtio_transport.c
>  create mode 100644 net/vmw_vsock/virtio_transport_common.c
> 
> -- 
> 2.5.0

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/4] VSOCK: Introduce virtio-vsock-common.ko
  2015-12-09 12:03 ` Stefan Hajnoczi
@ 2015-12-10 10:17   ` Alex Bennée
  2015-12-11  2:51     ` Stefan Hajnoczi
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Bennée @ 2015-12-10 10:17 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kvm, Michael S. Tsirkin, netdev, virtualization, Matt Benjamin,
	Asias He, Christoffer Dall, matt.ma


Stefan Hajnoczi <stefanha@redhat.com> writes:

> From: Asias He <asias@redhat.com>
>
> This module contains the common code and header files for the following
> virtio-vsock and virtio-vhost kernel modules.

General comment checkpatch has a bunch of warnings about 80 character
limits, extra braces and BUG_ON usage.

>
> Signed-off-by: Asias He <asias@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> v3:
>  * Remove unnecessary 3-way handshake, just do REQUEST/RESPONSE instead
>    of REQUEST/RESPONSE/ACK
>  * Remove SOCK_DGRAM support and focus on SOCK_STREAM first
>  * Only allow host->guest connections (same security model as latest
>    VMware)
> v2:
>  * Fix peer_buf_alloc inheritance on child socket
>  * Notify other side of SOCK_STREAM disconnect (fixes shutdown
>    semantics)
>  * Avoid recursive mutex_lock(tx_lock) for write_space (fixes deadlock)
>  * Define VIRTIO_VSOCK_TYPE_STREAM/DGRAM hardware interface constants
>  * Define VIRTIO_VSOCK_SHUTDOWN_RCV/SEND hardware interface constants
> ---
>  include/linux/virtio_vsock.h            | 203 ++++++++
>  include/uapi/linux/virtio_ids.h         |   1 +
>  include/uapi/linux/virtio_vsock.h       |  87 ++++
>  net/vmw_vsock/virtio_transport_common.c | 854 ++++++++++++++++++++++++++++++++
>  4 files changed, 1145 insertions(+)
>  create mode 100644 include/linux/virtio_vsock.h
>  create mode 100644 include/uapi/linux/virtio_vsock.h
>  create mode 100644 net/vmw_vsock/virtio_transport_common.c
>
> diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
> new file mode 100644
> index 0000000..e54eb45
> --- /dev/null
> +++ b/include/linux/virtio_vsock.h
> @@ -0,0 +1,203 @@
> +/*
> + * This header, excluding the #ifdef __KERNEL__ part, is BSD licensed so
> + * anyone can use the definitions to implement compatible
> drivers/servers:

Is anything in here actually exposed to userspace or the guest? The
#ifdef __KERNEL__ statement seems redundant for this file at least.

> + *
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + * 3. Neither the name of IBM nor the names of its contributors
> + *    may be used to endorse or promote products derived from this software
> + *    without specific prior written permission.
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS''
> + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> + * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + *
> + * Copyright (C) Red Hat, Inc., 2013-2015
> + * Copyright (C) Asias He <asias@redhat.com>, 2013
> + * Copyright (C) Stefan Hajnoczi <stefanha@redhat.com>, 2015
> + */
> +
> +#ifndef _LINUX_VIRTIO_VSOCK_H
> +#define _LINUX_VIRTIO_VSOCK_H
> +
> +#include <uapi/linux/virtio_vsock.h>
> +#include <linux/socket.h>
> +#include <net/sock.h>
> +
> +#define VIRTIO_VSOCK_DEFAULT_MIN_BUF_SIZE	128
> +#define VIRTIO_VSOCK_DEFAULT_BUF_SIZE		(1024 * 256)
> +#define VIRTIO_VSOCK_DEFAULT_MAX_BUF_SIZE	(1024 * 256)
> +#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE	(1024 * 4)
> +#define VIRTIO_VSOCK_MAX_BUF_SIZE		0xFFFFFFFFUL
> +#define VIRTIO_VSOCK_MAX_PKT_BUF_SIZE		(1024 * 64)
> +#define VIRTIO_VSOCK_MAX_TX_BUF_SIZE		(1024 * 1024 * 16)
> +#define VIRTIO_VSOCK_MAX_DGRAM_SIZE		(1024 * 64)
> +
> +struct vsock_transport_recv_notify_data;
> +struct vsock_transport_send_notify_data;
> +struct sockaddr_vm;
> +struct vsock_sock;
> +
> +enum {
> +	VSOCK_VQ_CTRL	= 0,
> +	VSOCK_VQ_RX	= 1, /* for host to guest data */
> +	VSOCK_VQ_TX	= 2, /* for guest to host data */
> +	VSOCK_VQ_MAX	= 3,
> +};
> +
> +/* virtio transport socket state */
> +struct virtio_transport {
> +	struct virtio_transport_pkt_ops	*ops;
> +	struct vsock_sock *vsk;
> +
> +	u32 buf_size;
> +	u32 buf_size_min;
> +	u32 buf_size_max;
> +
> +	struct mutex tx_lock;
> +	struct mutex rx_lock;
> +
> +	struct list_head rx_queue;
> +	u32 rx_bytes;
> +
> +	/* Protected by trans->tx_lock */
> +	u32 tx_cnt;
> +	u32 buf_alloc;
> +	u32 peer_fwd_cnt;
> +	u32 peer_buf_alloc;
> +	/* Protected by trans->rx_lock */
> +	u32 fwd_cnt;
> +};
> +
> +struct virtio_vsock_pkt {
> +	struct virtio_vsock_hdr	hdr;
> +	struct virtio_transport	*trans;
> +	struct work_struct work;
> +	struct list_head list;
> +	void *buf;
> +	u32 len;
> +	u32 off;
> +};
> +
> +struct virtio_vsock_pkt_info {
> +	u32 remote_cid, remote_port;
> +	struct msghdr *msg;
> +	u32 pkt_len;
> +	u16 type;
> +	u16 op;
> +	u32 flags;
> +};
> +
> +struct virtio_transport_pkt_ops {
> +	int (*send_pkt)(struct vsock_sock *vsk,
> +			struct virtio_vsock_pkt_info *info);
> +};
> +
> +void virtio_vsock_dumppkt(const char *func,
> +			  const struct virtio_vsock_pkt *pkt);
> +
> +struct sock *
> +virtio_transport_get_pending(struct sock *listener,
> +			     struct virtio_vsock_pkt *pkt);
> +struct virtio_vsock_pkt *
> +virtio_transport_alloc_pkt(struct vsock_sock *vsk,
> +			   struct virtio_vsock_pkt_info *info,
> +			   size_t len,
> +			   u32 src_cid,
> +			   u32 src_port,
> +			   u32 dst_cid,
> +			   u32 dst_port);
> +ssize_t
> +virtio_transport_stream_dequeue(struct vsock_sock *vsk,
> +				struct msghdr *msg,
> +				size_t len,
> +				int type);
> +int
> +virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
> +			       struct msghdr *msg,
> +			       size_t len, int flags);
> +
> +s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
> +s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
> +
> +int virtio_transport_do_socket_init(struct vsock_sock *vsk,
> +				 struct vsock_sock *psk);
> +u64 virtio_transport_get_buffer_size(struct vsock_sock *vsk);
> +u64 virtio_transport_get_min_buffer_size(struct vsock_sock *vsk);
> +u64 virtio_transport_get_max_buffer_size(struct vsock_sock *vsk);
> +void virtio_transport_set_buffer_size(struct vsock_sock *vsk, u64 val);
> +void virtio_transport_set_min_buffer_size(struct vsock_sock *vsk, u64 val);
> +void virtio_transport_set_max_buffer_size(struct vsock_sock *vs, u64 val);
> +int
> +virtio_transport_notify_poll_in(struct vsock_sock *vsk,
> +				size_t target,
> +				bool *data_ready_now);
> +int
> +virtio_transport_notify_poll_out(struct vsock_sock *vsk,
> +				 size_t target,
> +				 bool *space_available_now);
> +
> +int virtio_transport_notify_recv_init(struct vsock_sock *vsk,
> +	size_t target, struct vsock_transport_recv_notify_data *data);
> +int virtio_transport_notify_recv_pre_block(struct vsock_sock *vsk,
> +	size_t target, struct vsock_transport_recv_notify_data *data);
> +int virtio_transport_notify_recv_pre_dequeue(struct vsock_sock *vsk,
> +	size_t target, struct vsock_transport_recv_notify_data *data);
> +int virtio_transport_notify_recv_post_dequeue(struct vsock_sock *vsk,
> +	size_t target, ssize_t copied, bool data_read,
> +	struct vsock_transport_recv_notify_data *data);
> +int virtio_transport_notify_send_init(struct vsock_sock *vsk,
> +	struct vsock_transport_send_notify_data *data);
> +int virtio_transport_notify_send_pre_block(struct vsock_sock *vsk,
> +	struct vsock_transport_send_notify_data *data);
> +int virtio_transport_notify_send_pre_enqueue(struct vsock_sock *vsk,
> +	struct vsock_transport_send_notify_data *data);
> +int virtio_transport_notify_send_post_enqueue(struct vsock_sock *vsk,
> +	ssize_t written, struct vsock_transport_send_notify_data *data);
> +
> +u64 virtio_transport_stream_rcvhiwat(struct vsock_sock *vsk);
> +bool virtio_transport_stream_is_active(struct vsock_sock *vsk);
> +bool virtio_transport_stream_allow(u32 cid, u32 port);
> +int virtio_transport_dgram_bind(struct vsock_sock *vsk,
> +				struct sockaddr_vm *addr);
> +bool virtio_transport_dgram_allow(u32 cid, u32 port);
> +
> +int virtio_transport_connect(struct vsock_sock *vsk);
> +
> +int virtio_transport_shutdown(struct vsock_sock *vsk, int mode);
> +
> +void virtio_transport_release(struct vsock_sock *vsk);
> +
> +ssize_t
> +virtio_transport_stream_enqueue(struct vsock_sock *vsk,
> +				struct msghdr *msg,
> +				size_t len);
> +int
> +virtio_transport_dgram_enqueue(struct vsock_sock *vsk,
> +			       struct sockaddr_vm *remote_addr,
> +			       struct msghdr *msg,
> +			       size_t len);
> +
> +void virtio_transport_destruct(struct vsock_sock *vsk);
> +
> +void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt);
> +void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt);
> +void virtio_transport_inc_tx_pkt(struct virtio_vsock_pkt *pkt);
> +void virtio_transport_dec_tx_pkt(struct virtio_vsock_pkt *pkt);
> +u32 virtio_transport_get_credit(struct virtio_transport *trans, u32 wanted);
> +void virtio_transport_put_credit(struct virtio_transport *trans, u32 credit);
> +#endif /* _LINUX_VIRTIO_VSOCK_H */
> diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
> index 77925f5..16dcf5d 100644
> --- a/include/uapi/linux/virtio_ids.h
> +++ b/include/uapi/linux/virtio_ids.h
> @@ -39,6 +39,7 @@
>  #define VIRTIO_ID_9P		9 /* 9p virtio console */
>  #define VIRTIO_ID_RPROC_SERIAL 11 /* virtio remoteproc serial link */
>  #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
> +#define VIRTIO_ID_VSOCK        13 /* virtio vsock transport */
>  #define VIRTIO_ID_GPU          16 /* virtio GPU */
>  #define VIRTIO_ID_INPUT        18 /* virtio input */
>
> diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
> new file mode 100644
> index 0000000..ac6483d
> --- /dev/null
> +++ b/include/uapi/linux/virtio_vsock.h
> @@ -0,0 +1,87 @@
> +/*
> + * This header, excluding the #ifdef __KERNEL__ part, is BSD licensed so
> + * anyone can use the definitions to implement compatible drivers/servers:
> + *
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + * 3. Neither the name of IBM nor the names of its contributors
> + *    may be used to endorse or promote products derived from this software
> + *    without specific prior written permission.
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS''
> + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> + * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + *
> + * Copyright (C) Red Hat, Inc., 2013-2015
> + * Copyright (C) Asias He <asias@redhat.com>, 2013
> + * Copyright (C) Stefan Hajnoczi <stefanha@redhat.com>, 2015
> + */
> +
> +#ifndef _UAPI_LINUX_VIRTIO_VSOCK_H
> +#define _UAPI_LINUX_VIRTIO_VOSCK_H
> +
> +#include <linux/types.h>
> +#include <linux/virtio_ids.h>
> +#include <linux/virtio_config.h>
> +
> +struct virtio_vsock_config {
> +	__le32 guest_cid;
> +	__le32 max_virtqueue_pairs;
> +};
> +
> +struct virtio_vsock_hdr {
> +	__le32	src_cid;
> +	__le32	src_port;
> +	__le32	dst_cid;
> +	__le32	dst_port;
> +	__le32	len;
> +	__le16	type;		/* enum virtio_vsock_type */
> +	__le16	op;		/* enum virtio_vsock_op */
> +	__le32	flags;
> +	__le32	buf_alloc;
> +	__le32	fwd_cnt;
> +};
> +
> +enum virtio_vsock_type {
> +	VIRTIO_VSOCK_TYPE_STREAM = 1,
> +};
> +
> +enum virtio_vsock_op {
> +	VIRTIO_VSOCK_OP_INVALID = 0,
> +
> +	/* Connect operations */
> +	VIRTIO_VSOCK_OP_REQUEST = 1,
> +	VIRTIO_VSOCK_OP_RESPONSE = 2,
> +	VIRTIO_VSOCK_OP_RST = 3,
> +	VIRTIO_VSOCK_OP_SHUTDOWN = 4,
> +
> +	/* To send payload */
> +	VIRTIO_VSOCK_OP_RW = 5,
> +
> +	/* Tell the peer our credit info */
> +	VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
> +	/* Request the peer to send the credit info to us */
> +	VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
> +};
> +
> +/* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
> +enum virtio_vsock_shutdown {
> +	VIRTIO_VSOCK_SHUTDOWN_RCV = 1,
> +	VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
> +};
> +
> +#endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> new file mode 100644
> index 0000000..025a323
> --- /dev/null
> +++ b/net/vmw_vsock/virtio_transport_common.c
> @@ -0,0 +1,854 @@
> +/*
> + * common code for virtio vsock
> + *
> + * Copyright (C) 2013-2015 Red Hat, Inc.
> + * Author: Asias He <asias@redhat.com>
> + *         Stefan Hajnoczi <stefanha@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.
> + */
> +#include <linux/module.h>
> +#include <linux/ctype.h>
> +#include <linux/list.h>
> +#include <linux/virtio.h>
> +#include <linux/virtio_ids.h>
> +#include <linux/virtio_config.h>
> +#include <linux/virtio_vsock.h>
> +
> +#include <net/sock.h>
> +#include <net/af_vsock.h>
> +
> +void virtio_vsock_dumppkt(const char *func,  const struct virtio_vsock_pkt *pkt)
> +{
> +	pr_debug("%s: pkt=%p, op=%d, len=%d, %d:%d---%d:%d, len=%d\n",
> +		 func, pkt,
> +		 le16_to_cpu(pkt->hdr.op),
> +		 le32_to_cpu(pkt->hdr.len),
> +		 le32_to_cpu(pkt->hdr.src_cid),
> +		 le32_to_cpu(pkt->hdr.src_port),
> +		 le32_to_cpu(pkt->hdr.dst_cid),
> +		 le32_to_cpu(pkt->hdr.dst_port),
> +		 pkt->len);
> +}
> +EXPORT_SYMBOL_GPL(virtio_vsock_dumppkt);

Why export this at all? The only users are in this file so you could
make it static.

> +
> +struct virtio_vsock_pkt *
> +virtio_transport_alloc_pkt(struct vsock_sock *vsk,
> +			   struct virtio_vsock_pkt_info *info,
> +			   size_t len,
> +			   u32 src_cid,
> +			   u32 src_port,
> +			   u32 dst_cid,
> +			   u32 dst_port)

c.f. checkpatch

> +{
> +	struct virtio_transport *trans = vsk->trans;
> +	struct virtio_vsock_pkt *pkt;
> +	int err;
> +
> +	BUG_ON(!trans);

So checkpatch flags up BUG_ON usage as a potential problem. Should a
badly configured really take out the kernel rather than gracefully warn
and fail?

> +
> +	pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
> +	if (!pkt)
> +		return NULL;
> +
> +	pkt->hdr.type		= cpu_to_le16(info->type);
> +	pkt->hdr.op		= cpu_to_le16(info->op);
> +	pkt->hdr.src_cid	= cpu_to_le32(src_cid);
> +	pkt->hdr.src_port	= cpu_to_le32(src_port);
> +	pkt->hdr.dst_cid	= cpu_to_le32(dst_cid);
> +	pkt->hdr.dst_port	= cpu_to_le32(dst_port);
> +	pkt->hdr.flags		= cpu_to_le32(info->flags);
> +	pkt->len		= len;
> +	pkt->trans		= trans;
> +	pkt->hdr.len		= cpu_to_le32(len);
> +
> +	if (info->msg && len > 0) {
> +		pkt->buf = kmalloc(len, GFP_KERNEL);
> +		if (!pkt->buf)
> +			goto out_pkt;
> +		err = memcpy_from_msg(pkt->buf, info->msg, len);
> +		if (err)
> +			goto out;
> +	}
> +
> +	return pkt;
> +
> +out:
> +	kfree(pkt->buf);
> +out_pkt:
> +	kfree(pkt);
> +	return NULL;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_alloc_pkt);
> +
> +struct sock *
> +virtio_transport_get_pending(struct sock *listener,
> +			     struct virtio_vsock_pkt *pkt)
> +{
> +	struct vsock_sock *vlistener;
> +	struct vsock_sock *vpending;
> +	struct sockaddr_vm src;
> +	struct sockaddr_vm dst;
> +	struct sock *pending;
> +
> +	vsock_addr_init(&src, le32_to_cpu(pkt->hdr.src_cid), le32_to_cpu(pkt->hdr.src_port));
> +	vsock_addr_init(&dst, le32_to_cpu(pkt->hdr.dst_cid), le32_to_cpu(pkt->hdr.dst_port));
> +
> +	vlistener = vsock_sk(listener);
> +	list_for_each_entry(vpending, &vlistener->pending_links,
> +			    pending_links) {
> +		if (vsock_addr_equals_addr(&src, &vpending->remote_addr) &&
> +		    vsock_addr_equals_addr(&dst, &vpending->local_addr)) {
> +			pending = sk_vsock(vpending);
> +			sock_hold(pending);
> +			return pending;
> +		}
> +	}
> +
> +	return NULL;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_get_pending);
> +
> +static void virtio_transport_inc_rx_pkt(struct virtio_vsock_pkt *pkt)
> +{
> +	pkt->trans->rx_bytes += pkt->len;
> +}
> +
> +static void virtio_transport_dec_rx_pkt(struct virtio_vsock_pkt *pkt)
> +{
> +	pkt->trans->rx_bytes -= pkt->len;
> +	pkt->trans->fwd_cnt += pkt->len;
> +}
> +
> +void virtio_transport_inc_tx_pkt(struct virtio_vsock_pkt *pkt)
> +{
> +	mutex_lock(&pkt->trans->tx_lock);
> +	pkt->hdr.fwd_cnt = cpu_to_le32(pkt->trans->fwd_cnt);
> +	pkt->hdr.buf_alloc = cpu_to_le32(pkt->trans->buf_alloc);
> +	mutex_unlock(&pkt->trans->tx_lock);
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_inc_tx_pkt);
> +
> +void virtio_transport_dec_tx_pkt(struct virtio_vsock_pkt *pkt)
> +{
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_dec_tx_pkt);
> +
> +u32 virtio_transport_get_credit(struct virtio_transport *trans, u32 credit)
> +{
> +	u32 ret;
> +
> +	mutex_lock(&trans->tx_lock);
> +	ret = trans->peer_buf_alloc - (trans->tx_cnt - trans->peer_fwd_cnt);
> +	if (ret > credit)
> +		ret = credit;
> +	trans->tx_cnt += ret;
> +	mutex_unlock(&trans->tx_lock);
> +
> +	pr_debug("%s: ret=%d, buf_alloc=%d, peer_buf_alloc=%d,"
> +		 "tx_cnt=%d, fwd_cnt=%d, peer_fwd_cnt=%d\n", __func__,

I think __func__ is superfluous here as the dynamic print code already
has it and can print it when required. Having said that there seems to
be plenty of code already in the kernel that uses __func__ :-/

> +		 ret, trans->buf_alloc, trans->peer_buf_alloc,
> +		 trans->tx_cnt, trans->fwd_cnt, trans->peer_fwd_cnt);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_get_credit);
> +
> +void virtio_transport_put_credit(struct virtio_transport *trans, u32 credit)
> +{
> +	mutex_lock(&trans->tx_lock);
> +	trans->tx_cnt -= credit;
> +	mutex_unlock(&trans->tx_lock);
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_put_credit);
> +
> +static int virtio_transport_send_credit_update(struct vsock_sock *vsk, int type, struct virtio_vsock_hdr *hdr)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +	struct virtio_vsock_pkt_info info = {
> +		.op = VIRTIO_VSOCK_OP_CREDIT_UPDATE,
> +		.type = type,
> +	};
> +
> +	pr_debug("%s: sk=%p send_credit_update\n", __func__, vsk);

Again superfluous __func__

> +	return trans->ops->send_pkt(vsk, &info);
> +}
> +
> +static ssize_t
> +virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> +				   struct msghdr *msg,
> +				   size_t len)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +	struct virtio_vsock_pkt *pkt;
> +	size_t bytes, total = 0;
> +	int err = -EFAULT;
> +
> +	mutex_lock(&trans->rx_lock);
> +	while (total < len && trans->rx_bytes > 0  &&
> +			!list_empty(&trans->rx_queue)) {
> +		pkt = list_first_entry(&trans->rx_queue,
> +				       struct virtio_vsock_pkt, list);
> +
> +		bytes = len - total;
> +		if (bytes > pkt->len - pkt->off)
> +			bytes = pkt->len - pkt->off;
> +
> +		err = memcpy_to_msg(msg, pkt->buf + pkt->off, bytes);
> +		if (err)
> +			goto out;
> +		total += bytes;
> +		pkt->off += bytes;
> +		if (pkt->off == pkt->len) {
> +			virtio_transport_dec_rx_pkt(pkt);
> +			list_del(&pkt->list);
> +			virtio_transport_free_pkt(pkt);
> +		}
> +	}
> +	mutex_unlock(&trans->rx_lock);
> +
> +	/* Send a credit pkt to peer */
> +	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_STREAM,
> +					    NULL);
> +
> +	return total;
> +
> +out:
> +	mutex_unlock(&trans->rx_lock);
> +	if (total)
> +		err = total;
> +	return err;
> +}
> +
> +ssize_t
> +virtio_transport_stream_dequeue(struct vsock_sock *vsk,
> +				struct msghdr *msg,
> +				size_t len, int flags)
> +{
> +	if (flags & MSG_PEEK)
> +		return -EOPNOTSUPP;
> +
> +	return virtio_transport_stream_do_dequeue(vsk, msg, len);
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_stream_dequeue);
> +
> +int
> +virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
> +			       struct msghdr *msg,
> +			       size_t len, int flags)
> +{
> +	return -EOPNOTSUPP;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_dgram_dequeue);
> +
> +s64 virtio_transport_stream_has_data(struct vsock_sock *vsk)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +	s64 bytes;
> +
> +	mutex_lock(&trans->rx_lock);
> +	bytes = trans->rx_bytes;
> +	mutex_unlock(&trans->rx_lock);
> +
> +	return bytes;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_stream_has_data);
> +
> +static s64 virtio_transport_has_space(struct vsock_sock *vsk)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +	s64 bytes;
> +
> +	bytes = trans->peer_buf_alloc - (trans->tx_cnt - trans->peer_fwd_cnt);
> +	if (bytes < 0)
> +		bytes = 0;
> +
> +	return bytes;
> +}
> +
> +s64 virtio_transport_stream_has_space(struct vsock_sock *vsk)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +	s64 bytes;
> +
> +	mutex_lock(&trans->tx_lock);
> +	bytes = virtio_transport_has_space(vsk);
> +	mutex_unlock(&trans->tx_lock);
> +
> +	pr_debug("%s: bytes=%lld\n", __func__, bytes);
> +
> +	return bytes;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_stream_has_space);
> +
> +int virtio_transport_do_socket_init(struct vsock_sock *vsk,
> +				    struct vsock_sock *psk)
> +{
> +	struct virtio_transport *trans;
> +
> +	trans = kzalloc(sizeof(*trans), GFP_KERNEL);
> +	if (!trans)
> +		return -ENOMEM;
> +
> +	vsk->trans = trans;
> +	trans->vsk = vsk;
> +	if (psk) {
> +		struct virtio_transport *ptrans = psk->trans;
> +		trans->buf_size	= ptrans->buf_size;
> +		trans->buf_size_min = ptrans->buf_size_min;
> +		trans->buf_size_max = ptrans->buf_size_max;
> +		trans->peer_buf_alloc = ptrans->peer_buf_alloc;
> +	} else {
> +		trans->buf_size = VIRTIO_VSOCK_DEFAULT_BUF_SIZE;
> +		trans->buf_size_min = VIRTIO_VSOCK_DEFAULT_MIN_BUF_SIZE;
> +		trans->buf_size_max = VIRTIO_VSOCK_DEFAULT_MAX_BUF_SIZE;
> +	}
> +
> +	trans->buf_alloc = trans->buf_size;
> +
> +	pr_debug("%s: trans->buf_alloc=%d\n", __func__, trans->buf_alloc);
> +
> +	mutex_init(&trans->rx_lock);
> +	mutex_init(&trans->tx_lock);
> +	INIT_LIST_HEAD(&trans->rx_queue);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
> +
> +u64 virtio_transport_get_buffer_size(struct vsock_sock *vsk)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +
> +	return trans->buf_size;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_get_buffer_size);
> +
> +u64 virtio_transport_get_min_buffer_size(struct vsock_sock *vsk)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +
> +	return trans->buf_size_min;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_get_min_buffer_size);
> +
> +u64 virtio_transport_get_max_buffer_size(struct vsock_sock *vsk)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +
> +	return trans->buf_size_max;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_get_max_buffer_size);

All these accesses functions seem pretty simple. Maybe they should be
inline header functions or even #define macros?

> +
> +void virtio_transport_set_buffer_size(struct vsock_sock *vsk, u64 val)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +
> +	if (val > VIRTIO_VSOCK_MAX_BUF_SIZE)
> +		val = VIRTIO_VSOCK_MAX_BUF_SIZE;
> +	if (val < trans->buf_size_min)
> +		trans->buf_size_min = val;
> +	if (val > trans->buf_size_max)
> +		trans->buf_size_max = val;
> +	trans->buf_size = val;
> +	trans->buf_alloc = val;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_set_buffer_size);
> +
> +void virtio_transport_set_min_buffer_size(struct vsock_sock *vsk, u64 val)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +
> +	if (val > VIRTIO_VSOCK_MAX_BUF_SIZE)
> +		val = VIRTIO_VSOCK_MAX_BUF_SIZE;
> +	if (val > trans->buf_size)
> +		trans->buf_size = val;
> +	trans->buf_size_min = val;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_set_min_buffer_size);
> +
> +void virtio_transport_set_max_buffer_size(struct vsock_sock *vsk, u64 val)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +
> +	if (val > VIRTIO_VSOCK_MAX_BUF_SIZE)
> +		val = VIRTIO_VSOCK_MAX_BUF_SIZE;
> +	if (val < trans->buf_size)
> +		trans->buf_size = val;
> +	trans->buf_size_max = val;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_set_max_buffer_size);
> +
> +int
> +virtio_transport_notify_poll_in(struct vsock_sock *vsk,
> +				size_t target,
> +				bool *data_ready_now)

c.f. checkpatch indentation

> +{
> +	if (vsock_stream_has_data(vsk))
> +		*data_ready_now = true;
> +	else
> +		*data_ready_now = false;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_notify_poll_in);
> +
> +int
> +virtio_transport_notify_poll_out(struct vsock_sock *vsk,
> +				 size_t target,
> +				 bool *space_avail_now)

checkpatch

> +{
> +	s64 free_space;
> +
> +	free_space = vsock_stream_has_space(vsk);
> +	if (free_space > 0)
> +		*space_avail_now = true;
> +	else if (free_space == 0)
> +		*space_avail_now = false;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_notify_poll_out);
> +
> +int virtio_transport_notify_recv_init(struct vsock_sock *vsk,
> +	size_t target, struct vsock_transport_recv_notify_data *data)
> +{
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_init);
> +
> +int virtio_transport_notify_recv_pre_block(struct vsock_sock *vsk,
> +	size_t target, struct vsock_transport_recv_notify_data *data)
> +{
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_pre_block);
> +
> +int virtio_transport_notify_recv_pre_dequeue(struct vsock_sock *vsk,
> +	size_t target, struct vsock_transport_recv_notify_data *data)
> +{
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_pre_dequeue);
> +
> +int virtio_transport_notify_recv_post_dequeue(struct vsock_sock *vsk,
> +	size_t target, ssize_t copied, bool data_read,
> +	struct vsock_transport_recv_notify_data *data)
> +{
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_post_dequeue);
> +
> +int virtio_transport_notify_send_init(struct vsock_sock *vsk,
> +	struct vsock_transport_send_notify_data *data)
> +{
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_notify_send_init);
> +
> +int virtio_transport_notify_send_pre_block(struct vsock_sock *vsk,
> +	struct vsock_transport_send_notify_data *data)
> +{
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_notify_send_pre_block);
> +
> +int virtio_transport_notify_send_pre_enqueue(struct vsock_sock *vsk,
> +	struct vsock_transport_send_notify_data *data)
> +{
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_notify_send_pre_enqueue);
> +
> +int virtio_transport_notify_send_post_enqueue(struct vsock_sock *vsk,
> +	ssize_t written, struct vsock_transport_send_notify_data *data)
> +{
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_notify_send_post_enqueue);

This makes me wonder if the calling code should be having
if(transport->fn) checks rather than filling stuff out will null
implementations but I guess that's a question better aimed at the
maintainers.

> +
> +u64 virtio_transport_stream_rcvhiwat(struct vsock_sock *vsk)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +
> +	return trans->buf_size;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_stream_rcvhiwat);
> +
> +bool virtio_transport_stream_is_active(struct vsock_sock *vsk)
> +{
> +	return true;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_stream_is_active);
> +
> +bool virtio_transport_stream_allow(u32 cid, u32 port)
> +{
> +	/* Only allow guest->host connections */
> +	return cid != VMADDR_CID_HOST;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_stream_allow);
> +
> +int virtio_transport_dgram_bind(struct vsock_sock *vsk,
> +				struct sockaddr_vm *addr)
> +{
> +	return -EOPNOTSUPP;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_dgram_bind);
> +
> +bool virtio_transport_dgram_allow(u32 cid, u32 port)
> +{
> +	return false;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_dgram_allow);
> +
> +int virtio_transport_connect(struct vsock_sock *vsk)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +	struct virtio_vsock_pkt_info info = {
> +		.op = VIRTIO_VSOCK_OP_REQUEST,
> +		.type = VIRTIO_VSOCK_TYPE_STREAM,
> +	};
> +
> +	pr_debug("%s: vsk=%p send_request\n", __func__, vsk);
> +	return trans->ops->send_pkt(vsk, &info);
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_connect);
> +
> +int virtio_transport_shutdown(struct vsock_sock *vsk, int mode)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +	struct virtio_vsock_pkt_info info = {
> +		.op = VIRTIO_VSOCK_OP_SHUTDOWN,
> +		.type = VIRTIO_VSOCK_TYPE_STREAM,
> +		.flags = (mode & RCV_SHUTDOWN ?
> +			  VIRTIO_VSOCK_SHUTDOWN_RCV : 0) |
> +			 (mode & SEND_SHUTDOWN ?
> +			  VIRTIO_VSOCK_SHUTDOWN_SEND : 0),
> +	};
> +
> +	pr_debug("%s: vsk=%p: send_shutdown\n", __func__, vsk);
> +	return trans->ops->send_pkt(vsk, &info);
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_shutdown);
> +
> +void virtio_transport_release(struct vsock_sock *vsk)
> +{
> +	struct sock *sk = &vsk->sk;
> +
> +	pr_debug("%s: vsk=%p\n", __func__, vsk);
> +
> +	/* Tell other side to terminate connection */
> +	if (sk->sk_type == SOCK_STREAM && sk->sk_state == SS_CONNECTED) {
> +		virtio_transport_shutdown(vsk, SHUTDOWN_MASK);
> +	}
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_release);
> +
> +int
> +virtio_transport_dgram_enqueue(struct vsock_sock *vsk,
> +			       struct sockaddr_vm *remote_addr,
> +			       struct msghdr *msg,
> +			       size_t dgram_len)
> +{
> +	return -EOPNOTSUPP;
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_dgram_enqueue);
> +
> +ssize_t
> +virtio_transport_stream_enqueue(struct vsock_sock *vsk,
> +				struct msghdr *msg,
> +				size_t len)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +	struct virtio_vsock_pkt_info info = {
> +		.op = VIRTIO_VSOCK_OP_RW,
> +		.type = VIRTIO_VSOCK_TYPE_STREAM,
> +		.msg = msg,
> +		.pkt_len = len,
> +	};
> +
> +	return trans->ops->send_pkt(vsk, &info);
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_stream_enqueue);
> +
> +void virtio_transport_destruct(struct vsock_sock *vsk)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +
> +	pr_debug("%s: vsk=%p\n", __func__, vsk);
> +	kfree(trans);
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_destruct);
> +
> +static int virtio_transport_send_reset(struct vsock_sock *vsk,
> +				       struct virtio_vsock_pkt *pkt)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +	struct virtio_vsock_pkt_info info = {
> +		.op = VIRTIO_VSOCK_OP_RST,
> +		.type = VIRTIO_VSOCK_TYPE_STREAM,
> +	};
> +
> +	pr_debug("%s\n", __func__);
> +
> +	/* Send RST only if the original pkt is not a RST pkt */
> +	if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_RST)
> +		return 0;
> +
> +	return trans->ops->send_pkt(vsk, &info);
> +}
> +
> +static int
> +virtio_transport_recv_connecting(struct sock *sk,
> +				 struct virtio_vsock_pkt *pkt)
> +{
> +	struct vsock_sock *vsk = vsock_sk(sk);
> +	int err;
> +	int skerr;
> +
> +	pr_debug("%s: vsk=%p\n", __func__, vsk);
> +	switch (le16_to_cpu(pkt->hdr.op)) {
> +	case VIRTIO_VSOCK_OP_RESPONSE:
> +		pr_debug("%s: got RESPONSE\n", __func__);
> +		sk->sk_state = SS_CONNECTED;
> +		sk->sk_socket->state = SS_CONNECTED;
> +		vsock_insert_connected(vsk);
> +		sk->sk_state_change(sk);
> +		break;
> +	case VIRTIO_VSOCK_OP_INVALID:
> +		pr_debug("%s: got invalid\n", __func__);
> +		break;
> +	case VIRTIO_VSOCK_OP_RST:
> +		pr_debug("%s: got rst\n", __func__);
> +		skerr = ECONNRESET;
> +		err = 0;
> +		goto destroy;
> +	default:
> +		pr_debug("%s: got def\n", __func__);
> +		skerr = EPROTO;
> +		err = -EINVAL;
> +		goto destroy;
> +	}
> +	return 0;
> +
> +destroy:
> +	virtio_transport_send_reset(vsk, pkt);
> +	sk->sk_state = SS_UNCONNECTED;
> +	sk->sk_err = skerr;
> +	sk->sk_error_report(sk);
> +	return err;
> +}
> +
> +static int
> +virtio_transport_recv_connected(struct sock *sk,
> +				struct virtio_vsock_pkt *pkt)
> +{
> +	struct vsock_sock *vsk = vsock_sk(sk);
> +	struct virtio_transport *trans = vsk->trans;
> +	int err = 0;
> +
> +	switch (le16_to_cpu(pkt->hdr.op)) {
> +	case VIRTIO_VSOCK_OP_RW:
> +		pkt->len = le32_to_cpu(pkt->hdr.len);
> +		pkt->off = 0;
> +		pkt->trans = trans;
> +
> +		mutex_lock(&trans->rx_lock);
> +		virtio_transport_inc_rx_pkt(pkt);
> +		list_add_tail(&pkt->list, &trans->rx_queue);
> +		mutex_unlock(&trans->rx_lock);
> +
> +		sk->sk_data_ready(sk);
> +		return err;
> +	case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
> +		sk->sk_write_space(sk);
> +		break;
> +	case VIRTIO_VSOCK_OP_SHUTDOWN:
> +		pr_debug("%s: got shutdown\n", __func__);
> +		if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SHUTDOWN_RCV)
> +			vsk->peer_shutdown |= RCV_SHUTDOWN;
> +		if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SHUTDOWN_SEND)
> +			vsk->peer_shutdown |= SEND_SHUTDOWN;
> +		if (le32_to_cpu(pkt->hdr.flags))
> +			sk->sk_state_change(sk);
> +		break;
> +	case VIRTIO_VSOCK_OP_RST:
> +		pr_debug("%s: got rst\n", __func__);
> +		sock_set_flag(sk, SOCK_DONE);
> +		vsk->peer_shutdown = SHUTDOWN_MASK;
> +		if (vsock_stream_has_data(vsk) <= 0)
> +			sk->sk_state = SS_DISCONNECTING;
> +		sk->sk_state_change(sk);
> +		break;
> +	default:
> +		err = -EINVAL;
> +		break;
> +	}
> +
> +	virtio_transport_free_pkt(pkt);
> +	return err;
> +}
> +
> +static int
> +virtio_transport_send_response(struct vsock_sock *vsk,
> +			       struct virtio_vsock_pkt *pkt)
> +{
> +	struct virtio_transport *trans = vsk->trans;
> +	struct virtio_vsock_pkt_info info = {
> +		.op = VIRTIO_VSOCK_OP_RESPONSE,
> +		.type = VIRTIO_VSOCK_TYPE_STREAM,
> +		.remote_cid = le32_to_cpu(pkt->hdr.src_cid),
> +		.remote_port = le32_to_cpu(pkt->hdr.src_port),
> +	};
> +
> +	pr_debug("%s: send_response\n", __func__);
> +
> +	return trans->ops->send_pkt(vsk, &info);
> +}
> +
> +/* Handle server socket */
> +static int
> +virtio_transport_recv_listen(struct sock *sk, struct virtio_vsock_pkt *pkt)
> +{
> +	struct vsock_sock *vsk = vsock_sk(sk);
> +	struct vsock_sock *vchild;
> +	struct sock *child;
> +
> +	if (le16_to_cpu(pkt->hdr.op) != VIRTIO_VSOCK_OP_REQUEST) {
> +		virtio_transport_send_reset(vsk, pkt);
> +		return -EINVAL;
> +	}
> +
> +	if (sk_acceptq_is_full(sk)) {
> +		virtio_transport_send_reset(vsk, pkt);
> +		return -ENOMEM;
> +	}
> +
> +	pr_debug("%s: create pending\n", __func__);
> +	child = __vsock_create(sock_net(sk), NULL, sk, GFP_KERNEL,
> +			       sk->sk_type, 0);
> +	if (!child) {
> +		virtio_transport_send_reset(vsk, pkt);
> +		return -ENOMEM;
> +	}
> +
> +	sk->sk_ack_backlog++;
> +
> +	lock_sock(child);
> +
> +	child->sk_state = SS_CONNECTED;
> +
> +	vchild = vsock_sk(child);
> +	vsock_addr_init(&vchild->local_addr, le32_to_cpu(pkt->hdr.dst_cid),
> +			le32_to_cpu(pkt->hdr.dst_port));
> +	vsock_addr_init(&vchild->remote_addr, le32_to_cpu(pkt->hdr.src_cid),
> +			le32_to_cpu(pkt->hdr.src_port));
> +
> +	vsock_insert_connected(vchild);
> +	vsock_enqueue_accept(sk, child);
> +	virtio_transport_send_response(vchild, pkt);
> +
> +	release_sock(child);
> +
> +	sk->sk_data_ready(sk);
> +	return 0;
> +}
> +
> +static void virtio_transport_space_update(struct sock *sk,
> +					  struct virtio_vsock_pkt *pkt)
> +{
> +	struct vsock_sock *vsk = vsock_sk(sk);
> +	struct virtio_transport *trans = vsk->trans;
> +	bool space_available;
> +
> +	/* buf_alloc and fwd_cnt is always included in the hdr */
> +	mutex_lock(&trans->tx_lock);
> +	trans->peer_buf_alloc = le32_to_cpu(pkt->hdr.buf_alloc);
> +	trans->peer_fwd_cnt = le32_to_cpu(pkt->hdr.fwd_cnt);
> +	space_available = virtio_transport_has_space(vsk);
> +	mutex_unlock(&trans->tx_lock);
> +
> +	if (space_available)
> +		sk->sk_write_space(sk);
> +}
> +
> +/* We are under the virtio-vsock's vsock->rx_lock or
> + * vhost-vsock's vq->mutex lock */
> +void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt)
> +{
> +	struct virtio_transport *trans;
> +	struct sockaddr_vm src, dst;
> +	struct vsock_sock *vsk;
> +	struct sock *sk;
> +
> +	vsock_addr_init(&src, le32_to_cpu(pkt->hdr.src_cid), le32_to_cpu(pkt->hdr.src_port));
> +	vsock_addr_init(&dst, le32_to_cpu(pkt->hdr.dst_cid), le32_to_cpu(pkt->hdr.dst_port));
> +
> +	virtio_vsock_dumppkt(__func__, pkt);
> +
> +	if (le16_to_cpu(pkt->hdr.type) != VIRTIO_VSOCK_TYPE_STREAM) {
> +		/* TODO send RST */

TODO's shouldn't make it into final submissions.

> +		goto free_pkt;
> +	}
> +
> +	/* The socket must be in connected or bound table
> +	 * otherwise send reset back
> +	 */
> +	sk = vsock_find_connected_socket(&src, &dst);
> +	if (!sk) {
> +		sk = vsock_find_bound_socket(&dst);
> +		if (!sk) {
> +			pr_debug("%s: can not find bound_socket\n", __func__);
> +			virtio_vsock_dumppkt(__func__, pkt);
> +			/* Ignore this pkt instead of sending reset back */
> +			/* TODO send a RST unless this packet is a RST
> (to avoid infinite loops) */

Ditto.

> +			goto free_pkt;
> +		}
> +	}
> +
> +	vsk = vsock_sk(sk);
> +	trans = vsk->trans;
> +	BUG_ON(!trans);

See above re: BUG_ON

> +
> +	virtio_transport_space_update(sk, pkt);
> +
> +	lock_sock(sk);
> +	switch (sk->sk_state) {
> +	case VSOCK_SS_LISTEN:
> +		virtio_transport_recv_listen(sk, pkt);
> +		virtio_transport_free_pkt(pkt);
> +		break;
> +	case SS_CONNECTING:
> +		virtio_transport_recv_connecting(sk, pkt);
> +		virtio_transport_free_pkt(pkt);
> +		break;
> +	case SS_CONNECTED:
> +		virtio_transport_recv_connected(sk, pkt);
> +		break;
> +	default:
> +		virtio_transport_free_pkt(pkt);
> +		break;
> +	}
> +	release_sock(sk);
> +
> +	/* Release refcnt obtained when we fetched this socket out of the
> +	 * bound or connected list.
> +	 */
> +	sock_put(sk);
> +	return;
> +
> +free_pkt:
> +	virtio_transport_free_pkt(pkt);
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_recv_pkt);
> +
> +void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt)
> +{
> +	kfree(pkt->buf);
> +	kfree(pkt);
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_free_pkt);
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Asias He");
> +MODULE_DESCRIPTION("common code for virtio vsock");
> --
> 2.5.0


--
Alex Bennée
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 2/4] VSOCK: Introduce virtio-vsock.ko
  2015-12-09 12:03 ` Stefan Hajnoczi
@ 2015-12-10 21:23   ` Alex Bennée
  2015-12-11  3:00     ` Stefan Hajnoczi
  2015-12-11  3:00     ` Stefan Hajnoczi
  0 siblings, 2 replies; 23+ messages in thread
From: Alex Bennée @ 2015-12-10 21:23 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kvm, Michael S. Tsirkin, netdev, virtualization, Matt Benjamin,
	Asias He, Christoffer Dall, matt.ma


Stefan Hajnoczi <stefanha@redhat.com> writes:

> From: Asias He <asias@redhat.com>
>
> VM sockets virtio transport implementation. This module runs in guest
> kernel.

checkpatch warns on a bunch of whitespace/tab issues.

>
> Signed-off-by: Asias He <asias@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> v2:
>  * Fix total_tx_buf accounting
>  * Add virtio_transport global mutex to prevent races
> ---
>  net/vmw_vsock/virtio_transport.c | 466 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 466 insertions(+)
>  create mode 100644 net/vmw_vsock/virtio_transport.c
>
> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> new file mode 100644
> index 0000000..df65dca
> --- /dev/null
> +++ b/net/vmw_vsock/virtio_transport.c
> @@ -0,0 +1,466 @@
> +/*
> + * virtio transport for vsock
> + *
> + * Copyright (C) 2013-2015 Red Hat, Inc.
> + * Author: Asias He <asias@redhat.com>
> + *         Stefan Hajnoczi <stefanha@redhat.com>
> + *
> + * Some of the code is take from Gerd Hoffmann <kraxel@redhat.com>'s
> + * early virtio-vsock proof-of-concept bits.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.
> + */
> +#include <linux/spinlock.h>
> +#include <linux/module.h>
> +#include <linux/list.h>
> +#include <linux/virtio.h>
> +#include <linux/virtio_ids.h>
> +#include <linux/virtio_config.h>
> +#include <linux/virtio_vsock.h>
> +#include <net/sock.h>
> +#include <linux/mutex.h>
> +#include <net/af_vsock.h>
> +
> +static struct workqueue_struct *virtio_vsock_workqueue;
> +static struct virtio_vsock *the_virtio_vsock;
> +static DEFINE_MUTEX(the_virtio_vsock_mutex); /* protects the_virtio_vsock */
> +static void virtio_vsock_rx_fill(struct virtio_vsock *vsock);
> +
> +struct virtio_vsock {
> +	/* Virtio device */
> +	struct virtio_device *vdev;
> +	/* Virtio virtqueue */
> +	struct virtqueue *vqs[VSOCK_VQ_MAX];
> +	/* Wait queue for send pkt */
> +	wait_queue_head_t queue_wait;
> +	/* Work item to send pkt */
> +	struct work_struct tx_work;
> +	/* Work item to recv pkt */
> +	struct work_struct rx_work;
> +	/* Mutex to protect send pkt*/
> +	struct mutex tx_lock;
> +	/* Mutex to protect recv pkt*/
> +	struct mutex rx_lock;

Further down I got confused by what lock was what and exactly what was
being protected. If the receive and transmit paths touch separate things
it might be worth re-arranging the structure to make it clearer, eg:

   /* The transmit path is protected by tx_lock */
   struct mutex tx_lock;
   struct work_struct tx_work;
   ..
   ..

   /* The receive path is protected by rx_lock */
   wait_queue_head_t queue_wait;
   ..
   ..

 Which might make things a little clearer. Then all the redundant
 information in the comments can be removed. I don't need to know what
 is a Virtio device, virtqueue or wait_queue etc as they are implicit in
 the structure name.

> +	/* Number of recv buffers */
> +	int rx_buf_nr;
> +	/* Number of max recv buffers */
> +	int rx_buf_max_nr;
> +	/* Used for global tx buf limitation */
> +	u32 total_tx_buf;
> +	/* Guest context id, just like guest ip address */
> +	u32 guest_cid;
> +};
> +
> +static struct virtio_vsock *virtio_vsock_get(void)
> +{
> +	return the_virtio_vsock;
> +}
> +
> +static u32 virtio_transport_get_local_cid(void)
> +{
> +	struct virtio_vsock *vsock = virtio_vsock_get();
> +
> +	return vsock->guest_cid;
> +}
> +
> +static int
> +virtio_transport_send_pkt(struct vsock_sock *vsk,
> +			  struct virtio_vsock_pkt_info *info)
> +{
> +	u32 src_cid, src_port, dst_cid, dst_port;
> +	int ret, in_sg = 0, out_sg = 0;
> +	struct virtio_transport *trans;
> +	struct virtio_vsock_pkt *pkt;
> +	struct virtio_vsock *vsock;
> +	struct scatterlist hdr, buf, *sgs[2];
> +	struct virtqueue *vq;
> +	u32 pkt_len = info->pkt_len;
> +	DEFINE_WAIT(wait);
> +
> +	vsock = virtio_vsock_get();
> +	if (!vsock)
> +		return -ENODEV;
> +
> +	src_cid	= virtio_transport_get_local_cid();
> +	src_port = vsk->local_addr.svm_port;
> +	if (!info->remote_cid) {
> +		dst_cid	= vsk->remote_addr.svm_cid;
> +		dst_port = vsk->remote_addr.svm_port;
> +	} else {
> +		dst_cid = info->remote_cid;
> +		dst_port = info->remote_port;
> +	}
> +
> +	trans = vsk->trans;
> +	vq = vsock->vqs[VSOCK_VQ_TX];
> +
> +	if (pkt_len > VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE)
> +		pkt_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
> +	pkt_len = virtio_transport_get_credit(trans, pkt_len);
> +	/* Do not send zero length OP_RW pkt*/
> +	if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW)
> +		return pkt_len;
> +
> +	/* Respect global tx buf limitation */
> +	mutex_lock(&vsock->tx_lock);
> +	while (pkt_len + vsock->total_tx_buf > VIRTIO_VSOCK_MAX_TX_BUF_SIZE) {
> +		prepare_to_wait_exclusive(&vsock->queue_wait, &wait,
> +					  TASK_UNINTERRUPTIBLE);
> +		mutex_unlock(&vsock->tx_lock);
> +		schedule();
> +		mutex_lock(&vsock->tx_lock);
> +		finish_wait(&vsock->queue_wait, &wait);
> +	}
> +	vsock->total_tx_buf += pkt_len;
> +	mutex_unlock(&vsock->tx_lock);
> +
> +	pkt = virtio_transport_alloc_pkt(vsk, info, pkt_len,
> +					 src_cid, src_port,
> +					 dst_cid, dst_port);
> +	if (!pkt) {
> +		mutex_lock(&vsock->tx_lock);
> +		vsock->total_tx_buf -= pkt_len;
> +		mutex_unlock(&vsock->tx_lock);
> +		virtio_transport_put_credit(trans, pkt_len);
> +		return -ENOMEM;
> +	}
> +
> +	pr_debug("%s:info->pkt_len= %d\n", __func__, info->pkt_len);
> +
> +	/* Will be released in virtio_transport_send_pkt_work */
> +	sock_hold(&trans->vsk->sk);
> +	virtio_transport_inc_tx_pkt(pkt);
> +
> +	/* Put pkt in the virtqueue */
> +	sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr));
> +	sgs[out_sg++] = &hdr;
> +	if (info->msg && info->pkt_len > 0) {
> +		sg_init_one(&buf, pkt->buf, pkt->len);
> +	        sgs[out_sg++] = &buf;
> +	}
> +
> +	mutex_lock(&vsock->tx_lock);
> +	while ((ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, pkt,
> +					GFP_KERNEL)) < 0) {
> +		prepare_to_wait_exclusive(&vsock->queue_wait, &wait,
> +					  TASK_UNINTERRUPTIBLE);
> +		mutex_unlock(&vsock->tx_lock);
> +		schedule();
> +		mutex_lock(&vsock->tx_lock);
> +		finish_wait(&vsock->queue_wait, &wait);
> +	}
> +	virtqueue_kick(vq);
> +	mutex_unlock(&vsock->tx_lock);

What are we protecting with tx_lock here? See comments above about
making the lock usage semantics clearer.

> +
> +	return pkt_len;
> +}
> +
> +static struct virtio_transport_pkt_ops virtio_ops = {
> +	.send_pkt = virtio_transport_send_pkt,
> +};
> +
> +static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
> +{
> +	int buf_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
> +	struct virtio_vsock_pkt *pkt;
> +	struct scatterlist hdr, buf, *sgs[2];
> +	struct virtqueue *vq;
> +	int ret;
> +
> +	vq = vsock->vqs[VSOCK_VQ_RX];
> +
> +	do {
> +		pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
> +		if (!pkt) {
> +			pr_debug("%s: fail to allocate pkt\n", __func__);
> +			goto out;
> +		}
> +
> +		/* TODO: use mergeable rx buffer */

TODO's should end up in merged code.

> +		pkt->buf = kmalloc(buf_len, GFP_KERNEL);
> +		if (!pkt->buf) {
> +			pr_debug("%s: fail to allocate pkt->buf\n", __func__);
> +			goto err;
> +		}
> +
> +		sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr));
> +		sgs[0] = &hdr;
> +
> +		sg_init_one(&buf, pkt->buf, buf_len);
> +	        sgs[1] = &buf;
> +		ret = virtqueue_add_sgs(vq, sgs, 0, 2, pkt, GFP_KERNEL);
> +		if (ret)
> +			goto err;
> +		vsock->rx_buf_nr++;
> +	} while (vq->num_free);
> +	if (vsock->rx_buf_nr > vsock->rx_buf_max_nr)
> +		vsock->rx_buf_max_nr = vsock->rx_buf_nr;
> +out:
> +	virtqueue_kick(vq);
> +	return;
> +err:
> +	virtqueue_kick(vq);
> +	virtio_transport_free_pkt(pkt);

You could free the pkt memory at the fail site and just have one exit path.

> +	return;
> +}
> +
> +static void virtio_transport_send_pkt_work(struct work_struct *work)
> +{
> +	struct virtio_vsock *vsock =
> +		container_of(work, struct virtio_vsock, tx_work);
> +	struct virtio_vsock_pkt *pkt;
> +	bool added = false;
> +	struct virtqueue *vq;
> +	unsigned int len;
> +	struct sock *sk;
> +
> +	vq = vsock->vqs[VSOCK_VQ_TX];
> +	mutex_lock(&vsock->tx_lock);
> +	do {

You can move the declarations of pkt/len into the do block.

> +		virtqueue_disable_cb(vq);
> +		while ((pkt = virtqueue_get_buf(vq, &len)) != NULL) {

And the sk declaration here

> +			sk = &pkt->trans->vsk->sk;
> +			virtio_transport_dec_tx_pkt(pkt);
> +			/* Release refcnt taken in virtio_transport_send_pkt */
> +			sock_put(sk);
> +			vsock->total_tx_buf -= pkt->len;
> +			virtio_transport_free_pkt(pkt);
> +			added = true;
> +		}
> +	} while (!virtqueue_enable_cb(vq));
> +	mutex_unlock(&vsock->tx_lock);
> +
> +	if (added)
> +		wake_up(&vsock->queue_wait);
> +}
> +
> +static void virtio_transport_recv_pkt_work(struct work_struct *work)
> +{
> +	struct virtio_vsock *vsock =
> +		container_of(work, struct virtio_vsock, rx_work);
> +	struct virtio_vsock_pkt *pkt;
> +	struct virtqueue *vq;
> +	unsigned int len;

Same as above for pkt, len.

> +
> +	vq = vsock->vqs[VSOCK_VQ_RX];
> +	mutex_lock(&vsock->rx_lock);
> +	do {
> +		virtqueue_disable_cb(vq);
> +		while ((pkt = virtqueue_get_buf(vq, &len)) != NULL) {
> +			pkt->len = len;
> +			virtio_transport_recv_pkt(pkt);
> +			vsock->rx_buf_nr--;
> +		}
> +	} while (!virtqueue_enable_cb(vq));
> +
> +	if (vsock->rx_buf_nr < vsock->rx_buf_max_nr / 2)
> +		virtio_vsock_rx_fill(vsock);
> +	mutex_unlock(&vsock->rx_lock);
> +}
> +
> +static void virtio_vsock_ctrl_done(struct virtqueue *vq)
> +{
> +}
> +
> +static void virtio_vsock_tx_done(struct virtqueue *vq)
> +{
> +	struct virtio_vsock *vsock = vq->vdev->priv;
> +
> +	if (!vsock)
> +		return;
> +	queue_work(virtio_vsock_workqueue, &vsock->tx_work);
> +}
> +
> +static void virtio_vsock_rx_done(struct virtqueue *vq)
> +{
> +	struct virtio_vsock *vsock = vq->vdev->priv;
> +
> +	if (!vsock)
> +		return;
> +	queue_work(virtio_vsock_workqueue, &vsock->rx_work);
> +}
> +
> +static int
> +virtio_transport_socket_init(struct vsock_sock *vsk, struct vsock_sock *psk)
> +{
> +	struct virtio_transport *trans;
> +	int ret;
> +
> +	ret = virtio_transport_do_socket_init(vsk, psk);
> +	if (ret)
> +		return ret;
> +
> +	trans = vsk->trans;
> +	trans->ops = &virtio_ops;
> +	return ret;
> +}
> +
> +static struct vsock_transport virtio_transport = {
> +	.get_local_cid            = virtio_transport_get_local_cid,
> +
> +	.init                     = virtio_transport_socket_init,
> +	.destruct                 = virtio_transport_destruct,
> +	.release                  = virtio_transport_release,
> +	.connect                  = virtio_transport_connect,
> +	.shutdown                 = virtio_transport_shutdown,
> +
> +	.dgram_bind               = virtio_transport_dgram_bind,
> +	.dgram_dequeue            = virtio_transport_dgram_dequeue,
> +	.dgram_enqueue            = virtio_transport_dgram_enqueue,
> +	.dgram_allow              = virtio_transport_dgram_allow,
> +
> +	.stream_dequeue           = virtio_transport_stream_dequeue,
> +	.stream_enqueue           = virtio_transport_stream_enqueue,
> +	.stream_has_data          = virtio_transport_stream_has_data,
> +	.stream_has_space         = virtio_transport_stream_has_space,
> +	.stream_rcvhiwat          = virtio_transport_stream_rcvhiwat,
> +	.stream_is_active         = virtio_transport_stream_is_active,
> +	.stream_allow             = virtio_transport_stream_allow,
> +
> +	.notify_poll_in           = virtio_transport_notify_poll_in,
> +	.notify_poll_out          = virtio_transport_notify_poll_out,
> +	.notify_recv_init         = virtio_transport_notify_recv_init,
> +	.notify_recv_pre_block    = virtio_transport_notify_recv_pre_block,
> +	.notify_recv_pre_dequeue  = virtio_transport_notify_recv_pre_dequeue,
> +	.notify_recv_post_dequeue = virtio_transport_notify_recv_post_dequeue,
> +	.notify_send_init         = virtio_transport_notify_send_init,
> +	.notify_send_pre_block    = virtio_transport_notify_send_pre_block,
> +	.notify_send_pre_enqueue  = virtio_transport_notify_send_pre_enqueue,
> +	.notify_send_post_enqueue = virtio_transport_notify_send_post_enqueue,
> +
> +	.set_buffer_size          = virtio_transport_set_buffer_size,
> +	.set_min_buffer_size      = virtio_transport_set_min_buffer_size,
> +	.set_max_buffer_size      = virtio_transport_set_max_buffer_size,
> +	.get_buffer_size          = virtio_transport_get_buffer_size,
> +	.get_min_buffer_size      = virtio_transport_get_min_buffer_size,
> +	.get_max_buffer_size      = virtio_transport_get_max_buffer_size,
> +};
> +
> +static int virtio_vsock_probe(struct virtio_device *vdev)
> +{
> +	vq_callback_t *callbacks[] = {
> +		virtio_vsock_ctrl_done,
> +		virtio_vsock_rx_done,
> +		virtio_vsock_tx_done,
> +	};
> +	const char *names[] = {
> +		"ctrl",
> +		"rx",
> +		"tx",
> +	};
> +	struct virtio_vsock *vsock = NULL;
> +	u32 guest_cid;
> +	int ret;
> +
> +	ret = mutex_lock_interruptible(&the_virtio_vsock_mutex);
> +	if (ret)
> +		return ret;
> +
> +	/* Only one virtio-vsock device per guest is supported */
> +	if (the_virtio_vsock) {
> +		ret = -EBUSY;
> +		goto out;
> +	}
> +
> +	vsock = kzalloc(sizeof(*vsock), GFP_KERNEL);
> +	if (!vsock) {
> +		ret = -ENOMEM;
> +		goto out;

Won't this attempt to kfree a NULL vsock?

> +	}
> +
> +	vsock->vdev = vdev;
> +
> +	ret = vsock->vdev->config->find_vqs(vsock->vdev, VSOCK_VQ_MAX,
> +					    vsock->vqs, callbacks, names);
> +	if (ret < 0)
> +		goto out;
> +
> +	vdev->config->get(vdev, offsetof(struct virtio_vsock_config, guest_cid),
> +			  &guest_cid, sizeof(guest_cid));
> +	vsock->guest_cid = le32_to_cpu(guest_cid);
> +	pr_debug("%s:guest_cid=%d\n", __func__, vsock->guest_cid);
> +
> +	ret = vsock_core_init(&virtio_transport);
> +	if (ret < 0)
> +		goto out_vqs;
> +
> +	vsock->rx_buf_nr = 0;
> +	vsock->rx_buf_max_nr = 0;
> +
> +	vdev->priv = the_virtio_vsock = vsock;
> +	init_waitqueue_head(&vsock->queue_wait);
> +	mutex_init(&vsock->tx_lock);
> +	mutex_init(&vsock->rx_lock);
> +	INIT_WORK(&vsock->rx_work, virtio_transport_recv_pkt_work);
> +	INIT_WORK(&vsock->tx_work, virtio_transport_send_pkt_work);
> +
> +	mutex_lock(&vsock->rx_lock);
> +	virtio_vsock_rx_fill(vsock);
> +	mutex_unlock(&vsock->rx_lock);
> +
> +	mutex_unlock(&the_virtio_vsock_mutex);
> +	return 0;
> +
> +out_vqs:
> +	vsock->vdev->config->del_vqs(vsock->vdev);
> +out:
> +	kfree(vsock);
> +	mutex_unlock(&the_virtio_vsock_mutex);
> +	return ret;
> +}
> +
> +static void virtio_vsock_remove(struct virtio_device *vdev)
> +{
> +	struct virtio_vsock *vsock = vdev->priv;
> +
> +	mutex_lock(&the_virtio_vsock_mutex);
> +	the_virtio_vsock = NULL;
> +	vsock_core_exit();
> +	mutex_unlock(&the_virtio_vsock_mutex);
> +
> +	kfree(vsock);
> +}
> +
> +static struct virtio_device_id id_table[] = {
> +	{ VIRTIO_ID_VSOCK, VIRTIO_DEV_ANY_ID },
> +	{ 0 },
> +};
> +
> +static unsigned int features[] = {
> +};
> +
> +static struct virtio_driver virtio_vsock_driver = {
> +	.feature_table = features,
> +	.feature_table_size = ARRAY_SIZE(features),
> +	.driver.name = KBUILD_MODNAME,
> +	.driver.owner = THIS_MODULE,
> +	.id_table = id_table,
> +	.probe = virtio_vsock_probe,
> +	.remove = virtio_vsock_remove,
> +};
> +
> +static int __init virtio_vsock_init(void)
> +{
> +	int ret;
> +
> +	virtio_vsock_workqueue = alloc_workqueue("virtio_vsock", 0, 0);
> +	if (!virtio_vsock_workqueue)
> +		return -ENOMEM;
> +	ret = register_virtio_driver(&virtio_vsock_driver);
> +	if (ret)
> +		destroy_workqueue(virtio_vsock_workqueue);
> +	return ret;
> +}
> +
> +static void __exit virtio_vsock_exit(void)
> +{
> +	unregister_virtio_driver(&virtio_vsock_driver);
> +	destroy_workqueue(virtio_vsock_workqueue);
> +}
> +
> +module_init(virtio_vsock_init);
> +module_exit(virtio_vsock_exit);
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Asias He");
> +MODULE_DESCRIPTION("virtio transport for vsock");
> +MODULE_DEVICE_TABLE(virtio, id_table);


--
Alex Bennée
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/4] VSOCK: Introduce virtio-vsock-common.ko
  2015-12-10 10:17   ` Alex Bennée
@ 2015-12-11  2:51     ` Stefan Hajnoczi
  0 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-11  2:51 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, Michael S. Tsirkin, netdev, virtualization, Matt Benjamin,
	Asias He, Christoffer Dall, matt.ma


[-- Attachment #1.1: Type: text/plain, Size: 4841 bytes --]

On Thu, Dec 10, 2015 at 10:17:07AM +0000, Alex Bennée wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> 
> > From: Asias He <asias@redhat.com>
> >
> > This module contains the common code and header files for the following
> > virtio-vsock and virtio-vhost kernel modules.
> 
> General comment checkpatch has a bunch of warnings about 80 character
> limits, extra braces and BUG_ON usage.

Will fix in the next verison.

> > diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
> > new file mode 100644
> > index 0000000..e54eb45
> > --- /dev/null
> > +++ b/include/linux/virtio_vsock.h
> > @@ -0,0 +1,203 @@
> > +/*
> > + * This header, excluding the #ifdef __KERNEL__ part, is BSD licensed so
> > + * anyone can use the definitions to implement compatible
> > drivers/servers:
> 
> Is anything in here actually exposed to userspace or the guest? The
> #ifdef __KERNEL__ statement seems redundant for this file at least.

You are right.  I think the header was copied from a uapi file.

I'll compare against other virtio code and apply an appropriate header.

> > +void virtio_vsock_dumppkt(const char *func,  const struct virtio_vsock_pkt *pkt)
> > +{
> > +	pr_debug("%s: pkt=%p, op=%d, len=%d, %d:%d---%d:%d, len=%d\n",
> > +		 func, pkt,
> > +		 le16_to_cpu(pkt->hdr.op),
> > +		 le32_to_cpu(pkt->hdr.len),
> > +		 le32_to_cpu(pkt->hdr.src_cid),
> > +		 le32_to_cpu(pkt->hdr.src_port),
> > +		 le32_to_cpu(pkt->hdr.dst_cid),
> > +		 le32_to_cpu(pkt->hdr.dst_port),
> > +		 pkt->len);
> > +}
> > +EXPORT_SYMBOL_GPL(virtio_vsock_dumppkt);
> 
> Why export this at all? The only users are in this file so you could
> make it static.

I'll make it static.

> > +u32 virtio_transport_get_credit(struct virtio_transport *trans, u32 credit)
> > +{
> > +	u32 ret;
> > +
> > +	mutex_lock(&trans->tx_lock);
> > +	ret = trans->peer_buf_alloc - (trans->tx_cnt - trans->peer_fwd_cnt);
> > +	if (ret > credit)
> > +		ret = credit;
> > +	trans->tx_cnt += ret;
> > +	mutex_unlock(&trans->tx_lock);
> > +
> > +	pr_debug("%s: ret=%d, buf_alloc=%d, peer_buf_alloc=%d,"
> > +		 "tx_cnt=%d, fwd_cnt=%d, peer_fwd_cnt=%d\n", __func__,
> 
> I think __func__ is superfluous here as the dynamic print code already
> has it and can print it when required. Having said that there seems to
> be plenty of code already in the kernel that uses __func__ :-/

I'll convert most printks to tracepoints in the next revision.

> > +u64 virtio_transport_get_max_buffer_size(struct vsock_sock *vsk)
> > +{
> > +	struct virtio_transport *trans = vsk->trans;
> > +
> > +	return trans->buf_size_max;
> > +}
> > +EXPORT_SYMBOL_GPL(virtio_transport_get_max_buffer_size);
> 
> All these accesses functions seem pretty simple. Maybe they should be
> inline header functions or even #define macros?

They are used as struct vsock_transport function pointers.  What is the
advantage to inlining them?

> > +int virtio_transport_notify_send_post_enqueue(struct vsock_sock *vsk,
> > +	ssize_t written, struct vsock_transport_send_notify_data *data)
> > +{
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(virtio_transport_notify_send_post_enqueue);
> 
> This makes me wonder if the calling code should be having
> if(transport->fn) checks rather than filling stuff out will null
> implementations but I guess that's a question better aimed at the
> maintainers.

I've considered it too.  I'll try to streamline this in the next
revision.

> > +/* We are under the virtio-vsock's vsock->rx_lock or
> > + * vhost-vsock's vq->mutex lock */
> > +void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt)
> > +{
> > +	struct virtio_transport *trans;
> > +	struct sockaddr_vm src, dst;
> > +	struct vsock_sock *vsk;
> > +	struct sock *sk;
> > +
> > +	vsock_addr_init(&src, le32_to_cpu(pkt->hdr.src_cid), le32_to_cpu(pkt->hdr.src_port));
> > +	vsock_addr_init(&dst, le32_to_cpu(pkt->hdr.dst_cid), le32_to_cpu(pkt->hdr.dst_port));
> > +
> > +	virtio_vsock_dumppkt(__func__, pkt);
> > +
> > +	if (le16_to_cpu(pkt->hdr.type) != VIRTIO_VSOCK_TYPE_STREAM) {
> > +		/* TODO send RST */
> 
> TODO's shouldn't make it into final submissions.
> 
> > +		goto free_pkt;
> > +	}
> > +
> > +	/* The socket must be in connected or bound table
> > +	 * otherwise send reset back
> > +	 */
> > +	sk = vsock_find_connected_socket(&src, &dst);
> > +	if (!sk) {
> > +		sk = vsock_find_bound_socket(&dst);
> > +		if (!sk) {
> > +			pr_debug("%s: can not find bound_socket\n", __func__);
> > +			virtio_vsock_dumppkt(__func__, pkt);
> > +			/* Ignore this pkt instead of sending reset back */
> > +			/* TODO send a RST unless this packet is a RST
> > (to avoid infinite loops) */
> 
> Ditto.

Thanks, I'll complete the RST code in the next revision.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 2/4] VSOCK: Introduce virtio-vsock.ko
  2015-12-10 21:23   ` Alex Bennée
  2015-12-11  3:00     ` Stefan Hajnoczi
@ 2015-12-11  3:00     ` Stefan Hajnoczi
  1 sibling, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-11  3:00 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, Michael S. Tsirkin, netdev, virtualization, Matt Benjamin,
	Asias He, Christoffer Dall, matt.ma

[-- Attachment #1: Type: text/plain, Size: 5135 bytes --]

On Thu, Dec 10, 2015 at 09:23:25PM +0000, Alex Bennée wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> 
> > From: Asias He <asias@redhat.com>
> >
> > VM sockets virtio transport implementation. This module runs in guest
> > kernel.
> 
> checkpatch warns on a bunch of whitespace/tab issues.

Will fix in the next version.

> > +struct virtio_vsock {
> > +	/* Virtio device */
> > +	struct virtio_device *vdev;
> > +	/* Virtio virtqueue */
> > +	struct virtqueue *vqs[VSOCK_VQ_MAX];
> > +	/* Wait queue for send pkt */
> > +	wait_queue_head_t queue_wait;
> > +	/* Work item to send pkt */
> > +	struct work_struct tx_work;
> > +	/* Work item to recv pkt */
> > +	struct work_struct rx_work;
> > +	/* Mutex to protect send pkt*/
> > +	struct mutex tx_lock;
> > +	/* Mutex to protect recv pkt*/
> > +	struct mutex rx_lock;
> 
> Further down I got confused by what lock was what and exactly what was
> being protected. If the receive and transmit paths touch separate things
> it might be worth re-arranging the structure to make it clearer, eg:
> 
>    /* The transmit path is protected by tx_lock */
>    struct mutex tx_lock;
>    struct work_struct tx_work;
>    ..
>    ..
> 
>    /* The receive path is protected by rx_lock */
>    wait_queue_head_t queue_wait;
>    ..
>    ..
> 
>  Which might make things a little clearer. Then all the redundant
>  information in the comments can be removed. I don't need to know what
>  is a Virtio device, virtqueue or wait_queue etc as they are implicit in
>  the structure name.

Thanks, that is a nice idea.

> > +	mutex_lock(&vsock->tx_lock);
> > +	while ((ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, pkt,
> > +					GFP_KERNEL)) < 0) {
> > +		prepare_to_wait_exclusive(&vsock->queue_wait, &wait,
> > +					  TASK_UNINTERRUPTIBLE);
> > +		mutex_unlock(&vsock->tx_lock);
> > +		schedule();
> > +		mutex_lock(&vsock->tx_lock);
> > +		finish_wait(&vsock->queue_wait, &wait);
> > +	}
> > +	virtqueue_kick(vq);
> > +	mutex_unlock(&vsock->tx_lock);
> 
> What are we protecting with tx_lock here? See comments above about
> making the lock usage semantics clearer.

vq (vsock->vqs[VSOCK_VQ_TX]) is being protected.  Concurrent calls to
virtqueue_add_sgs() are not allowed.

> > +
> > +	return pkt_len;
> > +}
> > +
> > +static struct virtio_transport_pkt_ops virtio_ops = {
> > +	.send_pkt = virtio_transport_send_pkt,
> > +};
> > +
> > +static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
> > +{
> > +	int buf_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
> > +	struct virtio_vsock_pkt *pkt;
> > +	struct scatterlist hdr, buf, *sgs[2];
> > +	struct virtqueue *vq;
> > +	int ret;
> > +
> > +	vq = vsock->vqs[VSOCK_VQ_RX];
> > +
> > +	do {
> > +		pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
> > +		if (!pkt) {
> > +			pr_debug("%s: fail to allocate pkt\n", __func__);
> > +			goto out;
> > +		}
> > +
> > +		/* TODO: use mergeable rx buffer */
> 
> TODO's should end up in merged code.

Will fix in next revision.

> > +		pkt->buf = kmalloc(buf_len, GFP_KERNEL);
> > +		if (!pkt->buf) {
> > +			pr_debug("%s: fail to allocate pkt->buf\n", __func__);
> > +			goto err;
> > +		}
> > +
> > +		sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr));
> > +		sgs[0] = &hdr;
> > +
> > +		sg_init_one(&buf, pkt->buf, buf_len);
> > +	        sgs[1] = &buf;
> > +		ret = virtqueue_add_sgs(vq, sgs, 0, 2, pkt, GFP_KERNEL);
> > +		if (ret)
> > +			goto err;
> > +		vsock->rx_buf_nr++;
> > +	} while (vq->num_free);
> > +	if (vsock->rx_buf_nr > vsock->rx_buf_max_nr)
> > +		vsock->rx_buf_max_nr = vsock->rx_buf_nr;
> > +out:
> > +	virtqueue_kick(vq);
> > +	return;
> > +err:
> > +	virtqueue_kick(vq);
> > +	virtio_transport_free_pkt(pkt);
> 
> You could free the pkt memory at the fail site and just have one exit path.

Okay, I agree the err label is of marginal use.  Let's get rid of it.

> 
> > +	return;
> > +}
> > +
> > +static void virtio_transport_send_pkt_work(struct work_struct *work)
> > +{
> > +	struct virtio_vsock *vsock =
> > +		container_of(work, struct virtio_vsock, tx_work);
> > +	struct virtio_vsock_pkt *pkt;
> > +	bool added = false;
> > +	struct virtqueue *vq;
> > +	unsigned int len;
> > +	struct sock *sk;
> > +
> > +	vq = vsock->vqs[VSOCK_VQ_TX];
> > +	mutex_lock(&vsock->tx_lock);
> > +	do {
> 
> You can move the declarations of pkt/len into the do block.

Okay.

> 
> > +		virtqueue_disable_cb(vq);
> > +		while ((pkt = virtqueue_get_buf(vq, &len)) != NULL) {
> 
> And the sk declaration here

Okay.

> > +static void virtio_transport_recv_pkt_work(struct work_struct *work)
> > +{
> > +	struct virtio_vsock *vsock =
> > +		container_of(work, struct virtio_vsock, rx_work);
> > +	struct virtio_vsock_pkt *pkt;
> > +	struct virtqueue *vq;
> > +	unsigned int len;
> 
> Same as above for pkt, len.

Okay.

> > +	vsock = kzalloc(sizeof(*vsock), GFP_KERNEL);
> > +	if (!vsock) {
> > +		ret = -ENOMEM;
> > +		goto out;
> 
> Won't this attempt to kfree a NULL vsock?

kfree(NULL) is a nop so this is safe.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 2/4] VSOCK: Introduce virtio-vsock.ko
  2015-12-10 21:23   ` Alex Bennée
@ 2015-12-11  3:00     ` Stefan Hajnoczi
  2015-12-11  3:00     ` Stefan Hajnoczi
  1 sibling, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-11  3:00 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, Michael S. Tsirkin, netdev, virtualization, Matt Benjamin,
	Asias He, Christoffer Dall, matt.ma


[-- Attachment #1.1: Type: text/plain, Size: 5135 bytes --]

On Thu, Dec 10, 2015 at 09:23:25PM +0000, Alex Bennée wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> 
> > From: Asias He <asias@redhat.com>
> >
> > VM sockets virtio transport implementation. This module runs in guest
> > kernel.
> 
> checkpatch warns on a bunch of whitespace/tab issues.

Will fix in the next version.

> > +struct virtio_vsock {
> > +	/* Virtio device */
> > +	struct virtio_device *vdev;
> > +	/* Virtio virtqueue */
> > +	struct virtqueue *vqs[VSOCK_VQ_MAX];
> > +	/* Wait queue for send pkt */
> > +	wait_queue_head_t queue_wait;
> > +	/* Work item to send pkt */
> > +	struct work_struct tx_work;
> > +	/* Work item to recv pkt */
> > +	struct work_struct rx_work;
> > +	/* Mutex to protect send pkt*/
> > +	struct mutex tx_lock;
> > +	/* Mutex to protect recv pkt*/
> > +	struct mutex rx_lock;
> 
> Further down I got confused by what lock was what and exactly what was
> being protected. If the receive and transmit paths touch separate things
> it might be worth re-arranging the structure to make it clearer, eg:
> 
>    /* The transmit path is protected by tx_lock */
>    struct mutex tx_lock;
>    struct work_struct tx_work;
>    ..
>    ..
> 
>    /* The receive path is protected by rx_lock */
>    wait_queue_head_t queue_wait;
>    ..
>    ..
> 
>  Which might make things a little clearer. Then all the redundant
>  information in the comments can be removed. I don't need to know what
>  is a Virtio device, virtqueue or wait_queue etc as they are implicit in
>  the structure name.

Thanks, that is a nice idea.

> > +	mutex_lock(&vsock->tx_lock);
> > +	while ((ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, pkt,
> > +					GFP_KERNEL)) < 0) {
> > +		prepare_to_wait_exclusive(&vsock->queue_wait, &wait,
> > +					  TASK_UNINTERRUPTIBLE);
> > +		mutex_unlock(&vsock->tx_lock);
> > +		schedule();
> > +		mutex_lock(&vsock->tx_lock);
> > +		finish_wait(&vsock->queue_wait, &wait);
> > +	}
> > +	virtqueue_kick(vq);
> > +	mutex_unlock(&vsock->tx_lock);
> 
> What are we protecting with tx_lock here? See comments above about
> making the lock usage semantics clearer.

vq (vsock->vqs[VSOCK_VQ_TX]) is being protected.  Concurrent calls to
virtqueue_add_sgs() are not allowed.

> > +
> > +	return pkt_len;
> > +}
> > +
> > +static struct virtio_transport_pkt_ops virtio_ops = {
> > +	.send_pkt = virtio_transport_send_pkt,
> > +};
> > +
> > +static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
> > +{
> > +	int buf_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
> > +	struct virtio_vsock_pkt *pkt;
> > +	struct scatterlist hdr, buf, *sgs[2];
> > +	struct virtqueue *vq;
> > +	int ret;
> > +
> > +	vq = vsock->vqs[VSOCK_VQ_RX];
> > +
> > +	do {
> > +		pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
> > +		if (!pkt) {
> > +			pr_debug("%s: fail to allocate pkt\n", __func__);
> > +			goto out;
> > +		}
> > +
> > +		/* TODO: use mergeable rx buffer */
> 
> TODO's should end up in merged code.

Will fix in next revision.

> > +		pkt->buf = kmalloc(buf_len, GFP_KERNEL);
> > +		if (!pkt->buf) {
> > +			pr_debug("%s: fail to allocate pkt->buf\n", __func__);
> > +			goto err;
> > +		}
> > +
> > +		sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr));
> > +		sgs[0] = &hdr;
> > +
> > +		sg_init_one(&buf, pkt->buf, buf_len);
> > +	        sgs[1] = &buf;
> > +		ret = virtqueue_add_sgs(vq, sgs, 0, 2, pkt, GFP_KERNEL);
> > +		if (ret)
> > +			goto err;
> > +		vsock->rx_buf_nr++;
> > +	} while (vq->num_free);
> > +	if (vsock->rx_buf_nr > vsock->rx_buf_max_nr)
> > +		vsock->rx_buf_max_nr = vsock->rx_buf_nr;
> > +out:
> > +	virtqueue_kick(vq);
> > +	return;
> > +err:
> > +	virtqueue_kick(vq);
> > +	virtio_transport_free_pkt(pkt);
> 
> You could free the pkt memory at the fail site and just have one exit path.

Okay, I agree the err label is of marginal use.  Let's get rid of it.

> 
> > +	return;
> > +}
> > +
> > +static void virtio_transport_send_pkt_work(struct work_struct *work)
> > +{
> > +	struct virtio_vsock *vsock =
> > +		container_of(work, struct virtio_vsock, tx_work);
> > +	struct virtio_vsock_pkt *pkt;
> > +	bool added = false;
> > +	struct virtqueue *vq;
> > +	unsigned int len;
> > +	struct sock *sk;
> > +
> > +	vq = vsock->vqs[VSOCK_VQ_TX];
> > +	mutex_lock(&vsock->tx_lock);
> > +	do {
> 
> You can move the declarations of pkt/len into the do block.

Okay.

> 
> > +		virtqueue_disable_cb(vq);
> > +		while ((pkt = virtqueue_get_buf(vq, &len)) != NULL) {
> 
> And the sk declaration here

Okay.

> > +static void virtio_transport_recv_pkt_work(struct work_struct *work)
> > +{
> > +	struct virtio_vsock *vsock =
> > +		container_of(work, struct virtio_vsock, rx_work);
> > +	struct virtio_vsock_pkt *pkt;
> > +	struct virtqueue *vq;
> > +	unsigned int len;
> 
> Same as above for pkt, len.

Okay.

> > +	vsock = kzalloc(sizeof(*vsock), GFP_KERNEL);
> > +	if (!vsock) {
> > +		ret = -ENOMEM;
> > +		goto out;
> 
> Won't this attempt to kfree a NULL vsock?

kfree(NULL) is a nop so this is safe.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/4] VSOCK: Introduce vhost-vsock.ko
  2015-12-09 12:03 ` [PATCH v3 3/4] VSOCK: Introduce vhost-vsock.ko Stefan Hajnoczi
  2015-12-11 13:45   ` Alex Bennée
@ 2015-12-11 13:45   ` Alex Bennée
  2015-12-15  7:47     ` Stefan Hajnoczi
  2015-12-15  7:47     ` Stefan Hajnoczi
  1 sibling, 2 replies; 23+ messages in thread
From: Alex Bennée @ 2015-12-11 13:45 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kvm, Matt Benjamin, Christoffer Dall, netdev, Michael S. Tsirkin,
	matt.ma, virtualization, Asias He


Stefan Hajnoczi <stefanha@redhat.com> writes:

> From: Asias He <asias@redhat.com>
>
> VM sockets vhost transport implementation. This module runs in host
> kernel.

As per previous checkpatch comments.

>
> Signed-off-by: Asias He <asias@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> v3:
>  * Remove unneeded variable used to store return value
>    (Fengguang Wu <fengguang.wu@intel.com> and Julia Lawall
>    <julia.lawall@lip6.fr>)
> v2:
>  * Add missing total_tx_buf decrement
>  * Support flexible rx/tx descriptor layout
>  * Refuse to assign reserved CIDs
>  * Refuse guest CID if already in use
>  * Only accept correctly addressed packets
> ---
>  drivers/vhost/vsock.c | 628 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/vhost/vsock.h |   4 +
>  2 files changed, 632 insertions(+)
>  create mode 100644 drivers/vhost/vsock.c
>  create mode 100644 drivers/vhost/vsock.h
>
> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> new file mode 100644
> index 0000000..3c0034a
> --- /dev/null
> +++ b/drivers/vhost/vsock.c
> @@ -0,0 +1,628 @@
> +/*
> + * vhost transport for vsock
> + *
> + * Copyright (C) 2013-2015 Red Hat, Inc.
> + * Author: Asias He <asias@redhat.com>
> + *         Stefan Hajnoczi <stefanha@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.
> + */
> +#include <linux/miscdevice.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +#include <net/sock.h>
> +#include <linux/virtio_vsock.h>
> +#include <linux/vhost.h>
> +
> +#include <net/af_vsock.h>
> +#include "vhost.h"
> +#include "vsock.h"
> +
> +#define VHOST_VSOCK_DEFAULT_HOST_CID	2
> +
> +static int vhost_transport_socket_init(struct vsock_sock *vsk,
> +				       struct vsock_sock *psk);
> +
> +enum {
> +	VHOST_VSOCK_FEATURES = VHOST_FEATURES,
> +};
> +
> +/* Used to track all the vhost_vsock instances on the system. */
> +static LIST_HEAD(vhost_vsock_list);
> +static DEFINE_MUTEX(vhost_vsock_mutex);
> +
> +struct vhost_vsock_virtqueue {
> +	struct vhost_virtqueue vq;
> +};
> +
> +struct vhost_vsock {
> +	/* Vhost device */
> +	struct vhost_dev dev;
> +	/* Vhost vsock virtqueue*/
> +	struct vhost_vsock_virtqueue vqs[VSOCK_VQ_MAX];
> +	/* Link to global vhost_vsock_list*/
> +	struct list_head list;
> +	/* Head for pkt from host to guest */
> +	struct list_head send_pkt_list;
> +	/* Work item to send pkt */
> +	struct vhost_work send_pkt_work;
> +	/* Wait queue for send pkt */
> +	wait_queue_head_t queue_wait;
> +	/* Used for global tx buf limitation */
> +	u32 total_tx_buf;
> +	/* Guest contex id this vhost_vsock instance handles */
> +	u32 guest_cid;
> +};

As with 2/4 there is a fair bit of redundancy in the comments but I
don't see any obvious grouping here that could streamline it.

> +
> +static u32 vhost_transport_get_local_cid(void)
> +{
> +	return VHOST_VSOCK_DEFAULT_HOST_CID;
> +}
> +
> +static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
> +{
> +	struct vhost_vsock *vsock;
> +
> +	mutex_lock(&vhost_vsock_mutex);
> +	list_for_each_entry(vsock, &vhost_vsock_list, list) {
> +		if (vsock->guest_cid == guest_cid) {
> +			mutex_unlock(&vhost_vsock_mutex);
> +			return vsock;
> +		}
> +	}
> +	mutex_unlock(&vhost_vsock_mutex);
> +
> +	return NULL;
> +}
> +
> +static void
> +vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
> +			    struct vhost_virtqueue *vq)
> +{
> +	bool added = false;
> +
> +	mutex_lock(&vq->mutex);
> +	vhost_disable_notify(&vsock->dev, vq);
> +	for (;;) {
> +		struct virtio_vsock_pkt *pkt;
> +		struct iov_iter iov_iter;
> +		unsigned out, in;
> +		struct sock *sk;
> +		size_t nbytes;
> +		size_t len;
> +		int head;
> +
> +		if (list_empty(&vsock->send_pkt_list)) {
> +			vhost_enable_notify(&vsock->dev, vq);
> +			break;
> +		}
> +
> +		head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
> +					 &out, &in, NULL, NULL);
> +		pr_debug("%s: head = %d\n", __func__, head);
> +		if (head < 0)
> +			break;
> +
> +		if (head == vq->num) {
> +			if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
> +				vhost_disable_notify(&vsock->dev, vq);
> +				continue;

Why are we doing this? If we enable something we then disable it? A
comment as to what is going on here would be useful.

> +			}
> +			break;
> +		}
> +
> +		pkt = list_first_entry(&vsock->send_pkt_list,
> +				       struct virtio_vsock_pkt, list);
> +		list_del_init(&pkt->list);
> +
> +		if (out) {
> +			virtio_transport_free_pkt(pkt);
> +			vq_err(vq, "Expected 0 output buffers, got %u\n", out);
> +			break;
> +		}
> +
> +		len = iov_length(&vq->iov[out], in);
> +		iov_iter_init(&iov_iter, READ, &vq->iov[out], in, len);
> +
> +		nbytes = copy_to_iter(&pkt->hdr, sizeof(pkt->hdr), &iov_iter);
> +		if (nbytes != sizeof(pkt->hdr)) {
> +			virtio_transport_free_pkt(pkt);
> +			vq_err(vq, "Faulted on copying pkt hdr\n");
> +			break;
> +		}
> +
> +		nbytes = copy_to_iter(pkt->buf, pkt->len, &iov_iter);
> +		if (nbytes != pkt->len) {
> +			virtio_transport_free_pkt(pkt);
> +			vq_err(vq, "Faulted on copying pkt buf\n");
> +			break;
> +		}
> +
> +		vhost_add_used(vq, head, pkt->len); /* TODO should this
> be sizeof(pkt->hdr) + pkt->len? */

TODO needs sorting our or removing.

> +		added = true;
> +
> +		virtio_transport_dec_tx_pkt(pkt);
> +		vsock->total_tx_buf -= pkt->len;
> +
> +		sk = sk_vsock(pkt->trans->vsk);
> +		/* Release refcnt taken in vhost_transport_send_pkt */
> +		sock_put(sk);
> +
> +		virtio_transport_free_pkt(pkt);
> +	}
> +	if (added)
> +		vhost_signal(&vsock->dev, vq);
> +	mutex_unlock(&vq->mutex);
> +
> +	if (added)
> +		wake_up(&vsock->queue_wait);
> +}
> +
> +static void vhost_transport_send_pkt_work(struct vhost_work *work)
> +{
> +	struct vhost_virtqueue *vq;
> +	struct vhost_vsock *vsock;
> +
> +	vsock = container_of(work, struct vhost_vsock, send_pkt_work);
> +	vq = &vsock->vqs[VSOCK_VQ_RX].vq;
> +
> +	vhost_transport_do_send_pkt(vsock, vq);
> +}
> +
> +static int
> +vhost_transport_send_pkt(struct vsock_sock *vsk,
> +			 struct virtio_vsock_pkt_info *info)
> +{
> +	u32 src_cid, src_port, dst_cid, dst_port;
> +	struct virtio_transport *trans;
> +	struct virtio_vsock_pkt *pkt;
> +	struct vhost_virtqueue *vq;
> +	struct vhost_vsock *vsock;
> +	u32 pkt_len = info->pkt_len;
> +	DEFINE_WAIT(wait);
> +
> +	src_cid = vhost_transport_get_local_cid();
> +	src_port = vsk->local_addr.svm_port;
> +	if (!info->remote_cid) {
> +		dst_cid	= vsk->remote_addr.svm_cid;
> +		dst_port = vsk->remote_addr.svm_port;
> +	} else {
> +		dst_cid = info->remote_cid;
> +		dst_port = info->remote_port;
> +	}
> +
> +	/* Find the vhost_vsock according to guest context id  */
> +	vsock = vhost_vsock_get(dst_cid);
> +	if (!vsock)
> +		return -ENODEV;
> +
> +	trans = vsk->trans;
> +	vq = &vsock->vqs[VSOCK_VQ_RX].vq;
> +
> +	/* we can send less than pkt_len bytes */
> +	if (pkt_len > VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE)
> +		pkt_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
> +
> +	/* virtio_transport_get_credit might return less than pkt_len credit */
> +	pkt_len = virtio_transport_get_credit(trans, pkt_len);
> +
> +	/* Do not send zero length OP_RW pkt*/
> +	if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW)
> +		return pkt_len;
> +
> +	/* Respect global tx buf limitation */
> +	mutex_lock(&vq->mutex);
> +	while (pkt_len + vsock->total_tx_buf >
> VIRTIO_VSOCK_MAX_TX_BUF_SIZE) {

I'm curious about the relationship between
VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE above and VIRTIO_VSOCK_MAX_TX_BUF_SIZE
just here. Why do we need to limit pkt_len to the smaller when really
all that matters is pkt_len + vsock->total_tx_buf >
VIRTIO_VSOCK_MAX_TX_BUF_SIZE?


> +		prepare_to_wait_exclusive(&vsock->queue_wait, &wait,
> +					  TASK_UNINTERRUPTIBLE);
> +		mutex_unlock(&vq->mutex);
> +		schedule();
> +		mutex_lock(&vq->mutex);
> +		finish_wait(&vsock->queue_wait, &wait);
> +	}
> +	vsock->total_tx_buf += pkt_len;
> +	mutex_unlock(&vq->mutex);
> +
> +	pkt = virtio_transport_alloc_pkt(vsk, info, pkt_len,
> +					 src_cid, src_port,
> +					 dst_cid, dst_port);
> +	if (!pkt) {
> +		mutex_lock(&vq->mutex);
> +		vsock->total_tx_buf -= pkt_len;
> +		mutex_unlock(&vq->mutex);
> +		virtio_transport_put_credit(trans, pkt_len);
> +		return -ENOMEM;
> +	}
> +
> +	pr_debug("%s:info->pkt_len= %d\n", __func__, pkt_len);
> +	/* Released in vhost_transport_do_send_pkt */
> +	sock_hold(&trans->vsk->sk);
> +	virtio_transport_inc_tx_pkt(pkt);
> +
> +	/* Queue it up in vhost work */
> +	mutex_lock(&vq->mutex);
> +	list_add_tail(&pkt->list, &vsock->send_pkt_list);
> +	vhost_work_queue(&vsock->dev, &vsock->send_pkt_work);
> +	mutex_unlock(&vq->mutex);
> +
> +	return pkt_len;
> +}
> +
> +static struct virtio_transport_pkt_ops vhost_ops = {
> +	.send_pkt = vhost_transport_send_pkt,
> +};
> +
> +static struct virtio_vsock_pkt *
> +vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq,
> +		      unsigned int out, unsigned int in)
> +{
> +	struct virtio_vsock_pkt *pkt;
> +	struct iov_iter iov_iter;
> +	size_t nbytes;
> +	size_t len;
> +
> +	if (in != 0) {
> +		vq_err(vq, "Expected 0 input buffers, got %u\n", in);
> +		return NULL;
> +	}
> +
> +	pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
> +	if (!pkt)
> +		return NULL;
> +
> +	len = iov_length(vq->iov, out);
> +	iov_iter_init(&iov_iter, WRITE, vq->iov, out, len);
> +
> +	nbytes = copy_from_iter(&pkt->hdr, sizeof(pkt->hdr), &iov_iter);
> +	if (nbytes != sizeof(pkt->hdr)) {
> +		vq_err(vq, "Expected %zu bytes for pkt->hdr, got %zu bytes\n",
> +		       sizeof(pkt->hdr), nbytes);
> +		kfree(pkt);
> +		return NULL;
> +	}
> +
> +	if (le16_to_cpu(pkt->hdr.type) == VIRTIO_VSOCK_TYPE_STREAM)
> +		pkt->len = le32_to_cpu(pkt->hdr.len);
> +
> +	/* No payload */
> +	if (!pkt->len)
> +		return pkt;
> +
> +	/* The pkt is too big */
> +	if (pkt->len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) {
> +		kfree(pkt);
> +		return NULL;
> +	}
> +
> +	pkt->buf = kmalloc(pkt->len, GFP_KERNEL);
> +	if (!pkt->buf) {
> +		kfree(pkt);
> +		return NULL;
> +	}
> +
> +	nbytes = copy_from_iter(pkt->buf, pkt->len, &iov_iter);
> +	if (nbytes != pkt->len) {
> +		vq_err(vq, "Expected %u byte payload, got %zu bytes\n",
> +		       pkt->len, nbytes);
> +		virtio_transport_free_pkt(pkt);
> +		return NULL;
> +	}
> +
> +	return pkt;
> +}
> +
> +static void vhost_vsock_handle_ctl_kick(struct vhost_work *work)
> +{
> +	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
> +						  poll.work);
> +	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
> +						 dev);
> +
> +	pr_debug("%s vq=%p, vsock=%p\n", __func__, vq, vsock);
> +}

This doesn't handle anything, it just prints debug stuff. Should this be
a NOP function?

> +
> +static void vhost_vsock_handle_tx_kick(struct vhost_work *work)
> +{
> +	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
> +						  poll.work);
> +	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
> +						 dev);
> +	struct virtio_vsock_pkt *pkt;
> +	int head;
> +	unsigned int out, in;
> +	bool added = false;
> +	u32 len;
> +
> +	mutex_lock(&vq->mutex);
> +	vhost_disable_notify(&vsock->dev, vq);
> +	for (;;) {
> +		head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
> +					 &out, &in, NULL, NULL);
> +		if (head < 0)
> +			break;
> +
> +		if (head == vq->num) {
> +			if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
> +				vhost_disable_notify(&vsock->dev, vq);
> +				continue;

Same question about the enable/disable dance as above.

> +			}
> +			break;
> +		}
> +
> +		pkt = vhost_vsock_alloc_pkt(vq, out, in);
> +		if (!pkt) {
> +			vq_err(vq, "Faulted on pkt\n");
> +			continue;
> +		}
> +
> +		len = pkt->len;
> +
> +		/* Only accept correctly addressed packets */
> +		if (le32_to_cpu(pkt->hdr.src_cid) == vsock->guest_cid &&
> +		    le32_to_cpu(pkt->hdr.dst_cid) == vhost_transport_get_local_cid())
> +			virtio_transport_recv_pkt(pkt);
> +		else
> +			virtio_transport_free_pkt(pkt);
> +
> +		vhost_add_used(vq, head, len);
> +		added = true;
> +	}
> +	if (added)
> +		vhost_signal(&vsock->dev, vq);
> +	mutex_unlock(&vq->mutex);
> +}
> +
> +static void vhost_vsock_handle_rx_kick(struct vhost_work *work)
> +{
> +	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
> +						poll.work);
> +	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
> +						 dev);
> +
> +	vhost_transport_do_send_pkt(vsock, vq);
> +}
> +
> +static int vhost_vsock_dev_open(struct inode *inode, struct file *file)
> +{
> +	struct vhost_virtqueue **vqs;
> +	struct vhost_vsock *vsock;
> +	int ret;
> +
> +	vsock = kzalloc(sizeof(*vsock), GFP_KERNEL);
> +	if (!vsock)
> +		return -ENOMEM;
> +
> +	pr_debug("%s:vsock=%p\n", __func__, vsock);
> +
> +	vqs = kmalloc(VSOCK_VQ_MAX * sizeof(*vqs), GFP_KERNEL);
> +	if (!vqs) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +
> +	vqs[VSOCK_VQ_CTRL] = &vsock->vqs[VSOCK_VQ_CTRL].vq;
> +	vqs[VSOCK_VQ_TX] = &vsock->vqs[VSOCK_VQ_TX].vq;
> +	vqs[VSOCK_VQ_RX] = &vsock->vqs[VSOCK_VQ_RX].vq;
> +	vsock->vqs[VSOCK_VQ_CTRL].vq.handle_kick = vhost_vsock_handle_ctl_kick;
> +	vsock->vqs[VSOCK_VQ_TX].vq.handle_kick = vhost_vsock_handle_tx_kick;
> +	vsock->vqs[VSOCK_VQ_RX].vq.handle_kick = vhost_vsock_handle_rx_kick;
> +
> +	vhost_dev_init(&vsock->dev, vqs, VSOCK_VQ_MAX);
> +
> +	file->private_data = vsock;
> +	init_waitqueue_head(&vsock->queue_wait);
> +	INIT_LIST_HEAD(&vsock->send_pkt_list);
> +	vhost_work_init(&vsock->send_pkt_work, vhost_transport_send_pkt_work);
> +
> +	mutex_lock(&vhost_vsock_mutex);
> +	list_add_tail(&vsock->list, &vhost_vsock_list);
> +	mutex_unlock(&vhost_vsock_mutex);
> +	return 0;
> +
> +out:
> +	kfree(vsock);
> +	return ret;
> +}
> +
> +static void vhost_vsock_flush(struct vhost_vsock *vsock)
> +{
> +	int i;
> +
> +	for (i = 0; i < VSOCK_VQ_MAX; i++)
> +		vhost_poll_flush(&vsock->vqs[i].vq.poll);
> +	vhost_work_flush(&vsock->dev, &vsock->send_pkt_work);
> +}
> +
> +static int vhost_vsock_dev_release(struct inode *inode, struct file *file)
> +{
> +	struct vhost_vsock *vsock = file->private_data;
> +
> +	mutex_lock(&vhost_vsock_mutex);
> +	list_del(&vsock->list);
> +	mutex_unlock(&vhost_vsock_mutex);
> +
> +	vhost_dev_stop(&vsock->dev);
> +	vhost_vsock_flush(vsock);
> +	vhost_dev_cleanup(&vsock->dev, false);
> +	kfree(vsock->dev.vqs);
> +	kfree(vsock);
> +	return 0;
> +}
> +
> +static int vhost_vsock_set_cid(struct vhost_vsock *vsock, u32 guest_cid)
> +{
> +	struct vhost_vsock *other;
> +
> +	/* Refuse reserved CIDs */
> +	if (guest_cid <= VMADDR_CID_HOST) {
> +		return -EINVAL;
> +	}
> +
> +	/* Refuse if CID is already in use */
> +	other = vhost_vsock_get(guest_cid);
> +	if (other && other != vsock) {
> +		return -EADDRINUSE;
> +	}
> +
> +	mutex_lock(&vhost_vsock_mutex);
> +	vsock->guest_cid = guest_cid;
> +	pr_debug("%s:guest_cid=%d\n", __func__, guest_cid);
> +	mutex_unlock(&vhost_vsock_mutex);
> +
> +	return 0;
> +}
> +
> +static int vhost_vsock_set_features(struct vhost_vsock *vsock, u64 features)
> +{
> +	struct vhost_virtqueue *vq;
> +	int i;
> +
> +	if (features & ~VHOST_VSOCK_FEATURES)
> +		return -EOPNOTSUPP;
> +
> +	mutex_lock(&vsock->dev.mutex);
> +	if ((features & (1 << VHOST_F_LOG_ALL)) &&
> +	    !vhost_log_access_ok(&vsock->dev)) {
> +		mutex_unlock(&vsock->dev.mutex);
> +		return -EFAULT;
> +	}
> +
> +	for (i = 0; i < VSOCK_VQ_MAX; i++) {
> +		vq = &vsock->vqs[i].vq;
> +		mutex_lock(&vq->mutex);
> +		vq->acked_features = features;

Is this a user supplied flag? Should it be masked to valid values?

> +		mutex_unlock(&vq->mutex);
> +	}
> +	mutex_unlock(&vsock->dev.mutex);
> +	return 0;
> +}
> +
> +static long vhost_vsock_dev_ioctl(struct file *f, unsigned int ioctl,
> +				  unsigned long arg)
> +{
> +	struct vhost_vsock *vsock = f->private_data;
> +	void __user *argp = (void __user *)arg;
> +	u64 __user *featurep = argp;
> +	u32 __user *cidp = argp;
> +	u32 guest_cid;
> +	u64 features;
> +	int r;
> +
> +	switch (ioctl) {
> +	case VHOST_VSOCK_SET_GUEST_CID:
> +		if (get_user(guest_cid, cidp))
> +			return -EFAULT;
> +		return vhost_vsock_set_cid(vsock, guest_cid);
> +	case VHOST_GET_FEATURES:
> +		features = VHOST_VSOCK_FEATURES;
> +		if (copy_to_user(featurep, &features, sizeof(features)))
> +			return -EFAULT;
> +		return 0;
> +	case VHOST_SET_FEATURES:
> +		if (copy_from_user(&features, featurep, sizeof(features)))
> +			return -EFAULT;
> +		return vhost_vsock_set_features(vsock, features);
> +	default:
> +		mutex_lock(&vsock->dev.mutex);
> +		r = vhost_dev_ioctl(&vsock->dev, ioctl, argp);
> +		if (r == -ENOIOCTLCMD)
> +			r = vhost_vring_ioctl(&vsock->dev, ioctl, argp);
> +		else
> +			vhost_vsock_flush(vsock);
> +		mutex_unlock(&vsock->dev.mutex);
> +		return r;
> +	}
> +}
> +
> +static const struct file_operations vhost_vsock_fops = {
> +	.owner          = THIS_MODULE,
> +	.open           = vhost_vsock_dev_open,
> +	.release        = vhost_vsock_dev_release,
> +	.llseek		= noop_llseek,
> +	.unlocked_ioctl = vhost_vsock_dev_ioctl,
> +};
> +
> +static struct miscdevice vhost_vsock_misc = {
> +	.minor = MISC_DYNAMIC_MINOR,
> +	.name = "vhost-vsock",
> +	.fops = &vhost_vsock_fops,
> +};
> +
> +static int
> +vhost_transport_socket_init(struct vsock_sock *vsk, struct vsock_sock *psk)
> +{
> +	struct virtio_transport *trans;
> +	int ret;
> +
> +	ret = virtio_transport_do_socket_init(vsk, psk);
> +	if (ret)
> +		return ret;
> +
> +	trans = vsk->trans;
> +	trans->ops = &vhost_ops;
> +
> +	return ret;
> +}
> +
> +static struct vsock_transport vhost_transport = {
> +	.get_local_cid            = vhost_transport_get_local_cid,
> +
> +	.init                     = vhost_transport_socket_init,
> +	.destruct                 = virtio_transport_destruct,
> +	.release                  = virtio_transport_release,
> +	.connect                  = virtio_transport_connect,
> +	.shutdown                 = virtio_transport_shutdown,
> +
> +	.dgram_enqueue            = virtio_transport_dgram_enqueue,
> +	.dgram_dequeue            = virtio_transport_dgram_dequeue,
> +	.dgram_bind               = virtio_transport_dgram_bind,
> +	.dgram_allow              = virtio_transport_dgram_allow,
> +
> +	.stream_enqueue           = virtio_transport_stream_enqueue,
> +	.stream_dequeue           = virtio_transport_stream_dequeue,
> +	.stream_has_data          = virtio_transport_stream_has_data,
> +	.stream_has_space         = virtio_transport_stream_has_space,
> +	.stream_rcvhiwat          = virtio_transport_stream_rcvhiwat,
> +	.stream_is_active         = virtio_transport_stream_is_active,
> +	.stream_allow             = virtio_transport_stream_allow,
> +
> +	.notify_poll_in           = virtio_transport_notify_poll_in,
> +	.notify_poll_out          = virtio_transport_notify_poll_out,
> +	.notify_recv_init         = virtio_transport_notify_recv_init,
> +	.notify_recv_pre_block    = virtio_transport_notify_recv_pre_block,
> +	.notify_recv_pre_dequeue  = virtio_transport_notify_recv_pre_dequeue,
> +	.notify_recv_post_dequeue = virtio_transport_notify_recv_post_dequeue,
> +	.notify_send_init         = virtio_transport_notify_send_init,
> +	.notify_send_pre_block    = virtio_transport_notify_send_pre_block,
> +	.notify_send_pre_enqueue  = virtio_transport_notify_send_pre_enqueue,
> +	.notify_send_post_enqueue = virtio_transport_notify_send_post_enqueue,
> +
> +	.set_buffer_size          = virtio_transport_set_buffer_size,
> +	.set_min_buffer_size      = virtio_transport_set_min_buffer_size,
> +	.set_max_buffer_size      = virtio_transport_set_max_buffer_size,
> +	.get_buffer_size          = virtio_transport_get_buffer_size,
> +	.get_min_buffer_size      = virtio_transport_get_min_buffer_size,
> +	.get_max_buffer_size      = virtio_transport_get_max_buffer_size,
> +};
> +
> +static int __init vhost_vsock_init(void)
> +{
> +	int ret;
> +
> +	ret = vsock_core_init(&vhost_transport);
> +	if (ret < 0)
> +		return ret;
> +	return misc_register(&vhost_vsock_misc);
> +};
> +
> +static void __exit vhost_vsock_exit(void)
> +{
> +	misc_deregister(&vhost_vsock_misc);
> +	vsock_core_exit();
> +};
> +
> +module_init(vhost_vsock_init);
> +module_exit(vhost_vsock_exit);
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Asias He");
> +MODULE_DESCRIPTION("vhost transport for vsock ");
> diff --git a/drivers/vhost/vsock.h b/drivers/vhost/vsock.h
> new file mode 100644
> index 0000000..0ddb107
> --- /dev/null
> +++ b/drivers/vhost/vsock.h
> @@ -0,0 +1,4 @@
> +#ifndef VHOST_VSOCK_H
> +#define VHOST_VSOCK_H
> +#define VHOST_VSOCK_SET_GUEST_CID _IOW(VHOST_VIRTIO, 0x60, __u32)
> +#endif
> --
> 2.5.0


--
Alex Bennée

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/4] VSOCK: Introduce vhost-vsock.ko
  2015-12-09 12:03 ` [PATCH v3 3/4] VSOCK: Introduce vhost-vsock.ko Stefan Hajnoczi
@ 2015-12-11 13:45   ` Alex Bennée
  2015-12-11 13:45   ` Alex Bennée
  1 sibling, 0 replies; 23+ messages in thread
From: Alex Bennée @ 2015-12-11 13:45 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kvm, Michael S. Tsirkin, netdev, virtualization, Matt Benjamin,
	Asias He, Christoffer Dall, matt.ma


Stefan Hajnoczi <stefanha@redhat.com> writes:

> From: Asias He <asias@redhat.com>
>
> VM sockets vhost transport implementation. This module runs in host
> kernel.

As per previous checkpatch comments.

>
> Signed-off-by: Asias He <asias@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> v3:
>  * Remove unneeded variable used to store return value
>    (Fengguang Wu <fengguang.wu@intel.com> and Julia Lawall
>    <julia.lawall@lip6.fr>)
> v2:
>  * Add missing total_tx_buf decrement
>  * Support flexible rx/tx descriptor layout
>  * Refuse to assign reserved CIDs
>  * Refuse guest CID if already in use
>  * Only accept correctly addressed packets
> ---
>  drivers/vhost/vsock.c | 628 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/vhost/vsock.h |   4 +
>  2 files changed, 632 insertions(+)
>  create mode 100644 drivers/vhost/vsock.c
>  create mode 100644 drivers/vhost/vsock.h
>
> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> new file mode 100644
> index 0000000..3c0034a
> --- /dev/null
> +++ b/drivers/vhost/vsock.c
> @@ -0,0 +1,628 @@
> +/*
> + * vhost transport for vsock
> + *
> + * Copyright (C) 2013-2015 Red Hat, Inc.
> + * Author: Asias He <asias@redhat.com>
> + *         Stefan Hajnoczi <stefanha@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.
> + */
> +#include <linux/miscdevice.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +#include <net/sock.h>
> +#include <linux/virtio_vsock.h>
> +#include <linux/vhost.h>
> +
> +#include <net/af_vsock.h>
> +#include "vhost.h"
> +#include "vsock.h"
> +
> +#define VHOST_VSOCK_DEFAULT_HOST_CID	2
> +
> +static int vhost_transport_socket_init(struct vsock_sock *vsk,
> +				       struct vsock_sock *psk);
> +
> +enum {
> +	VHOST_VSOCK_FEATURES = VHOST_FEATURES,
> +};
> +
> +/* Used to track all the vhost_vsock instances on the system. */
> +static LIST_HEAD(vhost_vsock_list);
> +static DEFINE_MUTEX(vhost_vsock_mutex);
> +
> +struct vhost_vsock_virtqueue {
> +	struct vhost_virtqueue vq;
> +};
> +
> +struct vhost_vsock {
> +	/* Vhost device */
> +	struct vhost_dev dev;
> +	/* Vhost vsock virtqueue*/
> +	struct vhost_vsock_virtqueue vqs[VSOCK_VQ_MAX];
> +	/* Link to global vhost_vsock_list*/
> +	struct list_head list;
> +	/* Head for pkt from host to guest */
> +	struct list_head send_pkt_list;
> +	/* Work item to send pkt */
> +	struct vhost_work send_pkt_work;
> +	/* Wait queue for send pkt */
> +	wait_queue_head_t queue_wait;
> +	/* Used for global tx buf limitation */
> +	u32 total_tx_buf;
> +	/* Guest contex id this vhost_vsock instance handles */
> +	u32 guest_cid;
> +};

As with 2/4 there is a fair bit of redundancy in the comments but I
don't see any obvious grouping here that could streamline it.

> +
> +static u32 vhost_transport_get_local_cid(void)
> +{
> +	return VHOST_VSOCK_DEFAULT_HOST_CID;
> +}
> +
> +static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
> +{
> +	struct vhost_vsock *vsock;
> +
> +	mutex_lock(&vhost_vsock_mutex);
> +	list_for_each_entry(vsock, &vhost_vsock_list, list) {
> +		if (vsock->guest_cid == guest_cid) {
> +			mutex_unlock(&vhost_vsock_mutex);
> +			return vsock;
> +		}
> +	}
> +	mutex_unlock(&vhost_vsock_mutex);
> +
> +	return NULL;
> +}
> +
> +static void
> +vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
> +			    struct vhost_virtqueue *vq)
> +{
> +	bool added = false;
> +
> +	mutex_lock(&vq->mutex);
> +	vhost_disable_notify(&vsock->dev, vq);
> +	for (;;) {
> +		struct virtio_vsock_pkt *pkt;
> +		struct iov_iter iov_iter;
> +		unsigned out, in;
> +		struct sock *sk;
> +		size_t nbytes;
> +		size_t len;
> +		int head;
> +
> +		if (list_empty(&vsock->send_pkt_list)) {
> +			vhost_enable_notify(&vsock->dev, vq);
> +			break;
> +		}
> +
> +		head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
> +					 &out, &in, NULL, NULL);
> +		pr_debug("%s: head = %d\n", __func__, head);
> +		if (head < 0)
> +			break;
> +
> +		if (head == vq->num) {
> +			if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
> +				vhost_disable_notify(&vsock->dev, vq);
> +				continue;

Why are we doing this? If we enable something we then disable it? A
comment as to what is going on here would be useful.

> +			}
> +			break;
> +		}
> +
> +		pkt = list_first_entry(&vsock->send_pkt_list,
> +				       struct virtio_vsock_pkt, list);
> +		list_del_init(&pkt->list);
> +
> +		if (out) {
> +			virtio_transport_free_pkt(pkt);
> +			vq_err(vq, "Expected 0 output buffers, got %u\n", out);
> +			break;
> +		}
> +
> +		len = iov_length(&vq->iov[out], in);
> +		iov_iter_init(&iov_iter, READ, &vq->iov[out], in, len);
> +
> +		nbytes = copy_to_iter(&pkt->hdr, sizeof(pkt->hdr), &iov_iter);
> +		if (nbytes != sizeof(pkt->hdr)) {
> +			virtio_transport_free_pkt(pkt);
> +			vq_err(vq, "Faulted on copying pkt hdr\n");
> +			break;
> +		}
> +
> +		nbytes = copy_to_iter(pkt->buf, pkt->len, &iov_iter);
> +		if (nbytes != pkt->len) {
> +			virtio_transport_free_pkt(pkt);
> +			vq_err(vq, "Faulted on copying pkt buf\n");
> +			break;
> +		}
> +
> +		vhost_add_used(vq, head, pkt->len); /* TODO should this
> be sizeof(pkt->hdr) + pkt->len? */

TODO needs sorting our or removing.

> +		added = true;
> +
> +		virtio_transport_dec_tx_pkt(pkt);
> +		vsock->total_tx_buf -= pkt->len;
> +
> +		sk = sk_vsock(pkt->trans->vsk);
> +		/* Release refcnt taken in vhost_transport_send_pkt */
> +		sock_put(sk);
> +
> +		virtio_transport_free_pkt(pkt);
> +	}
> +	if (added)
> +		vhost_signal(&vsock->dev, vq);
> +	mutex_unlock(&vq->mutex);
> +
> +	if (added)
> +		wake_up(&vsock->queue_wait);
> +}
> +
> +static void vhost_transport_send_pkt_work(struct vhost_work *work)
> +{
> +	struct vhost_virtqueue *vq;
> +	struct vhost_vsock *vsock;
> +
> +	vsock = container_of(work, struct vhost_vsock, send_pkt_work);
> +	vq = &vsock->vqs[VSOCK_VQ_RX].vq;
> +
> +	vhost_transport_do_send_pkt(vsock, vq);
> +}
> +
> +static int
> +vhost_transport_send_pkt(struct vsock_sock *vsk,
> +			 struct virtio_vsock_pkt_info *info)
> +{
> +	u32 src_cid, src_port, dst_cid, dst_port;
> +	struct virtio_transport *trans;
> +	struct virtio_vsock_pkt *pkt;
> +	struct vhost_virtqueue *vq;
> +	struct vhost_vsock *vsock;
> +	u32 pkt_len = info->pkt_len;
> +	DEFINE_WAIT(wait);
> +
> +	src_cid = vhost_transport_get_local_cid();
> +	src_port = vsk->local_addr.svm_port;
> +	if (!info->remote_cid) {
> +		dst_cid	= vsk->remote_addr.svm_cid;
> +		dst_port = vsk->remote_addr.svm_port;
> +	} else {
> +		dst_cid = info->remote_cid;
> +		dst_port = info->remote_port;
> +	}
> +
> +	/* Find the vhost_vsock according to guest context id  */
> +	vsock = vhost_vsock_get(dst_cid);
> +	if (!vsock)
> +		return -ENODEV;
> +
> +	trans = vsk->trans;
> +	vq = &vsock->vqs[VSOCK_VQ_RX].vq;
> +
> +	/* we can send less than pkt_len bytes */
> +	if (pkt_len > VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE)
> +		pkt_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
> +
> +	/* virtio_transport_get_credit might return less than pkt_len credit */
> +	pkt_len = virtio_transport_get_credit(trans, pkt_len);
> +
> +	/* Do not send zero length OP_RW pkt*/
> +	if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW)
> +		return pkt_len;
> +
> +	/* Respect global tx buf limitation */
> +	mutex_lock(&vq->mutex);
> +	while (pkt_len + vsock->total_tx_buf >
> VIRTIO_VSOCK_MAX_TX_BUF_SIZE) {

I'm curious about the relationship between
VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE above and VIRTIO_VSOCK_MAX_TX_BUF_SIZE
just here. Why do we need to limit pkt_len to the smaller when really
all that matters is pkt_len + vsock->total_tx_buf >
VIRTIO_VSOCK_MAX_TX_BUF_SIZE?


> +		prepare_to_wait_exclusive(&vsock->queue_wait, &wait,
> +					  TASK_UNINTERRUPTIBLE);
> +		mutex_unlock(&vq->mutex);
> +		schedule();
> +		mutex_lock(&vq->mutex);
> +		finish_wait(&vsock->queue_wait, &wait);
> +	}
> +	vsock->total_tx_buf += pkt_len;
> +	mutex_unlock(&vq->mutex);
> +
> +	pkt = virtio_transport_alloc_pkt(vsk, info, pkt_len,
> +					 src_cid, src_port,
> +					 dst_cid, dst_port);
> +	if (!pkt) {
> +		mutex_lock(&vq->mutex);
> +		vsock->total_tx_buf -= pkt_len;
> +		mutex_unlock(&vq->mutex);
> +		virtio_transport_put_credit(trans, pkt_len);
> +		return -ENOMEM;
> +	}
> +
> +	pr_debug("%s:info->pkt_len= %d\n", __func__, pkt_len);
> +	/* Released in vhost_transport_do_send_pkt */
> +	sock_hold(&trans->vsk->sk);
> +	virtio_transport_inc_tx_pkt(pkt);
> +
> +	/* Queue it up in vhost work */
> +	mutex_lock(&vq->mutex);
> +	list_add_tail(&pkt->list, &vsock->send_pkt_list);
> +	vhost_work_queue(&vsock->dev, &vsock->send_pkt_work);
> +	mutex_unlock(&vq->mutex);
> +
> +	return pkt_len;
> +}
> +
> +static struct virtio_transport_pkt_ops vhost_ops = {
> +	.send_pkt = vhost_transport_send_pkt,
> +};
> +
> +static struct virtio_vsock_pkt *
> +vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq,
> +		      unsigned int out, unsigned int in)
> +{
> +	struct virtio_vsock_pkt *pkt;
> +	struct iov_iter iov_iter;
> +	size_t nbytes;
> +	size_t len;
> +
> +	if (in != 0) {
> +		vq_err(vq, "Expected 0 input buffers, got %u\n", in);
> +		return NULL;
> +	}
> +
> +	pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
> +	if (!pkt)
> +		return NULL;
> +
> +	len = iov_length(vq->iov, out);
> +	iov_iter_init(&iov_iter, WRITE, vq->iov, out, len);
> +
> +	nbytes = copy_from_iter(&pkt->hdr, sizeof(pkt->hdr), &iov_iter);
> +	if (nbytes != sizeof(pkt->hdr)) {
> +		vq_err(vq, "Expected %zu bytes for pkt->hdr, got %zu bytes\n",
> +		       sizeof(pkt->hdr), nbytes);
> +		kfree(pkt);
> +		return NULL;
> +	}
> +
> +	if (le16_to_cpu(pkt->hdr.type) == VIRTIO_VSOCK_TYPE_STREAM)
> +		pkt->len = le32_to_cpu(pkt->hdr.len);
> +
> +	/* No payload */
> +	if (!pkt->len)
> +		return pkt;
> +
> +	/* The pkt is too big */
> +	if (pkt->len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) {
> +		kfree(pkt);
> +		return NULL;
> +	}
> +
> +	pkt->buf = kmalloc(pkt->len, GFP_KERNEL);
> +	if (!pkt->buf) {
> +		kfree(pkt);
> +		return NULL;
> +	}
> +
> +	nbytes = copy_from_iter(pkt->buf, pkt->len, &iov_iter);
> +	if (nbytes != pkt->len) {
> +		vq_err(vq, "Expected %u byte payload, got %zu bytes\n",
> +		       pkt->len, nbytes);
> +		virtio_transport_free_pkt(pkt);
> +		return NULL;
> +	}
> +
> +	return pkt;
> +}
> +
> +static void vhost_vsock_handle_ctl_kick(struct vhost_work *work)
> +{
> +	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
> +						  poll.work);
> +	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
> +						 dev);
> +
> +	pr_debug("%s vq=%p, vsock=%p\n", __func__, vq, vsock);
> +}

This doesn't handle anything, it just prints debug stuff. Should this be
a NOP function?

> +
> +static void vhost_vsock_handle_tx_kick(struct vhost_work *work)
> +{
> +	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
> +						  poll.work);
> +	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
> +						 dev);
> +	struct virtio_vsock_pkt *pkt;
> +	int head;
> +	unsigned int out, in;
> +	bool added = false;
> +	u32 len;
> +
> +	mutex_lock(&vq->mutex);
> +	vhost_disable_notify(&vsock->dev, vq);
> +	for (;;) {
> +		head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
> +					 &out, &in, NULL, NULL);
> +		if (head < 0)
> +			break;
> +
> +		if (head == vq->num) {
> +			if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
> +				vhost_disable_notify(&vsock->dev, vq);
> +				continue;

Same question about the enable/disable dance as above.

> +			}
> +			break;
> +		}
> +
> +		pkt = vhost_vsock_alloc_pkt(vq, out, in);
> +		if (!pkt) {
> +			vq_err(vq, "Faulted on pkt\n");
> +			continue;
> +		}
> +
> +		len = pkt->len;
> +
> +		/* Only accept correctly addressed packets */
> +		if (le32_to_cpu(pkt->hdr.src_cid) == vsock->guest_cid &&
> +		    le32_to_cpu(pkt->hdr.dst_cid) == vhost_transport_get_local_cid())
> +			virtio_transport_recv_pkt(pkt);
> +		else
> +			virtio_transport_free_pkt(pkt);
> +
> +		vhost_add_used(vq, head, len);
> +		added = true;
> +	}
> +	if (added)
> +		vhost_signal(&vsock->dev, vq);
> +	mutex_unlock(&vq->mutex);
> +}
> +
> +static void vhost_vsock_handle_rx_kick(struct vhost_work *work)
> +{
> +	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
> +						poll.work);
> +	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
> +						 dev);
> +
> +	vhost_transport_do_send_pkt(vsock, vq);
> +}
> +
> +static int vhost_vsock_dev_open(struct inode *inode, struct file *file)
> +{
> +	struct vhost_virtqueue **vqs;
> +	struct vhost_vsock *vsock;
> +	int ret;
> +
> +	vsock = kzalloc(sizeof(*vsock), GFP_KERNEL);
> +	if (!vsock)
> +		return -ENOMEM;
> +
> +	pr_debug("%s:vsock=%p\n", __func__, vsock);
> +
> +	vqs = kmalloc(VSOCK_VQ_MAX * sizeof(*vqs), GFP_KERNEL);
> +	if (!vqs) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +
> +	vqs[VSOCK_VQ_CTRL] = &vsock->vqs[VSOCK_VQ_CTRL].vq;
> +	vqs[VSOCK_VQ_TX] = &vsock->vqs[VSOCK_VQ_TX].vq;
> +	vqs[VSOCK_VQ_RX] = &vsock->vqs[VSOCK_VQ_RX].vq;
> +	vsock->vqs[VSOCK_VQ_CTRL].vq.handle_kick = vhost_vsock_handle_ctl_kick;
> +	vsock->vqs[VSOCK_VQ_TX].vq.handle_kick = vhost_vsock_handle_tx_kick;
> +	vsock->vqs[VSOCK_VQ_RX].vq.handle_kick = vhost_vsock_handle_rx_kick;
> +
> +	vhost_dev_init(&vsock->dev, vqs, VSOCK_VQ_MAX);
> +
> +	file->private_data = vsock;
> +	init_waitqueue_head(&vsock->queue_wait);
> +	INIT_LIST_HEAD(&vsock->send_pkt_list);
> +	vhost_work_init(&vsock->send_pkt_work, vhost_transport_send_pkt_work);
> +
> +	mutex_lock(&vhost_vsock_mutex);
> +	list_add_tail(&vsock->list, &vhost_vsock_list);
> +	mutex_unlock(&vhost_vsock_mutex);
> +	return 0;
> +
> +out:
> +	kfree(vsock);
> +	return ret;
> +}
> +
> +static void vhost_vsock_flush(struct vhost_vsock *vsock)
> +{
> +	int i;
> +
> +	for (i = 0; i < VSOCK_VQ_MAX; i++)
> +		vhost_poll_flush(&vsock->vqs[i].vq.poll);
> +	vhost_work_flush(&vsock->dev, &vsock->send_pkt_work);
> +}
> +
> +static int vhost_vsock_dev_release(struct inode *inode, struct file *file)
> +{
> +	struct vhost_vsock *vsock = file->private_data;
> +
> +	mutex_lock(&vhost_vsock_mutex);
> +	list_del(&vsock->list);
> +	mutex_unlock(&vhost_vsock_mutex);
> +
> +	vhost_dev_stop(&vsock->dev);
> +	vhost_vsock_flush(vsock);
> +	vhost_dev_cleanup(&vsock->dev, false);
> +	kfree(vsock->dev.vqs);
> +	kfree(vsock);
> +	return 0;
> +}
> +
> +static int vhost_vsock_set_cid(struct vhost_vsock *vsock, u32 guest_cid)
> +{
> +	struct vhost_vsock *other;
> +
> +	/* Refuse reserved CIDs */
> +	if (guest_cid <= VMADDR_CID_HOST) {
> +		return -EINVAL;
> +	}
> +
> +	/* Refuse if CID is already in use */
> +	other = vhost_vsock_get(guest_cid);
> +	if (other && other != vsock) {
> +		return -EADDRINUSE;
> +	}
> +
> +	mutex_lock(&vhost_vsock_mutex);
> +	vsock->guest_cid = guest_cid;
> +	pr_debug("%s:guest_cid=%d\n", __func__, guest_cid);
> +	mutex_unlock(&vhost_vsock_mutex);
> +
> +	return 0;
> +}
> +
> +static int vhost_vsock_set_features(struct vhost_vsock *vsock, u64 features)
> +{
> +	struct vhost_virtqueue *vq;
> +	int i;
> +
> +	if (features & ~VHOST_VSOCK_FEATURES)
> +		return -EOPNOTSUPP;
> +
> +	mutex_lock(&vsock->dev.mutex);
> +	if ((features & (1 << VHOST_F_LOG_ALL)) &&
> +	    !vhost_log_access_ok(&vsock->dev)) {
> +		mutex_unlock(&vsock->dev.mutex);
> +		return -EFAULT;
> +	}
> +
> +	for (i = 0; i < VSOCK_VQ_MAX; i++) {
> +		vq = &vsock->vqs[i].vq;
> +		mutex_lock(&vq->mutex);
> +		vq->acked_features = features;

Is this a user supplied flag? Should it be masked to valid values?

> +		mutex_unlock(&vq->mutex);
> +	}
> +	mutex_unlock(&vsock->dev.mutex);
> +	return 0;
> +}
> +
> +static long vhost_vsock_dev_ioctl(struct file *f, unsigned int ioctl,
> +				  unsigned long arg)
> +{
> +	struct vhost_vsock *vsock = f->private_data;
> +	void __user *argp = (void __user *)arg;
> +	u64 __user *featurep = argp;
> +	u32 __user *cidp = argp;
> +	u32 guest_cid;
> +	u64 features;
> +	int r;
> +
> +	switch (ioctl) {
> +	case VHOST_VSOCK_SET_GUEST_CID:
> +		if (get_user(guest_cid, cidp))
> +			return -EFAULT;
> +		return vhost_vsock_set_cid(vsock, guest_cid);
> +	case VHOST_GET_FEATURES:
> +		features = VHOST_VSOCK_FEATURES;
> +		if (copy_to_user(featurep, &features, sizeof(features)))
> +			return -EFAULT;
> +		return 0;
> +	case VHOST_SET_FEATURES:
> +		if (copy_from_user(&features, featurep, sizeof(features)))
> +			return -EFAULT;
> +		return vhost_vsock_set_features(vsock, features);
> +	default:
> +		mutex_lock(&vsock->dev.mutex);
> +		r = vhost_dev_ioctl(&vsock->dev, ioctl, argp);
> +		if (r == -ENOIOCTLCMD)
> +			r = vhost_vring_ioctl(&vsock->dev, ioctl, argp);
> +		else
> +			vhost_vsock_flush(vsock);
> +		mutex_unlock(&vsock->dev.mutex);
> +		return r;
> +	}
> +}
> +
> +static const struct file_operations vhost_vsock_fops = {
> +	.owner          = THIS_MODULE,
> +	.open           = vhost_vsock_dev_open,
> +	.release        = vhost_vsock_dev_release,
> +	.llseek		= noop_llseek,
> +	.unlocked_ioctl = vhost_vsock_dev_ioctl,
> +};
> +
> +static struct miscdevice vhost_vsock_misc = {
> +	.minor = MISC_DYNAMIC_MINOR,
> +	.name = "vhost-vsock",
> +	.fops = &vhost_vsock_fops,
> +};
> +
> +static int
> +vhost_transport_socket_init(struct vsock_sock *vsk, struct vsock_sock *psk)
> +{
> +	struct virtio_transport *trans;
> +	int ret;
> +
> +	ret = virtio_transport_do_socket_init(vsk, psk);
> +	if (ret)
> +		return ret;
> +
> +	trans = vsk->trans;
> +	trans->ops = &vhost_ops;
> +
> +	return ret;
> +}
> +
> +static struct vsock_transport vhost_transport = {
> +	.get_local_cid            = vhost_transport_get_local_cid,
> +
> +	.init                     = vhost_transport_socket_init,
> +	.destruct                 = virtio_transport_destruct,
> +	.release                  = virtio_transport_release,
> +	.connect                  = virtio_transport_connect,
> +	.shutdown                 = virtio_transport_shutdown,
> +
> +	.dgram_enqueue            = virtio_transport_dgram_enqueue,
> +	.dgram_dequeue            = virtio_transport_dgram_dequeue,
> +	.dgram_bind               = virtio_transport_dgram_bind,
> +	.dgram_allow              = virtio_transport_dgram_allow,
> +
> +	.stream_enqueue           = virtio_transport_stream_enqueue,
> +	.stream_dequeue           = virtio_transport_stream_dequeue,
> +	.stream_has_data          = virtio_transport_stream_has_data,
> +	.stream_has_space         = virtio_transport_stream_has_space,
> +	.stream_rcvhiwat          = virtio_transport_stream_rcvhiwat,
> +	.stream_is_active         = virtio_transport_stream_is_active,
> +	.stream_allow             = virtio_transport_stream_allow,
> +
> +	.notify_poll_in           = virtio_transport_notify_poll_in,
> +	.notify_poll_out          = virtio_transport_notify_poll_out,
> +	.notify_recv_init         = virtio_transport_notify_recv_init,
> +	.notify_recv_pre_block    = virtio_transport_notify_recv_pre_block,
> +	.notify_recv_pre_dequeue  = virtio_transport_notify_recv_pre_dequeue,
> +	.notify_recv_post_dequeue = virtio_transport_notify_recv_post_dequeue,
> +	.notify_send_init         = virtio_transport_notify_send_init,
> +	.notify_send_pre_block    = virtio_transport_notify_send_pre_block,
> +	.notify_send_pre_enqueue  = virtio_transport_notify_send_pre_enqueue,
> +	.notify_send_post_enqueue = virtio_transport_notify_send_post_enqueue,
> +
> +	.set_buffer_size          = virtio_transport_set_buffer_size,
> +	.set_min_buffer_size      = virtio_transport_set_min_buffer_size,
> +	.set_max_buffer_size      = virtio_transport_set_max_buffer_size,
> +	.get_buffer_size          = virtio_transport_get_buffer_size,
> +	.get_min_buffer_size      = virtio_transport_get_min_buffer_size,
> +	.get_max_buffer_size      = virtio_transport_get_max_buffer_size,
> +};
> +
> +static int __init vhost_vsock_init(void)
> +{
> +	int ret;
> +
> +	ret = vsock_core_init(&vhost_transport);
> +	if (ret < 0)
> +		return ret;
> +	return misc_register(&vhost_vsock_misc);
> +};
> +
> +static void __exit vhost_vsock_exit(void)
> +{
> +	misc_deregister(&vhost_vsock_misc);
> +	vsock_core_exit();
> +};
> +
> +module_init(vhost_vsock_init);
> +module_exit(vhost_vsock_exit);
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Asias He");
> +MODULE_DESCRIPTION("vhost transport for vsock ");
> diff --git a/drivers/vhost/vsock.h b/drivers/vhost/vsock.h
> new file mode 100644
> index 0000000..0ddb107
> --- /dev/null
> +++ b/drivers/vhost/vsock.h
> @@ -0,0 +1,4 @@
> +#ifndef VHOST_VSOCK_H
> +#define VHOST_VSOCK_H
> +#define VHOST_VSOCK_SET_GUEST_CID _IOW(VHOST_VIRTIO, 0x60, __u32)
> +#endif
> --
> 2.5.0


--
Alex Bennée
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 4/4] VSOCK: Add Makefile and Kconfig
  2015-12-09 12:03 ` [PATCH v3 4/4] VSOCK: Add Makefile and Kconfig Stefan Hajnoczi
@ 2015-12-11 17:19   ` Alex Bennée
  2015-12-15  8:19     ` Stefan Hajnoczi
  2015-12-15  8:19     ` Stefan Hajnoczi
  0 siblings, 2 replies; 23+ messages in thread
From: Alex Bennée @ 2015-12-11 17:19 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kvm, Michael S. Tsirkin, netdev, virtualization, Matt Benjamin,
	Asias He, Christoffer Dall, matt.ma


Stefan Hajnoczi <stefanha@redhat.com> writes:

> From: Asias He <asias@redhat.com>
>
> Enable virtio-vsock and vhost-vsock.
>
> Signed-off-by: Asias He <asias@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> v3:
>  * Don't put vhost vsock driver into staging
>  * Add missing Kconfig dependencies (Arnd Bergmann <arnd@arndb.de>)
> ---
>  drivers/vhost/Kconfig  | 10 ++++++++++
>  drivers/vhost/Makefile |  4 ++++
>  net/vmw_vsock/Kconfig  | 18 ++++++++++++++++++
>  net/vmw_vsock/Makefile |  2 ++
>  4 files changed, 34 insertions(+)
>
> diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
> index 533eaf0..a1bb4c2 100644
> --- a/drivers/vhost/Kconfig
> +++ b/drivers/vhost/Kconfig
> @@ -21,6 +21,16 @@ config VHOST_SCSI
>  	Say M here to enable the vhost_scsi TCM fabric module
>  	for use with virtio-scsi guests
>
> +config VHOST_VSOCK
> +	tristate "vhost virtio-vsock driver"
> +	depends on VSOCKETS && EVENTFD
> +	select VIRTIO_VSOCKETS_COMMON
> +	select VHOST
> +	select VHOST_RING
> +	default n
> +	---help---
> +	Say M here to enable the vhost-vsock for virtio-vsock guests

I think checkpatch prefers a few more words for the feature but I'm
happy with it.

> +
>  config VHOST_RING
>  	tristate
>  	---help---
> diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
> index e0441c3..6b012b9 100644
> --- a/drivers/vhost/Makefile
> +++ b/drivers/vhost/Makefile
> @@ -4,5 +4,9 @@ vhost_net-y := net.o
>  obj-$(CONFIG_VHOST_SCSI) += vhost_scsi.o
>  vhost_scsi-y := scsi.o
>
> +obj-$(CONFIG_VHOST_VSOCK) += vhost_vsock.o
> +vhost_vsock-y := vsock.o
> +
>  obj-$(CONFIG_VHOST_RING) += vringh.o
> +
>  obj-$(CONFIG_VHOST)	+= vhost.o
> diff --git a/net/vmw_vsock/Kconfig b/net/vmw_vsock/Kconfig
> index 14810ab..74e0bc8 100644
> --- a/net/vmw_vsock/Kconfig
> +++ b/net/vmw_vsock/Kconfig
> @@ -26,3 +26,21 @@ config VMWARE_VMCI_VSOCKETS
>
>  	  To compile this driver as a module, choose M here: the module
>  	  will be called vmw_vsock_vmci_transport. If unsure, say N.
> +
> +config VIRTIO_VSOCKETS
> +	tristate "virtio transport for Virtual Sockets"
> +	depends on VSOCKETS && VIRTIO
> +	select VIRTIO_VSOCKETS_COMMON
> +	help
> +	  This module implements a virtio transport for Virtual Sockets.
> +
> +	  Enable this transport if your Virtual Machine runs on
>  	  Qemu/KVM.

Is this better worded as:

"Enable this transport if your Virtual Machine host supports vsockets
over virtio."

?

Otherwise:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> +
> +	  To compile this driver as a module, choose M here: the module
> +	  will be called virtio_vsock_transport. If unsure, say N.
> +
> +config VIRTIO_VSOCKETS_COMMON
> +       tristate
> +       ---help---
> +         This option is selected by any driver which needs to access
> +         the virtio_vsock.
> diff --git a/net/vmw_vsock/Makefile b/net/vmw_vsock/Makefile
> index 2ce52d7..cf4c294 100644
> --- a/net/vmw_vsock/Makefile
> +++ b/net/vmw_vsock/Makefile
> @@ -1,5 +1,7 @@
>  obj-$(CONFIG_VSOCKETS) += vsock.o
>  obj-$(CONFIG_VMWARE_VMCI_VSOCKETS) += vmw_vsock_vmci_transport.o
> +obj-$(CONFIG_VIRTIO_VSOCKETS) += virtio_transport.o
> +obj-$(CONFIG_VIRTIO_VSOCKETS_COMMON) += virtio_transport_common.o
>
>  vsock-y += af_vsock.o vsock_addr.o
>
> --
> 2.5.0


--
Alex Bennée
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/4] VSOCK: Introduce vhost-vsock.ko
  2015-12-11 13:45   ` Alex Bennée
  2015-12-15  7:47     ` Stefan Hajnoczi
@ 2015-12-15  7:47     ` Stefan Hajnoczi
  1 sibling, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-15  7:47 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, Matt Benjamin, Christoffer Dall, netdev, Michael S. Tsirkin,
	matt.ma, virtualization, Asias He

[-- Attachment #1: Type: text/plain, Size: 3986 bytes --]

On Fri, Dec 11, 2015 at 01:45:29PM +0000, Alex Bennée wrote:
> > +		if (head == vq->num) {
> > +			if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
> > +				vhost_disable_notify(&vsock->dev, vq);
> > +				continue;
> 
> Why are we doing this? If we enable something we then disable it? A
> comment as to what is going on here would be useful.

This is a standard optimization to avoid vmexits that other vhost
devices and QEMU implement too.

When the host begins pulling buffers off a virtqueue it first disables
guest->host notifications.  If the guest adds additional buffers while
the host is processing, the notification (vmexit) is skipped.  The host
re-enables guest->host notifications when it finishes virtqueue
processing.

If the guest added buffers after vhost_get_vq_desc() but before
vhost_enable_notify(), then vhost_enable_notify() returns true and the
host must process the buffers (i.e. restart the loop).  Failure to do so
could result in deadlocks because the guest didn't notify and the host
would be waiting for a notification.

I will add comments to the code.

> > +		vhost_add_used(vq, head, pkt->len); /* TODO should this
> > be sizeof(pkt->hdr) + pkt->len? */
> 
> TODO needs sorting our or removing.

Will fix in the next revision.

> > +	/* Respect global tx buf limitation */
> > +	mutex_lock(&vq->mutex);
> > +	while (pkt_len + vsock->total_tx_buf >
> > VIRTIO_VSOCK_MAX_TX_BUF_SIZE) {
> 
> I'm curious about the relationship between
> VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE above and VIRTIO_VSOCK_MAX_TX_BUF_SIZE
> just here. Why do we need to limit pkt_len to the smaller when really
> all that matters is pkt_len + vsock->total_tx_buf >
> VIRTIO_VSOCK_MAX_TX_BUF_SIZE?

There are two separate issues:

1. The total amount of pending data.  The idea is to stop queuing
   packets and make the caller wait until resources become available so
   that vhost_vsock.ko memory consumption is bounded.

   total_tx_buf len is an artificial limit that is lower than the actual
   virtqueue maximum data size.  Otherwise we could just rely on the
   virtqueue to limit the size but it can be very large.

2. Splitting data into packets that fit into rx virtqueue buffers.  The
   guest sets up the rx virtqueue with VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE
   buffers.  Here, vhost_vsock.ko is assuming that the rx virtqueue
   buffers are always VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE bytes so it
   splits data along this boundary.

   This is ugly because the guest could choose a different buffer size
   and the host has VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE hardcoded.  I'll
   look into eliminating this assumption.

> > +static void vhost_vsock_handle_ctl_kick(struct vhost_work *work)
> > +{
> > +	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
> > +						  poll.work);
> > +	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
> > +						 dev);
> > +
> > +	pr_debug("%s vq=%p, vsock=%p\n", __func__, vq, vsock);
> > +}
> 
> This doesn't handle anything, it just prints debug stuff. Should this be
> a NOP function?

The control virtqueue is currently not used.  In the next revision this
function will be dropped.

> > +static int vhost_vsock_set_features(struct vhost_vsock *vsock, u64 features)
> > +{
> > +	struct vhost_virtqueue *vq;
> > +	int i;
> > +
> > +	if (features & ~VHOST_VSOCK_FEATURES)
> > +		return -EOPNOTSUPP;
> > +
> > +	mutex_lock(&vsock->dev.mutex);
> > +	if ((features & (1 << VHOST_F_LOG_ALL)) &&
> > +	    !vhost_log_access_ok(&vsock->dev)) {
> > +		mutex_unlock(&vsock->dev.mutex);
> > +		return -EFAULT;
> > +	}
> > +
> > +	for (i = 0; i < VSOCK_VQ_MAX; i++) {
> > +		vq = &vsock->vqs[i].vq;
> > +		mutex_lock(&vq->mutex);
> > +		vq->acked_features = features;
> 
> Is this a user supplied flag? Should it be masked to valid values?

That is already done above where VHOST_VSOCK_FEATURES is checked.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/4] VSOCK: Introduce vhost-vsock.ko
  2015-12-11 13:45   ` Alex Bennée
@ 2015-12-15  7:47     ` Stefan Hajnoczi
  2015-12-15  7:47     ` Stefan Hajnoczi
  1 sibling, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-15  7:47 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, Michael S. Tsirkin, netdev, virtualization, Matt Benjamin,
	Asias He, Christoffer Dall, matt.ma


[-- Attachment #1.1: Type: text/plain, Size: 3986 bytes --]

On Fri, Dec 11, 2015 at 01:45:29PM +0000, Alex Bennée wrote:
> > +		if (head == vq->num) {
> > +			if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
> > +				vhost_disable_notify(&vsock->dev, vq);
> > +				continue;
> 
> Why are we doing this? If we enable something we then disable it? A
> comment as to what is going on here would be useful.

This is a standard optimization to avoid vmexits that other vhost
devices and QEMU implement too.

When the host begins pulling buffers off a virtqueue it first disables
guest->host notifications.  If the guest adds additional buffers while
the host is processing, the notification (vmexit) is skipped.  The host
re-enables guest->host notifications when it finishes virtqueue
processing.

If the guest added buffers after vhost_get_vq_desc() but before
vhost_enable_notify(), then vhost_enable_notify() returns true and the
host must process the buffers (i.e. restart the loop).  Failure to do so
could result in deadlocks because the guest didn't notify and the host
would be waiting for a notification.

I will add comments to the code.

> > +		vhost_add_used(vq, head, pkt->len); /* TODO should this
> > be sizeof(pkt->hdr) + pkt->len? */
> 
> TODO needs sorting our or removing.

Will fix in the next revision.

> > +	/* Respect global tx buf limitation */
> > +	mutex_lock(&vq->mutex);
> > +	while (pkt_len + vsock->total_tx_buf >
> > VIRTIO_VSOCK_MAX_TX_BUF_SIZE) {
> 
> I'm curious about the relationship between
> VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE above and VIRTIO_VSOCK_MAX_TX_BUF_SIZE
> just here. Why do we need to limit pkt_len to the smaller when really
> all that matters is pkt_len + vsock->total_tx_buf >
> VIRTIO_VSOCK_MAX_TX_BUF_SIZE?

There are two separate issues:

1. The total amount of pending data.  The idea is to stop queuing
   packets and make the caller wait until resources become available so
   that vhost_vsock.ko memory consumption is bounded.

   total_tx_buf len is an artificial limit that is lower than the actual
   virtqueue maximum data size.  Otherwise we could just rely on the
   virtqueue to limit the size but it can be very large.

2. Splitting data into packets that fit into rx virtqueue buffers.  The
   guest sets up the rx virtqueue with VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE
   buffers.  Here, vhost_vsock.ko is assuming that the rx virtqueue
   buffers are always VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE bytes so it
   splits data along this boundary.

   This is ugly because the guest could choose a different buffer size
   and the host has VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE hardcoded.  I'll
   look into eliminating this assumption.

> > +static void vhost_vsock_handle_ctl_kick(struct vhost_work *work)
> > +{
> > +	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
> > +						  poll.work);
> > +	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
> > +						 dev);
> > +
> > +	pr_debug("%s vq=%p, vsock=%p\n", __func__, vq, vsock);
> > +}
> 
> This doesn't handle anything, it just prints debug stuff. Should this be
> a NOP function?

The control virtqueue is currently not used.  In the next revision this
function will be dropped.

> > +static int vhost_vsock_set_features(struct vhost_vsock *vsock, u64 features)
> > +{
> > +	struct vhost_virtqueue *vq;
> > +	int i;
> > +
> > +	if (features & ~VHOST_VSOCK_FEATURES)
> > +		return -EOPNOTSUPP;
> > +
> > +	mutex_lock(&vsock->dev.mutex);
> > +	if ((features & (1 << VHOST_F_LOG_ALL)) &&
> > +	    !vhost_log_access_ok(&vsock->dev)) {
> > +		mutex_unlock(&vsock->dev.mutex);
> > +		return -EFAULT;
> > +	}
> > +
> > +	for (i = 0; i < VSOCK_VQ_MAX; i++) {
> > +		vq = &vsock->vqs[i].vq;
> > +		mutex_lock(&vq->mutex);
> > +		vq->acked_features = features;
> 
> Is this a user supplied flag? Should it be masked to valid values?

That is already done above where VHOST_VSOCK_FEATURES is checked.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 4/4] VSOCK: Add Makefile and Kconfig
  2015-12-11 17:19   ` Alex Bennée
@ 2015-12-15  8:19     ` Stefan Hajnoczi
  2015-12-15  8:19     ` Stefan Hajnoczi
  1 sibling, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-15  8:19 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, Matt Benjamin, Christoffer Dall, netdev, Michael S. Tsirkin,
	matt.ma, virtualization, Asias He

[-- Attachment #1: Type: text/plain, Size: 1357 bytes --]

On Fri, Dec 11, 2015 at 05:19:08PM +0000, Alex Bennée wrote:
> > +config VHOST_VSOCK
> > +	tristate "vhost virtio-vsock driver"
> > +	depends on VSOCKETS && EVENTFD
> > +	select VIRTIO_VSOCKETS_COMMON
> > +	select VHOST
> > +	select VHOST_RING
> > +	default n
> > +	---help---
> > +	Say M here to enable the vhost-vsock for virtio-vsock guests
> 
> I think checkpatch prefers a few more words for the feature but I'm
> happy with it.

I have expanded the description.

> > diff --git a/net/vmw_vsock/Kconfig b/net/vmw_vsock/Kconfig
> > index 14810ab..74e0bc8 100644
> > --- a/net/vmw_vsock/Kconfig
> > +++ b/net/vmw_vsock/Kconfig
> > @@ -26,3 +26,21 @@ config VMWARE_VMCI_VSOCKETS
> >
> >  	  To compile this driver as a module, choose M here: the module
> >  	  will be called vmw_vsock_vmci_transport. If unsure, say N.
> > +
> > +config VIRTIO_VSOCKETS
> > +	tristate "virtio transport for Virtual Sockets"
> > +	depends on VSOCKETS && VIRTIO
> > +	select VIRTIO_VSOCKETS_COMMON
> > +	help
> > +	  This module implements a virtio transport for Virtual Sockets.
> > +
> > +	  Enable this transport if your Virtual Machine runs on
> >  	  Qemu/KVM.
> 
> Is this better worded as:
> 
> "Enable this transport if your Virtual Machine host supports vsockets
> over virtio."

Good idea.  Will fix in the next revision.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 4/4] VSOCK: Add Makefile and Kconfig
  2015-12-11 17:19   ` Alex Bennée
  2015-12-15  8:19     ` Stefan Hajnoczi
@ 2015-12-15  8:19     ` Stefan Hajnoczi
  1 sibling, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2015-12-15  8:19 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, Michael S. Tsirkin, netdev, virtualization, Matt Benjamin,
	Asias He, Christoffer Dall, matt.ma


[-- Attachment #1.1: Type: text/plain, Size: 1357 bytes --]

On Fri, Dec 11, 2015 at 05:19:08PM +0000, Alex Bennée wrote:
> > +config VHOST_VSOCK
> > +	tristate "vhost virtio-vsock driver"
> > +	depends on VSOCKETS && EVENTFD
> > +	select VIRTIO_VSOCKETS_COMMON
> > +	select VHOST
> > +	select VHOST_RING
> > +	default n
> > +	---help---
> > +	Say M here to enable the vhost-vsock for virtio-vsock guests
> 
> I think checkpatch prefers a few more words for the feature but I'm
> happy with it.

I have expanded the description.

> > diff --git a/net/vmw_vsock/Kconfig b/net/vmw_vsock/Kconfig
> > index 14810ab..74e0bc8 100644
> > --- a/net/vmw_vsock/Kconfig
> > +++ b/net/vmw_vsock/Kconfig
> > @@ -26,3 +26,21 @@ config VMWARE_VMCI_VSOCKETS
> >
> >  	  To compile this driver as a module, choose M here: the module
> >  	  will be called vmw_vsock_vmci_transport. If unsure, say N.
> > +
> > +config VIRTIO_VSOCKETS
> > +	tristate "virtio transport for Virtual Sockets"
> > +	depends on VSOCKETS && VIRTIO
> > +	select VIRTIO_VSOCKETS_COMMON
> > +	help
> > +	  This module implements a virtio transport for Virtual Sockets.
> > +
> > +	  Enable this transport if your Virtual Machine runs on
> >  	  Qemu/KVM.
> 
> Is this better worded as:
> 
> "Enable this transport if your Virtual Machine host supports vsockets
> over virtio."

Good idea.  Will fix in the next revision.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2015-12-15  8:19 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-09 12:03 [PATCH v3 0/4] Add virtio transport for AF_VSOCK Stefan Hajnoczi
2015-12-09 12:03 ` [PATCH v3 1/4] VSOCK: Introduce virtio-vsock-common.ko Stefan Hajnoczi
2015-12-09 12:03 ` Stefan Hajnoczi
2015-12-10 10:17   ` Alex Bennée
2015-12-11  2:51     ` Stefan Hajnoczi
2015-12-09 12:03 ` [PATCH v3 2/4] VSOCK: Introduce virtio-vsock.ko Stefan Hajnoczi
2015-12-09 12:03 ` Stefan Hajnoczi
2015-12-10 21:23   ` Alex Bennée
2015-12-11  3:00     ` Stefan Hajnoczi
2015-12-11  3:00     ` Stefan Hajnoczi
2015-12-09 12:03 ` [PATCH v3 3/4] VSOCK: Introduce vhost-vsock.ko Stefan Hajnoczi
2015-12-11 13:45   ` Alex Bennée
2015-12-11 13:45   ` Alex Bennée
2015-12-15  7:47     ` Stefan Hajnoczi
2015-12-15  7:47     ` Stefan Hajnoczi
2015-12-09 12:03 ` Stefan Hajnoczi
2015-12-09 12:03 ` [PATCH v3 4/4] VSOCK: Add Makefile and Kconfig Stefan Hajnoczi
2015-12-11 17:19   ` Alex Bennée
2015-12-15  8:19     ` Stefan Hajnoczi
2015-12-15  8:19     ` Stefan Hajnoczi
2015-12-09 12:03 ` Stefan Hajnoczi
2015-12-09 20:12 ` [PATCH v3 0/4] Add virtio transport for AF_VSOCK Michael S. Tsirkin
2015-12-09 20:12 ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.