* [RFC v4 0/5] Add virtio transport for AF_VSOCK
@ 2015-12-22 9:07 Stefan Hajnoczi
2015-12-22 9:07 ` [RFC v4 1/5] VSOCK: transport-specific vsock_transport functions Stefan Hajnoczi
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2015-12-22 9:07 UTC (permalink / raw)
To: kvm
Cc: Stefan Hajnoczi, Michael S. Tsirkin, netdev, virtualization,
Matt Benjamin, Christoffer Dall, matt.ma
This series is based on v4.4-rc2 and the "virtio: make find_vqs()
checkpatch.pl-friendly" patch I recently submitted.
v4:
* Addressed code review comments from Alex Bennee
* MAINTAINERS file entries for new files
* Trace events instead of pr_debug()
* RST packet is sent when there is no listen socket
* Allow guest->host connections again (began discussing netfilter support with
Matt Benjamin instead of hard-coding security policy in virtio-vsock code)
* Many checkpatch.pl cleanups (will be 100% clean in v5)
v3:
* Remove unnecessary 3-way handshake, just do REQUEST/RESPONSE instead
of REQUEST/RESPONSE/ACK
* Remove SOCK_DGRAM support and focus on SOCK_STREAM first
(also drop v2 Patch 1, it's only needed for SOCK_DGRAM)
* Only allow host->guest connections (same security model as latest
VMware)
* Don't put vhost vsock driver into staging
* Add missing Kconfig dependencies (Arnd Bergmann <arnd@arndb.de>)
* Remove unneeded variable used to store return value
(Fengguang Wu <fengguang.wu@intel.com> and Julia Lawall
<julia.lawall@lip6.fr>)
v2:
* Rebased onto Linux v4.4-rc2
* vhost: Refuse to assign reserved CIDs
* vhost: Refuse guest CID if already in use
* vhost: Only accept correctly addressed packets (no spoofing!)
* vhost: Support flexible rx/tx descriptor layout
* vhost: Add missing total_tx_buf decrement
* virtio_transport: Fix total_tx_buf accounting
* virtio_transport: Add virtio_transport global mutex to prevent races
* common: Notify other side of SOCK_STREAM disconnect (fixes shutdown
semantics)
* common: Avoid recursive mutex_lock(tx_lock) for write_space (fixes deadlock)
* common: Define VIRTIO_VSOCK_TYPE_STREAM/DGRAM hardware interface constants
* common: Define VIRTIO_VSOCK_SHUTDOWN_RCV/SEND hardware interface constants
* common: Fix peer_buf_alloc inheritance on child socket
This patch series adds a virtio transport for AF_VSOCK (net/vmw_vsock/).
AF_VSOCK is designed for communication between virtual machines and
hypervisors. It is currently only implemented for VMware's VMCI transport.
This series implements the proposed virtio-vsock device specification from
here:
http://permalink.gmane.org/gmane.comp.emulators.virtio.devel/980
Most of the work was done by Asias He and Gerd Hoffmann a while back. I have
picked up the series again.
The QEMU userspace changes are here:
https://github.com/stefanha/qemu/commits/vsock
Why virtio-vsock?
-----------------
Guest<->host communication is currently done over the virtio-serial device.
This makes it hard to port sockets API-based applications and is limited to
static ports.
virtio-vsock uses the sockets API so that applications can rely on familiar
SOCK_STREAM semantics. Applications on the host can easily connect to guest
agents because the sockets API allows multiple connections to a listen socket
(unlike virtio-serial). This simplifies the guest<->host communication and
eliminates the need for extra processes on the host to arbitrate virtio-serial
ports.
Overview
--------
This series adds 3 pieces:
1. virtio_transport_common.ko - core virtio vsock code that uses vsock.ko
2. virtio_transport.ko - guest driver
3. drivers/vhost/vsock.ko - host driver
Howto
-----
The following kernel options are needed:
CONFIG_VSOCKETS=y
CONFIG_VIRTIO_VSOCKETS=y
CONFIG_VIRTIO_VSOCKETS_COMMON=y
CONFIG_VHOST_VSOCK=m
Launch QEMU as follows:
# qemu ... -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3
Guest and host can communicate via AF_VSOCK sockets. The host's CID (address)
is 2 and the guest must be assigned a CID (3 in the example above).
Status
------
This patch series implements the latest draft specification. Please review.
Asias He (4):
VSOCK: Introduce virtio_vsock_common.ko
VSOCK: Introduce virtio_transport.ko
VSOCK: Introduce vhost_vsock.ko
VSOCK: Add Makefile and Kconfig
Stefan Hajnoczi (1):
VSOCK: transport-specific vsock_transport functions
MAINTAINERS | 13 +
drivers/vhost/Kconfig | 15 +
drivers/vhost/Makefile | 4 +
drivers/vhost/vsock.c | 607 +++++++++++++++
drivers/vhost/vsock.h | 4 +
include/linux/virtio_vsock.h | 167 +++++
include/net/af_vsock.h | 3 +
.../trace/events/vsock_virtio_transport_common.h | 144 ++++
include/uapi/linux/virtio_ids.h | 1 +
include/uapi/linux/virtio_vsock.h | 87 +++
net/vmw_vsock/Kconfig | 19 +
net/vmw_vsock/Makefile | 2 +
net/vmw_vsock/af_vsock.c | 9 +
net/vmw_vsock/virtio_transport.c | 481 ++++++++++++
net/vmw_vsock/virtio_transport_common.c | 834 +++++++++++++++++++++
15 files changed, 2390 insertions(+)
create mode 100644 drivers/vhost/vsock.c
create mode 100644 drivers/vhost/vsock.h
create mode 100644 include/linux/virtio_vsock.h
create mode 100644 include/trace/events/vsock_virtio_transport_common.h
create mode 100644 include/uapi/linux/virtio_vsock.h
create mode 100644 net/vmw_vsock/virtio_transport.c
create mode 100644 net/vmw_vsock/virtio_transport_common.c
--
2.5.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC v4 1/5] VSOCK: transport-specific vsock_transport functions
2015-12-22 9:07 [RFC v4 0/5] Add virtio transport for AF_VSOCK Stefan Hajnoczi
@ 2015-12-22 9:07 ` Stefan Hajnoczi
2015-12-22 9:07 ` [RFC v4 2/5] VSOCK: Introduce virtio_vsock_common.ko Stefan Hajnoczi
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2015-12-22 9:07 UTC (permalink / raw)
To: kvm
Cc: Matt Benjamin, Christoffer Dall, netdev, Michael S. Tsirkin,
matt.ma, virtualization, Stefan Hajnoczi
struct vsock_transport contains function pointers called by AF_VSOCK
core code. The transport may want its own transport-specific function
pointers and they can be added after struct vsock_transport.
Allow the transport to fetch vsock_transport. It can downcast it to
access transport-specific function pointers.
The virtio transport will use this.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
include/net/af_vsock.h | 3 +++
net/vmw_vsock/af_vsock.c | 9 +++++++++
2 files changed, 12 insertions(+)
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index e9eb2d6..23f5525 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -165,6 +165,9 @@ static inline int vsock_core_init(const struct vsock_transport *t)
}
void vsock_core_exit(void);
+/* The transport may downcast this to access transport-specific functions */
+const struct vsock_transport *vsock_core_get_transport(void);
+
/**** UTILS ****/
void vsock_release_pending(struct sock *pending);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 7fd1220..9783a38 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1994,6 +1994,15 @@ void vsock_core_exit(void)
}
EXPORT_SYMBOL_GPL(vsock_core_exit);
+const struct vsock_transport *vsock_core_get_transport(void)
+{
+ /* vsock_register_mutex not taken since only the transport uses this
+ * function and only while registered.
+ */
+ return transport;
+}
+EXPORT_SYMBOL_GPL(vsock_core_get_transport);
+
MODULE_AUTHOR("VMware, Inc.");
MODULE_DESCRIPTION("VMware Virtual Socket Family");
MODULE_VERSION("1.0.1.0-k");
--
2.5.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC v4 2/5] VSOCK: Introduce virtio_vsock_common.ko
2015-12-22 9:07 [RFC v4 0/5] Add virtio transport for AF_VSOCK Stefan Hajnoczi
2015-12-22 9:07 ` [RFC v4 1/5] VSOCK: transport-specific vsock_transport functions Stefan Hajnoczi
@ 2015-12-22 9:07 ` Stefan Hajnoczi
2015-12-22 9:07 ` [RFC v4 3/5] VSOCK: Introduce virtio_transport.ko Stefan Hajnoczi
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2015-12-22 9:07 UTC (permalink / raw)
To: kvm
Cc: Stefan Hajnoczi, Michael S. Tsirkin, netdev, virtualization,
Matt Benjamin, Asias He, Christoffer Dall, matt.ma
From: Asias He <asias@redhat.com>
This module contains the common code and header files for the following
virtio_transporto and vhost_vsock kernel modules.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v4:
* Add MAINTAINERS file entry
* checkpatch.pl cleanups
* linux_vsock.h: drop wrong copy-pasted license header
* Move tx sock refcounting to virtio_transport_alloc/free_pkt() to fix
leaks in error paths
* Add send_pkt_no_sock() to send RST packets with no listen socket
* Rename per-socket state from virtio_transport to virtio_vsock_sock
* Move send_pkt_ops to new virtio_transport struct
* Drop dumppkt function, packet capture will be added in the future
* Drop empty virtio_transport_dec_tx_pkt()
* Allow guest->host connections again
* Use trace events instead of pr_debug()
v3:
* Remove unnecessary 3-way handshake, just do REQUEST/RESPONSE instead
of REQUEST/RESPONSE/ACK
* Remove SOCK_DGRAM support and focus on SOCK_STREAM first
* Only allow host->guest connections (same security model as latest
VMware)
v2:
* Fix peer_buf_alloc inheritance on child socket
* Notify other side of SOCK_STREAM disconnect (fixes shutdown
semantics)
* Avoid recursive mutex_lock(tx_lock) for write_space (fixes deadlock)
* Define VIRTIO_VSOCK_TYPE_STREAM/DGRAM hardware interface constants
* Define VIRTIO_VSOCK_SHUTDOWN_RCV/SEND hardware interface constants
---
MAINTAINERS | 10 +
include/linux/virtio_vsock.h | 167 +++++
.../trace/events/vsock_virtio_transport_common.h | 144 ++++
include/uapi/linux/virtio_ids.h | 1 +
include/uapi/linux/virtio_vsock.h | 87 +++
net/vmw_vsock/virtio_transport_common.c | 834 +++++++++++++++++++++
6 files changed, 1243 insertions(+)
create mode 100644 include/linux/virtio_vsock.h
create mode 100644 include/trace/events/vsock_virtio_transport_common.h
create mode 100644 include/uapi/linux/virtio_vsock.h
create mode 100644 net/vmw_vsock/virtio_transport_common.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 050d0e7..d42db78 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11360,6 +11360,16 @@ S: Maintained
F: drivers/media/v4l2-core/videobuf2-*
F: include/media/videobuf2-*
+VIRTIO AND VHOST VSOCK DRIVER
+M: Stefan Hajnoczi <stefanha@redhat.com>
+L: kvm@vger.kernel.org
+L: virtualization@lists.linux-foundation.org
+L: netdev@vger.kernel.org
+S: Maintained
+F: include/linux/virtio_vsock.h
+F: include/uapi/linux/virtio_vsock.h
+F: net/vmw_vsock/virtio_transport_common.c
+
VIRTUAL SERIO DEVICE DRIVER
M: Stephen Chandler Paul <thatslyude@gmail.com>
S: Maintained
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
new file mode 100644
index 0000000..4acf1ad
--- /dev/null
+++ b/include/linux/virtio_vsock.h
@@ -0,0 +1,167 @@
+#ifndef _LINUX_VIRTIO_VSOCK_H
+#define _LINUX_VIRTIO_VSOCK_H
+
+#include <uapi/linux/virtio_vsock.h>
+#include <linux/socket.h>
+#include <net/sock.h>
+#include <net/af_vsock.h>
+
+#define VIRTIO_VSOCK_DEFAULT_MIN_BUF_SIZE 128
+#define VIRTIO_VSOCK_DEFAULT_BUF_SIZE (1024 * 256)
+#define VIRTIO_VSOCK_DEFAULT_MAX_BUF_SIZE (1024 * 256)
+#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE (1024 * 4)
+#define VIRTIO_VSOCK_MAX_BUF_SIZE 0xFFFFFFFFUL
+#define VIRTIO_VSOCK_MAX_PKT_BUF_SIZE (1024 * 64)
+#define VIRTIO_VSOCK_MAX_TX_BUF_SIZE (1024 * 1024 * 16)
+#define VIRTIO_VSOCK_MAX_DGRAM_SIZE (1024 * 64)
+
+enum {
+ VSOCK_VQ_CTRL = 0,
+ VSOCK_VQ_RX = 1, /* for host to guest data */
+ VSOCK_VQ_TX = 2, /* for guest to host data */
+ VSOCK_VQ_MAX = 3,
+};
+
+/* Per-socket state (accessed via vsk->trans) */
+struct virtio_vsock_sock {
+ struct vsock_sock *vsk;
+
+ /* Protected by lock_sock(sk_vsock(trans->vsk)) */
+ u32 buf_size;
+ u32 buf_size_min;
+ u32 buf_size_max;
+
+ struct mutex tx_lock;
+ struct mutex rx_lock;
+
+ /* Protected by tx_lock */
+ u32 tx_cnt;
+ u32 buf_alloc;
+ u32 peer_fwd_cnt;
+ u32 peer_buf_alloc;
+
+ /* Protected by rx_lock */
+ u32 fwd_cnt;
+ u32 rx_bytes;
+ struct list_head rx_queue;
+};
+
+struct virtio_vsock_pkt {
+ struct virtio_vsock_hdr hdr;
+ struct work_struct work;
+ struct list_head list;
+ void *buf;
+ u32 len;
+ u32 off;
+};
+
+struct virtio_vsock_pkt_info {
+ u32 remote_cid, remote_port;
+ struct msghdr *msg;
+ u32 pkt_len;
+ u16 type;
+ u16 op;
+ u32 flags;
+};
+
+struct virtio_transport {
+ /* This must be the first field */
+ struct vsock_transport transport;
+
+ /* Send packet for a specific socket */
+ int (*send_pkt)(struct vsock_sock *vsk,
+ struct virtio_vsock_pkt_info *info);
+
+ /* Send packet without a socket (e.g. RST). Prefer send_pkt() over
+ * send_pkt_no_sock() when a socket exists.
+ */
+ int (*send_pkt_no_sock)(struct virtio_vsock_pkt *pkt);
+};
+
+struct virtio_vsock_pkt *
+virtio_transport_alloc_pkt(struct virtio_vsock_pkt_info *info,
+ size_t len,
+ u32 src_cid,
+ u32 src_port,
+ u32 dst_cid,
+ u32 dst_port);
+ssize_t
+virtio_transport_stream_dequeue(struct vsock_sock *vsk,
+ struct msghdr *msg,
+ size_t len,
+ int type);
+int
+virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
+ struct msghdr *msg,
+ size_t len, int flags);
+
+s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
+s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
+
+int virtio_transport_do_socket_init(struct vsock_sock *vsk,
+ struct vsock_sock *psk);
+u64 virtio_transport_get_buffer_size(struct vsock_sock *vsk);
+u64 virtio_transport_get_min_buffer_size(struct vsock_sock *vsk);
+u64 virtio_transport_get_max_buffer_size(struct vsock_sock *vsk);
+void virtio_transport_set_buffer_size(struct vsock_sock *vsk, u64 val);
+void virtio_transport_set_min_buffer_size(struct vsock_sock *vsk, u64 val);
+void virtio_transport_set_max_buffer_size(struct vsock_sock *vs, u64 val);
+int
+virtio_transport_notify_poll_in(struct vsock_sock *vsk,
+ size_t target,
+ bool *data_ready_now);
+int
+virtio_transport_notify_poll_out(struct vsock_sock *vsk,
+ size_t target,
+ bool *space_available_now);
+
+int virtio_transport_notify_recv_init(struct vsock_sock *vsk,
+ size_t target, struct vsock_transport_recv_notify_data *data);
+int virtio_transport_notify_recv_pre_block(struct vsock_sock *vsk,
+ size_t target, struct vsock_transport_recv_notify_data *data);
+int virtio_transport_notify_recv_pre_dequeue(struct vsock_sock *vsk,
+ size_t target, struct vsock_transport_recv_notify_data *data);
+int virtio_transport_notify_recv_post_dequeue(struct vsock_sock *vsk,
+ size_t target, ssize_t copied, bool data_read,
+ struct vsock_transport_recv_notify_data *data);
+int virtio_transport_notify_send_init(struct vsock_sock *vsk,
+ struct vsock_transport_send_notify_data *data);
+int virtio_transport_notify_send_pre_block(struct vsock_sock *vsk,
+ struct vsock_transport_send_notify_data *data);
+int virtio_transport_notify_send_pre_enqueue(struct vsock_sock *vsk,
+ struct vsock_transport_send_notify_data *data);
+int virtio_transport_notify_send_post_enqueue(struct vsock_sock *vsk,
+ ssize_t written, struct vsock_transport_send_notify_data *data);
+
+u64 virtio_transport_stream_rcvhiwat(struct vsock_sock *vsk);
+bool virtio_transport_stream_is_active(struct vsock_sock *vsk);
+bool virtio_transport_stream_allow(u32 cid, u32 port);
+int virtio_transport_dgram_bind(struct vsock_sock *vsk,
+ struct sockaddr_vm *addr);
+bool virtio_transport_dgram_allow(u32 cid, u32 port);
+
+int virtio_transport_connect(struct vsock_sock *vsk);
+
+int virtio_transport_shutdown(struct vsock_sock *vsk, int mode);
+
+void virtio_transport_release(struct vsock_sock *vsk);
+
+ssize_t
+virtio_transport_stream_enqueue(struct vsock_sock *vsk,
+ struct msghdr *msg,
+ size_t len);
+int
+virtio_transport_dgram_enqueue(struct vsock_sock *vsk,
+ struct sockaddr_vm *remote_addr,
+ struct msghdr *msg,
+ size_t len);
+
+void virtio_transport_destruct(struct vsock_sock *vsk);
+
+void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt);
+void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt);
+void virtio_transport_inc_tx_pkt(struct virtio_vsock_sock *vvs, struct virtio_vsock_pkt *pkt);
+u32 virtio_transport_get_credit(struct virtio_vsock_sock *vvs, u32 wanted);
+void virtio_transport_put_credit(struct virtio_vsock_sock *vvs, u32 credit);
+
+#endif /* _LINUX_VIRTIO_VSOCK_H */
diff --git a/include/trace/events/vsock_virtio_transport_common.h b/include/trace/events/vsock_virtio_transport_common.h
new file mode 100644
index 0000000..b7f1d62
--- /dev/null
+++ b/include/trace/events/vsock_virtio_transport_common.h
@@ -0,0 +1,144 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM vsock
+
+#if !defined(_TRACE_VSOCK_VIRTIO_TRANSPORT_COMMON_H) || \
+ defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_VSOCK_VIRTIO_TRANSPORT_COMMON_H
+
+#include <linux/tracepoint.h>
+
+TRACE_DEFINE_ENUM(VIRTIO_VSOCK_TYPE_STREAM);
+
+#define show_type(val) \
+ __print_symbolic(val, { VIRTIO_VSOCK_TYPE_STREAM, "STREAM" })
+
+TRACE_DEFINE_ENUM(VIRTIO_VSOCK_OP_INVALID);
+TRACE_DEFINE_ENUM(VIRTIO_VSOCK_OP_REQUEST);
+TRACE_DEFINE_ENUM(VIRTIO_VSOCK_OP_RESPONSE);
+TRACE_DEFINE_ENUM(VIRTIO_VSOCK_OP_RST);
+TRACE_DEFINE_ENUM(VIRTIO_VSOCK_OP_SHUTDOWN);
+TRACE_DEFINE_ENUM(VIRTIO_VSOCK_OP_RW);
+TRACE_DEFINE_ENUM(VIRTIO_VSOCK_OP_CREDIT_UPDATE);
+TRACE_DEFINE_ENUM(VIRTIO_VSOCK_OP_CREDIT_REQUEST);
+
+#define show_op(val) \
+ __print_symbolic(val, \
+ { VIRTIO_VSOCK_OP_INVALID, "INVALID" }, \
+ { VIRTIO_VSOCK_OP_REQUEST, "REQUEST" }, \
+ { VIRTIO_VSOCK_OP_RESPONSE, "RESPONSE" }, \
+ { VIRTIO_VSOCK_OP_RST, "RST" }, \
+ { VIRTIO_VSOCK_OP_SHUTDOWN, "SHUTDOWN" }, \
+ { VIRTIO_VSOCK_OP_RW, "RW" }, \
+ { VIRTIO_VSOCK_OP_CREDIT_UPDATE, "CREDIT_UPDATE" }, \
+ { VIRTIO_VSOCK_OP_CREDIT_REQUEST, "CREDIT_REQUEST" })
+
+TRACE_EVENT(virtio_transport_alloc_pkt,
+ TP_PROTO(
+ __u32 src_cid, __u32 src_port,
+ __u32 dst_cid, __u32 dst_port,
+ __u32 len,
+ __u16 type,
+ __u16 op,
+ __u32 flags
+ ),
+ TP_ARGS(
+ src_cid, src_port,
+ dst_cid, dst_port,
+ len,
+ type,
+ op,
+ flags
+ ),
+ TP_STRUCT__entry(
+ __field(__u32, src_cid)
+ __field(__u32, src_port)
+ __field(__u32, dst_cid)
+ __field(__u32, dst_port)
+ __field(__u32, len)
+ __field(__u16, type)
+ __field(__u16, op)
+ __field(__u32, flags)
+ ),
+ TP_fast_assign(
+ __entry->src_cid = src_cid;
+ __entry->src_port = src_port;
+ __entry->dst_cid = dst_cid;
+ __entry->dst_port = dst_port;
+ __entry->len = len;
+ __entry->type = type;
+ __entry->op = op;
+ __entry->flags = flags;
+ ),
+ TP_printk("%u:%u -> %u:%u len=%u type=%s op=%s flags=%#x",
+ __entry->src_cid, __entry->src_port,
+ __entry->dst_cid, __entry->dst_port,
+ __entry->len,
+ show_type(__entry->type),
+ show_op(__entry->op),
+ __entry->flags)
+);
+
+TRACE_EVENT(virtio_transport_recv_pkt,
+ TP_PROTO(
+ __u32 src_cid, __u32 src_port,
+ __u32 dst_cid, __u32 dst_port,
+ __u32 len,
+ __u16 type,
+ __u16 op,
+ __u32 flags,
+ __u32 buf_alloc,
+ __u32 fwd_cnt
+ ),
+ TP_ARGS(
+ src_cid, src_port,
+ dst_cid, dst_port,
+ len,
+ type,
+ op,
+ flags,
+ buf_alloc,
+ fwd_cnt
+ ),
+ TP_STRUCT__entry(
+ __field(__u32, src_cid)
+ __field(__u32, src_port)
+ __field(__u32, dst_cid)
+ __field(__u32, dst_port)
+ __field(__u32, len)
+ __field(__u16, type)
+ __field(__u16, op)
+ __field(__u32, flags)
+ __field(__u32, buf_alloc)
+ __field(__u32, fwd_cnt)
+ ),
+ TP_fast_assign(
+ __entry->src_cid = src_cid;
+ __entry->src_port = src_port;
+ __entry->dst_cid = dst_cid;
+ __entry->dst_port = dst_port;
+ __entry->len = len;
+ __entry->type = type;
+ __entry->op = op;
+ __entry->flags = flags;
+ __entry->buf_alloc = buf_alloc;
+ __entry->fwd_cnt = fwd_cnt;
+ ),
+ TP_printk("%u:%u -> %u:%u len=%u type=%s op=%s flags=%#x "
+ "buf_alloc=%u fwd_cnt=%u",
+ __entry->src_cid, __entry->src_port,
+ __entry->dst_cid, __entry->dst_port,
+ __entry->len,
+ show_type(__entry->type),
+ show_op(__entry->op),
+ __entry->flags,
+ __entry->buf_alloc,
+ __entry->fwd_cnt)
+);
+
+#endif /* _TRACE_VSOCK_VIRTIO_TRANSPORT_COMMON_H */
+
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE vsock_virtio_transport_common
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
index 77925f5..16dcf5d 100644
--- a/include/uapi/linux/virtio_ids.h
+++ b/include/uapi/linux/virtio_ids.h
@@ -39,6 +39,7 @@
#define VIRTIO_ID_9P 9 /* 9p virtio console */
#define VIRTIO_ID_RPROC_SERIAL 11 /* virtio remoteproc serial link */
#define VIRTIO_ID_CAIF 12 /* Virtio caif */
+#define VIRTIO_ID_VSOCK 13 /* virtio vsock transport */
#define VIRTIO_ID_GPU 16 /* virtio GPU */
#define VIRTIO_ID_INPUT 18 /* virtio input */
diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
new file mode 100644
index 0000000..ac6483d
--- /dev/null
+++ b/include/uapi/linux/virtio_vsock.h
@@ -0,0 +1,87 @@
+/*
+ * This header, excluding the #ifdef __KERNEL__ part, is BSD licensed so
+ * anyone can use the definitions to implement compatible drivers/servers:
+ *
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ * may be used to endorse or promote products derived from this software
+ * without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * Copyright (C) Red Hat, Inc., 2013-2015
+ * Copyright (C) Asias He <asias@redhat.com>, 2013
+ * Copyright (C) Stefan Hajnoczi <stefanha@redhat.com>, 2015
+ */
+
+#ifndef _UAPI_LINUX_VIRTIO_VSOCK_H
+#define _UAPI_LINUX_VIRTIO_VOSCK_H
+
+#include <linux/types.h>
+#include <linux/virtio_ids.h>
+#include <linux/virtio_config.h>
+
+struct virtio_vsock_config {
+ __le32 guest_cid;
+ __le32 max_virtqueue_pairs;
+};
+
+struct virtio_vsock_hdr {
+ __le32 src_cid;
+ __le32 src_port;
+ __le32 dst_cid;
+ __le32 dst_port;
+ __le32 len;
+ __le16 type; /* enum virtio_vsock_type */
+ __le16 op; /* enum virtio_vsock_op */
+ __le32 flags;
+ __le32 buf_alloc;
+ __le32 fwd_cnt;
+};
+
+enum virtio_vsock_type {
+ VIRTIO_VSOCK_TYPE_STREAM = 1,
+};
+
+enum virtio_vsock_op {
+ VIRTIO_VSOCK_OP_INVALID = 0,
+
+ /* Connect operations */
+ VIRTIO_VSOCK_OP_REQUEST = 1,
+ VIRTIO_VSOCK_OP_RESPONSE = 2,
+ VIRTIO_VSOCK_OP_RST = 3,
+ VIRTIO_VSOCK_OP_SHUTDOWN = 4,
+
+ /* To send payload */
+ VIRTIO_VSOCK_OP_RW = 5,
+
+ /* Tell the peer our credit info */
+ VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
+ /* Request the peer to send the credit info to us */
+ VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
+};
+
+/* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
+enum virtio_vsock_shutdown {
+ VIRTIO_VSOCK_SHUTDOWN_RCV = 1,
+ VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
+};
+
+#endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
new file mode 100644
index 0000000..000f12b
--- /dev/null
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -0,0 +1,834 @@
+/*
+ * common code for virtio vsock
+ *
+ * Copyright (C) 2013-2015 Red Hat, Inc.
+ * Author: Asias He <asias@redhat.com>
+ * Stefan Hajnoczi <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+#include <linux/module.h>
+#include <linux/ctype.h>
+#include <linux/list.h>
+#include <linux/virtio.h>
+#include <linux/virtio_ids.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_vsock.h>
+
+#include <net/sock.h>
+#include <net/af_vsock.h>
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/vsock_virtio_transport_common.h>
+
+static const struct virtio_transport *virtio_transport_get_ops(void)
+{
+ const struct vsock_transport *t = vsock_core_get_transport();
+
+ return container_of(t, struct virtio_transport, transport);
+}
+
+static int virtio_transport_send_pkt(struct vsock_sock *vsk,
+ struct virtio_vsock_pkt_info *info)
+{
+ return virtio_transport_get_ops()->send_pkt(vsk, info);
+}
+
+static int virtio_transport_send_pkt_no_sock(struct virtio_vsock_pkt *pkt)
+{
+ return virtio_transport_get_ops()->send_pkt_no_sock(pkt);
+}
+
+struct virtio_vsock_pkt *
+virtio_transport_alloc_pkt(struct virtio_vsock_pkt_info *info,
+ size_t len,
+ u32 src_cid,
+ u32 src_port,
+ u32 dst_cid,
+ u32 dst_port)
+{
+ struct virtio_vsock_pkt *pkt;
+ int err;
+
+ pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
+ if (!pkt)
+ return NULL;
+
+ pkt->hdr.type = cpu_to_le16(info->type);
+ pkt->hdr.op = cpu_to_le16(info->op);
+ pkt->hdr.src_cid = cpu_to_le32(src_cid);
+ pkt->hdr.src_port = cpu_to_le32(src_port);
+ pkt->hdr.dst_cid = cpu_to_le32(dst_cid);
+ pkt->hdr.dst_port = cpu_to_le32(dst_port);
+ pkt->hdr.flags = cpu_to_le32(info->flags);
+ pkt->len = len;
+ pkt->hdr.len = cpu_to_le32(len);
+
+ if (info->msg && len > 0) {
+ pkt->buf = kmalloc(len, GFP_KERNEL);
+ if (!pkt->buf)
+ goto out_pkt;
+ err = memcpy_from_msg(pkt->buf, info->msg, len);
+ if (err)
+ goto out;
+ }
+
+ trace_virtio_transport_alloc_pkt(src_cid, src_port,
+ dst_cid, dst_port,
+ len,
+ info->type,
+ info->op,
+ info->flags);
+
+ return pkt;
+
+out:
+ kfree(pkt->buf);
+out_pkt:
+ kfree(pkt);
+ return NULL;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_alloc_pkt);
+
+static void virtio_transport_inc_rx_pkt(struct virtio_vsock_sock *vvs,
+ struct virtio_vsock_pkt *pkt)
+{
+ vvs->rx_bytes += pkt->len;
+}
+
+static void virtio_transport_dec_rx_pkt(struct virtio_vsock_sock *vvs,
+ struct virtio_vsock_pkt *pkt)
+{
+ vvs->rx_bytes -= pkt->len;
+ vvs->fwd_cnt += pkt->len;
+}
+
+void virtio_transport_inc_tx_pkt(struct virtio_vsock_sock *vvs, struct virtio_vsock_pkt *pkt)
+{
+ mutex_lock(&vvs->tx_lock);
+ pkt->hdr.fwd_cnt = cpu_to_le32(vvs->fwd_cnt);
+ pkt->hdr.buf_alloc = cpu_to_le32(vvs->buf_alloc);
+ mutex_unlock(&vvs->tx_lock);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_inc_tx_pkt);
+
+u32 virtio_transport_get_credit(struct virtio_vsock_sock *vvs, u32 credit)
+{
+ u32 ret;
+
+ mutex_lock(&vvs->tx_lock);
+ ret = vvs->peer_buf_alloc - (vvs->tx_cnt - vvs->peer_fwd_cnt);
+ if (ret > credit)
+ ret = credit;
+ vvs->tx_cnt += ret;
+ mutex_unlock(&vvs->tx_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_credit);
+
+void virtio_transport_put_credit(struct virtio_vsock_sock *vvs, u32 credit)
+{
+ mutex_lock(&vvs->tx_lock);
+ vvs->tx_cnt -= credit;
+ mutex_unlock(&vvs->tx_lock);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_put_credit);
+
+static int virtio_transport_send_credit_update(struct vsock_sock *vsk,
+ int type,
+ struct virtio_vsock_hdr *hdr)
+{
+ struct virtio_vsock_pkt_info info = {
+ .op = VIRTIO_VSOCK_OP_CREDIT_UPDATE,
+ .type = type,
+ };
+
+ return virtio_transport_send_pkt(vsk, &info);
+}
+
+static ssize_t
+virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
+ struct msghdr *msg,
+ size_t len)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+ struct virtio_vsock_pkt *pkt;
+ size_t bytes, total = 0;
+ int err = -EFAULT;
+
+ mutex_lock(&vvs->rx_lock);
+ while (total < len &&
+ vvs->rx_bytes > 0 &&
+ !list_empty(&vvs->rx_queue)) {
+ pkt = list_first_entry(&vvs->rx_queue,
+ struct virtio_vsock_pkt, list);
+
+ bytes = len - total;
+ if (bytes > pkt->len - pkt->off)
+ bytes = pkt->len - pkt->off;
+
+ err = memcpy_to_msg(msg, pkt->buf + pkt->off, bytes);
+ if (err)
+ goto out;
+ total += bytes;
+ pkt->off += bytes;
+ if (pkt->off == pkt->len) {
+ virtio_transport_dec_rx_pkt(vvs, pkt);
+ list_del(&pkt->list);
+ virtio_transport_free_pkt(pkt);
+ }
+ }
+ mutex_unlock(&vvs->rx_lock);
+
+ /* Send a credit pkt to peer */
+ virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_STREAM,
+ NULL);
+
+ return total;
+
+out:
+ mutex_unlock(&vvs->rx_lock);
+ if (total)
+ err = total;
+ return err;
+}
+
+ssize_t
+virtio_transport_stream_dequeue(struct vsock_sock *vsk,
+ struct msghdr *msg,
+ size_t len, int flags)
+{
+ if (flags & MSG_PEEK)
+ return -EOPNOTSUPP;
+
+ return virtio_transport_stream_do_dequeue(vsk, msg, len);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_dequeue);
+
+int
+virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
+ struct msghdr *msg,
+ size_t len, int flags)
+{
+ return -EOPNOTSUPP;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dgram_dequeue);
+
+s64 virtio_transport_stream_has_data(struct vsock_sock *vsk)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+ s64 bytes;
+
+ mutex_lock(&vvs->rx_lock);
+ bytes = vvs->rx_bytes;
+ mutex_unlock(&vvs->rx_lock);
+
+ return bytes;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_has_data);
+
+static s64 virtio_transport_has_space(struct vsock_sock *vsk)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+ s64 bytes;
+
+ bytes = vvs->peer_buf_alloc - (vvs->tx_cnt - vvs->peer_fwd_cnt);
+ if (bytes < 0)
+ bytes = 0;
+
+ return bytes;
+}
+
+s64 virtio_transport_stream_has_space(struct vsock_sock *vsk)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+ s64 bytes;
+
+ mutex_lock(&vvs->tx_lock);
+ bytes = virtio_transport_has_space(vsk);
+ mutex_unlock(&vvs->tx_lock);
+
+ return bytes;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_has_space);
+
+int virtio_transport_do_socket_init(struct vsock_sock *vsk,
+ struct vsock_sock *psk)
+{
+ struct virtio_vsock_sock *vvs;
+
+ vvs = kzalloc(sizeof(*vvs), GFP_KERNEL);
+ if (!vvs)
+ return -ENOMEM;
+
+ vsk->trans = vvs;
+ vvs->vsk = vsk;
+ if (psk) {
+ struct virtio_vsock_sock *ptrans = psk->trans;
+
+ vvs->buf_size = ptrans->buf_size;
+ vvs->buf_size_min = ptrans->buf_size_min;
+ vvs->buf_size_max = ptrans->buf_size_max;
+ vvs->peer_buf_alloc = ptrans->peer_buf_alloc;
+ } else {
+ vvs->buf_size = VIRTIO_VSOCK_DEFAULT_BUF_SIZE;
+ vvs->buf_size_min = VIRTIO_VSOCK_DEFAULT_MIN_BUF_SIZE;
+ vvs->buf_size_max = VIRTIO_VSOCK_DEFAULT_MAX_BUF_SIZE;
+ }
+
+ vvs->buf_alloc = vvs->buf_size;
+
+ mutex_init(&vvs->rx_lock);
+ mutex_init(&vvs->tx_lock);
+ INIT_LIST_HEAD(&vvs->rx_queue);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
+
+u64 virtio_transport_get_buffer_size(struct vsock_sock *vsk)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+
+ return vvs->buf_size;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_buffer_size);
+
+u64 virtio_transport_get_min_buffer_size(struct vsock_sock *vsk)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+
+ return vvs->buf_size_min;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_min_buffer_size);
+
+u64 virtio_transport_get_max_buffer_size(struct vsock_sock *vsk)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+
+ return vvs->buf_size_max;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_get_max_buffer_size);
+
+void virtio_transport_set_buffer_size(struct vsock_sock *vsk, u64 val)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+
+ if (val > VIRTIO_VSOCK_MAX_BUF_SIZE)
+ val = VIRTIO_VSOCK_MAX_BUF_SIZE;
+ if (val < vvs->buf_size_min)
+ vvs->buf_size_min = val;
+ if (val > vvs->buf_size_max)
+ vvs->buf_size_max = val;
+ vvs->buf_size = val;
+ vvs->buf_alloc = val;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_set_buffer_size);
+
+void virtio_transport_set_min_buffer_size(struct vsock_sock *vsk, u64 val)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+
+ if (val > VIRTIO_VSOCK_MAX_BUF_SIZE)
+ val = VIRTIO_VSOCK_MAX_BUF_SIZE;
+ if (val > vvs->buf_size)
+ vvs->buf_size = val;
+ vvs->buf_size_min = val;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_set_min_buffer_size);
+
+void virtio_transport_set_max_buffer_size(struct vsock_sock *vsk, u64 val)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+
+ if (val > VIRTIO_VSOCK_MAX_BUF_SIZE)
+ val = VIRTIO_VSOCK_MAX_BUF_SIZE;
+ if (val < vvs->buf_size)
+ vvs->buf_size = val;
+ vvs->buf_size_max = val;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_set_max_buffer_size);
+
+int
+virtio_transport_notify_poll_in(struct vsock_sock *vsk,
+ size_t target,
+ bool *data_ready_now)
+{
+ if (vsock_stream_has_data(vsk))
+ *data_ready_now = true;
+ else
+ *data_ready_now = false;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_poll_in);
+
+int
+virtio_transport_notify_poll_out(struct vsock_sock *vsk,
+ size_t target,
+ bool *space_avail_now)
+{
+ s64 free_space;
+
+ free_space = vsock_stream_has_space(vsk);
+ if (free_space > 0)
+ *space_avail_now = true;
+ else if (free_space == 0)
+ *space_avail_now = false;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_poll_out);
+
+int virtio_transport_notify_recv_init(struct vsock_sock *vsk,
+ size_t target, struct vsock_transport_recv_notify_data *data)
+{
+ return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_init);
+
+int virtio_transport_notify_recv_pre_block(struct vsock_sock *vsk,
+ size_t target, struct vsock_transport_recv_notify_data *data)
+{
+ return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_pre_block);
+
+int virtio_transport_notify_recv_pre_dequeue(struct vsock_sock *vsk,
+ size_t target, struct vsock_transport_recv_notify_data *data)
+{
+ return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_pre_dequeue);
+
+int virtio_transport_notify_recv_post_dequeue(struct vsock_sock *vsk,
+ size_t target, ssize_t copied, bool data_read,
+ struct vsock_transport_recv_notify_data *data)
+{
+ return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_recv_post_dequeue);
+
+int virtio_transport_notify_send_init(struct vsock_sock *vsk,
+ struct vsock_transport_send_notify_data *data)
+{
+ return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_send_init);
+
+int virtio_transport_notify_send_pre_block(struct vsock_sock *vsk,
+ struct vsock_transport_send_notify_data *data)
+{
+ return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_send_pre_block);
+
+int virtio_transport_notify_send_pre_enqueue(struct vsock_sock *vsk,
+ struct vsock_transport_send_notify_data *data)
+{
+ return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_send_pre_enqueue);
+
+int virtio_transport_notify_send_post_enqueue(struct vsock_sock *vsk,
+ ssize_t written, struct vsock_transport_send_notify_data *data)
+{
+ return 0;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_notify_send_post_enqueue);
+
+u64 virtio_transport_stream_rcvhiwat(struct vsock_sock *vsk)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+
+ return vvs->buf_size;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_rcvhiwat);
+
+bool virtio_transport_stream_is_active(struct vsock_sock *vsk)
+{
+ return true;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_is_active);
+
+bool virtio_transport_stream_allow(u32 cid, u32 port)
+{
+ return true;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_allow);
+
+int virtio_transport_dgram_bind(struct vsock_sock *vsk,
+ struct sockaddr_vm *addr)
+{
+ return -EOPNOTSUPP;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dgram_bind);
+
+bool virtio_transport_dgram_allow(u32 cid, u32 port)
+{
+ return false;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dgram_allow);
+
+int virtio_transport_connect(struct vsock_sock *vsk)
+{
+ struct virtio_vsock_pkt_info info = {
+ .op = VIRTIO_VSOCK_OP_REQUEST,
+ .type = VIRTIO_VSOCK_TYPE_STREAM,
+ };
+
+ return virtio_transport_send_pkt(vsk, &info);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_connect);
+
+int virtio_transport_shutdown(struct vsock_sock *vsk, int mode)
+{
+ struct virtio_vsock_pkt_info info = {
+ .op = VIRTIO_VSOCK_OP_SHUTDOWN,
+ .type = VIRTIO_VSOCK_TYPE_STREAM,
+ .flags = (mode & RCV_SHUTDOWN ?
+ VIRTIO_VSOCK_SHUTDOWN_RCV : 0) |
+ (mode & SEND_SHUTDOWN ?
+ VIRTIO_VSOCK_SHUTDOWN_SEND : 0),
+ };
+
+ return virtio_transport_send_pkt(vsk, &info);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_shutdown);
+
+void virtio_transport_release(struct vsock_sock *vsk)
+{
+ struct sock *sk = &vsk->sk;
+
+ /* Tell other side to terminate connection */
+ if (sk->sk_type == SOCK_STREAM &&
+ vsk->peer_shutdown != SHUTDOWN_MASK &&
+ sk->sk_state == SS_CONNECTED)
+ (void)virtio_transport_shutdown(vsk, SHUTDOWN_MASK);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_release);
+
+int
+virtio_transport_dgram_enqueue(struct vsock_sock *vsk,
+ struct sockaddr_vm *remote_addr,
+ struct msghdr *msg,
+ size_t dgram_len)
+{
+ return -EOPNOTSUPP;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_dgram_enqueue);
+
+ssize_t
+virtio_transport_stream_enqueue(struct vsock_sock *vsk,
+ struct msghdr *msg,
+ size_t len)
+{
+ struct virtio_vsock_pkt_info info = {
+ .op = VIRTIO_VSOCK_OP_RW,
+ .type = VIRTIO_VSOCK_TYPE_STREAM,
+ .msg = msg,
+ .pkt_len = len,
+ };
+
+ return virtio_transport_send_pkt(vsk, &info);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_stream_enqueue);
+
+void virtio_transport_destruct(struct vsock_sock *vsk)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+
+ kfree(vvs);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_destruct);
+
+static int virtio_transport_send_reset(struct vsock_sock *vsk,
+ struct virtio_vsock_pkt *pkt)
+{
+ struct virtio_vsock_pkt_info info = {
+ .op = VIRTIO_VSOCK_OP_RST,
+ .type = VIRTIO_VSOCK_TYPE_STREAM,
+ };
+
+ /* Send RST only if the original pkt is not a RST pkt */
+ if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_RST)
+ return 0;
+
+ return virtio_transport_send_pkt(vsk, &info);
+}
+
+/* Normally packets are associated with a socket. There may be no socket if an
+ * attempt was made to connect to a socket that does not exist.
+ */
+static int virtio_transport_send_reset_no_sock(struct virtio_vsock_pkt *pkt)
+{
+ struct virtio_vsock_pkt_info info = {
+ .op = VIRTIO_VSOCK_OP_RST,
+ .type = le16_to_cpu(pkt->hdr.type),
+ };
+
+ /* Send RST only if the original pkt is not a RST pkt */
+ if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_RST)
+ return 0;
+
+ pkt = virtio_transport_alloc_pkt(&info, 0,
+ le32_to_cpu(pkt->hdr.dst_cid),
+ le32_to_cpu(pkt->hdr.dst_port),
+ le32_to_cpu(pkt->hdr.src_cid),
+ le32_to_cpu(pkt->hdr.src_port));
+ if (!pkt)
+ return -ENOMEM;
+
+ return virtio_transport_send_pkt_no_sock(pkt);
+}
+
+static int
+virtio_transport_recv_connecting(struct sock *sk,
+ struct virtio_vsock_pkt *pkt)
+{
+ struct vsock_sock *vsk = vsock_sk(sk);
+ int err;
+ int skerr;
+
+ switch (le16_to_cpu(pkt->hdr.op)) {
+ case VIRTIO_VSOCK_OP_RESPONSE:
+ sk->sk_state = SS_CONNECTED;
+ sk->sk_socket->state = SS_CONNECTED;
+ vsock_insert_connected(vsk);
+ sk->sk_state_change(sk);
+ break;
+ case VIRTIO_VSOCK_OP_INVALID:
+ break;
+ case VIRTIO_VSOCK_OP_RST:
+ skerr = ECONNRESET;
+ err = 0;
+ goto destroy;
+ default:
+ skerr = EPROTO;
+ err = -EINVAL;
+ goto destroy;
+ }
+ return 0;
+
+destroy:
+ virtio_transport_send_reset(vsk, pkt);
+ sk->sk_state = SS_UNCONNECTED;
+ sk->sk_err = skerr;
+ sk->sk_error_report(sk);
+ return err;
+}
+
+static int
+virtio_transport_recv_connected(struct sock *sk,
+ struct virtio_vsock_pkt *pkt)
+{
+ struct vsock_sock *vsk = vsock_sk(sk);
+ struct virtio_vsock_sock *vvs = vsk->trans;
+ int err = 0;
+
+ switch (le16_to_cpu(pkt->hdr.op)) {
+ case VIRTIO_VSOCK_OP_RW:
+ pkt->len = le32_to_cpu(pkt->hdr.len);
+ pkt->off = 0;
+
+ mutex_lock(&vvs->rx_lock);
+ virtio_transport_inc_rx_pkt(vvs, pkt);
+ list_add_tail(&pkt->list, &vvs->rx_queue);
+ mutex_unlock(&vvs->rx_lock);
+
+ sk->sk_data_ready(sk);
+ return err;
+ case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
+ sk->sk_write_space(sk);
+ break;
+ case VIRTIO_VSOCK_OP_SHUTDOWN:
+ if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SHUTDOWN_RCV)
+ vsk->peer_shutdown |= RCV_SHUTDOWN;
+ if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SHUTDOWN_SEND)
+ vsk->peer_shutdown |= SEND_SHUTDOWN;
+ if (vsk->peer_shutdown == SHUTDOWN_MASK &&
+ vsock_stream_has_data(vsk) <= 0)
+ sk->sk_state = SS_DISCONNECTING;
+ if (le32_to_cpu(pkt->hdr.flags))
+ sk->sk_state_change(sk);
+ break;
+ case VIRTIO_VSOCK_OP_RST:
+ sock_set_flag(sk, SOCK_DONE);
+ vsk->peer_shutdown = SHUTDOWN_MASK;
+ if (vsock_stream_has_data(vsk) <= 0)
+ sk->sk_state = SS_DISCONNECTING;
+ sk->sk_state_change(sk);
+ break;
+ default:
+ err = -EINVAL;
+ break;
+ }
+
+ virtio_transport_free_pkt(pkt);
+ return err;
+}
+
+static int
+virtio_transport_send_response(struct vsock_sock *vsk,
+ struct virtio_vsock_pkt *pkt)
+{
+ struct virtio_vsock_pkt_info info = {
+ .op = VIRTIO_VSOCK_OP_RESPONSE,
+ .type = VIRTIO_VSOCK_TYPE_STREAM,
+ .remote_cid = le32_to_cpu(pkt->hdr.src_cid),
+ .remote_port = le32_to_cpu(pkt->hdr.src_port),
+ };
+
+ return virtio_transport_send_pkt(vsk, &info);
+}
+
+/* Handle server socket */
+static int
+virtio_transport_recv_listen(struct sock *sk, struct virtio_vsock_pkt *pkt)
+{
+ struct vsock_sock *vsk = vsock_sk(sk);
+ struct vsock_sock *vchild;
+ struct sock *child;
+
+ if (le16_to_cpu(pkt->hdr.op) != VIRTIO_VSOCK_OP_REQUEST) {
+ virtio_transport_send_reset(vsk, pkt);
+ return -EINVAL;
+ }
+
+ if (sk_acceptq_is_full(sk)) {
+ virtio_transport_send_reset(vsk, pkt);
+ return -ENOMEM;
+ }
+
+ child = __vsock_create(sock_net(sk), NULL, sk, GFP_KERNEL,
+ sk->sk_type, 0);
+ if (!child) {
+ virtio_transport_send_reset(vsk, pkt);
+ return -ENOMEM;
+ }
+
+ sk->sk_ack_backlog++;
+
+ lock_sock(child);
+
+ child->sk_state = SS_CONNECTED;
+
+ vchild = vsock_sk(child);
+ vsock_addr_init(&vchild->local_addr, le32_to_cpu(pkt->hdr.dst_cid),
+ le32_to_cpu(pkt->hdr.dst_port));
+ vsock_addr_init(&vchild->remote_addr, le32_to_cpu(pkt->hdr.src_cid),
+ le32_to_cpu(pkt->hdr.src_port));
+
+ vsock_insert_connected(vchild);
+ vsock_enqueue_accept(sk, child);
+ virtio_transport_send_response(vchild, pkt);
+
+ release_sock(child);
+
+ sk->sk_data_ready(sk);
+ return 0;
+}
+
+static void virtio_transport_space_update(struct sock *sk,
+ struct virtio_vsock_pkt *pkt)
+{
+ struct vsock_sock *vsk = vsock_sk(sk);
+ struct virtio_vsock_sock *vvs = vsk->trans;
+ bool space_available;
+
+ /* buf_alloc and fwd_cnt is always included in the hdr */
+ mutex_lock(&vvs->tx_lock);
+ vvs->peer_buf_alloc = le32_to_cpu(pkt->hdr.buf_alloc);
+ vvs->peer_fwd_cnt = le32_to_cpu(pkt->hdr.fwd_cnt);
+ space_available = virtio_transport_has_space(vsk);
+ mutex_unlock(&vvs->tx_lock);
+
+ if (space_available)
+ sk->sk_write_space(sk);
+}
+
+/* We are under the virtio-vsock's vsock->rx_lock or vhost-vsock's vq->mutex
+ * lock.
+ */
+void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt)
+{
+ struct sockaddr_vm src, dst;
+ struct vsock_sock *vsk;
+ struct sock *sk;
+
+ vsock_addr_init(&src, le32_to_cpu(pkt->hdr.src_cid),
+ le32_to_cpu(pkt->hdr.src_port));
+ vsock_addr_init(&dst, le32_to_cpu(pkt->hdr.dst_cid),
+ le32_to_cpu(pkt->hdr.dst_port));
+
+ trace_virtio_transport_recv_pkt(src.svm_cid, src.svm_port,
+ dst.svm_cid, dst.svm_port,
+ le32_to_cpu(pkt->hdr.len),
+ le16_to_cpu(pkt->hdr.type),
+ le16_to_cpu(pkt->hdr.op),
+ le32_to_cpu(pkt->hdr.flags),
+ le32_to_cpu(pkt->hdr.buf_alloc),
+ le32_to_cpu(pkt->hdr.fwd_cnt));
+
+ if (le16_to_cpu(pkt->hdr.type) != VIRTIO_VSOCK_TYPE_STREAM) {
+ (void)virtio_transport_send_reset_no_sock(pkt);
+ goto free_pkt;
+ }
+
+ /* The socket must be in connected or bound table
+ * otherwise send reset back
+ */
+ sk = vsock_find_connected_socket(&src, &dst);
+ if (!sk) {
+ sk = vsock_find_bound_socket(&dst);
+ if (!sk) {
+ (void)virtio_transport_send_reset_no_sock(pkt);
+ goto free_pkt;
+ }
+ }
+
+ vsk = vsock_sk(sk);
+
+ virtio_transport_space_update(sk, pkt);
+
+ lock_sock(sk);
+ switch (sk->sk_state) {
+ case VSOCK_SS_LISTEN:
+ virtio_transport_recv_listen(sk, pkt);
+ virtio_transport_free_pkt(pkt);
+ break;
+ case SS_CONNECTING:
+ virtio_transport_recv_connecting(sk, pkt);
+ virtio_transport_free_pkt(pkt);
+ break;
+ case SS_CONNECTED:
+ virtio_transport_recv_connected(sk, pkt);
+ break;
+ default:
+ virtio_transport_free_pkt(pkt);
+ break;
+ }
+ release_sock(sk);
+
+ /* Release refcnt obtained when we fetched this socket out of the
+ * bound or connected list.
+ */
+ sock_put(sk);
+ return;
+
+free_pkt:
+ virtio_transport_free_pkt(pkt);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_recv_pkt);
+
+void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt)
+{
+ kfree(pkt->buf);
+ kfree(pkt);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_free_pkt);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Asias He");
+MODULE_DESCRIPTION("common code for virtio vsock");
--
2.5.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC v4 3/5] VSOCK: Introduce virtio_transport.ko
2015-12-22 9:07 [RFC v4 0/5] Add virtio transport for AF_VSOCK Stefan Hajnoczi
2015-12-22 9:07 ` [RFC v4 1/5] VSOCK: transport-specific vsock_transport functions Stefan Hajnoczi
2015-12-22 9:07 ` [RFC v4 2/5] VSOCK: Introduce virtio_vsock_common.ko Stefan Hajnoczi
@ 2015-12-22 9:07 ` Stefan Hajnoczi
2015-12-22 9:07 ` [RFC v4 4/5] VSOCK: Introduce vhost_vsock.ko Stefan Hajnoczi
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2015-12-22 9:07 UTC (permalink / raw)
To: kvm
Cc: Stefan Hajnoczi, Michael S. Tsirkin, netdev, virtualization,
Matt Benjamin, Asias He, Christoffer Dall, matt.ma
From: Asias He <asias@redhat.com>
VM sockets virtio transport implementation. This driver runs in the
guest.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v4:
* Add MAINTAINERS file entry
* Drop short/long rx packets
* checkpatch.pl cleanups
* Clarify locking in struct virtio_vsock
* Narrow local variable scopes as suggested by Alex Bennee
* Call wake_up() after decrementing total_tx_buf to avoid deadlock
v2:
* Fix total_tx_buf accounting
* Add virtio_transport global mutex to prevent races
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
MAINTAINERS | 1 +
net/vmw_vsock/virtio_transport.c | 481 +++++++++++++++++++++++++++++++++++++++
2 files changed, 482 insertions(+)
create mode 100644 net/vmw_vsock/virtio_transport.c
diff --git a/MAINTAINERS b/MAINTAINERS
index d42db78..67d8504 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11369,6 +11369,7 @@ S: Maintained
F: include/linux/virtio_vsock.h
F: include/uapi/linux/virtio_vsock.h
F: net/vmw_vsock/virtio_transport_common.c
+F: net/vmw_vsock/virtio_transport.c
VIRTUAL SERIO DEVICE DRIVER
M: Stephen Chandler Paul <thatslyude@gmail.com>
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
new file mode 100644
index 0000000..e4787bf
--- /dev/null
+++ b/net/vmw_vsock/virtio_transport.c
@@ -0,0 +1,481 @@
+/*
+ * virtio transport for vsock
+ *
+ * Copyright (C) 2013-2015 Red Hat, Inc.
+ * Author: Asias He <asias@redhat.com>
+ * Stefan Hajnoczi <stefanha@redhat.com>
+ *
+ * Some of the code is take from Gerd Hoffmann <kraxel@redhat.com>'s
+ * early virtio-vsock proof-of-concept bits.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/list.h>
+#include <linux/virtio.h>
+#include <linux/virtio_ids.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_vsock.h>
+#include <net/sock.h>
+#include <linux/mutex.h>
+#include <net/af_vsock.h>
+
+static struct workqueue_struct *virtio_vsock_workqueue;
+static struct virtio_vsock *the_virtio_vsock;
+static DEFINE_MUTEX(the_virtio_vsock_mutex); /* protects the_virtio_vsock */
+static void virtio_vsock_rx_fill(struct virtio_vsock *vsock);
+
+struct virtio_vsock {
+ struct virtio_device *vdev;
+ struct virtqueue *vqs[VSOCK_VQ_MAX];
+
+ /* Virtqueue processing is deferred to a workqueue */
+ struct work_struct tx_work;
+ struct work_struct rx_work;
+
+ wait_queue_head_t tx_wait; /* for waiting for tx resources */
+
+ /* The following fields are protected by tx_lock. vqs[VSOCK_VQ_TX]
+ * must be accessed with tx_lock held.
+ */
+ struct mutex tx_lock;
+ u32 total_tx_buf;
+
+ /* The following fields are protected by rx_lock. vqs[VSOCK_VQ_RX]
+ * must be accessed with rx_lock held.
+ */
+ struct mutex rx_lock;
+ int rx_buf_nr;
+ int rx_buf_max_nr;
+
+ u32 guest_cid;
+};
+
+static struct virtio_vsock *virtio_vsock_get(void)
+{
+ return the_virtio_vsock;
+}
+
+static u32 virtio_transport_get_local_cid(void)
+{
+ struct virtio_vsock *vsock = virtio_vsock_get();
+
+ return vsock->guest_cid;
+}
+
+static int
+virtio_transport_send_one_pkt(struct virtio_vsock *vsock,
+ struct virtio_vsock_pkt *pkt)
+{
+ struct scatterlist hdr, buf, *sgs[2];
+ int ret, in_sg = 0, out_sg = 0;
+ struct virtqueue *vq;
+ DEFINE_WAIT(wait);
+
+ vq = vsock->vqs[VSOCK_VQ_TX];
+
+ /* Put pkt in the virtqueue */
+ sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr));
+ sgs[out_sg++] = &hdr;
+ if (pkt->buf) {
+ sg_init_one(&buf, pkt->buf, pkt->len);
+ sgs[out_sg++] = &buf;
+ }
+
+ mutex_lock(&vsock->tx_lock);
+ while ((ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, pkt,
+ GFP_KERNEL)) < 0) {
+ prepare_to_wait_exclusive(&vsock->tx_wait, &wait,
+ TASK_UNINTERRUPTIBLE);
+ mutex_unlock(&vsock->tx_lock);
+ schedule();
+ mutex_lock(&vsock->tx_lock);
+ finish_wait(&vsock->tx_wait, &wait);
+ }
+ virtqueue_kick(vq);
+ mutex_unlock(&vsock->tx_lock);
+
+ return pkt->len;
+}
+
+static int
+virtio_transport_send_pkt_no_sock(struct virtio_vsock_pkt *pkt)
+{
+ struct virtio_vsock *vsock;
+
+ vsock = virtio_vsock_get();
+ if (!vsock) {
+ virtio_transport_free_pkt(pkt);
+ return -ENODEV;
+ }
+
+ return virtio_transport_send_one_pkt(vsock, pkt);
+}
+
+static int
+virtio_transport_send_pkt(struct vsock_sock *vsk,
+ struct virtio_vsock_pkt_info *info)
+{
+ u32 src_cid, src_port, dst_cid, dst_port;
+ struct virtio_vsock_sock *vvs;
+ struct virtio_vsock_pkt *pkt;
+ struct virtio_vsock *vsock;
+ u32 pkt_len = info->pkt_len;
+ DEFINE_WAIT(wait);
+
+ vsock = virtio_vsock_get();
+ if (!vsock)
+ return -ENODEV;
+
+ src_cid = virtio_transport_get_local_cid();
+ src_port = vsk->local_addr.svm_port;
+ if (!info->remote_cid) {
+ dst_cid = vsk->remote_addr.svm_cid;
+ dst_port = vsk->remote_addr.svm_port;
+ } else {
+ dst_cid = info->remote_cid;
+ dst_port = info->remote_port;
+ }
+
+ vvs = vsk->trans;
+
+ if (pkt_len > VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE)
+ pkt_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
+ pkt_len = virtio_transport_get_credit(vvs, pkt_len);
+ /* Do not send zero length OP_RW pkt*/
+ if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW)
+ return pkt_len;
+
+ /* Respect global tx buf limitation */
+ mutex_lock(&vsock->tx_lock);
+ while (pkt_len + vsock->total_tx_buf > VIRTIO_VSOCK_MAX_TX_BUF_SIZE) {
+ prepare_to_wait_exclusive(&vsock->tx_wait, &wait,
+ TASK_UNINTERRUPTIBLE);
+ mutex_unlock(&vsock->tx_lock);
+ schedule();
+ mutex_lock(&vsock->tx_lock);
+ finish_wait(&vsock->tx_wait, &wait);
+ }
+ vsock->total_tx_buf += pkt_len;
+ mutex_unlock(&vsock->tx_lock);
+
+ pkt = virtio_transport_alloc_pkt(info, pkt_len,
+ src_cid, src_port,
+ dst_cid, dst_port);
+ if (!pkt) {
+ mutex_lock(&vsock->tx_lock);
+ vsock->total_tx_buf -= pkt_len;
+ mutex_unlock(&vsock->tx_lock);
+ virtio_transport_put_credit(vvs, pkt_len);
+ wake_up(&vsock->tx_wait);
+ return -ENOMEM;
+ }
+
+ virtio_transport_inc_tx_pkt(vvs, pkt);
+
+ return virtio_transport_send_one_pkt(vsock, pkt);
+}
+
+static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
+{
+ int buf_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
+ struct virtio_vsock_pkt *pkt;
+ struct scatterlist hdr, buf, *sgs[2];
+ struct virtqueue *vq;
+ int ret;
+
+ vq = vsock->vqs[VSOCK_VQ_RX];
+
+ do {
+ pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
+ if (!pkt)
+ break;
+
+ pkt->buf = kmalloc(buf_len, GFP_KERNEL);
+ if (!pkt->buf) {
+ virtio_transport_free_pkt(pkt);
+ break;
+ }
+
+ pkt->len = buf_len;
+
+ sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr));
+ sgs[0] = &hdr;
+
+ sg_init_one(&buf, pkt->buf, buf_len);
+ sgs[1] = &buf;
+ ret = virtqueue_add_sgs(vq, sgs, 0, 2, pkt, GFP_KERNEL);
+ if (ret) {
+ virtio_transport_free_pkt(pkt);
+ break;
+ }
+ vsock->rx_buf_nr++;
+ } while (vq->num_free);
+ if (vsock->rx_buf_nr > vsock->rx_buf_max_nr)
+ vsock->rx_buf_max_nr = vsock->rx_buf_nr;
+ virtqueue_kick(vq);
+}
+
+static void virtio_transport_send_pkt_work(struct work_struct *work)
+{
+ struct virtio_vsock *vsock =
+ container_of(work, struct virtio_vsock, tx_work);
+ struct virtqueue *vq;
+ bool added = false;
+
+ vq = vsock->vqs[VSOCK_VQ_TX];
+ mutex_lock(&vsock->tx_lock);
+ do {
+ struct virtio_vsock_pkt *pkt;
+ unsigned int len;
+
+ virtqueue_disable_cb(vq);
+ while ((pkt = virtqueue_get_buf(vq, &len)) != NULL) {
+ vsock->total_tx_buf -= pkt->len;
+ virtio_transport_free_pkt(pkt);
+ added = true;
+ }
+ } while (!virtqueue_enable_cb(vq));
+ mutex_unlock(&vsock->tx_lock);
+
+ if (added)
+ wake_up(&vsock->tx_wait);
+}
+
+static void virtio_transport_recv_pkt_work(struct work_struct *work)
+{
+ struct virtio_vsock *vsock =
+ container_of(work, struct virtio_vsock, rx_work);
+ struct virtqueue *vq;
+
+ vq = vsock->vqs[VSOCK_VQ_RX];
+ mutex_lock(&vsock->rx_lock);
+ do {
+ struct virtio_vsock_pkt *pkt;
+ unsigned int len;
+
+ virtqueue_disable_cb(vq);
+ while ((pkt = virtqueue_get_buf(vq, &len)) != NULL) {
+ vsock->rx_buf_nr--;
+
+ /* Drop short/long packets */
+ if (unlikely(len < sizeof(pkt->hdr) ||
+ len > sizeof(pkt->hdr) + pkt->len)) {
+ virtio_transport_free_pkt(pkt);
+ continue;
+ }
+
+ pkt->len = len - sizeof(pkt->hdr);
+ virtio_transport_recv_pkt(pkt);
+ }
+ } while (!virtqueue_enable_cb(vq));
+
+ if (vsock->rx_buf_nr < vsock->rx_buf_max_nr / 2)
+ virtio_vsock_rx_fill(vsock);
+ mutex_unlock(&vsock->rx_lock);
+}
+
+static void virtio_vsock_ctrl_done(struct virtqueue *vq)
+{
+}
+
+static void virtio_vsock_tx_done(struct virtqueue *vq)
+{
+ struct virtio_vsock *vsock = vq->vdev->priv;
+
+ if (!vsock)
+ return;
+ queue_work(virtio_vsock_workqueue, &vsock->tx_work);
+}
+
+static void virtio_vsock_rx_done(struct virtqueue *vq)
+{
+ struct virtio_vsock *vsock = vq->vdev->priv;
+
+ if (!vsock)
+ return;
+ queue_work(virtio_vsock_workqueue, &vsock->rx_work);
+}
+
+static struct virtio_transport virtio_transport = {
+ .transport = {
+ .get_local_cid = virtio_transport_get_local_cid,
+
+ .init = virtio_transport_do_socket_init,
+ .destruct = virtio_transport_destruct,
+ .release = virtio_transport_release,
+ .connect = virtio_transport_connect,
+ .shutdown = virtio_transport_shutdown,
+
+ .dgram_bind = virtio_transport_dgram_bind,
+ .dgram_dequeue = virtio_transport_dgram_dequeue,
+ .dgram_enqueue = virtio_transport_dgram_enqueue,
+ .dgram_allow = virtio_transport_dgram_allow,
+
+ .stream_dequeue = virtio_transport_stream_dequeue,
+ .stream_enqueue = virtio_transport_stream_enqueue,
+ .stream_has_data = virtio_transport_stream_has_data,
+ .stream_has_space = virtio_transport_stream_has_space,
+ .stream_rcvhiwat = virtio_transport_stream_rcvhiwat,
+ .stream_is_active = virtio_transport_stream_is_active,
+ .stream_allow = virtio_transport_stream_allow,
+
+ .notify_poll_in = virtio_transport_notify_poll_in,
+ .notify_poll_out = virtio_transport_notify_poll_out,
+ .notify_recv_init = virtio_transport_notify_recv_init,
+ .notify_recv_pre_block = virtio_transport_notify_recv_pre_block,
+ .notify_recv_pre_dequeue = virtio_transport_notify_recv_pre_dequeue,
+ .notify_recv_post_dequeue = virtio_transport_notify_recv_post_dequeue,
+ .notify_send_init = virtio_transport_notify_send_init,
+ .notify_send_pre_block = virtio_transport_notify_send_pre_block,
+ .notify_send_pre_enqueue = virtio_transport_notify_send_pre_enqueue,
+ .notify_send_post_enqueue = virtio_transport_notify_send_post_enqueue,
+
+ .set_buffer_size = virtio_transport_set_buffer_size,
+ .set_min_buffer_size = virtio_transport_set_min_buffer_size,
+ .set_max_buffer_size = virtio_transport_set_max_buffer_size,
+ .get_buffer_size = virtio_transport_get_buffer_size,
+ .get_min_buffer_size = virtio_transport_get_min_buffer_size,
+ .get_max_buffer_size = virtio_transport_get_max_buffer_size,
+ },
+
+ .send_pkt = virtio_transport_send_pkt,
+ .send_pkt_no_sock = virtio_transport_send_pkt_no_sock,
+};
+
+static int virtio_vsock_probe(struct virtio_device *vdev)
+{
+ vq_callback_t *callbacks[] = {
+ virtio_vsock_ctrl_done,
+ virtio_vsock_rx_done,
+ virtio_vsock_tx_done,
+ };
+ static const char * const names[] = {
+ "ctrl",
+ "rx",
+ "tx",
+ };
+ struct virtio_vsock *vsock = NULL;
+ u32 guest_cid;
+ int ret;
+
+ ret = mutex_lock_interruptible(&the_virtio_vsock_mutex);
+ if (ret)
+ return ret;
+
+ /* Only one virtio-vsock device per guest is supported */
+ if (the_virtio_vsock) {
+ ret = -EBUSY;
+ goto out;
+ }
+
+ vsock = kzalloc(sizeof(*vsock), GFP_KERNEL);
+ if (!vsock) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ vsock->vdev = vdev;
+
+ ret = vsock->vdev->config->find_vqs(vsock->vdev, VSOCK_VQ_MAX,
+ vsock->vqs, callbacks, names);
+ if (ret < 0)
+ goto out;
+
+ vdev->config->get(vdev, offsetof(struct virtio_vsock_config, guest_cid),
+ &guest_cid, sizeof(guest_cid));
+ vsock->guest_cid = le32_to_cpu(guest_cid);
+
+ ret = vsock_core_init(&virtio_transport.transport);
+ if (ret < 0)
+ goto out_vqs;
+
+ vsock->rx_buf_nr = 0;
+ vsock->rx_buf_max_nr = 0;
+
+ vdev->priv = vsock;
+ the_virtio_vsock = vsock;
+ init_waitqueue_head(&vsock->tx_wait);
+ mutex_init(&vsock->tx_lock);
+ mutex_init(&vsock->rx_lock);
+ INIT_WORK(&vsock->rx_work, virtio_transport_recv_pkt_work);
+ INIT_WORK(&vsock->tx_work, virtio_transport_send_pkt_work);
+
+ mutex_lock(&vsock->rx_lock);
+ virtio_vsock_rx_fill(vsock);
+ mutex_unlock(&vsock->rx_lock);
+
+ mutex_unlock(&the_virtio_vsock_mutex);
+ return 0;
+
+out_vqs:
+ vsock->vdev->config->del_vqs(vsock->vdev);
+out:
+ kfree(vsock);
+ mutex_unlock(&the_virtio_vsock_mutex);
+ return ret;
+}
+
+static void virtio_vsock_remove(struct virtio_device *vdev)
+{
+ struct virtio_vsock *vsock = vdev->priv;
+
+ flush_work(&vsock->rx_work);
+ flush_work(&vsock->tx_work);
+
+ vdev->config->reset(vdev);
+
+ mutex_lock(&the_virtio_vsock_mutex);
+ the_virtio_vsock = NULL;
+ vsock_core_exit();
+ mutex_unlock(&the_virtio_vsock_mutex);
+
+ vdev->config->del_vqs(vdev);
+
+ kfree(vsock);
+}
+
+static struct virtio_device_id id_table[] = {
+ { VIRTIO_ID_VSOCK, VIRTIO_DEV_ANY_ID },
+ { 0 },
+};
+
+static unsigned int features[] = {
+};
+
+static struct virtio_driver virtio_vsock_driver = {
+ .feature_table = features,
+ .feature_table_size = ARRAY_SIZE(features),
+ .driver.name = KBUILD_MODNAME,
+ .driver.owner = THIS_MODULE,
+ .id_table = id_table,
+ .probe = virtio_vsock_probe,
+ .remove = virtio_vsock_remove,
+};
+
+static int __init virtio_vsock_init(void)
+{
+ int ret;
+
+ virtio_vsock_workqueue = alloc_workqueue("virtio_vsock", 0, 0);
+ if (!virtio_vsock_workqueue)
+ return -ENOMEM;
+ ret = register_virtio_driver(&virtio_vsock_driver);
+ if (ret)
+ destroy_workqueue(virtio_vsock_workqueue);
+ return ret;
+}
+
+static void __exit virtio_vsock_exit(void)
+{
+ unregister_virtio_driver(&virtio_vsock_driver);
+ destroy_workqueue(virtio_vsock_workqueue);
+}
+
+module_init(virtio_vsock_init);
+module_exit(virtio_vsock_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Asias He");
+MODULE_DESCRIPTION("virtio transport for vsock");
+MODULE_DEVICE_TABLE(virtio, id_table);
--
2.5.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC v4 4/5] VSOCK: Introduce vhost_vsock.ko
2015-12-22 9:07 [RFC v4 0/5] Add virtio transport for AF_VSOCK Stefan Hajnoczi
` (2 preceding siblings ...)
2015-12-22 9:07 ` [RFC v4 3/5] VSOCK: Introduce virtio_transport.ko Stefan Hajnoczi
@ 2015-12-22 9:07 ` Stefan Hajnoczi
2015-12-22 9:07 ` [RFC v4 5/5] VSOCK: Add Makefile and Kconfig Stefan Hajnoczi
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2015-12-22 9:07 UTC (permalink / raw)
To: kvm
Cc: Stefan Hajnoczi, Michael S. Tsirkin, netdev, virtualization,
Matt Benjamin, Asias He, Christoffer Dall, matt.ma
From: Asias He <asias@redhat.com>
VM sockets vhost transport implementation. This driver runs on the
host.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v4:
* Add MAINTAINERS file entry
* virtqueue used len is now sizeof(pkt->hdr) + pkt->len instead of just
pkt->len
* checkpatch.pl cleanups
* Clarify struct vhost_vsock locking
* Add comments about optimization that disables virtqueue notify
* Drop unused vhost_vsock_handle_ctl_kick()
* Call wake_up() after decrementing total_tx_buf to prevent deadlock
v3:
* Remove unneeded variable used to store return value
(Fengguang Wu <fengguang.wu@intel.com> and Julia Lawall
<julia.lawall@lip6.fr>)
v2:
* Add missing total_tx_buf decrement
* Support flexible rx/tx descriptor layout
* Refuse to assign reserved CIDs
* Refuse guest CID if already in use
* Only accept correctly addressed packets
vhost: checkpatch.pl cleanups
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
MAINTAINERS | 2 +
drivers/vhost/vsock.c | 607 ++++++++++++++++++++++++++++++++++++++++++++++++++
drivers/vhost/vsock.h | 4 +
3 files changed, 613 insertions(+)
create mode 100644 drivers/vhost/vsock.c
create mode 100644 drivers/vhost/vsock.h
diff --git a/MAINTAINERS b/MAINTAINERS
index 67d8504..0181dc2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11370,6 +11370,8 @@ F: include/linux/virtio_vsock.h
F: include/uapi/linux/virtio_vsock.h
F: net/vmw_vsock/virtio_transport_common.c
F: net/vmw_vsock/virtio_transport.c
+F: drivers/vhost/vsock.c
+F: drivers/vhost/vsock.h
VIRTUAL SERIO DEVICE DRIVER
M: Stephen Chandler Paul <thatslyude@gmail.com>
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
new file mode 100644
index 0000000..2c5963c
--- /dev/null
+++ b/drivers/vhost/vsock.c
@@ -0,0 +1,607 @@
+/*
+ * vhost transport for vsock
+ *
+ * Copyright (C) 2013-2015 Red Hat, Inc.
+ * Author: Asias He <asias@redhat.com>
+ * Stefan Hajnoczi <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+#include <linux/miscdevice.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <net/sock.h>
+#include <linux/virtio_vsock.h>
+#include <linux/vhost.h>
+
+#include <net/af_vsock.h>
+#include "vhost.h"
+#include "vsock.h"
+
+#define VHOST_VSOCK_DEFAULT_HOST_CID 2
+
+enum {
+ VHOST_VSOCK_FEATURES = VHOST_FEATURES,
+};
+
+/* Used to track all the vhost_vsock instances on the system. */
+static LIST_HEAD(vhost_vsock_list);
+static DEFINE_MUTEX(vhost_vsock_mutex);
+
+struct vhost_vsock {
+ struct vhost_dev dev;
+ struct vhost_virtqueue vqs[VSOCK_VQ_MAX];
+
+ /* Link to global vhost_vsock_list, protected by vhost_vsock_mutex */
+ struct list_head list;
+
+ struct vhost_work send_pkt_work;
+ wait_queue_head_t send_wait;
+
+ /* Fields protected by vqs[VSOCK_VQ_RX].mutex */
+ struct list_head send_pkt_list; /* host->guest pending packets */
+ u32 total_tx_buf;
+
+ u32 guest_cid;
+};
+
+static u32 vhost_transport_get_local_cid(void)
+{
+ return VHOST_VSOCK_DEFAULT_HOST_CID;
+}
+
+static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
+{
+ struct vhost_vsock *vsock;
+
+ mutex_lock(&vhost_vsock_mutex);
+ list_for_each_entry(vsock, &vhost_vsock_list, list) {
+ if (vsock->guest_cid == guest_cid) {
+ mutex_unlock(&vhost_vsock_mutex);
+ return vsock;
+ }
+ }
+ mutex_unlock(&vhost_vsock_mutex);
+
+ return NULL;
+}
+
+static void
+vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
+ struct vhost_virtqueue *vq)
+{
+ bool added = false;
+
+ mutex_lock(&vq->mutex);
+
+ /* Avoid further vmexits, we're already processing the virtqueue */
+ vhost_disable_notify(&vsock->dev, vq);
+
+ for (;;) {
+ struct virtio_vsock_pkt *pkt;
+ struct iov_iter iov_iter;
+ unsigned out, in;
+ size_t nbytes;
+ size_t len;
+ int head;
+
+ if (list_empty(&vsock->send_pkt_list)) {
+ vhost_enable_notify(&vsock->dev, vq);
+ break;
+ }
+
+ head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
+ &out, &in, NULL, NULL);
+ if (head < 0)
+ break;
+
+ if (head == vq->num) {
+ /* We cannot finish yet if more buffers snuck in while
+ * re-enabling notify.
+ */
+ if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
+ vhost_disable_notify(&vsock->dev, vq);
+ continue;
+ }
+ break;
+ }
+
+ pkt = list_first_entry(&vsock->send_pkt_list,
+ struct virtio_vsock_pkt, list);
+ list_del_init(&pkt->list);
+
+ if (out) {
+ virtio_transport_free_pkt(pkt);
+ vq_err(vq, "Expected 0 output buffers, got %u\n", out);
+ break;
+ }
+
+ len = iov_length(&vq->iov[out], in);
+ iov_iter_init(&iov_iter, READ, &vq->iov[out], in, len);
+
+ nbytes = copy_to_iter(&pkt->hdr, sizeof(pkt->hdr), &iov_iter);
+ if (nbytes != sizeof(pkt->hdr)) {
+ virtio_transport_free_pkt(pkt);
+ vq_err(vq, "Faulted on copying pkt hdr\n");
+ break;
+ }
+
+ nbytes = copy_to_iter(pkt->buf, pkt->len, &iov_iter);
+ if (nbytes != pkt->len) {
+ virtio_transport_free_pkt(pkt);
+ vq_err(vq, "Faulted on copying pkt buf\n");
+ break;
+ }
+
+ vhost_add_used(vq, head, sizeof(pkt->hdr) + pkt->len);
+ added = true;
+
+ vsock->total_tx_buf -= pkt->len;
+
+ virtio_transport_free_pkt(pkt);
+ }
+ if (added)
+ vhost_signal(&vsock->dev, vq);
+ mutex_unlock(&vq->mutex);
+
+ if (added)
+ wake_up(&vsock->send_wait);
+}
+
+static void vhost_transport_send_pkt_work(struct vhost_work *work)
+{
+ struct vhost_virtqueue *vq;
+ struct vhost_vsock *vsock;
+
+ vsock = container_of(work, struct vhost_vsock, send_pkt_work);
+ vq = &vsock->vqs[VSOCK_VQ_RX];
+
+ vhost_transport_do_send_pkt(vsock, vq);
+}
+
+static int
+vhost_transport_send_one_pkt(struct vhost_vsock *vsock,
+ struct virtio_vsock_pkt *pkt)
+{
+ struct vhost_virtqueue *vq = &vsock->vqs[VSOCK_VQ_RX];
+
+ /* Queue it up in vhost work */
+ mutex_lock(&vq->mutex);
+ list_add_tail(&pkt->list, &vsock->send_pkt_list);
+ vhost_work_queue(&vsock->dev, &vsock->send_pkt_work);
+ mutex_unlock(&vq->mutex);
+
+ return pkt->len;
+}
+
+static int
+vhost_transport_send_pkt_no_sock(struct virtio_vsock_pkt *pkt)
+{
+ struct vhost_vsock *vsock;
+
+ /* Find the vhost_vsock according to guest context id */
+ vsock = vhost_vsock_get(le32_to_cpu(pkt->hdr.dst_cid));
+ if (!vsock) {
+ virtio_transport_free_pkt(pkt);
+ return -ENODEV;
+ }
+
+ return vhost_transport_send_one_pkt(vsock, pkt);
+}
+
+static int
+vhost_transport_send_pkt(struct vsock_sock *vsk,
+ struct virtio_vsock_pkt_info *info)
+{
+ u32 src_cid, src_port, dst_cid, dst_port;
+ struct virtio_vsock_sock *vvs;
+ struct virtio_vsock_pkt *pkt;
+ struct vhost_virtqueue *vq;
+ struct vhost_vsock *vsock;
+ u32 pkt_len = info->pkt_len;
+ DEFINE_WAIT(wait);
+
+ src_cid = vhost_transport_get_local_cid();
+ src_port = vsk->local_addr.svm_port;
+ if (!info->remote_cid) {
+ dst_cid = vsk->remote_addr.svm_cid;
+ dst_port = vsk->remote_addr.svm_port;
+ } else {
+ dst_cid = info->remote_cid;
+ dst_port = info->remote_port;
+ }
+
+ /* Find the vhost_vsock according to guest context id */
+ vsock = vhost_vsock_get(dst_cid);
+ if (!vsock)
+ return -ENODEV;
+
+ vvs = vsk->trans;
+ vq = &vsock->vqs[VSOCK_VQ_RX];
+
+ /* we can send less than pkt_len bytes */
+ if (pkt_len > VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE)
+ pkt_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
+
+ /* virtio_transport_get_credit might return less than pkt_len credit */
+ pkt_len = virtio_transport_get_credit(vvs, pkt_len);
+
+ /* Do not send zero length OP_RW pkt*/
+ if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW)
+ return pkt_len;
+
+ /* Respect global tx buf limitation */
+ mutex_lock(&vq->mutex);
+ while (pkt_len + vsock->total_tx_buf > VIRTIO_VSOCK_MAX_TX_BUF_SIZE) {
+ prepare_to_wait_exclusive(&vsock->send_wait, &wait,
+ TASK_UNINTERRUPTIBLE);
+ mutex_unlock(&vq->mutex);
+ schedule();
+ mutex_lock(&vq->mutex);
+ finish_wait(&vsock->send_wait, &wait);
+ }
+ vsock->total_tx_buf += pkt_len;
+ mutex_unlock(&vq->mutex);
+
+ pkt = virtio_transport_alloc_pkt(info, pkt_len,
+ src_cid, src_port,
+ dst_cid, dst_port);
+ if (!pkt) {
+ mutex_lock(&vq->mutex);
+ vsock->total_tx_buf -= pkt_len;
+ mutex_unlock(&vq->mutex);
+ virtio_transport_put_credit(vvs, pkt_len);
+ wake_up(&vsock->send_wait);
+ return -ENOMEM;
+ }
+
+ virtio_transport_inc_tx_pkt(vvs, pkt);
+
+ return vhost_transport_send_one_pkt(vsock, pkt);
+}
+
+static struct virtio_vsock_pkt *
+vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq,
+ unsigned int out, unsigned int in)
+{
+ struct virtio_vsock_pkt *pkt;
+ struct iov_iter iov_iter;
+ size_t nbytes;
+ size_t len;
+
+ if (in != 0) {
+ vq_err(vq, "Expected 0 input buffers, got %u\n", in);
+ return NULL;
+ }
+
+ pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
+ if (!pkt)
+ return NULL;
+
+ len = iov_length(vq->iov, out);
+ iov_iter_init(&iov_iter, WRITE, vq->iov, out, len);
+
+ nbytes = copy_from_iter(&pkt->hdr, sizeof(pkt->hdr), &iov_iter);
+ if (nbytes != sizeof(pkt->hdr)) {
+ vq_err(vq, "Expected %zu bytes for pkt->hdr, got %zu bytes\n",
+ sizeof(pkt->hdr), nbytes);
+ kfree(pkt);
+ return NULL;
+ }
+
+ if (le16_to_cpu(pkt->hdr.type) == VIRTIO_VSOCK_TYPE_STREAM)
+ pkt->len = le32_to_cpu(pkt->hdr.len);
+
+ /* No payload */
+ if (!pkt->len)
+ return pkt;
+
+ /* The pkt is too big */
+ if (pkt->len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) {
+ kfree(pkt);
+ return NULL;
+ }
+
+ pkt->buf = kmalloc(pkt->len, GFP_KERNEL);
+ if (!pkt->buf) {
+ kfree(pkt);
+ return NULL;
+ }
+
+ nbytes = copy_from_iter(pkt->buf, pkt->len, &iov_iter);
+ if (nbytes != pkt->len) {
+ vq_err(vq, "Expected %u byte payload, got %zu bytes\n",
+ pkt->len, nbytes);
+ virtio_transport_free_pkt(pkt);
+ return NULL;
+ }
+
+ return pkt;
+}
+
+static void vhost_vsock_handle_tx_kick(struct vhost_work *work)
+{
+ struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
+ poll.work);
+ struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
+ dev);
+ struct virtio_vsock_pkt *pkt;
+ int head;
+ unsigned int out, in;
+ bool added = false;
+
+ mutex_lock(&vq->mutex);
+ vhost_disable_notify(&vsock->dev, vq);
+ for (;;) {
+ head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
+ &out, &in, NULL, NULL);
+ if (head < 0)
+ break;
+
+ if (head == vq->num) {
+ if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
+ vhost_disable_notify(&vsock->dev, vq);
+ continue;
+ }
+ break;
+ }
+
+ pkt = vhost_vsock_alloc_pkt(vq, out, in);
+ if (!pkt) {
+ vq_err(vq, "Faulted on pkt\n");
+ continue;
+ }
+
+ /* Only accept correctly addressed packets */
+ if (le32_to_cpu(pkt->hdr.src_cid) == vsock->guest_cid)
+ virtio_transport_recv_pkt(pkt);
+ else
+ virtio_transport_free_pkt(pkt);
+
+ vhost_add_used(vq, head, sizeof(pkt->hdr) + pkt->len);
+ added = true;
+ }
+ if (added)
+ vhost_signal(&vsock->dev, vq);
+ mutex_unlock(&vq->mutex);
+}
+
+static void vhost_vsock_handle_rx_kick(struct vhost_work *work)
+{
+ struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
+ poll.work);
+ struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
+ dev);
+
+ vhost_transport_do_send_pkt(vsock, vq);
+}
+
+static int vhost_vsock_dev_open(struct inode *inode, struct file *file)
+{
+ struct vhost_virtqueue **vqs;
+ struct vhost_vsock *vsock;
+ int ret;
+
+ vsock = kzalloc(sizeof(*vsock), GFP_KERNEL);
+ if (!vsock)
+ return -ENOMEM;
+
+ vqs = kmalloc_array(VSOCK_VQ_MAX, sizeof(*vqs), GFP_KERNEL);
+ if (!vqs) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ vqs[VSOCK_VQ_CTRL] = &vsock->vqs[VSOCK_VQ_CTRL];
+ vqs[VSOCK_VQ_TX] = &vsock->vqs[VSOCK_VQ_TX];
+ vqs[VSOCK_VQ_RX] = &vsock->vqs[VSOCK_VQ_RX];
+ /* The CTRL virtqueue is currently not used so skip it. */
+ vsock->vqs[VSOCK_VQ_TX].handle_kick = vhost_vsock_handle_tx_kick;
+ vsock->vqs[VSOCK_VQ_RX].handle_kick = vhost_vsock_handle_rx_kick;
+
+ vhost_dev_init(&vsock->dev, vqs, VSOCK_VQ_MAX);
+
+ file->private_data = vsock;
+ init_waitqueue_head(&vsock->send_wait);
+ INIT_LIST_HEAD(&vsock->send_pkt_list);
+ vhost_work_init(&vsock->send_pkt_work, vhost_transport_send_pkt_work);
+
+ mutex_lock(&vhost_vsock_mutex);
+ list_add_tail(&vsock->list, &vhost_vsock_list);
+ mutex_unlock(&vhost_vsock_mutex);
+ return 0;
+
+out:
+ kfree(vsock);
+ return ret;
+}
+
+static void vhost_vsock_flush(struct vhost_vsock *vsock)
+{
+ int i;
+
+ for (i = 0; i < VSOCK_VQ_MAX; i++)
+ if (vsock->vqs[i].handle_kick)
+ vhost_poll_flush(&vsock->vqs[i].poll);
+ vhost_work_flush(&vsock->dev, &vsock->send_pkt_work);
+}
+
+static int vhost_vsock_dev_release(struct inode *inode, struct file *file)
+{
+ struct vhost_vsock *vsock = file->private_data;
+
+ mutex_lock(&vhost_vsock_mutex);
+ list_del(&vsock->list);
+ mutex_unlock(&vhost_vsock_mutex);
+
+ vhost_dev_stop(&vsock->dev);
+ vhost_vsock_flush(vsock);
+ vhost_dev_cleanup(&vsock->dev, false);
+ kfree(vsock->dev.vqs);
+ kfree(vsock);
+ return 0;
+}
+
+static int vhost_vsock_set_cid(struct vhost_vsock *vsock, u32 guest_cid)
+{
+ struct vhost_vsock *other;
+
+ /* Refuse reserved CIDs */
+ if (guest_cid <= VMADDR_CID_HOST)
+ return -EINVAL;
+
+ /* Refuse if CID is already in use */
+ other = vhost_vsock_get(guest_cid);
+ if (other && other != vsock)
+ return -EADDRINUSE;
+
+ mutex_lock(&vhost_vsock_mutex);
+ vsock->guest_cid = guest_cid;
+ mutex_unlock(&vhost_vsock_mutex);
+
+ return 0;
+}
+
+static int vhost_vsock_set_features(struct vhost_vsock *vsock, u64 features)
+{
+ struct vhost_virtqueue *vq;
+ int i;
+
+ if (features & ~VHOST_VSOCK_FEATURES)
+ return -EOPNOTSUPP;
+
+ mutex_lock(&vsock->dev.mutex);
+ if ((features & (1 << VHOST_F_LOG_ALL)) &&
+ !vhost_log_access_ok(&vsock->dev)) {
+ mutex_unlock(&vsock->dev.mutex);
+ return -EFAULT;
+ }
+
+ for (i = 0; i < VSOCK_VQ_MAX; i++) {
+ vq = &vsock->vqs[i];
+ mutex_lock(&vq->mutex);
+ vq->acked_features = features;
+ mutex_unlock(&vq->mutex);
+ }
+ mutex_unlock(&vsock->dev.mutex);
+ return 0;
+}
+
+static long vhost_vsock_dev_ioctl(struct file *f, unsigned int ioctl,
+ unsigned long arg)
+{
+ struct vhost_vsock *vsock = f->private_data;
+ void __user *argp = (void __user *)arg;
+ u64 __user *featurep = argp;
+ u32 __user *cidp = argp;
+ u32 guest_cid;
+ u64 features;
+ int r;
+
+ switch (ioctl) {
+ case VHOST_VSOCK_SET_GUEST_CID:
+ if (get_user(guest_cid, cidp))
+ return -EFAULT;
+ return vhost_vsock_set_cid(vsock, guest_cid);
+ case VHOST_GET_FEATURES:
+ features = VHOST_VSOCK_FEATURES;
+ if (copy_to_user(featurep, &features, sizeof(features)))
+ return -EFAULT;
+ return 0;
+ case VHOST_SET_FEATURES:
+ if (copy_from_user(&features, featurep, sizeof(features)))
+ return -EFAULT;
+ return vhost_vsock_set_features(vsock, features);
+ default:
+ mutex_lock(&vsock->dev.mutex);
+ r = vhost_dev_ioctl(&vsock->dev, ioctl, argp);
+ if (r == -ENOIOCTLCMD)
+ r = vhost_vring_ioctl(&vsock->dev, ioctl, argp);
+ else
+ vhost_vsock_flush(vsock);
+ mutex_unlock(&vsock->dev.mutex);
+ return r;
+ }
+}
+
+static const struct file_operations vhost_vsock_fops = {
+ .owner = THIS_MODULE,
+ .open = vhost_vsock_dev_open,
+ .release = vhost_vsock_dev_release,
+ .llseek = noop_llseek,
+ .unlocked_ioctl = vhost_vsock_dev_ioctl,
+};
+
+static struct miscdevice vhost_vsock_misc = {
+ .minor = MISC_DYNAMIC_MINOR,
+ .name = "vhost-vsock",
+ .fops = &vhost_vsock_fops,
+};
+
+static struct virtio_transport vhost_transport = {
+ .transport = {
+ .get_local_cid = vhost_transport_get_local_cid,
+
+ .init = virtio_transport_do_socket_init,
+ .destruct = virtio_transport_destruct,
+ .release = virtio_transport_release,
+ .connect = virtio_transport_connect,
+ .shutdown = virtio_transport_shutdown,
+
+ .dgram_enqueue = virtio_transport_dgram_enqueue,
+ .dgram_dequeue = virtio_transport_dgram_dequeue,
+ .dgram_bind = virtio_transport_dgram_bind,
+ .dgram_allow = virtio_transport_dgram_allow,
+
+ .stream_enqueue = virtio_transport_stream_enqueue,
+ .stream_dequeue = virtio_transport_stream_dequeue,
+ .stream_has_data = virtio_transport_stream_has_data,
+ .stream_has_space = virtio_transport_stream_has_space,
+ .stream_rcvhiwat = virtio_transport_stream_rcvhiwat,
+ .stream_is_active = virtio_transport_stream_is_active,
+ .stream_allow = virtio_transport_stream_allow,
+
+ .notify_poll_in = virtio_transport_notify_poll_in,
+ .notify_poll_out = virtio_transport_notify_poll_out,
+ .notify_recv_init = virtio_transport_notify_recv_init,
+ .notify_recv_pre_block = virtio_transport_notify_recv_pre_block,
+ .notify_recv_pre_dequeue = virtio_transport_notify_recv_pre_dequeue,
+ .notify_recv_post_dequeue = virtio_transport_notify_recv_post_dequeue,
+ .notify_send_init = virtio_transport_notify_send_init,
+ .notify_send_pre_block = virtio_transport_notify_send_pre_block,
+ .notify_send_pre_enqueue = virtio_transport_notify_send_pre_enqueue,
+ .notify_send_post_enqueue = virtio_transport_notify_send_post_enqueue,
+
+ .set_buffer_size = virtio_transport_set_buffer_size,
+ .set_min_buffer_size = virtio_transport_set_min_buffer_size,
+ .set_max_buffer_size = virtio_transport_set_max_buffer_size,
+ .get_buffer_size = virtio_transport_get_buffer_size,
+ .get_min_buffer_size = virtio_transport_get_min_buffer_size,
+ .get_max_buffer_size = virtio_transport_get_max_buffer_size,
+ },
+
+ .send_pkt = vhost_transport_send_pkt,
+ .send_pkt_no_sock = vhost_transport_send_pkt_no_sock,
+};
+
+static int __init vhost_vsock_init(void)
+{
+ int ret;
+
+ ret = vsock_core_init(&vhost_transport.transport);
+ if (ret < 0)
+ return ret;
+ return misc_register(&vhost_vsock_misc);
+};
+
+static void __exit vhost_vsock_exit(void)
+{
+ misc_deregister(&vhost_vsock_misc);
+ vsock_core_exit();
+};
+
+module_init(vhost_vsock_init);
+module_exit(vhost_vsock_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Asias He");
+MODULE_DESCRIPTION("vhost transport for vsock ");
diff --git a/drivers/vhost/vsock.h b/drivers/vhost/vsock.h
new file mode 100644
index 0000000..0ddb107
--- /dev/null
+++ b/drivers/vhost/vsock.h
@@ -0,0 +1,4 @@
+#ifndef VHOST_VSOCK_H
+#define VHOST_VSOCK_H
+#define VHOST_VSOCK_SET_GUEST_CID _IOW(VHOST_VIRTIO, 0x60, __u32)
+#endif
--
2.5.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC v4 5/5] VSOCK: Add Makefile and Kconfig
2015-12-22 9:07 [RFC v4 0/5] Add virtio transport for AF_VSOCK Stefan Hajnoczi
` (3 preceding siblings ...)
2015-12-22 9:07 ` [RFC v4 4/5] VSOCK: Introduce vhost_vsock.ko Stefan Hajnoczi
@ 2015-12-22 9:07 ` Stefan Hajnoczi
2016-01-04 7:55 ` [RFC v4 0/5] Add virtio transport for AF_VSOCK Jason Wang
2016-01-28 15:19 ` Stefan Hajnoczi
6 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2015-12-22 9:07 UTC (permalink / raw)
To: kvm
Cc: Stefan Hajnoczi, Michael S. Tsirkin, netdev, virtualization,
Matt Benjamin, Asias He, Christoffer Dall, matt.ma
From: Asias He <asias@redhat.com>
Enable virtio-vsock and vhost-vsock.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v4:
* Make checkpatch.pl happy with longer option description
* Clarify dependency on virtio rather than QEMU as suggested by Alex
Bennee
v3:
* Don't put vhost vsock driver into staging
* Add missing Kconfig dependencies (Arnd Bergmann <arnd@arndb.de>)
---
drivers/vhost/Kconfig | 15 +++++++++++++++
drivers/vhost/Makefile | 4 ++++
net/vmw_vsock/Kconfig | 19 +++++++++++++++++++
net/vmw_vsock/Makefile | 2 ++
4 files changed, 40 insertions(+)
diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index 533eaf0..d7aae9e 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -21,6 +21,21 @@ config VHOST_SCSI
Say M here to enable the vhost_scsi TCM fabric module
for use with virtio-scsi guests
+config VHOST_VSOCK
+ tristate "vhost virtio-vsock driver"
+ depends on VSOCKETS && EVENTFD
+ select VIRTIO_VSOCKETS_COMMON
+ select VHOST
+ select VHOST_RING
+ default n
+ ---help---
+ This kernel module can be loaded in the host kernel to provide AF_VSOCK
+ sockets for communicating with guests. The guests must have the
+ virtio_transport.ko driver loaded to use the virtio-vsock device.
+
+ To compile this driver as a module, choose M here: the module will be called
+ vhost_vsock.
+
config VHOST_RING
tristate
---help---
diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
index e0441c3..6b012b9 100644
--- a/drivers/vhost/Makefile
+++ b/drivers/vhost/Makefile
@@ -4,5 +4,9 @@ vhost_net-y := net.o
obj-$(CONFIG_VHOST_SCSI) += vhost_scsi.o
vhost_scsi-y := scsi.o
+obj-$(CONFIG_VHOST_VSOCK) += vhost_vsock.o
+vhost_vsock-y := vsock.o
+
obj-$(CONFIG_VHOST_RING) += vringh.o
+
obj-$(CONFIG_VHOST) += vhost.o
diff --git a/net/vmw_vsock/Kconfig b/net/vmw_vsock/Kconfig
index 14810ab..f27e74b 100644
--- a/net/vmw_vsock/Kconfig
+++ b/net/vmw_vsock/Kconfig
@@ -26,3 +26,22 @@ config VMWARE_VMCI_VSOCKETS
To compile this driver as a module, choose M here: the module
will be called vmw_vsock_vmci_transport. If unsure, say N.
+
+config VIRTIO_VSOCKETS
+ tristate "virtio transport for Virtual Sockets"
+ depends on VSOCKETS && VIRTIO
+ select VIRTIO_VSOCKETS_COMMON
+ help
+ This module implements a virtio transport for Virtual Sockets.
+
+ Enable this transport if your Virtual Machine host supports Virtual
+ Sockets over virtio.
+
+ To compile this driver as a module, choose M here: the module
+ will be called virtio_vsock_transport. If unsure, say N.
+
+config VIRTIO_VSOCKETS_COMMON
+ tristate
+ ---help---
+ This option is selected by any driver which needs to access
+ the virtio_vsock.
diff --git a/net/vmw_vsock/Makefile b/net/vmw_vsock/Makefile
index 2ce52d7..cf4c294 100644
--- a/net/vmw_vsock/Makefile
+++ b/net/vmw_vsock/Makefile
@@ -1,5 +1,7 @@
obj-$(CONFIG_VSOCKETS) += vsock.o
obj-$(CONFIG_VMWARE_VMCI_VSOCKETS) += vmw_vsock_vmci_transport.o
+obj-$(CONFIG_VIRTIO_VSOCKETS) += virtio_transport.o
+obj-$(CONFIG_VIRTIO_VSOCKETS_COMMON) += virtio_transport_common.o
vsock-y += af_vsock.o vsock_addr.o
--
2.5.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC v4 0/5] Add virtio transport for AF_VSOCK
2015-12-22 9:07 [RFC v4 0/5] Add virtio transport for AF_VSOCK Stefan Hajnoczi
` (4 preceding siblings ...)
2015-12-22 9:07 ` [RFC v4 5/5] VSOCK: Add Makefile and Kconfig Stefan Hajnoczi
@ 2016-01-04 7:55 ` Jason Wang
2016-01-07 8:34 ` Stefan Hajnoczi
2016-01-28 15:19 ` Stefan Hajnoczi
6 siblings, 1 reply; 9+ messages in thread
From: Jason Wang @ 2016-01-04 7:55 UTC (permalink / raw)
To: Stefan Hajnoczi, kvm
Cc: Michael S. Tsirkin, netdev, virtualization, Matt Benjamin,
Christoffer Dall, matt.ma
On 12/22/2015 05:07 PM, Stefan Hajnoczi wrote:
> This series is based on v4.4-rc2 and the "virtio: make find_vqs()
> checkpatch.pl-friendly" patch I recently submitted.
>
> v4:
> * Addressed code review comments from Alex Bennee
> * MAINTAINERS file entries for new files
> * Trace events instead of pr_debug()
> * RST packet is sent when there is no listen socket
> * Allow guest->host connections again (began discussing netfilter support with
> Matt Benjamin instead of hard-coding security policy in virtio-vsock code)
> * Many checkpatch.pl cleanups (will be 100% clean in v5)
>
> v3:
> * Remove unnecessary 3-way handshake, just do REQUEST/RESPONSE instead
> of REQUEST/RESPONSE/ACK
> * Remove SOCK_DGRAM support and focus on SOCK_STREAM first
> (also drop v2 Patch 1, it's only needed for SOCK_DGRAM)
> * Only allow host->guest connections (same security model as latest
> VMware)
> * Don't put vhost vsock driver into staging
> * Add missing Kconfig dependencies (Arnd Bergmann <arnd@arndb.de>)
> * Remove unneeded variable used to store return value
> (Fengguang Wu <fengguang.wu@intel.com> and Julia Lawall
> <julia.lawall@lip6.fr>)
>
> v2:
> * Rebased onto Linux v4.4-rc2
> * vhost: Refuse to assign reserved CIDs
> * vhost: Refuse guest CID if already in use
> * vhost: Only accept correctly addressed packets (no spoofing!)
> * vhost: Support flexible rx/tx descriptor layout
> * vhost: Add missing total_tx_buf decrement
> * virtio_transport: Fix total_tx_buf accounting
> * virtio_transport: Add virtio_transport global mutex to prevent races
> * common: Notify other side of SOCK_STREAM disconnect (fixes shutdown
> semantics)
> * common: Avoid recursive mutex_lock(tx_lock) for write_space (fixes deadlock)
> * common: Define VIRTIO_VSOCK_TYPE_STREAM/DGRAM hardware interface constants
> * common: Define VIRTIO_VSOCK_SHUTDOWN_RCV/SEND hardware interface constants
> * common: Fix peer_buf_alloc inheritance on child socket
>
> This patch series adds a virtio transport for AF_VSOCK (net/vmw_vsock/).
> AF_VSOCK is designed for communication between virtual machines and
> hypervisors. It is currently only implemented for VMware's VMCI transport.
>
> This series implements the proposed virtio-vsock device specification from
> here:
> http://permalink.gmane.org/gmane.comp.emulators.virtio.devel/980
>
> Most of the work was done by Asias He and Gerd Hoffmann a while back. I have
> picked up the series again.
>
> The QEMU userspace changes are here:
> https://github.com/stefanha/qemu/commits/vsock
>
> Why virtio-vsock?
> -----------------
> Guest<->host communication is currently done over the virtio-serial device.
> This makes it hard to port sockets API-based applications and is limited to
> static ports.
>
> virtio-vsock uses the sockets API so that applications can rely on familiar
> SOCK_STREAM semantics. Applications on the host can easily connect to guest
> agents because the sockets API allows multiple connections to a listen socket
> (unlike virtio-serial). This simplifies the guest<->host communication and
> eliminates the need for extra processes on the host to arbitrate virtio-serial
> ports.
>
> Overview
> --------
> This series adds 3 pieces:
>
> 1. virtio_transport_common.ko - core virtio vsock code that uses vsock.ko
>
> 2. virtio_transport.ko - guest driver
>
> 3. drivers/vhost/vsock.ko - host driver
Have a (dumb maybe) question after a quick glance at the codes:
Is there any chance to reuse existed virtio-net/vhost-net codes? For
example, using virito-net instead of a new device as a transport in
guest and using vhost-net (especially consider it uses a socket as
backend) in host. Maybe just a new virtio-net header type for vsock. I'm
asking since I don't see any blocker for doing this.
Thanks
> Howto
> -----
> The following kernel options are needed:
> CONFIG_VSOCKETS=y
> CONFIG_VIRTIO_VSOCKETS=y
> CONFIG_VIRTIO_VSOCKETS_COMMON=y
> CONFIG_VHOST_VSOCK=m
>
> Launch QEMU as follows:
> # qemu ... -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3
>
> Guest and host can communicate via AF_VSOCK sockets. The host's CID (address)
> is 2 and the guest must be assigned a CID (3 in the example above).
>
> Status
> ------
> This patch series implements the latest draft specification. Please review.
>
> Asias He (4):
> VSOCK: Introduce virtio_vsock_common.ko
> VSOCK: Introduce virtio_transport.ko
> VSOCK: Introduce vhost_vsock.ko
> VSOCK: Add Makefile and Kconfig
>
> Stefan Hajnoczi (1):
> VSOCK: transport-specific vsock_transport functions
>
> MAINTAINERS | 13 +
> drivers/vhost/Kconfig | 15 +
> drivers/vhost/Makefile | 4 +
> drivers/vhost/vsock.c | 607 +++++++++++++++
> drivers/vhost/vsock.h | 4 +
> include/linux/virtio_vsock.h | 167 +++++
> include/net/af_vsock.h | 3 +
> .../trace/events/vsock_virtio_transport_common.h | 144 ++++
> include/uapi/linux/virtio_ids.h | 1 +
> include/uapi/linux/virtio_vsock.h | 87 +++
> net/vmw_vsock/Kconfig | 19 +
> net/vmw_vsock/Makefile | 2 +
> net/vmw_vsock/af_vsock.c | 9 +
> net/vmw_vsock/virtio_transport.c | 481 ++++++++++++
> net/vmw_vsock/virtio_transport_common.c | 834 +++++++++++++++++++++
> 15 files changed, 2390 insertions(+)
> create mode 100644 drivers/vhost/vsock.c
> create mode 100644 drivers/vhost/vsock.h
> create mode 100644 include/linux/virtio_vsock.h
> create mode 100644 include/trace/events/vsock_virtio_transport_common.h
> create mode 100644 include/uapi/linux/virtio_vsock.h
> create mode 100644 net/vmw_vsock/virtio_transport.c
> create mode 100644 net/vmw_vsock/virtio_transport_common.c
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC v4 0/5] Add virtio transport for AF_VSOCK
2016-01-04 7:55 ` [RFC v4 0/5] Add virtio transport for AF_VSOCK Jason Wang
@ 2016-01-07 8:34 ` Stefan Hajnoczi
0 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2016-01-07 8:34 UTC (permalink / raw)
To: Jason Wang
Cc: Stefan Hajnoczi, kvm, Michael S. Tsirkin, netdev, virtualization,
Matt Benjamin, Christoffer Dall, matt.ma
[-- Attachment #1: Type: text/plain, Size: 4683 bytes --]
On Mon, Jan 04, 2016 at 03:55:35PM +0800, Jason Wang wrote:
>
>
> On 12/22/2015 05:07 PM, Stefan Hajnoczi wrote:
> > This series is based on v4.4-rc2 and the "virtio: make find_vqs()
> > checkpatch.pl-friendly" patch I recently submitted.
> >
> > v4:
> > * Addressed code review comments from Alex Bennee
> > * MAINTAINERS file entries for new files
> > * Trace events instead of pr_debug()
> > * RST packet is sent when there is no listen socket
> > * Allow guest->host connections again (began discussing netfilter support with
> > Matt Benjamin instead of hard-coding security policy in virtio-vsock code)
> > * Many checkpatch.pl cleanups (will be 100% clean in v5)
> >
> > v3:
> > * Remove unnecessary 3-way handshake, just do REQUEST/RESPONSE instead
> > of REQUEST/RESPONSE/ACK
> > * Remove SOCK_DGRAM support and focus on SOCK_STREAM first
> > (also drop v2 Patch 1, it's only needed for SOCK_DGRAM)
> > * Only allow host->guest connections (same security model as latest
> > VMware)
> > * Don't put vhost vsock driver into staging
> > * Add missing Kconfig dependencies (Arnd Bergmann <arnd@arndb.de>)
> > * Remove unneeded variable used to store return value
> > (Fengguang Wu <fengguang.wu@intel.com> and Julia Lawall
> > <julia.lawall@lip6.fr>)
> >
> > v2:
> > * Rebased onto Linux v4.4-rc2
> > * vhost: Refuse to assign reserved CIDs
> > * vhost: Refuse guest CID if already in use
> > * vhost: Only accept correctly addressed packets (no spoofing!)
> > * vhost: Support flexible rx/tx descriptor layout
> > * vhost: Add missing total_tx_buf decrement
> > * virtio_transport: Fix total_tx_buf accounting
> > * virtio_transport: Add virtio_transport global mutex to prevent races
> > * common: Notify other side of SOCK_STREAM disconnect (fixes shutdown
> > semantics)
> > * common: Avoid recursive mutex_lock(tx_lock) for write_space (fixes deadlock)
> > * common: Define VIRTIO_VSOCK_TYPE_STREAM/DGRAM hardware interface constants
> > * common: Define VIRTIO_VSOCK_SHUTDOWN_RCV/SEND hardware interface constants
> > * common: Fix peer_buf_alloc inheritance on child socket
> >
> > This patch series adds a virtio transport for AF_VSOCK (net/vmw_vsock/).
> > AF_VSOCK is designed for communication between virtual machines and
> > hypervisors. It is currently only implemented for VMware's VMCI transport.
> >
> > This series implements the proposed virtio-vsock device specification from
> > here:
> > http://permalink.gmane.org/gmane.comp.emulators.virtio.devel/980
> >
> > Most of the work was done by Asias He and Gerd Hoffmann a while back. I have
> > picked up the series again.
> >
> > The QEMU userspace changes are here:
> > https://github.com/stefanha/qemu/commits/vsock
> >
> > Why virtio-vsock?
> > -----------------
> > Guest<->host communication is currently done over the virtio-serial device.
> > This makes it hard to port sockets API-based applications and is limited to
> > static ports.
> >
> > virtio-vsock uses the sockets API so that applications can rely on familiar
> > SOCK_STREAM semantics. Applications on the host can easily connect to guest
> > agents because the sockets API allows multiple connections to a listen socket
> > (unlike virtio-serial). This simplifies the guest<->host communication and
> > eliminates the need for extra processes on the host to arbitrate virtio-serial
> > ports.
> >
> > Overview
> > --------
> > This series adds 3 pieces:
> >
> > 1. virtio_transport_common.ko - core virtio vsock code that uses vsock.ko
> >
> > 2. virtio_transport.ko - guest driver
> >
> > 3. drivers/vhost/vsock.ko - host driver
>
> Have a (dumb maybe) question after a quick glance at the codes:
>
> Is there any chance to reuse existed virtio-net/vhost-net codes? For
> example, using virito-net instead of a new device as a transport in
> guest and using vhost-net (especially consider it uses a socket as
> backend) in host. Maybe just a new virtio-net header type for vsock. I'm
> asking since I don't see any blocker for doing this.
virtio-net is an Ethernet device with best-effort delivery while
virtio-vsock is guaranteed delivery. That's the fundamental difference
at the protocol level.
At the driver level AF_VSOCK isn't supposed to have a netdev or
user-visible network interface that can be misconfigured by the guest
administrator.
Trying to make the existing virtio-net device work for this use case is
probably nastier than using a dedicated device since lot of the
virtio-net configuration space and control features don't make sense for
AF_VSOCK.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC v4 0/5] Add virtio transport for AF_VSOCK
2015-12-22 9:07 [RFC v4 0/5] Add virtio transport for AF_VSOCK Stefan Hajnoczi
` (5 preceding siblings ...)
2016-01-04 7:55 ` [RFC v4 0/5] Add virtio transport for AF_VSOCK Jason Wang
@ 2016-01-28 15:19 ` Stefan Hajnoczi
6 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2016-01-28 15:19 UTC (permalink / raw)
To: kvm
Cc: Michael S. Tsirkin, netdev, virtualization, Matt Benjamin,
Christoffer Dall, matt.ma
[-- Attachment #1.1: Type: text/plain, Size: 5793 bytes --]
On Tue, Dec 22, 2015 at 05:07:33PM +0800, Stefan Hajnoczi wrote:
> This series is based on v4.4-rc2 and the "virtio: make find_vqs()
> checkpatch.pl-friendly" patch I recently submitted.
>
> v4:
> * Addressed code review comments from Alex Bennee
> * MAINTAINERS file entries for new files
> * Trace events instead of pr_debug()
> * RST packet is sent when there is no listen socket
> * Allow guest->host connections again (began discussing netfilter support with
> Matt Benjamin instead of hard-coding security policy in virtio-vsock code)
> * Many checkpatch.pl cleanups (will be 100% clean in v5)
>
> v3:
> * Remove unnecessary 3-way handshake, just do REQUEST/RESPONSE instead
> of REQUEST/RESPONSE/ACK
> * Remove SOCK_DGRAM support and focus on SOCK_STREAM first
> (also drop v2 Patch 1, it's only needed for SOCK_DGRAM)
> * Only allow host->guest connections (same security model as latest
> VMware)
> * Don't put vhost vsock driver into staging
> * Add missing Kconfig dependencies (Arnd Bergmann <arnd@arndb.de>)
> * Remove unneeded variable used to store return value
> (Fengguang Wu <fengguang.wu@intel.com> and Julia Lawall
> <julia.lawall@lip6.fr>)
>
> v2:
> * Rebased onto Linux v4.4-rc2
> * vhost: Refuse to assign reserved CIDs
> * vhost: Refuse guest CID if already in use
> * vhost: Only accept correctly addressed packets (no spoofing!)
> * vhost: Support flexible rx/tx descriptor layout
> * vhost: Add missing total_tx_buf decrement
> * virtio_transport: Fix total_tx_buf accounting
> * virtio_transport: Add virtio_transport global mutex to prevent races
> * common: Notify other side of SOCK_STREAM disconnect (fixes shutdown
> semantics)
> * common: Avoid recursive mutex_lock(tx_lock) for write_space (fixes deadlock)
> * common: Define VIRTIO_VSOCK_TYPE_STREAM/DGRAM hardware interface constants
> * common: Define VIRTIO_VSOCK_SHUTDOWN_RCV/SEND hardware interface constants
> * common: Fix peer_buf_alloc inheritance on child socket
>
> This patch series adds a virtio transport for AF_VSOCK (net/vmw_vsock/).
> AF_VSOCK is designed for communication between virtual machines and
> hypervisors. It is currently only implemented for VMware's VMCI transport.
>
> This series implements the proposed virtio-vsock device specification from
> here:
> http://permalink.gmane.org/gmane.comp.emulators.virtio.devel/980
>
> Most of the work was done by Asias He and Gerd Hoffmann a while back. I have
> picked up the series again.
>
> The QEMU userspace changes are here:
> https://github.com/stefanha/qemu/commits/vsock
>
> Why virtio-vsock?
> -----------------
> Guest<->host communication is currently done over the virtio-serial device.
> This makes it hard to port sockets API-based applications and is limited to
> static ports.
>
> virtio-vsock uses the sockets API so that applications can rely on familiar
> SOCK_STREAM semantics. Applications on the host can easily connect to guest
> agents because the sockets API allows multiple connections to a listen socket
> (unlike virtio-serial). This simplifies the guest<->host communication and
> eliminates the need for extra processes on the host to arbitrate virtio-serial
> ports.
>
> Overview
> --------
> This series adds 3 pieces:
>
> 1. virtio_transport_common.ko - core virtio vsock code that uses vsock.ko
>
> 2. virtio_transport.ko - guest driver
>
> 3. drivers/vhost/vsock.ko - host driver
>
> Howto
> -----
> The following kernel options are needed:
> CONFIG_VSOCKETS=y
> CONFIG_VIRTIO_VSOCKETS=y
> CONFIG_VIRTIO_VSOCKETS_COMMON=y
> CONFIG_VHOST_VSOCK=m
>
> Launch QEMU as follows:
> # qemu ... -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3
>
> Guest and host can communicate via AF_VSOCK sockets. The host's CID (address)
> is 2 and the guest must be assigned a CID (3 in the example above).
>
> Status
> ------
> This patch series implements the latest draft specification. Please review.
>
> Asias He (4):
> VSOCK: Introduce virtio_vsock_common.ko
> VSOCK: Introduce virtio_transport.ko
> VSOCK: Introduce vhost_vsock.ko
> VSOCK: Add Makefile and Kconfig
>
> Stefan Hajnoczi (1):
> VSOCK: transport-specific vsock_transport functions
>
> MAINTAINERS | 13 +
> drivers/vhost/Kconfig | 15 +
> drivers/vhost/Makefile | 4 +
> drivers/vhost/vsock.c | 607 +++++++++++++++
> drivers/vhost/vsock.h | 4 +
> include/linux/virtio_vsock.h | 167 +++++
> include/net/af_vsock.h | 3 +
> .../trace/events/vsock_virtio_transport_common.h | 144 ++++
> include/uapi/linux/virtio_ids.h | 1 +
> include/uapi/linux/virtio_vsock.h | 87 +++
> net/vmw_vsock/Kconfig | 19 +
> net/vmw_vsock/Makefile | 2 +
> net/vmw_vsock/af_vsock.c | 9 +
> net/vmw_vsock/virtio_transport.c | 481 ++++++++++++
> net/vmw_vsock/virtio_transport_common.c | 834 +++++++++++++++++++++
> 15 files changed, 2390 insertions(+)
> create mode 100644 drivers/vhost/vsock.c
> create mode 100644 drivers/vhost/vsock.h
> create mode 100644 include/linux/virtio_vsock.h
> create mode 100644 include/trace/events/vsock_virtio_transport_common.h
> create mode 100644 include/uapi/linux/virtio_vsock.h
> create mode 100644 net/vmw_vsock/virtio_transport.c
> create mode 100644 net/vmw_vsock/virtio_transport_common.c
Michael and Jason: Ping
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
[-- Attachment #2: Type: text/plain, Size: 183 bytes --]
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-01-28 15:19 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-22 9:07 [RFC v4 0/5] Add virtio transport for AF_VSOCK Stefan Hajnoczi
2015-12-22 9:07 ` [RFC v4 1/5] VSOCK: transport-specific vsock_transport functions Stefan Hajnoczi
2015-12-22 9:07 ` [RFC v4 2/5] VSOCK: Introduce virtio_vsock_common.ko Stefan Hajnoczi
2015-12-22 9:07 ` [RFC v4 3/5] VSOCK: Introduce virtio_transport.ko Stefan Hajnoczi
2015-12-22 9:07 ` [RFC v4 4/5] VSOCK: Introduce vhost_vsock.ko Stefan Hajnoczi
2015-12-22 9:07 ` [RFC v4 5/5] VSOCK: Add Makefile and Kconfig Stefan Hajnoczi
2016-01-04 7:55 ` [RFC v4 0/5] Add virtio transport for AF_VSOCK Jason Wang
2016-01-07 8:34 ` Stefan Hajnoczi
2016-01-28 15:19 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).