All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bobby Eshleman <bobby.eshleman@bytedance.com>
To: Stefan Hajnoczi <stefanha@redhat.com>,
	Stefano Garzarella <sgarzare@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
	Bryan Tan <bryantan@vmware.com>, Vishnu Dasa <vdasa@vmware.com>,
	VMware PV-Drivers Reviewers <pv-drivers@vmware.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>,
	Simon Horman <simon.horman@corigine.com>,
	Krasnov Arseniy <oxffffaa@gmail.com>,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-hyperv@vger.kernel.org, bpf@vger.kernel.org,
	Bobby Eshleman <bobby.eshleman@bytedance.com>,
	Jiang Wang <jiang.wang@bytedance.com>
Subject: [PATCH RFC net-next v4 0/8] virtio/vsock: support datagrams
Date: Sat, 10 Jun 2023 00:58:27 +0000	[thread overview]
Message-ID: <20230413-b4-vsock-dgram-v4-0-0cebbb2ae899@bytedance.com> (raw)

Hey all!

This series introduces support for datagrams to virtio/vsock.

It is a spin-off (and smaller version) of this series from the summer:
  https://lore.kernel.org/all/cover.1660362668.git.bobby.eshleman@bytedance.com/

Please note that this is an RFC and should not be merged until
associated changes are made to the virtio specification, which will
follow after discussion from this series.

Another aside, the v4 of the series has only been mildly tested with a
run of tools/testing/vsock/vsock_test. Some code likely needs cleaning
up, but I'm hoping to get some of the design choices agreed upon before
spending too much time making it pretty.

This series first supports datagrams in a basic form for virtio, and
then optimizes the sendpath for all datagram transports.

The result is a very fast datagram communication protocol that
outperforms even UDP on multi-queue virtio-net w/ vhost on a variety
of multi-threaded workload samples.

For those that are curious, some summary data comparing UDP and VSOCK
DGRAM (N=5):

	vCPUS: 16
	virtio-net queues: 16
	payload size: 4KB
	Setup: bare metal + vm (non-nested)

	UDP: 287.59 MB/s
	VSOCK DGRAM: 509.2 MB/s

Some notes about the implementation...

This datagram implementation forces datagrams to self-throttle according
to the threshold set by sk_sndbuf. It behaves similar to the credits
used by streams in its effect on throughput and memory consumption, but
it is not influenced by the receiving socket as credits are.

The device drops packets silently.

As discussed previously, this series introduces datagrams and defers
fairness to future work. See discussion in v2 for more context around
datagrams, fairness, and this implementation.

Signed-off-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
---
Changes in v4:
- style changes
  - vsock: use sk_vsock(vsk) in vsock_dgram_recvmsg instead of
    &sk->vsk
  - vsock: fix xmas tree declaration
  - vsock: fix spacing issues
  - virtio/vsock: virtio_transport_recv_dgram returns void because err
    unused
- sparse analysis warnings/errors
  - virtio/vsock: fix unitialized skerr on destroy
  - virtio/vsock: fix uninitialized err var on goto out
  - vsock: fix declarations that need static
  - vsock: fix __rcu annotation order
- bugs
  - vsock: fix null ptr in remote_info code
  - vsock/dgram: make transport_dgram a fallback instead of first
    priority
  - vsock: remove redundant rcu read lock acquire in getname()
- tests
  - add more tests (message bounds and more)
  - add vsock_dgram_bind() helper
  - add vsock_dgram_connect() helper

Changes in v3:
- Support multi-transport dgram, changing logic in connect/bind
  to support VMCI case
- Support per-pkt transport lookup for sendto() case
- Fix dgram_allow() implementation
- Fix dgram feature bit number (now it is 3)
- Fix binding so dgram and connectible (cid,port) spaces are
  non-overlapping
- RCU protect transport ptr so connect() calls never leave
  a lockless read of the transport and remote_addr are always
  in sync
- Link to v2: https://lore.kernel.org/r/20230413-b4-vsock-dgram-v2-0-079cc7cee62e@bytedance.com

---
Bobby Eshleman (7):
      vsock/dgram: generalize recvmsg and drop transport->dgram_dequeue
      vsock: refactor transport lookup code
      vsock: support multi-transport datagrams
      vsock: make vsock bind reusable
      virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
      virtio/vsock: support dgrams
      vsock: Add lockless sendmsg() support

Jiang Wang (1):
      tests: add vsock dgram tests

 drivers/vhost/vsock.c                   |  44 ++-
 include/linux/virtio_vsock.h            |  13 +-
 include/net/af_vsock.h                  |  52 ++-
 include/uapi/linux/virtio_vsock.h       |   2 +
 net/vmw_vsock/af_vsock.c                | 616 ++++++++++++++++++++++++++------
 net/vmw_vsock/diag.c                    |  10 +-
 net/vmw_vsock/hyperv_transport.c        |  42 ++-
 net/vmw_vsock/virtio_transport.c        |  28 +-
 net/vmw_vsock/virtio_transport_common.c | 226 +++++++++---
 net/vmw_vsock/vmci_transport.c          | 152 ++++----
 net/vmw_vsock/vsock_bpf.c               |  10 +-
 net/vmw_vsock/vsock_loopback.c          |  13 +-
 tools/testing/vsock/util.c              | 141 +++++++-
 tools/testing/vsock/util.h              |   6 +
 tools/testing/vsock/vsock_test.c        | 432 ++++++++++++++++++++++
 15 files changed, 1533 insertions(+), 254 deletions(-)
---
base-commit: 28cfea989d6f55c3d10608eba2a2bae609c5bf3e
change-id: 20230413-b4-vsock-dgram-3b6eba6a64e5

Best regards,
-- 
Bobby Eshleman <bobby.eshleman@bytedance.com>


             reply	other threads:[~2023-06-10  0:59 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-10  0:58 Bobby Eshleman [this message]
2023-06-10  0:58 ` [PATCH RFC net-next v4 1/8] vsock/dgram: generalize recvmsg and drop transport->dgram_dequeue Bobby Eshleman
2023-06-11 20:43   ` Arseniy Krasnov
2023-06-22 14:51     ` Stefano Garzarella
2023-06-22 14:51       ` Stefano Garzarella
2023-06-22 19:23       ` Arseniy Krasnov
2023-06-22 23:34         ` Bobby Eshleman
2023-06-22 23:37       ` Bobby Eshleman
2023-06-23  8:14         ` Stefano Garzarella
2023-06-23  8:14           ` Stefano Garzarella
2023-06-22 23:25     ` Bobby Eshleman
2023-06-10  0:58 ` [PATCH RFC net-next v4 2/8] vsock: refactor transport lookup code Bobby Eshleman
2023-06-22 14:57   ` Stefano Garzarella
2023-06-22 14:57     ` Stefano Garzarella
2023-06-10  0:58 ` [PATCH RFC net-next v4 3/8] vsock: support multi-transport datagrams Bobby Eshleman
2023-06-22 15:19   ` Stefano Garzarella
2023-06-22 15:19     ` Stefano Garzarella
2023-06-23  2:50     ` Bobby Eshleman
2023-06-23  2:59       ` Bobby Eshleman
2023-06-26 14:50         ` Stefano Garzarella
2023-06-26 14:50           ` Stefano Garzarella
2023-06-10  0:58 ` [PATCH RFC net-next v4 4/8] vsock: make vsock bind reusable Bobby Eshleman
2023-06-12  9:49   ` Simon Horman
2023-06-22 23:00     ` Bobby Eshleman
2023-06-22 15:25   ` Stefano Garzarella
2023-06-22 15:25     ` Stefano Garzarella
2023-06-22 23:05     ` Bobby Eshleman
2023-06-23  8:15       ` Stefano Garzarella
2023-06-23  8:15         ` Stefano Garzarella
2023-06-10  0:58 ` [PATCH RFC net-next v4 5/8] virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit Bobby Eshleman
2023-06-22 15:29   ` Stefano Garzarella
2023-06-22 15:29     ` Stefano Garzarella
2023-06-22 23:06     ` Bobby Eshleman
2023-06-10  0:58 ` [PATCH RFC net-next v4 6/8] virtio/vsock: support dgrams Bobby Eshleman
2023-06-11 20:49   ` Arseniy Krasnov
2023-06-22 16:09     ` Stefano Garzarella
2023-06-22 16:09       ` Stefano Garzarella
2023-06-22 18:46       ` Arseniy Krasnov
2023-06-23  4:37       ` Bobby Eshleman
2023-06-26 15:03         ` Stefano Garzarella
2023-06-26 15:03           ` Stefano Garzarella
2023-06-27  1:19           ` Bobby Eshleman
2023-06-29 12:30             ` Stefano Garzarella
2023-06-29 12:30               ` Stefano Garzarella
2023-06-22 16:31   ` Stefano Garzarella
2023-06-22 16:31     ` Stefano Garzarella
2023-06-10  0:58 ` [PATCH RFC net-next v4 7/8] vsock: Add lockless sendmsg() support Bobby Eshleman
2023-06-12  9:53   ` Simon Horman
2023-06-22 22:59     ` Bobby Eshleman
2023-06-22 16:37   ` Stefano Garzarella
2023-06-22 16:37     ` Stefano Garzarella
2023-06-22 22:57     ` Bobby Eshleman
2023-06-10  0:58 ` [PATCH RFC net-next v4 8/8] tests: add vsock dgram tests Bobby Eshleman
2023-06-11 20:54   ` Arseniy Krasnov
2023-06-22 23:16     ` Bobby Eshleman
2023-06-23 18:34       ` Arseniy Krasnov
2023-06-23  6:33         ` Bobby Eshleman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230413-b4-vsock-dgram-v4-0-0cebbb2ae899@bytedance.com \
    --to=bobby.eshleman@bytedance.com \
    --cc=bpf@vger.kernel.org \
    --cc=bryantan@vmware.com \
    --cc=dan.carpenter@linaro.org \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=edumazet@google.com \
    --cc=haiyangz@microsoft.com \
    --cc=jasowang@redhat.com \
    --cc=jiang.wang@bytedance.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=oxffffaa@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=pv-drivers@vmware.com \
    --cc=sgarzare@redhat.com \
    --cc=simon.horman@corigine.com \
    --cc=stefanha@redhat.com \
    --cc=vdasa@vmware.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=wei.liu@kernel.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.