All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefano Garzarella <sgarzare@redhat.com>
To: Bobby Eshleman <bobbyeshleman@gmail.com>
Cc: linux-hyperv@vger.kernel.org,
	Bobby Eshleman <bobby.eshleman@bytedance.com>,
	kvm@vger.kernel.org, "Michael S. Tsirkin" <mst@redhat.com>,
	VMware PV-Drivers Reviewers <pv-drivers@vmware.com>,
	Simon Horman <simon.horman@corigine.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	virtualization@lists.linux-foundation.org,
	Eric Dumazet <edumazet@google.com>,
	Dan Carpenter <dan.carpenter@linaro.org>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
	Bryan Tan <bryantan@vmware.com>, Jakub Kicinski <kuba@kernel.org>,
	Paolo Abeni <pabeni@redhat.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Arseniy Krasnov <oxffffaa@gmail.com>,
	Vishnu Dasa <vdasa@vmware.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	bpf@vger.kernel.org, "David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH RFC net-next v4 6/8] virtio/vsock: support dgrams
Date: Mon, 26 Jun 2023 17:03:15 +0200	[thread overview]
Message-ID: <d53tgo4igvz34pycgs36xikjosrncejlzuvh47bszk55milq52@whcyextsxfka> (raw)
In-Reply-To: <ZJUho6NbpCgGatap@bullseye>

On Fri, Jun 23, 2023 at 04:37:55AM +0000, Bobby Eshleman wrote:
>On Thu, Jun 22, 2023 at 06:09:12PM +0200, Stefano Garzarella wrote:
>> On Sun, Jun 11, 2023 at 11:49:02PM +0300, Arseniy Krasnov wrote:
>> > Hello Bobby!
>> >
>> > On 10.06.2023 03:58, Bobby Eshleman wrote:
>> > > This commit adds support for datagrams over virtio/vsock.
>> > >
>> > > Message boundaries are preserved on a per-skb and per-vq entry basis.
>> >
>> > I'm a little bit confused about the following case: let vhost sends 4097 bytes
>> > datagram to the guest. Guest uses 4096 RX buffers in it's virtio queue, each
>> > buffer has attached empty skb to it. Vhost places first 4096 bytes to the first
>> > buffer of guests RX queue, and 1 last byte to the second buffer. Now IIUC guest
>> > has two skb in it rx queue, and user in guest wants to read data - does it read
>> > 4097 bytes, while guest has two skb - 4096 bytes and 1 bytes? In seqpacket there is
>> > special marker in header which shows where message ends, and how it works here?
>>
>> I think the main difference is that DGRAM is not connection-oriented, so
>> we don't have a stream and we can't split the packet into 2 (maybe we
>> could, but we have no guarantee that the second one for example will be
>> not discarded because there is no space).
>>
>> So I think it is acceptable as a restriction to keep it simple.
>>
>> My only doubt is, should we make the RX buffer size configurable,
>> instead of always using 4k?
>>
>I think that is a really good idea. What mechanism do you imagine?

Some parameter in sysfs?

>
>For sendmsg() with buflen > VQ_BUF_SIZE, I think I'd like -ENOBUFS

For the guest it should be easy since it allocates the buffers, but for
the host?

Maybe we should add a field in the configuration space that reports some
sort of MTU.

Something in addition to what Laura had proposed here:
https://markmail.org/message/ymhz7wllutdxji3e

>returned even though it is uncharacteristic of Linux sockets.
>Alternatively, silently dropping is okay... but seems needlessly
>unhelpful.

UDP takes advantage of IP fragmentation, right?
But what happens if a fragment is lost?

We should try to behave in a similar way.

>
>FYI, this patch is broken for h2g because it requeues partially sent
>skbs, so probably doesn't need much code review until we decided on the
>policy.

Got it.

Thanks,
Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

WARNING: multiple messages have this Message-ID (diff)
From: Stefano Garzarella <sgarzare@redhat.com>
To: Bobby Eshleman <bobbyeshleman@gmail.com>
Cc: Arseniy Krasnov <oxffffaa@gmail.com>,
	Bobby Eshleman <bobby.eshleman@bytedance.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
	Bryan Tan <bryantan@vmware.com>, Vishnu Dasa <vdasa@vmware.com>,
	VMware PV-Drivers Reviewers <pv-drivers@vmware.com>,
	Dan Carpenter <dan.carpenter@linaro.org>,
	Simon Horman <simon.horman@corigine.com>,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-hyperv@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH RFC net-next v4 6/8] virtio/vsock: support dgrams
Date: Mon, 26 Jun 2023 17:03:15 +0200	[thread overview]
Message-ID: <d53tgo4igvz34pycgs36xikjosrncejlzuvh47bszk55milq52@whcyextsxfka> (raw)
In-Reply-To: <ZJUho6NbpCgGatap@bullseye>

On Fri, Jun 23, 2023 at 04:37:55AM +0000, Bobby Eshleman wrote:
>On Thu, Jun 22, 2023 at 06:09:12PM +0200, Stefano Garzarella wrote:
>> On Sun, Jun 11, 2023 at 11:49:02PM +0300, Arseniy Krasnov wrote:
>> > Hello Bobby!
>> >
>> > On 10.06.2023 03:58, Bobby Eshleman wrote:
>> > > This commit adds support for datagrams over virtio/vsock.
>> > >
>> > > Message boundaries are preserved on a per-skb and per-vq entry basis.
>> >
>> > I'm a little bit confused about the following case: let vhost sends 4097 bytes
>> > datagram to the guest. Guest uses 4096 RX buffers in it's virtio queue, each
>> > buffer has attached empty skb to it. Vhost places first 4096 bytes to the first
>> > buffer of guests RX queue, and 1 last byte to the second buffer. Now IIUC guest
>> > has two skb in it rx queue, and user in guest wants to read data - does it read
>> > 4097 bytes, while guest has two skb - 4096 bytes and 1 bytes? In seqpacket there is
>> > special marker in header which shows where message ends, and how it works here?
>>
>> I think the main difference is that DGRAM is not connection-oriented, so
>> we don't have a stream and we can't split the packet into 2 (maybe we
>> could, but we have no guarantee that the second one for example will be
>> not discarded because there is no space).
>>
>> So I think it is acceptable as a restriction to keep it simple.
>>
>> My only doubt is, should we make the RX buffer size configurable,
>> instead of always using 4k?
>>
>I think that is a really good idea. What mechanism do you imagine?

Some parameter in sysfs?

>
>For sendmsg() with buflen > VQ_BUF_SIZE, I think I'd like -ENOBUFS

For the guest it should be easy since it allocates the buffers, but for
the host?

Maybe we should add a field in the configuration space that reports some
sort of MTU.

Something in addition to what Laura had proposed here:
https://markmail.org/message/ymhz7wllutdxji3e

>returned even though it is uncharacteristic of Linux sockets.
>Alternatively, silently dropping is okay... but seems needlessly
>unhelpful.

UDP takes advantage of IP fragmentation, right?
But what happens if a fragment is lost?

We should try to behave in a similar way.

>
>FYI, this patch is broken for h2g because it requeues partially sent
>skbs, so probably doesn't need much code review until we decided on the
>policy.

Got it.

Thanks,
Stefano


  reply	other threads:[~2023-06-26 15:03 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-10  0:58 [PATCH RFC net-next v4 0/8] virtio/vsock: support datagrams Bobby Eshleman
2023-06-10  0:58 ` [PATCH RFC net-next v4 1/8] vsock/dgram: generalize recvmsg and drop transport->dgram_dequeue Bobby Eshleman
2023-06-11 20:43   ` Arseniy Krasnov
2023-06-22 14:51     ` Stefano Garzarella
2023-06-22 14:51       ` Stefano Garzarella
2023-06-22 19:23       ` Arseniy Krasnov
2023-06-22 23:34         ` Bobby Eshleman
2023-06-22 23:37       ` Bobby Eshleman
2023-06-23  8:14         ` Stefano Garzarella
2023-06-23  8:14           ` Stefano Garzarella
2023-06-22 23:25     ` Bobby Eshleman
2023-06-10  0:58 ` [PATCH RFC net-next v4 2/8] vsock: refactor transport lookup code Bobby Eshleman
2023-06-22 14:57   ` Stefano Garzarella
2023-06-22 14:57     ` Stefano Garzarella
2023-06-10  0:58 ` [PATCH RFC net-next v4 3/8] vsock: support multi-transport datagrams Bobby Eshleman
2023-06-22 15:19   ` Stefano Garzarella
2023-06-22 15:19     ` Stefano Garzarella
2023-06-23  2:50     ` Bobby Eshleman
2023-06-23  2:59       ` Bobby Eshleman
2023-06-26 14:50         ` Stefano Garzarella
2023-06-26 14:50           ` Stefano Garzarella
2023-06-10  0:58 ` [PATCH RFC net-next v4 4/8] vsock: make vsock bind reusable Bobby Eshleman
2023-06-12  9:49   ` Simon Horman
2023-06-22 23:00     ` Bobby Eshleman
2023-06-22 15:25   ` Stefano Garzarella
2023-06-22 15:25     ` Stefano Garzarella
2023-06-22 23:05     ` Bobby Eshleman
2023-06-23  8:15       ` Stefano Garzarella
2023-06-23  8:15         ` Stefano Garzarella
2023-06-10  0:58 ` [PATCH RFC net-next v4 5/8] virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit Bobby Eshleman
2023-06-22 15:29   ` Stefano Garzarella
2023-06-22 15:29     ` Stefano Garzarella
2023-06-22 23:06     ` Bobby Eshleman
2023-06-10  0:58 ` [PATCH RFC net-next v4 6/8] virtio/vsock: support dgrams Bobby Eshleman
2023-06-11 20:49   ` Arseniy Krasnov
2023-06-22 16:09     ` Stefano Garzarella
2023-06-22 16:09       ` Stefano Garzarella
2023-06-22 18:46       ` Arseniy Krasnov
2023-06-23  4:37       ` Bobby Eshleman
2023-06-26 15:03         ` Stefano Garzarella [this message]
2023-06-26 15:03           ` Stefano Garzarella
2023-06-27  1:19           ` Bobby Eshleman
2023-06-29 12:30             ` Stefano Garzarella
2023-06-29 12:30               ` Stefano Garzarella
2023-06-22 16:31   ` Stefano Garzarella
2023-06-22 16:31     ` Stefano Garzarella
2023-06-10  0:58 ` [PATCH RFC net-next v4 7/8] vsock: Add lockless sendmsg() support Bobby Eshleman
2023-06-12  9:53   ` Simon Horman
2023-06-22 22:59     ` Bobby Eshleman
2023-06-22 16:37   ` Stefano Garzarella
2023-06-22 16:37     ` Stefano Garzarella
2023-06-22 22:57     ` Bobby Eshleman
2023-06-10  0:58 ` [PATCH RFC net-next v4 8/8] tests: add vsock dgram tests Bobby Eshleman
2023-06-11 20:54   ` Arseniy Krasnov
2023-06-22 23:16     ` Bobby Eshleman
2023-06-23 18:34       ` Arseniy Krasnov
2023-06-23  6:33         ` Bobby Eshleman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d53tgo4igvz34pycgs36xikjosrncejlzuvh47bszk55milq52@whcyextsxfka \
    --to=sgarzare@redhat.com \
    --cc=bobby.eshleman@bytedance.com \
    --cc=bobbyeshleman@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=bryantan@vmware.com \
    --cc=dan.carpenter@linaro.org \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=edumazet@google.com \
    --cc=haiyangz@microsoft.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=oxffffaa@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=pv-drivers@vmware.com \
    --cc=simon.horman@corigine.com \
    --cc=stefanha@redhat.com \
    --cc=vdasa@vmware.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=wei.liu@kernel.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.