virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* Re: [virtio-comment] [RFC PATCH v4 2/2] virtio-vsock: SOCK_SEQPACKET description
       [not found] ` <20210326090254.1144486-1-arseny.krasnov@kaspersky.com>
@ 2021-03-29 16:11   ` Stefan Hajnoczi
       [not found]     ` <230d95fd-29e8-465b-0ab2-b406d614c11b@kaspersky.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Hajnoczi @ 2021-03-29 16:11 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, Michael S. Tsirkin, cohuck, Colin Ian King,
	Norbert Slusarek, oxffffaa, virtio-comment, Jakub Kicinski,
	Alexander Popov, virtualization, David S. Miller, Jorgen Hansen


[-- Attachment #1.1: Type: text/plain, Size: 9221 bytes --]

On Fri, Mar 26, 2021 at 12:02:50PM +0300, Arseny Krasnov wrote:
> This adds description of SOCK_SEQPACKET socket type
> support for virtio-vsock.
> 
> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
> ---
>  virtio-vsock.tex | 65 +++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 59 insertions(+), 6 deletions(-)
> 
> diff --git a/virtio-vsock.tex b/virtio-vsock.tex
> index ad57f9d..c366de7 100644
> --- a/virtio-vsock.tex
> +++ b/virtio-vsock.tex
> @@ -17,6 +17,10 @@ \subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues}
>  \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
>  
>  There are currently no feature bits defined for this device.

^ This line is now out of date :)

> +\begin{description}
> +\item VIRTIO_VSOCK_F_SEQPACKET (0) SOCK_SEQPACKET socket type is
> +    supported.
> +\end{description}
>  
>  \subsection{Device configuration layout}\label{sec:Device Types / Socket Device / Device configuration layout}
>  
> @@ -98,6 +102,10 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op
>  #define VIRTIO_VSOCK_OP_CREDIT_UPDATE  6
>  /* Request the peer to send the credit info to us */
>  #define VIRTIO_VSOCK_OP_CREDIT_REQUEST 7
> +/* Message begin for SOCK_SEQPACKET */
> +#define VIRTIO_VSOCK_OP_SEQ_BEGIN      8
> +/* Message end for SOCK_SEQPACKET */
> +#define VIRTIO_VSOCK_OP_SEQ_END        9

The struct virtio_vsock_hdr->flags field is le32 and currently unused.
Could 24 bits be used for a unique message id and 8 bits for flags? 1
flag bit could be used for end-of-message and the remaining 7 bits could
be reserved. That way SEQ_BEGIN and SEQ_END are not necessary. Pressure
on the virtqueue would be reduced and performance should be comparable
to SOCK_STREAM.

>  \end{lstlisting}
>  
>  \subsubsection{Virtqueue Flow Control}\label{sec:Device Types / Socket Device / Device Operation / Virtqueue Flow Control}
> @@ -135,15 +143,16 @@ \subsubsection{Addressing}\label{sec:Device Types / Socket Device / Device Opera
>  consists of a (cid, port number) tuple. The header fields used for this are
>  \field{src_cid}, \field{src_port}, \field{dst_cid}, and \field{dst_port}.
>  
> -Currently only stream sockets are supported. \field{type} is 1 (VIRTIO_VSOCK_TYPE_STREAM)
> -for stream socket types.
> +Currently stream and seqpacket sockets are supported. \field{type} is 1 (VIRTIO_VSOCK_TYPE_STREAM)
> +for stream socket types, and 2 (VIRTIO_VSOCK_TYPE_SEQPACKET) for seqpacket socket types.
>  
>  \begin{lstlisting}
> -#define VIRTIO_VSOCK_TYPE_STREAM 1
> +#define VIRTIO_VSOCK_TYPE_STREAM    1
> +#define VIRTIO_VSOCK_TYPE_SEQPACKET 2
>  \end{lstlisting}
>  
>  Stream sockets provide in-order, guaranteed, connection-oriented delivery
> -without message boundaries.
> +without message boundaries. Seqpacket sockets also provide message boundaries.

"also" is ambiguous, I guess it means "is the same except ..." here. I
suggest writing out the characteristics of seqpacket sockets in full:

  Seqpacket sockets provide in-order, guaranteed, connection-oriented
  delivery with message boundaries.

If "also" is intended to mean that message boundaries are optional, then
my understanding is that they are mandatory, not optional. Seqpacket
sockets deliver entire messages and therefore communication is
impossible without message boundaries.

>  
>  \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device / Device Operation / Buffer Space Management}
>  \field{buf_alloc} and \field{fwd_cnt} are used for buffer space management of
> @@ -170,8 +179,10 @@ \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device /
>  communicating updates any time a change in buffer space occurs.
>  
>  \drivernormative{\paragraph}{Device Operation: Buffer Space Management}{Device Types / Socket Device / Device Operation / Buffer Space Management}
> -VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has
> -sufficient free buffer space for the payload.
> +VIRTIO_VSOCK_OP_RW data packets as well as VIRTIO_VSOCK_OP_SEQ_BEGIN and
> +VIRTIO_VSOCK_OP_SEQ_END message boundary packets MUST only be
> +transmitted when the peer has sufficient free buffer space for the
> +payload.
>  
>  All packets associated with a stream flow MUST contain valid information in
>  \field{buf_alloc} and \field{fwd_cnt} fields.
> @@ -244,6 +255,48 @@ \subsubsection{Stream Sockets}\label{sec:Device Types / Socket Device / Device O
>  destination) address tuple for a new connection while the other peer is still
>  processing the old connection.
>  
> +\subsubsection{Seqpacket Sockets}\label{sec:Device Types / Socket Device / Device Operation / Seqpacket Sockets}
> +
> +\paragraph{Message boundaries}\label{sec:Device Types / Socket Device / Device Operation / Seqpacket Sockets / Message boundaries}
> +
> +Seqpacket sockets differ from stream sockets only in data transmission way: in

s/in data transmission way/in how data is transmitted/

> +stream sockets all data is sent using only VIRTIO_VSOCK_OP_RW packets. In
> +seqpacket sockets, to provide message boundaries, every sequence of
> +VIRTIO_VSOCK_OP_RW packets of each message MUST be headed with
> +VIRTIO_VSOCK_OP_SEQ_BEGIN and tailed with VIRTIO_VSOCK_OP_SEQ_END packets.
> +Both VIRTIO_VSOCK_OP_SEQ_BEGIN and VIRTIO_VSOCK_OP_SEQ_END packets MUST carry

s/MUST//

MUST/MAY/CAN/SHOULD/etc are only used in normative sections of the spec.

> +special structure in payload:

s/carry special structure in payload/carry the following payload/

> +
> +\begin{lstlisting}
> +struct virtio_vsock_seq_hdr {
> +	__le32  msg_id;
> +	__le32  msg_len;
> +};
> +\end{lstlisting}
> +
> +This data MUST be placed starting from the first byte of payload and no more

s/MUST be/is/

s/starting from/starting at/

> +data is allowed to be beyond it. Also payload of such packet MUST be transmitted

s/to be//

s/MUST be/is/

Which "payload" is this sentence referring to? The previous sentence
referred to the payload of
VIRTIO_VSOCK_OP_SEQ_BEGIN/VIRTIO_VSOCK_OP_SEQ_END packets, but here I'm
not sure if the text is referring to VIRTIO_VSOCK_OP_RW packets.

> +without fragmentation. \field{len} of packet header MUST be set to the size of

The word "fragmentation" is not used in the Socket device section and
it's not clear to me what it means. It seems like SEQ_BEGIN is followed
by one or more RW packets and then SEQ_END. But does splitting a message
across multiple RW packets count as "fragmentation"?

s/\field{len} of packet header/The packet header \field{len} field/

s/MUST be/is/

> +the struct virtio_vsock_seq_hdr.
> +
> +\field{msg_id} is value to identify message. It MUST be same for pair of

s/is value to identify message/is a unique valid to identify a message/

s/MUST be/is the/

s/for pair/for a pair/

> +VIRTIO_VSOCK_OP_SEQ_BEGIN and VIRTIO_VSOCK_OP_SEQ_END of one message, and MUST

s/of one message/packets related to one message/

> +differ in next messages. \field{msg_len} contains message length. This header

"MUST differ in next messages" can be removed here. If we describe
msg_id as "unique" above the the intent is clear and sentence can be
added to the driver normative section indicating that all msg_id values
in use at a given time MUST be unique.

s/contains message length/contains the message length/

> +is used to check integrity of each message: message is valid if length of data

s/check integrity/check the integrity/

s/if length/if the total length/

(I suggest adding "total" to make it clear that the lengths of all RW
packets is summed. This rules out the interpretation that each RW
packet's length must be msg_len.)

> +in VIRTIO_VSOCK_OP_RW packets is equal to \field{msg_len} of prepending

s/in VIRTIO_VSOCK_OP_RW packets/in the VIRTIO_VSOCK_OP_RW packets/

s/prepending/the corresponding/

> +VIRTIO_VSOCK_OP_SEQ_BEGIN and \field{msg_id} of VIRTIO_VSOCK_OP_SEQ_END is equal
> +to \field{msg_id} of VIRTIO_VSOCK_OP_SEQ_BEGIN.
> +
> +\paragraph{POSIX MSG_EOR flag}\label{sec:Device Types / Socket Device / Device Operation / Seqpacket Sockets / MSG_EOR flag}
> +
> +When message is sent using one of POSIX functions: send(), sendto() or sendmsg() and
> +MSG_EOR flag is set in \field{flags} parameter of these functions, then all VIRTIO_VSOCK_OP_RW
> +packets of this message MUST have \field{flags} field in header set to special constanst:
> +
> +\begin{lstlisting}
> +#define VIRTIO_VSOCK_RW_EOR 1
> +\end{lstlisting}

I'm not familiar with MSG_EOR. It seems to be a hint to send buffered
data with Linux TCP sockets and it's not implemented for Linux AF_UNIX
sockets.

How is MSG_EOR useful if SEQ_END already deliminates message boundaries?

It seems like passing the MSG_EOR flag in this way provides an
additional layer on top of message boundaries? It's not clear to me that
MSG_EOR has to work like that according to POSIX. Do you have a link to
the POSIX spec pages that describe how MSG_EOR works?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [virtio-comment] [RFC PATCH v4 2/2] virtio-vsock: SOCK_SEQPACKET description
       [not found]     ` <230d95fd-29e8-465b-0ab2-b406d614c11b@kaspersky.com>
@ 2021-03-29 21:28       ` Stefano Garzarella
       [not found]         ` <d6d92105-f7d4-74a3-4acc-fcfb40872b76@kaspersky.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Stefano Garzarella @ 2021-03-29 21:28 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, Michael S. Tsirkin, cohuck, Colin Ian King,
	oxffffaa, Norbert Slusarek, Stefan Hajnoczi, virtio-comment,
	Jakub Kicinski, virtualization, David S. Miller, Jorgen Hansen,
	Alexander Popov

On Mon, Mar 29, 2021 at 08:33:27PM +0300, Arseny Krasnov wrote:
>
>On 29.03.2021 19:11, Stefan Hajnoczi wrote:
>> On Fri, Mar 26, 2021 at 12:02:50PM +0300, Arseny Krasnov wrote:
>>> This adds description of SOCK_SEQPACKET socket type
>>> support for virtio-vsock.
>>>
>>> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>>> ---
>>>  virtio-vsock.tex | 65 +++++++++++++++++++++++++++++++++++++++++++-----
>>>  1 file changed, 59 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/virtio-vsock.tex b/virtio-vsock.tex
>>> index ad57f9d..c366de7 100644
>>> --- a/virtio-vsock.tex
>>> +++ b/virtio-vsock.tex
>>> @@ -17,6 +17,10 @@ \subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues}
>>>  \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
>>>
>>>  There are currently no feature bits defined for this device.
>> ^ This line is now out of date :)
>Ack
>>
>>> +\begin{description}
>>> +\item VIRTIO_VSOCK_F_SEQPACKET (0) SOCK_SEQPACKET socket type is
>>> +    supported.
>>> +\end{description}
>>>
>>>  \subsection{Device configuration layout}\label{sec:Device Types / Socket Device / Device configuration layout}
>>>
>>> @@ -98,6 +102,10 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op
>>>  #define VIRTIO_VSOCK_OP_CREDIT_UPDATE  6
>>>  /* Request the peer to send the credit info to us */
>>>  #define VIRTIO_VSOCK_OP_CREDIT_REQUEST 7
>>> +/* Message begin for SOCK_SEQPACKET */
>>> +#define VIRTIO_VSOCK_OP_SEQ_BEGIN      8
>>> +/* Message end for SOCK_SEQPACKET */
>>> +#define VIRTIO_VSOCK_OP_SEQ_END        9
>> The struct virtio_vsock_hdr->flags field is le32 and currently unused.
>> Could 24 bits be used for a unique message id and 8 bits for flags? 1
>> flag bit could be used for end-of-message and the remaining 7 bits could
>> be reserved. That way SEQ_BEGIN and SEQ_END are not necessary.  
>> Pressure
>> on the virtqueue would be reduced and performance should be comparable
>> to SOCK_STREAM.
>
>Well, my first versions of SOCK_SEQPACKET implementation, worked
>something like this: i used flags field of header as length of whole
>message. I discussed it with Stefano Garzarella, and he told that it 
>will
>be better to use special "header" in packet's payload, to keep some
>SOCK_SEQPACKET specific data, instead of reusing packet's header
>fields.

IIRC in the first implementation SEQ_BEGIN was an empty message and we 
didn't added the msg_id yet. So since we needed to carry both id and 
total length, I suggested to use the payload to put these extra 
information.

IIUC what Stefan is suggesting is a bit different and I think it should 
be cool to implement: we can remove the boundary packets, use only 8 
bits for the flags, and add a new field to reuse the 24 unused bits, 
maybe also 16 bits would be enough.
At that point we will only use the EOR flag to know the last packet.

The main difference will be that the receiver will know the total size 
only when the last packet is received.

Do you see any issue on that approach?

Thanks,
Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [virtio-comment] Re: [MASSMAIL KLMS] Re: [virtio-comment] [RFC PATCH v4 2/2] virtio-vsock: SOCK_SEQPACKET description
       [not found]         ` <d6d92105-f7d4-74a3-4acc-fcfb40872b76@kaspersky.com>
@ 2021-03-30  7:32           ` Stefano Garzarella
  2021-03-30  8:55           ` Stefan Hajnoczi
  1 sibling, 0 replies; 7+ messages in thread
From: Stefano Garzarella @ 2021-03-30  7:32 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, Michael S. Tsirkin, cohuck, Colin Ian King,
	oxffffaa, Norbert Slusarek, Stefan Hajnoczi, virtio-comment,
	Jakub Kicinski, virtualization, David S. Miller, Jorgen Hansen,
	Alexander Popov

On Tue, Mar 30, 2021 at 09:15:39AM +0300, Arseny Krasnov wrote:
>
>On 30.03.2021 00:28, Stefano Garzarella wrote:
>> On Mon, Mar 29, 2021 at 08:33:27PM +0300, Arseny Krasnov wrote:
>>> On 29.03.2021 19:11, Stefan Hajnoczi wrote:
>>>> On Fri, Mar 26, 2021 at 12:02:50PM +0300, Arseny Krasnov wrote:
>>>>> This adds description of SOCK_SEQPACKET socket type
>>>>> support for virtio-vsock.
>>>>>
>>>>> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>>>>> ---
>>>>>  virtio-vsock.tex | 65 +++++++++++++++++++++++++++++++++++++++++++-----
>>>>>  1 file changed, 59 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/virtio-vsock.tex b/virtio-vsock.tex
>>>>> index ad57f9d..c366de7 100644
>>>>> --- a/virtio-vsock.tex
>>>>> +++ b/virtio-vsock.tex
>>>>> @@ -17,6 +17,10 @@ \subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues}
>>>>>  \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
>>>>>
>>>>>  There are currently no feature bits defined for this device.
>>>> ^ This line is now out of date :)
>>> Ack
>>>>> +\begin{description}
>>>>> +\item VIRTIO_VSOCK_F_SEQPACKET (0) SOCK_SEQPACKET socket type is
>>>>> +    supported.
>>>>> +\end{description}
>>>>>
>>>>>  \subsection{Device configuration layout}\label{sec:Device Types / Socket Device / Device configuration layout}
>>>>>
>>>>> @@ -98,6 +102,10 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op
>>>>>  #define VIRTIO_VSOCK_OP_CREDIT_UPDATE  6
>>>>>  /* Request the peer to send the credit info to us */
>>>>>  #define VIRTIO_VSOCK_OP_CREDIT_REQUEST 7
>>>>> +/* Message begin for SOCK_SEQPACKET */
>>>>> +#define VIRTIO_VSOCK_OP_SEQ_BEGIN      8
>>>>> +/* Message end for SOCK_SEQPACKET */
>>>>> +#define VIRTIO_VSOCK_OP_SEQ_END        9
>>>> The struct virtio_vsock_hdr->flags field is le32 and currently unused.
>>>> Could 24 bits be used for a unique message id and 8 bits for flags? 1
>>>> flag bit could be used for end-of-message and the remaining 7 bits could
>>>> be reserved. That way SEQ_BEGIN and SEQ_END are not necessary.
>>>> Pressure
>>>> on the virtqueue would be reduced and performance should be comparable
>>>> to SOCK_STREAM.
>>> Well, my first versions of SOCK_SEQPACKET implementation, worked
>>> something like this: i used flags field of header as length of whole
>>> message. I discussed it with Stefano Garzarella, and he told that it
>>> will
>>> be better to use special "header" in packet's payload, to keep some
>>> SOCK_SEQPACKET specific data, instead of reusing packet's header
>>> fields.
>> IIRC in the first implementation SEQ_BEGIN was an empty message and we
>> didn't added the msg_id yet. So since we needed to carry both id and
>> total length, I suggested to use the payload to put these extra
>> information.
>>
>> IIUC what Stefan is suggesting is a bit different and I think it should
>> be cool to implement: we can remove the boundary packets, use only 8
>> bits for the flags, and add a new field to reuse the 24 unused bits,
>> maybe also 16 bits would be enough.
>> At that point we will only use the EOR flag to know the last packet.
>>
>> The main difference will be that the receiver will know the total size
>> only when the last packet is received.
>>
>> Do you see any issue on that approach?
>
>It will work, except we can't check that any packet of message,
>
>except last(with EOR bit) was dropped, since receiver don't know
>
>real length of message. If it is ok, then i can implement it.
>

This is true, but where can a packet be lost?

The channel is lossless, so it can be lost either by the transmitter or 
the receiver, but only in critical cases where for example the whole 
system has run out of memory, but in this case we can't do much, maybe 
only reset the connection.

Thanks,
Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [virtio-comment] Re: [MASSMAIL KLMS] Re: [virtio-comment] [RFC PATCH v4 2/2] virtio-vsock: SOCK_SEQPACKET description
       [not found]         ` <d6d92105-f7d4-74a3-4acc-fcfb40872b76@kaspersky.com>
  2021-03-30  7:32           ` [virtio-comment] Re: [MASSMAIL KLMS] " Stefano Garzarella
@ 2021-03-30  8:55           ` Stefan Hajnoczi
       [not found]             ` <2061f2ab-f3fc-c059-7cfc-a34b06f061fe@kaspersky.com>
  1 sibling, 1 reply; 7+ messages in thread
From: Stefan Hajnoczi @ 2021-03-30  8:55 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, Michael S. Tsirkin, cohuck, Colin Ian King,
	Norbert Slusarek, oxffffaa, virtio-comment, Jakub Kicinski,
	Alexander Popov, virtualization, David S. Miller, Jorgen Hansen


[-- Attachment #1.1: Type: text/plain, Size: 3437 bytes --]

On Tue, Mar 30, 2021 at 09:15:39AM +0300, Arseny Krasnov wrote:
> On 30.03.2021 00:28, Stefano Garzarella wrote:
> > On Mon, Mar 29, 2021 at 08:33:27PM +0300, Arseny Krasnov wrote:
> >> On 29.03.2021 19:11, Stefan Hajnoczi wrote:
> >>> On Fri, Mar 26, 2021 at 12:02:50PM +0300, Arseny Krasnov wrote:
> >>>> @@ -98,6 +102,10 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op
> >>>>  #define VIRTIO_VSOCK_OP_CREDIT_UPDATE  6
> >>>>  /* Request the peer to send the credit info to us */
> >>>>  #define VIRTIO_VSOCK_OP_CREDIT_REQUEST 7
> >>>> +/* Message begin for SOCK_SEQPACKET */
> >>>> +#define VIRTIO_VSOCK_OP_SEQ_BEGIN      8
> >>>> +/* Message end for SOCK_SEQPACKET */
> >>>> +#define VIRTIO_VSOCK_OP_SEQ_END        9
> >>> The struct virtio_vsock_hdr->flags field is le32 and currently unused.
> >>> Could 24 bits be used for a unique message id and 8 bits for flags? 1
> >>> flag bit could be used for end-of-message and the remaining 7 bits could
> >>> be reserved. That way SEQ_BEGIN and SEQ_END are not necessary.  
> >>> Pressure
> >>> on the virtqueue would be reduced and performance should be comparable
> >>> to SOCK_STREAM.
> >> Well, my first versions of SOCK_SEQPACKET implementation, worked
> >> something like this: i used flags field of header as length of whole
> >> message. I discussed it with Stefano Garzarella, and he told that it 
> >> will
> >> be better to use special "header" in packet's payload, to keep some
> >> SOCK_SEQPACKET specific data, instead of reusing packet's header
> >> fields.
> > IIRC in the first implementation SEQ_BEGIN was an empty message and we 
> > didn't added the msg_id yet. So since we needed to carry both id and 
> > total length, I suggested to use the payload to put these extra 
> > information.
> >
> > IIUC what Stefan is suggesting is a bit different and I think it should 
> > be cool to implement: we can remove the boundary packets, use only 8 
> > bits for the flags, and add a new field to reuse the 24 unused bits, 
> > maybe also 16 bits would be enough.
> > At that point we will only use the EOR flag to know the last packet.
> >
> > The main difference will be that the receiver will know the total size 
> > only when the last packet is received.
> >
> > Do you see any issue on that approach?
> 
> It will work, except we can't check that any packet of message,
> 
> except last(with EOR bit) was dropped, since receiver don't know
> 
> real length of message. If it is ok, then i can implement it.

The credit mechanism ensures that packets are not dropped, so it's not
necessary to check for this condition.

By the way, is a unique message ID needed? My understanding is:

If two messages are being sent on a socket at the same time either their
order is serialized (whichever message began first) or it is undefined
(whichever message completes first). I wonder if POSIX specifies this
and if Linux implements it (e.g. with AF_UNIX SOCK_SEQPACKET messages
that are multiple pages long and exceed sndbuf)?

Depending on these semantics maybe we don't need a unique message ID.
Instead the driver transmits messages sequentially. RW packets for
different messages on the same socket will never be interleaved.
Therefore the unique message ID is not needed and just the MSG_EOR flag
is enough to indicate message boundaries.

What do you think?

Stefan

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [virtio-comment] Re: [MASSMAIL KLMS] Re: [virtio-comment] [RFC PATCH v4 2/2] virtio-vsock: SOCK_SEQPACKET description
       [not found]             ` <2061f2ab-f3fc-c059-7cfc-a34b06f061fe@kaspersky.com>
@ 2021-03-30 13:57               ` Stefan Hajnoczi
       [not found]                 ` <64023aef-2e6b-b4bf-6569-ea71f7ee53de@kaspersky.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Hajnoczi @ 2021-03-30 13:57 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, Michael S. Tsirkin, cohuck, Colin Ian King,
	Norbert Slusarek, oxffffaa, virtio-comment, Jakub Kicinski,
	Alexander Popov, virtualization, David S. Miller, Jorgen Hansen


[-- Attachment #1.1: Type: text/plain, Size: 4238 bytes --]

On Tue, Mar 30, 2021 at 12:50:06PM +0300, Arseny Krasnov wrote:
> 
> On 30.03.2021 11:55, Stefan Hajnoczi wrote:
> > On Tue, Mar 30, 2021 at 09:15:39AM +0300, Arseny Krasnov wrote:
> >> On 30.03.2021 00:28, Stefano Garzarella wrote:
> >>> On Mon, Mar 29, 2021 at 08:33:27PM +0300, Arseny Krasnov wrote:
> >>>> On 29.03.2021 19:11, Stefan Hajnoczi wrote:
> >>>>> On Fri, Mar 26, 2021 at 12:02:50PM +0300, Arseny Krasnov wrote:
> >>>>>> @@ -98,6 +102,10 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op
> >>>>>>  #define VIRTIO_VSOCK_OP_CREDIT_UPDATE  6
> >>>>>>  /* Request the peer to send the credit info to us */
> >>>>>>  #define VIRTIO_VSOCK_OP_CREDIT_REQUEST 7
> >>>>>> +/* Message begin for SOCK_SEQPACKET */
> >>>>>> +#define VIRTIO_VSOCK_OP_SEQ_BEGIN      8
> >>>>>> +/* Message end for SOCK_SEQPACKET */
> >>>>>> +#define VIRTIO_VSOCK_OP_SEQ_END        9
> >>>>> The struct virtio_vsock_hdr->flags field is le32 and currently unused.
> >>>>> Could 24 bits be used for a unique message id and 8 bits for flags? 1
> >>>>> flag bit could be used for end-of-message and the remaining 7 bits could
> >>>>> be reserved. That way SEQ_BEGIN and SEQ_END are not necessary.  
> >>>>> Pressure
> >>>>> on the virtqueue would be reduced and performance should be comparable
> >>>>> to SOCK_STREAM.
> >>>> Well, my first versions of SOCK_SEQPACKET implementation, worked
> >>>> something like this: i used flags field of header as length of whole
> >>>> message. I discussed it with Stefano Garzarella, and he told that it 
> >>>> will
> >>>> be better to use special "header" in packet's payload, to keep some
> >>>> SOCK_SEQPACKET specific data, instead of reusing packet's header
> >>>> fields.
> >>> IIRC in the first implementation SEQ_BEGIN was an empty message and we 
> >>> didn't added the msg_id yet. So since we needed to carry both id and 
> >>> total length, I suggested to use the payload to put these extra 
> >>> information.
> >>>
> >>> IIUC what Stefan is suggesting is a bit different and I think it should 
> >>> be cool to implement: we can remove the boundary packets, use only 8 
> >>> bits for the flags, and add a new field to reuse the 24 unused bits, 
> >>> maybe also 16 bits would be enough.
> >>> At that point we will only use the EOR flag to know the last packet.
> >>>
> >>> The main difference will be that the receiver will know the total size 
> >>> only when the last packet is received.
> >>>
> >>> Do you see any issue on that approach?
> >> It will work, except we can't check that any packet of message,
> >>
> >> except last(with EOR bit) was dropped, since receiver don't know
> >>
> >> real length of message. If it is ok, then i can implement it.
> > The credit mechanism ensures that packets are not dropped, so it's not
> > necessary to check for this condition.
> >
> > By the way, is a unique message ID needed? My understanding is:
> >
> > If two messages are being sent on a socket at the same time either their
> > order is serialized (whichever message began first) or it is undefined
> > (whichever message completes first).
> 
> If we are talking about case, when two threads writes to one socket at the same time,
> 
> in Linux it is possible that two message will interleave(for vsock). But as i know, for example
> 
> when TCP socket is used, both 'write()' calls will be serialized. May be it is bug of vsock: when
> 
> first writer goes out of space, it will sleep. Then second writer tries to send something, but
> 
> as free space is over, it will sleep too. Then, credit update is received from peer. Both sender's
> 
> will be woken up, but sender which grab socket lock first will continue to send it's message.
> 
> So may be we can add something like semaphore to new/vmw_vsock/af_vsock.c which will
> 
> serialize two 'write()' calls: second sender enters 'write()' path, only when first left this path.
> 
> My implementation doesn't care about that, because i wanted to add semaphore later, by another
> 
> patch.

Yes, that is definitely an issue that the driver needs to take care of
if we don't have unique message IDs. Thanks for explaining!

Stefan

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [virtio-comment] Re: [MASSMAIL KLMS] Re: [virtio-comment] [RFC PATCH v4 2/2] virtio-vsock: SOCK_SEQPACKET description
       [not found]                 ` <64023aef-2e6b-b4bf-6569-ea71f7ee53de@kaspersky.com>
@ 2021-03-31 14:48                   ` Stefan Hajnoczi
       [not found]                     ` <486fe58d-4472-d5e6-11d9-062408c59c1e@kaspersky.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Hajnoczi @ 2021-03-31 14:48 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, Michael S. Tsirkin, cohuck, Colin Ian King,
	Norbert Slusarek, oxffffaa, virtio-comment, Jakub Kicinski,
	Alexander Popov, virtualization, David S. Miller, Jorgen Hansen


[-- Attachment #1.1: Type: text/plain, Size: 4612 bytes --]

On Tue, Mar 30, 2021 at 05:24:19PM +0300, Arseny Krasnov wrote:
> 
> On 30.03.2021 16:57, Stefan Hajnoczi wrote:
> > On Tue, Mar 30, 2021 at 12:50:06PM +0300, Arseny Krasnov wrote:
> >> On 30.03.2021 11:55, Stefan Hajnoczi wrote:
> >>> On Tue, Mar 30, 2021 at 09:15:39AM +0300, Arseny Krasnov wrote:
> >>>> On 30.03.2021 00:28, Stefano Garzarella wrote:
> >>>>> On Mon, Mar 29, 2021 at 08:33:27PM +0300, Arseny Krasnov wrote:
> >>>>>> On 29.03.2021 19:11, Stefan Hajnoczi wrote:
> >>>>>>> On Fri, Mar 26, 2021 at 12:02:50PM +0300, Arseny Krasnov wrote:
> >>>>>>>> @@ -98,6 +102,10 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op
> >>>>>>>>  #define VIRTIO_VSOCK_OP_CREDIT_UPDATE  6
> >>>>>>>>  /* Request the peer to send the credit info to us */
> >>>>>>>>  #define VIRTIO_VSOCK_OP_CREDIT_REQUEST 7
> >>>>>>>> +/* Message begin for SOCK_SEQPACKET */
> >>>>>>>> +#define VIRTIO_VSOCK_OP_SEQ_BEGIN      8
> >>>>>>>> +/* Message end for SOCK_SEQPACKET */
> >>>>>>>> +#define VIRTIO_VSOCK_OP_SEQ_END        9
> >>>>>>> The struct virtio_vsock_hdr->flags field is le32 and currently unused.
> >>>>>>> Could 24 bits be used for a unique message id and 8 bits for flags? 1
> >>>>>>> flag bit could be used for end-of-message and the remaining 7 bits could
> >>>>>>> be reserved. That way SEQ_BEGIN and SEQ_END are not necessary.  
> >>>>>>> Pressure
> >>>>>>> on the virtqueue would be reduced and performance should be comparable
> >>>>>>> to SOCK_STREAM.
> >>>>>> Well, my first versions of SOCK_SEQPACKET implementation, worked
> >>>>>> something like this: i used flags field of header as length of whole
> >>>>>> message. I discussed it with Stefano Garzarella, and he told that it 
> >>>>>> will
> >>>>>> be better to use special "header" in packet's payload, to keep some
> >>>>>> SOCK_SEQPACKET specific data, instead of reusing packet's header
> >>>>>> fields.
> >>>>> IIRC in the first implementation SEQ_BEGIN was an empty message and we 
> >>>>> didn't added the msg_id yet. So since we needed to carry both id and 
> >>>>> total length, I suggested to use the payload to put these extra 
> >>>>> information.
> >>>>>
> >>>>> IIUC what Stefan is suggesting is a bit different and I think it should 
> >>>>> be cool to implement: we can remove the boundary packets, use only 8 
> >>>>> bits for the flags, and add a new field to reuse the 24 unused bits, 
> >>>>> maybe also 16 bits would be enough.
> >>>>> At that point we will only use the EOR flag to know the last packet.
> >>>>>
> >>>>> The main difference will be that the receiver will know the total size 
> >>>>> only when the last packet is received.
> >>>>>
> >>>>> Do you see any issue on that approach?
> >>>> It will work, except we can't check that any packet of message,
> >>>>
> >>>> except last(with EOR bit) was dropped, since receiver don't know
> >>>>
> >>>> real length of message. If it is ok, then i can implement it.
> >>> The credit mechanism ensures that packets are not dropped, so it's not
> >>> necessary to check for this condition.
> >>>
> >>> By the way, is a unique message ID needed? My understanding is:
> >>>
> >>> If two messages are being sent on a socket at the same time either their
> >>> order is serialized (whichever message began first) or it is undefined
> >>> (whichever message completes first).
> >> If we are talking about case, when two threads writes to one socket at the same time,
> >>
> >> in Linux it is possible that two message will interleave(for vsock). But as i know, for example
> >>
> >> when TCP socket is used, both 'write()' calls will be serialized. May be it is bug of vsock: when
> >>
> >> first writer goes out of space, it will sleep. Then second writer tries to send something, but
> >>
> >> as free space is over, it will sleep too. Then, credit update is received from peer. Both sender's
> >>
> >> will be woken up, but sender which grab socket lock first will continue to send it's message.
> >>
> >> So may be we can add something like semaphore to new/vmw_vsock/af_vsock.c which will
> >>
> >> serialize two 'write()' calls: second sender enters 'write()' path, only when first left this path.
> >>
> >> My implementation doesn't care about that, because i wanted to add semaphore later, by another
> >>
> >> patch.
> > Yes, that is definitely an issue that the driver needs to take care of
> > if we don't have unique message IDs. Thanks for explaining!
> 
> So may I  include patch with serializer to next version of my patchset?

Sounds good!

Stefan

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [virtio-comment] Re: [MASSMAIL KLMS] Re: [virtio-comment] Re: [MASSMAIL KLMS] Re: [virtio-comment] [RFC PATCH v4 2/2] virtio-vsock: SOCK_SEQPACKET description
       [not found]                     ` <486fe58d-4472-d5e6-11d9-062408c59c1e@kaspersky.com>
@ 2021-04-09 14:27                       ` Stefano Garzarella
  0 siblings, 0 replies; 7+ messages in thread
From: Stefano Garzarella @ 2021-04-09 14:27 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, Michael S. Tsirkin, cohuck, Colin Ian King,
	oxffffaa, Norbert Slusarek, Stefan Hajnoczi, virtio-comment,
	Jakub Kicinski, virtualization, David S. Miller, Jorgen Hansen,
	Alexander Popov

On Sat, Apr 03, 2021 at 12:45:54PM +0300, Arseny Krasnov wrote:
>
>On 31.03.2021 17:48, Stefan Hajnoczi wrote:
>> On Tue, Mar 30, 2021 at 05:24:19PM +0300, Arseny Krasnov wrote:
>>> On 30.03.2021 16:57, Stefan Hajnoczi wrote:
>>>> On Tue, Mar 30, 2021 at 12:50:06PM +0300, Arseny Krasnov wrote:
>>>>> On 30.03.2021 11:55, Stefan Hajnoczi wrote:
>>>>>> On Tue, Mar 30, 2021 at 09:15:39AM +0300, Arseny Krasnov wrote:
>>>>>>> On 30.03.2021 00:28, Stefano Garzarella wrote:
>>>>>>>> On Mon, Mar 29, 2021 at 08:33:27PM +0300, Arseny Krasnov wrote:
>>>>>>>>> On 29.03.2021 19:11, Stefan Hajnoczi wrote:
>>>>>>>>>> On Fri, Mar 26, 2021 at 12:02:50PM +0300, Arseny Krasnov wrote:
>>>>>>>>>>> @@ -98,6 +102,10 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op
>>>>>>>>>>>  #define VIRTIO_VSOCK_OP_CREDIT_UPDATE  6
>>>>>>>>>>>  /* Request the peer to send the credit info to us */
>>>>>>>>>>>  #define VIRTIO_VSOCK_OP_CREDIT_REQUEST 7
>>>>>>>>>>> +/* Message begin for SOCK_SEQPACKET */
>>>>>>>>>>> +#define VIRTIO_VSOCK_OP_SEQ_BEGIN      8
>>>>>>>>>>> +/* Message end for SOCK_SEQPACKET */
>>>>>>>>>>> +#define VIRTIO_VSOCK_OP_SEQ_END        9
>>>>>>>>>> The struct virtio_vsock_hdr->flags field is le32 and currently unused.
>>>>>>>>>> Could 24 bits be used for a unique message id and 8 bits for flags? 1
>>>>>>>>>> flag bit could be used for end-of-message and the remaining 7 bits could
>>>>>>>>>> be reserved. That way SEQ_BEGIN and SEQ_END are not necessary.
>>>>>>>>>> Pressure
>>>>>>>>>> on the virtqueue would be reduced and performance should be comparable
>>>>>>>>>> to SOCK_STREAM.
>>>>>>>>> Well, my first versions of SOCK_SEQPACKET implementation, worked
>>>>>>>>> something like this: i used flags field of header as length of whole
>>>>>>>>> message. I discussed it with Stefano Garzarella, and he told that it
>>>>>>>>> will
>>>>>>>>> be better to use special "header" in packet's payload, to keep some
>>>>>>>>> SOCK_SEQPACKET specific data, instead of reusing packet's header
>>>>>>>>> fields.
>>>>>>>> IIRC in the first implementation SEQ_BEGIN was an empty message and we
>>>>>>>> didn't added the msg_id yet. So since we needed to carry both id and
>>>>>>>> total length, I suggested to use the payload to put these extra
>>>>>>>> information.
>>>>>>>>
>>>>>>>> IIUC what Stefan is suggesting is a bit different and I think it should
>>>>>>>> be cool to implement: we can remove the boundary packets, use only 8
>>>>>>>> bits for the flags, and add a new field to reuse the 24 unused bits,
>>>>>>>> maybe also 16 bits would be enough.
>>>>>>>> At that point we will only use the EOR flag to know the last packet.
>>>>>>>>
>>>>>>>> The main difference will be that the receiver will know the total size
>>>>>>>> only when the last packet is received.
>>>>>>>>
>>>>>>>> Do you see any issue on that approach?
>>>>>>> It will work, except we can't check that any packet of message,
>>>>>>>
>>>>>>> except last(with EOR bit) was dropped, since receiver don't know
>>>>>>>
>>>>>>> real length of message. If it is ok, then i can implement it.
>>>>>> The credit mechanism ensures that packets are not dropped, so it's not
>>>>>> necessary to check for this condition.
>>>>>>
>>>>>> By the way, is a unique message ID needed? My understanding is:
>>>>>>
>>>>>> If two messages are being sent on a socket at the same time either their
>>>>>> order is serialized (whichever message began first) or it is undefined
>>>>>> (whichever message completes first).
>>>>> If we are talking about case, when two threads writes to one socket at the same time,
>>>>>
>>>>> in Linux it is possible that two message will interleave(for vsock). But as i know, for example
>>>>>
>>>>> when TCP socket is used, both 'write()' calls will be serialized. May be it is bug of vsock: when
>>>>>
>>>>> first writer goes out of space, it will sleep. Then second writer tries to send something, but
>>>>>
>>>>> as free space is over, it will sleep too. Then, credit update is received from peer. Both sender's
>>>>>
>>>>> will be woken up, but sender which grab socket lock first will continue to send it's message.
>>>>>
>>>>> So may be we can add something like semaphore to new/vmw_vsock/af_vsock.c which will
>>>>>
>>>>> serialize two 'write()' calls: second sender enters 'write()' path, only when first left this path.
>>>>>
>>>>> My implementation doesn't care about that, because i wanted to add semaphore later, by another
>>>>>
>>>>> patch.
>>>> Yes, that is definitely an issue that the driver needs to take care of
>>>> if we don't have unique message IDs. Thanks for explaining!
>>> So may I  include patch with serializer to next version of my patchset?
>> Sounds good!
>
>There is small problem with approach, when we remove SEQ_BEGIN/SEQ_END 
>
>bounds of messages in flags field(EOR bit ) of packet header: consider case, when vhost sends data
>
>to guest(let it be 64kb). In vhost vsock driver, big buffer of packet(for example 64kb) will
>
>be splitted to small buffers, to fit guests's virtio rx descriptors(in current implementation of
>
>Linux it is 4kb). All fields of new headers in rx queue will be copied from fields of header
>
>of big packet(except len field).  Thus we get 16 headers with  EOR bit set.
>set.
>
>
>May be it is possible to:
>
>1) Handle such bit in vhost driver(set EOR bit in flags only for last small buffer of guests rx queue)
>
>OR
>
>2) I can remove SEQ_BEGIN, msg_len and msg_id. But keep SEQ_END op. This will be special
>
>packet which marks end of message without any payload. In this case, such packet will be
>
>processed by vhost "as is".
>
>
>What do You think?
>

IMHO option 1 is the best and should not be too complicated to 
implement.

Do you see a specific issue?

Thanks,
Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-04-09 14:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210326090154.1144100-1-arseny.krasnov@kaspersky.com>
     [not found] ` <20210326090254.1144486-1-arseny.krasnov@kaspersky.com>
2021-03-29 16:11   ` [virtio-comment] [RFC PATCH v4 2/2] virtio-vsock: SOCK_SEQPACKET description Stefan Hajnoczi
     [not found]     ` <230d95fd-29e8-465b-0ab2-b406d614c11b@kaspersky.com>
2021-03-29 21:28       ` Stefano Garzarella
     [not found]         ` <d6d92105-f7d4-74a3-4acc-fcfb40872b76@kaspersky.com>
2021-03-30  7:32           ` [virtio-comment] Re: [MASSMAIL KLMS] " Stefano Garzarella
2021-03-30  8:55           ` Stefan Hajnoczi
     [not found]             ` <2061f2ab-f3fc-c059-7cfc-a34b06f061fe@kaspersky.com>
2021-03-30 13:57               ` Stefan Hajnoczi
     [not found]                 ` <64023aef-2e6b-b4bf-6569-ea71f7ee53de@kaspersky.com>
2021-03-31 14:48                   ` Stefan Hajnoczi
     [not found]                     ` <486fe58d-4472-d5e6-11d9-062408c59c1e@kaspersky.com>
2021-04-09 14:27                       ` [virtio-comment] Re: [MASSMAIL KLMS] " Stefano Garzarella

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).