All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support
@ 2021-02-07 15:12 Arseny Krasnov
  2021-02-07 15:14 ` [RFC PATCH v4 01/17] af_vsock: update functions for connectible socket Arseny Krasnov
                   ` (17 more replies)
  0 siblings, 18 replies; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:12 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Andra Paraschiv, Colin Ian King, Alexander Popov
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

	This patchset impelements support of SOCK_SEQPACKET for virtio
transport.
	As SOCK_SEQPACKET guarantees to save record boundaries, so to
do it, two new packet operations were added: first for start of record
 and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
both operations carries metadata - to maintain boundaries and payload
integrity. Metadata is introduced by adding special header with two
fields - message count and message length:

	struct virtio_vsock_seq_hdr {
		__le32  msg_cnt;
		__le32  msg_len;
	} __attribute__((packed));

	This header is transmitted as payload of SEQ_BEGIN and SEQ_END
packets(buffer of second virtio descriptor in chain) in the same way as
data transmitted in RW packets. Payload was chosen as buffer for this
header to avoid touching first virtio buffer which carries header of
packet, because someone could check that size of this buffer is equal
to size of packet header. To send record, packet with start marker is
sent first(it's header contains length of record and counter), then
counter is incremented and all data is sent as usual 'RW' packets and
finally SEQ_END is sent(it also carries counter of message, which is
counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
incremented again. On receiver's side, length of record is known from
packet with start record marker. To check that no packets were dropped
by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
1) and length of data between two markers is compared to length in
SEQ_BEGIN header.
	Now as  packets of one socket are not reordered neither on
vsock nor on vhost transport layers, such markers allows to restore
original record on receiver's side. If user's buffer is smaller that
record length, when all out of size data is dropped.
	Maximum length of datagram is not limited as in stream socket,
because same credit logic is used. Difference with stream socket is
that user is not woken up until whole record is received or error
occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
	Tests also implemented.

 Arseny Krasnov (17):
  af_vsock: update functions for connectible socket
  af_vsock: separate wait data loop
  af_vsock: separate receive data loop
  af_vsock: implement SEQPACKET receive loop
  af_vsock: separate wait space loop
  af_vsock: implement send logic for SEQPACKET
  af_vsock: rest of SEQPACKET support
  af_vsock: update comments for stream sockets
  virtio/vsock: dequeue callback for SOCK_SEQPACKET
  virtio/vsock: fetch length for SEQPACKET record
  virtio/vsock: add SEQPACKET receive logic
  virtio/vsock: rest of SOCK_SEQPACKET support
  virtio/vsock: setup SEQPACKET ops for transport
  vhost/vsock: setup SEQPACKET ops for transport
  vsock_test: add SOCK_SEQPACKET tests
  loopback/vsock: setup SEQPACKET ops for transport
  virtio/vsock: simplify credit update function API

 drivers/vhost/vsock.c                   |   8 +-
 include/linux/virtio_vsock.h            |  15 +
 include/net/af_vsock.h                  |   9 +
 include/uapi/linux/virtio_vsock.h       |  16 +
 net/vmw_vsock/af_vsock.c                | 588 +++++++++++++++-------
 net/vmw_vsock/virtio_transport.c        |   5 +
 net/vmw_vsock/virtio_transport_common.c | 316 ++++++++++--
 net/vmw_vsock/vsock_loopback.c          |   5 +
 tools/testing/vsock/util.c              |  32 +-
 tools/testing/vsock/util.h              |   3 +
 tools/testing/vsock/vsock_test.c        | 126 +++++
 11 files changed, 895 insertions(+), 228 deletions(-)

 TODO:
 - What to do, when server doesn't support SOCK_SEQPACKET. In current
   implementation RST is replied in the same way when listening port
   is not found. I think that current RST is enough,because case when
   server doesn't support SEQ_PACKET is same when listener missed(e.g.
   no listener in both cases).

 v3 -> v4:
 - callbacks for loopback transport
 - SEQPACKET specific metadata moved from packet header to payload
   and called 'virtio_vsock_seq_hdr'
 - record integrity check:
   1) SEQ_END operation was added, which marks end of record.
   2) Both SEQ_BEGIN and SEQ_END carries counter which is incremented
      on every marker send.
 - af_vsock.c: socket operations for STREAM and SEQPACKET call same
   functions instead of having own "gates" differs only by names:
   'vsock_seqpacket/stream_getsockopt()' now replaced with
   'vsock_connectible_getsockopt()'.
 - af_vsock.c: 'seqpacket_dequeue' callback returns error and flag that
   record ready. There is no need to return number of copied bytes,
   because case when record received successfully is checked at virtio
   transport layer, when SEQ_END is processed. Also user doesn't need
   number of copied bytes, because 'recv()' from SEQPACKET could return
   error, length of users's buffer or length of whole record(both are
   known in af_vsock.c).
 - af_vsock.c: both wait loops in af_vsock.c(for data and space) moved
   to separate functions because now both called from several places.
 - af_vsock.c: 'vsock_assign_transport()' checks that 'new_transport'
   pointer is not NULL and returns 'ESOCKTNOSUPPORT' instead of 'ENODEV'
   if failed to use transport.
 - tools/testing/vsock/vsock_test.c: rename tests

 v2 -> v3:
 - patches reorganized: split for prepare and implementation patches
 - local variables are declared in "Reverse Christmas tree" manner
 - virtio_transport_common.c: valid leXX_to_cpu() for vsock header
   fields access
 - af_vsock.c: 'vsock_connectible_*sockopt()' added as shared code
   between stream and seqpacket sockets.
 - af_vsock.c: loops in '__vsock_*_recvmsg()' refactored.
 - af_vsock.c: 'vsock_wait_data()' refactored.

 v1 -> v2:
 - patches reordered: af_vsock.c related changes now before virtio vsock
 - patches reorganized: more small patches, where +/- are not mixed
 - tests for SOCK_SEQPACKET added
 - all commit messages updated
 - af_vsock.c: 'vsock_pre_recv_check()' inlined to
   'vsock_connectible_recvmsg()'
 - af_vsock.c: 'vsock_assign_transport()' returns ENODEV if transport
   was not found
 - virtio_transport_common.c: transport callback for seqpacket dequeue
 - virtio_transport_common.c: simplified
   'virtio_transport_recv_connected()'
 - virtio_transport_common.c: send reset on socket and packet type
			      mismatch.

-- 
2.25.1


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 01/17] af_vsock: update functions for connectible socket
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
@ 2021-02-07 15:14 ` Arseny Krasnov
  2021-02-11 10:52     ` Stefano Garzarella
  2021-02-07 15:14 ` [RFC PATCH v4 02/17] af_vsock: separate wait data loop Arseny Krasnov
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:14 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This prepares af_vsock.c for SEQPACKET support: some functions such
as setsockopt(), getsockopt(), connect(), recvmsg(), sendmsg() are
shared between both types of sockets, so rename them in general
manner.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 net/vmw_vsock/af_vsock.c | 64 +++++++++++++++++++++-------------------
 1 file changed, 34 insertions(+), 30 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 6894f21dc147..f4fabec50650 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -604,8 +604,8 @@ static void vsock_pending_work(struct work_struct *work)
 
 /**** SOCKET OPERATIONS ****/
 
-static int __vsock_bind_stream(struct vsock_sock *vsk,
-			       struct sockaddr_vm *addr)
+static int __vsock_bind_connectible(struct vsock_sock *vsk,
+				    struct sockaddr_vm *addr)
 {
 	static u32 port;
 	struct sockaddr_vm new_addr;
@@ -685,7 +685,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr)
 	switch (sk->sk_socket->type) {
 	case SOCK_STREAM:
 		spin_lock_bh(&vsock_table_lock);
-		retval = __vsock_bind_stream(vsk, addr);
+		retval = __vsock_bind_connectible(vsk, addr);
 		spin_unlock_bh(&vsock_table_lock);
 		break;
 
@@ -767,6 +767,11 @@ static struct sock *__vsock_create(struct net *net,
 	return sk;
 }
 
+static bool sock_type_connectible(u16 type)
+{
+	return type == SOCK_STREAM;
+}
+
 static void __vsock_release(struct sock *sk, int level)
 {
 	if (sk) {
@@ -785,7 +790,7 @@ static void __vsock_release(struct sock *sk, int level)
 
 		if (vsk->transport)
 			vsk->transport->release(vsk);
-		else if (sk->sk_type == SOCK_STREAM)
+		else if (sock_type_connectible(sk->sk_type))
 			vsock_remove_sock(vsk);
 
 		sock_orphan(sk);
@@ -945,7 +950,7 @@ static int vsock_shutdown(struct socket *sock, int mode)
 	sk = sock->sk;
 	if (sock->state == SS_UNCONNECTED) {
 		err = -ENOTCONN;
-		if (sk->sk_type == SOCK_STREAM)
+		if (sock_type_connectible(sk->sk_type))
 			return err;
 	} else {
 		sock->state = SS_DISCONNECTING;
@@ -960,7 +965,7 @@ static int vsock_shutdown(struct socket *sock, int mode)
 		sk->sk_state_change(sk);
 		release_sock(sk);
 
-		if (sk->sk_type == SOCK_STREAM) {
+		if (sock_type_connectible(sk->sk_type)) {
 			sock_reset_flag(sk, SOCK_DONE);
 			vsock_send_shutdown(sk, mode);
 		}
@@ -1013,7 +1018,7 @@ static __poll_t vsock_poll(struct file *file, struct socket *sock,
 		if (!(sk->sk_shutdown & SEND_SHUTDOWN))
 			mask |= EPOLLOUT | EPOLLWRNORM | EPOLLWRBAND;
 
-	} else if (sock->type == SOCK_STREAM) {
+	} else if (sock_type_connectible(sk->sk_type)) {
 		const struct vsock_transport *transport;
 
 		lock_sock(sk);
@@ -1263,8 +1268,8 @@ static void vsock_connect_timeout(struct work_struct *work)
 	sock_put(sk);
 }
 
-static int vsock_stream_connect(struct socket *sock, struct sockaddr *addr,
-				int addr_len, int flags)
+static int vsock_connect(struct socket *sock, struct sockaddr *addr,
+			 int addr_len, int flags)
 {
 	int err;
 	struct sock *sk;
@@ -1414,7 +1419,7 @@ static int vsock_accept(struct socket *sock, struct socket *newsock, int flags,
 
 	lock_sock(listener);
 
-	if (sock->type != SOCK_STREAM) {
+	if (!sock_type_connectible(sock->type)) {
 		err = -EOPNOTSUPP;
 		goto out;
 	}
@@ -1491,7 +1496,7 @@ static int vsock_listen(struct socket *sock, int backlog)
 
 	lock_sock(sk);
 
-	if (sock->type != SOCK_STREAM) {
+	if (!sock_type_connectible(sk->sk_type)) {
 		err = -EOPNOTSUPP;
 		goto out;
 	}
@@ -1535,11 +1540,11 @@ static void vsock_update_buffer_size(struct vsock_sock *vsk,
 	vsk->buffer_size = val;
 }
 
-static int vsock_stream_setsockopt(struct socket *sock,
-				   int level,
-				   int optname,
-				   sockptr_t optval,
-				   unsigned int optlen)
+static int vsock_connectible_setsockopt(struct socket *sock,
+					int level,
+					int optname,
+					sockptr_t optval,
+					unsigned int optlen)
 {
 	int err;
 	struct sock *sk;
@@ -1617,10 +1622,10 @@ static int vsock_stream_setsockopt(struct socket *sock,
 	return err;
 }
 
-static int vsock_stream_getsockopt(struct socket *sock,
-				   int level, int optname,
-				   char __user *optval,
-				   int __user *optlen)
+static int vsock_connectible_getsockopt(struct socket *sock,
+					int level, int optname,
+					char __user *optval,
+					int __user *optlen)
 {
 	int err;
 	int len;
@@ -1688,8 +1693,8 @@ static int vsock_stream_getsockopt(struct socket *sock,
 	return 0;
 }
 
-static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg,
-				size_t len)
+static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
+				     size_t len)
 {
 	struct sock *sk;
 	struct vsock_sock *vsk;
@@ -1828,10 +1833,9 @@ static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg,
 	return err;
 }
 
-
 static int
-vsock_stream_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
-		     int flags)
+vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+			  int flags)
 {
 	struct sock *sk;
 	struct vsock_sock *vsk;
@@ -2007,7 +2011,7 @@ static const struct proto_ops vsock_stream_ops = {
 	.owner = THIS_MODULE,
 	.release = vsock_release,
 	.bind = vsock_bind,
-	.connect = vsock_stream_connect,
+	.connect = vsock_connect,
 	.socketpair = sock_no_socketpair,
 	.accept = vsock_accept,
 	.getname = vsock_getname,
@@ -2015,10 +2019,10 @@ static const struct proto_ops vsock_stream_ops = {
 	.ioctl = sock_no_ioctl,
 	.listen = vsock_listen,
 	.shutdown = vsock_shutdown,
-	.setsockopt = vsock_stream_setsockopt,
-	.getsockopt = vsock_stream_getsockopt,
-	.sendmsg = vsock_stream_sendmsg,
-	.recvmsg = vsock_stream_recvmsg,
+	.setsockopt = vsock_connectible_setsockopt,
+	.getsockopt = vsock_connectible_getsockopt,
+	.sendmsg = vsock_connectible_sendmsg,
+	.recvmsg = vsock_connectible_recvmsg,
 	.mmap = sock_no_mmap,
 	.sendpage = sock_no_sendpage,
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 02/17] af_vsock: separate wait data loop
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
  2021-02-07 15:14 ` [RFC PATCH v4 01/17] af_vsock: update functions for connectible socket Arseny Krasnov
@ 2021-02-07 15:14 ` Arseny Krasnov
  2021-02-11 11:24     ` Stefano Garzarella
  2021-02-11 15:11     ` Jorgen Hansen
  2021-02-07 15:15 ` [RFC PATCH v4 03/17] af_vsock: separate receive " Arseny Krasnov
                   ` (15 subsequent siblings)
  17 siblings, 2 replies; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:14 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Colin Ian King, Andra Paraschiv, Alexander Popov
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This moves wait loop for data to dedicated function, because later
it will be used by SEQPACKET data receive loop.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 net/vmw_vsock/af_vsock.c | 158 +++++++++++++++++++++------------------
 1 file changed, 86 insertions(+), 72 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index f4fabec50650..38927695786f 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1833,6 +1833,71 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 	return err;
 }
 
+static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
+			   long timeout,
+			   struct vsock_transport_recv_notify_data *recv_data,
+			   size_t target)
+{
+	const struct vsock_transport *transport;
+	struct vsock_sock *vsk;
+	s64 data;
+	int err;
+
+	vsk = vsock_sk(sk);
+	err = 0;
+	transport = vsk->transport;
+	prepare_to_wait(sk_sleep(sk), wait, TASK_INTERRUPTIBLE);
+
+	while ((data = vsock_stream_has_data(vsk)) == 0) {
+		if (sk->sk_err != 0 ||
+		    (sk->sk_shutdown & RCV_SHUTDOWN) ||
+		    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
+			goto out;
+		}
+
+		/* Don't wait for non-blocking sockets. */
+		if (timeout == 0) {
+			err = -EAGAIN;
+			goto out;
+		}
+
+		if (recv_data) {
+			err = transport->notify_recv_pre_block(vsk, target, recv_data);
+			if (err < 0)
+				goto out;
+		}
+
+		release_sock(sk);
+		timeout = schedule_timeout(timeout);
+		lock_sock(sk);
+
+		if (signal_pending(current)) {
+			err = sock_intr_errno(timeout);
+			goto out;
+		} else if (timeout == 0) {
+			err = -EAGAIN;
+			goto out;
+		}
+	}
+
+	finish_wait(sk_sleep(sk), wait);
+
+	/* Invalid queue pair content. XXX This should
+	 * be changed to a connection reset in a later
+	 * change.
+	 */
+	if (data < 0)
+		return -ENOMEM;
+
+	/* Have some data, return. */
+	if (data)
+		return data;
+
+out:
+	finish_wait(sk_sleep(sk), wait);
+	return err;
+}
+
 static int
 vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 			  int flags)
@@ -1912,85 +1977,34 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 
 
 	while (1) {
-		s64 ready;
+		ssize_t read;
 
-		prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
-		ready = vsock_stream_has_data(vsk);
-
-		if (ready == 0) {
-			if (sk->sk_err != 0 ||
-			    (sk->sk_shutdown & RCV_SHUTDOWN) ||
-			    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
-				finish_wait(sk_sleep(sk), &wait);
-				break;
-			}
-			/* Don't wait for non-blocking sockets. */
-			if (timeout == 0) {
-				err = -EAGAIN;
-				finish_wait(sk_sleep(sk), &wait);
-				break;
-			}
-
-			err = transport->notify_recv_pre_block(
-					vsk, target, &recv_data);
-			if (err < 0) {
-				finish_wait(sk_sleep(sk), &wait);
-				break;
-			}
-			release_sock(sk);
-			timeout = schedule_timeout(timeout);
-			lock_sock(sk);
-
-			if (signal_pending(current)) {
-				err = sock_intr_errno(timeout);
-				finish_wait(sk_sleep(sk), &wait);
-				break;
-			} else if (timeout == 0) {
-				err = -EAGAIN;
-				finish_wait(sk_sleep(sk), &wait);
-				break;
-			}
-		} else {
-			ssize_t read;
+		err = vsock_wait_data(sk, &wait, timeout, &recv_data, target);
+		if (err <= 0)
+			break;
 
-			finish_wait(sk_sleep(sk), &wait);
-
-			if (ready < 0) {
-				/* Invalid queue pair content. XXX This should
-				* be changed to a connection reset in a later
-				* change.
-				*/
-
-				err = -ENOMEM;
-				goto out;
-			}
-
-			err = transport->notify_recv_pre_dequeue(
-					vsk, target, &recv_data);
-			if (err < 0)
-				break;
+		err = transport->notify_recv_pre_dequeue(vsk, target,
+							 &recv_data);
+		if (err < 0)
+			break;
 
-			read = transport->stream_dequeue(
-					vsk, msg,
-					len - copied, flags);
-			if (read < 0) {
-				err = -ENOMEM;
-				break;
-			}
+		read = transport->stream_dequeue(vsk, msg, len - copied, flags);
+		if (read < 0) {
+			err = -ENOMEM;
+			break;
+		}
 
-			copied += read;
+		copied += read;
 
-			err = transport->notify_recv_post_dequeue(
-					vsk, target, read,
-					!(flags & MSG_PEEK), &recv_data);
-			if (err < 0)
-				goto out;
+		err = transport->notify_recv_post_dequeue(vsk, target, read,
+						!(flags & MSG_PEEK), &recv_data);
+		if (err < 0)
+			goto out;
 
-			if (read >= target || flags & MSG_PEEK)
-				break;
+		if (read >= target || flags & MSG_PEEK)
+			break;
 
-			target -= read;
-		}
+		target -= read;
 	}
 
 	if (sk->sk_err)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 03/17] af_vsock: separate receive data loop
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
  2021-02-07 15:14 ` [RFC PATCH v4 01/17] af_vsock: update functions for connectible socket Arseny Krasnov
  2021-02-07 15:14 ` [RFC PATCH v4 02/17] af_vsock: separate wait data loop Arseny Krasnov
@ 2021-02-07 15:15 ` Arseny Krasnov
  2021-02-11 11:37     ` Stefano Garzarella
  2021-02-07 15:15 ` [RFC PATCH v4 04/17] af_vsock: implement SEQPACKET receive loop Arseny Krasnov
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:15 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This moves STREAM specific data receive logic to dedicated function:
'__vsock_stream_recvmsg()', while checks that will be same for both
types of socket are in shared function: 'vsock_connectible_recvmsg()'.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 net/vmw_vsock/af_vsock.c | 117 +++++++++++++++++++++++----------------
 1 file changed, 68 insertions(+), 49 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 38927695786f..66c8a932f49b 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1898,65 +1898,22 @@ static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
 	return err;
 }
 
-static int
-vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
-			  int flags)
+static int __vsock_stream_recvmsg(struct sock *sk, struct msghdr *msg,
+				  size_t len, int flags)
 {
-	struct sock *sk;
-	struct vsock_sock *vsk;
+	struct vsock_transport_recv_notify_data recv_data;
 	const struct vsock_transport *transport;
-	int err;
-	size_t target;
+	struct vsock_sock *vsk;
 	ssize_t copied;
+	size_t target;
 	long timeout;
-	struct vsock_transport_recv_notify_data recv_data;
+	int err;
 
 	DEFINE_WAIT(wait);
 
-	sk = sock->sk;
 	vsk = vsock_sk(sk);
-	err = 0;
-
-	lock_sock(sk);
-
 	transport = vsk->transport;
 
-	if (!transport || sk->sk_state != TCP_ESTABLISHED) {
-		/* Recvmsg is supposed to return 0 if a peer performs an
-		 * orderly shutdown. Differentiate between that case and when a
-		 * peer has not connected or a local shutdown occured with the
-		 * SOCK_DONE flag.
-		 */
-		if (sock_flag(sk, SOCK_DONE))
-			err = 0;
-		else
-			err = -ENOTCONN;
-
-		goto out;
-	}
-
-	if (flags & MSG_OOB) {
-		err = -EOPNOTSUPP;
-		goto out;
-	}
-
-	/* We don't check peer_shutdown flag here since peer may actually shut
-	 * down, but there can be data in the queue that a local socket can
-	 * receive.
-	 */
-	if (sk->sk_shutdown & RCV_SHUTDOWN) {
-		err = 0;
-		goto out;
-	}
-
-	/* It is valid on Linux to pass in a zero-length receive buffer.  This
-	 * is not an error.  We may as well bail out now.
-	 */
-	if (!len) {
-		err = 0;
-		goto out;
-	}
-
 	/* We must not copy less than target bytes into the user's buffer
 	 * before returning successfully, so we wait for the consume queue to
 	 * have that much data to consume before dequeueing.  Note that this
@@ -2020,6 +1977,68 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 	return err;
 }
 
+static int
+vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+			  int flags)
+{
+	struct sock *sk;
+	struct vsock_sock *vsk;
+	const struct vsock_transport *transport;
+	int err;
+
+	DEFINE_WAIT(wait);
+
+	sk = sock->sk;
+	vsk = vsock_sk(sk);
+	err = 0;
+
+	lock_sock(sk);
+
+	transport = vsk->transport;
+
+	if (!transport || sk->sk_state != TCP_ESTABLISHED) {
+		/* Recvmsg is supposed to return 0 if a peer performs an
+		 * orderly shutdown. Differentiate between that case and when a
+		 * peer has not connected or a local shutdown occurred with the
+		 * SOCK_DONE flag.
+		 */
+		if (sock_flag(sk, SOCK_DONE))
+			err = 0;
+		else
+			err = -ENOTCONN;
+
+		goto out;
+	}
+
+	if (flags & MSG_OOB) {
+		err = -EOPNOTSUPP;
+		goto out;
+	}
+
+	/* We don't check peer_shutdown flag here since peer may actually shut
+	 * down, but there can be data in the queue that a local socket can
+	 * receive.
+	 */
+	if (sk->sk_shutdown & RCV_SHUTDOWN) {
+		err = 0;
+		goto out;
+	}
+
+	/* It is valid on Linux to pass in a zero-length receive buffer.  This
+	 * is not an error.  We may as well bail out now.
+	 */
+	if (!len) {
+		err = 0;
+		goto out;
+	}
+
+	err = __vsock_stream_recvmsg(sk, msg, len, flags);
+
+out:
+	release_sock(sk);
+	return err;
+}
+
 static const struct proto_ops vsock_stream_ops = {
 	.family = PF_VSOCK,
 	.owner = THIS_MODULE,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 04/17] af_vsock: implement SEQPACKET receive loop
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (2 preceding siblings ...)
  2021-02-07 15:15 ` [RFC PATCH v4 03/17] af_vsock: separate receive " Arseny Krasnov
@ 2021-02-07 15:15 ` Arseny Krasnov
  2021-02-11 11:47     ` Stefano Garzarella
  2021-02-07 15:15 ` [RFC PATCH v4 05/17] af_vsock: separate wait space loop Arseny Krasnov
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:15 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Colin Ian King, Andra Paraschiv, Alexander Popov
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This adds receive loop for SEQPACKET. It looks like receive loop for
STREAM, but there is a little bit difference:
1) It doesn't call notify callbacks.
2) It doesn't care about 'SO_SNDLOWAT' and 'SO_RCVLOWAT' values, because
   there is no sense for these values in SEQPACKET case.
3) It waits until whole record is received or error is found during
   receiving.
4) It processes and sets 'MSG_TRUNC' flag.

So to avoid extra conditions for two types of socket inside one loop, two
independent functions were created.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 include/net/af_vsock.h   |  5 +++
 net/vmw_vsock/af_vsock.c | 96 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index b1c717286993..bb6a0e52be86 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -135,6 +135,11 @@ struct vsock_transport {
 	bool (*stream_is_active)(struct vsock_sock *);
 	bool (*stream_allow)(u32 cid, u32 port);
 
+	/* SEQ_PACKET. */
+	size_t (*seqpacket_seq_get_len)(struct vsock_sock *);
+	int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
+				     int flags, bool *msg_ready);
+
 	/* Notification. */
 	int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
 	int (*notify_poll_out)(struct vsock_sock *, size_t, bool *);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 66c8a932f49b..3d8af987216a 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1977,6 +1977,97 @@ static int __vsock_stream_recvmsg(struct sock *sk, struct msghdr *msg,
 	return err;
 }
 
+static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
+				     size_t len, int flags)
+{
+	const struct vsock_transport *transport;
+	const struct iovec *orig_iov;
+	unsigned long orig_nr_segs;
+	bool msg_ready;
+	struct vsock_sock *vsk;
+	size_t record_len;
+	long timeout;
+	int err = 0;
+	DEFINE_WAIT(wait);
+
+	vsk = vsock_sk(sk);
+	transport = vsk->transport;
+
+	timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
+	orig_nr_segs = msg->msg_iter.nr_segs;
+	orig_iov = msg->msg_iter.iov;
+	msg_ready = false;
+	record_len = 0;
+
+	while (1) {
+		err = vsock_wait_data(sk, &wait, timeout, NULL, 0);
+
+		if (err <= 0) {
+			/* In case of any loop break(timeout, signal
+			 * interrupt or shutdown), we report user that
+			 * nothing was copied.
+			 */
+			err = 0;
+			break;
+		}
+
+		if (record_len == 0) {
+			record_len =
+				transport->seqpacket_seq_get_len(vsk);
+
+			if (record_len == 0)
+				continue;
+		}
+
+		err = transport->seqpacket_dequeue(vsk, msg,
+					flags, &msg_ready);
+		if (err < 0) {
+			if (err == -EAGAIN) {
+				iov_iter_init(&msg->msg_iter, READ,
+					      orig_iov, orig_nr_segs,
+					      len);
+				/* Clear 'MSG_EOR' here, because dequeue
+				 * callback above set it again if it was
+				 * set by sender. This 'MSG_EOR' is from
+				 * dropped record.
+				 */
+				msg->msg_flags &= ~MSG_EOR;
+				record_len = 0;
+				continue;
+			}
+
+			err = -ENOMEM;
+			break;
+		}
+
+		if (msg_ready)
+			break;
+	}
+
+	if (sk->sk_err)
+		err = -sk->sk_err;
+	else if (sk->sk_shutdown & RCV_SHUTDOWN)
+		err = 0;
+
+	if (msg_ready) {
+		/* User sets MSG_TRUNC, so return real length of
+		 * packet.
+		 */
+		if (flags & MSG_TRUNC)
+			err = record_len;
+		else
+			err = len - msg->msg_iter.count;
+
+		/* Always set MSG_TRUNC if real length of packet is
+		 * bigger than user's buffer.
+		 */
+		if (record_len > len)
+			msg->msg_flags |= MSG_TRUNC;
+	}
+
+	return err;
+}
+
 static int
 vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 			  int flags)
@@ -2032,7 +2123,10 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 		goto out;
 	}
 
-	err = __vsock_stream_recvmsg(sk, msg, len, flags);
+	if (sk->sk_type == SOCK_STREAM)
+		err = __vsock_stream_recvmsg(sk, msg, len, flags);
+	else
+		err = __vsock_seqpacket_recvmsg(sk, msg, len, flags);
 
 out:
 	release_sock(sk);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 05/17] af_vsock: separate wait space loop
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (3 preceding siblings ...)
  2021-02-07 15:15 ` [RFC PATCH v4 04/17] af_vsock: implement SEQPACKET receive loop Arseny Krasnov
@ 2021-02-07 15:15 ` Arseny Krasnov
  2021-02-07 16:58   ` kernel test robot
  2021-02-11 12:14     ` Stefano Garzarella
  2021-02-07 15:15 ` [RFC PATCH v4 06/17] af_vsock: implement send logic for SEQPACKET Arseny Krasnov
                   ` (12 subsequent siblings)
  17 siblings, 2 replies; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:15 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This moves loop that waits for space on send to separate function,
because it will be used for SEQ_BEGIN/SEQ_END sending before and
after data transmission. Waiting for SEQ_BEGIN/SEQ_END is needed
because such packets carries SEQPACKET header that couldn't be
fragmented by credit mechanism, so to avoid it, sender waits until
enough space will be ready.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 include/net/af_vsock.h   |  2 +
 net/vmw_vsock/af_vsock.c | 93 ++++++++++++++++++++++++++--------------
 2 files changed, 62 insertions(+), 33 deletions(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index bb6a0e52be86..19f6f22821ec 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -205,6 +205,8 @@ void vsock_remove_sock(struct vsock_sock *vsk);
 void vsock_for_each_connected_socket(void (*fn)(struct sock *sk));
 int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk);
 bool vsock_find_cid(unsigned int cid);
+int vsock_wait_space(struct sock *sk, size_t space, int flags,
+		     struct vsock_transport_send_notify_data *send_data);
 
 /**** TAP ****/
 
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 3d8af987216a..ea99261e88ac 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1693,6 +1693,64 @@ static int vsock_connectible_getsockopt(struct socket *sock,
 	return 0;
 }
 
+int vsock_wait_space(struct sock *sk, size_t space, int flags,
+		     struct vsock_transport_send_notify_data *send_data)
+{
+	const struct vsock_transport *transport;
+	struct vsock_sock *vsk;
+	long timeout;
+	int err;
+
+	DEFINE_WAIT_FUNC(wait, woken_wake_function);
+
+	vsk = vsock_sk(sk);
+	transport = vsk->transport;
+	timeout = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
+	err = 0;
+
+	add_wait_queue(sk_sleep(sk), &wait);
+
+	while (vsock_stream_has_space(vsk) < space &&
+	       sk->sk_err == 0 &&
+	       !(sk->sk_shutdown & SEND_SHUTDOWN) &&
+	       !(vsk->peer_shutdown & RCV_SHUTDOWN)) {
+		/* Don't wait for non-blocking sockets. */
+		if (timeout == 0) {
+			err = -EAGAIN;
+			goto out_err;
+		}
+
+		if (send_data) {
+			err = transport->notify_send_pre_block(vsk, send_data);
+			if (err < 0)
+				goto out_err;
+		}
+
+		release_sock(sk);
+		timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout);
+		lock_sock(sk);
+		if (signal_pending(current)) {
+			err = sock_intr_errno(timeout);
+			goto out_err;
+		} else if (timeout == 0) {
+			err = -EAGAIN;
+			goto out_err;
+		}
+	}
+
+	if (sk->sk_err) {
+		err = -sk->sk_err;
+	} else if ((sk->sk_shutdown & SEND_SHUTDOWN) ||
+		   (vsk->peer_shutdown & RCV_SHUTDOWN)) {
+		err = -EPIPE;
+	}
+
+out_err:
+	remove_wait_queue(sk_sleep(sk), &wait);
+	return err;
+}
+EXPORT_SYMBOL_GPL(vsock_wait_space);
+
 static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 				     size_t len)
 {
@@ -1751,39 +1809,8 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 	while (total_written < len) {
 		ssize_t written;
 
-		add_wait_queue(sk_sleep(sk), &wait);
-		while (vsock_stream_has_space(vsk) == 0 &&
-		       sk->sk_err == 0 &&
-		       !(sk->sk_shutdown & SEND_SHUTDOWN) &&
-		       !(vsk->peer_shutdown & RCV_SHUTDOWN)) {
-
-			/* Don't wait for non-blocking sockets. */
-			if (timeout == 0) {
-				err = -EAGAIN;
-				remove_wait_queue(sk_sleep(sk), &wait);
-				goto out_err;
-			}
-
-			err = transport->notify_send_pre_block(vsk, &send_data);
-			if (err < 0) {
-				remove_wait_queue(sk_sleep(sk), &wait);
-				goto out_err;
-			}
-
-			release_sock(sk);
-			timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout);
-			lock_sock(sk);
-			if (signal_pending(current)) {
-				err = sock_intr_errno(timeout);
-				remove_wait_queue(sk_sleep(sk), &wait);
-				goto out_err;
-			} else if (timeout == 0) {
-				err = -EAGAIN;
-				remove_wait_queue(sk_sleep(sk), &wait);
-				goto out_err;
-			}
-		}
-		remove_wait_queue(sk_sleep(sk), &wait);
+		if (vsock_wait_space(sk, 1, msg->msg_flags, &send_data))
+			goto out_err;
 
 		/* These checks occur both as part of and after the loop
 		 * conditional since we need to check before and after
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 06/17] af_vsock: implement send logic for SEQPACKET
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (4 preceding siblings ...)
  2021-02-07 15:15 ` [RFC PATCH v4 05/17] af_vsock: separate wait space loop Arseny Krasnov
@ 2021-02-07 15:15 ` Arseny Krasnov
  2021-02-11 12:17     ` Stefano Garzarella
  2021-02-07 15:16 ` [RFC PATCH v4 07/17] af_vsock: rest of SEQPACKET support Arseny Krasnov
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:15 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Andra Paraschiv, Colin Ian King,
	Jeff Vander Stoep
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This adds some logic to current stream enqueue function for SEQPACKET
support:
1) Send record's begin/end marker.
2) Return value from enqueue function is whole record length or error
   for SOCK_SEQPACKET.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 include/net/af_vsock.h   |  2 ++
 net/vmw_vsock/af_vsock.c | 22 ++++++++++++++++++++--
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 19f6f22821ec..198d58c4c7ee 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -136,6 +136,8 @@ struct vsock_transport {
 	bool (*stream_allow)(u32 cid, u32 port);
 
 	/* SEQ_PACKET. */
+	int (*seqpacket_seq_send_len)(struct vsock_sock *, size_t len, int flags);
+	int (*seqpacket_seq_send_eor)(struct vsock_sock *, int flags);
 	size_t (*seqpacket_seq_get_len)(struct vsock_sock *);
 	int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
 				     int flags, bool *msg_ready);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index ea99261e88ac..a033d3340ac4 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1806,6 +1806,12 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (err < 0)
 		goto out;
 
+	if (sk->sk_type == SOCK_SEQPACKET) {
+		err = transport->seqpacket_seq_send_len(vsk, len, msg->msg_flags);
+		if (err < 0)
+			goto out;
+	}
+
 	while (total_written < len) {
 		ssize_t written;
 
@@ -1852,9 +1858,21 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 
 	}
 
+	if (sk->sk_type == SOCK_SEQPACKET) {
+		err = transport->seqpacket_seq_send_eor(vsk, msg->msg_flags);
+		if (err < 0)
+			goto out;
+	}
+
 out_err:
-	if (total_written > 0)
-		err = total_written;
+	if (total_written > 0) {
+		/* Return number of written bytes only if:
+		 * 1) SOCK_STREAM socket.
+		 * 2) SOCK_SEQPACKET socket when whole buffer is sent.
+		 */
+		if (sk->sk_type == SOCK_STREAM || total_written == len)
+			err = total_written;
+	}
 out:
 	release_sock(sk);
 	return err;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 07/17] af_vsock: rest of SEQPACKET support
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (5 preceding siblings ...)
  2021-02-07 15:15 ` [RFC PATCH v4 06/17] af_vsock: implement send logic for SEQPACKET Arseny Krasnov
@ 2021-02-07 15:16 ` Arseny Krasnov
  2021-02-11 12:27     ` Stefano Garzarella
  2021-02-07 15:16 ` [RFC PATCH v4 08/17] af_vsock: update comments for stream sockets Arseny Krasnov
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:16 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This does rest of SOCK_SEQPACKET support:
1) Adds socket ops for SEQPACKET type.
2) Allows to create socket with SEQPACKET type.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 net/vmw_vsock/af_vsock.c | 37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index a033d3340ac4..c77998a14018 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -452,6 +452,7 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
 		new_transport = transport_dgram;
 		break;
 	case SOCK_STREAM:
+	case SOCK_SEQPACKET:
 		if (vsock_use_local_transport(remote_cid))
 			new_transport = transport_local;
 		else if (remote_cid <= VMADDR_CID_HOST || !transport_h2g ||
@@ -459,6 +460,15 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
 			new_transport = transport_g2h;
 		else
 			new_transport = transport_h2g;
+
+		if (sk->sk_type == SOCK_SEQPACKET) {
+			if (!new_transport ||
+			    !new_transport->seqpacket_seq_send_len ||
+			    !new_transport->seqpacket_seq_send_eor ||
+			    !new_transport->seqpacket_seq_get_len ||
+			    !new_transport->seqpacket_dequeue)
+				return -ESOCKTNOSUPPORT;
+		}
 		break;
 	default:
 		return -ESOCKTNOSUPPORT;
@@ -684,6 +694,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr)
 
 	switch (sk->sk_socket->type) {
 	case SOCK_STREAM:
+	case SOCK_SEQPACKET:
 		spin_lock_bh(&vsock_table_lock);
 		retval = __vsock_bind_connectible(vsk, addr);
 		spin_unlock_bh(&vsock_table_lock);
@@ -769,7 +780,7 @@ static struct sock *__vsock_create(struct net *net,
 
 static bool sock_type_connectible(u16 type)
 {
-	return type == SOCK_STREAM;
+	return (type == SOCK_STREAM) || (type == SOCK_SEQPACKET);
 }
 
 static void __vsock_release(struct sock *sk, int level)
@@ -2199,6 +2210,27 @@ static const struct proto_ops vsock_stream_ops = {
 	.sendpage = sock_no_sendpage,
 };
 
+static const struct proto_ops vsock_seqpacket_ops = {
+	.family = PF_VSOCK,
+	.owner = THIS_MODULE,
+	.release = vsock_release,
+	.bind = vsock_bind,
+	.connect = vsock_connect,
+	.socketpair = sock_no_socketpair,
+	.accept = vsock_accept,
+	.getname = vsock_getname,
+	.poll = vsock_poll,
+	.ioctl = sock_no_ioctl,
+	.listen = vsock_listen,
+	.shutdown = vsock_shutdown,
+	.setsockopt = vsock_connectible_setsockopt,
+	.getsockopt = vsock_connectible_getsockopt,
+	.sendmsg = vsock_connectible_sendmsg,
+	.recvmsg = vsock_connectible_recvmsg,
+	.mmap = sock_no_mmap,
+	.sendpage = sock_no_sendpage,
+};
+
 static int vsock_create(struct net *net, struct socket *sock,
 			int protocol, int kern)
 {
@@ -2219,6 +2251,9 @@ static int vsock_create(struct net *net, struct socket *sock,
 	case SOCK_STREAM:
 		sock->ops = &vsock_stream_ops;
 		break;
+	case SOCK_SEQPACKET:
+		sock->ops = &vsock_seqpacket_ops;
+		break;
 	default:
 		return -ESOCKTNOSUPPORT;
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 08/17] af_vsock: update comments for stream sockets
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (6 preceding siblings ...)
  2021-02-07 15:16 ` [RFC PATCH v4 07/17] af_vsock: rest of SEQPACKET support Arseny Krasnov
@ 2021-02-07 15:16 ` Arseny Krasnov
  2021-02-11 13:19     ` Stefano Garzarella
  2021-02-07 15:16 ` [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET Arseny Krasnov
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:16 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This replaces 'stream' to 'connect oriented' in comments as SEQPACKET is
also connect oriented.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 net/vmw_vsock/af_vsock.c | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index c77998a14018..6e5e192cb703 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -415,8 +415,8 @@ static void vsock_deassign_transport(struct vsock_sock *vsk)
 
 /* Assign a transport to a socket and call the .init transport callback.
  *
- * Note: for stream socket this must be called when vsk->remote_addr is set
- * (e.g. during the connect() or when a connection request on a listener
+ * Note: for connect oriented socket this must be called when vsk->remote_addr
+ * is set (e.g. during the connect() or when a connection request on a listener
  * socket is received).
  * The vsk->remote_addr is used to decide which transport to use:
  *  - remote CID == VMADDR_CID_LOCAL or g2h->local_cid or VMADDR_CID_HOST if
@@ -479,10 +479,10 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
 			return 0;
 
 		/* transport->release() must be called with sock lock acquired.
-		 * This path can only be taken during vsock_stream_connect(),
-		 * where we have already held the sock lock.
-		 * In the other cases, this function is called on a new socket
-		 * which is not assigned to any transport.
+		 * This path can only be taken during vsock_connect(), where we
+		 * have already held the sock lock. In the other cases, this
+		 * function is called on a new socket which is not assigned to
+		 * any transport.
 		 */
 		vsk->transport->release(vsk);
 		vsock_deassign_transport(vsk);
@@ -659,9 +659,10 @@ static int __vsock_bind_connectible(struct vsock_sock *vsk,
 
 	vsock_addr_init(&vsk->local_addr, new_addr.svm_cid, new_addr.svm_port);
 
-	/* Remove stream sockets from the unbound list and add them to the hash
-	 * table for easy lookup by its address.  The unbound list is simply an
-	 * extra entry at the end of the hash table, a trick used by AF_UNIX.
+	/* Remove connect oriented sockets from the unbound list and add them
+	 * to the hash table for easy lookup by its address.  The unbound list
+	 * is simply an extra entry at the end of the hash table, a trick used
+	 * by AF_UNIX.
 	 */
 	__vsock_remove_bound(vsk);
 	__vsock_insert_bound(vsock_bound_sockets(&vsk->local_addr), vsk);
@@ -952,10 +953,10 @@ static int vsock_shutdown(struct socket *sock, int mode)
 	if ((mode & ~SHUTDOWN_MASK) || !mode)
 		return -EINVAL;
 
-	/* If this is a STREAM socket and it is not connected then bail out
-	 * immediately.  If it is a DGRAM socket then we must first kick the
-	 * socket so that it wakes up from any sleeping calls, for example
-	 * recv(), and then afterwards return the error.
+	/* If this is a connect oriented socket and it is not connected then
+	 * bail out immediately.  If it is a DGRAM socket then we must first
+	 * kick the socket so that it wakes up from any sleeping calls, for
+	 * example recv(), and then afterwards return the error.
 	 */
 
 	sk = sock->sk;
@@ -1786,7 +1787,9 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 
 	transport = vsk->transport;
 
-	/* Callers should not provide a destination with stream sockets. */
+	/* Callers should not provide a destination with connect oriented
+	 * sockets.
+	 */
 	if (msg->msg_namelen) {
 		err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
 		goto out;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (7 preceding siblings ...)
  2021-02-07 15:16 ` [RFC PATCH v4 08/17] af_vsock: update comments for stream sockets Arseny Krasnov
@ 2021-02-07 15:16 ` Arseny Krasnov
  2021-02-11 13:54     ` Stefano Garzarella
  2021-02-07 15:17 ` [RFC PATCH v4 10/17] virtio/vsock: fetch length for SEQPACKET record Arseny Krasnov
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:16 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Andra Paraschiv, Colin Ian King,
	Jeff Vander Stoep
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This adds transport callback and it's logic for SEQPACKET dequeue.
Callback fetches RW packets from rx queue of socket until whole record
is copied(if user's buffer is full, user is not woken up). This is done
to not stall sender, because if we wake up user and it leaves syscall,
nobody will send credit update for rest of record, and sender will wait
for next enter of read syscall at receiver's side. So if user buffer is
full, we just send credit update and drop data. If during copy SEQ_BEGIN
was found(and not all data was copied), copying is restarted by reset
user's iov iterator(previous unfinished data is dropped).

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 include/linux/virtio_vsock.h            |   5 +
 include/uapi/linux/virtio_vsock.h       |  16 ++++
 net/vmw_vsock/virtio_transport_common.c | 120 ++++++++++++++++++++++++
 3 files changed, 141 insertions(+)

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index dc636b727179..4d0de3dee9a4 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -36,6 +36,11 @@ struct virtio_vsock_sock {
 	u32 rx_bytes;
 	u32 buf_alloc;
 	struct list_head rx_queue;
+
+	/* For SOCK_SEQPACKET */
+	u32 user_read_seq_len;
+	u32 user_read_copied;
+	u32 curr_rx_msg_cnt;
 };
 
 struct virtio_vsock_pkt {
diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
index 1d57ed3d84d2..cf9c165e5cca 100644
--- a/include/uapi/linux/virtio_vsock.h
+++ b/include/uapi/linux/virtio_vsock.h
@@ -63,8 +63,14 @@ struct virtio_vsock_hdr {
 	__le32	fwd_cnt;
 } __attribute__((packed));
 
+struct virtio_vsock_seq_hdr {
+	__le32  msg_cnt;
+	__le32  msg_len;
+} __attribute__((packed));
+
 enum virtio_vsock_type {
 	VIRTIO_VSOCK_TYPE_STREAM = 1,
+	VIRTIO_VSOCK_TYPE_SEQPACKET = 2,
 };
 
 enum virtio_vsock_op {
@@ -83,6 +89,11 @@ enum virtio_vsock_op {
 	VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
 	/* Request the peer to send the credit info to us */
 	VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
+
+	/* Record begin for SOCK_SEQPACKET */
+	VIRTIO_VSOCK_OP_SEQ_BEGIN = 8,
+	/* Record end for SOCK_SEQPACKET */
+	VIRTIO_VSOCK_OP_SEQ_END = 9,
 };
 
 /* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
@@ -91,4 +102,9 @@ enum virtio_vsock_shutdown {
 	VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
 };
 
+/* VIRTIO_VSOCK_OP_RW flags values */
+enum virtio_vsock_rw {
+	VIRTIO_VSOCK_RW_EOR = 1,
+};
+
 #endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 5956939eebb7..4572d01c8ea5 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -397,6 +397,126 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
 	return err;
 }
 
+static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
+{
+	list_del(&pkt->list);
+	virtio_transport_free_pkt(pkt);
+}
+
+static size_t virtio_transport_drop_until_seq_begin(struct virtio_vsock_sock *vvs)
+{
+	struct virtio_vsock_pkt *pkt, *n;
+	size_t bytes_dropped = 0;
+
+	list_for_each_entry_safe(pkt, n, &vvs->rx_queue, list) {
+		if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_SEQ_BEGIN)
+			break;
+
+		bytes_dropped += le32_to_cpu(pkt->hdr.len);
+		virtio_transport_dec_rx_pkt(vvs, pkt);
+		virtio_transport_remove_pkt(pkt);
+	}
+
+	return bytes_dropped;
+}
+
+static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
+						 struct msghdr *msg,
+						 bool *msg_ready)
+{
+	struct virtio_vsock_sock *vvs = vsk->trans;
+	struct virtio_vsock_pkt *pkt;
+	int err = 0;
+	size_t user_buf_len = msg->msg_iter.count;
+
+	*msg_ready = false;
+	spin_lock_bh(&vvs->rx_lock);
+
+	while (!*msg_ready && !list_empty(&vvs->rx_queue) && !err) {
+		pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
+
+		switch (le16_to_cpu(pkt->hdr.op)) {
+		case VIRTIO_VSOCK_OP_SEQ_BEGIN: {
+			/* Unexpected 'SEQ_BEGIN' during record copy:
+			 * Leave receive loop, 'EAGAIN' will restart it from
+			 * outer receive loop, packet is still in queue and
+			 * counters are cleared. So in next loop enter,
+			 * 'SEQ_BEGIN' will be dequeued first. User's iov
+			 * iterator will be reset in outer loop. Also
+			 * send credit update, because some bytes could be
+			 * copied. User will never see unfinished record.
+			 */
+			err = -EAGAIN;
+			break;
+		}
+		case VIRTIO_VSOCK_OP_SEQ_END: {
+			struct virtio_vsock_seq_hdr *seq_hdr;
+
+			seq_hdr = (struct virtio_vsock_seq_hdr *)pkt->buf;
+			/* First check that whole record is received. */
+
+			if (vvs->user_read_copied != vvs->user_read_seq_len ||
+			    (le32_to_cpu(seq_hdr->msg_cnt) - vvs->curr_rx_msg_cnt) != 1) {
+				/* Tail of current record and head of next missed,
+				 * so this EOR is from next record. Restart receive.
+				 * Current record will be dropped, next headless will
+				 * be dropped on next attempt to get record length.
+				 */
+				err = -EAGAIN;
+			} else {
+				/* Success. */
+				*msg_ready = true;
+			}
+
+			break;
+		}
+		case VIRTIO_VSOCK_OP_RW: {
+			size_t bytes_to_copy;
+			size_t pkt_len;
+
+			pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
+			bytes_to_copy = min(user_buf_len, pkt_len);
+
+			/* sk_lock is held by caller so no one else can dequeue.
+			 * Unlock rx_lock since memcpy_to_msg() may sleep.
+			 */
+			spin_unlock_bh(&vvs->rx_lock);
+
+			if (memcpy_to_msg(msg, pkt->buf, bytes_to_copy)) {
+				spin_lock_bh(&vvs->rx_lock);
+				err = -EINVAL;
+				break;
+			}
+
+			spin_lock_bh(&vvs->rx_lock);
+			user_buf_len -= bytes_to_copy;
+			vvs->user_read_copied += pkt_len;
+
+			if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_RW_EOR)
+				msg->msg_flags |= MSG_EOR;
+			break;
+		}
+		default:
+			;
+		}
+
+		/* For unexpected 'SEQ_BEGIN', keep such packet in queue,
+		 * but drop any other type of packet.
+		 */
+		if (le16_to_cpu(pkt->hdr.op) != VIRTIO_VSOCK_OP_SEQ_BEGIN) {
+			virtio_transport_dec_rx_pkt(vvs, pkt);
+			virtio_transport_remove_pkt(pkt);
+		}
+	}
+
+	spin_unlock_bh(&vvs->rx_lock);
+
+	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_SEQPACKET,
+					    NULL);
+
+	return err;
+}
+
 ssize_t
 virtio_transport_stream_dequeue(struct vsock_sock *vsk,
 				struct msghdr *msg,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 10/17] virtio/vsock: fetch length for SEQPACKET record
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (8 preceding siblings ...)
  2021-02-07 15:16 ` [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET Arseny Krasnov
@ 2021-02-07 15:17 ` Arseny Krasnov
  2021-02-11 13:58     ` Stefano Garzarella
  2021-02-07 15:17 ` [RFC PATCH v4 11/17] virtio/vsock: add SEQPACKET receive logic Arseny Krasnov
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:17 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This adds transport callback which tries to fetch record begin marker
from socket's rx queue. It is called from af_vsock.c before reading data
packets of record.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 include/linux/virtio_vsock.h            |  1 +
 net/vmw_vsock/virtio_transport_common.c | 40 +++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 4d0de3dee9a4..a5e8681bfc6a 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -85,6 +85,7 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
 			       struct msghdr *msg,
 			       size_t len, int flags);
 
+size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk);
 s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
 s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
 
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 4572d01c8ea5..7ac552bfd90b 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -420,6 +420,46 @@ static size_t virtio_transport_drop_until_seq_begin(struct virtio_vsock_sock *vv
 	return bytes_dropped;
 }
 
+size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk)
+{
+	struct virtio_vsock_seq_hdr *seq_hdr;
+	struct virtio_vsock_sock *vvs;
+	struct virtio_vsock_pkt *pkt;
+	size_t bytes_dropped;
+
+	vvs = vsk->trans;
+
+	spin_lock_bh(&vvs->rx_lock);
+
+	/* Fetch all orphaned 'RW', packets, and
+	 * send credit update.
+	 */
+	bytes_dropped = virtio_transport_drop_until_seq_begin(vvs);
+
+	if (list_empty(&vvs->rx_queue))
+		goto out;
+
+	pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
+
+	vvs->user_read_copied = 0;
+
+	seq_hdr = (struct virtio_vsock_seq_hdr *)pkt->buf;
+	vvs->user_read_seq_len = le32_to_cpu(seq_hdr->msg_len);
+	vvs->curr_rx_msg_cnt = le32_to_cpu(seq_hdr->msg_cnt);
+	virtio_transport_dec_rx_pkt(vvs, pkt);
+	virtio_transport_remove_pkt(pkt);
+out:
+	spin_unlock_bh(&vvs->rx_lock);
+
+	if (bytes_dropped)
+		virtio_transport_send_credit_update(vsk,
+						    VIRTIO_VSOCK_TYPE_SEQPACKET,
+						    NULL);
+
+	return vvs->user_read_seq_len;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_get_len);
+
 static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
 						 struct msghdr *msg,
 						 bool *msg_ready)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 11/17] virtio/vsock: add SEQPACKET receive logic
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (9 preceding siblings ...)
  2021-02-07 15:17 ` [RFC PATCH v4 10/17] virtio/vsock: fetch length for SEQPACKET record Arseny Krasnov
@ 2021-02-07 15:17 ` Arseny Krasnov
  2021-02-07 15:17 ` [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support Arseny Krasnov
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:17 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Andra Paraschiv, Colin Ian King, Alexander Popov
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This modifies current receive logic for SEQPACKET support:
1) Inserts 'SEQ_BEGIN' packet to socket's rx queue.
2) Inserts 'RW' packet to socket's rx queue, but without merging with
   buffer of last packet in queue.
3) Performs check for packet and socket types on receive(if mismatch,
   then reset connection).

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 net/vmw_vsock/virtio_transport_common.c | 63 +++++++++++++++++--------
 1 file changed, 44 insertions(+), 19 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 7ac552bfd90b..51b66f8dd7c7 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -397,6 +397,14 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
 	return err;
 }
 
+static u16 virtio_transport_get_type(struct sock *sk)
+{
+	if (sk->sk_type == SOCK_STREAM)
+		return VIRTIO_VSOCK_TYPE_STREAM;
+	else
+		return VIRTIO_VSOCK_TYPE_SEQPACKET;
+}
+
 static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
 {
 	list_del(&pkt->list);
@@ -1062,25 +1070,27 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
 		goto out;
 	}
 
-	/* Try to copy small packets into the buffer of last packet queued,
-	 * to avoid wasting memory queueing the entire buffer with a small
-	 * payload.
-	 */
-	if (pkt->len <= GOOD_COPY_LEN && !list_empty(&vvs->rx_queue)) {
-		struct virtio_vsock_pkt *last_pkt;
+	if (le16_to_cpu(pkt->hdr.type) == VIRTIO_VSOCK_TYPE_STREAM) {
+		/* Try to copy small packets into the buffer of last packet queued,
+		 * to avoid wasting memory queueing the entire buffer with a small
+		 * payload.
+		 */
+		if (pkt->len <= GOOD_COPY_LEN && !list_empty(&vvs->rx_queue)) {
+			struct virtio_vsock_pkt *last_pkt;
 
-		last_pkt = list_last_entry(&vvs->rx_queue,
-					   struct virtio_vsock_pkt, list);
+			last_pkt = list_last_entry(&vvs->rx_queue,
+						   struct virtio_vsock_pkt, list);
 
-		/* If there is space in the last packet queued, we copy the
-		 * new packet in its buffer.
-		 */
-		if (pkt->len <= last_pkt->buf_len - last_pkt->len) {
-			memcpy(last_pkt->buf + last_pkt->len, pkt->buf,
-			       pkt->len);
-			last_pkt->len += pkt->len;
-			free_pkt = true;
-			goto out;
+			/* If there is space in the last packet queued, we copy the
+			 * new packet in its buffer.
+			 */
+			if (pkt->len <= last_pkt->buf_len - last_pkt->len) {
+				memcpy(last_pkt->buf + last_pkt->len, pkt->buf,
+				       pkt->len);
+				last_pkt->len += pkt->len;
+				free_pkt = true;
+				goto out;
+			}
 		}
 	}
 
@@ -1100,9 +1110,13 @@ virtio_transport_recv_connected(struct sock *sk,
 	int err = 0;
 
 	switch (le16_to_cpu(pkt->hdr.op)) {
+	case VIRTIO_VSOCK_OP_SEQ_BEGIN:
+	case VIRTIO_VSOCK_OP_SEQ_END:
 	case VIRTIO_VSOCK_OP_RW:
 		virtio_transport_recv_enqueue(vsk, pkt);
-		sk->sk_data_ready(sk);
+
+		if (le16_to_cpu(pkt->hdr.op) != VIRTIO_VSOCK_OP_SEQ_BEGIN)
+			sk->sk_data_ready(sk);
 		return err;
 	case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
 		sk->sk_write_space(sk);
@@ -1246,6 +1260,12 @@ virtio_transport_recv_listen(struct sock *sk, struct virtio_vsock_pkt *pkt,
 	return 0;
 }
 
+static bool virtio_transport_valid_type(u16 type)
+{
+	return (type == VIRTIO_VSOCK_TYPE_STREAM) ||
+	       (type == VIRTIO_VSOCK_TYPE_SEQPACKET);
+}
+
 /* We are under the virtio-vsock's vsock->rx_lock or vhost-vsock's vq->mutex
  * lock.
  */
@@ -1271,7 +1291,7 @@ void virtio_transport_recv_pkt(struct virtio_transport *t,
 					le32_to_cpu(pkt->hdr.buf_alloc),
 					le32_to_cpu(pkt->hdr.fwd_cnt));
 
-	if (le16_to_cpu(pkt->hdr.type) != VIRTIO_VSOCK_TYPE_STREAM) {
+	if (!virtio_transport_valid_type(le16_to_cpu(pkt->hdr.type))) {
 		(void)virtio_transport_reset_no_sock(t, pkt);
 		goto free_pkt;
 	}
@@ -1288,6 +1308,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t,
 		}
 	}
 
+	if (virtio_transport_get_type(sk) != le16_to_cpu(pkt->hdr.type)) {
+		(void)virtio_transport_reset_no_sock(t, pkt);
+		goto free_pkt;
+	}
+
 	vsk = vsock_sk(sk);
 
 	space_available = virtio_transport_space_update(sk, pkt);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (10 preceding siblings ...)
  2021-02-07 15:17 ` [RFC PATCH v4 11/17] virtio/vsock: add SEQPACKET receive logic Arseny Krasnov
@ 2021-02-07 15:17 ` Arseny Krasnov
  2021-02-09  4:34   ` kernel test robot
                     ` (2 more replies)
  2021-02-07 15:18 ` [RFC PATCH v4 13/17] virtio/vsock: setup SEQPACKET ops for transport Arseny Krasnov
                   ` (5 subsequent siblings)
  17 siblings, 3 replies; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:17 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Colin Ian King, Andra Paraschiv, Alexander Popov
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This adds rest of logic for SEQPACKET:
1) Packet's type is now set in 'virtio_send_pkt_info()' using
   type of socket.
2) SEQPACKET specific functions which send SEQ_BEGIN/SEQ_END.
   Note that both functions may sleep to wait enough space for
   SEQPACKET header.
3) SEQ_BEGIN/SEQ_END to TAP packet capture.
4) Send SHUTDOWN on socket close for SEQPACKET type.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 include/linux/virtio_vsock.h            |  9 +++
 net/vmw_vsock/virtio_transport_common.c | 99 +++++++++++++++++++++----
 2 files changed, 95 insertions(+), 13 deletions(-)

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index a5e8681bfc6a..c4a39424686d 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -41,6 +41,7 @@ struct virtio_vsock_sock {
 	u32 user_read_seq_len;
 	u32 user_read_copied;
 	u32 curr_rx_msg_cnt;
+	u32 next_tx_msg_cnt;
 };
 
 struct virtio_vsock_pkt {
@@ -85,7 +86,15 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
 			       struct msghdr *msg,
 			       size_t len, int flags);
 
+int virtio_transport_seqpacket_seq_send_len(struct vsock_sock *vsk, size_t len, int flags);
+int virtio_transport_seqpacket_seq_send_eor(struct vsock_sock *vsk, int flags);
 size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk);
+int
+virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
+				   struct msghdr *msg,
+				   int flags,
+				   bool *msg_ready);
+
 s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
 s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
 
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 51b66f8dd7c7..0aa0fd33e9d6 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -139,6 +139,8 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque)
 		break;
 	case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
 	case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
+	case VIRTIO_VSOCK_OP_SEQ_BEGIN:
+	case VIRTIO_VSOCK_OP_SEQ_END:
 		hdr->op = cpu_to_le16(AF_VSOCK_OP_CONTROL);
 		break;
 	default:
@@ -165,6 +167,14 @@ void virtio_transport_deliver_tap_pkt(struct virtio_vsock_pkt *pkt)
 }
 EXPORT_SYMBOL_GPL(virtio_transport_deliver_tap_pkt);
 
+static u16 virtio_transport_get_type(struct sock *sk)
+{
+	if (sk->sk_type == SOCK_STREAM)
+		return VIRTIO_VSOCK_TYPE_STREAM;
+	else
+		return VIRTIO_VSOCK_TYPE_SEQPACKET;
+}
+
 /* This function can only be used on connecting/connected sockets,
  * since a socket assigned to a transport is required.
  *
@@ -179,6 +189,13 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
 	struct virtio_vsock_pkt *pkt;
 	u32 pkt_len = info->pkt_len;
 
+	info->type = virtio_transport_get_type(sk_vsock(vsk));
+
+	if (info->type == VIRTIO_VSOCK_TYPE_SEQPACKET &&
+	    info->msg &&
+	    info->msg->msg_flags & MSG_EOR)
+		info->flags |= VIRTIO_VSOCK_RW_EOR;
+
 	t_ops = virtio_transport_get_ops(vsk);
 	if (unlikely(!t_ops))
 		return -EFAULT;
@@ -397,13 +414,61 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
 	return err;
 }
 
-static u16 virtio_transport_get_type(struct sock *sk)
+static int virtio_transport_seqpacket_send_ctrl(struct vsock_sock *vsk,
+						int type,
+						size_t len,
+						int flags)
 {
-	if (sk->sk_type == SOCK_STREAM)
-		return VIRTIO_VSOCK_TYPE_STREAM;
-	else
-		return VIRTIO_VSOCK_TYPE_SEQPACKET;
+	struct virtio_vsock_sock *vvs = vsk->trans;
+	struct virtio_vsock_pkt_info info = {
+		.op = type,
+		.vsk = vsk,
+		.pkt_len = sizeof(struct virtio_vsock_seq_hdr)
+	};
+
+	struct virtio_vsock_seq_hdr seq_hdr = {
+		.msg_cnt = vvs->next_tx_msg_cnt,
+		.msg_len = len
+	};
+
+	struct kvec seq_hdr_kiov = {
+		.iov_base = (void *)&seq_hdr,
+		.iov_len = sizeof(struct virtio_vsock_seq_hdr)
+	};
+
+	struct msghdr msg = {0};
+
+	//XXX: do we need 'vsock_transport_send_notify_data' pointer?
+	if (vsock_wait_space(sk_vsock(vsk),
+			     sizeof(struct virtio_vsock_seq_hdr),
+			     flags, NULL))
+		return -1;
+
+	iov_iter_kvec(&msg.msg_iter, WRITE, &seq_hdr_kiov, 1, sizeof(seq_hdr));
+
+	info.msg = &msg;
+	vvs->next_tx_msg_cnt++;
+
+	return virtio_transport_send_pkt_info(vsk, &info);
+}
+
+int virtio_transport_seqpacket_seq_send_len(struct vsock_sock *vsk, size_t len, int flags)
+{
+	return virtio_transport_seqpacket_send_ctrl(vsk,
+						    VIRTIO_VSOCK_OP_SEQ_BEGIN,
+						    len,
+						    flags);
 }
+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_send_len);
+
+int virtio_transport_seqpacket_seq_send_eor(struct vsock_sock *vsk, int flags)
+{
+	return virtio_transport_seqpacket_send_ctrl(vsk,
+						    VIRTIO_VSOCK_OP_SEQ_END,
+						    0,
+						    flags);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_send_eor);
 
 static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
 {
@@ -577,6 +642,18 @@ virtio_transport_stream_dequeue(struct vsock_sock *vsk,
 }
 EXPORT_SYMBOL_GPL(virtio_transport_stream_dequeue);
 
+int
+virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
+				   struct msghdr *msg,
+				   int flags, bool *msg_ready)
+{
+	if (flags & MSG_PEEK)
+		return -EOPNOTSUPP;
+
+	return virtio_transport_seqpacket_do_dequeue(vsk, msg, msg_ready);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_dequeue);
+
 int
 virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
 			       struct msghdr *msg,
@@ -658,14 +735,15 @@ EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
 void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val)
 {
 	struct virtio_vsock_sock *vvs = vsk->trans;
+	int type;
 
 	if (*val > VIRTIO_VSOCK_MAX_BUF_SIZE)
 		*val = VIRTIO_VSOCK_MAX_BUF_SIZE;
 
 	vvs->buf_alloc = *val;
 
-	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_STREAM,
-					    NULL);
+	type = virtio_transport_get_type(sk_vsock(vsk));
+	virtio_transport_send_credit_update(vsk, type, NULL);
 }
 EXPORT_SYMBOL_GPL(virtio_transport_notify_buffer_size);
 
@@ -792,7 +870,6 @@ int virtio_transport_connect(struct vsock_sock *vsk)
 {
 	struct virtio_vsock_pkt_info info = {
 		.op = VIRTIO_VSOCK_OP_REQUEST,
-		.type = VIRTIO_VSOCK_TYPE_STREAM,
 		.vsk = vsk,
 	};
 
@@ -804,7 +881,6 @@ int virtio_transport_shutdown(struct vsock_sock *vsk, int mode)
 {
 	struct virtio_vsock_pkt_info info = {
 		.op = VIRTIO_VSOCK_OP_SHUTDOWN,
-		.type = VIRTIO_VSOCK_TYPE_STREAM,
 		.flags = (mode & RCV_SHUTDOWN ?
 			  VIRTIO_VSOCK_SHUTDOWN_RCV : 0) |
 			 (mode & SEND_SHUTDOWN ?
@@ -833,7 +909,6 @@ virtio_transport_stream_enqueue(struct vsock_sock *vsk,
 {
 	struct virtio_vsock_pkt_info info = {
 		.op = VIRTIO_VSOCK_OP_RW,
-		.type = VIRTIO_VSOCK_TYPE_STREAM,
 		.msg = msg,
 		.pkt_len = len,
 		.vsk = vsk,
@@ -856,7 +931,6 @@ static int virtio_transport_reset(struct vsock_sock *vsk,
 {
 	struct virtio_vsock_pkt_info info = {
 		.op = VIRTIO_VSOCK_OP_RST,
-		.type = VIRTIO_VSOCK_TYPE_STREAM,
 		.reply = !!pkt,
 		.vsk = vsk,
 	};
@@ -1001,7 +1075,7 @@ void virtio_transport_release(struct vsock_sock *vsk)
 	struct sock *sk = &vsk->sk;
 	bool remove_sock = true;
 
-	if (sk->sk_type == SOCK_STREAM)
+	if (sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET)
 		remove_sock = virtio_transport_close(vsk);
 
 	list_for_each_entry_safe(pkt, tmp, &vvs->rx_queue, list) {
@@ -1164,7 +1238,6 @@ virtio_transport_send_response(struct vsock_sock *vsk,
 {
 	struct virtio_vsock_pkt_info info = {
 		.op = VIRTIO_VSOCK_OP_RESPONSE,
-		.type = VIRTIO_VSOCK_TYPE_STREAM,
 		.remote_cid = le64_to_cpu(pkt->hdr.src_cid),
 		.remote_port = le32_to_cpu(pkt->hdr.src_port),
 		.reply = true,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 13/17] virtio/vsock: setup SEQPACKET ops for transport
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (11 preceding siblings ...)
  2021-02-07 15:17 ` [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support Arseny Krasnov
@ 2021-02-07 15:18 ` Arseny Krasnov
  2021-02-07 15:18 ` [RFC PATCH v4 14/17] vhost/vsock: " Arseny Krasnov
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:18 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This adds SEQPACKET ops for virtio transport

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 net/vmw_vsock/virtio_transport.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 2700a63ab095..bd3a854bb366 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -469,6 +469,11 @@ static struct virtio_transport virtio_transport = {
 		.stream_is_active         = virtio_transport_stream_is_active,
 		.stream_allow             = virtio_transport_stream_allow,
 
+		.seqpacket_seq_send_len	  = virtio_transport_seqpacket_seq_send_len,
+		.seqpacket_seq_send_eor	  = virtio_transport_seqpacket_seq_send_eor,
+		.seqpacket_seq_get_len	  = virtio_transport_seqpacket_seq_get_len,
+		.seqpacket_dequeue        = virtio_transport_seqpacket_dequeue,
+
 		.notify_poll_in           = virtio_transport_notify_poll_in,
 		.notify_poll_out          = virtio_transport_notify_poll_out,
 		.notify_recv_init         = virtio_transport_notify_recv_init,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 14/17] vhost/vsock: setup SEQPACKET ops for transport
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (12 preceding siblings ...)
  2021-02-07 15:18 ` [RFC PATCH v4 13/17] virtio/vsock: setup SEQPACKET ops for transport Arseny Krasnov
@ 2021-02-07 15:18 ` Arseny Krasnov
  2021-02-07 15:18 ` [RFC PATCH v4 15/17] vsock_test: add SOCK_SEQPACKET tests Arseny Krasnov
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:18 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Andra Paraschiv, Colin Ian King,
	Jeff Vander Stoep
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This also removes ignore of non-stream type of packets.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 drivers/vhost/vsock.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 5e78fb719602..5c86d09e36d9 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -354,8 +354,7 @@ vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq,
 		return NULL;
 	}
 
-	if (le16_to_cpu(pkt->hdr.type) == VIRTIO_VSOCK_TYPE_STREAM)
-		pkt->len = le32_to_cpu(pkt->hdr.len);
+	pkt->len = le32_to_cpu(pkt->hdr.len);
 
 	/* No payload */
 	if (!pkt->len)
@@ -424,6 +423,11 @@ static struct virtio_transport vhost_transport = {
 		.stream_is_active         = virtio_transport_stream_is_active,
 		.stream_allow             = virtio_transport_stream_allow,
 
+		.seqpacket_seq_send_len	  = virtio_transport_seqpacket_seq_send_len,
+		.seqpacket_seq_send_eor	  = virtio_transport_seqpacket_seq_send_eor,
+		.seqpacket_seq_get_len	  = virtio_transport_seqpacket_seq_get_len,
+		.seqpacket_dequeue        = virtio_transport_seqpacket_dequeue,
+
 		.notify_poll_in           = virtio_transport_notify_poll_in,
 		.notify_poll_out          = virtio_transport_notify_poll_out,
 		.notify_recv_init         = virtio_transport_notify_recv_init,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 15/17] vsock_test: add SOCK_SEQPACKET tests
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (13 preceding siblings ...)
  2021-02-07 15:18 ` [RFC PATCH v4 14/17] vhost/vsock: " Arseny Krasnov
@ 2021-02-07 15:18 ` Arseny Krasnov
  2021-02-07 15:18 ` [RFC PATCH v4 16/17] loopback/vsock: setup SEQPACKET ops for transport Arseny Krasnov
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:18 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Colin Ian King, Andra Paraschiv, Alexander Popov
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This adds two tests of SOCK_SEQPACKET socket: both transfer data and then
test MSG_EOR and MSG_TRUNC flags. Cases for connect(), bind(),  etc. are
not tested, because it is same as for stream socket.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 tools/testing/vsock/util.c       |  32 ++++++--
 tools/testing/vsock/util.h       |   3 +
 tools/testing/vsock/vsock_test.c | 126 +++++++++++++++++++++++++++++++
 3 files changed, 156 insertions(+), 5 deletions(-)

diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
index 93cbd6f603f9..2acbb7703c6a 100644
--- a/tools/testing/vsock/util.c
+++ b/tools/testing/vsock/util.c
@@ -84,7 +84,7 @@ void vsock_wait_remote_close(int fd)
 }
 
 /* Connect to <cid, port> and return the file descriptor. */
-int vsock_stream_connect(unsigned int cid, unsigned int port)
+static int vsock_connect(unsigned int cid, unsigned int port, int type)
 {
 	union {
 		struct sockaddr sa;
@@ -101,7 +101,7 @@ int vsock_stream_connect(unsigned int cid, unsigned int port)
 
 	control_expectln("LISTENING");
 
-	fd = socket(AF_VSOCK, SOCK_STREAM, 0);
+	fd = socket(AF_VSOCK, type, 0);
 
 	timeout_begin(TIMEOUT);
 	do {
@@ -120,11 +120,21 @@ int vsock_stream_connect(unsigned int cid, unsigned int port)
 	return fd;
 }
 
+int vsock_stream_connect(unsigned int cid, unsigned int port)
+{
+	return vsock_connect(cid, port, SOCK_STREAM);
+}
+
+int vsock_seqpacket_connect(unsigned int cid, unsigned int port)
+{
+	return vsock_connect(cid, port, SOCK_SEQPACKET);
+}
+
 /* Listen on <cid, port> and return the first incoming connection.  The remote
  * address is stored to clientaddrp.  clientaddrp may be NULL.
  */
-int vsock_stream_accept(unsigned int cid, unsigned int port,
-			struct sockaddr_vm *clientaddrp)
+static int vsock_accept(unsigned int cid, unsigned int port,
+			struct sockaddr_vm *clientaddrp, int type)
 {
 	union {
 		struct sockaddr sa;
@@ -145,7 +155,7 @@ int vsock_stream_accept(unsigned int cid, unsigned int port,
 	int client_fd;
 	int old_errno;
 
-	fd = socket(AF_VSOCK, SOCK_STREAM, 0);
+	fd = socket(AF_VSOCK, type, 0);
 
 	if (bind(fd, &addr.sa, sizeof(addr.svm)) < 0) {
 		perror("bind");
@@ -189,6 +199,18 @@ int vsock_stream_accept(unsigned int cid, unsigned int port,
 	return client_fd;
 }
 
+int vsock_stream_accept(unsigned int cid, unsigned int port,
+			struct sockaddr_vm *clientaddrp)
+{
+	return vsock_accept(cid, port, clientaddrp, SOCK_STREAM);
+}
+
+int vsock_seqpacket_accept(unsigned int cid, unsigned int port,
+			   struct sockaddr_vm *clientaddrp)
+{
+	return vsock_accept(cid, port, clientaddrp, SOCK_SEQPACKET);
+}
+
 /* Transmit one byte and check the return value.
  *
  * expected_ret:
diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h
index e53dd09d26d9..a3375ad2fb7f 100644
--- a/tools/testing/vsock/util.h
+++ b/tools/testing/vsock/util.h
@@ -36,8 +36,11 @@ struct test_case {
 void init_signals(void);
 unsigned int parse_cid(const char *str);
 int vsock_stream_connect(unsigned int cid, unsigned int port);
+int vsock_seqpacket_connect(unsigned int cid, unsigned int port);
 int vsock_stream_accept(unsigned int cid, unsigned int port,
 			struct sockaddr_vm *clientaddrp);
+int vsock_seqpacket_accept(unsigned int cid, unsigned int port,
+			   struct sockaddr_vm *clientaddrp);
 void vsock_wait_remote_close(int fd);
 void send_byte(int fd, int expected_ret, int flags);
 void recv_byte(int fd, int expected_ret, int flags);
diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 5a4fb80fa832..5fca9be5b1dd 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -14,6 +14,8 @@
 #include <errno.h>
 #include <unistd.h>
 #include <linux/kernel.h>
+#include <sys/types.h>
+#include <sys/socket.h>
 
 #include "timeout.h"
 #include "control.h"
@@ -279,6 +281,120 @@ static void test_stream_msg_peek_server(const struct test_opts *opts)
 	close(fd);
 }
 
+#define MESSAGES_CNT 7
+#define MESSAGE_EOR_IDX (MESSAGES_CNT / 2)
+static void test_seqpacket_msg_eor_client(const struct test_opts *opts)
+{
+	int fd;
+
+	fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
+	if (fd < 0) {
+		perror("connect");
+		exit(EXIT_FAILURE);
+	}
+
+	/* Send several messages, one with MSG_EOR flag */
+	for (int i = 0; i < MESSAGES_CNT; i++)
+		send_byte(fd, 1, (i != MESSAGE_EOR_IDX) ? 0 : MSG_EOR);
+
+	control_writeln("SENDDONE");
+	close(fd);
+}
+
+static void test_seqpacket_msg_eor_server(const struct test_opts *opts)
+{
+	int fd;
+	char buf[16];
+	struct msghdr msg = {0};
+	struct iovec iov = {0};
+
+	fd = vsock_seqpacket_accept(VMADDR_CID_ANY, 1234, NULL);
+	if (fd < 0) {
+		perror("accept");
+		exit(EXIT_FAILURE);
+	}
+
+	control_expectln("SENDDONE");
+	iov.iov_base = buf;
+	iov.iov_len = sizeof(buf);
+	msg.msg_iov = &iov;
+	msg.msg_iovlen = 1;
+
+	for (int i = 0; i < MESSAGES_CNT; i++) {
+		if (recvmsg(fd, &msg, 0) != 1) {
+			perror("message bound violated");
+			exit(EXIT_FAILURE);
+		}
+
+		if (i == MESSAGE_EOR_IDX) {
+			if (!(msg.msg_flags & MSG_EOR)) {
+				fprintf(stderr, "MSG_EOR flag expected\n");
+				exit(EXIT_FAILURE);
+			}
+		} else {
+			if (msg.msg_flags & MSG_EOR) {
+				fprintf(stderr, "unexpected MSG_EOR flag\n");
+				exit(EXIT_FAILURE);
+			}
+		}
+	}
+
+	close(fd);
+}
+
+#define MESSAGE_TRUNC_SZ 32
+static void test_seqpacket_msg_trunc_client(const struct test_opts *opts)
+{
+	int fd;
+	char buf[MESSAGE_TRUNC_SZ];
+
+	fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
+	if (fd < 0) {
+		perror("connect");
+		exit(EXIT_FAILURE);
+	}
+
+	if (send(fd, buf, sizeof(buf), 0) != sizeof(buf)) {
+		perror("send failed");
+		exit(EXIT_FAILURE);
+	}
+
+	control_writeln("SENDDONE");
+	close(fd);
+}
+
+static void test_seqpacket_msg_trunc_server(const struct test_opts *opts)
+{
+	int fd;
+	char buf[MESSAGE_TRUNC_SZ / 2];
+	struct msghdr msg = {0};
+	struct iovec iov = {0};
+
+	fd = vsock_seqpacket_accept(VMADDR_CID_ANY, 1234, NULL);
+	if (fd < 0) {
+		perror("accept");
+		exit(EXIT_FAILURE);
+	}
+
+	control_expectln("SENDDONE");
+	iov.iov_base = buf;
+	iov.iov_len = sizeof(buf);
+	msg.msg_iov = &iov;
+	msg.msg_iovlen = 1;
+
+	if (recvmsg(fd, &msg, MSG_TRUNC) != MESSAGE_TRUNC_SZ) {
+		perror("MSG_TRUNC doesn't work");
+		exit(EXIT_FAILURE);
+	}
+
+	if (!(msg.msg_flags & MSG_TRUNC)) {
+		fprintf(stderr, "MSG_TRUNC expected\n");
+		exit(EXIT_FAILURE);
+	}
+
+	close(fd);
+}
+
 static struct test_case test_cases[] = {
 	{
 		.name = "SOCK_STREAM connection reset",
@@ -309,6 +425,16 @@ static struct test_case test_cases[] = {
 		.run_client = test_stream_msg_peek_client,
 		.run_server = test_stream_msg_peek_server,
 	},
+	{
+		.name = "SOCK_SEQPACKET send data MSG_EOR",
+		.run_client = test_seqpacket_msg_eor_client,
+		.run_server = test_seqpacket_msg_eor_server,
+	},
+	{
+		.name = "SOCK_SEQPACKET send data MSG_TRUNC",
+		.run_client = test_seqpacket_msg_trunc_client,
+		.run_server = test_seqpacket_msg_trunc_server,
+	},
 	{},
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 16/17] loopback/vsock: setup SEQPACKET ops for transport
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (14 preceding siblings ...)
  2021-02-07 15:18 ` [RFC PATCH v4 15/17] vsock_test: add SOCK_SEQPACKET tests Arseny Krasnov
@ 2021-02-07 15:18 ` Arseny Krasnov
  2021-02-11 14:31     ` Stefano Garzarella
  2021-02-07 15:19 ` [RFC PATCH v4 17/17] virtio/vsock: simplify credit update function API Arseny Krasnov
  2021-02-07 16:20   ` Michael S. Tsirkin
  17 siblings, 1 reply; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:18 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

This adds SEQPACKET ops for loopback transport

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 net/vmw_vsock/vsock_loopback.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c
index a45f7ffca8c5..c0da94119f74 100644
--- a/net/vmw_vsock/vsock_loopback.c
+++ b/net/vmw_vsock/vsock_loopback.c
@@ -89,6 +89,11 @@ static struct virtio_transport loopback_transport = {
 		.stream_is_active         = virtio_transport_stream_is_active,
 		.stream_allow             = virtio_transport_stream_allow,
 
+		.seqpacket_seq_send_len	  = virtio_transport_seqpacket_seq_send_len,
+		.seqpacket_seq_send_eor	  = virtio_transport_seqpacket_seq_send_eor,
+		.seqpacket_seq_get_len	  = virtio_transport_seqpacket_seq_get_len,
+		.seqpacket_dequeue        = virtio_transport_seqpacket_dequeue,
+
 		.notify_poll_in           = virtio_transport_notify_poll_in,
 		.notify_poll_out          = virtio_transport_notify_poll_out,
 		.notify_recv_init         = virtio_transport_notify_recv_init,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC PATCH v4 17/17] virtio/vsock: simplify credit update function API
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
                   ` (15 preceding siblings ...)
  2021-02-07 15:18 ` [RFC PATCH v4 16/17] loopback/vsock: setup SEQPACKET ops for transport Arseny Krasnov
@ 2021-02-07 15:19 ` Arseny Krasnov
  2021-02-11 14:39     ` Stefano Garzarella
  2021-02-07 16:20   ` Michael S. Tsirkin
  17 siblings, 1 reply; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-07 15:19 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Andra Paraschiv, Colin Ian King, Alexander Popov
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa

'virtio_transport_send_credit_update()' has some extra args:
1) 'type' may be set in 'virtio_transport_send_pkt_info()' using type
   of socket.
2) This function is static and 'hdr' arg was always NULL.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 net/vmw_vsock/virtio_transport_common.c | 20 +++++---------------
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 0aa0fd33e9d6..46308679c8a4 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -286,13 +286,10 @@ void virtio_transport_put_credit(struct virtio_vsock_sock *vvs, u32 credit)
 }
 EXPORT_SYMBOL_GPL(virtio_transport_put_credit);
 
-static int virtio_transport_send_credit_update(struct vsock_sock *vsk,
-					       int type,
-					       struct virtio_vsock_hdr *hdr)
+static int virtio_transport_send_credit_update(struct vsock_sock *vsk)
 {
 	struct virtio_vsock_pkt_info info = {
 		.op = VIRTIO_VSOCK_OP_CREDIT_UPDATE,
-		.type = type,
 		.vsk = vsk,
 	};
 
@@ -401,9 +398,7 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
 	 * with different values.
 	 */
 	if (free_space < VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) {
-		virtio_transport_send_credit_update(vsk,
-						    VIRTIO_VSOCK_TYPE_STREAM,
-						    NULL);
+		virtio_transport_send_credit_update(vsk);
 	}
 
 	return total;
@@ -525,9 +520,7 @@ size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk)
 	spin_unlock_bh(&vvs->rx_lock);
 
 	if (bytes_dropped)
-		virtio_transport_send_credit_update(vsk,
-						    VIRTIO_VSOCK_TYPE_SEQPACKET,
-						    NULL);
+		virtio_transport_send_credit_update(vsk);
 
 	return vvs->user_read_seq_len;
 }
@@ -624,8 +617,7 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
 
 	spin_unlock_bh(&vvs->rx_lock);
 
-	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_SEQPACKET,
-					    NULL);
+	virtio_transport_send_credit_update(vsk);
 
 	return err;
 }
@@ -735,15 +727,13 @@ EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
 void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val)
 {
 	struct virtio_vsock_sock *vvs = vsk->trans;
-	int type;
 
 	if (*val > VIRTIO_VSOCK_MAX_BUF_SIZE)
 		*val = VIRTIO_VSOCK_MAX_BUF_SIZE;
 
 	vvs->buf_alloc = *val;
 
-	type = virtio_transport_get_type(sk_vsock(vsk));
-	virtio_transport_send_credit_update(vsk, type, NULL);
+	virtio_transport_send_credit_update(vsk);
 }
 EXPORT_SYMBOL_GPL(virtio_transport_notify_buffer_size);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support
  2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
@ 2021-02-07 16:20   ` Michael S. Tsirkin
  2021-02-07 15:14 ` [RFC PATCH v4 02/17] af_vsock: separate wait data loop Arseny Krasnov
                     ` (16 subsequent siblings)
  17 siblings, 0 replies; 61+ messages in thread
From: Michael S. Tsirkin @ 2021-02-07 16:20 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Stefano Garzarella, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Andra Paraschiv, Colin Ian King,
	Alexander Popov, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Sun, Feb 07, 2021 at 06:12:56PM +0300, Arseny Krasnov wrote:
> 	This patchset impelements support of SOCK_SEQPACKET for virtio
> transport.
> 	As SOCK_SEQPACKET guarantees to save record boundaries, so to
> do it, two new packet operations were added: first for start of record
>  and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
> both operations carries metadata - to maintain boundaries and payload
> integrity. Metadata is introduced by adding special header with two
> fields - message count and message length:
> 
> 	struct virtio_vsock_seq_hdr {
> 		__le32  msg_cnt;
> 		__le32  msg_len;
> 	} __attribute__((packed));
> 
> 	This header is transmitted as payload of SEQ_BEGIN and SEQ_END
> packets(buffer of second virtio descriptor in chain) in the same way as
> data transmitted in RW packets. Payload was chosen as buffer for this
> header to avoid touching first virtio buffer which carries header of
> packet, because someone could check that size of this buffer is equal
> to size of packet header. To send record, packet with start marker is
> sent first(it's header contains length of record and counter), then
> counter is incremented and all data is sent as usual 'RW' packets and
> finally SEQ_END is sent(it also carries counter of message, which is
> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
> incremented again. On receiver's side, length of record is known from
> packet with start record marker. To check that no packets were dropped
> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
> 1) and length of data between two markers is compared to length in
> SEQ_BEGIN header.
> 	Now as  packets of one socket are not reordered neither on
> vsock nor on vhost transport layers, such markers allows to restore
> original record on receiver's side. If user's buffer is smaller that
> record length, when all out of size data is dropped.
> 	Maximum length of datagram is not limited as in stream socket,
> because same credit logic is used. Difference with stream socket is
> that user is not woken up until whole record is received or error
> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
> 	Tests also implemented.
> 
>  Arseny Krasnov (17):
>   af_vsock: update functions for connectible socket
>   af_vsock: separate wait data loop
>   af_vsock: separate receive data loop
>   af_vsock: implement SEQPACKET receive loop
>   af_vsock: separate wait space loop
>   af_vsock: implement send logic for SEQPACKET
>   af_vsock: rest of SEQPACKET support
>   af_vsock: update comments for stream sockets
>   virtio/vsock: dequeue callback for SOCK_SEQPACKET
>   virtio/vsock: fetch length for SEQPACKET record
>   virtio/vsock: add SEQPACKET receive logic
>   virtio/vsock: rest of SOCK_SEQPACKET support
>   virtio/vsock: setup SEQPACKET ops for transport
>   vhost/vsock: setup SEQPACKET ops for transport
>   vsock_test: add SOCK_SEQPACKET tests
>   loopback/vsock: setup SEQPACKET ops for transport
>   virtio/vsock: simplify credit update function API
> 
>  drivers/vhost/vsock.c                   |   8 +-
>  include/linux/virtio_vsock.h            |  15 +
>  include/net/af_vsock.h                  |   9 +
>  include/uapi/linux/virtio_vsock.h       |  16 +
>  net/vmw_vsock/af_vsock.c                | 588 +++++++++++++++-------
>  net/vmw_vsock/virtio_transport.c        |   5 +
>  net/vmw_vsock/virtio_transport_common.c | 316 ++++++++++--
>  net/vmw_vsock/vsock_loopback.c          |   5 +
>  tools/testing/vsock/util.c              |  32 +-
>  tools/testing/vsock/util.h              |   3 +
>  tools/testing/vsock/vsock_test.c        | 126 +++++
>  11 files changed, 895 insertions(+), 228 deletions(-)
> 
>  TODO:
>  - What to do, when server doesn't support SOCK_SEQPACKET. In current
>    implementation RST is replied in the same way when listening port
>    is not found. I think that current RST is enough,because case when
>    server doesn't support SEQ_PACKET is same when listener missed(e.g.
>    no listener in both cases).

   - virtio spec patch

>  v3 -> v4:
>  - callbacks for loopback transport
>  - SEQPACKET specific metadata moved from packet header to payload
>    and called 'virtio_vsock_seq_hdr'
>  - record integrity check:
>    1) SEQ_END operation was added, which marks end of record.
>    2) Both SEQ_BEGIN and SEQ_END carries counter which is incremented
>       on every marker send.
>  - af_vsock.c: socket operations for STREAM and SEQPACKET call same
>    functions instead of having own "gates" differs only by names:
>    'vsock_seqpacket/stream_getsockopt()' now replaced with
>    'vsock_connectible_getsockopt()'.
>  - af_vsock.c: 'seqpacket_dequeue' callback returns error and flag that
>    record ready. There is no need to return number of copied bytes,
>    because case when record received successfully is checked at virtio
>    transport layer, when SEQ_END is processed. Also user doesn't need
>    number of copied bytes, because 'recv()' from SEQPACKET could return
>    error, length of users's buffer or length of whole record(both are
>    known in af_vsock.c).
>  - af_vsock.c: both wait loops in af_vsock.c(for data and space) moved
>    to separate functions because now both called from several places.
>  - af_vsock.c: 'vsock_assign_transport()' checks that 'new_transport'
>    pointer is not NULL and returns 'ESOCKTNOSUPPORT' instead of 'ENODEV'
>    if failed to use transport.
>  - tools/testing/vsock/vsock_test.c: rename tests
> 
>  v2 -> v3:
>  - patches reorganized: split for prepare and implementation patches
>  - local variables are declared in "Reverse Christmas tree" manner
>  - virtio_transport_common.c: valid leXX_to_cpu() for vsock header
>    fields access
>  - af_vsock.c: 'vsock_connectible_*sockopt()' added as shared code
>    between stream and seqpacket sockets.
>  - af_vsock.c: loops in '__vsock_*_recvmsg()' refactored.
>  - af_vsock.c: 'vsock_wait_data()' refactored.
> 
>  v1 -> v2:
>  - patches reordered: af_vsock.c related changes now before virtio vsock
>  - patches reorganized: more small patches, where +/- are not mixed
>  - tests for SOCK_SEQPACKET added
>  - all commit messages updated
>  - af_vsock.c: 'vsock_pre_recv_check()' inlined to
>    'vsock_connectible_recvmsg()'
>  - af_vsock.c: 'vsock_assign_transport()' returns ENODEV if transport
>    was not found
>  - virtio_transport_common.c: transport callback for seqpacket dequeue
>  - virtio_transport_common.c: simplified
>    'virtio_transport_recv_connected()'
>  - virtio_transport_common.c: send reset on socket and packet type
> 			      mismatch.
> 
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support
@ 2021-02-07 16:20   ` Michael S. Tsirkin
  0 siblings, 0 replies; 61+ messages in thread
From: Michael S. Tsirkin @ 2021-02-07 16:20 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, netdev, stsp2, linux-kernel,
	virtualization, oxffffaa, Stefan Hajnoczi, Colin Ian King,
	Jakub Kicinski, Alexander Popov, David S. Miller, Jorgen Hansen

On Sun, Feb 07, 2021 at 06:12:56PM +0300, Arseny Krasnov wrote:
> 	This patchset impelements support of SOCK_SEQPACKET for virtio
> transport.
> 	As SOCK_SEQPACKET guarantees to save record boundaries, so to
> do it, two new packet operations were added: first for start of record
>  and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
> both operations carries metadata - to maintain boundaries and payload
> integrity. Metadata is introduced by adding special header with two
> fields - message count and message length:
> 
> 	struct virtio_vsock_seq_hdr {
> 		__le32  msg_cnt;
> 		__le32  msg_len;
> 	} __attribute__((packed));
> 
> 	This header is transmitted as payload of SEQ_BEGIN and SEQ_END
> packets(buffer of second virtio descriptor in chain) in the same way as
> data transmitted in RW packets. Payload was chosen as buffer for this
> header to avoid touching first virtio buffer which carries header of
> packet, because someone could check that size of this buffer is equal
> to size of packet header. To send record, packet with start marker is
> sent first(it's header contains length of record and counter), then
> counter is incremented and all data is sent as usual 'RW' packets and
> finally SEQ_END is sent(it also carries counter of message, which is
> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
> incremented again. On receiver's side, length of record is known from
> packet with start record marker. To check that no packets were dropped
> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
> 1) and length of data between two markers is compared to length in
> SEQ_BEGIN header.
> 	Now as  packets of one socket are not reordered neither on
> vsock nor on vhost transport layers, such markers allows to restore
> original record on receiver's side. If user's buffer is smaller that
> record length, when all out of size data is dropped.
> 	Maximum length of datagram is not limited as in stream socket,
> because same credit logic is used. Difference with stream socket is
> that user is not woken up until whole record is received or error
> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
> 	Tests also implemented.
> 
>  Arseny Krasnov (17):
>   af_vsock: update functions for connectible socket
>   af_vsock: separate wait data loop
>   af_vsock: separate receive data loop
>   af_vsock: implement SEQPACKET receive loop
>   af_vsock: separate wait space loop
>   af_vsock: implement send logic for SEQPACKET
>   af_vsock: rest of SEQPACKET support
>   af_vsock: update comments for stream sockets
>   virtio/vsock: dequeue callback for SOCK_SEQPACKET
>   virtio/vsock: fetch length for SEQPACKET record
>   virtio/vsock: add SEQPACKET receive logic
>   virtio/vsock: rest of SOCK_SEQPACKET support
>   virtio/vsock: setup SEQPACKET ops for transport
>   vhost/vsock: setup SEQPACKET ops for transport
>   vsock_test: add SOCK_SEQPACKET tests
>   loopback/vsock: setup SEQPACKET ops for transport
>   virtio/vsock: simplify credit update function API
> 
>  drivers/vhost/vsock.c                   |   8 +-
>  include/linux/virtio_vsock.h            |  15 +
>  include/net/af_vsock.h                  |   9 +
>  include/uapi/linux/virtio_vsock.h       |  16 +
>  net/vmw_vsock/af_vsock.c                | 588 +++++++++++++++-------
>  net/vmw_vsock/virtio_transport.c        |   5 +
>  net/vmw_vsock/virtio_transport_common.c | 316 ++++++++++--
>  net/vmw_vsock/vsock_loopback.c          |   5 +
>  tools/testing/vsock/util.c              |  32 +-
>  tools/testing/vsock/util.h              |   3 +
>  tools/testing/vsock/vsock_test.c        | 126 +++++
>  11 files changed, 895 insertions(+), 228 deletions(-)
> 
>  TODO:
>  - What to do, when server doesn't support SOCK_SEQPACKET. In current
>    implementation RST is replied in the same way when listening port
>    is not found. I think that current RST is enough,because case when
>    server doesn't support SEQ_PACKET is same when listener missed(e.g.
>    no listener in both cases).

   - virtio spec patch

>  v3 -> v4:
>  - callbacks for loopback transport
>  - SEQPACKET specific metadata moved from packet header to payload
>    and called 'virtio_vsock_seq_hdr'
>  - record integrity check:
>    1) SEQ_END operation was added, which marks end of record.
>    2) Both SEQ_BEGIN and SEQ_END carries counter which is incremented
>       on every marker send.
>  - af_vsock.c: socket operations for STREAM and SEQPACKET call same
>    functions instead of having own "gates" differs only by names:
>    'vsock_seqpacket/stream_getsockopt()' now replaced with
>    'vsock_connectible_getsockopt()'.
>  - af_vsock.c: 'seqpacket_dequeue' callback returns error and flag that
>    record ready. There is no need to return number of copied bytes,
>    because case when record received successfully is checked at virtio
>    transport layer, when SEQ_END is processed. Also user doesn't need
>    number of copied bytes, because 'recv()' from SEQPACKET could return
>    error, length of users's buffer or length of whole record(both are
>    known in af_vsock.c).
>  - af_vsock.c: both wait loops in af_vsock.c(for data and space) moved
>    to separate functions because now both called from several places.
>  - af_vsock.c: 'vsock_assign_transport()' checks that 'new_transport'
>    pointer is not NULL and returns 'ESOCKTNOSUPPORT' instead of 'ENODEV'
>    if failed to use transport.
>  - tools/testing/vsock/vsock_test.c: rename tests
> 
>  v2 -> v3:
>  - patches reorganized: split for prepare and implementation patches
>  - local variables are declared in "Reverse Christmas tree" manner
>  - virtio_transport_common.c: valid leXX_to_cpu() for vsock header
>    fields access
>  - af_vsock.c: 'vsock_connectible_*sockopt()' added as shared code
>    between stream and seqpacket sockets.
>  - af_vsock.c: loops in '__vsock_*_recvmsg()' refactored.
>  - af_vsock.c: 'vsock_wait_data()' refactored.
> 
>  v1 -> v2:
>  - patches reordered: af_vsock.c related changes now before virtio vsock
>  - patches reorganized: more small patches, where +/- are not mixed
>  - tests for SOCK_SEQPACKET added
>  - all commit messages updated
>  - af_vsock.c: 'vsock_pre_recv_check()' inlined to
>    'vsock_connectible_recvmsg()'
>  - af_vsock.c: 'vsock_assign_transport()' returns ENODEV if transport
>    was not found
>  - virtio_transport_common.c: transport callback for seqpacket dequeue
>  - virtio_transport_common.c: simplified
>    'virtio_transport_recv_connected()'
>  - virtio_transport_common.c: send reset on socket and packet type
> 			      mismatch.
> 
> -- 
> 2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 05/17] af_vsock: separate wait space loop
  2021-02-07 15:15 ` [RFC PATCH v4 05/17] af_vsock: separate wait space loop Arseny Krasnov
@ 2021-02-07 16:58   ` kernel test robot
  2021-02-11 12:14     ` Stefano Garzarella
  1 sibling, 0 replies; 61+ messages in thread
From: kernel test robot @ 2021-02-07 16:58 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 10048 bytes --]

Hi Arseny,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on linus/master]
[cannot apply to vhost/linux-next v5.11-rc6 next-20210125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Arseny-Krasnov/virtio-vsock-introduce-SOCK_SEQPACKET-support/20210207-232655
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 61556703b610a104de324e4f061dc6cf7b218b46
config: x86_64-randconfig-a002-20210207 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/2fc3a79be4e6633693da8cf6f889ebb0581f95e4
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Arseny-Krasnov/virtio-vsock-introduce-SOCK_SEQPACKET-support/20210207-232655
        git checkout 2fc3a79be4e6633693da8cf6f889ebb0581f95e4
        # save the attached .config to linux build tree
        make W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   net/vmw_vsock/af_vsock.c: In function 'vsock_connectible_sendmsg':
>> net/vmw_vsock/af_vsock.c:1761:7: warning: variable 'timeout' set but not used [-Wunused-but-set-variable]
    1761 |  long timeout;
         |       ^~~~~~~


vim +/timeout +1761 net/vmw_vsock/af_vsock.c

2fc3a79be4e663 Arseny Krasnov     2021-02-07  1753  
91571ee8147192 Arseny Krasnov     2021-02-07  1754  static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
1b784140474e4f Ying Xue           2015-03-02  1755  				     size_t len)
d021c344051af9 Andy King          2013-02-06  1756  {
d021c344051af9 Andy King          2013-02-06  1757  	struct sock *sk;
d021c344051af9 Andy King          2013-02-06  1758  	struct vsock_sock *vsk;
fe502c4a38d97e Stefano Garzarella 2019-11-14  1759  	const struct vsock_transport *transport;
d021c344051af9 Andy King          2013-02-06  1760  	ssize_t total_written;
d021c344051af9 Andy King          2013-02-06 @1761  	long timeout;
d021c344051af9 Andy King          2013-02-06  1762  	int err;
d021c344051af9 Andy King          2013-02-06  1763  	struct vsock_transport_send_notify_data send_data;
499fde662f1957 WANG Cong          2017-05-19  1764  	DEFINE_WAIT_FUNC(wait, woken_wake_function);
d021c344051af9 Andy King          2013-02-06  1765  
d021c344051af9 Andy King          2013-02-06  1766  	sk = sock->sk;
d021c344051af9 Andy King          2013-02-06  1767  	vsk = vsock_sk(sk);
d021c344051af9 Andy King          2013-02-06  1768  	total_written = 0;
d021c344051af9 Andy King          2013-02-06  1769  	err = 0;
d021c344051af9 Andy King          2013-02-06  1770  
d021c344051af9 Andy King          2013-02-06  1771  	if (msg->msg_flags & MSG_OOB)
d021c344051af9 Andy King          2013-02-06  1772  		return -EOPNOTSUPP;
d021c344051af9 Andy King          2013-02-06  1773  
d021c344051af9 Andy King          2013-02-06  1774  	lock_sock(sk);
d021c344051af9 Andy King          2013-02-06  1775  
c518adafa39f37 Alexander Popov    2021-02-01  1776  	transport = vsk->transport;
c518adafa39f37 Alexander Popov    2021-02-01  1777  
d021c344051af9 Andy King          2013-02-06  1778  	/* Callers should not provide a destination with stream sockets. */
d021c344051af9 Andy King          2013-02-06  1779  	if (msg->msg_namelen) {
3b4477d2dcf270 Stefan Hajnoczi    2017-10-05  1780  		err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
d021c344051af9 Andy King          2013-02-06  1781  		goto out;
d021c344051af9 Andy King          2013-02-06  1782  	}
d021c344051af9 Andy King          2013-02-06  1783  
d021c344051af9 Andy King          2013-02-06  1784  	/* Send data only if both sides are not shutdown in the direction. */
d021c344051af9 Andy King          2013-02-06  1785  	if (sk->sk_shutdown & SEND_SHUTDOWN ||
d021c344051af9 Andy King          2013-02-06  1786  	    vsk->peer_shutdown & RCV_SHUTDOWN) {
d021c344051af9 Andy King          2013-02-06  1787  		err = -EPIPE;
d021c344051af9 Andy King          2013-02-06  1788  		goto out;
d021c344051af9 Andy King          2013-02-06  1789  	}
d021c344051af9 Andy King          2013-02-06  1790  
c0cfa2d8a788fc Stefano Garzarella 2019-11-14  1791  	if (!transport || sk->sk_state != TCP_ESTABLISHED ||
d021c344051af9 Andy King          2013-02-06  1792  	    !vsock_addr_bound(&vsk->local_addr)) {
d021c344051af9 Andy King          2013-02-06  1793  		err = -ENOTCONN;
d021c344051af9 Andy King          2013-02-06  1794  		goto out;
d021c344051af9 Andy King          2013-02-06  1795  	}
d021c344051af9 Andy King          2013-02-06  1796  
d021c344051af9 Andy King          2013-02-06  1797  	if (!vsock_addr_bound(&vsk->remote_addr)) {
d021c344051af9 Andy King          2013-02-06  1798  		err = -EDESTADDRREQ;
d021c344051af9 Andy King          2013-02-06  1799  		goto out;
d021c344051af9 Andy King          2013-02-06  1800  	}
d021c344051af9 Andy King          2013-02-06  1801  
d021c344051af9 Andy King          2013-02-06  1802  	/* Wait for room in the produce queue to enqueue our user's data. */
d021c344051af9 Andy King          2013-02-06  1803  	timeout = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
d021c344051af9 Andy King          2013-02-06  1804  
d021c344051af9 Andy King          2013-02-06  1805  	err = transport->notify_send_init(vsk, &send_data);
d021c344051af9 Andy King          2013-02-06  1806  	if (err < 0)
d021c344051af9 Andy King          2013-02-06  1807  		goto out;
d021c344051af9 Andy King          2013-02-06  1808  
d021c344051af9 Andy King          2013-02-06  1809  	while (total_written < len) {
d021c344051af9 Andy King          2013-02-06  1810  		ssize_t written;
d021c344051af9 Andy King          2013-02-06  1811  
2fc3a79be4e663 Arseny Krasnov     2021-02-07  1812  		if (vsock_wait_space(sk, 1, msg->msg_flags, &send_data))
f7f9b5e7f8eccf Claudio Imbrenda   2016-03-22  1813  			goto out_err;
d021c344051af9 Andy King          2013-02-06  1814  
d021c344051af9 Andy King          2013-02-06  1815  		/* These checks occur both as part of and after the loop
d021c344051af9 Andy King          2013-02-06  1816  		 * conditional since we need to check before and after
d021c344051af9 Andy King          2013-02-06  1817  		 * sleeping.
d021c344051af9 Andy King          2013-02-06  1818  		 */
d021c344051af9 Andy King          2013-02-06  1819  		if (sk->sk_err) {
d021c344051af9 Andy King          2013-02-06  1820  			err = -sk->sk_err;
f7f9b5e7f8eccf Claudio Imbrenda   2016-03-22  1821  			goto out_err;
d021c344051af9 Andy King          2013-02-06  1822  		} else if ((sk->sk_shutdown & SEND_SHUTDOWN) ||
d021c344051af9 Andy King          2013-02-06  1823  			   (vsk->peer_shutdown & RCV_SHUTDOWN)) {
d021c344051af9 Andy King          2013-02-06  1824  			err = -EPIPE;
f7f9b5e7f8eccf Claudio Imbrenda   2016-03-22  1825  			goto out_err;
d021c344051af9 Andy King          2013-02-06  1826  		}
d021c344051af9 Andy King          2013-02-06  1827  
d021c344051af9 Andy King          2013-02-06  1828  		err = transport->notify_send_pre_enqueue(vsk, &send_data);
d021c344051af9 Andy King          2013-02-06  1829  		if (err < 0)
f7f9b5e7f8eccf Claudio Imbrenda   2016-03-22  1830  			goto out_err;
d021c344051af9 Andy King          2013-02-06  1831  
d021c344051af9 Andy King          2013-02-06  1832  		/* Note that enqueue will only write as many bytes as are free
d021c344051af9 Andy King          2013-02-06  1833  		 * in the produce queue, so we don't need to ensure len is
d021c344051af9 Andy King          2013-02-06  1834  		 * smaller than the queue size.  It is the caller's
d021c344051af9 Andy King          2013-02-06  1835  		 * responsibility to check how many bytes we were able to send.
d021c344051af9 Andy King          2013-02-06  1836  		 */
d021c344051af9 Andy King          2013-02-06  1837  
d021c344051af9 Andy King          2013-02-06  1838  		written = transport->stream_enqueue(
0f7db23a07af5d Al Viro            2014-11-20  1839  				vsk, msg,
d021c344051af9 Andy King          2013-02-06  1840  				len - total_written);
d021c344051af9 Andy King          2013-02-06  1841  		if (written < 0) {
d021c344051af9 Andy King          2013-02-06  1842  			err = -ENOMEM;
f7f9b5e7f8eccf Claudio Imbrenda   2016-03-22  1843  			goto out_err;
d021c344051af9 Andy King          2013-02-06  1844  		}
d021c344051af9 Andy King          2013-02-06  1845  
d021c344051af9 Andy King          2013-02-06  1846  		total_written += written;
d021c344051af9 Andy King          2013-02-06  1847  
d021c344051af9 Andy King          2013-02-06  1848  		err = transport->notify_send_post_enqueue(
d021c344051af9 Andy King          2013-02-06  1849  				vsk, written, &send_data);
d021c344051af9 Andy King          2013-02-06  1850  		if (err < 0)
f7f9b5e7f8eccf Claudio Imbrenda   2016-03-22  1851  			goto out_err;
d021c344051af9 Andy King          2013-02-06  1852  
d021c344051af9 Andy King          2013-02-06  1853  	}
d021c344051af9 Andy King          2013-02-06  1854  
f7f9b5e7f8eccf Claudio Imbrenda   2016-03-22  1855  out_err:
d021c344051af9 Andy King          2013-02-06  1856  	if (total_written > 0)
d021c344051af9 Andy King          2013-02-06  1857  		err = total_written;
d021c344051af9 Andy King          2013-02-06  1858  out:
d021c344051af9 Andy King          2013-02-06  1859  	release_sock(sk);
d021c344051af9 Andy King          2013-02-06  1860  	return err;
d021c344051af9 Andy King          2013-02-06  1861  }
d021c344051af9 Andy King          2013-02-06  1862  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 32214 bytes --]

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support
  2021-02-07 16:20   ` Michael S. Tsirkin
  (?)
@ 2021-02-08  6:32   ` Arseny Krasnov
  2021-02-11 14:57       ` Stefano Garzarella
  -1 siblings, 1 reply; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-08  6:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Stefan Hajnoczi, Stefano Garzarella, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Andra Paraschiv, Colin Ian King,
	Alexander Popov, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa


On 07.02.2021 19:20, Michael S. Tsirkin wrote:
> On Sun, Feb 07, 2021 at 06:12:56PM +0300, Arseny Krasnov wrote:
>> 	This patchset impelements support of SOCK_SEQPACKET for virtio
>> transport.
>> 	As SOCK_SEQPACKET guarantees to save record boundaries, so to
>> do it, two new packet operations were added: first for start of record
>>  and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>> both operations carries metadata - to maintain boundaries and payload
>> integrity. Metadata is introduced by adding special header with two
>> fields - message count and message length:
>>
>> 	struct virtio_vsock_seq_hdr {
>> 		__le32  msg_cnt;
>> 		__le32  msg_len;
>> 	} __attribute__((packed));
>>
>> 	This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>> packets(buffer of second virtio descriptor in chain) in the same way as
>> data transmitted in RW packets. Payload was chosen as buffer for this
>> header to avoid touching first virtio buffer which carries header of
>> packet, because someone could check that size of this buffer is equal
>> to size of packet header. To send record, packet with start marker is
>> sent first(it's header contains length of record and counter), then
>> counter is incremented and all data is sent as usual 'RW' packets and
>> finally SEQ_END is sent(it also carries counter of message, which is
>> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
>> incremented again. On receiver's side, length of record is known from
>> packet with start record marker. To check that no packets were dropped
>> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
>> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
>> 1) and length of data between two markers is compared to length in
>> SEQ_BEGIN header.
>> 	Now as  packets of one socket are not reordered neither on
>> vsock nor on vhost transport layers, such markers allows to restore
>> original record on receiver's side. If user's buffer is smaller that
>> record length, when all out of size data is dropped.
>> 	Maximum length of datagram is not limited as in stream socket,
>> because same credit logic is used. Difference with stream socket is
>> that user is not woken up until whole record is received or error
>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>> 	Tests also implemented.
>>
>>  Arseny Krasnov (17):
>>   af_vsock: update functions for connectible socket
>>   af_vsock: separate wait data loop
>>   af_vsock: separate receive data loop
>>   af_vsock: implement SEQPACKET receive loop
>>   af_vsock: separate wait space loop
>>   af_vsock: implement send logic for SEQPACKET
>>   af_vsock: rest of SEQPACKET support
>>   af_vsock: update comments for stream sockets
>>   virtio/vsock: dequeue callback for SOCK_SEQPACKET
>>   virtio/vsock: fetch length for SEQPACKET record
>>   virtio/vsock: add SEQPACKET receive logic
>>   virtio/vsock: rest of SOCK_SEQPACKET support
>>   virtio/vsock: setup SEQPACKET ops for transport
>>   vhost/vsock: setup SEQPACKET ops for transport
>>   vsock_test: add SOCK_SEQPACKET tests
>>   loopback/vsock: setup SEQPACKET ops for transport
>>   virtio/vsock: simplify credit update function API
>>
>>  drivers/vhost/vsock.c                   |   8 +-
>>  include/linux/virtio_vsock.h            |  15 +
>>  include/net/af_vsock.h                  |   9 +
>>  include/uapi/linux/virtio_vsock.h       |  16 +
>>  net/vmw_vsock/af_vsock.c                | 588 +++++++++++++++-------
>>  net/vmw_vsock/virtio_transport.c        |   5 +
>>  net/vmw_vsock/virtio_transport_common.c | 316 ++++++++++--
>>  net/vmw_vsock/vsock_loopback.c          |   5 +
>>  tools/testing/vsock/util.c              |  32 +-
>>  tools/testing/vsock/util.h              |   3 +
>>  tools/testing/vsock/vsock_test.c        | 126 +++++
>>  11 files changed, 895 insertions(+), 228 deletions(-)
>>
>>  TODO:
>>  - What to do, when server doesn't support SOCK_SEQPACKET. In current
>>    implementation RST is replied in the same way when listening port
>>    is not found. I think that current RST is enough,because case when
>>    server doesn't support SEQ_PACKET is same when listener missed(e.g.
>>    no listener in both cases).
>    - virtio spec patch
Ok
>
>>  v3 -> v4:
>>  - callbacks for loopback transport
>>  - SEQPACKET specific metadata moved from packet header to payload
>>    and called 'virtio_vsock_seq_hdr'
>>  - record integrity check:
>>    1) SEQ_END operation was added, which marks end of record.
>>    2) Both SEQ_BEGIN and SEQ_END carries counter which is incremented
>>       on every marker send.
>>  - af_vsock.c: socket operations for STREAM and SEQPACKET call same
>>    functions instead of having own "gates" differs only by names:
>>    'vsock_seqpacket/stream_getsockopt()' now replaced with
>>    'vsock_connectible_getsockopt()'.
>>  - af_vsock.c: 'seqpacket_dequeue' callback returns error and flag that
>>    record ready. There is no need to return number of copied bytes,
>>    because case when record received successfully is checked at virtio
>>    transport layer, when SEQ_END is processed. Also user doesn't need
>>    number of copied bytes, because 'recv()' from SEQPACKET could return
>>    error, length of users's buffer or length of whole record(both are
>>    known in af_vsock.c).
>>  - af_vsock.c: both wait loops in af_vsock.c(for data and space) moved
>>    to separate functions because now both called from several places.
>>  - af_vsock.c: 'vsock_assign_transport()' checks that 'new_transport'
>>    pointer is not NULL and returns 'ESOCKTNOSUPPORT' instead of 'ENODEV'
>>    if failed to use transport.
>>  - tools/testing/vsock/vsock_test.c: rename tests
>>
>>  v2 -> v3:
>>  - patches reorganized: split for prepare and implementation patches
>>  - local variables are declared in "Reverse Christmas tree" manner
>>  - virtio_transport_common.c: valid leXX_to_cpu() for vsock header
>>    fields access
>>  - af_vsock.c: 'vsock_connectible_*sockopt()' added as shared code
>>    between stream and seqpacket sockets.
>>  - af_vsock.c: loops in '__vsock_*_recvmsg()' refactored.
>>  - af_vsock.c: 'vsock_wait_data()' refactored.
>>
>>  v1 -> v2:
>>  - patches reordered: af_vsock.c related changes now before virtio vsock
>>  - patches reorganized: more small patches, where +/- are not mixed
>>  - tests for SOCK_SEQPACKET added
>>  - all commit messages updated
>>  - af_vsock.c: 'vsock_pre_recv_check()' inlined to
>>    'vsock_connectible_recvmsg()'
>>  - af_vsock.c: 'vsock_assign_transport()' returns ENODEV if transport
>>    was not found
>>  - virtio_transport_common.c: transport callback for seqpacket dequeue
>>  - virtio_transport_common.c: simplified
>>    'virtio_transport_recv_connected()'
>>  - virtio_transport_common.c: send reset on socket and packet type
>> 			      mismatch.
>>
>> -- 
>> 2.25.1
>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support
  2021-02-07 15:17 ` [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support Arseny Krasnov
@ 2021-02-09  4:34   ` kernel test robot
  2021-02-11 11:00   ` Arseny Krasnov
  2021-02-11 14:29     ` Stefano Garzarella
  2 siblings, 0 replies; 61+ messages in thread
From: kernel test robot @ 2021-02-09  4:34 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3691 bytes --]

Hi Arseny,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on linus/master]
[cannot apply to vhost/linux-next v5.11-rc6 next-20210125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Arseny-Krasnov/virtio-vsock-introduce-SOCK_SEQPACKET-support/20210207-232655
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 61556703b610a104de324e4f061dc6cf7b218b46
config: i386-randconfig-s001-20210209 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.3-215-g0fb77bb6-dirty
        # https://github.com/0day-ci/linux/commit/0bfa48046cf3aa71cde18727f1ac90448308bfdd
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Arseny-Krasnov/virtio-vsock-introduce-SOCK_SEQPACKET-support/20210207-232655
        git checkout 0bfa48046cf3aa71cde18727f1ac90448308bfdd
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


"sparse warnings: (new ones prefixed by >>)"
   net/vmw_vsock/virtio_transport_common.c:430:31: sparse: sparse: incorrect type in initializer (different base types) @@     expected restricted __le32 [usertype] msg_cnt @@     got unsigned int [usertype] next_tx_msg_cnt @@
   net/vmw_vsock/virtio_transport_common.c:430:31: sparse:     expected restricted __le32 [usertype] msg_cnt
   net/vmw_vsock/virtio_transport_common.c:430:31: sparse:     got unsigned int [usertype] next_tx_msg_cnt
>> net/vmw_vsock/virtio_transport_common.c:431:28: sparse: sparse: incorrect type in initializer (different base types) @@     expected restricted __le32 [usertype] msg_len @@     got unsigned int [usertype] len @@
   net/vmw_vsock/virtio_transport_common.c:431:28: sparse:     expected restricted __le32 [usertype] msg_len
   net/vmw_vsock/virtio_transport_common.c:431:28: sparse:     got unsigned int [usertype] len

vim +431 net/vmw_vsock/virtio_transport_common.c

   416	
   417	static int virtio_transport_seqpacket_send_ctrl(struct vsock_sock *vsk,
   418							int type,
   419							size_t len,
   420							int flags)
   421	{
   422		struct virtio_vsock_sock *vvs = vsk->trans;
   423		struct virtio_vsock_pkt_info info = {
   424			.op = type,
   425			.vsk = vsk,
   426			.pkt_len = sizeof(struct virtio_vsock_seq_hdr)
   427		};
   428	
   429		struct virtio_vsock_seq_hdr seq_hdr = {
   430			.msg_cnt = vvs->next_tx_msg_cnt,
 > 431			.msg_len = len
   432		};
   433	
   434		struct kvec seq_hdr_kiov = {
   435			.iov_base = (void *)&seq_hdr,
   436			.iov_len = sizeof(struct virtio_vsock_seq_hdr)
   437		};
   438	
   439		struct msghdr msg = {0};
   440	
   441		//XXX: do we need 'vsock_transport_send_notify_data' pointer?
   442		if (vsock_wait_space(sk_vsock(vsk),
   443				     sizeof(struct virtio_vsock_seq_hdr),
   444				     flags, NULL))
   445			return -1;
   446	
   447		iov_iter_kvec(&msg.msg_iter, WRITE, &seq_hdr_kiov, 1, sizeof(seq_hdr));
   448	
   449		info.msg = &msg;
   450		vvs->next_tx_msg_cnt++;
   451	
   452		return virtio_transport_send_pkt_info(vsk, &info);
   453	}
   454	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 35484 bytes --]

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 01/17] af_vsock: update functions for connectible socket
  2021-02-07 15:14 ` [RFC PATCH v4 01/17] af_vsock: update functions for connectible socket Arseny Krasnov
@ 2021-02-11 10:52     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 10:52 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Sun, Feb 07, 2021 at 06:14:23PM +0300, Arseny Krasnov wrote:
>This prepares af_vsock.c for SEQPACKET support: some functions such
>as setsockopt(), getsockopt(), connect(), recvmsg(), sendmsg() are
>shared between both types of sockets, so rename them in general
>manner.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 64 +++++++++++++++++++++-------------------
> 1 file changed, 34 insertions(+), 30 deletions(-)

This patch LGTM:

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>

Thanks,
Stefano

>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 6894f21dc147..f4fabec50650 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -604,8 +604,8 @@ static void vsock_pending_work(struct work_struct *work)
>
> /**** SOCKET OPERATIONS ****/
>
>-static int __vsock_bind_stream(struct vsock_sock *vsk,
>-			       struct sockaddr_vm *addr)
>+static int __vsock_bind_connectible(struct vsock_sock *vsk,
>+				    struct sockaddr_vm *addr)
> {
> 	static u32 port;
> 	struct sockaddr_vm new_addr;
>@@ -685,7 +685,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr)
> 	switch (sk->sk_socket->type) {
> 	case SOCK_STREAM:
> 		spin_lock_bh(&vsock_table_lock);
>-		retval = __vsock_bind_stream(vsk, addr);
>+		retval = __vsock_bind_connectible(vsk, addr);
> 		spin_unlock_bh(&vsock_table_lock);
> 		break;
>
>@@ -767,6 +767,11 @@ static struct sock *__vsock_create(struct net *net,
> 	return sk;
> }
>
>+static bool sock_type_connectible(u16 type)
>+{
>+	return type == SOCK_STREAM;
>+}
>+
> static void __vsock_release(struct sock *sk, int level)
> {
> 	if (sk) {
>@@ -785,7 +790,7 @@ static void __vsock_release(struct sock *sk, int level)
>
> 		if (vsk->transport)
> 			vsk->transport->release(vsk);
>-		else if (sk->sk_type == SOCK_STREAM)
>+		else if (sock_type_connectible(sk->sk_type))
> 			vsock_remove_sock(vsk);
>
> 		sock_orphan(sk);
>@@ -945,7 +950,7 @@ static int vsock_shutdown(struct socket *sock, int mode)
> 	sk = sock->sk;
> 	if (sock->state == SS_UNCONNECTED) {
> 		err = -ENOTCONN;
>-		if (sk->sk_type == SOCK_STREAM)
>+		if (sock_type_connectible(sk->sk_type))
> 			return err;
> 	} else {
> 		sock->state = SS_DISCONNECTING;
>@@ -960,7 +965,7 @@ static int vsock_shutdown(struct socket *sock, int mode)
> 		sk->sk_state_change(sk);
> 		release_sock(sk);
>
>-		if (sk->sk_type == SOCK_STREAM) {
>+		if (sock_type_connectible(sk->sk_type)) {
> 			sock_reset_flag(sk, SOCK_DONE);
> 			vsock_send_shutdown(sk, mode);
> 		}
>@@ -1013,7 +1018,7 @@ static __poll_t vsock_poll(struct file *file, struct socket *sock,
> 		if (!(sk->sk_shutdown & SEND_SHUTDOWN))
> 			mask |= EPOLLOUT | EPOLLWRNORM | EPOLLWRBAND;
>
>-	} else if (sock->type == SOCK_STREAM) {
>+	} else if (sock_type_connectible(sk->sk_type)) {
> 		const struct vsock_transport *transport;
>
> 		lock_sock(sk);
>@@ -1263,8 +1268,8 @@ static void vsock_connect_timeout(struct work_struct *work)
> 	sock_put(sk);
> }
>
>-static int vsock_stream_connect(struct socket *sock, struct sockaddr *addr,
>-				int addr_len, int flags)
>+static int vsock_connect(struct socket *sock, struct sockaddr *addr,
>+			 int addr_len, int flags)
> {
> 	int err;
> 	struct sock *sk;
>@@ -1414,7 +1419,7 @@ static int vsock_accept(struct socket *sock, struct socket *newsock, int flags,
>
> 	lock_sock(listener);
>
>-	if (sock->type != SOCK_STREAM) {
>+	if (!sock_type_connectible(sock->type)) {
> 		err = -EOPNOTSUPP;
> 		goto out;
> 	}
>@@ -1491,7 +1496,7 @@ static int vsock_listen(struct socket *sock, int backlog)
>
> 	lock_sock(sk);
>
>-	if (sock->type != SOCK_STREAM) {
>+	if (!sock_type_connectible(sk->sk_type)) {
> 		err = -EOPNOTSUPP;
> 		goto out;
> 	}
>@@ -1535,11 +1540,11 @@ static void vsock_update_buffer_size(struct vsock_sock *vsk,
> 	vsk->buffer_size = val;
> }
>
>-static int vsock_stream_setsockopt(struct socket *sock,
>-				   int level,
>-				   int optname,
>-				   sockptr_t optval,
>-				   unsigned int optlen)
>+static int vsock_connectible_setsockopt(struct socket *sock,
>+					int level,
>+					int optname,
>+					sockptr_t optval,
>+					unsigned int optlen)
> {
> 	int err;
> 	struct sock *sk;
>@@ -1617,10 +1622,10 @@ static int vsock_stream_setsockopt(struct socket *sock,
> 	return err;
> }
>
>-static int vsock_stream_getsockopt(struct socket *sock,
>-				   int level, int optname,
>-				   char __user *optval,
>-				   int __user *optlen)
>+static int vsock_connectible_getsockopt(struct socket *sock,
>+					int level, int optname,
>+					char __user *optval,
>+					int __user *optlen)
> {
> 	int err;
> 	int len;
>@@ -1688,8 +1693,8 @@ static int vsock_stream_getsockopt(struct socket *sock,
> 	return 0;
> }
>
>-static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg,
>-				size_t len)
>+static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
>+				     size_t len)
> {
> 	struct sock *sk;
> 	struct vsock_sock *vsk;
>@@ -1828,10 +1833,9 @@ static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg,
> 	return err;
> }
>
>-
> static int
>-vsock_stream_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>-		     int flags)
>+vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>+			  int flags)
> {
> 	struct sock *sk;
> 	struct vsock_sock *vsk;
>@@ -2007,7 +2011,7 @@ static const struct proto_ops vsock_stream_ops = {
> 	.owner = THIS_MODULE,
> 	.release = vsock_release,
> 	.bind = vsock_bind,
>-	.connect = vsock_stream_connect,
>+	.connect = vsock_connect,
> 	.socketpair = sock_no_socketpair,
> 	.accept = vsock_accept,
> 	.getname = vsock_getname,
>@@ -2015,10 +2019,10 @@ static const struct proto_ops vsock_stream_ops = {
> 	.ioctl = sock_no_ioctl,
> 	.listen = vsock_listen,
> 	.shutdown = vsock_shutdown,
>-	.setsockopt = vsock_stream_setsockopt,
>-	.getsockopt = vsock_stream_getsockopt,
>-	.sendmsg = vsock_stream_sendmsg,
>-	.recvmsg = vsock_stream_recvmsg,
>+	.setsockopt = vsock_connectible_setsockopt,
>+	.getsockopt = vsock_connectible_getsockopt,
>+	.sendmsg = vsock_connectible_sendmsg,
>+	.recvmsg = vsock_connectible_recvmsg,
> 	.mmap = sock_no_mmap,
> 	.sendpage = sock_no_sendpage,
> };
>-- 
>2.25.1
>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 01/17] af_vsock: update functions for connectible socket
@ 2021-02-11 10:52     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 10:52 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:14:23PM +0300, Arseny Krasnov wrote:
>This prepares af_vsock.c for SEQPACKET support: some functions such
>as setsockopt(), getsockopt(), connect(), recvmsg(), sendmsg() are
>shared between both types of sockets, so rename them in general
>manner.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 64 +++++++++++++++++++++-------------------
> 1 file changed, 34 insertions(+), 30 deletions(-)

This patch LGTM:

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>

Thanks,
Stefano

>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 6894f21dc147..f4fabec50650 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -604,8 +604,8 @@ static void vsock_pending_work(struct work_struct *work)
>
> /**** SOCKET OPERATIONS ****/
>
>-static int __vsock_bind_stream(struct vsock_sock *vsk,
>-			       struct sockaddr_vm *addr)
>+static int __vsock_bind_connectible(struct vsock_sock *vsk,
>+				    struct sockaddr_vm *addr)
> {
> 	static u32 port;
> 	struct sockaddr_vm new_addr;
>@@ -685,7 +685,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr)
> 	switch (sk->sk_socket->type) {
> 	case SOCK_STREAM:
> 		spin_lock_bh(&vsock_table_lock);
>-		retval = __vsock_bind_stream(vsk, addr);
>+		retval = __vsock_bind_connectible(vsk, addr);
> 		spin_unlock_bh(&vsock_table_lock);
> 		break;
>
>@@ -767,6 +767,11 @@ static struct sock *__vsock_create(struct net *net,
> 	return sk;
> }
>
>+static bool sock_type_connectible(u16 type)
>+{
>+	return type == SOCK_STREAM;
>+}
>+
> static void __vsock_release(struct sock *sk, int level)
> {
> 	if (sk) {
>@@ -785,7 +790,7 @@ static void __vsock_release(struct sock *sk, int level)
>
> 		if (vsk->transport)
> 			vsk->transport->release(vsk);
>-		else if (sk->sk_type == SOCK_STREAM)
>+		else if (sock_type_connectible(sk->sk_type))
> 			vsock_remove_sock(vsk);
>
> 		sock_orphan(sk);
>@@ -945,7 +950,7 @@ static int vsock_shutdown(struct socket *sock, int mode)
> 	sk = sock->sk;
> 	if (sock->state == SS_UNCONNECTED) {
> 		err = -ENOTCONN;
>-		if (sk->sk_type == SOCK_STREAM)
>+		if (sock_type_connectible(sk->sk_type))
> 			return err;
> 	} else {
> 		sock->state = SS_DISCONNECTING;
>@@ -960,7 +965,7 @@ static int vsock_shutdown(struct socket *sock, int mode)
> 		sk->sk_state_change(sk);
> 		release_sock(sk);
>
>-		if (sk->sk_type == SOCK_STREAM) {
>+		if (sock_type_connectible(sk->sk_type)) {
> 			sock_reset_flag(sk, SOCK_DONE);
> 			vsock_send_shutdown(sk, mode);
> 		}
>@@ -1013,7 +1018,7 @@ static __poll_t vsock_poll(struct file *file, struct socket *sock,
> 		if (!(sk->sk_shutdown & SEND_SHUTDOWN))
> 			mask |= EPOLLOUT | EPOLLWRNORM | EPOLLWRBAND;
>
>-	} else if (sock->type == SOCK_STREAM) {
>+	} else if (sock_type_connectible(sk->sk_type)) {
> 		const struct vsock_transport *transport;
>
> 		lock_sock(sk);
>@@ -1263,8 +1268,8 @@ static void vsock_connect_timeout(struct work_struct *work)
> 	sock_put(sk);
> }
>
>-static int vsock_stream_connect(struct socket *sock, struct sockaddr *addr,
>-				int addr_len, int flags)
>+static int vsock_connect(struct socket *sock, struct sockaddr *addr,
>+			 int addr_len, int flags)
> {
> 	int err;
> 	struct sock *sk;
>@@ -1414,7 +1419,7 @@ static int vsock_accept(struct socket *sock, struct socket *newsock, int flags,
>
> 	lock_sock(listener);
>
>-	if (sock->type != SOCK_STREAM) {
>+	if (!sock_type_connectible(sock->type)) {
> 		err = -EOPNOTSUPP;
> 		goto out;
> 	}
>@@ -1491,7 +1496,7 @@ static int vsock_listen(struct socket *sock, int backlog)
>
> 	lock_sock(sk);
>
>-	if (sock->type != SOCK_STREAM) {
>+	if (!sock_type_connectible(sk->sk_type)) {
> 		err = -EOPNOTSUPP;
> 		goto out;
> 	}
>@@ -1535,11 +1540,11 @@ static void vsock_update_buffer_size(struct vsock_sock *vsk,
> 	vsk->buffer_size = val;
> }
>
>-static int vsock_stream_setsockopt(struct socket *sock,
>-				   int level,
>-				   int optname,
>-				   sockptr_t optval,
>-				   unsigned int optlen)
>+static int vsock_connectible_setsockopt(struct socket *sock,
>+					int level,
>+					int optname,
>+					sockptr_t optval,
>+					unsigned int optlen)
> {
> 	int err;
> 	struct sock *sk;
>@@ -1617,10 +1622,10 @@ static int vsock_stream_setsockopt(struct socket *sock,
> 	return err;
> }
>
>-static int vsock_stream_getsockopt(struct socket *sock,
>-				   int level, int optname,
>-				   char __user *optval,
>-				   int __user *optlen)
>+static int vsock_connectible_getsockopt(struct socket *sock,
>+					int level, int optname,
>+					char __user *optval,
>+					int __user *optlen)
> {
> 	int err;
> 	int len;
>@@ -1688,8 +1693,8 @@ static int vsock_stream_getsockopt(struct socket *sock,
> 	return 0;
> }
>
>-static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg,
>-				size_t len)
>+static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
>+				     size_t len)
> {
> 	struct sock *sk;
> 	struct vsock_sock *vsk;
>@@ -1828,10 +1833,9 @@ static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg,
> 	return err;
> }
>
>-
> static int
>-vsock_stream_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>-		     int flags)
>+vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>+			  int flags)
> {
> 	struct sock *sk;
> 	struct vsock_sock *vsk;
>@@ -2007,7 +2011,7 @@ static const struct proto_ops vsock_stream_ops = {
> 	.owner = THIS_MODULE,
> 	.release = vsock_release,
> 	.bind = vsock_bind,
>-	.connect = vsock_stream_connect,
>+	.connect = vsock_connect,
> 	.socketpair = sock_no_socketpair,
> 	.accept = vsock_accept,
> 	.getname = vsock_getname,
>@@ -2015,10 +2019,10 @@ static const struct proto_ops vsock_stream_ops = {
> 	.ioctl = sock_no_ioctl,
> 	.listen = vsock_listen,
> 	.shutdown = vsock_shutdown,
>-	.setsockopt = vsock_stream_setsockopt,
>-	.getsockopt = vsock_stream_getsockopt,
>-	.sendmsg = vsock_stream_sendmsg,
>-	.recvmsg = vsock_stream_recvmsg,
>+	.setsockopt = vsock_connectible_setsockopt,
>+	.getsockopt = vsock_connectible_getsockopt,
>+	.sendmsg = vsock_connectible_sendmsg,
>+	.recvmsg = vsock_connectible_recvmsg,
> 	.mmap = sock_no_mmap,
> 	.sendpage = sock_no_sendpage,
> };
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support
  2021-02-07 15:17 ` [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support Arseny Krasnov
  2021-02-09  4:34   ` kernel test robot
@ 2021-02-11 11:00   ` Arseny Krasnov
  2021-02-11 14:29     ` Stefano Garzarella
  2 siblings, 0 replies; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-11 11:00 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Jorgen Hansen,
	Colin Ian King, Andra Paraschiv, Alexander Popov
  Cc: kvm, virtualization, netdev, linux-kernel, stsp2, oxffffaa


On 07.02.2021 18:17, Arseny Krasnov wrote:
> This adds rest of logic for SEQPACKET:
> 1) Packet's type is now set in 'virtio_send_pkt_info()' using
>    type of socket.
> 2) SEQPACKET specific functions which send SEQ_BEGIN/SEQ_END.
>    Note that both functions may sleep to wait enough space for
>    SEQPACKET header.
> 3) SEQ_BEGIN/SEQ_END to TAP packet capture.
> 4) Send SHUTDOWN on socket close for SEQPACKET type.
>
> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
> ---
>  include/linux/virtio_vsock.h            |  9 +++
>  net/vmw_vsock/virtio_transport_common.c | 99 +++++++++++++++++++++----
>  2 files changed, 95 insertions(+), 13 deletions(-)
>
> diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
> index a5e8681bfc6a..c4a39424686d 100644
> --- a/include/linux/virtio_vsock.h
> +++ b/include/linux/virtio_vsock.h
> @@ -41,6 +41,7 @@ struct virtio_vsock_sock {
>  	u32 user_read_seq_len;
>  	u32 user_read_copied;
>  	u32 curr_rx_msg_cnt;
> +	u32 next_tx_msg_cnt;
>  };
>  
>  struct virtio_vsock_pkt {
> @@ -85,7 +86,15 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
>  			       struct msghdr *msg,
>  			       size_t len, int flags);
>  
> +int virtio_transport_seqpacket_seq_send_len(struct vsock_sock *vsk, size_t len, int flags);
> +int virtio_transport_seqpacket_seq_send_eor(struct vsock_sock *vsk, int flags);
>  size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk);
> +int
> +virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
> +				   struct msghdr *msg,
> +				   int flags,
> +				   bool *msg_ready);
> +
>  s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
>  s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
>  
> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> index 51b66f8dd7c7..0aa0fd33e9d6 100644
> --- a/net/vmw_vsock/virtio_transport_common.c
> +++ b/net/vmw_vsock/virtio_transport_common.c
> @@ -139,6 +139,8 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque)
>  		break;
>  	case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
>  	case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
> +	case VIRTIO_VSOCK_OP_SEQ_BEGIN:
> +	case VIRTIO_VSOCK_OP_SEQ_END:
>  		hdr->op = cpu_to_le16(AF_VSOCK_OP_CONTROL);
>  		break;
>  	default:
> @@ -165,6 +167,14 @@ void virtio_transport_deliver_tap_pkt(struct virtio_vsock_pkt *pkt)
>  }
>  EXPORT_SYMBOL_GPL(virtio_transport_deliver_tap_pkt);
>  
> +static u16 virtio_transport_get_type(struct sock *sk)
> +{
> +	if (sk->sk_type == SOCK_STREAM)
> +		return VIRTIO_VSOCK_TYPE_STREAM;
> +	else
> +		return VIRTIO_VSOCK_TYPE_SEQPACKET;
> +}
> +
>  /* This function can only be used on connecting/connected sockets,
>   * since a socket assigned to a transport is required.
>   *
> @@ -179,6 +189,13 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
>  	struct virtio_vsock_pkt *pkt;
>  	u32 pkt_len = info->pkt_len;
>  
> +	info->type = virtio_transport_get_type(sk_vsock(vsk));
> +
> +	if (info->type == VIRTIO_VSOCK_TYPE_SEQPACKET &&
> +	    info->msg &&
> +	    info->msg->msg_flags & MSG_EOR)
> +		info->flags |= VIRTIO_VSOCK_RW_EOR;
> +
>  	t_ops = virtio_transport_get_ops(vsk);
>  	if (unlikely(!t_ops))
>  		return -EFAULT;
> @@ -397,13 +414,61 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
>  	return err;
>  }
>  
> -static u16 virtio_transport_get_type(struct sock *sk)
> +static int virtio_transport_seqpacket_send_ctrl(struct vsock_sock *vsk,
> +						int type,
> +						size_t len,
> +						int flags)
>  {
> -	if (sk->sk_type == SOCK_STREAM)
> -		return VIRTIO_VSOCK_TYPE_STREAM;
> -	else
> -		return VIRTIO_VSOCK_TYPE_SEQPACKET;
> +	struct virtio_vsock_sock *vvs = vsk->trans;
> +	struct virtio_vsock_pkt_info info = {
> +		.op = type,
> +		.vsk = vsk,
> +		.pkt_len = sizeof(struct virtio_vsock_seq_hdr)
> +	};
> +
> +	struct virtio_vsock_seq_hdr seq_hdr = {
> +		.msg_cnt = vvs->next_tx_msg_cnt,
> +		.msg_len = len
Oops, forgot to use 'cpu_to_le32()'. Will fix in v5
> +	};
> +
> +	struct kvec seq_hdr_kiov = {
> +		.iov_base = (void *)&seq_hdr,
> +		.iov_len = sizeof(struct virtio_vsock_seq_hdr)
> +	};
> +
> +	struct msghdr msg = {0};
> +
> +	//XXX: do we need 'vsock_transport_send_notify_data' pointer?
> +	if (vsock_wait_space(sk_vsock(vsk),
> +			     sizeof(struct virtio_vsock_seq_hdr),
> +			     flags, NULL))
> +		return -1;
> +
> +	iov_iter_kvec(&msg.msg_iter, WRITE, &seq_hdr_kiov, 1, sizeof(seq_hdr));
> +
> +	info.msg = &msg;
> +	vvs->next_tx_msg_cnt++;
> +
> +	return virtio_transport_send_pkt_info(vsk, &info);
> +}
> +
> +int virtio_transport_seqpacket_seq_send_len(struct vsock_sock *vsk, size_t len, int flags)
> +{
> +	return virtio_transport_seqpacket_send_ctrl(vsk,
> +						    VIRTIO_VSOCK_OP_SEQ_BEGIN,
> +						    len,
> +						    flags);
>  }
> +EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_send_len);
> +
> +int virtio_transport_seqpacket_seq_send_eor(struct vsock_sock *vsk, int flags)
> +{
> +	return virtio_transport_seqpacket_send_ctrl(vsk,
> +						    VIRTIO_VSOCK_OP_SEQ_END,
> +						    0,
> +						    flags);
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_send_eor);
>  
>  static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
>  {
> @@ -577,6 +642,18 @@ virtio_transport_stream_dequeue(struct vsock_sock *vsk,
>  }
>  EXPORT_SYMBOL_GPL(virtio_transport_stream_dequeue);
>  
> +int
> +virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
> +				   struct msghdr *msg,
> +				   int flags, bool *msg_ready)
> +{
> +	if (flags & MSG_PEEK)
> +		return -EOPNOTSUPP;
> +
> +	return virtio_transport_seqpacket_do_dequeue(vsk, msg, msg_ready);
> +}
> +EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_dequeue);
> +
>  int
>  virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
>  			       struct msghdr *msg,
> @@ -658,14 +735,15 @@ EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
>  void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val)
>  {
>  	struct virtio_vsock_sock *vvs = vsk->trans;
> +	int type;
>  
>  	if (*val > VIRTIO_VSOCK_MAX_BUF_SIZE)
>  		*val = VIRTIO_VSOCK_MAX_BUF_SIZE;
>  
>  	vvs->buf_alloc = *val;
>  
> -	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_STREAM,
> -					    NULL);
> +	type = virtio_transport_get_type(sk_vsock(vsk));
> +	virtio_transport_send_credit_update(vsk, type, NULL);
>  }
>  EXPORT_SYMBOL_GPL(virtio_transport_notify_buffer_size);
>  
> @@ -792,7 +870,6 @@ int virtio_transport_connect(struct vsock_sock *vsk)
>  {
>  	struct virtio_vsock_pkt_info info = {
>  		.op = VIRTIO_VSOCK_OP_REQUEST,
> -		.type = VIRTIO_VSOCK_TYPE_STREAM,
>  		.vsk = vsk,
>  	};
>  
> @@ -804,7 +881,6 @@ int virtio_transport_shutdown(struct vsock_sock *vsk, int mode)
>  {
>  	struct virtio_vsock_pkt_info info = {
>  		.op = VIRTIO_VSOCK_OP_SHUTDOWN,
> -		.type = VIRTIO_VSOCK_TYPE_STREAM,
>  		.flags = (mode & RCV_SHUTDOWN ?
>  			  VIRTIO_VSOCK_SHUTDOWN_RCV : 0) |
>  			 (mode & SEND_SHUTDOWN ?
> @@ -833,7 +909,6 @@ virtio_transport_stream_enqueue(struct vsock_sock *vsk,
>  {
>  	struct virtio_vsock_pkt_info info = {
>  		.op = VIRTIO_VSOCK_OP_RW,
> -		.type = VIRTIO_VSOCK_TYPE_STREAM,
>  		.msg = msg,
>  		.pkt_len = len,
>  		.vsk = vsk,
> @@ -856,7 +931,6 @@ static int virtio_transport_reset(struct vsock_sock *vsk,
>  {
>  	struct virtio_vsock_pkt_info info = {
>  		.op = VIRTIO_VSOCK_OP_RST,
> -		.type = VIRTIO_VSOCK_TYPE_STREAM,
>  		.reply = !!pkt,
>  		.vsk = vsk,
>  	};
> @@ -1001,7 +1075,7 @@ void virtio_transport_release(struct vsock_sock *vsk)
>  	struct sock *sk = &vsk->sk;
>  	bool remove_sock = true;
>  
> -	if (sk->sk_type == SOCK_STREAM)
> +	if (sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET)
>  		remove_sock = virtio_transport_close(vsk);
>  
>  	list_for_each_entry_safe(pkt, tmp, &vvs->rx_queue, list) {
> @@ -1164,7 +1238,6 @@ virtio_transport_send_response(struct vsock_sock *vsk,
>  {
>  	struct virtio_vsock_pkt_info info = {
>  		.op = VIRTIO_VSOCK_OP_RESPONSE,
> -		.type = VIRTIO_VSOCK_TYPE_STREAM,
>  		.remote_cid = le64_to_cpu(pkt->hdr.src_cid),
>  		.remote_port = le32_to_cpu(pkt->hdr.src_port),
>  		.reply = true,

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 02/17] af_vsock: separate wait data loop
  2021-02-07 15:14 ` [RFC PATCH v4 02/17] af_vsock: separate wait data loop Arseny Krasnov
@ 2021-02-11 11:24     ` Stefano Garzarella
  2021-02-11 15:11     ` Jorgen Hansen
  1 sibling, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 11:24 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Alexander Popov, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Sun, Feb 07, 2021 at 06:14:48PM +0300, Arseny Krasnov wrote:
>This moves wait loop for data to dedicated function, because later
>it will be used by SEQPACKET data receive loop.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 158 +++++++++++++++++++++------------------
> 1 file changed, 86 insertions(+), 72 deletions(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index f4fabec50650..38927695786f 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1833,6 +1833,71 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 	return err;
> }
>
>+static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
>+			   long timeout,
>+			   struct vsock_transport_recv_notify_data *recv_data,
>+			   size_t target)
>+{
>+	const struct vsock_transport *transport;
>+	struct vsock_sock *vsk;
>+	s64 data;
>+	int err;
>+
>+	vsk = vsock_sk(sk);
>+	err = 0;
>+	transport = vsk->transport;
>+	prepare_to_wait(sk_sleep(sk), wait, TASK_INTERRUPTIBLE);
>+
>+	while ((data = vsock_stream_has_data(vsk)) == 0) {
>+		if (sk->sk_err != 0 ||
>+		    (sk->sk_shutdown & RCV_SHUTDOWN) ||
>+		    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
>+			goto out;
>+		}
>+
>+		/* Don't wait for non-blocking sockets. */
>+		if (timeout == 0) {
>+			err = -EAGAIN;
>+			goto out;
>+		}
>+
>+		if (recv_data) {
>+			err = transport->notify_recv_pre_block(vsk, target, recv_data);
>+			if (err < 0)
>+				goto out;
>+		}
>+
>+		release_sock(sk);
>+		timeout = schedule_timeout(timeout);
>+		lock_sock(sk);
>+
>+		if (signal_pending(current)) {
>+			err = sock_intr_errno(timeout);
>+			goto out;
>+		} else if (timeout == 0) {
>+			err = -EAGAIN;
>+			goto out;
>+		}
>+	}
>+
>+	finish_wait(sk_sleep(sk), wait);
>+
>+	/* Invalid queue pair content. XXX This should
>+	 * be changed to a connection reset in a later
>+	 * change.
>+	 */
>+	if (data < 0)
>+		return -ENOMEM;
>+
>+	/* Have some data, return. */
>+	if (data)
>+		return data;

IIUC here data must be != 0 so you can simply return data in any case.

Or cleaner, you can do 'break' instead of 'goto out' in the error paths 
and after the while loop you can do something like this:

	finish_wait(sk_sleep(sk), wait);

	if (err)
		return err;

	if (data < 0)
		return -ENOMEM;

	return data;
}

>+
>+out:
>+	finish_wait(sk_sleep(sk), wait);
>+	return err;
>+}
>+
> static int
> vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 			  int flags)
>@@ -1912,85 +1977,34 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>
>
> 	while (1) {
>-		s64 ready;
>+		ssize_t read;
>
>-		prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
>-		ready = vsock_stream_has_data(vsk);
>-
>-		if (ready == 0) {
>-			if (sk->sk_err != 0 ||
>-			    (sk->sk_shutdown & RCV_SHUTDOWN) ||
>-			    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			}
>-			/* Don't wait for non-blocking sockets. */
>-			if (timeout == 0) {
>-				err = -EAGAIN;
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			}
>-
>-			err = transport->notify_recv_pre_block(
>-					vsk, target, &recv_data);
>-			if (err < 0) {
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			}
>-			release_sock(sk);
>-			timeout = schedule_timeout(timeout);
>-			lock_sock(sk);
>-
>-			if (signal_pending(current)) {
>-				err = sock_intr_errno(timeout);
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			} else if (timeout == 0) {
>-				err = -EAGAIN;
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			}
>-		} else {
>-			ssize_t read;
>+		err = vsock_wait_data(sk, &wait, timeout, &recv_data, target);
>+		if (err <= 0)
>+			break;
>
>-			finish_wait(sk_sleep(sk), &wait);
>-
>-			if (ready < 0) {
>-				/* Invalid queue pair content. XXX This should
>-				* be changed to a connection reset in a later
>-				* change.
>-				*/
>-
>-				err = -ENOMEM;
>-				goto out;
>-			}
>-
>-			err = transport->notify_recv_pre_dequeue(
>-					vsk, target, &recv_data);
>-			if (err < 0)
>-				break;
>+		err = transport->notify_recv_pre_dequeue(vsk, target,
>+							 &recv_data);
>+		if (err < 0)
>+			break;
>
>-			read = transport->stream_dequeue(
>-					vsk, msg,
>-					len - copied, flags);
>-			if (read < 0) {
>-				err = -ENOMEM;
>-				break;
>-			}
>+		read = transport->stream_dequeue(vsk, msg, len - copied, flags);
>+		if (read < 0) {
>+			err = -ENOMEM;
>+			break;
>+		}
>
>-			copied += read;
>+		copied += read;
>
>-			err = transport->notify_recv_post_dequeue(
>-					vsk, target, read,
>-					!(flags & MSG_PEEK), &recv_data);
>-			if (err < 0)
>-				goto out;
>+		err = transport->notify_recv_post_dequeue(vsk, target, read,
>+						!(flags & MSG_PEEK), &recv_data);
>+		if (err < 0)
>+			goto out;
>
>-			if (read >= target || flags & MSG_PEEK)
>-				break;
>+		if (read >= target || flags & MSG_PEEK)
>+			break;
>
>-			target -= read;
>-		}
>+		target -= read;
> 	}

This part looks okay, maybe we could improve the loop a bit and make it 
more readable, but it's out of the scope of this patch.

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 02/17] af_vsock: separate wait data loop
@ 2021-02-11 11:24     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 11:24 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, netdev, stsp2,
	linux-kernel, virtualization, oxffffaa, Stefan Hajnoczi,
	Colin Ian King, Jakub Kicinski, David S. Miller, Jorgen Hansen,
	Alexander Popov

On Sun, Feb 07, 2021 at 06:14:48PM +0300, Arseny Krasnov wrote:
>This moves wait loop for data to dedicated function, because later
>it will be used by SEQPACKET data receive loop.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 158 +++++++++++++++++++++------------------
> 1 file changed, 86 insertions(+), 72 deletions(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index f4fabec50650..38927695786f 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1833,6 +1833,71 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 	return err;
> }
>
>+static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
>+			   long timeout,
>+			   struct vsock_transport_recv_notify_data *recv_data,
>+			   size_t target)
>+{
>+	const struct vsock_transport *transport;
>+	struct vsock_sock *vsk;
>+	s64 data;
>+	int err;
>+
>+	vsk = vsock_sk(sk);
>+	err = 0;
>+	transport = vsk->transport;
>+	prepare_to_wait(sk_sleep(sk), wait, TASK_INTERRUPTIBLE);
>+
>+	while ((data = vsock_stream_has_data(vsk)) == 0) {
>+		if (sk->sk_err != 0 ||
>+		    (sk->sk_shutdown & RCV_SHUTDOWN) ||
>+		    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
>+			goto out;
>+		}
>+
>+		/* Don't wait for non-blocking sockets. */
>+		if (timeout == 0) {
>+			err = -EAGAIN;
>+			goto out;
>+		}
>+
>+		if (recv_data) {
>+			err = transport->notify_recv_pre_block(vsk, target, recv_data);
>+			if (err < 0)
>+				goto out;
>+		}
>+
>+		release_sock(sk);
>+		timeout = schedule_timeout(timeout);
>+		lock_sock(sk);
>+
>+		if (signal_pending(current)) {
>+			err = sock_intr_errno(timeout);
>+			goto out;
>+		} else if (timeout == 0) {
>+			err = -EAGAIN;
>+			goto out;
>+		}
>+	}
>+
>+	finish_wait(sk_sleep(sk), wait);
>+
>+	/* Invalid queue pair content. XXX This should
>+	 * be changed to a connection reset in a later
>+	 * change.
>+	 */
>+	if (data < 0)
>+		return -ENOMEM;
>+
>+	/* Have some data, return. */
>+	if (data)
>+		return data;

IIUC here data must be != 0 so you can simply return data in any case.

Or cleaner, you can do 'break' instead of 'goto out' in the error paths 
and after the while loop you can do something like this:

	finish_wait(sk_sleep(sk), wait);

	if (err)
		return err;

	if (data < 0)
		return -ENOMEM;

	return data;
}

>+
>+out:
>+	finish_wait(sk_sleep(sk), wait);
>+	return err;
>+}
>+
> static int
> vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 			  int flags)
>@@ -1912,85 +1977,34 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>
>
> 	while (1) {
>-		s64 ready;
>+		ssize_t read;
>
>-		prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
>-		ready = vsock_stream_has_data(vsk);
>-
>-		if (ready == 0) {
>-			if (sk->sk_err != 0 ||
>-			    (sk->sk_shutdown & RCV_SHUTDOWN) ||
>-			    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			}
>-			/* Don't wait for non-blocking sockets. */
>-			if (timeout == 0) {
>-				err = -EAGAIN;
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			}
>-
>-			err = transport->notify_recv_pre_block(
>-					vsk, target, &recv_data);
>-			if (err < 0) {
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			}
>-			release_sock(sk);
>-			timeout = schedule_timeout(timeout);
>-			lock_sock(sk);
>-
>-			if (signal_pending(current)) {
>-				err = sock_intr_errno(timeout);
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			} else if (timeout == 0) {
>-				err = -EAGAIN;
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			}
>-		} else {
>-			ssize_t read;
>+		err = vsock_wait_data(sk, &wait, timeout, &recv_data, target);
>+		if (err <= 0)
>+			break;
>
>-			finish_wait(sk_sleep(sk), &wait);
>-
>-			if (ready < 0) {
>-				/* Invalid queue pair content. XXX This should
>-				* be changed to a connection reset in a later
>-				* change.
>-				*/
>-
>-				err = -ENOMEM;
>-				goto out;
>-			}
>-
>-			err = transport->notify_recv_pre_dequeue(
>-					vsk, target, &recv_data);
>-			if (err < 0)
>-				break;
>+		err = transport->notify_recv_pre_dequeue(vsk, target,
>+							 &recv_data);
>+		if (err < 0)
>+			break;
>
>-			read = transport->stream_dequeue(
>-					vsk, msg,
>-					len - copied, flags);
>-			if (read < 0) {
>-				err = -ENOMEM;
>-				break;
>-			}
>+		read = transport->stream_dequeue(vsk, msg, len - copied, flags);
>+		if (read < 0) {
>+			err = -ENOMEM;
>+			break;
>+		}
>
>-			copied += read;
>+		copied += read;
>
>-			err = transport->notify_recv_post_dequeue(
>-					vsk, target, read,
>-					!(flags & MSG_PEEK), &recv_data);
>-			if (err < 0)
>-				goto out;
>+		err = transport->notify_recv_post_dequeue(vsk, target, read,
>+						!(flags & MSG_PEEK), &recv_data);
>+		if (err < 0)
>+			goto out;
>
>-			if (read >= target || flags & MSG_PEEK)
>-				break;
>+		if (read >= target || flags & MSG_PEEK)
>+			break;
>
>-			target -= read;
>-		}
>+		target -= read;
> 	}

This part looks okay, maybe we could improve the loop a bit and make it 
more readable, but it's out of the scope of this patch.

Thanks,
Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 03/17] af_vsock: separate receive data loop
  2021-02-07 15:15 ` [RFC PATCH v4 03/17] af_vsock: separate receive " Arseny Krasnov
@ 2021-02-11 11:37     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 11:37 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Sun, Feb 07, 2021 at 06:15:05PM +0300, Arseny Krasnov wrote:
>This moves STREAM specific data receive logic to dedicated function:
>'__vsock_stream_recvmsg()', while checks that will be same for both
>types of socket are in shared function: 'vsock_connectible_recvmsg()'.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 117 +++++++++++++++++++++++----------------
> 1 file changed, 68 insertions(+), 49 deletions(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 38927695786f..66c8a932f49b 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1898,65 +1898,22 @@ static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
> 	return err;
> }
>
>-static int
>-vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>-			  int flags)
>+static int __vsock_stream_recvmsg(struct sock *sk, struct msghdr *msg,
>+				  size_t len, int flags)
> {
>-	struct sock *sk;
>-	struct vsock_sock *vsk;
>+	struct vsock_transport_recv_notify_data recv_data;
> 	const struct vsock_transport *transport;
>-	int err;
>-	size_t target;
>+	struct vsock_sock *vsk;
> 	ssize_t copied;
>+	size_t target;
> 	long timeout;
>-	struct vsock_transport_recv_notify_data recv_data;
>+	int err;
>
> 	DEFINE_WAIT(wait);
>
>-	sk = sock->sk;
> 	vsk = vsock_sk(sk);
>-	err = 0;
>-
>-	lock_sock(sk);
>-
> 	transport = vsk->transport;
>
>-	if (!transport || sk->sk_state != TCP_ESTABLISHED) {
>-		/* Recvmsg is supposed to return 0 if a peer performs an
>-		 * orderly shutdown. Differentiate between that case and when a
>-		 * peer has not connected or a local shutdown occured with the
>-		 * SOCK_DONE flag.
>-		 */
>-		if (sock_flag(sk, SOCK_DONE))
>-			err = 0;
>-		else
>-			err = -ENOTCONN;
>-
>-		goto out;
>-	}
>-
>-	if (flags & MSG_OOB) {
>-		err = -EOPNOTSUPP;
>-		goto out;
>-	}
>-
>-	/* We don't check peer_shutdown flag here since peer may actually shut
>-	 * down, but there can be data in the queue that a local socket can
>-	 * receive.
>-	 */
>-	if (sk->sk_shutdown & RCV_SHUTDOWN) {
>-		err = 0;
>-		goto out;
>-	}
>-
>-	/* It is valid on Linux to pass in a zero-length receive buffer.  This
>-	 * is not an error.  We may as well bail out now.
>-	 */
>-	if (!len) {
>-		err = 0;
>-		goto out;
>-	}
>-
> 	/* We must not copy less than target bytes into the user's buffer
> 	 * before returning successfully, so we wait for the consume queue to
> 	 * have that much data to consume before dequeueing.  Note that this

At the end of __vsock_stream_recvmsg() you are calling release_sock(sk) 
and it's wrong since we are releasing it in vsock_connectible_recvmsg().

Please fix it.

>@@ -2020,6 +1977,68 @@ vsock_connectible_recvmsg(struct socket *sock, 
>struct msghdr *msg, size_t len,
> 	return err;
> }
>
>+static int
>+vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>+			  int flags)
>+{
>+	struct sock *sk;
>+	struct vsock_sock *vsk;
>+	const struct vsock_transport *transport;
>+	int err;
>+
>+	DEFINE_WAIT(wait);
>+
>+	sk = sock->sk;
>+	vsk = vsock_sk(sk);
>+	err = 0;
>+
>+	lock_sock(sk);
>+
>+	transport = vsk->transport;
>+
>+	if (!transport || sk->sk_state != TCP_ESTABLISHED) {
>+		/* Recvmsg is supposed to return 0 if a peer performs an
>+		 * orderly shutdown. Differentiate between that case and when a
>+		 * peer has not connected or a local shutdown occurred with the
>+		 * SOCK_DONE flag.
>+		 */
>+		if (sock_flag(sk, SOCK_DONE))
>+			err = 0;
>+		else
>+			err = -ENOTCONN;
>+
>+		goto out;
>+	}
>+
>+	if (flags & MSG_OOB) {
>+		err = -EOPNOTSUPP;
>+		goto out;
>+	}
>+
>+	/* We don't check peer_shutdown flag here since peer may actually shut
>+	 * down, but there can be data in the queue that a local socket can
>+	 * receive.
>+	 */
>+	if (sk->sk_shutdown & RCV_SHUTDOWN) {
>+		err = 0;
>+		goto out;
>+	}
>+
>+	/* It is valid on Linux to pass in a zero-length receive buffer.  This
>+	 * is not an error.  We may as well bail out now.
>+	 */
>+	if (!len) {
>+		err = 0;
>+		goto out;
>+	}
>+
>+	err = __vsock_stream_recvmsg(sk, msg, len, flags);
>+
>+out:
>+	release_sock(sk);
>+	return err;
>+}
>+

The rest of the patch LGTM.

Stefano


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 03/17] af_vsock: separate receive data loop
@ 2021-02-11 11:37     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 11:37 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:15:05PM +0300, Arseny Krasnov wrote:
>This moves STREAM specific data receive logic to dedicated function:
>'__vsock_stream_recvmsg()', while checks that will be same for both
>types of socket are in shared function: 'vsock_connectible_recvmsg()'.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 117 +++++++++++++++++++++++----------------
> 1 file changed, 68 insertions(+), 49 deletions(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 38927695786f..66c8a932f49b 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1898,65 +1898,22 @@ static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
> 	return err;
> }
>
>-static int
>-vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>-			  int flags)
>+static int __vsock_stream_recvmsg(struct sock *sk, struct msghdr *msg,
>+				  size_t len, int flags)
> {
>-	struct sock *sk;
>-	struct vsock_sock *vsk;
>+	struct vsock_transport_recv_notify_data recv_data;
> 	const struct vsock_transport *transport;
>-	int err;
>-	size_t target;
>+	struct vsock_sock *vsk;
> 	ssize_t copied;
>+	size_t target;
> 	long timeout;
>-	struct vsock_transport_recv_notify_data recv_data;
>+	int err;
>
> 	DEFINE_WAIT(wait);
>
>-	sk = sock->sk;
> 	vsk = vsock_sk(sk);
>-	err = 0;
>-
>-	lock_sock(sk);
>-
> 	transport = vsk->transport;
>
>-	if (!transport || sk->sk_state != TCP_ESTABLISHED) {
>-		/* Recvmsg is supposed to return 0 if a peer performs an
>-		 * orderly shutdown. Differentiate between that case and when a
>-		 * peer has not connected or a local shutdown occured with the
>-		 * SOCK_DONE flag.
>-		 */
>-		if (sock_flag(sk, SOCK_DONE))
>-			err = 0;
>-		else
>-			err = -ENOTCONN;
>-
>-		goto out;
>-	}
>-
>-	if (flags & MSG_OOB) {
>-		err = -EOPNOTSUPP;
>-		goto out;
>-	}
>-
>-	/* We don't check peer_shutdown flag here since peer may actually shut
>-	 * down, but there can be data in the queue that a local socket can
>-	 * receive.
>-	 */
>-	if (sk->sk_shutdown & RCV_SHUTDOWN) {
>-		err = 0;
>-		goto out;
>-	}
>-
>-	/* It is valid on Linux to pass in a zero-length receive buffer.  This
>-	 * is not an error.  We may as well bail out now.
>-	 */
>-	if (!len) {
>-		err = 0;
>-		goto out;
>-	}
>-
> 	/* We must not copy less than target bytes into the user's buffer
> 	 * before returning successfully, so we wait for the consume queue to
> 	 * have that much data to consume before dequeueing.  Note that this

At the end of __vsock_stream_recvmsg() you are calling release_sock(sk) 
and it's wrong since we are releasing it in vsock_connectible_recvmsg().

Please fix it.

>@@ -2020,6 +1977,68 @@ vsock_connectible_recvmsg(struct socket *sock, 
>struct msghdr *msg, size_t len,
> 	return err;
> }
>
>+static int
>+vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>+			  int flags)
>+{
>+	struct sock *sk;
>+	struct vsock_sock *vsk;
>+	const struct vsock_transport *transport;
>+	int err;
>+
>+	DEFINE_WAIT(wait);
>+
>+	sk = sock->sk;
>+	vsk = vsock_sk(sk);
>+	err = 0;
>+
>+	lock_sock(sk);
>+
>+	transport = vsk->transport;
>+
>+	if (!transport || sk->sk_state != TCP_ESTABLISHED) {
>+		/* Recvmsg is supposed to return 0 if a peer performs an
>+		 * orderly shutdown. Differentiate between that case and when a
>+		 * peer has not connected or a local shutdown occurred with the
>+		 * SOCK_DONE flag.
>+		 */
>+		if (sock_flag(sk, SOCK_DONE))
>+			err = 0;
>+		else
>+			err = -ENOTCONN;
>+
>+		goto out;
>+	}
>+
>+	if (flags & MSG_OOB) {
>+		err = -EOPNOTSUPP;
>+		goto out;
>+	}
>+
>+	/* We don't check peer_shutdown flag here since peer may actually shut
>+	 * down, but there can be data in the queue that a local socket can
>+	 * receive.
>+	 */
>+	if (sk->sk_shutdown & RCV_SHUTDOWN) {
>+		err = 0;
>+		goto out;
>+	}
>+
>+	/* It is valid on Linux to pass in a zero-length receive buffer.  This
>+	 * is not an error.  We may as well bail out now.
>+	 */
>+	if (!len) {
>+		err = 0;
>+		goto out;
>+	}
>+
>+	err = __vsock_stream_recvmsg(sk, msg, len, flags);
>+
>+out:
>+	release_sock(sk);
>+	return err;
>+}
>+

The rest of the patch LGTM.

Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 04/17] af_vsock: implement SEQPACKET receive loop
  2021-02-07 15:15 ` [RFC PATCH v4 04/17] af_vsock: implement SEQPACKET receive loop Arseny Krasnov
@ 2021-02-11 11:47     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 11:47 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Alexander Popov, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Sun, Feb 07, 2021 at 06:15:22PM +0300, Arseny Krasnov wrote:
>This adds receive loop for SEQPACKET. It looks like receive loop for
>STREAM, but there is a little bit difference:
>1) It doesn't call notify callbacks.
>2) It doesn't care about 'SO_SNDLOWAT' and 'SO_RCVLOWAT' values, because
>   there is no sense for these values in SEQPACKET case.
>3) It waits until whole record is received or error is found during
>   receiving.
>4) It processes and sets 'MSG_TRUNC' flag.
>
>So to avoid extra conditions for two types of socket inside one loop, two
>independent functions were created.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/net/af_vsock.h   |  5 +++
> net/vmw_vsock/af_vsock.c | 96 +++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 100 insertions(+), 1 deletion(-)
>
>diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>index b1c717286993..bb6a0e52be86 100644
>--- a/include/net/af_vsock.h
>+++ b/include/net/af_vsock.h
>@@ -135,6 +135,11 @@ struct vsock_transport {
> 	bool (*stream_is_active)(struct vsock_sock *);
> 	bool (*stream_allow)(u32 cid, u32 port);
>
>+	/* SEQ_PACKET. */
>+	size_t (*seqpacket_seq_get_len)(struct vsock_sock *);
>+	int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
>+				     int flags, bool *msg_ready);

CHECK: Alignment should match open parenthesis
#35: FILE: include/net/af_vsock.h:141:
+	int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
+				     int flags, bool *msg_ready);

And to make checkpatch.pl happy please use the identifier name also for 
the others parameter. I know we haven't done this before, but for new 
code I think we can do it.

>+
> 	/* Notification. */
> 	int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
> 	int (*notify_poll_out)(struct vsock_sock *, size_t, bool *);
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 66c8a932f49b..3d8af987216a 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1977,6 +1977,97 @@ static int __vsock_stream_recvmsg(struct sock *sk, struct msghdr *msg,
> 	return err;
> }
>
>+static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
>+				     size_t len, int flags)
>+{
>+	const struct vsock_transport *transport;
>+	const struct iovec *orig_iov;
>+	unsigned long orig_nr_segs;
>+	bool msg_ready;
>+	struct vsock_sock *vsk;
>+	size_t record_len;
>+	long timeout;
>+	int err = 0;
>+	DEFINE_WAIT(wait);
>+
>+	vsk = vsock_sk(sk);
>+	transport = vsk->transport;
>+
>+	timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
>+	orig_nr_segs = msg->msg_iter.nr_segs;
>+	orig_iov = msg->msg_iter.iov;
>+	msg_ready = false;
>+	record_len = 0;
>+
>+	while (1) {
>+		err = vsock_wait_data(sk, &wait, timeout, NULL, 0);
>+
>+		if (err <= 0) {
>+			/* In case of any loop break(timeout, signal
>+			 * interrupt or shutdown), we report user that
>+			 * nothing was copied.
>+			 */
>+			err = 0;
>+			break;
>+		}
>+
>+		if (record_len == 0) {
>+			record_len =
>+				transport->seqpacket_seq_get_len(vsk);
>+
>+			if (record_len == 0)
>+				continue;
>+		}
>+
>+		err = transport->seqpacket_dequeue(vsk, msg,
>+					flags, &msg_ready);

A single line here should be okay.

>+		if (err < 0) {
>+			if (err == -EAGAIN) {
>+				iov_iter_init(&msg->msg_iter, READ,
>+					      orig_iov, orig_nr_segs,
>+					      len);
>+				/* Clear 'MSG_EOR' here, because dequeue
>+				 * callback above set it again if it was
>+				 * set by sender. This 'MSG_EOR' is from
>+				 * dropped record.
>+				 */
>+				msg->msg_flags &= ~MSG_EOR;
>+				record_len = 0;
>+				continue;
>+			}
>+
>+			err = -ENOMEM;
>+			break;
>+		}
>+
>+		if (msg_ready)
>+			break;
>+	}
>+
>+	if (sk->sk_err)
>+		err = -sk->sk_err;
>+	else if (sk->sk_shutdown & RCV_SHUTDOWN)
>+		err = 0;
>+
>+	if (msg_ready) {
>+		/* User sets MSG_TRUNC, so return real length of
>+		 * packet.
>+		 */
>+		if (flags & MSG_TRUNC)
>+			err = record_len;
>+		else
>+			err = len - msg->msg_iter.count;
>+
>+		/* Always set MSG_TRUNC if real length of packet is
>+		 * bigger than user's buffer.
>+		 */
>+		if (record_len > len)
>+			msg->msg_flags |= MSG_TRUNC;
>+	}
>+
>+	return err;
>+}
>+
> static int
> vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 			  int flags)
>@@ -2032,7 +2123,10 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 		goto out;
> 	}
>
>-	err = __vsock_stream_recvmsg(sk, msg, len, flags);
>+	if (sk->sk_type == SOCK_STREAM)
>+		err = __vsock_stream_recvmsg(sk, msg, len, flags);
>+	else
>+		err = __vsock_seqpacket_recvmsg(sk, msg, len, flags);
>
> out:
> 	release_sock(sk);

The rest seems ok to me, but I need to get more familiar with SEQPACKET 
before giving my R-b.

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 04/17] af_vsock: implement SEQPACKET receive loop
@ 2021-02-11 11:47     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 11:47 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, netdev, stsp2,
	linux-kernel, virtualization, oxffffaa, Stefan Hajnoczi,
	Colin Ian King, Jakub Kicinski, David S. Miller, Jorgen Hansen,
	Alexander Popov

On Sun, Feb 07, 2021 at 06:15:22PM +0300, Arseny Krasnov wrote:
>This adds receive loop for SEQPACKET. It looks like receive loop for
>STREAM, but there is a little bit difference:
>1) It doesn't call notify callbacks.
>2) It doesn't care about 'SO_SNDLOWAT' and 'SO_RCVLOWAT' values, because
>   there is no sense for these values in SEQPACKET case.
>3) It waits until whole record is received or error is found during
>   receiving.
>4) It processes and sets 'MSG_TRUNC' flag.
>
>So to avoid extra conditions for two types of socket inside one loop, two
>independent functions were created.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/net/af_vsock.h   |  5 +++
> net/vmw_vsock/af_vsock.c | 96 +++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 100 insertions(+), 1 deletion(-)
>
>diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>index b1c717286993..bb6a0e52be86 100644
>--- a/include/net/af_vsock.h
>+++ b/include/net/af_vsock.h
>@@ -135,6 +135,11 @@ struct vsock_transport {
> 	bool (*stream_is_active)(struct vsock_sock *);
> 	bool (*stream_allow)(u32 cid, u32 port);
>
>+	/* SEQ_PACKET. */
>+	size_t (*seqpacket_seq_get_len)(struct vsock_sock *);
>+	int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
>+				     int flags, bool *msg_ready);

CHECK: Alignment should match open parenthesis
#35: FILE: include/net/af_vsock.h:141:
+	int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
+				     int flags, bool *msg_ready);

And to make checkpatch.pl happy please use the identifier name also for 
the others parameter. I know we haven't done this before, but for new 
code I think we can do it.

>+
> 	/* Notification. */
> 	int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
> 	int (*notify_poll_out)(struct vsock_sock *, size_t, bool *);
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 66c8a932f49b..3d8af987216a 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1977,6 +1977,97 @@ static int __vsock_stream_recvmsg(struct sock *sk, struct msghdr *msg,
> 	return err;
> }
>
>+static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
>+				     size_t len, int flags)
>+{
>+	const struct vsock_transport *transport;
>+	const struct iovec *orig_iov;
>+	unsigned long orig_nr_segs;
>+	bool msg_ready;
>+	struct vsock_sock *vsk;
>+	size_t record_len;
>+	long timeout;
>+	int err = 0;
>+	DEFINE_WAIT(wait);
>+
>+	vsk = vsock_sk(sk);
>+	transport = vsk->transport;
>+
>+	timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
>+	orig_nr_segs = msg->msg_iter.nr_segs;
>+	orig_iov = msg->msg_iter.iov;
>+	msg_ready = false;
>+	record_len = 0;
>+
>+	while (1) {
>+		err = vsock_wait_data(sk, &wait, timeout, NULL, 0);
>+
>+		if (err <= 0) {
>+			/* In case of any loop break(timeout, signal
>+			 * interrupt or shutdown), we report user that
>+			 * nothing was copied.
>+			 */
>+			err = 0;
>+			break;
>+		}
>+
>+		if (record_len == 0) {
>+			record_len =
>+				transport->seqpacket_seq_get_len(vsk);
>+
>+			if (record_len == 0)
>+				continue;
>+		}
>+
>+		err = transport->seqpacket_dequeue(vsk, msg,
>+					flags, &msg_ready);

A single line here should be okay.

>+		if (err < 0) {
>+			if (err == -EAGAIN) {
>+				iov_iter_init(&msg->msg_iter, READ,
>+					      orig_iov, orig_nr_segs,
>+					      len);
>+				/* Clear 'MSG_EOR' here, because dequeue
>+				 * callback above set it again if it was
>+				 * set by sender. This 'MSG_EOR' is from
>+				 * dropped record.
>+				 */
>+				msg->msg_flags &= ~MSG_EOR;
>+				record_len = 0;
>+				continue;
>+			}
>+
>+			err = -ENOMEM;
>+			break;
>+		}
>+
>+		if (msg_ready)
>+			break;
>+	}
>+
>+	if (sk->sk_err)
>+		err = -sk->sk_err;
>+	else if (sk->sk_shutdown & RCV_SHUTDOWN)
>+		err = 0;
>+
>+	if (msg_ready) {
>+		/* User sets MSG_TRUNC, so return real length of
>+		 * packet.
>+		 */
>+		if (flags & MSG_TRUNC)
>+			err = record_len;
>+		else
>+			err = len - msg->msg_iter.count;
>+
>+		/* Always set MSG_TRUNC if real length of packet is
>+		 * bigger than user's buffer.
>+		 */
>+		if (record_len > len)
>+			msg->msg_flags |= MSG_TRUNC;
>+	}
>+
>+	return err;
>+}
>+
> static int
> vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 			  int flags)
>@@ -2032,7 +2123,10 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 		goto out;
> 	}
>
>-	err = __vsock_stream_recvmsg(sk, msg, len, flags);
>+	if (sk->sk_type == SOCK_STREAM)
>+		err = __vsock_stream_recvmsg(sk, msg, len, flags);
>+	else
>+		err = __vsock_seqpacket_recvmsg(sk, msg, len, flags);
>
> out:
> 	release_sock(sk);

The rest seems ok to me, but I need to get more familiar with SEQPACKET 
before giving my R-b.

Thanks,
Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 05/17] af_vsock: separate wait space loop
  2021-02-07 15:15 ` [RFC PATCH v4 05/17] af_vsock: separate wait space loop Arseny Krasnov
@ 2021-02-11 12:14     ` Stefano Garzarella
  2021-02-11 12:14     ` Stefano Garzarella
  1 sibling, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 12:14 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Sun, Feb 07, 2021 at 06:15:41PM +0300, Arseny Krasnov wrote:
>This moves loop that waits for space on send to separate function,
>because it will be used for SEQ_BEGIN/SEQ_END sending before and
>after data transmission. Waiting for SEQ_BEGIN/SEQ_END is needed
>because such packets carries SEQPACKET header that couldn't be
>fragmented by credit mechanism, so to avoid it, sender waits until
>enough space will be ready.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/net/af_vsock.h   |  2 +
> net/vmw_vsock/af_vsock.c | 93 ++++++++++++++++++++++++++--------------
> 2 files changed, 62 insertions(+), 33 deletions(-)
>
>diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>index bb6a0e52be86..19f6f22821ec 100644
>--- a/include/net/af_vsock.h
>+++ b/include/net/af_vsock.h
>@@ -205,6 +205,8 @@ void vsock_remove_sock(struct vsock_sock *vsk);
> void vsock_for_each_connected_socket(void (*fn)(struct sock *sk));
> int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk);
> bool vsock_find_cid(unsigned int cid);
>+int vsock_wait_space(struct sock *sk, size_t space, int flags,
>+		     struct vsock_transport_send_notify_data *send_data);
>
> /**** TAP ****/
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 3d8af987216a..ea99261e88ac 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1693,6 +1693,64 @@ static int vsock_connectible_getsockopt(struct socket *sock,
> 	return 0;
> }
>
>+int vsock_wait_space(struct sock *sk, size_t space, int flags,
>+		     struct vsock_transport_send_notify_data *send_data)
>+{
>+	const struct vsock_transport *transport;
>+	struct vsock_sock *vsk;
>+	long timeout;
>+	int err;
>+
>+	DEFINE_WAIT_FUNC(wait, woken_wake_function);
>+
>+	vsk = vsock_sk(sk);
>+	transport = vsk->transport;
>+	timeout = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
>+	err = 0;
>+
>+	add_wait_queue(sk_sleep(sk), &wait);
>+
>+	while (vsock_stream_has_space(vsk) < space &&
>+	       sk->sk_err == 0 &&
>+	       !(sk->sk_shutdown & SEND_SHUTDOWN) &&
>+	       !(vsk->peer_shutdown & RCV_SHUTDOWN)) {

Maybe a new line here, like in the original code, would help the 
readability.

>+		/* Don't wait for non-blocking sockets. */
>+		if (timeout == 0) {
>+			err = -EAGAIN;
>+			goto out_err;
>+		}
>+
>+		if (send_data) {
>+			err = transport->notify_send_pre_block(vsk, send_data);
>+			if (err < 0)
>+				goto out_err;
>+		}
>+
>+		release_sock(sk);
>+		timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout);
>+		lock_sock(sk);
>+		if (signal_pending(current)) {
>+			err = sock_intr_errno(timeout);
>+			goto out_err;
>+		} else if (timeout == 0) {
>+			err = -EAGAIN;
>+			goto out_err;
>+		}
>+	}
>+
>+	if (sk->sk_err) {
>+		err = -sk->sk_err;
>+	} else if ((sk->sk_shutdown & SEND_SHUTDOWN) ||
>+		   (vsk->peer_shutdown & RCV_SHUTDOWN)) {
>+		err = -EPIPE;
>+	}
>+
>+out_err:
>+	remove_wait_queue(sk_sleep(sk), &wait);
>+	return err;
>+}
>+EXPORT_SYMBOL_GPL(vsock_wait_space);
>+
> static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 				     size_t len)
> {

After removing the wait loop in vsock_connectible_sendmsg(), we should 
remove the 'timeout' variable because it is no longer used.

>@@ -1751,39 +1809,8 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 	while (total_written < len) {
> 		ssize_t written;
>
>-		add_wait_queue(sk_sleep(sk), &wait);
>-		while (vsock_stream_has_space(vsk) == 0 &&
>-		       sk->sk_err == 0 &&
>-		       !(sk->sk_shutdown & SEND_SHUTDOWN) &&
>-		       !(vsk->peer_shutdown & RCV_SHUTDOWN)) {
>-
>-			/* Don't wait for non-blocking sockets. */
>-			if (timeout == 0) {
>-				err = -EAGAIN;
>-				remove_wait_queue(sk_sleep(sk), &wait);
>-				goto out_err;
>-			}
>-
>-			err = transport->notify_send_pre_block(vsk, &send_data);
>-			if (err < 0) {
>-				remove_wait_queue(sk_sleep(sk), &wait);
>-				goto out_err;
>-			}
>-
>-			release_sock(sk);
>-			timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout);
>-			lock_sock(sk);
>-			if (signal_pending(current)) {
>-				err = sock_intr_errno(timeout);
>-				remove_wait_queue(sk_sleep(sk), &wait);
>-				goto out_err;
>-			} else if (timeout == 0) {
>-				err = -EAGAIN;
>-				remove_wait_queue(sk_sleep(sk), &wait);
>-				goto out_err;
>-			}
>-		}
>-		remove_wait_queue(sk_sleep(sk), &wait);
>+		if (vsock_wait_space(sk, 1, msg->msg_flags, &send_data))
>+			goto out_err;
>
> 		/* These checks occur both as part of and after the loop
> 		 * conditional since we need to check before and after
>-- 
>2.25.1
>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 05/17] af_vsock: separate wait space loop
@ 2021-02-11 12:14     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 12:14 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:15:41PM +0300, Arseny Krasnov wrote:
>This moves loop that waits for space on send to separate function,
>because it will be used for SEQ_BEGIN/SEQ_END sending before and
>after data transmission. Waiting for SEQ_BEGIN/SEQ_END is needed
>because such packets carries SEQPACKET header that couldn't be
>fragmented by credit mechanism, so to avoid it, sender waits until
>enough space will be ready.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/net/af_vsock.h   |  2 +
> net/vmw_vsock/af_vsock.c | 93 ++++++++++++++++++++++++++--------------
> 2 files changed, 62 insertions(+), 33 deletions(-)
>
>diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>index bb6a0e52be86..19f6f22821ec 100644
>--- a/include/net/af_vsock.h
>+++ b/include/net/af_vsock.h
>@@ -205,6 +205,8 @@ void vsock_remove_sock(struct vsock_sock *vsk);
> void vsock_for_each_connected_socket(void (*fn)(struct sock *sk));
> int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk);
> bool vsock_find_cid(unsigned int cid);
>+int vsock_wait_space(struct sock *sk, size_t space, int flags,
>+		     struct vsock_transport_send_notify_data *send_data);
>
> /**** TAP ****/
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 3d8af987216a..ea99261e88ac 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1693,6 +1693,64 @@ static int vsock_connectible_getsockopt(struct socket *sock,
> 	return 0;
> }
>
>+int vsock_wait_space(struct sock *sk, size_t space, int flags,
>+		     struct vsock_transport_send_notify_data *send_data)
>+{
>+	const struct vsock_transport *transport;
>+	struct vsock_sock *vsk;
>+	long timeout;
>+	int err;
>+
>+	DEFINE_WAIT_FUNC(wait, woken_wake_function);
>+
>+	vsk = vsock_sk(sk);
>+	transport = vsk->transport;
>+	timeout = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
>+	err = 0;
>+
>+	add_wait_queue(sk_sleep(sk), &wait);
>+
>+	while (vsock_stream_has_space(vsk) < space &&
>+	       sk->sk_err == 0 &&
>+	       !(sk->sk_shutdown & SEND_SHUTDOWN) &&
>+	       !(vsk->peer_shutdown & RCV_SHUTDOWN)) {

Maybe a new line here, like in the original code, would help the 
readability.

>+		/* Don't wait for non-blocking sockets. */
>+		if (timeout == 0) {
>+			err = -EAGAIN;
>+			goto out_err;
>+		}
>+
>+		if (send_data) {
>+			err = transport->notify_send_pre_block(vsk, send_data);
>+			if (err < 0)
>+				goto out_err;
>+		}
>+
>+		release_sock(sk);
>+		timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout);
>+		lock_sock(sk);
>+		if (signal_pending(current)) {
>+			err = sock_intr_errno(timeout);
>+			goto out_err;
>+		} else if (timeout == 0) {
>+			err = -EAGAIN;
>+			goto out_err;
>+		}
>+	}
>+
>+	if (sk->sk_err) {
>+		err = -sk->sk_err;
>+	} else if ((sk->sk_shutdown & SEND_SHUTDOWN) ||
>+		   (vsk->peer_shutdown & RCV_SHUTDOWN)) {
>+		err = -EPIPE;
>+	}
>+
>+out_err:
>+	remove_wait_queue(sk_sleep(sk), &wait);
>+	return err;
>+}
>+EXPORT_SYMBOL_GPL(vsock_wait_space);
>+
> static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 				     size_t len)
> {

After removing the wait loop in vsock_connectible_sendmsg(), we should 
remove the 'timeout' variable because it is no longer used.

>@@ -1751,39 +1809,8 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 	while (total_written < len) {
> 		ssize_t written;
>
>-		add_wait_queue(sk_sleep(sk), &wait);
>-		while (vsock_stream_has_space(vsk) == 0 &&
>-		       sk->sk_err == 0 &&
>-		       !(sk->sk_shutdown & SEND_SHUTDOWN) &&
>-		       !(vsk->peer_shutdown & RCV_SHUTDOWN)) {
>-
>-			/* Don't wait for non-blocking sockets. */
>-			if (timeout == 0) {
>-				err = -EAGAIN;
>-				remove_wait_queue(sk_sleep(sk), &wait);
>-				goto out_err;
>-			}
>-
>-			err = transport->notify_send_pre_block(vsk, &send_data);
>-			if (err < 0) {
>-				remove_wait_queue(sk_sleep(sk), &wait);
>-				goto out_err;
>-			}
>-
>-			release_sock(sk);
>-			timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout);
>-			lock_sock(sk);
>-			if (signal_pending(current)) {
>-				err = sock_intr_errno(timeout);
>-				remove_wait_queue(sk_sleep(sk), &wait);
>-				goto out_err;
>-			} else if (timeout == 0) {
>-				err = -EAGAIN;
>-				remove_wait_queue(sk_sleep(sk), &wait);
>-				goto out_err;
>-			}
>-		}
>-		remove_wait_queue(sk_sleep(sk), &wait);
>+		if (vsock_wait_space(sk, 1, msg->msg_flags, &send_data))
>+			goto out_err;
>
> 		/* These checks occur both as part of and after the loop
> 		 * conditional since we need to check before and after
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 06/17] af_vsock: implement send logic for SEQPACKET
  2021-02-07 15:15 ` [RFC PATCH v4 06/17] af_vsock: implement send logic for SEQPACKET Arseny Krasnov
@ 2021-02-11 12:17     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 12:17 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Andra Paraschiv, Colin Ian King,
	Jeff Vander Stoep, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Sun, Feb 07, 2021 at 06:15:57PM +0300, Arseny Krasnov wrote:
>This adds some logic to current stream enqueue function for SEQPACKET
>support:
>1) Send record's begin/end marker.
>2) Return value from enqueue function is whole record length or error
>   for SOCK_SEQPACKET.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/net/af_vsock.h   |  2 ++
> net/vmw_vsock/af_vsock.c | 22 ++++++++++++++++++++--
> 2 files changed, 22 insertions(+), 2 deletions(-)
>
>diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>index 19f6f22821ec..198d58c4c7ee 100644
>--- a/include/net/af_vsock.h
>+++ b/include/net/af_vsock.h
>@@ -136,6 +136,8 @@ struct vsock_transport {
> 	bool (*stream_allow)(u32 cid, u32 port);
>
> 	/* SEQ_PACKET. */
>+	int (*seqpacket_seq_send_len)(struct vsock_sock *, size_t len, int flags);
>+	int (*seqpacket_seq_send_eor)(struct vsock_sock *, int flags);

As before, we could add the identifier of the parameters.

Other than that, the patch LGTM.

Stefano

> 	size_t (*seqpacket_seq_get_len)(struct vsock_sock *);
> 	int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
> 				     int flags, bool *msg_ready);
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index ea99261e88ac..a033d3340ac4 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1806,6 +1806,12 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 	if (err < 0)
> 		goto out;
>
>+	if (sk->sk_type == SOCK_SEQPACKET) {
>+		err = transport->seqpacket_seq_send_len(vsk, len, msg->msg_flags);
>+		if (err < 0)
>+			goto out;
>+	}
>+
> 	while (total_written < len) {
> 		ssize_t written;
>
>@@ -1852,9 +1858,21 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
>
> 	}
>
>+	if (sk->sk_type == SOCK_SEQPACKET) {
>+		err = transport->seqpacket_seq_send_eor(vsk, msg->msg_flags);
>+		if (err < 0)
>+			goto out;
>+	}
>+
> out_err:
>-	if (total_written > 0)
>-		err = total_written;
>+	if (total_written > 0) {
>+		/* Return number of written bytes only if:
>+		 * 1) SOCK_STREAM socket.
>+		 * 2) SOCK_SEQPACKET socket when whole buffer is sent.
>+		 */
>+		if (sk->sk_type == SOCK_STREAM || total_written == len)
>+			err = total_written;
>+	}
> out:
> 	release_sock(sk);
> 	return err;
>-- 
>2.25.1
>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 06/17] af_vsock: implement send logic for SEQPACKET
@ 2021-02-11 12:17     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 12:17 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:15:57PM +0300, Arseny Krasnov wrote:
>This adds some logic to current stream enqueue function for SEQPACKET
>support:
>1) Send record's begin/end marker.
>2) Return value from enqueue function is whole record length or error
>   for SOCK_SEQPACKET.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/net/af_vsock.h   |  2 ++
> net/vmw_vsock/af_vsock.c | 22 ++++++++++++++++++++--
> 2 files changed, 22 insertions(+), 2 deletions(-)
>
>diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>index 19f6f22821ec..198d58c4c7ee 100644
>--- a/include/net/af_vsock.h
>+++ b/include/net/af_vsock.h
>@@ -136,6 +136,8 @@ struct vsock_transport {
> 	bool (*stream_allow)(u32 cid, u32 port);
>
> 	/* SEQ_PACKET. */
>+	int (*seqpacket_seq_send_len)(struct vsock_sock *, size_t len, int flags);
>+	int (*seqpacket_seq_send_eor)(struct vsock_sock *, int flags);

As before, we could add the identifier of the parameters.

Other than that, the patch LGTM.

Stefano

> 	size_t (*seqpacket_seq_get_len)(struct vsock_sock *);
> 	int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
> 				     int flags, bool *msg_ready);
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index ea99261e88ac..a033d3340ac4 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1806,6 +1806,12 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 	if (err < 0)
> 		goto out;
>
>+	if (sk->sk_type == SOCK_SEQPACKET) {
>+		err = transport->seqpacket_seq_send_len(vsk, len, msg->msg_flags);
>+		if (err < 0)
>+			goto out;
>+	}
>+
> 	while (total_written < len) {
> 		ssize_t written;
>
>@@ -1852,9 +1858,21 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
>
> 	}
>
>+	if (sk->sk_type == SOCK_SEQPACKET) {
>+		err = transport->seqpacket_seq_send_eor(vsk, msg->msg_flags);
>+		if (err < 0)
>+			goto out;
>+	}
>+
> out_err:
>-	if (total_written > 0)
>-		err = total_written;
>+	if (total_written > 0) {
>+		/* Return number of written bytes only if:
>+		 * 1) SOCK_STREAM socket.
>+		 * 2) SOCK_SEQPACKET socket when whole buffer is sent.
>+		 */
>+		if (sk->sk_type == SOCK_STREAM || total_written == len)
>+			err = total_written;
>+	}
> out:
> 	release_sock(sk);
> 	return err;
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 07/17] af_vsock: rest of SEQPACKET support
  2021-02-07 15:16 ` [RFC PATCH v4 07/17] af_vsock: rest of SEQPACKET support Arseny Krasnov
@ 2021-02-11 12:27     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 12:27 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Sun, Feb 07, 2021 at 06:16:12PM +0300, Arseny Krasnov wrote:
>This does rest of SOCK_SEQPACKET support:
>1) Adds socket ops for SEQPACKET type.
>2) Allows to create socket with SEQPACKET type.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 37 ++++++++++++++++++++++++++++++++++++-
> 1 file changed, 36 insertions(+), 1 deletion(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index a033d3340ac4..c77998a14018 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -452,6 +452,7 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
> 		new_transport = transport_dgram;
> 		break;
> 	case SOCK_STREAM:
>+	case SOCK_SEQPACKET:
> 		if (vsock_use_local_transport(remote_cid))
> 			new_transport = transport_local;
> 		else if (remote_cid <= VMADDR_CID_HOST || !transport_h2g ||
>@@ -459,6 +460,15 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
> 			new_transport = transport_g2h;
> 		else
> 			new_transport = transport_h2g;
>+
>+		if (sk->sk_type == SOCK_SEQPACKET) {
>+			if (!new_transport ||
>+			    !new_transport->seqpacket_seq_send_len ||
>+			    !new_transport->seqpacket_seq_send_eor ||
>+			    !new_transport->seqpacket_seq_get_len ||
>+			    !new_transport->seqpacket_dequeue)
>+				return -ESOCKTNOSUPPORT;
>+		}

Maybe we should move this check after the try_module_get() call, since 
the memory pointed by 'new_transport' pointer can be deallocated in the 
meantime.

Also, if the socket had a transport before, we should deassign it before 
returning an error.

> 		break;
> 	default:
> 		return -ESOCKTNOSUPPORT;
>@@ -684,6 +694,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr)
>
> 	switch (sk->sk_socket->type) {
> 	case SOCK_STREAM:
>+	case SOCK_SEQPACKET:
> 		spin_lock_bh(&vsock_table_lock);
> 		retval = __vsock_bind_connectible(vsk, addr);
> 		spin_unlock_bh(&vsock_table_lock);
>@@ -769,7 +780,7 @@ static struct sock *__vsock_create(struct net *net,
>
> static bool sock_type_connectible(u16 type)
> {
>-	return type == SOCK_STREAM;
>+	return (type == SOCK_STREAM) || (type == SOCK_SEQPACKET);
> }
>
> static void __vsock_release(struct sock *sk, int level)
>@@ -2199,6 +2210,27 @@ static const struct proto_ops vsock_stream_ops = {
> 	.sendpage = sock_no_sendpage,
> };
>
>+static const struct proto_ops vsock_seqpacket_ops = {
>+	.family = PF_VSOCK,
>+	.owner = THIS_MODULE,
>+	.release = vsock_release,
>+	.bind = vsock_bind,
>+	.connect = vsock_connect,
>+	.socketpair = sock_no_socketpair,
>+	.accept = vsock_accept,
>+	.getname = vsock_getname,
>+	.poll = vsock_poll,
>+	.ioctl = sock_no_ioctl,
>+	.listen = vsock_listen,
>+	.shutdown = vsock_shutdown,
>+	.setsockopt = vsock_connectible_setsockopt,
>+	.getsockopt = vsock_connectible_getsockopt,
>+	.sendmsg = vsock_connectible_sendmsg,
>+	.recvmsg = vsock_connectible_recvmsg,
>+	.mmap = sock_no_mmap,
>+	.sendpage = sock_no_sendpage,
>+};
>+
> static int vsock_create(struct net *net, struct socket *sock,
> 			int protocol, int kern)
> {
>@@ -2219,6 +2251,9 @@ static int vsock_create(struct net *net, struct socket *sock,
> 	case SOCK_STREAM:
> 		sock->ops = &vsock_stream_ops;
> 		break;
>+	case SOCK_SEQPACKET:
>+		sock->ops = &vsock_seqpacket_ops;
>+		break;
> 	default:
> 		return -ESOCKTNOSUPPORT;
> 	}
>-- 
>2.25.1
>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 07/17] af_vsock: rest of SEQPACKET support
@ 2021-02-11 12:27     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 12:27 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:16:12PM +0300, Arseny Krasnov wrote:
>This does rest of SOCK_SEQPACKET support:
>1) Adds socket ops for SEQPACKET type.
>2) Allows to create socket with SEQPACKET type.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 37 ++++++++++++++++++++++++++++++++++++-
> 1 file changed, 36 insertions(+), 1 deletion(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index a033d3340ac4..c77998a14018 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -452,6 +452,7 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
> 		new_transport = transport_dgram;
> 		break;
> 	case SOCK_STREAM:
>+	case SOCK_SEQPACKET:
> 		if (vsock_use_local_transport(remote_cid))
> 			new_transport = transport_local;
> 		else if (remote_cid <= VMADDR_CID_HOST || !transport_h2g ||
>@@ -459,6 +460,15 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
> 			new_transport = transport_g2h;
> 		else
> 			new_transport = transport_h2g;
>+
>+		if (sk->sk_type == SOCK_SEQPACKET) {
>+			if (!new_transport ||
>+			    !new_transport->seqpacket_seq_send_len ||
>+			    !new_transport->seqpacket_seq_send_eor ||
>+			    !new_transport->seqpacket_seq_get_len ||
>+			    !new_transport->seqpacket_dequeue)
>+				return -ESOCKTNOSUPPORT;
>+		}

Maybe we should move this check after the try_module_get() call, since 
the memory pointed by 'new_transport' pointer can be deallocated in the 
meantime.

Also, if the socket had a transport before, we should deassign it before 
returning an error.

> 		break;
> 	default:
> 		return -ESOCKTNOSUPPORT;
>@@ -684,6 +694,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr)
>
> 	switch (sk->sk_socket->type) {
> 	case SOCK_STREAM:
>+	case SOCK_SEQPACKET:
> 		spin_lock_bh(&vsock_table_lock);
> 		retval = __vsock_bind_connectible(vsk, addr);
> 		spin_unlock_bh(&vsock_table_lock);
>@@ -769,7 +780,7 @@ static struct sock *__vsock_create(struct net *net,
>
> static bool sock_type_connectible(u16 type)
> {
>-	return type == SOCK_STREAM;
>+	return (type == SOCK_STREAM) || (type == SOCK_SEQPACKET);
> }
>
> static void __vsock_release(struct sock *sk, int level)
>@@ -2199,6 +2210,27 @@ static const struct proto_ops vsock_stream_ops = {
> 	.sendpage = sock_no_sendpage,
> };
>
>+static const struct proto_ops vsock_seqpacket_ops = {
>+	.family = PF_VSOCK,
>+	.owner = THIS_MODULE,
>+	.release = vsock_release,
>+	.bind = vsock_bind,
>+	.connect = vsock_connect,
>+	.socketpair = sock_no_socketpair,
>+	.accept = vsock_accept,
>+	.getname = vsock_getname,
>+	.poll = vsock_poll,
>+	.ioctl = sock_no_ioctl,
>+	.listen = vsock_listen,
>+	.shutdown = vsock_shutdown,
>+	.setsockopt = vsock_connectible_setsockopt,
>+	.getsockopt = vsock_connectible_getsockopt,
>+	.sendmsg = vsock_connectible_sendmsg,
>+	.recvmsg = vsock_connectible_recvmsg,
>+	.mmap = sock_no_mmap,
>+	.sendpage = sock_no_sendpage,
>+};
>+
> static int vsock_create(struct net *net, struct socket *sock,
> 			int protocol, int kern)
> {
>@@ -2219,6 +2251,9 @@ static int vsock_create(struct net *net, struct socket *sock,
> 	case SOCK_STREAM:
> 		sock->ops = &vsock_stream_ops;
> 		break;
>+	case SOCK_SEQPACKET:
>+		sock->ops = &vsock_seqpacket_ops;
>+		break;
> 	default:
> 		return -ESOCKTNOSUPPORT;
> 	}
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 08/17] af_vsock: update comments for stream sockets
  2021-02-07 15:16 ` [RFC PATCH v4 08/17] af_vsock: update comments for stream sockets Arseny Krasnov
@ 2021-02-11 13:19     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 13:19 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Sun, Feb 07, 2021 at 06:16:29PM +0300, Arseny Krasnov wrote:
>This replaces 'stream' to 'connect oriented' in comments as SEQPACKET is
>also connect oriented.

I'm not a native speaker but maybe is better 'connection oriented' or 
looking at socket(2) man page 'connection-based' is also fine.

Thanks,
Stefano

>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 31 +++++++++++++++++--------------
> 1 file changed, 17 insertions(+), 14 deletions(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index c77998a14018..6e5e192cb703 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -415,8 +415,8 @@ static void vsock_deassign_transport(struct vsock_sock *vsk)
>
> /* Assign a transport to a socket and call the .init transport callback.
>  *
>- * Note: for stream socket this must be called when vsk->remote_addr is set
>- * (e.g. during the connect() or when a connection request on a listener
>+ * Note: for connect oriented socket this must be called when vsk->remote_addr
>+ * is set (e.g. during the connect() or when a connection request on a listener
>  * socket is received).
>  * The vsk->remote_addr is used to decide which transport to use:
>  *  - remote CID == VMADDR_CID_LOCAL or g2h->local_cid or VMADDR_CID_HOST if
>@@ -479,10 +479,10 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
> 			return 0;
>
> 		/* transport->release() must be called with sock lock acquired.
>-		 * This path can only be taken during vsock_stream_connect(),
>-		 * where we have already held the sock lock.
>-		 * In the other cases, this function is called on a new socket
>-		 * which is not assigned to any transport.
>+		 * This path can only be taken during vsock_connect(), where we
>+		 * have already held the sock lock. In the other cases, this
>+		 * function is called on a new socket which is not assigned to
>+		 * any transport.
> 		 */
> 		vsk->transport->release(vsk);
> 		vsock_deassign_transport(vsk);
>@@ -659,9 +659,10 @@ static int __vsock_bind_connectible(struct vsock_sock *vsk,
>
> 	vsock_addr_init(&vsk->local_addr, new_addr.svm_cid, new_addr.svm_port);
>
>-	/* Remove stream sockets from the unbound list and add them to the hash
>-	 * table for easy lookup by its address.  The unbound list is simply an
>-	 * extra entry at the end of the hash table, a trick used by AF_UNIX.
>+	/* Remove connect oriented sockets from the unbound list and add them
>+	 * to the hash table for easy lookup by its address.  The unbound list
>+	 * is simply an extra entry at the end of the hash table, a trick used
>+	 * by AF_UNIX.
> 	 */
> 	__vsock_remove_bound(vsk);
> 	__vsock_insert_bound(vsock_bound_sockets(&vsk->local_addr), vsk);
>@@ -952,10 +953,10 @@ static int vsock_shutdown(struct socket *sock, int mode)
> 	if ((mode & ~SHUTDOWN_MASK) || !mode)
> 		return -EINVAL;
>
>-	/* If this is a STREAM socket and it is not connected then bail out
>-	 * immediately.  If it is a DGRAM socket then we must first kick the
>-	 * socket so that it wakes up from any sleeping calls, for example
>-	 * recv(), and then afterwards return the error.
>+	/* If this is a connect oriented socket and it is not connected then
>+	 * bail out immediately.  If it is a DGRAM socket then we must first
>+	 * kick the socket so that it wakes up from any sleeping calls, for
>+	 * example recv(), and then afterwards return the error.
> 	 */
>
> 	sk = sock->sk;
>@@ -1786,7 +1787,9 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
>
> 	transport = vsk->transport;
>
>-	/* Callers should not provide a destination with stream sockets. */
>+	/* Callers should not provide a destination with connect oriented
>+	 * sockets.
>+	 */
> 	if (msg->msg_namelen) {
> 		err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
> 		goto out;
>-- 
>2.25.1
>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 08/17] af_vsock: update comments for stream sockets
@ 2021-02-11 13:19     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 13:19 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:16:29PM +0300, Arseny Krasnov wrote:
>This replaces 'stream' to 'connect oriented' in comments as SEQPACKET is
>also connect oriented.

I'm not a native speaker but maybe is better 'connection oriented' or 
looking at socket(2) man page 'connection-based' is also fine.

Thanks,
Stefano

>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 31 +++++++++++++++++--------------
> 1 file changed, 17 insertions(+), 14 deletions(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index c77998a14018..6e5e192cb703 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -415,8 +415,8 @@ static void vsock_deassign_transport(struct vsock_sock *vsk)
>
> /* Assign a transport to a socket and call the .init transport callback.
>  *
>- * Note: for stream socket this must be called when vsk->remote_addr is set
>- * (e.g. during the connect() or when a connection request on a listener
>+ * Note: for connect oriented socket this must be called when vsk->remote_addr
>+ * is set (e.g. during the connect() or when a connection request on a listener
>  * socket is received).
>  * The vsk->remote_addr is used to decide which transport to use:
>  *  - remote CID == VMADDR_CID_LOCAL or g2h->local_cid or VMADDR_CID_HOST if
>@@ -479,10 +479,10 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
> 			return 0;
>
> 		/* transport->release() must be called with sock lock acquired.
>-		 * This path can only be taken during vsock_stream_connect(),
>-		 * where we have already held the sock lock.
>-		 * In the other cases, this function is called on a new socket
>-		 * which is not assigned to any transport.
>+		 * This path can only be taken during vsock_connect(), where we
>+		 * have already held the sock lock. In the other cases, this
>+		 * function is called on a new socket which is not assigned to
>+		 * any transport.
> 		 */
> 		vsk->transport->release(vsk);
> 		vsock_deassign_transport(vsk);
>@@ -659,9 +659,10 @@ static int __vsock_bind_connectible(struct vsock_sock *vsk,
>
> 	vsock_addr_init(&vsk->local_addr, new_addr.svm_cid, new_addr.svm_port);
>
>-	/* Remove stream sockets from the unbound list and add them to the hash
>-	 * table for easy lookup by its address.  The unbound list is simply an
>-	 * extra entry at the end of the hash table, a trick used by AF_UNIX.
>+	/* Remove connect oriented sockets from the unbound list and add them
>+	 * to the hash table for easy lookup by its address.  The unbound list
>+	 * is simply an extra entry at the end of the hash table, a trick used
>+	 * by AF_UNIX.
> 	 */
> 	__vsock_remove_bound(vsk);
> 	__vsock_insert_bound(vsock_bound_sockets(&vsk->local_addr), vsk);
>@@ -952,10 +953,10 @@ static int vsock_shutdown(struct socket *sock, int mode)
> 	if ((mode & ~SHUTDOWN_MASK) || !mode)
> 		return -EINVAL;
>
>-	/* If this is a STREAM socket and it is not connected then bail out
>-	 * immediately.  If it is a DGRAM socket then we must first kick the
>-	 * socket so that it wakes up from any sleeping calls, for example
>-	 * recv(), and then afterwards return the error.
>+	/* If this is a connect oriented socket and it is not connected then
>+	 * bail out immediately.  If it is a DGRAM socket then we must first
>+	 * kick the socket so that it wakes up from any sleeping calls, for
>+	 * example recv(), and then afterwards return the error.
> 	 */
>
> 	sk = sock->sk;
>@@ -1786,7 +1787,9 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
>
> 	transport = vsk->transport;
>
>-	/* Callers should not provide a destination with stream sockets. */
>+	/* Callers should not provide a destination with connect oriented
>+	 * sockets.
>+	 */
> 	if (msg->msg_namelen) {
> 		err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
> 		goto out;
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET
  2021-02-07 15:16 ` [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET Arseny Krasnov
@ 2021-02-11 13:54     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 13:54 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Andra Paraschiv, Colin Ian King,
	Jeff Vander Stoep, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Sun, Feb 07, 2021 at 06:16:46PM +0300, Arseny Krasnov wrote:
>This adds transport callback and it's logic for SEQPACKET dequeue.
>Callback fetches RW packets from rx queue of socket until whole record
>is copied(if user's buffer is full, user is not woken up). This is done
>to not stall sender, because if we wake up user and it leaves syscall,
>nobody will send credit update for rest of record, and sender will wait
>for next enter of read syscall at receiver's side. So if user buffer is
>full, we just send credit update and drop data. If during copy SEQ_BEGIN
>was found(and not all data was copied), copying is restarted by reset
>user's iov iterator(previous unfinished data is dropped).
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/linux/virtio_vsock.h            |   5 +
> include/uapi/linux/virtio_vsock.h       |  16 ++++
> net/vmw_vsock/virtio_transport_common.c | 120 ++++++++++++++++++++++++
> 3 files changed, 141 insertions(+)
>
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index dc636b727179..4d0de3dee9a4 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -36,6 +36,11 @@ struct virtio_vsock_sock {
> 	u32 rx_bytes;
> 	u32 buf_alloc;
> 	struct list_head rx_queue;
>+
>+	/* For SOCK_SEQPACKET */
>+	u32 user_read_seq_len;
>+	u32 user_read_copied;
>+	u32 curr_rx_msg_cnt;
> };
>
> struct virtio_vsock_pkt {
>diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
>index 1d57ed3d84d2..cf9c165e5cca 100644
>--- a/include/uapi/linux/virtio_vsock.h
>+++ b/include/uapi/linux/virtio_vsock.h
>@@ -63,8 +63,14 @@ struct virtio_vsock_hdr {
> 	__le32	fwd_cnt;
> } __attribute__((packed));
>
>+struct virtio_vsock_seq_hdr {
>+	__le32  msg_cnt;
>+	__le32  msg_len;
>+} __attribute__((packed));
>+
> enum virtio_vsock_type {
> 	VIRTIO_VSOCK_TYPE_STREAM = 1,
>+	VIRTIO_VSOCK_TYPE_SEQPACKET = 2,
> };
>
> enum virtio_vsock_op {
>@@ -83,6 +89,11 @@ enum virtio_vsock_op {
> 	VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
> 	/* Request the peer to send the credit info to us */
> 	VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
>+
>+	/* Record begin for SOCK_SEQPACKET */
>+	VIRTIO_VSOCK_OP_SEQ_BEGIN = 8,
>+	/* Record end for SOCK_SEQPACKET */
>+	VIRTIO_VSOCK_OP_SEQ_END = 9,
> };
>
> /* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
>@@ -91,4 +102,9 @@ enum virtio_vsock_shutdown {
> 	VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
> };
>
>+/* VIRTIO_VSOCK_OP_RW flags values */
>+enum virtio_vsock_rw {
>+	VIRTIO_VSOCK_RW_EOR = 1,
>+};
>+
> #endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 5956939eebb7..4572d01c8ea5 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -397,6 +397,126 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> 	return err;
> }
>
>+static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
>+{
>+	list_del(&pkt->list);
>+	virtio_transport_free_pkt(pkt);
>+}
>+
>+static size_t virtio_transport_drop_until_seq_begin(struct virtio_vsock_sock *vvs)
>+{

This function is not used here, but in the next patch, so I'd add this 
with the next patch.

>+	struct virtio_vsock_pkt *pkt, *n;
>+	size_t bytes_dropped = 0;
>+
>+	list_for_each_entry_safe(pkt, n, &vvs->rx_queue, list) {
>+		if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_SEQ_BEGIN)
>+			break;
>+
>+		bytes_dropped += le32_to_cpu(pkt->hdr.len);
>+		virtio_transport_dec_rx_pkt(vvs, pkt);
>+		virtio_transport_remove_pkt(pkt);
>+	}
>+
>+	return bytes_dropped;
>+}
>+
>+static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
>+						 struct msghdr *msg,
>+						 bool *msg_ready)
>+{

Also this function is not used, maybe you can add in this patch the 
virtio_transport_seqpacket_dequeue() implementation.

>+	struct virtio_vsock_sock *vvs = vsk->trans;
>+	struct virtio_vsock_pkt *pkt;
>+	int err = 0;
>+	size_t user_buf_len = msg->msg_iter.count;
>+
>+	*msg_ready = false;
>+	spin_lock_bh(&vvs->rx_lock);
>+
>+	while (!*msg_ready && !list_empty(&vvs->rx_queue) && !err) {
>+		pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
>+
>+		switch (le16_to_cpu(pkt->hdr.op)) {
>+		case VIRTIO_VSOCK_OP_SEQ_BEGIN: {
>+			/* Unexpected 'SEQ_BEGIN' during record copy:
>+			 * Leave receive loop, 'EAGAIN' will restart it from
>+			 * outer receive loop, packet is still in queue and
>+			 * counters are cleared. So in next loop enter,
>+			 * 'SEQ_BEGIN' will be dequeued first. User's iov
>+			 * iterator will be reset in outer loop. Also
>+			 * send credit update, because some bytes could be
>+			 * copied. User will never see unfinished record.
>+			 */
>+			err = -EAGAIN;
>+			break;
>+		}
>+		case VIRTIO_VSOCK_OP_SEQ_END: {
>+			struct virtio_vsock_seq_hdr *seq_hdr;
>+
>+			seq_hdr = (struct virtio_vsock_seq_hdr *)pkt->buf;
>+			/* First check that whole record is received. */
>+
>+			if (vvs->user_read_copied != vvs->user_read_seq_len ||
>+			    (le32_to_cpu(seq_hdr->msg_cnt) - vvs->curr_rx_msg_cnt) != 1) {
>+				/* Tail of current record and head of next missed,
>+				 * so this EOR is from next record. Restart receive.
>+				 * Current record will be dropped, next headless will
>+				 * be dropped on next attempt to get record length.
>+				 */
>+				err = -EAGAIN;
>+			} else {
>+				/* Success. */
>+				*msg_ready = true;
>+			}
>+
>+			break;
>+		}
>+		case VIRTIO_VSOCK_OP_RW: {
>+			size_t bytes_to_copy;
>+			size_t pkt_len;
>+
>+			pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
>+			bytes_to_copy = min(user_buf_len, pkt_len);
>+
>+			/* sk_lock is held by caller so no one else can dequeue.
>+			 * Unlock rx_lock since memcpy_to_msg() may sleep.
>+			 */
>+			spin_unlock_bh(&vvs->rx_lock);
>+
>+			if (memcpy_to_msg(msg, pkt->buf, bytes_to_copy)) {
>+				spin_lock_bh(&vvs->rx_lock);
>+				err = -EINVAL;
>+				break;
>+			}
>+
>+			spin_lock_bh(&vvs->rx_lock);
>+			user_buf_len -= bytes_to_copy;
>+			vvs->user_read_copied += pkt_len;
>+
>+			if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_RW_EOR)
>+				msg->msg_flags |= MSG_EOR;
>+			break;
>+		}
>+		default:
>+			;
>+		}
>+
>+		/* For unexpected 'SEQ_BEGIN', keep such packet in queue,
>+		 * but drop any other type of packet.
>+		 */
>+		if (le16_to_cpu(pkt->hdr.op) != VIRTIO_VSOCK_OP_SEQ_BEGIN) {
>+			virtio_transport_dec_rx_pkt(vvs, pkt);
>+			virtio_transport_remove_pkt(pkt);
>+		}
>+	}
>+
>+	spin_unlock_bh(&vvs->rx_lock);
>+
>+	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_SEQPACKET,
>+					    NULL);
>+
>+	return err;
>+}
>+
> ssize_t
> virtio_transport_stream_dequeue(struct vsock_sock *vsk,
> 				struct msghdr *msg,
>-- 
>2.25.1
>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET
@ 2021-02-11 13:54     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 13:54 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:16:46PM +0300, Arseny Krasnov wrote:
>This adds transport callback and it's logic for SEQPACKET dequeue.
>Callback fetches RW packets from rx queue of socket until whole record
>is copied(if user's buffer is full, user is not woken up). This is done
>to not stall sender, because if we wake up user and it leaves syscall,
>nobody will send credit update for rest of record, and sender will wait
>for next enter of read syscall at receiver's side. So if user buffer is
>full, we just send credit update and drop data. If during copy SEQ_BEGIN
>was found(and not all data was copied), copying is restarted by reset
>user's iov iterator(previous unfinished data is dropped).
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/linux/virtio_vsock.h            |   5 +
> include/uapi/linux/virtio_vsock.h       |  16 ++++
> net/vmw_vsock/virtio_transport_common.c | 120 ++++++++++++++++++++++++
> 3 files changed, 141 insertions(+)
>
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index dc636b727179..4d0de3dee9a4 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -36,6 +36,11 @@ struct virtio_vsock_sock {
> 	u32 rx_bytes;
> 	u32 buf_alloc;
> 	struct list_head rx_queue;
>+
>+	/* For SOCK_SEQPACKET */
>+	u32 user_read_seq_len;
>+	u32 user_read_copied;
>+	u32 curr_rx_msg_cnt;
> };
>
> struct virtio_vsock_pkt {
>diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
>index 1d57ed3d84d2..cf9c165e5cca 100644
>--- a/include/uapi/linux/virtio_vsock.h
>+++ b/include/uapi/linux/virtio_vsock.h
>@@ -63,8 +63,14 @@ struct virtio_vsock_hdr {
> 	__le32	fwd_cnt;
> } __attribute__((packed));
>
>+struct virtio_vsock_seq_hdr {
>+	__le32  msg_cnt;
>+	__le32  msg_len;
>+} __attribute__((packed));
>+
> enum virtio_vsock_type {
> 	VIRTIO_VSOCK_TYPE_STREAM = 1,
>+	VIRTIO_VSOCK_TYPE_SEQPACKET = 2,
> };
>
> enum virtio_vsock_op {
>@@ -83,6 +89,11 @@ enum virtio_vsock_op {
> 	VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
> 	/* Request the peer to send the credit info to us */
> 	VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
>+
>+	/* Record begin for SOCK_SEQPACKET */
>+	VIRTIO_VSOCK_OP_SEQ_BEGIN = 8,
>+	/* Record end for SOCK_SEQPACKET */
>+	VIRTIO_VSOCK_OP_SEQ_END = 9,
> };
>
> /* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
>@@ -91,4 +102,9 @@ enum virtio_vsock_shutdown {
> 	VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
> };
>
>+/* VIRTIO_VSOCK_OP_RW flags values */
>+enum virtio_vsock_rw {
>+	VIRTIO_VSOCK_RW_EOR = 1,
>+};
>+
> #endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 5956939eebb7..4572d01c8ea5 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -397,6 +397,126 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> 	return err;
> }
>
>+static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
>+{
>+	list_del(&pkt->list);
>+	virtio_transport_free_pkt(pkt);
>+}
>+
>+static size_t virtio_transport_drop_until_seq_begin(struct virtio_vsock_sock *vvs)
>+{

This function is not used here, but in the next patch, so I'd add this 
with the next patch.

>+	struct virtio_vsock_pkt *pkt, *n;
>+	size_t bytes_dropped = 0;
>+
>+	list_for_each_entry_safe(pkt, n, &vvs->rx_queue, list) {
>+		if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_SEQ_BEGIN)
>+			break;
>+
>+		bytes_dropped += le32_to_cpu(pkt->hdr.len);
>+		virtio_transport_dec_rx_pkt(vvs, pkt);
>+		virtio_transport_remove_pkt(pkt);
>+	}
>+
>+	return bytes_dropped;
>+}
>+
>+static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
>+						 struct msghdr *msg,
>+						 bool *msg_ready)
>+{

Also this function is not used, maybe you can add in this patch the 
virtio_transport_seqpacket_dequeue() implementation.

>+	struct virtio_vsock_sock *vvs = vsk->trans;
>+	struct virtio_vsock_pkt *pkt;
>+	int err = 0;
>+	size_t user_buf_len = msg->msg_iter.count;
>+
>+	*msg_ready = false;
>+	spin_lock_bh(&vvs->rx_lock);
>+
>+	while (!*msg_ready && !list_empty(&vvs->rx_queue) && !err) {
>+		pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
>+
>+		switch (le16_to_cpu(pkt->hdr.op)) {
>+		case VIRTIO_VSOCK_OP_SEQ_BEGIN: {
>+			/* Unexpected 'SEQ_BEGIN' during record copy:
>+			 * Leave receive loop, 'EAGAIN' will restart it from
>+			 * outer receive loop, packet is still in queue and
>+			 * counters are cleared. So in next loop enter,
>+			 * 'SEQ_BEGIN' will be dequeued first. User's iov
>+			 * iterator will be reset in outer loop. Also
>+			 * send credit update, because some bytes could be
>+			 * copied. User will never see unfinished record.
>+			 */
>+			err = -EAGAIN;
>+			break;
>+		}
>+		case VIRTIO_VSOCK_OP_SEQ_END: {
>+			struct virtio_vsock_seq_hdr *seq_hdr;
>+
>+			seq_hdr = (struct virtio_vsock_seq_hdr *)pkt->buf;
>+			/* First check that whole record is received. */
>+
>+			if (vvs->user_read_copied != vvs->user_read_seq_len ||
>+			    (le32_to_cpu(seq_hdr->msg_cnt) - vvs->curr_rx_msg_cnt) != 1) {
>+				/* Tail of current record and head of next missed,
>+				 * so this EOR is from next record. Restart receive.
>+				 * Current record will be dropped, next headless will
>+				 * be dropped on next attempt to get record length.
>+				 */
>+				err = -EAGAIN;
>+			} else {
>+				/* Success. */
>+				*msg_ready = true;
>+			}
>+
>+			break;
>+		}
>+		case VIRTIO_VSOCK_OP_RW: {
>+			size_t bytes_to_copy;
>+			size_t pkt_len;
>+
>+			pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
>+			bytes_to_copy = min(user_buf_len, pkt_len);
>+
>+			/* sk_lock is held by caller so no one else can dequeue.
>+			 * Unlock rx_lock since memcpy_to_msg() may sleep.
>+			 */
>+			spin_unlock_bh(&vvs->rx_lock);
>+
>+			if (memcpy_to_msg(msg, pkt->buf, bytes_to_copy)) {
>+				spin_lock_bh(&vvs->rx_lock);
>+				err = -EINVAL;
>+				break;
>+			}
>+
>+			spin_lock_bh(&vvs->rx_lock);
>+			user_buf_len -= bytes_to_copy;
>+			vvs->user_read_copied += pkt_len;
>+
>+			if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_RW_EOR)
>+				msg->msg_flags |= MSG_EOR;
>+			break;
>+		}
>+		default:
>+			;
>+		}
>+
>+		/* For unexpected 'SEQ_BEGIN', keep such packet in queue,
>+		 * but drop any other type of packet.
>+		 */
>+		if (le16_to_cpu(pkt->hdr.op) != VIRTIO_VSOCK_OP_SEQ_BEGIN) {
>+			virtio_transport_dec_rx_pkt(vvs, pkt);
>+			virtio_transport_remove_pkt(pkt);
>+		}
>+	}
>+
>+	spin_unlock_bh(&vvs->rx_lock);
>+
>+	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_SEQPACKET,
>+					    NULL);
>+
>+	return err;
>+}
>+
> ssize_t
> virtio_transport_stream_dequeue(struct vsock_sock *vsk,
> 				struct msghdr *msg,
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 10/17] virtio/vsock: fetch length for SEQPACKET record
  2021-02-07 15:17 ` [RFC PATCH v4 10/17] virtio/vsock: fetch length for SEQPACKET record Arseny Krasnov
@ 2021-02-11 13:58     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 13:58 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Sun, Feb 07, 2021 at 06:17:08PM +0300, Arseny Krasnov wrote:
>This adds transport callback which tries to fetch record begin marker
>from socket's rx queue. It is called from af_vsock.c before reading data
>packets of record.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/linux/virtio_vsock.h            |  1 +
> net/vmw_vsock/virtio_transport_common.c | 40 +++++++++++++++++++++++++
> 2 files changed, 41 insertions(+)
>
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index 4d0de3dee9a4..a5e8681bfc6a 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -85,6 +85,7 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
> 			       struct msghdr *msg,
> 			       size_t len, int flags);
>
>+size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk);
> s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
> s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 4572d01c8ea5..7ac552bfd90b 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -420,6 +420,46 @@ static size_t virtio_transport_drop_until_seq_begin(struct virtio_vsock_sock *vv
> 	return bytes_dropped;
> }
>
>+size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk)
>+{
>+	struct virtio_vsock_seq_hdr *seq_hdr;
>+	struct virtio_vsock_sock *vvs;
>+	struct virtio_vsock_pkt *pkt;
>+	size_t bytes_dropped;
>+
>+	vvs = vsk->trans;
>+
>+	spin_lock_bh(&vvs->rx_lock);
>+
>+	/* Fetch all orphaned 'RW', packets, and
>+	 * send credit update.

Single line?

>+	 */
>+	bytes_dropped = virtio_transport_drop_until_seq_begin(vvs);
>+
>+	if (list_empty(&vvs->rx_queue))
>+		goto out;
>+
>+	pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
>+
>+	vvs->user_read_copied = 0;
>+
>+	seq_hdr = (struct virtio_vsock_seq_hdr *)pkt->buf;
>+	vvs->user_read_seq_len = le32_to_cpu(seq_hdr->msg_len);
>+	vvs->curr_rx_msg_cnt = le32_to_cpu(seq_hdr->msg_cnt);
>+	virtio_transport_dec_rx_pkt(vvs, pkt);
>+	virtio_transport_remove_pkt(pkt);
>+out:
>+	spin_unlock_bh(&vvs->rx_lock);
>+
>+	if (bytes_dropped)
>+		virtio_transport_send_credit_update(vsk,
>+						    VIRTIO_VSOCK_TYPE_SEQPACKET,
>+						    NULL);
>+
>+	return vvs->user_read_seq_len;
>+}
>+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_get_len);
>+
> static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
> 						 struct msghdr *msg,
> 						 bool *msg_ready)
>-- 
>2.25.1
>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 10/17] virtio/vsock: fetch length for SEQPACKET record
@ 2021-02-11 13:58     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 13:58 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:17:08PM +0300, Arseny Krasnov wrote:
>This adds transport callback which tries to fetch record begin marker
>from socket's rx queue. It is called from af_vsock.c before reading data
>packets of record.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/linux/virtio_vsock.h            |  1 +
> net/vmw_vsock/virtio_transport_common.c | 40 +++++++++++++++++++++++++
> 2 files changed, 41 insertions(+)
>
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index 4d0de3dee9a4..a5e8681bfc6a 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -85,6 +85,7 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
> 			       struct msghdr *msg,
> 			       size_t len, int flags);
>
>+size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk);
> s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
> s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 4572d01c8ea5..7ac552bfd90b 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -420,6 +420,46 @@ static size_t virtio_transport_drop_until_seq_begin(struct virtio_vsock_sock *vv
> 	return bytes_dropped;
> }
>
>+size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk)
>+{
>+	struct virtio_vsock_seq_hdr *seq_hdr;
>+	struct virtio_vsock_sock *vvs;
>+	struct virtio_vsock_pkt *pkt;
>+	size_t bytes_dropped;
>+
>+	vvs = vsk->trans;
>+
>+	spin_lock_bh(&vvs->rx_lock);
>+
>+	/* Fetch all orphaned 'RW', packets, and
>+	 * send credit update.

Single line?

>+	 */
>+	bytes_dropped = virtio_transport_drop_until_seq_begin(vvs);
>+
>+	if (list_empty(&vvs->rx_queue))
>+		goto out;
>+
>+	pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
>+
>+	vvs->user_read_copied = 0;
>+
>+	seq_hdr = (struct virtio_vsock_seq_hdr *)pkt->buf;
>+	vvs->user_read_seq_len = le32_to_cpu(seq_hdr->msg_len);
>+	vvs->curr_rx_msg_cnt = le32_to_cpu(seq_hdr->msg_cnt);
>+	virtio_transport_dec_rx_pkt(vvs, pkt);
>+	virtio_transport_remove_pkt(pkt);
>+out:
>+	spin_unlock_bh(&vvs->rx_lock);
>+
>+	if (bytes_dropped)
>+		virtio_transport_send_credit_update(vsk,
>+						    VIRTIO_VSOCK_TYPE_SEQPACKET,
>+						    NULL);
>+
>+	return vvs->user_read_seq_len;
>+}
>+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_get_len);
>+
> static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
> 						 struct msghdr *msg,
> 						 bool *msg_ready)
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET
  2021-02-11 13:54     ` Stefano Garzarella
@ 2021-02-11 14:03       ` Stefano Garzarella
  -1 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:03 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Andra Paraschiv, Colin Ian King,
	Jeff Vander Stoep, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Thu, Feb 11, 2021 at 02:54:28PM +0100, Stefano Garzarella wrote:
>On Sun, Feb 07, 2021 at 06:16:46PM +0300, Arseny Krasnov wrote:
>>This adds transport callback and it's logic for SEQPACKET dequeue.
>>Callback fetches RW packets from rx queue of socket until whole record
>>is copied(if user's buffer is full, user is not woken up). This is done
>>to not stall sender, because if we wake up user and it leaves syscall,
>>nobody will send credit update for rest of record, and sender will wait
>>for next enter of read syscall at receiver's side. So if user buffer is
>>full, we just send credit update and drop data. If during copy SEQ_BEGIN
>>was found(and not all data was copied), copying is restarted by reset
>>user's iov iterator(previous unfinished data is dropped).
>>
>>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>>---
>>include/linux/virtio_vsock.h            |   5 +
>>include/uapi/linux/virtio_vsock.h       |  16 ++++
>>net/vmw_vsock/virtio_transport_common.c | 120 ++++++++++++++++++++++++
>>3 files changed, 141 insertions(+)
>>
>>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>>index dc636b727179..4d0de3dee9a4 100644
>>--- a/include/linux/virtio_vsock.h
>>+++ b/include/linux/virtio_vsock.h
>>@@ -36,6 +36,11 @@ struct virtio_vsock_sock {
>>	u32 rx_bytes;
>>	u32 buf_alloc;
>>	struct list_head rx_queue;
>>+
>>+	/* For SOCK_SEQPACKET */
>>+	u32 user_read_seq_len;
>>+	u32 user_read_copied;
>>+	u32 curr_rx_msg_cnt;
>>};
>>
>>struct virtio_vsock_pkt {
>>diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
>>index 1d57ed3d84d2..cf9c165e5cca 100644
>>--- a/include/uapi/linux/virtio_vsock.h
>>+++ b/include/uapi/linux/virtio_vsock.h
>>@@ -63,8 +63,14 @@ struct virtio_vsock_hdr {
>>	__le32	fwd_cnt;
>>} __attribute__((packed));
>>
>>+struct virtio_vsock_seq_hdr {
>>+	__le32  msg_cnt;

Maybe it's better 'msg_id' for this field, since we use it to identify a 
message. Then whether we use a counter or a random number, I think it's 
just an implementation detail.

As Michael said, perhaps this detail should be discussed in the proposal 
for VIRTIO spec changes.

>>+	__le32  msg_len;
>>+} __attribute__((packed));
>>+
>>enum virtio_vsock_type {
>>	VIRTIO_VSOCK_TYPE_STREAM = 1,
>>+	VIRTIO_VSOCK_TYPE_SEQPACKET = 2,
>>};
>>
>>enum virtio_vsock_op {
>>@@ -83,6 +89,11 @@ enum virtio_vsock_op {
>>	VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
>>	/* Request the peer to send the credit info to us */
>>	VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
>>+
>>+	/* Record begin for SOCK_SEQPACKET */
>>+	VIRTIO_VSOCK_OP_SEQ_BEGIN = 8,
>>+	/* Record end for SOCK_SEQPACKET */
>>+	VIRTIO_VSOCK_OP_SEQ_END = 9,
>>};
>>
>>/* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
>>@@ -91,4 +102,9 @@ enum virtio_vsock_shutdown {
>>	VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
>>};
>>
>>+/* VIRTIO_VSOCK_OP_RW flags values */
>>+enum virtio_vsock_rw {
>>+	VIRTIO_VSOCK_RW_EOR = 1,
>>+};
>>+
>>#endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
>>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>>index 5956939eebb7..4572d01c8ea5 100644
>>--- a/net/vmw_vsock/virtio_transport_common.c
>>+++ b/net/vmw_vsock/virtio_transport_common.c
>>@@ -397,6 +397,126 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
>>	return err;
>>}
>>
>>+static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
>>+{
>>+	list_del(&pkt->list);
>>+	virtio_transport_free_pkt(pkt);
>>+}
>>+
>>+static size_t virtio_transport_drop_until_seq_begin(struct virtio_vsock_sock *vvs)
>>+{
>
>This function is not used here, but in the next patch, so I'd add this 
>with the next patch.
>
>>+	struct virtio_vsock_pkt *pkt, *n;
>>+	size_t bytes_dropped = 0;
>>+
>>+	list_for_each_entry_safe(pkt, n, &vvs->rx_queue, list) {
>>+		if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_SEQ_BEGIN)
>>+			break;
>>+
>>+		bytes_dropped += le32_to_cpu(pkt->hdr.len);
>>+		virtio_transport_dec_rx_pkt(vvs, pkt);
>>+		virtio_transport_remove_pkt(pkt);
>>+	}
>>+
>>+	return bytes_dropped;
>>+}
>>+
>>+static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
>>+						 struct msghdr *msg,
>>+						 bool *msg_ready)
>>+{
>
>Also this function is not used, maybe you can add in this patch the 
>virtio_transport_seqpacket_dequeue() implementation.
>
>>+	struct virtio_vsock_sock *vvs = vsk->trans;
>>+	struct virtio_vsock_pkt *pkt;
>>+	int err = 0;
>>+	size_t user_buf_len = msg->msg_iter.count;
>>+
>>+	*msg_ready = false;
>>+	spin_lock_bh(&vvs->rx_lock);
>>+
>>+	while (!*msg_ready && !list_empty(&vvs->rx_queue) && !err) {
>>+		pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
>>+
>>+		switch (le16_to_cpu(pkt->hdr.op)) {
>>+		case VIRTIO_VSOCK_OP_SEQ_BEGIN: {
>>+			/* Unexpected 'SEQ_BEGIN' during record copy:
>>+			 * Leave receive loop, 'EAGAIN' will restart it from
>>+			 * outer receive loop, packet is still in queue and
>>+			 * counters are cleared. So in next loop enter,
>>+			 * 'SEQ_BEGIN' will be dequeued first. User's iov
>>+			 * iterator will be reset in outer loop. Also
>>+			 * send credit update, because some bytes could be
>>+			 * copied. User will never see unfinished record.
>>+			 */
>>+			err = -EAGAIN;
>>+			break;
>>+		}
>>+		case VIRTIO_VSOCK_OP_SEQ_END: {
>>+			struct virtio_vsock_seq_hdr *seq_hdr;
>>+
>>+			seq_hdr = (struct virtio_vsock_seq_hdr *)pkt->buf;
>>+			/* First check that whole record is received. */
>>+
>>+			if (vvs->user_read_copied != vvs->user_read_seq_len ||
>>+			    (le32_to_cpu(seq_hdr->msg_cnt) - vvs->curr_rx_msg_cnt) != 1) {
>>+				/* Tail of current record and head of next missed,
>>+				 * so this EOR is from next record. Restart receive.
>>+				 * Current record will be dropped, next headless will
>>+				 * be dropped on next attempt to get record length.
>>+				 */
>>+				err = -EAGAIN;
>>+			} else {
>>+				/* Success. */
>>+				*msg_ready = true;
>>+			}
>>+
>>+			break;
>>+		}
>>+		case VIRTIO_VSOCK_OP_RW: {
>>+			size_t bytes_to_copy;
>>+			size_t pkt_len;
>>+
>>+			pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
>>+			bytes_to_copy = min(user_buf_len, pkt_len);
>>+
>>+			/* sk_lock is held by caller so no one else can dequeue.
>>+			 * Unlock rx_lock since memcpy_to_msg() may sleep.
>>+			 */
>>+			spin_unlock_bh(&vvs->rx_lock);
>>+
>>+			if (memcpy_to_msg(msg, pkt->buf, bytes_to_copy)) {
>>+				spin_lock_bh(&vvs->rx_lock);
>>+				err = -EINVAL;
>>+				break;
>>+			}
>>+
>>+			spin_lock_bh(&vvs->rx_lock);
>>+			user_buf_len -= bytes_to_copy;
>>+			vvs->user_read_copied += pkt_len;
>>+
>>+			if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_RW_EOR)
>>+				msg->msg_flags |= MSG_EOR;
>>+			break;
>>+		}
>>+		default:
>>+			;
>>+		}
>>+
>>+		/* For unexpected 'SEQ_BEGIN', keep such packet in queue,
>>+		 * but drop any other type of packet.
>>+		 */
>>+		if (le16_to_cpu(pkt->hdr.op) != VIRTIO_VSOCK_OP_SEQ_BEGIN) {
>>+			virtio_transport_dec_rx_pkt(vvs, pkt);
>>+			virtio_transport_remove_pkt(pkt);
>>+		}
>>+	}
>>+
>>+	spin_unlock_bh(&vvs->rx_lock);
>>+
>>+	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_SEQPACKET,
>>+					    NULL);
>>+
>>+	return err;
>>+}
>>+
>>ssize_t
>>virtio_transport_stream_dequeue(struct vsock_sock *vsk,
>>				struct msghdr *msg,
>>-- 
>>2.25.1
>>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET
@ 2021-02-11 14:03       ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:03 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Thu, Feb 11, 2021 at 02:54:28PM +0100, Stefano Garzarella wrote:
>On Sun, Feb 07, 2021 at 06:16:46PM +0300, Arseny Krasnov wrote:
>>This adds transport callback and it's logic for SEQPACKET dequeue.
>>Callback fetches RW packets from rx queue of socket until whole record
>>is copied(if user's buffer is full, user is not woken up). This is done
>>to not stall sender, because if we wake up user and it leaves syscall,
>>nobody will send credit update for rest of record, and sender will wait
>>for next enter of read syscall at receiver's side. So if user buffer is
>>full, we just send credit update and drop data. If during copy SEQ_BEGIN
>>was found(and not all data was copied), copying is restarted by reset
>>user's iov iterator(previous unfinished data is dropped).
>>
>>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>>---
>>include/linux/virtio_vsock.h            |   5 +
>>include/uapi/linux/virtio_vsock.h       |  16 ++++
>>net/vmw_vsock/virtio_transport_common.c | 120 ++++++++++++++++++++++++
>>3 files changed, 141 insertions(+)
>>
>>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>>index dc636b727179..4d0de3dee9a4 100644
>>--- a/include/linux/virtio_vsock.h
>>+++ b/include/linux/virtio_vsock.h
>>@@ -36,6 +36,11 @@ struct virtio_vsock_sock {
>>	u32 rx_bytes;
>>	u32 buf_alloc;
>>	struct list_head rx_queue;
>>+
>>+	/* For SOCK_SEQPACKET */
>>+	u32 user_read_seq_len;
>>+	u32 user_read_copied;
>>+	u32 curr_rx_msg_cnt;
>>};
>>
>>struct virtio_vsock_pkt {
>>diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
>>index 1d57ed3d84d2..cf9c165e5cca 100644
>>--- a/include/uapi/linux/virtio_vsock.h
>>+++ b/include/uapi/linux/virtio_vsock.h
>>@@ -63,8 +63,14 @@ struct virtio_vsock_hdr {
>>	__le32	fwd_cnt;
>>} __attribute__((packed));
>>
>>+struct virtio_vsock_seq_hdr {
>>+	__le32  msg_cnt;

Maybe it's better 'msg_id' for this field, since we use it to identify a 
message. Then whether we use a counter or a random number, I think it's 
just an implementation detail.

As Michael said, perhaps this detail should be discussed in the proposal 
for VIRTIO spec changes.

>>+	__le32  msg_len;
>>+} __attribute__((packed));
>>+
>>enum virtio_vsock_type {
>>	VIRTIO_VSOCK_TYPE_STREAM = 1,
>>+	VIRTIO_VSOCK_TYPE_SEQPACKET = 2,
>>};
>>
>>enum virtio_vsock_op {
>>@@ -83,6 +89,11 @@ enum virtio_vsock_op {
>>	VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
>>	/* Request the peer to send the credit info to us */
>>	VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
>>+
>>+	/* Record begin for SOCK_SEQPACKET */
>>+	VIRTIO_VSOCK_OP_SEQ_BEGIN = 8,
>>+	/* Record end for SOCK_SEQPACKET */
>>+	VIRTIO_VSOCK_OP_SEQ_END = 9,
>>};
>>
>>/* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
>>@@ -91,4 +102,9 @@ enum virtio_vsock_shutdown {
>>	VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
>>};
>>
>>+/* VIRTIO_VSOCK_OP_RW flags values */
>>+enum virtio_vsock_rw {
>>+	VIRTIO_VSOCK_RW_EOR = 1,
>>+};
>>+
>>#endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
>>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>>index 5956939eebb7..4572d01c8ea5 100644
>>--- a/net/vmw_vsock/virtio_transport_common.c
>>+++ b/net/vmw_vsock/virtio_transport_common.c
>>@@ -397,6 +397,126 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
>>	return err;
>>}
>>
>>+static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
>>+{
>>+	list_del(&pkt->list);
>>+	virtio_transport_free_pkt(pkt);
>>+}
>>+
>>+static size_t virtio_transport_drop_until_seq_begin(struct virtio_vsock_sock *vvs)
>>+{
>
>This function is not used here, but in the next patch, so I'd add this 
>with the next patch.
>
>>+	struct virtio_vsock_pkt *pkt, *n;
>>+	size_t bytes_dropped = 0;
>>+
>>+	list_for_each_entry_safe(pkt, n, &vvs->rx_queue, list) {
>>+		if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_SEQ_BEGIN)
>>+			break;
>>+
>>+		bytes_dropped += le32_to_cpu(pkt->hdr.len);
>>+		virtio_transport_dec_rx_pkt(vvs, pkt);
>>+		virtio_transport_remove_pkt(pkt);
>>+	}
>>+
>>+	return bytes_dropped;
>>+}
>>+
>>+static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
>>+						 struct msghdr *msg,
>>+						 bool *msg_ready)
>>+{
>
>Also this function is not used, maybe you can add in this patch the 
>virtio_transport_seqpacket_dequeue() implementation.
>
>>+	struct virtio_vsock_sock *vvs = vsk->trans;
>>+	struct virtio_vsock_pkt *pkt;
>>+	int err = 0;
>>+	size_t user_buf_len = msg->msg_iter.count;
>>+
>>+	*msg_ready = false;
>>+	spin_lock_bh(&vvs->rx_lock);
>>+
>>+	while (!*msg_ready && !list_empty(&vvs->rx_queue) && !err) {
>>+		pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
>>+
>>+		switch (le16_to_cpu(pkt->hdr.op)) {
>>+		case VIRTIO_VSOCK_OP_SEQ_BEGIN: {
>>+			/* Unexpected 'SEQ_BEGIN' during record copy:
>>+			 * Leave receive loop, 'EAGAIN' will restart it from
>>+			 * outer receive loop, packet is still in queue and
>>+			 * counters are cleared. So in next loop enter,
>>+			 * 'SEQ_BEGIN' will be dequeued first. User's iov
>>+			 * iterator will be reset in outer loop. Also
>>+			 * send credit update, because some bytes could be
>>+			 * copied. User will never see unfinished record.
>>+			 */
>>+			err = -EAGAIN;
>>+			break;
>>+		}
>>+		case VIRTIO_VSOCK_OP_SEQ_END: {
>>+			struct virtio_vsock_seq_hdr *seq_hdr;
>>+
>>+			seq_hdr = (struct virtio_vsock_seq_hdr *)pkt->buf;
>>+			/* First check that whole record is received. */
>>+
>>+			if (vvs->user_read_copied != vvs->user_read_seq_len ||
>>+			    (le32_to_cpu(seq_hdr->msg_cnt) - vvs->curr_rx_msg_cnt) != 1) {
>>+				/* Tail of current record and head of next missed,
>>+				 * so this EOR is from next record. Restart receive.
>>+				 * Current record will be dropped, next headless will
>>+				 * be dropped on next attempt to get record length.
>>+				 */
>>+				err = -EAGAIN;
>>+			} else {
>>+				/* Success. */
>>+				*msg_ready = true;
>>+			}
>>+
>>+			break;
>>+		}
>>+		case VIRTIO_VSOCK_OP_RW: {
>>+			size_t bytes_to_copy;
>>+			size_t pkt_len;
>>+
>>+			pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
>>+			bytes_to_copy = min(user_buf_len, pkt_len);
>>+
>>+			/* sk_lock is held by caller so no one else can dequeue.
>>+			 * Unlock rx_lock since memcpy_to_msg() may sleep.
>>+			 */
>>+			spin_unlock_bh(&vvs->rx_lock);
>>+
>>+			if (memcpy_to_msg(msg, pkt->buf, bytes_to_copy)) {
>>+				spin_lock_bh(&vvs->rx_lock);
>>+				err = -EINVAL;
>>+				break;
>>+			}
>>+
>>+			spin_lock_bh(&vvs->rx_lock);
>>+			user_buf_len -= bytes_to_copy;
>>+			vvs->user_read_copied += pkt_len;
>>+
>>+			if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_RW_EOR)
>>+				msg->msg_flags |= MSG_EOR;
>>+			break;
>>+		}
>>+		default:
>>+			;
>>+		}
>>+
>>+		/* For unexpected 'SEQ_BEGIN', keep such packet in queue,
>>+		 * but drop any other type of packet.
>>+		 */
>>+		if (le16_to_cpu(pkt->hdr.op) != VIRTIO_VSOCK_OP_SEQ_BEGIN) {
>>+			virtio_transport_dec_rx_pkt(vvs, pkt);
>>+			virtio_transport_remove_pkt(pkt);
>>+		}
>>+	}
>>+
>>+	spin_unlock_bh(&vvs->rx_lock);
>>+
>>+	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_SEQPACKET,
>>+					    NULL);
>>+
>>+	return err;
>>+}
>>+
>>ssize_t
>>virtio_transport_stream_dequeue(struct vsock_sock *vsk,
>>				struct msghdr *msg,
>>-- 
>>2.25.1
>>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support
  2021-02-07 15:17 ` [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support Arseny Krasnov
@ 2021-02-11 14:29     ` Stefano Garzarella
  2021-02-11 11:00   ` Arseny Krasnov
  2021-02-11 14:29     ` Stefano Garzarella
  2 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:29 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Alexander Popov, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Sun, Feb 07, 2021 at 06:17:44PM +0300, Arseny Krasnov wrote:
>This adds rest of logic for SEQPACKET:
>1) Packet's type is now set in 'virtio_send_pkt_info()' using
>   type of socket.
>2) SEQPACKET specific functions which send SEQ_BEGIN/SEQ_END.
>   Note that both functions may sleep to wait enough space for
>   SEQPACKET header.
>3) SEQ_BEGIN/SEQ_END to TAP packet capture.
>4) Send SHUTDOWN on socket close for SEQPACKET type.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/linux/virtio_vsock.h            |  9 +++
> net/vmw_vsock/virtio_transport_common.c | 99 +++++++++++++++++++++----
> 2 files changed, 95 insertions(+), 13 deletions(-)
>
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index a5e8681bfc6a..c4a39424686d 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -41,6 +41,7 @@ struct virtio_vsock_sock {
> 	u32 user_read_seq_len;
> 	u32 user_read_copied;
> 	u32 curr_rx_msg_cnt;
>+	u32 next_tx_msg_cnt;
> };
>
> struct virtio_vsock_pkt {
>@@ -85,7 +86,15 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
> 			       struct msghdr *msg,
> 			       size_t len, int flags);
>
>+int virtio_transport_seqpacket_seq_send_len(struct vsock_sock *vsk, size_t len, int flags);
>+int virtio_transport_seqpacket_seq_send_eor(struct vsock_sock *vsk, int flags);
> size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk);
>+int
>+virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
>+				   struct msghdr *msg,
>+				   int flags,
>+				   bool *msg_ready);
>+
> s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
> s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 51b66f8dd7c7..0aa0fd33e9d6 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -139,6 +139,8 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque)
> 		break;
> 	case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
> 	case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
>+	case VIRTIO_VSOCK_OP_SEQ_BEGIN:
>+	case VIRTIO_VSOCK_OP_SEQ_END:
> 		hdr->op = cpu_to_le16(AF_VSOCK_OP_CONTROL);
> 		break;
> 	default:
>@@ -165,6 +167,14 @@ void virtio_transport_deliver_tap_pkt(struct virtio_vsock_pkt *pkt)
> }
> EXPORT_SYMBOL_GPL(virtio_transport_deliver_tap_pkt);
>
>+static u16 virtio_transport_get_type(struct sock *sk)
>+{
>+	if (sk->sk_type == SOCK_STREAM)
>+		return VIRTIO_VSOCK_TYPE_STREAM;
>+	else
>+		return VIRTIO_VSOCK_TYPE_SEQPACKET;
>+}
>+

Maybe add this function in this part of the file from the first patch, 
so you don't need to move it in this series.

> /* This function can only be used on connecting/connected sockets,
>  * since a socket assigned to a transport is required.
>  *
>@@ -179,6 +189,13 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
> 	struct virtio_vsock_pkt *pkt;
> 	u32 pkt_len = info->pkt_len;
>
>+	info->type = virtio_transport_get_type(sk_vsock(vsk));

I'd this change in another patch before this one, since this touch also 
the stream part.

>+
>+	if (info->type == VIRTIO_VSOCK_TYPE_SEQPACKET &&
>+	    info->msg &&
>+	    info->msg->msg_flags & MSG_EOR)
>+		info->flags |= VIRTIO_VSOCK_RW_EOR;
>+
> 	t_ops = virtio_transport_get_ops(vsk);
> 	if (unlikely(!t_ops))
> 		return -EFAULT;
>@@ -397,13 +414,61 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> 	return err;
> }
>
>-static u16 virtio_transport_get_type(struct sock *sk)
>+static int virtio_transport_seqpacket_send_ctrl(struct vsock_sock *vsk,
>+						int type,
>+						size_t len,
>+						int flags)
> {
>-	if (sk->sk_type == SOCK_STREAM)
>-		return VIRTIO_VSOCK_TYPE_STREAM;
>-	else
>-		return VIRTIO_VSOCK_TYPE_SEQPACKET;
>+	struct virtio_vsock_sock *vvs = vsk->trans;
>+	struct virtio_vsock_pkt_info info = {
>+		.op = type,
>+		.vsk = vsk,
>+		.pkt_len = sizeof(struct virtio_vsock_seq_hdr)
>+	};
>+
>+	struct virtio_vsock_seq_hdr seq_hdr = {
>+		.msg_cnt = vvs->next_tx_msg_cnt,
>+		.msg_len = len
>+	};
>+
>+	struct kvec seq_hdr_kiov = {
>+		.iov_base = (void *)&seq_hdr,
>+		.iov_len = sizeof(struct virtio_vsock_seq_hdr)
>+	};
>+
>+	struct msghdr msg = {0};
>+
>+	//XXX: do we need 'vsock_transport_send_notify_data' pointer?
>+	if (vsock_wait_space(sk_vsock(vsk),
>+			     sizeof(struct virtio_vsock_seq_hdr),
>+			     flags, NULL))
>+		return -1;
>+
>+	iov_iter_kvec(&msg.msg_iter, WRITE, &seq_hdr_kiov, 1, sizeof(seq_hdr));
>+
>+	info.msg = &msg;
>+	vvs->next_tx_msg_cnt++;
>+
>+	return virtio_transport_send_pkt_info(vsk, &info);
>+}
>+
>+int virtio_transport_seqpacket_seq_send_len(struct vsock_sock *vsk, size_t len, int flags)
>+{
>+	return virtio_transport_seqpacket_send_ctrl(vsk,
>+						    VIRTIO_VSOCK_OP_SEQ_BEGIN,
>+						    len,
>+						    flags);
> }
>+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_send_len);
>+
>+int virtio_transport_seqpacket_seq_send_eor(struct vsock_sock *vsk, int flags)
>+{
>+	return virtio_transport_seqpacket_send_ctrl(vsk,
>+						    VIRTIO_VSOCK_OP_SEQ_END,
>+						    0,
>+						    flags);
>+}
>+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_send_eor);
>
> static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
> {
>@@ -577,6 +642,18 @@ virtio_transport_stream_dequeue(struct vsock_sock *vsk,
> }
> EXPORT_SYMBOL_GPL(virtio_transport_stream_dequeue);
>
>+int
>+virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
>+				   struct msghdr *msg,
>+				   int flags, bool *msg_ready)
>+{
>+	if (flags & MSG_PEEK)
>+		return -EOPNOTSUPP;
>+
>+	return virtio_transport_seqpacket_do_dequeue(vsk, msg, msg_ready);
>+}
>+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_dequeue);
>+
> int
> virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
> 			       struct msghdr *msg,
>@@ -658,14 +735,15 @@ EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
> void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val)
> {
> 	struct virtio_vsock_sock *vvs = vsk->trans;
>+	int type;
>
> 	if (*val > VIRTIO_VSOCK_MAX_BUF_SIZE)
> 		*val = VIRTIO_VSOCK_MAX_BUF_SIZE;
>
> 	vvs->buf_alloc = *val;
>
>-	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_STREAM,
>-					    NULL);
>+	type = virtio_transport_get_type(sk_vsock(vsk));
>+	virtio_transport_send_credit_update(vsk, type, NULL);

I think we can remove the 'type' parameter of 
virtio_transport_send_credit_update() since 
virtio_transport_send_pkt_info() will overwrite it.

> }
> EXPORT_SYMBOL_GPL(virtio_transport_notify_buffer_size);
>
>@@ -792,7 +870,6 @@ int virtio_transport_connect(struct vsock_sock *vsk)
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_REQUEST,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.vsk = vsk,
> 	};
>
>@@ -804,7 +881,6 @@ int virtio_transport_shutdown(struct vsock_sock *vsk, int mode)
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_SHUTDOWN,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.flags = (mode & RCV_SHUTDOWN ?
> 			  VIRTIO_VSOCK_SHUTDOWN_RCV : 0) |
> 			 (mode & SEND_SHUTDOWN ?
>@@ -833,7 +909,6 @@ virtio_transport_stream_enqueue(struct vsock_sock *vsk,
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_RW,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.msg = msg,
> 		.pkt_len = len,
> 		.vsk = vsk,
>@@ -856,7 +931,6 @@ static int virtio_transport_reset(struct vsock_sock *vsk,
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_RST,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.reply = !!pkt,
> 		.vsk = vsk,
> 	};

These changes could go with the new patch to handle the type directly in 
the virtio_transport_send_pkt_info().


>@@ -1001,7 +1075,7 @@ void virtio_transport_release(struct vsock_sock *vsk)
> 	struct sock *sk = &vsk->sk;
> 	bool remove_sock = true;
>
>-	if (sk->sk_type == SOCK_STREAM)
>+	if (sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET)
> 		remove_sock = virtio_transport_close(vsk);
>
> 	list_for_each_entry_safe(pkt, tmp, &vvs->rx_queue, list) {
>@@ -1164,7 +1238,6 @@ virtio_transport_send_response(struct vsock_sock *vsk,
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_RESPONSE,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.remote_cid = le64_to_cpu(pkt->hdr.src_cid),
> 		.remote_port = le32_to_cpu(pkt->hdr.src_port),
> 		.reply = true,

Also this one.

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support
@ 2021-02-11 14:29     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:29 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, netdev, stsp2,
	linux-kernel, virtualization, oxffffaa, Stefan Hajnoczi,
	Colin Ian King, Jakub Kicinski, David S. Miller, Jorgen Hansen,
	Alexander Popov

On Sun, Feb 07, 2021 at 06:17:44PM +0300, Arseny Krasnov wrote:
>This adds rest of logic for SEQPACKET:
>1) Packet's type is now set in 'virtio_send_pkt_info()' using
>   type of socket.
>2) SEQPACKET specific functions which send SEQ_BEGIN/SEQ_END.
>   Note that both functions may sleep to wait enough space for
>   SEQPACKET header.
>3) SEQ_BEGIN/SEQ_END to TAP packet capture.
>4) Send SHUTDOWN on socket close for SEQPACKET type.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/linux/virtio_vsock.h            |  9 +++
> net/vmw_vsock/virtio_transport_common.c | 99 +++++++++++++++++++++----
> 2 files changed, 95 insertions(+), 13 deletions(-)
>
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index a5e8681bfc6a..c4a39424686d 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -41,6 +41,7 @@ struct virtio_vsock_sock {
> 	u32 user_read_seq_len;
> 	u32 user_read_copied;
> 	u32 curr_rx_msg_cnt;
>+	u32 next_tx_msg_cnt;
> };
>
> struct virtio_vsock_pkt {
>@@ -85,7 +86,15 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
> 			       struct msghdr *msg,
> 			       size_t len, int flags);
>
>+int virtio_transport_seqpacket_seq_send_len(struct vsock_sock *vsk, size_t len, int flags);
>+int virtio_transport_seqpacket_seq_send_eor(struct vsock_sock *vsk, int flags);
> size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk);
>+int
>+virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
>+				   struct msghdr *msg,
>+				   int flags,
>+				   bool *msg_ready);
>+
> s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
> s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 51b66f8dd7c7..0aa0fd33e9d6 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -139,6 +139,8 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque)
> 		break;
> 	case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
> 	case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
>+	case VIRTIO_VSOCK_OP_SEQ_BEGIN:
>+	case VIRTIO_VSOCK_OP_SEQ_END:
> 		hdr->op = cpu_to_le16(AF_VSOCK_OP_CONTROL);
> 		break;
> 	default:
>@@ -165,6 +167,14 @@ void virtio_transport_deliver_tap_pkt(struct virtio_vsock_pkt *pkt)
> }
> EXPORT_SYMBOL_GPL(virtio_transport_deliver_tap_pkt);
>
>+static u16 virtio_transport_get_type(struct sock *sk)
>+{
>+	if (sk->sk_type == SOCK_STREAM)
>+		return VIRTIO_VSOCK_TYPE_STREAM;
>+	else
>+		return VIRTIO_VSOCK_TYPE_SEQPACKET;
>+}
>+

Maybe add this function in this part of the file from the first patch, 
so you don't need to move it in this series.

> /* This function can only be used on connecting/connected sockets,
>  * since a socket assigned to a transport is required.
>  *
>@@ -179,6 +189,13 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
> 	struct virtio_vsock_pkt *pkt;
> 	u32 pkt_len = info->pkt_len;
>
>+	info->type = virtio_transport_get_type(sk_vsock(vsk));

I'd this change in another patch before this one, since this touch also 
the stream part.

>+
>+	if (info->type == VIRTIO_VSOCK_TYPE_SEQPACKET &&
>+	    info->msg &&
>+	    info->msg->msg_flags & MSG_EOR)
>+		info->flags |= VIRTIO_VSOCK_RW_EOR;
>+
> 	t_ops = virtio_transport_get_ops(vsk);
> 	if (unlikely(!t_ops))
> 		return -EFAULT;
>@@ -397,13 +414,61 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> 	return err;
> }
>
>-static u16 virtio_transport_get_type(struct sock *sk)
>+static int virtio_transport_seqpacket_send_ctrl(struct vsock_sock *vsk,
>+						int type,
>+						size_t len,
>+						int flags)
> {
>-	if (sk->sk_type == SOCK_STREAM)
>-		return VIRTIO_VSOCK_TYPE_STREAM;
>-	else
>-		return VIRTIO_VSOCK_TYPE_SEQPACKET;
>+	struct virtio_vsock_sock *vvs = vsk->trans;
>+	struct virtio_vsock_pkt_info info = {
>+		.op = type,
>+		.vsk = vsk,
>+		.pkt_len = sizeof(struct virtio_vsock_seq_hdr)
>+	};
>+
>+	struct virtio_vsock_seq_hdr seq_hdr = {
>+		.msg_cnt = vvs->next_tx_msg_cnt,
>+		.msg_len = len
>+	};
>+
>+	struct kvec seq_hdr_kiov = {
>+		.iov_base = (void *)&seq_hdr,
>+		.iov_len = sizeof(struct virtio_vsock_seq_hdr)
>+	};
>+
>+	struct msghdr msg = {0};
>+
>+	//XXX: do we need 'vsock_transport_send_notify_data' pointer?
>+	if (vsock_wait_space(sk_vsock(vsk),
>+			     sizeof(struct virtio_vsock_seq_hdr),
>+			     flags, NULL))
>+		return -1;
>+
>+	iov_iter_kvec(&msg.msg_iter, WRITE, &seq_hdr_kiov, 1, sizeof(seq_hdr));
>+
>+	info.msg = &msg;
>+	vvs->next_tx_msg_cnt++;
>+
>+	return virtio_transport_send_pkt_info(vsk, &info);
>+}
>+
>+int virtio_transport_seqpacket_seq_send_len(struct vsock_sock *vsk, size_t len, int flags)
>+{
>+	return virtio_transport_seqpacket_send_ctrl(vsk,
>+						    VIRTIO_VSOCK_OP_SEQ_BEGIN,
>+						    len,
>+						    flags);
> }
>+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_send_len);
>+
>+int virtio_transport_seqpacket_seq_send_eor(struct vsock_sock *vsk, int flags)
>+{
>+	return virtio_transport_seqpacket_send_ctrl(vsk,
>+						    VIRTIO_VSOCK_OP_SEQ_END,
>+						    0,
>+						    flags);
>+}
>+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_send_eor);
>
> static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
> {
>@@ -577,6 +642,18 @@ virtio_transport_stream_dequeue(struct vsock_sock *vsk,
> }
> EXPORT_SYMBOL_GPL(virtio_transport_stream_dequeue);
>
>+int
>+virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
>+				   struct msghdr *msg,
>+				   int flags, bool *msg_ready)
>+{
>+	if (flags & MSG_PEEK)
>+		return -EOPNOTSUPP;
>+
>+	return virtio_transport_seqpacket_do_dequeue(vsk, msg, msg_ready);
>+}
>+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_dequeue);
>+
> int
> virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
> 			       struct msghdr *msg,
>@@ -658,14 +735,15 @@ EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
> void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val)
> {
> 	struct virtio_vsock_sock *vvs = vsk->trans;
>+	int type;
>
> 	if (*val > VIRTIO_VSOCK_MAX_BUF_SIZE)
> 		*val = VIRTIO_VSOCK_MAX_BUF_SIZE;
>
> 	vvs->buf_alloc = *val;
>
>-	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_STREAM,
>-					    NULL);
>+	type = virtio_transport_get_type(sk_vsock(vsk));
>+	virtio_transport_send_credit_update(vsk, type, NULL);

I think we can remove the 'type' parameter of 
virtio_transport_send_credit_update() since 
virtio_transport_send_pkt_info() will overwrite it.

> }
> EXPORT_SYMBOL_GPL(virtio_transport_notify_buffer_size);
>
>@@ -792,7 +870,6 @@ int virtio_transport_connect(struct vsock_sock *vsk)
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_REQUEST,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.vsk = vsk,
> 	};
>
>@@ -804,7 +881,6 @@ int virtio_transport_shutdown(struct vsock_sock *vsk, int mode)
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_SHUTDOWN,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.flags = (mode & RCV_SHUTDOWN ?
> 			  VIRTIO_VSOCK_SHUTDOWN_RCV : 0) |
> 			 (mode & SEND_SHUTDOWN ?
>@@ -833,7 +909,6 @@ virtio_transport_stream_enqueue(struct vsock_sock *vsk,
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_RW,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.msg = msg,
> 		.pkt_len = len,
> 		.vsk = vsk,
>@@ -856,7 +931,6 @@ static int virtio_transport_reset(struct vsock_sock *vsk,
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_RST,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.reply = !!pkt,
> 		.vsk = vsk,
> 	};

These changes could go with the new patch to handle the type directly in 
the virtio_transport_send_pkt_info().


>@@ -1001,7 +1075,7 @@ void virtio_transport_release(struct vsock_sock *vsk)
> 	struct sock *sk = &vsk->sk;
> 	bool remove_sock = true;
>
>-	if (sk->sk_type == SOCK_STREAM)
>+	if (sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET)
> 		remove_sock = virtio_transport_close(vsk);
>
> 	list_for_each_entry_safe(pkt, tmp, &vvs->rx_queue, list) {
>@@ -1164,7 +1238,6 @@ virtio_transport_send_response(struct vsock_sock *vsk,
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_RESPONSE,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.remote_cid = le64_to_cpu(pkt->hdr.src_cid),
> 		.remote_port = le32_to_cpu(pkt->hdr.src_port),
> 		.reply = true,

Also this one.

Thanks,
Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 16/17] loopback/vsock: setup SEQPACKET ops for transport
  2021-02-07 15:18 ` [RFC PATCH v4 16/17] loopback/vsock: setup SEQPACKET ops for transport Arseny Krasnov
@ 2021-02-11 14:31     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:31 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

Please move this patch before the test and I'd change the prefix in 
"vsock_loopback" or "vsock/loopback".

Thanks,
Stefano

On Sun, Feb 07, 2021 at 06:18:48PM +0300, Arseny Krasnov wrote:
>This adds SEQPACKET ops for loopback transport
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/vsock_loopback.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
>diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c
>index a45f7ffca8c5..c0da94119f74 100644
>--- a/net/vmw_vsock/vsock_loopback.c
>+++ b/net/vmw_vsock/vsock_loopback.c
>@@ -89,6 +89,11 @@ static struct virtio_transport loopback_transport = {
> 		.stream_is_active         = virtio_transport_stream_is_active,
> 		.stream_allow             = virtio_transport_stream_allow,
>
>+		.seqpacket_seq_send_len	  = virtio_transport_seqpacket_seq_send_len,
>+		.seqpacket_seq_send_eor	  = virtio_transport_seqpacket_seq_send_eor,
>+		.seqpacket_seq_get_len	  = virtio_transport_seqpacket_seq_get_len,
>+		.seqpacket_dequeue        = virtio_transport_seqpacket_dequeue,
>+
> 		.notify_poll_in           = virtio_transport_notify_poll_in,
> 		.notify_poll_out          = virtio_transport_notify_poll_out,
> 		.notify_recv_init         = virtio_transport_notify_recv_init,
>-- 
>2.25.1
>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 16/17] loopback/vsock: setup SEQPACKET ops for transport
@ 2021-02-11 14:31     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:31 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

Please move this patch before the test and I'd change the prefix in 
"vsock_loopback" or "vsock/loopback".

Thanks,
Stefano

On Sun, Feb 07, 2021 at 06:18:48PM +0300, Arseny Krasnov wrote:
>This adds SEQPACKET ops for loopback transport
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/vsock_loopback.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
>diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c
>index a45f7ffca8c5..c0da94119f74 100644
>--- a/net/vmw_vsock/vsock_loopback.c
>+++ b/net/vmw_vsock/vsock_loopback.c
>@@ -89,6 +89,11 @@ static struct virtio_transport loopback_transport = {
> 		.stream_is_active         = virtio_transport_stream_is_active,
> 		.stream_allow             = virtio_transport_stream_allow,
>
>+		.seqpacket_seq_send_len	  = virtio_transport_seqpacket_seq_send_len,
>+		.seqpacket_seq_send_eor	  = virtio_transport_seqpacket_seq_send_eor,
>+		.seqpacket_seq_get_len	  = virtio_transport_seqpacket_seq_get_len,
>+		.seqpacket_dequeue        = virtio_transport_seqpacket_dequeue,
>+
> 		.notify_poll_in           = virtio_transport_notify_poll_in,
> 		.notify_poll_out          = virtio_transport_notify_poll_out,
> 		.notify_recv_init         = virtio_transport_notify_recv_init,
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 17/17] virtio/vsock: simplify credit update function API
  2021-02-07 15:19 ` [RFC PATCH v4 17/17] virtio/vsock: simplify credit update function API Arseny Krasnov
@ 2021-02-11 14:39     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:39 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Andra Paraschiv, Colin Ian King,
	Alexander Popov, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Sun, Feb 07, 2021 at 06:19:03PM +0300, Arseny Krasnov wrote:
>'virtio_transport_send_credit_update()' has some extra args:
>1) 'type' may be set in 'virtio_transport_send_pkt_info()' using type
>   of socket.
>2) This function is static and 'hdr' arg was always NULL.
>

Okay, I saw this patch after my previous comment.

I think this looks good, but please move this before your changes (e.g.  
before patch 'virtio/vsock: dequeue callback for SOCK_SEQPACKET').

In this way you don't need to modify 
virtio_transport_notify_buffer_size(), calling 
virtio_transport_get_type() and then remove these changes.

It's generally not a good idea to make changes in a patch and then 
remove them a few patches later in the same series. This should ring a 
bell about moving these changes before others.

Thanks,
Stefano

>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/virtio_transport_common.c | 20 +++++---------------
> 1 file changed, 5 insertions(+), 15 deletions(-)
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c 
>b/net/vmw_vsock/virtio_transport_common.c
>index 0aa0fd33e9d6..46308679c8a4 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -286,13 +286,10 @@ void virtio_transport_put_credit(struct virtio_vsock_sock *vvs, u32 credit)
> }
> EXPORT_SYMBOL_GPL(virtio_transport_put_credit);
>
>-static int virtio_transport_send_credit_update(struct vsock_sock *vsk,
>-					       int type,
>-					       struct virtio_vsock_hdr *hdr)
>+static int virtio_transport_send_credit_update(struct vsock_sock *vsk)
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_CREDIT_UPDATE,
>-		.type = type,
> 		.vsk = vsk,
> 	};
>
>@@ -401,9 +398,7 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> 	 * with different values.
> 	 */
> 	if (free_space < VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) {
>-		virtio_transport_send_credit_update(vsk,
>-						    VIRTIO_VSOCK_TYPE_STREAM,
>-						    NULL);
>+		virtio_transport_send_credit_update(vsk);
> 	}
>
> 	return total;
>@@ -525,9 +520,7 @@ size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk)
> 	spin_unlock_bh(&vvs->rx_lock);
>
> 	if (bytes_dropped)
>-		virtio_transport_send_credit_update(vsk,
>-						    VIRTIO_VSOCK_TYPE_SEQPACKET,
>-						    NULL);
>+		virtio_transport_send_credit_update(vsk);
>
> 	return vvs->user_read_seq_len;
> }
>@@ -624,8 +617,7 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
>
> 	spin_unlock_bh(&vvs->rx_lock);
>
>-	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_SEQPACKET,
>-					    NULL);
>+	virtio_transport_send_credit_update(vsk);
>
> 	return err;
> }
>@@ -735,15 +727,13 @@ EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
> void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val)
> {
> 	struct virtio_vsock_sock *vvs = vsk->trans;
>-	int type;
>
> 	if (*val > VIRTIO_VSOCK_MAX_BUF_SIZE)
> 		*val = VIRTIO_VSOCK_MAX_BUF_SIZE;
>
> 	vvs->buf_alloc = *val;
>
>-	type = virtio_transport_get_type(sk_vsock(vsk));
>-	virtio_transport_send_credit_update(vsk, type, NULL);
>+	virtio_transport_send_credit_update(vsk);
> }
> EXPORT_SYMBOL_GPL(virtio_transport_notify_buffer_size);
>
>-- 
>2.25.1
>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 17/17] virtio/vsock: simplify credit update function API
@ 2021-02-11 14:39     ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:39 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, netdev, stsp2,
	linux-kernel, virtualization, oxffffaa, Stefan Hajnoczi,
	Colin Ian King, Jakub Kicinski, David S. Miller, Jorgen Hansen,
	Alexander Popov

On Sun, Feb 07, 2021 at 06:19:03PM +0300, Arseny Krasnov wrote:
>'virtio_transport_send_credit_update()' has some extra args:
>1) 'type' may be set in 'virtio_transport_send_pkt_info()' using type
>   of socket.
>2) This function is static and 'hdr' arg was always NULL.
>

Okay, I saw this patch after my previous comment.

I think this looks good, but please move this before your changes (e.g.  
before patch 'virtio/vsock: dequeue callback for SOCK_SEQPACKET').

In this way you don't need to modify 
virtio_transport_notify_buffer_size(), calling 
virtio_transport_get_type() and then remove these changes.

It's generally not a good idea to make changes in a patch and then 
remove them a few patches later in the same series. This should ring a 
bell about moving these changes before others.

Thanks,
Stefano

>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/virtio_transport_common.c | 20 +++++---------------
> 1 file changed, 5 insertions(+), 15 deletions(-)
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c 
>b/net/vmw_vsock/virtio_transport_common.c
>index 0aa0fd33e9d6..46308679c8a4 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -286,13 +286,10 @@ void virtio_transport_put_credit(struct virtio_vsock_sock *vvs, u32 credit)
> }
> EXPORT_SYMBOL_GPL(virtio_transport_put_credit);
>
>-static int virtio_transport_send_credit_update(struct vsock_sock *vsk,
>-					       int type,
>-					       struct virtio_vsock_hdr *hdr)
>+static int virtio_transport_send_credit_update(struct vsock_sock *vsk)
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_CREDIT_UPDATE,
>-		.type = type,
> 		.vsk = vsk,
> 	};
>
>@@ -401,9 +398,7 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> 	 * with different values.
> 	 */
> 	if (free_space < VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) {
>-		virtio_transport_send_credit_update(vsk,
>-						    VIRTIO_VSOCK_TYPE_STREAM,
>-						    NULL);
>+		virtio_transport_send_credit_update(vsk);
> 	}
>
> 	return total;
>@@ -525,9 +520,7 @@ size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk)
> 	spin_unlock_bh(&vvs->rx_lock);
>
> 	if (bytes_dropped)
>-		virtio_transport_send_credit_update(vsk,
>-						    VIRTIO_VSOCK_TYPE_SEQPACKET,
>-						    NULL);
>+		virtio_transport_send_credit_update(vsk);
>
> 	return vvs->user_read_seq_len;
> }
>@@ -624,8 +617,7 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
>
> 	spin_unlock_bh(&vvs->rx_lock);
>
>-	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_SEQPACKET,
>-					    NULL);
>+	virtio_transport_send_credit_update(vsk);
>
> 	return err;
> }
>@@ -735,15 +727,13 @@ EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
> void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val)
> {
> 	struct virtio_vsock_sock *vvs = vsk->trans;
>-	int type;
>
> 	if (*val > VIRTIO_VSOCK_MAX_BUF_SIZE)
> 		*val = VIRTIO_VSOCK_MAX_BUF_SIZE;
>
> 	vvs->buf_alloc = *val;
>
>-	type = virtio_transport_get_type(sk_vsock(vsk));
>-	virtio_transport_send_credit_update(vsk, type, NULL);
>+	virtio_transport_send_credit_update(vsk);
> }
> EXPORT_SYMBOL_GPL(virtio_transport_notify_buffer_size);
>
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support
  2021-02-08  6:32   ` Arseny Krasnov
@ 2021-02-11 14:57       ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:57 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Michael S. Tsirkin, Stefan Hajnoczi, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Andra Paraschiv, Colin Ian King,
	Alexander Popov, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

Hi Arseny,

On Mon, Feb 08, 2021 at 09:32:59AM +0300, Arseny Krasnov wrote:
>
>On 07.02.2021 19:20, Michael S. Tsirkin wrote:
>> On Sun, Feb 07, 2021 at 06:12:56PM +0300, Arseny Krasnov wrote:
>>> 	This patchset impelements support of SOCK_SEQPACKET for virtio
>>> transport.
>>> 	As SOCK_SEQPACKET guarantees to save record boundaries, so to
>>> do it, two new packet operations were added: first for start of record
>>>  and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>>> both operations carries metadata - to maintain boundaries and payload
>>> integrity. Metadata is introduced by adding special header with two
>>> fields - message count and message length:
>>>
>>> 	struct virtio_vsock_seq_hdr {
>>> 		__le32  msg_cnt;
>>> 		__le32  msg_len;
>>> 	} __attribute__((packed));
>>>
>>> 	This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>>> packets(buffer of second virtio descriptor in chain) in the same way as
>>> data transmitted in RW packets. Payload was chosen as buffer for this
>>> header to avoid touching first virtio buffer which carries header of
>>> packet, because someone could check that size of this buffer is equal
>>> to size of packet header. To send record, packet with start marker is
>>> sent first(it's header contains length of record and counter), then
>>> counter is incremented and all data is sent as usual 'RW' packets and
>>> finally SEQ_END is sent(it also carries counter of message, which is
>>> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
>>> incremented again. On receiver's side, length of record is known from
>>> packet with start record marker. To check that no packets were dropped
>>> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
>>> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
>>> 1) and length of data between two markers is compared to length in
>>> SEQ_BEGIN header.
>>> 	Now as  packets of one socket are not reordered neither on
>>> vsock nor on vhost transport layers, such markers allows to restore
>>> original record on receiver's side. If user's buffer is smaller that
>>> record length, when all out of size data is dropped.
>>> 	Maximum length of datagram is not limited as in stream socket,
>>> because same credit logic is used. Difference with stream socket is
>>> that user is not woken up until whole record is received or error
>>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>>> 	Tests also implemented.
>>>
>>>  Arseny Krasnov (17):
>>>   af_vsock: update functions for connectible socket
>>>   af_vsock: separate wait data loop
>>>   af_vsock: separate receive data loop
>>>   af_vsock: implement SEQPACKET receive loop
>>>   af_vsock: separate wait space loop
>>>   af_vsock: implement send logic for SEQPACKET
>>>   af_vsock: rest of SEQPACKET support
>>>   af_vsock: update comments for stream sockets
>>>   virtio/vsock: dequeue callback for SOCK_SEQPACKET
>>>   virtio/vsock: fetch length for SEQPACKET record
>>>   virtio/vsock: add SEQPACKET receive logic
>>>   virtio/vsock: rest of SOCK_SEQPACKET support
>>>   virtio/vsock: setup SEQPACKET ops for transport
>>>   vhost/vsock: setup SEQPACKET ops for transport
>>>   vsock_test: add SOCK_SEQPACKET tests
>>>   loopback/vsock: setup SEQPACKET ops for transport
>>>   virtio/vsock: simplify credit update function API
>>>
>>>  drivers/vhost/vsock.c                   |   8 +-
>>>  include/linux/virtio_vsock.h            |  15 +
>>>  include/net/af_vsock.h                  |   9 +
>>>  include/uapi/linux/virtio_vsock.h       |  16 +
>>>  net/vmw_vsock/af_vsock.c                | 588 +++++++++++++++-------
>>>  net/vmw_vsock/virtio_transport.c        |   5 +
>>>  net/vmw_vsock/virtio_transport_common.c | 316 ++++++++++--
>>>  net/vmw_vsock/vsock_loopback.c          |   5 +
>>>  tools/testing/vsock/util.c              |  32 +-
>>>  tools/testing/vsock/util.h              |   3 +
>>>  tools/testing/vsock/vsock_test.c        | 126 +++++
>>>  11 files changed, 895 insertions(+), 228 deletions(-)
>>>
>>>  TODO:
>>>  - What to do, when server doesn't support SOCK_SEQPACKET. In current
>>>    implementation RST is replied in the same way when listening port
>>>    is not found. I think that current RST is enough,because case when
>>>    server doesn't support SEQ_PACKET is same when listener missed(e.g.
>>>    no listener in both cases).

I think is fine.

>>    - virtio spec patch
>Ok

Yes, please prepare a patch to discuss the VIRTIO spec changes.

For example for 'virtio_vsock_seq_hdr', I left a comment about 'msg_cnt' 
naming that should be better to discuss with virtio guys.

Anyway, I reviewed this series and I left some comments.
I think we are in a good shape :-)

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support
@ 2021-02-11 14:57       ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:57 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, netdev, stsp2,
	linux-kernel, virtualization, oxffffaa, Stefan Hajnoczi,
	Colin Ian King, Jakub Kicinski, David S. Miller, Jorgen Hansen,
	Alexander Popov

Hi Arseny,

On Mon, Feb 08, 2021 at 09:32:59AM +0300, Arseny Krasnov wrote:
>
>On 07.02.2021 19:20, Michael S. Tsirkin wrote:
>> On Sun, Feb 07, 2021 at 06:12:56PM +0300, Arseny Krasnov wrote:
>>> 	This patchset impelements support of SOCK_SEQPACKET for virtio
>>> transport.
>>> 	As SOCK_SEQPACKET guarantees to save record boundaries, so to
>>> do it, two new packet operations were added: first for start of record
>>>  and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>>> both operations carries metadata - to maintain boundaries and payload
>>> integrity. Metadata is introduced by adding special header with two
>>> fields - message count and message length:
>>>
>>> 	struct virtio_vsock_seq_hdr {
>>> 		__le32  msg_cnt;
>>> 		__le32  msg_len;
>>> 	} __attribute__((packed));
>>>
>>> 	This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>>> packets(buffer of second virtio descriptor in chain) in the same way as
>>> data transmitted in RW packets. Payload was chosen as buffer for this
>>> header to avoid touching first virtio buffer which carries header of
>>> packet, because someone could check that size of this buffer is equal
>>> to size of packet header. To send record, packet with start marker is
>>> sent first(it's header contains length of record and counter), then
>>> counter is incremented and all data is sent as usual 'RW' packets and
>>> finally SEQ_END is sent(it also carries counter of message, which is
>>> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
>>> incremented again. On receiver's side, length of record is known from
>>> packet with start record marker. To check that no packets were dropped
>>> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
>>> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
>>> 1) and length of data between two markers is compared to length in
>>> SEQ_BEGIN header.
>>> 	Now as  packets of one socket are not reordered neither on
>>> vsock nor on vhost transport layers, such markers allows to restore
>>> original record on receiver's side. If user's buffer is smaller that
>>> record length, when all out of size data is dropped.
>>> 	Maximum length of datagram is not limited as in stream socket,
>>> because same credit logic is used. Difference with stream socket is
>>> that user is not woken up until whole record is received or error
>>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>>> 	Tests also implemented.
>>>
>>>  Arseny Krasnov (17):
>>>   af_vsock: update functions for connectible socket
>>>   af_vsock: separate wait data loop
>>>   af_vsock: separate receive data loop
>>>   af_vsock: implement SEQPACKET receive loop
>>>   af_vsock: separate wait space loop
>>>   af_vsock: implement send logic for SEQPACKET
>>>   af_vsock: rest of SEQPACKET support
>>>   af_vsock: update comments for stream sockets
>>>   virtio/vsock: dequeue callback for SOCK_SEQPACKET
>>>   virtio/vsock: fetch length for SEQPACKET record
>>>   virtio/vsock: add SEQPACKET receive logic
>>>   virtio/vsock: rest of SOCK_SEQPACKET support
>>>   virtio/vsock: setup SEQPACKET ops for transport
>>>   vhost/vsock: setup SEQPACKET ops for transport
>>>   vsock_test: add SOCK_SEQPACKET tests
>>>   loopback/vsock: setup SEQPACKET ops for transport
>>>   virtio/vsock: simplify credit update function API
>>>
>>>  drivers/vhost/vsock.c                   |   8 +-
>>>  include/linux/virtio_vsock.h            |  15 +
>>>  include/net/af_vsock.h                  |   9 +
>>>  include/uapi/linux/virtio_vsock.h       |  16 +
>>>  net/vmw_vsock/af_vsock.c                | 588 +++++++++++++++-------
>>>  net/vmw_vsock/virtio_transport.c        |   5 +
>>>  net/vmw_vsock/virtio_transport_common.c | 316 ++++++++++--
>>>  net/vmw_vsock/vsock_loopback.c          |   5 +
>>>  tools/testing/vsock/util.c              |  32 +-
>>>  tools/testing/vsock/util.h              |   3 +
>>>  tools/testing/vsock/vsock_test.c        | 126 +++++
>>>  11 files changed, 895 insertions(+), 228 deletions(-)
>>>
>>>  TODO:
>>>  - What to do, when server doesn't support SOCK_SEQPACKET. In current
>>>    implementation RST is replied in the same way when listening port
>>>    is not found. I think that current RST is enough,because case when
>>>    server doesn't support SEQ_PACKET is same when listener missed(e.g.
>>>    no listener in both cases).

I think is fine.

>>    - virtio spec patch
>Ok

Yes, please prepare a patch to discuss the VIRTIO spec changes.

For example for 'virtio_vsock_seq_hdr', I left a comment about 'msg_cnt' 
naming that should be better to discuss with virtio guys.

Anyway, I reviewed this series and I left some comments.
I think we are in a good shape :-)

Thanks,
Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 02/17] af_vsock: separate wait data loop
  2021-02-07 15:14 ` [RFC PATCH v4 02/17] af_vsock: separate wait data loop Arseny Krasnov
@ 2021-02-11 15:11     ` Jorgen Hansen
  2021-02-11 15:11     ` Jorgen Hansen
  1 sibling, 0 replies; 61+ messages in thread
From: Jorgen Hansen @ 2021-02-11 15:11 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Colin Ian King,
	Andra Paraschiv, Alexander Popov, kvm, virtualization, netdev,
	linux-kernel, stsp2, oxffffaa


> On 7 Feb 2021, at 16:14, Arseny Krasnov <arseny.krasnov@kaspersky.com> wrote:
> 
> This moves wait loop for data to dedicated function, because later
> it will be used by SEQPACKET data receive loop.
> 
> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
> ---
> net/vmw_vsock/af_vsock.c | 158 +++++++++++++++++++++------------------
> 1 file changed, 86 insertions(+), 72 deletions(-)
> 
> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> index f4fabec50650..38927695786f 100644
> --- a/net/vmw_vsock/af_vsock.c
> +++ b/net/vmw_vsock/af_vsock.c
> @@ -1833,6 +1833,71 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 	return err;
> }
> 
> +static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
> +			   long timeout,
> +			   struct vsock_transport_recv_notify_data *recv_data,
> +			   size_t target)
> +{
> +	const struct vsock_transport *transport;
> +	struct vsock_sock *vsk;
> +	s64 data;
> +	int err;
> +
> +	vsk = vsock_sk(sk);
> +	err = 0;
> +	transport = vsk->transport;
> +	prepare_to_wait(sk_sleep(sk), wait, TASK_INTERRUPTIBLE);
> +
> +	while ((data = vsock_stream_has_data(vsk)) == 0) {
> +		if (sk->sk_err != 0 ||
> +		    (sk->sk_shutdown & RCV_SHUTDOWN) ||
> +		    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
> +			goto out;
> +		}
> +
> +		/* Don't wait for non-blocking sockets. */
> +		if (timeout == 0) {
> +			err = -EAGAIN;
> +			goto out;
> +		}
> +
> +		if (recv_data) {
> +			err = transport->notify_recv_pre_block(vsk, target, recv_data);
> +			if (err < 0)
> +				goto out;
> +		}
> +
> +		release_sock(sk);
> +		timeout = schedule_timeout(timeout);
> +		lock_sock(sk);
> +
> +		if (signal_pending(current)) {
> +			err = sock_intr_errno(timeout);
> +			goto out;
> +		} else if (timeout == 0) {
> +			err = -EAGAIN;
> +			goto out;
> +		}
> +	}
> +
> +	finish_wait(sk_sleep(sk), wait);
> +
> +	/* Invalid queue pair content. XXX This should
> +	 * be changed to a connection reset in a later
> +	 * change.
> +	 */

Since you are here, could you update this comment to something like:

/* Internal transport error when checking for available
 * data. XXX This should be changed to a connection
 * reset in a later change.
 */

> +	if (data < 0)
> +		return -ENOMEM;
> +
> +	/* Have some data, return. */
> +	if (data)
> +		return data;
> +
> +out:
> +	finish_wait(sk_sleep(sk), wait);
> +	return err;
> +}

I agree with Stefanos suggestion to get rid of the out: part  and just have the single finish_wait().

> +
> static int
> vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 			  int flags)
> @@ -1912,85 +1977,34 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 
> 
> 	while (1) {
> -		s64 ready;
> +		ssize_t read;
> 
> -		prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
> -		ready = vsock_stream_has_data(vsk);
> -
> -		if (ready == 0) {
> -			if (sk->sk_err != 0 ||
> -			    (sk->sk_shutdown & RCV_SHUTDOWN) ||
> -			    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			}
> -			/* Don't wait for non-blocking sockets. */
> -			if (timeout == 0) {
> -				err = -EAGAIN;
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			}
> -
> -			err = transport->notify_recv_pre_block(
> -					vsk, target, &recv_data);
> -			if (err < 0) {
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			}
> -			release_sock(sk);
> -			timeout = schedule_timeout(timeout);
> -			lock_sock(sk);
> -
> -			if (signal_pending(current)) {
> -				err = sock_intr_errno(timeout);
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			} else if (timeout == 0) {
> -				err = -EAGAIN;
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			}
> -		} else {
> -			ssize_t read;
> +		err = vsock_wait_data(sk, &wait, timeout, &recv_data, target);
> +		if (err <= 0)
> +			break;

There is a small change in the behaviour here if vsock_stream_has_data(vsk)
returned something < 0. Since you just do a break, the err value can be updated
if there is an sk->sk_err, a receive shutdown has been performed or data has
already been copied. That should be ok, though.

> -			finish_wait(sk_sleep(sk), &wait);
> -
> -			if (ready < 0) {
> -				/* Invalid queue pair content. XXX This should
> -				* be changed to a connection reset in a later
> -				* change.
> -				*/
> -
> -				err = -ENOMEM;
> -				goto out;
> -			}
> -
> -			err = transport->notify_recv_pre_dequeue(
> -					vsk, target, &recv_data);
> -			if (err < 0)
> -				break;
> +		err = transport->notify_recv_pre_dequeue(vsk, target,
> +							 &recv_data);
> +		if (err < 0)
> +			break;
> 
> -			read = transport->stream_dequeue(
> -					vsk, msg,
> -					len - copied, flags);
> -			if (read < 0) {
> -				err = -ENOMEM;
> -				break;
> -			}
> +		read = transport->stream_dequeue(vsk, msg, len - copied, flags);
> +		if (read < 0) {
> +			err = -ENOMEM;
> +			break;
> +		}
> 
> -			copied += read;
> +		copied += read;
> 
> -			err = transport->notify_recv_post_dequeue(
> -					vsk, target, read,
> -					!(flags & MSG_PEEK), &recv_data);
> -			if (err < 0)
> -				goto out;
> +		err = transport->notify_recv_post_dequeue(vsk, target, read,
> +						!(flags & MSG_PEEK), &recv_data);
> +		if (err < 0)
> +			goto out;
> 
> -			if (read >= target || flags & MSG_PEEK)
> -				break;
> +		if (read >= target || flags & MSG_PEEK)
> +			break;
> 
> -			target -= read;
> -		}
> +		target -= read;
> 	}
> 
> 	if (sk->sk_err)
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 02/17] af_vsock: separate wait data loop
@ 2021-02-11 15:11     ` Jorgen Hansen
  0 siblings, 0 replies; 61+ messages in thread
From: Jorgen Hansen @ 2021-02-11 15:11 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, netdev, stsp2,
	linux-kernel, virtualization, oxffffaa, Stefan Hajnoczi,
	Colin Ian King, Jakub Kicinski, Alexander Popov, David S. Miller


> On 7 Feb 2021, at 16:14, Arseny Krasnov <arseny.krasnov@kaspersky.com> wrote:
> 
> This moves wait loop for data to dedicated function, because later
> it will be used by SEQPACKET data receive loop.
> 
> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
> ---
> net/vmw_vsock/af_vsock.c | 158 +++++++++++++++++++++------------------
> 1 file changed, 86 insertions(+), 72 deletions(-)
> 
> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> index f4fabec50650..38927695786f 100644
> --- a/net/vmw_vsock/af_vsock.c
> +++ b/net/vmw_vsock/af_vsock.c
> @@ -1833,6 +1833,71 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 	return err;
> }
> 
> +static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
> +			   long timeout,
> +			   struct vsock_transport_recv_notify_data *recv_data,
> +			   size_t target)
> +{
> +	const struct vsock_transport *transport;
> +	struct vsock_sock *vsk;
> +	s64 data;
> +	int err;
> +
> +	vsk = vsock_sk(sk);
> +	err = 0;
> +	transport = vsk->transport;
> +	prepare_to_wait(sk_sleep(sk), wait, TASK_INTERRUPTIBLE);
> +
> +	while ((data = vsock_stream_has_data(vsk)) == 0) {
> +		if (sk->sk_err != 0 ||
> +		    (sk->sk_shutdown & RCV_SHUTDOWN) ||
> +		    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
> +			goto out;
> +		}
> +
> +		/* Don't wait for non-blocking sockets. */
> +		if (timeout == 0) {
> +			err = -EAGAIN;
> +			goto out;
> +		}
> +
> +		if (recv_data) {
> +			err = transport->notify_recv_pre_block(vsk, target, recv_data);
> +			if (err < 0)
> +				goto out;
> +		}
> +
> +		release_sock(sk);
> +		timeout = schedule_timeout(timeout);
> +		lock_sock(sk);
> +
> +		if (signal_pending(current)) {
> +			err = sock_intr_errno(timeout);
> +			goto out;
> +		} else if (timeout == 0) {
> +			err = -EAGAIN;
> +			goto out;
> +		}
> +	}
> +
> +	finish_wait(sk_sleep(sk), wait);
> +
> +	/* Invalid queue pair content. XXX This should
> +	 * be changed to a connection reset in a later
> +	 * change.
> +	 */

Since you are here, could you update this comment to something like:

/* Internal transport error when checking for available
 * data. XXX This should be changed to a connection
 * reset in a later change.
 */

> +	if (data < 0)
> +		return -ENOMEM;
> +
> +	/* Have some data, return. */
> +	if (data)
> +		return data;
> +
> +out:
> +	finish_wait(sk_sleep(sk), wait);
> +	return err;
> +}

I agree with Stefanos suggestion to get rid of the out: part  and just have the single finish_wait().

> +
> static int
> vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 			  int flags)
> @@ -1912,85 +1977,34 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 
> 
> 	while (1) {
> -		s64 ready;
> +		ssize_t read;
> 
> -		prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
> -		ready = vsock_stream_has_data(vsk);
> -
> -		if (ready == 0) {
> -			if (sk->sk_err != 0 ||
> -			    (sk->sk_shutdown & RCV_SHUTDOWN) ||
> -			    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			}
> -			/* Don't wait for non-blocking sockets. */
> -			if (timeout == 0) {
> -				err = -EAGAIN;
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			}
> -
> -			err = transport->notify_recv_pre_block(
> -					vsk, target, &recv_data);
> -			if (err < 0) {
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			}
> -			release_sock(sk);
> -			timeout = schedule_timeout(timeout);
> -			lock_sock(sk);
> -
> -			if (signal_pending(current)) {
> -				err = sock_intr_errno(timeout);
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			} else if (timeout == 0) {
> -				err = -EAGAIN;
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			}
> -		} else {
> -			ssize_t read;
> +		err = vsock_wait_data(sk, &wait, timeout, &recv_data, target);
> +		if (err <= 0)
> +			break;

There is a small change in the behaviour here if vsock_stream_has_data(vsk)
returned something < 0. Since you just do a break, the err value can be updated
if there is an sk->sk_err, a receive shutdown has been performed or data has
already been copied. That should be ok, though.

> -			finish_wait(sk_sleep(sk), &wait);
> -
> -			if (ready < 0) {
> -				/* Invalid queue pair content. XXX This should
> -				* be changed to a connection reset in a later
> -				* change.
> -				*/
> -
> -				err = -ENOMEM;
> -				goto out;
> -			}
> -
> -			err = transport->notify_recv_pre_dequeue(
> -					vsk, target, &recv_data);
> -			if (err < 0)
> -				break;
> +		err = transport->notify_recv_pre_dequeue(vsk, target,
> +							 &recv_data);
> +		if (err < 0)
> +			break;
> 
> -			read = transport->stream_dequeue(
> -					vsk, msg,
> -					len - copied, flags);
> -			if (read < 0) {
> -				err = -ENOMEM;
> -				break;
> -			}
> +		read = transport->stream_dequeue(vsk, msg, len - copied, flags);
> +		if (read < 0) {
> +			err = -ENOMEM;
> +			break;
> +		}
> 
> -			copied += read;
> +		copied += read;
> 
> -			err = transport->notify_recv_post_dequeue(
> -					vsk, target, read,
> -					!(flags & MSG_PEEK), &recv_data);
> -			if (err < 0)
> -				goto out;
> +		err = transport->notify_recv_post_dequeue(vsk, target, read,
> +						!(flags & MSG_PEEK), &recv_data);
> +		if (err < 0)
> +			goto out;
> 
> -			if (read >= target || flags & MSG_PEEK)
> -				break;
> +		if (read >= target || flags & MSG_PEEK)
> +			break;
> 
> -			target -= read;
> -		}
> +		target -= read;
> 	}
> 
> 	if (sk->sk_err)
> -- 
> 2.25.1
> 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support
  2021-02-11 14:57       ` Stefano Garzarella
  (?)
@ 2021-02-12  6:11       ` Arseny Krasnov
  2021-02-12  8:07           ` Stefano Garzarella
  -1 siblings, 1 reply; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-12  6:11 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Michael S. Tsirkin, Stefan Hajnoczi, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Andra Paraschiv, Colin Ian King,
	Alexander Popov, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa


On 11.02.2021 17:57, Stefano Garzarella wrote:
> Hi Arseny,
>
> On Mon, Feb 08, 2021 at 09:32:59AM +0300, Arseny Krasnov wrote:
>> On 07.02.2021 19:20, Michael S. Tsirkin wrote:
>>> On Sun, Feb 07, 2021 at 06:12:56PM +0300, Arseny Krasnov wrote:
>>>> 	This patchset impelements support of SOCK_SEQPACKET for virtio
>>>> transport.
>>>> 	As SOCK_SEQPACKET guarantees to save record boundaries, so to
>>>> do it, two new packet operations were added: first for start of record
>>>>  and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>>>> both operations carries metadata - to maintain boundaries and payload
>>>> integrity. Metadata is introduced by adding special header with two
>>>> fields - message count and message length:
>>>>
>>>> 	struct virtio_vsock_seq_hdr {
>>>> 		__le32  msg_cnt;
>>>> 		__le32  msg_len;
>>>> 	} __attribute__((packed));
>>>>
>>>> 	This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>>>> packets(buffer of second virtio descriptor in chain) in the same way as
>>>> data transmitted in RW packets. Payload was chosen as buffer for this
>>>> header to avoid touching first virtio buffer which carries header of
>>>> packet, because someone could check that size of this buffer is equal
>>>> to size of packet header. To send record, packet with start marker is
>>>> sent first(it's header contains length of record and counter), then
>>>> counter is incremented and all data is sent as usual 'RW' packets and
>>>> finally SEQ_END is sent(it also carries counter of message, which is
>>>> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
>>>> incremented again. On receiver's side, length of record is known from
>>>> packet with start record marker. To check that no packets were dropped
>>>> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
>>>> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
>>>> 1) and length of data between two markers is compared to length in
>>>> SEQ_BEGIN header.
>>>> 	Now as  packets of one socket are not reordered neither on
>>>> vsock nor on vhost transport layers, such markers allows to restore
>>>> original record on receiver's side. If user's buffer is smaller that
>>>> record length, when all out of size data is dropped.
>>>> 	Maximum length of datagram is not limited as in stream socket,
>>>> because same credit logic is used. Difference with stream socket is
>>>> that user is not woken up until whole record is received or error
>>>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>>>> 	Tests also implemented.
>>>>
>>>>  Arseny Krasnov (17):
>>>>   af_vsock: update functions for connectible socket
>>>>   af_vsock: separate wait data loop
>>>>   af_vsock: separate receive data loop
>>>>   af_vsock: implement SEQPACKET receive loop
>>>>   af_vsock: separate wait space loop
>>>>   af_vsock: implement send logic for SEQPACKET
>>>>   af_vsock: rest of SEQPACKET support
>>>>   af_vsock: update comments for stream sockets
>>>>   virtio/vsock: dequeue callback for SOCK_SEQPACKET
>>>>   virtio/vsock: fetch length for SEQPACKET record
>>>>   virtio/vsock: add SEQPACKET receive logic
>>>>   virtio/vsock: rest of SOCK_SEQPACKET support
>>>>   virtio/vsock: setup SEQPACKET ops for transport
>>>>   vhost/vsock: setup SEQPACKET ops for transport
>>>>   vsock_test: add SOCK_SEQPACKET tests
>>>>   loopback/vsock: setup SEQPACKET ops for transport
>>>>   virtio/vsock: simplify credit update function API
>>>>
>>>>  drivers/vhost/vsock.c                   |   8 +-
>>>>  include/linux/virtio_vsock.h            |  15 +
>>>>  include/net/af_vsock.h                  |   9 +
>>>>  include/uapi/linux/virtio_vsock.h       |  16 +
>>>>  net/vmw_vsock/af_vsock.c                | 588 +++++++++++++++-------
>>>>  net/vmw_vsock/virtio_transport.c        |   5 +
>>>>  net/vmw_vsock/virtio_transport_common.c | 316 ++++++++++--
>>>>  net/vmw_vsock/vsock_loopback.c          |   5 +
>>>>  tools/testing/vsock/util.c              |  32 +-
>>>>  tools/testing/vsock/util.h              |   3 +
>>>>  tools/testing/vsock/vsock_test.c        | 126 +++++
>>>>  11 files changed, 895 insertions(+), 228 deletions(-)
>>>>
>>>>  TODO:
>>>>  - What to do, when server doesn't support SOCK_SEQPACKET. In current
>>>>    implementation RST is replied in the same way when listening port
>>>>    is not found. I think that current RST is enough,because case when
>>>>    server doesn't support SEQ_PACKET is same when listener missed(e.g.
>>>>    no listener in both cases).
> I think is fine.
>
>>>    - virtio spec patch
>> Ok
> Yes, please prepare a patch to discuss the VIRTIO spec changes.
>
> For example for 'virtio_vsock_seq_hdr', I left a comment about 'msg_cnt' 
> naming that should be better to discuss with virtio guys.

Ok, i'll prepare it in v5. So I have to send it both LKML(as one of patches) and

virtio mailing lists? (e.g. virtio-comment@lists.oasis-open.org)

>
> Anyway, I reviewed this series and I left some comments.
> I think we are in a good shape :-)
Great, thanks for review. I'll consider all review comments in next version.
>
> Thanks,
> Stefano
>
>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support
  2021-02-12  6:11       ` Arseny Krasnov
@ 2021-02-12  8:07           ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-12  8:07 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Michael S. Tsirkin, Stefan Hajnoczi, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Andra Paraschiv, Colin Ian King,
	Alexander Popov, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa

On Fri, Feb 12, 2021 at 09:11:50AM +0300, Arseny Krasnov wrote:
>
>On 11.02.2021 17:57, Stefano Garzarella wrote:
>> Hi Arseny,
>>
>> On Mon, Feb 08, 2021 at 09:32:59AM +0300, Arseny Krasnov wrote:
>>> On 07.02.2021 19:20, Michael S. Tsirkin wrote:
>>>> On Sun, Feb 07, 2021 at 06:12:56PM +0300, Arseny Krasnov wrote:
>>>>> 	This patchset impelements support of SOCK_SEQPACKET for virtio
>>>>> transport.
>>>>> 	As SOCK_SEQPACKET guarantees to save record boundaries, so to
>>>>> do it, two new packet operations were added: first for start of record
>>>>>  and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>>>>> both operations carries metadata - to maintain boundaries and payload
>>>>> integrity. Metadata is introduced by adding special header with two
>>>>> fields - message count and message length:
>>>>>
>>>>> 	struct virtio_vsock_seq_hdr {
>>>>> 		__le32  msg_cnt;
>>>>> 		__le32  msg_len;
>>>>> 	} __attribute__((packed));
>>>>>
>>>>> 	This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>>>>> packets(buffer of second virtio descriptor in chain) in the same way as
>>>>> data transmitted in RW packets. Payload was chosen as buffer for this
>>>>> header to avoid touching first virtio buffer which carries header of
>>>>> packet, because someone could check that size of this buffer is equal
>>>>> to size of packet header. To send record, packet with start marker is
>>>>> sent first(it's header contains length of record and counter), then
>>>>> counter is incremented and all data is sent as usual 'RW' packets and
>>>>> finally SEQ_END is sent(it also carries counter of message, which is
>>>>> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
>>>>> incremented again. On receiver's side, length of record is known from
>>>>> packet with start record marker. To check that no packets were dropped
>>>>> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
>>>>> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
>>>>> 1) and length of data between two markers is compared to length in
>>>>> SEQ_BEGIN header.
>>>>> 	Now as  packets of one socket are not reordered neither on
>>>>> vsock nor on vhost transport layers, such markers allows to restore
>>>>> original record on receiver's side. If user's buffer is smaller that
>>>>> record length, when all out of size data is dropped.
>>>>> 	Maximum length of datagram is not limited as in stream socket,
>>>>> because same credit logic is used. Difference with stream socket is
>>>>> that user is not woken up until whole record is received or error
>>>>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>>>>> 	Tests also implemented.
>>>>>
>>>>>  Arseny Krasnov (17):
>>>>>   af_vsock: update functions for connectible socket
>>>>>   af_vsock: separate wait data loop
>>>>>   af_vsock: separate receive data loop
>>>>>   af_vsock: implement SEQPACKET receive loop
>>>>>   af_vsock: separate wait space loop
>>>>>   af_vsock: implement send logic for SEQPACKET
>>>>>   af_vsock: rest of SEQPACKET support
>>>>>   af_vsock: update comments for stream sockets
>>>>>   virtio/vsock: dequeue callback for SOCK_SEQPACKET
>>>>>   virtio/vsock: fetch length for SEQPACKET record
>>>>>   virtio/vsock: add SEQPACKET receive logic
>>>>>   virtio/vsock: rest of SOCK_SEQPACKET support
>>>>>   virtio/vsock: setup SEQPACKET ops for transport
>>>>>   vhost/vsock: setup SEQPACKET ops for transport
>>>>>   vsock_test: add SOCK_SEQPACKET tests
>>>>>   loopback/vsock: setup SEQPACKET ops for transport
>>>>>   virtio/vsock: simplify credit update function API
>>>>>
>>>>>  drivers/vhost/vsock.c                   |   8 +-
>>>>>  include/linux/virtio_vsock.h            |  15 +
>>>>>  include/net/af_vsock.h                  |   9 +
>>>>>  include/uapi/linux/virtio_vsock.h       |  16 +
>>>>>  net/vmw_vsock/af_vsock.c                | 588 +++++++++++++++-------
>>>>>  net/vmw_vsock/virtio_transport.c        |   5 +
>>>>>  net/vmw_vsock/virtio_transport_common.c | 316 ++++++++++--
>>>>>  net/vmw_vsock/vsock_loopback.c          |   5 +
>>>>>  tools/testing/vsock/util.c              |  32 +-
>>>>>  tools/testing/vsock/util.h              |   3 +
>>>>>  tools/testing/vsock/vsock_test.c        | 126 +++++
>>>>>  11 files changed, 895 insertions(+), 228 deletions(-)
>>>>>
>>>>>  TODO:
>>>>>  - What to do, when server doesn't support SOCK_SEQPACKET. In current
>>>>>    implementation RST is replied in the same way when listening port
>>>>>    is not found. I think that current RST is enough,because case when
>>>>>    server doesn't support SEQ_PACKET is same when listener missed(e.g.
>>>>>    no listener in both cases).
>> I think is fine.
>>
>>>>    - virtio spec patch
>>> Ok
>> Yes, please prepare a patch to discuss the VIRTIO spec changes.
>>
>> For example for 'virtio_vsock_seq_hdr', I left a comment about 'msg_cnt'
>> naming that should be better to discuss with virtio guys.
>
>Ok, i'll prepare it in v5. So I have to send it both LKML(as one of patches) and
>
>virtio mailing lists? (e.g. virtio-comment@lists.oasis-open.org)

I think you can send the VIRTIO spec patch separately from this series 
to virtio-comment, maybe CCing virtualization@lists.linux-foundation.org

But Michael could correct me :-)

>
>>
>> Anyway, I reviewed this series and I left some comments.
>> I think we are in a good shape :-)
>Great, thanks for review. I'll consider all review comments in next 
>version.

Great!

Stefano


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support
@ 2021-02-12  8:07           ` Stefano Garzarella
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Garzarella @ 2021-02-12  8:07 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, netdev, stsp2,
	linux-kernel, virtualization, oxffffaa, Stefan Hajnoczi,
	Colin Ian King, Jakub Kicinski, David S. Miller, Jorgen Hansen,
	Alexander Popov

On Fri, Feb 12, 2021 at 09:11:50AM +0300, Arseny Krasnov wrote:
>
>On 11.02.2021 17:57, Stefano Garzarella wrote:
>> Hi Arseny,
>>
>> On Mon, Feb 08, 2021 at 09:32:59AM +0300, Arseny Krasnov wrote:
>>> On 07.02.2021 19:20, Michael S. Tsirkin wrote:
>>>> On Sun, Feb 07, 2021 at 06:12:56PM +0300, Arseny Krasnov wrote:
>>>>> 	This patchset impelements support of SOCK_SEQPACKET for virtio
>>>>> transport.
>>>>> 	As SOCK_SEQPACKET guarantees to save record boundaries, so to
>>>>> do it, two new packet operations were added: first for start of record
>>>>>  and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>>>>> both operations carries metadata - to maintain boundaries and payload
>>>>> integrity. Metadata is introduced by adding special header with two
>>>>> fields - message count and message length:
>>>>>
>>>>> 	struct virtio_vsock_seq_hdr {
>>>>> 		__le32  msg_cnt;
>>>>> 		__le32  msg_len;
>>>>> 	} __attribute__((packed));
>>>>>
>>>>> 	This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>>>>> packets(buffer of second virtio descriptor in chain) in the same way as
>>>>> data transmitted in RW packets. Payload was chosen as buffer for this
>>>>> header to avoid touching first virtio buffer which carries header of
>>>>> packet, because someone could check that size of this buffer is equal
>>>>> to size of packet header. To send record, packet with start marker is
>>>>> sent first(it's header contains length of record and counter), then
>>>>> counter is incremented and all data is sent as usual 'RW' packets and
>>>>> finally SEQ_END is sent(it also carries counter of message, which is
>>>>> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
>>>>> incremented again. On receiver's side, length of record is known from
>>>>> packet with start record marker. To check that no packets were dropped
>>>>> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
>>>>> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
>>>>> 1) and length of data between two markers is compared to length in
>>>>> SEQ_BEGIN header.
>>>>> 	Now as  packets of one socket are not reordered neither on
>>>>> vsock nor on vhost transport layers, such markers allows to restore
>>>>> original record on receiver's side. If user's buffer is smaller that
>>>>> record length, when all out of size data is dropped.
>>>>> 	Maximum length of datagram is not limited as in stream socket,
>>>>> because same credit logic is used. Difference with stream socket is
>>>>> that user is not woken up until whole record is received or error
>>>>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>>>>> 	Tests also implemented.
>>>>>
>>>>>  Arseny Krasnov (17):
>>>>>   af_vsock: update functions for connectible socket
>>>>>   af_vsock: separate wait data loop
>>>>>   af_vsock: separate receive data loop
>>>>>   af_vsock: implement SEQPACKET receive loop
>>>>>   af_vsock: separate wait space loop
>>>>>   af_vsock: implement send logic for SEQPACKET
>>>>>   af_vsock: rest of SEQPACKET support
>>>>>   af_vsock: update comments for stream sockets
>>>>>   virtio/vsock: dequeue callback for SOCK_SEQPACKET
>>>>>   virtio/vsock: fetch length for SEQPACKET record
>>>>>   virtio/vsock: add SEQPACKET receive logic
>>>>>   virtio/vsock: rest of SOCK_SEQPACKET support
>>>>>   virtio/vsock: setup SEQPACKET ops for transport
>>>>>   vhost/vsock: setup SEQPACKET ops for transport
>>>>>   vsock_test: add SOCK_SEQPACKET tests
>>>>>   loopback/vsock: setup SEQPACKET ops for transport
>>>>>   virtio/vsock: simplify credit update function API
>>>>>
>>>>>  drivers/vhost/vsock.c                   |   8 +-
>>>>>  include/linux/virtio_vsock.h            |  15 +
>>>>>  include/net/af_vsock.h                  |   9 +
>>>>>  include/uapi/linux/virtio_vsock.h       |  16 +
>>>>>  net/vmw_vsock/af_vsock.c                | 588 +++++++++++++++-------
>>>>>  net/vmw_vsock/virtio_transport.c        |   5 +
>>>>>  net/vmw_vsock/virtio_transport_common.c | 316 ++++++++++--
>>>>>  net/vmw_vsock/vsock_loopback.c          |   5 +
>>>>>  tools/testing/vsock/util.c              |  32 +-
>>>>>  tools/testing/vsock/util.h              |   3 +
>>>>>  tools/testing/vsock/vsock_test.c        | 126 +++++
>>>>>  11 files changed, 895 insertions(+), 228 deletions(-)
>>>>>
>>>>>  TODO:
>>>>>  - What to do, when server doesn't support SOCK_SEQPACKET. In current
>>>>>    implementation RST is replied in the same way when listening port
>>>>>    is not found. I think that current RST is enough,because case when
>>>>>    server doesn't support SEQ_PACKET is same when listener missed(e.g.
>>>>>    no listener in both cases).
>> I think is fine.
>>
>>>>    - virtio spec patch
>>> Ok
>> Yes, please prepare a patch to discuss the VIRTIO spec changes.
>>
>> For example for 'virtio_vsock_seq_hdr', I left a comment about 'msg_cnt'
>> naming that should be better to discuss with virtio guys.
>
>Ok, i'll prepare it in v5. So I have to send it both LKML(as one of patches) and
>
>virtio mailing lists? (e.g. virtio-comment@lists.oasis-open.org)

I think you can send the VIRTIO spec patch separately from this series 
to virtio-comment, maybe CCing virtualization@lists.linux-foundation.org

But Michael could correct me :-)

>
>>
>> Anyway, I reviewed this series and I left some comments.
>> I think we are in a good shape :-)
>Great, thanks for review. I'll consider all review comments in next 
>version.

Great!

Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 07/17] af_vsock: rest of SEQPACKET support
  2021-02-11 12:27     ` Stefano Garzarella
  (?)
@ 2021-02-15  9:11     ` Arseny Krasnov
  -1 siblings, 0 replies; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-15  9:11 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Jakub Kicinski, Jorgen Hansen, Colin Ian King, Andra Paraschiv,
	Jeff Vander Stoep, kvm, virtualization, netdev, linux-kernel,
	stsp2, oxffffaa


On 11.02.2021 15:27, Stefano Garzarella wrote:
> On Sun, Feb 07, 2021 at 06:16:12PM +0300, Arseny Krasnov wrote:
>> This does rest of SOCK_SEQPACKET support:
>> 1) Adds socket ops for SEQPACKET type.
>> 2) Allows to create socket with SEQPACKET type.
>>
>> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>> ---
>> net/vmw_vsock/af_vsock.c | 37 ++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 36 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>> index a033d3340ac4..c77998a14018 100644
>> --- a/net/vmw_vsock/af_vsock.c
>> +++ b/net/vmw_vsock/af_vsock.c
>> @@ -452,6 +452,7 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
>> 		new_transport = transport_dgram;
>> 		break;
>> 	case SOCK_STREAM:
>> +	case SOCK_SEQPACKET:
>> 		if (vsock_use_local_transport(remote_cid))
>> 			new_transport = transport_local;
>> 		else if (remote_cid <= VMADDR_CID_HOST || !transport_h2g ||
>> @@ -459,6 +460,15 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
>> 			new_transport = transport_g2h;
>> 		else
>> 			new_transport = transport_h2g;
>> +
>> +		if (sk->sk_type == SOCK_SEQPACKET) {
>> +			if (!new_transport ||
>> +			    !new_transport->seqpacket_seq_send_len ||
>> +			    !new_transport->seqpacket_seq_send_eor ||
>> +			    !new_transport->seqpacket_seq_get_len ||
>> +			    !new_transport->seqpacket_dequeue)
>> +				return -ESOCKTNOSUPPORT;
>> +		}
> Maybe we should move this check after the try_module_get() call, since 
> the memory pointed by 'new_transport' pointer can be deallocated in the 
> meantime.
>
> Also, if the socket had a transport before, we should deassign it before 
> returning an error.

I think previous transport is deassigned immediately after this

'switch()' on sk->sk_type:

if (vsk->transport) {

    ...

    vsock_deassign_transport(vsk);

}


Ok, check will be moved after 'try_module_get()'.

>
>> 		break;
>> 	default:
>> 		return -ESOCKTNOSUPPORT;
>> @@ -684,6 +694,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr)
>>
>> 	switch (sk->sk_socket->type) {
>> 	case SOCK_STREAM:
>> +	case SOCK_SEQPACKET:
>> 		spin_lock_bh(&vsock_table_lock);
>> 		retval = __vsock_bind_connectible(vsk, addr);
>> 		spin_unlock_bh(&vsock_table_lock);
>> @@ -769,7 +780,7 @@ static struct sock *__vsock_create(struct net *net,
>>
>> static bool sock_type_connectible(u16 type)
>> {
>> -	return type == SOCK_STREAM;
>> +	return (type == SOCK_STREAM) || (type == SOCK_SEQPACKET);
>> }
>>
>> static void __vsock_release(struct sock *sk, int level)
>> @@ -2199,6 +2210,27 @@ static const struct proto_ops vsock_stream_ops = {
>> 	.sendpage = sock_no_sendpage,
>> };
>>
>> +static const struct proto_ops vsock_seqpacket_ops = {
>> +	.family = PF_VSOCK,
>> +	.owner = THIS_MODULE,
>> +	.release = vsock_release,
>> +	.bind = vsock_bind,
>> +	.connect = vsock_connect,
>> +	.socketpair = sock_no_socketpair,
>> +	.accept = vsock_accept,
>> +	.getname = vsock_getname,
>> +	.poll = vsock_poll,
>> +	.ioctl = sock_no_ioctl,
>> +	.listen = vsock_listen,
>> +	.shutdown = vsock_shutdown,
>> +	.setsockopt = vsock_connectible_setsockopt,
>> +	.getsockopt = vsock_connectible_getsockopt,
>> +	.sendmsg = vsock_connectible_sendmsg,
>> +	.recvmsg = vsock_connectible_recvmsg,
>> +	.mmap = sock_no_mmap,
>> +	.sendpage = sock_no_sendpage,
>> +};
>> +
>> static int vsock_create(struct net *net, struct socket *sock,
>> 			int protocol, int kern)
>> {
>> @@ -2219,6 +2251,9 @@ static int vsock_create(struct net *net, struct socket *sock,
>> 	case SOCK_STREAM:
>> 		sock->ops = &vsock_stream_ops;
>> 		break;
>> +	case SOCK_SEQPACKET:
>> +		sock->ops = &vsock_seqpacket_ops;
>> +		break;
>> 	default:
>> 		return -ESOCKTNOSUPPORT;
>> 	}
>> -- 
>> 2.25.1
>>
>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC PATCH v4 02/17] af_vsock: separate wait data loop
  2021-02-11 15:11     ` Jorgen Hansen
  (?)
@ 2021-02-16  6:58     ` Arseny Krasnov
  -1 siblings, 0 replies; 61+ messages in thread
From: Arseny Krasnov @ 2021-02-16  6:58 UTC (permalink / raw)
  To: Jorgen Hansen
  Cc: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Colin Ian King,
	Andra Paraschiv, Alexander Popov, kvm, virtualization, netdev,
	linux-kernel, stsp2, oxffffaa


On 11.02.2021 18:11, Jorgen Hansen wrote:
>> On 7 Feb 2021, at 16:14, Arseny Krasnov <arseny.krasnov@kaspersky.com> wrote:
>>
>> This moves wait loop for data to dedicated function, because later
>> it will be used by SEQPACKET data receive loop.
>>
>> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>> ---
>> net/vmw_vsock/af_vsock.c | 158 +++++++++++++++++++++------------------
>> 1 file changed, 86 insertions(+), 72 deletions(-)
>>
>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>> index f4fabec50650..38927695786f 100644
>> --- a/net/vmw_vsock/af_vsock.c
>> +++ b/net/vmw_vsock/af_vsock.c
>> @@ -1833,6 +1833,71 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
>> 	return err;
>> }
>>
>> +static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
>> +			   long timeout,
>> +			   struct vsock_transport_recv_notify_data *recv_data,
>> +			   size_t target)
>> +{
>> +	const struct vsock_transport *transport;
>> +	struct vsock_sock *vsk;
>> +	s64 data;
>> +	int err;
>> +
>> +	vsk = vsock_sk(sk);
>> +	err = 0;
>> +	transport = vsk->transport;
>> +	prepare_to_wait(sk_sleep(sk), wait, TASK_INTERRUPTIBLE);
>> +
>> +	while ((data = vsock_stream_has_data(vsk)) == 0) {
>> +		if (sk->sk_err != 0 ||
>> +		    (sk->sk_shutdown & RCV_SHUTDOWN) ||
>> +		    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
>> +			goto out;
>> +		}
>> +
>> +		/* Don't wait for non-blocking sockets. */
>> +		if (timeout == 0) {
>> +			err = -EAGAIN;
>> +			goto out;
>> +		}
>> +
>> +		if (recv_data) {
>> +			err = transport->notify_recv_pre_block(vsk, target, recv_data);
>> +			if (err < 0)
>> +				goto out;
>> +		}
>> +
>> +		release_sock(sk);
>> +		timeout = schedule_timeout(timeout);
>> +		lock_sock(sk);
>> +
>> +		if (signal_pending(current)) {
>> +			err = sock_intr_errno(timeout);
>> +			goto out;
>> +		} else if (timeout == 0) {
>> +			err = -EAGAIN;
>> +			goto out;
>> +		}
>> +	}
>> +
>> +	finish_wait(sk_sleep(sk), wait);
>> +
>> +	/* Invalid queue pair content. XXX This should
>> +	 * be changed to a connection reset in a later
>> +	 * change.
>> +	 */
> Since you are here, could you update this comment to something like:
>
> /* Internal transport error when checking for available
>  * data. XXX This should be changed to a connection
>  * reset in a later change.
>  */
>
>> +	if (data < 0)
>> +		return -ENOMEM;
>> +
>> +	/* Have some data, return. */
>> +	if (data)
>> +		return data;
>> +
>> +out:
>> +	finish_wait(sk_sleep(sk), wait);
>> +	return err;
>> +}
> I agree with Stefanos suggestion to get rid of the out: part  and just have the single finish_wait().
>
>> +
>> static int
>> vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>> 			  int flags)
>> @@ -1912,85 +1977,34 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>>
>>
>> 	while (1) {
>> -		s64 ready;
>> +		ssize_t read;
>>
>> -		prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
>> -		ready = vsock_stream_has_data(vsk);
>> -
>> -		if (ready == 0) {
>> -			if (sk->sk_err != 0 ||
>> -			    (sk->sk_shutdown & RCV_SHUTDOWN) ||
>> -			    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
>> -				finish_wait(sk_sleep(sk), &wait);
>> -				break;
>> -			}
>> -			/* Don't wait for non-blocking sockets. */
>> -			if (timeout == 0) {
>> -				err = -EAGAIN;
>> -				finish_wait(sk_sleep(sk), &wait);
>> -				break;
>> -			}
>> -
>> -			err = transport->notify_recv_pre_block(
>> -					vsk, target, &recv_data);
>> -			if (err < 0) {
>> -				finish_wait(sk_sleep(sk), &wait);
>> -				break;
>> -			}
>> -			release_sock(sk);
>> -			timeout = schedule_timeout(timeout);
>> -			lock_sock(sk);
>> -
>> -			if (signal_pending(current)) {
>> -				err = sock_intr_errno(timeout);
>> -				finish_wait(sk_sleep(sk), &wait);
>> -				break;
>> -			} else if (timeout == 0) {
>> -				err = -EAGAIN;
>> -				finish_wait(sk_sleep(sk), &wait);
>> -				break;
>> -			}
>> -		} else {
>> -			ssize_t read;
>> +		err = vsock_wait_data(sk, &wait, timeout, &recv_data, target);
>> +		if (err <= 0)
>> +			break;
> There is a small change in the behaviour here if vsock_stream_has_data(vsk)
> returned something < 0. Since you just do a break, the err value can be updated
> if there is an sk->sk_err, a receive shutdown has been performed or data has
> already been copied. That should be ok, though.

May be i can add the following 'if' after while (1) loop:

There was:

if (sk->sk_err)
    err = -sk->sk->sk_err;
else if (sk->sk_shutdown & RCV_SHUTDOWN)
    err = 0;
if (copied > 0)
    err = copied;

Will be:

if (err == 0) {
    if (sk->sk_err)
        err = -sk->sk->sk_err;
     else if (sk->sk_shutdown & RCV_SHUTDOWN)
       err = 0;

    if (copied > 0)
        err = copied;

}

E.g. update 'err' only if it is clear. Don't touch otherwise


>
>> -			finish_wait(sk_sleep(sk), &wait);
>> -
>> -			if (ready < 0) {
>> -				/* Invalid queue pair content. XXX This should
>> -				* be changed to a connection reset in a later
>> -				* change.
>> -				*/
>> -
>> -				err = -ENOMEM;
>> -				goto out;
>> -			}
>> -
>> -			err = transport->notify_recv_pre_dequeue(
>> -					vsk, target, &recv_data);
>> -			if (err < 0)
>> -				break;
>> +		err = transport->notify_recv_pre_dequeue(vsk, target,
>> +							 &recv_data);
>> +		if (err < 0)
>> +			break;
>>
>> -			read = transport->stream_dequeue(
>> -					vsk, msg,
>> -					len - copied, flags);
>> -			if (read < 0) {
>> -				err = -ENOMEM;
>> -				break;
>> -			}
>> +		read = transport->stream_dequeue(vsk, msg, len - copied, flags);
>> +		if (read < 0) {
>> +			err = -ENOMEM;
>> +			break;
>> +		}
>>
>> -			copied += read;
>> +		copied += read;
>>
>> -			err = transport->notify_recv_post_dequeue(
>> -					vsk, target, read,
>> -					!(flags & MSG_PEEK), &recv_data);
>> -			if (err < 0)
>> -				goto out;
>> +		err = transport->notify_recv_post_dequeue(vsk, target, read,
>> +						!(flags & MSG_PEEK), &recv_data);
>> +		if (err < 0)
>> +			goto out;
>>
>> -			if (read >= target || flags & MSG_PEEK)
>> -				break;
>> +		if (read >= target || flags & MSG_PEEK)
>> +			break;
>>
>> -			target -= read;
>> -		}
>> +		target -= read;
>> 	}
>>
>> 	if (sk->sk_err)
>> -- 
>> 2.25.1
>>
>

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2021-02-16  7:00 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-07 15:12 [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Arseny Krasnov
2021-02-07 15:14 ` [RFC PATCH v4 01/17] af_vsock: update functions for connectible socket Arseny Krasnov
2021-02-11 10:52   ` Stefano Garzarella
2021-02-11 10:52     ` Stefano Garzarella
2021-02-07 15:14 ` [RFC PATCH v4 02/17] af_vsock: separate wait data loop Arseny Krasnov
2021-02-11 11:24   ` Stefano Garzarella
2021-02-11 11:24     ` Stefano Garzarella
2021-02-11 15:11   ` Jorgen Hansen
2021-02-11 15:11     ` Jorgen Hansen
2021-02-16  6:58     ` Arseny Krasnov
2021-02-07 15:15 ` [RFC PATCH v4 03/17] af_vsock: separate receive " Arseny Krasnov
2021-02-11 11:37   ` Stefano Garzarella
2021-02-11 11:37     ` Stefano Garzarella
2021-02-07 15:15 ` [RFC PATCH v4 04/17] af_vsock: implement SEQPACKET receive loop Arseny Krasnov
2021-02-11 11:47   ` Stefano Garzarella
2021-02-11 11:47     ` Stefano Garzarella
2021-02-07 15:15 ` [RFC PATCH v4 05/17] af_vsock: separate wait space loop Arseny Krasnov
2021-02-07 16:58   ` kernel test robot
2021-02-11 12:14   ` Stefano Garzarella
2021-02-11 12:14     ` Stefano Garzarella
2021-02-07 15:15 ` [RFC PATCH v4 06/17] af_vsock: implement send logic for SEQPACKET Arseny Krasnov
2021-02-11 12:17   ` Stefano Garzarella
2021-02-11 12:17     ` Stefano Garzarella
2021-02-07 15:16 ` [RFC PATCH v4 07/17] af_vsock: rest of SEQPACKET support Arseny Krasnov
2021-02-11 12:27   ` Stefano Garzarella
2021-02-11 12:27     ` Stefano Garzarella
2021-02-15  9:11     ` Arseny Krasnov
2021-02-07 15:16 ` [RFC PATCH v4 08/17] af_vsock: update comments for stream sockets Arseny Krasnov
2021-02-11 13:19   ` Stefano Garzarella
2021-02-11 13:19     ` Stefano Garzarella
2021-02-07 15:16 ` [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET Arseny Krasnov
2021-02-11 13:54   ` Stefano Garzarella
2021-02-11 13:54     ` Stefano Garzarella
2021-02-11 14:03     ` Stefano Garzarella
2021-02-11 14:03       ` Stefano Garzarella
2021-02-07 15:17 ` [RFC PATCH v4 10/17] virtio/vsock: fetch length for SEQPACKET record Arseny Krasnov
2021-02-11 13:58   ` Stefano Garzarella
2021-02-11 13:58     ` Stefano Garzarella
2021-02-07 15:17 ` [RFC PATCH v4 11/17] virtio/vsock: add SEQPACKET receive logic Arseny Krasnov
2021-02-07 15:17 ` [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support Arseny Krasnov
2021-02-09  4:34   ` kernel test robot
2021-02-11 11:00   ` Arseny Krasnov
2021-02-11 14:29   ` Stefano Garzarella
2021-02-11 14:29     ` Stefano Garzarella
2021-02-07 15:18 ` [RFC PATCH v4 13/17] virtio/vsock: setup SEQPACKET ops for transport Arseny Krasnov
2021-02-07 15:18 ` [RFC PATCH v4 14/17] vhost/vsock: " Arseny Krasnov
2021-02-07 15:18 ` [RFC PATCH v4 15/17] vsock_test: add SOCK_SEQPACKET tests Arseny Krasnov
2021-02-07 15:18 ` [RFC PATCH v4 16/17] loopback/vsock: setup SEQPACKET ops for transport Arseny Krasnov
2021-02-11 14:31   ` Stefano Garzarella
2021-02-11 14:31     ` Stefano Garzarella
2021-02-07 15:19 ` [RFC PATCH v4 17/17] virtio/vsock: simplify credit update function API Arseny Krasnov
2021-02-11 14:39   ` Stefano Garzarella
2021-02-11 14:39     ` Stefano Garzarella
2021-02-07 16:20 ` [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Michael S. Tsirkin
2021-02-07 16:20   ` Michael S. Tsirkin
2021-02-08  6:32   ` Arseny Krasnov
2021-02-11 14:57     ` Stefano Garzarella
2021-02-11 14:57       ` Stefano Garzarella
2021-02-12  6:11       ` Arseny Krasnov
2021-02-12  8:07         ` Stefano Garzarella
2021-02-12  8:07           ` Stefano Garzarella

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.