linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic
@ 2021-07-04  8:08 Arseny Krasnov
  2021-07-04  8:09 ` [RFC PATCH v2 1/6] af_vsock/virtio/vsock: change seqpacket " Arseny Krasnov
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Arseny Krasnov @ 2021-07-04  8:08 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Andra Paraschiv, Norbert Slusarek, Colin Ian King
  Cc: kvm, virtualization, netdev, linux-kernel, oxffffaa

	This patchset modifies receive logic for SOCK_SEQPACKET.
Difference between current implementation and this version is that
now reader is woken up when there is at least one RW packet in rx
queue of socket and data is copied to user's buffer, while merged
approach wake up user only when whole message is received and kept
in queue. New implementation has several advantages:
 1) There is no limit for message length. Merged approach requires
    that length must be smaller than 'peer_buf_alloc', otherwise
    transmission will stuck.
 2) There is no need to keep whole message in queue, thus no
    'kmalloc()' memory will be wasted until EOR is received.

    Also new approach has some feature: as fragments of message
are copied until EOR is received, it is possible that part of
message will be already in user's buffer, while rest of message
still not received. And if user will be interrupted by signal or
timeout with part of message in buffer, it will exit receive loop,
leaving rest of message in queue. To solve this problem special
callback was added to transport: it is called when user was forced
to leave exit loop and tells transport to drop any packet until
EOR met. When EOR is found, this mode is disabled and normal packet
processing started. Note, that when 'drop until EOR' mode is on,
incoming packets still inserted in queue, reader will be woken up,
tries to copy data, but nothing will be copied until EOR found.
It was possible to drain such unneeded packets it rx work without
kicking user, but implemented way is simplest. Anyway, i think
such cases are rare.

    New test also added - it tries to copy to invalid user's
buffer.

Arseny Krasnov (16):
 af_vsock/virtio/vsock: change seqpacket receive logic
 af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
 virtio/vsock: remove 'msg_count' based logic
 af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
 virtio/vsock: remove record size limit for SEQPACKET
 vsock_test: SEQPACKET read to broken buffer

 drivers/vhost/vsock.c                   |   2 +-
 include/linux/virtio_vsock.h            |   7 +-
 include/net/af_vsock.h                  |   4 +-
 net/vmw_vsock/af_vsock.c                |  44 ++++----
 net/vmw_vsock/virtio_transport.c        |   2 +-
 net/vmw_vsock/virtio_transport_common.c | 103 ++++++++-----------
 net/vmw_vsock/vsock_loopback.c          |   2 +-
 tools/testing/vsock/vsock_test.c        | 120 ++++++++++++++++++++++
 8 files changed, 193 insertions(+), 91 deletions(-)

 v1 -> v2:
 Patches reordered and reorganized.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 cv.txt | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 cv.txt

diff --git a/cv.txt b/cv.txt
new file mode 100644
index 000000000000..e69de29bb2d1
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 1/6] af_vsock/virtio/vsock: change seqpacket receive logic
  2021-07-04  8:08 [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic Arseny Krasnov
@ 2021-07-04  8:09 ` Arseny Krasnov
  2021-07-04  8:10 ` [RFC PATCH v2 2/6] af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback Arseny Krasnov
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Arseny Krasnov @ 2021-07-04  8:09 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Colin Ian King, Norbert Slusarek, Andra Paraschiv
  Cc: kvm, virtualization, netdev, linux-kernel, oxffffaa

1) In af_vsock "loop" now is really loop: it receives
   message fragments one by one, until 'msg_ready'
   value is returned by transport.
2) In virtio transport, dequeue callback is called
   everytime when at least one fragment of message is
   received.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 include/linux/virtio_vsock.h            |  3 +-
 include/net/af_vsock.h                  |  2 +-
 net/vmw_vsock/af_vsock.c                | 33 +++++++++----
 net/vmw_vsock/virtio_transport_common.c | 62 +++++++++++--------------
 4 files changed, 52 insertions(+), 48 deletions(-)

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 35d7eedb5e8e..e68b4029f038 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -88,7 +88,8 @@ virtio_transport_seqpacket_enqueue(struct vsock_sock *vsk,
 ssize_t
 virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
 				   struct msghdr *msg,
-				   int flags);
+				   int flags,
+				   bool *msg_ready);
 s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
 s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
 u32 virtio_transport_seqpacket_has_data(struct vsock_sock *vsk);
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index ab207677e0a8..c40d341611b0 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -137,7 +137,7 @@ struct vsock_transport {
 
 	/* SEQ_PACKET. */
 	ssize_t (*seqpacket_dequeue)(struct vsock_sock *vsk, struct msghdr *msg,
-				     int flags);
+				     int flags, bool *msg_ready);
 	int (*seqpacket_enqueue)(struct vsock_sock *vsk, struct msghdr *msg,
 				 size_t len);
 	bool (*seqpacket_allow)(u32 remote_cid);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 3e02cc3b24f8..b66884def8e8 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1881,7 +1881,7 @@ static int vsock_connectible_wait_data(struct sock *sk,
 	err = 0;
 	transport = vsk->transport;
 
-	while ((data = vsock_connectible_has_data(vsk)) == 0) {
+	while ((data = vsock_stream_has_data(vsk)) == 0) {
 		prepare_to_wait(sk_sleep(sk), wait, TASK_INTERRUPTIBLE);
 
 		if (sk->sk_err != 0 ||
@@ -2013,6 +2013,7 @@ static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
 				     size_t len, int flags)
 {
 	const struct vsock_transport *transport;
+	bool msg_ready;
 	struct vsock_sock *vsk;
 	ssize_t record_len;
 	long timeout;
@@ -2023,23 +2024,36 @@ static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
 	transport = vsk->transport;
 
 	timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
+	msg_ready = false;
+	record_len = 0;
 
-	err = vsock_connectible_wait_data(sk, &wait, timeout, NULL, 0);
-	if (err <= 0)
-		goto out;
+	while (!msg_ready) {
+		ssize_t fragment_len;
+		int intr_err;
 
-	record_len = transport->seqpacket_dequeue(vsk, msg, flags);
+		intr_err = vsock_connectible_wait_data(sk, &wait, timeout, NULL, 0);
+		if (intr_err <= 0) {
+			err = intr_err;
+			break;
+		}
 
-	if (record_len < 0) {
-		err = -ENOMEM;
-		goto out;
+		fragment_len = transport->seqpacket_dequeue(vsk, msg, flags, &msg_ready);
+
+		if (fragment_len < 0) {
+			err = -ENOMEM;
+			break;
+		}
+
+		record_len += fragment_len;
 	}
 
 	if (sk->sk_err) {
 		err = -sk->sk_err;
 	} else if (sk->sk_shutdown & RCV_SHUTDOWN) {
 		err = 0;
-	} else {
+	}
+
+	if (msg_ready && !err) {
 		/* User sets MSG_TRUNC, so return real length of
 		 * packet.
 		 */
@@ -2055,7 +2069,6 @@ static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
 			msg->msg_flags |= MSG_TRUNC;
 	}
 
-out:
 	return err;
 }
 
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 169ba8b72a63..053bcea1a03f 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -407,58 +407,48 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
 
 static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
 						 struct msghdr *msg,
-						 int flags)
+						 int flags,
+						 bool *msg_ready)
 {
 	struct virtio_vsock_sock *vvs = vsk->trans;
 	struct virtio_vsock_pkt *pkt;
 	int dequeued_len = 0;
 	size_t user_buf_len = msg_data_left(msg);
-	bool msg_ready = false;
 
+	*msg_ready = false;
 	spin_lock_bh(&vvs->rx_lock);
 
-	if (vvs->msg_count == 0) {
-		spin_unlock_bh(&vvs->rx_lock);
-		return 0;
-	}
+	while (!*msg_ready && !list_empty(&vvs->rx_queue) && dequeued_len >= 0) {
+		size_t pkt_len;
+		size_t bytes_to_copy;
 
-	while (!msg_ready) {
 		pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
+		pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
 
-		if (dequeued_len >= 0) {
-			size_t pkt_len;
-			size_t bytes_to_copy;
+		bytes_to_copy = min(user_buf_len, pkt_len);
 
-			pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
-			bytes_to_copy = min(user_buf_len, pkt_len);
-
-			if (bytes_to_copy) {
-				int err;
-
-				/* sk_lock is held by caller so no one else can dequeue.
-				 * Unlock rx_lock since memcpy_to_msg() may sleep.
-				 */
-				spin_unlock_bh(&vvs->rx_lock);
+		if (bytes_to_copy) {
+			int err;
+			/* sk_lock is held by caller so no one else can dequeue.
+			 * Unlock rx_lock since memcpy_to_msg() may sleep.
+			 */
+			spin_unlock_bh(&vvs->rx_lock);
 
-				err = memcpy_to_msg(msg, pkt->buf, bytes_to_copy);
-				if (err) {
-					/* Copy of message failed. Rest of
-					 * fragments will be freed without copy.
-					 */
-					dequeued_len = err;
-				} else {
-					user_buf_len -= bytes_to_copy;
-				}
+			err = memcpy_to_msg(msg, pkt->buf, bytes_to_copy);
 
-				spin_lock_bh(&vvs->rx_lock);
-			}
+			spin_lock_bh(&vvs->rx_lock);
 
-			if (dequeued_len >= 0)
-				dequeued_len += pkt_len;
+			if (err)
+				dequeued_len = err;
+			else
+				user_buf_len -= bytes_to_copy;
 		}
 
+		if (dequeued_len >= 0)
+			dequeued_len += pkt_len;
+
 		if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SEQ_EOR) {
-			msg_ready = true;
+			*msg_ready = true;
 			vvs->msg_count--;
 		}
 
@@ -489,12 +479,12 @@ EXPORT_SYMBOL_GPL(virtio_transport_stream_dequeue);
 ssize_t
 virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
 				   struct msghdr *msg,
-				   int flags)
+				   int flags, bool *msg_ready)
 {
 	if (flags & MSG_PEEK)
 		return -EOPNOTSUPP;
 
-	return virtio_transport_seqpacket_do_dequeue(vsk, msg, flags);
+	return virtio_transport_seqpacket_do_dequeue(vsk, msg, flags, msg_ready);
 }
 EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_dequeue);
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 2/6] af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
  2021-07-04  8:08 [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic Arseny Krasnov
  2021-07-04  8:09 ` [RFC PATCH v2 1/6] af_vsock/virtio/vsock: change seqpacket " Arseny Krasnov
@ 2021-07-04  8:10 ` Arseny Krasnov
  2021-07-04  8:10 ` [RFC PATCH v2 3/6] virtio/vsock: remove 'msg_count' based logic Arseny Krasnov
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Arseny Krasnov @ 2021-07-04  8:10 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Andra Paraschiv, Norbert Slusarek, Colin Ian King
  Cc: kvm, virtualization, netdev, linux-kernel, oxffffaa

'rx_bytes' is used to check data presence on both SOCK_STREAM
and SOCK_SEQPACKET socket types for virtio/vsock.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 drivers/vhost/vsock.c                   |  1 -
 include/linux/virtio_vsock.h            |  1 -
 include/net/af_vsock.h                  |  1 -
 net/vmw_vsock/af_vsock.c                | 10 ----------
 net/vmw_vsock/virtio_transport.c        |  1 -
 net/vmw_vsock/virtio_transport_common.c | 13 -------------
 net/vmw_vsock/vsock_loopback.c          |  1 -
 7 files changed, 28 deletions(-)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index d38c996b4f46..c9713d8db0f4 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -451,7 +451,6 @@ static struct virtio_transport vhost_transport = {
 		.seqpacket_dequeue        = virtio_transport_seqpacket_dequeue,
 		.seqpacket_enqueue        = virtio_transport_seqpacket_enqueue,
 		.seqpacket_allow          = vhost_transport_seqpacket_allow,
-		.seqpacket_has_data       = virtio_transport_seqpacket_has_data,
 
 		.notify_poll_in           = virtio_transport_notify_poll_in,
 		.notify_poll_out          = virtio_transport_notify_poll_out,
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index e68b4029f038..8d34f3d73bbb 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -92,7 +92,6 @@ virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
 				   bool *msg_ready);
 s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
 s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
-u32 virtio_transport_seqpacket_has_data(struct vsock_sock *vsk);
 
 int virtio_transport_do_socket_init(struct vsock_sock *vsk,
 				 struct vsock_sock *psk);
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index c40d341611b0..1747c0b564ef 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -141,7 +141,6 @@ struct vsock_transport {
 	int (*seqpacket_enqueue)(struct vsock_sock *vsk, struct msghdr *msg,
 				 size_t len);
 	bool (*seqpacket_allow)(u32 remote_cid);
-	u32 (*seqpacket_has_data)(struct vsock_sock *vsk);
 
 	/* Notification. */
 	int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index b66884def8e8..87955f9ff065 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -860,16 +860,6 @@ s64 vsock_stream_has_data(struct vsock_sock *vsk)
 }
 EXPORT_SYMBOL_GPL(vsock_stream_has_data);
 
-static s64 vsock_connectible_has_data(struct vsock_sock *vsk)
-{
-	struct sock *sk = sk_vsock(vsk);
-
-	if (sk->sk_type == SOCK_SEQPACKET)
-		return vsk->transport->seqpacket_has_data(vsk);
-	else
-		return vsock_stream_has_data(vsk);
-}
-
 s64 vsock_stream_has_space(struct vsock_sock *vsk)
 {
 	return vsk->transport->stream_has_space(vsk);
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index e0c2c992ad9c..2a7c56fcb062 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -475,7 +475,6 @@ static struct virtio_transport virtio_transport = {
 		.seqpacket_dequeue        = virtio_transport_seqpacket_dequeue,
 		.seqpacket_enqueue        = virtio_transport_seqpacket_enqueue,
 		.seqpacket_allow          = virtio_transport_seqpacket_allow,
-		.seqpacket_has_data       = virtio_transport_seqpacket_has_data,
 
 		.notify_poll_in           = virtio_transport_notify_poll_in,
 		.notify_poll_out          = virtio_transport_notify_poll_out,
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 053bcea1a03f..37d4ed526750 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -530,19 +530,6 @@ s64 virtio_transport_stream_has_data(struct vsock_sock *vsk)
 }
 EXPORT_SYMBOL_GPL(virtio_transport_stream_has_data);
 
-u32 virtio_transport_seqpacket_has_data(struct vsock_sock *vsk)
-{
-	struct virtio_vsock_sock *vvs = vsk->trans;
-	u32 msg_count;
-
-	spin_lock_bh(&vvs->rx_lock);
-	msg_count = vvs->msg_count;
-	spin_unlock_bh(&vvs->rx_lock);
-
-	return msg_count;
-}
-EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_has_data);
-
 static s64 virtio_transport_has_space(struct vsock_sock *vsk)
 {
 	struct virtio_vsock_sock *vvs = vsk->trans;
diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c
index 169a8cf65b39..809f807d0710 100644
--- a/net/vmw_vsock/vsock_loopback.c
+++ b/net/vmw_vsock/vsock_loopback.c
@@ -94,7 +94,6 @@ static struct virtio_transport loopback_transport = {
 		.seqpacket_dequeue        = virtio_transport_seqpacket_dequeue,
 		.seqpacket_enqueue        = virtio_transport_seqpacket_enqueue,
 		.seqpacket_allow          = vsock_loopback_seqpacket_allow,
-		.seqpacket_has_data       = virtio_transport_seqpacket_has_data,
 
 		.notify_poll_in           = virtio_transport_notify_poll_in,
 		.notify_poll_out          = virtio_transport_notify_poll_out,
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 3/6] virtio/vsock: remove 'msg_count' based logic
  2021-07-04  8:08 [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic Arseny Krasnov
  2021-07-04  8:09 ` [RFC PATCH v2 1/6] af_vsock/virtio/vsock: change seqpacket " Arseny Krasnov
  2021-07-04  8:10 ` [RFC PATCH v2 2/6] af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback Arseny Krasnov
@ 2021-07-04  8:10 ` Arseny Krasnov
  2021-07-04  8:10 ` [RFC PATCH v2 4/6] af_vsock/virtio/vsock: add 'seqpacket_drop()' callback Arseny Krasnov
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Arseny Krasnov @ 2021-07-04  8:10 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Andra Paraschiv, Norbert Slusarek, Colin Ian King
  Cc: kvm, virtualization, netdev, linux-kernel, oxffffaa

'msg_count' is obsolete, because 'rx_bytes' used instead.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 include/linux/virtio_vsock.h            | 1 -
 net/vmw_vsock/virtio_transport_common.c | 4 ----
 2 files changed, 5 deletions(-)

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 8d34f3d73bbb..7360ab7ea0af 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -36,7 +36,6 @@ struct virtio_vsock_sock {
 	u32 rx_bytes;
 	u32 buf_alloc;
 	struct list_head rx_queue;
-	u32 msg_count;
 };
 
 struct virtio_vsock_pkt {
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 37d4ed526750..ce67cf449ef8 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -449,7 +449,6 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
 
 		if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SEQ_EOR) {
 			*msg_ready = true;
-			vvs->msg_count--;
 		}
 
 		virtio_transport_dec_rx_pkt(vvs, pkt);
@@ -1006,9 +1005,6 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
 		goto out;
 	}
 
-	if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SEQ_EOR)
-		vvs->msg_count++;
-
 	/* Try to copy small packets into the buffer of last packet queued,
 	 * to avoid wasting memory queueing the entire buffer with a small
 	 * payload.
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 4/6] af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
  2021-07-04  8:08 [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic Arseny Krasnov
                   ` (2 preceding siblings ...)
  2021-07-04  8:10 ` [RFC PATCH v2 3/6] virtio/vsock: remove 'msg_count' based logic Arseny Krasnov
@ 2021-07-04  8:10 ` Arseny Krasnov
  2021-07-04  8:10 ` [RFC PATCH v2 5/6] virtio/vsock: remove record size limit for SEQPACKET Arseny Krasnov
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Arseny Krasnov @ 2021-07-04  8:10 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Norbert Slusarek, Colin Ian King, Andra Paraschiv
  Cc: kvm, virtualization, netdev, linux-kernel, oxffffaa

Add special callback for SEQPACKET socket which is called when
we need to drop current in-progress record: part of record was
copied successfully, reader wait rest of record, but signal
interrupts it and reader leaves it's loop, leaving packets of
current record still in queue. So to avoid copy of "orphaned"
record, we tell transport to drop every packet until EOR will
be found.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 drivers/vhost/vsock.c                   |  1 +
 include/linux/virtio_vsock.h            |  2 ++
 include/net/af_vsock.h                  |  1 +
 net/vmw_vsock/af_vsock.c                |  1 +
 net/vmw_vsock/virtio_transport.c        |  1 +
 net/vmw_vsock/virtio_transport_common.c | 23 +++++++++++++++++++----
 net/vmw_vsock/vsock_loopback.c          |  1 +
 7 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index c9713d8db0f4..731b9fe07cd3 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -447,6 +447,7 @@ static struct virtio_transport vhost_transport = {
 		.stream_rcvhiwat          = virtio_transport_stream_rcvhiwat,
 		.stream_is_active         = virtio_transport_stream_is_active,
 		.stream_allow             = virtio_transport_stream_allow,
+		.seqpacket_drop           = virtio_transport_seqpacket_drop,
 
 		.seqpacket_dequeue        = virtio_transport_seqpacket_dequeue,
 		.seqpacket_enqueue        = virtio_transport_seqpacket_enqueue,
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 7360ab7ea0af..18a50f64bf54 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -36,6 +36,7 @@ struct virtio_vsock_sock {
 	u32 rx_bytes;
 	u32 buf_alloc;
 	struct list_head rx_queue;
+	bool drop_until_eor;
 };
 
 struct virtio_vsock_pkt {
@@ -89,6 +90,7 @@ virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
 				   struct msghdr *msg,
 				   int flags,
 				   bool *msg_ready);
+void virtio_transport_seqpacket_drop(struct vsock_sock *vsk);
 s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
 s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
 
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 1747c0b564ef..356878aabbd4 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -141,6 +141,7 @@ struct vsock_transport {
 	int (*seqpacket_enqueue)(struct vsock_sock *vsk, struct msghdr *msg,
 				 size_t len);
 	bool (*seqpacket_allow)(u32 remote_cid);
+	void (*seqpacket_drop)(struct vsock_sock *vsk);
 
 	/* Notification. */
 	int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 87955f9ff065..380a90c758c4 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -2024,6 +2024,7 @@ static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
 		intr_err = vsock_connectible_wait_data(sk, &wait, timeout, NULL, 0);
 		if (intr_err <= 0) {
 			err = intr_err;
+			transport->seqpacket_drop(vsk);
 			break;
 		}
 
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 2a7c56fcb062..2f7d54071ee2 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -475,6 +475,7 @@ static struct virtio_transport virtio_transport = {
 		.seqpacket_dequeue        = virtio_transport_seqpacket_dequeue,
 		.seqpacket_enqueue        = virtio_transport_seqpacket_enqueue,
 		.seqpacket_allow          = virtio_transport_seqpacket_allow,
+		.seqpacket_drop           = virtio_transport_seqpacket_drop,
 
 		.notify_poll_in           = virtio_transport_notify_poll_in,
 		.notify_poll_out          = virtio_transport_notify_poll_out,
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index ce67cf449ef8..52765754edcd 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -425,7 +425,7 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
 		pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
 		pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
 
-		bytes_to_copy = min(user_buf_len, pkt_len);
+		bytes_to_copy = vvs->drop_until_eor ? 0 : min(user_buf_len, pkt_len);
 
 		if (bytes_to_copy) {
 			int err;
@@ -438,17 +438,22 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
 
 			spin_lock_bh(&vvs->rx_lock);
 
-			if (err)
+			if (err) {
 				dequeued_len = err;
-			else
+				vvs->drop_until_eor = true;
+			} else {
 				user_buf_len -= bytes_to_copy;
+			}
 		}
 
 		if (dequeued_len >= 0)
 			dequeued_len += pkt_len;
 
 		if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SEQ_EOR) {
-			*msg_ready = true;
+			if (vvs->drop_until_eor)
+				vvs->drop_until_eor = false;
+			else
+				*msg_ready = true;
 		}
 
 		virtio_transport_dec_rx_pkt(vvs, pkt);
@@ -487,6 +492,16 @@ virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
 }
 EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_dequeue);
 
+void virtio_transport_seqpacket_drop(struct vsock_sock *vsk)
+{
+	struct virtio_vsock_sock *vvs = vsk->trans;
+
+	spin_lock_bh(&vvs->rx_lock);
+	vvs->drop_until_eor = true;
+	spin_unlock_bh(&vvs->rx_lock);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_drop);
+
 int
 virtio_transport_seqpacket_enqueue(struct vsock_sock *vsk,
 				   struct msghdr *msg,
diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c
index 809f807d0710..d9030a46e4b9 100644
--- a/net/vmw_vsock/vsock_loopback.c
+++ b/net/vmw_vsock/vsock_loopback.c
@@ -94,6 +94,7 @@ static struct virtio_transport loopback_transport = {
 		.seqpacket_dequeue        = virtio_transport_seqpacket_dequeue,
 		.seqpacket_enqueue        = virtio_transport_seqpacket_enqueue,
 		.seqpacket_allow          = vsock_loopback_seqpacket_allow,
+		.seqpacket_drop           = virtio_transport_seqpacket_drop,
 
 		.notify_poll_in           = virtio_transport_notify_poll_in,
 		.notify_poll_out          = virtio_transport_notify_poll_out,
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 5/6] virtio/vsock: remove record size limit for SEQPACKET
  2021-07-04  8:08 [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic Arseny Krasnov
                   ` (3 preceding siblings ...)
  2021-07-04  8:10 ` [RFC PATCH v2 4/6] af_vsock/virtio/vsock: add 'seqpacket_drop()' callback Arseny Krasnov
@ 2021-07-04  8:10 ` Arseny Krasnov
  2021-07-04  8:11 ` [RFC PATCH v2 6/6] vsock_test: SEQPACKET read to broken buffer Arseny Krasnov
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Arseny Krasnov @ 2021-07-04  8:10 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Jorgen Hansen, Colin Ian King, Norbert Slusarek, Andra Paraschiv
  Cc: kvm, virtualization, netdev, linux-kernel, oxffffaa

Remove record size limit which was 'peer_buf_alloc' value.
New approach doesn't need this, because data is copied to
user's buffer in stream manner(we don't wait until whole
record is received).

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 net/vmw_vsock/virtio_transport_common.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 52765754edcd..5cfdf701a8af 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -507,17 +507,6 @@ virtio_transport_seqpacket_enqueue(struct vsock_sock *vsk,
 				   struct msghdr *msg,
 				   size_t len)
 {
-	struct virtio_vsock_sock *vvs = vsk->trans;
-
-	spin_lock_bh(&vvs->tx_lock);
-
-	if (len > vvs->peer_buf_alloc) {
-		spin_unlock_bh(&vvs->tx_lock);
-		return -EMSGSIZE;
-	}
-
-	spin_unlock_bh(&vvs->tx_lock);
-
 	return virtio_transport_stream_enqueue(vsk, msg, len);
 }
 EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_enqueue);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 6/6] vsock_test: SEQPACKET read to broken buffer
  2021-07-04  8:08 [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic Arseny Krasnov
                   ` (4 preceding siblings ...)
  2021-07-04  8:10 ` [RFC PATCH v2 5/6] virtio/vsock: remove record size limit for SEQPACKET Arseny Krasnov
@ 2021-07-04  8:11 ` Arseny Krasnov
  2021-07-04  8:13 ` [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic Arseny Krasnov
  2021-07-04  8:30 ` Michael S. Tsirkin
  7 siblings, 0 replies; 13+ messages in thread
From: Arseny Krasnov @ 2021-07-04  8:11 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Arseny Krasnov,
	Colin Ian King, Norbert Slusarek, Andra Paraschiv
  Cc: kvm, virtualization, netdev, linux-kernel, oxffffaa

Add test where sender sends two message, each with own
data pattern. Reader tries to read first to broken buffer:
it has three pages size, but middle page is unmapped. Then,
reader tries to read second message to valid buffer. Test
checks, that uncopied part of first message was dropped
and thus not copied as part of second message.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
---
 tools/testing/vsock/vsock_test.c | 120 +++++++++++++++++++++++++++++++
 1 file changed, 120 insertions(+)

diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 67766bfe176f..cdaa154fc3a9 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -16,6 +16,7 @@
 #include <linux/kernel.h>
 #include <sys/types.h>
 #include <sys/socket.h>
+#include <sys/mman.h>
 
 #include "timeout.h"
 #include "control.h"
@@ -385,6 +386,120 @@ static void test_seqpacket_msg_trunc_server(const struct test_opts *opts)
 	close(fd);
 }
 
+#define BUF_PATTERN_1 'a'
+#define BUF_PATTERN_2 'b'
+
+static void test_seqpacket_invalid_rec_buffer_client(const struct test_opts *opts)
+{
+	int fd;
+	unsigned char *buf1;
+	unsigned char *buf2;
+	int buf_size = getpagesize() * 3;
+
+	fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
+	if (fd < 0) {
+		perror("connect");
+		exit(EXIT_FAILURE);
+	}
+
+	buf1 = malloc(buf_size);
+	if (!buf1) {
+		perror("'malloc()' for 'buf1'");
+		exit(EXIT_FAILURE);
+	}
+
+	buf2 = malloc(buf_size);
+	if (!buf2) {
+		perror("'malloc()' for 'buf2'");
+		exit(EXIT_FAILURE);
+	}
+
+	memset(buf1, BUF_PATTERN_1, buf_size);
+	memset(buf2, BUF_PATTERN_2, buf_size);
+
+	if (send(fd, buf1, buf_size, 0) != buf_size) {
+		perror("send failed");
+		exit(EXIT_FAILURE);
+	}
+
+	if (send(fd, buf2, buf_size, 0) != buf_size) {
+		perror("send failed");
+		exit(EXIT_FAILURE);
+	}
+
+	close(fd);
+}
+
+static void test_seqpacket_invalid_rec_buffer_server(const struct test_opts *opts)
+{
+	int fd;
+	unsigned char *broken_buf;
+	unsigned char *valid_buf;
+	int page_size = getpagesize();
+	int buf_size = page_size * 3;
+	ssize_t res;
+	int prot = PROT_READ | PROT_WRITE;
+	int flags = MAP_PRIVATE | MAP_ANONYMOUS;
+	int i;
+
+	fd = vsock_seqpacket_accept(VMADDR_CID_ANY, 1234, NULL);
+	if (fd < 0) {
+		perror("accept");
+		exit(EXIT_FAILURE);
+	}
+
+	/* Setup first buffer. */
+	broken_buf = mmap(NULL, buf_size, prot, flags, -1, 0);
+	if (broken_buf == MAP_FAILED) {
+		perror("mmap for 'broken_buf'");
+		exit(EXIT_FAILURE);
+	}
+
+	/* Unmap "hole" in buffer. */
+	if (munmap(broken_buf + page_size, page_size)) {
+		perror("'broken_buf' setup");
+		exit(EXIT_FAILURE);
+	}
+
+	valid_buf = mmap(NULL, buf_size, prot, flags, -1, 0);
+	if (valid_buf == MAP_FAILED) {
+		perror("mmap for 'valid_buf'");
+		exit(EXIT_FAILURE);
+	}
+
+	/* Try to fill buffer with unmapped middle. */
+	res = read(fd, broken_buf, buf_size);
+	if (res != -1) {
+		perror("invalid read result of 'broken_buf'");
+		exit(EXIT_FAILURE);
+	}
+
+	if (errno != ENOMEM) {
+		perror("invalid errno of 'broken_buf'");
+		exit(EXIT_FAILURE);
+	}
+
+	/* Try to fill valid buffer. */
+	res = read(fd, valid_buf, buf_size);
+	if (res != buf_size) {
+		perror("invalid read result of 'valid_buf'");
+		exit(EXIT_FAILURE);
+	}
+
+	for (i = 0; i < buf_size; i++) {
+		if (valid_buf[i] != BUF_PATTERN_2) {
+			perror("invalid pattern for valid buf");
+			exit(EXIT_FAILURE);
+		}
+	}
+
+	/* Unmap buffers. */
+	munmap(broken_buf, page_size);
+	munmap(broken_buf + page_size * 2, page_size);
+	munmap(valid_buf, buf_size);
+	close(fd);
+}
+
 static struct test_case test_cases[] = {
 	{
 		.name = "SOCK_STREAM connection reset",
@@ -425,6 +540,11 @@ static struct test_case test_cases[] = {
 		.run_client = test_seqpacket_msg_trunc_client,
 		.run_server = test_seqpacket_msg_trunc_server,
 	},
+	{
+		.name = "SOCK_SEQPACKET invalid receive buffer",
+		.run_client = test_seqpacket_invalid_rec_buffer_client,
+		.run_server = test_seqpacket_invalid_rec_buffer_server,
+	},
 	{},
 };
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic
  2021-07-04  8:08 [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic Arseny Krasnov
                   ` (5 preceding siblings ...)
  2021-07-04  8:11 ` [RFC PATCH v2 6/6] vsock_test: SEQPACKET read to broken buffer Arseny Krasnov
@ 2021-07-04  8:13 ` Arseny Krasnov
  2021-07-04  8:30 ` Michael S. Tsirkin
  7 siblings, 0 replies; 13+ messages in thread
From: Arseny Krasnov @ 2021-07-04  8:13 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, David S. Miller, Jakub Kicinski, Andra Paraschiv,
	Norbert Slusarek, Colin Ian King
  Cc: kvm, virtualization, netdev, linux-kernel, oxffffaa


On 04.07.2021 11:08, Arseny Krasnov wrote:
> 	This patchset modifies receive logic for SOCK_SEQPACKET.
> Difference between current implementation and this version is that
> now reader is woken up when there is at least one RW packet in rx
> queue of socket and data is copied to user's buffer, while merged
> approach wake up user only when whole message is received and kept
> in queue. New implementation has several advantages:
>  1) There is no limit for message length. Merged approach requires
>     that length must be smaller than 'peer_buf_alloc', otherwise
>     transmission will stuck.
>  2) There is no need to keep whole message in queue, thus no
>     'kmalloc()' memory will be wasted until EOR is received.
>
>     Also new approach has some feature: as fragments of message
> are copied until EOR is received, it is possible that part of
> message will be already in user's buffer, while rest of message
> still not received. And if user will be interrupted by signal or
> timeout with part of message in buffer, it will exit receive loop,
> leaving rest of message in queue. To solve this problem special
> callback was added to transport: it is called when user was forced
> to leave exit loop and tells transport to drop any packet until
> EOR met. When EOR is found, this mode is disabled and normal packet
> processing started. Note, that when 'drop until EOR' mode is on,
> incoming packets still inserted in queue, reader will be woken up,
> tries to copy data, but nothing will be copied until EOR found.
> It was possible to drain such unneeded packets it rx work without
> kicking user, but implemented way is simplest. Anyway, i think
> such cases are rare.
>
>     New test also added - it tries to copy to invalid user's
> buffer.
>
> Arseny Krasnov (16):
>  af_vsock/virtio/vsock: change seqpacket receive logic
>  af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
>  virtio/vsock: remove 'msg_count' based logic
>  af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
>  virtio/vsock: remove record size limit for SEQPACKET
>  vsock_test: SEQPACKET read to broken buffer
>
>  drivers/vhost/vsock.c                   |   2 +-
>  include/linux/virtio_vsock.h            |   7 +-
>  include/net/af_vsock.h                  |   4 +-
>  net/vmw_vsock/af_vsock.c                |  44 ++++----
>  net/vmw_vsock/virtio_transport.c        |   2 +-
>  net/vmw_vsock/virtio_transport_common.c | 103 ++++++++-----------
>  net/vmw_vsock/vsock_loopback.c          |   2 +-
>  tools/testing/vsock/vsock_test.c        | 120 ++++++++++++++++++++++
>  8 files changed, 193 insertions(+), 91 deletions(-)
>
>  v1 -> v2:
>  Patches reordered and reorganized.
>
> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
> ---
>  cv.txt | 0
>  1 file changed, 0 insertions(+), 0 deletions(-)
>  create mode 100644 cv.txt
>
> diff --git a/cv.txt b/cv.txt
> new file mode 100644
> index 000000000000..e69de29bb2d1
Sorry, forget to remove my tmp file

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic
  2021-07-04  8:08 [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic Arseny Krasnov
                   ` (6 preceding siblings ...)
  2021-07-04  8:13 ` [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic Arseny Krasnov
@ 2021-07-04  8:30 ` Michael S. Tsirkin
  2021-07-04  9:23   ` Arseny Krasnov
  7 siblings, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2021-07-04  8:30 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Stefano Garzarella, Jason Wang, David S. Miller,
	Jakub Kicinski, Andra Paraschiv, Norbert Slusarek,
	Colin Ian King, kvm, virtualization, netdev, linux-kernel,
	oxffffaa

On Sun, Jul 04, 2021 at 11:08:13AM +0300, Arseny Krasnov wrote:
> 	This patchset modifies receive logic for SOCK_SEQPACKET.
> Difference between current implementation and this version is that
> now reader is woken up when there is at least one RW packet in rx
> queue of socket and data is copied to user's buffer, while merged
> approach wake up user only when whole message is received and kept
> in queue. New implementation has several advantages:
>  1) There is no limit for message length. Merged approach requires
>     that length must be smaller than 'peer_buf_alloc', otherwise
>     transmission will stuck.
>  2) There is no need to keep whole message in queue, thus no
>     'kmalloc()' memory will be wasted until EOR is received.
> 
>     Also new approach has some feature: as fragments of message
> are copied until EOR is received, it is possible that part of
> message will be already in user's buffer, while rest of message
> still not received. And if user will be interrupted by signal or
> timeout with part of message in buffer, it will exit receive loop,
> leaving rest of message in queue. To solve this problem special
> callback was added to transport: it is called when user was forced
> to leave exit loop and tells transport to drop any packet until
> EOR met.

Sorry about commenting late in the game.  I'm a bit lost


SOCK_SEQPACKET
Provides sequenced, reliable, bidirectional, connection-mode transmission paths for records. A record can be sent using one or more output operations and received using one or more input operations, but a single operation never transfers part of more than one record. Record boundaries are visible to the receiver via the MSG_EOR flag.

it's supposed to be reliable - how is it legal to drop packets?


> When EOR is found, this mode is disabled and normal packet
> processing started. Note, that when 'drop until EOR' mode is on,
> incoming packets still inserted in queue, reader will be woken up,
> tries to copy data, but nothing will be copied until EOR found.
> It was possible to drain such unneeded packets it rx work without
> kicking user, but implemented way is simplest. Anyway, i think
> such cases are rare.


>     New test also added - it tries to copy to invalid user's
> buffer.
> 
> Arseny Krasnov (16):
>  af_vsock/virtio/vsock: change seqpacket receive logic
>  af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
>  virtio/vsock: remove 'msg_count' based logic
>  af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
>  virtio/vsock: remove record size limit for SEQPACKET
>  vsock_test: SEQPACKET read to broken buffer
> 
>  drivers/vhost/vsock.c                   |   2 +-
>  include/linux/virtio_vsock.h            |   7 +-
>  include/net/af_vsock.h                  |   4 +-
>  net/vmw_vsock/af_vsock.c                |  44 ++++----
>  net/vmw_vsock/virtio_transport.c        |   2 +-
>  net/vmw_vsock/virtio_transport_common.c | 103 ++++++++-----------
>  net/vmw_vsock/vsock_loopback.c          |   2 +-
>  tools/testing/vsock/vsock_test.c        | 120 ++++++++++++++++++++++
>  8 files changed, 193 insertions(+), 91 deletions(-)
> 
>  v1 -> v2:
>  Patches reordered and reorganized.
> 
> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
> ---
>  cv.txt | 0
>  1 file changed, 0 insertions(+), 0 deletions(-)
>  create mode 100644 cv.txt
> 
> diff --git a/cv.txt b/cv.txt
> new file mode 100644
> index 000000000000..e69de29bb2d1
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic
  2021-07-04  8:30 ` Michael S. Tsirkin
@ 2021-07-04  9:23   ` Arseny Krasnov
  2021-07-04  9:54     ` Michael S. Tsirkin
  0 siblings, 1 reply; 13+ messages in thread
From: Arseny Krasnov @ 2021-07-04  9:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Stefan Hajnoczi, Stefano Garzarella, Jason Wang, David S. Miller,
	Jakub Kicinski, Andra Paraschiv, Norbert Slusarek,
	Colin Ian King, kvm, virtualization, netdev, linux-kernel,
	oxffffaa


On 04.07.2021 11:30, Michael S. Tsirkin wrote:
> On Sun, Jul 04, 2021 at 11:08:13AM +0300, Arseny Krasnov wrote:
>> 	This patchset modifies receive logic for SOCK_SEQPACKET.
>> Difference between current implementation and this version is that
>> now reader is woken up when there is at least one RW packet in rx
>> queue of socket and data is copied to user's buffer, while merged
>> approach wake up user only when whole message is received and kept
>> in queue. New implementation has several advantages:
>>  1) There is no limit for message length. Merged approach requires
>>     that length must be smaller than 'peer_buf_alloc', otherwise
>>     transmission will stuck.
>>  2) There is no need to keep whole message in queue, thus no
>>     'kmalloc()' memory will be wasted until EOR is received.
>>
>>     Also new approach has some feature: as fragments of message
>> are copied until EOR is received, it is possible that part of
>> message will be already in user's buffer, while rest of message
>> still not received. And if user will be interrupted by signal or
>> timeout with part of message in buffer, it will exit receive loop,
>> leaving rest of message in queue. To solve this problem special
>> callback was added to transport: it is called when user was forced
>> to leave exit loop and tells transport to drop any packet until
>> EOR met.
> Sorry about commenting late in the game.  I'm a bit lost
>
>
> SOCK_SEQPACKET
> Provides sequenced, reliable, bidirectional, connection-mode transmission paths for records. A record can be sent using one or more output operations and received using one or more input operations, but a single operation never transfers part of more than one record. Record boundaries are visible to the receiver via the MSG_EOR flag.
>
> it's supposed to be reliable - how is it legal to drop packets?

Sorry, seems i need to rephrase description. "Packet" here means fragment of record(message) at transport

layer. As this is SEQPACKET mode, receiver could get only whole message or error, so if only several fragments

of message was copied (if signal received for example) we can't return it to user - it breaks SEQPACKET sense. I think,

in this case we can drop rest of record's fragments legally.


Thank You

>
>
>> When EOR is found, this mode is disabled and normal packet
>> processing started. Note, that when 'drop until EOR' mode is on,
>> incoming packets still inserted in queue, reader will be woken up,
>> tries to copy data, but nothing will be copied until EOR found.
>> It was possible to drain such unneeded packets it rx work without
>> kicking user, but implemented way is simplest. Anyway, i think
>> such cases are rare.
>
>>     New test also added - it tries to copy to invalid user's
>> buffer.
>>
>> Arseny Krasnov (16):
>>  af_vsock/virtio/vsock: change seqpacket receive logic
>>  af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
>>  virtio/vsock: remove 'msg_count' based logic
>>  af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
>>  virtio/vsock: remove record size limit for SEQPACKET
>>  vsock_test: SEQPACKET read to broken buffer
>>
>>  drivers/vhost/vsock.c                   |   2 +-
>>  include/linux/virtio_vsock.h            |   7 +-
>>  include/net/af_vsock.h                  |   4 +-
>>  net/vmw_vsock/af_vsock.c                |  44 ++++----
>>  net/vmw_vsock/virtio_transport.c        |   2 +-
>>  net/vmw_vsock/virtio_transport_common.c | 103 ++++++++-----------
>>  net/vmw_vsock/vsock_loopback.c          |   2 +-
>>  tools/testing/vsock/vsock_test.c        | 120 ++++++++++++++++++++++
>>  8 files changed, 193 insertions(+), 91 deletions(-)
>>
>>  v1 -> v2:
>>  Patches reordered and reorganized.
>>
>> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>> ---
>>  cv.txt | 0
>>  1 file changed, 0 insertions(+), 0 deletions(-)
>>  create mode 100644 cv.txt
>>
>> diff --git a/cv.txt b/cv.txt
>> new file mode 100644
>> index 000000000000..e69de29bb2d1
>> -- 
>> 2.25.1
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic
  2021-07-04  9:23   ` Arseny Krasnov
@ 2021-07-04  9:54     ` Michael S. Tsirkin
  2021-07-05 10:48       ` [MASSMAIL KLMS]Re: " Arseny Krasnov
  0 siblings, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2021-07-04  9:54 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Stefan Hajnoczi, Stefano Garzarella, Jason Wang, David S. Miller,
	Jakub Kicinski, Andra Paraschiv, Norbert Slusarek,
	Colin Ian King, kvm, virtualization, netdev, linux-kernel,
	oxffffaa

On Sun, Jul 04, 2021 at 12:23:03PM +0300, Arseny Krasnov wrote:
> 
> On 04.07.2021 11:30, Michael S. Tsirkin wrote:
> > On Sun, Jul 04, 2021 at 11:08:13AM +0300, Arseny Krasnov wrote:
> >> 	This patchset modifies receive logic for SOCK_SEQPACKET.
> >> Difference between current implementation and this version is that
> >> now reader is woken up when there is at least one RW packet in rx
> >> queue of socket and data is copied to user's buffer, while merged
> >> approach wake up user only when whole message is received and kept
> >> in queue. New implementation has several advantages:
> >>  1) There is no limit for message length. Merged approach requires
> >>     that length must be smaller than 'peer_buf_alloc', otherwise
> >>     transmission will stuck.
> >>  2) There is no need to keep whole message in queue, thus no
> >>     'kmalloc()' memory will be wasted until EOR is received.
> >>
> >>     Also new approach has some feature: as fragments of message
> >> are copied until EOR is received, it is possible that part of
> >> message will be already in user's buffer, while rest of message
> >> still not received. And if user will be interrupted by signal or
> >> timeout with part of message in buffer, it will exit receive loop,
> >> leaving rest of message in queue. To solve this problem special
> >> callback was added to transport: it is called when user was forced
> >> to leave exit loop and tells transport to drop any packet until
> >> EOR met.
> > Sorry about commenting late in the game.  I'm a bit lost
> >
> >
> > SOCK_SEQPACKET
> > Provides sequenced, reliable, bidirectional, connection-mode transmission paths for records. A record can be sent using one or more output operations and received using one or more input operations, but a single operation never transfers part of more than one record. Record boundaries are visible to the receiver via the MSG_EOR flag.
> >
> > it's supposed to be reliable - how is it legal to drop packets?
> 
> Sorry, seems i need to rephrase description. "Packet" here means fragment of record(message) at transport
> 
> layer. As this is SEQPACKET mode, receiver could get only whole message or error, so if only several fragments
> 
> of message was copied (if signal received for example) we can't return it to user - it breaks SEQPACKET sense. I think,
> 
> in this case we can drop rest of record's fragments legally.
> 
> 
> Thank You

Would not that violate the reliable property? IIUC it's only ok to
return an error if socket gets closed. Just like e.g. TCP ...



> >
> >
> >> When EOR is found, this mode is disabled and normal packet
> >> processing started. Note, that when 'drop until EOR' mode is on,
> >> incoming packets still inserted in queue, reader will be woken up,
> >> tries to copy data, but nothing will be copied until EOR found.
> >> It was possible to drain such unneeded packets it rx work without
> >> kicking user, but implemented way is simplest. Anyway, i think
> >> such cases are rare.
> >
> >>     New test also added - it tries to copy to invalid user's
> >> buffer.
> >>
> >> Arseny Krasnov (16):
> >>  af_vsock/virtio/vsock: change seqpacket receive logic
> >>  af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
> >>  virtio/vsock: remove 'msg_count' based logic
> >>  af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
> >>  virtio/vsock: remove record size limit for SEQPACKET
> >>  vsock_test: SEQPACKET read to broken buffer
> >>
> >>  drivers/vhost/vsock.c                   |   2 +-
> >>  include/linux/virtio_vsock.h            |   7 +-
> >>  include/net/af_vsock.h                  |   4 +-
> >>  net/vmw_vsock/af_vsock.c                |  44 ++++----
> >>  net/vmw_vsock/virtio_transport.c        |   2 +-
> >>  net/vmw_vsock/virtio_transport_common.c | 103 ++++++++-----------
> >>  net/vmw_vsock/vsock_loopback.c          |   2 +-
> >>  tools/testing/vsock/vsock_test.c        | 120 ++++++++++++++++++++++
> >>  8 files changed, 193 insertions(+), 91 deletions(-)
> >>
> >>  v1 -> v2:
> >>  Patches reordered and reorganized.
> >>
> >> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
> >> ---
> >>  cv.txt | 0
> >>  1 file changed, 0 insertions(+), 0 deletions(-)
> >>  create mode 100644 cv.txt
> >>
> >> diff --git a/cv.txt b/cv.txt
> >> new file mode 100644
> >> index 000000000000..e69de29bb2d1
> >> -- 
> >> 2.25.1
> >


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [MASSMAIL KLMS]Re: [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic
  2021-07-04  9:54     ` Michael S. Tsirkin
@ 2021-07-05 10:48       ` Arseny Krasnov
  2021-07-05 15:23         ` Stefano Garzarella
  0 siblings, 1 reply; 13+ messages in thread
From: Arseny Krasnov @ 2021-07-05 10:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Stefan Hajnoczi, Stefano Garzarella, Jason Wang, David S. Miller,
	Jakub Kicinski, Andra Paraschiv, Norbert Slusarek,
	Colin Ian King, kvm, virtualization, netdev, linux-kernel,
	oxffffaa


On 04.07.2021 12:54, Michael S. Tsirkin wrote:
> On Sun, Jul 04, 2021 at 12:23:03PM +0300, Arseny Krasnov wrote:
>> On 04.07.2021 11:30, Michael S. Tsirkin wrote:
>>> On Sun, Jul 04, 2021 at 11:08:13AM +0300, Arseny Krasnov wrote:
>>>> 	This patchset modifies receive logic for SOCK_SEQPACKET.
>>>> Difference between current implementation and this version is that
>>>> now reader is woken up when there is at least one RW packet in rx
>>>> queue of socket and data is copied to user's buffer, while merged
>>>> approach wake up user only when whole message is received and kept
>>>> in queue. New implementation has several advantages:
>>>>  1) There is no limit for message length. Merged approach requires
>>>>     that length must be smaller than 'peer_buf_alloc', otherwise
>>>>     transmission will stuck.
>>>>  2) There is no need to keep whole message in queue, thus no
>>>>     'kmalloc()' memory will be wasted until EOR is received.
>>>>
>>>>     Also new approach has some feature: as fragments of message
>>>> are copied until EOR is received, it is possible that part of
>>>> message will be already in user's buffer, while rest of message
>>>> still not received. And if user will be interrupted by signal or
>>>> timeout with part of message in buffer, it will exit receive loop,
>>>> leaving rest of message in queue. To solve this problem special
>>>> callback was added to transport: it is called when user was forced
>>>> to leave exit loop and tells transport to drop any packet until
>>>> EOR met.
>>> Sorry about commenting late in the game.  I'm a bit lost
>>>
>>>
>>> SOCK_SEQPACKET
>>> Provides sequenced, reliable, bidirectional, connection-mode transmission paths for records. A record can be sent using one or more output operations and received using one or more input operations, but a single operation never transfers part of more than one record. Record boundaries are visible to the receiver via the MSG_EOR flag.
>>>
>>> it's supposed to be reliable - how is it legal to drop packets?
>> Sorry, seems i need to rephrase description. "Packet" here means fragment of record(message) at transport
>>
>> layer. As this is SEQPACKET mode, receiver could get only whole message or error, so if only several fragments
>>
>> of message was copied (if signal received for example) we can't return it to user - it breaks SEQPACKET sense. I think,
>>
>> in this case we can drop rest of record's fragments legally.
>>
>>
>> Thank You
> Would not that violate the reliable property? IIUC it's only ok to
> return an error if socket gets closed. Just like e.g. TCP ...
>
Sorry for late answer, yes You're right, seems this is unwanted drop...

Lets wait for Stefano Garzarella feedback


Thank You

>
>>>
>>>> When EOR is found, this mode is disabled and normal packet
>>>> processing started. Note, that when 'drop until EOR' mode is on,
>>>> incoming packets still inserted in queue, reader will be woken up,
>>>> tries to copy data, but nothing will be copied until EOR found.
>>>> It was possible to drain such unneeded packets it rx work without
>>>> kicking user, but implemented way is simplest. Anyway, i think
>>>> such cases are rare.
>>>>     New test also added - it tries to copy to invalid user's
>>>> buffer.
>>>>
>>>> Arseny Krasnov (16):
>>>>  af_vsock/virtio/vsock: change seqpacket receive logic
>>>>  af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
>>>>  virtio/vsock: remove 'msg_count' based logic
>>>>  af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
>>>>  virtio/vsock: remove record size limit for SEQPACKET
>>>>  vsock_test: SEQPACKET read to broken buffer
>>>>
>>>>  drivers/vhost/vsock.c                   |   2 +-
>>>>  include/linux/virtio_vsock.h            |   7 +-
>>>>  include/net/af_vsock.h                  |   4 +-
>>>>  net/vmw_vsock/af_vsock.c                |  44 ++++----
>>>>  net/vmw_vsock/virtio_transport.c        |   2 +-
>>>>  net/vmw_vsock/virtio_transport_common.c | 103 ++++++++-----------
>>>>  net/vmw_vsock/vsock_loopback.c          |   2 +-
>>>>  tools/testing/vsock/vsock_test.c        | 120 ++++++++++++++++++++++
>>>>  8 files changed, 193 insertions(+), 91 deletions(-)
>>>>
>>>>  v1 -> v2:
>>>>  Patches reordered and reorganized.
>>>>
>>>> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>>>> ---
>>>>  cv.txt | 0
>>>>  1 file changed, 0 insertions(+), 0 deletions(-)
>>>>  create mode 100644 cv.txt
>>>>
>>>> diff --git a/cv.txt b/cv.txt
>>>> new file mode 100644
>>>> index 000000000000..e69de29bb2d1
>>>> -- 
>>>> 2.25.1
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [MASSMAIL KLMS]Re: [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic
  2021-07-05 10:48       ` [MASSMAIL KLMS]Re: " Arseny Krasnov
@ 2021-07-05 15:23         ` Stefano Garzarella
  0 siblings, 0 replies; 13+ messages in thread
From: Stefano Garzarella @ 2021-07-05 15:23 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Michael S. Tsirkin, Stefan Hajnoczi, Jason Wang, David S. Miller,
	Jakub Kicinski, Andra Paraschiv, Norbert Slusarek,
	Colin Ian King, kvm, virtualization, netdev, linux-kernel,
	oxffffaa

On Mon, Jul 05, 2021 at 01:48:28PM +0300, Arseny Krasnov wrote:
>
>On 04.07.2021 12:54, Michael S. Tsirkin wrote:
>> On Sun, Jul 04, 2021 at 12:23:03PM +0300, Arseny Krasnov wrote:
>>> On 04.07.2021 11:30, Michael S. Tsirkin wrote:
>>>> On Sun, Jul 04, 2021 at 11:08:13AM +0300, Arseny Krasnov wrote:
>>>>> 	This patchset modifies receive logic for SOCK_SEQPACKET.
>>>>> Difference between current implementation and this version is that
>>>>> now reader is woken up when there is at least one RW packet in rx
>>>>> queue of socket and data is copied to user's buffer, while merged
>>>>> approach wake up user only when whole message is received and kept
>>>>> in queue. New implementation has several advantages:
>>>>>  1) There is no limit for message length. Merged approach requires
>>>>>     that length must be smaller than 'peer_buf_alloc', otherwise
>>>>>     transmission will stuck.
>>>>>  2) There is no need to keep whole message in queue, thus no
>>>>>     'kmalloc()' memory will be wasted until EOR is received.
>>>>>
>>>>>     Also new approach has some feature: as fragments of message
>>>>> are copied until EOR is received, it is possible that part of
>>>>> message will be already in user's buffer, while rest of message
>>>>> still not received. And if user will be interrupted by signal or
>>>>> timeout with part of message in buffer, it will exit receive loop,
>>>>> leaving rest of message in queue. To solve this problem special
>>>>> callback was added to transport: it is called when user was forced
>>>>> to leave exit loop and tells transport to drop any packet until
>>>>> EOR met.
>>>> Sorry about commenting late in the game.  I'm a bit lost
>>>>
>>>>
>>>> SOCK_SEQPACKET
>>>> Provides sequenced, reliable, bidirectional, connection-mode transmission paths for records. A record can be sent using one or more output operations and received using one or more input operations, but a single operation never transfers part of more than one record. Record boundaries are visible to the receiver via the MSG_EOR flag.
>>>>
>>>> it's supposed to be reliable - how is it legal to drop packets?
>>> Sorry, seems i need to rephrase description. "Packet" here means fragment of record(message) at transport
>>>
>>> layer. As this is SEQPACKET mode, receiver could get only whole message or error, so if only several fragments
>>>
>>> of message was copied (if signal received for example) we can't return it to user - it breaks SEQPACKET sense. I think,
>>>
>>> in this case we can drop rest of record's fragments legally.
>>>
>>>
>>> Thank You
>> Would not that violate the reliable property? IIUC it's only ok to
>> return an error if socket gets closed. Just like e.g. TCP ...
>>
>Sorry for late answer, yes You're right, seems this is unwanted drop...
>
>Lets wait for Stefano Garzarella feedback

It was the same concern I had with the series that introduced SEQPACKET 
for vsock, which is why I suggested to wait until the message is 
complete, before copying it to the user's buffer.

IIUC, with the current upstream implementation, we don't have this 
problem, right?

I'm not sure how to fix this, other than by keeping all the fragments 
queued until we've successfully copied them to user space, which is what 
we should do without this series applied IIUC.

Stefano


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-07-05 15:23 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-04  8:08 [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic Arseny Krasnov
2021-07-04  8:09 ` [RFC PATCH v2 1/6] af_vsock/virtio/vsock: change seqpacket " Arseny Krasnov
2021-07-04  8:10 ` [RFC PATCH v2 2/6] af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback Arseny Krasnov
2021-07-04  8:10 ` [RFC PATCH v2 3/6] virtio/vsock: remove 'msg_count' based logic Arseny Krasnov
2021-07-04  8:10 ` [RFC PATCH v2 4/6] af_vsock/virtio/vsock: add 'seqpacket_drop()' callback Arseny Krasnov
2021-07-04  8:10 ` [RFC PATCH v2 5/6] virtio/vsock: remove record size limit for SEQPACKET Arseny Krasnov
2021-07-04  8:11 ` [RFC PATCH v2 6/6] vsock_test: SEQPACKET read to broken buffer Arseny Krasnov
2021-07-04  8:13 ` [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic Arseny Krasnov
2021-07-04  8:30 ` Michael S. Tsirkin
2021-07-04  9:23   ` Arseny Krasnov
2021-07-04  9:54     ` Michael S. Tsirkin
2021-07-05 10:48       ` [MASSMAIL KLMS]Re: " Arseny Krasnov
2021-07-05 15:23         ` Stefano Garzarella

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).