All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] hv_sock: perf: loop in send() to maximize bandwidth
@ 2019-05-22 23:10 Sunil Muthuswamy
  2019-05-22 23:31 ` Dexuan Cui
  2019-05-23  1:00 ` David Miller
  0 siblings, 2 replies; 3+ messages in thread
From: Sunil Muthuswamy @ 2019-05-22 23:10 UTC (permalink / raw)
  To: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Sasha Levin,
	David S. Miller, Michael Kelley
  Cc: netdev, linux-hyperv, linux-kernel

Currently, the hv_sock send() iterates once over the buffer, puts data into
the VMBUS channel and returns. It doesn't maximize on the case when there
is a simultaneous reader draining data from the channel. In such a case,
the send() can maximize the bandwidth (and consequently minimize the cpu
cycles) by iterating until the channel is found to be full.

Perf data:
Total Data Transfer: 10GB/iteration
Single threaded reader/writer, Linux hvsocket writer with Windows hvsocket
reader
Packet size: 64KB
CPU sys time was captured using the 'time' command for the writer to send
10GB of data.
'Send Buffer Loop' is with the patch applied.
The values below are over 10 iterations.

|--------------------------------------------------------|
|        |        Current        |   Send Buffer Loop    |
|--------------------------------------------------------|
|        | Throughput | CPU sys  | Throughput | CPU sys  |
|        | (MB/s)     | time (s) | (MB/s)     | time (s) |
|--------------------------------------------------------|
| Min    |     407    |   7.048  |    401     |  5.958   |
|--------------------------------------------------------|
| Max    |     455    |   7.563  |    542     |  6.993   |
|--------------------------------------------------------|
| Avg    |     440    |   7.411  |    451     |  6.639   |
|--------------------------------------------------------|
| Median |     446    |   7.417  |    447     |  6.761   |
|--------------------------------------------------------|

Observation:
1. The avg throughput doesn't really change much with this change for this
scenario. This is most probably because the bottleneck on throughput is
somewhere else.
2. The average system (or kernel) cpu time goes down by 10%+ with this
change, for the same amount of data transfer.

Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
---
- The patch has been previously submitted to net and reviewed. The
feedback was to submit it to net-next.

 net/vmw_vsock/hyperv_transport.c | 45 +++++++++++++++++++++++++++-------------
 1 file changed, 31 insertions(+), 14 deletions(-)

diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index 982a8dc..7c13032 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -55,8 +55,9 @@ struct hvs_recv_buf {
 };
 
 /* We can send up to HVS_MTU_SIZE bytes of payload to the host, but let's use
- * a small size, i.e. HVS_SEND_BUF_SIZE, to minimize the dynamically-allocated
- * buffer, because tests show there is no significant performance difference.
+ * a smaller size, i.e. HVS_SEND_BUF_SIZE, to maximize concurrency between the
+ * guest and the host processing as one VMBUS packet is the smallest processing
+ * unit.
  *
  * Note: the buffer can be eliminated in the future when we add new VMBus
  * ringbuffer APIs that allow us to directly copy data from userspace buffer
@@ -644,7 +645,9 @@ static ssize_t hvs_stream_enqueue(struct vsock_sock *vsk, struct msghdr *msg,
 	struct hvsock *hvs = vsk->trans;
 	struct vmbus_channel *chan = hvs->chan;
 	struct hvs_send_buf *send_buf;
-	ssize_t to_write, max_writable, ret;
+	ssize_t to_write, max_writable;
+	ssize_t ret = 0;
+	ssize_t bytes_written = 0;
 
 	BUILD_BUG_ON(sizeof(*send_buf) != PAGE_SIZE_4K);
 
@@ -652,20 +655,34 @@ static ssize_t hvs_stream_enqueue(struct vsock_sock *vsk, struct msghdr *msg,
 	if (!send_buf)
 		return -ENOMEM;
 
-	max_writable = hvs_channel_writable_bytes(chan);
-	to_write = min_t(ssize_t, len, max_writable);
-	to_write = min_t(ssize_t, to_write, HVS_SEND_BUF_SIZE);
-
-	ret = memcpy_from_msg(send_buf->data, msg, to_write);
-	if (ret < 0)
-		goto out;
+	/* Reader(s) could be draining data from the channel as we write.
+	 * Maximize bandwidth, by iterating until the channel is found to be
+	 * full.
+	 */
+	while (len) {
+		max_writable = hvs_channel_writable_bytes(chan);
+		if (!max_writable)
+			break;
+		to_write = min_t(ssize_t, len, max_writable);
+		to_write = min_t(ssize_t, to_write, HVS_SEND_BUF_SIZE);
+		/* memcpy_from_msg is safe for loop as it advances the offsets
+		 * within the message iterator.
+		 */
+		ret = memcpy_from_msg(send_buf->data, msg, to_write);
+		if (ret < 0)
+			goto out;
 
-	ret = hvs_send_data(hvs->chan, send_buf, to_write);
-	if (ret < 0)
-		goto out;
+		ret = hvs_send_data(hvs->chan, send_buf, to_write);
+		if (ret < 0)
+			goto out;
 
-	ret = to_write;
+		bytes_written += to_write;
+		len -= to_write;
+	}
 out:
+	/* If any data has been sent, return that */
+	if (bytes_written)
+		ret = bytes_written;
 	kfree(send_buf);
 	return ret;
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* RE: [PATCH net-next] hv_sock: perf: loop in send() to maximize bandwidth
  2019-05-22 23:10 [PATCH net-next] hv_sock: perf: loop in send() to maximize bandwidth Sunil Muthuswamy
@ 2019-05-22 23:31 ` Dexuan Cui
  2019-05-23  1:00 ` David Miller
  1 sibling, 0 replies; 3+ messages in thread
From: Dexuan Cui @ 2019-05-22 23:31 UTC (permalink / raw)
  To: Sunil Muthuswamy, KY Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Sasha Levin, David S. Miller, Michael Kelley
  Cc: netdev, linux-hyperv, linux-kernel

> From: linux-hyperv-owner@vger.kernel.org
> <linux-hyperv-owner@vger.kernel.org> On Behalf Of Sunil Muthuswamy
> 
> Currently, the hv_sock send() iterates once over the buffer, puts data into
> the VMBUS channel and returns. It doesn't maximize on the case when there
> is a simultaneous reader draining data from the channel. In such a case,
> the send() can maximize the bandwidth (and consequently minimize the cpu
> cycles) by iterating until the channel is found to be full.
 
Reviewed-by: Dexuan Cui <decui@microsoft.com>

The patch looks good. Thanks, Sunil!

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next] hv_sock: perf: loop in send() to maximize bandwidth
  2019-05-22 23:10 [PATCH net-next] hv_sock: perf: loop in send() to maximize bandwidth Sunil Muthuswamy
  2019-05-22 23:31 ` Dexuan Cui
@ 2019-05-23  1:00 ` David Miller
  1 sibling, 0 replies; 3+ messages in thread
From: David Miller @ 2019-05-23  1:00 UTC (permalink / raw)
  To: sunilmut
  Cc: kys, haiyangz, sthemmin, sashal, mikelley, netdev, linux-hyperv,
	linux-kernel

From: Sunil Muthuswamy <sunilmut@microsoft.com>
Date: Wed, 22 May 2019 23:10:44 +0000

> Currently, the hv_sock send() iterates once over the buffer, puts data into
> the VMBUS channel and returns. It doesn't maximize on the case when there
> is a simultaneous reader draining data from the channel. In such a case,
> the send() can maximize the bandwidth (and consequently minimize the cpu
> cycles) by iterating until the channel is found to be full.
> 
> Perf data:
> Total Data Transfer: 10GB/iteration
> Single threaded reader/writer, Linux hvsocket writer with Windows hvsocket
> reader
> Packet size: 64KB
> CPU sys time was captured using the 'time' command for the writer to send
> 10GB of data.
> 'Send Buffer Loop' is with the patch applied.
> The values below are over 10 iterations.
 ...
> Observation:
> 1. The avg throughput doesn't really change much with this change for this
> scenario. This is most probably because the bottleneck on throughput is
> somewhere else.
> 2. The average system (or kernel) cpu time goes down by 10%+ with this
> change, for the same amount of data transfer.
> 
> Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>

Applied.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-05-23  1:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-22 23:10 [PATCH net-next] hv_sock: perf: loop in send() to maximize bandwidth Sunil Muthuswamy
2019-05-22 23:31 ` Dexuan Cui
2019-05-23  1:00 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.