From: Jason Wang <jasowang@redhat.com>
To: Rick Jones <rick.jones2@hp.com>
Cc: mst@redhat.com, mashirle@us.ibm.com, krkumar2@in.ibm.com,
habanero@linux.vnet.ibm.com, rusty@rustcorp.com.au,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, edumazet@google.com,
tahm@linux.vnet.ibm.com, jwhan@filewood.snu.ac.kr,
davem@davemloft.net, akong@redhat.com, kvm@vger.kernel.org,
sri@us.ibm.com
Subject: Re: [net-next RFC V5 0/5] Multiqueue virtio-net
Date: Fri, 06 Jul 2012 15:42:01 +0800 [thread overview]
Message-ID: <4FF696C9.5070907@redhat.com> (raw)
In-Reply-To: <4FF5D2B7.6080602@hp.com>
On 07/06/2012 01:45 AM, Rick Jones wrote:
> On 07/05/2012 03:29 AM, Jason Wang wrote:
>
>>
>> Test result:
>>
>> 1) 1 vm 2 vcpu 1q vs 2q, 1 - 1q, 2 - 2q, no pinning
>>
>> - Guest to External Host TCP STREAM
>> sessions size throughput1 throughput2 norm1 norm2
>> 1 64 650.55 655.61 100% 24.88 24.86 99%
>> 2 64 1446.81 1309.44 90% 30.49 27.16 89%
>> 4 64 1430.52 1305.59 91% 30.78 26.80 87%
>> 8 64 1450.89 1270.82 87% 30.83 25.95 84%
>
> Was the -D test-specific option used to set TCP_NODELAY? I'm guessing
> from your description of how packet sizes were smaller with multiqueue
> and your need to hack tcp_write_xmit() it wasn't but since we don't
> have the specific netperf command lines (hint hint :) I wanted to make
> certain.
Hi Rick:
I didn't specify -D for disabling Nagle. I also collects rx packets and
average packet size:
Guest to External Host ( 2vcpu 1q vs 2q )
sessions size tput-sq tput-mq % norm-sq norm-mq % #tx-pkts-sq
#tx-pkts-mq % avg-sz-sq avg-sz-mq %
1 64 668.85 671.13 100% 25.80 26.86 104% 629038 627126 99% 1395 1403 100%
2 64 1421.29 1345.40 94% 32.06 27.57 85% 1318498 1246721 94% 1413 1414 100%
4 64 1469.96 1365.42 92% 32.44 27.04 83% 1362542 1277848 93% 1414 1401 99%
8 64 1131.00 1361.58 120% 24.81 26.76 107% 1223700 1280970 104% 1395
1394 99%
1 256 1883.98 1649.87 87% 60.67 58.48 96% 1542775 1465836 95% 1592 1472 92%
2 256 4847.09 3539.74 73% 98.35 64.05 65% 2683346 3074046 114% 2323 1505 64%
4 256 5197.33 3283.48 63% 109.14 62.39 57% 1819814 2929486 160% 3636
1467 40%
8 256 5953.53 3359.22 56% 122.75 64.21 52% 906071 2924148 322% 8282 1502 18%
1 512 3019.70 2646.07 87% 93.89 86.78 92% 2003780 2256077 112% 1949 1532 78%
2 512 7455.83 5861.03 78% 173.79 104.43 60% 1200322 3577142 298% 7831
2114 26%
4 512 8962.28 7062.20 78% 213.08 127.82 59% 468142 2594812 554% 24030
3468 14%
8 512 7849.82 8523.85 108% 175.41 154.19 87% 304923 1662023 545% 38640
6479 16%
When multiqueue were enabled, it does have a higher packets per second
but with a much more smaller packet size. It looks to me that multiqueue
is faster and guest tcp have less oppotunity to build a larger skbs to
send, so lots of small packet were required to send which leads to much
more #exit and vhost works. One interesting thing is, if I run tcpdump
in the host where guest run, I can get obvious throughput increasing. To
verify the assumption, I hack the tcp_write_xmit() with following patch
and set tcp_tso_win_divisor=1, then I multiqueue can outperform or at
least get the same throughput as singlequeue, though it could introduce
latency but I havent' measured it.
I'm not expert of tcp, but looks like the changes are reasonable:
- we can do full-sized TSO check in tcp_tso_should_defer() only for
westwood, according to tcp westwood
- run tcp_tso_should_defer for tso_segs = 1 when tso is enabled.
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index c465d3e..166a888 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1567,7 +1567,7 @@ static bool tcp_tso_should_defer(struct sock *sk,
struct sk_buff *skb)
in_flight = tcp_packets_in_flight(tp);
- BUG_ON(tcp_skb_pcount(skb) <= 1 || (tp->snd_cwnd <= in_flight));
+ BUG_ON(tp->snd_cwnd <= in_flight);
send_win = tcp_wnd_end(tp) - TCP_SKB_CB(skb)->seq;
@@ -1576,9 +1576,11 @@ static bool tcp_tso_should_defer(struct sock *sk,
struct sk_buff *skb)
limit = min(send_win, cong_win);
+#if 0
/* If a full-sized TSO skb can be sent, do it. */
if (limit >= sk->sk_gso_max_size)
goto send_now;
+#endif
/* Middle in queue won't get any more data, full sendable
already? */
if ((skb != tcp_write_queue_tail(sk)) && (limit >= skb->len))
@@ -1795,10 +1797,9 @@ static bool tcp_write_xmit(struct sock *sk,
unsigned int mss_now, int nonagle,
(tcp_skb_is_last(sk, skb) ?
nonagle :
TCP_NAGLE_PUSH))))
break;
- } else {
- if (!push_one && tcp_tso_should_defer(sk, skb))
- break;
}
+ if (!push_one && tcp_tso_should_defer(sk, skb))
+ break;
limit = mss_now;
if (tso_segs > 1 && !tcp_urg_mode(tp))
>
> Instead of calling them throughput1 and throughput2, it might be more
> clear in future to identify them as singlequeue and multiqueue.
>
Sure.
> Also, how are you combining the concurrent netperf results? Are you
> taking sums of what netperf reports, or are you gathering statistics
> outside of netperf?
>
The throughput were just sumed from netperf result like what netperf
manual suggests. The cpu utilization were measured by mpstat.
>> - TCP RR
>> sessions size throughput1 throughput2 norm1 norm2
>> 50 1 54695.41 84164.98 153% 1957.33 1901.31 97%
>
> A single instance TCP_RR test would help confirm/refute any
> non-trivial change in (effective) path length between the two cases.
>
Yes, I would test this thanks.
> happy benchmarking,
>
> rick jones
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2012-07-06 7:40 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-05 10:29 [net-next RFC V5 0/5] Multiqueue virtio-net Jason Wang
2012-07-05 10:29 ` [net-next RFC V5 1/5] virtio_net: Introduce VIRTIO_NET_F_MULTIQUEUE Jason Wang
2012-07-05 10:29 ` [net-next RFC V5 2/5] virtio_ring: move queue_index to vring_virtqueue Jason Wang
2012-07-05 11:40 ` Sasha Levin
2012-07-06 3:17 ` Jason Wang
2012-07-26 8:20 ` Paolo Bonzini
2012-07-30 3:30 ` Jason Wang
2012-07-05 10:29 ` [net-next RFC V5 3/5] virtio: intorduce an API to set affinity for a virtqueue Jason Wang
2012-07-27 14:38 ` Paolo Bonzini
2012-07-29 20:40 ` Michael S. Tsirkin
2012-07-30 6:27 ` Paolo Bonzini
2012-08-09 15:14 ` Paolo Bonzini
2012-08-09 15:13 ` Paolo Bonzini
2012-08-09 15:35 ` Avi Kivity
2012-07-05 10:29 ` [net-next RFC V5 4/5] virtio_net: multiqueue support Jason Wang
2012-07-05 20:02 ` Amos Kong
2012-07-06 7:45 ` Jason Wang
2012-07-20 13:40 ` Michael S. Tsirkin
2012-07-21 12:02 ` Sasha Levin
2012-07-23 5:54 ` Jason Wang
2012-07-23 9:28 ` Sasha Levin
2012-07-30 3:29 ` Jason Wang
2012-07-29 9:44 ` Michael S. Tsirkin
2012-07-30 3:26 ` Jason Wang
2012-07-30 13:00 ` Sasha Levin
2012-07-23 5:48 ` Jason Wang
2012-07-29 9:50 ` Michael S. Tsirkin
2012-07-30 5:15 ` Jason Wang
2012-07-05 10:29 ` [net-next RFC V5 5/5] virtio_net: support negotiating the number of queues through ctrl vq Jason Wang
2012-07-05 12:51 ` Sasha Levin
2012-07-05 20:07 ` Amos Kong
2012-07-06 7:46 ` Jason Wang
2012-07-06 3:20 ` Jason Wang
2012-07-06 6:38 ` Stephen Hemminger
2012-07-06 9:26 ` Jason Wang
2012-07-06 8:10 ` Sasha Levin
2012-07-09 20:13 ` Ben Hutchings
2012-07-20 12:33 ` Michael S. Tsirkin
2012-07-23 5:32 ` Jason Wang
2012-07-05 17:45 ` [net-next RFC V5 0/5] Multiqueue virtio-net Rick Jones
2012-07-06 7:42 ` Jason Wang [this message]
2012-07-06 16:23 ` Rick Jones
2012-07-09 3:23 ` Jason Wang
2012-07-09 16:46 ` Rick Jones
2012-07-08 8:19 ` Ronen Hod
2012-07-09 5:35 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FF696C9.5070907@redhat.com \
--to=jasowang@redhat.com \
--cc=akong@redhat.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=habanero@linux.vnet.ibm.com \
--cc=jwhan@filewood.snu.ac.kr \
--cc=krkumar2@in.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mashirle@us.ibm.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=rick.jones2@hp.com \
--cc=rusty@rustcorp.com.au \
--cc=sri@us.ibm.com \
--cc=tahm@linux.vnet.ibm.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).