* KVM virtio network performance on RHEL5.4
@ 2009-10-28 17:43 Tom Lendacky
0 siblings, 0 replies; only message in thread
From: Tom Lendacky @ 2009-10-28 17:43 UTC (permalink / raw)
To: kvm
I've been trying to understand why the performance from guest to guest over a
10GbE link using virtio, as measured by netperf, dramatically decreases when
the socket buffer size is increased on the receiving guest. This is an Intel
X3210 4-core 2.13GHz system running RHEL5.4. I don't see this drop in
performance when going from guest to host or host to guest over the 10GbE
link. Here are the results from netperf:
Default socket buffer sizes:
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 16384 60.01 2268.47 47.69 99.95 1.722 3.609
Receiver 256K socket buffer size (actually rmem_max * 2):
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
262142 16384 16384 60.00 1583.75 39.00 74.09 2.018 3.832
There is increased idle time in the receiver. Using systemtap I found that
the idle time is because we are waiting for data (tcp_recvmsg calling
sk_wait_data).
I instrumented qemu on the receiver side to print out some statistics related
to xmit/recv events.
"Rx-Could not receive" is incremented whenever "do_virtio_net_can_receive"
returns 0
"Rx-Ring full" is incremented in "do_virtio_net_can_receive" whenever there
are no available entries/space in the receive ring
"Rx-Count" is incremented whenever "virtio_net_receive2" is called (and can
receive data)
"Rx-Bytes" is increased in "virtio_net_receive2" by the number of bytes to be
read from the tap device
"Rx-Ring buffers" is increased by the number of buffers used for the data in
"virtio_net_receive2"
"Tx-Notify" is incremented whenever "virtio_net_handle_tx" is invoked
"Tx-Sched BH" is incremented whenever "virtio_net_handle_tx" is invoked and
the the qemu_bh hasn't been scheduled yet
"Tx-Packets" is incremented in "virtio_net_flush_tx" whenever a packet is
removed from the transmit ring and sent to qemu
"Tx-Bytes" is increased in "virtio_net_flush_tx" by the number of bytes sent
to qemu.
Here are the stats for the two cases:
Default 256K
Rx-Could not receive 3,559 0
Rx-Ring full 3,559 0
Rx-Count 1,063,056 805,012
Rx-Bytes 18,131,704,980 12,593,270,826
Rx-Ring buffers 4,963,793 3,541,010
Tx-Notify 125,068 125,702
Tx-Sched BH 125,068 125,702
Tx-Packets 147,256 232,219
Tx-Bytes 11,486,448 18,113,586
Dividing the Tx-Bytes by Tx-Packets in each case yields about 78 bytes/packet
so these are most likely ACKs. But why am I seeing almost 85,000 more of
these in the 256K socket buffer case? Also, dividing the Rx-Bytes by the Rx-
Count shows that the tap device is delivering about 1413 bytes less per call
to qemu in the 256K socket buffer case.
Does anyone have some insight as to what is happening?
Thanks,
Tom Lendacky
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2009-10-28 17:43 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-28 17:43 KVM virtio network performance on RHEL5.4 Tom Lendacky
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.