All of lore.kernel.org
 help / color / mirror / Atom feed
* KVM virtio network performance on RHEL5.4
@ 2009-10-28 17:43 Tom Lendacky
  0 siblings, 0 replies; only message in thread
From: Tom Lendacky @ 2009-10-28 17:43 UTC (permalink / raw)
  To: kvm

I've been trying to understand why the performance from guest to guest over a 
10GbE link using virtio, as measured by netperf, dramatically decreases when 
the socket buffer size is increased on the receiving guest.  This is an Intel 
X3210 4-core 2.13GHz system running RHEL5.4.  I don't see this drop in 
performance when going from guest to host or host to guest over the 10GbE 
link.  Here are the results from netperf:

Default socket buffer sizes:
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

 87380  16384  16384    60.01      2268.47   47.69    99.95    1.722   3.609

Receiver 256K socket buffer size (actually rmem_max * 2):
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

262142  16384  16384    60.00      1583.75   39.00    74.09    2.018   3.832

There is increased idle time in the receiver.  Using systemtap I found that 
the idle time is because we are waiting for data (tcp_recvmsg calling 
sk_wait_data).

I instrumented qemu on the receiver side to print out some statistics related 
to xmit/recv events.

"Rx-Could not receive" is incremented whenever "do_virtio_net_can_receive" 
returns 0
"Rx-Ring full" is incremented in "do_virtio_net_can_receive" whenever there 
are no available entries/space in the receive ring
"Rx-Count" is incremented whenever "virtio_net_receive2" is called (and can 
receive data)
"Rx-Bytes" is increased in "virtio_net_receive2" by the number of bytes to be 
read from the tap device
"Rx-Ring buffers" is increased by the number of buffers used for the data in 
"virtio_net_receive2"
"Tx-Notify" is incremented whenever "virtio_net_handle_tx" is invoked
"Tx-Sched BH" is incremented whenever "virtio_net_handle_tx" is invoked and 
the the qemu_bh hasn't been scheduled yet
"Tx-Packets" is incremented in "virtio_net_flush_tx" whenever a packet is 
removed from the transmit ring and sent to qemu
"Tx-Bytes" is increased in "virtio_net_flush_tx" by the number of bytes sent 
to qemu.

Here are the stats for the two cases:

                        Default                 256K
Rx-Could not receive    3,559                   0
Rx-Ring full            3,559                   0
Rx-Count                1,063,056               805,012
Rx-Bytes                18,131,704,980          12,593,270,826
Rx-Ring buffers         4,963,793               3,541,010
Tx-Notify               125,068                 125,702
Tx-Sched BH             125,068                 125,702
Tx-Packets              147,256                 232,219
Tx-Bytes                11,486,448              18,113,586

Dividing the Tx-Bytes by Tx-Packets in each case yields about 78 bytes/packet 
so these are most likely ACKs.  But why am I seeing almost 85,000 more of 
these in the 256K socket buffer case?  Also, dividing the Rx-Bytes by the Rx-
Count shows that the tap device is delivering about 1413 bytes less per call 
to qemu in the 256K socket buffer case.

Does anyone have some insight as to what is happening?

Thanks,
Tom Lendacky

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2009-10-28 17:43 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-28 17:43 KVM virtio network performance on RHEL5.4 Tom Lendacky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.