All of lore.kernel.org
 help / color / mirror / Atom feed
* Disabling XPS for 4.4.0-1+ixgbe+OpenStack VM over a VLAN means 65% increase in netperf TCP_STREAM
@ 2016-02-04 19:13 Rick Jones
  2016-02-04 19:38 ` Tom Herbert
  0 siblings, 1 reply; 7+ messages in thread
From: Rick Jones @ 2016-02-04 19:13 UTC (permalink / raw)
  To: netdev

Folks -

I was doing some performance work with OpenStack Liberty on systems with 
2x E5-2650L v3 @ 1.80GHz processors and 560FLR (Intel 82599ES) NICs onto 
which I'd placed a 4.4.0-1 kernel.  I was actually interested in the 
effect of removing the linux bridge from all the plumbing OpenStack 
creates (it is there for iptables-based implementation of security group 
rules because OS Liberty doesn't enable them on the OVS bridge(s) it 
creates), and I'd noticed that when I removed the linux bridge from the 
"stack" instance-to-instance (vm-to-vm) performance across a VLAN-based 
Neutron private network dropped.  Quite unexpected.

On a lark, I tried explicitly binding the NIC's IRQs and Boom! the 
single-stream performance shot-up to near link-rate.  I couldn't recall 
explicit binding of IRQs doing that much for single-stream netperf 
TCP_STREAM before.

I asked the Intel folks about that, they suggested I try disabling XPS. 
  So, with that I see the following on single-stream tests between the 
VMs on that VLAN-based private network as created by OpenStack Liberty:


	   99% Confident within +/- 2.5% of "real" average
		TCP_RR in Trans/s TCP_STREAM in Mbit/s

                    XPS Enabled   XPS Disabled   Delta
TCP_STREAM            5353	    8841 (*)    65.2%
TCP_RR                8562          9666        12.9%

The Intel folks suggested something about the process scheduler moving 
the sender around and ultimately causing some packet re-ordering.  That 
could I suppose explain the TCP_STREAM difference, but not the TCP_RR 
since that has just a single segment in flight at one time.

I can try to get perf/whatnot installed on the systems - suggestions as 
to what metrics to look at are welcome.

happy benchmarking,

rick jones
* If I disable XPS on the sending side only, it is more like 7700 Mbit/s

netstats from the receiver over a netperf TCP_STREAM test's duration 
with XPS enabled:

$ netperf -H 10.240.50.191 -- -o throughput,local_transport_retrans
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
10.240.50.191 () port 0 AF_INET : demo
Throughput,Local Transport Retransmissions
5292.74,4555


$ ./beforeafter before after
Ip:
     327837 total packets received
     0 with invalid addresses
     0 forwarded
     0 incoming packets discarded
     327837 incoming packets delivered
     293438 requests sent out
Icmp:
     0 ICMP messages received
     0 input ICMP message failed.
     ICMP input histogram:
         destination unreachable: 0
     0 ICMP messages sent
     0 ICMP messages failed
     ICMP output histogram:
         destination unreachable: 0
IcmpMsg:
         InType3: 0
         OutType3: 0
Tcp:
     0 active connections openings
     2 passive connection openings
     0 failed connection attempts
     0 connection resets received
     0 connections established
     327837 segments received
     293438 segments send out
     0 segments retransmited
     0 bad segments received.
     0 resets sent
Udp:
     0 packets received
     0 packets to unknown port received.
     0 packet receive errors
     0 packets sent
     IgnoredMulti: 0
UdpLite:
TcpExt:
     0 TCP sockets finished time wait in fast timer
     0 delayed acks sent
     Quick ack mode was activated 1016 times
     50386 packets directly queued to recvmsg prequeue.
     309545872 bytes directly in process context from backlog
     2874395424 bytes directly received in process context from prequeue
     86591 packet headers predicted
     84934 packets header predicted and directly queued to user
     6 acknowledgments not containing data payload received
     20 predicted acknowledgments
     1017 DSACKs sent for old packets
     TCPRcvCoalesce: 157097
     TCPOFOQueue: 78206
     TCPOrigDataSent: 24
IpExt:
     InBcastPkts: 0
     InOctets: 6643231012
     OutOctets: 17203936
     InBcastOctets: 0
     InNoECTPkts: 327837

And now with it disabled on both sides:
$ netperf -H 10.240.50.191 -- -o throughput,local_transport_retrans
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
10.240.50.191 () port 0 AF_INET : demo
Throughput,Local Transport Retransmissions
8656.84,1903
$ ./beforeafter noxps_before noxps_avter
Ip:
     251831 total packets received
     0 with invalid addresses
     0 forwarded
     0 incoming packets discarded
     251831 incoming packets delivered
     218415 requests sent out
Icmp:
     0 ICMP messages received
     0 input ICMP message failed.
     ICMP input histogram:
         destination unreachable: 0
     0 ICMP messages sent
     0 ICMP messages failed
     ICMP output histogram:
         destination unreachable: 0
IcmpMsg:
         InType3: 0
         OutType3: 0
Tcp:
     0 active connections openings
     2 passive connection openings
     0 failed connection attempts
     0 connection resets received
     0 connections established
     251831 segments received
     218415 segments send out
     0 segments retransmited
     0 bad segments received.
     0 resets sent
Udp:
     0 packets received
     0 packets to unknown port received.
     0 packet receive errors
     0 packets sent
     IgnoredMulti: 0
UdpLite:
TcpExt:
     0 TCP sockets finished time wait in fast timer
     0 delayed acks sent
     Quick ack mode was activated 48 times
     91752 packets directly queued to recvmsg prequeue.
     846851580 bytes directly in process context from backlog
     5442436572 bytes directly received in process context from prequeue
     102517 packet headers predicted
     146102 packets header predicted and directly queued to user
     6 acknowledgments not containing data payload received
     26 predicted acknowledgments
     TCPLossProbes: 0
     TCPLossProbeRecovery: 0
     48 DSACKs sent for old packets
     0 DSACKs received
     TCPRcvCoalesce: 45658
     TCPOFOQueue: 967
     TCPOrigDataSent: 30
IpExt:
     InBcastPkts: 0
     InOctets: 10837972268
     OutOctets: 11413100
     InBcastOctets: 0
     InNoECTPkts: 251831

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Disabling XPS for 4.4.0-1+ixgbe+OpenStack VM over a VLAN means 65% increase in netperf TCP_STREAM
  2016-02-04 19:13 Disabling XPS for 4.4.0-1+ixgbe+OpenStack VM over a VLAN means 65% increase in netperf TCP_STREAM Rick Jones
@ 2016-02-04 19:38 ` Tom Herbert
  2016-02-04 19:57   ` Rick Jones
  2016-02-08 18:03   ` Rick Jones
  0 siblings, 2 replies; 7+ messages in thread
From: Tom Herbert @ 2016-02-04 19:38 UTC (permalink / raw)
  To: Rick Jones; +Cc: Linux Kernel Network Developers

On Thu, Feb 4, 2016 at 11:13 AM, Rick Jones <rick.jones2@hpe.com> wrote:
> Folks -
>
> I was doing some performance work with OpenStack Liberty on systems with 2x
> E5-2650L v3 @ 1.80GHz processors and 560FLR (Intel 82599ES) NICs onto which
> I'd placed a 4.4.0-1 kernel.  I was actually interested in the effect of
> removing the linux bridge from all the plumbing OpenStack creates (it is
> there for iptables-based implementation of security group rules because OS
> Liberty doesn't enable them on the OVS bridge(s) it creates), and I'd
> noticed that when I removed the linux bridge from the "stack"
> instance-to-instance (vm-to-vm) performance across a VLAN-based Neutron
> private network dropped.  Quite unexpected.
>
> On a lark, I tried explicitly binding the NIC's IRQs and Boom! the
> single-stream performance shot-up to near link-rate.  I couldn't recall
> explicit binding of IRQs doing that much for single-stream netperf
> TCP_STREAM before.
>
> I asked the Intel folks about that, they suggested I try disabling XPS.  So,
> with that I see the following on single-stream tests between the VMs on that
> VLAN-based private network as created by OpenStack Liberty:
>
>
>            99% Confident within +/- 2.5% of "real" average
>                 TCP_RR in Trans/s TCP_STREAM in Mbit/s
>
>                    XPS Enabled   XPS Disabled   Delta
> TCP_STREAM            5353          8841 (*)    65.2%
> TCP_RR                8562          9666        12.9%
>
> The Intel folks suggested something about the process scheduler moving the
> sender around and ultimately causing some packet re-ordering.  That could I
> suppose explain the TCP_STREAM difference, but not the TCP_RR since that has
> just a single segment in flight at one time.
>
XPS has OOO avoidance for TCP, that should not be a problem.

> I can try to get perf/whatnot installed on the systems - suggestions as to
> what metrics to look at are welcome.
>
I'd start with verifying the XPS configuration is sane and then trying
to reproduce the issue outside of using VMs, if both of those are okay
then maybe look at some sort of bad interaction with OpenStack
configuration.

Tom

> happy benchmarking,
>
> rick jones
> * If I disable XPS on the sending side only, it is more like 7700 Mbit/s
>
> netstats from the receiver over a netperf TCP_STREAM test's duration with
> XPS enabled:
>
> $ netperf -H 10.240.50.191 -- -o throughput,local_transport_retrans
> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> 10.240.50.191 () port 0 AF_INET : demo
> Throughput,Local Transport Retransmissions
> 5292.74,4555
>
>
> $ ./beforeafter before after
> Ip:
>     327837 total packets received
>     0 with invalid addresses
>     0 forwarded
>     0 incoming packets discarded
>     327837 incoming packets delivered
>     293438 requests sent out
> Icmp:
>     0 ICMP messages received
>     0 input ICMP message failed.
>     ICMP input histogram:
>         destination unreachable: 0
>     0 ICMP messages sent
>     0 ICMP messages failed
>     ICMP output histogram:
>         destination unreachable: 0
> IcmpMsg:
>         InType3: 0
>         OutType3: 0
> Tcp:
>     0 active connections openings
>     2 passive connection openings
>     0 failed connection attempts
>     0 connection resets received
>     0 connections established
>     327837 segments received
>     293438 segments send out
>     0 segments retransmited
>     0 bad segments received.
>     0 resets sent
> Udp:
>     0 packets received
>     0 packets to unknown port received.
>     0 packet receive errors
>     0 packets sent
>     IgnoredMulti: 0
> UdpLite:
> TcpExt:
>     0 TCP sockets finished time wait in fast timer
>     0 delayed acks sent
>     Quick ack mode was activated 1016 times
>     50386 packets directly queued to recvmsg prequeue.
>     309545872 bytes directly in process context from backlog
>     2874395424 bytes directly received in process context from prequeue
>     86591 packet headers predicted
>     84934 packets header predicted and directly queued to user
>     6 acknowledgments not containing data payload received
>     20 predicted acknowledgments
>     1017 DSACKs sent for old packets
>     TCPRcvCoalesce: 157097
>     TCPOFOQueue: 78206
>     TCPOrigDataSent: 24
> IpExt:
>     InBcastPkts: 0
>     InOctets: 6643231012
>     OutOctets: 17203936
>     InBcastOctets: 0
>     InNoECTPkts: 327837
>
> And now with it disabled on both sides:
> $ netperf -H 10.240.50.191 -- -o throughput,local_transport_retrans
> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> 10.240.50.191 () port 0 AF_INET : demo
> Throughput,Local Transport Retransmissions
> 8656.84,1903
> $ ./beforeafter noxps_before noxps_avter
> Ip:
>     251831 total packets received
>     0 with invalid addresses
>     0 forwarded
>     0 incoming packets discarded
>     251831 incoming packets delivered
>     218415 requests sent out
> Icmp:
>     0 ICMP messages received
>     0 input ICMP message failed.
>     ICMP input histogram:
>         destination unreachable: 0
>     0 ICMP messages sent
>     0 ICMP messages failed
>     ICMP output histogram:
>         destination unreachable: 0
> IcmpMsg:
>         InType3: 0
>         OutType3: 0
> Tcp:
>     0 active connections openings
>     2 passive connection openings
>     0 failed connection attempts
>     0 connection resets received
>     0 connections established
>     251831 segments received
>     218415 segments send out
>     0 segments retransmited
>     0 bad segments received.
>     0 resets sent
> Udp:
>     0 packets received
>     0 packets to unknown port received.
>     0 packet receive errors
>     0 packets sent
>     IgnoredMulti: 0
> UdpLite:
> TcpExt:
>     0 TCP sockets finished time wait in fast timer
>     0 delayed acks sent
>     Quick ack mode was activated 48 times
>     91752 packets directly queued to recvmsg prequeue.
>     846851580 bytes directly in process context from backlog
>     5442436572 bytes directly received in process context from prequeue
>     102517 packet headers predicted
>     146102 packets header predicted and directly queued to user
>     6 acknowledgments not containing data payload received
>     26 predicted acknowledgments
>     TCPLossProbes: 0
>     TCPLossProbeRecovery: 0
>     48 DSACKs sent for old packets
>     0 DSACKs received
>     TCPRcvCoalesce: 45658
>     TCPOFOQueue: 967
>     TCPOrigDataSent: 30
> IpExt:
>     InBcastPkts: 0
>     InOctets: 10837972268
>     OutOctets: 11413100
>     InBcastOctets: 0
>     InNoECTPkts: 251831

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Disabling XPS for 4.4.0-1+ixgbe+OpenStack VM over a VLAN means 65% increase in netperf TCP_STREAM
  2016-02-04 19:38 ` Tom Herbert
@ 2016-02-04 19:57   ` Rick Jones
  2016-02-04 20:13     ` Tom Herbert
  2016-02-08 18:03   ` Rick Jones
  1 sibling, 1 reply; 7+ messages in thread
From: Rick Jones @ 2016-02-04 19:57 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Linux Kernel Network Developers

On 02/04/2016 11:38 AM, Tom Herbert wrote:
> On Thu, Feb 4, 2016 at 11:13 AM, Rick Jones <rick.jones2@hpe.com> wrote:
>> The Intel folks suggested something about the process scheduler moving the
>> sender around and ultimately causing some packet re-ordering.  That could I
>> suppose explain the TCP_STREAM difference, but not the TCP_RR since that has
>> just a single segment in flight at one time.
>>
> XPS has OOO avoidance for TCP, that should not be a problem.

What/how much should I read into:

With XPS    TCPOFOQueue: 78206
Without XPS TCPOFOQueue: 967

out of the netstat statistics on the receiving VM?

>> I can try to get perf/whatnot installed on the systems - suggestions as to
>> what metrics to look at are welcome.
>>
> I'd start with verifying the XPS configuration is sane and then trying
> to reproduce the issue outside of using VMs, if both of those are okay
> then maybe look at some sort of bad interaction with OpenStack
> configuration.

Fair enough - what is the definition of "sane" for an XPS configuration?

Here is what it looks like before I disabled it:

$ for i in `find 
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0 -name 
xps_cpus`; do echo $i `cat $i`; done
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-0/xps_cpus 
0000,00000001
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-1/xps_cpus 
0000,00000002
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-2/xps_cpus 
0000,00000004
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-3/xps_cpus 
0000,00000008
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-4/xps_cpus 
0000,00000010
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-5/xps_cpus 
0000,00000020
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-6/xps_cpus 
0000,00000040
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-7/xps_cpus 
0000,00000080
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-8/xps_cpus 
0000,00000100
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-9/xps_cpus 
0000,00000200
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-10/xps_cpus 
0000,00000400
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-11/xps_cpus 
0000,00000800
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-12/xps_cpus 
0000,00001000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-13/xps_cpus 
0000,00002000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-14/xps_cpus 
0000,00004000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-15/xps_cpus 
0000,00008000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-16/xps_cpus 
0000,00010000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-17/xps_cpus 
0000,00020000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-18/xps_cpus 
0000,00040000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-19/xps_cpus 
0000,00080000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-20/xps_cpus 
0000,00100000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-21/xps_cpus 
0000,00200000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-22/xps_cpus 
0000,00400000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-23/xps_cpus 
0000,00800000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-24/xps_cpus 
0000,01000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-25/xps_cpus 
0000,02000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-26/xps_cpus 
0000,04000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-27/xps_cpus 
0000,08000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-28/xps_cpus 
0000,10000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-29/xps_cpus 
0000,20000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-30/xps_cpus 
0000,40000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-31/xps_cpus 
0000,80000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-32/xps_cpus 
0001,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-33/xps_cpus 
0002,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-34/xps_cpus 
0004,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-35/xps_cpus 
0008,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-36/xps_cpus 
0010,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-37/xps_cpus 
0020,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-38/xps_cpus 
0040,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-39/xps_cpus 
0080,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-40/xps_cpus 
0100,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-41/xps_cpus 
0200,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-42/xps_cpus 
0400,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-43/xps_cpus 
0800,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-44/xps_cpus 
1000,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-45/xps_cpus 
2000,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-46/xps_cpus 
4000,00000000
/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-47/xps_cpus 
8000,00000000


Which looks like it simply got spread across all the "CPUs" (HTs 
included) in the system.

happy benchmarking,

rick jones

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Disabling XPS for 4.4.0-1+ixgbe+OpenStack VM over a VLAN means 65% increase in netperf TCP_STREAM
  2016-02-04 19:57   ` Rick Jones
@ 2016-02-04 20:13     ` Tom Herbert
  2016-02-04 20:28       ` Rick Jones
  0 siblings, 1 reply; 7+ messages in thread
From: Tom Herbert @ 2016-02-04 20:13 UTC (permalink / raw)
  To: Rick Jones; +Cc: Linux Kernel Network Developers

On Thu, Feb 4, 2016 at 11:57 AM, Rick Jones <rick.jones2@hpe.com> wrote:
> On 02/04/2016 11:38 AM, Tom Herbert wrote:
>>
>> On Thu, Feb 4, 2016 at 11:13 AM, Rick Jones <rick.jones2@hpe.com> wrote:
>>>
>>> The Intel folks suggested something about the process scheduler moving
>>> the
>>> sender around and ultimately causing some packet re-ordering.  That could
>>> I
>>> suppose explain the TCP_STREAM difference, but not the TCP_RR since that
>>> has
>>> just a single segment in flight at one time.
>>>
>> XPS has OOO avoidance for TCP, that should not be a problem.
>
>
> What/how much should I read into:
>
> With XPS    TCPOFOQueue: 78206
> Without XPS TCPOFOQueue: 967
>
> out of the netstat statistics on the receiving VM?
>
Okay, that makes sense. The OOO avoidance only applies to TCP sockets
in the stack, that doesn't cross into VM. Presumably, packets coming
from the VM don't have a socket so sk_tx_queue_get always returns -1
and so netdev_pick_tx will steer packet to the queue based on
currently running CPU without any memory.

>>> I can try to get perf/whatnot installed on the systems - suggestions as
>>> to
>>> what metrics to look at are welcome.
>>>
>> I'd start with verifying the XPS configuration is sane and then trying
>> to reproduce the issue outside of using VMs, if both of those are okay
>> then maybe look at some sort of bad interaction with OpenStack
>> configuration.
>
>
> Fair enough - what is the definition of "sane" for an XPS configuration?
>
> Here is what it looks like before I disabled it:
>
> $ for i in `find /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0
> -name xps_cpus`; do echo $i `cat $i`; done
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-0/xps_cpus
> 0000,00000001
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-1/xps_cpus
> 0000,00000002
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-2/xps_cpus
> 0000,00000004
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-3/xps_cpus
> 0000,00000008
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-4/xps_cpus
> 0000,00000010
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-5/xps_cpus
> 0000,00000020
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-6/xps_cpus
> 0000,00000040
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-7/xps_cpus
> 0000,00000080
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-8/xps_cpus
> 0000,00000100
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-9/xps_cpus
> 0000,00000200
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-10/xps_cpus
> 0000,00000400
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-11/xps_cpus
> 0000,00000800
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-12/xps_cpus
> 0000,00001000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-13/xps_cpus
> 0000,00002000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-14/xps_cpus
> 0000,00004000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-15/xps_cpus
> 0000,00008000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-16/xps_cpus
> 0000,00010000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-17/xps_cpus
> 0000,00020000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-18/xps_cpus
> 0000,00040000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-19/xps_cpus
> 0000,00080000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-20/xps_cpus
> 0000,00100000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-21/xps_cpus
> 0000,00200000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-22/xps_cpus
> 0000,00400000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-23/xps_cpus
> 0000,00800000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-24/xps_cpus
> 0000,01000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-25/xps_cpus
> 0000,02000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-26/xps_cpus
> 0000,04000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-27/xps_cpus
> 0000,08000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-28/xps_cpus
> 0000,10000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-29/xps_cpus
> 0000,20000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-30/xps_cpus
> 0000,40000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-31/xps_cpus
> 0000,80000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-32/xps_cpus
> 0001,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-33/xps_cpus
> 0002,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-34/xps_cpus
> 0004,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-35/xps_cpus
> 0008,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-36/xps_cpus
> 0010,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-37/xps_cpus
> 0020,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-38/xps_cpus
> 0040,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-39/xps_cpus
> 0080,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-40/xps_cpus
> 0100,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-41/xps_cpus
> 0200,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-42/xps_cpus
> 0400,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-43/xps_cpus
> 0800,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-44/xps_cpus
> 1000,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-45/xps_cpus
> 2000,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-46/xps_cpus
> 4000,00000000
> /sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0/net/eth0/queues/tx-47/xps_cpus
> 8000,00000000
>
>
> Which looks like it simply got spread across all the "CPUs" (HTs included)
> in the system.
>
> happy benchmarking,
>
> rick jones

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Disabling XPS for 4.4.0-1+ixgbe+OpenStack VM over a VLAN means 65% increase in netperf TCP_STREAM
  2016-02-04 20:13     ` Tom Herbert
@ 2016-02-04 20:28       ` Rick Jones
  0 siblings, 0 replies; 7+ messages in thread
From: Rick Jones @ 2016-02-04 20:28 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Linux Kernel Network Developers

On 02/04/2016 12:13 PM, Tom Herbert wrote:
> On Thu, Feb 4, 2016 at 11:57 AM, Rick Jones <rick.jones2@hpe.com> wrote:
>> On 02/04/2016 11:38 AM, Tom Herbert wrote:
>>> XPS has OOO avoidance for TCP, that should not be a problem.
>>
>>
>> What/how much should I read into:
>>
>> With XPS    TCPOFOQueue: 78206
>> Without XPS TCPOFOQueue: 967
>>
>> out of the netstat statistics on the receiving VM?
>>
> Okay, that makes sense. The OOO avoidance only applies to TCP sockets
> in the stack, that doesn't cross into VM. Presumably, packets coming
> from the VM don't have a socket so sk_tx_queue_get always returns -1
> and so netdev_pick_tx will steer packet to the queue based on
> currently running CPU without any memory.

Any thoughts as to why explicitly binding the IRQs made things better, 
or for that matter why the scheduler would be moving the VM (or its 
vhost-net kernel thread I suppose?) around so much?

happy benchmarking,

rick jones

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Disabling XPS for 4.4.0-1+ixgbe+OpenStack VM over a VLAN means 65% increase in netperf TCP_STREAM
  2016-02-04 19:38 ` Tom Herbert
  2016-02-04 19:57   ` Rick Jones
@ 2016-02-08 18:03   ` Rick Jones
  2016-02-08 19:03     ` Rick Jones
  1 sibling, 1 reply; 7+ messages in thread
From: Rick Jones @ 2016-02-08 18:03 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Linux Kernel Network Developers

On 02/04/2016 11:38 AM, Tom Herbert wrote:
> I'd start with verifying the XPS configuration is sane and then trying
> to reproduce the issue outside of using VMs, if both of those are okay
> then maybe look at some sort of bad interaction with OpenStack
> configuration.

So, looking at bare-iron, I can see something similar but not to the 
same degree (well, depending on which is one's metric of interest I guess):


XPS being enabled for ixgbe here looks to be increasing receive side 
service demand by 30% but there is enough CPU available in this setup 
that it is only a loss of 2.5% or so on throughput.

stack@fcperf-cp1-comp0001-mgmt:~$ grep 87380 xps_on_* | awk 
'{t+=$6;r+=$9;s+=$10}END{print "throughput",t/NR,"recv sd",r/NR,"send 
sd",s/NR}'
throughput 9072.52 recv sd 0.8623 send sd 0.3686
stack@fcperf-cp1-comp0001-mgmt:~$ grep TCPOFO xps_on_* | awk '{sum += 
$NF}END{print "sum",sum/NR}'
sum 1621.1
stack@fcperf-cp1-comp0001-mgmt:~$ grep 87380 xps_off_* | awk 
'{t+=$6;r+=$9;s+=$10}END{print "throughput",t/NR,"recv sd",r/NR,"send 
sd",s/NR}'
throughput 9300.48 recv sd 0.6543 send sd 0.3606
stack@fcperf-cp1-comp0001-mgmt:~$ grep TCPOFO xps_off_* | awk '{sum += 
$NF}END{print "sum",sum/NR}'
sum 173.9

happy benchmarking,

rick jones

raw results at ftp://ftp.netperf.org/xps_4.4.0-1_ixgbe.tgz

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Disabling XPS for 4.4.0-1+ixgbe+OpenStack VM over a VLAN means 65% increase in netperf TCP_STREAM
  2016-02-08 18:03   ` Rick Jones
@ 2016-02-08 19:03     ` Rick Jones
  0 siblings, 0 replies; 7+ messages in thread
From: Rick Jones @ 2016-02-08 19:03 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Linux Kernel Network Developers


Shame on me for not including bare-iron TCP_RR:

stack@fcperf-cp1-comp0001-mgmt:~$ grep "1       1" xps_tcp_rr_on_* | awk 
'{t+=$6;r+=$9;s+=$10}END{print "throughput",t/NR,"recv sd",r/NR,"send 
sd",s/NR}'
throughput 18589.4 recv sd 21.6296 send sd 20.5931
stack@fcperf-cp1-comp0001-mgmt:~$ grep "1       1" xps_tcp_rr_off_* | 
awk '{t+=$6;r+=$9;s+=$10}END{print "throughput",t/NR,"recv 
sd",r/NR,"send sd",s/NR}'
throughput 20883.6 recv sd 19.6255 send sd 20.0178

So that is 12% on TCP_RR throughput.

Looks like XPS shouldn't be enabled by default for ixgbe.

happy benchmarking,

rick jones

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-02-08 19:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-04 19:13 Disabling XPS for 4.4.0-1+ixgbe+OpenStack VM over a VLAN means 65% increase in netperf TCP_STREAM Rick Jones
2016-02-04 19:38 ` Tom Herbert
2016-02-04 19:57   ` Rick Jones
2016-02-04 20:13     ` Tom Herbert
2016-02-04 20:28       ` Rick Jones
2016-02-08 18:03   ` Rick Jones
2016-02-08 19:03     ` Rick Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.