[BUG] net: performance regression on ixgbe (Intel 82599EB 10-Gigabit NIC)

All of lore.kernel.org
 help / color / mirror / Atom feed

* [BUG] net: performance regression on ixgbe (Intel 82599EB 10-Gigabit NIC)
@ 2015-12-03 16:26 Otto Sabart
  2015-12-03 17:15 ` Alexander Duyck
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Otto Sabart @ 2015-12-03 16:26 UTC (permalink / raw)
  To: netdev; +Cc: Jeff Kirsher, Jirka Hladky, Adam Okuliar, Kamil Kolakowski

Hello netdev,
I probably found a performance regression on ixgbe (Intel 82599EB
10-Gigabit NIC) on v4.4-rc3. I am able to see this problem since
v4.4-rc1.

The bug report you can find here [0].

Can somebody take a look at it?

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1288124

thanks,
Ota

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG] net: performance regression on ixgbe (Intel 82599EB 10-Gigabit NIC)
  2015-12-03 16:26 [BUG] net: performance regression on ixgbe (Intel 82599EB 10-Gigabit NIC) Otto Sabart
@ 2015-12-03 17:15 ` Alexander Duyck
  2015-12-07 11:28   ` Otto Sabart
  2015-12-04 18:13 ` Rick Jones
  2015-12-04 21:31 ` Rustad, Mark D
  2 siblings, 1 reply; 8+ messages in thread
From: Alexander Duyck @ 2015-12-03 17:15 UTC (permalink / raw)
  To: Otto Sabart, netdev
  Cc: Jirka Hladky, e1000-devel, Adam Okuliar, Kamil Kolakowski

On 12/03/2015 08:26 AM, Otto Sabart wrote:
> Hello netdev,
> I probably found a performance regression on ixgbe (Intel 82599EB
> 10-Gigabit NIC) on v4.4-rc3. I am able to see this problem since
> v4.4-rc1.
>
> The bug report you can find here [0].
>
> Can somebody take a look at it?
>
> [0] https://bugzilla.redhat.com/show_bug.cgi?id=1288124
>
>
> thanks,
> Ota

Hi Ota,

It looks like there were a few changes that went through that could be 
causing the regression.  The most obvious one that jumps out at me is 
commit 72bfd32d2f84 ("ixgbe: disable LRO by default").  As such one 
thing you might try doing is turning on LRO support via ethtool -k to 
see if that is the issue you are seeing.

If that doesn't resolve the issue it would be useful if you could might 
try doing a git bisect to narrow this down to a specific patch.

Thanks.

- Alex


------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG] net: performance regression on ixgbe (Intel 82599EB 10-Gigabit NIC)
  2015-12-03 16:26 [BUG] net: performance regression on ixgbe (Intel 82599EB 10-Gigabit NIC) Otto Sabart
  2015-12-03 17:15 ` Alexander Duyck
@ 2015-12-04 18:13 ` Rick Jones
  2015-12-10 14:18   ` Otto Sabart
  2015-12-04 21:31 ` Rustad, Mark D
  2 siblings, 1 reply; 8+ messages in thread
From: Rick Jones @ 2015-12-04 18:13 UTC (permalink / raw)
  To: Otto Sabart, netdev
  Cc: Jeff Kirsher, Jirka Hladky, Adam Okuliar, Kamil Kolakowski

On 12/03/2015 08:26 AM, Otto Sabart wrote:
> Hello netdev,
> I probably found a performance regression on ixgbe (Intel 82599EB
> 10-Gigabit NIC) on v4.4-rc3. I am able to see this problem since
> v4.4-rc1.
>
> The bug report you can find here [0].
>
> Can somebody take a look at it?
>
> [0] https://bugzilla.redhat.com/show_bug.cgi?id=1288124

A few of comments/questions  based on reading that bug report:

*)  It is good to be binding netperf and netserver - helps with 
reproducibility, but why the two -T options?  A brief look at 
src/netsh.c suggests it will indeed set the two binding options 
separately but that is merely a side-effect of how I wrote the code.  It 
wasn't an intentional thing.

*) Is irqbalance disabled and the IRQs set the same each time, or might 
there be variability possible there?  Each of the five netperf runs will 
be a different four-tuple which means each may (or may not) get RSS 
hashed/etc differently.

*) It is perhaps adding duct tape to already-present belt and 
suspenders, but is power-management set to a fixed state on the systems 
involved? (Since this seems to be ProLiant G7s going by the legends on 
the charts, either static high perf or static low power I would imagine)

*) What is the difference before/after for the service demands?  The 
netperf tests being run are asking for CPU utilization but I don't see 
the service demand change being summarized.

*) Does a specific CPU on one side or the other saturate? 
(LOCAL_CPU_PEAK_UTIL, LOCAL_CPU_PEAK_ID, REMOTE_CPU_PEAK_UTIL, 
REMOTE_CPU_PEAK_ID output selectors)

*) What are the processors involved?  Presumably the "other system" is 
fixed?

*) It is important to remember the socket buffer sizes reported with the 
default output is *just* what they were when the data socket was 
created.  If you want to see what they became by the end of the test, 
you need to use the appropriate output selectors (or, IIRC invoking the 
tests as "omni" rather than tcp_stream/tcp_maerts will report the end 
values rather than the start ones.).

happy benchmarking,

rick jones

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG] net: performance regression on ixgbe (Intel 82599EB 10-Gigabit NIC)
  2015-12-03 16:26 [BUG] net: performance regression on ixgbe (Intel 82599EB 10-Gigabit NIC) Otto Sabart
  2015-12-03 17:15 ` Alexander Duyck
  2015-12-04 18:13 ` Rick Jones
@ 2015-12-04 21:31 ` Rustad, Mark D
  2 siblings, 0 replies; 8+ messages in thread
From: Rustad, Mark D @ 2015-12-04 21:31 UTC (permalink / raw)
  To: Otto Sabart
  Cc: netdev, Kirsher, Jeffrey T, Jirka Hladky, Adam Okuliar, Kamil Kolakowski

[-- Attachment #1: Type: text/plain, Size: 570 bytes --]

Otto Sabart <osabart@redhat.com> wrote:

> I probably found a performance regression on ixgbe (Intel 82599EB
> 10-Gigabit NIC) on v4.4-rc3. I am able to see this problem since
> v4.4-rc1.
> 
> The bug report you can find here [0].
> 
> Can somebody take a look at it?
> 
> [0] https://bugzilla.redhat.com/show_bug.cgi?id=1288124

A recent patch has disabled LRO by default because it is incompatible with forwarding. If you aren't interested in forwarding, you might try enabling lro with ethtool.

--
Mark Rustad, Networking Division, Intel Corporation

[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 841 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG] net: performance regression on ixgbe (Intel 82599EB 10-Gigabit NIC)
  2015-12-03 17:15 ` Alexander Duyck
@ 2015-12-07 11:28   ` Otto Sabart
  2015-12-07 18:25     ` Rick Jones
  0 siblings, 1 reply; 8+ messages in thread
From: Otto Sabart @ 2015-12-07 11:28 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: netdev, Jeff Kirsher, Jirka Hladky, Adam Okuliar,
	Kamil Kolakowski, e1000-devel

> Hi Ota,
> 
> It looks like there were a few changes that went through that could be
> causing the regression.  The most obvious one that jumps out at me is commit
> 72bfd32d2f84 ("ixgbe: disable LRO by default").  As such one thing you might
> try doing is turning on LRO support via ethtool -k to see if that is the
> issue you are seeing.
> 

Hi Alex,
enabling LRO resolved the problem.

Thank you!

Ota

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG] net: performance regression on ixgbe (Intel 82599EB 10-Gigabit NIC)
  2015-12-07 11:28   ` Otto Sabart
@ 2015-12-07 18:25     ` Rick Jones
  0 siblings, 0 replies; 8+ messages in thread
From: Rick Jones @ 2015-12-07 18:25 UTC (permalink / raw)
  To: Otto Sabart, Alexander Duyck
  Cc: netdev, Jeff Kirsher, Jirka Hladky, Adam Okuliar,
	Kamil Kolakowski, e1000-devel

On 12/07/2015 03:28 AM, Otto Sabart wrote:
>> Hi Ota,
>>
>> It looks like there were a few changes that went through that could be
>> causing the regression.  The most obvious one that jumps out at me is commit
>> 72bfd32d2f84 ("ixgbe: disable LRO by default").  As such one thing you might
>> try doing is turning on LRO support via ethtool -k to see if that is the
>> issue you are seeing.
>>
>
> Hi Alex,
> enabling LRO resolved the problem.

So you had the same NIC and CPUs and whatnot on both sides?

rick jones

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG] net: performance regression on ixgbe (Intel 82599EB 10-Gigabit NIC)
  2015-12-04 18:13 ` Rick Jones
@ 2015-12-10 14:18   ` Otto Sabart
  2015-12-10 17:49     ` Rick Jones
  0 siblings, 1 reply; 8+ messages in thread
From: Otto Sabart @ 2015-12-10 14:18 UTC (permalink / raw)
  To: Rick Jones
  Cc: netdev, Jeff Kirsher, Jirka Hladky, Adam Okuliar, Kamil Kolakowski

Hi Rick,

> *)  It is good to be binding netperf and netserver - helps with
> reproducibility, but why the two -T options?  A brief look at src/netsh.c
> suggests it will indeed set the two binding options separately but that is
> merely a side-effect of how I wrote the code.  It wasn't an intentional
> thing.

It's because of the way we generate arguments for netperf.
'-T 0, -T ,0' does the same as '-T 0,0', but the first option is more
convenient for us.

> *) Is irqbalance disabled and the IRQs set the same each time, or might
> there be variability possible there?  Each of the five netperf runs will be
> a different four-tuple which means each may (or may not) get RSS hashed/etc
> differently.

The irqbalance is disabled on all systems.

Can you suggest, if there is a need to assign irqs manually? Which irqs
we should pin to which CPU?

> *) It is perhaps adding duct tape to already-present belt and suspenders,
> but is power-management set to a fixed state on the systems involved? (Since
> this seems to be ProLiant G7s going by the legends on the charts, either
> static high perf or static low power I would imagine)

Power management is set to OS-Control in bios, which effectively means,
that _bios_ does not do any power management at all.

> *) What is the difference before/after for the service demands?  The netperf
> tests being run are asking for CPU utilization but I don't see the service
> demand change being summarized.

Unfortunatelly we does not have any summary chart for service demands,
we will add some shortly.

> *) Does a specific CPU on one side or the other saturate?
> (LOCAL_CPU_PEAK_UTIL, LOCAL_CPU_PEAK_ID, REMOTE_CPU_PEAK_UTIL,
> REMOTE_CPU_PEAK_ID output selectors)

We are sort of stuck in a stone age. We still use old fashion tcp/udp
migrated tests, but we plan to switch to omni.

> *) What are the processors involved?  Presumably the "other system" is
> fixed?

In this case:

hp-dl380g7 - $ lscpu:
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                24
On-line CPU(s) list:   0-23
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 44
Model name:            Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz
Stepping:              2
CPU MHz:               2660.000
BogoMIPS:              5331.27
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              12288K
NUMA node0 CPU(s):     0,2,4,6,8,10,12,14,16,18,20,22
NUMA node1 CPU(s):     1,3,5,7,9,11,13,15,17,19,21,23


hp-dl385g7 - $ lscpu:
tecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                24
On-line CPU(s) list:   0-23
Thread(s) per core:    1
Core(s) per socket:    12
Socket(s):             2
NUMA node(s):          4
Vendor ID:             AuthenticAMD
CPU family:            16
Model:                 9
Model name:            AMD Opteron(tm) Processor 6172
Stepping:              1
CPU MHz:               2100.000
BogoMIPS:              4200.39
Virtualization:        AMD-V
L1d cache:             64K
L1i cache:             64K
L2 cache:              512K
L3 cache:              5118K
NUMA node0 CPU(s):     0,2,4,6,8,10
NUMA node1 CPU(s):     12,14,16,18,20,22
NUMA node2 CPU(s):     13,15,17,19,21,23
NUMA node3 CPU(s):     1,3,5,7,9,11


Thank you for your hints!

Ota

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG] net: performance regression on ixgbe (Intel 82599EB 10-Gigabit NIC)
  2015-12-10 14:18   ` Otto Sabart
@ 2015-12-10 17:49     ` Rick Jones
  0 siblings, 0 replies; 8+ messages in thread
From: Rick Jones @ 2015-12-10 17:49 UTC (permalink / raw)
  To: Otto Sabart
  Cc: netdev, Jeff Kirsher, Jirka Hladky, Adam Okuliar, Kamil Kolakowski

On 12/10/2015 06:18 AM, Otto Sabart wrote:
>> *) Is irqbalance disabled and the IRQs set the same each time, or might
>> there be variability possible there?  Each of the five netperf runs will be
>> a different four-tuple which means each may (or may not) get RSS hashed/etc
>> differently.
>
> The irqbalance is disabled on all systems.
>
> Can you suggest, if there is a need to assign irqs manually? Which irqs
> we should pin to which CPU?

Likely as not it will depend on your goals.  When I want single-stream 
results, I will tend to disable irqbalance and set all the IRQs to one 
CPU in the system (often as not CPU0 but that is as much habit as 
anything else).  The idea is to clamp-down on any source of run-to-run 
variation.  I will also sometimes alter where I bind netperf/netserver 
to show the effects (especially on service demand) when 
netperf/netserver run on the same CPU as the IRQ, a thread in the same 
core as the IRQ, a core in the same processor as the IRQ and/or a core 
in another processor.  Unless all the IRQs are pointed at the same CPU 
(or I always specify the same, full four-tuple for addressing and wait 
for TIME_WAIT) that can be a challenge to keep straight.

When I want to measure aggregate, I either let irqbalance do its thing 
and run a bunch of warm-up tests, or simply peanut-butter the IRQs 
across the CPUs with variations on the theme of:

grep eth[23] /proc/interrupts | awk -F ":" -v cpus=12 '{mask = 1 * 
2^(count++ % cpus);printf("echo %x > 
/proc/irq/%d/smp_affinity\n",mask,$1)}' | sh

How one might structure/alter that pipeline will depend on the CPU 
enumeration.  That one was from a 2x6 core system where I didn't want to 
hit the second thread of each core, and the enumeration was the first 
twelve CPUs were on thread 0 of each core of both processors.

>> *) It is perhaps adding duct tape to already-present belt and suspenders,
>> but is power-management set to a fixed state on the systems involved? (Since
>> this seems to be ProLiant G7s going by the legends on the charts, either
>> static high perf or static low power I would imagine)
>
> Power management is set to OS-Control in bios, which effectively means,
> that _bios_ does not do any power management at all.

Probably just as well :)

>> *) What is the difference before/after for the service demands?  The netperf
>> tests being run are asking for CPU utilization but I don't see the service
>> demand change being summarized.
>
> Unfortunatelly we does not have any summary chart for service demands,
> we will add some shortly.
>
>> *) Does a specific CPU on one side or the other saturate?
>> (LOCAL_CPU_PEAK_UTIL, LOCAL_CPU_PEAK_ID, REMOTE_CPU_PEAK_UTIL,
>> REMOTE_CPU_PEAK_ID output selectors)
>
> We are sort of stuck in a stone age. We still use old fashion tcp/udp
> migrated tests, but we plan to switch to omni.

Well, you don't have to invoke with -t omni to make use of the output 
selectors - just add the -O (or -o or -k) test-specific option.

>
>> *) What are the processors involved?  Presumably the "other system" is
>> fixed?
>
> In this case:
>
> hp-dl380g7 - $ lscpu:
> Architecture:          x86_64
> CPU op-mode(s):        32-bit, 64-bit
> Byte Order:            Little Endian
> CPU(s):                24
> On-line CPU(s) list:   0-23
> Thread(s) per core:    2
> Core(s) per socket:    6
> Socket(s):             2
> NUMA node(s):          2
> Vendor ID:             GenuineIntel
> CPU family:            6
> Model:                 44
> Model name:            Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz
> Stepping:              2
> CPU MHz:               2660.000
> BogoMIPS:              5331.27
> Virtualization:        VT-x
> L1d cache:             32K
> L1i cache:             32K
> L2 cache:              256K
> L3 cache:              12288K
> NUMA node0 CPU(s):     0,2,4,6,8,10,12,14,16,18,20,22
> NUMA node1 CPU(s):     1,3,5,7,9,11,13,15,17,19,21,23
>
>
> hp-dl385g7 - $ lscpu:
> tecture:          x86_64
> CPU op-mode(s):        32-bit, 64-bit
> Byte Order:            Little Endian
> CPU(s):                24
> On-line CPU(s) list:   0-23
> Thread(s) per core:    1
> Core(s) per socket:    12
> Socket(s):             2
> NUMA node(s):          4
> Vendor ID:             AuthenticAMD
> CPU family:            16
> Model:                 9
> Model name:            AMD Opteron(tm) Processor 6172
> Stepping:              1
> CPU MHz:               2100.000
> BogoMIPS:              4200.39
> Virtualization:        AMD-V
> L1d cache:             64K
> L1i cache:             64K
> L2 cache:              512K
> L3 cache:              5118K
> NUMA node0 CPU(s):     0,2,4,6,8,10
> NUMA node1 CPU(s):     12,14,16,18,20,22
> NUMA node2 CPU(s):     13,15,17,19,21,23
> NUMA node3 CPU(s):     1,3,5,7,9,11

I guess that helps explain why there were such large differences in the 
deltas between TCP_STREAM and TCP_MAERTS since it wasn't the same 
per-core "horsepower" on either side and so why LRO on/off could have 
also affected the TCP_STREAM results. (When LRO was off it was off on 
both sides, and when on was on on both yes?)

happy benchmarking,

rick jones

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-12-10 17:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-03 16:26 [BUG] net: performance regression on ixgbe (Intel 82599EB 10-Gigabit NIC) Otto Sabart
2015-12-03 17:15 ` Alexander Duyck
2015-12-07 11:28   ` Otto Sabart
2015-12-07 18:25     ` Rick Jones
2015-12-04 18:13 ` Rick Jones
2015-12-10 14:18   ` Otto Sabart
2015-12-10 17:49     ` Rick Jones
2015-12-04 21:31 ` Rustad, Mark D

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.