All of lore.kernel.org
 help / color / mirror / Atom feed
* HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values
@ 2015-11-03 17:33 Denys Fedoryshchenko
  2015-11-03 19:11 ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Denys Fedoryshchenko @ 2015-11-03 17:33 UTC (permalink / raw)
  To: Netdev

Hi

Recently i was testing shaping over single 10G cards, for speeds up to 
3-4Gbps, and noticed interesting effect.

Shaping scheme:
Incoming bandwidth comes to switch port, with access vlan 100
Outgoing bandwidth leaves switch port with access vlan 200
Linux with Intel X710 connected to trunk port, bridge created, eth0.100 
bridged to eth0.200
gso/gro/tso disabled (they doesn't work nice with shapers)
Sure latest kernel

Shaper are installed on eth0.200, and seems multiqueue works on eth0 in 
general (i see packets are distributed over each queue), CPU load is 
very low (max 20% on core, but usually below 5%).
I tried:
HTB with fq, pfifo, pie qdisc
HFSC with fq, pfifo, pie qdisc

After i run shaper with default values, i can see traffic start to queue 
in classes and total traffic doesn't reach more than 2.4Gbit, and if i 
remove shaper it directly reach 4Gbit.
The only trick i found, it is running pie with burst 10000 cburst 10000 
in leaf classes, and 100000 in root class (i think 10000 in root class 
might work as well). If i change discipline to fq, i am returning back 
to 2.4Gbit, but it might be just because fq is not intended to be used 
with HTB leaf class.
So in my case burst/cburst solved issue, but i suspect maybe possible 
more elegant solution/tuning, than putting some random values?
Is there any particular reason why i am limited by ~2.4Gbit on any other 
settings?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values
  2015-11-03 17:33 HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values Denys Fedoryshchenko
@ 2015-11-03 19:11 ` Eric Dumazet
  2015-11-03 19:17   ` Denys Fedoryshchenko
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2015-11-03 19:11 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: Netdev

On Tue, 2015-11-03 at 19:33 +0200, Denys Fedoryshchenko wrote:
> Hi
> 
> Recently i was testing shaping over single 10G cards, for speeds up to 
> 3-4Gbps, and noticed interesting effect.
> 
> Shaping scheme:
> Incoming bandwidth comes to switch port, with access vlan 100
> Outgoing bandwidth leaves switch port with access vlan 200
> Linux with Intel X710 connected to trunk port, bridge created, eth0.100 
> bridged to eth0.200
> gso/gro/tso disabled (they doesn't work nice with shapers)

Well, this seems urban legend to me.

Something that is repeatedly copied/pasted on many web pages since last
century.

Given the nature of qdisc (being protected by a spinlock), you
absolutely want to have some kind of aggregation.

I have a patch to allow a sysadmin to set a max gro segs value to
incoming packets. You could play with it. Start with 4 segments,
allow GSO/TSO on the output and watch performance coming back.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values
  2015-11-03 19:11 ` Eric Dumazet
@ 2015-11-03 19:17   ` Denys Fedoryshchenko
  2015-11-03 19:49     ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Denys Fedoryshchenko @ 2015-11-03 19:17 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Netdev

On 2015-11-03 21:11, Eric Dumazet wrote:
> On Tue, 2015-11-03 at 19:33 +0200, Denys Fedoryshchenko wrote:
>> Hi
>> 
>> Recently i was testing shaping over single 10G cards, for speeds up to
>> 3-4Gbps, and noticed interesting effect.
>> 
>> Shaping scheme:
>> Incoming bandwidth comes to switch port, with access vlan 100
>> Outgoing bandwidth leaves switch port with access vlan 200
>> Linux with Intel X710 connected to trunk port, bridge created, 
>> eth0.100
>> bridged to eth0.200
>> gso/gro/tso disabled (they doesn't work nice with shapers)
> 
> Well, this seems urban legend to me.
> 
> Something that is repeatedly copied/pasted on many web pages since last
> century.
> 
> Given the nature of qdisc (being protected by a spinlock), you
> absolutely want to have some kind of aggregation.
> 
> I have a patch to allow a sysadmin to set a max gro segs value to
> incoming packets. You could play with it. Start with 4 segments,
> allow GSO/TSO on the output and watch performance coming back.

It is not, since i have more than 120 servers installed over country 
(most of them handle small traffic), in forwarding mode, first thing i 
am doing on forwarding setup - disabling gro/gso/tso. It is helped also 
many ISP on their forum where i visit often, first thing in 
troubleshooting unreliable network traffic forwarding - disabling 
offloading.
Because problem starts from incorrect shaping, and ends in some cases 
with network drivers spitting watchdog errors. Sometimes even shaper not 
necessary, just plain forwarding with offload enabled can cause issues, 
but it might be bug in networking drivers.
Should i try to reproduce and report? Sure if anybody can look into this 
issue.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values
  2015-11-03 19:17   ` Denys Fedoryshchenko
@ 2015-11-03 19:49     ` Eric Dumazet
  2015-11-03 20:01       ` David Miller
                         ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Eric Dumazet @ 2015-11-03 19:49 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: Netdev

On Tue, 2015-11-03 at 21:17 +0200, Denys Fedoryshchenko wrote:
>  GSO/TSO on the output and watch performance coming back.
> 
> It is not, since i have more than 120 servers installed over country 
> (most of them handle small traffic), in forwarding mode, first thing i 
> am doing on forwarding setup - disabling gro/gso/tso. It is helped also 
> many ISP on their forum where i visit often, first thing in 
> troubleshooting unreliable network traffic forwarding - disabling 
> offloading.
> Because problem starts from incorrect shaping, and ends in some cases 
> with network drivers spitting watchdog errors. Sometimes even shaper not 
> necessary, just plain forwarding with offload enabled can cause issues, 
> but it might be bug in networking drivers.
> Should i try to reproduce and report? Sure if anybody can look into this 
> issue.


Well, I am telling you.

Say no to people advising to turn off GRO/TSO.

If you were the guy adviding others to do so, it is time to see the
light.

Lets fix the bugs if any, instead of spreading disinformation.

I am so tired of telling these very simple facts guys.

If you prefer, continue to work on linux-2.0 but don't ask help on
netdev.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values
  2015-11-03 19:49     ` Eric Dumazet
@ 2015-11-03 20:01       ` David Miller
  2015-11-03 20:24       ` Denys Fedoryshchenko
  2015-11-03 21:04       ` Andrew
  2 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2015-11-03 20:01 UTC (permalink / raw)
  To: eric.dumazet; +Cc: nuclearcat, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 03 Nov 2015 11:49:16 -0800

> If you prefer, continue to work on linux-2.0 but don't ask help on
> netdev.

+1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values
  2015-11-03 19:49     ` Eric Dumazet
  2015-11-03 20:01       ` David Miller
@ 2015-11-03 20:24       ` Denys Fedoryshchenko
  2015-11-03 21:23         ` Eric Dumazet
  2015-11-03 21:04       ` Andrew
  2 siblings, 1 reply; 13+ messages in thread
From: Denys Fedoryshchenko @ 2015-11-03 20:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Netdev

On 2015-11-03 21:49, Eric Dumazet wrote:
> 
> Well, I am telling you.
> 
> Say no to people advising to turn off GRO/TSO.
> 
> If you were the guy adviding others to do so, it is time to see the
> light.
> 
> Lets fix the bugs if any, instead of spreading disinformation.
> 
> I am so tired of telling these very simple facts guys.
> 
> If you prefer, continue to work on linux-2.0 but don't ask help on
> netdev.
I wont argue on that, you are right.
Ok, then it is a bit offtopic in current case, different setup, but i 
know this one has easy to reproduce issues with offloading. but this is 
bug related to that, directly appearing when i enable tso/gso/gro. I am 
losing access to remote box, so max i can do right now:
ethtool -K eth0 tso on gso on gro on; sleep 5;ethtool -K eth0 tso off 
gso off gro off

No shapers, just plain nat. I suspect it might be specific to network 
card, but not sure.
4.1.4
02:00.0 "Class 0200" "8086" "10d3" "8086" "357a"

driver: e1000e
version: 2.3.2-k
firmware-version: 0.13-4
bus-info: 0000:00:19.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

But after that messages, honestly i don't know where to dig.

[6606122.904234] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
[6606122.904234]   TDH                  <a5>
[6606122.904234]   TDT                  <ad>
[6606122.904234]   next_to_use          <ad>
[6606122.904234]   next_to_clean        <a3>
[6606122.904234] buffer_info[next_to_clean]:
[6606122.904234]   time_stamp           <12761e88c>
[6606122.904234]   next_to_watch        <a5>
[6606122.904234]   jiffies              <12761e928>
[6606122.904234]   next_to_watch.status <0>
[6606122.904234] MAC Status             <40080083>
[6606122.904234] PHY Status             <796d>
[6606122.904234] PHY 1000BASE-T Status  <3800>
[6606122.904234] PHY Extended Status    <3000>
[6606122.904234] PCI Status             <10>
[6606124.903733] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
[6606124.903733]   TDH                  <a5>
[6606124.903733]   TDT                  <ad>
[6606124.903733]   next_to_use          <ad>
[6606124.903733]   next_to_clean        <a3>
[6606124.903733] buffer_info[next_to_clean]:
[6606124.903733]   time_stamp           <12761e88c>
[6606124.903733]   next_to_watch        <a5>
[6606124.903733]   jiffies              <12761e9f0>
[6606124.903733]   next_to_watch.status <0>
[6606124.903733] MAC Status             <40080083>
[6606124.903733] PHY Status             <796d>
[6606124.903733] PHY 1000BASE-T Status  <3800>
[6606124.903733] PHY Extended Status    <3000>
[6606124.903733] PCI Status             <10>
[6606126.903291] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
[6606126.903291]   TDH                  <a5>
[6606126.903291]   TDT                  <ad>
[6606126.903291]   next_to_use          <ad>
[6606126.903291]   next_to_clean        <a3>
[6606126.903291] buffer_info[next_to_clean]:
[6606126.903291]   time_stamp           <12761e88c>
[6606126.903291]   next_to_watch        <a5>
[6606126.903291]   jiffies              <12761eab8>
[6606126.903291]   next_to_watch.status <0>
[6606126.903291] MAC Status             <40080083>
[6606126.903291] PHY Status             <796d>
[6606126.903291] PHY 1000BASE-T Status  <3800>
[6606126.903291] PHY Extended Status    <3000>
[6606126.903291] PCI Status             <10>
[6606127.912352] ------------[ cut here ]------------
[6606127.912566] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:303 
dev_watchdog+0x180/0x1e6()
[6606127.912877] NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed 
out
[6606127.913067] Modules linked in: xt_CLASSIFY xt_set ipt_REJECT 
nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_recent ipt_MASQUERADE 
nf_nat_masquerade_ipv4 xt_nat xt_tcpudp nf_nat_pptp nf_nat_proto_gre 
nf_conntrack_pptp nf_conntrack_proto_gre ip_set_hash_net ip_set 
nfnetlink iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables 
act_nat cls_u32 sch_ingress
[6606127.915843] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.1.4-build-0084 #1
[6606127.916035] Hardware name: Intel Corporation SandyBridge 
Platform/To be filled by O.E.M., BIOS 
S1200BT.86B.02.00.0041.120520121743 12/05/2012
[6606127.916356]  0000000000000009 ffff88042f003dd8 ffffffff81896390 
00000000000000fb
[6606127.916903]  ffff88042f003e28 ffff88042f003e18 ffffffff810bc024 
ffffffff820aad98
[6606127.917451]  ffffffff81830ab3 ffff8800be47c000 ffff88042a8dce00 
0000000000000001
[6606127.917991] Call Trace:
[6606127.918175]  <IRQ>  [<ffffffff81896390>] dump_stack+0x45/0x57
[6606127.918429]  [<ffffffff810bc024>] warn_slowpath_common+0x97/0xb1
[6606127.918621]  [<ffffffff81830ab3>] ? dev_watchdog+0x180/0x1e6
[6606127.918812]  [<ffffffff810bc07f>] warn_slowpath_fmt+0x41/0x43
[6606127.919007]  [<ffffffffa003930f>] ? nf_ct_delete+0x1ef/0x202 
[nf_conntrack]
[6606127.919201]  [<ffffffff81830ab3>] dev_watchdog+0x180/0x1e6
[6606127.919396]  [<ffffffffa0039322>] ? nf_ct_delete+0x202/0x202 
[nf_conntrack]
[6606127.919589]  [<ffffffff81830933>] ? dev_graft_qdisc+0x65/0x65
[6606127.919781]  [<ffffffff810ef971>] call_timer_fn.isra.27+0x17/0x6d
[6606127.919975]  [<ffffffff810f0235>] run_timer_softirq+0x1a0/0x1c4
[6606127.920169]  [<ffffffff810be861>] __do_softirq+0xc3/0x1b2
[6606127.920359]  [<ffffffff810bea9c>] irq_exit+0x37/0x7c
[6606127.920551]  [<ffffffff81028c2a>] 
smp_apic_timer_interrupt+0x3e/0x4a
[6606127.920745]  [<ffffffff8189d9b8>] apic_timer_interrupt+0x68/0x70
[6606127.920937]  <EOI>  [<ffffffff8100a34c>] ? mwait_idle+0x68/0x7e
[6606127.921189]  [<ffffffff8100aa07>] arch_cpu_idle+0xa/0xc
[6606127.921379]  [<ffffffff810e22af>] cpu_startup_entry+0x207/0x238
[6606127.921574]  [<ffffffff818915d0>] rest_init+0x77/0x79
[6606127.921768]  [<ffffffff820c7de8>] start_kernel+0x3f6/0x403
[6606127.921957]  [<ffffffff820c77ea>] ? set_init_arg+0x55/0x55
[6606127.922149]  [<ffffffff820c7442>] 
x86_64_start_reservations+0x2a/0x2c
[6606127.922341]  [<ffffffff820c7500>] x86_64_start_kernel+0xbc/0xc0
[6606127.922532] ---[ end trace c83a65027c0499d1 ]---
[6606127.922734] e1000e 0000:00:19.0 eth0: Reset adapter unexpectedly
[6606131.827524] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: None

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values
  2015-11-03 19:49     ` Eric Dumazet
  2015-11-03 20:01       ` David Miller
  2015-11-03 20:24       ` Denys Fedoryshchenko
@ 2015-11-03 21:04       ` Andrew
  2015-11-03 22:02         ` Eric Dumazet
  2 siblings, 1 reply; 13+ messages in thread
From: Andrew @ 2015-11-03 21:04 UTC (permalink / raw)
  To: Netdev

Hi.

This is common trouble due to hierarchical shapers realization (global 
tree lock on packet dequeuing - so when one CPU looks for parent class 
where tokens can be borrowed, other CPUs are waiting). It's mentioned 
even in academic publications :) You can read about it here: 
http://www.ijcset.com/docs/IJCSET13-04-04-113.pdf

I think that simple lock removing will greatly improve performance; and 
race conditions on packets dequeuing shouldn't hurt anything except 
shaping accuracy. Another solutions looks more complex.

03.11.2015 21:49, Eric Dumazet пишет:
> On Tue, 2015-11-03 at 21:17 +0200, Denys Fedoryshchenko wrote:
>>   GSO/TSO on the output and watch performance coming back.
>>
>> It is not, since i have more than 120 servers installed over country
>> (most of them handle small traffic), in forwarding mode, first thing i
>> am doing on forwarding setup - disabling gro/gso/tso. It is helped also
>> many ISP on their forum where i visit often, first thing in
>> troubleshooting unreliable network traffic forwarding - disabling
>> offloading.
>> Because problem starts from incorrect shaping, and ends in some cases
>> with network drivers spitting watchdog errors. Sometimes even shaper not
>> necessary, just plain forwarding with offload enabled can cause issues,
>> but it might be bug in networking drivers.
>> Should i try to reproduce and report? Sure if anybody can look into this
>> issue.
>
> Well, I am telling you.
>
> Say no to people advising to turn off GRO/TSO.
>
> If you were the guy adviding others to do so, it is time to see the
> light.
>
> Lets fix the bugs if any, instead of spreading disinformation.
>
> I am so tired of telling these very simple facts guys.
>
> If you prefer, continue to work on linux-2.0 but don't ask help on
> netdev.
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values
  2015-11-03 20:24       ` Denys Fedoryshchenko
@ 2015-11-03 21:23         ` Eric Dumazet
  2015-11-04  4:12           ` Denys Fedoryshchenko
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2015-11-03 21:23 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: Netdev

On Tue, 2015-11-03 at 22:24 +0200, Denys Fedoryshchenko wrote:

> I wont argue on that, you are right.
> Ok, then it is a bit offtopic in current case, different setup, but i 
> know this one has easy to reproduce issues with offloading. but this is 
> bug related to that, directly appearing when i enable tso/gso/gro. I am 
> losing access to remote box, so max i can do right now:
> ethtool -K eth0 tso on gso on gro on; sleep 5;ethtool -K eth0 tso off 
> gso off gro off
> 
> No shapers, just plain nat. I suspect it might be specific to network 
> card, but not sure.
> 



What happens if you enable gro, but disable tso ?

With GRO enabled, you'll get a good performance increase, as forwarding
and qdisc will use big packets.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values
  2015-11-03 21:04       ` Andrew
@ 2015-11-03 22:02         ` Eric Dumazet
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Dumazet @ 2015-11-03 22:02 UTC (permalink / raw)
  To: Andrew; +Cc: Netdev

On Tue, 2015-11-03 at 23:04 +0200, Andrew wrote:
> Hi.
> 
> This is common trouble due to hierarchical shapers realization (global 
> tree lock on packet dequeuing - so when one CPU looks for parent class 
> where tokens can be borrowed, other CPUs are waiting). It's mentioned 
> even in academic publications :) You can read about it here: 
> http://www.ijcset.com/docs/IJCSET13-04-04-113.pdf
> 
> I think that simple lock removing will greatly improve performance; and 
> race conditions on packets dequeuing shouldn't hurt anything except 
> shaping accuracy. Another solutions looks more complex.

Thanks Andrew, I am very well aware of qdisc spinlock contention.

And race conditions on packets dequeueing will _crash_ your host.
Plainly.

This is why I am advising using GRO in the first place.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values
  2015-11-03 21:23         ` Eric Dumazet
@ 2015-11-04  4:12           ` Denys Fedoryshchenko
  2015-11-04  4:28             ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Denys Fedoryshchenko @ 2015-11-04  4:12 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Netdev

On 2015-11-03 23:23, Eric Dumazet wrote:
> On Tue, 2015-11-03 at 22:24 +0200, Denys Fedoryshchenko wrote:
> 
>> I wont argue on that, you are right.
>> Ok, then it is a bit offtopic in current case, different setup, but i
>> know this one has easy to reproduce issues with offloading. but this 
>> is
>> bug related to that, directly appearing when i enable tso/gso/gro. I 
>> am
>> losing access to remote box, so max i can do right now:
>> ethtool -K eth0 tso on gso on gro on; sleep 5;ethtool -K eth0 tso off
>> gso off gro off
>> 
>> No shapers, just plain nat. I suspect it might be specific to network
>> card, but not sure.
>> 
> 
> 
> 
> What happens if you enable gro, but disable tso ?
> 
> With GRO enabled, you'll get a good performance increase, as forwarding
> and qdisc will use big packets.
Just enabling gro or gso (or together) is fine there. Thanks for advice. 
Seems only tso causing problems.
Also i guess if i keep tso disabled, it will solve my MTU issues (i had 
once issue, that traffic heading to pppoe users,
who have 14xx mtu, was blocked, when offloading enabled on transit 
server, but can't reproduce it quickly again).
Should i try to report to e1000e maintainers this bug? On similar setup 
it is happening only at specific locations,
but i am not definitely sure what can be the reason.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values
  2015-11-04  4:12           ` Denys Fedoryshchenko
@ 2015-11-04  4:28             ` Eric Dumazet
  2015-11-04  4:49               ` Denys Fedoryshchenko
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2015-11-04  4:28 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: Netdev

On Wed, 2015-11-04 at 06:12 +0200, Denys Fedoryshchenko wrote:
> Just enabling gro or gso (or together) is fine there. Thanks for advice. 
> Seems only tso causing problems.
> Also i guess if i keep tso disabled, it will solve my MTU issues (i had 
> once issue, that traffic heading to pppoe users,
> who have 14xx mtu, was blocked, when offloading enabled on transit 
> server, but can't reproduce it quickly again).
> Should i try to report to e1000e maintainers this bug? On similar setup 
> it is happening only at specific locations,
> but i am not definitely sure what can be the reason.

Not sure, have you tried per chance latest kernel (linux-4.3) for this
e1000e issue ?

Are you using vlan tags on this NIC ?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values
  2015-11-04  4:28             ` Eric Dumazet
@ 2015-11-04  4:49               ` Denys Fedoryshchenko
  2015-11-04  5:02                 ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Denys Fedoryshchenko @ 2015-11-04  4:49 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Netdev

On 2015-11-04 06:28, Eric Dumazet wrote:
> On Wed, 2015-11-04 at 06:12 +0200, Denys Fedoryshchenko wrote:
>> Just enabling gro or gso (or together) is fine there. Thanks for 
>> advice.
>> Seems only tso causing problems.
>> Also i guess if i keep tso disabled, it will solve my MTU issues (i 
>> had
>> once issue, that traffic heading to pppoe users,
>> who have 14xx mtu, was blocked, when offloading enabled on transit
>> server, but can't reproduce it quickly again).
>> Should i try to report to e1000e maintainers this bug? On similar 
>> setup
>> it is happening only at specific locations,
>> but i am not definitely sure what can be the reason.
> 
> Not sure, have you tried per chance latest kernel (linux-4.3) for this
> e1000e issue ?
> 
> Are you using vlan tags on this NIC ?
Tested now, can be reproduced on 4.3 as well.
What is interesting, if i enable tso alone, and leave gso/gro off - it 
is working fine. gso+gro on, tso off - fine also.
But if i enable them all together - i trigger the bug.

[   71.699687] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
[   71.699687]   TDH                  <96>
[   71.699687]   TDT                  <9c>
[   71.699687]   next_to_use          <9c>
[   71.699687]   next_to_clean        <92>
[   71.699687] buffer_info[next_to_clean]:
[   71.699687]   time_stamp           <fffc78bd>
[   71.699687]   next_to_watch        <96>
[   71.699687]   jiffies              <fffc843c>
[   71.699687]   next_to_watch.status <0>
[   71.699687] MAC Status             <40080083>
[   71.699687] PHY Status             <796d>
[   71.699687] PHY 1000BASE-T Status  <3800>
[   71.699687] PHY Extended Status    <3000>
[   71.699687] PCI Status             <10>
[   73.699241] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
[   73.699241]   TDH                  <96>
[   73.699241]   TDT                  <9c>
[   73.699241]   next_to_use          <9c>
[   73.699241]   next_to_clean        <92>
[   73.699241] buffer_info[next_to_clean]:
[   73.699241]   time_stamp           <fffc78bd>
[   73.699241]   next_to_watch        <96>
[   73.699241]   jiffies              <fffc8c0c>
[   73.699241]   next_to_watch.status <0>
[   73.699241] MAC Status             <40080083>
[   73.699241] PHY Status             <796d>
[   73.699241] PHY 1000BASE-T Status  <3800>
[   73.699241] PHY Extended Status    <3000>
[   73.699241] PCI Status             <10>
[   75.698775] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
[   75.698775]   TDH                  <96>
[   75.698775]   TDT                  <9c>
[   75.698775]   next_to_use          <9c>
[   75.698775]   next_to_clean        <92>
[   75.698775] buffer_info[next_to_clean]:
[   75.698775]   time_stamp           <fffc78bd>
[   75.698775]   next_to_watch        <96>
[   75.698775]   jiffies              <fffc93dc>
[   75.698775]   next_to_watch.status <0>
[   75.698775] MAC Status             <40080083>
[   75.698775] PHY Status             <796d>
[   75.698775] PHY 1000BASE-T Status  <3800>
[   75.698775] PHY Extended Status    <3000>
[   75.698775] PCI Status             <10>
[   76.709871] ------------[ cut here ]------------
[   76.710075] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:303 
dev_watchdog+0x17c/0x1e2()
[   76.710383] NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed 
out
[   76.710572] Modules linked in: xt_CLASSIFY xt_set ipt_REJECT 
nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_recent ipt_MASQUERADE 
nf_nat_masquerade_ipv4 xt_nat xt_tcpudp nf_nat_pptp nf_nat_proto_gre 
nf_conntrack_pptp nf_conntrack_proto_gre ip_set_hash_net ip_set 
nfnetlink iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables 
act_nat cls_u32 sch_ingress
[   76.713354] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.3.0-build-0087 #1
[   76.713547] Hardware name: Intel Corporation SandyBridge Platform/To 
be filled by O.E.M., BIOS S1200BT.86B.02.00.0041.120520121743 12/05/2012
[   76.713868]  0000000000000000 ffff88042f003e08 ffffffff81259d1d 
ffff88042f003e50
[   76.714413]  ffff88042f003e40 ffffffff810bda73 ffffffff818654a3 
ffff88042c290000
[   76.714946]  ffff8800be758c00 0000000000000001 0000000000000000 
ffff88042f003ea0
[   76.715481] Call Trace:
[   76.715657]  <IRQ>  [<ffffffff81259d1d>] dump_stack+0x44/0x55
[   76.715908]  [<ffffffff810bda73>] warn_slowpath_common+0x95/0xae
[   76.716095]  [<ffffffff818654a3>] ? dev_watchdog+0x17c/0x1e2
[   76.716281]  [<ffffffff810bdad3>] warn_slowpath_fmt+0x47/0x49
[   76.716470]  [<ffffffff810f4bcc>] ? mod_timer_pinned+0xaf/0xbe
[   76.716662]  [<ffffffff818654a3>] dev_watchdog+0x17c/0x1e2
[   76.716850]  [<ffffffff81865327>] ? dev_graft_qdisc+0x65/0x65
[   76.717039]  [<ffffffff810f4db8>] call_timer_fn.isra.26+0x17/0x6d
[   76.717227]  [<ffffffff810f4f80>] run_timer_softirq+0x172/0x193
[   76.717418]  [<ffffffff810c0588>] __do_softirq+0xba/0x1a9
[   76.717606]  [<ffffffff810c07bf>] irq_exit+0x37/0x7c
[   76.717795]  [<ffffffff81029c06>] smp_apic_timer_interrupt+0x3d/0x48
[   76.717988]  [<ffffffff818cdccc>] apic_timer_interrupt+0x7c/0x90
[   76.718179]  <EOI>  [<ffffffff8100aed5>] ? mwait_idle+0x68/0x7e
[   76.718436]  [<ffffffff8100b2d8>] arch_cpu_idle+0xa/0xc
[   76.718625]  [<ffffffff810e5822>] default_idle_call+0x27/0x29
[   76.718816]  [<ffffffff810e5945>] cpu_startup_entry+0x121/0x1da
[   76.719008]  [<ffffffff818c8970>] rest_init+0x77/0x79
[   76.719195]  [<ffffffff820cde02>] start_kernel+0x40f/0x41c
[   76.719384]  [<ffffffff820cd7e2>] ? set_init_arg+0x55/0x55
[   76.719572]  [<ffffffff820cd442>] x86_64_start_reservations+0x2a/0x2c
[   76.719764]  [<ffffffff820cd4ff>] x86_64_start_kernel+0xbb/0xbe
[   76.719955] ---[ end trace 6e1862989bd54a50 ]---
[   76.720145] e1000e 0000:00:19.0 eth0: Reset adapter unexpectedly
[   80.568051] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: None

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values
  2015-11-04  4:49               ` Denys Fedoryshchenko
@ 2015-11-04  5:02                 ` Eric Dumazet
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Dumazet @ 2015-11-04  5:02 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: Netdev

On Wed, 2015-11-04 at 06:49 +0200, Denys Fedoryshchenko wrote:
> Tested now, can be reproduced on 4.3 as well.
> What is interesting, if i enable tso alone, and leave gso/gro off - it 
> is working fine. gso+gro on, tso off - fine also.
> But if i enable them all together - i trigger the bug.

If GRO is disabled, no TSO packet ever arrive to the NIC to trigger the
TSO bug ;) So leaving TSO on or off does not really matter.

If GRO is enabled, and TSO disabled, core networking stack will
'segment' the GRO packets right after qdisc dequeue, before giving to
the NIC individual (non TSO) segments.

It enables skb->xmit_more processing on e1000e so you should get better
performance, even if TSO happens to be broken on e1000e

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-11-04  5:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-03 17:33 HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values Denys Fedoryshchenko
2015-11-03 19:11 ` Eric Dumazet
2015-11-03 19:17   ` Denys Fedoryshchenko
2015-11-03 19:49     ` Eric Dumazet
2015-11-03 20:01       ` David Miller
2015-11-03 20:24       ` Denys Fedoryshchenko
2015-11-03 21:23         ` Eric Dumazet
2015-11-04  4:12           ` Denys Fedoryshchenko
2015-11-04  4:28             ` Eric Dumazet
2015-11-04  4:49               ` Denys Fedoryshchenko
2015-11-04  5:02                 ` Eric Dumazet
2015-11-03 21:04       ` Andrew
2015-11-03 22:02         ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.