* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
@ 2012-07-06 9:47 Nieścierowicz Adam
2012-07-06 10:13 ` Eric Dumazet
0 siblings, 1 reply; 18+ messages in thread
From: Nieścierowicz Adam @ 2012-07-06 9:47 UTC (permalink / raw)
To: Eric Dumazet, Netdev
Hello,
Can I send something that will help determine the cause of the problem?
W dniu 08.06.2012 11:41, Eric Dumazet napisał(a):
> On Fri, 2012-06-08 at 10:58 +0200, Nieścierowicz Adam wrote:
>
>> Hello, recently we changed on the router kernel from 2.6.38.1 to
>> 3.4.1
>> and noticed 30% packet loss when traffic increases up to 250MB / s.
>> Similar is for kernel 3.5-rc1 Here a link to ifstat
>> http://wklej.org/id/767577/ [2]
>
> You should give as much as possible delails on your setup (hardware,
> software)
>
> lspci
> cat /proc/cpuinfo
> cat /proc/interrupts
> ifconfig -a
> tc -s -d qdisc
> dmesg
> netstat -s
currently running on 2.6.38.1 and traffic is 100Mb / s
lspci: http://wklej.org/id/769102/
/proc/cpuinfo: http://wklej.org/id/769104/
/proc/interrupts: http://wklej.org/id/769106/
ifconfig -a: http://wklej.org/id/769108/
tc -s -d qdisc: http://wklej.org/id/769109/
dmesg: here are some logs from iptables
netstat -s: http://wklej.org/id/769110/
lsmod: http://wklej.org/id/769117/
/proc/net/softnet_stat: http://wklej.org/id/769116/
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-07-06 9:47 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s Nieścierowicz Adam
@ 2012-07-06 10:13 ` Eric Dumazet
0 siblings, 0 replies; 18+ messages in thread
From: Eric Dumazet @ 2012-07-06 10:13 UTC (permalink / raw)
To: adam.niescierowicz; +Cc: Netdev
On Fri, 2012-07-06 at 11:47 +0200, Nieścierowicz Adam wrote:
> Hello,
> Can I send something that will help determine the cause of the problem?
>
>
> W dniu 08.06.2012 11:41, Eric Dumazet napisał(a):
>
> > On Fri, 2012-06-08 at 10:58 +0200, Nieścierowicz Adam wrote:
> >
> >> Hello, recently we changed on the router kernel from 2.6.38.1 to
> >> 3.4.1
> >> and noticed 30% packet loss when traffic increases up to 250MB / s.
> >> Similar is for kernel 3.5-rc1 Here a link to ifstat
> >> http://wklej.org/id/767577/ [2]
> >
> > You should give as much as possible delails on your setup (hardware,
> > software)
> >
> > lspci
> > cat /proc/cpuinfo
> > cat /proc/interrupts
> > ifconfig -a
> > tc -s -d qdisc
> > dmesg
> > netstat -s
>
> currently running on 2.6.38.1 and traffic is 100Mb / s
>
> lspci: http://wklej.org/id/769102/
> /proc/cpuinfo: http://wklej.org/id/769104/
> /proc/interrupts: http://wklej.org/id/769106/
> ifconfig -a: http://wklej.org/id/769108/
> tc -s -d qdisc: http://wklej.org/id/769109/
> dmesg: here are some logs from iptables
> netstat -s: http://wklej.org/id/769110/
> lsmod: http://wklej.org/id/769117/
> /proc/net/softnet_stat: http://wklej.org/id/769116/
Same infos of 3.5-rcX kernel would be nice.
What NIC is eth0 ? (dmesg please)
It seems all network traffic on 2.6.38 is handled by a single cpu (cpu0)
(seen in /proc/interrupts)
I suspect that with 3.4 or 3.5 kernels, traffic is handled by many cpus
and they hit false sharing and contention.
You probably get better performance doing some affinity tuning :
For example,
eth0 serviced by cpu0
eth2 serviced by cpu1
eth3 serviced by cpu2
eth5 serviced by cpu3
and so on...
check and/or set /proc/irq/${NUM}/smp_affinity
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-10-09 19:56 ` Nieścierowicz Adam
@ 2012-10-10 4:59 ` Jeff Kirsher
0 siblings, 0 replies; 18+ messages in thread
From: Jeff Kirsher @ 2012-10-10 4:59 UTC (permalink / raw)
To: adam.niescierowicz; +Cc: Andre Tomt, Eric Dumazet, Netdev, Jesse Brandeburg
On 10/09/2012 12:56 PM, Nieścierowicz Adam wrote:
> W dniu 08.10.2012 14:59, Andre Tomt napisał(a):
>
>> On 08. okt. 2012 14:32, Andre Tomt wrote:
>>
>>> On 08. okt. 2012 14:13, Eric Dumazet wrote:
>>>
>>>> On Mon, 2012-10-08 at 14:00 +0200, Andre Tomt wrote:
>>>>
>>>>> On 08. okt. 2012 12:49, Nieścierowicz Adam wrote:
>>>>>
>>>>>> W dniu 08.10.2012 11:47, Eric Dumazet napisał(a):
>>>>>>
>>>>>>> Anyway you dont say where are drops, (ifconfig give us very few
>>>>>>> drops)
>>>>>> you can see no losses(drop), but a temporary decline in traffic
>>>>>> on the interface to 0kb/s
>>>>> This sounds very familiar, could it be something similar to:
>>>>> http://marc.info/?l=linux-netdev&m=134594936016796&w=3 [1] The chip
>>>>> seems to be of the same family (though not model)
>>>> Yes, but Adam says 3.4.1 already has a problem, while commit
>>>> 2cb7a9cc008c25dc03314de563c00c107b3e5432 is in 3.5 only. Since Adam
>>>> uses Intel e1000e, it could be the BQL related problem.
>>> The other chips have had DMA burst flag enabled for longer, so that he
>>> sees the same problem in 3.4 while I'm not makes sense. Hmm, as 3.4 is
>>> when BQL went in (IIRC) it seems very likely that this BQL issue is the
>>> problem for both of us.
>>
>> To clarify; I think the DMA burst flag in the driver triggers the BQL
>> related issue. Judging by the patchwork link for wthresh=1 this seems
>> very related indeed.
>>
>> Removing the FLAG2_DMA_BURST flag for 82574 in the driver works for me.
>> Adam, it might be worth testing out a build on your system too with the
>> flag removed. If you try the attached patch (for 3.6, probably OK for
>> 3.5) and the problem dissapears, we are probably at least talking about
>> the same bug.
>
> after applying the patch everything looks good, no visible loss
>
> Do you expect to correct the bug in mainline?
Jesse Brandenburg is working on a patch for upstream currently to fix
the issue.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-10-08 12:59 ` Andre Tomt
@ 2012-10-09 19:56 ` Nieścierowicz Adam
2012-10-10 4:59 ` Jeff Kirsher
0 siblings, 1 reply; 18+ messages in thread
From: Nieścierowicz Adam @ 2012-10-09 19:56 UTC (permalink / raw)
To: Andre Tomt; +Cc: Eric Dumazet, Netdev
W dniu 08.10.2012 14:59, Andre Tomt napisał(a):
> On 08. okt. 2012 14:32, Andre Tomt wrote:
>
>> On 08. okt. 2012 14:13, Eric Dumazet wrote:
>>
>>> On Mon, 2012-10-08 at 14:00 +0200, Andre Tomt wrote:
>>>
>>>> On 08. okt. 2012 12:49, Nieścierowicz Adam wrote:
>>>>
>>>>> W dniu 08.10.2012 11:47, Eric Dumazet napisał(a):
>>>>>
>>>>>> Anyway you dont say where are drops, (ifconfig give us very few
>>>>>> drops)
>>>>> you can see no losses(drop), but a temporary decline in traffic
>>>>> on the interface to 0kb/s
>>>> This sounds very familiar, could it be something similar to:
>>>> http://marc.info/?l=linux-netdev&m=134594936016796&w=3 [1] The
>>>> chip
>>>> seems to be of the same family (though not model)
>>> Yes, but Adam says 3.4.1 already has a problem, while commit
>>> 2cb7a9cc008c25dc03314de563c00c107b3e5432 is in 3.5 only. Since Adam
>>> uses Intel e1000e, it could be the BQL related problem.
>> The other chips have had DMA burst flag enabled for longer, so that
>> he
>> sees the same problem in 3.4 while I'm not makes sense. Hmm, as 3.4
>> is
>> when BQL went in (IIRC) it seems very likely that this BQL issue is
>> the
>> problem for both of us.
>
> To clarify; I think the DMA burst flag in the driver triggers the BQL
> related issue. Judging by the patchwork link for wthresh=1 this seems
> very related indeed.
>
> Removing the FLAG2_DMA_BURST flag for 82574 in the driver works for
> me.
> Adam, it might be worth testing out a build on your system too with
> the
> flag removed. If you try the attached patch (for 3.6, probably OK for
> 3.5) and the problem dissapears, we are probably at least talking
> about
> the same bug.
after applying the patch everything looks good, no visible loss
Do you expect to correct the bug in mainline?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-10-08 12:32 ` Andre Tomt
@ 2012-10-08 12:59 ` Andre Tomt
2012-10-09 19:56 ` Nieścierowicz Adam
0 siblings, 1 reply; 18+ messages in thread
From: Andre Tomt @ 2012-10-08 12:59 UTC (permalink / raw)
To: Eric Dumazet; +Cc: adam.niescierowicz, Netdev
[-- Attachment #1: Type: text/plain, Size: 1551 bytes --]
On 08. okt. 2012 14:32, Andre Tomt wrote:
> On 08. okt. 2012 14:13, Eric Dumazet wrote:
>> On Mon, 2012-10-08 at 14:00 +0200, Andre Tomt wrote:
>>> On 08. okt. 2012 12:49, Nieścierowicz Adam wrote:
>>>> W dniu 08.10.2012 11:47, Eric Dumazet napisał(a):
>>>>> Anyway you dont say where are drops, (ifconfig give us very few drops)
>>>>
>>>> you can see no losses(drop), but a temporary decline in traffic on the
>>>> interface to 0kb/s
>>>
>>> This sounds very familiar, could it be something similar to:
>>> http://marc.info/?l=linux-netdev&m=134594936016796&w=3
>>>
>>> The chip seems to be of the same family (though not model)
>>
>> Yes, but Adam says 3.4.1 already has a problem, while
>> commit 2cb7a9cc008c25dc03314de563c00c107b3e5432 is in 3.5 only.
> >
>> Since Adam uses Intel e1000e, it could be the BQL related problem.
>
> The other chips have had DMA burst flag enabled for longer, so that he
> sees the same problem in 3.4 while I'm not makes sense. Hmm, as 3.4 is
> when BQL went in (IIRC) it seems very likely that this BQL issue is the
> problem for both of us.
To clarify; I think the DMA burst flag in the driver triggers the BQL
related issue. Judging by the patchwork link for wthresh=1 this seems
very related indeed.
Removing the FLAG2_DMA_BURST flag for 82574 in the driver works for me.
Adam, it might be worth testing out a build on your system too with the
flag removed. If you try the attached patch (for 3.6, probably OK for
3.5) and the problem dissapears, we are probably at least talking about
the same bug.
[-- Attachment #2: e1000e-disable-dma-burst.patch --]
[-- Type: text/x-patch, Size: 1355 bytes --]
diff -Naur linux-3.6.1/drivers/net/ethernet/intel/e1000e/82571.c linux-3.6.1-2/drivers/net/ethernet/intel/e1000e/82571.c
--- linux-3.6.1/drivers/net/ethernet/intel/e1000e/82571.c 2012-10-07 17:41:28.000000000 +0200
+++ linux-3.6.1-2/drivers/net/ethernet/intel/e1000e/82571.c 2012-10-08 14:54:08.853095363 +0200
@@ -2031,8 +2031,7 @@
| FLAG_RESET_OVERWRITES_LAA /* errata */
| FLAG_TARC_SPEED_MODE_BIT /* errata */
| FLAG_APME_CHECK_PORT_B,
- .flags2 = FLAG2_DISABLE_ASPM_L1 /* errata 13 */
- | FLAG2_DMA_BURST,
+ .flags2 = FLAG2_DISABLE_ASPM_L1, /* errata 13 */
.pba = 38,
.max_hw_frame_size = DEFAULT_JUMBO,
.get_variants = e1000_get_variants_82571,
@@ -2049,8 +2048,7 @@
| FLAG_APME_IN_CTRL3
| FLAG_HAS_CTRLEXT_ON_LOAD
| FLAG_TARC_SPEED_MODE_BIT, /* errata */
- .flags2 = FLAG2_DISABLE_ASPM_L1 /* errata 13 */
- | FLAG2_DMA_BURST,
+ .flags2 = FLAG2_DISABLE_ASPM_L1, /* errata 13 */
.pba = 38,
.max_hw_frame_size = DEFAULT_JUMBO,
.get_variants = e1000_get_variants_82571,
@@ -2090,8 +2088,7 @@
.flags2 = FLAG2_CHECK_PHY_HANG
| FLAG2_DISABLE_ASPM_L0S
| FLAG2_DISABLE_ASPM_L1
- | FLAG2_NO_DISABLE_RX
- | FLAG2_DMA_BURST,
+ | FLAG2_NO_DISABLE_RX,
.pba = 32,
.max_hw_frame_size = DEFAULT_JUMBO,
.get_variants = e1000_get_variants_82571,
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-10-08 12:13 ` Eric Dumazet
@ 2012-10-08 12:32 ` Andre Tomt
2012-10-08 12:59 ` Andre Tomt
0 siblings, 1 reply; 18+ messages in thread
From: Andre Tomt @ 2012-10-08 12:32 UTC (permalink / raw)
To: Eric Dumazet; +Cc: adam.niescierowicz, Netdev
On 08. okt. 2012 14:13, Eric Dumazet wrote:
> On Mon, 2012-10-08 at 14:00 +0200, Andre Tomt wrote:
>> On 08. okt. 2012 12:49, Nieścierowicz Adam wrote:
>>> W dniu 08.10.2012 11:47, Eric Dumazet napisał(a):
>>>> Anyway you dont say where are drops, (ifconfig give us very few drops)
>>>
>>> you can see no losses(drop), but a temporary decline in traffic on the
>>> interface to 0kb/s
>>
>> This sounds very familiar, could it be something similar to:
>> http://marc.info/?l=linux-netdev&m=134594936016796&w=3
>>
>> The chip seems to be of the same family (though not model)
>
> Yes, but Adam says 3.4.1 already has a problem, while
> commit 2cb7a9cc008c25dc03314de563c00c107b3e5432 is in 3.5 only.
>
> Since Adam uses Intel e1000e, it could be the BQL related problem.
The other chips have had DMA burst flag enabled for longer, so that he
sees the same problem in 3.4 while I'm not makes sense. Hmm, as 3.4 is
when BQL went in (IIRC) it seems very likely that this BQL issue is the
problem for both of us.
> (Not sure if Intel guys finally fixed the problem, if not, its really
> insane)
>
> http://patchwork.ozlabs.org/patch/163298/
Ugh. :)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-10-08 12:00 ` Andre Tomt
2012-10-08 12:06 ` Nieścierowicz Adam
@ 2012-10-08 12:13 ` Eric Dumazet
2012-10-08 12:32 ` Andre Tomt
1 sibling, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2012-10-08 12:13 UTC (permalink / raw)
To: Andre Tomt; +Cc: adam.niescierowicz, Netdev
On Mon, 2012-10-08 at 14:00 +0200, Andre Tomt wrote:
> On 08. okt. 2012 12:49, Nieścierowicz Adam wrote:
> > W dniu 08.10.2012 11:47, Eric Dumazet napisał(a):
> >> Anyway you dont say where are drops, (ifconfig give us very few drops)
> >
> > you can see no losses(drop), but a temporary decline in traffic on the
> > interface to 0kb/s
>
> This sounds very familiar, could it be something similar to:
> http://marc.info/?l=linux-netdev&m=134594936016796&w=3
>
> The chip seems to be of the same family (though not model)
Yes, but Adam says 3.4.1 already has a problem, while
commit 2cb7a9cc008c25dc03314de563c00c107b3e5432 is in 3.5 only.
Since Adam uses Intel e1000e, it could be the BQL related problem.
(Not sure if Intel guys finally fixed the problem, if not, its really
insane)
http://patchwork.ozlabs.org/patch/163298/
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-10-08 12:00 ` Andre Tomt
@ 2012-10-08 12:06 ` Nieścierowicz Adam
2012-10-08 12:13 ` Eric Dumazet
1 sibling, 0 replies; 18+ messages in thread
From: Nieścierowicz Adam @ 2012-10-08 12:06 UTC (permalink / raw)
To: Andre Tomt; +Cc: Eric Dumazet, Netdev
W dniu 08.10.2012 14:00, Andre Tomt napisał(a):
> On 08. okt. 2012 12:49, Nieścierowicz Adam wrote:
>
>> W dniu 08.10.2012 11:47, Eric Dumazet napisał(a):
>>
>>> Anyway you dont say where are drops, (ifconfig give us very few
>>> drops)
>> you can see no losses(drop), but a temporary decline in traffic on
>> the
>> interface to 0kb/s
>
> This sounds very familiar, could it be something similar to:
> http://marc.info/?l=linux-netdev&m=134594936016796&w=3 [1]
>
> The chip seems to be of the same family (though not model)
In fact it looks similarly.
Here the problem has performed in kernel 3.4.1, 3.5-rcX, and 3.6
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-10-08 10:49 ` Nieścierowicz Adam
@ 2012-10-08 12:00 ` Andre Tomt
2012-10-08 12:06 ` Nieścierowicz Adam
2012-10-08 12:13 ` Eric Dumazet
0 siblings, 2 replies; 18+ messages in thread
From: Andre Tomt @ 2012-10-08 12:00 UTC (permalink / raw)
To: adam.niescierowicz; +Cc: Eric Dumazet, Netdev
On 08. okt. 2012 12:49, Nieścierowicz Adam wrote:
> W dniu 08.10.2012 11:47, Eric Dumazet napisał(a):
>> Anyway you dont say where are drops, (ifconfig give us very few drops)
>
> you can see no losses(drop), but a temporary decline in traffic on the
> interface to 0kb/s
This sounds very familiar, could it be something similar to:
http://marc.info/?l=linux-netdev&m=134594936016796&w=3
The chip seems to be of the same family (though not model)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-10-08 9:47 ` Eric Dumazet
@ 2012-10-08 10:49 ` Nieścierowicz Adam
2012-10-08 12:00 ` Andre Tomt
0 siblings, 1 reply; 18+ messages in thread
From: Nieścierowicz Adam @ 2012-10-08 10:49 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Netdev
W dniu 08.10.2012 11:47, Eric Dumazet napisał(a):
> On Mon, 2012-10-08 at 11:29 +0200, Nieścierowicz Adam wrote:
>
>>> You should use RPS on eth2/eth3 because they are non multi queue.
>>> Documentation/networking/scaling.txt should give you all the needed
>>> info
>> I set processors for rps such as affinity, unfortunately it did not
>> help --- cat /sys/class/net/eth{2,3,4,5}/queues/rx-0/rps_cpus 0040
>> 0080
>> 0100 0200 --- CPU affinity http://wklej.org/id/843161/ [1]
>
> I said eth2 and eth3
>
eth2,eth3 and eth4,eth5 is the same card so I changed them as well, or
made a mistake?
> And you should use cpu11->cpu15 instead of cpu6->cpu9 since they are
> in
> use...
>
I read in the file Documentation/networking/scaling.txt
---
if the rps_cpus for each queue are the ones that
share the same memory domain as the interrupting CPU for that queue
---
so i used the same CPU, I misunderstood?
> Anyway you dont say where are drops, (ifconfig give us very few
> drops)
you can see no losses(drop), but a temporary decline in traffic on the
interface to 0kb/s
>
> Also your eth0 seems to have a strange balance :
>
> RX interrupts seems to be well balanced on 4 queues :
>
> 76: 503 0 169271690 0 0 0 PCI-MSI-edge eth0-rx-0
> 77: 405 0 0 164532538 0 0 PCI-MSI-edge eth0-rx-1
> 78: 408 0 0 0 152778723 0 PCI-MSI-edge eth0-rx-2
> 79: 349 0 0 0 0 155011301 PCI-MSI-edge eth0-rx-3
> 80: 144 0 443432394 0 0 0 PCI-MSI-edge eth0-tx-0
> 81: 18 0 0 2043311 0 0 PCI-MSI-edge eth0-tx-1
> 82: 30 0 0 0 1934537 0 PCI-MSI-edge eth0-tx-2
> 83: 137 0 0 0 0 1968272 PCI-MSI-edge eth0-tx-3
>
> But TX seems to mostly use queue 0
>
> Packets sent to eth0 are coming from where ?
Packets come mainly from two routers(Edge BGP and local NAT)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-10-08 9:29 ` Nieścierowicz Adam
@ 2012-10-08 9:47 ` Eric Dumazet
2012-10-08 10:49 ` Nieścierowicz Adam
0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2012-10-08 9:47 UTC (permalink / raw)
To: adam.niescierowicz; +Cc: Netdev
On Mon, 2012-10-08 at 11:29 +0200, Nieścierowicz Adam wrote:
> >
> > You should use RPS on eth2/eth3 because they are non multi queue.
> >
> > Documentation/networking/scaling.txt should give you all the needed
> > info
>
> I set processors for rps such as affinity, unfortunately it did not
> help
>
> ---
> cat /sys/class/net/eth{2,3,4,5}/queues/rx-0/rps_cpus
> 0040
> 0080
> 0100
> 0200
> ---
> CPU affinity http://wklej.org/id/843161/
>
>
I said eth2 and eth3
And you should use cpu11->cpu15 instead of cpu6->cpu9 since they are in
use...
Anyway you dont say where are drops, (ifconfig give us very few drops)
Also your eth0 seems to have a strange balance :
RX interrupts seems to be well balanced on 4 queues :
76: 503 0 169271690 0 0 0 PCI-MSI-edge eth0-rx-0
77: 405 0 0 164532538 0 0 PCI-MSI-edge eth0-rx-1
78: 408 0 0 0 152778723 0 PCI-MSI-edge eth0-rx-2
79: 349 0 0 0 0 155011301 PCI-MSI-edge eth0-rx-3
80: 144 0 443432394 0 0 0 PCI-MSI-edge eth0-tx-0
81: 18 0 0 2043311 0 0 PCI-MSI-edge eth0-tx-1
82: 30 0 0 0 1934537 0 PCI-MSI-edge eth0-tx-2
83: 137 0 0 0 0 1968272 PCI-MSI-edge eth0-tx-3
But TX seems to mostly use queue 0
Packets sent to eth0 are coming from where ?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-10-08 6:22 ` Eric Dumazet
@ 2012-10-08 9:29 ` Nieścierowicz Adam
2012-10-08 9:47 ` Eric Dumazet
0 siblings, 1 reply; 18+ messages in thread
From: Nieścierowicz Adam @ 2012-10-08 9:29 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Netdev
>
> You should use RPS on eth2/eth3 because they are non multi queue.
>
> Documentation/networking/scaling.txt should give you all the needed
> info
I set processors for rps such as affinity, unfortunately it did not
help
---
cat /sys/class/net/eth{2,3,4,5}/queues/rx-0/rps_cpus
0040
0080
0100
0200
---
CPU affinity http://wklej.org/id/843161/
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-10-07 19:18 Nieścierowicz Adam
@ 2012-10-08 6:22 ` Eric Dumazet
2012-10-08 9:29 ` Nieścierowicz Adam
0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2012-10-08 6:22 UTC (permalink / raw)
To: adam.niescierowicz; +Cc: Netdev
On Sun, 2012-10-07 at 21:18 +0200, Nieścierowicz Adam wrote:
> W dniu 06.07.2012 12:13, Eric Dumazet napisał(a):
>
> > On Fri, 2012-07-06 at 11:47 +0200, Nieścierowicz Adam wrote:
> >
> >> Hello, Can I send something that will help determine the cause of
> >> the
> >> problem? W dniu 08.06.2012 11:41, Eric Dumazet napisał(a):
> >>
> >>> On Fri, 2012-06-08 at 10:58 +0200, Nieścierowicz Adam wrote:
> >>>
> >>>> Hello, recently we changed on the router kernel from 2.6.38.1 to
> >>>> 3.4.1 and noticed 30% packet loss when traffic increases up to
> >>>> 250MB / s. Similar is for kernel 3.5-rc1 Here a link to ifstat
> >>>> http://wklej.org/id/767577/ [1] [2]
> >>> You should give as much as possible delails on your setup
> >>> (hardware,
> >>> software) lspci cat /proc/cpuinfo cat /proc/interrupts ifconfig -a
> >>> tc
> >>> -s -d qdisc dmesg netstat -s
> >> currently running on 2.6.38.1 and traffic is 100Mb / s lspci:
> >> http://wklej.org/id/769102/ [2] /proc/cpuinfo:
> >> http://wklej.org/id/769104/ [3] /proc/interrupts:
> >> http://wklej.org/id/769106/ [4] ifconfig -a:
> >> http://wklej.org/id/769108/ [5] tc -s -d qdisc:
> >> http://wklej.org/id/769109/ [6] dmesg: here are some logs from
> >> iptables
> >> netstat -s: http://wklej.org/id/769110/ [7] lsmod:
> >> http://wklej.org/id/769117/ [8] /proc/net/softnet_stat:
> >> http://wklej.org/id/769116/ [9]
> >
> > Same infos of 3.5-rcX kernel would be nice.
> >
> > What NIC is eth0 ? (dmesg please)
> >
> > It seems all network traffic on 2.6.38 is handled by a single cpu
> > (cpu0)
> >
> > (seen in /proc/interrupts)
> >
> > I suspect that with 3.4 or 3.5 kernels, traffic is handled by many
> > cpus
> > and they hit false sharing and contention.
> >
> > You probably get better performance doing some affinity tuning :
> >
> > For example,
> > eth0 serviced by cpu0
> > eth2 serviced by cpu1
> > eth3 serviced by cpu2
> > eth5 serviced by cpu3
> >
> > and so on...
> >
> > check and/or set /proc/irq/${NUM}/smp_affinity
>
> hello
> I would go back to an earlier thread.
>
> Currently is installed kernel 3.6.0 and symptoms are the same
>
> about configuration:
>
> - affinity on
>
> - lspci: http://wklej.org/id/843156/ [10]
>
> - /proc/cpuinfo: http://wklej.org/id/843158/ [11]
>
> - /proc/interrupts: http://wklej.org/id/843161/ [12]
>
> - ifconfig -a: http://wklej.org/id/843162/ [13]
>
> - tc -s -d qdisc: http://wklej.org/id/843164/ [14]
>
> - dmesg: http://wklej.org/id/843166/ [15]
>
> - lsmod: http://wklej.org/id/843167/ [16]
>
> - /proc/net/softnet_stat: /proc/net/softnet_stat
>
> attach something else?
>
> Thanks
You should use RPS on eth2/eth3 because they are non multi queue.
Documentation/networking/scaling.txt should give you all the needed info
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
@ 2012-10-07 19:18 Nieścierowicz Adam
2012-10-08 6:22 ` Eric Dumazet
0 siblings, 1 reply; 18+ messages in thread
From: Nieścierowicz Adam @ 2012-10-07 19:18 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Netdev
W dniu 06.07.2012 12:13, Eric Dumazet napisał(a):
> On Fri, 2012-07-06 at 11:47 +0200, Nieścierowicz Adam wrote:
>
>> Hello, Can I send something that will help determine the cause of
>> the
>> problem? W dniu 08.06.2012 11:41, Eric Dumazet napisał(a):
>>
>>> On Fri, 2012-06-08 at 10:58 +0200, Nieścierowicz Adam wrote:
>>>
>>>> Hello, recently we changed on the router kernel from 2.6.38.1 to
>>>> 3.4.1 and noticed 30% packet loss when traffic increases up to
>>>> 250MB / s. Similar is for kernel 3.5-rc1 Here a link to ifstat
>>>> http://wklej.org/id/767577/ [1] [2]
>>> You should give as much as possible delails on your setup
>>> (hardware,
>>> software) lspci cat /proc/cpuinfo cat /proc/interrupts ifconfig -a
>>> tc
>>> -s -d qdisc dmesg netstat -s
>> currently running on 2.6.38.1 and traffic is 100Mb / s lspci:
>> http://wklej.org/id/769102/ [2] /proc/cpuinfo:
>> http://wklej.org/id/769104/ [3] /proc/interrupts:
>> http://wklej.org/id/769106/ [4] ifconfig -a:
>> http://wklej.org/id/769108/ [5] tc -s -d qdisc:
>> http://wklej.org/id/769109/ [6] dmesg: here are some logs from
>> iptables
>> netstat -s: http://wklej.org/id/769110/ [7] lsmod:
>> http://wklej.org/id/769117/ [8] /proc/net/softnet_stat:
>> http://wklej.org/id/769116/ [9]
>
> Same infos of 3.5-rcX kernel would be nice.
>
> What NIC is eth0 ? (dmesg please)
>
> It seems all network traffic on 2.6.38 is handled by a single cpu
> (cpu0)
>
> (seen in /proc/interrupts)
>
> I suspect that with 3.4 or 3.5 kernels, traffic is handled by many
> cpus
> and they hit false sharing and contention.
>
> You probably get better performance doing some affinity tuning :
>
> For example,
> eth0 serviced by cpu0
> eth2 serviced by cpu1
> eth3 serviced by cpu2
> eth5 serviced by cpu3
>
> and so on...
>
> check and/or set /proc/irq/${NUM}/smp_affinity
hello
I would go back to an earlier thread.
Currently is installed kernel 3.6.0 and symptoms are the same
about configuration:
- affinity on
- lspci: http://wklej.org/id/843156/ [10]
- /proc/cpuinfo: http://wklej.org/id/843158/ [11]
- /proc/interrupts: http://wklej.org/id/843161/ [12]
- ifconfig -a: http://wklej.org/id/843162/ [13]
- tc -s -d qdisc: http://wklej.org/id/843164/ [14]
- dmesg: http://wklej.org/id/843166/ [15]
- lsmod: http://wklej.org/id/843167/ [16]
- /proc/net/softnet_stat: /proc/net/softnet_stat
attach something else?
Thanks
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-06-08 9:41 ` Eric Dumazet
@ 2012-06-08 9:43 ` Eric Dumazet
0 siblings, 0 replies; 18+ messages in thread
From: Eric Dumazet @ 2012-06-08 9:43 UTC (permalink / raw)
To: adam.niescierowicz; +Cc: Netdev
On Fri, 2012-06-08 at 11:41 +0200, Eric Dumazet wrote:
> On Fri, 2012-06-08 at 10:58 +0200, Nieścierowicz Adam wrote:
> lspci
> cat /proc/cpuinfo
> cat /proc/interrupts
> ifconfig -a
> tc -s -d qdisc
> dmesg
> netstat -s
>
cat /proc/net/softnet_stat
lsmod
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
2012-06-08 8:58 Nieścierowicz Adam
@ 2012-06-08 9:41 ` Eric Dumazet
2012-06-08 9:43 ` Eric Dumazet
0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2012-06-08 9:41 UTC (permalink / raw)
To: adam.niescierowicz; +Cc: Netdev
On Fri, 2012-06-08 at 10:58 +0200, Nieścierowicz Adam wrote:
> Hello,
>
> recently we changed on the router kernel from 2.6.38.1 to 3.4.1 and
> noticed
> 30% packet loss when traffic increases up to 250MB / s.
>
> Similar is for kernel 3.5-rc1
>
> Here a link to ifstat http://wklej.org/id/767577/
You should give as much as possible delails on your setup (hardware,
software)
lspci
cat /proc/cpuinfo
cat /proc/interrupts
ifconfig -a
tc -s -d qdisc
dmesg
netstat -s
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
@ 2012-06-08 9:31 Nieścierowicz Adam
0 siblings, 0 replies; 18+ messages in thread
From: Nieścierowicz Adam @ 2012-06-08 9:31 UTC (permalink / raw)
To: Eric Dumazet, Netdev
W dniu 08.06.2012 11:41, Eric Dumazet napisał(a):
> On Fri, 2012-06-08 at 10:58 +0200, Nieścierowicz Adam wrote:
>
>> Hello, recently we changed on the router kernel from 2.6.38.1 to
>> 3.4.1
>> and noticed 30% packet loss when traffic increases up to 250MB / s.
>> Similar is for kernel 3.5-rc1 Here a link to ifstat
>> http://wklej.org/id/767577/ [2]
>
> You should give as much as possible delails on your setup (hardware,
> software)
>
> lspci
> cat /proc/cpuinfo
> cat /proc/interrupts
> ifconfig -a
> tc -s -d qdisc
> dmesg
> netstat -s
currently running on 2.6.38.1 and traffic is 100Mb / s
lspci: http://wklej.org/id/769102/
/proc/cpuinfo: http://wklej.org/id/769104/
/proc/interrupts: http://wklej.org/id/769106/
ifconfig -a: http://wklej.org/id/769108/
tc -s -d qdisc: http://wklej.org/id/769109/
dmesg: here are some logs from iptables
netstat -s: http://wklej.org/id/769110/
lsmod: http://wklej.org/id/769117/
/proc/net/softnet_stat: http://wklej.org/id/769116/
^ permalink raw reply [flat|nested] 18+ messages in thread
* 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s
@ 2012-06-08 8:58 Nieścierowicz Adam
2012-06-08 9:41 ` Eric Dumazet
0 siblings, 1 reply; 18+ messages in thread
From: Nieścierowicz Adam @ 2012-06-08 8:58 UTC (permalink / raw)
To: Netdev
Hello,
recently we changed on the router kernel from 2.6.38.1 to 3.4.1 and
noticed
30% packet loss when traffic increases up to 250MB / s.
Similar is for kernel 3.5-rc1
Here a link to ifstat http://wklej.org/id/767577/
Regards
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2012-10-10 4:59 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-06 9:47 3.4.1 and 3.5-rc1 Packet lost at 250Mb/s Nieścierowicz Adam
2012-07-06 10:13 ` Eric Dumazet
-- strict thread matches above, loose matches on Subject: below --
2012-10-07 19:18 Nieścierowicz Adam
2012-10-08 6:22 ` Eric Dumazet
2012-10-08 9:29 ` Nieścierowicz Adam
2012-10-08 9:47 ` Eric Dumazet
2012-10-08 10:49 ` Nieścierowicz Adam
2012-10-08 12:00 ` Andre Tomt
2012-10-08 12:06 ` Nieścierowicz Adam
2012-10-08 12:13 ` Eric Dumazet
2012-10-08 12:32 ` Andre Tomt
2012-10-08 12:59 ` Andre Tomt
2012-10-09 19:56 ` Nieścierowicz Adam
2012-10-10 4:59 ` Jeff Kirsher
2012-06-08 9:31 Nieścierowicz Adam
2012-06-08 8:58 Nieścierowicz Adam
2012-06-08 9:41 ` Eric Dumazet
2012-06-08 9:43 ` Eric Dumazet
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.