* e1000e on thinkpad x60: interrupt problem
@ 2013-07-09 1:05 Pavel Machek
2013-07-09 15:51 ` Ronciak, John
0 siblings, 1 reply; 9+ messages in thread
From: Pavel Machek @ 2013-07-09 1:05 UTC (permalink / raw)
To: jeffrey.t.kirsher, jesse.brandeburg, bruce.w.allan,
carolyn.wyborny, donald.c.skidmore, gregory.v.rose,
peter.p.waskiewicz.jr, alexander.h.duyck, john.ronciak,
tushar.n.dave, e1000-devel, netdev
Hi!
I'm using
02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller
on thinkpad x60. Kernel is 3.10.
# CONFIG_E100 is not set
# CONFIG_E1000 is not set
CONFIG_E1000E=y
Interrupts are like this:
pavel@amd:/data/l/linux-good$ cat /proc/interrupts
CPU0 CPU1
0: 95454037 5192 IO-APIC-edge timer
1: 16292 20 IO-APIC-edge i8042
3: 9 0 IO-APIC-edge
4: 9 0 IO-APIC-edge
7: 0 0 IO-APIC-edge parport0
8: 1 0 IO-APIC-edge rtc0
9: 19471974 1207 IO-APIC-fasteoi acpi
12: 168092 15 IO-APIC-edge i8042
14: 3568551 165 IO-APIC-edge ata_piix
15: 0 0 IO-APIC-edge ata_piix
16: 14033945 877 IO-APIC-fasteoi i915, ahci, yenta, uhci_hcd:usb2, eth0
but it seems that eth0 is not generating interrupts at all:
...
64 bytes from 10.0.0.251: icmp_req=32 ttl=64 time=26.2 ms
64 bytes from 10.0.0.251: icmp_req=33 ttl=64 time=16.6 ms
64 bytes from 10.0.0.251: icmp_req=34 ttl=64 time=1.14 ms
64 bytes from 10.0.0.251: icmp_req=35 ttl=64 time=56.4 ms
^C
--- 10.0.0.251 ping statistics ---
35 packets transmitted, 35 received, 0% packet loss, time 34140ms
rtt min/avg/max/mdev = 1.024/20.173/88.285/25.281 ms
pavel@amd:/data/l/linux-good$ ping 10.0.0.251
Note huge latencies. But as interrupt is shared with ahci, I can help
with:
pavel@amd:~$ sudo cat /dev/sda > /dev/null
[sudo] password for pavel:
Then latencies get to high but reasonable range:
pavel@amd:/data/l/linux-good$ ping 10.0.0.251
PING 10.0.0.251 (10.0.0.251) 56(84) bytes of data.
64 bytes from 10.0.0.251: icmp_req=1 ttl=64 time=1.14 ms
64 bytes from 10.0.0.251: icmp_req=2 ttl=64 time=3.79 ms
64 bytes from 10.0.0.251: icmp_req=3 ttl=64 time=1.24 ms
64 bytes from 10.0.0.251: icmp_req=4 ttl=64 time=1.54 ms
64 bytes from 10.0.0.251: icmp_req=5 ttl=64 time=2.04 ms
64 bytes from 10.0.0.251: icmp_req=6 ttl=64 time=2.48 ms
64 bytes from 10.0.0.251: icmp_req=7 ttl=64 time=1.90 ms
64 bytes from 10.0.0.251: icmp_req=8 ttl=64 time=2.30 ms
(other attempt)
--- 10.0.0.251 ping statistics ---
22 packets transmitted, 22 received, 0% packet loss, time 21128ms
rtt min/avg/max/mdev = 0.940/1.733/3.604/0.685 ms
root@amd:/data/l/linux-good# ethtool -e eth0
Offset Values
------ ------
0x0000 00 16 d3 25 19 04 30 0b b2 ff 51 00 ff ff ff ff
0x0010 53 00 03 02 6b 02 7e 20 aa 17 9a 10 86 80 df 80
0x0020 00 00 00 20 54 7e 00 00 14 00 da 00 04 00 00 27
0x0030 c9 6c 50 31 3e 07 0b 04 8b 29 00 00 00 f0 02 0f
0x0040 08 10 00 00 04 0f ff 7f 01 4d ff ff ff ff ff ff
0x0050 14 00 1d 00 14 00 1d 00 af aa 1e 00 00 00 1d 00
0x0060 00 01 00 40 32 12 07 40 ff ff ff ff ff ff ff ff
0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff fd 5d
02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller
Subsystem: Lenovo ThinkPad X60s
Flags: bus master, fast devsel, latency 0, IRQ 16
Memory at ee000000 (32-bit, non-prefetchable) [size=128K]
I/O ports at 2000 [size=32]
Capabilities: [c8] Power Management version 2
Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [e0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Device Serial Number 00-16-d3-ff-ff-25-19-04
Kernel driver in use: e1000e
Any ideas how to debug/fix this?
[Maybe related: https://bugzilla.kernel.org/show_bug.cgi?id=6929 but
that was fixed in 2007. And yes, it _is_ better with bigger packets.]?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: e1000e on thinkpad x60: interrupt problem
2013-07-09 1:05 e1000e on thinkpad x60: interrupt problem Pavel Machek
@ 2013-07-09 15:51 ` Ronciak, John
2013-07-09 17:02 ` Pavel Machek
0 siblings, 1 reply; 9+ messages in thread
From: Ronciak, John @ 2013-07-09 15:51 UTC (permalink / raw)
To: Pavel Machek, Kirsher, Jeffrey T, Brandeburg, Jesse, Allan,
Bruce W, Wyborny, Carolyn, Skidmore, Donald C, Rose, Gregory V,
Waskiewicz Jr, Peter P, Duyck, Alexander H, Dave, Tushar N,
e1000-devel, netdev
Nothing appears to be wrong. If the system is seeing ping packets at all means that device is generating interrupts and that they are being processed. If you are looking at performance then sharing interrupts is probably not a good idea. Since you are on a laptop it may not be easy to separate the networking device onto its own interrupt. The interrupt is shared with a lot of other devices and not just ahci.
Cheers,
John
> -----Original Message-----
> From: Pavel Machek [mailto:pavel@ucw.cz]
> Sent: Monday, July 08, 2013 6:05 PM
> To: Kirsher, Jeffrey T; Brandeburg, Jesse; Allan, Bruce W; Wyborny,
> Carolyn; Skidmore, Donald C; Rose, Gregory V; Waskiewicz Jr, Peter P;
> Duyck, Alexander H; Ronciak, John; Dave, Tushar N; e1000-
> devel@lists.sourceforge.net; netdev@vger.kernel.org
> Subject: e1000e on thinkpad x60: interrupt problem
>
> Hi!
>
> I'm using
>
> 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet
> Controller
>
> on thinkpad x60. Kernel is 3.10.
>
> # CONFIG_E100 is not set
> # CONFIG_E1000 is not set
> CONFIG_E1000E=y
>
> Interrupts are like this:
>
> pavel@amd:/data/l/linux-good$ cat /proc/interrupts
> CPU0 CPU1
> 0: 95454037 5192 IO-APIC-edge timer
> 1: 16292 20 IO-APIC-edge i8042
> 3: 9 0 IO-APIC-edge
> 4: 9 0 IO-APIC-edge
> 7: 0 0 IO-APIC-edge parport0
> 8: 1 0 IO-APIC-edge rtc0
> 9: 19471974 1207 IO-APIC-fasteoi acpi
> 12: 168092 15 IO-APIC-edge i8042
> 14: 3568551 165 IO-APIC-edge ata_piix
> 15: 0 0 IO-APIC-edge ata_piix
> 16: 14033945 877 IO-APIC-fasteoi i915, ahci, yenta,
> uhci_hcd:usb2, eth0
>
> but it seems that eth0 is not generating interrupts at all:
>
> ...
> 64 bytes from 10.0.0.251: icmp_req=32 ttl=64 time=26.2 ms
> 64 bytes from 10.0.0.251: icmp_req=33 ttl=64 time=16.6 ms
> 64 bytes from 10.0.0.251: icmp_req=34 ttl=64 time=1.14 ms
> 64 bytes from 10.0.0.251: icmp_req=35 ttl=64 time=56.4 ms ^C
> --- 10.0.0.251 ping statistics ---
> 35 packets transmitted, 35 received, 0% packet loss, time 34140ms rtt
> min/avg/max/mdev = 1.024/20.173/88.285/25.281 ms
> pavel@amd:/data/l/linux-good$ ping 10.0.0.251
>
> Note huge latencies. But as interrupt is shared with ahci, I can help
> with:
>
> pavel@amd:~$ sudo cat /dev/sda > /dev/null [sudo] password for pavel:
>
> Then latencies get to high but reasonable range:
>
> pavel@amd:/data/l/linux-good$ ping 10.0.0.251 PING 10.0.0.251
> (10.0.0.251) 56(84) bytes of data.
> 64 bytes from 10.0.0.251: icmp_req=1 ttl=64 time=1.14 ms
> 64 bytes from 10.0.0.251: icmp_req=2 ttl=64 time=3.79 ms
> 64 bytes from 10.0.0.251: icmp_req=3 ttl=64 time=1.24 ms
> 64 bytes from 10.0.0.251: icmp_req=4 ttl=64 time=1.54 ms
> 64 bytes from 10.0.0.251: icmp_req=5 ttl=64 time=2.04 ms
> 64 bytes from 10.0.0.251: icmp_req=6 ttl=64 time=2.48 ms
> 64 bytes from 10.0.0.251: icmp_req=7 ttl=64 time=1.90 ms
> 64 bytes from 10.0.0.251: icmp_req=8 ttl=64 time=2.30 ms (other
> attempt)
> --- 10.0.0.251 ping statistics ---
> 22 packets transmitted, 22 received, 0% packet loss, time 21128ms rtt
> min/avg/max/mdev = 0.940/1.733/3.604/0.685 ms
>
> root@amd:/data/l/linux-good# ethtool -e eth0 Offset Values
> ------ ------
> 0x0000 00 16 d3 25 19 04 30 0b b2 ff 51 00 ff ff ff ff
> 0x0010 53 00 03 02 6b 02 7e 20 aa 17 9a 10 86 80 df 80
> 0x0020 00 00 00 20 54 7e 00 00 14 00 da 00 04 00 00 27
> 0x0030 c9 6c 50 31 3e 07 0b 04 8b 29 00 00 00 f0 02 0f
> 0x0040 08 10 00 00 04 0f ff 7f 01 4d ff ff ff ff ff ff
> 0x0050 14 00 1d 00 14 00 1d 00 af aa 1e 00 00 00 1d 00
> 0x0060 00 01 00 40 32 12 07 40 ff ff ff ff ff ff ff ff
> 0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff fd 5d
>
> 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet
> Controller
> Subsystem: Lenovo ThinkPad X60s
> Flags: bus master, fast devsel, latency 0, IRQ 16
> Memory at ee000000 (32-bit, non-prefetchable) [size=128K]
> I/O ports at 2000 [size=32]
> Capabilities: [c8] Power Management version 2
> Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
> Capabilities: [e0] Express Endpoint, MSI 00
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [140] Device Serial Number 00-16-d3-ff-ff-25-19-
> 04
> Kernel driver in use: e1000e
>
> Any ideas how to debug/fix this?
>
> [Maybe related: https://bugzilla.kernel.org/show_bug.cgi?id=6929 but
> that was fixed in 2007. And yes, it _is_ better with bigger packets.]?
>
> Pavel
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures)
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: e1000e on thinkpad x60: interrupt problem
2013-07-09 15:51 ` Ronciak, John
@ 2013-07-09 17:02 ` Pavel Machek
2013-07-09 17:15 ` Waskiewicz Jr, Peter P
0 siblings, 1 reply; 9+ messages in thread
From: Pavel Machek @ 2013-07-09 17:02 UTC (permalink / raw)
To: Ronciak, John; +Cc: e1000-devel, Allan, Bruce W, Brandeburg, Jesse, netdev
Hi!
> Nothing appears to be wrong. If the system is seeing ping packets
>at all means that device is generating interrupts and that they are
>being processed. If you are looking at performance then sharing
No, that's not true. There is other interrupt load, and e1000e has big
enough buffers; that means that packets eventually get processed. I
strongly suspect e1000e generates little or no interrupts and packets
only get processed when other devices on shared interrupt line
generate interrupt.
>interrupts is probably not a good idea. Since you are on a laptop it
>may not be easy to separate the networking device onto its own
>interrupt. The interrupt is shared with a lot of other devices and
>not just ahci.
Yes, it is wrong. 100msec latency on unloaded core duo is not
right... and notice how ping latency goes _down_ when I load the other
devices.
[100msec latency is so big that it makes interactive work over
ssh hard.]
IOW if the interrupt was not shared, I'd be getting latencies in one
second range. It happened before on this machine, and other x60 users
are seing that, too.
It may have something to do with ASPM, because hack below makes
latencies lower than 2msec. (And btw shared interrupt load does not
make them worse in a way that can be measured; it stays <2msec with
AHCI load.)
Any ideas for acceptable solution?
Pavel
diff --git a/.config b/.config
index 149f713..d7f5a11 100644
--- a/.config
+++ b/.config
@@ -559,9 +559,9 @@ CONFIG_PCIEAER=y
# CONFIG_PCIEAER_INJECT is not set
CONFIG_PCIEASPM=y
CONFIG_PCIEASPM_DEBUG=y
-CONFIG_PCIEASPM_DEFAULT=y
+# CONFIG_PCIEASPM_DEFAULT is not set
# CONFIG_PCIEASPM_POWERSAVE is not set
-# CONFIG_PCIEASPM_PERFORMANCE is not set
+CONFIG_PCIEASPM_PERFORMANCE=y
CONFIG_PCIE_PME=y
CONFIG_ARCH_SUPPORTS_MSI=y
# CONFIG_PCI_MSI is not set
index e4b1fb2..9a1b63e 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -382,7 +382,7 @@ static int __init acpi_pci_init(void)
if (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_ASPM) {
printk(KERN_INFO"ACPI FADT declares the system doesn't support PCIe ASPM, so disable it\n");
- pcie_no_aspm();
+// pcie_no_aspm();
}
ret = register_acpi_bus_type(&acpi_pci_bus);
> > Interrupts are like this:
> >
> > pavel@amd:/data/l/linux-good$ cat /proc/interrupts
> > CPU0 CPU1
> > 0: 95454037 5192 IO-APIC-edge timer
> > 1: 16292 20 IO-APIC-edge i8042
> > 3: 9 0 IO-APIC-edge
> > 4: 9 0 IO-APIC-edge
> > 7: 0 0 IO-APIC-edge parport0
> > 8: 1 0 IO-APIC-edge rtc0
> > 9: 19471974 1207 IO-APIC-fasteoi acpi
> > 12: 168092 15 IO-APIC-edge i8042
> > 14: 3568551 165 IO-APIC-edge ata_piix
> > 15: 0 0 IO-APIC-edge ata_piix
> > 16: 14033945 877 IO-APIC-fasteoi i915, ahci, yenta,
> > uhci_hcd:usb2, eth0
> >
> > but it seems that eth0 is not generating interrupts at all:
> >
> > ...
> > 64 bytes from 10.0.0.251: icmp_req=32 ttl=64 time=26.2 ms
> > 64 bytes from 10.0.0.251: icmp_req=33 ttl=64 time=16.6 ms
> > 64 bytes from 10.0.0.251: icmp_req=34 ttl=64 time=1.14 ms
> > 64 bytes from 10.0.0.251: icmp_req=35 ttl=64 time=56.4 ms ^C
> > --- 10.0.0.251 ping statistics ---
> > 35 packets transmitted, 35 received, 0% packet loss, time 34140ms rtt
> > min/avg/max/mdev = 1.024/20.173/88.285/25.281 ms
> > pavel@amd:/data/l/linux-good$ ping 10.0.0.251
> >
> > Note huge latencies. But as interrupt is shared with ahci, I can help
> > with:
> >
> > pavel@amd:~$ sudo cat /dev/sda > /dev/null [sudo] password for pavel:
> >
> > Then latencies get to high but reasonable range:
> >
> > pavel@amd:/data/l/linux-good$ ping 10.0.0.251 PING 10.0.0.251
> > (10.0.0.251) 56(84) bytes of data.
> > 64 bytes from 10.0.0.251: icmp_req=1 ttl=64 time=1.14 ms
> > 64 bytes from 10.0.0.251: icmp_req=2 ttl=64 time=3.79 ms
> > 64 bytes from 10.0.0.251: icmp_req=3 ttl=64 time=1.24 ms
> > 64 bytes from 10.0.0.251: icmp_req=4 ttl=64 time=1.54 ms
> > 64 bytes from 10.0.0.251: icmp_req=5 ttl=64 time=2.04 ms
> > 64 bytes from 10.0.0.251: icmp_req=6 ttl=64 time=2.48 ms
> > 64 bytes from 10.0.0.251: icmp_req=7 ttl=64 time=1.90 ms
> > 64 bytes from 10.0.0.251: icmp_req=8 ttl=64 time=2.30 ms (other
> > attempt)
> > --- 10.0.0.251 ping statistics ---
> > 22 packets transmitted, 22 received, 0% packet loss, time 21128ms rtt
> > min/avg/max/mdev = 0.940/1.733/3.604/0.685 ms
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: e1000e on thinkpad x60: interrupt problem
2013-07-09 17:02 ` Pavel Machek
@ 2013-07-09 17:15 ` Waskiewicz Jr, Peter P
2013-07-09 20:48 ` Pavel Machek
0 siblings, 1 reply; 9+ messages in thread
From: Waskiewicz Jr, Peter P @ 2013-07-09 17:15 UTC (permalink / raw)
To: Pavel Machek
Cc: Ronciak, John, Kirsher, Jeffrey T, Brandeburg, Jesse, Allan,
Bruce W, Wyborny, Carolyn, Skidmore, Donald C, Rose, Gregory V,
Duyck, Alexander H, Dave, Tushar N, e1000-devel, netdev
On Tue, 2013-07-09 at 19:02 +0200, Pavel Machek wrote:
> Hi!
>
> > Nothing appears to be wrong. If the system is seeing ping packets
> >at all means that device is generating interrupts and that they are
> >being processed. If you are looking at performance then sharing
>
> No, that's not true. There is other interrupt load, and e1000e has big
> enough buffers; that means that packets eventually get processed. I
> strongly suspect e1000e generates little or no interrupts and packets
> only get processed when other devices on shared interrupt line
> generate interrupt.
If the interrupt is shared, e1000e checks if it's the hardware that
generated it before processing packets. Consuming an interrupt that
isn't meant for this device will throw major warnings in the kernel
about bad interrupt routing, etc. Here's the code from the interrupt
handler (note the last part of the pasted code):
/**
* e1000_intr - Interrupt Handler
* @irq: interrupt number
* @data: pointer to a network interface device structure
**/
static irqreturn_t e1000_intr(int __always_unused irq, void *data)
{
struct net_device *netdev = data;
struct e1000_adapter *adapter = netdev_priv(netdev);
struct e1000_hw *hw = &adapter->hw;
u32 rctl, icr = er32(ICR);
if (!icr || test_bit(__E1000_DOWN, &adapter->state))
return IRQ_NONE; /* Not our interrupt */
/* IMS will not auto-mask if INT_ASSERTED is not set, and if it
is
* not set, then the adapter didn't send an interrupt
*/
if (!(icr & E1000_ICR_INT_ASSERTED))
return IRQ_NONE;
Cheers,
-PJ
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: e1000e on thinkpad x60: interrupt problem
2013-07-09 17:15 ` Waskiewicz Jr, Peter P
@ 2013-07-09 20:48 ` Pavel Machek
2013-07-09 20:59 ` Ronciak, John
2013-07-09 21:40 ` Jesse Brandeburg
0 siblings, 2 replies; 9+ messages in thread
From: Pavel Machek @ 2013-07-09 20:48 UTC (permalink / raw)
To: Waskiewicz Jr, Peter P
Cc: Ronciak, John, Kirsher, Jeffrey T, Brandeburg, Jesse, Allan,
Bruce W, Wyborny, Carolyn, Skidmore, Donald C, Rose, Gregory V,
Duyck, Alexander H, Dave, Tushar N, e1000-devel, netdev
On Tue 2013-07-09 17:15:48, Waskiewicz Jr, Peter P wrote:
> On Tue, 2013-07-09 at 19:02 +0200, Pavel Machek wrote:
> > Hi!
> >
> > > Nothing appears to be wrong. If the system is seeing ping packets
> > >at all means that device is generating interrupts and that they are
> > >being processed. If you are looking at performance then sharing
> >
> > No, that's not true. There is other interrupt load, and e1000e has big
> > enough buffers; that means that packets eventually get processed. I
> > strongly suspect e1000e generates little or no interrupts and packets
> > only get processed when other devices on shared interrupt line
> > generate interrupt.
>
> If the interrupt is shared, e1000e checks if it's the hardware that
> generated it before processing packets. Consuming an interrupt that
> isn't meant for this device will throw major warnings in the kernel
> about bad interrupt routing, etc. Here's the code from the interrupt
> handler (note the last part of the pasted code):
Yeah, of course you need to ask e1000e if it generated the
interrupt. That part works. The part that actually generates the
interrupt does not. Take a look at original mail...
packet comes
e1000e sets E1000_ICR_INT_ASSERTED bit
e1000e tries to generate an interrupt and fails
50msec passes
AHCI generates interrupt
all the handlers are called
AHCI processes its interrupt, handles disk read
e1000_intr notices E1000_ICR_INT_ASSERTED bit, delivers the packet.
Network still works, only slowly. Ping goes lower when I use the
disk. That matches what I see.
Do you have other explanation?
Pavel
> /**
> * e1000_intr - Interrupt Handler
> * @irq: interrupt number
> * @data: pointer to a network interface device structure
> **/
> static irqreturn_t e1000_intr(int __always_unused irq, void *data)
> {
> struct net_device *netdev = data;
> struct e1000_adapter *adapter = netdev_priv(netdev);
> struct e1000_hw *hw = &adapter->hw;
> u32 rctl, icr = er32(ICR);
>
> if (!icr || test_bit(__E1000_DOWN, &adapter->state))
> return IRQ_NONE; /* Not our interrupt */
>
> /* IMS will not auto-mask if INT_ASSERTED is not set, and if it
> is
> * not set, then the adapter didn't send an interrupt
> */
> if (!(icr & E1000_ICR_INT_ASSERTED))
> return IRQ_NONE;
>
> Cheers,
> -PJ
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: e1000e on thinkpad x60: interrupt problem
2013-07-09 20:48 ` Pavel Machek
@ 2013-07-09 20:59 ` Ronciak, John
2013-07-09 21:40 ` Jesse Brandeburg
1 sibling, 0 replies; 9+ messages in thread
From: Ronciak, John @ 2013-07-09 20:59 UTC (permalink / raw)
To: Pavel Machek, Waskiewicz Jr, Peter P
Cc: Kirsher, Jeffrey T, Brandeburg, Jesse, Allan, Bruce W, Wyborny,
Carolyn, Skidmore, Donald C, Rose, Gregory V, Duyck, Alexander H,
Dave, Tushar N, e1000-devel, netdev
So are you saying the HW isn't generating an interrupt? Clearly it is as the driver is taking action on it. I went and looked at the first mail I saw (the one I first responded to) but maybe you are talking about another email that I (we) didn't get. I don't see any of what you point to in this email (below). So I still don't think we know what you are asking. Is it that you think an interrupt isn't processed because of the interrupts being shared and the packets get processed on the next interrupt?
Cheers,
John
> -----Original Message-----
> From: Pavel Machek [mailto:pavel@ucw.cz]
> Sent: Tuesday, July 09, 2013 1:49 PM
> To: Waskiewicz Jr, Peter P
> Cc: Ronciak, John; Kirsher, Jeffrey T; Brandeburg, Jesse; Allan, Bruce
> W; Wyborny, Carolyn; Skidmore, Donald C; Rose, Gregory V; Duyck,
> Alexander H; Dave, Tushar N; e1000-devel@lists.sourceforge.net;
> netdev@vger.kernel.org
> Subject: Re: e1000e on thinkpad x60: interrupt problem
>
> On Tue 2013-07-09 17:15:48, Waskiewicz Jr, Peter P wrote:
> > On Tue, 2013-07-09 at 19:02 +0200, Pavel Machek wrote:
> > > Hi!
> > >
> > > > Nothing appears to be wrong. If the system is seeing ping
> packets
> > > >at all means that device is generating interrupts and that they
> are
> > > >being processed. If you are looking at performance then sharing
> > >
> > > No, that's not true. There is other interrupt load, and e1000e has
> > > big enough buffers; that means that packets eventually get
> > > processed. I strongly suspect e1000e generates little or no
> > > interrupts and packets only get processed when other devices on
> > > shared interrupt line generate interrupt.
> >
> > If the interrupt is shared, e1000e checks if it's the hardware that
> > generated it before processing packets. Consuming an interrupt that
> > isn't meant for this device will throw major warnings in the kernel
> > about bad interrupt routing, etc. Here's the code from the interrupt
> > handler (note the last part of the pasted code):
>
> Yeah, of course you need to ask e1000e if it generated the interrupt.
> That part works. The part that actually generates the interrupt does
> not. Take a look at original mail...
>
> packet comes
> e1000e sets E1000_ICR_INT_ASSERTED bit
> e1000e tries to generate an interrupt and fails 50msec passes AHCI
> generates interrupt all the handlers are called
> AHCI processes its interrupt, handles disk read
> e1000_intr notices E1000_ICR_INT_ASSERTED bit, delivers the packet.
>
> Network still works, only slowly. Ping goes lower when I use the disk.
> That matches what I see.
>
> Do you have other explanation?
> Pavel
>
> > /**
> > * e1000_intr - Interrupt Handler
> > * @irq: interrupt number
> > * @data: pointer to a network interface device structure **/ static
> > irqreturn_t e1000_intr(int __always_unused irq, void *data) {
> > struct net_device *netdev = data;
> > struct e1000_adapter *adapter = netdev_priv(netdev);
> > struct e1000_hw *hw = &adapter->hw;
> > u32 rctl, icr = er32(ICR);
> >
> > if (!icr || test_bit(__E1000_DOWN, &adapter->state))
> > return IRQ_NONE; /* Not our interrupt */
> >
> > /* IMS will not auto-mask if INT_ASSERTED is not set, and if
> > it is
> > * not set, then the adapter didn't send an interrupt
> > */
> > if (!(icr & E1000_ICR_INT_ASSERTED))
> > return IRQ_NONE;
> >
> > Cheers,
> > -PJ
>
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures)
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: e1000e on thinkpad x60: interrupt problem
2013-07-09 20:48 ` Pavel Machek
2013-07-09 20:59 ` Ronciak, John
@ 2013-07-09 21:40 ` Jesse Brandeburg
2013-07-10 3:43 ` [E1000-devel] " Fujinaka, Todd
2013-07-10 11:37 ` Pavel Machek
1 sibling, 2 replies; 9+ messages in thread
From: Jesse Brandeburg @ 2013-07-09 21:40 UTC (permalink / raw)
To: Pavel Machek
Cc: Waskiewicz Jr, Peter P, Ronciak, John, Kirsher, Jeffrey T, Allan,
Bruce W, Wyborny, Carolyn, Skidmore, Donald C, Rose, Gregory V,
Duyck, Alexander H, Dave, Tushar N, e1000-devel, netdev,
jesse.brandeburg
On Tue, 9 Jul 2013 22:48:54 +0200
Pavel Machek <pavel@ucw.cz> wrote:
> Yeah, of course you need to ask e1000e if it generated the
> interrupt. That part works. The part that actually generates the
> interrupt does not. Take a look at original mail...
>
> packet comes
> e1000e sets E1000_ICR_INT_ASSERTED bit
> e1000e tries to generate an interrupt and fails
> 50msec passes
^^ thats the ASPM timeout length.
> AHCI generates interrupt
> all the handlers are called
> AHCI processes its interrupt, handles disk read
> e1000_intr notices E1000_ICR_INT_ASSERTED bit, delivers the packet.
>
> Network still works, only slowly. Ping goes lower when I use the
> disk. That matches what I see.
>
> Do you have other explanation?
Regardless of what others are saying I believe you have an issue with
ASPM being enabled. All the discussion about shared interrupts, is
just a distraction. This issue would still occur (and just be worse)
without a shared interrupt.
You already mentioned that a kernel hack to disable ASPM fixes it, but
you can just boot with different options to turn off ASPM.
pcie_aspm=off
There are known issues with ASPM on this part, and it definitely needs
to be off. If your bios has the option to turn it off, that is the
best way to disable it, second choice is to turn it off using the
kernel option.
Hope this helps,
Jesse
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [E1000-devel] e1000e on thinkpad x60: interrupt problem
2013-07-09 21:40 ` Jesse Brandeburg
@ 2013-07-10 3:43 ` Fujinaka, Todd
2013-07-10 11:37 ` Pavel Machek
1 sibling, 0 replies; 9+ messages in thread
From: Fujinaka, Todd @ 2013-07-10 3:43 UTC (permalink / raw)
To: Brandeburg, Jesse, Pavel Machek
Cc: e1000-devel, Allan, Bruce W, Brandeburg, Jesse, Ronciak, John, netdev
The latest kernel should turn off ASPM as well, but you should be able to check by looking at lspci -vvv. I think LnkCtl should say "ASPM disabled."
Sorry for top-posting.
Todd Fujinaka
Software Application Engineer
Networking Division (ND)
Intel Corporation
todd.fujinaka@intel.com
(503) 712-4565
-----Original Message-----
From: Jesse Brandeburg [mailto:jesse.brandeburg@intel.com]
Sent: Tuesday, July 09, 2013 2:41 PM
To: Pavel Machek
Cc: e1000-devel@lists.sourceforge.net; Allan, Bruce W; Brandeburg, Jesse; Ronciak, John; netdev@vger.kernel.org
Subject: Re: [E1000-devel] e1000e on thinkpad x60: interrupt problem
On Tue, 9 Jul 2013 22:48:54 +0200
Pavel Machek <pavel@ucw.cz> wrote:
> Yeah, of course you need to ask e1000e if it generated the interrupt.
> That part works. The part that actually generates the interrupt does
> not. Take a look at original mail...
>
> packet comes
> e1000e sets E1000_ICR_INT_ASSERTED bit e1000e tries to generate an
> interrupt and fails 50msec passes
^^ thats the ASPM timeout length.
> AHCI generates interrupt
> all the handlers are called
> AHCI processes its interrupt, handles disk read
> e1000_intr notices E1000_ICR_INT_ASSERTED bit, delivers the packet.
>
> Network still works, only slowly. Ping goes lower when I use the disk.
> That matches what I see.
>
> Do you have other explanation?
Regardless of what others are saying I believe you have an issue with ASPM being enabled. All the discussion about shared interrupts, is just a distraction. This issue would still occur (and just be worse) without a shared interrupt.
You already mentioned that a kernel hack to disable ASPM fixes it, but you can just boot with different options to turn off ASPM.
pcie_aspm=off
There are known issues with ASPM on this part, and it definitely needs to be off. If your bios has the option to turn it off, that is the best way to disable it, second choice is to turn it off using the kernel option.
Hope this helps,
Jesse
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: e1000e on thinkpad x60: interrupt problem
2013-07-09 21:40 ` Jesse Brandeburg
2013-07-10 3:43 ` [E1000-devel] " Fujinaka, Todd
@ 2013-07-10 11:37 ` Pavel Machek
1 sibling, 0 replies; 9+ messages in thread
From: Pavel Machek @ 2013-07-10 11:37 UTC (permalink / raw)
To: Jesse Brandeburg
Cc: Waskiewicz Jr, Peter P, Ronciak, John, Kirsher, Jeffrey T, Allan,
Bruce W, Wyborny, Carolyn, Skidmore, Donald C, Rose, Gregory V,
Duyck, Alexander H, Dave, Tushar N, e1000-devel, netdev
Hi!
> > Yeah, of course you need to ask e1000e if it generated the
> > interrupt. That part works. The part that actually generates the
> > interrupt does not. Take a look at original mail...
> >
> > packet comes
> > e1000e sets E1000_ICR_INT_ASSERTED bit
> > e1000e tries to generate an interrupt and fails
> > 50msec passes
>
> ^^ thats the ASPM timeout length.
>
> > AHCI generates interrupt
> > all the handlers are called
> > AHCI processes its interrupt, handles disk read
> > e1000_intr notices E1000_ICR_INT_ASSERTED bit, delivers the packet.
> >
> > Network still works, only slowly. Ping goes lower when I use the
> > disk. That matches what I see.
> >
> > Do you have other explanation?
>
> Regardless of what others are saying I believe you have an issue with
> ASPM being enabled. All the discussion about shared interrupts, is
> just a distraction. This issue would still occur (and just be worse)
> without a shared interrupt.
Agreed.
> You already mentioned that a kernel hack to disable ASPM fixes it, but
> you can just boot with different options to turn off ASPM.
>
> pcie_aspm=off
Are you sure? AFAICT linux will not turn off aspm if ACPI says
so.. hence the hack.
# From: Robert Hancock <hancockrwd@gmail.com>
# To: Pavel Machek <pavel@ucw.cz>
# CC: Greg KH <greg@kroah.com>, kernel list
# <linux-kernel@vger.kernel.org>,
# joe.lawrence@stratus.com, myron.stowe@redhat.com,
# bhelgaas@google.com
# Subject: Re: /sys/module/pcie_aspm/parameters/policy not writable?
# ...
# > pavel@amd:~$ dmesg | grep -i aspm
# > ACPI FADT declares the system doesn't support PCIe ASPM, so disable
# it
#
# IIRC, this message is somewhat misleading. When that FADT flag is set
# by the BIOS, the kernel doesn't so much disable ASPM as disable the
# kernel's control over ASPM. I believe this was to match Windows
# behavior.
It looks like ASPM needs to be off, but BIOS enables ASPM and tells
kernel it is not supported... and that means that kernel will not
disable it :-(.
I guess there's no way to do ASPM disable for single device from the
driver?
Thanks,
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-07-10 11:37 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-09 1:05 e1000e on thinkpad x60: interrupt problem Pavel Machek
2013-07-09 15:51 ` Ronciak, John
2013-07-09 17:02 ` Pavel Machek
2013-07-09 17:15 ` Waskiewicz Jr, Peter P
2013-07-09 20:48 ` Pavel Machek
2013-07-09 20:59 ` Ronciak, John
2013-07-09 21:40 ` Jesse Brandeburg
2013-07-10 3:43 ` [E1000-devel] " Fujinaka, Todd
2013-07-10 11:37 ` Pavel Machek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).