* e1000e on thinkpad x60: interrupt problem @ 2013-07-09 1:05 Pavel Machek 2013-07-09 15:51 ` Ronciak, John 0 siblings, 1 reply; 9+ messages in thread From: Pavel Machek @ 2013-07-09 1:05 UTC (permalink / raw) To: jeffrey.t.kirsher, jesse.brandeburg, bruce.w.allan, carolyn.wyborny, donald.c.skidmore, gregory.v.rose, peter.p.waskiewicz.jr, alexander.h.duyck, john.ronciak, tushar.n.dave, e1000-devel, netdev Hi! I'm using 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller on thinkpad x60. Kernel is 3.10. # CONFIG_E100 is not set # CONFIG_E1000 is not set CONFIG_E1000E=y Interrupts are like this: pavel@amd:/data/l/linux-good$ cat /proc/interrupts CPU0 CPU1 0: 95454037 5192 IO-APIC-edge timer 1: 16292 20 IO-APIC-edge i8042 3: 9 0 IO-APIC-edge 4: 9 0 IO-APIC-edge 7: 0 0 IO-APIC-edge parport0 8: 1 0 IO-APIC-edge rtc0 9: 19471974 1207 IO-APIC-fasteoi acpi 12: 168092 15 IO-APIC-edge i8042 14: 3568551 165 IO-APIC-edge ata_piix 15: 0 0 IO-APIC-edge ata_piix 16: 14033945 877 IO-APIC-fasteoi i915, ahci, yenta, uhci_hcd:usb2, eth0 but it seems that eth0 is not generating interrupts at all: ... 64 bytes from 10.0.0.251: icmp_req=32 ttl=64 time=26.2 ms 64 bytes from 10.0.0.251: icmp_req=33 ttl=64 time=16.6 ms 64 bytes from 10.0.0.251: icmp_req=34 ttl=64 time=1.14 ms 64 bytes from 10.0.0.251: icmp_req=35 ttl=64 time=56.4 ms ^C --- 10.0.0.251 ping statistics --- 35 packets transmitted, 35 received, 0% packet loss, time 34140ms rtt min/avg/max/mdev = 1.024/20.173/88.285/25.281 ms pavel@amd:/data/l/linux-good$ ping 10.0.0.251 Note huge latencies. But as interrupt is shared with ahci, I can help with: pavel@amd:~$ sudo cat /dev/sda > /dev/null [sudo] password for pavel: Then latencies get to high but reasonable range: pavel@amd:/data/l/linux-good$ ping 10.0.0.251 PING 10.0.0.251 (10.0.0.251) 56(84) bytes of data. 64 bytes from 10.0.0.251: icmp_req=1 ttl=64 time=1.14 ms 64 bytes from 10.0.0.251: icmp_req=2 ttl=64 time=3.79 ms 64 bytes from 10.0.0.251: icmp_req=3 ttl=64 time=1.24 ms 64 bytes from 10.0.0.251: icmp_req=4 ttl=64 time=1.54 ms 64 bytes from 10.0.0.251: icmp_req=5 ttl=64 time=2.04 ms 64 bytes from 10.0.0.251: icmp_req=6 ttl=64 time=2.48 ms 64 bytes from 10.0.0.251: icmp_req=7 ttl=64 time=1.90 ms 64 bytes from 10.0.0.251: icmp_req=8 ttl=64 time=2.30 ms (other attempt) --- 10.0.0.251 ping statistics --- 22 packets transmitted, 22 received, 0% packet loss, time 21128ms rtt min/avg/max/mdev = 0.940/1.733/3.604/0.685 ms root@amd:/data/l/linux-good# ethtool -e eth0 Offset Values ------ ------ 0x0000 00 16 d3 25 19 04 30 0b b2 ff 51 00 ff ff ff ff 0x0010 53 00 03 02 6b 02 7e 20 aa 17 9a 10 86 80 df 80 0x0020 00 00 00 20 54 7e 00 00 14 00 da 00 04 00 00 27 0x0030 c9 6c 50 31 3e 07 0b 04 8b 29 00 00 00 f0 02 0f 0x0040 08 10 00 00 04 0f ff 7f 01 4d ff ff ff ff ff ff 0x0050 14 00 1d 00 14 00 1d 00 af aa 1e 00 00 00 1d 00 0x0060 00 01 00 40 32 12 07 40 ff ff ff ff ff ff ff ff 0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff fd 5d 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller Subsystem: Lenovo ThinkPad X60s Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at ee000000 (32-bit, non-prefetchable) [size=128K] I/O ports at 2000 [size=32] Capabilities: [c8] Power Management version 2 Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [e0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number 00-16-d3-ff-ff-25-19-04 Kernel driver in use: e1000e Any ideas how to debug/fix this? [Maybe related: https://bugzilla.kernel.org/show_bug.cgi?id=6929 but that was fixed in 2007. And yes, it _is_ better with bigger packets.]? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: e1000e on thinkpad x60: interrupt problem 2013-07-09 1:05 e1000e on thinkpad x60: interrupt problem Pavel Machek @ 2013-07-09 15:51 ` Ronciak, John 2013-07-09 17:02 ` Pavel Machek 0 siblings, 1 reply; 9+ messages in thread From: Ronciak, John @ 2013-07-09 15:51 UTC (permalink / raw) To: Pavel Machek, Kirsher, Jeffrey T, Brandeburg, Jesse, Allan, Bruce W, Wyborny, Carolyn, Skidmore, Donald C, Rose, Gregory V, Waskiewicz Jr, Peter P, Duyck, Alexander H, Dave, Tushar N, e1000-devel, netdev Nothing appears to be wrong. If the system is seeing ping packets at all means that device is generating interrupts and that they are being processed. If you are looking at performance then sharing interrupts is probably not a good idea. Since you are on a laptop it may not be easy to separate the networking device onto its own interrupt. The interrupt is shared with a lot of other devices and not just ahci. Cheers, John > -----Original Message----- > From: Pavel Machek [mailto:pavel@ucw.cz] > Sent: Monday, July 08, 2013 6:05 PM > To: Kirsher, Jeffrey T; Brandeburg, Jesse; Allan, Bruce W; Wyborny, > Carolyn; Skidmore, Donald C; Rose, Gregory V; Waskiewicz Jr, Peter P; > Duyck, Alexander H; Ronciak, John; Dave, Tushar N; e1000- > devel@lists.sourceforge.net; netdev@vger.kernel.org > Subject: e1000e on thinkpad x60: interrupt problem > > Hi! > > I'm using > > 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet > Controller > > on thinkpad x60. Kernel is 3.10. > > # CONFIG_E100 is not set > # CONFIG_E1000 is not set > CONFIG_E1000E=y > > Interrupts are like this: > > pavel@amd:/data/l/linux-good$ cat /proc/interrupts > CPU0 CPU1 > 0: 95454037 5192 IO-APIC-edge timer > 1: 16292 20 IO-APIC-edge i8042 > 3: 9 0 IO-APIC-edge > 4: 9 0 IO-APIC-edge > 7: 0 0 IO-APIC-edge parport0 > 8: 1 0 IO-APIC-edge rtc0 > 9: 19471974 1207 IO-APIC-fasteoi acpi > 12: 168092 15 IO-APIC-edge i8042 > 14: 3568551 165 IO-APIC-edge ata_piix > 15: 0 0 IO-APIC-edge ata_piix > 16: 14033945 877 IO-APIC-fasteoi i915, ahci, yenta, > uhci_hcd:usb2, eth0 > > but it seems that eth0 is not generating interrupts at all: > > ... > 64 bytes from 10.0.0.251: icmp_req=32 ttl=64 time=26.2 ms > 64 bytes from 10.0.0.251: icmp_req=33 ttl=64 time=16.6 ms > 64 bytes from 10.0.0.251: icmp_req=34 ttl=64 time=1.14 ms > 64 bytes from 10.0.0.251: icmp_req=35 ttl=64 time=56.4 ms ^C > --- 10.0.0.251 ping statistics --- > 35 packets transmitted, 35 received, 0% packet loss, time 34140ms rtt > min/avg/max/mdev = 1.024/20.173/88.285/25.281 ms > pavel@amd:/data/l/linux-good$ ping 10.0.0.251 > > Note huge latencies. But as interrupt is shared with ahci, I can help > with: > > pavel@amd:~$ sudo cat /dev/sda > /dev/null [sudo] password for pavel: > > Then latencies get to high but reasonable range: > > pavel@amd:/data/l/linux-good$ ping 10.0.0.251 PING 10.0.0.251 > (10.0.0.251) 56(84) bytes of data. > 64 bytes from 10.0.0.251: icmp_req=1 ttl=64 time=1.14 ms > 64 bytes from 10.0.0.251: icmp_req=2 ttl=64 time=3.79 ms > 64 bytes from 10.0.0.251: icmp_req=3 ttl=64 time=1.24 ms > 64 bytes from 10.0.0.251: icmp_req=4 ttl=64 time=1.54 ms > 64 bytes from 10.0.0.251: icmp_req=5 ttl=64 time=2.04 ms > 64 bytes from 10.0.0.251: icmp_req=6 ttl=64 time=2.48 ms > 64 bytes from 10.0.0.251: icmp_req=7 ttl=64 time=1.90 ms > 64 bytes from 10.0.0.251: icmp_req=8 ttl=64 time=2.30 ms (other > attempt) > --- 10.0.0.251 ping statistics --- > 22 packets transmitted, 22 received, 0% packet loss, time 21128ms rtt > min/avg/max/mdev = 0.940/1.733/3.604/0.685 ms > > root@amd:/data/l/linux-good# ethtool -e eth0 Offset Values > ------ ------ > 0x0000 00 16 d3 25 19 04 30 0b b2 ff 51 00 ff ff ff ff > 0x0010 53 00 03 02 6b 02 7e 20 aa 17 9a 10 86 80 df 80 > 0x0020 00 00 00 20 54 7e 00 00 14 00 da 00 04 00 00 27 > 0x0030 c9 6c 50 31 3e 07 0b 04 8b 29 00 00 00 f0 02 0f > 0x0040 08 10 00 00 04 0f ff 7f 01 4d ff ff ff ff ff ff > 0x0050 14 00 1d 00 14 00 1d 00 af aa 1e 00 00 00 1d 00 > 0x0060 00 01 00 40 32 12 07 40 ff ff ff ff ff ff ff ff > 0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff fd 5d > > 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet > Controller > Subsystem: Lenovo ThinkPad X60s > Flags: bus master, fast devsel, latency 0, IRQ 16 > Memory at ee000000 (32-bit, non-prefetchable) [size=128K] > I/O ports at 2000 [size=32] > Capabilities: [c8] Power Management version 2 > Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+ > Capabilities: [e0] Express Endpoint, MSI 00 > Capabilities: [100] Advanced Error Reporting > Capabilities: [140] Device Serial Number 00-16-d3-ff-ff-25-19- > 04 > Kernel driver in use: e1000e > > Any ideas how to debug/fix this? > > [Maybe related: https://bugzilla.kernel.org/show_bug.cgi?id=6929 but > that was fixed in 2007. And yes, it _is_ better with bigger packets.]? > > Pavel > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) > http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: e1000e on thinkpad x60: interrupt problem 2013-07-09 15:51 ` Ronciak, John @ 2013-07-09 17:02 ` Pavel Machek 2013-07-09 17:15 ` Waskiewicz Jr, Peter P 0 siblings, 1 reply; 9+ messages in thread From: Pavel Machek @ 2013-07-09 17:02 UTC (permalink / raw) To: Ronciak, John; +Cc: e1000-devel, Allan, Bruce W, Brandeburg, Jesse, netdev Hi! > Nothing appears to be wrong. If the system is seeing ping packets >at all means that device is generating interrupts and that they are >being processed. If you are looking at performance then sharing No, that's not true. There is other interrupt load, and e1000e has big enough buffers; that means that packets eventually get processed. I strongly suspect e1000e generates little or no interrupts and packets only get processed when other devices on shared interrupt line generate interrupt. >interrupts is probably not a good idea. Since you are on a laptop it >may not be easy to separate the networking device onto its own >interrupt. The interrupt is shared with a lot of other devices and >not just ahci. Yes, it is wrong. 100msec latency on unloaded core duo is not right... and notice how ping latency goes _down_ when I load the other devices. [100msec latency is so big that it makes interactive work over ssh hard.] IOW if the interrupt was not shared, I'd be getting latencies in one second range. It happened before on this machine, and other x60 users are seing that, too. It may have something to do with ASPM, because hack below makes latencies lower than 2msec. (And btw shared interrupt load does not make them worse in a way that can be measured; it stays <2msec with AHCI load.) Any ideas for acceptable solution? Pavel diff --git a/.config b/.config index 149f713..d7f5a11 100644 --- a/.config +++ b/.config @@ -559,9 +559,9 @@ CONFIG_PCIEAER=y # CONFIG_PCIEAER_INJECT is not set CONFIG_PCIEASPM=y CONFIG_PCIEASPM_DEBUG=y -CONFIG_PCIEASPM_DEFAULT=y +# CONFIG_PCIEASPM_DEFAULT is not set # CONFIG_PCIEASPM_POWERSAVE is not set -# CONFIG_PCIEASPM_PERFORMANCE is not set +CONFIG_PCIEASPM_PERFORMANCE=y CONFIG_PCIE_PME=y CONFIG_ARCH_SUPPORTS_MSI=y # CONFIG_PCI_MSI is not set index e4b1fb2..9a1b63e 100644 --- a/drivers/pci/pci-acpi.c +++ b/drivers/pci/pci-acpi.c @@ -382,7 +382,7 @@ static int __init acpi_pci_init(void) if (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_ASPM) { printk(KERN_INFO"ACPI FADT declares the system doesn't support PCIe ASPM, so disable it\n"); - pcie_no_aspm(); +// pcie_no_aspm(); } ret = register_acpi_bus_type(&acpi_pci_bus); > > Interrupts are like this: > > > > pavel@amd:/data/l/linux-good$ cat /proc/interrupts > > CPU0 CPU1 > > 0: 95454037 5192 IO-APIC-edge timer > > 1: 16292 20 IO-APIC-edge i8042 > > 3: 9 0 IO-APIC-edge > > 4: 9 0 IO-APIC-edge > > 7: 0 0 IO-APIC-edge parport0 > > 8: 1 0 IO-APIC-edge rtc0 > > 9: 19471974 1207 IO-APIC-fasteoi acpi > > 12: 168092 15 IO-APIC-edge i8042 > > 14: 3568551 165 IO-APIC-edge ata_piix > > 15: 0 0 IO-APIC-edge ata_piix > > 16: 14033945 877 IO-APIC-fasteoi i915, ahci, yenta, > > uhci_hcd:usb2, eth0 > > > > but it seems that eth0 is not generating interrupts at all: > > > > ... > > 64 bytes from 10.0.0.251: icmp_req=32 ttl=64 time=26.2 ms > > 64 bytes from 10.0.0.251: icmp_req=33 ttl=64 time=16.6 ms > > 64 bytes from 10.0.0.251: icmp_req=34 ttl=64 time=1.14 ms > > 64 bytes from 10.0.0.251: icmp_req=35 ttl=64 time=56.4 ms ^C > > --- 10.0.0.251 ping statistics --- > > 35 packets transmitted, 35 received, 0% packet loss, time 34140ms rtt > > min/avg/max/mdev = 1.024/20.173/88.285/25.281 ms > > pavel@amd:/data/l/linux-good$ ping 10.0.0.251 > > > > Note huge latencies. But as interrupt is shared with ahci, I can help > > with: > > > > pavel@amd:~$ sudo cat /dev/sda > /dev/null [sudo] password for pavel: > > > > Then latencies get to high but reasonable range: > > > > pavel@amd:/data/l/linux-good$ ping 10.0.0.251 PING 10.0.0.251 > > (10.0.0.251) 56(84) bytes of data. > > 64 bytes from 10.0.0.251: icmp_req=1 ttl=64 time=1.14 ms > > 64 bytes from 10.0.0.251: icmp_req=2 ttl=64 time=3.79 ms > > 64 bytes from 10.0.0.251: icmp_req=3 ttl=64 time=1.24 ms > > 64 bytes from 10.0.0.251: icmp_req=4 ttl=64 time=1.54 ms > > 64 bytes from 10.0.0.251: icmp_req=5 ttl=64 time=2.04 ms > > 64 bytes from 10.0.0.251: icmp_req=6 ttl=64 time=2.48 ms > > 64 bytes from 10.0.0.251: icmp_req=7 ttl=64 time=1.90 ms > > 64 bytes from 10.0.0.251: icmp_req=8 ttl=64 time=2.30 ms (other > > attempt) > > --- 10.0.0.251 ping statistics --- > > 22 packets transmitted, 22 received, 0% packet loss, time 21128ms rtt > > min/avg/max/mdev = 0.940/1.733/3.604/0.685 ms -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ------------------------------------------------------------------------------ See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: e1000e on thinkpad x60: interrupt problem 2013-07-09 17:02 ` Pavel Machek @ 2013-07-09 17:15 ` Waskiewicz Jr, Peter P 2013-07-09 20:48 ` Pavel Machek 0 siblings, 1 reply; 9+ messages in thread From: Waskiewicz Jr, Peter P @ 2013-07-09 17:15 UTC (permalink / raw) To: Pavel Machek Cc: Ronciak, John, Kirsher, Jeffrey T, Brandeburg, Jesse, Allan, Bruce W, Wyborny, Carolyn, Skidmore, Donald C, Rose, Gregory V, Duyck, Alexander H, Dave, Tushar N, e1000-devel, netdev On Tue, 2013-07-09 at 19:02 +0200, Pavel Machek wrote: > Hi! > > > Nothing appears to be wrong. If the system is seeing ping packets > >at all means that device is generating interrupts and that they are > >being processed. If you are looking at performance then sharing > > No, that's not true. There is other interrupt load, and e1000e has big > enough buffers; that means that packets eventually get processed. I > strongly suspect e1000e generates little or no interrupts and packets > only get processed when other devices on shared interrupt line > generate interrupt. If the interrupt is shared, e1000e checks if it's the hardware that generated it before processing packets. Consuming an interrupt that isn't meant for this device will throw major warnings in the kernel about bad interrupt routing, etc. Here's the code from the interrupt handler (note the last part of the pasted code): /** * e1000_intr - Interrupt Handler * @irq: interrupt number * @data: pointer to a network interface device structure **/ static irqreturn_t e1000_intr(int __always_unused irq, void *data) { struct net_device *netdev = data; struct e1000_adapter *adapter = netdev_priv(netdev); struct e1000_hw *hw = &adapter->hw; u32 rctl, icr = er32(ICR); if (!icr || test_bit(__E1000_DOWN, &adapter->state)) return IRQ_NONE; /* Not our interrupt */ /* IMS will not auto-mask if INT_ASSERTED is not set, and if it is * not set, then the adapter didn't send an interrupt */ if (!(icr & E1000_ICR_INT_ASSERTED)) return IRQ_NONE; Cheers, -PJ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: e1000e on thinkpad x60: interrupt problem 2013-07-09 17:15 ` Waskiewicz Jr, Peter P @ 2013-07-09 20:48 ` Pavel Machek 2013-07-09 20:59 ` Ronciak, John 2013-07-09 21:40 ` Jesse Brandeburg 0 siblings, 2 replies; 9+ messages in thread From: Pavel Machek @ 2013-07-09 20:48 UTC (permalink / raw) To: Waskiewicz Jr, Peter P Cc: Ronciak, John, Kirsher, Jeffrey T, Brandeburg, Jesse, Allan, Bruce W, Wyborny, Carolyn, Skidmore, Donald C, Rose, Gregory V, Duyck, Alexander H, Dave, Tushar N, e1000-devel, netdev On Tue 2013-07-09 17:15:48, Waskiewicz Jr, Peter P wrote: > On Tue, 2013-07-09 at 19:02 +0200, Pavel Machek wrote: > > Hi! > > > > > Nothing appears to be wrong. If the system is seeing ping packets > > >at all means that device is generating interrupts and that they are > > >being processed. If you are looking at performance then sharing > > > > No, that's not true. There is other interrupt load, and e1000e has big > > enough buffers; that means that packets eventually get processed. I > > strongly suspect e1000e generates little or no interrupts and packets > > only get processed when other devices on shared interrupt line > > generate interrupt. > > If the interrupt is shared, e1000e checks if it's the hardware that > generated it before processing packets. Consuming an interrupt that > isn't meant for this device will throw major warnings in the kernel > about bad interrupt routing, etc. Here's the code from the interrupt > handler (note the last part of the pasted code): Yeah, of course you need to ask e1000e if it generated the interrupt. That part works. The part that actually generates the interrupt does not. Take a look at original mail... packet comes e1000e sets E1000_ICR_INT_ASSERTED bit e1000e tries to generate an interrupt and fails 50msec passes AHCI generates interrupt all the handlers are called AHCI processes its interrupt, handles disk read e1000_intr notices E1000_ICR_INT_ASSERTED bit, delivers the packet. Network still works, only slowly. Ping goes lower when I use the disk. That matches what I see. Do you have other explanation? Pavel > /** > * e1000_intr - Interrupt Handler > * @irq: interrupt number > * @data: pointer to a network interface device structure > **/ > static irqreturn_t e1000_intr(int __always_unused irq, void *data) > { > struct net_device *netdev = data; > struct e1000_adapter *adapter = netdev_priv(netdev); > struct e1000_hw *hw = &adapter->hw; > u32 rctl, icr = er32(ICR); > > if (!icr || test_bit(__E1000_DOWN, &adapter->state)) > return IRQ_NONE; /* Not our interrupt */ > > /* IMS will not auto-mask if INT_ASSERTED is not set, and if it > is > * not set, then the adapter didn't send an interrupt > */ > if (!(icr & E1000_ICR_INT_ASSERTED)) > return IRQ_NONE; > > Cheers, > -PJ -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: e1000e on thinkpad x60: interrupt problem 2013-07-09 20:48 ` Pavel Machek @ 2013-07-09 20:59 ` Ronciak, John 2013-07-09 21:40 ` Jesse Brandeburg 1 sibling, 0 replies; 9+ messages in thread From: Ronciak, John @ 2013-07-09 20:59 UTC (permalink / raw) To: Pavel Machek, Waskiewicz Jr, Peter P Cc: Kirsher, Jeffrey T, Brandeburg, Jesse, Allan, Bruce W, Wyborny, Carolyn, Skidmore, Donald C, Rose, Gregory V, Duyck, Alexander H, Dave, Tushar N, e1000-devel, netdev So are you saying the HW isn't generating an interrupt? Clearly it is as the driver is taking action on it. I went and looked at the first mail I saw (the one I first responded to) but maybe you are talking about another email that I (we) didn't get. I don't see any of what you point to in this email (below). So I still don't think we know what you are asking. Is it that you think an interrupt isn't processed because of the interrupts being shared and the packets get processed on the next interrupt? Cheers, John > -----Original Message----- > From: Pavel Machek [mailto:pavel@ucw.cz] > Sent: Tuesday, July 09, 2013 1:49 PM > To: Waskiewicz Jr, Peter P > Cc: Ronciak, John; Kirsher, Jeffrey T; Brandeburg, Jesse; Allan, Bruce > W; Wyborny, Carolyn; Skidmore, Donald C; Rose, Gregory V; Duyck, > Alexander H; Dave, Tushar N; e1000-devel@lists.sourceforge.net; > netdev@vger.kernel.org > Subject: Re: e1000e on thinkpad x60: interrupt problem > > On Tue 2013-07-09 17:15:48, Waskiewicz Jr, Peter P wrote: > > On Tue, 2013-07-09 at 19:02 +0200, Pavel Machek wrote: > > > Hi! > > > > > > > Nothing appears to be wrong. If the system is seeing ping > packets > > > >at all means that device is generating interrupts and that they > are > > > >being processed. If you are looking at performance then sharing > > > > > > No, that's not true. There is other interrupt load, and e1000e has > > > big enough buffers; that means that packets eventually get > > > processed. I strongly suspect e1000e generates little or no > > > interrupts and packets only get processed when other devices on > > > shared interrupt line generate interrupt. > > > > If the interrupt is shared, e1000e checks if it's the hardware that > > generated it before processing packets. Consuming an interrupt that > > isn't meant for this device will throw major warnings in the kernel > > about bad interrupt routing, etc. Here's the code from the interrupt > > handler (note the last part of the pasted code): > > Yeah, of course you need to ask e1000e if it generated the interrupt. > That part works. The part that actually generates the interrupt does > not. Take a look at original mail... > > packet comes > e1000e sets E1000_ICR_INT_ASSERTED bit > e1000e tries to generate an interrupt and fails 50msec passes AHCI > generates interrupt all the handlers are called > AHCI processes its interrupt, handles disk read > e1000_intr notices E1000_ICR_INT_ASSERTED bit, delivers the packet. > > Network still works, only slowly. Ping goes lower when I use the disk. > That matches what I see. > > Do you have other explanation? > Pavel > > > /** > > * e1000_intr - Interrupt Handler > > * @irq: interrupt number > > * @data: pointer to a network interface device structure **/ static > > irqreturn_t e1000_intr(int __always_unused irq, void *data) { > > struct net_device *netdev = data; > > struct e1000_adapter *adapter = netdev_priv(netdev); > > struct e1000_hw *hw = &adapter->hw; > > u32 rctl, icr = er32(ICR); > > > > if (!icr || test_bit(__E1000_DOWN, &adapter->state)) > > return IRQ_NONE; /* Not our interrupt */ > > > > /* IMS will not auto-mask if INT_ASSERTED is not set, and if > > it is > > * not set, then the adapter didn't send an interrupt > > */ > > if (!(icr & E1000_ICR_INT_ASSERTED)) > > return IRQ_NONE; > > > > Cheers, > > -PJ > > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) > http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: e1000e on thinkpad x60: interrupt problem 2013-07-09 20:48 ` Pavel Machek 2013-07-09 20:59 ` Ronciak, John @ 2013-07-09 21:40 ` Jesse Brandeburg 2013-07-10 3:43 ` [E1000-devel] " Fujinaka, Todd 2013-07-10 11:37 ` Pavel Machek 1 sibling, 2 replies; 9+ messages in thread From: Jesse Brandeburg @ 2013-07-09 21:40 UTC (permalink / raw) To: Pavel Machek Cc: Waskiewicz Jr, Peter P, Ronciak, John, Kirsher, Jeffrey T, Allan, Bruce W, Wyborny, Carolyn, Skidmore, Donald C, Rose, Gregory V, Duyck, Alexander H, Dave, Tushar N, e1000-devel, netdev, jesse.brandeburg On Tue, 9 Jul 2013 22:48:54 +0200 Pavel Machek <pavel@ucw.cz> wrote: > Yeah, of course you need to ask e1000e if it generated the > interrupt. That part works. The part that actually generates the > interrupt does not. Take a look at original mail... > > packet comes > e1000e sets E1000_ICR_INT_ASSERTED bit > e1000e tries to generate an interrupt and fails > 50msec passes ^^ thats the ASPM timeout length. > AHCI generates interrupt > all the handlers are called > AHCI processes its interrupt, handles disk read > e1000_intr notices E1000_ICR_INT_ASSERTED bit, delivers the packet. > > Network still works, only slowly. Ping goes lower when I use the > disk. That matches what I see. > > Do you have other explanation? Regardless of what others are saying I believe you have an issue with ASPM being enabled. All the discussion about shared interrupts, is just a distraction. This issue would still occur (and just be worse) without a shared interrupt. You already mentioned that a kernel hack to disable ASPM fixes it, but you can just boot with different options to turn off ASPM. pcie_aspm=off There are known issues with ASPM on this part, and it definitely needs to be off. If your bios has the option to turn it off, that is the best way to disable it, second choice is to turn it off using the kernel option. Hope this helps, Jesse ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [E1000-devel] e1000e on thinkpad x60: interrupt problem 2013-07-09 21:40 ` Jesse Brandeburg @ 2013-07-10 3:43 ` Fujinaka, Todd 2013-07-10 11:37 ` Pavel Machek 1 sibling, 0 replies; 9+ messages in thread From: Fujinaka, Todd @ 2013-07-10 3:43 UTC (permalink / raw) To: Brandeburg, Jesse, Pavel Machek Cc: e1000-devel, Allan, Bruce W, Brandeburg, Jesse, Ronciak, John, netdev The latest kernel should turn off ASPM as well, but you should be able to check by looking at lspci -vvv. I think LnkCtl should say "ASPM disabled." Sorry for top-posting. Todd Fujinaka Software Application Engineer Networking Division (ND) Intel Corporation todd.fujinaka@intel.com (503) 712-4565 -----Original Message----- From: Jesse Brandeburg [mailto:jesse.brandeburg@intel.com] Sent: Tuesday, July 09, 2013 2:41 PM To: Pavel Machek Cc: e1000-devel@lists.sourceforge.net; Allan, Bruce W; Brandeburg, Jesse; Ronciak, John; netdev@vger.kernel.org Subject: Re: [E1000-devel] e1000e on thinkpad x60: interrupt problem On Tue, 9 Jul 2013 22:48:54 +0200 Pavel Machek <pavel@ucw.cz> wrote: > Yeah, of course you need to ask e1000e if it generated the interrupt. > That part works. The part that actually generates the interrupt does > not. Take a look at original mail... > > packet comes > e1000e sets E1000_ICR_INT_ASSERTED bit e1000e tries to generate an > interrupt and fails 50msec passes ^^ thats the ASPM timeout length. > AHCI generates interrupt > all the handlers are called > AHCI processes its interrupt, handles disk read > e1000_intr notices E1000_ICR_INT_ASSERTED bit, delivers the packet. > > Network still works, only slowly. Ping goes lower when I use the disk. > That matches what I see. > > Do you have other explanation? Regardless of what others are saying I believe you have an issue with ASPM being enabled. All the discussion about shared interrupts, is just a distraction. This issue would still occur (and just be worse) without a shared interrupt. You already mentioned that a kernel hack to disable ASPM fixes it, but you can just boot with different options to turn off ASPM. pcie_aspm=off There are known issues with ASPM on this part, and it definitely needs to be off. If your bios has the option to turn it off, that is the best way to disable it, second choice is to turn it off using the kernel option. Hope this helps, Jesse ------------------------------------------------------------------------------ See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: e1000e on thinkpad x60: interrupt problem 2013-07-09 21:40 ` Jesse Brandeburg 2013-07-10 3:43 ` [E1000-devel] " Fujinaka, Todd @ 2013-07-10 11:37 ` Pavel Machek 1 sibling, 0 replies; 9+ messages in thread From: Pavel Machek @ 2013-07-10 11:37 UTC (permalink / raw) To: Jesse Brandeburg Cc: Waskiewicz Jr, Peter P, Ronciak, John, Kirsher, Jeffrey T, Allan, Bruce W, Wyborny, Carolyn, Skidmore, Donald C, Rose, Gregory V, Duyck, Alexander H, Dave, Tushar N, e1000-devel, netdev Hi! > > Yeah, of course you need to ask e1000e if it generated the > > interrupt. That part works. The part that actually generates the > > interrupt does not. Take a look at original mail... > > > > packet comes > > e1000e sets E1000_ICR_INT_ASSERTED bit > > e1000e tries to generate an interrupt and fails > > 50msec passes > > ^^ thats the ASPM timeout length. > > > AHCI generates interrupt > > all the handlers are called > > AHCI processes its interrupt, handles disk read > > e1000_intr notices E1000_ICR_INT_ASSERTED bit, delivers the packet. > > > > Network still works, only slowly. Ping goes lower when I use the > > disk. That matches what I see. > > > > Do you have other explanation? > > Regardless of what others are saying I believe you have an issue with > ASPM being enabled. All the discussion about shared interrupts, is > just a distraction. This issue would still occur (and just be worse) > without a shared interrupt. Agreed. > You already mentioned that a kernel hack to disable ASPM fixes it, but > you can just boot with different options to turn off ASPM. > > pcie_aspm=off Are you sure? AFAICT linux will not turn off aspm if ACPI says so.. hence the hack. # From: Robert Hancock <hancockrwd@gmail.com> # To: Pavel Machek <pavel@ucw.cz> # CC: Greg KH <greg@kroah.com>, kernel list # <linux-kernel@vger.kernel.org>, # joe.lawrence@stratus.com, myron.stowe@redhat.com, # bhelgaas@google.com # Subject: Re: /sys/module/pcie_aspm/parameters/policy not writable? # ... # > pavel@amd:~$ dmesg | grep -i aspm # > ACPI FADT declares the system doesn't support PCIe ASPM, so disable # it # # IIRC, this message is somewhat misleading. When that FADT flag is set # by the BIOS, the kernel doesn't so much disable ASPM as disable the # kernel's control over ASPM. I believe this was to match Windows # behavior. It looks like ASPM needs to be off, but BIOS enables ASPM and tells kernel it is not supported... and that means that kernel will not disable it :-(. I guess there's no way to do ASPM disable for single device from the driver? Thanks, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-07-10 11:37 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-07-09 1:05 e1000e on thinkpad x60: interrupt problem Pavel Machek 2013-07-09 15:51 ` Ronciak, John 2013-07-09 17:02 ` Pavel Machek 2013-07-09 17:15 ` Waskiewicz Jr, Peter P 2013-07-09 20:48 ` Pavel Machek 2013-07-09 20:59 ` Ronciak, John 2013-07-09 21:40 ` Jesse Brandeburg 2013-07-10 3:43 ` [E1000-devel] " Fujinaka, Todd 2013-07-10 11:37 ` Pavel Machek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).