All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] Ethernet driver issue
@ 2013-05-21 18:58 Jeff Webb
  2013-05-23 16:30 ` [Xenomai] IRQ issue (was Ethernet driver issue) Jeff Webb
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Webb @ 2013-05-21 18:58 UTC (permalink / raw)
  To: Xenomai

I am setting up a new lab machine (x86-64) with two ethernet interfaces (one on-board, and one in a PCI slot).  The secondary PCI card works fine under standard linux, but does not work when running a xenomai-patched kernel.  In the latter case, the OS brings up the eth1 interface, but I am unable to ping anything.  No bytes are received as shown via 'ifconfig':

eth1      Link encap:Ethernet  HWaddr 90:e2:ba:1b:61:70
           inet addr:192.168.12.21  Bcast:192.168.12.255  Mask:255.255.255.0
           inet6 addr: fe80::92e2:baff:fe1b:6170/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:84 overruns:0 frame:0
           TX packets:34 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:0 (0.0 B)  TX bytes:8040 (8.0 KB)

This machine is a quad-core Xeon W3520 @ 2.67GHz running Ubuntu 12.04.  I started out with a custom-built 3.5.7/xenomai-2.6.2.1 kernel package using Ubuntu's config as a starting point.  When that didn't work, I rebuilt a vanilla 3.5.7 kernel using the same configuration.  The ethernet worked fine under that kernel, so it seems be a xenomai/i-pipe related issue.  I then built a kernel using code from the ipipe-core-3.5.7 and xenomai-2.6 git repositories, but this did not improve things.  I don't see any kernel panics, but I see a couple of spurious interrupt messages in the syslog:

[   28.585160] I-pipe: spurious interrupt 32
[   68.537855] I-pipe: spurious interrupt 32

That is not the IRQ associated with the ethernet card.  I have seen this same message on other machines, but I have not tracked down the cause.  Here is the output of 'sudo lcpci -vv' for the problematic ethernet card under xenomai:

06:04.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05)
         Subsystem: Intel Corporation PRO/1000 GT Desktop Adapter
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
         Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx+
         Latency: 64 (63750ns min), Cache Line Size: 64 bytes
         Interrupt: pin A routed to IRQ 16
         Region 0: Memory at f3bc0000 (32-bit, non-prefetchable) [size=128K]
         Region 1: Memory at f3be0000 (32-bit, non-prefetchable) [size=128K]
         Region 2: I/O ports at ccc0 [size=64]
         Expansion ROM at f3c00000 [disabled] [size=128K]
         Capabilities: [dc] Power Management version 2
                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
         Capabilities: [e4] PCI-X non-bridge device
                 Command: DPERE- ERO+ RBC=512 OST=1
                 Status: Dev=00:00.0 64bit- 133MHz- SCD- USC- DC=simple DMMRBC=2048 DMOST=1 DMCRS=8 RSCEM- 266MHz- 533MHz-
         Kernel driver in use: e1000
         Kernel modules: e1000

On a working kernel, the output is similar, except for the last character in the status line:

         Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

This is the output of /proc/interrupts under xenomai:

             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
    0:         77          0          0          0          0          0          0          0   IO-APIC-edge      timer
    1:          3          0          0          0          0          0          0          0   IO-APIC-edge      i8042
    7:          1          0          0          0          0          0          0          0   IO-APIC-edge      parport0
    8:          1          0          0          0          0          0          0          0   IO-APIC-edge      rtc0
    9:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   acpi
   12:          4          0          0          0          0          0          0          0   IO-APIC-edge      i8042
   16:         38          0          0          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb3, eth1
   17:        200       5351        351          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb4, uhci_hcd:usb7
   18:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb8
   22:          3          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5
   23:         60          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb6
   34:        113        131          0          0          0          0          0          0   IO-APIC-fasteoi   snd_hda_intel
   66:       6177          0       9890          0          0          0          0          0   PCI-MSI-edge      ahci
   67:        245          0          0        129          0          0          0          0   PCI-MSI-edge      snd_hda_intel
   68:         55          0          0          0          0       5804          0          0   PCI-MSI-edge      eth0
  NMI:         16         11         17          9         15         19         14         19   Non-maskable interrupts
  LOC:      25255      14281      21555      14835      16608      18121      14736      19355   Local timer interrupts
  SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
  PMI:         16         11         17          9         15         19         14         19   Performance monitoring interrupts
  IWI:          0          0          0          0          0          0          0          0   IRQ work interrupts
  RTR:          7          0          0          0          0          0          0          0   APIC ICR read retries
  RES:      40927      39497      33477      29679       5827       6441       7713       5641   Rescheduling interrupts
  CAL:        242        360        425        393        418        416        393        339   Function call interrupts
  TLB:        743        739        826        819        659       1038       1074       1160   TLB shootdowns
  TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
  THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts
  MCE:          0          0          0          0          0          0          0          0   Machine check exceptions
  MCP:          4          4          4          4          4          4          4          4   Machine check polls
  ERR:          0
  MIS:          0

On a working kernel, I get this:

             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
    0:         76          0          0          0          0          0          0          0   IO-APIC-edge      timer
    1:          3          0          0          0          0          0          0          0   IO-APIC-edge      i8042
    7:          1          0          0          0          0          0          0          0   IO-APIC-edge      parport0
    8:          1          0          0          0          0          0          0          0   IO-APIC-edge      rtc0
    9:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   acpi
   12:          4          0          0          0          0          0          0          0   IO-APIC-edge      i8042
   16:         35          0          0          0       1032          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb3, eth1
   17:         86        393        460          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb4, uhci_hcd:usb7
   18:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb8
   22:          3          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5
   23:         61          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb6
   34:        193        232          0          0          0          0          0          0   IO-APIC-fasteoi   snd_hda_intel
   66:       6247          0       8704          0          0          0          0          0   PCI-MSI-edge      ahci
   67:        245          0          0         57          0          0          0          0   PCI-MSI-edge      snd_hda_intel
   68:         80          0          0          0          0      11328          0          0   PCI-MSI-edge      eth0
  NMI:          7         10         11          9         14         17         14         14   Non-maskable interrupts
  LOC:      30352       9408      12686       7699       6434       9800       8633       6039   Local timer interrupts
  SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
  PMI:          7         10         11          9         14         17         14         14   Performance monitoring interrupts
  IWI:          0          0          0          0          0          0          0          0   IRQ work interrupts
  RTR:          7          0          0          0          0          0          0          0   APIC ICR read retries
  RES:      12970       1023        383        210        176        123        114        157   Rescheduling interrupts
  CAL:        152        385        395        397        371        388        404        389   Function call interrupts
  TLB:        579        604        677        666        925        763       1179       1090   TLB shootdowns
  TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
  THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts
  MCE:          0          0          0          0          0          0          0          0   Machine check exceptions
  MCP:          8          8          8          8          8          8          8          8   Machine check polls

So, it seems to me that some IRQ 16 interrupts are not getting through to linux.

Can anyone tell me how to proceed in debugging this issue?  What other information do you need?

Thanks,

-Jeff


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai] IRQ issue (was Ethernet driver issue)
  2013-05-21 18:58 [Xenomai] Ethernet driver issue Jeff Webb
@ 2013-05-23 16:30 ` Jeff Webb
  2013-05-23 17:39   ` Jeroen Van den Keybus
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Webb @ 2013-05-23 16:30 UTC (permalink / raw)
  To: xenomai

On 05/21/2013 01:58 PM, Jeff Webb wrote:
> I am setting up a new lab machine (x86-64) with two ethernet interfaces (one on-board, and one in a PCI slot).  The secondary PCI card works fine under standard linux, but does not work when running a xenomai-patched kernel.  In the latter case, the OS brings up the eth1 interface, but I am unable to ping anything.  No bytes are received as shown via 'ifconfig':
>
> eth1      Link encap:Ethernet  HWaddr 90:e2:ba:1b:61:70
>            inet addr:192.168.12.21  Bcast:192.168.12.255  Mask:255.255.255.0
>            inet6 addr: fe80::92e2:baff:fe1b:6170/64 Scope:Link
>            UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>            RX packets:0 errors:0 dropped:84 overruns:0 frame:0
>            TX packets:34 errors:0 dropped:0 overruns:0 carrier:0
>            collisions:0 txqueuelen:1000
>            RX bytes:0 (0.0 B)  TX bytes:8040 (8.0 KB)
>
> This machine is a quad-core Xeon W3520 @ 2.67GHz running Ubuntu 12.04.  I started out with a custom-built 3.5.7/xenomai-2.6.2.1 kernel package using Ubuntu's config as a starting point.  When that didn't work, I rebuilt a vanilla 3.5.7 kernel using the same configuration.  The ethernet worked fine under that kernel, so it seems be a xenomai/i-pipe related issue.  I then built a kernel using code from the ipipe-core-3.5.7 and xenomai-2.6 git repositories, but this did not improve things.  I don't see any kernel panics, but I see a couple of spurious interrupt messages in the syslog:
>
> [   28.585160] I-pipe: spurious interrupt 32
> [   68.537855] I-pipe: spurious interrupt 32
>
> That is not the IRQ associated with the ethernet card.  I have seen this same message on other machines, but I have not tracked down the cause.  Here is the output of 'sudo lcpci -vv' for the problematic ethernet card under xenomai:
>
> 06:04.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05)
>          Subsystem: Intel Corporation PRO/1000 GT Desktop Adapter
>          Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
>          Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx+
>          Latency: 64 (63750ns min), Cache Line Size: 64 bytes
>          Interrupt: pin A routed to IRQ 16
>          Region 0: Memory at f3bc0000 (32-bit, non-prefetchable) [size=128K]
>          Region 1: Memory at f3be0000 (32-bit, non-prefetchable) [size=128K]
>          Region 2: I/O ports at ccc0 [size=64]
>          Expansion ROM at f3c00000 [disabled] [size=128K]
>          Capabilities: [dc] Power Management version 2
>                  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                  Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>          Capabilities: [e4] PCI-X non-bridge device
>                  Command: DPERE- ERO+ RBC=512 OST=1
>                  Status: Dev=00:00.0 64bit- 133MHz- SCD- USC- DC=simple DMMRBC=2048 DMOST=1 DMCRS=8 RSCEM- 266MHz- 533MHz-
>          Kernel driver in use: e1000
>          Kernel modules: e1000
>
> On a working kernel, the output is similar, except for the last character in the status line:
>
>          Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>
> This is the output of /proc/interrupts under xenomai:
>
>              CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
>     0:         77          0          0          0          0          0          0          0   IO-APIC-edge      timer
>     1:          3          0          0          0          0          0          0          0   IO-APIC-edge      i8042
>     7:          1          0          0          0          0          0          0          0   IO-APIC-edge      parport0
>     8:          1          0          0          0          0          0          0          0   IO-APIC-edge      rtc0
>     9:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   acpi
>    12:          4          0          0          0          0          0          0          0   IO-APIC-edge      i8042
>    16:         38          0          0          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb3, eth1
>    17:        200       5351        351          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb4, uhci_hcd:usb7
>    18:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb8
>    22:          3          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5
>    23:         60          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb6
>    34:        113        131          0          0          0          0          0          0   IO-APIC-fasteoi   snd_hda_intel
>    66:       6177          0       9890          0          0          0          0          0   PCI-MSI-edge      ahci
>    67:        245          0          0        129          0          0          0          0   PCI-MSI-edge      snd_hda_intel
>    68:         55          0          0          0          0       5804          0          0   PCI-MSI-edge      eth0
>   NMI:         16         11         17          9         15         19         14         19   Non-maskable interrupts
>   LOC:      25255      14281      21555      14835      16608      18121      14736      19355   Local timer interrupts
>   SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
>   PMI:         16         11         17          9         15         19         14         19   Performance monitoring interrupts
>   IWI:          0          0          0          0          0          0          0          0   IRQ work interrupts
>   RTR:          7          0          0          0          0          0          0          0   APIC ICR read retries
>   RES:      40927      39497      33477      29679       5827       6441       7713       5641   Rescheduling interrupts
>   CAL:        242        360        425        393        418        416        393        339   Function call interrupts
>   TLB:        743        739        826        819        659       1038       1074       1160   TLB shootdowns
>   TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
>   THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts
>   MCE:          0          0          0          0          0          0          0          0   Machine check exceptions
>   MCP:          4          4          4          4          4          4          4          4   Machine check polls
>   ERR:          0
>   MIS:          0
>
> On a working kernel, I get this:
>
>              CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
>     0:         76          0          0          0          0          0          0          0   IO-APIC-edge      timer
>     1:          3          0          0          0          0          0          0          0   IO-APIC-edge      i8042
>     7:          1          0          0          0          0          0          0          0   IO-APIC-edge      parport0
>     8:          1          0          0          0          0          0          0          0   IO-APIC-edge      rtc0
>     9:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   acpi
>    12:          4          0          0          0          0          0          0          0   IO-APIC-edge      i8042
>    16:         35          0          0          0       1032          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb3, eth1
>    17:         86        393        460          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb4, uhci_hcd:usb7
>    18:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb8
>    22:          3          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5
>    23:         61          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb6
>    34:        193        232          0          0          0          0          0          0   IO-APIC-fasteoi   snd_hda_intel
>    66:       6247          0       8704          0          0          0          0          0   PCI-MSI-edge      ahci
>    67:        245          0          0         57          0          0          0          0   PCI-MSI-edge      snd_hda_intel
>    68:         80          0          0          0          0      11328          0          0   PCI-MSI-edge      eth0
>   NMI:          7         10         11          9         14         17         14         14   Non-maskable interrupts
>   LOC:      30352       9408      12686       7699       6434       9800       8633       6039   Local timer interrupts
>   SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
>   PMI:          7         10         11          9         14         17         14         14   Performance monitoring interrupts
>   IWI:          0          0          0          0          0          0          0          0   IRQ work interrupts
>   RTR:          7          0          0          0          0          0          0          0   APIC ICR read retries
>   RES:      12970       1023        383        210        176        123        114        157   Rescheduling interrupts
>   CAL:        152        385        395        397        371        388        404        389   Function call interrupts
>   TLB:        579        604        677        666        925        763       1179       1090   TLB shootdowns
>   TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
>   THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts
>   MCE:          0          0          0          0          0          0          0          0   Machine check exceptions
>   MCP:          8          8          8          8          8          8          8          8   Machine check polls
>
> So, it seems to me that some IRQ 16 interrupts are not getting through to linux.
>
> Can anyone tell me how to proceed in debugging this issue?  What other information do you need?

It turns out that if I switch the PCI ethernet card to another slot (IRQ 17 instead of 16), it works fine.  Here's another data point:  If I plug this card into a PCI to PCIe adapter and put it in a PCIe slot, the card works, but almost immediately the mouse and keyboard become almost non-responsive.  By that, I mean most keystrokes are missed and the mouse position is only updated every second or two.  I have seen this behavior under Xenomai on this machine a couple times before after running for a much longer time.  Maybe there's an issue related to USB and interrupts?

Does the difference in the last character of the status line mentioned in my previous email indicate that the card may be requesting an interrupt, but never serviced?

Any ideas?

-Jeff


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai] IRQ issue (was Ethernet driver issue)
  2013-05-23 16:30 ` [Xenomai] IRQ issue (was Ethernet driver issue) Jeff Webb
@ 2013-05-23 17:39   ` Jeroen Van den Keybus
  2013-05-23 21:17     ` Jeff Webb
  0 siblings, 1 reply; 7+ messages in thread
From: Jeroen Van den Keybus @ 2013-05-23 17:39 UTC (permalink / raw)
  To: Jeff Webb; +Cc: xenomai

> It turns out that if I switch the PCI ethernet card to another slot (IRQ
> 17 instead of 16), it works fine.  Here's another data point:  If I plug
> this card into a PCI to PCIe adapter and put it in a PCIe slot, the card
> works, but almost immediately the mouse and keyboard become almost
> non-responsive.  By that, I mean most keystrokes are missed and the mouse
> position is only updated every second or two.  I have seen this behavior
> under Xenomai on this machine a couple times before after running for a
> much longer time.  Maybe there's an issue related to USB and interrupts?
>
> Does the difference in the last character of the status line mentioned in
> my previous email indicate that the card may be requesting an interrupt,
> but never serviced?
>
>
If you mean INTx+, yes. In combination with DisINT- it indicates a pending
interrupt.

Any ideas?
>
>
Check with lspci what your PCI-PCIe bridge is. I once had serious issues
with an ASMedia bridge that did not send an PCIe IRQ deassert message.
Could be a mainboard issue as well.

Since you experience very slow response with another PCI-PCIe bridge, also
check the number of IRQs in /proc/interrupts and proc/xenomai/irq.

Also check very thoroughly that the issue does not occur in plain Linux
(check dmesg for 'Nobody cared'). It can take hours of testing to trigger
the problem and maybe the I-pipe exposes it more quickly. My personal
feeling on the problem back then was that it could have something to do
with the interrupt being serviced very fast.


Jeroen


-Jeff
>
> ______________________________**_________________
> Xenomai mailing list
> Xenomai@xenomai.org
> http://www.xenomai.org/**mailman/listinfo/xenomai<http://www.xenomai.org/mailman/listinfo/xenomai>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai] IRQ issue (was Ethernet driver issue)
  2013-05-23 17:39   ` Jeroen Van den Keybus
@ 2013-05-23 21:17     ` Jeff Webb
  2013-05-24 18:00       ` Jeff Webb
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Webb @ 2013-05-23 21:17 UTC (permalink / raw)
  To: xenomai

On 05/23/2013 12:39 PM, Jeroen Van den Keybus wrote:
> If you mean INTx+, yes. In combination with DisINT- it indicates a pending interrupt.

Yes, that's what I meant.  Thanks for confirming -- that helps.

> Check with lspci what your PCI-PCIe bridge is. I once had serious issues with an ASMedia bridge that did not send an PCIe IRQ deassert message. Could be a mainboard issue as well.

I don't think I have that particular bridge.  Here is my (brief) lspci output to give a better indication of my chipset, in case anyone is interested.  The secondary PCIe to PCI adapter is not plugged in at the moment, and the PCI ethernet card is plugged into the IRQ 17 slot, which works.

00:00.0 Host bridge: Intel Corporation 5520/5500/X58 I/O Hub to ESI Port (rev 13)
00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 13)
00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 13)
00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 13)
00:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management Registers (rev 13)
00:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 13)
00:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 13)
00:1a.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4
00:1a.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5
00:1a.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6
00:1a.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2
00:1b.0 Audio device: Intel Corporation 82801JI (ICH10 Family) HD Audio Controller
00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 1
00:1c.5 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 6
00:1d.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1
00:1d.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2
00:1d.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3
00:1d.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90)
00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller
00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller
00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller
02:00.0 VGA compatible controller: NVIDIA Corporation GF116 [GeForce GTX 550 Ti] (rev a1)
02:00.1 Audio device: NVIDIA Corporation GF116 High Definition Audio Controller (rev a1)
05:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5761 Gigabit Ethernet PCIe (rev 10)
06:04.0 Serial controller: Commtech, Inc. Fastcom 422/2-PCI-335 (rev 02)
06:05.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05)
3f:00.0 Host bridge: Intel Corporation Xeon 5500/Core i7 QuickPath Architecture Generic Non-Core Registers (rev 05)
3f:00.1 Host bridge: Intel Corporation Xeon 5500/Core i7 QuickPath Architecture System Address Decoder (rev 05)
3f:02.0 Host bridge: Intel Corporation Xeon 5500/Core i7 QPI Link 0 (rev 05)
3f:02.1 Host bridge: Intel Corporation Xeon 5500/Core i7 QPI Physical 0 (rev 05)
3f:03.0 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller (rev 05)
3f:03.1 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Target Address Decoder (rev 05)
3f:03.4 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Test Registers (rev 05)
3f:04.0 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 0 Control Registers (rev 05)
3f:04.1 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 0 Address Registers (rev 05)
3f:04.2 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 0 Rank Registers (rev 05)
3f:04.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 0 Thermal Control Registers (rev 05)
3f:05.0 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 1 Control Registers (rev 05)
3f:05.1 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 1 Address Registers (rev 05)
3f:05.2 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 1 Rank Registers (rev 05)
3f:05.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 1 Thermal Control Registers (rev 05)
3f:06.0 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Control Registers (rev 05)
3f:06.1 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Address Registers (rev 05)
3f:06.2 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Rank Registers (rev 05)
3f:06.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Thermal Control Registers (rev 05)

> Since you experience very slow response with another PCI-PCIe bridge, also check the number of IRQs in /proc/interrupts and proc/xenomai/irq.

I will recreate this scenario and check on that.

> Also check very thoroughly that the issue does not occur in plain Linux (check dmesg for 'Nobody cared'). It can take hours of testing to trigger the problem and maybe the I-pipe exposes it more quickly. My personal feeling on the problem back then was that it could have something to do with the interrupt being serviced very fast.

Thanks for that tip.  I haven't seen that message yet, but I haven't tested the system thoroughly as you suggested.

> Jeroen

Thank you very much for your response.

-Jeff



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai] IRQ issue (was Ethernet driver issue)
  2013-05-23 21:17     ` Jeff Webb
@ 2013-05-24 18:00       ` Jeff Webb
  2013-05-24 20:44         ` Matthew Fornero
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Webb @ 2013-05-24 18:00 UTC (permalink / raw)
  To: xenomai

On 05/23/2013 04:17 PM, Jeff Webb wrote:
> On 05/23/2013 12:39 PM, Jeroen Van den Keybus wrote:
>> If you mean INTx+, yes. In combination with DisINT- it indicates a pending interrupt.
>
> Yes, that's what I meant.  Thanks for confirming -- that helps.
>
>> Check with lspci what your PCI-PCIe bridge is. I once had serious issues with an ASMedia bridge that did not send an PCIe IRQ deassert message. Could be a mainboard issue as well.



I recreated the scenario where I plug in a PCIe->PCI adapter and plug the ethernet card into that.  For what it's worth, this is the PCIe->PCI adapter info:

01:00.0 PCI bridge: Pericom Semiconductor Device e111 (rev 02) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=01, secondary=02, subordinate=02, sec-latency=64
	I/O behind bridge: 0000c000-0000cfff
	Memory behind bridge: f3d00000-f3efffff
	Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: <access denied>
	Kernel modules: shpchp

(Remember, this is not the adapter on my motherboard, but just something I am plugging in for test purposes.)  When I'm running vanilla linux (3.5.7) with this configuration, everything seems to work fine (the ethernet card, the mouse/keyboard, and no spurious interrupts).  Under xenomai, the ethernet card works, but within a few seconds the mouse and keyboard become extremely delayed as I mentioned in a previous email.  This behavior is consistent and repeatable, so it seems to confirm that the problem has to do with xenomai.  In this case, the lspci output seems to indicate a pending interrupt like before, but this time it seems to be associated with the USB system, and not the PCI ethernet card.  This makes sense to me, since the USB is obviously malfunctioning under this test.

00:1a.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5 (prog-if 00 [UHCI])
         Subsystem: Dell Device 0293
         Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
         Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx+
         Latency: 0
         Interrupt: pin B routed to IRQ 17
         Region 4: I/O ports at ff00 [size=32]
         Capabilities: <access denied>
         Kernel driver in use: uhci_hcd

02:04.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05)
         Subsystem: Intel Corporation PRO/1000 GT Desktop Adapter
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
         Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 64 (63750ns min), Cache Line Size: 64 bytes
         Interrupt: pin A routed to IRQ 28
         Region 0: Memory at f3dc0000 (32-bit, non-prefetchable) [size=128K]
         Region 1: Memory at f3de0000 (32-bit, non-prefetchable) [size=128K]
         Region 2: I/O ports at ccc0 [size=64]
         Expansion ROM at f3e00000 [disabled] [size=128K]
         Capabilities: <access denied>
         Kernel driver in use: e1000
         Kernel modules: e1000

>> Since you experience very slow response with another PCI-PCIe bridge, also check the number of IRQs in /proc/interrupts and proc/xenomai/irq.

Here is /proc/interrupts for the above configuration:

             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
    0:         77          0          0          0          0          0          0          0   IO-APIC-edge      timer
    1:          3          0          0          0          0          0          0          0   IO-APIC-edge      i8042
    7:          1          0          0          0          0          0          0          0   IO-APIC-edge      parport0
    8:          1          0          0          0          0          0          0          0   IO-APIC-edge      rtc0
    9:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   acpi
   12:          4          0          0          0          0          0          0          0   IO-APIC-edge      i8042
   16:         42          0          0          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
   17:        199         43          0        576          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb4, uhci_hcd:usb7
   18:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb8
   22:          3          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5
   23:         60          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb6
   28:         43          0          0          0        291          0          0          0   IO-APIC-fasteoi   eth1
   34:        243          0          0          0          0          0          0          0   IO-APIC-fasteoi   snd_hda_intel
   66:       6206          0       7582          0          0          0          0          0   PCI-MSI-edge      ahci
   67:        245         58          0         11          0          0          0          0   PCI-MSI-edge      snd_hda_intel
   68:        309          0          0          0          0       2365          0          0   PCI-MSI-edge      eth0
  NMI:         10          5         10          4         15         18         14         16   Non-maskable interrupts
  LOC:      11262       7593      11187       7486       9941      12152       9752      10769   Local timer interrupts
  SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
  PMI:         10          5         10          4         15         18         14         16   Performance monitoring interrupts
  IWI:          0          0          0          0          0          0          0          0   IRQ work interrupts
  RTR:          7          0          0          0          0          0          0          0   APIC ICR read retries
  RES:      12821      12277      11275       7505       6297       5263       7275       4130   Rescheduling interrupts
  CAL:        224        359        402        406        385        420        411        355   Function call interrupts
  TLB:        494        611        637        526        616        889       1104        952   TLB shootdowns
  TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
  THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts
  MCE:          0          0          0          0          0          0          0          0   Machine check exceptions
  MCP:          2          2          2          2          2          2          2          2   Machine check polls
  ERR:          0
  MIS:          0

Here is /proc/xenomai/irq:

IRQ         CPU0        CPU1        CPU2        CPU3        CPU4        CPU5        CPU6        CPU7
16672:      137861       40965       37428       38394       30406       57161       27699       26446         [timer]
16673:           0           1           1           1           1           1           1           1         [reschedule]
16674:           0           1           1           1           1           1           1           1         [timer-ipi]
16675:           0           0           0           0           0           0           0           0         [sync]
16707:           0           0           0           0           0           0           0           0         [virtual]

>> Also check very thoroughly that the issue does not occur in plain Linux (check dmesg for 'Nobody cared'). It can take hours of testing to trigger the problem and maybe the I-pipe exposes it more quickly. My personal feeling on the problem back then was that it could have something to do with the interrupt being serviced very fast.

Still no sign of this.  I haven't done hours of testing under standard linux, but linux seems to work fine with a configuration that reproducibly produces the problem under xenomai.

I still always get something like this under Xenomai:

[   26.589844] I-pipe: spurious interrupt 32
[   36.596341] I-pipe: spurious interrupt 32

I'm not sure whether this is related or not, because the interrupt number is always 32, but it seems fishy.

Any pointers on how to proceed next would be appreciated.

Thanks,

Jeff



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai] IRQ issue (was Ethernet driver issue)
  2013-05-24 18:00       ` Jeff Webb
@ 2013-05-24 20:44         ` Matthew Fornero
  2013-05-29 19:43           ` Jeff Webb
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Fornero @ 2013-05-24 20:44 UTC (permalink / raw)
  To: Jeff Webb; +Cc: xenomai

> (Remember, this is not the adapter on my motherboard, but just something I
> am plugging in for test purposes.)  When I'm running vanilla linux (3.5.7)
> with this configuration, everything seems to work fine (the ethernet card,
> the mouse/keyboard, and no spurious interrupts).  Under xenomai, the
> ethernet card works, but within a few seconds the mouse and keyboard become
> extremely delayed as I mentioned in a previous email.

Have you been able to look at dmesg output (or /var/log/message) after
the USB mouse and keyboard stop responding? This sounds very similar
to an issue we had where a real-time PCI device was sharing an
interrupt with one of the USB controllers.

As a workaround, is it an option to simply use a PCIe network card?
This will give you an MSI interrupt that won't conflict with any other
devices in your system. I recall seeing some concerns with using MSI
for real-time interrupts under Xenomai, but I believe it can be done
successfully on x86 if you observe a few precautions (maybe Jan Kiszka
can comment?). We've successfully used MSI for non-real time
interrupts (ethernet cards) on many of our platforms.

-Matt


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai] IRQ issue (was Ethernet driver issue)
  2013-05-24 20:44         ` Matthew Fornero
@ 2013-05-29 19:43           ` Jeff Webb
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff Webb @ 2013-05-29 19:43 UTC (permalink / raw)
  To: Matthew Fornero; +Cc: xenomai

On 05/24/2013 03:44 PM, Matthew Fornero wrote:
>> (Remember, this is not the adapter on my motherboard, but just something I
>> am plugging in for test purposes.)  When I'm running vanilla linux (3.5.7)
>> with this configuration, everything seems to work fine (the ethernet card,
>> the mouse/keyboard, and no spurious interrupts).  Under xenomai, the
>> ethernet card works, but within a few seconds the mouse and keyboard become
>> extremely delayed as I mentioned in a previous email.
>
> Have you been able to look at dmesg output (or /var/log/message) after
> the USB mouse and keyboard stop responding?

Nothing looks unusual in dmesg or /var/log/syslog other than:

[   26.589844] I-pipe: spurious interrupt 32
[   36.596341] I-pipe: spurious interrupt 32

I get at least one of these messages (and sometimes two) every time I boot a Xenomai kernel on this machine, even if I don't see the mouse/keyboard issue, and even if the secondary PCI ethernet card works (IRQ 17).  The timing of the second spurious interrupt message could be coincident with the mouse/keyboard issue in this particular case, but I'm not sure.

> This sounds very similar
> to an issue we had where a real-time PCI device was sharing an
> interrupt with one of the USB controllers.

I'm not running any realtime code at the time of this test other than what runs when the machine is idle.  If there is something like that going on, it's not obvious to me.  I agree the behavior seems similar to the case you mentioned.  I have experienced cases like that as well in the past.

> As a workaround, is it an option to simply use a PCIe network card?
> This will give you an MSI interrupt that won't conflict with any other
> devices in your system.

I don't have a PCIe network card (other than the primary one on the motherboard), but I did try putting the PCI card in a PCIe-PCI adapter.  As you mentioned, it was assigned it's own IRQ (28).  This is the configuration I used when I experienced the mouse/keyboard issue I mentioned above, so I don't see how the problem could be due to the sharing of the network card IRQ, but it could possibly be due to the sharing of a USB IRQ.  The original problem of the PCI network card not functioning when in the IRQ 16 slot could also be due to some kind of IRQ sharing issue, though.

> I recall seeing some concerns with using MSI
> for real-time interrupts under Xenomai, but I believe it can be done
> successfully on x86 if you observe a few precautions (maybe Jan Kiszka
> can comment?). We've successfully used MSI for non-real time
> interrupts (ethernet cards) on many of our platforms.

I have used MSI for ethernet cards on other RT systems as well.  Even if I can get a PCIe network card working without the USB issues I experienced above, I will eventually need to use both PCI slots (IRQs 16 and 17) for RT serial cards, so I need figure out what's going on with those IRQ lines.

It seems to me that there's something fundamentally wrong with my system, since both USB and ethernet interrupts seem to be causing problems, depending on what IRQs are being utilized.

> -Matt

Thanks for the response, Matt!

-Jeff



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-05-29 19:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-21 18:58 [Xenomai] Ethernet driver issue Jeff Webb
2013-05-23 16:30 ` [Xenomai] IRQ issue (was Ethernet driver issue) Jeff Webb
2013-05-23 17:39   ` Jeroen Van den Keybus
2013-05-23 21:17     ` Jeff Webb
2013-05-24 18:00       ` Jeff Webb
2013-05-24 20:44         ` Matthew Fornero
2013-05-29 19:43           ` Jeff Webb

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.