All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1
@ 2009-11-19 16:39 Jens Rosenboom
  2009-11-19 18:07 ` Dhananjay Phadke
  0 siblings, 1 reply; 13+ messages in thread
From: Jens Rosenboom @ 2009-11-19 16:39 UTC (permalink / raw)
  To: netdev; +Cc: Dhananjay Phadke

My netxen 10G card stops working somewhere between 2.6.30 and 2.6.31-rc1. With the
newer kernel I can see packets been received on the switch it is connected to, but
the kernel doesn't report any sent packets in the interface counters and nothing
is being received either.

I've tried to bisect this, but only seems the end up with kernels that do not boot
at all because some SCSI stuff goes bad.

Any hints how to address this further? Please keep me on CC:.

Yours,
	Jens

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1
  2009-11-19 16:39 [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1 Jens Rosenboom
@ 2009-11-19 18:07 ` Dhananjay Phadke
  2009-11-19 18:36   ` Jens Rosenboom
  0 siblings, 1 reply; 13+ messages in thread
From: Dhananjay Phadke @ 2009-11-19 18:07 UTC (permalink / raw)
  To: Jens Rosenboom, netdev; +Cc: Amit Salecha

> My netxen 10G card stops working somewhere between 2.6.30 and 2.6.31-rc1.
> With the
> newer kernel I can see packets been received on the switch it is
> connected to, but
> the kernel doesn't report any sent packets in the interface counters and
> nothing
> is being received either.
> 
> I've tried to bisect this, but only seems the end up with kernels that do
> not boot
> at all because some SCSI stuff goes bad.

Any particular reason for using -rc1 kernel and not 2.6.31 stable kernel?

-Dhananjay

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1
  2009-11-19 18:07 ` Dhananjay Phadke
@ 2009-11-19 18:36   ` Jens Rosenboom
  2009-11-19 22:11     ` Dhananjay Phadke
  2009-11-20  1:19     ` Eric W. Biederman
  0 siblings, 2 replies; 13+ messages in thread
From: Jens Rosenboom @ 2009-11-19 18:36 UTC (permalink / raw)
  To: Dhananjay Phadke; +Cc: Jens Rosenboom, netdev, Amit Salecha

On Thu, Nov 19, 2009 at 10:07:21AM -0800, Dhananjay Phadke wrote:
> > My netxen 10G card stops working somewhere between 2.6.30 and 2.6.31-rc1.
> > With the
> > newer kernel I can see packets been received on the switch it is
> > connected to, but
> > the kernel doesn't report any sent packets in the interface counters and
> > nothing
> > is being received either.
> > 
> > I've tried to bisect this, but only seems the end up with kernels that do
> > not boot
> > at all because some SCSI stuff goes bad.
> 
> Any particular reason for using -rc1 kernel and not 2.6.31 stable kernel?

Sorry, I forgot to mention that all later kernels that I tested
including 2.6.31 and the current net-2.6 also fail, so the badness 
comes in somewhere in between 2.6.30 and 2.6.31-rc1.

I also noticed that the newer kernel allocate four interrupts for the
card instead of only one, but none of them seem to get triggered, the
/proc/interrupts counters all stay at zero.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1
  2009-11-19 18:36   ` Jens Rosenboom
@ 2009-11-19 22:11     ` Dhananjay Phadke
  2009-11-20  7:49       ` Jens Rosenboom
  2009-11-20  1:19     ` Eric W. Biederman
  1 sibling, 1 reply; 13+ messages in thread
From: Dhananjay Phadke @ 2009-11-19 22:11 UTC (permalink / raw)
  To: Jens Rosenboom; +Cc: netdev, Amit Salecha

> Sorry, I forgot to mention that all later kernels that I tested
> including 2.6.31 and the current net-2.6 also fail, so the badness
> comes in somewhere in between 2.6.30 and 2.6.31-rc1.
> 
> I also noticed that the newer kernel allocate four interrupts for the
> card instead of only one, but none of them seem to get triggered, the
> /proc/interrupts counters all stay at zero.

What firmware revision you have? Since you are saying nothing
transmitted as well, I doubt if you have a link. Otherwise I
would imagine kernel tries to send some neighbor solicitation
crap as soon as you bring up interface. What does your
"ethtool ethx" say about the link? 

It's possible to bisect commits which applied in driver/net/netxen.
That way you have fewer commits to rewind and remains focused on
the driver rather than screwing scsi.

Thanks,
Dhananjay

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1
  2009-11-19 18:36   ` Jens Rosenboom
  2009-11-19 22:11     ` Dhananjay Phadke
@ 2009-11-20  1:19     ` Eric W. Biederman
  2009-11-20  7:52       ` Jens Rosenboom
  1 sibling, 1 reply; 13+ messages in thread
From: Eric W. Biederman @ 2009-11-20  1:19 UTC (permalink / raw)
  To: Jens Rosenboom; +Cc: Dhananjay Phadke, netdev, Amit Salecha

Jens Rosenboom <me@jayr.de> writes:

> On Thu, Nov 19, 2009 at 10:07:21AM -0800, Dhananjay Phadke wrote:
>> > My netxen 10G card stops working somewhere between 2.6.30 and 2.6.31-rc1.
>> > With the
>> > newer kernel I can see packets been received on the switch it is
>> > connected to, but
>> > the kernel doesn't report any sent packets in the interface counters and
>> > nothing
>> > is being received either.
>> > 
>> > I've tried to bisect this, but only seems the end up with kernels that do
>> > not boot
>> > at all because some SCSI stuff goes bad.
>> 
>> Any particular reason for using -rc1 kernel and not 2.6.31 stable kernel?
>
> Sorry, I forgot to mention that all later kernels that I tested
> including 2.6.31 and the current net-2.6 also fail, so the badness 
> comes in somewhere in between 2.6.30 and 2.6.31-rc1.
>
> I also noticed that the newer kernel allocate four interrupts for the
> card instead of only one, but none of them seem to get triggered, the
> /proc/interrupts counters all stay at zero.

Hmm.  Have you tried disabling msi's? aka putting nomsi on the kernel
command line.

If you aren't getting interrupts it might be that your board simply
has problems with receiving msi interrupts.  That at least used to
be common.

Eric

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1
  2009-11-19 22:11     ` Dhananjay Phadke
@ 2009-11-20  7:49       ` Jens Rosenboom
  2009-11-20 16:11         ` Jens Rosenboom
  0 siblings, 1 reply; 13+ messages in thread
From: Jens Rosenboom @ 2009-11-20  7:49 UTC (permalink / raw)
  To: Dhananjay Phadke; +Cc: Jens Rosenboom, netdev, Amit Salecha

On Thu, Nov 19, 2009 at 02:11:33PM -0800, Dhananjay Phadke wrote:
> > Sorry, I forgot to mention that all later kernels that I tested
> > including 2.6.31 and the current net-2.6 also fail, so the badness
> > comes in somewhere in between 2.6.30 and 2.6.31-rc1.
> > 
> > I also noticed that the newer kernel allocate four interrupts for the
> > card instead of only one, but none of them seem to get triggered, the
> > /proc/interrupts counters all stay at zero.
> 
> What firmware revision you have? Since you are saying nothing
> transmitted as well, I doubt if you have a link. Otherwise I
> would imagine kernel tries to send some neighbor solicitation
> crap as soon as you bring up interface. What does your
> "ethtool ethx" say about the link? 

ethtool says "Link detected: yes" , if I try to ping a different host on the 
LAN the MAC of the card appears in the FDB on the switch, so I'm pretty sure 
that packets do get sent even if the kernel doesn't get a report for that 
because of the broken interrupts. Firmware is 3.4.336, which is the only one 
I could find from IBM Japan, the original Netxen pages seem to have been dumped
by Qlogic. :-( The firmware on the card itself is being rejected by the
kernel as too old.

> It's possible to bisect commits which applied in driver/net/netxen.
> That way you have fewer commits to rewind and remains focused on
> the driver rather than screwing scsi.

I did restrict the bisect to net/ + driver/net and still ran into trouble,
I can retry with your suggestion.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1
  2009-11-20  1:19     ` Eric W. Biederman
@ 2009-11-20  7:52       ` Jens Rosenboom
  2009-11-20 16:48         ` Eric W. Biederman
  0 siblings, 1 reply; 13+ messages in thread
From: Jens Rosenboom @ 2009-11-20  7:52 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Jens Rosenboom, Dhananjay Phadke, netdev, Amit Salecha

On Thu, Nov 19, 2009 at 05:19:05PM -0800, Eric W. Biederman wrote:
> Jens Rosenboom <me@jayr.de> writes:
> 
> > On Thu, Nov 19, 2009 at 10:07:21AM -0800, Dhananjay Phadke wrote:
> >> > My netxen 10G card stops working somewhere between 2.6.30 and 2.6.31-rc1.
> >> > With the
> >> > newer kernel I can see packets been received on the switch it is
> >> > connected to, but
> >> > the kernel doesn't report any sent packets in the interface counters and
> >> > nothing
> >> > is being received either.
> >> > 
> >> > I've tried to bisect this, but only seems the end up with kernels that do
> >> > not boot
> >> > at all because some SCSI stuff goes bad.
> >> 
> >> Any particular reason for using -rc1 kernel and not 2.6.31 stable kernel?
> >
> > Sorry, I forgot to mention that all later kernels that I tested
> > including 2.6.31 and the current net-2.6 also fail, so the badness 
> > comes in somewhere in between 2.6.30 and 2.6.31-rc1.
> >
> > I also noticed that the newer kernel allocate four interrupts for the
> > card instead of only one, but none of them seem to get triggered, the
> > /proc/interrupts counters all stay at zero.
> 
> Hmm.  Have you tried disabling msi's? aka putting nomsi on the kernel
> command line.

I hadn't before but tried it now, but no difference. The kernel still seems to
allocate four interrupts:

 kernel: [    2.980300] bus: 'pci': add driver netxen_nic
 kernel: [    2.980329] bus: 'pci': driver_probe_device: matched device 0000:22:00.0 with driver netxen_nic
 kernel: [    2.980333] bus: 'pci': really_probe: probing driver netxen_nic with device 0000:22:00.0
 kernel: [    2.980446] netxen_nic 0000:22:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
 kernel: [    2.980459] netxen_nic 0000:22:00.0: setting latency timer to 64
 kernel: [    2.981505] netxen_nic 0000:22:00.0: 128MB memory map
 kernel: [    2.981611] netxen_nic 0000:22:00.0: firmware: using built-in firmware nxromimg.bin
 kernel: [    4.144018] netxen_nic 0000:22:00.0: loading firmware from nxromimg.bin
 kernel: [   10.108208] NetXen XGb XFP Board S/N IF72MK0200  Chip rev 0x25
 kernel: [   10.108211] netxen_nic 0000:22:00.0: firmware version 3.4.336
 kernel: [   10.108262]   alloc irq_desc for 37 on node 0
 kernel: [   10.108265]   alloc kstat_irqs on node 0
 kernel: [   10.108273] netxen_nic 0000:22:00.0: irq 37 for MSI/MSI-X
 kernel: [   10.108275]   alloc irq_desc for 38 on node 0
 kernel: [   10.108277]   alloc kstat_irqs on node 0
 kernel: [   10.108281] netxen_nic 0000:22:00.0: irq 38 for MSI/MSI-X
 kernel: [   10.108284]   alloc irq_desc for 39 on node 0
 kernel: [   10.108286]   alloc kstat_irqs on node 0
 kernel: [   10.108289] netxen_nic 0000:22:00.0: irq 39 for MSI/MSI-X
 kernel: [   10.108291]   alloc irq_desc for 40 on node 0
 kernel: [   10.108293]   alloc kstat_irqs on node 0
 kernel: [   10.108296] netxen_nic 0000:22:00.0: irq 40 for MSI/MSI-X
 kernel: [   10.108311] netxen_nic 0000:22:00.0: using msi-x interrupts
 kernel: [   10.108371] device: 'eth2': device_add
 kernel: [   10.108442] PM: Adding info for No Bus:eth2
 kernel: [   10.109197] netxen_nic 0000:22:00.0: eth2: XGbE port initialized
 kernel: [   10.109219] driver: '0000:22:00.0': driver_bound: bound to device 'netxen_nic'
 kernel: [   10.109226] bus: 'pci': really_probe: bound device 0000:22:00.0 to driver netxen_nic

# grep eth2 /proc/interrupts
 37:          0          0          0          0   PCI-MSI-edge      eth2[0]
 38:          0          0          0          0   PCI-MSI-edge      eth2[1]
 39:          0          0          0          0   PCI-MSI-edge      eth2[2]
 40:          0          0          0          0   PCI-MSI-edge      eth2[3]
# ethtool eth2
Settings for eth2:
	Supported ports: [ FIBRE ]
	Supported link modes:   
	Supports auto-negotiation: No
	Advertised link modes:  10000baseT/Full 
	Advertised auto-negotiation: No
	Speed: 10000Mb/s
	Duplex: Full
	Port: FIBRE
	PHYAD: 0
	Transceiver: external
	Auto-negotiation: off
	Supports Wake-on: d
	Wake-on: d
	Link detected: yes
# ethtool -i eth2
driver: netxen_nic
version: 4.0.30
firmware-version: 3.4.336
bus-info: 0000:22:00.0
# uname -rvmpi
2.6.31.6 #5 SMP Wed Nov 18 09:15:48 CET 2009 x86_64 Dual-Core AMD Opteron(tm) Processor 2212 AuthenticAMD GNU/Linux

> If you aren't getting interrupts it might be that your board simply
> has problems with receiving msi interrupts.  That at least used to
> be common.

But it does work with the single interrupt setup in 2.6.30, is there a way to
tell the newer kernels to go back to this behaviour?

Here is the output with plain 2.6.30:

# uname -rvmpi
2.6.30 #2 SMP Wed Nov 18 16:41:15 CET 2009 x86_64 Dual-Core AMD Opteron(tm) Processor 2212 AuthenticAMD
# grep eth2 /proc/interrupts 
 37:          0          0          3       4836   PCI-MSI-edge                  eth2[0]
# ping 10.0.21.201
PING 10.0.21.201 (10.0.21.201) 56(84) bytes of data.
64 bytes from 10.0.21.201: icmp_seq=1 ttl=255 time=1.51 ms
64 bytes from 10.0.21.201: icmp_seq=2 ttl=255 time=0.170 ms
64 bytes from 10.0.21.201: icmp_seq=3 ttl=255 time=0.156 ms
^C
--- 10.0.21.201 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 0.156/0.612/1.512/0.636 ms
# grep eth2 /proc/interrupts 
 37:          0          0          3       4985   PCI-MSI-edge                  eth2[0]
# 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1
  2009-11-20  7:49       ` Jens Rosenboom
@ 2009-11-20 16:11         ` Jens Rosenboom
  0 siblings, 0 replies; 13+ messages in thread
From: Jens Rosenboom @ 2009-11-20 16:11 UTC (permalink / raw)
  To: Jens Rosenboom; +Cc: Dhananjay Phadke, netdev, Amit Salecha

On Fri, Nov 20, 2009 at 08:49:03AM +0100, Jens Rosenboom wrote:
> On Thu, Nov 19, 2009 at 02:11:33PM -0800, Dhananjay Phadke wrote:
> > > Sorry, I forgot to mention that all later kernels that I tested
> > > including 2.6.31 and the current net-2.6 also fail, so the badness
> > > comes in somewhere in between 2.6.30 and 2.6.31-rc1.
> > > 
> > > I also noticed that the newer kernel allocate four interrupts for the
> > > card instead of only one, but none of them seem to get triggered, the
> > > /proc/interrupts counters all stay at zero.
> > 
> > What firmware revision you have? Since you are saying nothing
> > transmitted as well, I doubt if you have a link. Otherwise I
> > would imagine kernel tries to send some neighbor solicitation
> > crap as soon as you bring up interface. What does your
> > "ethtool ethx" say about the link? 
> 
> ethtool says "Link detected: yes" , if I try to ping a different host on the 
> LAN the MAC of the card appears in the FDB on the switch, so I'm pretty sure 
> that packets do get sent even if the kernel doesn't get a report for that 
> because of the broken interrupts. Firmware is 3.4.336, which is the only one 
> I could find from IBM Japan, the original Netxen pages seem to have been dumped
> by Qlogic. :-( The firmware on the card itself is being rejected by the
> kernel as too old.
> 
> > It's possible to bisect commits which applied in driver/net/netxen.
> > That way you have fewer commits to rewind and remains focused on
> > the driver rather than screwing scsi.
> 
> I did restrict the bisect to net/ + driver/net and still ran into trouble,
> I can retry with your suggestion.

Sorry for following up to myself, but I made some progress. The bisect still
broke things, so I started to try to patch the latest kernel to use only a
single interrupt, but that didn't help either.

But I managed in finding another firmware which has version v3.4.250, which
is called "legacy" by the kernel. Loading this firmware results also in the
driver only using one interrupt, and the good news is: It Works. ;-)

Maybe this helps you to further narrow down the problem, I'm also ready to
take some testing/debugging patches or send you any other information that
might be helpful.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1
  2009-11-20  7:52       ` Jens Rosenboom
@ 2009-11-20 16:48         ` Eric W. Biederman
  2009-11-20 17:30           ` Dhananjay Phadke
  0 siblings, 1 reply; 13+ messages in thread
From: Eric W. Biederman @ 2009-11-20 16:48 UTC (permalink / raw)
  To: Jens Rosenboom; +Cc: Dhananjay Phadke, netdev, Amit Salecha

Jens Rosenboom <me@jayr.de> writes:

> On Thu, Nov 19, 2009 at 05:19:05PM -0800, Eric W. Biederman wrote:
>> Jens Rosenboom <me@jayr.de> writes:
>> 
>> > On Thu, Nov 19, 2009 at 10:07:21AM -0800, Dhananjay Phadke wrote:
>> >> > My netxen 10G card stops working somewhere between 2.6.30 and 2.6.31-rc1.
>> >> > With the
>> >> > newer kernel I can see packets been received on the switch it is
>> >> > connected to, but
>> >> > the kernel doesn't report any sent packets in the interface counters and
>> >> > nothing
>> >> > is being received either.
>> >> > 
>> >> > I've tried to bisect this, but only seems the end up with kernels that do
>> >> > not boot
>> >> > at all because some SCSI stuff goes bad.
>> >> 
>> >> Any particular reason for using -rc1 kernel and not 2.6.31 stable kernel?
>> >
>> > Sorry, I forgot to mention that all later kernels that I tested
>> > including 2.6.31 and the current net-2.6 also fail, so the badness 
>> > comes in somewhere in between 2.6.30 and 2.6.31-rc1.
>> >
>> > I also noticed that the newer kernel allocate four interrupts for the
>> > card instead of only one, but none of them seem to get triggered, the
>> > /proc/interrupts counters all stay at zero.
>> 
>> Hmm.  Have you tried disabling msi's? aka putting nomsi on the kernel
>> command line.
>
> I hadn't before but tried it now, but no difference. The kernel still seems to
> allocate four interrupts:

Weird. MSI's definitely weren't disabled.  Looking a little farther at
your quoted setup MSI work on your board.  This is definitely
something specific to the driver.  Except for a few initialization
races that are an issue for bonding I am running 2.6.31 just fine.


>  kernel: [    2.980300] bus: 'pci': add driver netxen_nic
>  kernel: [    2.980329] bus: 'pci': driver_probe_device: matched device 0000:22:00.0 with driver netxen_nic
>  kernel: [    2.980333] bus: 'pci': really_probe: probing driver netxen_nic with device 0000:22:00.0
>  kernel: [    2.980446] netxen_nic 0000:22:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
>  kernel: [    2.980459] netxen_nic 0000:22:00.0: setting latency timer to 64
>  kernel: [    2.981505] netxen_nic 0000:22:00.0: 128MB memory map
>  kernel: [    2.981611] netxen_nic 0000:22:00.0: firmware: using built-in firmware nxromimg.bin
>  kernel: [    4.144018] netxen_nic 0000:22:00.0: loading firmware from nxromimg.bin
>  kernel: [   10.108208] NetXen XGb XFP Board S/N IF72MK0200  Chip rev 0x25
>  kernel: [   10.108211] netxen_nic 0000:22:00.0: firmware version 3.4.336
>  kernel: [   10.108262]   alloc irq_desc for 37 on node 0
>  kernel: [   10.108265]   alloc kstat_irqs on node 0
>  kernel: [   10.108273] netxen_nic 0000:22:00.0: irq 37 for MSI/MSI-X
>  kernel: [   10.108275]   alloc irq_desc for 38 on node 0
>  kernel: [   10.108277]   alloc kstat_irqs on node 0
>  kernel: [   10.108281] netxen_nic 0000:22:00.0: irq 38 for MSI/MSI-X
>  kernel: [   10.108284]   alloc irq_desc for 39 on node 0
>  kernel: [   10.108286]   alloc kstat_irqs on node 0
>  kernel: [   10.108289] netxen_nic 0000:22:00.0: irq 39 for MSI/MSI-X
>  kernel: [   10.108291]   alloc irq_desc for 40 on node 0
>  kernel: [   10.108293]   alloc kstat_irqs on node 0
>  kernel: [   10.108296] netxen_nic 0000:22:00.0: irq 40 for MSI/MSI-X
>  kernel: [   10.108311] netxen_nic 0000:22:00.0: using msi-x interrupts
>  kernel: [   10.108371] device: 'eth2': device_add
>  kernel: [   10.108442] PM: Adding info for No Bus:eth2
>  kernel: [   10.109197] netxen_nic 0000:22:00.0: eth2: XGbE port initialized
>  kernel: [   10.109219] driver: '0000:22:00.0': driver_bound: bound to device 'netxen_nic'
>  kernel: [   10.109226] bus: 'pci': really_probe: bound device 0000:22:00.0 to driver netxen_nic
>
> # grep eth2 /proc/interrupts
>  37:          0          0          0          0   PCI-MSI-edge      eth2[0]
>  38:          0          0          0          0   PCI-MSI-edge      eth2[1]
>  39:          0          0          0          0   PCI-MSI-edge      eth2[2]
>  40:          0          0          0          0   PCI-MSI-edge      eth2[3]
> # ethtool eth2
> Settings for eth2:
> 	Supported ports: [ FIBRE ]
> 	Supported link modes:   
> 	Supports auto-negotiation: No
> 	Advertised link modes:  10000baseT/Full 
> 	Advertised auto-negotiation: No
> 	Speed: 10000Mb/s
> 	Duplex: Full
> 	Port: FIBRE
> 	PHYAD: 0
> 	Transceiver: external
> 	Auto-negotiation: off
> 	Supports Wake-on: d
> 	Wake-on: d
> 	Link detected: yes
> # ethtool -i eth2
> driver: netxen_nic
> version: 4.0.30
> firmware-version: 3.4.336
> bus-info: 0000:22:00.0
> # uname -rvmpi
> 2.6.31.6 #5 SMP Wed Nov 18 09:15:48 CET 2009 x86_64 Dual-Core AMD Opteron(tm) Processor 2212 AuthenticAMD GNU/Linux

On my working setup I have:
driver: netxen_nic
version: 4.0.30
firmware-version: 4.0.305
bus-info: 0000:06:00.0

4.0.305 looks like the latest publicly available version on qlogic's
website.  If that will work on your card I recommend you pull it down
and update your firmware.

My card is: NetXen Dual XGb SFP+ LP Board S/N SF86BK0008  Chip rev 0x41
Which is a bit different at the physical hardware level.

Eric

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1
  2009-11-20 16:48         ` Eric W. Biederman
@ 2009-11-20 17:30           ` Dhananjay Phadke
  2009-11-20 17:43             ` Eric W. Biederman
  0 siblings, 1 reply; 13+ messages in thread
From: Dhananjay Phadke @ 2009-11-20 17:30 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Jens Rosenboom, netdev, Amit Salecha, Ameen Rahman

> Weird. MSI's definitely weren't disabled.  Looking a little farther at
> your quoted setup MSI work on your board.  This is definitely
> something specific to the driver.  Except for a few initialization
> races that are an issue for bonding I am running 2.6.31 just fine.


Jens,

Even if /proc/interrupt says PCI-MSI in both cases, single interrupt 
case is msi vs. 4 vector case is msi-x. To confirm that msi-x doesn't 
work with your card/machine, you can still stay on 2.6.31 and set 
use_msi_x=0 in netxen_nic_main.c.

I had tried to make a available a module param to disable msi/msi-x for 
platforms where msi-x doesn't work cleanly, but it was declined by David 
Miller, et al.

Anyway, please note a few things -

Jens' has older generation (NX2031) of nic asic, so 4.0.xxx FW doesn't 
apply:

NetXen XGb XFP Board S/N IF72MK0200  Chip rev 0x25

Eric has newer generation (NX3031) of nic asic:

NetXen Dual XGb SFP+ LP Board S/N SF86BK0008  Chip rev 0x41


Anyhow, what I learned here is Jens obtained firmware from IBM Japan.
Could you please describe the source of your device, you should involve
respective OEM (HP/IBM) for getting right revision of the firmware.

If it was purchased via direct channel, call QLogic support about 
3.4.339 firmware. I can't imagine IBM Japan hosting firmware not 
released for IBM branded NIC boards.

Thanks,
Dhananjay


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1
  2009-11-20 17:30           ` Dhananjay Phadke
@ 2009-11-20 17:43             ` Eric W. Biederman
  2009-11-20 18:07               ` Dhananjay Phadke
  0 siblings, 1 reply; 13+ messages in thread
From: Eric W. Biederman @ 2009-11-20 17:43 UTC (permalink / raw)
  To: Dhananjay Phadke; +Cc: Jens Rosenboom, netdev, Amit Salecha, Ameen Rahman

Dhananjay Phadke <dhananjay.phadke@qlogic.com> writes:

>> Weird. MSI's definitely weren't disabled.  Looking a little farther at
>> your quoted setup MSI work on your board.  This is definitely
>> something specific to the driver.  Except for a few initialization
>> races that are an issue for bonding I am running 2.6.31 just fine.
>
>
> Jens,
>
> Even if /proc/interrupt says PCI-MSI in both cases, single interrupt case is msi
> vs. 4 vector case is msi-x. To confirm that msi-x doesn't work with your
> card/machine, you can still stay on 2.6.31 and set use_msi_x=0 in
> netxen_nic_main.c.
>
> I had tried to make a available a module param to disable msi/msi-x for
> platforms where msi-x doesn't work cleanly, but it was declined by David Miller,
> et al.

MSI-X uses the same messages on the wire is the same as MSI it is
only the programming interface that is different.  So if MSI works and
MSI-X does not it is not a platform problem.

Eric

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1
  2009-11-20 17:43             ` Eric W. Biederman
@ 2009-11-20 18:07               ` Dhananjay Phadke
  2009-11-20 18:21                 ` Eric W. Biederman
  0 siblings, 1 reply; 13+ messages in thread
From: Dhananjay Phadke @ 2009-11-20 18:07 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Jens Rosenboom, netdev, Amit Salecha, Ameen Rahman



Eric W. Biederman wrote:
> MSI-X uses the same messages on the wire is the same as MSI it is
> only the programming interface that is different.  So if MSI works and
> MSI-X does not it is not a platform problem.
> 

Sure, but firmware and driver see it differently. IMO, this points to 
some firmware issue (which is why I asked him to get right version from 
right source). All that driver did was tried to enable msi-x, from my 
testing, I can tell that both msi and msix work on all revisions of the 
nic asic, but firmware revision can make a difference. For chiprev 0x25, 
3.4.339 is the right firmware version.

Thanks,
Dhananjay

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1
  2009-11-20 18:07               ` Dhananjay Phadke
@ 2009-11-20 18:21                 ` Eric W. Biederman
  0 siblings, 0 replies; 13+ messages in thread
From: Eric W. Biederman @ 2009-11-20 18:21 UTC (permalink / raw)
  To: Dhananjay Phadke; +Cc: Jens Rosenboom, netdev, Amit Salecha, Ameen Rahman

Dhananjay Phadke <dhananjay.phadke@qlogic.com> writes:

> Eric W. Biederman wrote:
>> MSI-X uses the same messages on the wire is the same as MSI it is
>> only the programming interface that is different.  So if MSI works and
>> MSI-X does not it is not a platform problem.
>>
>
> Sure, but firmware and driver see it differently. IMO, this points to some
> firmware issue (which is why I asked him to get right version from right
> source). All that driver did was tried to enable msi-x, from my testing, I can
> tell that both msi and msix work on all revisions of the nic asic, but firmware
> revision can make a difference. For chiprev 0x25, 3.4.339 is the right firmware
> version.

Sorry I meant that only as a clarification, not to derail the conversation.
This does sound like a firmware mismatch issue to me as well.

Eric

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2009-11-20 18:21 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-19 16:39 [BUG] netxen: Stops working between 2.6.30 and 2.6.31-rc1 Jens Rosenboom
2009-11-19 18:07 ` Dhananjay Phadke
2009-11-19 18:36   ` Jens Rosenboom
2009-11-19 22:11     ` Dhananjay Phadke
2009-11-20  7:49       ` Jens Rosenboom
2009-11-20 16:11         ` Jens Rosenboom
2009-11-20  1:19     ` Eric W. Biederman
2009-11-20  7:52       ` Jens Rosenboom
2009-11-20 16:48         ` Eric W. Biederman
2009-11-20 17:30           ` Dhananjay Phadke
2009-11-20 17:43             ` Eric W. Biederman
2009-11-20 18:07               ` Dhananjay Phadke
2009-11-20 18:21                 ` Eric W. Biederman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.