* Problem with Infiniband adapter on IBM p550
@ 2010-10-08 2:57 Patrick Finnegan
0 siblings, 0 replies; 8+ messages in thread
From: Patrick Finnegan @ 2010-10-08 2:57 UTC (permalink / raw)
To: linuxppc-dev
I seem to be running into a problem getting a Mellanox Infinihost
Infiniband adapter working on my IBM p550 (a 9113-550). I'm using
Debian squeeze, and tried upgrading to the 2.6.35.7 kernel without any
help.
I get the following messages in dmesg:
[ 4.972548] ib_mthca: Mellanox InfiniBand HCA driver v1.0 (April 4,
2008)
[ 4.972564] ib_mthca: Initializing 0000:c1:00.0
[ 4.972674] ib_mthca 0000:c1:00.0: Missing DCS, aborting.
The problem looks the same as a problem I ran into with OpenFirmware on
a Sun V880, which was fixed with this patch by Dave Miller:
http://ns3.spinics.net/lists/linux-rdma/msg01779.html
I spent some time looking at the equivalent function on powerpc, but
didn't a block of code that looked similar.
Any suggestions?
I have dmesg, the dev .properties from openfirmware, and lspci -v from
the machine:
http://ned.rcac.purdue.edu/p550-ib/dmesg
http://ned.rcac.purdue.edu/p550-ib/ib-of-device
http://ned.rcac.purdue.edu/p550-ib/lspci-v
Pat
--
Purdue University ITaP/Research Systems -- http://www.rcac.purdue.edu
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problem with Infiniband adapter on IBM p550
2010-11-03 13:34 ` Anton Blanchard
@ 2010-11-03 14:43 ` Patrick Finnegan
0 siblings, 0 replies; 8+ messages in thread
From: Patrick Finnegan @ 2010-11-03 14:43 UTC (permalink / raw)
To: Anton Blanchard; +Cc: paulus, linuxppc-dev
On Wednesday, November 03, 2010, Anton Blanchard wrote:
> Firmware has the concept of "super slots" which allow larger memory
> windows and TCE tables. Section 3.4.3 explains it:
>
> http://www.redbooks.ibm.com/redpapers/pdfs/redp4095.pdf
Aha! I tried moving the adapter from slot C3 to C5, which is listed in
that guide, and now it's working.
Thanks for the pointer!
Pat
--
Purdue University Research Computing --- http://www.rcac.purdue.edu/
The Computer Refuge --- http://computer-refuge.org
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problem with Infiniband adapter on IBM p550
2010-11-03 3:15 ` Patrick Finnegan
2010-11-03 13:34 ` Anton Blanchard
@ 2010-11-03 13:46 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2010-11-03 13:46 UTC (permalink / raw)
To: Patrick Finnegan; +Cc: linuxppc-dev, paulus
On Tue, 2010-11-02 at 23:15 -0400, Patrick Finnegan wrote:
> > I don't know why, but it definitely looks like a firmware bug to me.
> > On those machines, PCI resource assignment is under hypervisor
> > control and so Linux cannot re-assign missing resources itself.
> >
> > I'll see if I can find a FW person to shed some light on this.
> >
> > Can you provide me (privately maybe) with the FW version on the
> > machine ?
>
> Ben,
>
> Have you found out anything more on this (firmware) bug?
No, not yet. Let me ping some folks again.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problem with Infiniband adapter on IBM p550
2010-11-03 3:15 ` Patrick Finnegan
@ 2010-11-03 13:34 ` Anton Blanchard
2010-11-03 14:43 ` Patrick Finnegan
2010-11-03 13:46 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 8+ messages in thread
From: Anton Blanchard @ 2010-11-03 13:34 UTC (permalink / raw)
To: Patrick Finnegan; +Cc: paulus, linuxppc-dev
Hi,
> > Now, I think this is the problem.
> >
> > The "assigned-addresses" property seems to indicate that the
> > firmware only assigned BAR 4 and didn't assign anything to the
> > other ones.
> >
> > I don't know why, but it definitely looks like a firmware bug to me.
> > On those machines, PCI resource assignment is under hypervisor
> > control and so Linux cannot re-assign missing resources itself.
> >
> > I'll see if I can find a FW person to shed some light on this.
> >
> > Can you provide me (privately maybe) with the FW version on the
> > machine ?
>
> Ben,
>
> Have you found out anything more on this (firmware) bug?
Firmware has the concept of "super slots" which allow larger memory
windows and TCE tables. Section 3.4.3 explains it:
http://www.redbooks.ibm.com/redpapers/pdfs/redp4095.pdf
Anton
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problem with Infiniband adapter on IBM p550
2010-10-08 5:45 ` Benjamin Herrenschmidt
@ 2010-11-03 3:15 ` Patrick Finnegan
2010-11-03 13:34 ` Anton Blanchard
2010-11-03 13:46 ` Benjamin Herrenschmidt
0 siblings, 2 replies; 8+ messages in thread
From: Patrick Finnegan @ 2010-11-03 3:15 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: paulus, linuxppc-dev
On Friday, October 08, 2010, Benjamin Herrenschmidt wrote:
> > Ok, so from what I can tell, the driver is unhappy because either
> > BAR 0 hasn't been assigned a memory resource or the size doesn't
> > match what the driver expects.
>
> Ooops, accidentally sent too quickly...
>
> >From your OF log I see:
> reg 00c10000 00000000 00000000 00000000 00000000
> 03c10010 00000000 00000000 00000000 00100000
> 43c10018 00000000 00000000 00000000 00800000
> 43c10020 00000000 00000000 00000000 08000000
> assigned-addresses 83c10020 00000000 e8000000 00000000 08000000
>
> Now, I think this is the problem.
>
> The "assigned-addresses" property seems to indicate that the firmware
> only assigned BAR 4 and didn't assign anything to the other ones.
>
> I don't know why, but it definitely looks like a firmware bug to me.
> On those machines, PCI resource assignment is under hypervisor
> control and so Linux cannot re-assign missing resources itself.
>
> I'll see if I can find a FW person to shed some light on this.
>
> Can you provide me (privately maybe) with the FW version on the
> machine ?
Ben,
Have you found out anything more on this (firmware) bug?
Pat
--
Purdue University Research Computing --- http://www.rcac.purdue.edu/
The Computer Refuge --- http://computer-refuge.org
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problem with Infiniband adapter on IBM p550
2010-10-08 5:41 ` Benjamin Herrenschmidt
@ 2010-10-08 5:45 ` Benjamin Herrenschmidt
2010-11-03 3:15 ` Patrick Finnegan
0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2010-10-08 5:45 UTC (permalink / raw)
To: Patrick Finnegan; +Cc: paulus, linuxppc-dev
> Ok, so from what I can tell, the driver is unhappy because either BAR 0
> hasn't been assigned a memory resource or the size doesn't match what
> the driver expects.
>
Ooops, accidentally sent too quickly...
>From your OF log I see:
reg 00c10000 00000000 00000000 00000000 00000000
03c10010 00000000 00000000 00000000 00100000
43c10018 00000000 00000000 00000000 00800000
43c10020 00000000 00000000 00000000 08000000
assigned-addresses 83c10020 00000000 e8000000 00000000 08000000
Now, I think this is the problem.
The "assigned-addresses" property seems to indicate that the firmware only
assigned BAR 4 and didn't assign anything to the other ones.
I don't know why, but it definitely looks like a firmware bug to me. On those
machines, PCI resource assignment is under hypervisor control and so Linux
cannot re-assign missing resources itself.
I'll see if I can find a FW person to shed some light on this.
Can you provide me (privately maybe) with the FW version on the machine ?
Cheers,
Ben.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problem with Infiniband adapter on IBM p550
2010-10-08 3:24 Patrick Finnegan
@ 2010-10-08 5:41 ` Benjamin Herrenschmidt
2010-10-08 5:45 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2010-10-08 5:41 UTC (permalink / raw)
To: Patrick Finnegan; +Cc: linuxppc-dev
On Thu, 2010-10-07 at 23:24 -0400, Patrick Finnegan wrote:
> I seem to be running into a problem getting a Mellanox Infinihost
> Infiniband adapter working on my IBM p550 (a 9113-550). I'm using
> Debian squeeze, and tried upgrading to the 2.6.35.7 kernel without any
> help.
>
> I get the following messages in dmesg:
> [ 4.972548] ib_mthca: Mellanox InfiniBand HCA driver v1.0 (April 4,
> 2008)
> [ 4.972564] ib_mthca: Initializing 0000:c1:00.0
> [ 4.972674] ib_mthca 0000:c1:00.0: Missing DCS, aborting.
Ok, so from what I can tell, the driver is unhappy because either BAR 0
hasn't been assigned a memory resource or the size doesn't match what
the driver expects.
Let's see...
> The problem looks the same as a problem I ran into with OpenFirmware on
> a Sun V880, which was fixed with this patch by Dave Miller:
> http://ns3.spinics.net/lists/linux-rdma/msg01779.html
>
> I spent some time looking at the equivalent function on powerpc, but
> didn't a block of code that looked similar.
I don't think we are hitting the same problem. I believe our code in
that area differs enough.
In your lspci, however, I see:
Memory at <unassigned> (64-bit, non-prefetchable)
Memory at <unassigned> (64-bit, prefetchable)
Which doesn't look good...
>From your OF log
> Any suggestions?
>
> I have dmesg, the dev .properties from openfirmware, and lspci -v from
> the machine:
>
> http://ned.rcac.purdue.edu/p550-ib/dmesg
> http://ned.rcac.purdue.edu/p550-ib/ib-of-device
> http://ned.rcac.purdue.edu/p550-ib/lspci-v
>
> Pat
^ permalink raw reply [flat|nested] 8+ messages in thread
* Problem with Infiniband adapter on IBM p550
@ 2010-10-08 3:24 Patrick Finnegan
2010-10-08 5:41 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 8+ messages in thread
From: Patrick Finnegan @ 2010-10-08 3:24 UTC (permalink / raw)
To: linuxppc-dev
I seem to be running into a problem getting a Mellanox Infinihost
Infiniband adapter working on my IBM p550 (a 9113-550). I'm using
Debian squeeze, and tried upgrading to the 2.6.35.7 kernel without any
help.
I get the following messages in dmesg:
[ 4.972548] ib_mthca: Mellanox InfiniBand HCA driver v1.0 (April 4,
2008)
[ 4.972564] ib_mthca: Initializing 0000:c1:00.0
[ 4.972674] ib_mthca 0000:c1:00.0: Missing DCS, aborting.
The problem looks the same as a problem I ran into with OpenFirmware on
a Sun V880, which was fixed with this patch by Dave Miller:
http://ns3.spinics.net/lists/linux-rdma/msg01779.html
I spent some time looking at the equivalent function on powerpc, but
didn't a block of code that looked similar.
Any suggestions?
I have dmesg, the dev .properties from openfirmware, and lspci -v from
the machine:
http://ned.rcac.purdue.edu/p550-ib/dmesg
http://ned.rcac.purdue.edu/p550-ib/ib-of-device
http://ned.rcac.purdue.edu/p550-ib/lspci-v
Pat
--
Purdue University Research Computing --- http://www.rcac.purdue.edu/
The Computer Refuge --- http://computer-refuge.org
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-11-03 14:43 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-08 2:57 Problem with Infiniband adapter on IBM p550 Patrick Finnegan
2010-10-08 3:24 Patrick Finnegan
2010-10-08 5:41 ` Benjamin Herrenschmidt
2010-10-08 5:45 ` Benjamin Herrenschmidt
2010-11-03 3:15 ` Patrick Finnegan
2010-11-03 13:34 ` Anton Blanchard
2010-11-03 14:43 ` Patrick Finnegan
2010-11-03 13:46 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).