linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* PPC: Possible bug in prom_parse.c:of_irq_map_raw?
@ 2010-02-10  3:21 John Williams
  2010-10-07 18:34 ` Grant Likely
  0 siblings, 1 reply; 3+ messages in thread
From: John Williams @ 2010-02-10  3:21 UTC (permalink / raw)
  To: Linux Kernel list, linuxppc-dev, Grant Likely, Michal Simek

Hi,

Perhaps this is my misunderstanding, but I'm looking at the bit of
code in of_irq_map_raw() that iterates the interrupt-map DTS node,
which I am seeing to behave strangely when handling the interrupt-map
property on a PCI bridge node.

Early in the function, we get the #address-cells value from the node -
in this case the PCI bridge, and store it in local var addrsize:

        /* Look for this #address-cells. We have to implement the old linux
         * trick of looking for the parent here as some device-trees rely on it
         */
        old = of_node_get(ipar);
        do {
                tmp = of_get_property(old, "#address-cells", NULL);
                tnode = of_get_parent(old);
                of_node_put(old);
                old = tnode;
        } while(old && tmp == NULL);
        of_node_put(old);
        old = NULL;
        addrsize = (tmp == NULL) ? 2 : *tmp;

        DBG(" -> addrsize=%d\n", addrsize);


Later, once we've found the interrupt-map and are iterating over it
attempting to match on PCI device addresses, there is this little
fragment:

                        /* Get the interrupt parent */
                        if (of_irq_workarounds & OF_IMAP_NO_PHANDLE)
                                newpar = of_node_get(of_irq_dflt_pic);
                        else
                                newpar =
of_find_node_by_phandle((phandle)*imap);
                        imap++;
                        --imaplen;

                        /* Check if not found */
                        if (newpar == NULL) {
                                DBG(" -> imap parent not found !\n");
                                goto fail;
                        }

                        /* Get #interrupt-cells and #address-cells of new
                         * parent
                         */
                        tmp = of_get_property(newpar, "#interrupt-cells", NULL);
                        if (tmp == NULL) {
                                DBG(" -> parent lacks #interrupt-cells !\n");
                                goto fail;
                        }
                        newintsize = *tmp;
                        tmp = of_get_property(newpar, "#address-cells", NULL);
                        newaddrsize = (tmp == NULL) ? 0 : *tmp;

Finally, later in the loop we update addrsize=newaddrsize

As I read this code, and the behaviour I'm seeing, is that if the
interrupt controller parent device doesn't have an #address-cells
entry, then this get_property will return fail, therefore setting
newaddrsize  to zero.  This then means that subsequent match attempts
when iterating the interrupt-map always fail, because the match length
(addrsize) is 0.

Maybe I'm missing something here, but shouldn't this last bit of code
instead be:

                        tmp = of_get_property(newpar, "#address-cells", NULL);
                        newaddrsize = (tmp == NULL) ? addrsize : *tmp;

 ^^^^^^^^^
?

In other words, if the parent node has an #address-cells value, then
use it, otherwise stick to the #address-cells value we picked up for
the actual node in question (ie the PCI bridge).

The way this is manifesting itself in the system I'm looking at is
that only the PCI device matching the first entry in the PCI bridge
interrupt-map property is correctly matching. For PCI devices that
require more than one pass through the interrupt-map loop, addrsize
goes to zero and no further match is possible.

I might be able to hack around this in the device-tree by setting
#address-cells=<3> in the interrupt controller node, but that doesn't
smell right either - it's only true for PCI devices for a start,
whereas this controller handles interrupts from many sources on the
32-bit PLB bus as well.  I looked at the ML510 DTS in boot/dts and
it's not doing anything like this.

As a footnote, I'm actually doing this in MicroBlaze whose PCI
infrastructure is not yet in mainline, however we copied this bit of
code directly from PPC and the logic is the same.

Thanks,

John
-- 
John Williams
PetaLogix - Linux Solutions for a Reconfigurable World
w: www.petalogix.com  p: +61-7-30090663  f: +61-7-30090663

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: PPC: Possible bug in prom_parse.c:of_irq_map_raw?
  2010-02-10  3:21 PPC: Possible bug in prom_parse.c:of_irq_map_raw? John Williams
@ 2010-10-07 18:34 ` Grant Likely
  2010-10-12  3:02   ` John Williams
  0 siblings, 1 reply; 3+ messages in thread
From: Grant Likely @ 2010-10-07 18:34 UTC (permalink / raw)
  To: John Williams
  Cc: linuxppc-dev, devicetree-discuss, Linux Kernel list, Michal Simek

Reaching way back into the past....

John, did you ever solve your issue here?  Comments below.

On Tue, Feb 9, 2010 at 8:21 PM, John Williams
<john.williams@petalogix.com> wrote:
> Hi,
>
> Perhaps this is my misunderstanding, but I'm looking at the bit of
> code in of_irq_map_raw() that iterates the interrupt-map DTS node,
> which I am seeing to behave strangely when handling the interrupt-map
> property on a PCI bridge node.
>
> Early in the function, we get the #address-cells value from the node -
> in this case the PCI bridge, and store it in local var addrsize:
>
> =A0 =A0 =A0 =A0/* Look for this #address-cells. We have to implement the =
old linux
> =A0 =A0 =A0 =A0 * trick of looking for the parent here as some device-tre=
es rely on it
> =A0 =A0 =A0 =A0 */
> =A0 =A0 =A0 =A0old =3D of_node_get(ipar);
> =A0 =A0 =A0 =A0do {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tmp =3D of_get_property(old, "#address-cel=
ls", NULL);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tnode =3D of_get_parent(old);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0of_node_put(old);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0old =3D tnode;
> =A0 =A0 =A0 =A0} while(old && tmp =3D=3D NULL);
> =A0 =A0 =A0 =A0of_node_put(old);
> =A0 =A0 =A0 =A0old =3D NULL;
> =A0 =A0 =A0 =A0addrsize =3D (tmp =3D=3D NULL) ? 2 : *tmp;
>
> =A0 =A0 =A0 =A0DBG(" -> addrsize=3D%d\n", addrsize);
>
>
> Later, once we've found the interrupt-map and are iterating over it
> attempting to match on PCI device addresses, there is this little
> fragment:
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* Get the interrupt paren=
t */
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (of_irq_workarounds & O=
F_IMAP_NO_PHANDLE)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0newpar =3D=
 of_node_get(of_irq_dflt_pic);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0else
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0newpar =3D
> of_find_node_by_phandle((phandle)*imap);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0imap++;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0--imaplen;
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* Check if not found */
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (newpar =3D=3D NULL) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0DBG(" -> i=
map parent not found !\n");
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto fail;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* Get #interrupt-cells an=
d #address-cells of new
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * parent
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 */
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tmp =3D of_get_property(ne=
wpar, "#interrupt-cells", NULL);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (tmp =3D=3D NULL) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0DBG(" -> p=
arent lacks #interrupt-cells !\n");
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto fail;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0newintsize =3D *tmp;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tmp =3D of_get_property(ne=
wpar, "#address-cells", NULL);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0newaddrsize =3D (tmp =3D=
=3D NULL) ? 0 : *tmp;
>
> Finally, later in the loop we update addrsize=3Dnewaddrsize
>
> As I read this code, and the behaviour I'm seeing, is that if the
> interrupt controller parent device doesn't have an #address-cells
> entry, then this get_property will return fail, therefore setting
> newaddrsize =A0to zero. =A0This then means that subsequent match attempts
> when iterating the interrupt-map always fail, because the match length
> (addrsize) is 0.

Correct.  The interrupt-map property contains the following fields:

child-unit-address child-irq irq-controller irq-parent-unit-address parent-=
irq

In the *vast majority* of cases, the irq-parent-unit-address is a
zero-length field because #address-cells isn't present on the
interrupt controller parent.  So effectively interrupt-map becomes:

child-unit-address child-irq irq-controller parent-irq

See epapr 1.0 for a full discussion

>
> Maybe I'm missing something here, but shouldn't this last bit of code
> instead be:
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tmp =3D of_get_property(ne=
wpar, "#address-cells", NULL);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0newaddrsize =3D (tmp =3D=
=3D NULL) ? addrsize : *tmp;
>
> =A0^^^^^^^^^
> ?
>
> In other words, if the parent node has an #address-cells value, then
> use it, otherwise stick to the #address-cells value we picked up for
> the actual node in question (ie the PCI bridge).

No, because at this point we absolutely do want to know how big the
parent #address-cells is, and if it is missing, we need to use 0.  If
the child's addrsize continued to be used, then the interrupt-map
parsing would get unaligned.

The inner loop is over the entries in interrupt-map.  addrsize and
intsize are only updated in the case where a match is found in the
table.  If a match isn't found, then it should bail out to the 'fail'
label.

> The way this is manifesting itself in the system I'm looking at is
> that only the PCI device matching the first entry in the PCI bridge
> interrupt-map property is correctly matching. For PCI devices that
> require more than one pass through the interrupt-map loop, addrsize
> goes to zero and no further match is possible.

Something sounds fishy.  If you're still having problems, can you
enable #define DEBUG in drivers/of/irq.c and post the output?

> I might be able to hack around this in the device-tree by setting
> #address-cells=3D<3> in the interrupt controller node, but that doesn't
> smell right either - it's only true for PCI devices for a start,
> whereas this controller handles interrupts from many sources on the
> 32-bit PLB bus as well. =A0I looked at the ML510 DTS in boot/dts and
> it's not doing anything like this.

Yeah, that's not right, and it would mess up the interrupt-map
parsing.  You'd end up with more hard to debug problems.

g.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: PPC: Possible bug in prom_parse.c:of_irq_map_raw?
  2010-10-07 18:34 ` Grant Likely
@ 2010-10-12  3:02   ` John Williams
  0 siblings, 0 replies; 3+ messages in thread
From: John Williams @ 2010-10-12  3:02 UTC (permalink / raw)
  To: Grant Likely
  Cc: linuxppc-dev, devicetree-discuss, Linux Kernel list, Michal Simek

Hi Grant,

On Fri, Oct 8, 2010 at 4:34 AM, Grant Likely <grant.likely@secretlab.ca> wr=
ote:
> Reaching way back into the past....

Indeed

> John, did you ever solve your issue here? =A0Comments below.

The fix in our case was to explicitly add child nodes to the PCI
controller, with interrupt-parent and interrupts properties. The
product in question had a fixed set of devices on the PCI bus, so this
was not a problem.

So, it's not really resolved so much as avoided.

Other comments below:

>> Perhaps this is my misunderstanding, but I'm looking at the bit of
>> code in of_irq_map_raw() that iterates the interrupt-map DTS node,
>> which I am seeing to behave strangely when handling the interrupt-map
>> property on a PCI bridge node.
>>
>> Early in the function, we get the #address-cells value from the node -
>> in this case the PCI bridge, and store it in local var addrsize:
>>
>> =A0 =A0 =A0 =A0/* Look for this #address-cells. We have to implement the=
 old linux
>> =A0 =A0 =A0 =A0 * trick of looking for the parent here as some device-tr=
ees rely on it
>> =A0 =A0 =A0 =A0 */
>> =A0 =A0 =A0 =A0old =3D of_node_get(ipar);
>> =A0 =A0 =A0 =A0do {
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tmp =3D of_get_property(old, "#address-ce=
lls", NULL);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tnode =3D of_get_parent(old);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0of_node_put(old);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0old =3D tnode;
>> =A0 =A0 =A0 =A0} while(old && tmp =3D=3D NULL);
>> =A0 =A0 =A0 =A0of_node_put(old);
>> =A0 =A0 =A0 =A0old =3D NULL;
>> =A0 =A0 =A0 =A0addrsize =3D (tmp =3D=3D NULL) ? 2 : *tmp;
>>
>> =A0 =A0 =A0 =A0DBG(" -> addrsize=3D%d\n", addrsize);
>>
>>
>> Later, once we've found the interrupt-map and are iterating over it
>> attempting to match on PCI device addresses, there is this little
>> fragment:
>>
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* Get the interrupt pare=
nt */
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (of_irq_workarounds & =
OF_IMAP_NO_PHANDLE)
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0newpar =
=3D of_node_get(of_irq_dflt_pic);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0else
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0newpar =
=3D
>> of_find_node_by_phandle((phandle)*imap);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0imap++;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0--imaplen;
>>
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* Check if not found */
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (newpar =3D=3D NULL) {
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0DBG(" -> =
imap parent not found !\n");
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto fail=
;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
>>
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* Get #interrupt-cells a=
nd #address-cells of new
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * parent
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 */
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tmp =3D of_get_property(n=
ewpar, "#interrupt-cells", NULL);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (tmp =3D=3D NULL) {
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0DBG(" -> =
parent lacks #interrupt-cells !\n");
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto fail=
;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0newintsize =3D *tmp;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tmp =3D of_get_property(n=
ewpar, "#address-cells", NULL);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0newaddrsize =3D (tmp =3D=
=3D NULL) ? 0 : *tmp;
>>
>> Finally, later in the loop we update addrsize=3Dnewaddrsize
>>
>> As I read this code, and the behaviour I'm seeing, is that if the
>> interrupt controller parent device doesn't have an #address-cells
>> entry, then this get_property will return fail, therefore setting
>> newaddrsize =A0to zero. =A0This then means that subsequent match attempt=
s
>> when iterating the interrupt-map always fail, because the match length
>> (addrsize) is 0.
>
> Correct. =A0The interrupt-map property contains the following fields:
>
> child-unit-address child-irq irq-controller irq-parent-unit-address paren=
t-irq
>
> In the *vast majority* of cases, the irq-parent-unit-address is a
> zero-length field because #address-cells isn't present on the
> interrupt controller parent. =A0So effectively interrupt-map becomes:
>
> child-unit-address child-irq irq-controller parent-irq
>
> See epapr 1.0 for a full discussion
>
>>
>> Maybe I'm missing something here, but shouldn't this last bit of code
>> instead be:
>>
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tmp =3D of_get_property(n=
ewpar, "#address-cells", NULL);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0newaddrsize =3D (tmp =3D=
=3D NULL) ? addrsize : *tmp;
>>
>> =A0^^^^^^^^^
>> ?
>>
>> In other words, if the parent node has an #address-cells value, then
>> use it, otherwise stick to the #address-cells value we picked up for
>> the actual node in question (ie the PCI bridge).
>
> No, because at this point we absolutely do want to know how big the
> parent #address-cells is, and if it is missing, we need to use 0. =A0If
> the child's addrsize continued to be used, then the interrupt-map
> parsing would get unaligned.
>
> The inner loop is over the entries in interrupt-map. =A0addrsize and
> intsize are only updated in the case where a match is found in the
> table. =A0If a match isn't found, then it should bail out to the 'fail'
> label.
>
>> The way this is manifesting itself in the system I'm looking at is
>> that only the PCI device matching the first entry in the PCI bridge
>> interrupt-map property is correctly matching. For PCI devices that
>> require more than one pass through the interrupt-map loop, addrsize
>> goes to zero and no further match is possible.
>
> Something sounds fishy. =A0If you're still having problems, can you
> enable #define DEBUG in drivers/of/irq.c and post the output?

OK, this may take me a lilttle while as the test system is in storage!

Thanks,

John

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-10-12  3:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-10  3:21 PPC: Possible bug in prom_parse.c:of_irq_map_raw? John Williams
2010-10-07 18:34 ` Grant Likely
2010-10-12  3:02   ` John Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).