All of lore.kernel.org
 help / color / mirror / Atom feed
* for_each_xxx_of_node() - lots of refcounting bugs
@ 2015-09-01 11:07 Russell King - ARM Linux
       [not found] ` <20150901110743.GJ21084-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Russell King - ARM Linux @ 2015-09-01 11:07 UTC (permalink / raw)
  To: Grant Likely, Rob Herring; +Cc: devicetree-u79uwXL29TY76Z2rM5mHXA

Consider the following loop:

	for_each_child_of_node(&pdev->dev.of_node, child) {
		if (some_condition)
			break;
	}

The use of for_each_..._of_node() leads people to believe that it's
like other for_each_...() loops - the continue and break statements
can be used.

However, with OF, "break" can't be used without disrupting the
reference counting on the nodes.  This is because:

#define for_each_child_of_node(parent, child) \
        for (child = of_get_next_child(parent, NULL); child != NULL; \
             child = of_get_next_child(parent, child))

of_get_next_child() takes a reference on the node it's about to return,
while dropping the reference on the node passed into it.  In the case
of the last iteration, where of_get_next_child() returns NULL, the
previous child will have its reference dropped, resulting in no child
nodes having a reference held.

However, if a 'break' statement is used, the reference on the current
child is not dropped unless code explicitly drops it.

We have code which does exactly this kind of thing:

        for_each_child_of_node(cpus, cpu) {
                /*
                 * A device tree containing CPU nodes with missing "reg"
                 * properties is considered invalid to build the
                 * cpu_logical_map.
                 */
                if (of_property_read_u32(cpu, "reg", &hwid)) {
                        pr_debug(" * %s missing reg property\n",
                                     cpu->full_name);
                        return;
                }

                /*
                 * 8 MSBs must be set to 0 in the DT since the reg property
                 * defines the MPIDR[23:0].
                 */
                if (hwid & ~MPIDR_HWID_BITMASK)
                        return;
... more return statements

        for_each_child_of_node(np, np0) {
                struct device_node *fc;
                int i;

                res = of_dev_hwmod_lookup(np0, oh, &i, &fc);
                if (res == 0) {
                        *found = fc;
                        *index = i;
                        return 0;

        for_each_child_of_node(parent, np) {
                pd = kzalloc(sizeof(*pd), GFP_KERNEL);
                if (!pd)
                        return -ENOMEM;

Virtually _all_ uses of for_each_child_of_node() in the kernel today
where the loop is terminated early leak a reference on the child node.
Even some of the drivers/of code does it:

	... for_each_child_of_node(root, child) {
                if (!of_match_node(matches, child))
                        continue;
                rc = of_platform_bus_create(child, matches, NULL, parent, false);
                if (rc)
                        break;
        }

This pretty much shows the danger of using macros which hide details
like this from the programmer - it leads to the assumption that it's
fine to use 'break' and 'return' without any further consideration,
because that's what you can do in standard C loops.  The fact that
these loops are actually more complex than that is hidden behind the
macro, and thus gets forgotten.

We could go around and fix all these sites, but that's not going to
stop this continuing to happen into the future.  So, fixing the
existing bugs is not a fix at all, it's a papering over of a more
fundamental problem here.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: for_each_xxx_of_node() - lots of refcounting bugs
       [not found] ` <20150901110743.GJ21084-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
@ 2015-09-01 22:00   ` Rob Herring
       [not found]     ` <CAL_JsqJE5seB=5-9_QJBFG1=ipDr_osQJusOSk00VZzwhs=CDA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Rob Herring @ 2015-09-01 22:00 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Grant Likely, devicetree-u79uwXL29TY76Z2rM5mHXA, Frank Rowand

On Tue, Sep 1, 2015 at 6:07 AM, Russell King - ARM Linux
<linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org> wrote:
> Consider the following loop:
>
>         for_each_child_of_node(&pdev->dev.of_node, child) {
>                 if (some_condition)
>                         break;
>         }
>
> The use of for_each_..._of_node() leads people to believe that it's
> like other for_each_...() loops - the continue and break statements
> can be used.
>
> However, with OF, "break" can't be used without disrupting the
> reference counting on the nodes.  This is because:
>
> #define for_each_child_of_node(parent, child) \
>         for (child = of_get_next_child(parent, NULL); child != NULL; \
>              child = of_get_next_child(parent, child))
>
> of_get_next_child() takes a reference on the node it's about to return,
> while dropping the reference on the node passed into it.  In the case
> of the last iteration, where of_get_next_child() returns NULL, the
> previous child will have its reference dropped, resulting in no child
> nodes having a reference held.
>
> However, if a 'break' statement is used, the reference on the current
> child is not dropped unless code explicitly drops it.
>
> We have code which does exactly this kind of thing:
>
>         for_each_child_of_node(cpus, cpu) {
>                 /*
>                  * A device tree containing CPU nodes with missing "reg"
>                  * properties is considered invalid to build the
>                  * cpu_logical_map.
>                  */
>                 if (of_property_read_u32(cpu, "reg", &hwid)) {
>                         pr_debug(" * %s missing reg property\n",
>                                      cpu->full_name);
>                         return;
>                 }
>
>                 /*
>                  * 8 MSBs must be set to 0 in the DT since the reg property
>                  * defines the MPIDR[23:0].
>                  */
>                 if (hwid & ~MPIDR_HWID_BITMASK)
>                         return;
> ... more return statements
>
>         for_each_child_of_node(np, np0) {
>                 struct device_node *fc;
>                 int i;
>
>                 res = of_dev_hwmod_lookup(np0, oh, &i, &fc);
>                 if (res == 0) {
>                         *found = fc;
>                         *index = i;
>                         return 0;
>
>         for_each_child_of_node(parent, np) {
>                 pd = kzalloc(sizeof(*pd), GFP_KERNEL);
>                 if (!pd)
>                         return -ENOMEM;
>
> Virtually _all_ uses of for_each_child_of_node() in the kernel today
> where the loop is terminated early leak a reference on the child node.
> Even some of the drivers/of code does it:
>
>         ... for_each_child_of_node(root, child) {
>                 if (!of_match_node(matches, child))
>                         continue;
>                 rc = of_platform_bus_create(child, matches, NULL, parent, false);
>                 if (rc)
>                         break;
>         }
>
> This pretty much shows the danger of using macros which hide details
> like this from the programmer - it leads to the assumption that it's
> fine to use 'break' and 'return' without any further consideration,
> because that's what you can do in standard C loops.  The fact that
> these loops are actually more complex than that is hidden behind the
> macro, and thus gets forgotten.
>
> We could go around and fix all these sites, but that's not going to
> stop this continuing to happen into the future.  So, fixing the
> existing bugs is not a fix at all, it's a papering over of a more
> fundamental problem here.

Yes, the ref counting for DT in general is difficult to get right and
needs to be redesigned. Geert did a checker and even the core and
unittests have 44 errors[1]. However, it is a nop in most cases, and
it only really matters on IBM pSeries and only for certain nodes on
those AIUI. We've had some discussions about it before, but no one has
come up with a solution. Managing this at a node level is probably too
fine grained when most nodes don't need ref counting. The implicit get
and explicit put are also a problem IMO. We need to be able to look at
code and see the calls are balanced.

Rob

[1] https://lkml.org/lkml/2015/1/23/437
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: for_each_xxx_of_node() - lots of refcounting bugs
       [not found]     ` <CAL_JsqJE5seB=5-9_QJBFG1=ipDr_osQJusOSk00VZzwhs=CDA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-09-03 18:26       ` Frank Rowand
  0 siblings, 0 replies; 3+ messages in thread
From: Frank Rowand @ 2015-09-03 18:26 UTC (permalink / raw)
  To: Rob Herring
  Cc: Russell King - ARM Linux, Grant Likely,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On 9/1/2015 3:00 PM, Rob Herring wrote:
> On Tue, Sep 1, 2015 at 6:07 AM, Russell King - ARM Linux
> <linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org> wrote:
>> Consider the following loop:
>>
>>         for_each_child_of_node(&pdev->dev.of_node, child) {
>>                 if (some_condition)
>>                         break;
>>         }
>>
>> The use of for_each_..._of_node() leads people to believe that it's
>> like other for_each_...() loops - the continue and break statements
>> can be used.
>>
>> However, with OF, "break" can't be used without disrupting the
>> reference counting on the nodes.  This is because:

< snip - nice explanation of the situation >

>> We could go around and fix all these sites, but that's not going to
>> stop this continuing to happen into the future.  So, fixing the
>> existing bugs is not a fix at all, it's a papering over of a more
>> fundamental problem here.
> 
> Yes, the ref counting for DT in general is difficult to get right and
> needs to be redesigned. Geert did a checker and even the core and
> unittests have 44 errors[1]. However, it is a nop in most cases, and
> it only really matters on IBM pSeries and only for certain nodes on
> those AIUI. We've had some discussions about it before, but no one has
> come up with a solution. Managing this at a node level is probably too
> fine grained when most nodes don't need ref counting. The implicit get
> and explicit put are also a problem IMO. We need to be able to look at
> code and see the calls are balanced.
> 
> Rob
> 
> [1] https://lkml.org/lkml/2015/1/23/437

I agree with all of the above.  I do not think that chasing after all
the broken sites is a long term solution.

But I have been poking at this a little bit this year.  I have some run
time debug data that collects where the refcounts are modified and what
the resulting values are.  In the unlikely case that someone has a short
term refcount issue and needs to fix the broken refcount of a specific
DT object as a temporary work around, I can share my hack tools.

-Frank

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-09-03 18:26 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-01 11:07 for_each_xxx_of_node() - lots of refcounting bugs Russell King - ARM Linux
     [not found] ` <20150901110743.GJ21084-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
2015-09-01 22:00   ` Rob Herring
     [not found]     ` <CAL_JsqJE5seB=5-9_QJBFG1=ipDr_osQJusOSk00VZzwhs=CDA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-03 18:26       ` Frank Rowand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.