linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* of_node_put() usage is buggy all over drivers/of/base.c?!
@ 2021-08-14  1:01 Vladimir Oltean
  2021-08-16 14:33 ` Frank Rowand
  0 siblings, 1 reply; 8+ messages in thread
From: Vladimir Oltean @ 2021-08-14  1:01 UTC (permalink / raw)
  To: Rob Herring, Frank Rowand, Sascha Hauer, devicetree,
	linux-kernel, netdev

Hi,

I was debugging an RCU stall which happened during the probing of a
driver. Activating lock debugging, I see:

[  101.710694] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:938
[  101.719119] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1534, name: sh
[  101.726763] INFO: lockdep is turned off.
[  101.730674] irq event stamp: 0
[  101.733716] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[  101.739973] hardirqs last disabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
[  101.748146] softirqs last  enabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
[  101.756313] softirqs last disabled at (0): [<0000000000000000>] 0x0
[  101.762569] CPU: 4 PID: 1534 Comm: sh Not tainted 5.14.0-rc5+ #272
[  101.774558] Call trace:
[  101.794734]  __might_sleep+0x50/0x88
[  101.798297]  __mutex_lock+0x60/0x938
[  101.801863]  mutex_lock_nested+0x38/0x50
[  101.805775]  kernfs_remove+0x2c/0x50             <---- this takes mutex_lock(&kernfs_mutex);
[  101.809341]  sysfs_remove_dir+0x54/0x70
[  101.813166]  __kobject_del+0x3c/0x80
[  101.816733]  kobject_put+0xf8/0x108
[  101.820211]  of_node_put+0x18/0x28
[  101.823602]  of_find_compatible_node+0xa8/0xf8    <--- this takes raw_spin_lock_irqsave(&devtree_lock)
[  101.828036]  sja1105_mdiobus_register+0x264/0x7a8

The pattern of calling of_node_put from under the atomic devtree_lock
context is pretty widespread in drivers/of/base.c.

Just by inspecting the code, this seems to be an issue since commit:

commit 75b57ecf9d1d1e17d099ab13b8f48e6e038676be
Author: Grant Likely <grant.likely@linaro.org>
Date:   Thu Feb 20 18:02:11 2014 +0000

    of: Make device nodes kobjects so they show up in sysfs

    Device tree nodes are already treated as objects, and we already want to
    expose them to userspace which is done using the /proc filesystem today.
    Right now the kernel has to do a lot of work to keep the /proc view in
    sync with the in-kernel representation. If device_nodes are switched to
    be kobjects then the device tree code can be a whole lot simpler. It
    also turns out that switching to using /sysfs from /proc results in
    smaller code and data size, and the userspace ABI won't change if
    /proc/device-tree symlinks to /sys/firmware/devicetree/base.

    v7: Add missing sysfs_bin_attr_init()
    v6: Add __of_add_property() early init fixes from Pantelis
    v5: Rename firmware/ofw to firmware/devicetree
        Fix updating property values in sysfs
    v4: Fixed build error on Powerpc
        Fixed handling of dynamic nodes on powerpc
    v3: Fixed handling of duplicate attribute and child node names
    v2: switch to using sysfs bin_attributes which solve the problem of
        reporting incorrect property size.

    Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
    Tested-by: Sascha Hauer <s.hauer@pengutronix.de>
    Cc: Rob Herring <rob.herring@calxeda.com>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: David S. Miller <davem@davemloft.net>
    Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
    Cc: Pantelis Antoniou <panto@antoniou-consulting.com>

because up until that point, of_node_put() was:

void of_node_put(struct device_node *node)
{
	if (node)
		kref_put(&node->kref, of_node_release);
}

and not:

void of_node_put(struct device_node *node)
{
	if (node)
		kobject_put(&node->kobj);
}

Either I'm holding it very, very wrong, or this is a very severe
oversight that just happened somehow to go unnoticed for 7 years.

Please tell me it's me.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: of_node_put() usage is buggy all over drivers/of/base.c?!
  2021-08-14  1:01 of_node_put() usage is buggy all over drivers/of/base.c?! Vladimir Oltean
@ 2021-08-16 14:33 ` Frank Rowand
  2021-08-16 14:46   ` Vladimir Oltean
  0 siblings, 1 reply; 8+ messages in thread
From: Frank Rowand @ 2021-08-16 14:33 UTC (permalink / raw)
  To: Vladimir Oltean, Rob Herring, Sascha Hauer, devicetree,
	linux-kernel, netdev

Hi Vladimir,

On 8/13/21 8:01 PM, Vladimir Oltean wrote:
> Hi,
> 
> I was debugging an RCU stall which happened during the probing of a
> driver. Activating lock debugging, I see:

I took a quick look at sja1105_mdiobus_register() in v5.14-rc1 and v5.14-rc6.

Looking at the following stack trace, I did not see any calls to
of_find_compatible_node() in sja1105_mdiobus_register().  I am
guessing that maybe there is an inlined function that calls
of_find_compatible_node().  This would likely be either
sja1105_mdiobus_base_tx_register() or sja1105_mdioux_base_t1_register().

> 
> [  101.710694] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:938
> [  101.719119] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1534, name: sh
> [  101.726763] INFO: lockdep is turned off.
> [  101.730674] irq event stamp: 0
> [  101.733716] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> [  101.739973] hardirqs last disabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
> [  101.748146] softirqs last  enabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
> [  101.756313] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [  101.762569] CPU: 4 PID: 1534 Comm: sh Not tainted 5.14.0-rc5+ #272
> [  101.774558] Call trace:
> [  101.794734]  __might_sleep+0x50/0x88
> [  101.798297]  __mutex_lock+0x60/0x938
> [  101.801863]  mutex_lock_nested+0x38/0x50
> [  101.805775]  kernfs_remove+0x2c/0x50             <---- this takes mutex_lock(&kernfs_mutex);
> [  101.809341]  sysfs_remove_dir+0x54/0x70

The __kobject_del() occurs only if the refcount on the node
becomes zero.  This should never be true when of_find_compatible_node()
calls of_node_put() unless a "from" node is passed to of_find_compatible_node().

In both sja1105_mdiobus_base_tx_register() and sja1105_mdioux_base_t1_register()
a from node ("mdio") is passed to of_find_compatible_node() without first doing an
of_node_get(mdio).  If you add the of_node_get() calls the problem should be fixed.

-Frank


> [  101.813166]  __kobject_del+0x3c/0x80
> [  101.816733]  kobject_put+0xf8/0x108
> [  101.820211]  of_node_put+0x18/0x28
> [  101.823602]  of_find_compatible_node+0xa8/0xf8    <--- this takes raw_spin_lock_irqsave(&devtree_lock)
> [  101.828036]  sja1105_mdiobus_register+0x264/0x7a8
> 
> The pattern of calling of_node_put from under the atomic devtree_lock
> context is pretty widespread in drivers/of/base.c.
> 
> Just by inspecting the code, this seems to be an issue since commit:
> 
> commit 75b57ecf9d1d1e17d099ab13b8f48e6e038676be
> Author: Grant Likely <grant.likely@linaro.org>
> Date:   Thu Feb 20 18:02:11 2014 +0000
> 
>     of: Make device nodes kobjects so they show up in sysfs
> 
>     Device tree nodes are already treated as objects, and we already want to
>     expose them to userspace which is done using the /proc filesystem today.
>     Right now the kernel has to do a lot of work to keep the /proc view in
>     sync with the in-kernel representation. If device_nodes are switched to
>     be kobjects then the device tree code can be a whole lot simpler. It
>     also turns out that switching to using /sysfs from /proc results in
>     smaller code and data size, and the userspace ABI won't change if
>     /proc/device-tree symlinks to /sys/firmware/devicetree/base.
> 
>     v7: Add missing sysfs_bin_attr_init()
>     v6: Add __of_add_property() early init fixes from Pantelis
>     v5: Rename firmware/ofw to firmware/devicetree
>         Fix updating property values in sysfs
>     v4: Fixed build error on Powerpc
>         Fixed handling of dynamic nodes on powerpc
>     v3: Fixed handling of duplicate attribute and child node names
>     v2: switch to using sysfs bin_attributes which solve the problem of
>         reporting incorrect property size.
> 
>     Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
>     Tested-by: Sascha Hauer <s.hauer@pengutronix.de>
>     Cc: Rob Herring <rob.herring@calxeda.com>
>     Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>     Cc: David S. Miller <davem@davemloft.net>
>     Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>     Cc: Pantelis Antoniou <panto@antoniou-consulting.com>
> 
> because up until that point, of_node_put() was:
> 
> void of_node_put(struct device_node *node)
> {
> 	if (node)
> 		kref_put(&node->kref, of_node_release);
> }
> 
> and not:
> 
> void of_node_put(struct device_node *node)
> {
> 	if (node)
> 		kobject_put(&node->kobj);
> }
> 
> Either I'm holding it very, very wrong, or this is a very severe
> oversight that just happened somehow to go unnoticed for 7 years.
> 
> Please tell me it's me.
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: of_node_put() usage is buggy all over drivers/of/base.c?!
  2021-08-16 14:33 ` Frank Rowand
@ 2021-08-16 14:46   ` Vladimir Oltean
  2021-08-16 15:14     ` Frank Rowand
  0 siblings, 1 reply; 8+ messages in thread
From: Vladimir Oltean @ 2021-08-16 14:46 UTC (permalink / raw)
  To: Frank Rowand; +Cc: Rob Herring, Sascha Hauer, devicetree, linux-kernel, netdev

Hi Frank,

On Mon, Aug 16, 2021 at 09:33:03AM -0500, Frank Rowand wrote:
> Hi Vladimir,
> 
> On 8/13/21 8:01 PM, Vladimir Oltean wrote:
> > Hi,
> > 
> > I was debugging an RCU stall which happened during the probing of a
> > driver. Activating lock debugging, I see:
> 
> I took a quick look at sja1105_mdiobus_register() in v5.14-rc1 and v5.14-rc6.
> 
> Looking at the following stack trace, I did not see any calls to
> of_find_compatible_node() in sja1105_mdiobus_register().  I am
> guessing that maybe there is an inlined function that calls
> of_find_compatible_node().  This would likely be either
> sja1105_mdiobus_base_tx_register() or sja1105_mdioux_base_t1_register().

Yes, it is sja1105_mdiobus_base_t1_register which is inlined.

> > 
> > [  101.710694] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:938
> > [  101.719119] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1534, name: sh
> > [  101.726763] INFO: lockdep is turned off.
> > [  101.730674] irq event stamp: 0
> > [  101.733716] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> > [  101.739973] hardirqs last disabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
> > [  101.748146] softirqs last  enabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
> > [  101.756313] softirqs last disabled at (0): [<0000000000000000>] 0x0
> > [  101.762569] CPU: 4 PID: 1534 Comm: sh Not tainted 5.14.0-rc5+ #272
> > [  101.774558] Call trace:
> > [  101.794734]  __might_sleep+0x50/0x88
> > [  101.798297]  __mutex_lock+0x60/0x938
> > [  101.801863]  mutex_lock_nested+0x38/0x50
> > [  101.805775]  kernfs_remove+0x2c/0x50             <---- this takes mutex_lock(&kernfs_mutex);
> > [  101.809341]  sysfs_remove_dir+0x54/0x70
> 
> The __kobject_del() occurs only if the refcount on the node
> becomes zero.  This should never be true when of_find_compatible_node()
> calls of_node_put() unless a "from" node is passed to of_find_compatible_node().

I figured that was the assumption, that the of_node_put would never
trigger a sysfs file / kobject deletion from there.

> In both sja1105_mdiobus_base_tx_register() and sja1105_mdioux_base_t1_register()
> a from node ("mdio") is passed to of_find_compatible_node() without first doing an
> of_node_get(mdio).  If you add the of_node_get() calls the problem should be fixed.

The answer seems simple enough, but stupid question, but why does
of_find_compatible_node call of_node_put on "from" in the first place?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: of_node_put() usage is buggy all over drivers/of/base.c?!
  2021-08-16 14:46   ` Vladimir Oltean
@ 2021-08-16 15:14     ` Frank Rowand
  2021-08-16 19:20       ` Rob Herring
  0 siblings, 1 reply; 8+ messages in thread
From: Frank Rowand @ 2021-08-16 15:14 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Rob Herring, Sascha Hauer, devicetree, linux-kernel, netdev

On 8/16/21 9:46 AM, Vladimir Oltean wrote:
> Hi Frank,
> 
> On Mon, Aug 16, 2021 at 09:33:03AM -0500, Frank Rowand wrote:
>> Hi Vladimir,
>>
>> On 8/13/21 8:01 PM, Vladimir Oltean wrote:
>>> Hi,
>>>
>>> I was debugging an RCU stall which happened during the probing of a
>>> driver. Activating lock debugging, I see:
>>
>> I took a quick look at sja1105_mdiobus_register() in v5.14-rc1 and v5.14-rc6.
>>
>> Looking at the following stack trace, I did not see any calls to
>> of_find_compatible_node() in sja1105_mdiobus_register().  I am
>> guessing that maybe there is an inlined function that calls
>> of_find_compatible_node().  This would likely be either
>> sja1105_mdiobus_base_tx_register() or sja1105_mdioux_base_t1_register().
> 
> Yes, it is sja1105_mdiobus_base_t1_register which is inlined.
> 
>>>
>>> [  101.710694] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:938
>>> [  101.719119] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1534, name: sh
>>> [  101.726763] INFO: lockdep is turned off.
>>> [  101.730674] irq event stamp: 0
>>> [  101.733716] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
>>> [  101.739973] hardirqs last disabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
>>> [  101.748146] softirqs last  enabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
>>> [  101.756313] softirqs last disabled at (0): [<0000000000000000>] 0x0
>>> [  101.762569] CPU: 4 PID: 1534 Comm: sh Not tainted 5.14.0-rc5+ #272
>>> [  101.774558] Call trace:
>>> [  101.794734]  __might_sleep+0x50/0x88
>>> [  101.798297]  __mutex_lock+0x60/0x938
>>> [  101.801863]  mutex_lock_nested+0x38/0x50
>>> [  101.805775]  kernfs_remove+0x2c/0x50             <---- this takes mutex_lock(&kernfs_mutex);
>>> [  101.809341]  sysfs_remove_dir+0x54/0x70
>>
>> The __kobject_del() occurs only if the refcount on the node
>> becomes zero.  This should never be true when of_find_compatible_node()
>> calls of_node_put() unless a "from" node is passed to of_find_compatible_node().
> 
> I figured that was the assumption, that the of_node_put would never
> trigger a sysfs file / kobject deletion from there.
> 
>> In both sja1105_mdiobus_base_tx_register() and sja1105_mdioux_base_t1_register()
>> a from node ("mdio") is passed to of_find_compatible_node() without first doing an
>> of_node_get(mdio).  If you add the of_node_get() calls the problem should be fixed.
> 
> The answer seems simple enough, but stupid question, but why does
> of_find_compatible_node call of_node_put on "from" in the first place?

Actually a good question.

I do not know why of_find_compatible_node() calls of_node_put() instead of making
the caller of of_find_compatible_node() responsible.  That pattern was created
long before I was involved in devicetree and I have not gone back to read the
review comments of when that code was created.

-Frank

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: of_node_put() usage is buggy all over drivers/of/base.c?!
  2021-08-16 15:14     ` Frank Rowand
@ 2021-08-16 19:20       ` Rob Herring
  2021-08-16 19:56         ` Frank Rowand
  0 siblings, 1 reply; 8+ messages in thread
From: Rob Herring @ 2021-08-16 19:20 UTC (permalink / raw)
  To: Frank Rowand, Vladimir Oltean
  Cc: Sascha Hauer, devicetree, linux-kernel, netdev

On Mon, Aug 16, 2021 at 10:14 AM Frank Rowand <frowand.list@gmail.com> wrote:
>
> On 8/16/21 9:46 AM, Vladimir Oltean wrote:
> > Hi Frank,
> >
> > On Mon, Aug 16, 2021 at 09:33:03AM -0500, Frank Rowand wrote:
> >> Hi Vladimir,
> >>
> >> On 8/13/21 8:01 PM, Vladimir Oltean wrote:
> >>> Hi,
> >>>
> >>> I was debugging an RCU stall which happened during the probing of a
> >>> driver. Activating lock debugging, I see:
> >>
> >> I took a quick look at sja1105_mdiobus_register() in v5.14-rc1 and v5.14-rc6.
> >>
> >> Looking at the following stack trace, I did not see any calls to
> >> of_find_compatible_node() in sja1105_mdiobus_register().  I am
> >> guessing that maybe there is an inlined function that calls
> >> of_find_compatible_node().  This would likely be either
> >> sja1105_mdiobus_base_tx_register() or sja1105_mdioux_base_t1_register().
> >
> > Yes, it is sja1105_mdiobus_base_t1_register which is inlined.
> >
> >>>
> >>> [  101.710694] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:938
> >>> [  101.719119] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1534, name: sh
> >>> [  101.726763] INFO: lockdep is turned off.
> >>> [  101.730674] irq event stamp: 0
> >>> [  101.733716] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> >>> [  101.739973] hardirqs last disabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
> >>> [  101.748146] softirqs last  enabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
> >>> [  101.756313] softirqs last disabled at (0): [<0000000000000000>] 0x0
> >>> [  101.762569] CPU: 4 PID: 1534 Comm: sh Not tainted 5.14.0-rc5+ #272
> >>> [  101.774558] Call trace:
> >>> [  101.794734]  __might_sleep+0x50/0x88
> >>> [  101.798297]  __mutex_lock+0x60/0x938
> >>> [  101.801863]  mutex_lock_nested+0x38/0x50
> >>> [  101.805775]  kernfs_remove+0x2c/0x50             <---- this takes mutex_lock(&kernfs_mutex);
> >>> [  101.809341]  sysfs_remove_dir+0x54/0x70
> >>
> >> The __kobject_del() occurs only if the refcount on the node
> >> becomes zero.  This should never be true when of_find_compatible_node()
> >> calls of_node_put() unless a "from" node is passed to of_find_compatible_node().
> >
> > I figured that was the assumption, that the of_node_put would never
> > trigger a sysfs file / kobject deletion from there.
> >
> >> In both sja1105_mdiobus_base_tx_register() and sja1105_mdioux_base_t1_register()
> >> a from node ("mdio") is passed to of_find_compatible_node() without first doing an
> >> of_node_get(mdio).  If you add the of_node_get() calls the problem should be fixed.
> >
> > The answer seems simple enough, but stupid question, but why does
> > of_find_compatible_node call of_node_put on "from" in the first place?
>
> Actually a good question.
>
> I do not know why of_find_compatible_node() calls of_node_put() instead of making
> the caller of of_find_compatible_node() responsible.  That pattern was created
> long before I was involved in devicetree and I have not gone back to read the
> review comments of when that code was created.

Because it is an iterator function and they all drop the ref from the
prior iteration.

I would say any open coded call where from is not NULL is an error.
It's not reliable because the DT search order is not defined and could
change. Someone want to write a coccinelle script to check that?

The above code should be using of_get_compatible_child() instead.

Rob

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: of_node_put() usage is buggy all over drivers/of/base.c?!
  2021-08-16 19:20       ` Rob Herring
@ 2021-08-16 19:56         ` Frank Rowand
  2021-08-16 20:00           ` Frank Rowand
  2021-08-16 20:25           ` Rob Herring
  0 siblings, 2 replies; 8+ messages in thread
From: Frank Rowand @ 2021-08-16 19:56 UTC (permalink / raw)
  To: Rob Herring, Vladimir Oltean
  Cc: Sascha Hauer, devicetree, linux-kernel, netdev

On 8/16/21 2:20 PM, Rob Herring wrote:
> On Mon, Aug 16, 2021 at 10:14 AM Frank Rowand <frowand.list@gmail.com> wrote:
>>
>> On 8/16/21 9:46 AM, Vladimir Oltean wrote:
>>> Hi Frank,
>>>
>>> On Mon, Aug 16, 2021 at 09:33:03AM -0500, Frank Rowand wrote:
>>>> Hi Vladimir,
>>>>
>>>> On 8/13/21 8:01 PM, Vladimir Oltean wrote:
>>>>> Hi,
>>>>>
>>>>> I was debugging an RCU stall which happened during the probing of a
>>>>> driver. Activating lock debugging, I see:
>>>>
>>>> I took a quick look at sja1105_mdiobus_register() in v5.14-rc1 and v5.14-rc6.
>>>>
>>>> Looking at the following stack trace, I did not see any calls to
>>>> of_find_compatible_node() in sja1105_mdiobus_register().  I am
>>>> guessing that maybe there is an inlined function that calls
>>>> of_find_compatible_node().  This would likely be either
>>>> sja1105_mdiobus_base_tx_register() or sja1105_mdioux_base_t1_register().
>>>
>>> Yes, it is sja1105_mdiobus_base_t1_register which is inlined.
>>>
>>>>>
>>>>> [  101.710694] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:938
>>>>> [  101.719119] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1534, name: sh
>>>>> [  101.726763] INFO: lockdep is turned off.
>>>>> [  101.730674] irq event stamp: 0
>>>>> [  101.733716] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
>>>>> [  101.739973] hardirqs last disabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
>>>>> [  101.748146] softirqs last  enabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
>>>>> [  101.756313] softirqs last disabled at (0): [<0000000000000000>] 0x0
>>>>> [  101.762569] CPU: 4 PID: 1534 Comm: sh Not tainted 5.14.0-rc5+ #272
>>>>> [  101.774558] Call trace:
>>>>> [  101.794734]  __might_sleep+0x50/0x88
>>>>> [  101.798297]  __mutex_lock+0x60/0x938
>>>>> [  101.801863]  mutex_lock_nested+0x38/0x50
>>>>> [  101.805775]  kernfs_remove+0x2c/0x50             <---- this takes mutex_lock(&kernfs_mutex);
>>>>> [  101.809341]  sysfs_remove_dir+0x54/0x70
>>>>
>>>> The __kobject_del() occurs only if the refcount on the node
>>>> becomes zero.  This should never be true when of_find_compatible_node()
>>>> calls of_node_put() unless a "from" node is passed to of_find_compatible_node().
>>>
>>> I figured that was the assumption, that the of_node_put would never
>>> trigger a sysfs file / kobject deletion from there.
>>>
>>>> In both sja1105_mdiobus_base_tx_register() and sja1105_mdioux_base_t1_register()
>>>> a from node ("mdio") is passed to of_find_compatible_node() without first doing an
>>>> of_node_get(mdio).  If you add the of_node_get() calls the problem should be fixed.
>>>
>>> The answer seems simple enough, but stupid question, but why does
>>> of_find_compatible_node call of_node_put on "from" in the first place?
>>
>> Actually a good question.
>>
>> I do not know why of_find_compatible_node() calls of_node_put() instead of making
>> the caller of of_find_compatible_node() responsible.  That pattern was created
>> long before I was involved in devicetree and I have not gone back to read the
>> review comments of when that code was created.
> 

> Because it is an iterator function and they all drop the ref from the
> prior iteration.

That is what I was expecting before reading through the code.  But instead
I found of_find_compatible_node():

        raw_spin_lock_irqsave(&devtree_lock, flags);
        for_each_of_allnodes_from(from, np)
                if (__of_device_is_compatible(np, compatible, type, NULL) &&
                    of_node_get(np))
                        break;
        of_node_put(from);
        raw_spin_unlock_irqrestore(&devtree_lock, flags);


for_each_of_allnodes_fromir:

#define for_each_of_allnodes_from(from, dn) \
        for (dn = __of_find_all_nodes(from); dn; dn = __of_find_all_nodes(dn))


and __of_find_all_nodes() is:

struct device_node *__of_find_all_nodes(struct device_node *prev)
{
        struct device_node *np;
        if (!prev) {
                np = of_root;
        } else if (prev->child) {
                np = prev->child;
        } else {
                /* Walk back up looking for a sibling, or the end of the structure */
                np = prev;
                while (np->parent && !np->sibling)
                        np = np->parent;
                np = np->sibling; /* Might be null at the end of the tree */
        }
        return np;
}


So the iterator is not using of_node_get() and of_node_put() for each
node that is traversed.  The protection against a node disappearing
during the iteration is provided by holding devtree_lock.

> 
> I would say any open coded call where from is not NULL is an error.

I assume you mean any open coded call of of_find_compatible_node().  There are
at least a couple of instances of that.  I did only a partial grep while looking
at Vladimir's issue.

Doing the full grep now, I see 13 instances of architecture and driver code calling
of_find_compatible_node().

> It's not reliable because the DT search order is not defined and could
> change. Someone want to write a coccinelle script to check that?
> 

> The above code should be using of_get_compatible_child() instead.

Yes, of_get_compatible_child() should be used here.  Thanks for pointing
that out.

There are 13 instances of architecture and driver code calling
of_find_compatible_node().  If possible, it would be good to change all of
them to of_get_compatible_child().  If we could replace all driver
usage of of_find_compatible_node() with a from parameter of NULL to
a new wrapper without a from parameter, where the wrapper calls
of_find_compatible_node() with the from parameter set to NULL, then
we could prevent this problem from recurring.

(I did not look at all 13 instances yet, to see if this can be done.)

> 
> Rob
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: of_node_put() usage is buggy all over drivers/of/base.c?!
  2021-08-16 19:56         ` Frank Rowand
@ 2021-08-16 20:00           ` Frank Rowand
  2021-08-16 20:25           ` Rob Herring
  1 sibling, 0 replies; 8+ messages in thread
From: Frank Rowand @ 2021-08-16 20:00 UTC (permalink / raw)
  To: Rob Herring, Vladimir Oltean
  Cc: Sascha Hauer, devicetree, linux-kernel, netdev


Hit send too soon, a couple of cleanups:

On 8/16/21 2:56 PM, Frank Rowand wrote:
> On 8/16/21 2:20 PM, Rob Herring wrote:
>> On Mon, Aug 16, 2021 at 10:14 AM Frank Rowand <frowand.list@gmail.com> wrote:
>>>
>>> On 8/16/21 9:46 AM, Vladimir Oltean wrote:
>>>> Hi Frank,
>>>>
>>>> On Mon, Aug 16, 2021 at 09:33:03AM -0500, Frank Rowand wrote:
>>>>> Hi Vladimir,
>>>>>
>>>>> On 8/13/21 8:01 PM, Vladimir Oltean wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I was debugging an RCU stall which happened during the probing of a
>>>>>> driver. Activating lock debugging, I see:
>>>>>
>>>>> I took a quick look at sja1105_mdiobus_register() in v5.14-rc1 and v5.14-rc6.
>>>>>
>>>>> Looking at the following stack trace, I did not see any calls to
>>>>> of_find_compatible_node() in sja1105_mdiobus_register().  I am
>>>>> guessing that maybe there is an inlined function that calls
>>>>> of_find_compatible_node().  This would likely be either
>>>>> sja1105_mdiobus_base_tx_register() or sja1105_mdioux_base_t1_register().
>>>>
>>>> Yes, it is sja1105_mdiobus_base_t1_register which is inlined.
>>>>
>>>>>>
>>>>>> [  101.710694] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:938
>>>>>> [  101.719119] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1534, name: sh
>>>>>> [  101.726763] INFO: lockdep is turned off.
>>>>>> [  101.730674] irq event stamp: 0
>>>>>> [  101.733716] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
>>>>>> [  101.739973] hardirqs last disabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
>>>>>> [  101.748146] softirqs last  enabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
>>>>>> [  101.756313] softirqs last disabled at (0): [<0000000000000000>] 0x0
>>>>>> [  101.762569] CPU: 4 PID: 1534 Comm: sh Not tainted 5.14.0-rc5+ #272
>>>>>> [  101.774558] Call trace:
>>>>>> [  101.794734]  __might_sleep+0x50/0x88
>>>>>> [  101.798297]  __mutex_lock+0x60/0x938
>>>>>> [  101.801863]  mutex_lock_nested+0x38/0x50
>>>>>> [  101.805775]  kernfs_remove+0x2c/0x50             <---- this takes mutex_lock(&kernfs_mutex);
>>>>>> [  101.809341]  sysfs_remove_dir+0x54/0x70
>>>>>
>>>>> The __kobject_del() occurs only if the refcount on the node
>>>>> becomes zero.  This should never be true when of_find_compatible_node()
>>>>> calls of_node_put() unless a "from" node is passed to of_find_compatible_node().
>>>>
>>>> I figured that was the assumption, that the of_node_put would never
>>>> trigger a sysfs file / kobject deletion from there.
>>>>
>>>>> In both sja1105_mdiobus_base_tx_register() and sja1105_mdioux_base_t1_register()
>>>>> a from node ("mdio") is passed to of_find_compatible_node() without first doing an
>>>>> of_node_get(mdio).  If you add the of_node_get() calls the problem should be fixed.
>>>>
>>>> The answer seems simple enough, but stupid question, but why does
>>>> of_find_compatible_node call of_node_put on "from" in the first place?
>>>
>>> Actually a good question.
>>>
>>> I do not know why of_find_compatible_node() calls of_node_put() instead of making
>>> the caller of of_find_compatible_node() responsible.  That pattern was created
>>> long before I was involved in devicetree and I have not gone back to read the
>>> review comments of when that code was created.
>>
> 
>> Because it is an iterator function and they all drop the ref from the
>> prior iteration.
> 
> That is what I was expecting before reading through the code.  But instead
> I found of_find_compatible_node():
> 
>         raw_spin_lock_irqsave(&devtree_lock, flags);
>         for_each_of_allnodes_from(from, np)
>                 if (__of_device_is_compatible(np, compatible, type, NULL) &&
>                     of_node_get(np))
>                         break;
>         of_node_put(from);
>         raw_spin_unlock_irqrestore(&devtree_lock, flags);
> 
> 
> for_each_of_allnodes_fromir:

  for_each_of_allnodes_from():

> 
> #define for_each_of_allnodes_from(from, dn) \
>         for (dn = __of_find_all_nodes(from); dn; dn = __of_find_all_nodes(dn))
> 
> 
> and __of_find_all_nodes() is:
> 
> struct device_node *__of_find_all_nodes(struct device_node *prev)
> {
>         struct device_node *np;
>         if (!prev) {
>                 np = of_root;
>         } else if (prev->child) {
>                 np = prev->child;
>         } else {
>                 /* Walk back up looking for a sibling, or the end of the structure */
>                 np = prev;
>                 while (np->parent && !np->sibling)
>                         np = np->parent;
>                 np = np->sibling; /* Might be null at the end of the tree */
>         }
>         return np;
> }
> 
> 
> So the iterator is not using of_node_get() and of_node_put() for each
> node that is traversed.  The protection against a node disappearing
> during the iteration is provided by holding devtree_lock.
> 
>>
>> I would say any open coded call where from is not NULL is an error.
> 
> I assume you mean any open coded call of of_find_compatible_node().  There are
> at least a couple of instances of that.  I did only a partial grep while looking
> at Vladimir's issue.
> 
> Doing the full grep now, I see 13 instances of architecture and driver code calling
> of_find_compatible_node().

  of_find_compatible_node() with parameter "from" not NULL.

> 
>> It's not reliable because the DT search order is not defined and could
>> change. Someone want to write a coccinelle script to check that?
>>
> 
>> The above code should be using of_get_compatible_child() instead.
> 
> Yes, of_get_compatible_child() should be used here.  Thanks for pointing
> that out.
> 
> There are 13 instances of architecture and driver code calling
> of_find_compatible_node().  If possible, it would be good to change all of
> them to of_get_compatible_child().  If we could replace all driver
> usage of of_find_compatible_node() with a from parameter of NULL to
> a new wrapper without a from parameter, where the wrapper calls
> of_find_compatible_node() with the from parameter set to NULL, then
> we could prevent this problem from recurring.
> 
> (I did not look at all 13 instances yet, to see if this can be done.)
> 
>>
>> Rob
>>
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: of_node_put() usage is buggy all over drivers/of/base.c?!
  2021-08-16 19:56         ` Frank Rowand
  2021-08-16 20:00           ` Frank Rowand
@ 2021-08-16 20:25           ` Rob Herring
  1 sibling, 0 replies; 8+ messages in thread
From: Rob Herring @ 2021-08-16 20:25 UTC (permalink / raw)
  To: Frank Rowand
  Cc: Vladimir Oltean, Sascha Hauer, devicetree, linux-kernel, netdev

On Mon, Aug 16, 2021 at 2:57 PM Frank Rowand <frowand.list@gmail.com> wrote:
>
> On 8/16/21 2:20 PM, Rob Herring wrote:
> > On Mon, Aug 16, 2021 at 10:14 AM Frank Rowand <frowand.list@gmail.com> wrote:
> >>
> >> On 8/16/21 9:46 AM, Vladimir Oltean wrote:
> >>> Hi Frank,
> >>>
> >>> On Mon, Aug 16, 2021 at 09:33:03AM -0500, Frank Rowand wrote:
> >>>> Hi Vladimir,
> >>>>
> >>>> On 8/13/21 8:01 PM, Vladimir Oltean wrote:
> >>>>> Hi,
> >>>>>
> >>>>> I was debugging an RCU stall which happened during the probing of a
> >>>>> driver. Activating lock debugging, I see:
> >>>>
> >>>> I took a quick look at sja1105_mdiobus_register() in v5.14-rc1 and v5.14-rc6.
> >>>>
> >>>> Looking at the following stack trace, I did not see any calls to
> >>>> of_find_compatible_node() in sja1105_mdiobus_register().  I am
> >>>> guessing that maybe there is an inlined function that calls
> >>>> of_find_compatible_node().  This would likely be either
> >>>> sja1105_mdiobus_base_tx_register() or sja1105_mdioux_base_t1_register().
> >>>
> >>> Yes, it is sja1105_mdiobus_base_t1_register which is inlined.
> >>>
> >>>>>
> >>>>> [  101.710694] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:938
> >>>>> [  101.719119] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1534, name: sh
> >>>>> [  101.726763] INFO: lockdep is turned off.
> >>>>> [  101.730674] irq event stamp: 0
> >>>>> [  101.733716] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> >>>>> [  101.739973] hardirqs last disabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
> >>>>> [  101.748146] softirqs last  enabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
> >>>>> [  101.756313] softirqs last disabled at (0): [<0000000000000000>] 0x0
> >>>>> [  101.762569] CPU: 4 PID: 1534 Comm: sh Not tainted 5.14.0-rc5+ #272
> >>>>> [  101.774558] Call trace:
> >>>>> [  101.794734]  __might_sleep+0x50/0x88
> >>>>> [  101.798297]  __mutex_lock+0x60/0x938
> >>>>> [  101.801863]  mutex_lock_nested+0x38/0x50
> >>>>> [  101.805775]  kernfs_remove+0x2c/0x50             <---- this takes mutex_lock(&kernfs_mutex);
> >>>>> [  101.809341]  sysfs_remove_dir+0x54/0x70
> >>>>
> >>>> The __kobject_del() occurs only if the refcount on the node
> >>>> becomes zero.  This should never be true when of_find_compatible_node()
> >>>> calls of_node_put() unless a "from" node is passed to of_find_compatible_node().
> >>>
> >>> I figured that was the assumption, that the of_node_put would never
> >>> trigger a sysfs file / kobject deletion from there.
> >>>
> >>>> In both sja1105_mdiobus_base_tx_register() and sja1105_mdioux_base_t1_register()
> >>>> a from node ("mdio") is passed to of_find_compatible_node() without first doing an
> >>>> of_node_get(mdio).  If you add the of_node_get() calls the problem should be fixed.
> >>>
> >>> The answer seems simple enough, but stupid question, but why does
> >>> of_find_compatible_node call of_node_put on "from" in the first place?
> >>
> >> Actually a good question.
> >>
> >> I do not know why of_find_compatible_node() calls of_node_put() instead of making
> >> the caller of of_find_compatible_node() responsible.  That pattern was created
> >> long before I was involved in devicetree and I have not gone back to read the
> >> review comments of when that code was created.
> >
>
> > Because it is an iterator function and they all drop the ref from the
> > prior iteration.
>
> That is what I was expecting before reading through the code.  But instead
> I found of_find_compatible_node():

No, I meant of_find_compatible_node() is the iterator for
for_each_compatible_node().

>
>         raw_spin_lock_irqsave(&devtree_lock, flags);
>         for_each_of_allnodes_from(from, np)
>                 if (__of_device_is_compatible(np, compatible, type, NULL) &&
>                     of_node_get(np))
>                         break;
>         of_node_put(from);
>         raw_spin_unlock_irqrestore(&devtree_lock, flags);
>
>
> for_each_of_allnodes_fromir:
>
> #define for_each_of_allnodes_from(from, dn) \
>         for (dn = __of_find_all_nodes(from); dn; dn = __of_find_all_nodes(dn))

This is only used internally, so it can rely on the caller holding the
lock. This should be moved into of_private.h.

> and __of_find_all_nodes() is:
>
> struct device_node *__of_find_all_nodes(struct device_node *prev)
> {
>         struct device_node *np;
>         if (!prev) {
>                 np = of_root;
>         } else if (prev->child) {
>                 np = prev->child;
>         } else {
>                 /* Walk back up looking for a sibling, or the end of the structure */
>                 np = prev;
>                 while (np->parent && !np->sibling)
>                         np = np->parent;
>                 np = np->sibling; /* Might be null at the end of the tree */
>         }
>         return np;
> }
>
>
> So the iterator is not using of_node_get() and of_node_put() for each
> node that is traversed.  The protection against a node disappearing
> during the iteration is provided by holding devtree_lock.

The lock is for traversing the nodes (i.e. a list lock), not keeping
nodes around.

>
> >
> > I would say any open coded call where from is not NULL is an error.
>
> I assume you mean any open coded call of of_find_compatible_node().  There are
> at least a couple of instances of that.  I did only a partial grep while looking
> at Vladimir's issue.
>
> Doing the full grep now, I see 13 instances of architecture and driver code calling
> of_find_compatible_node().
>
> > It's not reliable because the DT search order is not defined and could
> > change. Someone want to write a coccinelle script to check that?
> >
>
> > The above code should be using of_get_compatible_child() instead.
>
> Yes, of_get_compatible_child() should be used here.  Thanks for pointing
> that out.
>
> There are 13 instances of architecture and driver code calling
> of_find_compatible_node().  If possible, it would be good to change all of
> them to of_get_compatible_child().  If we could replace all driver
> usage of of_find_compatible_node() with a from parameter of NULL to
> a new wrapper without a from parameter, where the wrapper calls
> of_find_compatible_node() with the from parameter set to NULL, then
> we could prevent this problem from recurring.

Patches welcome.

I don't know if all 13 are only looking for child nodes. Could be open
coding for_each_compatible_node or looking for grandchild nodes in
addition (for which we don't have helpers).

Rob

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-08-16 20:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-14  1:01 of_node_put() usage is buggy all over drivers/of/base.c?! Vladimir Oltean
2021-08-16 14:33 ` Frank Rowand
2021-08-16 14:46   ` Vladimir Oltean
2021-08-16 15:14     ` Frank Rowand
2021-08-16 19:20       ` Rob Herring
2021-08-16 19:56         ` Frank Rowand
2021-08-16 20:00           ` Frank Rowand
2021-08-16 20:25           ` Rob Herring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).