linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 1/2] driver core: Fix lockdep warning on wfs_lock
@ 2020-11-04 20:54 Saravana Kannan
  2020-11-04 20:54 ` [PATCH v1 2/2] spi: Populate fwnode in of_register_spi_device() Saravana Kannan
  0 siblings, 1 reply; 6+ messages in thread
From: Saravana Kannan @ 2020-11-04 20:54 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Matthias Brugger,
	Nathan Chancellor, Nick Desaulniers
  Cc: Saravana Kannan, Cheng-Jui.Wang, kernel-team, linux-kernel,
	linux-arm-kernel, linux-mediatek, clang-built-linux

There's a potential deadlock with the following cycle:
wfs_lock --> device_links_lock --> kn->count

Fix this by simply dropping the lock around a list_empty() check that's
just exported to a sysfs file. The sysfs file output is an instantaneous
check anyway and the lock doesn't really add any protection.

Lockdep log:

[   48.808132]
[   48.808132] the existing dependency chain (in reverse order) is:
[   48.809069]
[   48.809069] -> #2 (kn->count){++++}:
[   48.809707]        __kernfs_remove.llvm.7860393000964815146+0x2d4/0x460
[   48.810537]        kernfs_remove_by_name_ns+0x54/0x9c
[   48.811171]        sysfs_remove_file_ns+0x18/0x24
[   48.811762]        device_del+0x2b8/0x5a8
[   48.812269]        __device_link_del+0x98/0xb8
[   48.812829]        device_links_driver_bound+0x210/0x2d8
[   48.813496]        driver_bound+0x44/0xf8
[   48.814000]        really_probe+0x340/0x6e0
[   48.814526]        driver_probe_device+0xb8/0x100
[   48.815117]        device_driver_attach+0x78/0xb8
[   48.815708]        __driver_attach+0xe0/0x194
[   48.816255]        bus_for_each_dev+0xa8/0x11c
[   48.816816]        driver_attach+0x24/0x30
[   48.817331]        bus_add_driver+0x100/0x1e0
[   48.817880]        driver_register+0x78/0x114
[   48.818427]        __platform_driver_register+0x44/0x50
[   48.819089]        0xffffffdbb3227038
[   48.819551]        do_one_initcall+0xd8/0x1e0
[   48.820099]        do_init_module+0xd8/0x298
[   48.820636]        load_module+0x3afc/0x44c8
[   48.821173]        __arm64_sys_finit_module+0xbc/0xf0
[   48.821807]        el0_svc_common+0xbc/0x1d0
[   48.822344]        el0_svc_handler+0x74/0x98
[   48.822882]        el0_svc+0x8/0xc
[   48.823310]
[   48.823310] -> #1 (device_links_lock){+.+.}:
[   48.824036]        __mutex_lock_common+0xe0/0xe44
[   48.824626]        mutex_lock_nested+0x28/0x34
[   48.825185]        device_link_add+0xd4/0x4ec
[   48.825734]        of_link_to_suppliers+0x158/0x204
[   48.826347]        of_fwnode_add_links+0x50/0x64
[   48.826928]        device_link_add_missing_supplier_links+0x90/0x11c
[   48.827725]        fw_devlink_resume+0x58/0x130
[   48.828296]        of_platform_default_populate_init+0xb4/0xd0
[   48.829030]        do_one_initcall+0xd8/0x1e0
[   48.829578]        do_initcall_level+0xb8/0xcc
[   48.830137]        do_basic_setup+0x60/0x7c
[   48.830662]        kernel_init_freeable+0x128/0x1ac
[   48.831275]        kernel_init+0x18/0x29c
[   48.831781]        ret_from_fork+0x10/0x18
[   48.832297]
[   48.832297] -> #0 (wfs_lock){+.+.}:
[   48.832922]        __lock_acquire+0xe04/0x2e20
[   48.833480]        lock_acquire+0xbc/0xec
[   48.833984]        __mutex_lock_common+0xe0/0xe44
[   48.834577]        mutex_lock_nested+0x28/0x34
[   48.835136]        waiting_for_supplier_show+0x3c/0x98
[   48.835781]        dev_attr_show+0x48/0xb4
[   48.836295]        sysfs_kf_seq_show+0xe8/0x184
[   48.836864]        kernfs_seq_show+0x48/0x8c
[   48.837401]        seq_read+0x1c8/0x600
[   48.837884]        kernfs_fop_read+0x68/0x204
[   48.838431]        __vfs_read+0x60/0x214
[   48.838925]        vfs_read+0xbc/0x15c
[   48.839397]        ksys_read+0x78/0xe4
[   48.839869]        __arm64_sys_read+0x1c/0x28
[   48.840416]        el0_svc_common+0xbc/0x1d0
[   48.840953]        el0_svc_handler+0x74/0x98
[   48.841490]        el0_svc+0x8/0xc
[   48.841917]
[   48.841917] other info that might help us debug this:
[   48.841917]
[   48.842920] Chain exists of:
[   48.842920]   wfs_lock --> device_links_lock --> kn->count
[   48.842920]
[   48.844152]  Possible unsafe locking scenario:
[   48.844152]
[   48.844895]        CPU0                    CPU1
[   48.845463]        ----                    ----
[   48.846032]   lock(kn->count);
[   48.846417]                                lock(device_links_lock);
[   48.847203]                                lock(kn->count);
[   48.847902]   lock(wfs_lock);
[   48.848276]
[   48.848276]  *** DEADLOCK ***

Reported-by: Cheng-Jui.Wang@mediatek.com
Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/base/core.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 78114ddac755..df35b3206286 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1064,10 +1064,8 @@ static ssize_t waiting_for_supplier_show(struct device *dev,
 	bool val;
 
 	device_lock(dev);
-	mutex_lock(&wfs_lock);
 	val = !list_empty(&dev->links.needs_suppliers)
 	      && dev->links.need_for_probe;
-	mutex_unlock(&wfs_lock);
 	device_unlock(dev);
 	return sysfs_emit(buf, "%u\n", val);
 }
-- 
2.29.1.341.ge80a0c044ae-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v1 2/2] spi: Populate fwnode in of_register_spi_device()
  2020-11-04 20:54 [PATCH v1 1/2] driver core: Fix lockdep warning on wfs_lock Saravana Kannan
@ 2020-11-04 20:54 ` Saravana Kannan
  2020-11-05 17:12   ` Mark Brown
  0 siblings, 1 reply; 6+ messages in thread
From: Saravana Kannan @ 2020-11-04 20:54 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Matthias Brugger,
	Nathan Chancellor, Nick Desaulniers, Mark Brown
  Cc: Saravana Kannan, Cheng-Jui.Wang, kernel-team, linux-kernel,
	linux-arm-kernel, linux-mediatek, clang-built-linux,
	Daniel Mentz, linux-spi

From: Daniel Mentz <danielmentz@google.com>

This allows the fw_devlink feature to work for spi devices
too.  This avoids unnecessary probe deferrals related to spi devices and
improves suspend/resume ordering for spi devices when fw_devlink=on.

Signed-off-by: Daniel Mentz <danielmentz@google.com>
[saravanak: Minor commit text cleanup]
Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/spi/spi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 0cab239d8e7f..d533aa939cca 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -2037,6 +2037,7 @@ of_register_spi_device(struct spi_controller *ctlr, struct device_node *nc)
 	/* Store a pointer to the node in the device structure */
 	of_node_get(nc);
 	spi->dev.of_node = nc;
+	spi->dev.fwnode = of_fwnode_handle(nc);
 
 	/* Register the new device */
 	rc = spi_add_device(spi);
-- 
2.29.1.341.ge80a0c044ae-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 2/2] spi: Populate fwnode in of_register_spi_device()
  2020-11-04 20:54 ` [PATCH v1 2/2] spi: Populate fwnode in of_register_spi_device() Saravana Kannan
@ 2020-11-05 17:12   ` Mark Brown
  2020-11-05 19:26     ` Saravana Kannan
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Brown @ 2020-11-05 17:12 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Matthias Brugger,
	Nathan Chancellor, Nick Desaulniers, Cheng-Jui.Wang, kernel-team,
	linux-kernel, linux-arm-kernel, linux-mediatek,
	clang-built-linux, Daniel Mentz, linux-spi

[-- Attachment #1: Type: text/plain, Size: 723 bytes --]

On Wed, Nov 04, 2020 at 12:54:31PM -0800, Saravana Kannan wrote:
> From: Daniel Mentz <danielmentz@google.com>
> 
> This allows the fw_devlink feature to work for spi devices
> too.  This avoids unnecessary probe deferrals related to spi devices and
> improves suspend/resume ordering for spi devices when fw_devlink=on.

>  	of_node_get(nc);
>  	spi->dev.of_node = nc;
> +	spi->dev.fwnode = of_fwnode_handle(nc);

Why is this a manual step in an individual subsystem rather than
something done in the driver core - when would we not want to have the
fwnode correspond to the of_node, and wouldn't that just be a case of
checking to see if there is a fwnode already set and only initializing
if not anyway?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 2/2] spi: Populate fwnode in of_register_spi_device()
  2020-11-05 17:12   ` Mark Brown
@ 2020-11-05 19:26     ` Saravana Kannan
  2020-11-06 15:10       ` Mark Brown
  0 siblings, 1 reply; 6+ messages in thread
From: Saravana Kannan @ 2020-11-05 19:26 UTC (permalink / raw)
  To: Mark Brown
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Matthias Brugger,
	Nathan Chancellor, Nick Desaulniers, Cheng-Jui.Wang,
	Android Kernel Team, LKML, linux-arm-kernel,
	moderated list:ARM/Mediatek SoC support, clang-built-linux,
	Daniel Mentz, linux-spi

On Thu, Nov 5, 2020 at 9:12 AM Mark Brown <broonie@kernel.org> wrote:
>
> On Wed, Nov 04, 2020 at 12:54:31PM -0800, Saravana Kannan wrote:
> > From: Daniel Mentz <danielmentz@google.com>
> >
> > This allows the fw_devlink feature to work for spi devices
> > too.  This avoids unnecessary probe deferrals related to spi devices and
> > improves suspend/resume ordering for spi devices when fw_devlink=on.
>
> >       of_node_get(nc);
> >       spi->dev.of_node = nc;
> > +     spi->dev.fwnode = of_fwnode_handle(nc);
>
> Why is this a manual step in an individual subsystem rather than
> something done in the driver core

It can't be done in driver core because "fwnode" is the abstraction
driver core uses. It shouldn't care or know if the firmware is DT,
ACPI or something else -- that's the whole point of fwnode.

> - when would we not want to have the
> fwnode correspond to the of_node,

Never.

> and wouldn't that just be a case of
> checking to see if there is a fwnode already set and only initializing
> if not anyway?

Honestly, we should be deleting device.of_node and always use
device.fwnode. But that's a long way away (lots of clean up). The
"common" place to do this is where a struct device is created from a
firmware (device_node, acpi_device, etc). I don't see a "common place"
for when a device is created out of a device_node, so I think this
patch is a reasonable middle ground.

-Saravana

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 2/2] spi: Populate fwnode in of_register_spi_device()
  2020-11-05 19:26     ` Saravana Kannan
@ 2020-11-06 15:10       ` Mark Brown
  2020-11-06 19:12         ` Saravana Kannan
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Brown @ 2020-11-06 15:10 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Matthias Brugger,
	Nathan Chancellor, Nick Desaulniers, Cheng-Jui.Wang,
	Android Kernel Team, LKML, linux-arm-kernel,
	moderated list:ARM/Mediatek SoC support, clang-built-linux,
	Daniel Mentz, linux-spi

[-- Attachment #1: Type: text/plain, Size: 2506 bytes --]

On Thu, Nov 05, 2020 at 11:26:44AM -0800, Saravana Kannan wrote:
> On Thu, Nov 5, 2020 at 9:12 AM Mark Brown <broonie@kernel.org> wrote:

> > >       of_node_get(nc);
> > >       spi->dev.of_node = nc;
> > > +     spi->dev.fwnode = of_fwnode_handle(nc);

> > Why is this a manual step in an individual subsystem rather than
> > something done in the driver core

> It can't be done in driver core because "fwnode" is the abstraction
> driver core uses. It shouldn't care or know if the firmware is DT,
> ACPI or something else -- that's the whole point of fwnode.

Clearly it *can* be done in the driver core, the question is do we want
to.  The abstraction thing feels weaker at init than use after init,
"init from X" is a common enough pattern.  If it's done by the driver
core there would be no possibility of anything that creates devices
getting things wrong here, and the driver core already has a bunch of
integration with both DT and ACPI so it seems like a weird boundary to
have.

> > and wouldn't that just be a case of
> > checking to see if there is a fwnode already set and only initializing
> > if not anyway?

> Honestly, we should be deleting device.of_node and always use
> device.fwnode. But that's a long way away (lots of clean up). The
> "common" place to do this is where a struct device is created from a
> firmware (device_node, acpi_device, etc). I don't see a "common place"
> for when a device is created out of a device_node, so I think this
> patch is a reasonable middle ground.

That is obviously a much bigger job that's going to require going
through subsystems (and their drivers) properly to eliminate references
to of_node, I'm not clear how doing this little bit per subsystem rather
than in the core helps or hinders going through and doing that.  I don't
think you'll ever have a single place where a device is constructed, and
I'm not sure that that is even desirable, since there are per subsystem
things that need doing.

I'd be totally happy with eliminating all references to of_node from the
subsystem but for this it seems more sensible to do it in the driver
core and cover everything rather than running around everything that
creates a device from DT individually and having stuff fall through the
cracks - it's been a year since the equivalent change was made in I2C
for example, we've had new buses merged in that time never mind SPI not
being covered.

BTW I'm also missing patch 1 and the cover letter for this series, not
sure what's going on there?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 2/2] spi: Populate fwnode in of_register_spi_device()
  2020-11-06 15:10       ` Mark Brown
@ 2020-11-06 19:12         ` Saravana Kannan
  0 siblings, 0 replies; 6+ messages in thread
From: Saravana Kannan @ 2020-11-06 19:12 UTC (permalink / raw)
  To: Mark Brown
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Matthias Brugger,
	Nathan Chancellor, Nick Desaulniers, Cheng-Jui.Wang,
	Android Kernel Team, LKML, linux-arm-kernel,
	moderated list:ARM/Mediatek SoC support, clang-built-linux,
	Daniel Mentz, linux-spi

On Fri, Nov 6, 2020 at 7:10 AM Mark Brown <broonie@kernel.org> wrote:
>
> On Thu, Nov 05, 2020 at 11:26:44AM -0800, Saravana Kannan wrote:
> > On Thu, Nov 5, 2020 at 9:12 AM Mark Brown <broonie@kernel.org> wrote:
>
> > > >       of_node_get(nc);
> > > >       spi->dev.of_node = nc;
> > > > +     spi->dev.fwnode = of_fwnode_handle(nc);
>
> > > Why is this a manual step in an individual subsystem rather than
> > > something done in the driver core
>
> > It can't be done in driver core because "fwnode" is the abstraction
> > driver core uses. It shouldn't care or know if the firmware is DT,
> > ACPI or something else -- that's the whole point of fwnode.
>
> Clearly it *can* be done in the driver core, the question is do we want
> to.  The abstraction thing feels weaker at init than use after init,
> "init from X" is a common enough pattern.  If it's done by the driver
> core there would be no possibility of anything that creates devices
> getting things wrong here, and the driver core already has a bunch of
> integration with both DT and ACPI so it seems like a weird boundary to
> have.
>
> > > and wouldn't that just be a case of
> > > checking to see if there is a fwnode already set and only initializing
> > > if not anyway?
>
> > Honestly, we should be deleting device.of_node and always use
> > device.fwnode. But that's a long way away (lots of clean up). The
> > "common" place to do this is where a struct device is created from a
> > firmware (device_node, acpi_device, etc). I don't see a "common place"
> > for when a device is created out of a device_node, so I think this
> > patch is a reasonable middle ground.
>
> That is obviously a much bigger job that's going to require going
> through subsystems (and their drivers) properly to eliminate references
> to of_node, I'm not clear how doing this little bit per subsystem rather
> than in the core helps or hinders going through and doing that.  I don't
> think you'll ever have a single place where a device is constructed, and
> I'm not sure that that is even desirable, since there are per subsystem
> things that need doing.
>
> I'd be totally happy with eliminating all references to of_node from the
> subsystem but for this it seems more sensible to do it in the driver
> core and cover everything rather than running around everything that
> creates a device from DT individually and having stuff fall through the
> cracks - it's been a year since the equivalent change was made in I2C
> for example, we've had new buses merged in that time never mind SPI not
> being covered.

Since you kicked off another thread while we were still discussing it
here, I'll just move to that thread. I don't want to discuss the same
thing in two different places.

> BTW I'm also missing patch 1 and the cover letter for this series, not
> sure what's going on there?

Sorry, scripting error. There is no series.

-Saravana

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-11-06 19:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-04 20:54 [PATCH v1 1/2] driver core: Fix lockdep warning on wfs_lock Saravana Kannan
2020-11-04 20:54 ` [PATCH v1 2/2] spi: Populate fwnode in of_register_spi_device() Saravana Kannan
2020-11-05 17:12   ` Mark Brown
2020-11-05 19:26     ` Saravana Kannan
2020-11-06 15:10       ` Mark Brown
2020-11-06 19:12         ` Saravana Kannan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).