linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] Deferred probing in driver model is racy, resulting in lost probes
@ 2012-08-18 14:58 Russell King - ARM Linux
  2012-09-15 16:03 ` Greg Kroah-Hartman
  2012-09-16  6:41 ` Ming Lei
  0 siblings, 2 replies; 16+ messages in thread
From: Russell King - ARM Linux @ 2012-08-18 14:58 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: Greg Kroah-Hartman, Grant Likely, Arnd Bergmann, Mark Brown

Okay, so EPROBE_DEFER seems to work when I build everything into the
kernel, but when I build a pile of ASoC drivers as modules, it fails
every time I've tried booting the platform so far.

This is a v3.5 based kernel, with preempt enabled.

Okay, what I have is a bunch of devices already pre-registered in the
system:

	kirkwood-spdif-audio.1 (requiring module snd_soc_kirkwood_spdif)
	kirkwood-i2s.1 (requiring module snd_soc_kirkwood_i2s)
	kirkwood-pcm-audio.1 (requiring module snd_soc_kirkwood)
	spdif-dit (requiring module snd_soc_spdif)

and the modules were loaded in this order:

Module                  Size  Used by
snd_soc_spdif           1020  0		(spdif-dit driver)
dove                   24577  1		(dove-drm driver)
snd_soc_kirkwood_i2s     5209  0	(kirkwood-i2s driver)
snd_soc_kirkwood        3629  0		(kirkwood-pcm-audio driver)
snd_soc_kirkwood_spdif     1449  0	(kirkwood-spdif-audio driver)

snd_soc_kirkwood_spdif will return -EPROBE_DEFER if it can't find the
devices registered into ASoC by the other three modules (referred to as
the CPU DAI, and Codec DAI).

What I see in the debugging is this (I've added some to increase the
debuggability):

bus: 'platform': add driver kirkwood-spdif-audio
bus: 'platform': driver_probe_device: matched device kirkwood-spdif-audio.1 with driver kirkwood-spdif-audio
bus: 'platform': really_probe: probing driver kirkwood-spdif-audio with device kirkwood-spdif-audio.1
kirkwood-spdif-audio kirkwood-spdif-audio.1: CPU DAI kirkwood-i2s.1 not registered
bus: 'platform': add driver kirkwood-pcm-audio
bus: 'platform': add driver kirkwood-i2s
kirkwood-spdif-audio kirkwood-spdif-audio.1: failed to register card
bus: 'platform': add driver dove-drm
platform kirkwood-spdif-audio.1: Driver kirkwood-spdif-audio requests probe deferral
platform kirkwood-spdif-audio.1: Added to deferred list
bus: 'platform': driver_probe_device: matched device kirkwood-pcm-audio.1 with driver kirkwood-pcm-audio
bus: 'platform': really_probe: probing driver kirkwood-pcm-audio with device kirkwood-pcm-audio.1
driver: 'kirkwood-pcm-audio.1': driver_bound: bound to device 'kirkwood-pcm-audio'
bus: 'platform': really_probe: bound device kirkwood-pcm-audio.1 to driver kirkwood-pcm-audio
platform kirkwood-spdif-audio.1: Retrying from deferred list
bus: device_attach 'kirkwood-spdif-audio.1' unbound
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'dove-temp'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'galcore'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'alarmtimer'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv_xor_shared'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv_xor'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'serial8250'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'sata_mv'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'orion_spi'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv643xx_eth'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv643xx_eth_port'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'ap510-vmeta'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'orion-ehci'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'rtc-mv'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv64xxx_i2c'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'gpio-rc-recv'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'sdhci-pxav2'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'sdhci-dove'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'leds-gpio'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'snd-soc-dummy'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'soc-audio'
bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'kirkwood-pcm-audio'

What you notice here is that the kirkwood-spdif-audio driver has gone
missing - it hasn't been unloaded, and it still appears in sysfs, and
it can still be cleanly unloaded and reloaded.

bus: 'platform': driver_probe_device: matched device kirkwood-i2s.1 with driver kirkwood-i2s
bus: 'platform': really_probe: probing driver kirkwood-i2s with device kirkwood-i2s.1
kirkwood-i2s kirkwood-i2s.1: found external clock
driver: 'kirkwood-i2s.1': driver_bound: bound to device 'kirkwood-i2s'
bus: 'platform': really_probe: bound device kirkwood-i2s.1 to driver kirkwood-i2s
bus: 'platform': driver_probe_device: matched device dove-drm with driver dove-drm
bus: 'platform': really_probe: probing driver dove-drm with device dove-drm
bus: 'platform': add driver spdif-dit
[drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
[drm] No driver support for vblank timestamp query.
fb0: dove-drmfb frame buffer device
drm: registered panic notifier
[drm] Initialized dove-drm 1.0.0 20120730 on minor 0
driver: 'dove-drm': driver_bound: bound to device 'dove-drm'
bus: 'platform': really_probe: bound device dove-drm to driver dove-drm
bus: 'platform': driver_probe_device: matched device spdif-dit with driver spdif-dit
bus: 'platform': really_probe: probing driver spdif-dit with device spdif-dit
driver: 'spdif-dit': driver_bound: bound to device 'spdif-dit'
bus: 'platform': really_probe: bound device spdif-dit to driver spdif-dit

I suspect what is going on is that we have a race condition between the
device being added to the deferred list vs other drivers successfully
probing and running the deferred list, and the driver with the deferred
device being added to the driver list.

This allows the deferred list to be run *before* the driver is placed
onto the bus driver list, which means when we try to re-probe deferred
devices, we find no matching driver to probe the device with.  This
then results in the device being dropped off the deferred probe list
(as the only way it stays on that list is if a driver is probed _and_
it returns -EPROBE_DEFER.)

I do not know all the details of the locking in this area, so I'm not
going to try to come up with a patch... but as this stands, it's
unworkable.  I have not yet had the system come up once with these
drivers as modules and had it work without having to manually remove
and re-add the Kirkwood SPDIF driver.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-08-18 14:58 [BUG] Deferred probing in driver model is racy, resulting in lost probes Russell King - ARM Linux
@ 2012-09-15 16:03 ` Greg Kroah-Hartman
  2012-09-15 18:32   ` Russell King - ARM Linux
  2012-09-16  6:41 ` Ming Lei
  1 sibling, 1 reply; 16+ messages in thread
From: Greg Kroah-Hartman @ 2012-09-15 16:03 UTC (permalink / raw)
  To: Russell King - ARM Linux, Grant Likely
  Cc: linux-arm-kernel, linux-kernel, Arnd Bergmann, Mark Brown

On Sat, Aug 18, 2012 at 03:58:56PM +0100, Russell King - ARM Linux wrote:
> Okay, so EPROBE_DEFER seems to work when I build everything into the
> kernel, but when I build a pile of ASoC drivers as modules, it fails
> every time I've tried booting the platform so far.

Ugh, sorry for the delay in this, but this is really Grant's area, as he
added this feature.

Actually, for some reason, I thought this was only supposed to work for
built-in drivers, not for modules at all.  The fact that it works for
them some times seems like an additional feature :)

> This is a v3.5 based kernel, with preempt enabled.
> 
> Okay, what I have is a bunch of devices already pre-registered in the
> system:
> 
> 	kirkwood-spdif-audio.1 (requiring module snd_soc_kirkwood_spdif)
> 	kirkwood-i2s.1 (requiring module snd_soc_kirkwood_i2s)
> 	kirkwood-pcm-audio.1 (requiring module snd_soc_kirkwood)
> 	spdif-dit (requiring module snd_soc_spdif)
> 
> and the modules were loaded in this order:
> 
> Module                  Size  Used by
> snd_soc_spdif           1020  0		(spdif-dit driver)
> dove                   24577  1		(dove-drm driver)
> snd_soc_kirkwood_i2s     5209  0	(kirkwood-i2s driver)
> snd_soc_kirkwood        3629  0		(kirkwood-pcm-audio driver)
> snd_soc_kirkwood_spdif     1449  0	(kirkwood-spdif-audio driver)
> 
> snd_soc_kirkwood_spdif will return -EPROBE_DEFER if it can't find the
> devices registered into ASoC by the other three modules (referred to as
> the CPU DAI, and Codec DAI).
> 
> What I see in the debugging is this (I've added some to increase the
> debuggability):
> 
> bus: 'platform': add driver kirkwood-spdif-audio
> bus: 'platform': driver_probe_device: matched device kirkwood-spdif-audio.1 with driver kirkwood-spdif-audio
> bus: 'platform': really_probe: probing driver kirkwood-spdif-audio with device kirkwood-spdif-audio.1
> kirkwood-spdif-audio kirkwood-spdif-audio.1: CPU DAI kirkwood-i2s.1 not registered
> bus: 'platform': add driver kirkwood-pcm-audio
> bus: 'platform': add driver kirkwood-i2s
> kirkwood-spdif-audio kirkwood-spdif-audio.1: failed to register card
> bus: 'platform': add driver dove-drm
> platform kirkwood-spdif-audio.1: Driver kirkwood-spdif-audio requests probe deferral
> platform kirkwood-spdif-audio.1: Added to deferred list
> bus: 'platform': driver_probe_device: matched device kirkwood-pcm-audio.1 with driver kirkwood-pcm-audio
> bus: 'platform': really_probe: probing driver kirkwood-pcm-audio with device kirkwood-pcm-audio.1
> driver: 'kirkwood-pcm-audio.1': driver_bound: bound to device 'kirkwood-pcm-audio'
> bus: 'platform': really_probe: bound device kirkwood-pcm-audio.1 to driver kirkwood-pcm-audio
> platform kirkwood-spdif-audio.1: Retrying from deferred list
> bus: device_attach 'kirkwood-spdif-audio.1' unbound
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'dove-temp'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'galcore'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'alarmtimer'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv_xor_shared'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv_xor'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'serial8250'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'sata_mv'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'orion_spi'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv643xx_eth'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv643xx_eth_port'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'ap510-vmeta'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'orion-ehci'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'rtc-mv'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv64xxx_i2c'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'gpio-rc-recv'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'sdhci-pxav2'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'sdhci-dove'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'leds-gpio'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'snd-soc-dummy'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'soc-audio'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'kirkwood-pcm-audio'
> 
> What you notice here is that the kirkwood-spdif-audio driver has gone
> missing - it hasn't been unloaded, and it still appears in sysfs, and
> it can still be cleanly unloaded and reloaded.
> 
> bus: 'platform': driver_probe_device: matched device kirkwood-i2s.1 with driver kirkwood-i2s
> bus: 'platform': really_probe: probing driver kirkwood-i2s with device kirkwood-i2s.1
> kirkwood-i2s kirkwood-i2s.1: found external clock
> driver: 'kirkwood-i2s.1': driver_bound: bound to device 'kirkwood-i2s'
> bus: 'platform': really_probe: bound device kirkwood-i2s.1 to driver kirkwood-i2s
> bus: 'platform': driver_probe_device: matched device dove-drm with driver dove-drm
> bus: 'platform': really_probe: probing driver dove-drm with device dove-drm
> bus: 'platform': add driver spdif-dit
> [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
> [drm] No driver support for vblank timestamp query.
> fb0: dove-drmfb frame buffer device
> drm: registered panic notifier
> [drm] Initialized dove-drm 1.0.0 20120730 on minor 0
> driver: 'dove-drm': driver_bound: bound to device 'dove-drm'
> bus: 'platform': really_probe: bound device dove-drm to driver dove-drm
> bus: 'platform': driver_probe_device: matched device spdif-dit with driver spdif-dit
> bus: 'platform': really_probe: probing driver spdif-dit with device spdif-dit
> driver: 'spdif-dit': driver_bound: bound to device 'spdif-dit'
> bus: 'platform': really_probe: bound device spdif-dit to driver spdif-dit
> 
> I suspect what is going on is that we have a race condition between the
> device being added to the deferred list vs other drivers successfully
> probing and running the deferred list, and the driver with the deferred
> device being added to the driver list.
> 
> This allows the deferred list to be run *before* the driver is placed
> onto the bus driver list, which means when we try to re-probe deferred
> devices, we find no matching driver to probe the device with.  This
> then results in the device being dropped off the deferred probe list
> (as the only way it stays on that list is if a driver is probed _and_
> it returns -EPROBE_DEFER.)
> 
> I do not know all the details of the locking in this area, so I'm not
> going to try to come up with a patch... but as this stands, it's
> unworkable.  I have not yet had the system come up once with these
> drivers as modules and had it work without having to manually remove
> and re-add the Kirkwood SPDIF driver.

Hm, I think you are right, there is probably a race here somewhere.

Grant, any chance you can look into this?  Has your testing showed this
as well?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-09-15 16:03 ` Greg Kroah-Hartman
@ 2012-09-15 18:32   ` Russell King - ARM Linux
  0 siblings, 0 replies; 16+ messages in thread
From: Russell King - ARM Linux @ 2012-09-15 18:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Grant Likely, linux-arm-kernel, linux-kernel, Arnd Bergmann, Mark Brown

On Sat, Sep 15, 2012 at 09:03:04AM -0700, Greg Kroah-Hartman wrote:
> On Sat, Aug 18, 2012 at 03:58:56PM +0100, Russell King - ARM Linux wrote:
> > Okay, so EPROBE_DEFER seems to work when I build everything into the
> > kernel, but when I build a pile of ASoC drivers as modules, it fails
> > every time I've tried booting the platform so far.
> 
> Ugh, sorry for the delay in this, but this is really Grant's area, as he
> added this feature.
> 
> Actually, for some reason, I thought this was only supposed to work for
> built-in drivers, not for modules at all.  The fact that it works for
> them some times seems like an additional feature :)

It has to work for modules because there's no way to tell modprobe about
these kinds of dependencies - the modules make no direct reference to
their dependent drivers.

Think of a driver making use of gpiolib.  It makes reference only to
GPIO functions defined by gpiolib, which then redirects them to the
appropriate driver.

If upon requesting the GPIOs, it finds that the GPIOs it should be
using aren't there yet, it would decide to return -EPROBE_DEFER.
There is no direct reference to the GPIO module (which could be some
I2C peripheral also built as a module).

If you happen to have udev setup to automatically load all modules for
devices, this provides a definite case where EPROBE_DEFER sorts out the
probe ordering in this case - in theory, when the GPIO driver module
gets loaded, the GPIO-using driver will reprobe and hopefully initialize.

So, I think there's a valid use case by this for modules - in any case,
we can't really treat built-in drivers differently from modular drivers
unless we want them scattered with #ifdef MODULE...#endif stuff.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-08-18 14:58 [BUG] Deferred probing in driver model is racy, resulting in lost probes Russell King - ARM Linux
  2012-09-15 16:03 ` Greg Kroah-Hartman
@ 2012-09-16  6:41 ` Ming Lei
  2012-09-16  8:25   ` Russell King - ARM Linux
  1 sibling, 1 reply; 16+ messages in thread
From: Ming Lei @ 2012-09-16  6:41 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: linux-arm-kernel, linux-kernel, Greg Kroah-Hartman, Grant Likely,
	Arnd Bergmann, Mark Brown

On Sat, Aug 18, 2012 at 10:58 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> Okay, so EPROBE_DEFER seems to work when I build everything into the
> kernel, but when I build a pile of ASoC drivers as modules, it fails
> every time I've tried booting the platform so far.
>
> This is a v3.5 based kernel, with preempt enabled.
>
> Okay, what I have is a bunch of devices already pre-registered in the
> system:
>
>         kirkwood-spdif-audio.1 (requiring module snd_soc_kirkwood_spdif)
>         kirkwood-i2s.1 (requiring module snd_soc_kirkwood_i2s)
>         kirkwood-pcm-audio.1 (requiring module snd_soc_kirkwood)
>         spdif-dit (requiring module snd_soc_spdif)
>
> and the modules were loaded in this order:
>
> Module                  Size  Used by
> snd_soc_spdif           1020  0         (spdif-dit driver)
> dove                   24577  1         (dove-drm driver)
> snd_soc_kirkwood_i2s     5209  0        (kirkwood-i2s driver)
> snd_soc_kirkwood        3629  0         (kirkwood-pcm-audio driver)
> snd_soc_kirkwood_spdif     1449  0      (kirkwood-spdif-audio driver)
>
> snd_soc_kirkwood_spdif will return -EPROBE_DEFER if it can't find the
> devices registered into ASoC by the other three modules (referred to as
> the CPU DAI, and Codec DAI).
>
> What I see in the debugging is this (I've added some to increase the
> debuggability):
>
> bus: 'platform': add driver kirkwood-spdif-audio
> bus: 'platform': driver_probe_device: matched device kirkwood-spdif-audio.1 with driver kirkwood-spdif-audio
> bus: 'platform': really_probe: probing driver kirkwood-spdif-audio with device kirkwood-spdif-audio.1
> kirkwood-spdif-audio kirkwood-spdif-audio.1: CPU DAI kirkwood-i2s.1 not registered
> bus: 'platform': add driver kirkwood-pcm-audio
> bus: 'platform': add driver kirkwood-i2s
> kirkwood-spdif-audio kirkwood-spdif-audio.1: failed to register card
> bus: 'platform': add driver dove-drm
> platform kirkwood-spdif-audio.1: Driver kirkwood-spdif-audio requests probe deferral
> platform kirkwood-spdif-audio.1: Added to deferred list
> bus: 'platform': driver_probe_device: matched device kirkwood-pcm-audio.1 with driver kirkwood-pcm-audio
> bus: 'platform': really_probe: probing driver kirkwood-pcm-audio with device kirkwood-pcm-audio.1
> driver: 'kirkwood-pcm-audio.1': driver_bound: bound to device 'kirkwood-pcm-audio'
> bus: 'platform': really_probe: bound device kirkwood-pcm-audio.1 to driver kirkwood-pcm-audio
> platform kirkwood-spdif-audio.1: Retrying from deferred list
> bus: device_attach 'kirkwood-spdif-audio.1' unbound

Looks the above message is added by yourself, where did you add
the message?

> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'dove-temp'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'galcore'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'alarmtimer'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv_xor_shared'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv_xor'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'serial8250'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'sata_mv'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'orion_spi'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv643xx_eth'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv643xx_eth_port'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'ap510-vmeta'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'orion-ehci'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'rtc-mv'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv64xxx_i2c'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'gpio-rc-recv'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'sdhci-pxav2'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'sdhci-dove'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'leds-gpio'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'snd-soc-dummy'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'soc-audio'
> bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'kirkwood-pcm-audio'
>
> What you notice here is that the kirkwood-spdif-audio driver has gone
> missing - it hasn't been unloaded, and it still appears in sysfs, and
> it can still be cleanly unloaded and reloaded.

It is a bit weird, and I am wondering why is the driver of  kirkwood-spdif-audio
deleted from bus->p->klist_drivers without being unregistered from
platform bus?

>
> bus: 'platform': driver_probe_device: matched device kirkwood-i2s.1 with driver kirkwood-i2s
> bus: 'platform': really_probe: probing driver kirkwood-i2s with device kirkwood-i2s.1
> kirkwood-i2s kirkwood-i2s.1: found external clock
> driver: 'kirkwood-i2s.1': driver_bound: bound to device 'kirkwood-i2s'
> bus: 'platform': really_probe: bound device kirkwood-i2s.1 to driver kirkwood-i2s
> bus: 'platform': driver_probe_device: matched device dove-drm with driver dove-drm
> bus: 'platform': really_probe: probing driver dove-drm with device dove-drm
> bus: 'platform': add driver spdif-dit
> [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
> [drm] No driver support for vblank timestamp query.
> fb0: dove-drmfb frame buffer device
> drm: registered panic notifier
> [drm] Initialized dove-drm 1.0.0 20120730 on minor 0
> driver: 'dove-drm': driver_bound: bound to device 'dove-drm'
> bus: 'platform': really_probe: bound device dove-drm to driver dove-drm
> bus: 'platform': driver_probe_device: matched device spdif-dit with driver spdif-dit
> bus: 'platform': really_probe: probing driver spdif-dit with device spdif-dit
> driver: 'spdif-dit': driver_bound: bound to device 'spdif-dit'
> bus: 'platform': really_probe: bound device spdif-dit to driver spdif-dit
>
> I suspect what is going on is that we have a race condition between the
> device being added to the deferred list vs other drivers successfully
> probing and running the deferred list, and the driver with the deferred
> device being added to the driver list.
>
> This allows the deferred list to be run *before* the driver is placed
> onto the bus driver list, which means when we try to re-probe deferred

Looks the above should not be a problem. After the driver is placed
on the bus, it still will call driver_attach to probe all unbound devices
on the bus.

> devices, we find no matching driver to probe the device with.  This
> then results in the device being dropped off the deferred probe list
> (as the only way it stays on that list is if a driver is probed _and_
> it returns -EPROBE_DEFER.)


Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-09-16  6:41 ` Ming Lei
@ 2012-09-16  8:25   ` Russell King - ARM Linux
  2012-09-16 13:24     ` Ming Lei
  0 siblings, 1 reply; 16+ messages in thread
From: Russell King - ARM Linux @ 2012-09-16  8:25 UTC (permalink / raw)
  To: Ming Lei
  Cc: linux-arm-kernel, linux-kernel, Greg Kroah-Hartman, Grant Likely,
	Arnd Bergmann, Mark Brown

On Sun, Sep 16, 2012 at 02:41:28PM +0800, Ming Lei wrote:
> On Sat, Aug 18, 2012 at 10:58 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > Okay, so EPROBE_DEFER seems to work when I build everything into the
> > kernel, but when I build a pile of ASoC drivers as modules, it fails
> > every time I've tried booting the platform so far.
> >
> > This is a v3.5 based kernel, with preempt enabled.
> >
> > Okay, what I have is a bunch of devices already pre-registered in the
> > system:
> >
> >         kirkwood-spdif-audio.1 (requiring module snd_soc_kirkwood_spdif)
> >         kirkwood-i2s.1 (requiring module snd_soc_kirkwood_i2s)
> >         kirkwood-pcm-audio.1 (requiring module snd_soc_kirkwood)
> >         spdif-dit (requiring module snd_soc_spdif)
> >
> > and the modules were loaded in this order:
> >
> > Module                  Size  Used by
> > snd_soc_spdif           1020  0         (spdif-dit driver)
> > dove                   24577  1         (dove-drm driver)
> > snd_soc_kirkwood_i2s     5209  0        (kirkwood-i2s driver)
> > snd_soc_kirkwood        3629  0         (kirkwood-pcm-audio driver)
> > snd_soc_kirkwood_spdif     1449  0      (kirkwood-spdif-audio driver)
> >
> > snd_soc_kirkwood_spdif will return -EPROBE_DEFER if it can't find the
> > devices registered into ASoC by the other three modules (referred to as
> > the CPU DAI, and Codec DAI).
> >
> > What I see in the debugging is this (I've added some to increase the
> > debuggability):
> >
> > bus: 'platform': add driver kirkwood-spdif-audio
> > bus: 'platform': driver_probe_device: matched device kirkwood-spdif-audio.1 with driver kirkwood-spdif-audio
> > bus: 'platform': really_probe: probing driver kirkwood-spdif-audio with device kirkwood-spdif-audio.1
> > kirkwood-spdif-audio kirkwood-spdif-audio.1: CPU DAI kirkwood-i2s.1 not registered
> > bus: 'platform': add driver kirkwood-pcm-audio
> > bus: 'platform': add driver kirkwood-i2s
> > kirkwood-spdif-audio kirkwood-spdif-audio.1: failed to register card
> > bus: 'platform': add driver dove-drm
> > platform kirkwood-spdif-audio.1: Driver kirkwood-spdif-audio requests probe deferral
> > platform kirkwood-spdif-audio.1: Added to deferred list
> > bus: 'platform': driver_probe_device: matched device kirkwood-pcm-audio.1 with driver kirkwood-pcm-audio
> > bus: 'platform': really_probe: probing driver kirkwood-pcm-audio with device kirkwood-pcm-audio.1
> > driver: 'kirkwood-pcm-audio.1': driver_bound: bound to device 'kirkwood-pcm-audio'
> > bus: 'platform': really_probe: bound device kirkwood-pcm-audio.1 to driver kirkwood-pcm-audio
> > platform kirkwood-spdif-audio.1: Retrying from deferred list
> > bus: device_attach 'kirkwood-spdif-audio.1' unbound
> 
> Looks the above message is added by yourself, where did you add
> the message?

In... device_attach().  I don't remember where because it's been soo long
since I investigated and reported this problem.  If you're going to take
a month since the report to get around to replying, don't expect folk to
be very helpful when you ask for more information... especially when they
lose the debugging patch they were using to generate this.

> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'dove-temp'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'galcore'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'alarmtimer'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv_xor_shared'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv_xor'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'serial8250'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'sata_mv'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'orion_spi'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv643xx_eth'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv643xx_eth_port'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'ap510-vmeta'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'orion-ehci'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'rtc-mv'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'mv64xxx_i2c'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'gpio-rc-recv'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'sdhci-pxav2'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'sdhci-dove'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'leds-gpio'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'snd-soc-dummy'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'soc-audio'
> > bus: __device_attach 'kirkwood-spdif-audio.1' probing with 'kirkwood-pcm-audio'
> >
> > What you notice here is that the kirkwood-spdif-audio driver has gone
> > missing - it hasn't been unloaded, and it still appears in sysfs, and
> > it can still be cleanly unloaded and reloaded.
> 
> It is a bit weird, and I am wondering why is the driver of  kirkwood-spdif-audio
> deleted from bus->p->klist_drivers without being unregistered from
> platform bus?

It isn't.  As I said, it's a race condition due to lack of locking - the
driver hasn't been added to the list of drivers at this point:

int bus_add_driver(struct device_driver *drv)
{
...
        if (drv->bus->p->drivers_autoprobe) {
                error = driver_attach(drv);
                if (error)
                        goto out_unregister;
        }
        klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
...
}

Notice that the attaching is done _before_ the driver is added to the
bus driver list.

> > bus: 'platform': driver_probe_device: matched device kirkwood-i2s.1 with driver kirkwood-i2s
> > bus: 'platform': really_probe: probing driver kirkwood-i2s with device kirkwood-i2s.1
> > kirkwood-i2s kirkwood-i2s.1: found external clock
> > driver: 'kirkwood-i2s.1': driver_bound: bound to device 'kirkwood-i2s'
> > bus: 'platform': really_probe: bound device kirkwood-i2s.1 to driver kirkwood-i2s
> > bus: 'platform': driver_probe_device: matched device dove-drm with driver dove-drm
> > bus: 'platform': really_probe: probing driver dove-drm with device dove-drm
> > bus: 'platform': add driver spdif-dit
> > [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
> > [drm] No driver support for vblank timestamp query.
> > fb0: dove-drmfb frame buffer device
> > drm: registered panic notifier
> > [drm] Initialized dove-drm 1.0.0 20120730 on minor 0
> > driver: 'dove-drm': driver_bound: bound to device 'dove-drm'
> > bus: 'platform': really_probe: bound device dove-drm to driver dove-drm
> > bus: 'platform': driver_probe_device: matched device spdif-dit with driver spdif-dit
> > bus: 'platform': really_probe: probing driver spdif-dit with device spdif-dit
> > driver: 'spdif-dit': driver_bound: bound to device 'spdif-dit'
> > bus: 'platform': really_probe: bound device spdif-dit to driver spdif-dit
> >
> > I suspect what is going on is that we have a race condition between the
> > device being added to the deferred list vs other drivers successfully
> > probing and running the deferred list, and the driver with the deferred
> > device being added to the driver list.
> >
> > This allows the deferred list to be run *before* the driver is placed
> > onto the bus driver list, which means when we try to re-probe deferred
> 
> Looks the above should not be a problem. After the driver is placed
> on the bus, it still will call driver_attach to probe all unbound devices
> on the bus.

Of course it's not a problem - of course, silly me, that's why it doesn't
work.  I'm sorry for reporting that something doesn't work, and now I see
it's all my fault.  Of course.  Stupid me.  Sorry for reporting a problem.

> > devices, we find no matching driver to probe the device with.  This
> > then results in the device being dropped off the deferred probe list
> > (as the only way it stays on that list is if a driver is probed _and_
> > it returns -EPROBE_DEFER.)
> 
> 
> Thanks,
> -- 
> Ming Lei

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-09-16  8:25   ` Russell King - ARM Linux
@ 2012-09-16 13:24     ` Ming Lei
  2012-09-26 20:08       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2012-09-16 13:24 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: linux-arm-kernel, linux-kernel, Greg Kroah-Hartman, Grant Likely,
	Arnd Bergmann, Mark Brown

On Sun, Sep 16, 2012 at 4:25 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
>
> It isn't.  As I said, it's a race condition due to lack of locking - the
> driver hasn't been added to the list of drivers at this point:
>
> int bus_add_driver(struct device_driver *drv)
> {
> ...
>         if (drv->bus->p->drivers_autoprobe) {
>                 error = driver_attach(drv);
>                 if (error)
>                         goto out_unregister;
>         }
>         klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
> ...
> }
>
> Notice that the attaching is done _before_ the driver is added to the
> bus driver list.

Yes, it is a problem since a new device may be added to bus
and bus_probe_device() may not see the new added driver.

So looks klist_add_tail() should complete before driver_attach()
inside bus_add_driver().

The attached one line change should fix the problem because the
below way can guarantee that no probe(dev) may be lost.


CPU0				CPU1
driver_register
	...
	write(bus->driver_list)
	smp_mb()
	read(bus->device_list)
	...
				device_add
					/* bus_add_device */	
					write(bus->device_list)
					smp_mb()
					/* bus_probe_device*/
					read(bus->driver_list)

And the smp_mb() has been implicit by UNLOCK+LOCK
of 'klist' according to 'VARIETIES OF MEMORY BARRIER' part
of Documentation/memory-barriers.txt.

Could you test the below patch to see if it can fix your problem?

diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 181ed26..17d7437 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -714,12 +714,12 @@ int bus_add_driver(struct device_driver *drv)
 	if (error)
 		goto out_unregister;

+	klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
 	if (drv->bus->p->drivers_autoprobe) {
 		error = driver_attach(drv);
 		if (error)
 			goto out_unregister;
 	}
-	klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
 	module_add_driver(drv->owner, drv);

 	error = driver_create_file(drv, &driver_attr_uevent);



Thanks,
-- 
Ming Lei

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-09-16 13:24     ` Ming Lei
@ 2012-09-26 20:08       ` Greg Kroah-Hartman
  2012-09-26 20:23         ` Russell King - ARM Linux
  0 siblings, 1 reply; 16+ messages in thread
From: Greg Kroah-Hartman @ 2012-09-26 20:08 UTC (permalink / raw)
  To: Ming Lei
  Cc: Russell King - ARM Linux, linux-arm-kernel, linux-kernel,
	Grant Likely, Arnd Bergmann, Mark Brown

On Sun, Sep 16, 2012 at 09:24:43PM +0800, Ming Lei wrote:
> On Sun, Sep 16, 2012 at 4:25 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> >
> > It isn't.  As I said, it's a race condition due to lack of locking - the
> > driver hasn't been added to the list of drivers at this point:
> >
> > int bus_add_driver(struct device_driver *drv)
> > {
> > ...
> >         if (drv->bus->p->drivers_autoprobe) {
> >                 error = driver_attach(drv);
> >                 if (error)
> >                         goto out_unregister;
> >         }
> >         klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
> > ...
> > }
> >
> > Notice that the attaching is done _before_ the driver is added to the
> > bus driver list.
> 
> Yes, it is a problem since a new device may be added to bus
> and bus_probe_device() may not see the new added driver.
> 
> So looks klist_add_tail() should complete before driver_attach()
> inside bus_add_driver().
> 
> The attached one line change should fix the problem because the
> below way can guarantee that no probe(dev) may be lost.
> 
> 
> CPU0				CPU1
> driver_register
> 	...
> 	write(bus->driver_list)
> 	smp_mb()
> 	read(bus->device_list)
> 	...
> 				device_add
> 					/* bus_add_device */	
> 					write(bus->device_list)
> 					smp_mb()
> 					/* bus_probe_device*/
> 					read(bus->driver_list)
> 
> And the smp_mb() has been implicit by UNLOCK+LOCK
> of 'klist' according to 'VARIETIES OF MEMORY BARRIER' part
> of Documentation/memory-barriers.txt.
> 
> Could you test the below patch to see if it can fix your problem?
> 
> diff --git a/drivers/base/bus.c b/drivers/base/bus.c
> index 181ed26..17d7437 100644
> --- a/drivers/base/bus.c
> +++ b/drivers/base/bus.c
> @@ -714,12 +714,12 @@ int bus_add_driver(struct device_driver *drv)
>  	if (error)
>  		goto out_unregister;
> 
> +	klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
>  	if (drv->bus->p->drivers_autoprobe) {
>  		error = driver_attach(drv);
>  		if (error)
>  			goto out_unregister;
>  	}
> -	klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
>  	module_add_driver(drv->owner, drv);
> 
>  	error = driver_create_file(drv, &driver_attr_uevent);
> 
> 
> 

Did the above patch ever prove to solve the issue or not?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-09-26 20:08       ` Greg Kroah-Hartman
@ 2012-09-26 20:23         ` Russell King - ARM Linux
  2012-09-26 20:36           ` Greg Kroah-Hartman
  2012-09-26 23:47           ` Ming Lei
  0 siblings, 2 replies; 16+ messages in thread
From: Russell King - ARM Linux @ 2012-09-26 20:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Ming Lei, linux-arm-kernel, linux-kernel, Grant Likely,
	Arnd Bergmann, Mark Brown

On Wed, Sep 26, 2012 at 01:08:33PM -0700, Greg Kroah-Hartman wrote:
> On Sun, Sep 16, 2012 at 09:24:43PM +0800, Ming Lei wrote:
> > diff --git a/drivers/base/bus.c b/drivers/base/bus.c
> > index 181ed26..17d7437 100644
> > --- a/drivers/base/bus.c
> > +++ b/drivers/base/bus.c
> > @@ -714,12 +714,12 @@ int bus_add_driver(struct device_driver *drv)
> >  	if (error)
> >  		goto out_unregister;
> > 
> > +	klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
> >  	if (drv->bus->p->drivers_autoprobe) {
> >  		error = driver_attach(drv);
> >  		if (error)
> >  			goto out_unregister;
> >  	}
> > -	klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
> >  	module_add_driver(drv->owner, drv);
> > 
> >  	error = driver_create_file(drv, &driver_attr_uevent);
> > 
> > 
> > 
> 
> Did the above patch ever prove to solve the issue or not?

To be honest, I've not bothered to test the above patch, and now when I
look at it, I notice it's broken - in that on error it will corrupt the
driver list.  Take a look at the error path.

priv is drv->p.  We add priv->knode_bus to the driver list.  If
driver_attach() returns an error, then we go to out_unregister, which
does:

out_unregister:
        kobject_put(&priv->kobj);
        kfree(drv->p);
        drv->p = NULL;

thereby freeing the node we just added to the driver list without first
removing it.

I suspect it will fix the problem, but let's get the patch to be correct
before it gets tested...

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-09-26 20:23         ` Russell King - ARM Linux
@ 2012-09-26 20:36           ` Greg Kroah-Hartman
  2012-09-26 23:47           ` Ming Lei
  1 sibling, 0 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2012-09-26 20:36 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Ming Lei, linux-arm-kernel, linux-kernel, Grant Likely,
	Arnd Bergmann, Mark Brown

On Wed, Sep 26, 2012 at 09:23:21PM +0100, Russell King - ARM Linux wrote:
> On Wed, Sep 26, 2012 at 01:08:33PM -0700, Greg Kroah-Hartman wrote:
> > On Sun, Sep 16, 2012 at 09:24:43PM +0800, Ming Lei wrote:
> > > diff --git a/drivers/base/bus.c b/drivers/base/bus.c
> > > index 181ed26..17d7437 100644
> > > --- a/drivers/base/bus.c
> > > +++ b/drivers/base/bus.c
> > > @@ -714,12 +714,12 @@ int bus_add_driver(struct device_driver *drv)
> > >  	if (error)
> > >  		goto out_unregister;
> > > 
> > > +	klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
> > >  	if (drv->bus->p->drivers_autoprobe) {
> > >  		error = driver_attach(drv);
> > >  		if (error)
> > >  			goto out_unregister;
> > >  	}
> > > -	klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
> > >  	module_add_driver(drv->owner, drv);
> > > 
> > >  	error = driver_create_file(drv, &driver_attr_uevent);
> > > 
> > > 
> > > 
> > 
> > Did the above patch ever prove to solve the issue or not?
> 
> To be honest, I've not bothered to test the above patch, and now when I
> look at it, I notice it's broken - in that on error it will corrupt the
> driver list.  Take a look at the error path.
> 
> priv is drv->p.  We add priv->knode_bus to the driver list.  If
> driver_attach() returns an error, then we go to out_unregister, which
> does:
> 
> out_unregister:
>         kobject_put(&priv->kobj);
>         kfree(drv->p);
>         drv->p = NULL;
> 
> thereby freeing the node we just added to the driver list without first
> removing it.
> 
> I suspect it will fix the problem, but let's get the patch to be correct
> before it gets tested...

Good catch.  Ming, care to redo this?

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-09-26 20:23         ` Russell King - ARM Linux
  2012-09-26 20:36           ` Greg Kroah-Hartman
@ 2012-09-26 23:47           ` Ming Lei
  2012-09-27 13:58             ` Russell King - ARM Linux
  1 sibling, 1 reply; 16+ messages in thread
From: Ming Lei @ 2012-09-26 23:47 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Greg Kroah-Hartman, linux-arm-kernel, linux-kernel, Grant Likely,
	Arnd Bergmann, Mark Brown

On Thu, Sep 27, 2012 at 4:23 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Wed, Sep 26, 2012 at 01:08:33PM -0700, Greg Kroah-Hartman wrote:
>> On Sun, Sep 16, 2012 at 09:24:43PM +0800, Ming Lei wrote:
>> > diff --git a/drivers/base/bus.c b/drivers/base/bus.c
>> > index 181ed26..17d7437 100644
>> > --- a/drivers/base/bus.c
>> > +++ b/drivers/base/bus.c
>> > @@ -714,12 +714,12 @@ int bus_add_driver(struct device_driver *drv)
>> >     if (error)
>> >             goto out_unregister;
>> >
>> > +   klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
>> >     if (drv->bus->p->drivers_autoprobe) {
>> >             error = driver_attach(drv);
>> >             if (error)
>> >                     goto out_unregister;
>> >     }
>> > -   klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
>> >     module_add_driver(drv->owner, drv);
>> >
>> >     error = driver_create_file(drv, &driver_attr_uevent);
>> >
>> >
>> >
>>
>> Did the above patch ever prove to solve the issue or not?
>
> To be honest, I've not bothered to test the above patch, and now when I
> look at it, I notice it's broken - in that on error it will corrupt the
> driver list.  Take a look at the error path.
>
> priv is drv->p.  We add priv->knode_bus to the driver list.  If
> driver_attach() returns an error, then we go to out_unregister, which

In fact, driver_attach() always returns zero, so it does __not__ affect
your test.

I knew the failure shouldn't be handled in theory because the probe
failure on one device should not cause failure of driver_register, and
it should be fixed, IMO.

> does:
>
> out_unregister:
>         kobject_put(&priv->kobj);
>         kfree(drv->p);
>         drv->p = NULL;
>
> thereby freeing the node we just added to the driver list without first
> removing it.
>
> I suspect it will fix the problem, but let's get the patch to be correct
> before it gets tested...

Trust me, please go ahead to test and it doesn't affect it...

Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-09-26 23:47           ` Ming Lei
@ 2012-09-27 13:58             ` Russell King - ARM Linux
  2012-09-27 14:03               ` Russell King - ARM Linux
  0 siblings, 1 reply; 16+ messages in thread
From: Russell King - ARM Linux @ 2012-09-27 13:58 UTC (permalink / raw)
  To: Ming Lei
  Cc: Greg Kroah-Hartman, linux-arm-kernel, linux-kernel, Grant Likely,
	Arnd Bergmann, Mark Brown

On Thu, Sep 27, 2012 at 07:47:46AM +0800, Ming Lei wrote:
> On Thu, Sep 27, 2012 at 4:23 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > To be honest, I've not bothered to test the above patch, and now when I
> > look at it, I notice it's broken - in that on error it will corrupt the
> > driver list.  Take a look at the error path.
> >
> > priv is drv->p.  We add priv->knode_bus to the driver list.  If
> > driver_attach() returns an error, then we go to out_unregister, which
> 
> In fact, driver_attach() always returns zero, so it does __not__ affect
> your test.

It may not affect my test, but the fact is, that patch lays down the
foundations for bugs to be introduced.  Now, while I can test it (and
have done) I don't think it's suitable for mainline because of _that_.

If driver_attach() always returns zero, there's no point in it returning
a value - it might as well return void and stop leading people to
believe that it might return an error.  So I suggest this alternative
patch instead, which gets rid of what is effectively dead code, and
probably a few warnings about not checking the return value from a
__must_check function (see usb-serial.c).

With this patch, no one is lead into a false sense of security that,
after your patch, if driver_attach() ever returns an error, proper
cleanup will happen - it's blatently obvious to anyone modifying it
that if they do want it to return an error, they have to audit all the
users of it, which IMHO is a good thing.

diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 2bcef65..39b3ef4 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -714,12 +714,10 @@ int bus_add_driver(struct device_driver *drv)
 	if (error)
 		goto out_unregister;
 
-	if (drv->bus->p->drivers_autoprobe) {
-		error = driver_attach(drv);
-		if (error)
-			goto out_unregister;
-	}
 	klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
+	if (drv->bus->p->drivers_autoprobe)
+		driver_attach(drv);
+
 	module_add_driver(drv->owner, drv);
 
 	error = driver_create_file(drv, &driver_attr_uevent);
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 4b01ab3..2999df5 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -456,7 +456,7 @@ static int __driver_attach(struct device *dev, void *data)
  * returns 0 and the @dev->driver is set, we've found a
  * compatible pair.
  */
-int driver_attach(struct device_driver *drv)
+void driver_attach(struct device_driver *drv)
 {
 	return bus_for_each_dev(drv->bus, NULL, drv, __driver_attach);
 }
diff --git a/drivers/char/agp/amd64-agp.c b/drivers/char/agp/amd64-agp.c
index 444f8b6..4c16681 100644
--- a/drivers/char/agp/amd64-agp.c
+++ b/drivers/char/agp/amd64-agp.c
@@ -785,10 +785,12 @@ int __init agp_amd64_init(void)
 
 		/* Look for any AGP bridge */
 		agp_amd64_pci_driver.id_table = agp_amd64_pci_promisc_table;
-		err = driver_attach(&agp_amd64_pci_driver.driver);
-		if (err == 0 && agp_bridges_found == 0) {
+		driver_attach(&agp_amd64_pci_driver.driver);
+		if (agp_bridges_found == 0) {
 			pci_unregister_driver(&agp_amd64_pci_driver);
 			err = -ENODEV;
+		} else {
+			err = 0;
 		}
 	}
 	return err;
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index bb0608c..d2a8de5 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -1722,9 +1722,9 @@ static ssize_t store_new_id(struct device_driver *drv, const char *buf,
 	list_add_tail(&dynid->list, &hdrv->dyn_list);
 	spin_unlock(&hdrv->dyn_lock);
 
-	ret = driver_attach(&hdrv->driver);
+	driver_attach(&hdrv->driver);
 
-	return ret ? : count;
+	return count;
 }
 static DRIVER_ATTR(new_id, S_IWUSR, NULL, store_new_id);
 
diff --git a/drivers/input/gameport/gameport.c b/drivers/input/gameport/gameport.c
index da739d9..84f9878 100644
--- a/drivers/input/gameport/gameport.c
+++ b/drivers/input/gameport/gameport.c
@@ -670,12 +670,7 @@ static int gameport_driver_remove(struct device *dev)
 
 static void gameport_attach_driver(struct gameport_driver *drv)
 {
-	int error;
-
-	error = driver_attach(&drv->driver);
-	if (error)
-		pr_err("driver_attach() failed for %s, error: %d\n",
-			drv->driver.name, error);
+	driver_attach(&drv->driver);
 }
 
 int __gameport_register_driver(struct gameport_driver *drv, struct module *owner,
diff --git a/drivers/input/serio/serio.c b/drivers/input/serio/serio.c
index d0f7533..83f4eeb 100644
--- a/drivers/input/serio/serio.c
+++ b/drivers/input/serio/serio.c
@@ -802,12 +802,7 @@ static void serio_shutdown(struct device *dev)
 
 static void serio_attach_driver(struct serio_driver *drv)
 {
-	int error;
-
-	error = driver_attach(&drv->driver);
-	if (error)
-		pr_warning("driver_attach() failed for %s with error %d\n",
-			   drv->driver.name, error);
+	driver_attach(&drv->driver);
 }
 
 int __serio_register_driver(struct serio_driver *drv, struct module *owner, const char *mod_name)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 099f46c..07b7ece 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -72,9 +72,9 @@ int pci_add_dynid(struct pci_driver *drv,
 	list_add_tail(&dynid->node, &drv->dynids.list);
 	spin_unlock(&drv->dynids.lock);
 
-	retval = driver_attach(&drv->driver);
+	driver_attach(&drv->driver);
 
-	return retval;
+	return 0;
 }
 
 static void pci_free_dynids(struct pci_driver *drv)
diff --git a/drivers/pcmcia/ds.c b/drivers/pcmcia/ds.c
index 079629b..ee631c9 100644
--- a/drivers/pcmcia/ds.c
+++ b/drivers/pcmcia/ds.c
@@ -103,7 +103,6 @@ pcmcia_store_new_id(struct device_driver *driver, const char *buf, size_t count)
 	__u8 func_id, function, device_no;
 	__u32 prod_id_hash[4] = {0, 0, 0, 0};
 	int fields = 0;
-	int retval = 0;
 
 	fields = sscanf(buf, "%hx %hx %hx %hhx %hhx %hhx %x %x %x %x",
 			&match_flags, &manf_id, &card_id, &func_id, &function, &device_no,
@@ -127,10 +126,8 @@ pcmcia_store_new_id(struct device_driver *driver, const char *buf, size_t count)
 	list_add_tail(&dynid->node, &pdrv->dynids.list);
 	mutex_unlock(&pdrv->dynids.lock);
 
-	retval = driver_attach(&pdrv->drv);
+	driver_attach(&pdrv->drv);
 
-	if (retval)
-		return retval;
 	return count;
 }
 static DRIVER_ATTR(new_id, S_IWUSR, NULL, pcmcia_store_new_id);
diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
index f536aeb..473c4fd 100644
--- a/drivers/usb/core/driver.c
+++ b/drivers/usb/core/driver.c
@@ -71,10 +71,8 @@ ssize_t usb_store_new_id(struct usb_dynids *dynids,
 	list_add_tail(&dynid->node, &dynids->list);
 	spin_unlock(&dynids->lock);
 
-	retval = driver_attach(driver);
+	driver_attach(driver);
 
-	if (retval)
-		return retval;
 	return count;
 }
 EXPORT_SYMBOL_GPL(usb_store_new_id);
diff --git a/drivers/usb/serial/usb-serial.c b/drivers/usb/serial/usb-serial.c
index 27483f9..c48ba9a 100644
--- a/drivers/usb/serial/usb-serial.c
+++ b/drivers/usb/serial/usb-serial.c
@@ -1444,7 +1444,7 @@ int usb_serial_register_drivers(struct usb_serial_driver *const serial_drivers[]
 
 	/* Now set udriver's id_table and look for matches */
 	udriver->id_table = id_table;
-	rc = driver_attach(&udriver->drvwrap.driver);
+	driver_attach(&udriver->drvwrap.driver);
 	return 0;
 
  failed:
diff --git a/include/linux/device.h b/include/linux/device.h
index 6de9415..ab2440d 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -829,7 +829,7 @@ static inline void *dev_get_platdata(const struct device *dev)
 extern int __must_check device_bind_driver(struct device *dev);
 extern void device_release_driver(struct device *dev);
 extern int  __must_check device_attach(struct device *dev);
-extern int __must_check driver_attach(struct device_driver *drv);
+extern void driver_attach(struct device_driver *drv);
 extern int __must_check device_reprobe(struct device *dev);
 
 /*


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-09-27 13:58             ` Russell King - ARM Linux
@ 2012-09-27 14:03               ` Russell King - ARM Linux
  2012-09-27 14:15                 ` Ming Lei
  2012-09-27 17:22                 ` Joachim Eastwood
  0 siblings, 2 replies; 16+ messages in thread
From: Russell King - ARM Linux @ 2012-09-27 14:03 UTC (permalink / raw)
  To: Ming Lei
  Cc: Arnd Bergmann, Greg Kroah-Hartman, Mark Brown, linux-kernel,
	Grant Likely, linux-arm-kernel

On Thu, Sep 27, 2012 at 02:58:09PM +0100, Russell King - ARM Linux wrote:
> On Thu, Sep 27, 2012 at 07:47:46AM +0800, Ming Lei wrote:
> > On Thu, Sep 27, 2012 at 4:23 AM, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> > > To be honest, I've not bothered to test the above patch, and now when I
> > > look at it, I notice it's broken - in that on error it will corrupt the
> > > driver list.  Take a look at the error path.
> > >
> > > priv is drv->p.  We add priv->knode_bus to the driver list.  If
> > > driver_attach() returns an error, then we go to out_unregister, which
> > 
> > In fact, driver_attach() always returns zero, so it does __not__ affect
> > your test.
> 
> It may not affect my test, but the fact is, that patch lays down the
> foundations for bugs to be introduced.  Now, while I can test it (and
> have done) I don't think it's suitable for mainline because of _that_.
> 
> If driver_attach() always returns zero, there's no point in it returning
> a value - it might as well return void and stop leading people to
> believe that it might return an error.  So I suggest this alternative
> patch instead, which gets rid of what is effectively dead code, and
> probably a few warnings about not checking the return value from a
> __must_check function (see usb-serial.c).
> 
> With this patch, no one is lead into a false sense of security that,
> after your patch, if driver_attach() ever returns an error, proper
> cleanup will happen - it's blatently obvious to anyone modifying it
> that if they do want it to return an error, they have to audit all the
> users of it, which IMHO is a good thing.

Sorry, old version of that patch.  Here's the right one.

diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 2bcef65..39b3ef4 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -714,12 +714,10 @@ int bus_add_driver(struct device_driver *drv)
 	if (error)
 		goto out_unregister;
 
-	if (drv->bus->p->drivers_autoprobe) {
-		error = driver_attach(drv);
-		if (error)
-			goto out_unregister;
-	}
 	klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
+	if (drv->bus->p->drivers_autoprobe)
+		driver_attach(drv);
+
 	module_add_driver(drv->owner, drv);
 
 	error = driver_create_file(drv, &driver_attr_uevent);
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 4b01ab3..77e2412 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -456,9 +456,9 @@ static int __driver_attach(struct device *dev, void *data)
  * returns 0 and the @dev->driver is set, we've found a
  * compatible pair.
  */
-int driver_attach(struct device_driver *drv)
+void driver_attach(struct device_driver *drv)
 {
-	return bus_for_each_dev(drv->bus, NULL, drv, __driver_attach);
+	bus_for_each_dev(drv->bus, NULL, drv, __driver_attach);
 }
 EXPORT_SYMBOL_GPL(driver_attach);
 
diff --git a/drivers/char/agp/amd64-agp.c b/drivers/char/agp/amd64-agp.c
index 444f8b6..4c16681 100644
--- a/drivers/char/agp/amd64-agp.c
+++ b/drivers/char/agp/amd64-agp.c
@@ -785,10 +785,12 @@ int __init agp_amd64_init(void)
 
 		/* Look for any AGP bridge */
 		agp_amd64_pci_driver.id_table = agp_amd64_pci_promisc_table;
-		err = driver_attach(&agp_amd64_pci_driver.driver);
-		if (err == 0 && agp_bridges_found == 0) {
+		driver_attach(&agp_amd64_pci_driver.driver);
+		if (agp_bridges_found == 0) {
 			pci_unregister_driver(&agp_amd64_pci_driver);
 			err = -ENODEV;
+		} else {
+			err = 0;
 		}
 	}
 	return err;
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index bb0608c..d2a8de5 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -1722,9 +1722,9 @@ static ssize_t store_new_id(struct device_driver *drv, const char *buf,
 	list_add_tail(&dynid->list, &hdrv->dyn_list);
 	spin_unlock(&hdrv->dyn_lock);
 
-	ret = driver_attach(&hdrv->driver);
+	driver_attach(&hdrv->driver);
 
-	return ret ? : count;
+	return count;
 }
 static DRIVER_ATTR(new_id, S_IWUSR, NULL, store_new_id);
 
diff --git a/drivers/input/gameport/gameport.c b/drivers/input/gameport/gameport.c
index da739d9..84f9878 100644
--- a/drivers/input/gameport/gameport.c
+++ b/drivers/input/gameport/gameport.c
@@ -670,12 +670,7 @@ static int gameport_driver_remove(struct device *dev)
 
 static void gameport_attach_driver(struct gameport_driver *drv)
 {
-	int error;
-
-	error = driver_attach(&drv->driver);
-	if (error)
-		pr_err("driver_attach() failed for %s, error: %d\n",
-			drv->driver.name, error);
+	driver_attach(&drv->driver);
 }
 
 int __gameport_register_driver(struct gameport_driver *drv, struct module *owner,
diff --git a/drivers/input/serio/serio.c b/drivers/input/serio/serio.c
index d0f7533..83f4eeb 100644
--- a/drivers/input/serio/serio.c
+++ b/drivers/input/serio/serio.c
@@ -802,12 +802,7 @@ static void serio_shutdown(struct device *dev)
 
 static void serio_attach_driver(struct serio_driver *drv)
 {
-	int error;
-
-	error = driver_attach(&drv->driver);
-	if (error)
-		pr_warning("driver_attach() failed for %s with error %d\n",
-			   drv->driver.name, error);
+	driver_attach(&drv->driver);
 }
 
 int __serio_register_driver(struct serio_driver *drv, struct module *owner, const char *mod_name)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 099f46c..07b7ece 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -72,9 +72,9 @@ int pci_add_dynid(struct pci_driver *drv,
 	list_add_tail(&dynid->node, &drv->dynids.list);
 	spin_unlock(&drv->dynids.lock);
 
-	retval = driver_attach(&drv->driver);
+	driver_attach(&drv->driver);
 
-	return retval;
+	return 0;
 }
 
 static void pci_free_dynids(struct pci_driver *drv)
diff --git a/drivers/pcmcia/ds.c b/drivers/pcmcia/ds.c
index 079629b..ee631c9 100644
--- a/drivers/pcmcia/ds.c
+++ b/drivers/pcmcia/ds.c
@@ -103,7 +103,6 @@ pcmcia_store_new_id(struct device_driver *driver, const char *buf, size_t count)
 	__u8 func_id, function, device_no;
 	__u32 prod_id_hash[4] = {0, 0, 0, 0};
 	int fields = 0;
-	int retval = 0;
 
 	fields = sscanf(buf, "%hx %hx %hx %hhx %hhx %hhx %x %x %x %x",
 			&match_flags, &manf_id, &card_id, &func_id, &function, &device_no,
@@ -127,10 +126,8 @@ pcmcia_store_new_id(struct device_driver *driver, const char *buf, size_t count)
 	list_add_tail(&dynid->node, &pdrv->dynids.list);
 	mutex_unlock(&pdrv->dynids.lock);
 
-	retval = driver_attach(&pdrv->drv);
+	driver_attach(&pdrv->drv);
 
-	if (retval)
-		return retval;
 	return count;
 }
 static DRIVER_ATTR(new_id, S_IWUSR, NULL, pcmcia_store_new_id);
diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
index f536aeb..473c4fd 100644
--- a/drivers/usb/core/driver.c
+++ b/drivers/usb/core/driver.c
@@ -71,10 +71,8 @@ ssize_t usb_store_new_id(struct usb_dynids *dynids,
 	list_add_tail(&dynid->node, &dynids->list);
 	spin_unlock(&dynids->lock);
 
-	retval = driver_attach(driver);
+	driver_attach(driver);
 
-	if (retval)
-		return retval;
 	return count;
 }
 EXPORT_SYMBOL_GPL(usb_store_new_id);
diff --git a/drivers/usb/serial/usb-serial.c b/drivers/usb/serial/usb-serial.c
index 27483f9..c48ba9a 100644
--- a/drivers/usb/serial/usb-serial.c
+++ b/drivers/usb/serial/usb-serial.c
@@ -1444,7 +1444,7 @@ int usb_serial_register_drivers(struct usb_serial_driver *const serial_drivers[]
 
 	/* Now set udriver's id_table and look for matches */
 	udriver->id_table = id_table;
-	rc = driver_attach(&udriver->drvwrap.driver);
+	driver_attach(&udriver->drvwrap.driver);
 	return 0;
 
  failed:
diff --git a/include/linux/device.h b/include/linux/device.h
index 6de9415..ab2440d 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -829,7 +829,7 @@ static inline void *dev_get_platdata(const struct device *dev)
 extern int __must_check device_bind_driver(struct device *dev);
 extern void device_release_driver(struct device *dev);
 extern int  __must_check device_attach(struct device *dev);
-extern int __must_check driver_attach(struct device_driver *drv);
+extern void driver_attach(struct device_driver *drv);
 extern int __must_check device_reprobe(struct device *dev);
 
 /*


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-09-27 14:03               ` Russell King - ARM Linux
@ 2012-09-27 14:15                 ` Ming Lei
  2012-09-27 17:22                 ` Joachim Eastwood
  1 sibling, 0 replies; 16+ messages in thread
From: Ming Lei @ 2012-09-27 14:15 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Arnd Bergmann, Greg Kroah-Hartman, Mark Brown, linux-kernel,
	Grant Likely, linux-arm-kernel

On Thu, Sep 27, 2012 at 10:03 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Sep 27, 2012 at 02:58:09PM +0100, Russell King - ARM Linux wrote:
>> On Thu, Sep 27, 2012 at 07:47:46AM +0800, Ming Lei wrote:
>> > On Thu, Sep 27, 2012 at 4:23 AM, Russell King - ARM Linux
>> > <linux@arm.linux.org.uk> wrote:
>> > > To be honest, I've not bothered to test the above patch, and now when I
>> > > look at it, I notice it's broken - in that on error it will corrupt the
>> > > driver list.  Take a look at the error path.
>> > >
>> > > priv is drv->p.  We add priv->knode_bus to the driver list.  If
>> > > driver_attach() returns an error, then we go to out_unregister, which
>> >
>> > In fact, driver_attach() always returns zero, so it does __not__ affect
>> > your test.
>>
>> It may not affect my test, but the fact is, that patch lays down the
>> foundations for bugs to be introduced.  Now, while I can test it (and
>> have done) I don't think it's suitable for mainline because of _that_.
>>
>> If driver_attach() always returns zero, there's no point in it returning
>> a value - it might as well return void and stop leading people to
>> believe that it might return an error.  So I suggest this alternative
>> patch instead, which gets rid of what is effectively dead code, and
>> probably a few warnings about not checking the return value from a
>> __must_check function (see usb-serial.c).
>>
>> With this patch, no one is lead into a false sense of security that,
>> after your patch, if driver_attach() ever returns an error, proper
>> cleanup will happen - it's blatently obvious to anyone modifying it
>> that if they do want it to return an error, they have to audit all the
>> users of it, which IMHO is a good thing.
>
> Sorry, old version of that patch.  Here's the right one.

IMO, it is better to take the one line change first since it is really
correct and easy to be backported to previous stable releases.


Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-09-27 14:03               ` Russell King - ARM Linux
  2012-09-27 14:15                 ` Ming Lei
@ 2012-09-27 17:22                 ` Joachim Eastwood
  2012-09-27 17:26                   ` Mark Brown
  2012-09-28  0:30                   ` Ming Lei
  1 sibling, 2 replies; 16+ messages in thread
From: Joachim Eastwood @ 2012-09-27 17:22 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Ming Lei, Arnd Bergmann, Greg Kroah-Hartman, Mark Brown,
	linux-kernel, Grant Likely, linux-arm-kernel

Hi,

On Thu, Sep 27, 2012 at 4:03 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Sep 27, 2012 at 02:58:09PM +0100, Russell King - ARM Linux wrote:
>> On Thu, Sep 27, 2012 at 07:47:46AM +0800, Ming Lei wrote:
>> > On Thu, Sep 27, 2012 at 4:23 AM, Russell King - ARM Linux
>> > <linux@arm.linux.org.uk> wrote:
>> > > To be honest, I've not bothered to test the above patch, and now when I
>> > > look at it, I notice it's broken - in that on error it will corrupt the
>> > > driver list.  Take a look at the error path.
>> > >
>> > > priv is drv->p.  We add priv->knode_bus to the driver list.  If
>> > > driver_attach() returns an error, then we go to out_unregister, which
>> >
>> > In fact, driver_attach() always returns zero, so it does __not__ affect
>> > your test.
>>
>> It may not affect my test, but the fact is, that patch lays down the
>> foundations for bugs to be introduced.  Now, while I can test it (and
>> have done) I don't think it's suitable for mainline because of _that_.
>>
>> If driver_attach() always returns zero, there's no point in it returning
>> a value - it might as well return void and stop leading people to
>> believe that it might return an error.  So I suggest this alternative
>> patch instead, which gets rid of what is effectively dead code, and
>> probably a few warnings about not checking the return value from a
>> __must_check function (see usb-serial.c).
>>
>> With this patch, no one is lead into a false sense of security that,
>> after your patch, if driver_attach() ever returns an error, proper
>> cleanup will happen - it's blatently obvious to anyone modifying it
>> that if they do want it to return an error, they have to audit all the
>> users of it, which IMHO is a good thing.
>
> Sorry, old version of that patch.  Here's the right one.

<snip patch>

I am also having problems with ASoC modules and deferred probing. This
patch seems to improve the a lot situation, though. Now it work most
of the time, but occasionally I get the warning below.

Not sure if this is directly related to the issue Russell have been
seeing. Guess it can be races in the Atmel ASoC drivers as well(?).

[   12.230000] mpa1600-audio mpa1600-audio.0: CODEC wm8776.0-001b not registered
[   12.390000] mpa1600-audio mpa1600-audio.0: snd_soc_register_card()
failed: -517
[   12.390000] platform mpa1600-audio.0: Driver mpa1600-audio requests
probe deferral
[   12.720000] ------------[ cut here ]------------
[   12.720000] WARNING: at fs/sysfs/dir.c:536 sysfs_add_one+0x7c/0xa0()
[   12.740000] mpa1600-audio mpa1600-audio.1: CODEC wm8776.0-001a not registered
[   12.910000] mpa1600-audio mpa1600-audio.1: snd_soc_register_card()
failed: -517
[   13.140000] sysfs: cannot create duplicate filename
'/devices/platform/ssc.0/atmel-ssc-dai.0'
[   13.140000] Modules linked in: snd_soc_mpa1600_wm8776(+)
snd_soc_atmel_ssc_dai snd_soc_core pcmcia regmap_i2c snd_pcm
snd_page_alloc ad525x_dpot_i2c snd_timer ad525x_dpot ohci_hcd
leds_pca9532 snd usbcore soundcore gpio_keys rege
[   13.630000] Backtrace:
[   13.630000] [<c000c46c>] (dump_backtrace+0x0/0x10c) from
[<c021ac48>] (dump_stack+0x18/0x1c)
[   13.980000]  r7:c3839ca8 r6:c00d0cf0 r5:c0278f34 r4:00000218
[   13.990000] [<c021ac30>] (dump_stack+0x0/0x1c) from [<c0015ad0>]
(warn_slowpath_common+0x54/0x6c)
[   14.240000] [<c0015a7c>] (warn_slowpath_common+0x0/0x6c) from
[<c0015b8c>] (warn_slowpath_fmt+0x38/0x40)
[   14.470000]  r8:c2d41110 r7:ffffffef r6:c3839ce8 r5:c3a47780 r4:c2c58000
[   14.480000] [<c0015b54>] (warn_slowpath_fmt+0x0/0x40) from
[<c00d0cf0>] (sysfs_add_one+0x7c/0xa0)
[   14.660000]  r3:c2c58000 r2:c0278f64
[   14.660000] [<c00d0c74>] (sysfs_add_one+0x0/0xa0) from [<c00d1898>]
(create_dir+0x6c/0xc0)
[   14.790000]  r7:00000000 r6:00000000 r5:c3a47780 r4:c3839ce8
[   14.790000] [<c00d182c>] (create_dir+0x0/0xc0) from [<c00d19c4>]
(sysfs_create_dir+0xd8/0xf0)
[   14.850000] [<c00d18ec>] (sysfs_create_dir+0x0/0xf0) from
[<c01252f8>] (kobject_add_internal+0xe4/0x1e0)
[   14.870000]  r7:c2d41b00 r6:c02cc2e0 r5:00000000 r4:c2d41110
[   14.880000] [<c0125214>] (kobject_add_internal+0x0/0x1e0) from
[<c0125644>] (kobject_add_varg+0x44/0x54)
[   14.890000] [<c0125600>] (kobject_add_varg+0x0/0x54) from
[<c01256dc>] (kobject_add+0x4c/0x58)
[   14.900000]  r6:00000000 r5:c2d41108 r4:c2d41100
[   14.910000] [<c0125690>] (kobject_add+0x0/0x58) from [<c015220c>]
(device_add+0xe4/0x5d0)
[   14.920000]  r3:c3839db4 r2:00000000
[   14.920000] [<c0152128>] (device_add+0x0/0x5d0) from [<c01561a8>]
(platform_device_add+0x10c/0x164)
[   14.930000] [<c015609c>] (platform_device_add+0x0/0x164) from
[<bf11a9f4>] (atmel_ssc_set_audio+0xac/0xdc [snd_soc_atmel_ssc_dai])
[   14.940000]  r7:c2d41b00 r6:00000000 r5:c2d41100 r4:c02ccf90
[   14.950000] [<bf11a948>] (atmel_ssc_set_audio+0x0/0xdc
[snd_soc_atmel_ssc_dai]) from [<bf11f038>]
(mpa1600_audio_probe+0x38/0x88 [snd_soc_mpa1600_wm8776])
[   14.960000]  r7:bf11f464 r6:c02ccf98 r5:bf11f4a0 r4:c02ccf90
[   14.960000] [<bf11f000>] (mpa1600_audio_probe+0x0/0x88
[snd_soc_mpa1600_wm8776]) from [<c0155b6c>]
(platform_drv_probe+0x20/0x24)
[   14.990000]  r6:c02ccf98 r5:c02ccf98 r4:bf11f464
[   14.990000] [<c0155b4c>] (platform_drv_probe+0x0/0x24) from
[<c0154680>] (driver_probe_device+0xb8/0x20c)
[   15.000000] [<c01545c8>] (driver_probe_device+0x0/0x20c) from
[<c01548a4>] (__device_attach+0x48/0x4c)
[   15.010000]  r7:c3839ea8 r6:c02ccf98 r5:c02ccf98 r4:bf11f464
[   15.020000] [<c015485c>] (__device_attach+0x0/0x4c) from
[<c0152dc0>] (bus_for_each_drv+0x5c/0x9c)
[   15.030000]  r5:c015485c r4:00000000
[   15.030000] [<c0152d64>] (bus_for_each_drv+0x0/0x9c) from
[<c015493c>] (device_attach+0x6c/0x8c)
[   15.040000]  r7:00000089 r6:c02ccfcc r5:c02d7518 r4:c02ccf98
[   15.050000] [<c01548d0>] (device_attach+0x0/0x8c) from [<c0153988>]
(bus_probe_device+0x34/0xa4)
[   15.060000]  r6:c02ccf98 r5:c02d7518 r4:c02ccf98
[   15.060000] [<c0153954>] (bus_probe_device+0x0/0xa4) from
[<c0154240>] (deferred_probe_work_func+0x60/0x94)
[   15.070000]  r6:c02e3750 r5:c02d7400 r4:c02ccf98
[   15.080000] [<c01541e0>] (deferred_probe_work_func+0x0/0x94) from
[<c002c338>] (process_one_work+0x310/0x4a0)
[   15.090000]  r5:c3804500 r4:c02d7408
[   15.100000] [<c002c028>] (process_one_work+0x0/0x4a0) from
[<c002cbc8>] (worker_thread+0x3b8/0x5e0)
[   15.110000] [<c002c810>] (worker_thread+0x0/0x5e0) from
[<c00336e4>] (kthread+0x8c/0x98)
[   15.110000] [<c0033658>] (kthread+0x0/0x98) from [<c001b4d4>]
(do_exit+0x0/0x720)
[   15.120000]  r7:00000013 r6:c001b4d4 r5:c0033658 r4:c3827ecc
[   15.130000] ---[ end trace d7472237284e57c2 ]---
[   15.130000] platform mpa1600-audio.1: Driver mpa1600-audio requests
probe deferral
[   15.150000] ------------[ cut here ]------------
[   15.150000] WARNING: at lib/kobject.c:196 kobject_add_internal+0x17c/0x1e0()
[   15.250000] kobject_add_internal failed for atmel-ssc-dai.0 with
-EEXIST, don't try to register things with the same name in the same
directory.
[   15.390000] Modules linked in: snd_soc_mpa1600_wm8776
snd_soc_atmel_ssc_dai snd_soc_core pcmcia regmap_i2c snd_pcm
snd_page_alloc ad525x_dpot_i2c snd_timer ad525x_dpot ohci_hcd
leds_pca9532 snd usbcore soundcore gpio_keys regmape
[   15.470000] Backtrace:
[   15.470000] [<c000c46c>] (dump_backtrace+0x0/0x10c) from
[<c021ac48>] (dump_stack+0x18/0x1c)
[   15.520000]  r7:c3839d28 r6:c0125390 r5:c027f187 r4:000000c4
[   15.520000] [<c021ac30>] (dump_stack+0x0/0x1c) from [<c0015ad0>]
(warn_slowpath_common+0x54/0x6c)
[   15.620000] [<c0015a7c>] (warn_slowpath_common+0x0/0x6c) from
[<c0015b8c>] (warn_slowpath_fmt+0x38/0x40)
[   15.640000]  r8:ffffffef r7:c2d41b00 r6:c02cc2e0 r5:ffffffef r4:c2d41110
[   15.650000] [<c0015b54>] (warn_slowpath_fmt+0x0/0x40) from
[<c0125390>] (kobject_add_internal+0x17c/0x1e0)
[   15.670000]  r3:c02247bc r2:c027f21f
[   15.680000] [<c0125214>] (kobject_add_internal+0x0/0x1e0) from
[<c0125644>] (kobject_add_varg+0x44/0x54)
[   15.700000] [<c0125600>] (kobject_add_varg+0x0/0x54) from
[<c01256dc>] (kobject_add+0x4c/0x58)
[   15.720000]  r6:00000000 r5:c2d41108 r4:c2d41100
[   15.720000] [<c0125690>] (kobject_add+0x0/0x58) from [<c015220c>]
(device_add+0xe4/0x5d0)
[   15.740000]  r3:c3839db4 r2:00000000
[   15.760000] [<c0152128>] (device_add+0x0/0x5d0) from [<c01561a8>]
(platform_device_add+0x10c/0x164)
[   15.760000] [<c015609c>] (platform_device_add+0x0/0x164) from
[<bf11a9f4>] (atmel_ssc_set_audio+0xac/0xdc [snd_soc_atmel_ssc_dai])
[   15.790000]  r7:c2d41b00 r6:00000000 r5:c2d41100 r4:c02ccf90
[   15.810000] [<bf11a948>] (atmel_ssc_set_audio+0x0/0xdc
[snd_soc_atmel_ssc_dai]) from [<bf11f038>]
(mpa1600_audio_probe+0x38/0x88 [snd_soc_mpa1600_wm8776])
[   15.840000]  r7:bf11f464 r6:c02ccf98 r5:bf11f4a0 r4:c02ccf90
[   15.840000] [<bf11f000>] (mpa1600_audio_probe+0x0/0x88
[snd_soc_mpa1600_wm8776]) from [<c0155b6c>]
(platform_drv_probe+0x20/0x24)
[   15.860000]  r6:c02ccf98 r5:c02ccf98 r4:bf11f464
[   15.880000] [<c0155b4c>] (platform_drv_probe+0x0/0x24) from
[<c0154680>] (driver_probe_device+0xb8/0x20c)
[   15.900000] [<c01545c8>] (driver_probe_device+0x0/0x20c) from
[<c01548a4>] (__device_attach+0x48/0x4c)
[   15.910000]  r7:c3839ea8 r6:c02ccf98 r5:c02ccf98 r4:bf11f464
[   15.920000] [<c015485c>] (__device_attach+0x0/0x4c) from
[<c0152dc0>] (bus_for_each_drv+0x5c/0x9c)
[   15.940000]  r5:c015485c r4:00000000
[   15.950000] [<c0152d64>] (bus_for_each_drv+0x0/0x9c) from
[<c015493c>] (device_attach+0x6c/0x8c)
[   15.960000]  r7:00000089 r6:c02ccfcc r5:c02d7518 r4:c02ccf98
[   15.970000] [<c01548d0>] (device_attach+0x0/0x8c) from [<c0153988>]
(bus_probe_device+0x34/0xa4)
[   16.000000]  r6:c02ccf98 r5:c02d7518 r4:c02ccf98
[   16.030000] [<c0153954>] (bus_probe_device+0x0/0xa4) from
[<c0154240>] (deferred_probe_work_func+0x60/0x94)
[   16.050000]  r6:c02e3750 r5:c02d7400 r4:c02ccf98
[   16.070000] [<c01541e0>] (deferred_probe_work_func+0x0/0x94) from
[<c002c338>] (process_one_work+0x310/0x4a0)
[   16.100000]  r5:c3804500 r4:c02d7408
[   16.100000] [<c002c028>] (process_one_work+0x0/0x4a0) from
[<c002cbc8>] (worker_thread+0x3b8/0x5e0)
[   16.120000] [<c002c810>] (worker_thread+0x0/0x5e0) from
[<c00336e4>] (kthread+0x8c/0x98)
[   16.140000] [<c0033658>] (kthread+0x0/0x98) from [<c001b4d4>]
(do_exit+0x0/0x720)
[   16.150000]  r7:00000013 r6:c001b4d4 r5:c0033658 r4:c3827ecc
[   16.160000] ---[ end trace d7472237284e57c3 ]---
[   16.160000] mpa1600-audio mpa1600-audio.0: Failed to set SSC for audio: -17
[   16.180000] mpa1600-audio: probe of mpa1600-audio.0 failed with error -17

regards
Joachim Eastwood

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-09-27 17:22                 ` Joachim Eastwood
@ 2012-09-27 17:26                   ` Mark Brown
  2012-09-28  0:30                   ` Ming Lei
  1 sibling, 0 replies; 16+ messages in thread
From: Mark Brown @ 2012-09-27 17:26 UTC (permalink / raw)
  To: Joachim Eastwood
  Cc: Russell King - ARM Linux, Ming Lei, Arnd Bergmann,
	Greg Kroah-Hartman, linux-kernel, Grant Likely, linux-arm-kernel

On Thu, Sep 27, 2012 at 07:22:38PM +0200, Joachim Eastwood wrote:

> [   13.140000] sysfs: cannot create duplicate filename
> '/devices/platform/ssc.0/atmel-ssc-dai.0'
> [   13.140000] Modules linked in: snd_soc_mpa1600_wm8776(+)

Oh, dear.  This is probably a genuine bug, triggered by but not really
related to deferred probing.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG] Deferred probing in driver model is racy, resulting in lost probes
  2012-09-27 17:22                 ` Joachim Eastwood
  2012-09-27 17:26                   ` Mark Brown
@ 2012-09-28  0:30                   ` Ming Lei
  1 sibling, 0 replies; 16+ messages in thread
From: Ming Lei @ 2012-09-28  0:30 UTC (permalink / raw)
  To: Joachim Eastwood
  Cc: Russell King - ARM Linux, Arnd Bergmann, Greg Kroah-Hartman,
	Mark Brown, linux-kernel, Grant Likely, linux-arm-kernel

On Fri, Sep 28, 2012 at 1:22 AM, Joachim  Eastwood <manabian@gmail.com> wrote:

>
> I am also having problems with ASoC modules and deferred probing. This
> patch seems to improve the a lot situation, though. Now it work most
> of the time, but occasionally I get the warning below.
>
> Not sure if this is directly related to the issue Russell have been
> seeing. Guess it can be races in the Atmel ASoC drivers as well(?).

Looks the driver involved is out of tree since mpa1600_audio_probe
can't be found in -next or linux tree.

Also your problem should be related the driver and probably caused
by calling atmel_ssc_set_audio more than one time.

Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-09-28  0:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-18 14:58 [BUG] Deferred probing in driver model is racy, resulting in lost probes Russell King - ARM Linux
2012-09-15 16:03 ` Greg Kroah-Hartman
2012-09-15 18:32   ` Russell King - ARM Linux
2012-09-16  6:41 ` Ming Lei
2012-09-16  8:25   ` Russell King - ARM Linux
2012-09-16 13:24     ` Ming Lei
2012-09-26 20:08       ` Greg Kroah-Hartman
2012-09-26 20:23         ` Russell King - ARM Linux
2012-09-26 20:36           ` Greg Kroah-Hartman
2012-09-26 23:47           ` Ming Lei
2012-09-27 13:58             ` Russell King - ARM Linux
2012-09-27 14:03               ` Russell King - ARM Linux
2012-09-27 14:15                 ` Ming Lei
2012-09-27 17:22                 ` Joachim Eastwood
2012-09-27 17:26                   ` Mark Brown
2012-09-28  0:30                   ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).