* [RFC PATCH 0/2] driver core: kick deferred probe from delayed context @ 2021-08-17 19:00 Pierre-Louis Bossart 2021-08-17 19:00 ` [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() Pierre-Louis Bossart 2021-08-17 19:00 ` [RFC PATCH 2/2] ASoC: SOF: trigger re-probing of deferred devices from workqueue Pierre-Louis Bossart 0 siblings, 2 replies; 17+ messages in thread From: Pierre-Louis Bossart @ 2021-08-17 19:00 UTC (permalink / raw) To: alsa-devel Cc: tiwai, broonie, vkoul, liam.r.girdwood, Andy Shevchenko, Dan Williams, Jason Gunthorpe, Christoph Hellwig, Greg Kroah-Hartman, Rafael J . Wysocki, linux-kernel, Pierre-Louis Bossart The deferred probe mechanism uses a successful driver probe/attach as a trigger to revisit the list of deferred probe devices. This works in most cases, except when the probe success is not a valid indicator of resources being available. In that case, a race condition may occur, where the device/driver core framework will attempt to probe a device that depends on resources before those resources are available, resulting in a -EPROBE_DEFER error and a deferred probe device that will never be initialized. The example provided in this RFC relies on the probe workqueue used for the HDaudio support where we simultaneously: a) need to use request_module() b) cannot use an async probe due to the use of request_module() c) cannot block the probe of other drivers In this example, the deferred probe can be kicked when the workqueue completes. The use of request_firmware_nowait() is another conceptual example, where a domain-specific callback can enable resources *after* the probe returns, for example by downloading the firmware, booting a processor and waiting for the processor to be ready for interaction with the Linux host. In this second example, the deferred probe could be kicked when the 'cont' callback completes. This patchset suggests a 7-line change to solve race conditions in these examples with delayed work. Discussion: a) During Intel internal reviews, Andy Shevchenko pointed out another known issue with deferred probe [1]. This patchset is unrelated and does not claim to solve the problem raised by Andy. b) one possible objection is that this patchset does not suppress a possibly unnecessary round of evaluation of deferred probe devices. It did not feel necessary to any of us to minimize the occurrences of EPROBE_DEFER but instead to make sure the device waiting for resources successfully probes in the end. c) another objection might be that the driver core should know about such dependencies. It would be desirable but in the cases we've encountered such dependencies are highly domain-specific and not necessarily straightforward to describe. There's been multiple endeavors to improve the description of dependencies, this patchset only focuses on the deferred probe framework, with an improvement when the provider of resources makes these resources available after its probe returns. [1] https://lore.kernel.org/lkml/20200324175719.62496-1-andriy.shevchenko@linux.intel.com/T/#u Pierre-Louis Bossart (2): driver core: export driver_deferred_probe_trigger() ASoC: SOF: trigger re-probing of deferred devices from workqueue drivers/base/dd.c | 3 ++- include/linux/device/driver.h | 1 + sound/soc/sof/core.c | 3 +++ 3 files changed, 6 insertions(+), 1 deletion(-) base-commit: 8d1998893cd5e3488cd95529f60a187e3009d14b -- 2.25.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() 2021-08-17 19:00 [RFC PATCH 0/2] driver core: kick deferred probe from delayed context Pierre-Louis Bossart @ 2021-08-17 19:00 ` Pierre-Louis Bossart 2021-08-18 5:44 ` Greg Kroah-Hartman 2021-08-17 19:00 ` [RFC PATCH 2/2] ASoC: SOF: trigger re-probing of deferred devices from workqueue Pierre-Louis Bossart 1 sibling, 1 reply; 17+ messages in thread From: Pierre-Louis Bossart @ 2021-08-17 19:00 UTC (permalink / raw) To: alsa-devel Cc: tiwai, broonie, vkoul, liam.r.girdwood, Andy Shevchenko, Dan Williams, Jason Gunthorpe, Christoph Hellwig, Greg Kroah-Hartman, Rafael J . Wysocki, linux-kernel, Pierre-Louis Bossart, Geert Uytterhoeven The premise of the deferred probe implementation is that a successful driver binding is a proxy for the resources provided by this driver becoming available. While this is a correct assumption in most of the cases, there are exceptions to the rule such as a) the use of request_firmware_nowait(). In this case, the resources may become available when the 'cont' callback completes, for example when if the firmware needs to be downloaded and executed on a SoC core or DSP. b) a split implementation of the probe with a workqueue when one or ore request_module() calls are required: a synchronous probe prevents other drivers from probing, impacting boot time, and an async probe is not allowed to avoid a deadlock. This is the case on all Intel audio platforms, with request_module() being required for the i915 display audio and HDaudio external codecs. In these cases, there is no way to notify the deferred probe infrastructure of the enablement of resources after the driver binding. The driver_deferred_probe_trigger() function is currently used 'anytime a driver is successfully bound to a device', this patch suggest exporing by exporting it so that drivers can kick-off re-probing of deferred devices at the end of a deferred processing. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> --- drivers/base/dd.c | 3 ++- include/linux/device/driver.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 437cd61343b2..33eca45aa65a 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -171,7 +171,7 @@ static bool driver_deferred_probe_enable = false; * changes in the midst of a probe, then deferred processing should be triggered * again. */ -static void driver_deferred_probe_trigger(void) +void driver_deferred_probe_trigger(void) { if (!driver_deferred_probe_enable) return; @@ -193,6 +193,7 @@ static void driver_deferred_probe_trigger(void) */ queue_work(system_unbound_wq, &deferred_probe_work); } +EXPORT_SYMBOL_GPL(driver_deferred_probe_trigger); /** * device_block_probing() - Block/defer device's probes diff --git a/include/linux/device/driver.h b/include/linux/device/driver.h index a498ebcf4993..2eec79d752a9 100644 --- a/include/linux/device/driver.h +++ b/include/linux/device/driver.h @@ -240,6 +240,7 @@ extern int driver_deferred_probe_timeout; void driver_deferred_probe_add(struct device *dev); int driver_deferred_probe_check_state(struct device *dev); void driver_init(void); +void driver_deferred_probe_trigger(void); /** * module_driver() - Helper macro for drivers that don't do anything -- 2.25.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() 2021-08-17 19:00 ` [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() Pierre-Louis Bossart @ 2021-08-18 5:44 ` Greg Kroah-Hartman 2021-08-18 11:57 ` Mark Brown 0 siblings, 1 reply; 17+ messages in thread From: Greg Kroah-Hartman @ 2021-08-18 5:44 UTC (permalink / raw) To: Pierre-Louis Bossart Cc: alsa-devel, tiwai, broonie, vkoul, liam.r.girdwood, Andy Shevchenko, Dan Williams, Jason Gunthorpe, Christoph Hellwig, Rafael J . Wysocki, linux-kernel, Geert Uytterhoeven On Tue, Aug 17, 2021 at 02:00:56PM -0500, Pierre-Louis Bossart wrote: > The premise of the deferred probe implementation is that a successful > driver binding is a proxy for the resources provided by this driver > becoming available. While this is a correct assumption in most of the > cases, there are exceptions to the rule such as > > a) the use of request_firmware_nowait(). In this case, the resources > may become available when the 'cont' callback completes, for example > when if the firmware needs to be downloaded and executed on a SoC > core or DSP. > > b) a split implementation of the probe with a workqueue when one or > ore request_module() calls are required: a synchronous probe prevents > other drivers from probing, impacting boot time, and an async probe is > not allowed to avoid a deadlock. This is the case on all Intel audio > platforms, with request_module() being required for the i915 display > audio and HDaudio external codecs. > > In these cases, there is no way to notify the deferred probe > infrastructure of the enablement of resources after the driver > binding. Then just wait for it to happen naturally? > The driver_deferred_probe_trigger() function is currently used > 'anytime a driver is successfully bound to a device', this patch > suggest exporing by exporting it so that drivers can kick-off > re-probing of deferred devices at the end of a deferred processing. I really do not want to export this as it will get really messy very quickly with different drivers/busses attempting to call this. Either handle it in your driver (why do you have to defer probe at all, just succeed and move on to register the needed stuff after you are initialized) or rely on the driver core here. thanks, greg k-h ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() 2021-08-18 5:44 ` Greg Kroah-Hartman @ 2021-08-18 11:57 ` Mark Brown 2021-08-18 13:22 ` Greg Kroah-Hartman 0 siblings, 1 reply; 17+ messages in thread From: Mark Brown @ 2021-08-18 11:57 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Pierre-Louis Bossart, alsa-devel, tiwai, vkoul, liam.r.girdwood, Andy Shevchenko, Dan Williams, Jason Gunthorpe, Christoph Hellwig, Rafael J . Wysocki, linux-kernel, Geert Uytterhoeven [-- Attachment #1: Type: text/plain, Size: 2504 bytes --] On Wed, Aug 18, 2021 at 07:44:39AM +0200, Greg Kroah-Hartman wrote: > On Tue, Aug 17, 2021 at 02:00:56PM -0500, Pierre-Louis Bossart wrote: > > In these cases, there is no way to notify the deferred probe > > infrastructure of the enablement of resources after the driver > > binding. > Then just wait for it to happen naturally? Through what mechanism will it happen naturally? Deferred probe currently only does things if things are being registered or if probes complete. > > The driver_deferred_probe_trigger() function is currently used > > 'anytime a driver is successfully bound to a device', this patch > > suggest exporing by exporting it so that drivers can kick-off > > re-probing of deferred devices at the end of a deferred processing. > I really do not want to export this as it will get really messy very > quickly with different drivers/busses attempting to call this. I'm not sure I see the mess here - it's just queueing some work, one of the things that the workqueue stuff does well is handle things getting scheduled while they're already queued. Honestly having understood their problem I think we need to be adding these calls into all the resource provider APIs. > Either handle it in your driver (why do you have to defer probe at all, > just succeed and move on to register the needed stuff after you are > initialized) or rely on the driver core here. That's exactly what they're doing currently and the driver core isn't delivering. Driver A is slow to start up and providing a resource to driver B, this gets handled in driver A by succeeding immediately and then registering the resource once the startup has completed. Unfortunately while that was happening not only has driver B registered and deferred but the rest of the probes/defers in the system have completed so the deferred probe mechanism is idle. Nothing currently tells the deferred probe mechanism that a new resource is now available so it never retries the probe of driver B. The only way I can see to fix this without modifying the driver core is to make driver A block during probe but that would at best slow down boot. The issue is that the driver core is using drivers completing probe as a proxy for resources becoming available. That works most of the time because most probes are fully synchronous but it breaks down if a resource provider registers resources outside of probe, we might still be fine if system boot is still happening and something else probes but only through luck. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() 2021-08-18 11:57 ` Mark Brown @ 2021-08-18 13:22 ` Greg Kroah-Hartman 2021-08-18 13:48 ` Mark Brown 0 siblings, 1 reply; 17+ messages in thread From: Greg Kroah-Hartman @ 2021-08-18 13:22 UTC (permalink / raw) To: Mark Brown Cc: Pierre-Louis Bossart, alsa-devel, tiwai, vkoul, liam.r.girdwood, Andy Shevchenko, Dan Williams, Jason Gunthorpe, Christoph Hellwig, Rafael J . Wysocki, linux-kernel, Geert Uytterhoeven On Wed, Aug 18, 2021 at 12:57:36PM +0100, Mark Brown wrote: > On Wed, Aug 18, 2021 at 07:44:39AM +0200, Greg Kroah-Hartman wrote: > > On Tue, Aug 17, 2021 at 02:00:56PM -0500, Pierre-Louis Bossart wrote: > > > > In these cases, there is no way to notify the deferred probe > > > infrastructure of the enablement of resources after the driver > > > binding. > > > Then just wait for it to happen naturally? > > Through what mechanism will it happen naturally? Deferred probe > currently only does things if things are being registered or if probes > complete. > > > > The driver_deferred_probe_trigger() function is currently used > > > 'anytime a driver is successfully bound to a device', this patch > > > suggest exporing by exporting it so that drivers can kick-off > > > re-probing of deferred devices at the end of a deferred processing. > > > I really do not want to export this as it will get really messy very > > quickly with different drivers/busses attempting to call this. > > I'm not sure I see the mess here - it's just queueing some work, one of > the things that the workqueue stuff does well is handle things getting > scheduled while they're already queued. Honestly having understood > their problem I think we need to be adding these calls into all the > resource provider APIs. > > > Either handle it in your driver (why do you have to defer probe at all, > > just succeed and move on to register the needed stuff after you are > > initialized) or rely on the driver core here. > > That's exactly what they're doing currently and the driver core isn't > delivering. > > Driver A is slow to start up and providing a resource to driver B, this > gets handled in driver A by succeeding immediately and then registering > the resource once the startup has completed. Unfortunately while that > was happening not only has driver B registered and deferred but the rest > of the probes/defers in the system have completed so the deferred probe > mechanism is idle. Nothing currently tells the deferred probe mechanism > that a new resource is now available so it never retries the probe of > driver B. The only way I can see to fix this without modifying the > driver core is to make driver A block during probe but that would at > best slow down boot. > > The issue is that the driver core is using drivers completing probe as a > proxy for resources becoming available. That works most of the time > because most probes are fully synchronous but it breaks down if a > resource provider registers resources outside of probe, we might still > be fine if system boot is still happening and something else probes but > only through luck. The driver core is not using that as a proxy, that is up to the driver itself or not. All probe means is "yes, this driver binds to this device, thank you!" for that specific bus/class type. That's all, if the driver needs to go off and do real work before it can properly control the device, wonderful, have it go and do that async. So if you know you should be binding to the device, great, kick off some other work and return success from probe. There's no reason you have to delay or defer for no good reason, right? But yes, if you do get new resources, the probe should be called again, that's what the deferred logic is for (or is that the link logic, I can't recall) This shouldn't be a new thing, no needing to call the driver core directly like this at all, it should "just happen", right? thanks, greg k-h ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() 2021-08-18 13:22 ` Greg Kroah-Hartman @ 2021-08-18 13:48 ` Mark Brown 2021-08-18 14:51 ` Pierre-Louis Bossart 0 siblings, 1 reply; 17+ messages in thread From: Mark Brown @ 2021-08-18 13:48 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Pierre-Louis Bossart, alsa-devel, tiwai, vkoul, liam.r.girdwood, Andy Shevchenko, Dan Williams, Jason Gunthorpe, Christoph Hellwig, Rafael J . Wysocki, linux-kernel, Geert Uytterhoeven [-- Attachment #1: Type: text/plain, Size: 2242 bytes --] On Wed, Aug 18, 2021 at 03:22:19PM +0200, Greg Kroah-Hartman wrote: > On Wed, Aug 18, 2021 at 12:57:36PM +0100, Mark Brown wrote: > > The issue is that the driver core is using drivers completing probe as a > > proxy for resources becoming available. That works most of the time > > because most probes are fully synchronous but it breaks down if a > > resource provider registers resources outside of probe, we might still > > be fine if system boot is still happening and something else probes but > > only through luck. > The driver core is not using that as a proxy, that is up to the driver > itself or not. All probe means is "yes, this driver binds to this > device, thank you!" for that specific bus/class type. That's all, if > the driver needs to go off and do real work before it can properly > control the device, wonderful, have it go and do that async. Right, which is what is happening here - but the deferred probe machinery in the core is reading more into the probe succeeding than it should. > So if you know you should be binding to the device, great, kick off some > other work and return success from probe. There's no reason you have to > delay or defer for no good reason, right? The driver that's deferring isn't the one that takes a long time to probe - the driver that's deferring depends on the driver that takes a long time to probe, it defers because the resource it needs isn't available when it tries to probe as the slow device is still doing it's thing asynchronously. The problem is that the driver core isn't going back and attempting to probe the deferred device again once the driver that took a long time has provided resources. > But yes, if you do get new resources, the probe should be called again, > that's what the deferred logic is for (or is that the link logic, I > can't recall) This shouldn't be a new thing, no needing to call the > driver core directly like this at all, it should "just happen", right? How specifically does new resources becoming available directly cause a new probe deferral run at the moment? I can't see anything that resource provider APIs are doing to say that a new resource has become available, this patch is trying to provide something they can do. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() 2021-08-18 13:48 ` Mark Brown @ 2021-08-18 14:51 ` Pierre-Louis Bossart 2021-08-18 14:59 ` Dan Williams 2021-08-18 15:28 ` Greg Kroah-Hartman 0 siblings, 2 replies; 17+ messages in thread From: Pierre-Louis Bossart @ 2021-08-18 14:51 UTC (permalink / raw) To: Mark Brown, Greg Kroah-Hartman Cc: alsa-devel, Rafael J . Wysocki, tiwai, linux-kernel, liam.r.girdwood, vkoul, Geert Uytterhoeven, Jason Gunthorpe, Dan Williams, Andy Shevchenko, Christoph Hellwig >>> The issue is that the driver core is using drivers completing probe as a >>> proxy for resources becoming available. That works most of the time >>> because most probes are fully synchronous but it breaks down if a >>> resource provider registers resources outside of probe, we might still >>> be fine if system boot is still happening and something else probes but >>> only through luck. > >> The driver core is not using that as a proxy, that is up to the driver >> itself or not. All probe means is "yes, this driver binds to this >> device, thank you!" for that specific bus/class type. That's all, if >> the driver needs to go off and do real work before it can properly >> control the device, wonderful, have it go and do that async. > > Right, which is what is happening here - but the deferred probe > machinery in the core is reading more into the probe succeeding than it > should. I think Greg was referring to the use of the PROBE_PREFER_ASYNCHRONOUS probe type. We tried just that and got a nice WARN_ON because we are using request_module() to deal with HDaudio codecs. The details are in [1] but the kernel code is unambiguous... /* * We don't allow synchronous module loading from async. Module * init may invoke async_synchronize_full() which will end up * waiting for this task which already is waiting for the module * loading to complete, leading to a deadlock. */ WARN_ON_ONCE(wait && current_is_async()); The reason why we use a workqueue is because we are otherwise painted in a corner by conflicting requirements. a) we have to use request_module() b) we cannot use the async probe because of the request_module() c) we have to avoid blocking on boot I understand the resistance to exporting this function, no one in our team was really happy about it, but no one could find an alternate solution. If there is something better, I am all ears. Thanks -Pierre [1] https://github.com/thesofproject/linux/pull/3079 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() 2021-08-18 14:51 ` Pierre-Louis Bossart @ 2021-08-18 14:59 ` Dan Williams 2021-08-18 15:28 ` Greg Kroah-Hartman 1 sibling, 0 replies; 17+ messages in thread From: Dan Williams @ 2021-08-18 14:59 UTC (permalink / raw) To: Pierre-Louis Bossart Cc: Mark Brown, Greg Kroah-Hartman, alsa-devel, Rafael J . Wysocki, Takashi Iwai, Linux Kernel Mailing List, Liam Girdwood, Vinod Koul, Geert Uytterhoeven, Jason Gunthorpe, Andy Shevchenko, Christoph Hellwig On Wed, Aug 18, 2021 at 7:52 AM Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> wrote: > > > > >>> The issue is that the driver core is using drivers completing probe as a > >>> proxy for resources becoming available. That works most of the time > >>> because most probes are fully synchronous but it breaks down if a > >>> resource provider registers resources outside of probe, we might still > >>> be fine if system boot is still happening and something else probes but > >>> only through luck. > > > >> The driver core is not using that as a proxy, that is up to the driver > >> itself or not. All probe means is "yes, this driver binds to this > >> device, thank you!" for that specific bus/class type. That's all, if > >> the driver needs to go off and do real work before it can properly > >> control the device, wonderful, have it go and do that async. > > > > Right, which is what is happening here - but the deferred probe > > machinery in the core is reading more into the probe succeeding than it > > should. > > I think Greg was referring to the use of the PROBE_PREFER_ASYNCHRONOUS > probe type. We tried just that and got a nice WARN_ON because we are > using request_module() to deal with HDaudio codecs. The details are in > [1] but the kernel code is unambiguous... > > /* > * We don't allow synchronous module loading from async. Module > * init may invoke async_synchronize_full() which will end up > * waiting for this task which already is waiting for the module > * loading to complete, leading to a deadlock. > */ > WARN_ON_ONCE(wait && current_is_async()); > > > The reason why we use a workqueue is because we are otherwise painted in > a corner by conflicting requirements. > > a) we have to use request_module() > b) we cannot use the async probe because of the request_module() > c) we have to avoid blocking on boot > > I understand the resistance to exporting this function, no one in our > team was really happy about it, but no one could find an alternate > solution. If there is something better, I am all ears. Additionally you mentioned that the consumer is unknown to the producer, so you are not able, for example, to use the newly exported device_driver_attach() to directly trigger the unblocked dependency. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() 2021-08-18 14:51 ` Pierre-Louis Bossart 2021-08-18 14:59 ` Dan Williams @ 2021-08-18 15:28 ` Greg Kroah-Hartman 2021-08-18 15:53 ` Pierre-Louis Bossart 1 sibling, 1 reply; 17+ messages in thread From: Greg Kroah-Hartman @ 2021-08-18 15:28 UTC (permalink / raw) To: Pierre-Louis Bossart Cc: Mark Brown, alsa-devel, Rafael J . Wysocki, tiwai, linux-kernel, liam.r.girdwood, vkoul, Geert Uytterhoeven, Jason Gunthorpe, Dan Williams, Andy Shevchenko, Christoph Hellwig On Wed, Aug 18, 2021 at 09:51:51AM -0500, Pierre-Louis Bossart wrote: > > > >>> The issue is that the driver core is using drivers completing probe as a > >>> proxy for resources becoming available. That works most of the time > >>> because most probes are fully synchronous but it breaks down if a > >>> resource provider registers resources outside of probe, we might still > >>> be fine if system boot is still happening and something else probes but > >>> only through luck. > > > >> The driver core is not using that as a proxy, that is up to the driver > >> itself or not. All probe means is "yes, this driver binds to this > >> device, thank you!" for that specific bus/class type. That's all, if > >> the driver needs to go off and do real work before it can properly > >> control the device, wonderful, have it go and do that async. > > > > Right, which is what is happening here - but the deferred probe > > machinery in the core is reading more into the probe succeeding than it > > should. > > I think Greg was referring to the use of the PROBE_PREFER_ASYNCHRONOUS > probe type. We tried just that and got a nice WARN_ON because we are > using request_module() to deal with HDaudio codecs. The details are in > [1] but the kernel code is unambiguous... > > /* > * We don't allow synchronous module loading from async. Module > * init may invoke async_synchronize_full() which will end up > * waiting for this task which already is waiting for the module > * loading to complete, leading to a deadlock. > */ > WARN_ON_ONCE(wait && current_is_async()); > > > The reason why we use a workqueue is because we are otherwise painted in > a corner by conflicting requirements. > > a) we have to use request_module() Wait, why? module loading is async, use auto-loading when the hardware/device is found and reported to userspace. Forcing a module to load by the kernel is not always wise as the module is not always present in the filesystem at that point in time at boot (think modules on the filesystem, not in the initramfs). Try fixing this issue and maybe it will resolve itself as you should be working async. thanks, greg k-h ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() 2021-08-18 15:28 ` Greg Kroah-Hartman @ 2021-08-18 15:53 ` Pierre-Louis Bossart 2021-08-18 16:49 ` Greg Kroah-Hartman 0 siblings, 1 reply; 17+ messages in thread From: Pierre-Louis Bossart @ 2021-08-18 15:53 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Mark Brown, alsa-devel, Rafael J . Wysocki, tiwai, linux-kernel, liam.r.girdwood, vkoul, Geert Uytterhoeven, Jason Gunthorpe, Dan Williams, Andy Shevchenko, Christoph Hellwig >> a) we have to use request_module() > > Wait, why? > > module loading is async, use auto-loading when the hardware/device is > found and reported to userspace. Forcing a module to load by the kernel > is not always wise as the module is not always present in the filesystem > at that point in time at boot (think modules on the filesystem, not in > the initramfs). > > Try fixing this issue and maybe it will resolve itself as you should be > working async. It's been that way for a very long time (2015?) for HDAudio support, see sound/pci/hda/hda_bind.c. It's my understanding that it was a conscious design decision to use vendor-specific modules, if available, and fallback to generic modules if the first pass failed. Takashi, you may want to chime in... ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() 2021-08-18 15:53 ` Pierre-Louis Bossart @ 2021-08-18 16:49 ` Greg Kroah-Hartman 2021-08-18 17:52 ` Mark Brown 2021-08-18 18:09 ` Pierre-Louis Bossart 0 siblings, 2 replies; 17+ messages in thread From: Greg Kroah-Hartman @ 2021-08-18 16:49 UTC (permalink / raw) To: Pierre-Louis Bossart Cc: Mark Brown, alsa-devel, Rafael J . Wysocki, tiwai, linux-kernel, liam.r.girdwood, vkoul, Geert Uytterhoeven, Jason Gunthorpe, Dan Williams, Andy Shevchenko, Christoph Hellwig On Wed, Aug 18, 2021 at 10:53:07AM -0500, Pierre-Louis Bossart wrote: > > > > >> a) we have to use request_module() > > > > Wait, why? > > > > module loading is async, use auto-loading when the hardware/device is > > found and reported to userspace. Forcing a module to load by the kernel > > is not always wise as the module is not always present in the filesystem > > at that point in time at boot (think modules on the filesystem, not in > > the initramfs). > > > > Try fixing this issue and maybe it will resolve itself as you should be > > working async. > > It's been that way for a very long time (2015?) for HDAudio support, see > sound/pci/hda/hda_bind.c. It's my understanding that it was a conscious > design decision to use vendor-specific modules, if available, and > fallback to generic modules if the first pass failed. If it has been this way for so long, what has caused the sudden change to need to export this and call this function? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() 2021-08-18 16:49 ` Greg Kroah-Hartman @ 2021-08-18 17:52 ` Mark Brown 2021-08-18 18:09 ` Pierre-Louis Bossart 1 sibling, 0 replies; 17+ messages in thread From: Mark Brown @ 2021-08-18 17:52 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Pierre-Louis Bossart, alsa-devel, Rafael J . Wysocki, tiwai, linux-kernel, liam.r.girdwood, vkoul, Geert Uytterhoeven, Jason Gunthorpe, Dan Williams, Andy Shevchenko, Christoph Hellwig [-- Attachment #1: Type: text/plain, Size: 630 bytes --] On Wed, Aug 18, 2021 at 06:49:51PM +0200, Greg Kroah-Hartman wrote: > On Wed, Aug 18, 2021 at 10:53:07AM -0500, Pierre-Louis Bossart wrote: > > It's been that way for a very long time (2015?) for HDAudio support, see > > sound/pci/hda/hda_bind.c. It's my understanding that it was a conscious > > design decision to use vendor-specific modules, if available, and > > fallback to generic modules if the first pass failed. > If it has been this way for so long, what has caused the sudden change > to need to export this and call this function? The usage predates the hardware that requires firmware downloads - that's very new. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() 2021-08-18 16:49 ` Greg Kroah-Hartman 2021-08-18 17:52 ` Mark Brown @ 2021-08-18 18:09 ` Pierre-Louis Bossart 2021-08-18 18:28 ` Mark Brown 1 sibling, 1 reply; 17+ messages in thread From: Pierre-Louis Bossart @ 2021-08-18 18:09 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: alsa-devel, Rafael J . Wysocki, tiwai, linux-kernel, liam.r.girdwood, vkoul, Mark Brown, Geert Uytterhoeven, Jason Gunthorpe, Dan Williams, Andy Shevchenko, Christoph Hellwig >>>> a) we have to use request_module() >>> >>> Wait, why? >>> >>> module loading is async, use auto-loading when the hardware/device is >>> found and reported to userspace. Forcing a module to load by the kernel >>> is not always wise as the module is not always present in the filesystem >>> at that point in time at boot (think modules on the filesystem, not in >>> the initramfs). >>> >>> Try fixing this issue and maybe it will resolve itself as you should be >>> working async. >> >> It's been that way for a very long time (2015?) for HDAudio support, see >> sound/pci/hda/hda_bind.c. It's my understanding that it was a conscious >> design decision to use vendor-specific modules, if available, and >> fallback to generic modules if the first pass failed. > > If it has been this way for so long, what has caused the sudden change > to need to export this and call this function? Fair question, I did not provide all the context with a cover letter that was already quite long. Here are more details: In the existing Intel audio drivers, we have a PCI device that first get probed. The PCI driver initializes the DSP and exposes what the audio DSP can do, but the platform-specific configuration for a given board is handled by a child device [1]. We have all kinds of hard-coded lookup tables to figure out what the board is and what machine driver should be used based on the presence of other ACPI devices and/or DMI quirks [2][3]. We must have used this solution since 2010, mainly because 'the other OS' does not rely on platform firmware for a description of the audio capabilities. In the 'soon' future, that machine driver will probed with its own ACPI ID and become generic, with all the information related to the board described in platform firmware and parsed by the driver. This is how the 'simple card' works today in Device Tree environments, platform firmware describes how host-provided components are connected to 3rd-party components. I cannot provide more details at this time since this is a not yet a publicly-available specification (this specification work does take place in a standardization body). That change in how the machine driver gets probed creates a new problem we didn't have before: this generic machine driver will probe in the early stages of the boot, long before the DSP and audio codecs are initialized/available. I initially looked at the component framework to try to express dependencies. It's really not clear to me if this is the 'right' direction, for ASoC-based solutions we already have components that register with a core. I also started looking at other proposals that were made over the years, this problem of expressing dependencies is not new. No real luck. In the end, since the DeviceTree-based solutions based on deferred probes work fine for the same type of usages, I tried to reuse the same deferred probe mechanism. The only reason why I needed to export this function is to work-around the request_module() use. I am not claiming any award for architecture, this is clearly a domain-specific corner case. I did try the async probe, I consulted with Marc Brown, had an internal review with Dan Williams and Andy Shevchenko. While nobody cheered, it seemed like this export was 'reasonable' compared to a re-architecture of the HDaudio/HDMI support - which is a really scary proposition. There is no immediate rush to make this change in this kernel cycle or the next, I am open to alternatives, but I wanted to make sure we don't have any Linux plumbing issues by the time the specification becomes public and is used by 'the other OS'. Does this help get more context? [1] https://elixir.bootlin.com/linux/latest/source/sound/soc/sof/core.c#L234 [2] https://elixir.bootlin.com/linux/latest/source/sound/soc/intel/common/soc-acpi-intel-tgl-match.c#L323 [3] https://elixir.bootlin.com/linux/latest/source/sound/soc/intel/boards/sof_sdw.c#L50 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() 2021-08-18 18:09 ` Pierre-Louis Bossart @ 2021-08-18 18:28 ` Mark Brown 0 siblings, 0 replies; 17+ messages in thread From: Mark Brown @ 2021-08-18 18:28 UTC (permalink / raw) To: Pierre-Louis Bossart Cc: Greg Kroah-Hartman, alsa-devel, Rafael J . Wysocki, tiwai, linux-kernel, liam.r.girdwood, vkoul, Geert Uytterhoeven, Jason Gunthorpe, Dan Williams, Andy Shevchenko, Christoph Hellwig [-- Attachment #1: Type: text/plain, Size: 1422 bytes --] On Wed, Aug 18, 2021 at 01:09:44PM -0500, Pierre-Louis Bossart wrote: > I initially looked at the component framework to try to express > dependencies. It's really not clear to me if this is the 'right' > direction, for ASoC-based solutions we already have components that > register with a core. Historically (long before both deferred probe and the component framework) ASoC used to implement a mechanism that essentially did deferred probe for the dependencies - it'd maintain it's own lists of dependencies and then tell the machine driver and all the components when the card was ready. Once deferred probe was there we dropped all the open coded deferral stuff since it was just reimplementing what deferred probe does in a slightly more complicated fashion (it tracked the dependencies in a finer grained manner, though the result wasn't any different). See b19e6e7b76 (ASoC: core: Use driver core probe deferral) for the conversion. What ASoC is doing with the cards is fundamentally the same thing as what the component helpers are doing, we could in theory convert to using that but unlike with probe deferral it doesn't really save us any work and we'd still need all the card level tracking we've got to connect the various bits of the card together and order things. If we were starting from scratch we would probably use components but there's far more pressing things to be getting on with otherwise. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC PATCH 2/2] ASoC: SOF: trigger re-probing of deferred devices from workqueue 2021-08-17 19:00 [RFC PATCH 0/2] driver core: kick deferred probe from delayed context Pierre-Louis Bossart 2021-08-17 19:00 ` [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() Pierre-Louis Bossart @ 2021-08-17 19:00 ` Pierre-Louis Bossart 2021-08-18 12:07 ` Mark Brown 1 sibling, 1 reply; 17+ messages in thread From: Pierre-Louis Bossart @ 2021-08-17 19:00 UTC (permalink / raw) To: alsa-devel Cc: tiwai, broonie, vkoul, liam.r.girdwood, Andy Shevchenko, Dan Williams, Jason Gunthorpe, Christoph Hellwig, Greg Kroah-Hartman, Rafael J . Wysocki, linux-kernel, Pierre-Louis Bossart, Liam Girdwood, Ranjani Sridharan, Kai Vehmanen, Daniel Baluta, Jaroslav Kysela, Takashi Iwai, moderated list:SOUND - SOUND OPEN FIRMWARE (SOF) DRIVERS Audio drivers such as HDaudio legacy and SOF rely on a workqueue to split the probe into two, with a first pass returning success immediately, and the second pass taking a lot more time due to the use of request_module() and the DSP initializations. This workqueue-based solution helps deal with conflicting requirements a) other drivers should not be blocked by a long probe b) a PROBE_PREFER_ASYNCHRONOUS probe_type is explicitly not allowed to avoid a deadlock when request_module() is used. This patch makes sure the deferred probe framework is triggered when the provider of resources successfully completes its initialization. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> --- sound/soc/sof/core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c index 3e4dd4a86363..cecc0e914807 100644 --- a/sound/soc/sof/core.c +++ b/sound/soc/sof/core.c @@ -251,6 +251,9 @@ static int sof_probe_continue(struct snd_sof_dev *sdev) sdev->probe_completed = true; + /* kick-off re-probing of deferred devices */ + driver_deferred_probe_trigger(); + return 0; fw_trace_err: -- 2.25.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 2/2] ASoC: SOF: trigger re-probing of deferred devices from workqueue 2021-08-17 19:00 ` [RFC PATCH 2/2] ASoC: SOF: trigger re-probing of deferred devices from workqueue Pierre-Louis Bossart @ 2021-08-18 12:07 ` Mark Brown 2021-08-18 15:25 ` Pierre-Louis Bossart 0 siblings, 1 reply; 17+ messages in thread From: Mark Brown @ 2021-08-18 12:07 UTC (permalink / raw) To: Pierre-Louis Bossart Cc: alsa-devel, tiwai, vkoul, liam.r.girdwood, Andy Shevchenko, Dan Williams, Jason Gunthorpe, Christoph Hellwig, Greg Kroah-Hartman, Rafael J . Wysocki, linux-kernel, Liam Girdwood, Ranjani Sridharan, Kai Vehmanen, Daniel Baluta, Jaroslav Kysela, Takashi Iwai, moderated list:SOUND - SOUND OPEN FIRMWARE (SOF) DRIVERS [-- Attachment #1: Type: text/plain, Size: 605 bytes --] On Tue, Aug 17, 2021 at 02:00:57PM -0500, Pierre-Louis Bossart wrote: > +++ b/sound/soc/sof/core.c > @@ -251,6 +251,9 @@ static int sof_probe_continue(struct snd_sof_dev *sdev) > > sdev->probe_completed = true; > > + /* kick-off re-probing of deferred devices */ > + driver_deferred_probe_trigger(); > + I think we should move this into snd_soc_register_component() - the same issue could occur with any other component, the only other thing I can see kicking in here is the machine driver registration but that ought to kick probe itself anyway. Or is there some other case here? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 2/2] ASoC: SOF: trigger re-probing of deferred devices from workqueue 2021-08-18 12:07 ` Mark Brown @ 2021-08-18 15:25 ` Pierre-Louis Bossart 0 siblings, 0 replies; 17+ messages in thread From: Pierre-Louis Bossart @ 2021-08-18 15:25 UTC (permalink / raw) To: Mark Brown Cc: alsa-devel, Kai Vehmanen, Rafael J . Wysocki, tiwai, Greg Kroah-Hartman, Takashi Iwai, linux-kernel, Liam Girdwood, liam.r.girdwood, vkoul, Ranjani Sridharan, Jason Gunthorpe, Dan Williams, Andy Shevchenko, Daniel Baluta, Christoph Hellwig, moderated list:SOUND - SOUND OPEN FIRMWARE (SOF) DRIVERS On 8/18/21 7:07 AM, Mark Brown wrote: > On Tue, Aug 17, 2021 at 02:00:57PM -0500, Pierre-Louis Bossart wrote: > >> +++ b/sound/soc/sof/core.c >> @@ -251,6 +251,9 @@ static int sof_probe_continue(struct snd_sof_dev *sdev) >> >> sdev->probe_completed = true; >> >> + /* kick-off re-probing of deferred devices */ >> + driver_deferred_probe_trigger(); >> + > > I think we should move this into snd_soc_register_component() - the same > issue could occur with any other component, the only other thing I can > see kicking in here is the machine driver registration but that ought to > kick probe itself anyway. Or is there some other case here? Thanks for the suggestion Mark, it would be more consistent indeed to kick a re-evaluation of the deferred probe list when ASoC components are successfully registered with something like this: diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index c830e96afba2..9d6feea7719c 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -2677,7 +2677,14 @@ int snd_soc_register_component(struct device *dev, if (ret < 0) return ret; - return snd_soc_add_component(component, dai_drv, num_dai); + ret = snd_soc_add_component(component, dai_drv, num_dai); + if (ret < 0) + return ret; + + /* kick-off re-probing of deferred devices */ + driver_deferred_probe_trigger(); + + return 0; } EXPORT_SYMBOL_GPL(snd_soc_register_component); In the case of this SOF driver, it'd be completely equivalent to what this patch suggested, the snd_soc_register_component() is what we do last in the workqueue. In the case of 'regular' drivers, the component registration is typically done last as well before the end of the probe. This would result in 2 evaluations (one on successful ASoC component registration and one on successful probe), and maybe on the second evaluation there's nothing to do. I can't think of any negative side-effects. ^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2021-08-18 18:29 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-17 19:00 [RFC PATCH 0/2] driver core: kick deferred probe from delayed context Pierre-Louis Bossart 2021-08-17 19:00 ` [RFC PATCH 1/2] driver core: export driver_deferred_probe_trigger() Pierre-Louis Bossart 2021-08-18 5:44 ` Greg Kroah-Hartman 2021-08-18 11:57 ` Mark Brown 2021-08-18 13:22 ` Greg Kroah-Hartman 2021-08-18 13:48 ` Mark Brown 2021-08-18 14:51 ` Pierre-Louis Bossart 2021-08-18 14:59 ` Dan Williams 2021-08-18 15:28 ` Greg Kroah-Hartman 2021-08-18 15:53 ` Pierre-Louis Bossart 2021-08-18 16:49 ` Greg Kroah-Hartman 2021-08-18 17:52 ` Mark Brown 2021-08-18 18:09 ` Pierre-Louis Bossart 2021-08-18 18:28 ` Mark Brown 2021-08-17 19:00 ` [RFC PATCH 2/2] ASoC: SOF: trigger re-probing of deferred devices from workqueue Pierre-Louis Bossart 2021-08-18 12:07 ` Mark Brown 2021-08-18 15:25 ` Pierre-Louis Bossart
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).