All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] ASoC: core/Intel: fix module-in-use error
@ 2019-02-01 17:22 Pierre-Louis Bossart
  2019-02-01 17:22 ` [RFC PATCH 1/2] ASoC: core: don't increase component module refcount unconditionally Pierre-Louis Bossart
  2019-02-01 17:22 ` [RFC PATCH 2/2] ASoC: Intel: Skylake: set .ignore_module_refcount field in component Pierre-Louis Bossart
  0 siblings, 2 replies; 10+ messages in thread
From: Pierre-Louis Bossart @ 2019-02-01 17:22 UTC (permalink / raw)
  To: alsa-devel; +Cc: tiwai, liam.r.girdwood, vkoul, broonie, Pierre-Louis Bossart

Recent simplifications to the SOF driver, which now follows the same
device hierarchy as the Skylake driver, expose a conceptual issue with
the module refcount that blocks module rmmod/modprobe loop tests for
both drivers.

Sending as an RFC since I don't fully understand the initial design
and might have missed a simpler solution.

Pierre-Louis Bossart (2):
  ASoC: core: don't increase component module refcount unconditionally
  ASoC: Intel: Skylake: set .ignore_module_refcount field in component

 include/sound/soc.h               | 3 +++
 sound/soc/intel/skylake/skl-pcm.c | 1 +
 sound/soc/soc-core.c              | 6 ++++--
 3 files changed, 8 insertions(+), 2 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC PATCH 1/2] ASoC: core: don't increase component module refcount unconditionally
  2019-02-01 17:22 [RFC PATCH 0/2] ASoC: core/Intel: fix module-in-use error Pierre-Louis Bossart
@ 2019-02-01 17:22 ` Pierre-Louis Bossart
  2019-02-01 18:12   ` Vinod Koul
  2019-02-12 14:22   ` Applied "ASoC: core: don't increase component module refcount unconditionally" to the asoc tree Mark Brown
  2019-02-01 17:22 ` [RFC PATCH 2/2] ASoC: Intel: Skylake: set .ignore_module_refcount field in component Pierre-Louis Bossart
  1 sibling, 2 replies; 10+ messages in thread
From: Pierre-Louis Bossart @ 2019-02-01 17:22 UTC (permalink / raw)
  To: alsa-devel; +Cc: tiwai, liam.r.girdwood, vkoul, broonie, Pierre-Louis Bossart

The ASoC core has for the longest time increased the module reference
counts, even before the transition to the component model. This is
probably fine on most platforms, but it introduces a deadlock case on
Intel devices with the Skylake and SOF drivers which cannot be removed
due to their reference counts being modified by the core.

In these 2 cases, the PCI or ACPI driver .probe creates a platform
device to let the machine driver .probe register the audio
card. Conversely the PCI or ACPI driver .remove will unregister the
platform device which results in the card being removed by the machine
driver .remove.

With ascii art, this can be represented as

modprobe
snd_soc_skl/
soc-pci-dev/sof-acpci-dev  ----------> pci/acpi probe
       ^                                    |
       |                     ---------------|
       |                    |               |
       |                    V               V
    increase            register        register machine
    refcount            component       platform_device
       ^                                    |
       |                                    |
       |                                    V
    component <----   register card  <---- probe
    probe

The issue is that by playing with the component's module reference
counts during the card registration, it's no longer possible to remove
the module which controls the component. This can be shown, e.g. with
the following error:

root@plb-XPS-13-9350:~# lsmod | grep snd_soc_skl
snd_soc_skl           110592  1

root@plb-XPS-13-9350:~# rmmod snd_soc_skl
rmmod: ERROR: Module snd_soc_skl is in use

Increasing the reference count during the component probe is not
useful. If the PCI/ACPI module is removed, the card will be removed
anyway.

To avoid breaking existing platforms and allowing Intel platforms to
safely deal with module load/unload cases, this patch introduces a
flag which needs to be set during the component initialization. This
is a strictly opt-in capability that should only be used when the
handling of the component module does not require a reference count
increase to prevent removal during use.

Note that this solution is not directly applicable to the legacy
Atom/SST driver, which uses a different device hierarchy. There are
however additional refcount issues which prevent the ACPI driver from
being removed. This is a different issue which would need a different
patch.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 include/sound/soc.h  | 3 +++
 sound/soc/soc-core.c | 6 ++++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/sound/soc.h b/include/sound/soc.h
index 95689680336b..eb7db605955b 100644
--- a/include/sound/soc.h
+++ b/include/sound/soc.h
@@ -802,6 +802,9 @@ struct snd_soc_component_driver {
 	int probe_order;
 	int remove_order;
 
+	/* signal if the module handling the component cannot be removed */
+	unsigned int ignore_module_refcount:1;
+
 	/* bits */
 	unsigned int idle_bias_on:1;
 	unsigned int suspend_bias_off:1;
diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index 9dad2b1498c1..80ab81f1df3d 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -947,7 +947,8 @@ static void soc_cleanup_component(struct snd_soc_component *component)
 	snd_soc_dapm_free(snd_soc_component_get_dapm(component));
 	soc_cleanup_component_debugfs(component);
 	component->card = NULL;
-	module_put(component->dev->driver->owner);
+	if (!component->driver->ignore_module_refcount)
+		module_put(component->dev->driver->owner);
 }
 
 static void soc_remove_component(struct snd_soc_component *component)
@@ -1362,7 +1363,8 @@ static int soc_probe_component(struct snd_soc_card *card,
 		return 0;
 	}
 
-	if (!try_module_get(component->dev->driver->owner))
+	if (!component->driver->ignore_module_refcount &&
+	    !try_module_get(component->dev->driver->owner))
 		return -ENODEV;
 
 	component->card = card;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 2/2] ASoC: Intel: Skylake: set .ignore_module_refcount field in component
  2019-02-01 17:22 [RFC PATCH 0/2] ASoC: core/Intel: fix module-in-use error Pierre-Louis Bossart
  2019-02-01 17:22 ` [RFC PATCH 1/2] ASoC: core: don't increase component module refcount unconditionally Pierre-Louis Bossart
@ 2019-02-01 17:22 ` Pierre-Louis Bossart
  2019-02-12 14:22   ` Applied "ASoC: Intel: Skylake: set .ignore_module_refcount field in component" to the asoc tree Mark Brown
  1 sibling, 1 reply; 10+ messages in thread
From: Pierre-Louis Bossart @ 2019-02-01 17:22 UTC (permalink / raw)
  To: alsa-devel; +Cc: tiwai, liam.r.girdwood, vkoul, broonie, Pierre-Louis Bossart

There is no risk of the module being removed while the platform
components are in use. This solves the problem of the snd_soc_skl
module not being removable with rmmod

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 sound/soc/intel/skylake/skl-pcm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/intel/skylake/skl-pcm.c b/sound/soc/intel/skylake/skl-pcm.c
index 8e589d698c58..a4284778f117 100644
--- a/sound/soc/intel/skylake/skl-pcm.c
+++ b/sound/soc/intel/skylake/skl-pcm.c
@@ -1464,6 +1464,7 @@ static const struct snd_soc_component_driver skl_component  = {
 	.ops		= &skl_platform_ops,
 	.pcm_new	= skl_pcm_new,
 	.pcm_free	= skl_pcm_free,
+	.ignore_module_refcount = 1, /* do not increase the refcount in core */
 };
 
 int skl_platform_register(struct device *dev)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 1/2] ASoC: core: don't increase component module refcount unconditionally
  2019-02-01 17:22 ` [RFC PATCH 1/2] ASoC: core: don't increase component module refcount unconditionally Pierre-Louis Bossart
@ 2019-02-01 18:12   ` Vinod Koul
  2019-02-01 18:36     ` Pierre-Louis Bossart
  2019-02-01 20:07     ` Ranjani Sridharan
  2019-02-12 14:22   ` Applied "ASoC: core: don't increase component module refcount unconditionally" to the asoc tree Mark Brown
  1 sibling, 2 replies; 10+ messages in thread
From: Vinod Koul @ 2019-02-01 18:12 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: tiwai, liam.r.girdwood, alsa-devel, broonie

On 01-02-19, 11:22, Pierre-Louis Bossart wrote:
> The ASoC core has for the longest time increased the module reference
> counts, even before the transition to the component model. This is
> probably fine on most platforms, but it introduces a deadlock case on
> Intel devices with the Skylake and SOF drivers which cannot be removed
> due to their reference counts being modified by the core.
> 
> In these 2 cases, the PCI or ACPI driver .probe creates a platform
> device to let the machine driver .probe register the audio
> card. Conversely the PCI or ACPI driver .remove will unregister the
> platform device which results in the card being removed by the machine
> driver .remove.
> 
> With ascii art, this can be represented as
> 
> modprobe
> snd_soc_skl/
> soc-pci-dev/sof-acpci-dev  ----------> pci/acpi probe
>        ^                                    |
>        |                     ---------------|
>        |                    |               |
>        |                    V               V
>     increase            register        register machine
>     refcount            component       platform_device
>        ^                                    |
>        |                                    |
>        |                                    V
>     component <----   register card  <---- probe
>     probe
> 
> The issue is that by playing with the component's module reference
> counts during the card registration, it's no longer possible to remove
> the module which controls the component. This can be shown, e.g. with
> the following error:
> 
> root@plb-XPS-13-9350:~# lsmod | grep snd_soc_skl
> snd_soc_skl           110592  1
> 
> root@plb-XPS-13-9350:~# rmmod snd_soc_skl
> rmmod: ERROR: Module snd_soc_skl is in use

Yup, that would be correct, the inuse is due to the fact the sound card
is up and someone needs to unload the sound card to remove the
reference.

That can be done by doing the rmmod of machine driver first and IIRC
that would remove the sound card and drop the reference and then
snd_soc_skl can be unloaded.

Now one can argue that is not required, but I feel that is correct for
core to hold references of modules the card is tied up to :)

> Increasing the reference count during the component probe is not
> useful. If the PCI/ACPI module is removed, the card will be removed
> anyway.
> 
> To avoid breaking existing platforms and allowing Intel platforms to
> safely deal with module load/unload cases, this patch introduces a
> flag which needs to be set during the component initialization. This
> is a strictly opt-in capability that should only be used when the
> handling of the component module does not require a reference count
> increase to prevent removal during use.
> 
> Note that this solution is not directly applicable to the legacy
> Atom/SST driver, which uses a different device hierarchy. There are
> however additional refcount issues which prevent the ACPI driver from
> being removed. This is a different issue which would need a different
> patch.
> 
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  include/sound/soc.h  | 3 +++
>  sound/soc/soc-core.c | 6 ++++--
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/include/sound/soc.h b/include/sound/soc.h
> index 95689680336b..eb7db605955b 100644
> --- a/include/sound/soc.h
> +++ b/include/sound/soc.h
> @@ -802,6 +802,9 @@ struct snd_soc_component_driver {
>  	int probe_order;
>  	int remove_order;
>  
> +	/* signal if the module handling the component cannot be removed */
> +	unsigned int ignore_module_refcount:1;
> +
>  	/* bits */
>  	unsigned int idle_bias_on:1;
>  	unsigned int suspend_bias_off:1;
> diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
> index 9dad2b1498c1..80ab81f1df3d 100644
> --- a/sound/soc/soc-core.c
> +++ b/sound/soc/soc-core.c
> @@ -947,7 +947,8 @@ static void soc_cleanup_component(struct snd_soc_component *component)
>  	snd_soc_dapm_free(snd_soc_component_get_dapm(component));
>  	soc_cleanup_component_debugfs(component);
>  	component->card = NULL;
> -	module_put(component->dev->driver->owner);
> +	if (!component->driver->ignore_module_refcount)
> +		module_put(component->dev->driver->owner);
>  }
>  
>  static void soc_remove_component(struct snd_soc_component *component)
> @@ -1362,7 +1363,8 @@ static int soc_probe_component(struct snd_soc_card *card,
>  		return 0;
>  	}
>  
> -	if (!try_module_get(component->dev->driver->owner))
> +	if (!component->driver->ignore_module_refcount &&
> +	    !try_module_get(component->dev->driver->owner))
>  		return -ENODEV;
>  
>  	component->card = card;
> -- 
> 2.17.1

-- 
~Vinod

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 1/2] ASoC: core: don't increase component module refcount unconditionally
  2019-02-01 18:12   ` Vinod Koul
@ 2019-02-01 18:36     ` Pierre-Louis Bossart
  2019-02-01 20:07     ` Ranjani Sridharan
  1 sibling, 0 replies; 10+ messages in thread
From: Pierre-Louis Bossart @ 2019-02-01 18:36 UTC (permalink / raw)
  To: Vinod Koul; +Cc: tiwai, liam.r.girdwood, alsa-devel, broonie


On 2/1/19 12:12 PM, Vinod Koul wrote:
> On 01-02-19, 11:22, Pierre-Louis Bossart wrote:
>> The ASoC core has for the longest time increased the module reference
>> counts, even before the transition to the component model. This is
>> probably fine on most platforms, but it introduces a deadlock case on
>> Intel devices with the Skylake and SOF drivers which cannot be removed
>> due to their reference counts being modified by the core.
>>
>> In these 2 cases, the PCI or ACPI driver .probe creates a platform
>> device to let the machine driver .probe register the audio
>> card. Conversely the PCI or ACPI driver .remove will unregister the
>> platform device which results in the card being removed by the machine
>> driver .remove.
>>
>> With ascii art, this can be represented as
>>
>> modprobe
>> snd_soc_skl/
>> soc-pci-dev/sof-acpci-dev  ----------> pci/acpi probe
>>         ^                                    |
>>         |                     ---------------|
>>         |                    |               |
>>         |                    V               V
>>      increase            register        register machine
>>      refcount            component       platform_device
>>         ^                                    |
>>         |                                    |
>>         |                                    V
>>      component <----   register card  <---- probe
>>      probe
>>
>> The issue is that by playing with the component's module reference
>> counts during the card registration, it's no longer possible to remove
>> the module which controls the component. This can be shown, e.g. with
>> the following error:
>>
>> root@plb-XPS-13-9350:~# lsmod | grep snd_soc_skl
>> snd_soc_skl           110592  1
>>
>> root@plb-XPS-13-9350:~# rmmod snd_soc_skl
>> rmmod: ERROR: Module snd_soc_skl is in use
> Yup, that would be correct, the inuse is due to the fact the sound card
> is up and someone needs to unload the sound card to remove the
> reference.
>
> That can be done by doing the rmmod of machine driver first and IIRC
> that would remove the sound card and drop the reference and then
> snd_soc_skl can be unloaded.

It's a hack.

This doesn't follow the parent/child model and has issues with the 
topology cleanups, not to mention that you have a PCI/ACPI module which 
is completely crippled if you try things like suspend-resume.

In addition, this prevents you from testing the .remove path where you 
unregister the plaform_device. In other words you have no clean way of 
testing if your allocations/releases are correct from the PCI/ACPI level 
all the way to the component probe/free.

>
> Now one can argue that is not required, but I feel that is correct for
> core to hold references of modules the card is tied up to :)
>
>> Increasing the reference count during the component probe is not
>> useful. If the PCI/ACPI module is removed, the card will be removed
>> anyway.
>>
>> To avoid breaking existing platforms and allowing Intel platforms to
>> safely deal with module load/unload cases, this patch introduces a
>> flag which needs to be set during the component initialization. This
>> is a strictly opt-in capability that should only be used when the
>> handling of the component module does not require a reference count
>> increase to prevent removal during use.
>>
>> Note that this solution is not directly applicable to the legacy
>> Atom/SST driver, which uses a different device hierarchy. There are
>> however additional refcount issues which prevent the ACPI driver from
>> being removed. This is a different issue which would need a different
>> patch.
>>
>> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
>> ---
>>   include/sound/soc.h  | 3 +++
>>   sound/soc/soc-core.c | 6 ++++--
>>   2 files changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/sound/soc.h b/include/sound/soc.h
>> index 95689680336b..eb7db605955b 100644
>> --- a/include/sound/soc.h
>> +++ b/include/sound/soc.h
>> @@ -802,6 +802,9 @@ struct snd_soc_component_driver {
>>   	int probe_order;
>>   	int remove_order;
>>   
>> +	/* signal if the module handling the component cannot be removed */
>> +	unsigned int ignore_module_refcount:1;
>> +
>>   	/* bits */
>>   	unsigned int idle_bias_on:1;
>>   	unsigned int suspend_bias_off:1;
>> diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
>> index 9dad2b1498c1..80ab81f1df3d 100644
>> --- a/sound/soc/soc-core.c
>> +++ b/sound/soc/soc-core.c
>> @@ -947,7 +947,8 @@ static void soc_cleanup_component(struct snd_soc_component *component)
>>   	snd_soc_dapm_free(snd_soc_component_get_dapm(component));
>>   	soc_cleanup_component_debugfs(component);
>>   	component->card = NULL;
>> -	module_put(component->dev->driver->owner);
>> +	if (!component->driver->ignore_module_refcount)
>> +		module_put(component->dev->driver->owner);
>>   }
>>   
>>   static void soc_remove_component(struct snd_soc_component *component)
>> @@ -1362,7 +1363,8 @@ static int soc_probe_component(struct snd_soc_card *card,
>>   		return 0;
>>   	}
>>   
>> -	if (!try_module_get(component->dev->driver->owner))
>> +	if (!component->driver->ignore_module_refcount &&
>> +	    !try_module_get(component->dev->driver->owner))
>>   		return -ENODEV;
>>   
>>   	component->card = card;
>> -- 
>> 2.17.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 1/2] ASoC: core: don't increase component module refcount unconditionally
  2019-02-01 18:12   ` Vinod Koul
  2019-02-01 18:36     ` Pierre-Louis Bossart
@ 2019-02-01 20:07     ` Ranjani Sridharan
  2019-02-05  4:25       ` Vinod Koul
  1 sibling, 1 reply; 10+ messages in thread
From: Ranjani Sridharan @ 2019-02-01 20:07 UTC (permalink / raw)
  To: Vinod Koul, Pierre-Louis Bossart
  Cc: tiwai, liam.r.girdwood, alsa-devel, broonie

On Fri, 2019-02-01 at 23:42 +0530, Vinod Koul wrote:
> On 01-02-19, 11:22, Pierre-Louis Bossart wrote:
> > The ASoC core has for the longest time increased the module
> > reference
> > counts, even before the transition to the component model. This is
> > probably fine on most platforms, but it introduces a deadlock case
> > on
> > Intel devices with the Skylake and SOF drivers which cannot be
> > removed
> > due to their reference counts being modified by the core.
> > 
> > In these 2 cases, the PCI or ACPI driver .probe creates a platform
> > device to let the machine driver .probe register the audio
> > card. Conversely the PCI or ACPI driver .remove will unregister the
> > platform device which results in the card being removed by the
> > machine
> > driver .remove.
> > 
> > With ascii art, this can be represented as
> > 
> > modprobe
> > snd_soc_skl/
> > soc-pci-dev/sof-acpci-dev  ----------> pci/acpi probe
> >        ^                                    |
> >        |                     ---------------|
> >        |                    |               |
> >        |                    V               V
> >     increase            register        register machine
> >     refcount            component       platform_device
> >        ^                                    |
> >        |                                    |
> >        |                                    V
> >     component <----   register card  <---- probe
> >     probe
> > 
> > The issue is that by playing with the component's module reference
> > counts during the card registration, it's no longer possible to
> > remove
> > the module which controls the component. This can be shown, e.g.
> > with
> > the following error:
> > 
> > root@plb-XPS-13-9350:~# lsmod | grep snd_soc_skl
> > snd_soc_skl           110592  1
> > 
> > root@plb-XPS-13-9350:~# rmmod snd_soc_skl
> > rmmod: ERROR: Module snd_soc_skl is in use
> 
> Yup, that would be correct, the inuse is due to the fact the sound
> card
> is up and someone needs to unload the sound card to remove the
> reference.
> 
> That can be done by doing the rmmod of machine driver first and IIRC
> that would remove the sound card and drop the reference and then
> snd_soc_skl can be unloaded.

This doesnt seem to be the case though. The machine driver module
cannot be removed either because its refcnt is also > 0.

Thanks,
Ranjani

> 
> Now one can argue that is not required, but I feel that is correct
> for
> core to hold references of modules the card is tied up to :)
> 
> > Increasing the reference count during the component probe is not
> > useful. If the PCI/ACPI module is removed, the card will be removed
> > anyway.
> > 
> > To avoid breaking existing platforms and allowing Intel platforms
> > to
> > safely deal with module load/unload cases, this patch introduces a
> > flag which needs to be set during the component initialization.
> > This
> > is a strictly opt-in capability that should only be used when the
> > handling of the component module does not require a reference count
> > increase to prevent removal during use.
> > 
> > Note that this solution is not directly applicable to the legacy
> > Atom/SST driver, which uses a different device hierarchy. There are
> > however additional refcount issues which prevent the ACPI driver
> > from
> > being removed. This is a different issue which would need a
> > different
> > patch.
> > 
> > Signed-off-by: Pierre-Louis Bossart <
> > pierre-louis.bossart@linux.intel.com>
> > ---
> >  include/sound/soc.h  | 3 +++
> >  sound/soc/soc-core.c | 6 ++++--
> >  2 files changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/sound/soc.h b/include/sound/soc.h
> > index 95689680336b..eb7db605955b 100644
> > --- a/include/sound/soc.h
> > +++ b/include/sound/soc.h
> > @@ -802,6 +802,9 @@ struct snd_soc_component_driver {
> >  	int probe_order;
> >  	int remove_order;
> >  
> > +	/* signal if the module handling the component cannot be
> > removed */
> > +	unsigned int ignore_module_refcount:1;
> > +
> >  	/* bits */
> >  	unsigned int idle_bias_on:1;
> >  	unsigned int suspend_bias_off:1;
> > diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
> > index 9dad2b1498c1..80ab81f1df3d 100644
> > --- a/sound/soc/soc-core.c
> > +++ b/sound/soc/soc-core.c
> > @@ -947,7 +947,8 @@ static void soc_cleanup_component(struct
> > snd_soc_component *component)
> >  	snd_soc_dapm_free(snd_soc_component_get_dapm(component));
> >  	soc_cleanup_component_debugfs(component);
> >  	component->card = NULL;
> > -	module_put(component->dev->driver->owner);
> > +	if (!component->driver->ignore_module_refcount)
> > +		module_put(component->dev->driver->owner);
> >  }
> >  
> >  static void soc_remove_component(struct snd_soc_component
> > *component)
> > @@ -1362,7 +1363,8 @@ static int soc_probe_component(struct
> > snd_soc_card *card,
> >  		return 0;
> >  	}
> >  
> > -	if (!try_module_get(component->dev->driver->owner))
> > +	if (!component->driver->ignore_module_refcount &&
> > +	    !try_module_get(component->dev->driver->owner))
> >  		return -ENODEV;
> >  
> >  	component->card = card;
> > -- 
> > 2.17.1
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 1/2] ASoC: core: don't increase component module refcount unconditionally
  2019-02-01 20:07     ` Ranjani Sridharan
@ 2019-02-05  4:25       ` Vinod Koul
  2019-02-05 15:09         ` Pierre-Louis Bossart
  0 siblings, 1 reply; 10+ messages in thread
From: Vinod Koul @ 2019-02-05  4:25 UTC (permalink / raw)
  To: Ranjani Sridharan
  Cc: tiwai, liam.r.girdwood, alsa-devel, broonie, Pierre-Louis Bossart

On 01-02-19, 12:07, Ranjani Sridharan wrote:
> On Fri, 2019-02-01 at 23:42 +0530, Vinod Koul wrote:
> > On 01-02-19, 11:22, Pierre-Louis Bossart wrote:
> > > The ASoC core has for the longest time increased the module
> > > reference
> > > counts, even before the transition to the component model. This is
> > > probably fine on most platforms, but it introduces a deadlock case
> > > on
> > > Intel devices with the Skylake and SOF drivers which cannot be
> > > removed
> > > due to their reference counts being modified by the core.
> > > 
> > > In these 2 cases, the PCI or ACPI driver .probe creates a platform
> > > device to let the machine driver .probe register the audio
> > > card. Conversely the PCI or ACPI driver .remove will unregister the
> > > platform device which results in the card being removed by the
> > > machine
> > > driver .remove.
> > > 
> > > With ascii art, this can be represented as
> > > 
> > > modprobe
> > > snd_soc_skl/
> > > soc-pci-dev/sof-acpci-dev  ----------> pci/acpi probe
> > >        ^                                    |
> > >        |                     ---------------|
> > >        |                    |               |
> > >        |                    V               V
> > >     increase            register        register machine
> > >     refcount            component       platform_device
> > >        ^                                    |
> > >        |                                    |
> > >        |                                    V
> > >     component <----   register card  <---- probe
> > >     probe
> > > 
> > > The issue is that by playing with the component's module reference
> > > counts during the card registration, it's no longer possible to
> > > remove
> > > the module which controls the component. This can be shown, e.g.
> > > with
> > > the following error:
> > > 
> > > root@plb-XPS-13-9350:~# lsmod | grep snd_soc_skl
> > > snd_soc_skl           110592  1
> > > 
> > > root@plb-XPS-13-9350:~# rmmod snd_soc_skl
> > > rmmod: ERROR: Module snd_soc_skl is in use
> > 
> > Yup, that would be correct, the inuse is due to the fact the sound
> > card
> > is up and someone needs to unload the sound card to remove the
> > reference.
> > 
> > That can be done by doing the rmmod of machine driver first and IIRC
> > that would remove the sound card and drop the reference and then
> > snd_soc_skl can be unloaded.
> 
> This doesnt seem to be the case though. The machine driver module
> cannot be removed either because its refcnt is also > 0.

At least this used to be the case when I used to try removal of modules
on skl, doing the reverse of load order seemed to work for me back then.

-- 
~Vinod

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 1/2] ASoC: core: don't increase component module refcount unconditionally
  2019-02-05  4:25       ` Vinod Koul
@ 2019-02-05 15:09         ` Pierre-Louis Bossart
  0 siblings, 0 replies; 10+ messages in thread
From: Pierre-Louis Bossart @ 2019-02-05 15:09 UTC (permalink / raw)
  To: Vinod Koul, Ranjani Sridharan; +Cc: tiwai, liam.r.girdwood, alsa-devel, broonie


On 2/4/19 10:25 PM, Vinod Koul wrote:
> On 01-02-19, 12:07, Ranjani Sridharan wrote:
>> On Fri, 2019-02-01 at 23:42 +0530, Vinod Koul wrote:
>>> On 01-02-19, 11:22, Pierre-Louis Bossart wrote:
>>>> The ASoC core has for the longest time increased the module
>>>> reference
>>>> counts, even before the transition to the component model. This is
>>>> probably fine on most platforms, but it introduces a deadlock case
>>>> on
>>>> Intel devices with the Skylake and SOF drivers which cannot be
>>>> removed
>>>> due to their reference counts being modified by the core.
>>>>
>>>> In these 2 cases, the PCI or ACPI driver .probe creates a platform
>>>> device to let the machine driver .probe register the audio
>>>> card. Conversely the PCI or ACPI driver .remove will unregister the
>>>> platform device which results in the card being removed by the
>>>> machine
>>>> driver .remove.
>>>>
>>>> With ascii art, this can be represented as
>>>>
>>>> modprobe
>>>> snd_soc_skl/
>>>> soc-pci-dev/sof-acpci-dev  ----------> pci/acpi probe
>>>>         ^                                    |
>>>>         |                     ---------------|
>>>>         |                    |               |
>>>>         |                    V               V
>>>>      increase            register        register machine
>>>>      refcount            component       platform_device
>>>>         ^                                    |
>>>>         |                                    |
>>>>         |                                    V
>>>>      component <----   register card  <---- probe
>>>>      probe
>>>>
>>>> The issue is that by playing with the component's module reference
>>>> counts during the card registration, it's no longer possible to
>>>> remove
>>>> the module which controls the component. This can be shown, e.g.
>>>> with
>>>> the following error:
>>>>
>>>> root@plb-XPS-13-9350:~# lsmod | grep snd_soc_skl
>>>> snd_soc_skl           110592  1
>>>>
>>>> root@plb-XPS-13-9350:~# rmmod snd_soc_skl
>>>> rmmod: ERROR: Module snd_soc_skl is in use
>>> Yup, that would be correct, the inuse is due to the fact the sound
>>> card
>>> is up and someone needs to unload the sound card to remove the
>>> reference.
>>>
>>> That can be done by doing the rmmod of machine driver first and IIRC
>>> that would remove the sound card and drop the reference and then
>>> snd_soc_skl can be unloaded.
>> This doesnt seem to be the case though. The machine driver module
>> cannot be removed either because its refcnt is also > 0.
> At least this used to be the case when I used to try removal of modules
> on skl, doing the reverse of load order seemed to work for me back then.

Unfortunately module unload is broken with the skylake driver (kernel 
oops left and right), so there's no way of verifying your 
assertion...Other folks are trying to restore the capability but it's 
not been working for a very long time.

Beyond the conceptual issue with the reference count, my other worry is 
that the topology is created by the Skylake driver driver but freed when 
you remove the card, so you end-up with non-sensical data structures and 
configurations when you remove the skl driver. it's *really* recommended 
to remove the component which instantiated the topology first before 
removing the card.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Applied "ASoC: Intel: Skylake: set .ignore_module_refcount field in component" to the asoc tree
  2019-02-01 17:22 ` [RFC PATCH 2/2] ASoC: Intel: Skylake: set .ignore_module_refcount field in component Pierre-Louis Bossart
@ 2019-02-12 14:22   ` Mark Brown
  0 siblings, 0 replies; 10+ messages in thread
From: Mark Brown @ 2019-02-12 14:22 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: tiwai, liam.r.girdwood, alsa-devel, broonie, vkoul

The patch

   ASoC: Intel: Skylake: set .ignore_module_refcount field in component

has been applied to the asoc tree at

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From e0771fc98909096b65c9781c438ac9d9c98ac41a Mon Sep 17 00:00:00 2001
From: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Date: Fri, 1 Feb 2019 11:22:24 -0600
Subject: [PATCH] ASoC: Intel: Skylake: set .ignore_module_refcount field in
 component

There is no risk of the module being removed while the platform
components are in use. This solves the problem of the snd_soc_skl
module not being removable with rmmod

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
 sound/soc/intel/skylake/skl-pcm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/intel/skylake/skl-pcm.c b/sound/soc/intel/skylake/skl-pcm.c
index 8e589d698c58..a4284778f117 100644
--- a/sound/soc/intel/skylake/skl-pcm.c
+++ b/sound/soc/intel/skylake/skl-pcm.c
@@ -1464,6 +1464,7 @@ static const struct snd_soc_component_driver skl_component  = {
 	.ops		= &skl_platform_ops,
 	.pcm_new	= skl_pcm_new,
 	.pcm_free	= skl_pcm_free,
+	.ignore_module_refcount = 1, /* do not increase the refcount in core */
 };
 
 int skl_platform_register(struct device *dev)
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Applied "ASoC: core: don't increase component module refcount unconditionally" to the asoc tree
  2019-02-01 17:22 ` [RFC PATCH 1/2] ASoC: core: don't increase component module refcount unconditionally Pierre-Louis Bossart
  2019-02-01 18:12   ` Vinod Koul
@ 2019-02-12 14:22   ` Mark Brown
  1 sibling, 0 replies; 10+ messages in thread
From: Mark Brown @ 2019-02-12 14:22 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: tiwai, liam.r.girdwood, alsa-devel, broonie, vkoul

The patch

   ASoC: core: don't increase component module refcount unconditionally

has been applied to the asoc tree at

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From b450b87847b157d69dbf9af7aefb4cec29e89cc9 Mon Sep 17 00:00:00 2001
From: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Date: Fri, 1 Feb 2019 11:22:23 -0600
Subject: [PATCH] ASoC: core: don't increase component module refcount
 unconditionally

The ASoC core has for the longest time increased the module reference
counts, even before the transition to the component model. This is
probably fine on most platforms, but it introduces a deadlock case on
Intel devices with the Skylake and SOF drivers which cannot be removed
due to their reference counts being modified by the core.

In these 2 cases, the PCI or ACPI driver .probe creates a platform
device to let the machine driver .probe register the audio
card. Conversely the PCI or ACPI driver .remove will unregister the
platform device which results in the card being removed by the machine
driver .remove.

With ascii art, this can be represented as

modprobe
snd_soc_skl/
soc-pci-dev/sof-acpci-dev  ----------> pci/acpi probe
       ^                                    |
       |                     ---------------|
       |                    |               |
       |                    V               V
    increase            register        register machine
    refcount            component       platform_device
       ^                                    |
       |                                    |
       |                                    V
    component <----   register card  <---- probe
    probe

The issue is that by playing with the component's module reference
counts during the card registration, it's no longer possible to remove
the module which controls the component. This can be shown, e.g. with
the following error:

root@plb-XPS-13-9350:~# lsmod | grep snd_soc_skl
snd_soc_skl           110592  1

root@plb-XPS-13-9350:~# rmmod snd_soc_skl
rmmod: ERROR: Module snd_soc_skl is in use

Increasing the reference count during the component probe is not
useful. If the PCI/ACPI module is removed, the card will be removed
anyway.

To avoid breaking existing platforms and allowing Intel platforms to
safely deal with module load/unload cases, this patch introduces a
flag which needs to be set during the component initialization. This
is a strictly opt-in capability that should only be used when the
handling of the component module does not require a reference count
increase to prevent removal during use.

Note that this solution is not directly applicable to the legacy
Atom/SST driver, which uses a different device hierarchy. There are
however additional refcount issues which prevent the ACPI driver from
being removed. This is a different issue which would need a different
patch.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
 include/sound/soc.h  | 3 +++
 sound/soc/soc-core.c | 6 ++++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/sound/soc.h b/include/sound/soc.h
index 95689680336b..eb7db605955b 100644
--- a/include/sound/soc.h
+++ b/include/sound/soc.h
@@ -802,6 +802,9 @@ struct snd_soc_component_driver {
 	int probe_order;
 	int remove_order;
 
+	/* signal if the module handling the component cannot be removed */
+	unsigned int ignore_module_refcount:1;
+
 	/* bits */
 	unsigned int idle_bias_on:1;
 	unsigned int suspend_bias_off:1;
diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index 994d21d7ba0f..93d316d5bf8e 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -947,7 +947,8 @@ static void soc_cleanup_component(struct snd_soc_component *component)
 	snd_soc_dapm_free(snd_soc_component_get_dapm(component));
 	soc_cleanup_component_debugfs(component);
 	component->card = NULL;
-	module_put(component->dev->driver->owner);
+	if (!component->driver->ignore_module_refcount)
+		module_put(component->dev->driver->owner);
 }
 
 static void soc_remove_component(struct snd_soc_component *component)
@@ -1380,7 +1381,8 @@ static int soc_probe_component(struct snd_soc_card *card,
 		return 0;
 	}
 
-	if (!try_module_get(component->dev->driver->owner))
+	if (!component->driver->ignore_module_refcount &&
+	    !try_module_get(component->dev->driver->owner))
 		return -ENODEV;
 
 	component->card = card;
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-02-12 14:22 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-01 17:22 [RFC PATCH 0/2] ASoC: core/Intel: fix module-in-use error Pierre-Louis Bossart
2019-02-01 17:22 ` [RFC PATCH 1/2] ASoC: core: don't increase component module refcount unconditionally Pierre-Louis Bossart
2019-02-01 18:12   ` Vinod Koul
2019-02-01 18:36     ` Pierre-Louis Bossart
2019-02-01 20:07     ` Ranjani Sridharan
2019-02-05  4:25       ` Vinod Koul
2019-02-05 15:09         ` Pierre-Louis Bossart
2019-02-12 14:22   ` Applied "ASoC: core: don't increase component module refcount unconditionally" to the asoc tree Mark Brown
2019-02-01 17:22 ` [RFC PATCH 2/2] ASoC: Intel: Skylake: set .ignore_module_refcount field in component Pierre-Louis Bossart
2019-02-12 14:22   ` Applied "ASoC: Intel: Skylake: set .ignore_module_refcount field in component" to the asoc tree Mark Brown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.