linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] iio: core: fix a possible circular locking dependency
@ 2019-03-22 13:54 Fabrice Gasnier
  2019-03-24 15:47 ` Jonathan Cameron
  0 siblings, 1 reply; 4+ messages in thread
From: Fabrice Gasnier @ 2019-03-22 13:54 UTC (permalink / raw)
  To: jic23; +Cc: lars, knaack.h, pmeerw, fabrice.gasnier, linux-kernel, linux-iio

This fixes a possible circular locking dependency detected warning seen
with:
- CONFIG_PROVE_LOCKING=y
- consumer/provider IIO devices (ex: "voltage-divider" consumer of "adc")

When using the IIO consumer interface, e.g. iio_channel_get(),
the consumer device will likely call iio_read_channel_raw() or similar that
rely on 'info_exist_lock' mutex.

typically:
...
	mutex_lock(&chan->indio_dev->info_exist_lock);
	if (chan->indio_dev->info == NULL) {
		ret = -ENODEV;
		goto err_unlock;
	}
	ret = do_some_ops()
err_unlock:
	mutex_unlock(&chan->indio_dev->info_exist_lock);
	return ret;
...

Same mutex is also hold in iio_device_unregister().

The following deadlock warning happens when:
- the consumer device has called an API like iio_read_channel_raw()
  at least once.
- the consumer driver is unregistered, removed (unbind from sysfs)

======================================================
WARNING: possible circular locking dependency detected
4.19.24 #577 Not tainted
------------------------------------------------------
sh/372 is trying to acquire lock:
(kn->count#30){++++}, at: kernfs_remove_by_name_ns+0x3c/0x84

but task is already holding lock:
(&dev->info_exist_lock){+.+.}, at: iio_device_unregister+0x18/0x60

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&dev->info_exist_lock){+.+.}:
       __mutex_lock+0x70/0xa3c
       mutex_lock_nested+0x1c/0x24
       iio_read_channel_raw+0x1c/0x60
       iio_read_channel_info+0xa8/0xb0
       dev_attr_show+0x1c/0x48
       sysfs_kf_seq_show+0x84/0xec
       seq_read+0x154/0x528
       __vfs_read+0x2c/0x15c
       vfs_read+0x8c/0x110
       ksys_read+0x4c/0xac
       ret_fast_syscall+0x0/0x28
       0xbedefb60

-> #0 (kn->count#30){++++}:
       lock_acquire+0xd8/0x268
       __kernfs_remove+0x288/0x374
       kernfs_remove_by_name_ns+0x3c/0x84
       remove_files+0x34/0x78
       sysfs_remove_group+0x40/0x9c
       sysfs_remove_groups+0x24/0x34
       device_remove_attrs+0x38/0x64
       device_del+0x11c/0x360
       cdev_device_del+0x14/0x2c
       iio_device_unregister+0x24/0x60
       release_nodes+0x1bc/0x200
       device_release_driver_internal+0x1a0/0x230
       unbind_store+0x80/0x130
       kernfs_fop_write+0x100/0x1e4
       __vfs_write+0x2c/0x160
       vfs_write+0xa4/0x17c
       ksys_write+0x4c/0xac
       ret_fast_syscall+0x0/0x28
       0xbe906840

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&dev->info_exist_lock);
                               lock(kn->count#30);
                               lock(&dev->info_exist_lock);
  lock(kn->count#30);

 *** DEADLOCK ***
...

So only hold the mutex to:
- disable all buffers while 'info' is available
- set 'info' to NULL
Then release it to call cdev_device_del and so on.

Help to reproduce:
See example: Documentation/devicetree/bindings/iio/afe/voltage-divider.txt
sysv {
	compatible = "voltage-divider";
	io-channels = <&adc 0>;
	output-ohms = <22>;
	full-ohms = <222>;
};

First, go to iio:deviceX for the "voltage-divider", do one read:
$ cd /sys/bus/iio/devices/iio:deviceX
$ cat in_voltage0_raw

Then, unbind the consumer driver. It triggers above deadlock warning.
$ cd /sys/bus/platform/drivers/iio-rescale/
$ echo sysv > unbind

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
---
 drivers/iio/industrialio-core.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/iio/industrialio-core.c b/drivers/iio/industrialio-core.c
index 4700fd5..e03d6ff 100644
--- a/drivers/iio/industrialio-core.c
+++ b/drivers/iio/industrialio-core.c
@@ -1745,19 +1745,19 @@ void iio_device_unregister(struct iio_dev *indio_dev)
 {
 	mutex_lock(&indio_dev->info_exist_lock);
 
-	cdev_device_del(&indio_dev->chrdev, &indio_dev->dev);
-
-	iio_device_unregister_debugfs(indio_dev);
-
 	iio_disable_all_buffers(indio_dev);
 
 	indio_dev->info = NULL;
 
+	mutex_unlock(&indio_dev->info_exist_lock);
+
+	cdev_device_del(&indio_dev->chrdev, &indio_dev->dev);
+
+	iio_device_unregister_debugfs(indio_dev);
+
 	iio_device_wakeup_eventset(indio_dev);
 	iio_buffer_wakeup_poll(indio_dev);
 
-	mutex_unlock(&indio_dev->info_exist_lock);
-
 	iio_buffer_free_sysfs_and_mask(indio_dev);
 }
 EXPORT_SYMBOL(iio_device_unregister);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] iio: core: fix a possible circular locking dependency
  2019-03-22 13:54 [RFC PATCH] iio: core: fix a possible circular locking dependency Fabrice Gasnier
@ 2019-03-24 15:47 ` Jonathan Cameron
  2019-03-25 12:51   ` Fabrice Gasnier
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Cameron @ 2019-03-24 15:47 UTC (permalink / raw)
  To: Fabrice Gasnier; +Cc: lars, knaack.h, pmeerw, linux-kernel, linux-iio

On Fri, 22 Mar 2019 14:54:06 +0100
Fabrice Gasnier <fabrice.gasnier@st.com> wrote:

> This fixes a possible circular locking dependency detected warning seen
> with:
> - CONFIG_PROVE_LOCKING=y
> - consumer/provider IIO devices (ex: "voltage-divider" consumer of "adc")
> 
> When using the IIO consumer interface, e.g. iio_channel_get(),
> the consumer device will likely call iio_read_channel_raw() or similar that
> rely on 'info_exist_lock' mutex.
> 
> typically:
> ...
> 	mutex_lock(&chan->indio_dev->info_exist_lock);
> 	if (chan->indio_dev->info == NULL) {
> 		ret = -ENODEV;
> 		goto err_unlock;
> 	}
> 	ret = do_some_ops()
> err_unlock:
> 	mutex_unlock(&chan->indio_dev->info_exist_lock);
> 	return ret;
> ...
> 
> Same mutex is also hold in iio_device_unregister().
> 
> The following deadlock warning happens when:
> - the consumer device has called an API like iio_read_channel_raw()
>   at least once.
> - the consumer driver is unregistered, removed (unbind from sysfs)
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.19.24 #577 Not tainted
> ------------------------------------------------------
> sh/372 is trying to acquire lock:
> (kn->count#30){++++}, at: kernfs_remove_by_name_ns+0x3c/0x84
> 
> but task is already holding lock:
> (&dev->info_exist_lock){+.+.}, at: iio_device_unregister+0x18/0x60
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (&dev->info_exist_lock){+.+.}:  
>        __mutex_lock+0x70/0xa3c
>        mutex_lock_nested+0x1c/0x24
>        iio_read_channel_raw+0x1c/0x60
>        iio_read_channel_info+0xa8/0xb0
>        dev_attr_show+0x1c/0x48
>        sysfs_kf_seq_show+0x84/0xec
>        seq_read+0x154/0x528
>        __vfs_read+0x2c/0x15c
>        vfs_read+0x8c/0x110
>        ksys_read+0x4c/0xac
>        ret_fast_syscall+0x0/0x28
>        0xbedefb60
> 
> -> #0 (kn->count#30){++++}:  
>        lock_acquire+0xd8/0x268
>        __kernfs_remove+0x288/0x374
>        kernfs_remove_by_name_ns+0x3c/0x84
>        remove_files+0x34/0x78
>        sysfs_remove_group+0x40/0x9c
>        sysfs_remove_groups+0x24/0x34
>        device_remove_attrs+0x38/0x64
>        device_del+0x11c/0x360
>        cdev_device_del+0x14/0x2c
>        iio_device_unregister+0x24/0x60
>        release_nodes+0x1bc/0x200
>        device_release_driver_internal+0x1a0/0x230
>        unbind_store+0x80/0x130
>        kernfs_fop_write+0x100/0x1e4
>        __vfs_write+0x2c/0x160
>        vfs_write+0xa4/0x17c
>        ksys_write+0x4c/0xac
>        ret_fast_syscall+0x0/0x28
>        0xbe906840
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(&dev->info_exist_lock);
>                                lock(kn->count#30);
>                                lock(&dev->info_exist_lock);
>   lock(kn->count#30);
> 
>  *** DEADLOCK ***
> ...
> 
> So only hold the mutex to:
> - disable all buffers while 'info' is available
> - set 'info' to NULL
> Then release it to call cdev_device_del and so on.
> 
> Help to reproduce:
> See example: Documentation/devicetree/bindings/iio/afe/voltage-divider.txt
> sysv {
> 	compatible = "voltage-divider";
> 	io-channels = <&adc 0>;
> 	output-ohms = <22>;
> 	full-ohms = <222>;
> };
> 
> First, go to iio:deviceX for the "voltage-divider", do one read:
> $ cd /sys/bus/iio/devices/iio:deviceX
> $ cat in_voltage0_raw
> 
> Then, unbind the consumer driver. It triggers above deadlock warning.
> $ cd /sys/bus/platform/drivers/iio-rescale/
> $ echo sysv > unbind
> 
> Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
I'm not in principle against the fix. However it is reordering the
remove wrt to the probe which I'm not so keen on.

The cdev register is fundamentally the point where the device
becomes exposed to userspace, so we naturally want to do it last
(and remove it first).  I worry that we may have some paths
in which we don't sanity check the existence of info (which
is kind of our backup plan to indicate the device has gone
away).

Are we safe to instead of reordering, just not take the lock
until after the problem functions?  Info doesn't go
away until later so I think we are.  I haven't looked it in that
much detail though!

Thanks for raising this as it's a nasty little problem.

Jonathan


> ---
>  drivers/iio/industrialio-core.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iio/industrialio-core.c b/drivers/iio/industrialio-core.c
> index 4700fd5..e03d6ff 100644
> --- a/drivers/iio/industrialio-core.c
> +++ b/drivers/iio/industrialio-core.c
> @@ -1745,19 +1745,19 @@ void iio_device_unregister(struct iio_dev *indio_dev)
>  {
>  	mutex_lock(&indio_dev->info_exist_lock);
>  
> -	cdev_device_del(&indio_dev->chrdev, &indio_dev->dev);
> -
> -	iio_device_unregister_debugfs(indio_dev);
> -
>  	iio_disable_all_buffers(indio_dev);
>  
>  	indio_dev->info = NULL;
>  
> +	mutex_unlock(&indio_dev->info_exist_lock);
> +
> +	cdev_device_del(&indio_dev->chrdev, &indio_dev->dev);
> +
> +	iio_device_unregister_debugfs(indio_dev);
> +
>  	iio_device_wakeup_eventset(indio_dev);
>  	iio_buffer_wakeup_poll(indio_dev);
>  
> -	mutex_unlock(&indio_dev->info_exist_lock);
> -
>  	iio_buffer_free_sysfs_and_mask(indio_dev);
>  }
>  EXPORT_SYMBOL(iio_device_unregister);


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] iio: core: fix a possible circular locking dependency
  2019-03-24 15:47 ` Jonathan Cameron
@ 2019-03-25 12:51   ` Fabrice Gasnier
  2019-03-30 18:46     ` Jonathan Cameron
  0 siblings, 1 reply; 4+ messages in thread
From: Fabrice Gasnier @ 2019-03-25 12:51 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: lars, knaack.h, pmeerw, linux-kernel, linux-iio

On 3/24/19 4:47 PM, Jonathan Cameron wrote:
> On Fri, 22 Mar 2019 14:54:06 +0100
> Fabrice Gasnier <fabrice.gasnier@st.com> wrote:
> 
>> This fixes a possible circular locking dependency detected warning seen
>> with:
>> - CONFIG_PROVE_LOCKING=y
>> - consumer/provider IIO devices (ex: "voltage-divider" consumer of "adc")
>>
>> When using the IIO consumer interface, e.g. iio_channel_get(),
>> the consumer device will likely call iio_read_channel_raw() or similar that
>> rely on 'info_exist_lock' mutex.
>>
>> typically:
>> ...
>> 	mutex_lock(&chan->indio_dev->info_exist_lock);
>> 	if (chan->indio_dev->info == NULL) {
>> 		ret = -ENODEV;
>> 		goto err_unlock;
>> 	}
>> 	ret = do_some_ops()
>> err_unlock:
>> 	mutex_unlock(&chan->indio_dev->info_exist_lock);
>> 	return ret;
>> ...
>>
>> Same mutex is also hold in iio_device_unregister().
>>
>> The following deadlock warning happens when:
>> - the consumer device has called an API like iio_read_channel_raw()
>>   at least once.
>> - the consumer driver is unregistered, removed (unbind from sysfs)
>>
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 4.19.24 #577 Not tainted
>> ------------------------------------------------------
>> sh/372 is trying to acquire lock:
>> (kn->count#30){++++}, at: kernfs_remove_by_name_ns+0x3c/0x84
>>
>> but task is already holding lock:
>> (&dev->info_exist_lock){+.+.}, at: iio_device_unregister+0x18/0x60
>>
>> which lock already depends on the new lock.
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #1 (&dev->info_exist_lock){+.+.}:  
>>        __mutex_lock+0x70/0xa3c
>>        mutex_lock_nested+0x1c/0x24
>>        iio_read_channel_raw+0x1c/0x60
>>        iio_read_channel_info+0xa8/0xb0
>>        dev_attr_show+0x1c/0x48
>>        sysfs_kf_seq_show+0x84/0xec
>>        seq_read+0x154/0x528
>>        __vfs_read+0x2c/0x15c
>>        vfs_read+0x8c/0x110
>>        ksys_read+0x4c/0xac
>>        ret_fast_syscall+0x0/0x28
>>        0xbedefb60
>>
>> -> #0 (kn->count#30){++++}:  
>>        lock_acquire+0xd8/0x268
>>        __kernfs_remove+0x288/0x374
>>        kernfs_remove_by_name_ns+0x3c/0x84
>>        remove_files+0x34/0x78
>>        sysfs_remove_group+0x40/0x9c
>>        sysfs_remove_groups+0x24/0x34
>>        device_remove_attrs+0x38/0x64
>>        device_del+0x11c/0x360
>>        cdev_device_del+0x14/0x2c
>>        iio_device_unregister+0x24/0x60
>>        release_nodes+0x1bc/0x200
>>        device_release_driver_internal+0x1a0/0x230
>>        unbind_store+0x80/0x130
>>        kernfs_fop_write+0x100/0x1e4
>>        __vfs_write+0x2c/0x160
>>        vfs_write+0xa4/0x17c
>>        ksys_write+0x4c/0xac
>>        ret_fast_syscall+0x0/0x28
>>        0xbe906840
>>
>> other info that might help us debug this:
>>
>>  Possible unsafe locking scenario:
>>
>>        CPU0                    CPU1
>>        ----                    ----
>>   lock(&dev->info_exist_lock);
>>                                lock(kn->count#30);
>>                                lock(&dev->info_exist_lock);
>>   lock(kn->count#30);
>>
>>  *** DEADLOCK ***
>> ...
>>
>> So only hold the mutex to:
>> - disable all buffers while 'info' is available
>> - set 'info' to NULL
>> Then release it to call cdev_device_del and so on.
>>
>> Help to reproduce:
>> See example: Documentation/devicetree/bindings/iio/afe/voltage-divider.txt
>> sysv {
>> 	compatible = "voltage-divider";
>> 	io-channels = <&adc 0>;
>> 	output-ohms = <22>;
>> 	full-ohms = <222>;
>> };
>>
>> First, go to iio:deviceX for the "voltage-divider", do one read:
>> $ cd /sys/bus/iio/devices/iio:deviceX
>> $ cat in_voltage0_raw
>>
>> Then, unbind the consumer driver. It triggers above deadlock warning.
>> $ cd /sys/bus/platform/drivers/iio-rescale/
>> $ echo sysv > unbind
>>
>> Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
> I'm not in principle against the fix. However it is reordering the
> remove wrt to the probe which I'm not so keen on.
> 

Hi Jonathan,

I also had it in mind. Just one thing confused me:
- The ->info struct is filled in by the device driver before calling one
of the "iio_device_register" routines.
- But it's assigned to NULL inside the unregister routine.
That's also why I've sent it as RFC.

> The cdev register is fundamentally the point where the device
> becomes exposed to userspace, so we naturally want to do it last
> (and remove it first).  I worry that we may have some paths
> in which we don't sanity check the existence of info (which
> is kind of our backup plan to indicate the device has gone
> away).
> 
> Are we safe to instead of reordering, just not take the lock
> until after the problem functions?  Info doesn't go
> away until later so I think we are.  I haven't looked it in that
> much detail though!

Ok, thanks for making up my mind :-). As far as I understand, the
"info_exist_lock" targets the inkern users (e.g. exported routines). So,
cdev_device_del() can probably be called unlocked, without reordering.

> 
> Thanks for raising this as it's a nasty little problem.

I'll send a v2 based on your proposal.

Thanks for your help,
Best Regards,
Fabrice

> 
> Jonathan
> 
> 
>> ---
>>  drivers/iio/industrialio-core.c | 12 ++++++------
>>  1 file changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/iio/industrialio-core.c b/drivers/iio/industrialio-core.c
>> index 4700fd5..e03d6ff 100644
>> --- a/drivers/iio/industrialio-core.c
>> +++ b/drivers/iio/industrialio-core.c
>> @@ -1745,19 +1745,19 @@ void iio_device_unregister(struct iio_dev *indio_dev)
>>  {
>>  	mutex_lock(&indio_dev->info_exist_lock);
>>  
>> -	cdev_device_del(&indio_dev->chrdev, &indio_dev->dev);
>> -
>> -	iio_device_unregister_debugfs(indio_dev);
>> -
>>  	iio_disable_all_buffers(indio_dev);
>>  
>>  	indio_dev->info = NULL;
>>  
>> +	mutex_unlock(&indio_dev->info_exist_lock);
>> +
>> +	cdev_device_del(&indio_dev->chrdev, &indio_dev->dev);
>> +
>> +	iio_device_unregister_debugfs(indio_dev);
>> +
>>  	iio_device_wakeup_eventset(indio_dev);
>>  	iio_buffer_wakeup_poll(indio_dev);
>>  
>> -	mutex_unlock(&indio_dev->info_exist_lock);
>> -
>>  	iio_buffer_free_sysfs_and_mask(indio_dev);
>>  }
>>  EXPORT_SYMBOL(iio_device_unregister);
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] iio: core: fix a possible circular locking dependency
  2019-03-25 12:51   ` Fabrice Gasnier
@ 2019-03-30 18:46     ` Jonathan Cameron
  0 siblings, 0 replies; 4+ messages in thread
From: Jonathan Cameron @ 2019-03-30 18:46 UTC (permalink / raw)
  To: Fabrice Gasnier; +Cc: lars, knaack.h, pmeerw, linux-kernel, linux-iio

On Mon, 25 Mar 2019 13:51:17 +0100
Fabrice Gasnier <fabrice.gasnier@st.com> wrote:

> On 3/24/19 4:47 PM, Jonathan Cameron wrote:
> > On Fri, 22 Mar 2019 14:54:06 +0100
> > Fabrice Gasnier <fabrice.gasnier@st.com> wrote:
> >   
> >> This fixes a possible circular locking dependency detected warning seen
> >> with:
> >> - CONFIG_PROVE_LOCKING=y
> >> - consumer/provider IIO devices (ex: "voltage-divider" consumer of "adc")
> >>
> >> When using the IIO consumer interface, e.g. iio_channel_get(),
> >> the consumer device will likely call iio_read_channel_raw() or similar that
> >> rely on 'info_exist_lock' mutex.
> >>
> >> typically:
> >> ...
> >> 	mutex_lock(&chan->indio_dev->info_exist_lock);
> >> 	if (chan->indio_dev->info == NULL) {
> >> 		ret = -ENODEV;
> >> 		goto err_unlock;
> >> 	}
> >> 	ret = do_some_ops()
> >> err_unlock:
> >> 	mutex_unlock(&chan->indio_dev->info_exist_lock);
> >> 	return ret;
> >> ...
> >>
> >> Same mutex is also hold in iio_device_unregister().
> >>
> >> The following deadlock warning happens when:
> >> - the consumer device has called an API like iio_read_channel_raw()
> >>   at least once.
> >> - the consumer driver is unregistered, removed (unbind from sysfs)
> >>
> >> ======================================================
> >> WARNING: possible circular locking dependency detected
> >> 4.19.24 #577 Not tainted
> >> ------------------------------------------------------
> >> sh/372 is trying to acquire lock:
> >> (kn->count#30){++++}, at: kernfs_remove_by_name_ns+0x3c/0x84
> >>
> >> but task is already holding lock:
> >> (&dev->info_exist_lock){+.+.}, at: iio_device_unregister+0x18/0x60
> >>
> >> which lock already depends on the new lock.
> >>
> >> the existing dependency chain (in reverse order) is:
> >>  
> >> -> #1 (&dev->info_exist_lock){+.+.}:    
> >>        __mutex_lock+0x70/0xa3c
> >>        mutex_lock_nested+0x1c/0x24
> >>        iio_read_channel_raw+0x1c/0x60
> >>        iio_read_channel_info+0xa8/0xb0
> >>        dev_attr_show+0x1c/0x48
> >>        sysfs_kf_seq_show+0x84/0xec
> >>        seq_read+0x154/0x528
> >>        __vfs_read+0x2c/0x15c
> >>        vfs_read+0x8c/0x110
> >>        ksys_read+0x4c/0xac
> >>        ret_fast_syscall+0x0/0x28
> >>        0xbedefb60
> >>  
> >> -> #0 (kn->count#30){++++}:    
> >>        lock_acquire+0xd8/0x268
> >>        __kernfs_remove+0x288/0x374
> >>        kernfs_remove_by_name_ns+0x3c/0x84
> >>        remove_files+0x34/0x78
> >>        sysfs_remove_group+0x40/0x9c
> >>        sysfs_remove_groups+0x24/0x34
> >>        device_remove_attrs+0x38/0x64
> >>        device_del+0x11c/0x360
> >>        cdev_device_del+0x14/0x2c
> >>        iio_device_unregister+0x24/0x60
> >>        release_nodes+0x1bc/0x200
> >>        device_release_driver_internal+0x1a0/0x230
> >>        unbind_store+0x80/0x130
> >>        kernfs_fop_write+0x100/0x1e4
> >>        __vfs_write+0x2c/0x160
> >>        vfs_write+0xa4/0x17c
> >>        ksys_write+0x4c/0xac
> >>        ret_fast_syscall+0x0/0x28
> >>        0xbe906840
> >>
> >> other info that might help us debug this:
> >>
> >>  Possible unsafe locking scenario:
> >>
> >>        CPU0                    CPU1
> >>        ----                    ----
> >>   lock(&dev->info_exist_lock);
> >>                                lock(kn->count#30);
> >>                                lock(&dev->info_exist_lock);
> >>   lock(kn->count#30);
> >>
> >>  *** DEADLOCK ***
> >> ...
> >>
> >> So only hold the mutex to:
> >> - disable all buffers while 'info' is available
> >> - set 'info' to NULL
> >> Then release it to call cdev_device_del and so on.
> >>
> >> Help to reproduce:
> >> See example: Documentation/devicetree/bindings/iio/afe/voltage-divider.txt
> >> sysv {
> >> 	compatible = "voltage-divider";
> >> 	io-channels = <&adc 0>;
> >> 	output-ohms = <22>;
> >> 	full-ohms = <222>;
> >> };
> >>
> >> First, go to iio:deviceX for the "voltage-divider", do one read:
> >> $ cd /sys/bus/iio/devices/iio:deviceX
> >> $ cat in_voltage0_raw
> >>
> >> Then, unbind the consumer driver. It triggers above deadlock warning.
> >> $ cd /sys/bus/platform/drivers/iio-rescale/
> >> $ echo sysv > unbind
> >>
> >> Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>  
> > I'm not in principle against the fix. However it is reordering the
> > remove wrt to the probe which I'm not so keen on.
> >   
> 
> Hi Jonathan,
> 
> I also had it in mind. Just one thing confused me:
> - The ->info struct is filled in by the device driver before calling one
> of the "iio_device_register" routines.
> - But it's assigned to NULL inside the unregister routine.
> That's also why I've sent it as RFC.

Yeah, IIRC we debated whether this was fair use of the pointer
when this handling was originally introduced.  The arguement Lars
(I think) made was that we always knew this pointer had no valid
use after this call, so why not use it even if we break the balance
of probe / remove.

> 
> > The cdev register is fundamentally the point where the device
> > becomes exposed to userspace, so we naturally want to do it last
> > (and remove it first).  I worry that we may have some paths
> > in which we don't sanity check the existence of info (which
> > is kind of our backup plan to indicate the device has gone
> > away).
> > 
> > Are we safe to instead of reordering, just not take the lock
> > until after the problem functions?  Info doesn't go
> > away until later so I think we are.  I haven't looked it in that
> > much detail though!  
> 
> Ok, thanks for making up my mind :-). As far as I understand, the
> "info_exist_lock" targets the inkern users (e.g. exported routines). So,
> cdev_device_del() can probably be called unlocked, without reordering.
Yes, I think you are right.

> 
> > 
> > Thanks for raising this as it's a nasty little problem.  
> 
> I'll send a v2 based on your proposal.
Cool. Thanks!

Jonathan
> 
> Thanks for your help,
> Best Regards,
> Fabrice
> 
> > 
> > Jonathan
> > 
> >   
> >> ---
> >>  drivers/iio/industrialio-core.c | 12 ++++++------
> >>  1 file changed, 6 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/iio/industrialio-core.c b/drivers/iio/industrialio-core.c
> >> index 4700fd5..e03d6ff 100644
> >> --- a/drivers/iio/industrialio-core.c
> >> +++ b/drivers/iio/industrialio-core.c
> >> @@ -1745,19 +1745,19 @@ void iio_device_unregister(struct iio_dev *indio_dev)
> >>  {
> >>  	mutex_lock(&indio_dev->info_exist_lock);
> >>  
> >> -	cdev_device_del(&indio_dev->chrdev, &indio_dev->dev);
> >> -
> >> -	iio_device_unregister_debugfs(indio_dev);
> >> -
> >>  	iio_disable_all_buffers(indio_dev);
> >>  
> >>  	indio_dev->info = NULL;
> >>  
> >> +	mutex_unlock(&indio_dev->info_exist_lock);
> >> +
> >> +	cdev_device_del(&indio_dev->chrdev, &indio_dev->dev);
> >> +
> >> +	iio_device_unregister_debugfs(indio_dev);
> >> +
> >>  	iio_device_wakeup_eventset(indio_dev);
> >>  	iio_buffer_wakeup_poll(indio_dev);
> >>  
> >> -	mutex_unlock(&indio_dev->info_exist_lock);
> >> -
> >>  	iio_buffer_free_sysfs_and_mask(indio_dev);
> >>  }
> >>  EXPORT_SYMBOL(iio_device_unregister);  
> >   


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-03-30 18:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-22 13:54 [RFC PATCH] iio: core: fix a possible circular locking dependency Fabrice Gasnier
2019-03-24 15:47 ` Jonathan Cameron
2019-03-25 12:51   ` Fabrice Gasnier
2019-03-30 18:46     ` Jonathan Cameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).