[1/7] regulator: push allocation in regulator_init_coupling() outside of lock
diff mbox series

Message ID b305adf8bcde9417cdd5c9d84ef5ed99541f0e2c.1597107682.git.mirq-linux@rere.qmqm.pl
State Superseded
Headers show
Series
  • regulator: fix deadlock vs memory reclaim
Related show

Commit Message

Michał Mirosław Aug. 11, 2020, 1:07 a.m. UTC
Allocating memory with regulator_list_mutex held makes lockdep unhappy
when memory pressure makes the system do fs_reclaim on eg. eMMC using
a regulator. Push the lock inside regulator_init_coupling() after the
allocation.

======================================================
WARNING: possible circular locking dependency detected
5.7.13+ #533 Not tainted
------------------------------------------------------
kswapd0/383 is trying to acquire lock:
cca78ca4 (&sbi->write_io[i][j].io_rwsem){++++}-{3:3}, at: __submit_merged_write_cond+0x104/0x154
but task is already holding lock:
c0e38518 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x50
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (fs_reclaim){+.+.}-{0:0}:
       fs_reclaim_acquire.part.11+0x40/0x50
       fs_reclaim_acquire+0x24/0x28
       __kmalloc+0x54/0x218
       regulator_register+0x860/0x1584
       dummy_regulator_probe+0x60/0xa8
[...]
other info that might help us debug this:

Chain exists of:
  &sbi->write_io[i][j].io_rwsem --> regulator_list_mutex --> fs_reclaim

Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               lock(regulator_list_mutex);
                               lock(fs_reclaim);
  lock(&sbi->write_io[i][j].io_rwsem);
 *** DEADLOCK ***

1 lock held by kswapd0/383:
 #0: c0e38518 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x50
[...]

Cc: stable@vger.kernel.org
Fixes: d8ca7d184b33 ("regulator: core: Introduce API for regulators coupling customization")
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
 drivers/regulator/core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Dmitry Osipenko Aug. 11, 2020, 3:59 p.m. UTC | #1
11.08.2020 04:07, Michał Mirosław пишет:
> Allocating memory with regulator_list_mutex held makes lockdep unhappy
> when memory pressure makes the system do fs_reclaim on eg. eMMC using
> a regulator. Push the lock inside regulator_init_coupling() after the
> allocation.
...

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Dmitry Osipenko Aug. 11, 2020, 4:27 p.m. UTC | #2
11.08.2020 18:59, Dmitry Osipenko пишет:
> 11.08.2020 04:07, Michał Mirosław пишет:
>> Allocating memory with regulator_list_mutex held makes lockdep unhappy
>> when memory pressure makes the system do fs_reclaim on eg. eMMC using
>> a regulator. Push the lock inside regulator_init_coupling() after the
>> allocation.
> ...
> 
> Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
> 

On the other hand, couldn't it be better to just remove taking the
list_mutex from the regulator_lock_dependent()?

I think the list_mutex is only needed to protect from supply/couple
regulator being removed during of the locking process, but maybe this is
not something we should worry about?
Michał Mirosław Aug. 11, 2020, 5:20 p.m. UTC | #3
On Tue, Aug 11, 2020 at 07:27:43PM +0300, Dmitry Osipenko wrote:
> 11.08.2020 18:59, Dmitry Osipenko пишет:
> > 11.08.2020 04:07, Michał Mirosław пишет:
> >> Allocating memory with regulator_list_mutex held makes lockdep unhappy
> >> when memory pressure makes the system do fs_reclaim on eg. eMMC using
> >> a regulator. Push the lock inside regulator_init_coupling() after the
> >> allocation.
> > ...
> > 
> > Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
> On the other hand, couldn't it be better to just remove taking the
> list_mutex from the regulator_lock_dependent()?
> 
> I think the list_mutex is only needed to protect from supply/couple
> regulator being removed during of the locking process, but maybe this is
> not something we should worry about?

This is what I would like to see in the end, but it requires more
thought, at least around interaction with regulator_resolve_coupling()
and the regulator removal.

Best Regards,
Michał Mirosław
Dmitry Osipenko Aug. 11, 2020, 9:02 p.m. UTC | #4
11.08.2020 20:20, Michał Mirosław пишет:
> On Tue, Aug 11, 2020 at 07:27:43PM +0300, Dmitry Osipenko wrote:
>> 11.08.2020 18:59, Dmitry Osipenko пишет:
>>> 11.08.2020 04:07, Michał Mirosław пишет:
>>>> Allocating memory with regulator_list_mutex held makes lockdep unhappy
>>>> when memory pressure makes the system do fs_reclaim on eg. eMMC using
>>>> a regulator. Push the lock inside regulator_init_coupling() after the
>>>> allocation.
>>> ...
>>>
>>> Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
>> On the other hand, couldn't it be better to just remove taking the
>> list_mutex from the regulator_lock_dependent()?
>>
>> I think the list_mutex is only needed to protect from supply/couple
>> regulator being removed during of the locking process, but maybe this is
>> not something we should worry about?
> 
> This is what I would like to see in the end, but it requires more
> thought, at least around interaction with regulator_resolve_coupling()
> and the regulator removal.

I meant that it's very unlikely to have regulator gone while it's
in-use. Hence it could be okay to ignore this rare case, and thus,
simplify the fix significantly by removing the offending lock.

Still this won't solve the root of the problem because potentially
reclaim could happen while storage regulator (or its supply) is locked,
although it should be a very unlikely condition in practice.

Patch
diff mbox series

diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 0a32c3da0e26..510d234f6c46 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -5010,7 +5010,10 @@  static int regulator_init_coupling(struct regulator_dev *rdev)
 	if (!of_check_coupling_data(rdev))
 		return -EPERM;
 
+	mutex_lock(&regulator_list_mutex);
 	rdev->coupling_desc.coupler = regulator_find_coupler(rdev);
+	mutex_unlock(&regulator_list_mutex);
+
 	if (IS_ERR(rdev->coupling_desc.coupler)) {
 		err = PTR_ERR(rdev->coupling_desc.coupler);
 		rdev_err(rdev, "failed to get coupler: %d\n", err);
@@ -5218,9 +5221,7 @@  regulator_register(const struct regulator_desc *regulator_desc,
 	if (ret < 0)
 		goto wash;
 
-	mutex_lock(&regulator_list_mutex);
 	ret = regulator_init_coupling(rdev);
-	mutex_unlock(&regulator_list_mutex);
 	if (ret < 0)
 		goto wash;