* [RFC PATCH 1/3] regulator: allocate memory outside of regulator_list mutex
[not found] <cover.1597089543.git.mirq-linux@rere.qmqm.pl>
@ 2020-08-10 20:09 ` Michał Mirosław
2020-08-10 20:09 ` [RFC PATCH 2/3] regulator: push enable_gpio allocation out from under regulator_list_mutex Michał Mirosław
` (2 subsequent siblings)
3 siblings, 0 replies; 18+ messages in thread
From: Michał Mirosław @ 2020-08-10 20:09 UTC (permalink / raw)
To: Dmitry Osipenko; +Cc: Mark Brown, Liam Girdwood, linux-kernel
Allocating memory with regulator_list_mutex held makes lockdep unhappy
when memory pressure makes the system do fs_reclaim on eg. eMMC using
a regulator. Push the lock inside regulator_init_coupling() after the
allocation.
======================================================
WARNING: possible circular locking dependency detected
5.7.13+ #533 Not tainted
------------------------------------------------------
kswapd0/383 is trying to acquire lock:
cca78ca4 (&sbi->write_io[i][j].io_rwsem){++++}-{3:3}, at: __submit_merged_write_cond+0x104/0x154
but task is already holding lock:
c0e38518 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x50
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (fs_reclaim){+.+.}-{0:0}:
fs_reclaim_acquire.part.11+0x40/0x50
fs_reclaim_acquire+0x24/0x28
__kmalloc+0x54/0x218
regulator_register+0x860/0x1584
dummy_regulator_probe+0x60/0xa8
[...]
other info that might help us debug this:
Chain exists of:
&sbi->write_io[i][j].io_rwsem --> regulator_list_mutex --> fs_reclaim
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(regulator_list_mutex);
lock(fs_reclaim);
lock(&sbi->write_io[i][j].io_rwsem);
*** DEADLOCK ***
1 lock held by kswapd0/383:
#0: c0e38518 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x50
[...]
Fixes: d8ca7d184b33 ("regulator: core: Introduce API for regulators coupling customization")
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
drivers/regulator/core.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 2ee109950352..915a727d8fc7 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -4900,7 +4900,7 @@ static void regulator_remove_coupling(struct regulator_dev *rdev)
static int regulator_init_coupling(struct regulator_dev *rdev)
{
struct regulator_dev **coupled;
- int err, n_phandles;
+ int err = 0, n_phandles;
if (!IS_ENABLED(CONFIG_OF))
n_phandles = 0;
@@ -4911,6 +4911,7 @@ static int regulator_init_coupling(struct regulator_dev *rdev)
if (!coupled)
return -ENOMEM;
+ mutex_lock(®ulator_list_mutex);
rdev->coupling_desc.coupled_rdevs = coupled;
/*
@@ -4923,19 +4924,21 @@ static int regulator_init_coupling(struct regulator_dev *rdev)
/* regulator isn't coupled */
if (n_phandles == 0)
- return 0;
+ goto out;
- if (!of_check_coupling_data(rdev))
- return -EPERM;
+ if (!of_check_coupling_data(rdev)) {
+ err = -EPERM;
+ goto out;
+ }
rdev->coupling_desc.coupler = regulator_find_coupler(rdev);
if (IS_ERR(rdev->coupling_desc.coupler)) {
err = PTR_ERR(rdev->coupling_desc.coupler);
rdev_err(rdev, "failed to get coupler: %d\n", err);
- return err;
}
-
- return 0;
+out:
+ mutex_unlock(®ulator_list_mutex);
+ return err;
}
static int generic_coupler_attach(struct regulator_coupler *coupler,
@@ -5135,9 +5138,7 @@ regulator_register(const struct regulator_desc *regulator_desc,
if (ret < 0)
goto wash;
- mutex_lock(®ulator_list_mutex);
ret = regulator_init_coupling(rdev);
- mutex_unlock(®ulator_list_mutex);
if (ret < 0)
goto wash;
--
2.20.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [RFC PATCH 2/3] regulator: push enable_gpio allocation out from under regulator_list_mutex
[not found] <cover.1597089543.git.mirq-linux@rere.qmqm.pl>
2020-08-10 20:09 ` [RFC PATCH 1/3] regulator: allocate memory outside of regulator_list mutex Michał Mirosław
@ 2020-08-10 20:09 ` Michał Mirosław
2020-08-10 20:09 ` [RFC PATCH 3/3] regulator: push supply_name allocation outside of lock Michał Mirosław
2020-08-10 20:15 ` regulator: deadlock vs memory reclaim Dmitry Osipenko
3 siblings, 0 replies; 18+ messages in thread
From: Michał Mirosław @ 2020-08-10 20:09 UTC (permalink / raw)
To: Dmitry Osipenko; +Cc: Mark Brown, Liam Girdwood, linux-kernel
Move another allocation out of regulator_list_mutex-protected region, as
reclaim might want to take the same lock.
WARNING: possible circular locking dependency detected
5.7.13+ #534 Not tainted
------------------------------------------------------
kswapd0/383 is trying to acquire lock:
c0e5d920 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent+0x54/0x2c0
but task is already holding lock:
c0e38518 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x50
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (fs_reclaim){+.+.}-{0:0}:
fs_reclaim_acquire.part.11+0x40/0x50
fs_reclaim_acquire+0x24/0x28
kmem_cache_alloc_trace+0x40/0x1e8
regulator_register+0x384/0x1630
devm_regulator_register+0x50/0x84
reg_fixed_voltage_probe+0x248/0x35c
[...]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(regulator_list_mutex);
lock(fs_reclaim);
lock(regulator_list_mutex);
*** DEADLOCK ***
[...]
2 locks held by kswapd0/383:
#0: c0e38518 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x50
#1: cb70e5e0 (hctx->srcu){....}-{0:0}, at: hctx_lock+0x60/0xb8
[...]
Fixes: 541d052d7215 ("regulator: core: Only support passing enable GPIO descriptors")
[this commit only changes context]
Fixes: f8702f9e4aa7 ("regulator: core: Use ww_mutex for regulators locking")
[this is when the regulator_list_mutex was introduced in reclaim locking path]
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
drivers/regulator/core.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 915a727d8fc7..05c9657c99d9 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -2200,10 +2200,13 @@ EXPORT_SYMBOL_GPL(regulator_bulk_unregister_supply_alias);
static int regulator_ena_gpio_request(struct regulator_dev *rdev,
const struct regulator_config *config)
{
- struct regulator_enable_gpio *pin;
+ struct regulator_enable_gpio *pin, *new_pin;
struct gpio_desc *gpiod;
gpiod = config->ena_gpiod;
+ new_pin = kzalloc(sizeof(*new_pin), GFP_KERNEL);
+
+ mutex_lock(®ulator_list_mutex);
list_for_each_entry(pin, ®ulator_ena_gpio_list, list) {
if (pin->gpiod == gpiod) {
@@ -2212,9 +2215,13 @@ static int regulator_ena_gpio_request(struct regulator_dev *rdev,
}
}
- pin = kzalloc(sizeof(struct regulator_enable_gpio), GFP_KERNEL);
- if (pin == NULL)
+ if (new_pin == NULL) {
+ mutex_unlock(®ulator_list_mutex);
return -ENOMEM;
+ }
+
+ pin = new_pin;
+ new_pin = NULL;
pin->gpiod = gpiod;
list_add(&pin->list, ®ulator_ena_gpio_list);
@@ -2222,6 +2229,10 @@ static int regulator_ena_gpio_request(struct regulator_dev *rdev,
update_ena_gpio_to_rdev:
pin->request_count++;
rdev->ena_pin = pin;
+
+ mutex_unlock(®ulator_list_mutex);
+ kfree(new_pin);
+
return 0;
}
@@ -5098,9 +5109,7 @@ regulator_register(const struct regulator_desc *regulator_desc,
}
if (config->ena_gpiod) {
- mutex_lock(®ulator_list_mutex);
ret = regulator_ena_gpio_request(rdev, config);
- mutex_unlock(®ulator_list_mutex);
if (ret != 0) {
rdev_err(rdev, "Failed to request enable GPIO: %d\n",
ret);
--
2.20.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [RFC PATCH 3/3] regulator: push supply_name allocation outside of lock
[not found] <cover.1597089543.git.mirq-linux@rere.qmqm.pl>
2020-08-10 20:09 ` [RFC PATCH 1/3] regulator: allocate memory outside of regulator_list mutex Michał Mirosław
2020-08-10 20:09 ` [RFC PATCH 2/3] regulator: push enable_gpio allocation out from under regulator_list_mutex Michał Mirosław
@ 2020-08-10 20:09 ` Michał Mirosław
2020-08-10 20:15 ` regulator: deadlock vs memory reclaim Dmitry Osipenko
3 siblings, 0 replies; 18+ messages in thread
From: Michał Mirosław @ 2020-08-10 20:09 UTC (permalink / raw)
To: Dmitry Osipenko; +Cc: Mark Brown, Liam Girdwood, linux-kernel
Move all allocations outside of the regulator_lock().
======================================================
WARNING: possible circular locking dependency detected
5.7.13+ #535 Not tainted
------------------------------------------------------
f2fs_discard-179:7/702 is trying to acquire lock:
c0e5d920 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent+0x54/0x2c0
but task is already holding lock:
cb95b080 (&dcc->cmd_lock){+.+.}-{3:3}, at: __issue_discard_cmd+0xec/0x5f8
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
[...]
-> #3 (fs_reclaim){+.+.}-{0:0}:
fs_reclaim_acquire.part.11+0x40/0x50
fs_reclaim_acquire+0x24/0x28
__kmalloc_track_caller+0x54/0x218
kstrdup+0x40/0x5c
create_regulator+0xf4/0x368
regulator_resolve_supply+0x1a0/0x200
regulator_register+0x9c8/0x163c
[...]
other info that might help us debug this:
Chain exists of:
regulator_list_mutex --> &sit_i->sentry_lock --> &dcc->cmd_lock
[...]
Fixes: f8702f9e4aa7 ("regulator: core: Use ww_mutex for regulators locking")
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
drivers/regulator/core.c | 46 +++++++++++++++++++++-------------------
1 file changed, 24 insertions(+), 22 deletions(-)
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 05c9657c99d9..5f518e034c90 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -1557,41 +1557,48 @@ static struct regulator *create_regulator(struct regulator_dev *rdev,
const char *supply_name)
{
struct regulator *regulator;
- char buf[REG_STR_SIZE];
- int err, size;
+ int err;
+
+ if (dev) {
+ char buf[REG_STR_SIZE];
+ int size;
+
+ size = snprintf(buf, REG_STR_SIZE, "%s-%s",
+ dev->kobj.name, supply_name);
+ if (size >= REG_STR_SIZE)
+ return NULL;
+
+ supply_name = kstrdup(buf, GFP_KERNEL);
+ if (supply_name == NULL)
+ return NULL;
+ } else {
+ supply_name = kstrdup_const(supply_name, GFP_KERNEL);
+ if (supply_name == NULL)
+ return NULL;
+ }
regulator = kzalloc(sizeof(*regulator), GFP_KERNEL);
- if (regulator == NULL)
+ if (regulator == NULL) {
+ kfree(supply_name);
return NULL;
+ }
regulator_lock(rdev);
regulator->rdev = rdev;
+ regulator->supply_name = supply_name;
list_add(®ulator->list, &rdev->consumer_list);
if (dev) {
regulator->dev = dev;
/* Add a link to the device sysfs entry */
- size = snprintf(buf, REG_STR_SIZE, "%s-%s",
- dev->kobj.name, supply_name);
- if (size >= REG_STR_SIZE)
- goto overflow_err;
-
- regulator->supply_name = kstrdup(buf, GFP_KERNEL);
- if (regulator->supply_name == NULL)
- goto overflow_err;
-
err = sysfs_create_link_nowarn(&rdev->dev.kobj, &dev->kobj,
- buf);
+ supply_name);
if (err) {
rdev_dbg(rdev, "could not add device link %s err %d\n",
dev->kobj.name, err);
/* non-fatal */
}
- } else {
- regulator->supply_name = kstrdup_const(supply_name, GFP_KERNEL);
- if (regulator->supply_name == NULL)
- goto overflow_err;
}
regulator->debugfs = debugfs_create_dir(regulator->supply_name,
@@ -1621,11 +1628,6 @@ static struct regulator *create_regulator(struct regulator_dev *rdev,
regulator_unlock(rdev);
return regulator;
-overflow_err:
- list_del(®ulator->list);
- kfree(regulator);
- regulator_unlock(rdev);
- return NULL;
}
static int _regulator_get_enable_time(struct regulator_dev *rdev)
--
2.20.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
[not found] <cover.1597089543.git.mirq-linux@rere.qmqm.pl>
` (2 preceding siblings ...)
2020-08-10 20:09 ` [RFC PATCH 3/3] regulator: push supply_name allocation outside of lock Michał Mirosław
@ 2020-08-10 20:15 ` Dmitry Osipenko
2020-08-10 20:18 ` Michał Mirosław
3 siblings, 1 reply; 18+ messages in thread
From: Dmitry Osipenko @ 2020-08-10 20:15 UTC (permalink / raw)
To: Michał Mirosław; +Cc: Mark Brown, Liam Girdwood, linux-kernel
10.08.2020 23:09, Michał Mirosław пишет:
> At first I also thought so, but there's more. Below is a lockdep
> complaint with your patch applied. I did a similar patch and then two more
> (following) and that is still not enough (sysfs/debugfs do allocations,
> too).
Then it should be good to move the locking for init_coupling() like I
suggested and use GFP_NOWAIT for the two other cases. It all could be a
single small patch. Could you please check whether GFP_NOWAIT helps?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
2020-08-10 20:15 ` regulator: deadlock vs memory reclaim Dmitry Osipenko
@ 2020-08-10 20:18 ` Michał Mirosław
2020-08-10 20:21 ` Dmitry Osipenko
0 siblings, 1 reply; 18+ messages in thread
From: Michał Mirosław @ 2020-08-10 20:18 UTC (permalink / raw)
To: Dmitry Osipenko; +Cc: Mark Brown, Liam Girdwood, linux-kernel
On Mon, Aug 10, 2020 at 11:15:28PM +0300, Dmitry Osipenko wrote:
> 10.08.2020 23:09, Michał Mirosław пишет:
> > At first I also thought so, but there's more. Below is a lockdep
> > complaint with your patch applied. I did a similar patch and then two more
> > (following) and that is still not enough (sysfs/debugfs do allocations,
> > too).
> Then it should be good to move the locking for init_coupling() like I
> suggested and use GFP_NOWAIT for the two other cases. It all could be a
> single small patch. Could you please check whether GFP_NOWAIT helps?
This would be equivalent to my patches. Problem with sysfs and debugfs
remains as they don't have the option of GFP_NOWAIT. This needs to be
moved outside of the locks.
Best Regards,
Michał Mirosław
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
2020-08-10 20:18 ` Michał Mirosław
@ 2020-08-10 20:21 ` Dmitry Osipenko
2020-08-10 20:56 ` Dmitry Osipenko
0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Osipenko @ 2020-08-10 20:21 UTC (permalink / raw)
To: Michał Mirosław; +Cc: Mark Brown, Liam Girdwood, linux-kernel
10.08.2020 23:18, Michał Mirosław пишет:
> On Mon, Aug 10, 2020 at 11:15:28PM +0300, Dmitry Osipenko wrote:
>> 10.08.2020 23:09, Michał Mirosław пишет:
>>> At first I also thought so, but there's more. Below is a lockdep
>>> complaint with your patch applied. I did a similar patch and then two more
>>> (following) and that is still not enough (sysfs/debugfs do allocations,
>>> too).
>> Then it should be good to move the locking for init_coupling() like I
>> suggested and use GFP_NOWAIT for the two other cases. It all could be a
>> single small patch. Could you please check whether GFP_NOWAIT helps?
>
> This would be equivalent to my patches. Problem with sysfs and debugfs
> remains as they don't have the option of GFP_NOWAIT. This needs to be
> moved outside of the locks.
Ah okay, you meant the debugfs core. I see now, thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
2020-08-10 20:21 ` Dmitry Osipenko
@ 2020-08-10 20:56 ` Dmitry Osipenko
2020-08-10 21:23 ` Dmitry Osipenko
2020-08-11 0:07 ` Michał Mirosław
0 siblings, 2 replies; 18+ messages in thread
From: Dmitry Osipenko @ 2020-08-10 20:56 UTC (permalink / raw)
To: Michał Mirosław, Mark Brown; +Cc: Liam Girdwood, linux-kernel
10.08.2020 23:21, Dmitry Osipenko пишет:
> 10.08.2020 23:18, Michał Mirosław пишет:
>> On Mon, Aug 10, 2020 at 11:15:28PM +0300, Dmitry Osipenko wrote:
>>> 10.08.2020 23:09, Michał Mirosław пишет:
>>>> At first I also thought so, but there's more. Below is a lockdep
>>>> complaint with your patch applied. I did a similar patch and then two more
>>>> (following) and that is still not enough (sysfs/debugfs do allocations,
>>>> too).
>>> Then it should be good to move the locking for init_coupling() like I
>>> suggested and use GFP_NOWAIT for the two other cases. It all could be a
>>> single small patch. Could you please check whether GFP_NOWAIT helps?
>>
>> This would be equivalent to my patches. Problem with sysfs and debugfs
>> remains as they don't have the option of GFP_NOWAIT. This needs to be
>> moved outside of the locks.
>
> Ah okay, you meant the debugfs core. I see now, thanks.
>
This indeed needs a capital solution.
It's not obvious how to fix it.. we can probably remove taking the
list_mutex from lock_dependent(), but this still won't help the case of
memory reclaiming because reclaim may cause touching the already locked
regulator. IIUC, the case of memory reclaiming under regulator lock was
always dangerous and happened to work by chance before, correct?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
2020-08-10 20:56 ` Dmitry Osipenko
@ 2020-08-10 21:23 ` Dmitry Osipenko
2020-08-11 0:07 ` Michał Mirosław
1 sibling, 0 replies; 18+ messages in thread
From: Dmitry Osipenko @ 2020-08-10 21:23 UTC (permalink / raw)
To: Michał Mirosław, Mark Brown; +Cc: Liam Girdwood, linux-kernel
10.08.2020 23:56, Dmitry Osipenko пишет:
> 10.08.2020 23:21, Dmitry Osipenko пишет:
>> 10.08.2020 23:18, Michał Mirosław пишет:
>>> On Mon, Aug 10, 2020 at 11:15:28PM +0300, Dmitry Osipenko wrote:
>>>> 10.08.2020 23:09, Michał Mirosław пишет:
>>>>> At first I also thought so, but there's more. Below is a lockdep
>>>>> complaint with your patch applied. I did a similar patch and then two more
>>>>> (following) and that is still not enough (sysfs/debugfs do allocations,
>>>>> too).
>>>> Then it should be good to move the locking for init_coupling() like I
>>>> suggested and use GFP_NOWAIT for the two other cases. It all could be a
>>>> single small patch. Could you please check whether GFP_NOWAIT helps?
>>>
>>> This would be equivalent to my patches. Problem with sysfs and debugfs
>>> remains as they don't have the option of GFP_NOWAIT. This needs to be
>>> moved outside of the locks.
>>
>> Ah okay, you meant the debugfs core. I see now, thanks.
>>
>
> This indeed needs a capital solution.
>
> It's not obvious how to fix it.. we can probably remove taking the
> list_mutex from lock_dependent(), but this still won't help the case of
> memory reclaiming because reclaim may cause touching the already locked
> regulator. IIUC, the case of memory reclaiming under regulator lock was
> always dangerous and happened to work by chance before, correct?
>
And like Mark mentioned before, this situation also potentially may
happen from other paths.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
2020-08-10 20:56 ` Dmitry Osipenko
2020-08-10 21:23 ` Dmitry Osipenko
@ 2020-08-11 0:07 ` Michał Mirosław
2020-08-11 15:44 ` Dmitry Osipenko
1 sibling, 1 reply; 18+ messages in thread
From: Michał Mirosław @ 2020-08-11 0:07 UTC (permalink / raw)
To: Dmitry Osipenko; +Cc: Mark Brown, Liam Girdwood, linux-kernel
On Mon, Aug 10, 2020 at 11:56:13PM +0300, Dmitry Osipenko wrote:
> 10.08.2020 23:21, Dmitry Osipenko пишет:
> > 10.08.2020 23:18, Michał Mirosław пишет:
> >> On Mon, Aug 10, 2020 at 11:15:28PM +0300, Dmitry Osipenko wrote:
> >>> 10.08.2020 23:09, Michał Mirosław пишет:
> >>>> At first I also thought so, but there's more. Below is a lockdep
> >>>> complaint with your patch applied. I did a similar patch and then two more
> >>>> (following) and that is still not enough (sysfs/debugfs do allocations,
> >>>> too).
> >>> Then it should be good to move the locking for init_coupling() like I
> >>> suggested and use GFP_NOWAIT for the two other cases. It all could be a
> >>> single small patch. Could you please check whether GFP_NOWAIT helps?
> >>
> >> This would be equivalent to my patches. Problem with sysfs and debugfs
> >> remains as they don't have the option of GFP_NOWAIT. This needs to be
> >> moved outside of the locks.
> >
> > Ah okay, you meant the debugfs core. I see now, thanks.
> >
>
> This indeed needs a capital solution.
>
> It's not obvious how to fix it.. we can probably remove taking the
> list_mutex from lock_dependent(), but this still won't help the case of
> memory reclaiming because reclaim may cause touching the already locked
> regulator. IIUC, the case of memory reclaiming under regulator lock was
> always dangerous and happened to work by chance before, correct?
I just noticed that locking in regulator_resolve_coupling() is bogus.
This all holds up because regulator_list_mutex is held during the call.
Feel free to test a patch below.
I'm working my way to push allocations outside of the locks, but the
coupling-related locking will need to be fixed regardless.
Best Regards,
Michał Mirosław
---->8<----
[PATCH] regulator: remove superfluous lock in regulator_resolve_coupling()
The code modifies rdev, but locks c_rdev instead. The bug remains:
stored c_rdev could be freed just after unlock anyway. This doesn't blow
up because regulator_list_mutex taken outside holds it together.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
drivers/regulator/core.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 94f9225869da..e519bc9a860d 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -4859,13 +4859,9 @@ static void regulator_resolve_coupling(struct regulator_dev *rdev)
return;
}
- regulator_lock(c_rdev);
-
c_desc->coupled_rdevs[i] = c_rdev;
c_desc->n_resolved++;
- regulator_unlock(c_rdev);
-
regulator_resolve_coupling(c_rdev);
}
}
--
2.20.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
2020-08-11 0:07 ` Michał Mirosław
@ 2020-08-11 15:44 ` Dmitry Osipenko
0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Osipenko @ 2020-08-11 15:44 UTC (permalink / raw)
To: Michał Mirosław; +Cc: Mark Brown, Liam Girdwood, linux-kernel
11.08.2020 03:07, Michał Mirosław пишет:
...
> I just noticed that locking in regulator_resolve_coupling() is bogus.
> This all holds up because regulator_list_mutex is held during the call.
> Feel free to test a patch below.
>
> I'm working my way to push allocations outside of the locks, but the
> coupling-related locking will need to be fixed regardless.
>
> Best Regards,
> Michał Mirosław
>
> ---->8<----
>
> [PATCH] regulator: remove superfluous lock in regulator_resolve_coupling()
>
> The code modifies rdev, but locks c_rdev instead. The bug remains:
> stored c_rdev could be freed just after unlock anyway. This doesn't blow
> up because regulator_list_mutex taken outside holds it together.
>
> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
> ---
> drivers/regulator/core.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
> index 94f9225869da..e519bc9a860d 100644
> --- a/drivers/regulator/core.c
> +++ b/drivers/regulator/core.c
> @@ -4859,13 +4859,9 @@ static void regulator_resolve_coupling(struct regulator_dev *rdev)
> return;
> }
>
> - regulator_lock(c_rdev);
> -
> c_desc->coupled_rdevs[i] = c_rdev;
> c_desc->n_resolved++;
>
> - regulator_unlock(c_rdev);
> -
> regulator_resolve_coupling(c_rdev);
> }
> }
>
The change looks like a good cleanup to me, thanks. I think that c_rdev
locking was accidentally left from some older version of the patch that
introduced the coupling support. There shouldn't be any real bug in this
code.
IIRC, at some point I changed the code to disallow consumers to get a
partially coupled regulator and then protected the resolve_coupling()
with list_mutex, but seems missed to remove that c_rdev locking. Hence
there shouldn't be a need to lock regulators individually during the
resolve because nothing should touch the coupled regulators until all
the coupling has been resolved.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
2020-08-09 22:25 Michał Mirosław
@ 2020-08-10 0:09 ` Dmitry Osipenko
2020-08-10 15:39 ` Mark Brown
1 sibling, 0 replies; 18+ messages in thread
From: Dmitry Osipenko @ 2020-08-10 0:09 UTC (permalink / raw)
To: Michał Mirosław, Liam Girdwood, Mark Brown; +Cc: linux-kernel
10.08.2020 01:25, Michał Mirosław пишет:
> Hi guys,
>
> Commit f8702f9e4aa7 ("regulator: core: Use ww_mutex for regulators locking")
> from Nov 2018 tried to fix possible deadlocks when handling coupled
> regulators. Unfortunately it introduced another possible deadlock,
> as discovered by lockdep (see below), instead.
>
> regulator_lock_dependent() starts by taking regulator_list_mutex, The
> same mutex covers eg. regulator initialization, including memory allocations
> that happen there. This will deadlock when you have filesystem on eg. eMMC
> (which uses a regulator to control module voltages) and you register
> a new regulator (hotplug a device?) when under memory pressure.
>
> There is also another problem with regulator_lock_dependent(): all the
> w/w rollback stuff is useless: because of the outer lock, there can only
> be one contendant doing multiple-lock-grabbing procedure. In this setup,
> the procedure cannot detect other processes waiting on
> regulator_lock_dependent() and it cannot signal (wound a transaction of)
> current holders of locks taken by regulator_lock().
>
> Basically, we have a BKL for regulator_enable() and we're using ww_mutex
> as a recursive mutex with no deadlock prevention whatsoever. The locks
> also seem to cover way to much (eg. initialization even before making the
> regulator visible to the system).
>
> To fix the regulator vs memory reclaim path I tried pushing all allocations
> out of protected sections. After doing a few patches, though, I'm not sure
> I'm going in the right direction. Your thoughts?
IIRC, taking the regulator_list_mutex within regulator_lock_dependent()
is needed in order to protect the case of decoupling regulators.
Perhaps moving out allocations or making them GFP_NOWAIT should be the
easiest solution.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
2020-08-09 22:25 Michał Mirosław
2020-08-10 0:09 ` Dmitry Osipenko
@ 2020-08-10 15:39 ` Mark Brown
2020-08-10 16:09 ` Michał Mirosław
1 sibling, 1 reply; 18+ messages in thread
From: Mark Brown @ 2020-08-10 15:39 UTC (permalink / raw)
To: Michał Mirosław; +Cc: Dmitry Osipenko, Liam Girdwood, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1275 bytes --]
On Mon, Aug 10, 2020 at 12:25:37AM +0200, Michał Mirosław wrote:
> regulator_lock_dependent() starts by taking regulator_list_mutex, The
> same mutex covers eg. regulator initialization, including memory allocations
> that happen there. This will deadlock when you have filesystem on eg. eMMC
> (which uses a regulator to control module voltages) and you register
> a new regulator (hotplug a device?) when under memory pressure.
OK, that's very much a corner case, it only applies in the case of
coupled regulators. The most obvious thing here would be to move the
allocations on registration out of the locked region, we really only
need this in the regulator_find_coupler() call I think. If the
regulator isn't coupled we don't need to take the lock at all.
> Basically, we have a BKL for regulator_enable() and we're using ww_mutex
> as a recursive mutex with no deadlock prevention whatsoever. The locks
> also seem to cover way to much (eg. initialization even before making the
> regulator visible to the system).
Could you be more specific about what you're looking at here? There's
nothing too obvious jumping out from the code here other than the bit
around the coupling allocation, otherwise it looks like we're locking
list walks.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
2020-08-10 15:39 ` Mark Brown
@ 2020-08-10 16:09 ` Michał Mirosław
2020-08-10 17:31 ` Mark Brown
0 siblings, 1 reply; 18+ messages in thread
From: Michał Mirosław @ 2020-08-10 16:09 UTC (permalink / raw)
To: Mark Brown; +Cc: Dmitry Osipenko, Liam Girdwood, linux-kernel
On Mon, Aug 10, 2020 at 04:39:28PM +0100, Mark Brown wrote:
> On Mon, Aug 10, 2020 at 12:25:37AM +0200, Michał Mirosław wrote:
>
> > regulator_lock_dependent() starts by taking regulator_list_mutex, The
> > same mutex covers eg. regulator initialization, including memory allocations
> > that happen there. This will deadlock when you have filesystem on eg. eMMC
> > (which uses a regulator to control module voltages) and you register
> > a new regulator (hotplug a device?) when under memory pressure.
>
> OK, that's very much a corner case, it only applies in the case of
> coupled regulators. The most obvious thing here would be to move the
> allocations on registration out of the locked region, we really only
> need this in the regulator_find_coupler() call I think. If the
> regulator isn't coupled we don't need to take the lock at all.
Currently, regulator_lock_dependent() is called by eg. regulator_enable() and
regulator_get_voltage(), so actually any regulator can deadlock this way.
I concur that the locking rules can (and need to) be relaxed.
> > Basically, we have a BKL for regulator_enable() and we're using ww_mutex
> > as a recursive mutex with no deadlock prevention whatsoever. The locks
> > also seem to cover way to much (eg. initialization even before making the
> > regulator visible to the system).
>
> Could you be more specific about what you're looking at here? There's
> nothing too obvious jumping out from the code here other than the bit
> around the coupling allocation, otherwise it looks like we're locking
> list walks.
When you look at the regulator API (regulator_enable() and friends),
then in their implementation we always start by .._lock_dependent(),
which takes regulator_list_mutex around its work. This mutex is what
makes the code deadlock-prone vs memory allocations. I have a feeling
that this lock is a workaround for historical requirements (recursive
locking of regulator_dev) that might be no longer needed or is just
too defensive programming. Hence my other patches and this inquiry.
Best Regards,
Michał Mirosław
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
2020-08-10 16:09 ` Michał Mirosław
@ 2020-08-10 17:31 ` Mark Brown
2020-08-10 19:25 ` Michał Mirosław
0 siblings, 1 reply; 18+ messages in thread
From: Mark Brown @ 2020-08-10 17:31 UTC (permalink / raw)
To: Michał Mirosław; +Cc: Dmitry Osipenko, Liam Girdwood, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2551 bytes --]
On Mon, Aug 10, 2020 at 06:09:36PM +0200, Michał Mirosław wrote:
> On Mon, Aug 10, 2020 at 04:39:28PM +0100, Mark Brown wrote:
> > On Mon, Aug 10, 2020 at 12:25:37AM +0200, Michał Mirosław wrote:
> > > regulator_lock_dependent() starts by taking regulator_list_mutex, The
> > > same mutex covers eg. regulator initialization, including memory allocations
> > > that happen there. This will deadlock when you have filesystem on eg. eMMC
> > > (which uses a regulator to control module voltages) and you register
> > > a new regulator (hotplug a device?) when under memory pressure.
> > OK, that's very much a corner case, it only applies in the case of
> > coupled regulators. The most obvious thing here would be to move the
> > allocations on registration out of the locked region, we really only
> > need this in the regulator_find_coupler() call I think. If the
> > regulator isn't coupled we don't need to take the lock at all.
> Currently, regulator_lock_dependent() is called by eg. regulator_enable() and
> regulator_get_voltage(), so actually any regulator can deadlock this way.
The initialization cases that are the trigger are only done for coupled
regulators though AFAICT, otherwise we're not doing allocations with the
lock held and should be able to progress.
> > > Basically, we have a BKL for regulator_enable() and we're using ww_mutex
> > > as a recursive mutex with no deadlock prevention whatsoever. The locks
> > > also seem to cover way to much (eg. initialization even before making the
> > > regulator visible to the system).
> > Could you be more specific about what you're looking at here? There's
> > nothing too obvious jumping out from the code here other than the bit
> > around the coupling allocation, otherwise it looks like we're locking
> > list walks.
> When you look at the regulator API (regulator_enable() and friends),
> then in their implementation we always start by .._lock_dependent(),
> which takes regulator_list_mutex around its work. This mutex is what
> makes the code deadlock-prone vs memory allocations. I have a feeling
> that this lock is a workaround for historical requirements (recursive
> locking of regulator_dev) that might be no longer needed or is just
> too defensive programming. Hence my other patches and this inquiry.
We need to both walk up the tree for operations that involve the parent
and rope in any peers that are coupled, the idea is basically to avoid
changes to the coupling while we're trying to figure that out.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
2020-08-10 17:31 ` Mark Brown
@ 2020-08-10 19:25 ` Michał Mirosław
2020-08-10 19:41 ` Dmitry Osipenko
0 siblings, 1 reply; 18+ messages in thread
From: Michał Mirosław @ 2020-08-10 19:25 UTC (permalink / raw)
To: Mark Brown; +Cc: Dmitry Osipenko, Liam Girdwood, linux-kernel
On Mon, Aug 10, 2020 at 06:31:36PM +0100, Mark Brown wrote:
> On Mon, Aug 10, 2020 at 06:09:36PM +0200, Michał Mirosław wrote:
> > On Mon, Aug 10, 2020 at 04:39:28PM +0100, Mark Brown wrote:
> > > On Mon, Aug 10, 2020 at 12:25:37AM +0200, Michał Mirosław wrote:
> > > > regulator_lock_dependent() starts by taking regulator_list_mutex, The
> > > > same mutex covers eg. regulator initialization, including memory allocations
> > > > that happen there. This will deadlock when you have filesystem on eg. eMMC
> > > > (which uses a regulator to control module voltages) and you register
> > > > a new regulator (hotplug a device?) when under memory pressure.
> > > OK, that's very much a corner case, it only applies in the case of
> > > coupled regulators. The most obvious thing here would be to move the
> > > allocations on registration out of the locked region, we really only
> > > need this in the regulator_find_coupler() call I think. If the
> > > regulator isn't coupled we don't need to take the lock at all.
> > Currently, regulator_lock_dependent() is called by eg. regulator_enable() and
> > regulator_get_voltage(), so actually any regulator can deadlock this way.
> The initialization cases that are the trigger are only done for coupled
> regulators though AFAICT, otherwise we're not doing allocations with the
> lock held and should be able to progress.
I caught a few lockdep complaints that suggest otherwise, but I'm still
looking into that.
> > > > Basically, we have a BKL for regulator_enable() and we're using ww_mutex
> > > > as a recursive mutex with no deadlock prevention whatsoever. The locks
> > > > also seem to cover way to much (eg. initialization even before making the
> > > > regulator visible to the system).
> > > Could you be more specific about what you're looking at here? There's
> > > nothing too obvious jumping out from the code here other than the bit
> > > around the coupling allocation, otherwise it looks like we're locking
> > > list walks.
> > When you look at the regulator API (regulator_enable() and friends),
> > then in their implementation we always start by .._lock_dependent(),
> > which takes regulator_list_mutex around its work. This mutex is what
> > makes the code deadlock-prone vs memory allocations. I have a feeling
> > that this lock is a workaround for historical requirements (recursive
> > locking of regulator_dev) that might be no longer needed or is just
> > too defensive programming. Hence my other patches and this inquiry.
> We need to both walk up the tree for operations that involve the parent
> and rope in any peers that are coupled, the idea is basically to avoid
> changes to the coupling while we're trying to figure that out.
This part I understand and this is what ww_mutex should take care of.
But there is another lock that's taken around ww_mutex locking transaction
that causes the trouble. I would expect that the rdev->mutex should be
enough to guard against changes in coupled regulators list, but this
might need a fix in the registration phase to guarantee that (it
probably should do the same ww_mutex locking dance as .._lock_dependent()
does now).
Best Regards,
Michał Mirosław
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
2020-08-10 19:25 ` Michał Mirosław
@ 2020-08-10 19:41 ` Dmitry Osipenko
2020-08-10 19:51 ` Mark Brown
0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Osipenko @ 2020-08-10 19:41 UTC (permalink / raw)
To: Michał Mirosław, Mark Brown; +Cc: Liam Girdwood, linux-kernel
10.08.2020 22:25, Michał Mirosław пишет:
>>>>> regulator_lock_dependent() starts by taking regulator_list_mutex, The
>>>>> same mutex covers eg. regulator initialization, including memory allocations
>>>>> that happen there. This will deadlock when you have filesystem on eg. eMMC
>>>>> (which uses a regulator to control module voltages) and you register
>>>>> a new regulator (hotplug a device?) when under memory pressure.
>>>> OK, that's very much a corner case, it only applies in the case of
>>>> coupled regulators. The most obvious thing here would be to move the
>>>> allocations on registration out of the locked region, we really only
>>>> need this in the regulator_find_coupler() call I think. If the
>>>> regulator isn't coupled we don't need to take the lock at all.
>>> Currently, regulator_lock_dependent() is called by eg. regulator_enable() and
>>> regulator_get_voltage(), so actually any regulator can deadlock this way.
>> The initialization cases that are the trigger are only done for coupled
>> regulators though AFAICT, otherwise we're not doing allocations with the
>> lock held and should be able to progress.
>
> I caught a few lockdep complaints that suggest otherwise, but I'm still
> looking into that.
The problem looks obvious to me. The regulator_init_coupling() is
protected with the list_mutex, the regulator_lock_dependent() also
protected with the list_mutex. Hence if offending reclaim happens from
init_coupling(), then there is a lockup.
1. mutex_lock(®ulator_list_mutex);
2. regulator_init_coupling()
3. kzalloc()
4. reclaim ...
5. regulator_get_voltage()
6. regulator_lock_dependent()
7. mutex_lock(®ulator_list_mutex);
It should be enough just to keep the regulator_find_coupler() under
lock, or even completely remove the locking around init_coupling(). I
think it should be better to keep the find_coupler() protected.
Michał, does this fix yours problem?
--- >8 ---
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 75ff7c563c5d..513f95c6f837 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -5040,7 +5040,10 @@ static int regulator_init_coupling(struct
regulator_dev *rdev)
if (!of_check_coupling_data(rdev))
return -EPERM;
+ mutex_lock(®ulator_list_mutex);
rdev->coupling_desc.coupler = regulator_find_coupler(rdev);
+ mutex_unlock(®ulator_list_mutex);
+
if (IS_ERR(rdev->coupling_desc.coupler)) {
err = PTR_ERR(rdev->coupling_desc.coupler);
rdev_err(rdev, "failed to get coupler: %d\n", err);
@@ -5248,9 +5251,7 @@ regulator_register(const struct regulator_desc
*regulator_desc,
if (ret < 0)
goto wash;
- mutex_lock(®ulator_list_mutex);
ret = regulator_init_coupling(rdev);
- mutex_unlock(®ulator_list_mutex);
if (ret < 0)
goto wash;
--- >8 ---
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: regulator: deadlock vs memory reclaim
2020-08-10 19:41 ` Dmitry Osipenko
@ 2020-08-10 19:51 ` Mark Brown
0 siblings, 0 replies; 18+ messages in thread
From: Mark Brown @ 2020-08-10 19:51 UTC (permalink / raw)
To: Dmitry Osipenko; +Cc: Michał Mirosław, Liam Girdwood, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1229 bytes --]
On Mon, Aug 10, 2020 at 10:41:54PM +0300, Dmitry Osipenko wrote:
> 10.08.2020 22:25, Michał Mirosław пишет:
> >> The initialization cases that are the trigger are only done for coupled
> >> regulators though AFAICT, otherwise we're not doing allocations with the
> >> lock held and should be able to progress.
> > I caught a few lockdep complaints that suggest otherwise, but I'm still
> > looking into that.
> The problem looks obvious to me. The regulator_init_coupling() is
> protected with the list_mutex, the regulator_lock_dependent() also
> protected with the list_mutex. Hence if offending reclaim happens from
> init_coupling(), then there is a lockup.
We may also have problems if I/O triggers allocations for some reason,
though that's also going to be a limited set of cases. Might be what
lockdep was showing though.
> It should be enough just to keep the regulator_find_coupler() under
> lock, or even completely remove the locking around init_coupling(). I
> think it should be better to keep the find_coupler() protected.
> Michał, does this fix yours problem?
That was the sort of thing I was thinking about here - it should at
least be an improvement if nothing else.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread