* [PATCH] cpuidle: psci: Fix suspicious RCU usage
@ 2020-09-04 6:47 Ulf Hansson
2020-09-04 14:13 ` Paul E. McKenney
2020-09-21 13:59 ` Rafael J. Wysocki
0 siblings, 2 replies; 5+ messages in thread
From: Ulf Hansson @ 2020-09-04 6:47 UTC (permalink / raw)
To: Peter Zijlstra, Rafael J . Wysocki, Paul E . McKenney, linux-pm
Cc: Ulf Hansson, Lorenzo Pieralisi, Vincent Guittot, Stephen Boyd,
Naresh Kamboju, Daniel Lezcano, Lina Iyer, Bjorn Andersson,
Steven Rostedt, Sudeep Holla, Thomas Gleixner, linux-arm-kernel
The commit eb1f00237aca ("lockdep,trace: Expose tracepoints"), started to
expose us for tracepoints. This lead to the following RCU splat on an ARM64
Qcom board.
[ 5.529634] WARNING: suspicious RCU usage
[ 5.537307] sdhci-pltfm: SDHCI platform and OF driver helper
[ 5.541092] 5.9.0-rc3 #86 Not tainted
[ 5.541098] -----------------------------
[ 5.541105] ../include/trace/events/lock.h:37 suspicious rcu_dereference_check() usage!
[ 5.541110]
[ 5.541110] other info that might help us debug this:
[ 5.541110]
[ 5.541116]
[ 5.541116] rcu_scheduler_active = 2, debug_locks = 1
[ 5.541122] RCU used illegally from extended quiescent state!
[ 5.541129] no locks held by swapper/0/0.
[ 5.541134]
[ 5.541134] stack backtrace:
[ 5.541143] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc3 #86
[ 5.541149] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[ 5.541157] Call trace:
[ 5.568185] sdhci_msm 7864900.sdhci: Got CD GPIO
[ 5.574186] dump_backtrace+0x0/0x1c8
[ 5.574206] show_stack+0x14/0x20
[ 5.574229] dump_stack+0xe8/0x154
[ 5.574250] lockdep_rcu_suspicious+0xd4/0xf8
[ 5.574269] lock_acquire+0x3f0/0x460
[ 5.574292] _raw_spin_lock_irqsave+0x80/0xb0
[ 5.574314] __pm_runtime_suspend+0x4c/0x188
[ 5.574341] psci_enter_domain_idle_state+0x40/0xa0
[ 5.574362] cpuidle_enter_state+0xc0/0x610
[ 5.646487] cpuidle_enter+0x38/0x50
[ 5.650651] call_cpuidle+0x18/0x40
[ 5.654467] do_idle+0x228/0x278
[ 5.657678] cpu_startup_entry+0x24/0x70
[ 5.661153] rest_init+0x1a4/0x278
[ 5.665061] arch_call_rest_init+0xc/0x14
[ 5.668272] start_kernel+0x508/0x540
Following the path in pm_runtime_put_sync_suspend() from
psci_enter_domain_idle_state(), it seems like we end up using the RCU.
Therefore, let's simply silence the splat by informing the RCU about it
with RCU_NONIDLE.
Note that, this is a temporary solution. Instead we should strive to avoid
using RCU_NONIDLE (and similar), but rather push rcu_idle_enter|exit()
further down, closer to the arch specific code. However, as the CPU PM
notifiers are also using the RCU, additional rework is needed.
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
drivers/cpuidle/cpuidle-psci.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
index 74463841805f..d928b37718bd 100644
--- a/drivers/cpuidle/cpuidle-psci.c
+++ b/drivers/cpuidle/cpuidle-psci.c
@@ -66,7 +66,7 @@ static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
return -1;
/* Do runtime PM to manage a hierarchical CPU toplogy. */
- pm_runtime_put_sync_suspend(pd_dev);
+ RCU_NONIDLE(pm_runtime_put_sync_suspend(pd_dev));
state = psci_get_domain_state();
if (!state)
@@ -74,7 +74,7 @@ static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
ret = psci_cpu_suspend_enter(state) ? -1 : idx;
- pm_runtime_get_sync(pd_dev);
+ RCU_NONIDLE(pm_runtime_get_sync(pd_dev));
cpu_pm_exit();
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] cpuidle: psci: Fix suspicious RCU usage
2020-09-04 6:47 [PATCH] cpuidle: psci: Fix suspicious RCU usage Ulf Hansson
@ 2020-09-04 14:13 ` Paul E. McKenney
2020-09-15 21:17 ` Ulf Hansson
2020-09-21 13:59 ` Rafael J. Wysocki
1 sibling, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2020-09-04 14:13 UTC (permalink / raw)
To: Ulf Hansson
Cc: Lorenzo Pieralisi, Vincent Guittot, linux-pm, Peter Zijlstra,
Naresh Kamboju, Daniel Lezcano, Rafael J . Wysocki,
Steven Rostedt, Bjorn Andersson, Stephen Boyd, Lina Iyer,
Sudeep Holla, Thomas Gleixner, linux-arm-kernel
On Fri, Sep 04, 2020 at 08:47:05AM +0200, Ulf Hansson wrote:
> The commit eb1f00237aca ("lockdep,trace: Expose tracepoints"), started to
> expose us for tracepoints. This lead to the following RCU splat on an ARM64
> Qcom board.
>
> [ 5.529634] WARNING: suspicious RCU usage
> [ 5.537307] sdhci-pltfm: SDHCI platform and OF driver helper
> [ 5.541092] 5.9.0-rc3 #86 Not tainted
> [ 5.541098] -----------------------------
> [ 5.541105] ../include/trace/events/lock.h:37 suspicious rcu_dereference_check() usage!
> [ 5.541110]
> [ 5.541110] other info that might help us debug this:
> [ 5.541110]
> [ 5.541116]
> [ 5.541116] rcu_scheduler_active = 2, debug_locks = 1
> [ 5.541122] RCU used illegally from extended quiescent state!
> [ 5.541129] no locks held by swapper/0/0.
> [ 5.541134]
> [ 5.541134] stack backtrace:
> [ 5.541143] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc3 #86
> [ 5.541149] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> [ 5.541157] Call trace:
> [ 5.568185] sdhci_msm 7864900.sdhci: Got CD GPIO
> [ 5.574186] dump_backtrace+0x0/0x1c8
> [ 5.574206] show_stack+0x14/0x20
> [ 5.574229] dump_stack+0xe8/0x154
> [ 5.574250] lockdep_rcu_suspicious+0xd4/0xf8
> [ 5.574269] lock_acquire+0x3f0/0x460
> [ 5.574292] _raw_spin_lock_irqsave+0x80/0xb0
> [ 5.574314] __pm_runtime_suspend+0x4c/0x188
> [ 5.574341] psci_enter_domain_idle_state+0x40/0xa0
> [ 5.574362] cpuidle_enter_state+0xc0/0x610
> [ 5.646487] cpuidle_enter+0x38/0x50
> [ 5.650651] call_cpuidle+0x18/0x40
> [ 5.654467] do_idle+0x228/0x278
> [ 5.657678] cpu_startup_entry+0x24/0x70
> [ 5.661153] rest_init+0x1a4/0x278
> [ 5.665061] arch_call_rest_init+0xc/0x14
> [ 5.668272] start_kernel+0x508/0x540
>
> Following the path in pm_runtime_put_sync_suspend() from
> psci_enter_domain_idle_state(), it seems like we end up using the RCU.
> Therefore, let's simply silence the splat by informing the RCU about it
> with RCU_NONIDLE.
>
> Note that, this is a temporary solution. Instead we should strive to avoid
> using RCU_NONIDLE (and similar), but rather push rcu_idle_enter|exit()
> further down, closer to the arch specific code. However, as the CPU PM
> notifiers are also using the RCU, additional rework is needed.
>
> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
> ---
> drivers/cpuidle/cpuidle-psci.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
> index 74463841805f..d928b37718bd 100644
> --- a/drivers/cpuidle/cpuidle-psci.c
> +++ b/drivers/cpuidle/cpuidle-psci.c
> @@ -66,7 +66,7 @@ static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
> return -1;
>
> /* Do runtime PM to manage a hierarchical CPU toplogy. */
> - pm_runtime_put_sync_suspend(pd_dev);
> + RCU_NONIDLE(pm_runtime_put_sync_suspend(pd_dev));
>
> state = psci_get_domain_state();
> if (!state)
> @@ -74,7 +74,7 @@ static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
>
> ret = psci_cpu_suspend_enter(state) ? -1 : idx;
>
> - pm_runtime_get_sync(pd_dev);
> + RCU_NONIDLE(pm_runtime_get_sync(pd_dev));
>
> cpu_pm_exit();
>
> --
> 2.25.1
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] cpuidle: psci: Fix suspicious RCU usage
2020-09-04 14:13 ` Paul E. McKenney
@ 2020-09-15 21:17 ` Ulf Hansson
2020-09-16 8:36 ` peterz
0 siblings, 1 reply; 5+ messages in thread
From: Ulf Hansson @ 2020-09-15 21:17 UTC (permalink / raw)
To: Peter Zijlstra, Rafael J . Wysocki
Cc: Lorenzo Pieralisi, Vincent Guittot, Paul E. McKenney, Linux PM,
Stephen Boyd, Naresh Kamboju, Daniel Lezcano, Steven Rostedt,
Bjorn Andersson, Lina Iyer, Sudeep Holla, Thomas Gleixner,
Linux ARM
On Fri, 4 Sep 2020 at 16:13, Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Fri, Sep 04, 2020 at 08:47:05AM +0200, Ulf Hansson wrote:
> > The commit eb1f00237aca ("lockdep,trace: Expose tracepoints"), started to
> > expose us for tracepoints. This lead to the following RCU splat on an ARM64
> > Qcom board.
> >
> > [ 5.529634] WARNING: suspicious RCU usage
> > [ 5.537307] sdhci-pltfm: SDHCI platform and OF driver helper
> > [ 5.541092] 5.9.0-rc3 #86 Not tainted
> > [ 5.541098] -----------------------------
> > [ 5.541105] ../include/trace/events/lock.h:37 suspicious rcu_dereference_check() usage!
> > [ 5.541110]
> > [ 5.541110] other info that might help us debug this:
> > [ 5.541110]
> > [ 5.541116]
> > [ 5.541116] rcu_scheduler_active = 2, debug_locks = 1
> > [ 5.541122] RCU used illegally from extended quiescent state!
> > [ 5.541129] no locks held by swapper/0/0.
> > [ 5.541134]
> > [ 5.541134] stack backtrace:
> > [ 5.541143] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc3 #86
> > [ 5.541149] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> > [ 5.541157] Call trace:
> > [ 5.568185] sdhci_msm 7864900.sdhci: Got CD GPIO
> > [ 5.574186] dump_backtrace+0x0/0x1c8
> > [ 5.574206] show_stack+0x14/0x20
> > [ 5.574229] dump_stack+0xe8/0x154
> > [ 5.574250] lockdep_rcu_suspicious+0xd4/0xf8
> > [ 5.574269] lock_acquire+0x3f0/0x460
> > [ 5.574292] _raw_spin_lock_irqsave+0x80/0xb0
> > [ 5.574314] __pm_runtime_suspend+0x4c/0x188
> > [ 5.574341] psci_enter_domain_idle_state+0x40/0xa0
> > [ 5.574362] cpuidle_enter_state+0xc0/0x610
> > [ 5.646487] cpuidle_enter+0x38/0x50
> > [ 5.650651] call_cpuidle+0x18/0x40
> > [ 5.654467] do_idle+0x228/0x278
> > [ 5.657678] cpu_startup_entry+0x24/0x70
> > [ 5.661153] rest_init+0x1a4/0x278
> > [ 5.665061] arch_call_rest_init+0xc/0x14
> > [ 5.668272] start_kernel+0x508/0x540
> >
> > Following the path in pm_runtime_put_sync_suspend() from
> > psci_enter_domain_idle_state(), it seems like we end up using the RCU.
> > Therefore, let's simply silence the splat by informing the RCU about it
> > with RCU_NONIDLE.
> >
> > Note that, this is a temporary solution. Instead we should strive to avoid
> > using RCU_NONIDLE (and similar), but rather push rcu_idle_enter|exit()
> > further down, closer to the arch specific code. However, as the CPU PM
> > notifiers are also using the RCU, additional rework is needed.
> >
> > Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
>
> Acked-by: Paul E. McKenney <paulmck@kernel.org>
Peter, Rafael,
Is $subject patch okay as the short term solution? If so, I can
continue to look at the CPU PM notifiers, etc.
Kind regards
Uffe
>
> > ---
> > drivers/cpuidle/cpuidle-psci.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
> > index 74463841805f..d928b37718bd 100644
> > --- a/drivers/cpuidle/cpuidle-psci.c
> > +++ b/drivers/cpuidle/cpuidle-psci.c
> > @@ -66,7 +66,7 @@ static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
> > return -1;
> >
> > /* Do runtime PM to manage a hierarchical CPU toplogy. */
> > - pm_runtime_put_sync_suspend(pd_dev);
> > + RCU_NONIDLE(pm_runtime_put_sync_suspend(pd_dev));
> >
> > state = psci_get_domain_state();
> > if (!state)
> > @@ -74,7 +74,7 @@ static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
> >
> > ret = psci_cpu_suspend_enter(state) ? -1 : idx;
> >
> > - pm_runtime_get_sync(pd_dev);
> > + RCU_NONIDLE(pm_runtime_get_sync(pd_dev));
> >
> > cpu_pm_exit();
> >
> > --
> > 2.25.1
> >
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] cpuidle: psci: Fix suspicious RCU usage
2020-09-15 21:17 ` Ulf Hansson
@ 2020-09-16 8:36 ` peterz
0 siblings, 0 replies; 5+ messages in thread
From: peterz @ 2020-09-16 8:36 UTC (permalink / raw)
To: Ulf Hansson
Cc: Lorenzo Pieralisi, Vincent Guittot, Paul E. McKenney, Linux PM,
Stephen Boyd, Naresh Kamboju, Daniel Lezcano, Rafael J . Wysocki,
Steven Rostedt, Bjorn Andersson, Lina Iyer, Sudeep Holla,
Thomas Gleixner, Linux ARM
On Tue, Sep 15, 2020 at 11:17:20PM +0200, Ulf Hansson wrote:
> Peter, Rafael,
>
> Is $subject patch okay as the short term solution? If so, I can
> continue to look at the CPU PM notifiers, etc.
Yes, it's a proper band-aid :-)
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] cpuidle: psci: Fix suspicious RCU usage
2020-09-04 6:47 [PATCH] cpuidle: psci: Fix suspicious RCU usage Ulf Hansson
2020-09-04 14:13 ` Paul E. McKenney
@ 2020-09-21 13:59 ` Rafael J. Wysocki
1 sibling, 0 replies; 5+ messages in thread
From: Rafael J. Wysocki @ 2020-09-21 13:59 UTC (permalink / raw)
To: Ulf Hansson
Cc: Lorenzo Pieralisi, Vincent Guittot, Paul E . McKenney, Linux PM,
Peter Zijlstra, Naresh Kamboju, Daniel Lezcano,
Rafael J . Wysocki, Steven Rostedt, Bjorn Andersson,
Stephen Boyd, Lina Iyer, Sudeep Holla, Thomas Gleixner,
Linux ARM
On Fri, Sep 4, 2020 at 8:47 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> The commit eb1f00237aca ("lockdep,trace: Expose tracepoints"), started to
> expose us for tracepoints. This lead to the following RCU splat on an ARM64
> Qcom board.
>
> [ 5.529634] WARNING: suspicious RCU usage
> [ 5.537307] sdhci-pltfm: SDHCI platform and OF driver helper
> [ 5.541092] 5.9.0-rc3 #86 Not tainted
> [ 5.541098] -----------------------------
> [ 5.541105] ../include/trace/events/lock.h:37 suspicious rcu_dereference_check() usage!
> [ 5.541110]
> [ 5.541110] other info that might help us debug this:
> [ 5.541110]
> [ 5.541116]
> [ 5.541116] rcu_scheduler_active = 2, debug_locks = 1
> [ 5.541122] RCU used illegally from extended quiescent state!
> [ 5.541129] no locks held by swapper/0/0.
> [ 5.541134]
> [ 5.541134] stack backtrace:
> [ 5.541143] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc3 #86
> [ 5.541149] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> [ 5.541157] Call trace:
> [ 5.568185] sdhci_msm 7864900.sdhci: Got CD GPIO
> [ 5.574186] dump_backtrace+0x0/0x1c8
> [ 5.574206] show_stack+0x14/0x20
> [ 5.574229] dump_stack+0xe8/0x154
> [ 5.574250] lockdep_rcu_suspicious+0xd4/0xf8
> [ 5.574269] lock_acquire+0x3f0/0x460
> [ 5.574292] _raw_spin_lock_irqsave+0x80/0xb0
> [ 5.574314] __pm_runtime_suspend+0x4c/0x188
> [ 5.574341] psci_enter_domain_idle_state+0x40/0xa0
> [ 5.574362] cpuidle_enter_state+0xc0/0x610
> [ 5.646487] cpuidle_enter+0x38/0x50
> [ 5.650651] call_cpuidle+0x18/0x40
> [ 5.654467] do_idle+0x228/0x278
> [ 5.657678] cpu_startup_entry+0x24/0x70
> [ 5.661153] rest_init+0x1a4/0x278
> [ 5.665061] arch_call_rest_init+0xc/0x14
> [ 5.668272] start_kernel+0x508/0x540
>
> Following the path in pm_runtime_put_sync_suspend() from
> psci_enter_domain_idle_state(), it seems like we end up using the RCU.
> Therefore, let's simply silence the splat by informing the RCU about it
> with RCU_NONIDLE.
>
> Note that, this is a temporary solution. Instead we should strive to avoid
> using RCU_NONIDLE (and similar), but rather push rcu_idle_enter|exit()
> further down, closer to the arch specific code. However, as the CPU PM
> notifiers are also using the RCU, additional rework is needed.
>
> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
I missed this one before, sorry about that.
Now applied as 5.9-rc6 material with the Paul's ACK, thanks!
> ---
> drivers/cpuidle/cpuidle-psci.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
> index 74463841805f..d928b37718bd 100644
> --- a/drivers/cpuidle/cpuidle-psci.c
> +++ b/drivers/cpuidle/cpuidle-psci.c
> @@ -66,7 +66,7 @@ static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
> return -1;
>
> /* Do runtime PM to manage a hierarchical CPU toplogy. */
> - pm_runtime_put_sync_suspend(pd_dev);
> + RCU_NONIDLE(pm_runtime_put_sync_suspend(pd_dev));
>
> state = psci_get_domain_state();
> if (!state)
> @@ -74,7 +74,7 @@ static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
>
> ret = psci_cpu_suspend_enter(state) ? -1 : idx;
>
> - pm_runtime_get_sync(pd_dev);
> + RCU_NONIDLE(pm_runtime_get_sync(pd_dev));
>
> cpu_pm_exit();
>
> --
> 2.25.1
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-09-21 14:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-04 6:47 [PATCH] cpuidle: psci: Fix suspicious RCU usage Ulf Hansson
2020-09-04 14:13 ` Paul E. McKenney
2020-09-15 21:17 ` Ulf Hansson
2020-09-16 8:36 ` peterz
2020-09-21 13:59 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).