From: Peter Zijlstra <peterz@infradead.org>
To: Alexey Klimov <alexey.klimov@linaro.org>
Cc: draszik@google.com, peter.griffin@linaro.org,
willmcvicker@google.com, mingo@kernel.org,
ulf.hansson@linaro.org, tony@atomide.com,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
axboe@kernel.dk, alim.akhtar@samsung.com,
regressions@lists.linux.dev, avri.altman@wdc.com,
bvanassche@acm.org, klimova@google.com
Subject: Re: [REGRESSION] CPUIDLE_FLAG_RCU_IDLE, blk_mq_freeze_queue_wait() and slow-stuck reboots
Date: Wed, 15 Mar 2023 12:16:06 +0100 [thread overview]
Message-ID: <20230315111606.GB2006103@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20230314230004.961993-1-alexey.klimov@linaro.org>
(could you wrap your email please)
On Tue, Mar 14, 2023 at 11:00:04PM +0000, Alexey Klimov wrote:
> #regzbot introduced: 0c5ffc3d7b15 #regzbot title:
> CPUIDLE_FLAG_RCU_IDLE, blk_mq_freeze_queue_wait() and slow-stuck
> reboots
>
> The upstream changes are being merged into android-mainline repo and
> at some point we started to observe kernel panics on reboot or long
> reboot times.
On what hardware? I find it somewhat hard to follow this DT code :/
> Looks like adding CPUIDLE_FLAG_RCU_IDLE flag to idle driver caused
> this behaviour. The minimal change that is required for this system
> to avoid the regression would be one liner that removes the flag
> (below).
>
> But if it is a real regression, then other idle drivers if used will
> likely cause this regression too withe same ufshcd driver. There is
> also a suspicion that CPUIDLE_FLAG_RCU_IDLE just revealed or uncovered
> some other problem.
>
> Any thoughts on this?
So ARM has a weird 'rule' in that idle state 0 (wfi) should not have
RCU_IDLE set, while others should have.
Of the dt_init_idle_driver() users:
- cpuidle-arm: arm_enter_idle_state()
- cpuidle-big_little: bl_enter_powerdown() does ct_cpuidle_{enter,exit}()
- cpuidle-psci: psci_enter_idle_state() uses CPU_PM_CPU_IDLE_ENTER_PARAM_RCU()
- cpuidle-qcom-spm: spm_enter_idle_state() uses CPU_PM_CPU_IDLE_ENTER_PARAM()
- cpuidle-riscv-sbi: sbi_cpuidle_enter_state() uses CPU_PM_CPU_IDLE_ENTER_*_PARAM()
All of them start on index 1 and hence should have RCU_IDLE set, but at
least the arm, qcom-spm and riscv-sbi don't actually appear to abide by
the rules.
Fixing that gives me the below; does that help?
---
diff --git a/drivers/cpuidle/cpuidle-arm.c b/drivers/cpuidle/cpuidle-arm.c
index 7cfb980a357d..58fa81f0fa7d 100644
--- a/drivers/cpuidle/cpuidle-arm.c
+++ b/drivers/cpuidle/cpuidle-arm.c
@@ -39,7 +39,7 @@ static __cpuidle int arm_enter_idle_state(struct cpuidle_device *dev,
* will call the CPU ops suspend protocol with idle index as a
* parameter.
*/
- return CPU_PM_CPU_IDLE_ENTER(arm_cpuidle_suspend, idx);
+ return CPU_PM_CPU_IDLE_ENTER_RCU(arm_cpuidle_suspend, idx);
}
static struct cpuidle_driver arm_idle_driver __initdata = {
diff --git a/drivers/cpuidle/cpuidle-qcom-spm.c b/drivers/cpuidle/cpuidle-qcom-spm.c
index c6e2e91bb4c3..429db2d40114 100644
--- a/drivers/cpuidle/cpuidle-qcom-spm.c
+++ b/drivers/cpuidle/cpuidle-qcom-spm.c
@@ -64,7 +64,7 @@ static __cpuidle int spm_enter_idle_state(struct cpuidle_device *dev,
struct cpuidle_qcom_spm_data *data = container_of(drv, struct cpuidle_qcom_spm_data,
cpuidle_driver);
- return CPU_PM_CPU_IDLE_ENTER_PARAM(qcom_cpu_spc, idx, data->spm);
+ return CPU_PM_CPU_IDLE_ENTER_PARAM_RCU(qcom_cpu_spc, idx, data->spm);
}
static struct cpuidle_driver qcom_spm_idle_driver = {
diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
index be383f4b6855..04a601cda06b 100644
--- a/drivers/cpuidle/cpuidle-riscv-sbi.c
+++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
@@ -100,10 +100,9 @@ static __cpuidle int sbi_cpuidle_enter_state(struct cpuidle_device *dev,
u32 state = states[idx];
if (state & SBI_HSM_SUSP_NON_RET_BIT)
- return CPU_PM_CPU_IDLE_ENTER_PARAM(sbi_suspend, idx, state);
- else
- return CPU_PM_CPU_IDLE_ENTER_RETENTION_PARAM(sbi_suspend,
- idx, state);
+ return CPU_PM_CPU_IDLE_ENTER_PARAM_RCU(sbi_suspend, idx, state);
+
+ return CPU_PM_CPU_IDLE_ENTER_RETENTION_PARAM_RCU(sbi_suspend, idx, state);
}
static __cpuidle int __sbi_enter_domain_idle_state(struct cpuidle_device *dev,
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 3183aeb7f5b4..dd92bdafe2d3 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -334,6 +334,9 @@ extern s64 cpuidle_governor_latency_req(unsigned int cpu);
#define CPU_PM_CPU_IDLE_ENTER(low_level_idle_enter, idx) \
__CPU_PM_CPU_IDLE_ENTER(low_level_idle_enter, idx, idx, 0, 0)
+#define CPU_PM_CPU_IDLE_ENTER_RCU(low_level_idle_enter, idx) \
+ __CPU_PM_CPU_IDLE_ENTER(low_level_idle_enter, idx, idx, 0, 1)
+
#define CPU_PM_CPU_IDLE_ENTER_RETENTION(low_level_idle_enter, idx) \
__CPU_PM_CPU_IDLE_ENTER(low_level_idle_enter, idx, idx, 1, 0)
next prev parent reply other threads:[~2023-03-15 11:18 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-14 23:00 [REGRESSION] CPUIDLE_FLAG_RCU_IDLE, blk_mq_freeze_queue_wait() and slow-stuck reboots Alexey Klimov
2023-03-14 23:21 ` Bart Van Assche
2023-03-17 1:38 ` Alexey Klimov
2023-03-15 11:16 ` Peter Zijlstra [this message]
2023-03-17 2:11 ` Alexey Klimov
2023-03-20 9:05 ` Peter Zijlstra
2023-03-20 9:36 ` Peter Zijlstra
2023-04-11 16:16 ` Alexey Klimov
2023-03-20 9:22 ` Peter Zijlstra
2023-03-20 13:52 ` Mark Rutland
2023-03-20 16:04 ` Mark Rutland
2023-04-02 12:40 ` Linux regression tracking #update (Thorsten Leemhuis)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230315111606.GB2006103@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=alexey.klimov@linaro.org \
--cc=alim.akhtar@samsung.com \
--cc=avri.altman@wdc.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=draszik@google.com \
--cc=klimova@google.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peter.griffin@linaro.org \
--cc=regressions@lists.linux.dev \
--cc=tony@atomide.com \
--cc=ulf.hansson@linaro.org \
--cc=willmcvicker@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).