* [PATCH v2] rt: cpufreq: Fix cpu hotplug hang
@ 2021-03-30 3:15 Ran Wang
2021-03-30 4:46 ` Viresh Kumar
0 siblings, 1 reply; 3+ messages in thread
From: Ran Wang @ 2021-03-30 3:15 UTC (permalink / raw)
To: Sebastian Siewior, Thomas Gleixner
Cc: Jiafei Pan, linux-rt-users, Ingo Molnar, Peter Zijlstra,
Rafael J . Wysocki, Viresh Kumar, Ran Wang
When selecting PREEMPT_RT, cpufreq_driver->stop_cpu(policy) might get
stuck due to irq_work_sync() pending for work on lazy_list, which had
no chance to be served in softirq context sometimes.
The reason of lazy_list was not served is because the nearest activated
timer might have been set to expire after long time (such as 100+ seconds).
Then function run_local_timers() would not call raise_softirq(TIMER_SOFTIRQ)
to handle enqueued irq_work.
This is observed on LX2160ARDB and LS1088ARDB with cpufreq governor of
‘schedutil’ or ‘ondemand’.
Configure related irqwork to run on raw-irq context could fix this issue.
Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
---
Change in v2:
- Update commit message to explain root cause more clear.
drivers/cpufreq/cpufreq_governor.c | 2 +-
kernel/sched/cpufreq_schedutil.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 63f7c219062b..731a7b1434df 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -360,7 +360,7 @@ static struct policy_dbs_info *alloc_policy_dbs_info(struct cpufreq_policy *poli
policy_dbs->policy = policy;
mutex_init(&policy_dbs->update_mutex);
atomic_set(&policy_dbs->work_count, 0);
- init_irq_work(&policy_dbs->irq_work, dbs_irq_work);
+ policy_dbs->irq_work = IRQ_WORK_INIT_HARD(dbs_irq_work);
INIT_WORK(&policy_dbs->work, dbs_work_handler);
/* Set policy_dbs for all CPUs, online+offline */
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 50cbad89f7fa..1d5af87ec92e 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -611,7 +611,7 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
sg_policy->thread = thread;
kthread_bind_mask(thread, policy->related_cpus);
- init_irq_work(&sg_policy->irq_work, sugov_irq_work);
+ sg_policy->irq_work = IRQ_WORK_INIT_HARD(sugov_irq_work);
mutex_init(&sg_policy->work_lock);
wake_up_process(thread);
--
2.25.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v2] rt: cpufreq: Fix cpu hotplug hang
2021-03-30 3:15 [PATCH v2] rt: cpufreq: Fix cpu hotplug hang Ran Wang
@ 2021-03-30 4:46 ` Viresh Kumar
2021-03-30 5:24 ` Ran Wang
0 siblings, 1 reply; 3+ messages in thread
From: Viresh Kumar @ 2021-03-30 4:46 UTC (permalink / raw)
To: Ran Wang
Cc: Sebastian Siewior, Thomas Gleixner, Jiafei Pan, linux-rt-users,
Ingo Molnar, Peter Zijlstra, Rafael J . Wysocki
On 30-03-21, 11:15, Ran Wang wrote:
> When selecting PREEMPT_RT, cpufreq_driver->stop_cpu(policy) might get
> stuck due to irq_work_sync() pending for work on lazy_list, which had
> no chance to be served in softirq context sometimes.
>
> The reason of lazy_list was not served is because the nearest activated
> timer might have been set to expire after long time (such as 100+ seconds).
> Then function run_local_timers() would not call raise_softirq(TIMER_SOFTIRQ)
> to handle enqueued irq_work.
>
> This is observed on LX2160ARDB and LS1088ARDB with cpufreq governor of
> ‘schedutil’ or ‘ondemand’.
>
> Configure related irqwork to run on raw-irq context could fix this issue.
>
> Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
> ---
> Change in v2:
> - Update commit message to explain root cause more clear.
>
> drivers/cpufreq/cpufreq_governor.c | 2 +-
> kernel/sched/cpufreq_schedutil.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 63f7c219062b..731a7b1434df 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -360,7 +360,7 @@ static struct policy_dbs_info *alloc_policy_dbs_info(struct cpufreq_policy *poli
> policy_dbs->policy = policy;
> mutex_init(&policy_dbs->update_mutex);
> atomic_set(&policy_dbs->work_count, 0);
> - init_irq_work(&policy_dbs->irq_work, dbs_irq_work);
> + policy_dbs->irq_work = IRQ_WORK_INIT_HARD(dbs_irq_work);
> INIT_WORK(&policy_dbs->work, dbs_work_handler);
>
> /* Set policy_dbs for all CPUs, online+offline */
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 50cbad89f7fa..1d5af87ec92e 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -611,7 +611,7 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
>
> sg_policy->thread = thread;
> kthread_bind_mask(thread, policy->related_cpus);
> - init_irq_work(&sg_policy->irq_work, sugov_irq_work);
> + sg_policy->irq_work = IRQ_WORK_INIT_HARD(sugov_irq_work);
> mutex_init(&sg_policy->work_lock);
>
> wake_up_process(thread);
Will this have any impact on the non-preempt-rt case ? Otherwise,
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
--
viresh
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: [PATCH v2] rt: cpufreq: Fix cpu hotplug hang
2021-03-30 4:46 ` Viresh Kumar
@ 2021-03-30 5:24 ` Ran Wang
0 siblings, 0 replies; 3+ messages in thread
From: Ran Wang @ 2021-03-30 5:24 UTC (permalink / raw)
To: Viresh Kumar
Cc: Sebastian Siewior, Thomas Gleixner, Jiafei Pan, linux-rt-users,
Ingo Molnar, Peter Zijlstra, Rafael J . Wysocki
Hi Kumar,
On Tuesday, March 30, 2021 12:46 PM, Viresh Kumar wrote:
>
> On 30-03-21, 11:15, Ran Wang wrote:
> > When selecting PREEMPT_RT, cpufreq_driver->stop_cpu(policy) might get
> > stuck due to irq_work_sync() pending for work on lazy_list, which had
> > no chance to be served in softirq context sometimes.
> >
> > The reason of lazy_list was not served is because the nearest
> > activated timer might have been set to expire after long time (such as 100+ seconds).
> > Then function run_local_timers() would not call
> > raise_softirq(TIMER_SOFTIRQ) to handle enqueued irq_work.
> >
> > This is observed on LX2160ARDB and LS1088ARDB with cpufreq governor of
> > ‘schedutil’ or ‘ondemand’.
> >
> > Configure related irqwork to run on raw-irq context could fix this issue.
> >
> > Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
> > ---
> > Change in v2:
> > - Update commit message to explain root cause more clear.
> >
> > drivers/cpufreq/cpufreq_governor.c | 2 +-
> > kernel/sched/cpufreq_schedutil.c | 2 +-
> > 2 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/cpufreq/cpufreq_governor.c
> > b/drivers/cpufreq/cpufreq_governor.c
> > index 63f7c219062b..731a7b1434df 100644
> > --- a/drivers/cpufreq/cpufreq_governor.c
> > +++ b/drivers/cpufreq/cpufreq_governor.c
> > @@ -360,7 +360,7 @@ static struct policy_dbs_info *alloc_policy_dbs_info(struct cpufreq_policy *poli
> > policy_dbs->policy = policy;
> > mutex_init(&policy_dbs->update_mutex);
> > atomic_set(&policy_dbs->work_count, 0);
> > - init_irq_work(&policy_dbs->irq_work, dbs_irq_work);
> > + policy_dbs->irq_work = IRQ_WORK_INIT_HARD(dbs_irq_work);
> > INIT_WORK(&policy_dbs->work, dbs_work_handler);
> >
> > /* Set policy_dbs for all CPUs, online+offline */ diff --git
> > a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index 50cbad89f7fa..1d5af87ec92e 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -611,7 +611,7 @@ static int sugov_kthread_create(struct
> > sugov_policy *sg_policy)
> >
> > sg_policy->thread = thread;
> > kthread_bind_mask(thread, policy->related_cpus);
> > - init_irq_work(&sg_policy->irq_work, sugov_irq_work);
> > + sg_policy->irq_work = IRQ_WORK_INIT_HARD(sugov_irq_work);
> > mutex_init(&sg_policy->work_lock);
> >
> > wake_up_process(thread);
>
> Will this have any impact on the non-preempt-rt case ? Otherwise,
My understanding is, in non-preempt-rt case, it will be queued to raised_list instead
and call arch_irq_work_raise() immediately to raise a IPI to serve. So that it would be
similar to what this patch do in preempt-rt case, see function __irq_work_queue_local():
53 /* Enqueue on current CPU, work must already be claimed and preempt disabled */
54 static void __irq_work_queue_local(struct irq_work *work)
55 {
56 struct llist_head *list;
57 bool lazy_work;
58 int work_flags;
59
60 work_flags = atomic_read(&work->node.a_flags);
61 if (work_flags & IRQ_WORK_LAZY)
62 lazy_work = true;
63 else if (IS_ENABLED(CONFIG_PREEMPT_RT) &&
64 !(work_flags & IRQ_WORK_HARD_IRQ))
65 lazy_work = true;
66 else
67 lazy_work = false;
And I have tested on mainline and rt tree with CONFIG_PREEMPT selected, couldn't reproduce such issue.
Regards,
Ran
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
>
> --
> viresh
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-03-30 5:25 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-30 3:15 [PATCH v2] rt: cpufreq: Fix cpu hotplug hang Ran Wang
2021-03-30 4:46 ` Viresh Kumar
2021-03-30 5:24 ` Ran Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).