From: Ran Wang <ran.wang_1@nxp.com>
To: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>,
"Rafael J . Wysocki" <rjw@rjwysocki.net>,
Thomas Gleixner <tglx@linutronix.de>,
Jiafei Pan <jiafei.pan@nxp.com>,
"linux-rt-users@vger.kernel.org" <linux-rt-users@vger.kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>
Subject: RE: [PATCH] rt: cpufreq: Fix cpu hotplug hang
Date: Tue, 23 Mar 2021 07:00:29 +0000 [thread overview]
Message-ID: <AM6PR04MB5413313F7577B63E8E930843F1649@AM6PR04MB5413.eurprd04.prod.outlook.com> (raw)
In-Reply-To: <20210224163745.5lqhnjviznsjojtl@linutronix.de>
Hi Sebastian,
Sorry for the late response due to bandwidth issue.
Please see my reply inline.
On Thursday, February 25, 2021 12:38 AM, Sebastian Siewior wrote:
>
> On 2021-02-23 08:43:08 [+0530], Viresh Kumar wrote:
> > On 23-02-21, 02:26, Ran Wang wrote:
> > > On Monday, February 22, 2021 10:01 PM, Sebastian Siewior wrote:
> > > > On 2021-02-19 16:44:20 [+0800], Ran Wang wrote:
> > > > > When selecting PREEMPT_RT, cpufreq_driver->stop_cpu(policy)
> > > > > might got stuck due to irq_work_sync() pending for work on
> > > > > lazy_list. That’s because lazy_list may have no chance to be
> > > > > served in softirq context sometimes. Below is one of scenarios that was captured:
> > > > >
> > > > > ...
> > > > > ret_from_fork
> > > > > kthread
> > > > > smpboot_thread_fn
> > > > > cpuhp_thread_fun
> > > > > cpuhp_invoke_callback: state: 193
> > > > > cpuhp_cpufreq_online
> > > > > cpufreq_online
> > > > > cpufreq_driver->stop_cpu(policy);
> > > > > cpufreq_dbs_governor_stop
> > > > > sugov_stop // kernel/sched/cpufreq_schedutil.c
> > > > > irq_work_sync(&sg_policy->irq_work);
> > > > >
> > > > > This is observed on LX2160ARDB (16 A72 cores) with cpufreq
> > > > > governor of ‘schedutil’ or ‘ondemand’.
> > > >
> > > > While staring at it, why do we invoke schedule_work_on() and
> > > > kthread_queue_work() from inside irq_work() instead invoking it directly? It raises an interrupt in which it kicks a user thread.
> > > > Couldn't we do it without irq_work?
> >
> > Because we reach there from scheduler's context, which must be
> > hard-irq context..
>
> Oha. Too bad.
> Ran, any chance to figure out why the softirq can not be served?
> ksoftirqd and the timer should be up.
Further debugging shows that sometimes the nearest coming local timer is expected to expire after a long time (100+ seconds).
That’s why function run_local_timers() gets no chance to call raise_softirq(TIMER_SOFTIRQ) to handle enqueued irq_work
for a long timer, causing irq_work_sync() pending when doing CPU hotplug operation.
Below is one of trace I captured, showing that sometimes [timeout=xxx] value would be much bigger than others (see [timeout=30070]).
And we can see on this particular example (CPU2), there would be 100+ seconds idle between 2 'timer_expire_entry' (see timestamp of 613.43867)
ksoftirqd/2-26 [002] d..s.13 35.046622: timer_expire_entry: timer=000000008f7d5140 function=delayed_work_timer_fn now=4294901034 baseclk=4294900752
ksoftirqd/2-26 [002] dnLs.13 35.046634: timer_expire_exit: timer=000000008f7d5140
kworker/2:1-106 [002] d...1.. 38.942670: timer_start: timer=000000008f7d5140 function=delayed_work_timer_fn expires=4294902244 [timeout=236] cpu=2 idx=125 flags=D|I
ksoftirqd/2-26 [002] d..s.13 316.486673: timer_expire_entry: timer=000000008f7d5140 function=delayed_work_timer_fn now=4294971394 baseclk=4294902248
ksoftirqd/2-26 [002] d..s.13 316.486688: timer_expire_exit: timer=000000008f7d5140
ksoftirqd/2-26 [002] ...s.13 316.486694: timer_expire_entry: timer=00000000b44e5013 function=idle_worker_timeout now=4294971394 baseclk=4294971392
ksoftirqd/2-26 [002] .nLs.13 316.486707: timer_expire_exit: timer=00000000b44e5013
kworker/u16:0-7 [002] d...1.. 492.606742: timer_start: timer=000000003fe6a257 function=delayed_work_timer_fn expires=4295045494 [timeout=30070] cpu=2 idx=217 flags=I
kworker/u16:0-7 [002] d...1.. 492.606756: timer_start: timer=000000003b28e768 function=delayed_work_timer_fn expires=4295022994 [timeout=7570] cpu=2 idx=237 flags=D|I
kworker/u16:0-7 [002] d...113 492.606766: timer_start: timer=0000000062a01a47 function=delayed_work_timer_fn expires=4295019174 [timeout=3750] cpu=2 idx=171 flags=D|I
kworker/u16:0-7 [002] d...113 492.606776: timer_start: timer=000000000bbd8bb1 function=delayed_work_timer_fn expires=4295019174 [timeout=3750] cpu=2 idx=171 flags=D|I
kworker/u16:0-7 [002] d...1.. 492.606860: timer_start: timer=00000000f1f3a9cb function=delayed_work_timer_fn expires=4295015455 [timeout=31] cpu=2 idx=32 flags=D|I
ksoftirqd/2-26 [002] d..s.13 613.438671: timer_expire_entry: timer=000000003fe6a257 function=delayed_work_timer_fn now=4295045632 baseclk=4295045632
ksoftirqd/2-26 [002] d..s.13 613.438695: timer_expire_exit: timer=000000003fe6a257
ksoftirqd/2-26 [002] d..s.13 613.438703: timer_expire_entry: timer=00000000f1f3a9cb function=delayed_work_timer_fn now=4295045632 baseclk=4295015456
ksoftirqd/2-26 [002] d..s.13 613.438717: timer_expire_exit: timer=00000000f1f3a9cb
ksoftirqd/2-26 [002] d..s.13 613.438723: timer_expire_entry: timer=000000000bbd8bb1 function=delayed_work_timer_fn now=4295045632 baseclk=4295019200
ksoftirqd/2-26 [002] d..s.13 613.438741: timer_expire_exit: timer=000000000bbd8bb1
ksoftirqd/2-26 [002] d..s.13 613.438745: timer_expire_entry: timer=0000000062a01a47 function=delayed_work_timer_fn now=4295045632 baseclk=4295019200
ksoftirqd/2-26 [002] d..s.13 613.438756: timer_expire_exit: timer=0000000062a01a47
ksoftirqd/2-26 [002] d..s.13 613.438762: timer_expire_entry: timer=000000003b28e768 function=delayed_work_timer_fn now=4295045632 baseclk=4295023104
> Any numbers on how often that is irq_work is scheduled?
I add some prints in function run_timer_softirq(), and collect some worst cases (CPU 5 and CPU3) as below,
looks some irq_work might have to wait for long time to get served in such case:
[ 13.040774] [005]:run_timer_softirq():call irq_work_tick_soft();
[ 17.078788] [005]:run_timer_softirq():call irq_work_tick_soft();
[ 19.222878] [005]:run_timer_softirq():call irq_work_tick_soft();
[ 49.608798] [005]:run_timer_softirq():call irq_work_tick_soft();
[ 56.564773] [005]:run_timer_softirq():call irq_work_tick_soft();
[ 124.368772] [005]:run_timer_softirq():call irq_work_tick_soft();
[ 124.880821] [005]:run_timer_softirq():call irq_work_tick_soft();
[ 125.392842] [005]:run_timer_softirq():call irq_work_tick_soft();
[ 129.556828] [005]:run_timer_softirq():call irq_work_tick_soft();
[ 132.336839] [005]:run_timer_softirq():call irq_work_tick_soft();
[ 253.168839] [005]:run_timer_softirq():call irq_work_tick_soft();
[ 376.048841] [005]:run_timer_softirq():call irq_work_tick_soft();
[ 13.072774] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 13.232774] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 14.064773] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 15.056773] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 16.080773] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 17.072772] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 17.840774] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 18.352775] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 18.864774] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 19.208796] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 19.376772] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 19.888776] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 20.118792] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 82.992773] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 125.072801] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 126.064772] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 127.056796] [003]:run_timer_softirq():call irq_work_tick_soft();
[ 128.976773] [003]:run_timer_softirq():call irq_work_tick_soft();
Regards,
Ran
prev parent reply other threads:[~2021-03-23 7:01 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-19 8:44 [PATCH] rt: cpufreq: Fix cpu hotplug hang Ran Wang
2021-02-22 14:01 ` Sebastian Siewior
2021-02-23 2:26 ` Ran Wang
2021-02-23 3:13 ` Viresh Kumar
2021-02-24 16:37 ` Sebastian Siewior
2021-02-25 2:31 ` Ran Wang
2021-03-23 7:00 ` Ran Wang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AM6PR04MB5413313F7577B63E8E930843F1649@AM6PR04MB5413.eurprd04.prod.outlook.com \
--to=ran.wang_1@nxp.com \
--cc=bigeasy@linutronix.de \
--cc=jiafei.pan@nxp.com \
--cc=linux-rt-users@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rjw@rjwysocki.net \
--cc=tglx@linutronix.de \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).