linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ran Wang <ran.wang_1@nxp.com>
To: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	Jiafei Pan <jiafei.pan@nxp.com>,
	"linux-rt-users@vger.kernel.org" <linux-rt-users@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: RE: [PATCH] rt: cpufreq: Fix cpu hotplug hang
Date: Tue, 23 Mar 2021 07:00:29 +0000	[thread overview]
Message-ID: <AM6PR04MB5413313F7577B63E8E930843F1649@AM6PR04MB5413.eurprd04.prod.outlook.com> (raw)
In-Reply-To: <20210224163745.5lqhnjviznsjojtl@linutronix.de>

Hi Sebastian,

  Sorry for the late response due to bandwidth issue.
  Please see my reply inline.

On Thursday, February 25, 2021 12:38 AM, Sebastian Siewior wrote:
> 
> On 2021-02-23 08:43:08 [+0530], Viresh Kumar wrote:
> > On 23-02-21, 02:26, Ran Wang wrote:
> > > On Monday, February 22, 2021 10:01 PM, Sebastian Siewior wrote:
> > > > On 2021-02-19 16:44:20 [+0800], Ran Wang wrote:
> > > > > When selecting PREEMPT_RT, cpufreq_driver->stop_cpu(policy)
> > > > > might got stuck due to irq_work_sync() pending for work on
> > > > > lazy_list. That’s because lazy_list may have no chance to be
> > > > > served in softirq context sometimes. Below is one of scenarios that was captured:
> > > > >
> > > > > ...
> > > > > ret_from_fork
> > > > >  kthread
> > > > >   smpboot_thread_fn
> > > > >    cpuhp_thread_fun
> > > > >     cpuhp_invoke_callback: state: 193
> > > > >      cpuhp_cpufreq_online
> > > > >       cpufreq_online
> > > > >        cpufreq_driver->stop_cpu(policy);
> > > > >         cpufreq_dbs_governor_stop
> > > > >          sugov_stop  // kernel/sched/cpufreq_schedutil.c
> > > > >           irq_work_sync(&sg_policy->irq_work);
> > > > >
> > > > > This is observed on LX2160ARDB (16 A72 cores) with cpufreq
> > > > > governor of ‘schedutil’ or ‘ondemand’.
> > > >
> > > > While staring at it, why do we invoke schedule_work_on() and
> > > > kthread_queue_work() from inside irq_work() instead invoking it directly? It raises an interrupt in which it kicks a user thread.
> > > > Couldn't we do it without irq_work?
> >
> > Because we reach there from scheduler's context, which must be
> > hard-irq context..
> 
> Oha. Too bad.
> Ran, any chance to figure out why the softirq can not be served?
> ksoftirqd and the timer should be up.

Further debugging shows that sometimes the nearest coming local timer is expected to expire after a long time (100+ seconds).
That’s why function run_local_timers() gets no chance to call raise_softirq(TIMER_SOFTIRQ) to handle enqueued irq_work
for a long timer, causing irq_work_sync() pending when doing CPU hotplug operation.

Below is one of trace I captured, showing that sometimes [timeout=xxx] value would be much bigger than others (see [timeout=30070]).
And we can see on this particular example (CPU2), there would be 100+ seconds idle between 2 'timer_expire_entry' (see timestamp of 613.43867)

     ksoftirqd/2-26      [002] d..s.13    35.046622: timer_expire_entry: timer=000000008f7d5140 function=delayed_work_timer_fn now=4294901034 baseclk=4294900752
     ksoftirqd/2-26      [002] dnLs.13    35.046634: timer_expire_exit: timer=000000008f7d5140
     kworker/2:1-106     [002] d...1..    38.942670: timer_start: timer=000000008f7d5140 function=delayed_work_timer_fn expires=4294902244 [timeout=236] cpu=2 idx=125 flags=D|I
     ksoftirqd/2-26      [002] d..s.13   316.486673: timer_expire_entry: timer=000000008f7d5140 function=delayed_work_timer_fn now=4294971394 baseclk=4294902248
     ksoftirqd/2-26      [002] d..s.13   316.486688: timer_expire_exit: timer=000000008f7d5140
     ksoftirqd/2-26      [002] ...s.13   316.486694: timer_expire_entry: timer=00000000b44e5013 function=idle_worker_timeout now=4294971394 baseclk=4294971392
     ksoftirqd/2-26      [002] .nLs.13   316.486707: timer_expire_exit: timer=00000000b44e5013
   kworker/u16:0-7       [002] d...1..   492.606742: timer_start: timer=000000003fe6a257 function=delayed_work_timer_fn expires=4295045494 [timeout=30070] cpu=2 idx=217 flags=I
   kworker/u16:0-7       [002] d...1..   492.606756: timer_start: timer=000000003b28e768 function=delayed_work_timer_fn expires=4295022994 [timeout=7570] cpu=2 idx=237 flags=D|I
   kworker/u16:0-7       [002] d...113   492.606766: timer_start: timer=0000000062a01a47 function=delayed_work_timer_fn expires=4295019174 [timeout=3750] cpu=2 idx=171 flags=D|I
   kworker/u16:0-7       [002] d...113   492.606776: timer_start: timer=000000000bbd8bb1 function=delayed_work_timer_fn expires=4295019174 [timeout=3750] cpu=2 idx=171 flags=D|I
   kworker/u16:0-7       [002] d...1..   492.606860: timer_start: timer=00000000f1f3a9cb function=delayed_work_timer_fn expires=4295015455 [timeout=31] cpu=2 idx=32 flags=D|I
     ksoftirqd/2-26      [002] d..s.13   613.438671: timer_expire_entry: timer=000000003fe6a257 function=delayed_work_timer_fn now=4295045632 baseclk=4295045632
     ksoftirqd/2-26      [002] d..s.13   613.438695: timer_expire_exit: timer=000000003fe6a257
     ksoftirqd/2-26      [002] d..s.13   613.438703: timer_expire_entry: timer=00000000f1f3a9cb function=delayed_work_timer_fn now=4295045632 baseclk=4295015456
     ksoftirqd/2-26      [002] d..s.13   613.438717: timer_expire_exit: timer=00000000f1f3a9cb
     ksoftirqd/2-26      [002] d..s.13   613.438723: timer_expire_entry: timer=000000000bbd8bb1 function=delayed_work_timer_fn now=4295045632 baseclk=4295019200
     ksoftirqd/2-26      [002] d..s.13   613.438741: timer_expire_exit: timer=000000000bbd8bb1
     ksoftirqd/2-26      [002] d..s.13   613.438745: timer_expire_entry: timer=0000000062a01a47 function=delayed_work_timer_fn now=4295045632 baseclk=4295019200
     ksoftirqd/2-26      [002] d..s.13   613.438756: timer_expire_exit: timer=0000000062a01a47
     ksoftirqd/2-26      [002] d..s.13   613.438762: timer_expire_entry: timer=000000003b28e768 function=delayed_work_timer_fn now=4295045632 baseclk=4295023104

> Any numbers on how often that is irq_work is scheduled?

I add some prints in function run_timer_softirq(), and collect some worst cases (CPU 5 and CPU3) as below,
looks some irq_work might have to wait for long time to get served in such case:

[   13.040774] [005]:run_timer_softirq():call irq_work_tick_soft();
[   17.078788] [005]:run_timer_softirq():call irq_work_tick_soft();
[   19.222878] [005]:run_timer_softirq():call irq_work_tick_soft();
[   49.608798] [005]:run_timer_softirq():call irq_work_tick_soft();
[   56.564773] [005]:run_timer_softirq():call irq_work_tick_soft();
[  124.368772] [005]:run_timer_softirq():call irq_work_tick_soft();
[  124.880821] [005]:run_timer_softirq():call irq_work_tick_soft();
[  125.392842] [005]:run_timer_softirq():call irq_work_tick_soft();
[  129.556828] [005]:run_timer_softirq():call irq_work_tick_soft();
[  132.336839] [005]:run_timer_softirq():call irq_work_tick_soft();
[  253.168839] [005]:run_timer_softirq():call irq_work_tick_soft();
[  376.048841] [005]:run_timer_softirq():call irq_work_tick_soft();

[   13.072774] [003]:run_timer_softirq():call irq_work_tick_soft();
[   13.232774] [003]:run_timer_softirq():call irq_work_tick_soft();
[   14.064773] [003]:run_timer_softirq():call irq_work_tick_soft();
[   15.056773] [003]:run_timer_softirq():call irq_work_tick_soft();
[   16.080773] [003]:run_timer_softirq():call irq_work_tick_soft();
[   17.072772] [003]:run_timer_softirq():call irq_work_tick_soft();
[   17.840774] [003]:run_timer_softirq():call irq_work_tick_soft();
[   18.352775] [003]:run_timer_softirq():call irq_work_tick_soft();
[   18.864774] [003]:run_timer_softirq():call irq_work_tick_soft();
[   19.208796] [003]:run_timer_softirq():call irq_work_tick_soft();
[   19.376772] [003]:run_timer_softirq():call irq_work_tick_soft();
[   19.888776] [003]:run_timer_softirq():call irq_work_tick_soft();
[   20.118792] [003]:run_timer_softirq():call irq_work_tick_soft();
[   82.992773] [003]:run_timer_softirq():call irq_work_tick_soft();
[  125.072801] [003]:run_timer_softirq():call irq_work_tick_soft();
[  126.064772] [003]:run_timer_softirq():call irq_work_tick_soft();
[  127.056796] [003]:run_timer_softirq():call irq_work_tick_soft();
[  128.976773] [003]:run_timer_softirq():call irq_work_tick_soft();

Regards,
Ran

      parent reply	other threads:[~2021-03-23  7:01 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-19  8:44 [PATCH] rt: cpufreq: Fix cpu hotplug hang Ran Wang
2021-02-22 14:01 ` Sebastian Siewior
2021-02-23  2:26   ` Ran Wang
2021-02-23  3:13     ` Viresh Kumar
2021-02-24 16:37       ` Sebastian Siewior
2021-02-25  2:31         ` Ran Wang
2021-03-23  7:00         ` Ran Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM6PR04MB5413313F7577B63E8E930843F1649@AM6PR04MB5413.eurprd04.prod.outlook.com \
    --to=ran.wang_1@nxp.com \
    --cc=bigeasy@linutronix.de \
    --cc=jiafei.pan@nxp.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=tglx@linutronix.de \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).