From: Alison Chaiken <achaiken@aurora.tech>
To: linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: Rami Jaber <rjaber@aurora.tech>, Glenn Elliott <glenn@aurora.tech>
Subject: late ktimer events due to C-state transitions
Date: Mon, 27 Jun 2022 16:23:25 -0700 [thread overview]
Message-ID: <CAFzL-7v-NSFKAsyhxReEES7bMomSTwYyrZscnjbkydEP3CTXmQ@mail.gmail.com> (raw)
We observe late ktimer events due to C-state transitions from CPU Idle
mode. The test results come from a 5.15-rt kernel with two
timers-related patches backported:
"softirq: Use a dedicated thread for timer wakeups." from Sebastian via
https://lore.kernel.org/linux-rt-users/20211201173316.rm6o67qapqsx2n47@linutronix.de/
and
"tick: Fix timer storm since introduction of timersd" from Frederic via
https://lore.kernel.org/linux-rt-users/20220405010752.1347437-2-frederic@kernel.org/
The kernel is configured with NO_HZ_COMMON=y and NO_HZ_IDLE=y.
Expected sequence of events:
@ 0ms: userspace enqueues a timer event to fire in 10ms from now
@ 10ms: ktimer event fires and triggers the online_periodic callback
@ 10ms: userspace enqueues a timer event to fire in 10ms from now
@ 20ms: ktimer event fires and triggers the online_periodic callback
...
What we’re actually seeing via kprobes:
@ 0ms: userspace enqueues a timer event to fire in 10ms from now
@ 10ms: ktimer event fires and triggers the online_periodic callback
@ 10ms: userspace enqueues a timer event to fire in 10ms from now
@ 15ms: CPU enters a deep idle state and sleeps for a long time
(20-100+)ms, let's say 70ms for this example
@ 80ms: ktimer event fires and triggers the online_periodic callback
@ 80ms: userspace enqueues a timer event to fire in 10ms from now
The result is that the second timer callback runs after 75ms instead
of the expected 10ms.
When the timers work well, the hrtimer_interrupt causes
raise_hrtimer_softirq() to run, as expected and __do_softirq() runs:
<userspace>-13534 [007] 16947.039513: enqueue_hrtimer
<idle>-0 [007] 16947.039555:
hrtimer_next_event_without: 16947.037643167
<idle>-0 [007] 16947.039562: enqueue_hrtimer
<idle>-0 [007] 16947.049453: hrtimer_interrupt
<idle>-0 [007] 16947.049457: raise_hrtimer_softirq.
<idle>-0 [007] 16947.049474: enqueue_hrtimer
ktimers/7-80 [007] 16947.049483: __do_softirq
ktimers/7-80 [007] 16947.049485: hrtimer_run_softirq.
<userspace>-13534 [007] 16947.049523: enqueue_hrtimer
<idle>-0 [007] 16947.049567:
hrtimer_next_event_without: 16947.047643167
<idle>-0 [007] 16947.049573: enqueue_hrtimer
<idle>-0 [007] 16947.049586: enqueue_hrtimer
<idle>-0 [007] 16947.049594:
hrtimer_next_event_without: 16947.047643167
<idle>-0 [007] 16947.049598: enqueue_hrtimer
<idle>-0 [007] 16947.059418: hrtimer_interrupt
<idle>-0 [007] 16947.059423: raise_hrtimer_softirq.
<idle>-0 [007] 16947.059441: enqueue_hrtimer
ktimers/7-80 [007] 16947.059449: __do_softirq
ktimers/7-80 [007] 16947.059451: hrtimer_run_softirq.
<userspace>-13534 [007] 16947.059490: enqueue_hrtimer
<idle>-0 [007] 16947.059529:
hrtimer_next_event_without: 16947.057643167
<idle>-0 [007] 16947.059531: enqueue_hrtimer
<idle>-0 [007] 16947.069434: hrtimer_interrupt
When the timers are delayed, the trouble appears to begin when the
hrtimer_interrupt results in execution of hrtimer_run_queues() instead
of raise_hrtimer_softirq():
<userspace>-13534 [007] 16947.069504: enqueue_hrtimer
<idle>-0 [007] 16947.069547:
hrtimer_next_event_without: 16947.067643167
<idle>-0 [007] 16947.069553: enqueue_hrtimer
<idle>-0 [007] 16947.069567: enqueue_hrtimer
<idle>-0 [007] 16947.069575:
hrtimer_next_event_without: 16947.067643167
<idle>-0 [007] 16947.069579: enqueue_hrtimer
<idle>-0 [007] 16947.078270: hrtimer_interrupt
<idle>-0 [007] 16947.078278: hrtimer_run_queues.
<idle>-0 [007] 16947.078300: enqueue_hrtimer
ktimers/7-80 [007] 16947.078308: __do_softirq
ksoftirqd/7-81 [007] 16947.078338: ksoftirqd_should_run 0
<idle>-0 [007] 16947.078361:
hrtimer_next_event_without: 16947.067643167
<idle>-0 [007] 16947.079323: hrtimer_interrupt
<idle>-0 [007] 16947.079328: hrtimer_run_queues.
<idle>-0 [007] 16947.079334: enqueue_hrtimer
ksoftirqd/7-81 [007] 16947.079359: ksoftirqd_should_run 128
ksoftirqd/7-81 [007] 16947.079360: __do_softirq
ksoftirqd/7-81 [007] 16947.079361: hrtimer_interrupt
ksoftirqd/7-81 [007] 16947.079361: raise_hrtimer_softirq.
ksoftirqd/7-81 [007] 16947.079364: ksoftirqd_should_run 0
<idle>-0 [007] 16947.079375:
hrtimer_next_event_without: 9223372036854775807
<idle>-0 [007] 16947.079376:
tick_nohz_get_sleep_length: 86.838 ms
<idle>-0 [007] 16947.079378: enqueue_hrtimer
Is the problem perhaps that ksoftirqd_should_run() needs to check
local_timer_softirq_pending() created by Frederic's patch as well as
local_softirq_pending()?
For the moment, we find that replacing NO_HZ_COMMON=y and NO_HZ_IDLE=y
with NO_HZ_PERIODIC prevents the undesirable sleep-state transition,
but clearly this solution is not ideal.
Thanks for any suggestions,
Alison Chaiken
achaiken@aurora.tech
Aurora Innovation
reply other threads:[~2022-06-27 23:23 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFzL-7v-NSFKAsyhxReEES7bMomSTwYyrZscnjbkydEP3CTXmQ@mail.gmail.com \
--to=achaiken@aurora.tech \
--cc=glenn@aurora.tech \
--cc=linux-rt-users@vger.kernel.org \
--cc=rjaber@aurora.tech \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).