From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Marcelo Tosatti <mtosatti@redhat.com>,
Frederic Weisbecker <frederic@kernel.org>,
Peter Xu <peterx@redhat.com>,
Nitesh Narayan Lal <nitesh@redhat.com>,
Alex Belits <abelits@marvell.com>,
"Rafael J. Wysocki" <rjw@rjwysocki.net>,
John Stultz <john.stultz@linaro.org>
Subject: [patch 2/8] hrtimer: Force clock_was_set() handling for the HIGHRES=n, NOHZ=y case
Date: Tue, 27 Apr 2021 10:25:39 +0200 [thread overview]
Message-ID: <20210427083724.180273544@linutronix.de> (raw)
In-Reply-To: 20210427082537.611978720@linutronix.de
When CONFIG_HIGH_RES_TIMERS is disabled, but NOHZ is enabled then
clock_was_set() is not doing anything. With HIGHRES=n the kernel relies on
the periodic tick to update the clock offsets, but when NOHZ is enabled and
active then CPUs which are in a deep idle sleep do not have a periodic tick
which means the expiry of timers affected by clock_was_set() can be
arbitrarily delayed up to the point where the CPUs are brought out of idle
again.
Make the clock_was_set() logic unconditionaly available so that idle CPUs
are kicked out of idle to handle the update.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
kernel/time/hrtimer.c | 87 +++++++++++++++++++++++++++++++++-----------------
1 file changed, 59 insertions(+), 28 deletions(-)
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -720,23 +720,7 @@ static inline int hrtimer_is_hres_enable
return hrtimer_hres_enabled;
}
-/*
- * Retrigger next event is called after clock was set
- *
- * Called with interrupts disabled via on_each_cpu()
- */
-static void retrigger_next_event(void *arg)
-{
- struct hrtimer_cpu_base *base = this_cpu_ptr(&hrtimer_bases);
-
- if (!__hrtimer_hres_active(base))
- return;
-
- raw_spin_lock(&base->lock);
- hrtimer_update_base(base);
- hrtimer_force_reprogram(base, 0);
- raw_spin_unlock(&base->lock);
-}
+static void retrigger_next_event(void *arg);
/*
* Switch to high resolution mode
@@ -762,9 +746,50 @@ static void hrtimer_switch_to_hres(void)
static inline int hrtimer_is_hres_enabled(void) { return 0; }
static inline void hrtimer_switch_to_hres(void) { }
-static inline void retrigger_next_event(void *arg) { }
#endif /* CONFIG_HIGH_RES_TIMERS */
+/*
+ * Retrigger next event is called after clock was set with interrupts
+ * disabled through an SMP function call or directly from low level
+ * resume code.
+ *
+ * This is only invoked when:
+ * - CONFIG_HIGH_RES_TIMERS is enabled.
+ * - CONFIG_NOHZ_COMMON is enabled
+ *
+ * For the other cases this function is empty and because the call sites
+ * are optimized out it vanishes as well, i.e. no need for lots of
+ * #ifdeffery.
+ */
+static void retrigger_next_event(void *arg)
+{
+ struct hrtimer_cpu_base *base = this_cpu_ptr(&hrtimer_bases);
+
+ /*
+ * When high resolution mode or nohz is active, then the offsets of
+ * CLOCK_REALTIME/TAI/BOOTTIME have to be updated. Otherwise the
+ * next tick will take care of that.
+ *
+ * If high resolution mode is active then the next expiring timer
+ * must be reevaluated and the clock event device reprogrammed if
+ * necessary.
+ *
+ * In the NOHZ case the update of the offset and the reevaluation
+ * of the next expiring timer is enough. The return from the SMP
+ * function call will take care of the reprogramming in case the
+ * CPU was in a NOHZ idle sleep.
+ */
+ if (!__hrtimer_hres_active(base) && !tick_nohz_active)
+ return;
+
+ raw_spin_lock(&base->lock);
+ hrtimer_update_base(base);
+ if (__hrtimer_hres_active(base))
+ hrtimer_force_reprogram(base, 0);
+ else
+ hrtimer_update_next_event(base);
+ raw_spin_unlock(&base->lock);
+}
/*
* When a timer is enqueued and expires earlier than the already enqueued
@@ -856,22 +881,28 @@ static void hrtimer_reprogram(struct hrt
}
/*
- * Clock realtime was set
- *
- * Change the offset of the realtime clock vs. the monotonic
- * clock.
+ * Clock was set. This might affect CLOCK_REALTIME, CLOCK_TAI and
+ * CLOCK_BOOTTIME (for late sleep time injection).
*
- * We might have to reprogram the high resolution timer interrupt. On
- * SMP we call the architecture specific code to retrigger _all_ high
- * resolution timer interrupts. On UP we just disable interrupts and
- * call the high resolution interrupt code.
+ * This requires to update the offsets for these clocks
+ * vs. CLOCK_MONOTONIC. When high resolution timers are enabled, then this
+ * also requires to eventually reprogram the per CPU clock event devices
+ * when the change moves an affected timer ahead of the first expiring
+ * timer on that CPU. Obviously remote per CPU clock event devices cannot
+ * be reprogrammed. The other reason why an IPI has to be sent is when the
+ * system is in !HIGH_RES and NOHZ mode. The NOHZ mode updates the offsets
+ * in the tick, which obviously might be stopped, so this has to bring out
+ * the remote CPU which might sleep in idle to get this sorted.
*/
void clock_was_set(void)
{
-#ifdef CONFIG_HIGH_RES_TIMERS
+ if (!hrtimer_hres_active() && !tick_nohz_active)
+ goto out_timerfd;
+
/* Retrigger the CPU local events everywhere */
on_each_cpu(retrigger_next_event, NULL, 1);
-#endif
+
+out_timerfd:
timerfd_clock_was_set();
}
next prev parent reply other threads:[~2021-04-27 8:38 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-27 8:25 [patch 0/8] hrtimers: Overhaul the clock_was_set() logic Thomas Gleixner
2021-04-27 8:25 ` [patch 1/8] hrtimer: Ensure timerfd notification for HIGHRES=n Thomas Gleixner
2021-04-27 8:25 ` Thomas Gleixner [this message]
2021-05-12 14:59 ` [patch 2/8] hrtimer: Force clock_was_set() handling for the HIGHRES=n, NOHZ=y case Peter Zijlstra
2021-05-12 16:40 ` Thomas Gleixner
2021-04-27 8:25 ` [patch 3/8] timerfd: Provide timerfd_resume() Thomas Gleixner
2021-04-27 8:25 ` [patch 4/8] timekeeping: Distangle resume and clock-was-set events Thomas Gleixner
2021-04-27 8:25 ` [patch 5/8] time/timekeeping: Avoid invoking clock_was_set() twice Thomas Gleixner
2021-04-27 8:25 ` [patch 6/8] hrtimer: Add bases argument to clock_was_set() Thomas Gleixner
2021-04-27 8:25 ` [patch 7/8] hrtimer: Avoid unnecessary SMP function calls in clock_was_set() Thomas Gleixner
2021-05-13 14:59 ` Peter Zijlstra
2021-05-14 18:52 ` Thomas Gleixner
2021-05-14 23:28 ` Peter Zijlstra
2021-05-15 0:24 ` Thomas Gleixner
2021-04-27 8:25 ` [patch 8/8] hrtimer: Avoid more " Thomas Gleixner
2021-04-27 15:11 ` Marcelo Tosatti
2021-04-27 19:59 ` Thomas Gleixner
2021-04-30 7:12 ` [patch V2 " Thomas Gleixner
2021-04-30 16:49 ` Marcelo Tosatti
2021-05-13 7:47 ` Peter Zijlstra
2021-05-14 19:08 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210427083724.180273544@linutronix.de \
--to=tglx@linutronix.de \
--cc=abelits@marvell.com \
--cc=anna-maria@linutronix.de \
--cc=frederic@kernel.org \
--cc=john.stultz@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=nitesh@redhat.com \
--cc=peterx@redhat.com \
--cc=peterz@infradead.org \
--cc=rjw@rjwysocki.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).