All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Stultz <johnstul@us.ibm.com>
To: stable@vger.kernel.org
Cc: John Stultz <johnstul@us.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Prarit Bhargava <prarit@redhat.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>
Subject: [PATCH 10/11] 2.6.35.x: hrtimer: Update hrtimer base offsets each hrtimer_interrupt
Date: Tue, 17 Jul 2012 14:05:41 -0400	[thread overview]
Message-ID: <1342548342-19683-11-git-send-email-johnstul@us.ibm.com> (raw)
In-Reply-To: <1342548342-19683-1-git-send-email-johnstul@us.ibm.com>

This is a backport of 5baefd6d84163443215f4a99f6a20f054ef11236

The update of the hrtimer base offsets on all cpus cannot be made
atomically from the timekeeper.lock held and interrupt disabled region
as smp function calls are not allowed there.

clock_was_set(), which enforces the update on all cpus, is called
either from preemptible process context in case of do_settimeofday()
or from the softirq context when the offset modification happened in
the timer interrupt itself due to a leap second.

In both cases there is a race window for an hrtimer interrupt between
dropping timekeeper lock, enabling interrupts and clock_was_set()
issuing the updates. Any interrupt which arrives in that window will
see the new time but operate on stale offsets.

So we need to make sure that an hrtimer interrupt always sees a
consistent state of time and offsets.

ktime_get_update_offsets() allows us to get the current monotonic time
and update the per cpu hrtimer base offsets from hrtimer_interrupt()
to capture a consistent state of monotonic time and the offsets. The
function replaces the existing ktime_get() calls in hrtimer_interrupt().

The overhead of the new function vs. ktime_get() is minimal as it just
adds two store operations.

This ensures that any changes to realtime or boottime offsets are
noticed and stored into the per-cpu hrtimer base structures, prior to
any hrtimer expiration and guarantees that timers are not expired early.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1341960205-56738-8-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
---
 kernel/hrtimer.c |   27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 2f05eb1..85b5cfc 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -603,6 +603,12 @@ static int hrtimer_reprogram(struct hrtimer *timer,
 	return res;
 }
 
+static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base)
+{
+	ktime_t *offs_real = &base->clock_base[CLOCK_REALTIME].offset;
+
+	return ktime_get_update_offsets(offs_real);
+}
 
 /*
  * Retrigger next event is called after clock was set
@@ -612,26 +618,15 @@ static int hrtimer_reprogram(struct hrtimer *timer,
 static void retrigger_next_event(void *arg)
 {
 	struct hrtimer_cpu_base *base;
-	struct timespec realtime_offset;
-	unsigned long seq;
 
 	if (!hrtimer_hres_active())
 		return;
 
-	do {
-		seq = read_seqbegin(&xtime_lock);
-		set_normalized_timespec(&realtime_offset,
-					-wall_to_monotonic.tv_sec,
-					-wall_to_monotonic.tv_nsec);
-	} while (read_seqretry(&xtime_lock, seq));
-
 	base = &__get_cpu_var(hrtimer_bases);
 
 	/* Adjust CLOCK_REALTIME offset */
 	raw_spin_lock(&base->lock);
-	base->clock_base[CLOCK_REALTIME].offset =
-		timespec_to_ktime(realtime_offset);
-
+	hrtimer_update_base(base);
 	hrtimer_force_reprogram(base, 0);
 	raw_spin_unlock(&base->lock);
 }
@@ -731,7 +726,6 @@ static int hrtimer_switch_to_hres(void)
 	base->clock_base[CLOCK_MONOTONIC].resolution = KTIME_HIGH_RES;
 
 	tick_setup_sched_timer();
-
 	/* "Retrigger" the interrupt to get things going */
 	retrigger_next_event(NULL);
 	local_irq_restore(flags);
@@ -1280,7 +1274,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 	dev->next_event.tv64 = KTIME_MAX;
 
 	raw_spin_lock(&cpu_base->lock);
-	entry_time = now = ktime_get();
+	entry_time = now = hrtimer_update_base(cpu_base);
 retry:
 	expires_next.tv64 = KTIME_MAX;
 	/*
@@ -1356,9 +1350,12 @@ retry:
 	 * We need to prevent that we loop forever in the hrtimer
 	 * interrupt routine. We give it 3 attempts to avoid
 	 * overreacting on some spurious event.
+	 *
+	 * Acquire base lock for updating the offsets and retrieving
+	 * the current time.
 	 */
 	raw_spin_lock(&cpu_base->lock);
-	now = ktime_get();
+	now = hrtimer_update_base(cpu_base);
 	cpu_base->nr_retries++;
 	if (++retries < 3)
 		goto retry;
-- 
1.7.9.5


  parent reply	other threads:[~2012-07-17 18:07 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-17 18:05 [PATCH 00/11] 2.6.35-stable: Fix for leapsecond deadlock & hrtimer/futex issue John Stultz
2012-07-17 18:05 ` [PATCH 01/11] 2.6.35.x: ntp: Fix leap-second hrtimer livelock John Stultz
2012-07-17 18:05 ` [PATCH 02/11] 2.6.35.x: ntp: Correct TAI offset during leap second John Stultz
2012-07-17 18:05 ` [PATCH 03/11] 2.6.35.x: timekeeping: Fix CLOCK_MONOTONIC inconsistency during leapsecond John Stultz
2012-07-17 18:05 ` [PATCH 04/11] 2.6.35.x: time: Move common updates to a function John Stultz
2012-07-17 18:05 ` [PATCH 05/11] 2.6.35.x: hrtimer: Provide clock_was_set_delayed() John Stultz
2012-07-17 18:05 ` [PATCH 06/11] 2.6.35.x: timekeeping: Fix leapsecond triggered load spike issue John Stultz
2012-07-17 18:05 ` [PATCH 07/11] 2.6.35.x: timekeeping: Maintain ktime_t based offsets for hrtimers John Stultz
2012-07-17 18:05 ` [PATCH 08/11] 2.6.35.x: hrtimers: Move lock held region in hrtimer_interrupt() John Stultz
2012-07-17 18:05 ` [PATCH 09/11] 2.6.35.x: timekeeping: Provide hrtimer update function John Stultz
2012-07-17 18:05 ` John Stultz [this message]
2012-07-17 18:05 ` [PATCH 11/11] 2.6.35.x: timekeeping: Add missing update call in timekeeping_resume() John Stultz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1342548342-19683-11-git-send-email-johnstul@us.ibm.com \
    --to=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=prarit@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.