linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 0/7] tick: Cleanups and reduce jiffies_seq held times
@ 2020-11-17 13:19 Thomas Gleixner
  2020-11-17 13:19 ` [patch 1/7] tick/broadcast: Serialize access to tick_next_period Thomas Gleixner
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Thomas Gleixner @ 2020-11-17 13:19 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Frederic Weisbecker, Yunfeng Ye

Yunfeng proposed a change to tick_do_update_jiffies64() to reduce the
contention on jiffies_seq sequence counter:

  https://lore.kernel.org/r/ac822c72-673e-73e1-9622-c5f12591b373@huawei.com

This made me look deeper and there are lots of other things to optimize,
but also things to document which are completely undocumented today.

The lot is based on 

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/core

and also available via git from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git tick

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [patch 1/7] tick/broadcast: Serialize access to tick_next_period
  2020-11-17 13:19 [patch 0/7] tick: Cleanups and reduce jiffies_seq held times Thomas Gleixner
@ 2020-11-17 13:19 ` Thomas Gleixner
  2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  2020-11-17 13:19 ` [patch 2/7] tick: Document protections for tick related data Thomas Gleixner
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2020-11-17 13:19 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Frederic Weisbecker, Yunfeng Ye

tick_broadcast_setup_oneshot() accesses tick_next_period twice without any
serialization. This is wrong in two aspects:

  - Reading it twice might make the broadcast data inconsistent if the
    variable is updated concurrently.

  - On 32bit systems the access might see an partial update

Protect it with jiffies_lock. That's safe as none of the callchains leading
up to this function can create a lock ordering violation:

timer interrupt
  run_local_timers()
    hrtimer_run_queues()
      hrtimer_switch_to_hres()
        tick_init_highres()
	  tick_switch_to_oneshot()
	    tick_broadcast_switch_to_oneshot()
or
     tick_check_oneshot_change()
       tick_nohz_switch_to_nohz()
         tick_switch_to_oneshot()
           tick_broadcast_switch_to_oneshot()

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/tick-broadcast.c |   23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -877,6 +877,22 @@ static void tick_broadcast_init_next_eve
 	}
 }
 
+static inline ktime_t tick_get_next_period(void)
+{
+	ktime_t next;
+
+	/*
+	 * Protect against concurrent updates (store /load tearing on
+	 * 32bit). It does not matter if the time is already in the
+	 * past. The broadcast device which is about to be programmed will
+	 * fire in any case.
+	 */
+	raw_spin_lock(&jiffies_lock);
+	next = tick_next_period;
+	raw_spin_unlock(&jiffies_lock);
+	return next;
+}
+
 /**
  * tick_broadcast_setup_oneshot - setup the broadcast device
  */
@@ -905,10 +921,11 @@ static void tick_broadcast_setup_oneshot
 			   tick_broadcast_oneshot_mask, tmpmask);
 
 		if (was_periodic && !cpumask_empty(tmpmask)) {
+			ktime_t nextevt = tick_get_next_period();
+
 			clockevents_switch_state(bc, CLOCK_EVT_STATE_ONESHOT);
-			tick_broadcast_init_next_event(tmpmask,
-						       tick_next_period);
-			tick_broadcast_set_event(bc, cpu, tick_next_period);
+			tick_broadcast_init_next_event(tmpmask, nextevt);
+			tick_broadcast_set_event(bc, cpu, nextevt);
 		} else
 			bc->next_event = KTIME_MAX;
 	} else {


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [patch 2/7] tick: Document protections for tick related data
  2020-11-17 13:19 [patch 0/7] tick: Cleanups and reduce jiffies_seq held times Thomas Gleixner
  2020-11-17 13:19 ` [patch 1/7] tick/broadcast: Serialize access to tick_next_period Thomas Gleixner
@ 2020-11-17 13:19 ` Thomas Gleixner
  2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  2020-11-17 13:19 ` [patch 3/7] tick/sched: Use tick_next_period for lockless quick check Thomas Gleixner
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2020-11-17 13:19 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Frederic Weisbecker, Yunfeng Ye

The protection rules for tick_next_period and last_jiffies_update are blury
at best. Clarify this.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/tick-common.c |    4 +++-
 kernel/time/tick-sched.c  |    4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -27,7 +27,9 @@
  */
 DEFINE_PER_CPU(struct tick_device, tick_cpu_device);
 /*
- * Tick next event: keeps track of the tick time
+ * Tick next event: keeps track of the tick time. It's updated by the
+ * CPU which handles the tick and protected by jiffies_lock. There is
+ * no requirement to write hold the jiffies seqcount for it.
  */
 ktime_t tick_next_period;
 ktime_t tick_period;
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -44,7 +44,9 @@ struct tick_sched *tick_get_tick_sched(i
 
 #if defined(CONFIG_NO_HZ_COMMON) || defined(CONFIG_HIGH_RES_TIMERS)
 /*
- * The time, when the last jiffy update happened. Protected by jiffies_lock.
+ * The time, when the last jiffy update happened. Write access must hold
+ * jiffies_lock and jiffies_seq. tick_nohz_next_event() needs to get a
+ * consistent view of jiffies and last_jiffies_update.
  */
 static ktime_t last_jiffies_update;
 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [patch 3/7] tick/sched: Use tick_next_period for lockless quick check
  2020-11-17 13:19 [patch 0/7] tick: Cleanups and reduce jiffies_seq held times Thomas Gleixner
  2020-11-17 13:19 ` [patch 1/7] tick/broadcast: Serialize access to tick_next_period Thomas Gleixner
  2020-11-17 13:19 ` [patch 2/7] tick: Document protections for tick related data Thomas Gleixner
@ 2020-11-17 13:19 ` Thomas Gleixner
  2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  2020-11-17 13:19 ` [patch 4/7] tick/sched: Reduce seqcount held scope in tick_do_update_jiffies64() Thomas Gleixner
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2020-11-17 13:19 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Frederic Weisbecker, Yunfeng Ye

No point in doing calculations.

   tick_next_period = last_jiffies_update + tick_period

Just check whether now is before tick_next_period to figure out whether
jiffies need an update.

Add a comment why the intentional data race in the quick check is safe or
not so safe in a 32bit corner case and why we don't worry about it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/tick-sched.c |   46 +++++++++++++++++++++++++++++++++-------------
 1 file changed, 33 insertions(+), 13 deletions(-)

--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -59,11 +59,29 @@ static void tick_do_update_jiffies64(kti
 	ktime_t delta;
 
 	/*
-	 * Do a quick check without holding jiffies_lock:
-	 * The READ_ONCE() pairs with two updates done later in this function.
+	 * Do a quick check without holding jiffies_lock. The READ_ONCE()
+	 * pairs with the update done later in this function.
+	 *
+	 * This is also an intentional data race which is even safe on
+	 * 32bit in theory. If there is a concurrent update then the check
+	 * might give a random answer. It does not matter because if it
+	 * returns then the concurrent update is already taking care, if it
+	 * falls through then it will pointlessly contend on jiffies_lock.
+	 *
+	 * Though there is one nasty case on 32bit due to store tearing of
+	 * the 64bit value. If the first 32bit store makes the quick check
+	 * return on all other CPUs and the writing CPU context gets
+	 * delayed to complete the second store (scheduled out on virt)
+	 * then jiffies can become stale for up to ~2^32 nanoseconds
+	 * without noticing. After that point all CPUs will wait for
+	 * jiffies lock.
+	 *
+	 * OTOH, this is not any different than the situation with NOHZ=off
+	 * where one CPU is responsible for updating jiffies and
+	 * timekeeping. If that CPU goes out for lunch then all other CPUs
+	 * will operate on stale jiffies until it decides to come back.
 	 */
-	delta = ktime_sub(now, READ_ONCE(last_jiffies_update));
-	if (delta < tick_period)
+	if (ktime_before(now, READ_ONCE(tick_next_period)))
 		return;
 
 	/* Reevaluate with jiffies_lock held */
@@ -74,9 +92,8 @@ static void tick_do_update_jiffies64(kti
 	if (delta >= tick_period) {
 
 		delta = ktime_sub(delta, tick_period);
-		/* Pairs with the lockless read in this function. */
-		WRITE_ONCE(last_jiffies_update,
-			   ktime_add(last_jiffies_update, tick_period));
+		last_jiffies_update = ktime_add(last_jiffies_update,
+						tick_period);
 
 		/* Slow path for long timeouts */
 		if (unlikely(delta >= tick_period)) {
@@ -84,15 +101,18 @@ static void tick_do_update_jiffies64(kti
 
 			ticks = ktime_divns(delta, incr);
 
-			/* Pairs with the lockless read in this function. */
-			WRITE_ONCE(last_jiffies_update,
-				   ktime_add_ns(last_jiffies_update,
-						incr * ticks));
+			last_jiffies_update = ktime_add_ns(last_jiffies_update,
+							   incr * ticks);
 		}
 		do_timer(++ticks);
 
-		/* Keep the tick_next_period variable up to date */
-		tick_next_period = ktime_add(last_jiffies_update, tick_period);
+		/*
+		 * Keep the tick_next_period variable up to date.
+		 * WRITE_ONCE() pairs with the READ_ONCE() in the lockless
+		 * quick check above.
+		 */
+		WRITE_ONCE(tick_next_period,
+			   ktime_add(last_jiffies_update, tick_period));
 	} else {
 		write_seqcount_end(&jiffies_seq);
 		raw_spin_unlock(&jiffies_lock);


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [patch 4/7] tick/sched: Reduce seqcount held scope in tick_do_update_jiffies64()
  2020-11-17 13:19 [patch 0/7] tick: Cleanups and reduce jiffies_seq held times Thomas Gleixner
                   ` (2 preceding siblings ...)
  2020-11-17 13:19 ` [patch 3/7] tick/sched: Use tick_next_period for lockless quick check Thomas Gleixner
@ 2020-11-17 13:19 ` Thomas Gleixner
  2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Yunfeng Ye
  2020-11-17 13:19 ` [patch 5/7] tick/sched: Optimize tick_do_update_jiffies64() further Thomas Gleixner
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2020-11-17 13:19 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Frederic Weisbecker, Yunfeng Ye

If jiffies are up to date already (caller lost the race against another
CPU) there is no point to change the sequence count. Doing that just forces
other CPUs into the seqcount retry loop in tick_nohz_next_event() for
nothing.

Just bail out early.

Reported-by: Yunfeng Ye <yeyunfeng@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/tick-sched.c |   53 ++++++++++++++++++++++-------------------------
 1 file changed, 25 insertions(+), 28 deletions(-)

--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -86,38 +86,35 @@ static void tick_do_update_jiffies64(kti
 
 	/* Reevaluate with jiffies_lock held */
 	raw_spin_lock(&jiffies_lock);
+	if (ktime_before(now, tick_next_period)) {
+		raw_spin_unlock(&jiffies_lock);
+		return;
+	}
+
 	write_seqcount_begin(&jiffies_seq);
 
-	delta = ktime_sub(now, last_jiffies_update);
-	if (delta >= tick_period) {
+	last_jiffies_update = ktime_add(last_jiffies_update, tick_period);
 
-		delta = ktime_sub(delta, tick_period);
-		last_jiffies_update = ktime_add(last_jiffies_update,
-						tick_period);
-
-		/* Slow path for long timeouts */
-		if (unlikely(delta >= tick_period)) {
-			s64 incr = ktime_to_ns(tick_period);
-
-			ticks = ktime_divns(delta, incr);
-
-			last_jiffies_update = ktime_add_ns(last_jiffies_update,
-							   incr * ticks);
-		}
-		do_timer(++ticks);
-
-		/*
-		 * Keep the tick_next_period variable up to date.
-		 * WRITE_ONCE() pairs with the READ_ONCE() in the lockless
-		 * quick check above.
-		 */
-		WRITE_ONCE(tick_next_period,
-			   ktime_add(last_jiffies_update, tick_period));
-	} else {
-		write_seqcount_end(&jiffies_seq);
-		raw_spin_unlock(&jiffies_lock);
-		return;
+	delta = ktime_sub(now, tick_next_period);
+	if (unlikely(delta >= tick_period)) {
+		/* Slow path for long idle sleep times */
+		s64 incr = ktime_to_ns(tick_period);
+
+		ticks = ktime_divns(delta, incr);
+
+		last_jiffies_update = ktime_add_ns(last_jiffies_update,
+						   incr * ticks);
 	}
+
+	do_timer(++ticks);
+
+	/*
+	 * Keep the tick_next_period variable up to date.  WRITE_ONCE()
+	 * pairs with the READ_ONCE() in the lockless quick check above.
+	 */
+	WRITE_ONCE(tick_next_period,
+		   ktime_add(last_jiffies_update, tick_period));
+
 	write_seqcount_end(&jiffies_seq);
 	raw_spin_unlock(&jiffies_lock);
 	update_wall_time();


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [patch 5/7] tick/sched: Optimize tick_do_update_jiffies64() further
  2020-11-17 13:19 [patch 0/7] tick: Cleanups and reduce jiffies_seq held times Thomas Gleixner
                   ` (3 preceding siblings ...)
  2020-11-17 13:19 ` [patch 4/7] tick/sched: Reduce seqcount held scope in tick_do_update_jiffies64() Thomas Gleixner
@ 2020-11-17 13:19 ` Thomas Gleixner
  2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  2020-11-17 13:19 ` [patch 6/7] tick/sched: Release seqcount before invoking calc_load_global() Thomas Gleixner
  2020-11-17 13:19 ` [patch 7/7] tick: Get rid of tick_period Thomas Gleixner
  6 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2020-11-17 13:19 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Frederic Weisbecker, Yunfeng Ye

Now that it's clear that there is always one tick to account, simplify the
calculations some more.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/tick-sched.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -55,7 +55,7 @@ static ktime_t last_jiffies_update;
  */
 static void tick_do_update_jiffies64(ktime_t now)
 {
-	unsigned long ticks = 0;
+	unsigned long ticks = 1;
 	ktime_t delta;
 
 	/*
@@ -93,20 +93,21 @@ static void tick_do_update_jiffies64(kti
 
 	write_seqcount_begin(&jiffies_seq);
 
-	last_jiffies_update = ktime_add(last_jiffies_update, tick_period);
-
 	delta = ktime_sub(now, tick_next_period);
 	if (unlikely(delta >= tick_period)) {
 		/* Slow path for long idle sleep times */
 		s64 incr = ktime_to_ns(tick_period);
 
-		ticks = ktime_divns(delta, incr);
+		ticks += ktime_divns(delta, incr);
 
 		last_jiffies_update = ktime_add_ns(last_jiffies_update,
 						   incr * ticks);
+	} else {
+		last_jiffies_update = ktime_add(last_jiffies_update,
+						tick_period);
 	}
 
-	do_timer(++ticks);
+	do_timer(ticks);
 
 	/*
 	 * Keep the tick_next_period variable up to date.  WRITE_ONCE()


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [patch 6/7] tick/sched: Release seqcount before invoking calc_load_global()
  2020-11-17 13:19 [patch 0/7] tick: Cleanups and reduce jiffies_seq held times Thomas Gleixner
                   ` (4 preceding siblings ...)
  2020-11-17 13:19 ` [patch 5/7] tick/sched: Optimize tick_do_update_jiffies64() further Thomas Gleixner
@ 2020-11-17 13:19 ` Thomas Gleixner
  2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Yunfeng Ye
  2020-11-17 13:19 ` [patch 7/7] tick: Get rid of tick_period Thomas Gleixner
  6 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2020-11-17 13:19 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Frederic Weisbecker, Yunfeng Ye

calc_load_global() does not need the sequence count protection.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/tick-sched.c |   12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -20,6 +20,7 @@
 #include <linux/sched/clock.h>
 #include <linux/sched/stat.h>
 #include <linux/sched/nohz.h>
+#include <linux/sched/loadavg.h>
 #include <linux/module.h>
 #include <linux/irq_work.h>
 #include <linux/posix-timers.h>
@@ -107,7 +108,8 @@ static void tick_do_update_jiffies64(kti
 						tick_period);
 	}
 
-	do_timer(ticks);
+	/* Advance jiffies to complete the jiffies_seq protected job */
+	jiffies_64 += ticks;
 
 	/*
 	 * Keep the tick_next_period variable up to date.  WRITE_ONCE()
@@ -116,7 +118,15 @@ static void tick_do_update_jiffies64(kti
 	WRITE_ONCE(tick_next_period,
 		   ktime_add(last_jiffies_update, tick_period));
 
+	/*
+	 * Release the sequence count. calc_global_load() below is not
+	 * protected by it, but jiffies_lock needs to be held to prevent
+	 * concurrent invocations.
+	 */
 	write_seqcount_end(&jiffies_seq);
+
+	calc_global_load();
+
 	raw_spin_unlock(&jiffies_lock);
 	update_wall_time();
 }


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [patch 7/7] tick: Get rid of tick_period
  2020-11-17 13:19 [patch 0/7] tick: Cleanups and reduce jiffies_seq held times Thomas Gleixner
                   ` (5 preceding siblings ...)
  2020-11-17 13:19 ` [patch 6/7] tick/sched: Release seqcount before invoking calc_load_global() Thomas Gleixner
@ 2020-11-17 13:19 ` Thomas Gleixner
  2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  6 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2020-11-17 13:19 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Frederic Weisbecker, Yunfeng Ye

The variable tick_period is initialized to NSEC_PER_TICK / HZ during boot
and never updated again.

If NSEC_PER_TICK is not an integer multiple of HZ this computation is less
accurate than TICK_NSEC which has proper rounding in place.

Aside of the inaccuracy there is no reason for having this variable at
all. It's just a pointless indirection and all usage sites can just use the
TICK_NSEC constant.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/tick-broadcast.c |    2 +-
 kernel/time/tick-common.c    |    8 +++-----
 kernel/time/tick-internal.h  |    1 -
 kernel/time/tick-sched.c     |   22 +++++++++++-----------
 4 files changed, 15 insertions(+), 18 deletions(-)

--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -331,7 +331,7 @@ static void tick_handle_periodic_broadca
 	bc_local = tick_do_periodic_broadcast();
 
 	if (clockevent_state_oneshot(dev)) {
-		ktime_t next = ktime_add(dev->next_event, tick_period);
+		ktime_t next = ktime_add_ns(dev->next_event, TICK_NSEC);
 
 		clockevents_program_event(dev, next, true);
 	}
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -32,7 +32,6 @@ DEFINE_PER_CPU(struct tick_device, tick_
  * no requirement to write hold the jiffies seqcount for it.
  */
 ktime_t tick_next_period;
-ktime_t tick_period;
 
 /*
  * tick_do_timer_cpu is a timer core internal variable which holds the CPU NR
@@ -90,7 +89,7 @@ static void tick_periodic(int cpu)
 		write_seqcount_begin(&jiffies_seq);
 
 		/* Keep track of the next tick event */
-		tick_next_period = ktime_add(tick_next_period, tick_period);
+		tick_next_period = ktime_add_ns(tick_next_period, TICK_NSEC);
 
 		do_timer(1);
 		write_seqcount_end(&jiffies_seq);
@@ -129,7 +128,7 @@ void tick_handle_periodic(struct clock_e
 		 * Setup the next period for devices, which do not have
 		 * periodic mode:
 		 */
-		next = ktime_add(next, tick_period);
+		next = ktime_add_ns(next, TICK_NSEC);
 
 		if (!clockevents_program_event(dev, next, false))
 			return;
@@ -175,7 +174,7 @@ void tick_setup_periodic(struct clock_ev
 		for (;;) {
 			if (!clockevents_program_event(dev, next, false))
 				return;
-			next = ktime_add(next, tick_period);
+			next = ktime_add_ns(next, TICK_NSEC);
 		}
 	}
 }
@@ -222,7 +221,6 @@ static void tick_setup_device(struct tic
 			tick_do_timer_cpu = cpu;
 
 			tick_next_period = ktime_get();
-			tick_period = NSEC_PER_SEC / HZ;
 #ifdef CONFIG_NO_HZ_FULL
 			/*
 			 * The boot CPU may be nohz_full, in which case set
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -15,7 +15,6 @@
 
 DECLARE_PER_CPU(struct tick_device, tick_cpu_device);
 extern ktime_t tick_next_period;
-extern ktime_t tick_period;
 extern int tick_do_timer_cpu __read_mostly;
 
 extern void tick_setup_periodic(struct clock_event_device *dev, int broadcast);
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -95,17 +95,17 @@ static void tick_do_update_jiffies64(kti
 	write_seqcount_begin(&jiffies_seq);
 
 	delta = ktime_sub(now, tick_next_period);
-	if (unlikely(delta >= tick_period)) {
+	if (unlikely(delta >= TICK_NSEC)) {
 		/* Slow path for long idle sleep times */
-		s64 incr = ktime_to_ns(tick_period);
+		s64 incr = TICK_NSEC;
 
 		ticks += ktime_divns(delta, incr);
 
 		last_jiffies_update = ktime_add_ns(last_jiffies_update,
 						   incr * ticks);
 	} else {
-		last_jiffies_update = ktime_add(last_jiffies_update,
-						tick_period);
+		last_jiffies_update = ktime_add_ns(last_jiffies_update,
+						   TICK_NSEC);
 	}
 
 	/* Advance jiffies to complete the jiffies_seq protected job */
@@ -116,7 +116,7 @@ static void tick_do_update_jiffies64(kti
 	 * pairs with the READ_ONCE() in the lockless quick check above.
 	 */
 	WRITE_ONCE(tick_next_period,
-		   ktime_add(last_jiffies_update, tick_period));
+		   ktime_add_ns(last_jiffies_update, TICK_NSEC));
 
 	/*
 	 * Release the sequence count. calc_global_load() below is not
@@ -691,7 +691,7 @@ static void tick_nohz_restart(struct tic
 	hrtimer_set_expires(&ts->sched_timer, ts->last_tick);
 
 	/* Forward the time to expire in the future */
-	hrtimer_forward(&ts->sched_timer, now, tick_period);
+	hrtimer_forward(&ts->sched_timer, now, TICK_NSEC);
 
 	if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
 		hrtimer_start_expires(&ts->sched_timer,
@@ -1260,7 +1260,7 @@ static void tick_nohz_handler(struct clo
 	if (unlikely(ts->tick_stopped))
 		return;
 
-	hrtimer_forward(&ts->sched_timer, now, tick_period);
+	hrtimer_forward(&ts->sched_timer, now, TICK_NSEC);
 	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
 }
 
@@ -1297,7 +1297,7 @@ static void tick_nohz_switch_to_nohz(voi
 	next = tick_init_jiffy_update();
 
 	hrtimer_set_expires(&ts->sched_timer, next);
-	hrtimer_forward_now(&ts->sched_timer, tick_period);
+	hrtimer_forward_now(&ts->sched_timer, TICK_NSEC);
 	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
 	tick_nohz_activate(ts, NOHZ_MODE_LOWRES);
 }
@@ -1363,7 +1363,7 @@ static enum hrtimer_restart tick_sched_t
 	if (unlikely(ts->tick_stopped))
 		return HRTIMER_NORESTART;
 
-	hrtimer_forward(timer, now, tick_period);
+	hrtimer_forward(timer, now, TICK_NSEC);
 
 	return HRTIMER_RESTART;
 }
@@ -1397,13 +1397,13 @@ void tick_setup_sched_timer(void)
 
 	/* Offset the tick to avert jiffies_lock contention. */
 	if (sched_skew_tick) {
-		u64 offset = ktime_to_ns(tick_period) >> 1;
+		u64 offset = TICK_NSEC >> 1;
 		do_div(offset, num_possible_cpus());
 		offset *= smp_processor_id();
 		hrtimer_add_expires_ns(&ts->sched_timer, offset);
 	}
 
-	hrtimer_forward(&ts->sched_timer, now, tick_period);
+	hrtimer_forward(&ts->sched_timer, now, TICK_NSEC);
 	hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED_HARD);
 	tick_nohz_activate(ts, NOHZ_MODE_HIGHRES);
 }


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip: timers/core] tick/sched: Release seqcount before invoking calc_load_global()
  2020-11-17 13:19 ` [patch 6/7] tick/sched: Release seqcount before invoking calc_load_global() Thomas Gleixner
@ 2020-11-19  9:55   ` tip-bot2 for Yunfeng Ye
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot2 for Yunfeng Ye @ 2020-11-19  9:55 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Yunfeng Ye, Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     896b969e6732b68ee3c12ae4e1aeddf5db99bc46
Gitweb:        https://git.kernel.org/tip/896b969e6732b68ee3c12ae4e1aeddf5db99bc46
Author:        Yunfeng Ye <yeyunfeng@huawei.com>
AuthorDate:    Tue, 17 Nov 2020 14:19:48 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 19 Nov 2020 10:48:29 +01:00

tick/sched: Release seqcount before invoking calc_load_global()

calc_load_global() does not need the sequence count protection.

[ tglx: Split it up properly and added comments ]

Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201117132006.660902274@linutronix.de

---
 kernel/time/tick-sched.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 306adeb..33c897b 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -20,6 +20,7 @@
 #include <linux/sched/clock.h>
 #include <linux/sched/stat.h>
 #include <linux/sched/nohz.h>
+#include <linux/sched/loadavg.h>
 #include <linux/module.h>
 #include <linux/irq_work.h>
 #include <linux/posix-timers.h>
@@ -107,7 +108,8 @@ static void tick_do_update_jiffies64(ktime_t now)
 						tick_period);
 	}
 
-	do_timer(ticks);
+	/* Advance jiffies to complete the jiffies_seq protected job */
+	jiffies_64 += ticks;
 
 	/*
 	 * Keep the tick_next_period variable up to date.  WRITE_ONCE()
@@ -116,7 +118,15 @@ static void tick_do_update_jiffies64(ktime_t now)
 	WRITE_ONCE(tick_next_period,
 		   ktime_add(last_jiffies_update, tick_period));
 
+	/*
+	 * Release the sequence count. calc_global_load() below is not
+	 * protected by it, but jiffies_lock needs to be held to prevent
+	 * concurrent invocations.
+	 */
 	write_seqcount_end(&jiffies_seq);
+
+	calc_global_load();
+
 	raw_spin_unlock(&jiffies_lock);
 	update_wall_time();
 }

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip: timers/core] tick: Get rid of tick_period
  2020-11-17 13:19 ` [patch 7/7] tick: Get rid of tick_period Thomas Gleixner
@ 2020-11-19  9:55   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-11-19  9:55 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     b996544916429946bf4934c1c01a306d1690972c
Gitweb:        https://git.kernel.org/tip/b996544916429946bf4934c1c01a306d1690972c
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 17 Nov 2020 14:19:49 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 19 Nov 2020 10:48:29 +01:00

tick: Get rid of tick_period

The variable tick_period is initialized to NSEC_PER_TICK / HZ during boot
and never updated again.

If NSEC_PER_TICK is not an integer multiple of HZ this computation is less
accurate than TICK_NSEC which has proper rounding in place.

Aside of the inaccuracy there is no reason for having this variable at
all. It's just a pointless indirection and all usage sites can just use the
TICK_NSEC constant.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201117132006.766643526@linutronix.de

---
 kernel/time/tick-broadcast.c |  2 +-
 kernel/time/tick-common.c    |  8 +++-----
 kernel/time/tick-internal.h  |  1 -
 kernel/time/tick-sched.c     | 22 +++++++++++-----------
 4 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 2a47c8f..5a23829 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -331,7 +331,7 @@ static void tick_handle_periodic_broadcast(struct clock_event_device *dev)
 	bc_local = tick_do_periodic_broadcast();
 
 	if (clockevent_state_oneshot(dev)) {
-		ktime_t next = ktime_add(dev->next_event, tick_period);
+		ktime_t next = ktime_add_ns(dev->next_event, TICK_NSEC);
 
 		clockevents_program_event(dev, next, true);
 	}
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 68504eb..a03764d 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -32,7 +32,6 @@ DEFINE_PER_CPU(struct tick_device, tick_cpu_device);
  * no requirement to write hold the jiffies seqcount for it.
  */
 ktime_t tick_next_period;
-ktime_t tick_period;
 
 /*
  * tick_do_timer_cpu is a timer core internal variable which holds the CPU NR
@@ -90,7 +89,7 @@ static void tick_periodic(int cpu)
 		write_seqcount_begin(&jiffies_seq);
 
 		/* Keep track of the next tick event */
-		tick_next_period = ktime_add(tick_next_period, tick_period);
+		tick_next_period = ktime_add_ns(tick_next_period, TICK_NSEC);
 
 		do_timer(1);
 		write_seqcount_end(&jiffies_seq);
@@ -129,7 +128,7 @@ void tick_handle_periodic(struct clock_event_device *dev)
 		 * Setup the next period for devices, which do not have
 		 * periodic mode:
 		 */
-		next = ktime_add(next, tick_period);
+		next = ktime_add_ns(next, TICK_NSEC);
 
 		if (!clockevents_program_event(dev, next, false))
 			return;
@@ -175,7 +174,7 @@ void tick_setup_periodic(struct clock_event_device *dev, int broadcast)
 		for (;;) {
 			if (!clockevents_program_event(dev, next, false))
 				return;
-			next = ktime_add(next, tick_period);
+			next = ktime_add_ns(next, TICK_NSEC);
 		}
 	}
 }
@@ -222,7 +221,6 @@ static void tick_setup_device(struct tick_device *td,
 			tick_do_timer_cpu = cpu;
 
 			tick_next_period = ktime_get();
-			tick_period = NSEC_PER_SEC / HZ;
 #ifdef CONFIG_NO_HZ_FULL
 			/*
 			 * The boot CPU may be nohz_full, in which case set
diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h
index 7b24961..7a981c9 100644
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -15,7 +15,6 @@
 
 DECLARE_PER_CPU(struct tick_device, tick_cpu_device);
 extern ktime_t tick_next_period;
-extern ktime_t tick_period;
 extern int tick_do_timer_cpu __read_mostly;
 
 extern void tick_setup_periodic(struct clock_event_device *dev, int broadcast);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 33c897b..cc7cba2 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -95,17 +95,17 @@ static void tick_do_update_jiffies64(ktime_t now)
 	write_seqcount_begin(&jiffies_seq);
 
 	delta = ktime_sub(now, tick_next_period);
-	if (unlikely(delta >= tick_period)) {
+	if (unlikely(delta >= TICK_NSEC)) {
 		/* Slow path for long idle sleep times */
-		s64 incr = ktime_to_ns(tick_period);
+		s64 incr = TICK_NSEC;
 
 		ticks += ktime_divns(delta, incr);
 
 		last_jiffies_update = ktime_add_ns(last_jiffies_update,
 						   incr * ticks);
 	} else {
-		last_jiffies_update = ktime_add(last_jiffies_update,
-						tick_period);
+		last_jiffies_update = ktime_add_ns(last_jiffies_update,
+						   TICK_NSEC);
 	}
 
 	/* Advance jiffies to complete the jiffies_seq protected job */
@@ -116,7 +116,7 @@ static void tick_do_update_jiffies64(ktime_t now)
 	 * pairs with the READ_ONCE() in the lockless quick check above.
 	 */
 	WRITE_ONCE(tick_next_period,
-		   ktime_add(last_jiffies_update, tick_period));
+		   ktime_add_ns(last_jiffies_update, TICK_NSEC));
 
 	/*
 	 * Release the sequence count. calc_global_load() below is not
@@ -691,7 +691,7 @@ static void tick_nohz_restart(struct tick_sched *ts, ktime_t now)
 	hrtimer_set_expires(&ts->sched_timer, ts->last_tick);
 
 	/* Forward the time to expire in the future */
-	hrtimer_forward(&ts->sched_timer, now, tick_period);
+	hrtimer_forward(&ts->sched_timer, now, TICK_NSEC);
 
 	if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
 		hrtimer_start_expires(&ts->sched_timer,
@@ -1260,7 +1260,7 @@ static void tick_nohz_handler(struct clock_event_device *dev)
 	if (unlikely(ts->tick_stopped))
 		return;
 
-	hrtimer_forward(&ts->sched_timer, now, tick_period);
+	hrtimer_forward(&ts->sched_timer, now, TICK_NSEC);
 	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
 }
 
@@ -1297,7 +1297,7 @@ static void tick_nohz_switch_to_nohz(void)
 	next = tick_init_jiffy_update();
 
 	hrtimer_set_expires(&ts->sched_timer, next);
-	hrtimer_forward_now(&ts->sched_timer, tick_period);
+	hrtimer_forward_now(&ts->sched_timer, TICK_NSEC);
 	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
 	tick_nohz_activate(ts, NOHZ_MODE_LOWRES);
 }
@@ -1363,7 +1363,7 @@ static enum hrtimer_restart tick_sched_timer(struct hrtimer *timer)
 	if (unlikely(ts->tick_stopped))
 		return HRTIMER_NORESTART;
 
-	hrtimer_forward(timer, now, tick_period);
+	hrtimer_forward(timer, now, TICK_NSEC);
 
 	return HRTIMER_RESTART;
 }
@@ -1397,13 +1397,13 @@ void tick_setup_sched_timer(void)
 
 	/* Offset the tick to avert jiffies_lock contention. */
 	if (sched_skew_tick) {
-		u64 offset = ktime_to_ns(tick_period) >> 1;
+		u64 offset = TICK_NSEC >> 1;
 		do_div(offset, num_possible_cpus());
 		offset *= smp_processor_id();
 		hrtimer_add_expires_ns(&ts->sched_timer, offset);
 	}
 
-	hrtimer_forward(&ts->sched_timer, now, tick_period);
+	hrtimer_forward(&ts->sched_timer, now, TICK_NSEC);
 	hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED_HARD);
 	tick_nohz_activate(ts, NOHZ_MODE_HIGHRES);
 }

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip: timers/core] tick/sched: Optimize tick_do_update_jiffies64() further
  2020-11-17 13:19 ` [patch 5/7] tick/sched: Optimize tick_do_update_jiffies64() further Thomas Gleixner
@ 2020-11-19  9:55   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-11-19  9:55 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     7a35bf2a6a871cd0252cd371d741e7d070b53af9
Gitweb:        https://git.kernel.org/tip/7a35bf2a6a871cd0252cd371d741e7d070b53af9
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 17 Nov 2020 14:19:47 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 19 Nov 2020 10:48:29 +01:00

tick/sched: Optimize tick_do_update_jiffies64() further

Now that it's clear that there is always one tick to account, simplify the
calculations some more.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201117132006.565663056@linutronix.de

---
 kernel/time/tick-sched.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index ca9191c..306adeb 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -55,7 +55,7 @@ static ktime_t last_jiffies_update;
  */
 static void tick_do_update_jiffies64(ktime_t now)
 {
-	unsigned long ticks = 0;
+	unsigned long ticks = 1;
 	ktime_t delta;
 
 	/*
@@ -93,20 +93,21 @@ static void tick_do_update_jiffies64(ktime_t now)
 
 	write_seqcount_begin(&jiffies_seq);
 
-	last_jiffies_update = ktime_add(last_jiffies_update, tick_period);
-
 	delta = ktime_sub(now, tick_next_period);
 	if (unlikely(delta >= tick_period)) {
 		/* Slow path for long idle sleep times */
 		s64 incr = ktime_to_ns(tick_period);
 
-		ticks = ktime_divns(delta, incr);
+		ticks += ktime_divns(delta, incr);
 
 		last_jiffies_update = ktime_add_ns(last_jiffies_update,
 						   incr * ticks);
+	} else {
+		last_jiffies_update = ktime_add(last_jiffies_update,
+						tick_period);
 	}
 
-	do_timer(++ticks);
+	do_timer(ticks);
 
 	/*
 	 * Keep the tick_next_period variable up to date.  WRITE_ONCE()

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip: timers/core] tick/sched: Reduce seqcount held scope in tick_do_update_jiffies64()
  2020-11-17 13:19 ` [patch 4/7] tick/sched: Reduce seqcount held scope in tick_do_update_jiffies64() Thomas Gleixner
@ 2020-11-19  9:55   ` tip-bot2 for Yunfeng Ye
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot2 for Yunfeng Ye @ 2020-11-19  9:55 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Yunfeng Ye, Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     94ad2e3cedb82af034f6d97c58022f162b669f9b
Gitweb:        https://git.kernel.org/tip/94ad2e3cedb82af034f6d97c58022f162b669f9b
Author:        Yunfeng Ye <yeyunfeng@huawei.com>
AuthorDate:    Tue, 17 Nov 2020 14:19:46 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 19 Nov 2020 10:48:29 +01:00

tick/sched: Reduce seqcount held scope in tick_do_update_jiffies64()

If jiffies are up to date already (caller lost the race against another
CPU) there is no point to change the sequence count. Doing that just forces
other CPUs into the seqcount retry loop in tick_nohz_next_event() for
nothing.

Just bail out early.

[ tglx: Rewrote most of it ]

Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201117132006.462195901@linutronix.de

---
 kernel/time/tick-sched.c | 47 ++++++++++++++++++---------------------
 1 file changed, 22 insertions(+), 25 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index b4b6abc..ca9191c 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -86,38 +86,35 @@ static void tick_do_update_jiffies64(ktime_t now)
 
 	/* Reevaluate with jiffies_lock held */
 	raw_spin_lock(&jiffies_lock);
+	if (ktime_before(now, tick_next_period)) {
+		raw_spin_unlock(&jiffies_lock);
+		return;
+	}
+
 	write_seqcount_begin(&jiffies_seq);
 
-	delta = ktime_sub(now, last_jiffies_update);
-	if (delta >= tick_period) {
+	last_jiffies_update = ktime_add(last_jiffies_update, tick_period);
 
-		delta = ktime_sub(delta, tick_period);
-		last_jiffies_update = ktime_add(last_jiffies_update,
-						tick_period);
+	delta = ktime_sub(now, tick_next_period);
+	if (unlikely(delta >= tick_period)) {
+		/* Slow path for long idle sleep times */
+		s64 incr = ktime_to_ns(tick_period);
 
-		/* Slow path for long timeouts */
-		if (unlikely(delta >= tick_period)) {
-			s64 incr = ktime_to_ns(tick_period);
+		ticks = ktime_divns(delta, incr);
 
-			ticks = ktime_divns(delta, incr);
+		last_jiffies_update = ktime_add_ns(last_jiffies_update,
+						   incr * ticks);
+	}
 
-			last_jiffies_update = ktime_add_ns(last_jiffies_update,
-							   incr * ticks);
-		}
-		do_timer(++ticks);
+	do_timer(++ticks);
+
+	/*
+	 * Keep the tick_next_period variable up to date.  WRITE_ONCE()
+	 * pairs with the READ_ONCE() in the lockless quick check above.
+	 */
+	WRITE_ONCE(tick_next_period,
+		   ktime_add(last_jiffies_update, tick_period));
 
-		/*
-		 * Keep the tick_next_period variable up to date.
-		 * WRITE_ONCE() pairs with the READ_ONCE() in the lockless
-		 * quick check above.
-		 */
-		WRITE_ONCE(tick_next_period,
-			   ktime_add(last_jiffies_update, tick_period));
-	} else {
-		write_seqcount_end(&jiffies_seq);
-		raw_spin_unlock(&jiffies_lock);
-		return;
-	}
 	write_seqcount_end(&jiffies_seq);
 	raw_spin_unlock(&jiffies_lock);
 	update_wall_time();

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip: timers/core] tick/sched: Use tick_next_period for lockless quick check
  2020-11-17 13:19 ` [patch 3/7] tick/sched: Use tick_next_period for lockless quick check Thomas Gleixner
@ 2020-11-19  9:55   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-11-19  9:55 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     372acbbaa80940189593f9d69c7c069955f24f7a
Gitweb:        https://git.kernel.org/tip/372acbbaa80940189593f9d69c7c069955f24f7a
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 17 Nov 2020 14:19:45 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 19 Nov 2020 10:48:29 +01:00

tick/sched: Use tick_next_period for lockless quick check

No point in doing calculations.

   tick_next_period = last_jiffies_update + tick_period

Just check whether now is before tick_next_period to figure out whether
jiffies need an update.

Add a comment why the intentional data race in the quick check is safe or
not so safe in a 32bit corner case and why we don't worry about it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201117132006.337366695@linutronix.de

---
 kernel/time/tick-sched.c | 46 +++++++++++++++++++++++++++------------
 1 file changed, 33 insertions(+), 13 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 15360e6..b4b6abc 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -59,11 +59,29 @@ static void tick_do_update_jiffies64(ktime_t now)
 	ktime_t delta;
 
 	/*
-	 * Do a quick check without holding jiffies_lock:
-	 * The READ_ONCE() pairs with two updates done later in this function.
+	 * Do a quick check without holding jiffies_lock. The READ_ONCE()
+	 * pairs with the update done later in this function.
+	 *
+	 * This is also an intentional data race which is even safe on
+	 * 32bit in theory. If there is a concurrent update then the check
+	 * might give a random answer. It does not matter because if it
+	 * returns then the concurrent update is already taking care, if it
+	 * falls through then it will pointlessly contend on jiffies_lock.
+	 *
+	 * Though there is one nasty case on 32bit due to store tearing of
+	 * the 64bit value. If the first 32bit store makes the quick check
+	 * return on all other CPUs and the writing CPU context gets
+	 * delayed to complete the second store (scheduled out on virt)
+	 * then jiffies can become stale for up to ~2^32 nanoseconds
+	 * without noticing. After that point all CPUs will wait for
+	 * jiffies lock.
+	 *
+	 * OTOH, this is not any different than the situation with NOHZ=off
+	 * where one CPU is responsible for updating jiffies and
+	 * timekeeping. If that CPU goes out for lunch then all other CPUs
+	 * will operate on stale jiffies until it decides to come back.
 	 */
-	delta = ktime_sub(now, READ_ONCE(last_jiffies_update));
-	if (delta < tick_period)
+	if (ktime_before(now, READ_ONCE(tick_next_period)))
 		return;
 
 	/* Reevaluate with jiffies_lock held */
@@ -74,9 +92,8 @@ static void tick_do_update_jiffies64(ktime_t now)
 	if (delta >= tick_period) {
 
 		delta = ktime_sub(delta, tick_period);
-		/* Pairs with the lockless read in this function. */
-		WRITE_ONCE(last_jiffies_update,
-			   ktime_add(last_jiffies_update, tick_period));
+		last_jiffies_update = ktime_add(last_jiffies_update,
+						tick_period);
 
 		/* Slow path for long timeouts */
 		if (unlikely(delta >= tick_period)) {
@@ -84,15 +101,18 @@ static void tick_do_update_jiffies64(ktime_t now)
 
 			ticks = ktime_divns(delta, incr);
 
-			/* Pairs with the lockless read in this function. */
-			WRITE_ONCE(last_jiffies_update,
-				   ktime_add_ns(last_jiffies_update,
-						incr * ticks));
+			last_jiffies_update = ktime_add_ns(last_jiffies_update,
+							   incr * ticks);
 		}
 		do_timer(++ticks);
 
-		/* Keep the tick_next_period variable up to date */
-		tick_next_period = ktime_add(last_jiffies_update, tick_period);
+		/*
+		 * Keep the tick_next_period variable up to date.
+		 * WRITE_ONCE() pairs with the READ_ONCE() in the lockless
+		 * quick check above.
+		 */
+		WRITE_ONCE(tick_next_period,
+			   ktime_add(last_jiffies_update, tick_period));
 	} else {
 		write_seqcount_end(&jiffies_seq);
 		raw_spin_unlock(&jiffies_lock);

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip: timers/core] tick/broadcast: Serialize access to tick_next_period
  2020-11-17 13:19 ` [patch 1/7] tick/broadcast: Serialize access to tick_next_period Thomas Gleixner
@ 2020-11-19  9:55   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-11-19  9:55 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     f73f64d5687192bc8eb7f3d9521ca6256b79f224
Gitweb:        https://git.kernel.org/tip/f73f64d5687192bc8eb7f3d9521ca6256b79f224
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 17 Nov 2020 14:19:43 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 19 Nov 2020 10:48:28 +01:00

tick/broadcast: Serialize access to tick_next_period

tick_broadcast_setup_oneshot() accesses tick_next_period twice without any
serialization. This is wrong in two aspects:

  - Reading it twice might make the broadcast data inconsistent if the
    variable is updated concurrently.

  - On 32bit systems the access might see an partial update

Protect it with jiffies_lock. That's safe as none of the callchains leading
up to this function can create a lock ordering violation:

timer interrupt
  run_local_timers()
    hrtimer_run_queues()
      hrtimer_switch_to_hres()
        tick_init_highres()
	  tick_switch_to_oneshot()
	    tick_broadcast_switch_to_oneshot()
or
     tick_check_oneshot_change()
       tick_nohz_switch_to_nohz()
         tick_switch_to_oneshot()
           tick_broadcast_switch_to_oneshot()

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201117132006.061341507@linutronix.de

---
 kernel/time/tick-broadcast.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 36d7464..2a47c8f 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -877,6 +877,22 @@ static void tick_broadcast_init_next_event(struct cpumask *mask,
 	}
 }
 
+static inline ktime_t tick_get_next_period(void)
+{
+	ktime_t next;
+
+	/*
+	 * Protect against concurrent updates (store /load tearing on
+	 * 32bit). It does not matter if the time is already in the
+	 * past. The broadcast device which is about to be programmed will
+	 * fire in any case.
+	 */
+	raw_spin_lock(&jiffies_lock);
+	next = tick_next_period;
+	raw_spin_unlock(&jiffies_lock);
+	return next;
+}
+
 /**
  * tick_broadcast_setup_oneshot - setup the broadcast device
  */
@@ -905,10 +921,11 @@ static void tick_broadcast_setup_oneshot(struct clock_event_device *bc)
 			   tick_broadcast_oneshot_mask, tmpmask);
 
 		if (was_periodic && !cpumask_empty(tmpmask)) {
+			ktime_t nextevt = tick_get_next_period();
+
 			clockevents_switch_state(bc, CLOCK_EVT_STATE_ONESHOT);
-			tick_broadcast_init_next_event(tmpmask,
-						       tick_next_period);
-			tick_broadcast_set_event(bc, cpu, tick_next_period);
+			tick_broadcast_init_next_event(tmpmask, nextevt);
+			tick_broadcast_set_event(bc, cpu, nextevt);
 		} else
 			bc->next_event = KTIME_MAX;
 	} else {

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip: timers/core] tick: Document protections for tick related data
  2020-11-17 13:19 ` [patch 2/7] tick: Document protections for tick related data Thomas Gleixner
@ 2020-11-19  9:55   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-11-19  9:55 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     c398960cd82b233886fbff163986f998b5a5c008
Gitweb:        https://git.kernel.org/tip/c398960cd82b233886fbff163986f998b5a5c008
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 17 Nov 2020 14:19:44 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 19 Nov 2020 10:48:28 +01:00

tick: Document protections for tick related data

The protection rules for tick_next_period and last_jiffies_update are blury
at best. Clarify this.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201117132006.197713794@linutronix.de

---
 kernel/time/tick-common.c | 4 +++-
 kernel/time/tick-sched.c  | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 6c9c342..68504eb 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -27,7 +27,9 @@
  */
 DEFINE_PER_CPU(struct tick_device, tick_cpu_device);
 /*
- * Tick next event: keeps track of the tick time
+ * Tick next event: keeps track of the tick time. It's updated by the
+ * CPU which handles the tick and protected by jiffies_lock. There is
+ * no requirement to write hold the jiffies seqcount for it.
  */
 ktime_t tick_next_period;
 ktime_t tick_period;
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 81632cd..15360e6 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -44,7 +44,9 @@ struct tick_sched *tick_get_tick_sched(int cpu)
 
 #if defined(CONFIG_NO_HZ_COMMON) || defined(CONFIG_HIGH_RES_TIMERS)
 /*
- * The time, when the last jiffy update happened. Protected by jiffies_lock.
+ * The time, when the last jiffy update happened. Write access must hold
+ * jiffies_lock and jiffies_seq. tick_nohz_next_event() needs to get a
+ * consistent view of jiffies and last_jiffies_update.
  */
 static ktime_t last_jiffies_update;
 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-11-19  9:55 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-17 13:19 [patch 0/7] tick: Cleanups and reduce jiffies_seq held times Thomas Gleixner
2020-11-17 13:19 ` [patch 1/7] tick/broadcast: Serialize access to tick_next_period Thomas Gleixner
2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
2020-11-17 13:19 ` [patch 2/7] tick: Document protections for tick related data Thomas Gleixner
2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
2020-11-17 13:19 ` [patch 3/7] tick/sched: Use tick_next_period for lockless quick check Thomas Gleixner
2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
2020-11-17 13:19 ` [patch 4/7] tick/sched: Reduce seqcount held scope in tick_do_update_jiffies64() Thomas Gleixner
2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Yunfeng Ye
2020-11-17 13:19 ` [patch 5/7] tick/sched: Optimize tick_do_update_jiffies64() further Thomas Gleixner
2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
2020-11-17 13:19 ` [patch 6/7] tick/sched: Release seqcount before invoking calc_load_global() Thomas Gleixner
2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Yunfeng Ye
2020-11-17 13:19 ` [patch 7/7] tick: Get rid of tick_period Thomas Gleixner
2020-11-19  9:55   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).