All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] timers: Use only bucket expiry for base->next_expiry value
@ 2020-07-14  7:29 Anna-Maria Behnsen
  2020-07-15 12:40 ` Frederic Weisbecker
  2020-07-15 22:16 ` [PATCH] timer: Preserve higher bits of expiration on index calculation Frederic Weisbecker
  0 siblings, 2 replies; 4+ messages in thread
From: Anna-Maria Behnsen @ 2020-07-14  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Anna-Maria Behnsen, Frederic Weisbecker, Peter Zijlstra,
	Juri Lelli, Thomas Gleixner

The bucket expiry time is the effective expriy time of timers and is
greater than or equal to the requested timer expiry time. This is due
to the guarantee that timers never expire early and the reduced expiry
granularity in the secondary wheel levels.

When a timer is enqueued, trigger_dyntick_cpu() checks whether the
timer is the new first timer. This check compares next_expiry with
the requested timer expiry value and not with the effective expiry
value of the bucket into which the timer was queued.

Storing the requested timer expiry value in base->next_expiry can lead
to base->clk going backwards if the requested timer expiry value is
smaller than base->clk. Commit 30c66fc30ee7 ("timer: Prevent base->clk
from moving backward") worked around this by preventing the store when
timer->expiry is before base->clk, but did not fix the underlying
problem.

Use the expiry value of the bucket into which the timer is queued to
do the new first timer check. This fixes the base->clk going backward
problem.

The workaround of commit 30c66fc30ee7 ("timer: Prevent base->clk from
moving backward") in trigger_dyntick_cpu() is not longer necessary as the
timers bucket expiry is guaranteed to be greater than or equal base->clk.

Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
---

The patch applies on top of tip: timers/urgent

 kernel/time/timer.c | 64 ++++++++++++++++++++++++---------------------
 1 file changed, 34 insertions(+), 30 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 9a838d38dbe6..f29e84c0b9fc 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -487,35 +487,39 @@ static inline void timer_set_idx(struct timer_list *timer, unsigned int idx)
  * Helper function to calculate the array index for a given expiry
  * time.
  */
-static inline unsigned calc_index(unsigned expires, unsigned lvl)
+static inline unsigned calc_index(unsigned expires, unsigned lvl,
+				  unsigned long *bucket_expiry)
 {
 	expires = (expires + LVL_GRAN(lvl)) >> LVL_SHIFT(lvl);
+	*bucket_expiry = expires << LVL_SHIFT(lvl);
 	return LVL_OFFS(lvl) + (expires & LVL_MASK);
 }
 
-static int calc_wheel_index(unsigned long expires, unsigned long clk)
+static int calc_wheel_index(unsigned long expires, unsigned long clk,
+			    unsigned long *bucket_expiry)
 {
 	unsigned long delta = expires - clk;
 	unsigned int idx;
 
 	if (delta < LVL_START(1)) {
-		idx = calc_index(expires, 0);
+		idx = calc_index(expires, 0, bucket_expiry);
 	} else if (delta < LVL_START(2)) {
-		idx = calc_index(expires, 1);
+		idx = calc_index(expires, 1, bucket_expiry);
 	} else if (delta < LVL_START(3)) {
-		idx = calc_index(expires, 2);
+		idx = calc_index(expires, 2, bucket_expiry);
 	} else if (delta < LVL_START(4)) {
-		idx = calc_index(expires, 3);
+		idx = calc_index(expires, 3, bucket_expiry);
 	} else if (delta < LVL_START(5)) {
-		idx = calc_index(expires, 4);
+		idx = calc_index(expires, 4, bucket_expiry);
 	} else if (delta < LVL_START(6)) {
-		idx = calc_index(expires, 5);
+		idx = calc_index(expires, 5, bucket_expiry);
 	} else if (delta < LVL_START(7)) {
-		idx = calc_index(expires, 6);
+		idx = calc_index(expires, 6, bucket_expiry);
 	} else if (LVL_DEPTH > 8 && delta < LVL_START(8)) {
-		idx = calc_index(expires, 7);
+		idx = calc_index(expires, 7, bucket_expiry);
 	} else if ((long) delta < 0) {
 		idx = clk & LVL_MASK;
+		*bucket_expiry = clk;
 	} else {
 		/*
 		 * Force expire obscene large timeouts to expire at the
@@ -524,7 +528,7 @@ static int calc_wheel_index(unsigned long expires, unsigned long clk)
 		if (expires >= WHEEL_TIMEOUT_CUTOFF)
 			expires = WHEEL_TIMEOUT_MAX;
 
-		idx = calc_index(expires, LVL_DEPTH - 1);
+		idx = calc_index(expires, LVL_DEPTH - 1, bucket_expiry);
 	}
 	return idx;
 }
@@ -544,16 +548,18 @@ static void enqueue_timer(struct timer_base *base, struct timer_list *timer,
 }
 
 static void
-__internal_add_timer(struct timer_base *base, struct timer_list *timer)
+__internal_add_timer(struct timer_base *base, struct timer_list *timer,
+		     unsigned long *bucket_expiry)
 {
 	unsigned int idx;
 
-	idx = calc_wheel_index(timer->expires, base->clk);
+	idx = calc_wheel_index(timer->expires, base->clk, bucket_expiry);
 	enqueue_timer(base, timer, idx);
 }
 
 static void
-trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer)
+trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer,
+		    unsigned long bucket_expiry)
 {
 	if (!is_timers_nohz_active())
 		return;
@@ -576,31 +582,29 @@ trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer)
 	if (!base->is_idle)
 		return;
 
-	/* Check whether this is the new first expiring timer: */
-	if (time_after_eq(timer->expires, base->next_expiry))
+	/*
+	 * Check whether this is the new first expiring timer. The
+	 * effective expiry time of the timer is required here
+	 * (bucket_expiry) instead of timer->expires.
+	 */
+	if (time_after_eq(bucket_expiry, base->next_expiry))
 		return;
 
 	/*
 	 * Set the next expiry time and kick the CPU so it can reevaluate the
 	 * wheel:
 	 */
-	if (time_before(timer->expires, base->clk)) {
-		/*
-		 * Prevent from forward_timer_base() moving the base->clk
-		 * backward
-		 */
-		base->next_expiry = base->clk;
-	} else {
-		base->next_expiry = timer->expires;
-	}
+	base->next_expiry = bucket_expiry;
 	wake_up_nohz_cpu(base->cpu);
 }
 
 static void
 internal_add_timer(struct timer_base *base, struct timer_list *timer)
 {
-	__internal_add_timer(base, timer);
-	trigger_dyntick_cpu(base, timer);
+	unsigned long bucket_expiry;
+
+	__internal_add_timer(base, timer, &bucket_expiry);
+	trigger_dyntick_cpu(base, timer, bucket_expiry);
 }
 
 #ifdef CONFIG_DEBUG_OBJECTS_TIMERS
@@ -959,9 +963,9 @@ static struct timer_base *lock_timer_base(struct timer_list *timer,
 static inline int
 __mod_timer(struct timer_list *timer, unsigned long expires, unsigned int options)
 {
+	unsigned long clk = 0, flags, bucket_expiry;
 	struct timer_base *base, *new_base;
 	unsigned int idx = UINT_MAX;
-	unsigned long clk = 0, flags;
 	int ret = 0;
 
 	BUG_ON(!timer->function);
@@ -1000,7 +1004,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, unsigned int option
 		}
 
 		clk = base->clk;
-		idx = calc_wheel_index(expires, clk);
+		idx = calc_wheel_index(expires, clk, &bucket_expiry);
 
 		/*
 		 * Retrieve and compare the array index of the pending
@@ -1059,7 +1063,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, unsigned int option
 	 */
 	if (idx != UINT_MAX && clk == base->clk) {
 		enqueue_timer(base, timer, idx);
-		trigger_dyntick_cpu(base, timer);
+		trigger_dyntick_cpu(base, timer, bucket_expiry);
 	} else {
 		internal_add_timer(base, timer);
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] timers: Use only bucket expiry for base->next_expiry value
  2020-07-14  7:29 [PATCH v2] timers: Use only bucket expiry for base->next_expiry value Anna-Maria Behnsen
@ 2020-07-15 12:40 ` Frederic Weisbecker
  2020-07-15 22:16 ` [PATCH] timer: Preserve higher bits of expiration on index calculation Frederic Weisbecker
  1 sibling, 0 replies; 4+ messages in thread
From: Frederic Weisbecker @ 2020-07-15 12:40 UTC (permalink / raw)
  To: Anna-Maria Behnsen
  Cc: linux-kernel, Peter Zijlstra, Juri Lelli, Thomas Gleixner

On Tue, Jul 14, 2020 at 09:29:24AM +0200, Anna-Maria Behnsen wrote:
> The bucket expiry time is the effective expriy time of timers and is
> greater than or equal to the requested timer expiry time. This is due
> to the guarantee that timers never expire early and the reduced expiry
> granularity in the secondary wheel levels.
> 
> When a timer is enqueued, trigger_dyntick_cpu() checks whether the
> timer is the new first timer. This check compares next_expiry with
> the requested timer expiry value and not with the effective expiry
> value of the bucket into which the timer was queued.
> 
> Storing the requested timer expiry value in base->next_expiry can lead
> to base->clk going backwards if the requested timer expiry value is
> smaller than base->clk. Commit 30c66fc30ee7 ("timer: Prevent base->clk
> from moving backward") worked around this by preventing the store when
> timer->expiry is before base->clk, but did not fix the underlying
> problem.
> 
> Use the expiry value of the bucket into which the timer is queued to
> do the new first timer check. This fixes the base->clk going backward
> problem.
> 
> The workaround of commit 30c66fc30ee7 ("timer: Prevent base->clk from
> moving backward") in trigger_dyntick_cpu() is not longer necessary as the
> timers bucket expiry is guaranteed to be greater than or equal base->clk.
> 
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

Thanks a lot!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] timer: Preserve higher bits of expiration on index calculation
  2020-07-14  7:29 [PATCH v2] timers: Use only bucket expiry for base->next_expiry value Anna-Maria Behnsen
  2020-07-15 12:40 ` Frederic Weisbecker
@ 2020-07-15 22:16 ` Frederic Weisbecker
  2020-07-16 14:24   ` Thomas Gleixner
  1 sibling, 1 reply; 4+ messages in thread
From: Frederic Weisbecker @ 2020-07-15 22:16 UTC (permalink / raw)
  To: Anna-Maria Behnsen
  Cc: linux-kernel, Peter Zijlstra, Juri Lelli, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 1753 bytes --]

On Tue, Jul 14, 2020 at 09:29:24AM +0200, Anna-Maria Behnsen wrote:
> The bucket expiry time is the effective expriy time of timers and is
> greater than or equal to the requested timer expiry time. This is due
> to the guarantee that timers never expire early and the reduced expiry
> granularity in the secondary wheel levels.
> 
> When a timer is enqueued, trigger_dyntick_cpu() checks whether the
> timer is the new first timer. This check compares next_expiry with
> the requested timer expiry value and not with the effective expiry
> value of the bucket into which the timer was queued.
> 
> Storing the requested timer expiry value in base->next_expiry can lead
> to base->clk going backwards if the requested timer expiry value is
> smaller than base->clk. Commit 30c66fc30ee7 ("timer: Prevent base->clk
> from moving backward") worked around this by preventing the store when
> timer->expiry is before base->clk, but did not fix the underlying
> problem.
> 
> Use the expiry value of the bucket into which the timer is queued to
> do the new first timer check. This fixes the base->clk going backward
> problem.
> 
> The workaround of commit 30c66fc30ee7 ("timer: Prevent base->clk from
> moving backward") in trigger_dyntick_cpu() is not longer necessary as the
> timers bucket expiry is guaranteed to be greater than or equal base->clk.
> 
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>

Actually I triggered the nasty warning in forward_timer_base()
with base->next_expiry < base->clk.

You'll need to first apply the following before your patch:


[-- Attachment #2: 0001-timer-Preserve-higher-bits-of-expiration-on-index-ca.patch --]
[-- Type: text/x-diff, Size: 1654 bytes --]

From 9770c3de69749cc2b8d1f295cecf12848212360e Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <frederic@kernel.org>
Date: Wed, 15 Jul 2020 23:49:09 +0200
Subject: [PATCH] timer: Preserve higher bits of expiration on index
 calculation

The higher bits of the timer expiration are cropped while calling
calc_index() due to the implicit cast from unsigned long to unsigned int.

This loss shouldn't have consequences on the current code since all the
computation to calculate the index is done on the lower 32 bits.

However we are preparing to return the actual bucket expiration from
calc_index() in order to properly fix base->next_expiry updates.
Preserving the higher bits is a requirement to achieve that.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
 kernel/time/timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 9a838d38dbe6..95e0b66f81ec 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -487,7 +487,7 @@ static inline void timer_set_idx(struct timer_list *timer, unsigned int idx)
  * Helper function to calculate the array index for a given expiry
  * time.
  */
-static inline unsigned calc_index(unsigned expires, unsigned lvl)
+static inline unsigned calc_index(unsigned long expires, unsigned lvl)
 {
 	expires = (expires + LVL_GRAN(lvl)) >> LVL_SHIFT(lvl);
 	return LVL_OFFS(lvl) + (expires & LVL_MASK);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] timer: Preserve higher bits of expiration on index calculation
  2020-07-15 22:16 ` [PATCH] timer: Preserve higher bits of expiration on index calculation Frederic Weisbecker
@ 2020-07-16 14:24   ` Thomas Gleixner
  0 siblings, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2020-07-16 14:24 UTC (permalink / raw)
  To: Frederic Weisbecker, Anna-Maria Behnsen
  Cc: linux-kernel, Peter Zijlstra, Juri Lelli

Frederic Weisbecker <frederic@kernel.org> writes:
> Subject: [PATCH] timer: Preserve higher bits of expiration on index
>  calculation
>
> The higher bits of the timer expiration are cropped while calling
> calc_index() due to the implicit cast from unsigned long to unsigned int.
>
> This loss shouldn't have consequences on the current code since all the
> computation to calculate the index is done on the lower 32 bits.
>
> However we are preparing to return the actual bucket expiration from
> calc_index() in order to properly fix base->next_expiry updates.
> Preserving the higher bits is a requirement to achieve that.

Nice catch!

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-07-16 14:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-14  7:29 [PATCH v2] timers: Use only bucket expiry for base->next_expiry value Anna-Maria Behnsen
2020-07-15 12:40 ` Frederic Weisbecker
2020-07-15 22:16 ` [PATCH] timer: Preserve higher bits of expiration on index calculation Frederic Weisbecker
2020-07-16 14:24   ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.