linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/2] hrtimer: Iterate only over active clock-bases
@ 2015-04-07  2:10 Viresh Kumar
  2015-04-07  2:10 ` [PATCH V2 1/2] hrtimer: update '->active_bases' before calling hrtimer_force_reprogram() Viresh Kumar
  2015-04-07  2:10 ` [PATCH V2 2/2] hrtimer: Iterate only over active clock-bases Viresh Kumar
  0 siblings, 2 replies; 123+ messages in thread
From: Viresh Kumar @ 2015-04-07  2:10 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra
  Cc: linaro-kernel, linux-kernel, Preeti U Murthy, Viresh Kumar

Hi,

'active_bases' indicates which clock-base have active timers. While it
is updated (almost) correctly, it is hardly used.

And so this is an attempt to improve the code that iterates over all
clock-bases.

The first patch fixes a issue that will result in a bug after the second commit,
and the second commit creates a macro for_each_active_base() and uses it at
multiple places.

V1->V2:
- Dropped ffs() and wrote own routine __next_bit().

Viresh Kumar (2):
  hrtimer: update '->active_bases' before calling
    hrtimer_force_reprogram()
  hrtimer: create for_each_active_base() to iterate over active
    clock-bases

 kernel/time/hrtimer.c | 70 ++++++++++++++++++++++++++++++++-------------------
 1 file changed, 44 insertions(+), 26 deletions(-)

-- 
2.3.0.rc0.44.ga94655d


^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH V2 1/2] hrtimer: update '->active_bases' before calling hrtimer_force_reprogram()
  2015-04-07  2:10 [PATCH V2 0/2] hrtimer: Iterate only over active clock-bases Viresh Kumar
@ 2015-04-07  2:10 ` Viresh Kumar
  2015-04-22 19:04   ` [tip:timers/core] hrtimer: Update active_bases " tip-bot for Viresh Kumar
  2015-04-07  2:10 ` [PATCH V2 2/2] hrtimer: Iterate only over active clock-bases Viresh Kumar
  1 sibling, 1 reply; 123+ messages in thread
From: Viresh Kumar @ 2015-04-07  2:10 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra
  Cc: linaro-kernel, linux-kernel, Preeti U Murthy, Viresh Kumar

'active_bases' indicates which clock-base have active timers. While it
is updated correctly, it is hardly used. Next commit will start using it
to make code more efficient, but before that we need to fix a problem.

While removing hrtimers, in __remove_hrtimer():
- We first remove the hrtimer from the queue.
- Then reprogram clockevent device if required
  (hrtimer_force_reprogram()).
- And then finally clear 'active_bases', if no more timers are pending
  on the current clock base (from which we are removing the hrtimer).

hrtimer_force_reprogram() needs to loop over all active clock bases to
find the next expiry event, and while doing so it will use
'active_bases' (after next commit). And it will find the current base
active, as we haven't cleared it until now, even if current clock base
has no more hrtimers queued.

The next commit will skip validating what timerqueue_getnext() returns,
as that is guaranteed to be valid for an active base, and the above
stated problem will result in a crash then (Because timerqueue_getnext()
will return NULL for the current clock base).

So, fix this issue by clearing active_bases before calling
hrtimer_force_reprogram().

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 kernel/time/hrtimer.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index bee0c1f78091..3152f327c988 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -879,6 +879,9 @@ static void __remove_hrtimer(struct hrtimer *timer,
 
 	next_timer = timerqueue_getnext(&base->active);
 	timerqueue_del(&base->active, &timer->node);
+	if (!timerqueue_getnext(&base->active))
+		base->cpu_base->active_bases &= ~(1 << base->index);
+
 	if (&timer->node == next_timer) {
 #ifdef CONFIG_HIGH_RES_TIMERS
 		/* Reprogram the clock event device. if enabled */
@@ -892,8 +895,6 @@ static void __remove_hrtimer(struct hrtimer *timer,
 		}
 #endif
 	}
-	if (!timerqueue_getnext(&base->active))
-		base->cpu_base->active_bases &= ~(1 << base->index);
 out:
 	timer->state = newstate;
 }
-- 
2.3.0.rc0.44.ga94655d


^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH V2 2/2] hrtimer: Iterate only over active clock-bases
  2015-04-07  2:10 [PATCH V2 0/2] hrtimer: Iterate only over active clock-bases Viresh Kumar
  2015-04-07  2:10 ` [PATCH V2 1/2] hrtimer: update '->active_bases' before calling hrtimer_force_reprogram() Viresh Kumar
@ 2015-04-07  2:10 ` Viresh Kumar
  2015-04-08 12:10   ` Peter Zijlstra
  2015-04-08 20:11   ` Thomas Gleixner
  1 sibling, 2 replies; 123+ messages in thread
From: Viresh Kumar @ 2015-04-07  2:10 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra
  Cc: linaro-kernel, linux-kernel, Preeti U Murthy, Viresh Kumar

At several instances we iterate over all possible clock-bases for a
particular cpu-base. Whereas, we only need to iterate over active bases.

We already have per cpu-base 'active_bases' field, which is updated on
addition/removal of hrtimers.

This patch creates for_each_active_base(), which uses 'active_bases' to
iterate only over active bases.

This also updates code which iterates over clock-bases.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 kernel/time/hrtimer.c | 65 ++++++++++++++++++++++++++++++++-------------------
 1 file changed, 41 insertions(+), 24 deletions(-)

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 3152f327c988..9da63e9ee63b 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -110,6 +110,31 @@ static inline int hrtimer_clockid_to_base(clockid_t clock_id)
 }
 
 
+static inline int __next_bit(unsigned int active_bases, int bit)
+{
+	do {
+		if (active_bases & (1 << bit))
+			return bit;
+	} while (++bit < HRTIMER_MAX_CLOCK_BASES);
+
+	/* We should never reach here */
+	return 0;
+}
+
+/*
+ * for_each_active_base: iterate over all active clock bases
+ * @_bit: 'int' variable for internal purpose
+ * @_base: holds pointer to a active clock base
+ * @_cpu_base: cpu base to iterate on
+ * @_active_bases: 'unsigned int' variable for internal purpose
+ */
+#define for_each_active_base(_bit, _base, _cpu_base, _active_bases)	\
+	for ((_active_bases) = (_cpu_base)->active_bases, (_bit) = -1;	\
+		(_active_bases) &&					\
+		((_bit) = __next_bit(_active_bases, ++_bit),		\
+		(_base) = (_cpu_base)->clock_base + _bit);		\
+		(_active_bases) &= ~(1 << (_bit)))
+
 /*
  * Get the coarse grained time at the softirq based on xtime and
  * wall_to_monotonic.
@@ -443,19 +468,15 @@ static inline void debug_deactivate(struct hrtimer *timer)
 #if defined(CONFIG_NO_HZ_COMMON) || defined(CONFIG_HIGH_RES_TIMERS)
 static ktime_t __hrtimer_get_next_event(struct hrtimer_cpu_base *cpu_base)
 {
-	struct hrtimer_clock_base *base = cpu_base->clock_base;
+	struct hrtimer_clock_base *base;
 	ktime_t expires, expires_next = { .tv64 = KTIME_MAX };
+	struct hrtimer *timer;
+	unsigned int active_bases;
 	int i;
 
-	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++, base++) {
-		struct timerqueue_node *next;
-		struct hrtimer *timer;
-
-		next = timerqueue_getnext(&base->active);
-		if (!next)
-			continue;
-
-		timer = container_of(next, struct hrtimer, node);
+	for_each_active_base(i, base, cpu_base, active_bases) {
+		timer = container_of(timerqueue_getnext(&base->active),
+				     struct hrtimer, node);
 		expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
 		if (expires.tv64 < expires_next.tv64)
 			expires_next = expires;
@@ -1245,6 +1266,8 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 {
 	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	ktime_t expires_next, now, entry_time, delta;
+	struct hrtimer_clock_base *base;
+	unsigned int active_bases;
 	int i, retries = 0;
 
 	BUG_ON(!cpu_base->hres_active);
@@ -1264,15 +1287,10 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 	 */
 	cpu_base->expires_next.tv64 = KTIME_MAX;
 
-	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
-		struct hrtimer_clock_base *base;
+	for_each_active_base(i, base, cpu_base, active_bases) {
 		struct timerqueue_node *node;
 		ktime_t basenow;
 
-		if (!(cpu_base->active_bases & (1 << i)))
-			continue;
-
-		base = cpu_base->clock_base + i;
 		basenow = ktime_add(now, base->offset);
 
 		while ((node = timerqueue_getnext(&base->active))) {
@@ -1435,16 +1453,13 @@ void hrtimer_run_queues(void)
 	struct timerqueue_node *node;
 	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	struct hrtimer_clock_base *base;
+	unsigned int active_bases;
 	int index, gettime = 1;
 
 	if (hrtimer_hres_active())
 		return;
 
-	for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) {
-		base = &cpu_base->clock_base[index];
-		if (!timerqueue_getnext(&base->active))
-			continue;
-
+	for_each_active_base(index, base, cpu_base, active_bases) {
 		if (gettime) {
 			hrtimer_get_softirq_time(cpu_base);
 			gettime = 0;
@@ -1665,6 +1680,8 @@ static void migrate_hrtimer_list(struct hrtimer_clock_base *old_base,
 static void migrate_hrtimers(int scpu)
 {
 	struct hrtimer_cpu_base *old_base, *new_base;
+	struct hrtimer_clock_base *clock_base;
+	unsigned int active_bases;
 	int i;
 
 	BUG_ON(cpu_online(scpu));
@@ -1680,9 +1697,9 @@ static void migrate_hrtimers(int scpu)
 	raw_spin_lock(&new_base->lock);
 	raw_spin_lock_nested(&old_base->lock, SINGLE_DEPTH_NESTING);
 
-	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
-		migrate_hrtimer_list(&old_base->clock_base[i],
-				     &new_base->clock_base[i]);
+	for_each_active_base(i, clock_base, old_base, active_bases) {
+		migrate_hrtimer_list(clock_base,
+				     &new_base->clock_base[clock_base->index]);
 	}
 
 	raw_spin_unlock(&old_base->lock);
-- 
2.3.0.rc0.44.ga94655d


^ permalink raw reply related	[flat|nested] 123+ messages in thread

* Re: [PATCH V2 2/2] hrtimer: Iterate only over active clock-bases
  2015-04-07  2:10 ` [PATCH V2 2/2] hrtimer: Iterate only over active clock-bases Viresh Kumar
@ 2015-04-08 12:10   ` Peter Zijlstra
  2015-04-08 20:11   ` Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: Peter Zijlstra @ 2015-04-08 12:10 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Thomas Gleixner, Ingo Molnar, linaro-kernel, linux-kernel,
	Preeti U Murthy

On Tue, Apr 07, 2015 at 07:40:53AM +0530, Viresh Kumar wrote:
> At several instances we iterate over all possible clock-bases for a
> particular cpu-base. Whereas, we only need to iterate over active bases.
> 
> We already have per cpu-base 'active_bases' field, which is updated on
> addition/removal of hrtimers.
> 
> This patch creates for_each_active_base(), which uses 'active_bases' to
> iterate only over active bases.
> 
> This also updates code which iterates over clock-bases.

This Changelog is very thin on compelling reasons to do this. Not to
mention you did no analysis on the code generated.


^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH V2 2/2] hrtimer: Iterate only over active clock-bases
  2015-04-07  2:10 ` [PATCH V2 2/2] hrtimer: Iterate only over active clock-bases Viresh Kumar
  2015-04-08 12:10   ` Peter Zijlstra
@ 2015-04-08 20:11   ` Thomas Gleixner
  2015-04-09  2:42     ` Viresh Kumar
  2015-04-09  6:28     ` [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list Ingo Molnar
  1 sibling, 2 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-08 20:11 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Ingo Molnar, Peter Zijlstra, linaro-kernel, linux-kernel,
	Preeti U Murthy

On Tue, 7 Apr 2015, Viresh Kumar wrote:

> At several instances we iterate over all possible clock-bases for a
> particular cpu-base. Whereas, we only need to iterate over active bases.
> 
> We already have per cpu-base 'active_bases' field, which is updated on
> addition/removal of hrtimers.
> 
> This patch creates for_each_active_base(), which uses 'active_bases' to
> iterate only over active bases.

I'm really not too excited about this incomprehensible macro mess and
especially not about the code it generates.

		x86_64	i386	ARM	power

Mainline	7668	6942	8077	10253

+ Patch		8068	7294	8313	10861

		+400	+352	+236	 +608

That's insane.

What's wrong with just adding 

	if (!(cpu_base->active_bases & (1 << i)))
		continue;

to the iterating sites?

Index: linux/kernel/time/hrtimer.c
===================================================================
--- linux.orig/kernel/time/hrtimer.c
+++ linux/kernel/time/hrtimer.c
@@ -451,6 +451,9 @@ static ktime_t __hrtimer_get_next_event(
 		struct timerqueue_node *next;
 		struct hrtimer *timer;
 
+		if (!(cpu_base->active_bases & (1 << i)))
+			continue;
+
 		next = timerqueue_getnext(&base->active);
 		if (!next)
 			continue;
@@ -1441,6 +1444,9 @@ void hrtimer_run_queues(void)
 		return;
 
 	for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) {
+		if (!(cpu_base->active_bases & (1 << index)))
+			continue;
+
 		base = &cpu_base->clock_base[index];
 		if (!timerqueue_getnext(&base->active))
 			continue;
@@ -1681,6 +1687,8 @@ static void migrate_hrtimers(int scpu)
 	raw_spin_lock_nested(&old_base->lock, SINGLE_DEPTH_NESTING);
 
 	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
+		if (!(old_base->active_bases & (1 << i)))
+			continue;
 		migrate_hrtimer_list(&old_base->clock_base[i],
 				     &new_base->clock_base[i]);
 	}

Now the code size increase is in a sane range:

		x86_64	i386	ARM	power

Mainline	7668	6942	8077	10253

+ patch		7748	6990	8113	10365

		 +80	 +48	 +36	 +112 

So your variant is at least 5 times larger than the simple and obvious
solution.

I told you before to look at the resulting binary code changes and
sanity check whether they are in the right ball park.

Thanks,

	tglx






^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH V2 2/2] hrtimer: Iterate only over active clock-bases
  2015-04-08 20:11   ` Thomas Gleixner
@ 2015-04-09  2:42     ` Viresh Kumar
  2015-04-09  6:28     ` [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list Ingo Molnar
  1 sibling, 0 replies; 123+ messages in thread
From: Viresh Kumar @ 2015-04-09  2:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Peter Zijlstra, Linaro Kernel Mailman List,
	Linux Kernel Mailing List, Preeti U Murthy

On 9 April 2015 at 01:41, Thomas Gleixner <tglx@linutronix.de> wrote:
> I'm really not too excited about this incomprehensible macro mess and
> especially not about the code it generates.
>
>                 x86_64  i386    ARM     power
>
> Mainline        7668    6942    8077    10253
>
> + Patch         8068    7294    8313    10861
>
>                 +400    +352    +236     +608
>
> That's insane.

After Peter's mail yesterday, I did check it on x86_64 and it surely
looked a lot bigger.

> What's wrong with just adding
>
>         if (!(cpu_base->active_bases & (1 << i)))
>                 continue;
>
> to the iterating sites?
>
> Index: linux/kernel/time/hrtimer.c
> ===================================================================
> --- linux.orig/kernel/time/hrtimer.c
> +++ linux/kernel/time/hrtimer.c
> @@ -451,6 +451,9 @@ static ktime_t __hrtimer_get_next_event(
>                 struct timerqueue_node *next;
>                 struct hrtimer *timer;
>
> +               if (!(cpu_base->active_bases & (1 << i)))
> +                       continue;
> +
>                 next = timerqueue_getnext(&base->active);
>                 if (!next)
>                         continue;

Isn't the check we already have here lightweight enough for this ?
timerqueue_getnext() returns head->next..

What benefit are we getting with this extra check ?

Maybe we can drop 'active_bases' from struct hrtimer_cpu_base ?

'active_bases' can be used effectively, if we can quit early from this
loop, i.e. by checking for !active_bases on every iteration.

But that generates a lot more code and is probably not that helpful
for small loop size that we have here.

--
viresh

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-08 20:11   ` Thomas Gleixner
  2015-04-09  2:42     ` Viresh Kumar
@ 2015-04-09  6:28     ` Ingo Molnar
  2015-04-09  6:38       ` Ingo Molnar
                         ` (2 more replies)
  1 sibling, 3 replies; 123+ messages in thread
From: Ingo Molnar @ 2015-04-09  6:28 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Viresh Kumar, Ingo Molnar, Peter Zijlstra, linaro-kernel,
	linux-kernel, Preeti U Murthy


* Thomas Gleixner <tglx@linutronix.de> wrote:

> On Tue, 7 Apr 2015, Viresh Kumar wrote:
> 
> > At several instances we iterate over all possible clock-bases for a
> > particular cpu-base. Whereas, we only need to iterate over active bases.
> > 
> > We already have per cpu-base 'active_bases' field, which is updated on
> > addition/removal of hrtimers.
> > 
> > This patch creates for_each_active_base(), which uses 'active_bases' to
> > iterate only over active bases.
> 
> I'm really not too excited about this incomprehensible macro mess and
> especially not about the code it generates.
> 
> 		x86_64	i386	ARM	power
> 
> Mainline	7668	6942	8077	10253
> 
> + Patch		8068	7294	8313	10861
> 
> 		+400	+352	+236	 +608
> 
> That's insane.
> 
> What's wrong with just adding 
> 
> 	if (!(cpu_base->active_bases & (1 << i)))
> 		continue;
> 
> to the iterating sites?
> 
> Index: linux/kernel/time/hrtimer.c
> ===================================================================
> --- linux.orig/kernel/time/hrtimer.c
> +++ linux/kernel/time/hrtimer.c
> @@ -451,6 +451,9 @@ static ktime_t __hrtimer_get_next_event(
>  		struct timerqueue_node *next;
>  		struct hrtimer *timer;
>  
> +		if (!(cpu_base->active_bases & (1 << i)))
> +			continue;
> +

Btw., does cpu_base->active_bases even make sense? hrtimer bases are 
fundamentally percpu, and to check whether there are any pending 
timers is a very simple check:

	base->active->next != NULL

So I'd rather suggest taking a direct look at the head, instead of 
calculating bit positions, masks, etc.

Furthermore, we never actually use cpu_base->active_bases as a 
'summary' value (which is the main point of bitmasks in general), so 
I'd remove that complication altogether.

This would speed up various hrtimer primitives like 
hrtimer_remove()/add and simplify the code. It would be a net code 
shrink as well.

Totally untested patch below. It gives:

   text    data     bss     dec     hex filename
   7502     427       0    7929    1ef9 hrtimer.o.before
   7422     427       0    7849    1ea9 hrtimer.o.after

and half of that code removal is from hot paths.

This would simplify the followup step of skipping over inactive bases 
as well.

Thanks,

	Ingo

 include/linux/hrtimer.h | 2 --
 kernel/time/hrtimer.c   | 7 ++-----
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 05f6df1fdf5b..d65b858a94c1 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -166,7 +166,6 @@ enum  hrtimer_base_type {
  * @lock:		lock protecting the base and associated clock bases
  *			and timers
  * @cpu:		cpu number
- * @active_bases:	Bitfield to mark bases with active timers
  * @clock_was_set:	Indicates that clock was set from irq context.
  * @expires_next:	absolute time of the next event which was scheduled
  *			via clock_set_next_event()
@@ -182,7 +181,6 @@ enum  hrtimer_base_type {
 struct hrtimer_cpu_base {
 	raw_spinlock_t			lock;
 	unsigned int			cpu;
-	unsigned int			active_bases;
 	unsigned int			clock_was_set;
 #ifdef CONFIG_HIGH_RES_TIMERS
 	ktime_t				expires_next;
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 76d4bd962b19..63e21df6c086 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -848,7 +848,6 @@ static int enqueue_hrtimer(struct hrtimer *timer,
 	debug_activate(timer);
 
 	timerqueue_add(&base->active, &timer->node);
-	base->cpu_base->active_bases |= 1 << base->index;
 
 	/*
 	 * HRTIMER_STATE_ENQUEUED is or'ed to the current state to preserve the
@@ -892,8 +891,6 @@ static void __remove_hrtimer(struct hrtimer *timer,
 		}
 #endif
 	}
-	if (!timerqueue_getnext(&base->active))
-		base->cpu_base->active_bases &= ~(1 << base->index);
 out:
 	timer->state = newstate;
 }
@@ -1268,10 +1265,10 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 		struct timerqueue_node *node;
 		ktime_t basenow;
 
-		if (!(cpu_base->active_bases & (1 << i)))
+		base = cpu_base->clock_base + i;
+		if (!base->active.next)
 			continue;
 
-		base = cpu_base->clock_base + i;
 		basenow = ktime_add(now, base->offset);
 
 		while ((node = timerqueue_getnext(&base->active))) {

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  6:28     ` [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list Ingo Molnar
@ 2015-04-09  6:38       ` Ingo Molnar
  2015-04-09  6:39         ` [PATCH] hrtimer: Only iterate over active bases in migrate_hrtimers() Ingo Molnar
                           ` (2 more replies)
  2015-04-09  6:57       ` [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list Peter Zijlstra
  2015-04-09  8:53       ` Thomas Gleixner
  2 siblings, 3 replies; 123+ messages in thread
From: Ingo Molnar @ 2015-04-09  6:38 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Viresh Kumar, Ingo Molnar, Peter Zijlstra, linaro-kernel,
	linux-kernel, Preeti U Murthy


* Ingo Molnar <mingo@kernel.org> wrote:

> This would speed up various hrtimer primitives like 
> hrtimer_remove()/add and simplify the code. It would be a net code 
> shrink as well.
> 
> Totally untested patch below. It gives:
> 
>    text    data     bss     dec     hex filename
>    7502     427       0    7929    1ef9 hrtimer.o.before
>    7422     427       0    7849    1ea9 hrtimer.o.after
> 
> and half of that code removal is from hot paths.
> 
> This would simplify the followup step of skipping over inactive bases 
> as well.

The followup step is attached below (untested as well).

Note that all other iterations already had a check for active.next, so 
the patch doesn't even bloat anything:

   text    data     bss     dec     hex filename
   7422     427       0    7849    1ea9 hrtimer.o.before
   7422     427       0    7849    1ea9 hrtimer.o.after

(I did a rename within migrate_hrtimers() because it used 'cpu_base' 
vs 'clock_base' inconsistently in a confusing (to me) manner.)

I'd also suggest the removal of the timerqueue_getnext() obfuscation: 
it 'sounds' complex but in reality it's a simple dereference to 
active.next. I think this is what triggered this rather pointless 
maintenance of active_bases.

Thanks,

	Ingo

---
 kernel/time/hrtimer.c |   24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -1660,29 +1660,35 @@ static void migrate_hrtimer_list(struct
 
 static void migrate_hrtimers(int scpu)
 {
-	struct hrtimer_cpu_base *old_base, *new_base;
+	struct hrtimer_cpu_base *old_cpu_base, *new_cpu_base;
 	int i;
 
 	BUG_ON(cpu_online(scpu));
 	tick_cancel_sched_timer(scpu);
 
 	local_irq_disable();
-	old_base = &per_cpu(hrtimer_bases, scpu);
-	new_base = this_cpu_ptr(&hrtimer_bases);
+	old_cpu_base = &per_cpu(hrtimer_bases, scpu);
+	new_cpu_base = this_cpu_ptr(&hrtimer_bases);
 	/*
 	 * The caller is globally serialized and nobody else
 	 * takes two locks at once, deadlock is not possible.
 	 */
-	raw_spin_lock(&new_base->lock);
-	raw_spin_lock_nested(&old_base->lock, SINGLE_DEPTH_NESTING);
+	raw_spin_lock(&new_cpu_base->lock);
+	raw_spin_lock_nested(&old_cpu_base->lock, SINGLE_DEPTH_NESTING);
 
 	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
-		migrate_hrtimer_list(&old_base->clock_base[i],
-				     &new_base->clock_base[i]);
+		struct hrtimer_clock_base *old_base = old_cpu_base->clock_base + i;
+		struct hrtimer_clock_base *new_base;
+
+		if (!old_base->active.next)
+			continue;
+
+		new_base = new_cpu_base->clock_base + i;
+		migrate_hrtimer_list(old_base, new_base);
 	}
 
-	raw_spin_unlock(&old_base->lock);
-	raw_spin_unlock(&new_base->lock);
+	raw_spin_unlock(&old_cpu_base->lock);
+	raw_spin_unlock(&new_cpu_base->lock);
 
 	/* Check, if we got expired work to do */
 	__hrtimer_peek_ahead_timers();

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH] hrtimer: Only iterate over active bases in migrate_hrtimers()
  2015-04-09  6:38       ` Ingo Molnar
@ 2015-04-09  6:39         ` Ingo Molnar
  2015-04-09  6:53         ` [PATCH] hrtimer: Replace timerqueue_getnext() uses with direct access to 'active.next' Ingo Molnar
  2015-04-09  7:10         ` [PATCH] hrtimers: Use consistent variable names for timerqueue_node iterations Ingo Molnar
  2 siblings, 0 replies; 123+ messages in thread
From: Ingo Molnar @ 2015-04-09  6:39 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Viresh Kumar, Ingo Molnar, Peter Zijlstra, linaro-kernel,
	linux-kernel, Preeti U Murthy


(fixed the subject to include the patch title. Patch quoted below.)

* Ingo Molnar <mingo@kernel.org> wrote:

> 
> * Ingo Molnar <mingo@kernel.org> wrote:
> 
> > This would speed up various hrtimer primitives like 
> > hrtimer_remove()/add and simplify the code. It would be a net code 
> > shrink as well.
> > 
> > Totally untested patch below. It gives:
> > 
> >    text    data     bss     dec     hex filename
> >    7502     427       0    7929    1ef9 hrtimer.o.before
> >    7422     427       0    7849    1ea9 hrtimer.o.after
> > 
> > and half of that code removal is from hot paths.
> > 
> > This would simplify the followup step of skipping over inactive bases 
> > as well.
> 
> The followup step is attached below (untested as well).
> 
> Note that all other iterations already had a check for active.next, so 
> the patch doesn't even bloat anything:
> 
>    text    data     bss     dec     hex filename
>    7422     427       0    7849    1ea9 hrtimer.o.before
>    7422     427       0    7849    1ea9 hrtimer.o.after
> 
> (I did a rename within migrate_hrtimers() because it used 'cpu_base' 
> vs 'clock_base' inconsistently in a confusing (to me) manner.)
> 
> I'd also suggest the removal of the timerqueue_getnext() obfuscation: 
> it 'sounds' complex but in reality it's a simple dereference to 
> active.next. I think this is what triggered this rather pointless 
> maintenance of active_bases.
> 
> Thanks,
> 
> 	Ingo
> 
> ---
>  kernel/time/hrtimer.c |   24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
> 
> Index: tip/kernel/time/hrtimer.c
> ===================================================================
> --- tip.orig/kernel/time/hrtimer.c
> +++ tip/kernel/time/hrtimer.c
> @@ -1660,29 +1660,35 @@ static void migrate_hrtimer_list(struct
>  
>  static void migrate_hrtimers(int scpu)
>  {
> -	struct hrtimer_cpu_base *old_base, *new_base;
> +	struct hrtimer_cpu_base *old_cpu_base, *new_cpu_base;
>  	int i;
>  
>  	BUG_ON(cpu_online(scpu));
>  	tick_cancel_sched_timer(scpu);
>  
>  	local_irq_disable();
> -	old_base = &per_cpu(hrtimer_bases, scpu);
> -	new_base = this_cpu_ptr(&hrtimer_bases);
> +	old_cpu_base = &per_cpu(hrtimer_bases, scpu);
> +	new_cpu_base = this_cpu_ptr(&hrtimer_bases);
>  	/*
>  	 * The caller is globally serialized and nobody else
>  	 * takes two locks at once, deadlock is not possible.
>  	 */
> -	raw_spin_lock(&new_base->lock);
> -	raw_spin_lock_nested(&old_base->lock, SINGLE_DEPTH_NESTING);
> +	raw_spin_lock(&new_cpu_base->lock);
> +	raw_spin_lock_nested(&old_cpu_base->lock, SINGLE_DEPTH_NESTING);
>  
>  	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
> -		migrate_hrtimer_list(&old_base->clock_base[i],
> -				     &new_base->clock_base[i]);
> +		struct hrtimer_clock_base *old_base = old_cpu_base->clock_base + i;
> +		struct hrtimer_clock_base *new_base;
> +
> +		if (!old_base->active.next)
> +			continue;
> +
> +		new_base = new_cpu_base->clock_base + i;
> +		migrate_hrtimer_list(old_base, new_base);
>  	}
>  
> -	raw_spin_unlock(&old_base->lock);
> -	raw_spin_unlock(&new_base->lock);
> +	raw_spin_unlock(&old_cpu_base->lock);
> +	raw_spin_unlock(&new_cpu_base->lock);
>  
>  	/* Check, if we got expired work to do */
>  	__hrtimer_peek_ahead_timers();

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH] hrtimer: Replace timerqueue_getnext() uses with direct access to 'active.next'
  2015-04-09  6:38       ` Ingo Molnar
  2015-04-09  6:39         ` [PATCH] hrtimer: Only iterate over active bases in migrate_hrtimers() Ingo Molnar
@ 2015-04-09  6:53         ` Ingo Molnar
  2015-04-09  7:10         ` [PATCH] hrtimers: Use consistent variable names for timerqueue_node iterations Ingo Molnar
  2 siblings, 0 replies; 123+ messages in thread
From: Ingo Molnar @ 2015-04-09  6:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Viresh Kumar, Ingo Molnar, Peter Zijlstra, linaro-kernel,
	linux-kernel, Preeti U Murthy


* Ingo Molnar <mingo@kernel.org> wrote:

> I'd also suggest the removal of the timerqueue_getnext() 
> obfuscation: it 'sounds' complex but in reality it's a simple 
> dereference to active.next. I think this is what triggered this 
> rather pointless maintenance of active_bases.

The patch below does this - it's more readable to me in this fashion, 
the 'next' field is already very clearly named, no need for a 
(longer!) helper function there.

I lightly tested the 3 patches with hrtimers enabled under a virtual 
machine.

Thanks,

	Ingo

---
 kernel/time/hrtimer.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -451,7 +451,7 @@ static ktime_t __hrtimer_get_next_event(
 		struct timerqueue_node *next;
 		struct hrtimer *timer;
 
-		next = timerqueue_getnext(&base->active);
+		next = base->active.next;
 		if (!next)
 			continue;
 
@@ -876,7 +876,7 @@ static void __remove_hrtimer(struct hrti
 	if (!(timer->state & HRTIMER_STATE_ENQUEUED))
 		goto out;
 
-	next_timer = timerqueue_getnext(&base->active);
+	next_timer = base->active.next;
 	timerqueue_del(&base->active, &timer->node);
 	if (&timer->node == next_timer) {
 #ifdef CONFIG_HIGH_RES_TIMERS
@@ -1271,7 +1271,7 @@ retry:
 
 		basenow = ktime_add(now, base->offset);
 
-		while ((node = timerqueue_getnext(&base->active))) {
+		while ((node = base->active.next)) {
 			struct hrtimer *timer;
 
 			timer = container_of(node, struct hrtimer, node);
@@ -1438,7 +1438,7 @@ void hrtimer_run_queues(void)
 
 	for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) {
 		base = &cpu_base->clock_base[index];
-		if (!timerqueue_getnext(&base->active))
+		if (!base->active.next)
 			continue;
 
 		if (gettime) {
@@ -1448,7 +1448,7 @@ void hrtimer_run_queues(void)
 
 		raw_spin_lock(&cpu_base->lock);
 
-		while ((node = timerqueue_getnext(&base->active))) {
+		while ((node = base->active.next)) {
 			struct hrtimer *timer;
 
 			timer = container_of(node, struct hrtimer, node);
@@ -1631,7 +1631,7 @@ static void migrate_hrtimer_list(struct
 	struct hrtimer *timer;
 	struct timerqueue_node *node;
 
-	while ((node = timerqueue_getnext(&old_base->active))) {
+	while ((node = old_base->active.next)) {
 		timer = container_of(node, struct hrtimer, node);
 		BUG_ON(hrtimer_callback_running(timer));
 		debug_deactivate(timer);

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  6:28     ` [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list Ingo Molnar
  2015-04-09  6:38       ` Ingo Molnar
@ 2015-04-09  6:57       ` Peter Zijlstra
  2015-04-09  7:09         ` Ingo Molnar
  2015-04-09  8:53       ` Thomas Gleixner
  2 siblings, 1 reply; 123+ messages in thread
From: Peter Zijlstra @ 2015-04-09  6:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Viresh Kumar, Ingo Molnar, linaro-kernel,
	linux-kernel, Preeti U Murthy

On Thu, Apr 09, 2015 at 08:28:41AM +0200, Ingo Molnar wrote:
> Btw., does cpu_base->active_bases even make sense? hrtimer bases are 
> fundamentally percpu, and to check whether there are any pending 
> timers is a very simple check:
> 
> 	base->active->next != NULL
> 

Yeah, that's 3 pointer dereferences from cpu_base, iow you traded a
single bit test on an already loaded word for 3 potential cacheline
misses.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  6:57       ` [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list Peter Zijlstra
@ 2015-04-09  7:09         ` Ingo Molnar
  2015-04-09  7:20           ` Ingo Molnar
  2015-04-09  8:03           ` Peter Zijlstra
  0 siblings, 2 replies; 123+ messages in thread
From: Ingo Molnar @ 2015-04-09  7:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Viresh Kumar, Ingo Molnar, linaro-kernel,
	linux-kernel, Preeti U Murthy


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Thu, Apr 09, 2015 at 08:28:41AM +0200, Ingo Molnar wrote:
> > Btw., does cpu_base->active_bases even make sense? hrtimer bases are 
> > fundamentally percpu, and to check whether there are any pending 
> > timers is a very simple check:
> > 
> > 	base->active->next != NULL
> > 
> 
> Yeah, that's 3 pointer dereferences from cpu_base, iow you traded a 
> single bit test on an already loaded word for 3 potential cacheline 
> misses.

But the clock bases are not aligned to cachelines, and we have 4 of 
them. So in practice when we access one, we'll load the next one 
anyway.

Furthermore the simplification is measurable, and a fair bit of it is 
in various fast paths. I'd rather trade a bit of a cacheline footprint 
for less overall complexity and faster code.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH] hrtimers: Use consistent variable names for timerqueue_node iterations
  2015-04-09  6:38       ` Ingo Molnar
  2015-04-09  6:39         ` [PATCH] hrtimer: Only iterate over active bases in migrate_hrtimers() Ingo Molnar
  2015-04-09  6:53         ` [PATCH] hrtimer: Replace timerqueue_getnext() uses with direct access to 'active.next' Ingo Molnar
@ 2015-04-09  7:10         ` Ingo Molnar
  2 siblings, 0 replies; 123+ messages in thread
From: Ingo Molnar @ 2015-04-09  7:10 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Viresh Kumar, Ingo Molnar, Peter Zijlstra, linaro-kernel,
	linux-kernel, Preeti U Murthy

We iterate over the timerqueue_node list of base->active.next 
frequently, but the iterator variables are a colorful and inconsistent 
mix: 'next', 'next_timer' (!), 'node' and 'node_next'.

For the 5 iterations we've invented 4 separate names for the same 
thing ... sigh.

So standardize the naming to 'node_next': this both tells us that it's 
an rbtree node, and that it's also a next element in a list.

No code changed:

   text	   data	    bss	    dec	    hex	filename
   7754	    427	      0	   8181	   1ff5	hrtimer.o.before
   7754	    427	      0	   8181	   1ff5	hrtimer.o.after

md5:
   de9af3b01d50c72af2d8b026ed22704a  hrtimer.o.before.asm
   de9af3b01d50c72af2d8b026ed22704a  hrtimer.o.after.asm

Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/time/hrtimer.c |   32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -448,14 +448,14 @@ static ktime_t __hrtimer_get_next_event(
 	int i;
 
 	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++, base++) {
-		struct timerqueue_node *next;
+		struct timerqueue_node *node_next;
 		struct hrtimer *timer;
 
-		next = base->active.next;
-		if (!next)
+		node_next = base->active.next;
+		if (!node_next)
 			continue;
 
-		timer = container_of(next, struct hrtimer, node);
+		timer = container_of(node_next, struct hrtimer, node);
 		expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
 		if (expires.tv64 < expires_next.tv64)
 			expires_next = expires;
@@ -872,13 +872,13 @@ static void __remove_hrtimer(struct hrti
 			     struct hrtimer_clock_base *base,
 			     unsigned long newstate, int reprogram)
 {
-	struct timerqueue_node *next_timer;
+	struct timerqueue_node *node_next;
 	if (!(timer->state & HRTIMER_STATE_ENQUEUED))
 		goto out;
 
-	next_timer = base->active.next;
+	node_next = base->active.next;
 	timerqueue_del(&base->active, &timer->node);
-	if (&timer->node == next_timer) {
+	if (&timer->node == node_next) {
 #ifdef CONFIG_HIGH_RES_TIMERS
 		/* Reprogram the clock event device. if enabled */
 		if (reprogram && hrtimer_hres_active()) {
@@ -1262,7 +1262,7 @@ retry:
 
 	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
 		struct hrtimer_clock_base *base;
-		struct timerqueue_node *node;
+		struct timerqueue_node *node_next;
 		ktime_t basenow;
 
 		base = cpu_base->clock_base + i;
@@ -1271,10 +1271,10 @@ retry:
 
 		basenow = ktime_add(now, base->offset);
 
-		while ((node = base->active.next)) {
+		while ((node_next = base->active.next)) {
 			struct hrtimer *timer;
 
-			timer = container_of(node, struct hrtimer, node);
+			timer = container_of(node_next, struct hrtimer, node);
 
 			/*
 			 * The immediate goal for using the softexpires is
@@ -1428,7 +1428,7 @@ void hrtimer_run_pending(void)
  */
 void hrtimer_run_queues(void)
 {
-	struct timerqueue_node *node;
+	struct timerqueue_node *node_next;
 	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	struct hrtimer_clock_base *base;
 	int index, gettime = 1;
@@ -1448,10 +1448,10 @@ void hrtimer_run_queues(void)
 
 		raw_spin_lock(&cpu_base->lock);
 
-		while ((node = base->active.next)) {
+		while ((node_next = base->active.next)) {
 			struct hrtimer *timer;
 
-			timer = container_of(node, struct hrtimer, node);
+			timer = container_of(node_next, struct hrtimer, node);
 			if (base->softirq_time.tv64 <=
 					hrtimer_get_expires_tv64(timer))
 				break;
@@ -1629,10 +1629,10 @@ static void migrate_hrtimer_list(struct
 				struct hrtimer_clock_base *new_base)
 {
 	struct hrtimer *timer;
-	struct timerqueue_node *node;
+	struct timerqueue_node *node_next;
 
-	while ((node = old_base->active.next)) {
-		timer = container_of(node, struct hrtimer, node);
+	while ((node_next = old_base->active.next)) {
+		timer = container_of(node_next, struct hrtimer, node);
 		BUG_ON(hrtimer_callback_running(timer));
 		debug_deactivate(timer);
 

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  7:09         ` Ingo Molnar
@ 2015-04-09  7:20           ` Ingo Molnar
  2015-04-09  8:58             ` Thomas Gleixner
  2015-04-09  8:58             ` Peter Zijlstra
  2015-04-09  8:03           ` Peter Zijlstra
  1 sibling, 2 replies; 123+ messages in thread
From: Ingo Molnar @ 2015-04-09  7:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Viresh Kumar, Ingo Molnar, linaro-kernel,
	linux-kernel, Preeti U Murthy


* Ingo Molnar <mingo@kernel.org> wrote:

> 
> * Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > On Thu, Apr 09, 2015 at 08:28:41AM +0200, Ingo Molnar wrote:
> > > Btw., does cpu_base->active_bases even make sense? hrtimer bases are 
> > > fundamentally percpu, and to check whether there are any pending 
> > > timers is a very simple check:
> > > 
> > > 	base->active->next != NULL
> > > 
> > 
> > Yeah, that's 3 pointer dereferences from cpu_base, iow you traded a 
> > single bit test on an already loaded word for 3 potential cacheline 
> > misses.
> 
> But the clock bases are not aligned to cachelines, and we have 4 of 
> them. So in practice when we access one, we'll load the next one 
> anyway.
> 
> Furthermore the simplification is measurable, and a fair bit of it is 
> in various fast paths. I'd rather trade a bit of a cacheline footprint 
> for less overall complexity and faster code.

Plus, look at this code in hrtimer_run_queues():

        for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) {
                base = &cpu_base->clock_base[index];
                if (!base->active.next)
                        continue;

                if (gettime) {
                        hrtimer_get_softirq_time(cpu_base);
                        gettime = 0;
                }

if at least one base is active (on my fairly standard system all cpus 
have at least one active hrtimer base all the time - and many cpus 
have two bases active), then we run hrtimer_get_softirq_time(), which 
dirties the cachelines of all 4 clock bases:

        base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim;
        base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono;
        base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot;
        base->clock_base[HRTIMER_BASE_TAI].softirq_time = tai;

so in practice we not only touch every cacheline in every timer 
interrupt, but we _dirty_ them, even the inactive ones.

So I'd strongly argue in favor of this patch series of simplification: 
it makes the code simpler and faster, and won't impact cache footprint 
in practice.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  7:09         ` Ingo Molnar
  2015-04-09  7:20           ` Ingo Molnar
@ 2015-04-09  8:03           ` Peter Zijlstra
  2015-04-09  8:10             ` Ingo Molnar
  1 sibling, 1 reply; 123+ messages in thread
From: Peter Zijlstra @ 2015-04-09  8:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Viresh Kumar, Ingo Molnar, linaro-kernel,
	linux-kernel, Preeti U Murthy

On Thu, Apr 09, 2015 at 09:09:17AM +0200, Ingo Molnar wrote:
> 
> * Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > On Thu, Apr 09, 2015 at 08:28:41AM +0200, Ingo Molnar wrote:
> > > Btw., does cpu_base->active_bases even make sense? hrtimer bases are 
> > > fundamentally percpu, and to check whether there are any pending 
> > > timers is a very simple check:
> > > 
> > > 	base->active->next != NULL
> > > 
> > 
> > Yeah, that's 3 pointer dereferences from cpu_base, iow you traded a 
> > single bit test on an already loaded word for 3 potential cacheline 
> > misses.
> 
> But the clock bases are not aligned to cachelines, and we have 4 of 
> them. So in practice when we access one, we'll load the next one 
> anyway.

$ pahole -C hrtimer_clock_base defconfig-build/kernel/time/timer.o 
struct hrtimer_clock_base {
        struct hrtimer_cpu_base *  cpu_base;             /*     0     8 */
        int                        index;                /*     8     4 */
        clockid_t                  clockid;              /*    12     4 */
        struct timerqueue_head     active;               /*    16    16 */
        ktime_t                    resolution;           /*    32     8 */
        ktime_t                    (*get_time)(void);    /*    40     8 */
        ktime_t                    softirq_time;         /*    48     8 */
        ktime_t                    offset;               /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */

        /* size: 64, cachelines: 1, members: 8 */
};

They _should_ be aligned :-)

> Furthermore the simplification is measurable, and a fair bit of it is 
> in various fast paths. I'd rather trade a bit of a cacheline footprint 
> for less overall complexity and faster code.

cacheline misses hurt a lot, and the bitmask isn't really complex.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  8:03           ` Peter Zijlstra
@ 2015-04-09  8:10             ` Ingo Molnar
  0 siblings, 0 replies; 123+ messages in thread
From: Ingo Molnar @ 2015-04-09  8:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Viresh Kumar, Ingo Molnar, linaro-kernel,
	linux-kernel, Preeti U Murthy


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Thu, Apr 09, 2015 at 09:09:17AM +0200, Ingo Molnar wrote:
> > 
> > * Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > > On Thu, Apr 09, 2015 at 08:28:41AM +0200, Ingo Molnar wrote:
> > > > Btw., does cpu_base->active_bases even make sense? hrtimer bases are 
> > > > fundamentally percpu, and to check whether there are any pending 
> > > > timers is a very simple check:
> > > > 
> > > > 	base->active->next != NULL
> > > > 
> > > 
> > > Yeah, that's 3 pointer dereferences from cpu_base, iow you traded a 
> > > single bit test on an already loaded word for 3 potential cacheline 
> > > misses.
> > 
> > But the clock bases are not aligned to cachelines, and we have 4 of 
> > them. So in practice when we access one, we'll load the next one 
> > anyway.
> 
> $ pahole -C hrtimer_clock_base defconfig-build/kernel/time/timer.o 
> struct hrtimer_clock_base {
>         struct hrtimer_cpu_base *  cpu_base;             /*     0     8 */
>         int                        index;                /*     8     4 */
>         clockid_t                  clockid;              /*    12     4 */
>         struct timerqueue_head     active;               /*    16    16 */
>         ktime_t                    resolution;           /*    32     8 */
>         ktime_t                    (*get_time)(void);    /*    40     8 */
>         ktime_t                    softirq_time;         /*    48     8 */
>         ktime_t                    offset;               /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
> 
>         /* size: 64, cachelines: 1, members: 8 */
> };
> 
> They _should_ be aligned :-)

Maybe, but they aren't currently - and aligning them has costs as 
well.

> > Furthermore the simplification is measurable, and a fair bit of it 
> > is in various fast paths. I'd rather trade a bit of a cacheline 
> > footprint for less overall complexity and faster code.
> 
> cacheline misses hurt a lot, and the bitmask isn't really complex.

See my other mail: in practice we already dirty all of these 
cachelines in the hrtimer irq...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  6:28     ` [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list Ingo Molnar
  2015-04-09  6:38       ` Ingo Molnar
  2015-04-09  6:57       ` [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list Peter Zijlstra
@ 2015-04-09  8:53       ` Thomas Gleixner
  2015-04-09  9:18         ` Ingo Molnar
  2 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-09  8:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Viresh Kumar, Ingo Molnar, Peter Zijlstra, linaro-kernel,
	linux-kernel, Preeti U Murthy

On Thu, 9 Apr 2015, Ingo Molnar wrote:
> Btw., does cpu_base->active_bases even make sense? hrtimer bases are 
> fundamentally percpu, and to check whether there are any pending 
> timers is a very simple check:
> 
> 	base->active->next != NULL
> 
> So I'd rather suggest taking a direct look at the head, instead of 
> calculating bit positions, masks, etc.
>
> Furthermore, we never actually use cpu_base->active_bases as a 
> 'summary' value (which is the main point of bitmasks in general), so 
> I'd remove that complication altogether.
> 
> This would speed up various hrtimer primitives like 
> hrtimer_remove()/add and simplify the code. It would be a net code 
> shrink as well.

Well. You trade a bit more code against touching cache lines to figure
out whether the clock base has active timers or not. So for a lot of
scenarios where only clock monotonic is used you touch 3 cache lines
for nothing.

I'm about to send out a patch which actually makes better use of the
active_bases field without creating a code size explosion.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  7:20           ` Ingo Molnar
@ 2015-04-09  8:58             ` Thomas Gleixner
  2015-04-09  8:58             ` Peter Zijlstra
  1 sibling, 0 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-09  8:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Viresh Kumar, Ingo Molnar, linaro-kernel,
	linux-kernel, Preeti U Murthy

On Thu, 9 Apr 2015, Ingo Molnar wrote:
> Plus, look at this code in hrtimer_run_queues():
> 
>         for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) {
>                 base = &cpu_base->clock_base[index];
>                 if (!base->active.next)
>                         continue;
> 
>                 if (gettime) {
>                         hrtimer_get_softirq_time(cpu_base);
>                         gettime = 0;
>                 }
> 
> if at least one base is active (on my fairly standard system all cpus 
> have at least one active hrtimer base all the time - and many cpus 
> have two bases active), then we run hrtimer_get_softirq_time(), which 
> dirties the cachelines of all 4 clock bases:
> 
>         base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim;
>         base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono;
>         base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot;
>         base->clock_base[HRTIMER_BASE_TAI].softirq_time = tai;

This is the non highres case and we actually could optimize that case
as well to avoid the cache line polution.

Thanks,

	tglx



 

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  7:20           ` Ingo Molnar
  2015-04-09  8:58             ` Thomas Gleixner
@ 2015-04-09  8:58             ` Peter Zijlstra
  2015-04-09  9:18               ` Thomas Gleixner
  1 sibling, 1 reply; 123+ messages in thread
From: Peter Zijlstra @ 2015-04-09  8:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Viresh Kumar, Ingo Molnar, linaro-kernel,
	linux-kernel, Preeti U Murthy

On Thu, Apr 09, 2015 at 09:20:39AM +0200, Ingo Molnar wrote:
> if at least one base is active (on my fairly standard system all cpus 
> have at least one active hrtimer base all the time - and many cpus 
> have two bases active), then we run hrtimer_get_softirq_time(), which 
> dirties the cachelines of all 4 clock bases:
> 
>         base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim;
>         base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono;
>         base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot;
>         base->clock_base[HRTIMER_BASE_TAI].softirq_time = tai;
> 
> so in practice we not only touch every cacheline in every timer 
> interrupt, but we _dirty_ them, even the inactive ones.
> 

Urgh we should really _really_ kill that entire softirq mess.

All it needs it hrtimer_start*() returning -ENOTIME when it cannot queue
the timer.

Of course, all that needs it auditing all hrtimer_start*() callsites,
which is a lot of work.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  8:53       ` Thomas Gleixner
@ 2015-04-09  9:18         ` Ingo Molnar
  0 siblings, 0 replies; 123+ messages in thread
From: Ingo Molnar @ 2015-04-09  9:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Viresh Kumar, Ingo Molnar, Peter Zijlstra, linaro-kernel,
	linux-kernel, Preeti U Murthy


* Thomas Gleixner <tglx@linutronix.de> wrote:

> On Thu, 9 Apr 2015, Ingo Molnar wrote:
> > Btw., does cpu_base->active_bases even make sense? hrtimer bases are 
> > fundamentally percpu, and to check whether there are any pending 
> > timers is a very simple check:
> > 
> > 	base->active->next != NULL
> > 
> > So I'd rather suggest taking a direct look at the head, instead of 
> > calculating bit positions, masks, etc.
> >
> > Furthermore, we never actually use cpu_base->active_bases as a 
> > 'summary' value (which is the main point of bitmasks in general), 
> > so I'd remove that complication altogether.
> > 
> > This would speed up various hrtimer primitives like 
> > hrtimer_remove()/add and simplify the code. It would be a net code 
> > shrink as well.
> 
> Well. You trade a bit more code against touching cache lines to 
> figure out whether the clock base has active timers or not. So for a 
> lot of scenarios where only clock monotonic is used you touch 3 
> cache lines for nothing.

In the (typical) case it will touch one extra cacheline - and removes 
a fair bit of complexity which 80 bytes (that touches two cachelines):

   7502     427       0    7929    1ef9 hrtimer.o.before
   7422     427       0    7849    1ea9 hrtimer.o.after

So even if we were to optimize for cache footprint (which isn't the 
only factor we optimize for)it looks like a win-win scenario to me, 
even if you ignore the speedup and the simpler code structure...

Ok?

> I'm about to send out a patch which actually makes better use of the 
> active_bases field without creating a code size explosion.

So please lets do this series first - it achieves the same thing, with 
less cache used and faster code.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  8:58             ` Peter Zijlstra
@ 2015-04-09  9:18               ` Thomas Gleixner
  2015-04-09  9:31                 ` Peter Zijlstra
  2015-04-13  5:53                 ` Preeti U Murthy
  0 siblings, 2 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-09  9:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Viresh Kumar, Ingo Molnar, linaro-kernel,
	linux-kernel, Preeti U Murthy

On Thu, 9 Apr 2015, Peter Zijlstra wrote:
> On Thu, Apr 09, 2015 at 09:20:39AM +0200, Ingo Molnar wrote:
> > if at least one base is active (on my fairly standard system all cpus 
> > have at least one active hrtimer base all the time - and many cpus 
> > have two bases active), then we run hrtimer_get_softirq_time(), which 
> > dirties the cachelines of all 4 clock bases:
> > 
> >         base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim;
> >         base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono;
> >         base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot;
> >         base->clock_base[HRTIMER_BASE_TAI].softirq_time = tai;
> > 
> > so in practice we not only touch every cacheline in every timer 
> > interrupt, but we _dirty_ them, even the inactive ones.
> > 
> 
> Urgh we should really _really_ kill that entire softirq mess.

That's the !highres part. We cannot kill that one unless we remove all
support for machines which do not provide hardware for highres
support.

Now the softirq_time thing is an optimization which we added back in
the days when hrtimer went into the tree and Roman complained about
the base->get_time() invocation being overkill.

The reasoning behing this was that low resolution systems do not need
accurate time for the expiry and the forwarding because everything
happens tick aligned.

So for !HIGHRES we have:

static inline ktime_t hrtimer_cb_get_time(struct hrtimer *timer)
{
	return timer->base->softirq_time;
}

and for the HIGHRES case:

static inline ktime_t hrtimer_cb_get_time(struct hrtimer *timer)
{
	return timer->base->get_time();
}

Here are the usage sites of this:

drivers/power/reset/ltc2952-poweroff.c: now = hrtimer_cb_get_time(timer);
kernel/sched/core.c:            now = hrtimer_cb_get_time(period_timer);
kernel/sched/deadline.c:        now = hrtimer_cb_get_time(&dl_se->dl_timer);
kernel/sched/fair.c:            now = hrtimer_cb_get_time(timer);
kernel/sched/rt.c:              now = hrtimer_cb_get_time(timer);
kernel/time/posix-timers.c:                     ktime_t now = hrtimer_cb_get_time(timer);
sound/drivers/dummy.c:  dpcm->base_time = hrtimer_cb_get_time(&dpcm->timer);
sound/drivers/dummy.c:  delta = ktime_us_delta(hrtimer_cb_get_time(&dpcm->timer),

So the only ones where this optimization might matter is the clock
monotonic one. The few users of posix interval timers which use
something else than CLOCK_MONO should not matter much.

I'd be happy to kill all of this and consolidate everything on the
HIGHRES implementation.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  9:18               ` Thomas Gleixner
@ 2015-04-09  9:31                 ` Peter Zijlstra
  2015-04-09  9:56                   ` Thomas Gleixner
  2015-04-13  5:53                 ` Preeti U Murthy
  1 sibling, 1 reply; 123+ messages in thread
From: Peter Zijlstra @ 2015-04-09  9:31 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Viresh Kumar, Ingo Molnar, linaro-kernel,
	linux-kernel, Preeti U Murthy

On Thu, Apr 09, 2015 at 11:18:52AM +0200, Thomas Gleixner wrote:
> On Thu, 9 Apr 2015, Peter Zijlstra wrote:
> > On Thu, Apr 09, 2015 at 09:20:39AM +0200, Ingo Molnar wrote:
> > > if at least one base is active (on my fairly standard system all cpus 
> > > have at least one active hrtimer base all the time - and many cpus 
> > > have two bases active), then we run hrtimer_get_softirq_time(), which 
> > > dirties the cachelines of all 4 clock bases:
> > > 
> > >         base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim;
> > >         base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono;
> > >         base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot;
> > >         base->clock_base[HRTIMER_BASE_TAI].softirq_time = tai;
> > > 
> > > so in practice we not only touch every cacheline in every timer 
> > > interrupt, but we _dirty_ them, even the inactive ones.
> > > 
> > 
> > Urgh we should really _really_ kill that entire softirq mess.
> 
> That's the !highres part. We cannot kill that one unless we remove all
> support for machines which do not provide hardware for highres
> support.

Oops, sorry I mixed up the two. Doesn't take away from the fact that I'd
like to get rid of the highres softirq fallback path.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  9:31                 ` Peter Zijlstra
@ 2015-04-09  9:56                   ` Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-09  9:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Viresh Kumar, Ingo Molnar, linaro-kernel,
	linux-kernel, Preeti U Murthy

On Thu, 9 Apr 2015, Peter Zijlstra wrote:
> On Thu, Apr 09, 2015 at 11:18:52AM +0200, Thomas Gleixner wrote:
> > On Thu, 9 Apr 2015, Peter Zijlstra wrote:
> > > On Thu, Apr 09, 2015 at 09:20:39AM +0200, Ingo Molnar wrote:
> > > > if at least one base is active (on my fairly standard system all cpus 
> > > > have at least one active hrtimer base all the time - and many cpus 
> > > > have two bases active), then we run hrtimer_get_softirq_time(), which 
> > > > dirties the cachelines of all 4 clock bases:
> > > > 
> > > >         base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim;
> > > >         base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono;
> > > >         base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot;
> > > >         base->clock_base[HRTIMER_BASE_TAI].softirq_time = tai;
> > > > 
> > > > so in practice we not only touch every cacheline in every timer 
> > > > interrupt, but we _dirty_ them, even the inactive ones.
> > > > 
> > > 
> > > Urgh we should really _really_ kill that entire softirq mess.
> > 
> > That's the !highres part. We cannot kill that one unless we remove all
> > support for machines which do not provide hardware for highres
> > support.
> 
> Oops, sorry I mixed up the two. Doesn't take away from the fact that I'd
> like to get rid of the highres softirq fallback path.

Indeed.
 

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-09  9:18               ` Thomas Gleixner
  2015-04-09  9:31                 ` Peter Zijlstra
@ 2015-04-13  5:53                 ` Preeti U Murthy
  2015-04-13  7:53                   ` Thomas Gleixner
  1 sibling, 1 reply; 123+ messages in thread
From: Preeti U Murthy @ 2015-04-13  5:53 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra
  Cc: Ingo Molnar, Viresh Kumar, Ingo Molnar, linaro-kernel, linux-kernel

On 04/09/2015 02:48 PM, Thomas Gleixner wrote:
> On Thu, 9 Apr 2015, Peter Zijlstra wrote:
>> On Thu, Apr 09, 2015 at 09:20:39AM +0200, Ingo Molnar wrote:
>>> if at least one base is active (on my fairly standard system all cpus 
>>> have at least one active hrtimer base all the time - and many cpus 
>>> have two bases active), then we run hrtimer_get_softirq_time(), which 
>>> dirties the cachelines of all 4 clock bases:
>>>
>>>         base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim;
>>>         base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono;
>>>         base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot;
>>>         base->clock_base[HRTIMER_BASE_TAI].softirq_time = tai;
>>>
>>> so in practice we not only touch every cacheline in every timer 
>>> interrupt, but we _dirty_ them, even the inactive ones.
>>>
>>
>> Urgh we should really _really_ kill that entire softirq mess.
> 
> That's the !highres part. We cannot kill that one unless we remove all
> support for machines which do not provide hardware for highres
> support.
> 
> Now the softirq_time thing is an optimization which we added back in
> the days when hrtimer went into the tree and Roman complained about
> the base->get_time() invocation being overkill.
> 
> The reasoning behing this was that low resolution systems do not need
> accurate time for the expiry and the forwarding because everything
> happens tick aligned.
> 
> So for !HIGHRES we have:
> 
> static inline ktime_t hrtimer_cb_get_time(struct hrtimer *timer)
> {
> 	return timer->base->softirq_time;
> }

Why is this called softirq_time when the hrtimer is being serviced in
the hard irq context ?

Regards
Preeti U Murthy


^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list
  2015-04-13  5:53                 ` Preeti U Murthy
@ 2015-04-13  7:53                   ` Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-13  7:53 UTC (permalink / raw)
  To: Preeti U Murthy
  Cc: Peter Zijlstra, Ingo Molnar, Viresh Kumar, Ingo Molnar,
	linaro-kernel, linux-kernel

On Mon, 13 Apr 2015, Preeti U Murthy wrote:
> On 04/09/2015 02:48 PM, Thomas Gleixner wrote:
> > On Thu, 9 Apr 2015, Peter Zijlstra wrote:
> >> On Thu, Apr 09, 2015 at 09:20:39AM +0200, Ingo Molnar wrote:
> >>> if at least one base is active (on my fairly standard system all cpus 
> >>> have at least one active hrtimer base all the time - and many cpus 
> >>> have two bases active), then we run hrtimer_get_softirq_time(), which 
> >>> dirties the cachelines of all 4 clock bases:
> >>>
> >>>         base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim;
> >>>         base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono;
> >>>         base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot;
> >>>         base->clock_base[HRTIMER_BASE_TAI].softirq_time = tai;
> >>>
> >>> so in practice we not only touch every cacheline in every timer 
> >>> interrupt, but we _dirty_ them, even the inactive ones.
> >>>
> >>
> >> Urgh we should really _really_ kill that entire softirq mess.
> > 
> > That's the !highres part. We cannot kill that one unless we remove all
> > support for machines which do not provide hardware for highres
> > support.
> > 
> > Now the softirq_time thing is an optimization which we added back in
> > the days when hrtimer went into the tree and Roman complained about
> > the base->get_time() invocation being overkill.
> > 
> > The reasoning behing this was that low resolution systems do not need
> > accurate time for the expiry and the forwarding because everything
> > happens tick aligned.
> > 
> > So for !HIGHRES we have:
> > 
> > static inline ktime_t hrtimer_cb_get_time(struct hrtimer *timer)
> > {
> > 	return timer->base->softirq_time;
> > }
> 
> Why is this called softirq_time when the hrtimer is being serviced in
> the hard irq context ?

For historical reasons.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues
@ 2015-04-14 21:08 Thomas Gleixner
  2015-04-14 21:08 ` [patch 01/39] hrtimer: Update active_bases before calling hrtimer_force_reprogram() Thomas Gleixner
                   ` (38 more replies)
  0 siblings, 39 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

When I returned from my break I got offended by a pile of patches
which kill the patient with the cure.

The issues at hand:

    - NOHZ: Get rid of the softirq invocation

    - hrtimer: Use the active_bases field in order to avoid evaluating
       	       inactive bases

    - hrtimer: Cache footprint issues

Aside of that Peter and I were discussing for a long time to get rid
of the hrtimer softirq.

After staring at all of it for quite some time it occured to me that
all issues are related in one way or the other. So I sat down and
reworked the code in various ways:

   - Reduce the data size, so the hrtimer clock bases can be made
     cache line aligned.

   - Consolidate everything on the high resolution timer
     implementation and get rid of dubious optimizations for the non
     highres case which bloat code and data.

   - Implement the active_bases mechanism proper and avoid touching
     inactive hrtimer clock bases. This includes a conditional update
     mechanism for the hrtimer clock offsets to update them only when
     they changed, which they do seldom enough instead of polluting 4
     cache lines in every tick/hrtimer interrupt.

   - Get rid of the softirq deferment and simply enforce a hrtimer
     interrupt when the timer was already expired. This allows to
     remove the ugly __hrtimer_start_range_ns() interface and to
     cleanup the usage sites (sched/perf). As a consequence this also
     gets rid of the forward loops in the tick nohz code.

   - Analogous to the hrtimer enforcement, force a tick interrupt for
     NOHZ non highres systems when the forwarding code tries to fire
     an expired tick. This allows to get rid of the softirq invocation
     in the NOHZ code.

   - A cleanup of the code which evaluates the next timer event: Use
     nsec based calculations instead of the jiffy magic. That makes it
     actually readable by some definition of readable.

   - While doing the above I had to audit quite some usage sites of
     various hrtimer interfaces, which revealed some entertaining
     bugs. The fixes have been posted in a seperate series
     already. Some other bogosities have been removed as part of this
     series.

The total change size of this overhaul is:

   37 files changed, 515 insertions(+), 794 deletions(-)

The resulting text size of hrtimers.o shrinks in the range of 8-10%
depending on the architecture.

The cache foot print of the hrtimer per cpu data shrinks as well.

    	  x8684	     i386   ARM	    ARM64   power64
Before:	  328	     248    280	    328	    328		Bytes
	    6	       4      5	      6	      6		cache lines (64byte)

After:	  320	     192    192	    320	    320		Bytes
	    5	       3      3	      5	      5		cache lines (64byte)

Note, that the new code avoids to touch the inactive clock bases which
are now cache line aligned and therefor reduces the cache foot print
in normal usage scenarios significantly.

I did some perf measurements on an isolated core running 

  - hrtimer centric workloads
  - idle scenarios with periodic wakeups of various length

The patches reduce the number of instructions executed during the test
runs between 2.5 and 6% depending on the scenario and the cache misses
between 3 and 8%.

For your convenience this series is also available at:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/wip

   Note: The branch is temporary and not meant to base other work on it.

Thanks,

	tglx
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c   |    5 
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |    5 
 drivers/power/reset/ltc2952-poweroff.c        |   18 
 drivers/staging/ozwpan/ozpd.c                 |    8 
 include/linux/alarmtimer.h                    |    4 
 include/linux/hrtimer.h                       |  101 ++---
 include/linux/interrupt.h                     |    7 
 include/linux/rcupdate.h                      |    6 
 include/linux/rcutree.h                       |    2 
 include/linux/timekeeper_internal.h           |    2 
 include/linux/timer.h                         |    7 
 include/linux/timerqueue.h                    |    8 
 include/trace/events/irq.h                    |    1 
 kernel/events/core.c                          |    9 
 kernel/futex.c                                |    5 
 kernel/locking/rtmutex.c                      |    5 
 kernel/rcu/tree_plugin.h                      |   14 
 kernel/sched/core.c                           |   28 -
 kernel/sched/deadline.c                       |   12 
 kernel/sched/fair.c                           |    2 
 kernel/softirq.c                              |    2 
 kernel/time/alarmtimer.c                      |   17 
 kernel/time/hrtimer.c                         |  525 +++++++++-----------------
 kernel/time/posix-timers.c                    |   17 
 kernel/time/tick-broadcast-hrtimer.c          |    8 
 kernel/time/tick-internal.h                   |    2 
 kernel/time/tick-sched.c                      |  288 +++++---------
 kernel/time/tick-sched.h                      |    2 
 kernel/time/timekeeping.c                     |   55 --
 kernel/time/timekeeping.h                     |   10 
 kernel/time/timer.c                           |   79 +--
 kernel/time/timer_list.c                      |   14 
 lib/timerqueue.c                              |   10 
 net/core/pktgen.c                             |    2 
 net/sched/sch_api.c                           |    5 
 sound/core/hrtimer.c                          |    9 
 sound/drivers/pcsp/pcsp.c                     |   15 
 37 files changed, 515 insertions(+), 794 deletions(-)




^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 01/39] hrtimer: Update active_bases before calling hrtimer_force_reprogram()
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-14 21:08 ` [patch 02/39] hrtimer: Get rid of the resolution field in hrtimer_clock_base Thomas Gleixner
                   ` (37 subsequent siblings)
  38 siblings, 0 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, linaro-kernel

[-- Attachment #1: hrtimer-move-update-active-bases.patch --]
[-- Type: text/plain, Size: 1994 bytes --]

From: Viresh Kumar <viresh.kumar@linaro.org>

'active_bases' indicates which clock-base have active timer. The
intention of this bit field was to avoid evaluating inactive bases. It
was introduced with the introduction of the BOOTTIME and TAI clock
bases, but it was never brought into full use.

We want to use it now, but in __remove_hrtimer() the update happens
after the calling hrtimer_force_reprogram() which has to evaluate all
clock bases for the next expiring timer. So in case the last timer of
a clock base got removed we still see the active bit and therefor
evaluate the clock base for no value. There are further optimizations
possible when active_bases is updated in the right place.

Move the update before the call to hrtimer_force_reprogram()

[ tglx: Massaged changelog ]

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linaro-kernel@lists.linaro.org
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/c7c8ebcd9ed88bb09d76059c745a1fafb48314e7.1428039899.git.viresh.kumar@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/hrtimer.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -879,6 +879,9 @@ static void __remove_hrtimer(struct hrti
 
 	next_timer = timerqueue_getnext(&base->active);
 	timerqueue_del(&base->active, &timer->node);
+	if (!timerqueue_getnext(&base->active))
+		base->cpu_base->active_bases &= ~(1 << base->index);
+
 	if (&timer->node == next_timer) {
 #ifdef CONFIG_HIGH_RES_TIMERS
 		/* Reprogram the clock event device. if enabled */
@@ -892,8 +895,6 @@ static void __remove_hrtimer(struct hrti
 		}
 #endif
 	}
-	if (!timerqueue_getnext(&base->active))
-		base->cpu_base->active_bases &= ~(1 << base->index);
 out:
 	timer->state = newstate;
 }



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 02/39] hrtimer: Get rid of the resolution field in hrtimer_clock_base
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
  2015-04-14 21:08 ` [patch 01/39] hrtimer: Update active_bases before calling hrtimer_force_reprogram() Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-15  6:29   ` Frans Klaver
                     ` (2 more replies)
  2015-04-14 21:08 ` [patch 03/39] net: sched: Use hrtimer_resolution instead of hrtimer_get_res() Thomas Gleixner
                   ` (36 subsequent siblings)
  38 siblings, 3 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrtimer-get-rid-of-resolution.patch --]
[-- Type: text/plain, Size: 5431 bytes --]

The field has no value because all clock bases have the same
resolution. The resolution only changes when we switch to high
resolution timer mode. We can evaluate that from a single static
variable as well. In the !HIGHRES case its simply a constant.

Export the variable, so we can simplify the usage sites.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h  |    6 ++++--
 kernel/time/hrtimer.c    |   26 +++++++++-----------------
 kernel/time/timer_list.c |    8 ++++----
 3 files changed, 17 insertions(+), 23 deletions(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -137,7 +137,6 @@ struct hrtimer_sleeper {
  *			timer to a base on another cpu.
  * @clockid:		clock id for per_cpu support
  * @active:		red black tree root node for the active timers
- * @resolution:		the resolution of the clock, in nanoseconds
  * @get_time:		function to retrieve the current time of the clock
  * @softirq_time:	the time when running the hrtimer queue in the softirq
  * @offset:		offset of this clock to the monotonic base
@@ -147,7 +146,6 @@ struct hrtimer_clock_base {
 	int			index;
 	clockid_t		clockid;
 	struct timerqueue_head	active;
-	ktime_t			resolution;
 	ktime_t			(*get_time)(void);
 	ktime_t			softirq_time;
 	ktime_t			offset;
@@ -295,11 +293,15 @@ extern void hrtimer_peek_ahead_timers(vo
 
 extern void clock_was_set_delayed(void);
 
+extern unsigned int hrtimer_resolution;
+
 #else
 
 # define MONOTONIC_RES_NSEC	LOW_RES_NSEC
 # define KTIME_MONOTONIC_RES	KTIME_LOW_RES
 
+#define hrtimer_resolution	LOW_RES_NSEC
+
 static inline void hrtimer_peek_ahead_timers(void) { }
 
 /*
Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -66,7 +66,6 @@
  */
 DEFINE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases) =
 {
-
 	.lock = __RAW_SPIN_LOCK_UNLOCKED(hrtimer_bases.lock),
 	.clock_base =
 	{
@@ -74,25 +73,21 @@ DEFINE_PER_CPU(struct hrtimer_cpu_base,
 			.index = HRTIMER_BASE_MONOTONIC,
 			.clockid = CLOCK_MONOTONIC,
 			.get_time = &ktime_get,
-			.resolution = KTIME_LOW_RES,
 		},
 		{
 			.index = HRTIMER_BASE_REALTIME,
 			.clockid = CLOCK_REALTIME,
 			.get_time = &ktime_get_real,
-			.resolution = KTIME_LOW_RES,
 		},
 		{
 			.index = HRTIMER_BASE_BOOTTIME,
 			.clockid = CLOCK_BOOTTIME,
 			.get_time = &ktime_get_boottime,
-			.resolution = KTIME_LOW_RES,
 		},
 		{
 			.index = HRTIMER_BASE_TAI,
 			.clockid = CLOCK_TAI,
 			.get_time = &ktime_get_clocktai,
-			.resolution = KTIME_LOW_RES,
 		},
 	}
 };
@@ -478,6 +473,8 @@ static ktime_t __hrtimer_get_next_event(
  * High resolution timer enabled ?
  */
 static int hrtimer_hres_enabled __read_mostly  = 1;
+unsigned int hrtimer_resolution __read_mostly = LOW_RES_NSEC;
+EXPORT_SYMBOL_GPL(hrtimer_resolution);
 
 /*
  * Enable / Disable high resolution mode
@@ -660,7 +657,7 @@ static void retrigger_next_event(void *a
  */
 static int hrtimer_switch_to_hres(void)
 {
-	int i, cpu = smp_processor_id();
+	int cpu = smp_processor_id();
 	struct hrtimer_cpu_base *base = &per_cpu(hrtimer_bases, cpu);
 	unsigned long flags;
 
@@ -676,8 +673,7 @@ static int hrtimer_switch_to_hres(void)
 		return 0;
 	}
 	base->hres_active = 1;
-	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++)
-		base->clock_base[i].resolution = KTIME_HIGH_RES;
+	hrtimer_resolution = HIGH_RES_NSEC;
 
 	tick_setup_sched_timer();
 	/* "Retrigger" the interrupt to get things going */
@@ -812,8 +808,8 @@ u64 hrtimer_forward(struct hrtimer *time
 	if (delta.tv64 < 0)
 		return 0;
 
-	if (interval.tv64 < timer->base->resolution.tv64)
-		interval.tv64 = timer->base->resolution.tv64;
+	if (interval.tv64 < hrtimer_resolution)
+		interval.tv64 = hrtimer_resolution;
 
 	if (unlikely(delta.tv64 >= interval.tv64)) {
 		s64 incr = ktime_to_ns(interval);
@@ -955,7 +951,7 @@ int __hrtimer_start_range_ns(struct hrti
 		 * timeouts. This will go away with the GTOD framework.
 		 */
 #ifdef CONFIG_TIME_LOW_RES
-		tim = ktime_add_safe(tim, base->resolution);
+		tim = ktime_add_safe(tim, ktime_set(0, hrtimer_resolution));
 #endif
 	}
 
@@ -1185,12 +1181,8 @@ EXPORT_SYMBOL_GPL(hrtimer_init);
  */
 int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp)
 {
-	struct hrtimer_cpu_base *cpu_base;
-	int base = hrtimer_clockid_to_base(which_clock);
-
-	cpu_base = raw_cpu_ptr(&hrtimer_bases);
-	*tp = ktime_to_timespec(cpu_base->clock_base[base].resolution);
-
+	tp->tv_sec = 0;
+	tp->tv_nsec = hrtimer_resolution;
 	return 0;
 }
 EXPORT_SYMBOL_GPL(hrtimer_get_res);
Index: tip/kernel/time/timer_list.c
===================================================================
--- tip.orig/kernel/time/timer_list.c
+++ tip/kernel/time/timer_list.c
@@ -120,10 +120,10 @@ static void
 print_base(struct seq_file *m, struct hrtimer_clock_base *base, u64 now)
 {
 	SEQ_printf(m, "  .base:       %pK\n", base);
-	SEQ_printf(m, "  .index:      %d\n",
-			base->index);
-	SEQ_printf(m, "  .resolution: %Lu nsecs\n",
-			(unsigned long long)ktime_to_ns(base->resolution));
+	SEQ_printf(m, "  .index:      %d\n", base->index);
+
+	SEQ_printf(m, "  .resolution: %u nsecs\n", (unsigned) hrtimer_resolution);
+
 	SEQ_printf(m,   "  .get_time:   ");
 	print_name_offset(m, base->get_time);
 	SEQ_printf(m,   "\n");



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 03/39] net: sched: Use hrtimer_resolution instead of hrtimer_get_res()
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
  2015-04-14 21:08 ` [patch 01/39] hrtimer: Update active_bases before calling hrtimer_force_reprogram() Thomas Gleixner
  2015-04-14 21:08 ` [patch 02/39] hrtimer: Get rid of the resolution field in hrtimer_clock_base Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-16 16:04   ` David Miller
  2015-04-22 19:05   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 04/39] sound: " Thomas Gleixner
                   ` (35 subsequent siblings)
  38 siblings, 2 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, Jamal Hadi Salim,
	David S. Miller, netdev

[-- Attachment #1: net-sched-use-hrtimer-resolution.patch --]
[-- Type: text/plain, Size: 986 bytes --]

No point in converting a timespec now that the value is directly
accessible.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
---

@David: Depends on a previous patch in this series

 net/sched/sch_api.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

Index: tip/net/sched/sch_api.c
===================================================================
--- tip.orig/net/sched/sch_api.c
+++ tip/net/sched/sch_api.c
@@ -1879,13 +1879,10 @@ EXPORT_SYMBOL(tcf_destroy_chain);
 #ifdef CONFIG_PROC_FS
 static int psched_show(struct seq_file *seq, void *v)
 {
-	struct timespec ts;
-
-	hrtimer_get_res(CLOCK_MONOTONIC, &ts);
 	seq_printf(seq, "%08x %08x %08x %08x\n",
 		   (u32)NSEC_PER_USEC, (u32)PSCHED_TICKS2NS(1),
 		   1000000,
-		   (u32)NSEC_PER_SEC/(u32)ktime_to_ns(timespec_to_ktime(ts)));
+		   (u32)NSEC_PER_SEC / hrtimer_resolution);
 
 	return 0;
 }



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 04/39] sound: Use hrtimer_resolution instead of hrtimer_get_res()
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (2 preceding siblings ...)
  2015-04-14 21:08 ` [patch 03/39] net: sched: Use hrtimer_resolution instead of hrtimer_get_res() Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-16  8:07   ` Takashi Iwai
  2015-04-22 19:05   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 05/39] hrtimer: Get rid " Thomas Gleixner
                   ` (34 subsequent siblings)
  38 siblings, 2 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, Jaroslav Kysela,
	Takashi Iwai, alsa-devel

[-- Attachment #1: sound-use-hrtimer-resolution.patch --]
[-- Type: text/plain, Size: 2536 bytes --]

No point in converting a timespec now that the value is directly
accessible. Get rid of the null check while at it. Resolution is
guaranteed to be > 0.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: alsa-devel@alsa-project.org
---
 sound/core/hrtimer.c      |    9 +--------
 sound/drivers/pcsp/pcsp.c |   15 ++++++---------
 2 files changed, 7 insertions(+), 17 deletions(-)

Index: tip/sound/core/hrtimer.c
===================================================================
--- tip.orig/sound/core/hrtimer.c
+++ tip/sound/core/hrtimer.c
@@ -121,16 +121,9 @@ static struct snd_timer *mytimer;
 static int __init snd_hrtimer_init(void)
 {
 	struct snd_timer *timer;
-	struct timespec tp;
 	int err;
 
-	hrtimer_get_res(CLOCK_MONOTONIC, &tp);
-	if (tp.tv_sec > 0 || !tp.tv_nsec) {
-		pr_err("snd-hrtimer: Invalid resolution %u.%09u",
-			   (unsigned)tp.tv_sec, (unsigned)tp.tv_nsec);
-		return -EINVAL;
-	}
-	resolution = tp.tv_nsec;
+	resolution = hrtimer_resolution;
 
 	/* Create a new timer and set up the fields */
 	err = snd_timer_global_new("hrtimer", SNDRV_TIMER_GLOBAL_HRTIMER,
Index: tip/sound/drivers/pcsp/pcsp.c
===================================================================
--- tip.orig/sound/drivers/pcsp/pcsp.c
+++ tip/sound/drivers/pcsp/pcsp.c
@@ -42,16 +42,13 @@ struct snd_pcsp pcsp_chip;
 static int snd_pcsp_create(struct snd_card *card)
 {
 	static struct snd_device_ops ops = { };
-	struct timespec tp;
-	int err;
-	int div, min_div, order;
-
-	hrtimer_get_res(CLOCK_MONOTONIC, &tp);
+	unsigned int resolution = hrtimer_resolution;
+	int err, div, min_div, order;
 
 	if (!nopcm) {
-		if (tp.tv_sec || tp.tv_nsec > PCSP_MAX_PERIOD_NS) {
+		if (resolution > PCSP_MAX_PERIOD_NS) {
 			printk(KERN_ERR "PCSP: Timer resolution is not sufficient "
-				"(%linS)\n", tp.tv_nsec);
+				"(%linS)\n", resolution);
 			printk(KERN_ERR "PCSP: Make sure you have HPET and ACPI "
 				"enabled.\n");
 			printk(KERN_ERR "PCSP: Turned into nopcm mode.\n");
@@ -59,13 +56,13 @@ static int snd_pcsp_create(struct snd_ca
 		}
 	}
 
-	if (loops_per_jiffy >= PCSP_MIN_LPJ && tp.tv_nsec <= PCSP_MIN_PERIOD_NS)
+	if (loops_per_jiffy >= PCSP_MIN_LPJ && resolution <= PCSP_MIN_PERIOD_NS)
 		min_div = MIN_DIV;
 	else
 		min_div = MAX_DIV;
 #if PCSP_DEBUG
 	printk(KERN_DEBUG "PCSP: lpj=%li, min_div=%i, res=%li\n",
-	       loops_per_jiffy, min_div, tp.tv_nsec);
+	       loops_per_jiffy, min_div, resolution);
 #endif
 
 	div = MAX_DIV / min_div;



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 05/39] hrtimer: Get rid of hrtimer_get_res()
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (3 preceding siblings ...)
  2015-04-14 21:08 ` [patch 04/39] sound: " Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-22 19:05   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 06/39] hrtimer: Make the statistics fields smaller Thomas Gleixner
                   ` (33 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrtimer-simplify-hrtimer-get-res.patch --]
[-- Type: text/plain, Size: 4511 bytes --]

The resolution is directly accessible now. So its simpler just to fill
in the values of the timespec and be done with it.

Text size reduction (combined with "hrtimer: Get rid of the resolution
field in hrtimer_clock_base"):
       x8664 -61, i386 -221, ARM -60, power64 -48

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h    |    1 -
 kernel/time/alarmtimer.c   |    6 +++---
 kernel/time/hrtimer.c      |   16 ----------------
 kernel/time/posix-timers.c |   17 ++++++++++++-----
 4 files changed, 15 insertions(+), 25 deletions(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -385,7 +385,6 @@ static inline int hrtimer_restart(struct
 
 /* Query timers: */
 extern ktime_t hrtimer_get_remaining(const struct hrtimer *timer);
-extern int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp);
 
 extern ktime_t hrtimer_get_next_event(void);
 
Index: tip/kernel/time/alarmtimer.c
===================================================================
--- tip.orig/kernel/time/alarmtimer.c
+++ tip/kernel/time/alarmtimer.c
@@ -495,12 +495,12 @@ static enum alarmtimer_restart alarm_han
  */
 static int alarm_clock_getres(const clockid_t which_clock, struct timespec *tp)
 {
-	clockid_t baseid = alarm_bases[clock2alarm(which_clock)].base_clockid;
-
 	if (!alarmtimer_get_rtcdev())
 		return -EINVAL;
 
-	return hrtimer_get_res(baseid, tp);
+	tp->tv_sec = 0;
+	tp->tv_nsec = hrtimer_resolution;
+	return 0;
 }
 
 /**
Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -1171,22 +1171,6 @@ void hrtimer_init(struct hrtimer *timer,
 }
 EXPORT_SYMBOL_GPL(hrtimer_init);
 
-/**
- * hrtimer_get_res - get the timer resolution for a clock
- * @which_clock: which clock to query
- * @tp:		 pointer to timespec variable to store the resolution
- *
- * Store the resolution of the clock selected by @which_clock in the
- * variable pointed to by @tp.
- */
-int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp)
-{
-	tp->tv_sec = 0;
-	tp->tv_nsec = hrtimer_resolution;
-	return 0;
-}
-EXPORT_SYMBOL_GPL(hrtimer_get_res);
-
 static void __run_hrtimer(struct hrtimer *timer, ktime_t *now)
 {
 	struct hrtimer_clock_base *base = timer->base;
Index: tip/kernel/time/posix-timers.c
===================================================================
--- tip.orig/kernel/time/posix-timers.c
+++ tip/kernel/time/posix-timers.c
@@ -272,13 +272,20 @@ static int posix_get_tai(clockid_t which
 	return 0;
 }
 
+static int posix_get_hrtimer_res(clockid_t which_clock, struct timespec *tp)
+{
+	tp->tv_sec = 0;
+	tp->tv_nsec = hrtimer_resolution;
+	return 0;
+}
+
 /*
  * Initialize everything, well, just everything in Posix clocks/timers ;)
  */
 static __init int init_posix_timers(void)
 {
 	struct k_clock clock_realtime = {
-		.clock_getres	= hrtimer_get_res,
+		.clock_getres	= posix_get_hrtimer_res,
 		.clock_get	= posix_clock_realtime_get,
 		.clock_set	= posix_clock_realtime_set,
 		.clock_adj	= posix_clock_realtime_adj,
@@ -290,7 +297,7 @@ static __init int init_posix_timers(void
 		.timer_del	= common_timer_del,
 	};
 	struct k_clock clock_monotonic = {
-		.clock_getres	= hrtimer_get_res,
+		.clock_getres	= posix_get_hrtimer_res,
 		.clock_get	= posix_ktime_get_ts,
 		.nsleep		= common_nsleep,
 		.nsleep_restart	= hrtimer_nanosleep_restart,
@@ -300,7 +307,7 @@ static __init int init_posix_timers(void
 		.timer_del	= common_timer_del,
 	};
 	struct k_clock clock_monotonic_raw = {
-		.clock_getres	= hrtimer_get_res,
+		.clock_getres	= posix_get_hrtimer_res,
 		.clock_get	= posix_get_monotonic_raw,
 	};
 	struct k_clock clock_realtime_coarse = {
@@ -312,7 +319,7 @@ static __init int init_posix_timers(void
 		.clock_get	= posix_get_monotonic_coarse,
 	};
 	struct k_clock clock_tai = {
-		.clock_getres	= hrtimer_get_res,
+		.clock_getres	= posix_get_hrtimer_res,
 		.clock_get	= posix_get_tai,
 		.nsleep		= common_nsleep,
 		.nsleep_restart	= hrtimer_nanosleep_restart,
@@ -322,7 +329,7 @@ static __init int init_posix_timers(void
 		.timer_del	= common_timer_del,
 	};
 	struct k_clock clock_boottime = {
-		.clock_getres	= hrtimer_get_res,
+		.clock_getres	= posix_get_hrtimer_res,
 		.clock_get	= posix_get_boottime,
 		.nsleep		= common_nsleep,
 		.nsleep_restart	= hrtimer_nanosleep_restart,



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 06/39] hrtimer: Make the statistics fields smaller
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (4 preceding siblings ...)
  2015-04-14 21:08 ` [patch 05/39] hrtimer: Get rid " Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-22 19:06   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 07/39] hrtimer: Get rid of softirq time Thomas Gleixner
                   ` (32 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrtimer-optimize-cache-foot-print.patch --]
[-- Type: text/plain, Size: 1881 bytes --]

No point in having usigned long for /proc/timer_list statistics. Make
them unsigned int.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h  |    8 ++++----
 kernel/time/hrtimer.c    |    4 ++--
 kernel/time/timer_list.c |    2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -187,10 +187,10 @@ struct hrtimer_cpu_base {
 	int				in_hrtirq;
 	int				hres_active;
 	int				hang_detected;
-	unsigned long			nr_events;
-	unsigned long			nr_retries;
-	unsigned long			nr_hangs;
-	ktime_t				max_hang_time;
+	unsigned int			nr_events;
+	unsigned int			nr_retries;
+	unsigned int			nr_hangs;
+	unsigned int			max_hang_time;
 #endif
 	struct hrtimer_clock_base	clock_base[HRTIMER_MAX_CLOCK_BASES];
 };
Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -1319,8 +1319,8 @@ retry:
 	cpu_base->hang_detected = 1;
 	raw_spin_unlock(&cpu_base->lock);
 	delta = ktime_sub(now, entry_time);
-	if (delta.tv64 > cpu_base->max_hang_time.tv64)
-		cpu_base->max_hang_time = delta;
+	if ((unsigned int)delta.tv64 > cpu_base->max_hang_time)
+		cpu_base->max_hang_time = (unsigned int) delta.tv64;
 	/*
 	 * Limit it to a sensible value as we enforce a longer
 	 * delay. Give the CPU at least 100ms to catch up.
Index: tip/kernel/time/timer_list.c
===================================================================
--- tip.orig/kernel/time/timer_list.c
+++ tip/kernel/time/timer_list.c
@@ -158,7 +158,7 @@ static void print_cpu(struct seq_file *m
 	P(nr_events);
 	P(nr_retries);
 	P(nr_hangs);
-	P_ns(max_hang_time);
+	P(max_hang_time);
 #endif
 #undef P
 #undef P_ns



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 07/39] hrtimer: Get rid of softirq time
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (5 preceding siblings ...)
  2015-04-14 21:08 ` [patch 06/39] hrtimer: Make the statistics fields smaller Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-22 19:06   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 08/39] hrtimer: Make offset update smarter Thomas Gleixner
                   ` (31 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrtimer-get-rid-of-softirq-time.patch --]
[-- Type: text/plain, Size: 11160 bytes --]

The softirq time field in the clock bases is an optimization from the
early days of hrtimers. It provides a coarse "jiffies" like time
mostly for self rearming timers.

But that comes with a price:
    - Larger code size
    - Extra storage space
    - Duplicated functions with really small differences
   
The benefit of this is optimization is marginal for contemporary
systems.

Consolidate everything on the high resolution timer
implementation. This makes further optimizations possible.

Text size reduction:
       x8664 -95, i386 -356, ARM -148, ARM64 -40, power64 -16

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h   |   24 +------
 kernel/time/hrtimer.c     |  148 ++++++++++++++++++----------------------------
 kernel/time/timekeeping.c |   32 ---------
 kernel/time/timekeeping.h |    3 
 4 files changed, 64 insertions(+), 143 deletions(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -138,7 +138,6 @@ struct hrtimer_sleeper {
  * @clockid:		clock id for per_cpu support
  * @active:		red black tree root node for the active timers
  * @get_time:		function to retrieve the current time of the clock
- * @softirq_time:	the time when running the hrtimer queue in the softirq
  * @offset:		offset of this clock to the monotonic base
  */
 struct hrtimer_clock_base {
@@ -147,7 +146,6 @@ struct hrtimer_clock_base {
 	clockid_t		clockid;
 	struct timerqueue_head	active;
 	ktime_t			(*get_time)(void);
-	ktime_t			softirq_time;
 	ktime_t			offset;
 };
 
@@ -260,19 +258,16 @@ static inline ktime_t hrtimer_expires_re
 	return ktime_sub(timer->node.expires, timer->base->get_time());
 }
 
-#ifdef CONFIG_HIGH_RES_TIMERS
-struct clock_event_device;
-
-extern void hrtimer_interrupt(struct clock_event_device *dev);
-
-/*
- * In high resolution mode the time reference must be read accurate
- */
 static inline ktime_t hrtimer_cb_get_time(struct hrtimer *timer)
 {
 	return timer->base->get_time();
 }
 
+#ifdef CONFIG_HIGH_RES_TIMERS
+struct clock_event_device;
+
+extern void hrtimer_interrupt(struct clock_event_device *dev);
+
 static inline int hrtimer_is_hres_active(struct hrtimer *timer)
 {
 	return timer->base->cpu_base->hres_active;
@@ -304,15 +299,6 @@ extern unsigned int hrtimer_resolution;
 
 static inline void hrtimer_peek_ahead_timers(void) { }
 
-/*
- * In non high resolution mode the time reference is taken from
- * the base softirq time variable.
- */
-static inline ktime_t hrtimer_cb_get_time(struct hrtimer *timer)
-{
-	return timer->base->softirq_time;
-}
-
 static inline int hrtimer_is_hres_active(struct hrtimer *timer)
 {
 	return 0;
Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -104,27 +104,6 @@ static inline int hrtimer_clockid_to_bas
 	return hrtimer_clock_to_base_table[clock_id];
 }
 
-
-/*
- * Get the coarse grained time at the softirq based on xtime and
- * wall_to_monotonic.
- */
-static void hrtimer_get_softirq_time(struct hrtimer_cpu_base *base)
-{
-	ktime_t xtim, mono, boot, tai;
-	ktime_t off_real, off_boot, off_tai;
-
-	mono = ktime_get_update_offsets_tick(&off_real, &off_boot, &off_tai);
-	boot = ktime_add(mono, off_boot);
-	xtim = ktime_add(mono, off_real);
-	tai = ktime_add(mono, off_tai);
-
-	base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim;
-	base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono;
-	base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot;
-	base->clock_base[HRTIMER_BASE_TAI].softirq_time = tai;
-}
-
 /*
  * Functions and macros which are different for UP/SMP systems are kept in a
  * single place
@@ -466,6 +445,15 @@ static ktime_t __hrtimer_get_next_event(
 }
 #endif
 
+static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base)
+{
+	ktime_t *offs_real = &base->clock_base[HRTIMER_BASE_REALTIME].offset;
+	ktime_t *offs_boot = &base->clock_base[HRTIMER_BASE_BOOTTIME].offset;
+	ktime_t *offs_tai = &base->clock_base[HRTIMER_BASE_TAI].offset;
+
+	return ktime_get_update_offsets_now(offs_real, offs_boot, offs_tai);
+}
+
 /* High resolution timer related functions */
 #ifdef CONFIG_HIGH_RES_TIMERS
 
@@ -516,7 +504,12 @@ static inline int hrtimer_hres_active(vo
 static void
 hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base, int skip_equal)
 {
-	ktime_t expires_next = __hrtimer_get_next_event(cpu_base);
+	ktime_t expires_next;
+
+	if (!cpu_base->hres_active)
+		return;
+
+	expires_next = __hrtimer_get_next_event(cpu_base);
 
 	if (skip_equal && expires_next.tv64 == cpu_base->expires_next.tv64)
 		return;
@@ -625,15 +618,6 @@ static inline void hrtimer_init_hres(str
 	base->hres_active = 0;
 }
 
-static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base)
-{
-	ktime_t *offs_real = &base->clock_base[HRTIMER_BASE_REALTIME].offset;
-	ktime_t *offs_boot = &base->clock_base[HRTIMER_BASE_BOOTTIME].offset;
-	ktime_t *offs_tai = &base->clock_base[HRTIMER_BASE_TAI].offset;
-
-	return ktime_get_update_offsets_now(offs_real, offs_boot, offs_tai);
-}
-
 /*
  * Retrigger next event is called after clock was set
  *
@@ -1171,10 +1155,10 @@ void hrtimer_init(struct hrtimer *timer,
 }
 EXPORT_SYMBOL_GPL(hrtimer_init);
 
-static void __run_hrtimer(struct hrtimer *timer, ktime_t *now)
+static void __run_hrtimer(struct hrtimer_cpu_base *cpu_base,
+			  struct hrtimer_clock_base *base,
+			  struct hrtimer *timer, ktime_t *now)
 {
-	struct hrtimer_clock_base *base = timer->base;
-	struct hrtimer_cpu_base *cpu_base = base->cpu_base;
 	enum hrtimer_restart (*fn)(struct hrtimer *);
 	int restart;
 
@@ -1211,34 +1195,9 @@ static void __run_hrtimer(struct hrtimer
 	timer->state &= ~HRTIMER_STATE_CALLBACK;
 }
 
-#ifdef CONFIG_HIGH_RES_TIMERS
-
-/*
- * High resolution timer interrupt
- * Called with interrupts disabled
- */
-void hrtimer_interrupt(struct clock_event_device *dev)
+static void __hrtimer_run_queues(struct hrtimer_cpu_base *cpu_base, ktime_t now)
 {
-	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
-	ktime_t expires_next, now, entry_time, delta;
-	int i, retries = 0;
-
-	BUG_ON(!cpu_base->hres_active);
-	cpu_base->nr_events++;
-	dev->next_event.tv64 = KTIME_MAX;
-
-	raw_spin_lock(&cpu_base->lock);
-	entry_time = now = hrtimer_update_base(cpu_base);
-retry:
-	cpu_base->in_hrtirq = 1;
-	/*
-	 * We set expires_next to KTIME_MAX here with cpu_base->lock
-	 * held to prevent that a timer is enqueued in our queue via
-	 * the migration code. This does not affect enqueueing of
-	 * timers which run their callback and need to be requeued on
-	 * this CPU.
-	 */
-	cpu_base->expires_next.tv64 = KTIME_MAX;
+	int i;
 
 	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
 		struct hrtimer_clock_base *base;
@@ -1271,9 +1230,42 @@ retry:
 			if (basenow.tv64 < hrtimer_get_softexpires_tv64(timer))
 				break;
 
-			__run_hrtimer(timer, &basenow);
+			__run_hrtimer(cpu_base, base, timer, &basenow);
 		}
 	}
+}
+
+#ifdef CONFIG_HIGH_RES_TIMERS
+
+/*
+ * High resolution timer interrupt
+ * Called with interrupts disabled
+ */
+void hrtimer_interrupt(struct clock_event_device *dev)
+{
+	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
+	ktime_t expires_next, now, entry_time, delta;
+	int retries = 0;
+
+	BUG_ON(!cpu_base->hres_active);
+	cpu_base->nr_events++;
+	dev->next_event.tv64 = KTIME_MAX;
+
+	raw_spin_lock(&cpu_base->lock);
+	entry_time = now = hrtimer_update_base(cpu_base);
+retry:
+	cpu_base->in_hrtirq = 1;
+	/*
+	 * We set expires_next to KTIME_MAX here with cpu_base->lock
+	 * held to prevent that a timer is enqueued in our queue via
+	 * the migration code. This does not affect enqueueing of
+	 * timers which run their callback and need to be requeued on
+	 * this CPU.
+	 */
+	cpu_base->expires_next.tv64 = KTIME_MAX;
+
+	__hrtimer_run_queues(cpu_base, now);
+
 	/* Reevaluate the clock bases for the next expiry */
 	expires_next = __hrtimer_get_next_event(cpu_base);
 	/*
@@ -1408,38 +1400,16 @@ void hrtimer_run_pending(void)
  */
 void hrtimer_run_queues(void)
 {
-	struct timerqueue_node *node;
 	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
-	struct hrtimer_clock_base *base;
-	int index, gettime = 1;
+	ktime_t now;
 
 	if (hrtimer_hres_active())
 		return;
 
-	for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) {
-		base = &cpu_base->clock_base[index];
-		if (!timerqueue_getnext(&base->active))
-			continue;
-
-		if (gettime) {
-			hrtimer_get_softirq_time(cpu_base);
-			gettime = 0;
-		}
-
-		raw_spin_lock(&cpu_base->lock);
-
-		while ((node = timerqueue_getnext(&base->active))) {
-			struct hrtimer *timer;
-
-			timer = container_of(node, struct hrtimer, node);
-			if (base->softirq_time.tv64 <=
-					hrtimer_get_expires_tv64(timer))
-				break;
-
-			__run_hrtimer(timer, &base->softirq_time);
-		}
-		raw_spin_unlock(&cpu_base->lock);
-	}
+	raw_spin_lock(&cpu_base->lock);
+	now = hrtimer_update_base(cpu_base);
+	__hrtimer_run_queues(cpu_base, now);
+	raw_spin_unlock(&cpu_base->lock);
 }
 
 /*
Index: tip/kernel/time/timekeeping.c
===================================================================
--- tip.orig/kernel/time/timekeeping.c
+++ tip/kernel/time/timekeeping.c
@@ -1926,37 +1926,6 @@ void do_timer(unsigned long ticks)
 }
 
 /**
- * ktime_get_update_offsets_tick - hrtimer helper
- * @offs_real:	pointer to storage for monotonic -> realtime offset
- * @offs_boot:	pointer to storage for monotonic -> boottime offset
- * @offs_tai:	pointer to storage for monotonic -> clock tai offset
- *
- * Returns monotonic time at last tick and various offsets
- */
-ktime_t ktime_get_update_offsets_tick(ktime_t *offs_real, ktime_t *offs_boot,
-							ktime_t *offs_tai)
-{
-	struct timekeeper *tk = &tk_core.timekeeper;
-	unsigned int seq;
-	ktime_t base;
-	u64 nsecs;
-
-	do {
-		seq = read_seqcount_begin(&tk_core.seq);
-
-		base = tk->tkr_mono.base;
-		nsecs = tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift;
-
-		*offs_real = tk->offs_real;
-		*offs_boot = tk->offs_boot;
-		*offs_tai = tk->offs_tai;
-	} while (read_seqcount_retry(&tk_core.seq, seq));
-
-	return ktime_add_ns(base, nsecs);
-}
-
-#ifdef CONFIG_HIGH_RES_TIMERS
-/**
  * ktime_get_update_offsets_now - hrtimer helper
  * @offs_real:	pointer to storage for monotonic -> realtime offset
  * @offs_boot:	pointer to storage for monotonic -> boottime offset
@@ -1986,7 +1955,6 @@ ktime_t ktime_get_update_offsets_now(kti
 
 	return ktime_add_ns(base, nsecs);
 }
-#endif
 
 /**
  * do_adjtimex() - Accessor function to NTP __do_adjtimex function
Index: tip/kernel/time/timekeeping.h
===================================================================
--- tip.orig/kernel/time/timekeeping.h
+++ tip/kernel/time/timekeeping.h
@@ -3,9 +3,6 @@
 /*
  * Internal interfaces for kernel/time/
  */
-extern ktime_t ktime_get_update_offsets_tick(ktime_t *offs_real,
-						ktime_t *offs_boot,
-						ktime_t *offs_tai);
 extern ktime_t ktime_get_update_offsets_now(ktime_t *offs_real,
 						ktime_t *offs_boot,
 						ktime_t *offs_tai);



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 08/39] hrtimer: Make offset update smarter
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (6 preceding siblings ...)
  2015-04-14 21:08 ` [patch 07/39] hrtimer: Get rid of softirq time Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-20  9:30   ` Preeti U Murthy
  2015-04-22 19:06   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 09/39] hrtimer: Use a bits for various boolean indicators Thomas Gleixner
                   ` (30 subsequent siblings)
  38 siblings, 2 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, John Stultz

[-- Attachment #1: hrtimer-make-offset-update-smarter.patch --]
[-- Type: text/plain, Size: 5744 bytes --]

On every tick/hrtimer interrupt we update the offset variables of the
clock bases. That's silly because these offsets change very seldom.

Add a sequence counter to the time keeping code which keeps track of
the offset updates (clock_was_set()). Have a sequence cache in the
hrtimer cpu bases to evaluate whether the offsets must be updated or
not. This allows us later to avoid pointless cacheline pollution.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
---
 include/linux/hrtimer.h             |    4 ++--
 include/linux/timekeeper_internal.h |    2 ++
 kernel/time/hrtimer.c               |    3 ++-
 kernel/time/timekeeping.c           |   23 ++++++++++++++++-------
 kernel/time/timekeeping.h           |    7 ++++---
 5 files changed, 26 insertions(+), 13 deletions(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -163,7 +163,7 @@ enum  hrtimer_base_type {
  *			and timers
  * @cpu:		cpu number
  * @active_bases:	Bitfield to mark bases with active timers
- * @clock_was_set:	Indicates that clock was set from irq context.
+ * @clock_was_set_seq:	Sequence counter of clock was set events
  * @expires_next:	absolute time of the next event which was scheduled
  *			via clock_set_next_event()
  * @in_hrtirq:		hrtimer_interrupt() is currently executing
@@ -179,7 +179,7 @@ struct hrtimer_cpu_base {
 	raw_spinlock_t			lock;
 	unsigned int			cpu;
 	unsigned int			active_bases;
-	unsigned int			clock_was_set;
+	unsigned int			clock_was_set_seq;
 #ifdef CONFIG_HIGH_RES_TIMERS
 	ktime_t				expires_next;
 	int				in_hrtirq;
Index: tip/include/linux/timekeeper_internal.h
===================================================================
--- tip.orig/include/linux/timekeeper_internal.h
+++ tip/include/linux/timekeeper_internal.h
@@ -49,6 +49,7 @@ struct tk_read_base {
  * @offs_boot:		Offset clock monotonic -> clock boottime
  * @offs_tai:		Offset clock monotonic -> clock tai
  * @tai_offset:		The current UTC to TAI offset in seconds
+ * @clock_was_set_seq:	The sequence number of clock was set events
  * @raw_time:		Monotonic raw base time in timespec64 format
  * @cycle_interval:	Number of clock cycles in one NTP interval
  * @xtime_interval:	Number of clock shifted nano seconds in one NTP
@@ -85,6 +86,7 @@ struct timekeeper {
 	ktime_t			offs_boot;
 	ktime_t			offs_tai;
 	s32			tai_offset;
+	unsigned int		clock_was_set_seq;
 	struct timespec64	raw_time;
 
 	/* The following members are for timekeeping internal use */
Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -451,7 +451,8 @@ static inline ktime_t hrtimer_update_bas
 	ktime_t *offs_boot = &base->clock_base[HRTIMER_BASE_BOOTTIME].offset;
 	ktime_t *offs_tai = &base->clock_base[HRTIMER_BASE_TAI].offset;
 
-	return ktime_get_update_offsets_now(offs_real, offs_boot, offs_tai);
+	return ktime_get_update_offsets_now(&base->clock_was_set_seq,
+					    offs_real, offs_boot, offs_tai);
 }
 
 /* High resolution timer related functions */
Index: tip/kernel/time/timekeeping.c
===================================================================
--- tip.orig/kernel/time/timekeeping.c
+++ tip/kernel/time/timekeeping.c
@@ -602,6 +602,9 @@ static void timekeeping_update(struct ti
 
 	update_fast_timekeeper(&tk->tkr_mono, &tk_fast_mono);
 	update_fast_timekeeper(&tk->tkr_raw,  &tk_fast_raw);
+
+	if (action & TK_CLOCK_WAS_SET)
+		tk->clock_was_set_seq++;
 }
 
 /**
@@ -1927,15 +1930,19 @@ void do_timer(unsigned long ticks)
 
 /**
  * ktime_get_update_offsets_now - hrtimer helper
+ * @cwsseq:	pointer to check and store the clock was set sequence number
  * @offs_real:	pointer to storage for monotonic -> realtime offset
  * @offs_boot:	pointer to storage for monotonic -> boottime offset
  * @offs_tai:	pointer to storage for monotonic -> clock tai offset
  *
- * Returns current monotonic time and updates the offsets
+ * Returns current monotonic time and updates the offsets if the
+ * sequence number in @cwsseq and timekeeper.clock_was_set_seq are
+ * different.
+ *
  * Called from hrtimer_interrupt() or retrigger_next_event()
  */
-ktime_t ktime_get_update_offsets_now(ktime_t *offs_real, ktime_t *offs_boot,
-							ktime_t *offs_tai)
+ktime_t ktime_get_update_offsets_now(unsigned int *cwsseq, ktime_t *offs_real,
+				     ktime_t *offs_boot, ktime_t *offs_tai)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
 	unsigned int seq;
@@ -1947,10 +1954,12 @@ ktime_t ktime_get_update_offsets_now(kti
 
 		base = tk->tkr_mono.base;
 		nsecs = timekeeping_get_ns(&tk->tkr_mono);
-
-		*offs_real = tk->offs_real;
-		*offs_boot = tk->offs_boot;
-		*offs_tai = tk->offs_tai;
+		if (*cwsseq != tk->clock_was_set_seq) {
+			*cwsseq = tk->clock_was_set_seq;
+			*offs_real = tk->offs_real;
+			*offs_boot = tk->offs_boot;
+			*offs_tai = tk->offs_tai;
+		}
 	} while (read_seqcount_retry(&tk_core.seq, seq));
 
 	return ktime_add_ns(base, nsecs);
Index: tip/kernel/time/timekeeping.h
===================================================================
--- tip.orig/kernel/time/timekeeping.h
+++ tip/kernel/time/timekeeping.h
@@ -3,9 +3,10 @@
 /*
  * Internal interfaces for kernel/time/
  */
-extern ktime_t ktime_get_update_offsets_now(ktime_t *offs_real,
-						ktime_t *offs_boot,
-						ktime_t *offs_tai);
+extern ktime_t ktime_get_update_offsets_now(unsigned int *cwsseq,
+					    ktime_t *offs_real,
+					    ktime_t *offs_boot,
+					    ktime_t *offs_tai);
 
 extern int timekeeping_valid_for_hres(void);
 extern u64 timekeeping_max_deferment(void);



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 09/39] hrtimer: Use a bits for various boolean indicators
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (7 preceding siblings ...)
  2015-04-14 21:08 ` [patch 08/39] hrtimer: Make offset update smarter Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-22 19:07   ` [tip:timers/core] hrtimer: Use " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 10/39] hrtimer: Use cpu_base->active_base for hotpath iterators Thomas Gleixner
                   ` (29 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrt-use-bits.patch --]
[-- Type: text/plain, Size: 3677 bytes --]

No point in wasting 12 byte storage space. Generates better code as well.

Text size reduction:
       x8664 -64, i386 -16, ARM -132, ARM64 -0, power64 -48

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h |    6 +++---
 kernel/time/hrtimer.c   |   24 ++++++++++++++++--------
 2 files changed, 19 insertions(+), 11 deletions(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -181,10 +181,10 @@ struct hrtimer_cpu_base {
 	unsigned int			active_bases;
 	unsigned int			clock_was_set_seq;
 #ifdef CONFIG_HIGH_RES_TIMERS
+	unsigned int			in_hrtirq	: 1,
+					hres_active	: 1,
+					hang_detected	: 1;
 	ktime_t				expires_next;
-	int				in_hrtirq;
-	int				hres_active;
-	int				hang_detected;
 	unsigned int			nr_events;
 	unsigned int			nr_retries;
 	unsigned int			nr_hangs;
Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -492,9 +492,14 @@ static inline int hrtimer_is_hres_enable
 /*
  * Is the high resolution mode active ?
  */
+static inline int __hrtimer_hres_active(struct hrtimer_cpu_base *cpu_base)
+{
+	return cpu_base->hres_active;
+}
+
 static inline int hrtimer_hres_active(void)
 {
-	return __this_cpu_read(hrtimer_bases.hres_active);
+	return __hrtimer_hres_active(this_cpu_ptr(&hrtimer_bases));
 }
 
 /*
@@ -628,7 +633,7 @@ static void retrigger_next_event(void *a
 {
 	struct hrtimer_cpu_base *base = this_cpu_ptr(&hrtimer_bases);
 
-	if (!hrtimer_hres_active())
+	if (!base->hres_active)
 		return;
 
 	raw_spin_lock(&base->lock);
@@ -685,6 +690,7 @@ void clock_was_set_delayed(void)
 
 #else
 
+static inline int __hrtimer_hres_active(struct hrtimer_cpu_base *b) { return 0; }
 static inline int hrtimer_hres_active(void) { return 0; }
 static inline int hrtimer_is_hres_enabled(void) { return 0; }
 static inline int hrtimer_switch_to_hres(void) { return 0; }
@@ -854,25 +860,27 @@ static void __remove_hrtimer(struct hrti
 			     struct hrtimer_clock_base *base,
 			     unsigned long newstate, int reprogram)
 {
+	struct hrtimer_cpu_base *cpu_base = base->cpu_base;
 	struct timerqueue_node *next_timer;
+
 	if (!(timer->state & HRTIMER_STATE_ENQUEUED))
 		goto out;
 
 	next_timer = timerqueue_getnext(&base->active);
 	timerqueue_del(&base->active, &timer->node);
 	if (!timerqueue_getnext(&base->active))
-		base->cpu_base->active_bases &= ~(1 << base->index);
+		cpu_base->active_bases &= ~(1 << base->index);
 
 	if (&timer->node == next_timer) {
 #ifdef CONFIG_HIGH_RES_TIMERS
 		/* Reprogram the clock event device. if enabled */
-		if (reprogram && hrtimer_hres_active()) {
+		if (reprogram && cpu_base->hres_active) {
 			ktime_t expires;
 
 			expires = ktime_sub(hrtimer_get_expires(timer),
 					    base->offset);
-			if (base->cpu_base->expires_next.tv64 == expires.tv64)
-				hrtimer_force_reprogram(base->cpu_base, 1);
+			if (cpu_base->expires_next.tv64 == expires.tv64)
+				hrtimer_force_reprogram(cpu_base, 1);
 		}
 #endif
 	}
@@ -1106,7 +1114,7 @@ ktime_t hrtimer_get_next_event(void)
 
 	raw_spin_lock_irqsave(&cpu_base->lock, flags);
 
-	if (!hrtimer_hres_active())
+	if (!__hrtimer_hres_active(cpu_base))
 		mindelta = ktime_sub(__hrtimer_get_next_event(cpu_base),
 				     ktime_get());
 
@@ -1404,7 +1412,7 @@ void hrtimer_run_queues(void)
 	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	ktime_t now;
 
-	if (hrtimer_hres_active())
+	if (__hrtimer_hres_active(cpu_base))
 		return;
 
 	raw_spin_lock(&cpu_base->lock);



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 10/39] hrtimer: Use cpu_base->active_base for hotpath iterators
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (8 preceding siblings ...)
  2015-04-14 21:08 ` [patch 09/39] hrtimer: Use a bits for various boolean indicators Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-20 11:16   ` Preeti U Murthy
  2015-04-22 19:07   ` [tip:timers/core] hrtimer: Use cpu_base-> active_base " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 11/39] hrtimer: Cache line align the hrtimer cpu base Thomas Gleixner
                   ` (28 subsequent siblings)
  38 siblings, 2 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrtimer-iterate-smarter.patch --]
[-- Type: text/plain, Size: 1936 bytes --]

Now that we have the active_bases field in sync we can use it for
iterating over the clock bases. This allows to break out early if no
more active clock bases are available and avoids touching the cache
lines of inactive clock bases.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/hrtimer.c |   17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -419,16 +419,16 @@ static ktime_t __hrtimer_get_next_event(
 {
 	struct hrtimer_clock_base *base = cpu_base->clock_base;
 	ktime_t expires, expires_next = { .tv64 = KTIME_MAX };
-	int i;
+	unsigned int active = cpu_base->active_bases;
 
-	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++, base++) {
+	for (; active; base++, active >>= 1) {
 		struct timerqueue_node *next;
 		struct hrtimer *timer;
 
-		next = timerqueue_getnext(&base->active);
-		if (!next)
+		if (!(active & 0x01))
 			continue;
 
+		next = timerqueue_getnext(&base->active);
 		timer = container_of(next, struct hrtimer, node);
 		expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
 		if (expires.tv64 < expires_next.tv64)
@@ -1206,17 +1206,16 @@ static void __run_hrtimer(struct hrtimer
 
 static void __hrtimer_run_queues(struct hrtimer_cpu_base *cpu_base, ktime_t now)
 {
-	int i;
+	struct hrtimer_clock_base *base = cpu_base->clock_base;
+	unsigned int active = cpu_base->active_bases;
 
-	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
-		struct hrtimer_clock_base *base;
+	for (; active; base++, active >>= 1) {
 		struct timerqueue_node *node;
 		ktime_t basenow;
 
-		if (!(cpu_base->active_bases & (1 << i)))
+		if (!(active & 0x01))
 			continue;
 
-		base = cpu_base->clock_base + i;
 		basenow = ktime_add(now, base->offset);
 
 		while ((node = timerqueue_getnext(&base->active))) {



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 11/39] hrtimer: Cache line align the hrtimer cpu base
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (9 preceding siblings ...)
  2015-04-14 21:08 ` [patch 10/39] hrtimer: Use cpu_base->active_base for hotpath iterators Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-22 19:07   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 12/39] hrtimer: Align the hrtimer clock bases as well Thomas Gleixner
                   ` (27 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrt-align.patch --]
[-- Type: text/plain, Size: 647 bytes --]

We really want that data structure to start at a cache line boundary.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -191,7 +191,7 @@ struct hrtimer_cpu_base {
 	unsigned int			max_hang_time;
 #endif
 	struct hrtimer_clock_base	clock_base[HRTIMER_MAX_CLOCK_BASES];
-};
+} ____cacheline_aligned;
 
 static inline void hrtimer_set_expires(struct hrtimer *timer, ktime_t time)
 {



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 12/39] hrtimer: Align the hrtimer clock bases as well
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (10 preceding siblings ...)
  2015-04-14 21:08 ` [patch 11/39] hrtimer: Cache line align the hrtimer cpu base Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-22 19:07   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 13/39] timerqueue: Let timerqueue_add/del return information Thomas Gleixner
                   ` (26 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrt-align-clock-base.patch --]
[-- Type: text/plain, Size: 1435 bytes --]

We don't use cacheline_align here because that might waste lot of
space on 32bit machine with 64 bytes cachelines and on 64bit machines
with 128 bytes cachelines.

The size of struct hrtimer_clock_base is 64byte on 64bit and 32byte on
32bit machines. So we utilize the cache lines proper.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -130,6 +130,12 @@ struct hrtimer_sleeper {
 	struct task_struct *task;
 };
 
+#ifdef CONFIG_64BIT
+# define HRTIMER_CLOCK_BASE_ALIGN	64
+#else
+# define HRTIMER_CLOCK_BASE_ALIGN	32
+#endif
+
 /**
  * struct hrtimer_clock_base - the timer base for a specific clock
  * @cpu_base:		per cpu clock base
@@ -147,7 +153,7 @@ struct hrtimer_clock_base {
 	struct timerqueue_head	active;
 	ktime_t			(*get_time)(void);
 	ktime_t			offset;
-};
+} __attribute__((__aligned__(HRTIMER_CLOCK_BASE_ALIGN)));
 
 enum  hrtimer_base_type {
 	HRTIMER_BASE_MONOTONIC,
@@ -195,6 +201,8 @@ struct hrtimer_cpu_base {
 
 static inline void hrtimer_set_expires(struct hrtimer *timer, ktime_t time)
 {
+	BUILD_BUG_ON(sizeof(struct hrtimer_clock_base) > HRTIMER_CLOCK_BASE_ALIGN);
+
 	timer->node.expires = time;
 	timer->_softexpires = time;
 }



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 13/39] timerqueue: Let timerqueue_add/del return information
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (11 preceding siblings ...)
  2015-04-14 21:08 ` [patch 12/39] hrtimer: Align the hrtimer clock bases as well Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-22 19:08   ` [tip:timers/core] timerqueue: Let timerqueue_add/ del " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 14/39] hrtimer: Make use of timerqueue_add/del return values Thomas Gleixner
                   ` (25 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, John Stultz

[-- Attachment #1: timerqueue-return-first.patch --]
[-- Type: text/plain, Size: 2640 bytes --]

The hrtimer code is interested whether the added timer is the first
one to expire and whether the removed timer was the last one in the
tree. The add/del routines have that information already. So we can
return it right away instead of reevaluating it at the call site.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
---
 include/linux/timerqueue.h |    8 ++++----
 lib/timerqueue.c           |   10 +++++++---
 2 files changed, 11 insertions(+), 7 deletions(-)

Index: linux/include/linux/timerqueue.h
===================================================================
--- linux.orig/include/linux/timerqueue.h
+++ linux/include/linux/timerqueue.h
@@ -16,10 +16,10 @@ struct timerqueue_head {
 };
 
 
-extern void timerqueue_add(struct timerqueue_head *head,
-				struct timerqueue_node *node);
-extern void timerqueue_del(struct timerqueue_head *head,
-				struct timerqueue_node *node);
+extern bool timerqueue_add(struct timerqueue_head *head,
+			   struct timerqueue_node *node);
+extern bool timerqueue_del(struct timerqueue_head *head,
+			   struct timerqueue_node *node);
 extern struct timerqueue_node *timerqueue_iterate_next(
 						struct timerqueue_node *node);
 
Index: linux/lib/timerqueue.c
===================================================================
--- linux.orig/lib/timerqueue.c
+++ linux/lib/timerqueue.c
@@ -36,7 +36,7 @@
  * Adds the timer node to the timerqueue, sorted by the
  * node's expires value.
  */
-void timerqueue_add(struct timerqueue_head *head, struct timerqueue_node *node)
+bool timerqueue_add(struct timerqueue_head *head, struct timerqueue_node *node)
 {
 	struct rb_node **p = &head->head.rb_node;
 	struct rb_node *parent = NULL;
@@ -56,8 +56,11 @@ void timerqueue_add(struct timerqueue_he
 	rb_link_node(&node->node, parent, p);
 	rb_insert_color(&node->node, &head->head);
 
-	if (!head->next || node->expires.tv64 < head->next->expires.tv64)
+	if (!head->next || node->expires.tv64 < head->next->expires.tv64) {
 		head->next = node;
+		return true;
+	}
+	return false;
 }
 EXPORT_SYMBOL_GPL(timerqueue_add);
 
@@ -69,7 +72,7 @@ EXPORT_SYMBOL_GPL(timerqueue_add);
  *
  * Removes the timer node from the timerqueue.
  */
-void timerqueue_del(struct timerqueue_head *head, struct timerqueue_node *node)
+bool timerqueue_del(struct timerqueue_head *head, struct timerqueue_node *node)
 {
 	WARN_ON_ONCE(RB_EMPTY_NODE(&node->node));
 
@@ -82,6 +85,7 @@ void timerqueue_del(struct timerqueue_he
 	}
 	rb_erase(&node->node, &head->head);
 	RB_CLEAR_NODE(&node->node);
+	return head->next != NULL;
 }
 EXPORT_SYMBOL_GPL(timerqueue_del);
 



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 14/39] hrtimer: Make use of timerqueue_add/del return values
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (12 preceding siblings ...)
  2015-04-14 21:08 ` [patch 13/39] timerqueue: Let timerqueue_add/del return information Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-22 19:08   ` [tip:timers/core] hrtimer: Make use of timerqueue_add/ del " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 15/39] hrtimer: Keep pointer to first timer and simplify __remove_hrtimer() Thomas Gleixner
                   ` (24 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrtimer-use-tq-retval.patch --]
[-- Type: text/plain, Size: 1146 bytes --]

Use the return value instead of reevaluating the information.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/hrtimer.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -834,7 +834,6 @@ static int enqueue_hrtimer(struct hrtime
 {
 	debug_activate(timer);
 
-	timerqueue_add(&base->active, &timer->node);
 	base->cpu_base->active_bases |= 1 << base->index;
 
 	/*
@@ -843,7 +842,7 @@ static int enqueue_hrtimer(struct hrtime
 	 */
 	timer->state |= HRTIMER_STATE_ENQUEUED;
 
-	return (&timer->node == base->active.next);
+	return timerqueue_add(&base->active, &timer->node);
 }
 
 /*
@@ -867,8 +866,7 @@ static void __remove_hrtimer(struct hrti
 		goto out;
 
 	next_timer = timerqueue_getnext(&base->active);
-	timerqueue_del(&base->active, &timer->node);
-	if (!timerqueue_getnext(&base->active))
+	if (!timerqueue_del(&base->active, &timer->node))
 		cpu_base->active_bases &= ~(1 << base->index);
 
 	if (&timer->node == next_timer) {



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 15/39] hrtimer: Keep pointer to first timer and simplify __remove_hrtimer()
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (13 preceding siblings ...)
  2015-04-14 21:08 ` [patch 14/39] hrtimer: Make use of timerqueue_add/del return values Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-22 19:08   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 16/39] hrtimer: Get rid of hrtimer softirq Thomas Gleixner
                   ` (23 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrtimer-keep-pointer-to-first.patch --]
[-- Type: text/plain, Size: 4871 bytes --]

__remove_hrtimer() needs to evaluate the expiry time to figure out
whether the timer which is removed is eventually the first expiring
timer on the cpu. Keep a pointer to it, which is lazily updated, so we
can avoid the evaluation dance and retrieve the information from there.

Generates slightly better code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h |    6 ++++++
 kernel/time/hrtimer.c   |   46 ++++++++++++++++++++++++++++------------------
 2 files changed, 34 insertions(+), 18 deletions(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -172,6 +172,7 @@ enum  hrtimer_base_type {
  * @clock_was_set_seq:	Sequence counter of clock was set events
  * @expires_next:	absolute time of the next event which was scheduled
  *			via clock_set_next_event()
+ * @next_timer:		Pointer to the first expiring timer
  * @in_hrtirq:		hrtimer_interrupt() is currently executing
  * @hres_active:	State of high resolution mode
  * @hang_detected:	The last hrtimer interrupt detected a hang
@@ -180,6 +181,10 @@ enum  hrtimer_base_type {
  * @nr_hangs:		Total number of hrtimer interrupt hangs
  * @max_hang_time:	Maximum time spent in hrtimer_interrupt
  * @clock_base:		array of clock bases for this cpu
+ *
+ * Note: next_timer is just an optimization for __remove_hrtimer().
+ *	 Do not dereference the pointer because it is not reliable on
+ *	 cross cpu removals.
  */
 struct hrtimer_cpu_base {
 	raw_spinlock_t			lock;
@@ -191,6 +196,7 @@ struct hrtimer_cpu_base {
 					hres_active	: 1,
 					hang_detected	: 1;
 	ktime_t				expires_next;
+	struct hrtimer			*next_timer;
 	unsigned int			nr_events;
 	unsigned int			nr_retries;
 	unsigned int			nr_hangs;
Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -415,12 +415,21 @@ static inline void debug_deactivate(stru
 }
 
 #if defined(CONFIG_NO_HZ_COMMON) || defined(CONFIG_HIGH_RES_TIMERS)
+static inline void hrtimer_update_next_timer(struct hrtimer_cpu_base *cpu_base,
+					     struct hrtimer *timer)
+{
+#ifdef CONFIG_HIGH_RES_TIMERS
+	cpu_base->next_timer = timer;
+#endif
+}
+
 static ktime_t __hrtimer_get_next_event(struct hrtimer_cpu_base *cpu_base)
 {
 	struct hrtimer_clock_base *base = cpu_base->clock_base;
 	ktime_t expires, expires_next = { .tv64 = KTIME_MAX };
 	unsigned int active = cpu_base->active_bases;
 
+	hrtimer_update_next_timer(cpu_base, NULL);
 	for (; active; base++, active >>= 1) {
 		struct timerqueue_node *next;
 		struct hrtimer *timer;
@@ -431,8 +440,10 @@ static ktime_t __hrtimer_get_next_event(
 		next = timerqueue_getnext(&base->active);
 		timer = container_of(next, struct hrtimer, node);
 		expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
-		if (expires.tv64 < expires_next.tv64)
+		if (expires.tv64 < expires_next.tv64) {
 			expires_next = expires;
+			hrtimer_update_next_timer(cpu_base, timer);
+		}
 	}
 	/*
 	 * clock_was_set() might have changed base->offset of any of
@@ -597,6 +608,8 @@ static int hrtimer_reprogram(struct hrti
 	if (cpu_base->in_hrtirq)
 		return 0;
 
+	cpu_base->next_timer = timer;
+
 	/*
 	 * If a hang was detected in the last timer interrupt then we
 	 * do not schedule a timer which is earlier than the expiry
@@ -860,30 +873,27 @@ static void __remove_hrtimer(struct hrti
 			     unsigned long newstate, int reprogram)
 {
 	struct hrtimer_cpu_base *cpu_base = base->cpu_base;
-	struct timerqueue_node *next_timer;
+	unsigned int state = timer->state;
 
-	if (!(timer->state & HRTIMER_STATE_ENQUEUED))
-		goto out;
+	timer->state = newstate;
+	if (!(state & HRTIMER_STATE_ENQUEUED))
+		return;
 
-	next_timer = timerqueue_getnext(&base->active);
 	if (!timerqueue_del(&base->active, &timer->node))
 		cpu_base->active_bases &= ~(1 << base->index);
 
-	if (&timer->node == next_timer) {
 #ifdef CONFIG_HIGH_RES_TIMERS
-		/* Reprogram the clock event device. if enabled */
-		if (reprogram && cpu_base->hres_active) {
-			ktime_t expires;
-
-			expires = ktime_sub(hrtimer_get_expires(timer),
-					    base->offset);
-			if (cpu_base->expires_next.tv64 == expires.tv64)
-				hrtimer_force_reprogram(cpu_base, 1);
-		}
+	/*
+	 * Note: If reprogram is false we do not update
+	 * cpu_base->next_timer. This happens when we remove the first
+	 * timer on a remote cpu. No harm as we never dereference
+	 * cpu_base->next_timer. So the worst thing what can happen is
+	 * an superflous call to hrtimer_force_reprogram() on the
+	 * remote cpu later on if the same timer gets enqueued again.
+	 */
+	if (reprogram && timer == cpu_base->next_timer)
+		hrtimer_force_reprogram(cpu_base, 1);
 #endif
-	}
-out:
-	timer->state = newstate;
 }
 
 /*



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 16/39] hrtimer: Get rid of hrtimer softirq
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (14 preceding siblings ...)
  2015-04-14 21:08 ` [patch 15/39] hrtimer: Keep pointer to first timer and simplify __remove_hrtimer() Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-22 19:09   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 17/39] tick: sched: Remove hrtimer_active() checks Thomas Gleixner
                   ` (22 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrtimer-get-rid-of-softirq-crap.patch --]
[-- Type: text/plain, Size: 11047 bytes --]

hrtimer softirq is a leftover from the initial implementation and
serves only the purpose to handle the enqueueing of already expired
timers in the high resolution timer mode. We discussed whether we
change the return value and force all start sites to handle that the
timer is already expired, but that would be a Herculean task and I'm
not sure whether its a good idea to enforce that handling on
everyone.

A simpler solution is to enforce a timer interrupt instead of raising
and scheduling a softirq. Just use the existing infrastructure to do
so and remove all the softirq leftovers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h    |    1 
 include/linux/interrupt.h  |    1 
 include/trace/events/irq.h |    1 
 kernel/softirq.c           |    2 
 kernel/time/hrtimer.c      |  163 +++++++++++----------------------------------
 kernel/time/timer.c        |    2 
 6 files changed, 44 insertions(+), 126 deletions(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -444,7 +444,6 @@ extern int schedule_hrtimeout(ktime_t *e
 
 /* Soft interrupt function to run the hrtimer queues: */
 extern void hrtimer_run_queues(void);
-extern void hrtimer_run_pending(void);
 
 /* Bootup initialization: */
 extern void __init hrtimers_init(void);
Index: tip/include/linux/interrupt.h
===================================================================
--- tip.orig/include/linux/interrupt.h
+++ tip/include/linux/interrupt.h
@@ -401,7 +401,6 @@ enum
 	BLOCK_IOPOLL_SOFTIRQ,
 	TASKLET_SOFTIRQ,
 	SCHED_SOFTIRQ,
-	HRTIMER_SOFTIRQ,
 	RCU_SOFTIRQ,    /* Preferable RCU should always be the last softirq */
 
 	NR_SOFTIRQS
Index: tip/include/trace/events/irq.h
===================================================================
--- tip.orig/include/trace/events/irq.h
+++ tip/include/trace/events/irq.h
@@ -20,7 +20,6 @@ struct softirq_action;
 			 softirq_name(BLOCK_IOPOLL),	\
 			 softirq_name(TASKLET),		\
 			 softirq_name(SCHED),		\
-			 softirq_name(HRTIMER),		\
 			 softirq_name(RCU))
 
 /**
Index: tip/kernel/softirq.c
===================================================================
--- tip.orig/kernel/softirq.c
+++ tip/kernel/softirq.c
@@ -59,7 +59,7 @@ DEFINE_PER_CPU(struct task_struct *, kso
 
 const char * const softirq_to_name[NR_SOFTIRQS] = {
 	"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "BLOCK_IOPOLL",
-	"TASKLET", "SCHED", "HRTIMER", "RCU"
+	"TASKLET", "SCHED", "RCU"
 };
 
 /*
Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -555,59 +555,48 @@ hrtimer_force_reprogram(struct hrtimer_c
 }
 
 /*
- * Shared reprogramming for clock_realtime and clock_monotonic
- *
  * When a timer is enqueued and expires earlier than the already enqueued
  * timers, we have to check, whether it expires earlier than the timer for
  * which the clock event device was armed.
  *
- * Note, that in case the state has HRTIMER_STATE_CALLBACK set, no reprogramming
- * and no expiry check happens. The timer gets enqueued into the rbtree. The
- * reprogramming and expiry check is done in the hrtimer_interrupt or in the
- * softirq.
- *
  * Called with interrupts disabled and base->cpu_base.lock held
  */
-static int hrtimer_reprogram(struct hrtimer *timer,
-			     struct hrtimer_clock_base *base)
+static void hrtimer_reprogram(struct hrtimer *timer,
+			      struct hrtimer_clock_base *base)
 {
 	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	ktime_t expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
-	int res;
 
 	WARN_ON_ONCE(hrtimer_get_expires_tv64(timer) < 0);
 
 	/*
-	 * When the callback is running, we do not reprogram the clock event
-	 * device. The timer callback is either running on a different CPU or
-	 * the callback is executed in the hrtimer_interrupt context. The
-	 * reprogramming is handled either by the softirq, which called the
-	 * callback or at the end of the hrtimer_interrupt.
+	 * If the timer is not on the current cpu, we cannot reprogram
+	 * the other cpus clock event device.
 	 */
-	if (hrtimer_callback_running(timer))
-		return 0;
+	if (base->cpu_base != cpu_base)
+		return;
+
+	/*
+	 * If the hrtimer interrupt is running, then it will
+	 * reevaluate the clock bases and reprogram the clock event
+	 * device. The callbacks are always executed in hard interrupt
+	 * context so we don't need an extra check for a running
+	 * callback.
+	 */
+	if (cpu_base->in_hrtirq)
+		return;
 
 	/*
 	 * CLOCK_REALTIME timer might be requested with an absolute
-	 * expiry time which is less than base->offset. Nothing wrong
-	 * about that, just avoid to call into the tick code, which
-	 * has now objections against negative expiry values.
+	 * expiry time which is less than base->offset. Set it to 0.
 	 */
 	if (expires.tv64 < 0)
-		return -ETIME;
+		expires.tv64 = 0;
 
 	if (expires.tv64 >= cpu_base->expires_next.tv64)
-		return 0;
-
-	/*
-	 * When the target cpu of the timer is currently executing
-	 * hrtimer_interrupt(), then we do not touch the clock event
-	 * device. hrtimer_interrupt() will reevaluate all clock bases
-	 * before reprogramming the device.
-	 */
-	if (cpu_base->in_hrtirq)
-		return 0;
+		return;
 
+	/* Update the pointer to the next expiring timer */
 	cpu_base->next_timer = timer;
 
 	/*
@@ -617,15 +606,14 @@ static int hrtimer_reprogram(struct hrti
 	 * to make progress.
 	 */
 	if (cpu_base->hang_detected)
-		return 0;
+		return;
 
 	/*
-	 * Clockevents returns -ETIME, when the event was in the past.
+	 * Program the timer hardware. We enforce the expiry for
+	 * events which are already in the past.
 	 */
-	res = tick_program_event(expires, 0);
-	if (!IS_ERR_VALUE(res))
-		cpu_base->expires_next = expires;
-	return res;
+	cpu_base->expires_next = expires;
+	tick_program_event(expires, 1);
 }
 
 /*
@@ -660,19 +648,11 @@ static void retrigger_next_event(void *a
  */
 static int hrtimer_switch_to_hres(void)
 {
-	int cpu = smp_processor_id();
-	struct hrtimer_cpu_base *base = &per_cpu(hrtimer_bases, cpu);
-	unsigned long flags;
-
-	if (base->hres_active)
-		return 1;
-
-	local_irq_save(flags);
+	struct hrtimer_cpu_base *base = this_cpu_ptr(&hrtimer_bases);
 
 	if (tick_init_highres()) {
-		local_irq_restore(flags);
 		printk(KERN_WARNING "Could not switch to high resolution "
-				    "mode on CPU %d\n", cpu);
+				    "mode on CPU %d\n", base->cpu);
 		return 0;
 	}
 	base->hres_active = 1;
@@ -681,7 +661,6 @@ static int hrtimer_switch_to_hres(void)
 	tick_setup_sched_timer();
 	/* "Retrigger" the interrupt to get things going */
 	retrigger_next_event(NULL);
-	local_irq_restore(flags);
 	return 1;
 }
 
@@ -976,26 +955,8 @@ int __hrtimer_start_range_ns(struct hrti
 		 * on dynticks target.
 		 */
 		wake_up_nohz_cpu(new_base->cpu_base->cpu);
-	} else if (new_base->cpu_base == this_cpu_ptr(&hrtimer_bases) &&
-			hrtimer_reprogram(timer, new_base)) {
-		/*
-		 * Only allow reprogramming if the new base is on this CPU.
-		 * (it might still be on another CPU if the timer was pending)
-		 *
-		 * XXX send_remote_softirq() ?
-		 */
-		if (wakeup) {
-			/*
-			 * We need to drop cpu_base->lock to avoid a
-			 * lock ordering issue vs. rq->lock.
-			 */
-			raw_spin_unlock(&new_base->cpu_base->lock);
-			raise_softirq_irqoff(HRTIMER_SOFTIRQ);
-			local_irq_restore(flags);
-			return ret;
-		} else {
-			__raise_softirq_irqoff(HRTIMER_SOFTIRQ);
-		}
+	} else {
+		hrtimer_reprogram(timer, new_base);
 	}
 
 	unlock_hrtimer_base(timer, &flags);
@@ -1346,7 +1307,7 @@ retry:
  * local version of hrtimer_peek_ahead_timers() called with interrupts
  * disabled.
  */
-static void __hrtimer_peek_ahead_timers(void)
+static inline void __hrtimer_peek_ahead_timers(void)
 {
 	struct tick_device *td;
 
@@ -1358,29 +1319,6 @@ static void __hrtimer_peek_ahead_timers(
 		hrtimer_interrupt(td->evtdev);
 }
 
-/**
- * hrtimer_peek_ahead_timers -- run soft-expired timers now
- *
- * hrtimer_peek_ahead_timers will peek at the timer queue of
- * the current cpu and check if there are any timers for which
- * the soft expires time has passed. If any such timers exist,
- * they are run immediately and then removed from the timer queue.
- *
- */
-void hrtimer_peek_ahead_timers(void)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	__hrtimer_peek_ahead_timers();
-	local_irq_restore(flags);
-}
-
-static void run_hrtimer_softirq(struct softirq_action *h)
-{
-	hrtimer_peek_ahead_timers();
-}
-
 #else /* CONFIG_HIGH_RES_TIMERS */
 
 static inline void __hrtimer_peek_ahead_timers(void) { }
@@ -1388,31 +1326,7 @@ static inline void __hrtimer_peek_ahead_
 #endif	/* !CONFIG_HIGH_RES_TIMERS */
 
 /*
- * Called from timer softirq every jiffy, expire hrtimers:
- *
- * For HRT its the fall back code to run the softirq in the timer
- * softirq context in case the hrtimer initialization failed or has
- * not been done yet.
- */
-void hrtimer_run_pending(void)
-{
-	if (hrtimer_hres_active())
-		return;
-
-	/*
-	 * This _is_ ugly: We have to check in the softirq context,
-	 * whether we can switch to highres and / or nohz mode. The
-	 * clocksource switch happens in the timer interrupt with
-	 * xtime_lock held. Notification from there only sets the
-	 * check bit in the tick_oneshot code, otherwise we might
-	 * deadlock vs. xtime_lock.
-	 */
-	if (tick_check_oneshot_change(!hrtimer_is_hres_enabled()))
-		hrtimer_switch_to_hres();
-}
-
-/*
- * Called from hardirq context every jiffy
+ * Called from run_local_timers in hardirq context every jiffy
  */
 void hrtimer_run_queues(void)
 {
@@ -1422,6 +1336,18 @@ void hrtimer_run_queues(void)
 	if (__hrtimer_hres_active(cpu_base))
 		return;
 
+	/*
+	 * This _is_ ugly: We have to check periodically, whether we
+	 * can switch to highres and / or nohz mode. The clocksource
+	 * switch happens with xtime_lock held. Notification from
+	 * there only sets the check bit in the tick_oneshot code,
+	 * otherwise we might deadlock vs. xtime_lock.
+	 */
+	if (tick_check_oneshot_change(!hrtimer_is_hres_enabled())) {
+		hrtimer_switch_to_hres();
+		return;
+	}
+
 	raw_spin_lock(&cpu_base->lock);
 	now = hrtimer_update_base(cpu_base);
 	__hrtimer_run_queues(cpu_base, now);
@@ -1692,9 +1618,6 @@ void __init hrtimers_init(void)
 	hrtimer_cpu_notify(&hrtimers_nb, (unsigned long)CPU_UP_PREPARE,
 			  (void *)(long)smp_processor_id());
 	register_cpu_notifier(&hrtimers_nb);
-#ifdef CONFIG_HIGH_RES_TIMERS
-	open_softirq(HRTIMER_SOFTIRQ, run_hrtimer_softirq);
-#endif
 }
 
 /**
Index: tip/kernel/time/timer.c
===================================================================
--- tip.orig/kernel/time/timer.c
+++ tip/kernel/time/timer.c
@@ -1409,8 +1409,6 @@ static void run_timer_softirq(struct sof
 {
 	struct tvec_base *base = __this_cpu_read(tvec_bases);
 
-	hrtimer_run_pending();
-
 	if (time_after_eq(jiffies, base->timer_jiffies))
 		__run_timers(base);
 }



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 17/39] tick: sched: Remove hrtimer_active() checks
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (15 preceding siblings ...)
  2015-04-14 21:08 ` [patch 16/39] hrtimer: Get rid of hrtimer softirq Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-16 13:37   ` Frederic Weisbecker
  2015-04-22 19:09   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 18/39] tick: sched: Force tick interrupt and get rid of softirq magic Thomas Gleixner
                   ` (21 subsequent siblings)
  38 siblings, 2 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, John Stultz,
	Marcelo Tosatti

[-- Attachment #1: tick-sched-remove-active-checks.patch --]
[-- Type: text/plain, Size: 2049 bytes --]

hrtimer_start() enforces a timer interrupt if the timer is already
expired. Get rid of the checks and the forward loop.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>

---
 kernel/time/tick-sched.c |   19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

Index: tip/kernel/time/tick-sched.c
===================================================================
--- tip.orig/kernel/time/tick-sched.c
+++ tip/kernel/time/tick-sched.c
@@ -696,11 +696,9 @@ static ktime_t tick_nohz_stop_sched_tick
 		if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
 			hrtimer_start(&ts->sched_timer, expires,
 				      HRTIMER_MODE_ABS_PINNED);
-			/* Check, if the timer was already in the past */
-			if (hrtimer_active(&ts->sched_timer))
-				goto out;
+			goto out;
 		} else if (!tick_program_event(expires, 0))
-				goto out;
+			goto out;
 		/*
 		 * We are past the event already. So we crossed a
 		 * jiffie boundary. Update jiffies and raise the
@@ -888,8 +886,6 @@ static void tick_nohz_restart(struct tic
 		if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
 			hrtimer_start_expires(&ts->sched_timer,
 					      HRTIMER_MODE_ABS_PINNED);
-			/* Check, if the timer was already in the past */
-			if (hrtimer_active(&ts->sched_timer))
 				break;
 		} else {
 			if (!tick_program_event(
@@ -1167,15 +1163,8 @@ void tick_setup_sched_timer(void)
 		hrtimer_add_expires_ns(&ts->sched_timer, offset);
 	}
 
-	for (;;) {
-		hrtimer_forward(&ts->sched_timer, now, tick_period);
-		hrtimer_start_expires(&ts->sched_timer,
-				      HRTIMER_MODE_ABS_PINNED);
-		/* Check, if the timer was already in the past */
-		if (hrtimer_active(&ts->sched_timer))
-			break;
-		now = ktime_get();
-	}
+	hrtimer_forward(&ts->sched_timer, now, tick_period);
+	hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED);
 
 #ifdef CONFIG_NO_HZ_COMMON
 	if (tick_nohz_enabled) {



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 18/39] tick: sched: Force tick interrupt and get rid of softirq magic
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (16 preceding siblings ...)
  2015-04-14 21:08 ` [patch 17/39] tick: sched: Remove hrtimer_active() checks Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-22 14:22   ` Frederic Weisbecker
  2015-04-22 19:09   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 19/39] tick: sched: Restructure code Thomas Gleixner
                   ` (20 subsequent siblings)
  38 siblings, 2 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, John Stultz,
	Marcelo Tosatti

[-- Attachment #1: tick-sched-force-tick-interrupt.patch --]
[-- Type: text/plain, Size: 4410 bytes --]

We already got rid of the hrtimer reprogramming loops and hoops as
hrtimer now enforces an interrupt if the enqueued time is in the past.

Do the same for the nohz non highres mode. That gets rid of the need
to raise the softirq which only serves the purpose of getting the
machine out of the inner idle loop.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>

---
 kernel/time/tick-sched.c |   83 ++++++++++++++++-------------------------------
 1 file changed, 29 insertions(+), 54 deletions(-)

Index: tip/kernel/time/tick-sched.c
===================================================================
--- tip.orig/kernel/time/tick-sched.c
+++ tip/kernel/time/tick-sched.c
@@ -565,6 +565,20 @@ u64 get_cpu_iowait_time_us(int cpu, u64
 }
 EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us);
 
+static void tick_nohz_restart(struct tick_sched *ts, ktime_t now)
+{
+	hrtimer_cancel(&ts->sched_timer);
+	hrtimer_set_expires(&ts->sched_timer, ts->last_tick);
+
+	/* Forward the time to expire in the future */
+	hrtimer_forward(&ts->sched_timer, now, tick_period);
+
+	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
+		hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED);
+	else
+		tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
+}
+
 static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 					 ktime_t now, int cpu)
 {
@@ -691,22 +705,18 @@ static ktime_t tick_nohz_stop_sched_tick
 			if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
 				hrtimer_cancel(&ts->sched_timer);
 			goto out;
-		}
+		 }
 
-		if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
-			hrtimer_start(&ts->sched_timer, expires,
-				      HRTIMER_MODE_ABS_PINNED);
-			goto out;
-		} else if (!tick_program_event(expires, 0))
-			goto out;
-		/*
-		 * We are past the event already. So we crossed a
-		 * jiffie boundary. Update jiffies and raise the
-		 * softirq.
-		 */
-		tick_do_update_jiffies64(ktime_get());
+		 if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
+			 hrtimer_start(&ts->sched_timer, expires,
+				       HRTIMER_MODE_ABS_PINNED);
+		 else
+			 tick_program_event(expires, 1);
+	} else {
+		/* Tick is stopped, but required now. Enforce it */
+		tick_nohz_restart(ts, now);
 	}
-	raise_softirq_irqoff(TIMER_SOFTIRQ);
+
 out:
 	ts->next_jiffies = next_jiffies;
 	ts->last_jiffies = last_jiffies;
@@ -874,30 +884,6 @@ ktime_t tick_nohz_get_sleep_length(void)
 	return ts->sleep_length;
 }
 
-static void tick_nohz_restart(struct tick_sched *ts, ktime_t now)
-{
-	hrtimer_cancel(&ts->sched_timer);
-	hrtimer_set_expires(&ts->sched_timer, ts->last_tick);
-
-	while (1) {
-		/* Forward the time to expire in the future */
-		hrtimer_forward(&ts->sched_timer, now, tick_period);
-
-		if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
-			hrtimer_start_expires(&ts->sched_timer,
-					      HRTIMER_MODE_ABS_PINNED);
-				break;
-		} else {
-			if (!tick_program_event(
-				hrtimer_get_expires(&ts->sched_timer), 0))
-				break;
-		}
-		/* Reread time and update jiffies */
-		now = ktime_get();
-		tick_do_update_jiffies64(now);
-	}
-}
-
 static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now)
 {
 	/* Update jiffies first */
@@ -968,12 +954,6 @@ void tick_nohz_idle_exit(void)
 	local_irq_enable();
 }
 
-static int tick_nohz_reprogram(struct tick_sched *ts, ktime_t now)
-{
-	hrtimer_forward(&ts->sched_timer, now, tick_period);
-	return tick_program_event(hrtimer_get_expires(&ts->sched_timer), 0);
-}
-
 /*
  * The nohz low res interrupt handler
  */
@@ -992,10 +972,8 @@ static void tick_nohz_handler(struct clo
 	if (unlikely(ts->tick_stopped))
 		return;
 
-	while (tick_nohz_reprogram(ts, now)) {
-		now = ktime_get();
-		tick_do_update_jiffies64(now);
-	}
+	hrtimer_forward(&ts->sched_timer, now, tick_period);
+	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
 }
 
 /**
@@ -1025,12 +1003,9 @@ static void tick_nohz_switch_to_nohz(voi
 	/* Get the next period */
 	next = tick_init_jiffy_update();
 
-	for (;;) {
-		hrtimer_set_expires(&ts->sched_timer, next);
-		if (!tick_program_event(next, 0))
-			break;
-		next = ktime_add(next, tick_period);
-	}
+	hrtimer_forward_now(&ts->sched_timer, tick_period);
+	hrtimer_set_expires(&ts->sched_timer, next);
+	tick_program_event(next, 1);
 	local_irq_enable();
 }
 



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 19/39] tick: sched: Restructure code
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (17 preceding siblings ...)
  2015-04-14 21:08 ` [patch 18/39] tick: sched: Force tick interrupt and get rid of softirq magic Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-22 19:09   ` [tip:timers/core] tick: Sched: " tip-bot for Thomas Gleixner
  2015-04-14 21:08 ` [patch 20/39] tick: nohz: Rework next timer evaluation Thomas Gleixner
                   ` (19 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, John Stultz,
	Marcelo Tosatti

[-- Attachment #1: tick-sched-restructure-code.patch --]
[-- Type: text/plain, Size: 7162 bytes --]

Get rid of one indentation level. Preparatory patch for a major
rework. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>

---
 kernel/time/tick-sched.c |  173 ++++++++++++++++++++++-------------------------
 1 file changed, 82 insertions(+), 91 deletions(-)

Index: tip/kernel/time/tick-sched.c
===================================================================
--- tip.orig/kernel/time/tick-sched.c
+++ tip/kernel/time/tick-sched.c
@@ -611,112 +611,103 @@ static ktime_t tick_nohz_stop_sched_tick
 		}
 	}
 
+	if ((long)delta_jiffies <= 1) {
+		if (!ts->tick_stopped)
+			goto out;
+		if (delta_jiffies == 0) {
+			/* Tick is stopped, but required now. Enforce it */
+			tick_nohz_restart(ts, now);
+			goto out;
+		}
+	}
+
 	/*
-	 * Do not stop the tick, if we are only one off (or less)
-	 * or if the cpu is required for RCU:
+	 * If this cpu is the one which updates jiffies, then give up
+	 * the assignment and let it be taken by the cpu which runs
+	 * the tick timer next, which might be this cpu as well. If we
+	 * don't drop this here the jiffies might be stale and
+	 * do_timer() never invoked. Keep track of the fact that it
+	 * was the one which had the do_timer() duty last. If this cpu
+	 * is the one which had the do_timer() duty last, we limit the
+	 * sleep time to the timekeeping max_deferement value which we
+	 * retrieved above. Otherwise we can sleep as long as we want.
 	 */
-	if (!ts->tick_stopped && delta_jiffies <= 1)
-		goto out;
-
-	/* Schedule the tick, if we are at least one jiffie off */
-	if ((long)delta_jiffies >= 1) {
-
-		/*
-		 * If this cpu is the one which updates jiffies, then
-		 * give up the assignment and let it be taken by the
-		 * cpu which runs the tick timer next, which might be
-		 * this cpu as well. If we don't drop this here the
-		 * jiffies might be stale and do_timer() never
-		 * invoked. Keep track of the fact that it was the one
-		 * which had the do_timer() duty last. If this cpu is
-		 * the one which had the do_timer() duty last, we
-		 * limit the sleep time to the timekeeping
-		 * max_deferement value which we retrieved
-		 * above. Otherwise we can sleep as long as we want.
-		 */
-		if (cpu == tick_do_timer_cpu) {
-			tick_do_timer_cpu = TICK_DO_TIMER_NONE;
-			ts->do_timer_last = 1;
-		} else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
-			time_delta = KTIME_MAX;
-			ts->do_timer_last = 0;
-		} else if (!ts->do_timer_last) {
-			time_delta = KTIME_MAX;
-		}
+	if (cpu == tick_do_timer_cpu) {
+		tick_do_timer_cpu = TICK_DO_TIMER_NONE;
+		ts->do_timer_last = 1;
+	} else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
+		time_delta = KTIME_MAX;
+		ts->do_timer_last = 0;
+	} else if (!ts->do_timer_last) {
+		time_delta = KTIME_MAX;
+	}
 
 #ifdef CONFIG_NO_HZ_FULL
-		if (!ts->inidle) {
-			time_delta = min(time_delta,
-					 scheduler_tick_max_deferment());
-		}
+	if (!ts->inidle)
+		time_delta = min(time_delta, scheduler_tick_max_deferment());
 #endif
 
+	/*
+	 * calculate the expiry time for the next timer wheel
+	 * timer. delta_jiffies >= NEXT_TIMER_MAX_DELTA signals that
+	 * there is no timer pending or at least extremely far into
+	 * the future (12 days for HZ=1000). In this case we set the
+	 * expiry to the end of time.
+	 */
+	if (likely(delta_jiffies < NEXT_TIMER_MAX_DELTA)) {
 		/*
-		 * calculate the expiry time for the next timer wheel
-		 * timer. delta_jiffies >= NEXT_TIMER_MAX_DELTA signals
-		 * that there is no timer pending or at least extremely
-		 * far into the future (12 days for HZ=1000). In this
-		 * case we set the expiry to the end of time.
+		 * Calculate the time delta for the next timer event.
+		 * If the time delta exceeds the maximum time delta
+		 * permitted by the current clocksource then adjust
+		 * the time delta accordingly to ensure the
+		 * clocksource does not wrap.
 		 */
-		if (likely(delta_jiffies < NEXT_TIMER_MAX_DELTA)) {
-			/*
-			 * Calculate the time delta for the next timer event.
-			 * If the time delta exceeds the maximum time delta
-			 * permitted by the current clocksource then adjust
-			 * the time delta accordingly to ensure the
-			 * clocksource does not wrap.
-			 */
-			time_delta = min_t(u64, time_delta,
-					   tick_period.tv64 * delta_jiffies);
-		}
-
-		if (time_delta < KTIME_MAX)
-			expires = ktime_add_ns(last_update, time_delta);
-		else
-			expires.tv64 = KTIME_MAX;
+		time_delta = min_t(u64, time_delta,
+				   tick_period.tv64 * delta_jiffies);
+	}
 
-		/* Skip reprogram of event if its not changed */
-		if (ts->tick_stopped && ktime_equal(expires, dev->next_event))
-			goto out;
+	if (time_delta < KTIME_MAX)
+		expires = ktime_add_ns(last_update, time_delta);
+	else
+		expires.tv64 = KTIME_MAX;
 
-		ret = expires;
+	/* Skip reprogram of event if its not changed */
+	if (ts->tick_stopped && ktime_equal(expires, dev->next_event))
+		goto out;
 
-		/*
-		 * nohz_stop_sched_tick can be called several times before
-		 * the nohz_restart_sched_tick is called. This happens when
-		 * interrupts arrive which do not cause a reschedule. In the
-		 * first call we save the current tick time, so we can restart
-		 * the scheduler tick in nohz_restart_sched_tick.
-		 */
-		if (!ts->tick_stopped) {
-			nohz_balance_enter_idle(cpu);
-			calc_load_enter_idle();
-
-			ts->last_tick = hrtimer_get_expires(&ts->sched_timer);
-			ts->tick_stopped = 1;
-			trace_tick_stop(1, " ");
-		}
+	ret = expires;
 
-		/*
-		 * If the expiration time == KTIME_MAX, then
-		 * in this case we simply stop the tick timer.
-		 */
-		 if (unlikely(expires.tv64 == KTIME_MAX)) {
-			if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
-				hrtimer_cancel(&ts->sched_timer);
-			goto out;
-		 }
+	/*
+	 * nohz_stop_sched_tick can be called several times before
+	 * the nohz_restart_sched_tick is called. This happens when
+	 * interrupts arrive which do not cause a reschedule. In the
+	 * first call we save the current tick time, so we can restart
+	 * the scheduler tick in nohz_restart_sched_tick.
+	 */
+	if (!ts->tick_stopped) {
+		nohz_balance_enter_idle(cpu);
+		calc_load_enter_idle();
+
+		ts->last_tick = hrtimer_get_expires(&ts->sched_timer);
+		ts->tick_stopped = 1;
+		trace_tick_stop(1, " ");
+	}
 
-		 if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
-			 hrtimer_start(&ts->sched_timer, expires,
-				       HRTIMER_MODE_ABS_PINNED);
-		 else
-			 tick_program_event(expires, 1);
-	} else {
-		/* Tick is stopped, but required now. Enforce it */
-		tick_nohz_restart(ts, now);
+	/*
+	 * If the expiration time == KTIME_MAX, then
+	 * in this case we simply stop the tick timer.
+	 */
+	if (unlikely(expires.tv64 == KTIME_MAX)) {
+		if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
+			hrtimer_cancel(&ts->sched_timer);
+		goto out;
 	}
 
+	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
+		hrtimer_start(&ts->sched_timer, expires,
+			      HRTIMER_MODE_ABS_PINNED);
+	else
+		tick_program_event(expires, 1);
 out:
 	ts->next_jiffies = next_jiffies;
 	ts->last_jiffies = last_jiffies;



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 20/39] tick: nohz: Rework next timer evaluation
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (18 preceding siblings ...)
  2015-04-14 21:08 ` [patch 19/39] tick: sched: Restructure code Thomas Gleixner
@ 2015-04-14 21:08 ` Thomas Gleixner
  2015-04-16 16:42   ` Paul E. McKenney
  2015-04-22 19:10   ` [tip:timers/core] tick: Nohz: " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 21/39] x86: perf: Use hrtimer_start() Thomas Gleixner
                   ` (18 subsequent siblings)
  38 siblings, 2 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:08 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, Paul E. McKenney,
	Josh Triplett, Lai Jiangshan, John Stultz, Marcelo Tosatti

[-- Attachment #1: tick-nohz-rework-next-timer-evaluation.patch --]
[-- Type: text/plain, Size: 17586 bytes --]

The evaluation of the next timer in the nohz code is based on jiffies
while all the tick internals are nano seconds based. We have also to
convert hrtimer nanoseconds to jiffies in the !highres case. That's
just wrong and introduces interesting corner cases.

Turn it around and convert the next timer wheel timer expiry and the
rcu event to clock monotonic and base all calculations on
nanoseconds. That identifies the case where no timer is pending
clearly with an absolute expiry value of KTIME_MAX.

Makes the code more readable and gets rid of the jiffies magic in the
nohz code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>

---
 include/linux/hrtimer.h     |    2 
 include/linux/rcupdate.h    |    6 +-
 include/linux/rcutree.h     |    2 
 include/linux/timer.h       |    7 --
 kernel/rcu/tree_plugin.h    |   14 +++--
 kernel/time/hrtimer.c       |   14 ++---
 kernel/time/tick-internal.h |    2 
 kernel/time/tick-sched.c    |  109 +++++++++++++++++++-------------------------
 kernel/time/tick-sched.h    |    2 
 kernel/time/timer.c         |   71 +++++++++++++---------------
 kernel/time/timer_list.c    |    4 -
 11 files changed, 107 insertions(+), 126 deletions(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -386,7 +386,7 @@ static inline int hrtimer_restart(struct
 /* Query timers: */
 extern ktime_t hrtimer_get_remaining(const struct hrtimer *timer);
 
-extern ktime_t hrtimer_get_next_event(void);
+extern u64 hrtimer_get_next_event(void);
 
 /*
  * A timer is active, when it is enqueued into the rbtree or the
Index: tip/include/linux/rcupdate.h
===================================================================
--- tip.orig/include/linux/rcupdate.h
+++ tip/include/linux/rcupdate.h
@@ -44,6 +44,8 @@
 #include <linux/debugobjects.h>
 #include <linux/bug.h>
 #include <linux/compiler.h>
+#include <linux/ktime.h>
+
 #include <asm/barrier.h>
 
 extern int rcu_expedited; /* for sysctl */
@@ -1122,9 +1124,9 @@ static inline notrace void rcu_read_unlo
 	__kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
 
 #if defined(CONFIG_TINY_RCU) || defined(CONFIG_RCU_NOCB_CPU_ALL)
-static inline int rcu_needs_cpu(unsigned long *delta_jiffies)
+static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
 {
-	*delta_jiffies = ULONG_MAX;
+	*nextevt = KTIME_MAX;
 	return 0;
 }
 #endif /* #if defined(CONFIG_TINY_RCU) || defined(CONFIG_RCU_NOCB_CPU_ALL) */
Index: tip/include/linux/rcutree.h
===================================================================
--- tip.orig/include/linux/rcutree.h
+++ tip/include/linux/rcutree.h
@@ -32,7 +32,7 @@
 
 void rcu_note_context_switch(void);
 #ifndef CONFIG_RCU_NOCB_CPU_ALL
-int rcu_needs_cpu(unsigned long *delta_jiffies);
+int rcu_needs_cpu(u64 basem, u64 *nextevt);
 #endif /* #ifndef CONFIG_RCU_NOCB_CPU_ALL */
 void rcu_cpu_stall_reset(void);
 
Index: tip/include/linux/timer.h
===================================================================
--- tip.orig/include/linux/timer.h
+++ tip/include/linux/timer.h
@@ -188,13 +188,6 @@ extern void set_timer_slack(struct timer
 #define NEXT_TIMER_MAX_DELTA	((1UL << 30) - 1)
 
 /*
- * Return when the next timer-wheel timeout occurs (in absolute jiffies),
- * locks the timer base and does the comparison against the given
- * jiffie.
- */
-extern unsigned long get_next_timer_interrupt(unsigned long now);
-
-/*
  * Timer-statistics info:
  */
 #ifdef CONFIG_TIMER_STATS
Index: tip/kernel/rcu/tree_plugin.h
===================================================================
--- tip.orig/kernel/rcu/tree_plugin.h
+++ tip/kernel/rcu/tree_plugin.h
@@ -1372,9 +1372,9 @@ static void rcu_prepare_kthreads(int cpu
  * any flavor of RCU.
  */
 #ifndef CONFIG_RCU_NOCB_CPU_ALL
-int rcu_needs_cpu(unsigned long *delta_jiffies)
+int rcu_needs_cpu(u64 basemono, u64 *nextevt)
 {
-	*delta_jiffies = ULONG_MAX;
+	*nextevt = KTIME_MAX;
 	return rcu_cpu_has_callbacks(NULL);
 }
 #endif /* #ifndef CONFIG_RCU_NOCB_CPU_ALL */
@@ -1485,16 +1485,17 @@ static bool __maybe_unused rcu_try_advan
  * The caller must have disabled interrupts.
  */
 #ifndef CONFIG_RCU_NOCB_CPU_ALL
-int rcu_needs_cpu(unsigned long *dj)
+int rcu_needs_cpu(u64 basemono, u64 *nextevt)
 {
 	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
+	unsigned long dj;
 
 	/* Snapshot to detect later posting of non-lazy callback. */
 	rdtp->nonlazy_posted_snap = rdtp->nonlazy_posted;
 
 	/* If no callbacks, RCU doesn't need the CPU. */
 	if (!rcu_cpu_has_callbacks(&rdtp->all_lazy)) {
-		*dj = ULONG_MAX;
+		*nextevt = KTIME_MAX;
 		return 0;
 	}
 
@@ -1508,11 +1509,12 @@ int rcu_needs_cpu(unsigned long *dj)
 
 	/* Request timer delay depending on laziness, and round. */
 	if (!rdtp->all_lazy) {
-		*dj = round_up(rcu_idle_gp_delay + jiffies,
+		dj = round_up(rcu_idle_gp_delay + jiffies,
 			       rcu_idle_gp_delay) - jiffies;
 	} else {
-		*dj = round_jiffies(rcu_idle_lazy_gp_delay + jiffies) - jiffies;
+		dj = round_jiffies(rcu_idle_lazy_gp_delay + jiffies) - jiffies;
 	}
+	*nextevt = basemono + dj * TICK_NSEC;
 	return 0;
 }
 #endif /* #ifndef CONFIG_RCU_NOCB_CPU_ALL */
Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -1072,26 +1072,22 @@ EXPORT_SYMBOL_GPL(hrtimer_get_remaining)
 /**
  * hrtimer_get_next_event - get the time until next expiry event
  *
- * Returns the delta to the next expiry event or KTIME_MAX if no timer
- * is pending.
+ * Returns the next expiry time or KTIME_MAX if no timer is pending.
  */
-ktime_t hrtimer_get_next_event(void)
+u64 hrtimer_get_next_event(void)
 {
 	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
-	ktime_t mindelta = { .tv64 = KTIME_MAX };
+	u64 expires = KTIME_MAX;
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&cpu_base->lock, flags);
 
 	if (!__hrtimer_hres_active(cpu_base))
-		mindelta = ktime_sub(__hrtimer_get_next_event(cpu_base),
-				     ktime_get());
+		expires = __hrtimer_get_next_event(cpu_base).tv64;
 
 	raw_spin_unlock_irqrestore(&cpu_base->lock, flags);
 
-	if (mindelta.tv64 < 0)
-		mindelta.tv64 = 0;
-	return mindelta;
+	return expires;
 }
 #endif
 
Index: tip/kernel/time/tick-internal.h
===================================================================
--- tip.orig/kernel/time/tick-internal.h
+++ tip/kernel/time/tick-internal.h
@@ -137,3 +137,5 @@ extern void tick_nohz_init(void);
 # else
 static inline void tick_nohz_init(void) { }
 #endif
+
+extern u64 get_next_timer_interrupt(unsigned long basej, u64 basem);
Index: tip/kernel/time/tick-sched.c
===================================================================
--- tip.orig/kernel/time/tick-sched.c
+++ tip/kernel/time/tick-sched.c
@@ -582,39 +582,46 @@ static void tick_nohz_restart(struct tic
 static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 					 ktime_t now, int cpu)
 {
-	unsigned long seq, last_jiffies, next_jiffies, delta_jiffies;
-	ktime_t last_update, expires, ret = { .tv64 = 0 };
-	unsigned long rcu_delta_jiffies;
 	struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);
-	u64 time_delta;
-
-	time_delta = timekeeping_max_deferment();
+	u64 basemono, next_tick, next_tmr, next_rcu, delta, expires;
+	unsigned long seq, basejiff;
+	ktime_t	tick;
 
 	/* Read jiffies and the time when jiffies were updated last */
 	do {
 		seq = read_seqbegin(&jiffies_lock);
-		last_update = last_jiffies_update;
-		last_jiffies = jiffies;
+		basemono = last_jiffies_update.tv64;
+		basejiff = jiffies;
 	} while (read_seqretry(&jiffies_lock, seq));
+	ts->last_jiffies = basejiff;
 
-	if (rcu_needs_cpu(&rcu_delta_jiffies) ||
+	if (rcu_needs_cpu(basemono, &next_rcu) ||
 	    arch_needs_cpu() || irq_work_needs_cpu()) {
-		next_jiffies = last_jiffies + 1;
-		delta_jiffies = 1;
+		next_tick = basemono + TICK_NSEC;
 	} else {
-		/* Get the next timer wheel timer */
-		next_jiffies = get_next_timer_interrupt(last_jiffies);
-		delta_jiffies = next_jiffies - last_jiffies;
-		if (rcu_delta_jiffies < delta_jiffies) {
-			next_jiffies = last_jiffies + rcu_delta_jiffies;
-			delta_jiffies = rcu_delta_jiffies;
-		}
+		/*
+		 * Get the next pending timer. If high resolution
+		 * timers are enabled this only takes the timer wheel
+		 * timers into account. If high resolution timers are
+		 * disabled this also looks at the next expiring
+		 * hrtimer.
+		 */
+		next_tmr = get_next_timer_interrupt(basejiff, basemono);
+		ts->next_timer = next_tmr;
+		/* Take the next rcu event into account */
+		next_tick = next_rcu < next_tmr ? next_rcu : next_tmr;
 	}
 
-	if ((long)delta_jiffies <= 1) {
+	/*
+	 * If the tick is due in the next period, keep it ticking or
+	 * restart it proper.
+	 */
+	delta = next_tick - basemono;
+	if (delta <= (u64)TICK_NSEC) {
+		tick.tv64 = 0;
 		if (!ts->tick_stopped)
 			goto out;
-		if (delta_jiffies == 0) {
+		if (delta == 0) {
 			/* Tick is stopped, but required now. Enforce it */
 			tick_nohz_restart(ts, now);
 			goto out;
@@ -629,54 +636,39 @@ static ktime_t tick_nohz_stop_sched_tick
 	 * do_timer() never invoked. Keep track of the fact that it
 	 * was the one which had the do_timer() duty last. If this cpu
 	 * is the one which had the do_timer() duty last, we limit the
-	 * sleep time to the timekeeping max_deferement value which we
-	 * retrieved above. Otherwise we can sleep as long as we want.
+	 * sleep time to the timekeeping max_deferement value.
+	 * Otherwise we can sleep as long as we want.
 	 */
+	delta = timekeeping_max_deferment();
 	if (cpu == tick_do_timer_cpu) {
 		tick_do_timer_cpu = TICK_DO_TIMER_NONE;
 		ts->do_timer_last = 1;
 	} else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
-		time_delta = KTIME_MAX;
+		delta = KTIME_MAX;
 		ts->do_timer_last = 0;
 	} else if (!ts->do_timer_last) {
-		time_delta = KTIME_MAX;
+		delta = KTIME_MAX;
 	}
 
 #ifdef CONFIG_NO_HZ_FULL
+	/* Limit the tick delta to the maximum scheduler deferment */
 	if (!ts->inidle)
-		time_delta = min(time_delta, scheduler_tick_max_deferment());
+		delta = min(time_delta, scheduler_tick_max_deferment());
 #endif
 
-	/*
-	 * calculate the expiry time for the next timer wheel
-	 * timer. delta_jiffies >= NEXT_TIMER_MAX_DELTA signals that
-	 * there is no timer pending or at least extremely far into
-	 * the future (12 days for HZ=1000). In this case we set the
-	 * expiry to the end of time.
-	 */
-	if (likely(delta_jiffies < NEXT_TIMER_MAX_DELTA)) {
-		/*
-		 * Calculate the time delta for the next timer event.
-		 * If the time delta exceeds the maximum time delta
-		 * permitted by the current clocksource then adjust
-		 * the time delta accordingly to ensure the
-		 * clocksource does not wrap.
-		 */
-		time_delta = min_t(u64, time_delta,
-				   tick_period.tv64 * delta_jiffies);
-	}
-
-	if (time_delta < KTIME_MAX)
-		expires = ktime_add_ns(last_update, time_delta);
+	/* Calculate the next expiry time */
+	if (delta < (KTIME_MAX - basemono))
+		expires = basemono + delta;
 	else
-		expires.tv64 = KTIME_MAX;
+		expires = KTIME_MAX;
+
+	expires = min_t(u64, expires, next_tick);
+	tick.tv64 = expires;
 
 	/* Skip reprogram of event if its not changed */
-	if (ts->tick_stopped && ktime_equal(expires, dev->next_event))
+	if (ts->tick_stopped && (expires == dev->next_event.tv64))
 		goto out;
 
-	ret = expires;
-
 	/*
 	 * nohz_stop_sched_tick can be called several times before
 	 * the nohz_restart_sched_tick is called. This happens when
@@ -694,26 +686,23 @@ static ktime_t tick_nohz_stop_sched_tick
 	}
 
 	/*
-	 * If the expiration time == KTIME_MAX, then
-	 * in this case we simply stop the tick timer.
+	 * If the expiration time == KTIME_MAX, then we simply stop
+	 * the tick timer.
 	 */
-	if (unlikely(expires.tv64 == KTIME_MAX)) {
+	if (unlikely(expires == KTIME_MAX)) {
 		if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
 			hrtimer_cancel(&ts->sched_timer);
 		goto out;
 	}
 
 	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
-		hrtimer_start(&ts->sched_timer, expires,
-			      HRTIMER_MODE_ABS_PINNED);
+		hrtimer_start(&ts->sched_timer, tick, HRTIMER_MODE_ABS_PINNED);
 	else
-		tick_program_event(expires, 1);
+		tick_program_event(tick, 1);
 out:
-	ts->next_jiffies = next_jiffies;
-	ts->last_jiffies = last_jiffies;
+	/* Update the estimated sleep length */
 	ts->sleep_length = ktime_sub(dev->next_event, now);
-
-	return ret;
+	return tick;
 }
 
 static void tick_nohz_full_stop_tick(struct tick_sched *ts)
Index: tip/kernel/time/tick-sched.h
===================================================================
--- tip.orig/kernel/time/tick-sched.h
+++ tip/kernel/time/tick-sched.h
@@ -57,7 +57,7 @@ struct tick_sched {
 	ktime_t				iowait_sleeptime;
 	ktime_t				sleep_length;
 	unsigned long			last_jiffies;
-	unsigned long			next_jiffies;
+	u64				next_timer;
 	ktime_t				idle_expires;
 	int				do_timer_last;
 };
Index: tip/kernel/time/timer.c
===================================================================
--- tip.orig/kernel/time/timer.c
+++ tip/kernel/time/timer.c
@@ -49,6 +49,8 @@
 #include <asm/timex.h>
 #include <asm/io.h>
 
+#include "tick-internal.h"
+
 #define CREATE_TRACE_POINTS
 #include <trace/events/timer.h>
 
@@ -1311,54 +1313,48 @@ cascade:
  * Check, if the next hrtimer event is before the next timer wheel
  * event:
  */
-static unsigned long cmp_next_hrtimer_event(unsigned long now,
-					    unsigned long expires)
+static u64 cmp_next_hrtimer_event(u64 basem, u64 expires)
 {
-	ktime_t hr_delta = hrtimer_get_next_event();
-	struct timespec tsdelta;
-	unsigned long delta;
-
-	if (hr_delta.tv64 == KTIME_MAX)
-		return expires;
+	u64 nextevt = hrtimer_get_next_event();
 
 	/*
-	 * Expired timer available, let it expire in the next tick
+	 * If high resolution timers are enabled
+	 * hrtimer_get_next_event() returns KTIME_MAX.
 	 */
-	if (hr_delta.tv64 <= 0)
-		return now + 1;
-
-	tsdelta = ktime_to_timespec(hr_delta);
-	delta = timespec_to_jiffies(&tsdelta);
+	if (expires <= nextevt)
+		return expires;
 
 	/*
-	 * Limit the delta to the max value, which is checked in
-	 * tick_nohz_stop_sched_tick():
+	 * If the next timer is already expired, return the tick base
+	 * time so the tick is fired immediately.
 	 */
-	if (delta > NEXT_TIMER_MAX_DELTA)
-		delta = NEXT_TIMER_MAX_DELTA;
+	if (nextevt <= basem)
+		return basem;
 
 	/*
-	 * Take rounding errors in to account and make sure, that it
-	 * expires in the next tick. Otherwise we go into an endless
-	 * ping pong due to tick_nohz_stop_sched_tick() retriggering
-	 * the timer softirq
+	 * Round up to the next jiffie. High resolution timers are
+	 * off, so the hrtimers are expired in the tick and we need to
+	 * make sure that this tick really expires the timer to avoid
+	 * a ping pong of the nohz stop code.
+	 *
+	 * Use DIV_ROUND_UP_ULL to prevent gcc calling __divdi3
 	 */
-	if (delta < 1)
-		delta = 1;
-	now += delta;
-	if (time_before(now, expires))
-		return now;
-	return expires;
+	return DIV_ROUND_UP_ULL(nextevt, TICK_NSEC) * TICK_NSEC;
 }
 
 /**
- * get_next_timer_interrupt - return the jiffy of the next pending timer
- * @now: current time (in jiffies)
+ * get_next_timer_interrupt - return the time (clock mono) of the next timer
+ * @basej:	base time jiffies
+ * @basem:	base time clock monotonic
+ *
+ * Returns the tick aligned clock monotonic time of the next pending
+ * timer or KTIME_MAX if no timer is pending.
  */
-unsigned long get_next_timer_interrupt(unsigned long now)
+u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
 {
 	struct tvec_base *base = __this_cpu_read(tvec_bases);
-	unsigned long expires = now + NEXT_TIMER_MAX_DELTA;
+	u64 expires = KTIME_MAX;
+	unsigned long nextevt;
 
 	/*
 	 * Pretend that there is no timer pending if the cpu is offline.
@@ -1371,14 +1367,15 @@ unsigned long get_next_timer_interrupt(u
 	if (base->active_timers) {
 		if (time_before_eq(base->next_timer, base->timer_jiffies))
 			base->next_timer = __next_timer_interrupt(base);
-		expires = base->next_timer;
+		nextevt = base->next_timer;
+		if (time_before_eq(nextevt, basej))
+			expires = basem;
+		else
+			expires = basem + (nextevt - basej) * TICK_NSEC;
 	}
 	spin_unlock(&base->lock);
 
-	if (time_before_eq(expires, now))
-		return now;
-
-	return cmp_next_hrtimer_event(now, expires);
+	return cmp_next_hrtimer_event(basem, expires);
 }
 #endif
 
Index: tip/kernel/time/timer_list.c
===================================================================
--- tip.orig/kernel/time/timer_list.c
+++ tip/kernel/time/timer_list.c
@@ -184,7 +184,7 @@ static void print_cpu(struct seq_file *m
 		P_ns(idle_sleeptime);
 		P_ns(iowait_sleeptime);
 		P(last_jiffies);
-		P(next_jiffies);
+		P(next_timer);
 		P_ns(idle_expires);
 		SEQ_printf(m, "jiffies: %Lu\n",
 			   (unsigned long long)jiffies);
@@ -282,7 +282,7 @@ static void timer_list_show_tickdevices_
 
 static inline void timer_list_header(struct seq_file *m, u64 now)
 {
-	SEQ_printf(m, "Timer List Version: v0.7\n");
+	SEQ_printf(m, "Timer List Version: v0.8\n");
 	SEQ_printf(m, "HRTIMER_MAX_CLOCK_BASES: %d\n", HRTIMER_MAX_CLOCK_BASES);
 	SEQ_printf(m, "now at %Ld nsecs\n", (unsigned long long)now);
 	SEQ_printf(m, "\n");



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 21/39] x86: perf: Use hrtimer_start()
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (19 preceding siblings ...)
  2015-04-14 21:08 ` [patch 20/39] tick: nohz: Rework next timer evaluation Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:10   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 22/39] x86: perf: uncore: " Thomas Gleixner
                   ` (17 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, x86

[-- Attachment #1: x86-perf-rapl-use-hrtimer-start.patch --]
[-- Type: text/plain, Size: 888 bytes --]

hrtimer_start() does not longer defer already expired timers to the
softirq. Get rid of the __hrtimer_start_range_ns() invocation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Index: tip/arch/x86/kernel/cpu/perf_event_intel_rapl.c
===================================================================
--- tip.orig/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ tip/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -182,9 +182,8 @@ again:
 
 static void rapl_start_hrtimer(struct rapl_pmu *pmu)
 {
-	__hrtimer_start_range_ns(&pmu->hrtimer,
-			pmu->timer_interval, 0,
-			HRTIMER_MODE_REL_PINNED, 0);
+       hrtimer_start(&pmu->hrtimer, pmu->timer_interval,
+		     HRTIMER_MODE_REL_PINNED);
 }
 
 static void rapl_stop_hrtimer(struct rapl_pmu *pmu)



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 22/39] x86: perf: uncore: Use hrtimer_start()
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (20 preceding siblings ...)
  2015-04-14 21:09 ` [patch 21/39] x86: perf: Use hrtimer_start() Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:10   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 23/39] perf: core: " Thomas Gleixner
                   ` (16 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, x86

[-- Attachment #1: x86-perf-uncore-use-hrtimer-start.patch --]
[-- Type: text/plain, Size: 971 bytes --]

hrtimer_start() does not longer defer already expired timers to the
softirq. Get rid of the __hrtimer_start_range_ns() invocation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Index: tip/arch/x86/kernel/cpu/perf_event_intel_uncore.c
===================================================================
--- tip.orig/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ tip/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -233,9 +233,8 @@ static enum hrtimer_restart uncore_pmu_h
 
 void uncore_pmu_start_hrtimer(struct intel_uncore_box *box)
 {
-	__hrtimer_start_range_ns(&box->hrtimer,
-			ns_to_ktime(box->hrtimer_duration), 0,
-			HRTIMER_MODE_REL_PINNED, 0);
+	hrtimer_start(&box->hrtimer, ns_to_ktime(box->hrtimer_duration),
+		      HRTIMER_MODE_REL_PINNED);
 }
 
 void uncore_pmu_cancel_hrtimer(struct intel_uncore_box *box)



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 23/39] perf: core: Use hrtimer_start()
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (21 preceding siblings ...)
  2015-04-14 21:09 ` [patch 22/39] x86: perf: uncore: " Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:11   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 24/39] sched: core: Use hrtimer_start[_expires]() Thomas Gleixner
                   ` (15 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: perf-core-use-hrtimer-start.patch --]
[-- Type: text/plain, Size: 1186 bytes --]

hrtimer_start() does not longer defer already expired timers to the
softirq. Get rid of the __hrtimer_start_range_ns() invocation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/events/core.c |    9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

Index: tip/kernel/events/core.c
===================================================================
--- tip.orig/kernel/events/core.c
+++ tip/kernel/events/core.c
@@ -853,9 +853,7 @@ static void perf_cpu_hrtimer_restart(str
 	if (hrtimer_active(hr))
 		return;
 
-	if (!hrtimer_callback_running(hr))
-		__hrtimer_start_range_ns(hr, cpuctx->hrtimer_interval,
-					 0, HRTIMER_MODE_REL_PINNED, 0);
+	hrtimer_start(hr, cpuctx->hrtimer_interval, HRTIMER_MODE_REL_PINNED);
 }
 
 void perf_pmu_disable(struct pmu *pmu)
@@ -6530,9 +6528,8 @@ static void perf_swevent_start_hrtimer(s
 	} else {
 		period = max_t(u64, 10000, hwc->sample_period);
 	}
-	__hrtimer_start_range_ns(&hwc->hrtimer,
-				ns_to_ktime(period), 0,
-				HRTIMER_MODE_REL_PINNED, 0);
+	hrtimer_start(&hwc->hrtimer, ns_to_ktime(period),
+		      HRTIMER_MODE_REL_PINNED);
 }
 
 static void perf_swevent_cancel_hrtimer(struct perf_event *event)



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 24/39] sched: core: Use hrtimer_start[_expires]()
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (22 preceding siblings ...)
  2015-04-14 21:09 ` [patch 23/39] perf: core: " Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:11   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 25/39] sched: deadline: Use hrtimer_start() Thomas Gleixner
                   ` (14 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: sched-core-use-hrtimer-start.patch --]
[-- Type: text/plain, Size: 2727 bytes --]

hrtimer_start() now enforces a timer interrupt when an already expired
timer is enqueued.

Get rid of the __hrtimer_start_range_ns() invocations and the loops
around it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/sched/core.c |   28 ++++++++--------------------
 kernel/sched/fair.c |    2 +-
 2 files changed, 9 insertions(+), 21 deletions(-)

Index: tip/kernel/sched/core.c
===================================================================
--- tip.orig/kernel/sched/core.c
+++ tip/kernel/sched/core.c
@@ -92,22 +92,11 @@
 
 void start_bandwidth_timer(struct hrtimer *period_timer, ktime_t period)
 {
-	unsigned long delta;
-	ktime_t soft, hard, now;
-
-	for (;;) {
-		if (hrtimer_active(period_timer))
-			break;
-
-		now = hrtimer_cb_get_time(period_timer);
-		hrtimer_forward(period_timer, now, period);
+	if (hrtimer_active(period_timer))
+		return;
 
-		soft = hrtimer_get_softexpires(period_timer);
-		hard = hrtimer_get_expires(period_timer);
-		delta = ktime_to_ns(ktime_sub(hard, soft));
-		__hrtimer_start_range_ns(period_timer, soft, delta,
-					 HRTIMER_MODE_ABS_PINNED, 0);
-	}
+	hrtimer_forward_now(period_timer, period);
+	hrtimer_start_expires(period_timer, HRTIMER_MODE_ABS_PINNED);
 }
 
 DEFINE_MUTEX(sched_domains_mutex);
@@ -352,12 +341,11 @@ static enum hrtimer_restart hrtick(struc
 
 #ifdef CONFIG_SMP
 
-static int __hrtick_restart(struct rq *rq)
+static void __hrtick_restart(struct rq *rq)
 {
 	struct hrtimer *timer = &rq->hrtick_timer;
-	ktime_t time = hrtimer_get_softexpires(timer);
 
-	return __hrtimer_start_range_ns(timer, time, 0, HRTIMER_MODE_ABS_PINNED, 0);
+	hrtimer_start_expires(timer, HRTIMER_MODE_ABS_PINNED);
 }
 
 /*
@@ -437,8 +425,8 @@ void hrtick_start(struct rq *rq, u64 del
 	 * doesn't make sense. Rely on vruntime for fairness.
 	 */
 	delay = max_t(u64, delay, 10000LL);
-	__hrtimer_start_range_ns(&rq->hrtick_timer, ns_to_ktime(delay), 0,
-			HRTIMER_MODE_REL_PINNED, 0);
+	hrtimer_start(&rq->hrtick_timer, ns_to_ktime(delay),
+		      HRTIMER_MODE_REL_PINNED);
 }
 
 static inline void init_hrtick(void)
Index: tip/kernel/sched/fair.c
===================================================================
--- tip.orig/kernel/sched/fair.c
+++ tip/kernel/sched/fair.c
@@ -3785,7 +3785,7 @@ static const u64 cfs_bandwidth_slack_per
  * Are we near the end of the current quota period?
  *
  * Requires cfs_b->lock for hrtimer_expires_remaining to be safe against the
- * hrtimer base being cleared by __hrtimer_start_range_ns. In the case of
+ * hrtimer base being cleared by hrtimer_start. In the case of
  * migrate_hrtimers, base is never cleared, so we are fine.
  */
 static int runtime_refresh_within(struct cfs_bandwidth *cfs_b, u64 min_expire)



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 25/39] sched: deadline: Use hrtimer_start()
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (23 preceding siblings ...)
  2015-04-14 21:09 ` [patch 24/39] sched: core: Use hrtimer_start[_expires]() Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:11   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 26/39] hrtimer: Get rid of __hrtimer_start_range_ns() Thomas Gleixner
                   ` (13 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: sched-deadline-use-hrtimer-start.patch --]
[-- Type: text/plain, Size: 1213 bytes --]

hrtimer_start() does not longer defer already expired timers to the
softirq. Get rid of the __hrtimer_start_range_ns() invocation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/sched/deadline.c |   12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

Index: tip/kernel/sched/deadline.c
===================================================================
--- tip.orig/kernel/sched/deadline.c
+++ tip/kernel/sched/deadline.c
@@ -457,8 +457,6 @@ static int start_dl_timer(struct sched_d
 	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
 	struct rq *rq = rq_of_dl_rq(dl_rq);
 	ktime_t now, act;
-	ktime_t soft, hard;
-	unsigned long range;
 	s64 delta;
 
 	if (boosted)
@@ -481,15 +479,9 @@ static int start_dl_timer(struct sched_d
 	if (ktime_us_delta(act, now) < 0)
 		return 0;
 
-	hrtimer_set_expires(&dl_se->dl_timer, act);
+	hrtimer_start(&dl_se->dl_timer, act, HRTIMER_MODE_ABS);
 
-	soft = hrtimer_get_softexpires(&dl_se->dl_timer);
-	hard = hrtimer_get_expires(&dl_se->dl_timer);
-	range = ktime_to_ns(ktime_sub(hard, soft));
-	__hrtimer_start_range_ns(&dl_se->dl_timer, soft,
-				 range, HRTIMER_MODE_ABS, 0);
-
-	return hrtimer_active(&dl_se->dl_timer);
+	return 1;
 }
 
 /*



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 26/39] hrtimer: Get rid of __hrtimer_start_range_ns()
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (24 preceding siblings ...)
  2015-04-14 21:09 ` [patch 25/39] sched: deadline: Use hrtimer_start() Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:11   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 27/39] hrtimer: Make hrtimer_start() a inline wrapper Thomas Gleixner
                   ` (12 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrtimer-get-rid-of-hrtimer-start-range-ns-wakeup-arg.patch --]
[-- Type: text/plain, Size: 2844 bytes --]

No more callers. Remove the leftovers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h |    4 ----
 kernel/time/hrtimer.c   |   38 +++++++++++++++-----------------------
 2 files changed, 15 insertions(+), 27 deletions(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -359,10 +359,6 @@ extern int hrtimer_start(struct hrtimer
 			 const enum hrtimer_mode mode);
 extern int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 			unsigned long range_ns, const enum hrtimer_mode mode);
-extern int
-__hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
-			 unsigned long delta_ns,
-			 const enum hrtimer_mode mode, int wakeup);
 
 extern int hrtimer_cancel(struct hrtimer *timer);
 extern int hrtimer_try_to_cancel(struct hrtimer *timer);
Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -908,9 +908,20 @@ remove_hrtimer(struct hrtimer *timer, st
 	return 0;
 }
 
-int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
-		unsigned long delta_ns, const enum hrtimer_mode mode,
-		int wakeup)
+/**
+ * hrtimer_start_range_ns - (re)start an hrtimer on the current CPU
+ * @timer:	the timer to be added
+ * @tim:	expiry time
+ * @delta_ns:	"slack" range for the timer
+ * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
+ *		relative (HRTIMER_MODE_REL)
+ *
+ * Returns:
+ *  0 on success
+ *  1 when the timer was active
+ */
+int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
+			   unsigned long delta_ns, const enum hrtimer_mode mode)
 {
 	struct hrtimer_clock_base *base, *new_base;
 	unsigned long flags;
@@ -963,25 +974,6 @@ int __hrtimer_start_range_ns(struct hrti
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(__hrtimer_start_range_ns);
-
-/**
- * hrtimer_start_range_ns - (re)start an hrtimer on the current CPU
- * @timer:	the timer to be added
- * @tim:	expiry time
- * @delta_ns:	"slack" range for the timer
- * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
- *		relative (HRTIMER_MODE_REL)
- *
- * Returns:
- *  0 on success
- *  1 when the timer was active
- */
-int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
-		unsigned long delta_ns, const enum hrtimer_mode mode)
-{
-	return __hrtimer_start_range_ns(timer, tim, delta_ns, mode, 1);
-}
 EXPORT_SYMBOL_GPL(hrtimer_start_range_ns);
 
 /**
@@ -998,7 +990,7 @@ EXPORT_SYMBOL_GPL(hrtimer_start_range_ns
 int
 hrtimer_start(struct hrtimer *timer, ktime_t tim, const enum hrtimer_mode mode)
 {
-	return __hrtimer_start_range_ns(timer, tim, 0, mode, 1);
+	return hrtimer_start_range_ns(timer, tim, 0, mode);
 }
 EXPORT_SYMBOL_GPL(hrtimer_start);
 



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 27/39] hrtimer: Make hrtimer_start() a inline wrapper
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (25 preceding siblings ...)
  2015-04-14 21:09 ` [patch 26/39] hrtimer: Get rid of __hrtimer_start_range_ns() Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:12   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 28/39] hrtimer: Remove bogus hrtimer_active() check Thomas Gleixner
                   ` (11 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrtimer-make-hrtimer-start-an-inline-wrapper.patch --]
[-- Type: text/plain, Size: 2207 bytes --]

No point for an extra export just to set the extra argument of
hrtimer_start_range_ns() to 0.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h |   19 +++++++++++++++++--
 kernel/time/hrtimer.c   |   19 -------------------
 2 files changed, 17 insertions(+), 21 deletions(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -355,11 +355,26 @@ static inline void destroy_hrtimer_on_st
 #endif
 
 /* Basic timer operations: */
-extern int hrtimer_start(struct hrtimer *timer, ktime_t tim,
-			 const enum hrtimer_mode mode);
 extern int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 			unsigned long range_ns, const enum hrtimer_mode mode);
 
+/**
+ * hrtimer_start - (re)start an hrtimer on the current CPU
+ * @timer:	the timer to be added
+ * @tim:	expiry time
+ * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
+ *		relative (HRTIMER_MODE_REL)
+ *
+ * Returns:
+ *  0 on success
+ *  1 when the timer was active
+ */
+static inline int hrtimer_start(struct hrtimer *timer, ktime_t tim,
+				const enum hrtimer_mode mode)
+{
+	return hrtimer_start_range_ns(timer, tim, 0, mode);
+}
+
 extern int hrtimer_cancel(struct hrtimer *timer);
 extern int hrtimer_try_to_cancel(struct hrtimer *timer);
 
Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -977,25 +977,6 @@ int hrtimer_start_range_ns(struct hrtime
 EXPORT_SYMBOL_GPL(hrtimer_start_range_ns);
 
 /**
- * hrtimer_start - (re)start an hrtimer on the current CPU
- * @timer:	the timer to be added
- * @tim:	expiry time
- * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
- *		relative (HRTIMER_MODE_REL)
- *
- * Returns:
- *  0 on success
- *  1 when the timer was active
- */
-int
-hrtimer_start(struct hrtimer *timer, ktime_t tim, const enum hrtimer_mode mode)
-{
-	return hrtimer_start_range_ns(timer, tim, 0, mode);
-}
-EXPORT_SYMBOL_GPL(hrtimer_start);
-
-
-/**
  * hrtimer_try_to_cancel - try to deactivate a timer
  * @timer:	hrtimer to stop
  *



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 28/39] hrtimer: Remove bogus hrtimer_active() check
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (26 preceding siblings ...)
  2015-04-14 21:09 ` [patch 27/39] hrtimer: Make hrtimer_start() a inline wrapper Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:12   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 29/39] hrtimer: Rmove " Thomas Gleixner
                   ` (10 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrtimer-remove-bogus-hrtimer-is-active-check.patch --]
[-- Type: text/plain, Size: 953 bytes --]

The check for hrtimer_active() after starting the timer is
pointless. If the timer is inactive it has expired already and
therefor the task pointer is already NULL.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/hrtimer.c |    4 ----
 1 file changed, 4 deletions(-)

Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -1357,8 +1357,6 @@ static int __sched do_nanosleep(struct h
 	do {
 		set_current_state(TASK_INTERRUPTIBLE);
 		hrtimer_start_expires(&t->timer, mode);
-		if (!hrtimer_active(&t->timer))
-			t->task = NULL;
 
 		if (likely(t->task))
 			freezable_schedule();
@@ -1629,8 +1627,6 @@ schedule_hrtimeout_range_clock(ktime_t *
 	hrtimer_init_sleeper(&t, current);
 
 	hrtimer_start_expires(&t.timer, mode);
-	if (!hrtimer_active(&t.timer))
-		t.task = NULL;
 
 	if (likely(t.task))
 		schedule();



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 29/39] hrtimer: Rmove bogus hrtimer_active() check
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (27 preceding siblings ...)
  2015-04-14 21:09 ` [patch 28/39] hrtimer: Remove bogus hrtimer_active() check Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:12   ` [tip:timers/core] futex: Remove " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 30/39] rtmutex: " Thomas Gleixner
                   ` (9 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: futex-remove-bogus-hrtimer-is-active-check.patch --]
[-- Type: text/plain, Size: 792 bytes --]

The check for hrtimer_active() after starting the timer is
pointless. If the timer is inactive it has expired already and
therefor the task pointer is already NULL.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/futex.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

Index: tip/kernel/futex.c
===================================================================
--- tip.orig/kernel/futex.c
+++ tip/kernel/futex.c
@@ -2063,11 +2063,8 @@ static void futex_wait_queue_me(struct f
 	queue_me(q, hb);
 
 	/* Arm the timer */
-	if (timeout) {
+	if (timeout)
 		hrtimer_start_expires(&timeout->timer, HRTIMER_MODE_ABS);
-		if (!hrtimer_active(&timeout->timer))
-			timeout->task = NULL;
-	}
 
 	/*
 	 * If we have been removed from the hash list, then another task



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 30/39] rtmutex: Remove bogus hrtimer_active() check
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (28 preceding siblings ...)
  2015-04-14 21:09 ` [patch 29/39] hrtimer: Rmove " Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:13   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 31/39] net: core: pktgen: " Thomas Gleixner
                   ` (8 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: rtmutex-remove-bogus-hrtimer-is-active-check.patch --]
[-- Type: text/plain, Size: 880 bytes --]

The check for hrtimer_active() after starting the timer is
pointless. If the timer is inactive it has expired already and
therefor the task pointer is already NULL.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/locking/rtmutex.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

Index: tip/kernel/locking/rtmutex.c
===================================================================
--- tip.orig/kernel/locking/rtmutex.c
+++ tip/kernel/locking/rtmutex.c
@@ -1180,11 +1180,8 @@ rt_mutex_slowlock(struct rt_mutex *lock,
 	set_current_state(state);
 
 	/* Setup the timer, when timeout != NULL */
-	if (unlikely(timeout)) {
+	if (unlikely(timeout))
 		hrtimer_start_expires(&timeout->timer, HRTIMER_MODE_ABS);
-		if (!hrtimer_active(&timeout->timer))
-			timeout->task = NULL;
-	}
 
 	ret = task_blocks_on_rt_mutex(lock, &waiter, current, chwalk);
 



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 31/39] net: core: pktgen: Remove bogus hrtimer_active() check
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (29 preceding siblings ...)
  2015-04-14 21:09 ` [patch 30/39] rtmutex: " Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-16 16:04   ` David Miller
  2015-04-22 19:13   ` [tip:timers/core] net: core: pktgen: Remove bogus hrtimer_active( ) check tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 32/39] alarmtimer: Get rid of unused return value Thomas Gleixner
                   ` (7 subsequent siblings)
  38 siblings, 2 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, David S. Miller, netdev

[-- Attachment #1: net-core-pktgen-remove-bogus-hrtimer-is-active-check.patch --]
[-- Type: text/plain, Size: 778 bytes --]

The check for hrtimer_active() after starting the timer is
pointless. If the timer is inactive it has expired already and
therefor the task pointer is already NULL.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
---
 net/core/pktgen.c |    2 --
 1 file changed, 2 deletions(-)

Index: tip/net/core/pktgen.c
===================================================================
--- tip.orig/net/core/pktgen.c
+++ tip/net/core/pktgen.c
@@ -2212,8 +2212,6 @@ static void spin(struct pktgen_dev *pkt_
 		do {
 			set_current_state(TASK_INTERRUPTIBLE);
 			hrtimer_start_expires(&t.timer, HRTIMER_MODE_ABS);
-			if (!hrtimer_active(&t.timer))
-				t.task = NULL;
 
 			if (likely(t.task))
 				schedule();



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 32/39] alarmtimer: Get rid of unused return value
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (30 preceding siblings ...)
  2015-04-14 21:09 ` [patch 31/39] net: core: pktgen: " Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:13   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 34/39] tick: broadcast-hrtimer: Remove overly clever return value abuse Thomas Gleixner
                   ` (6 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, John Stultz

[-- Attachment #1: alarmtimer-get-rid-of-unused-return-value.patch --]
[-- Type: text/plain, Size: 2407 bytes --]

We want to get rid of the hrtimer_start() return value and the alarm
timer return value is nowhere used. Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
---
 include/linux/alarmtimer.h |    4 ++--
 kernel/time/alarmtimer.c   |   11 ++++-------
 2 files changed, 6 insertions(+), 9 deletions(-)

Index: tip/include/linux/alarmtimer.h
===================================================================
--- tip.orig/include/linux/alarmtimer.h
+++ tip/include/linux/alarmtimer.h
@@ -43,8 +43,8 @@ struct alarm {
 
 void alarm_init(struct alarm *alarm, enum alarmtimer_type type,
 		enum alarmtimer_restart (*function)(struct alarm *, ktime_t));
-int alarm_start(struct alarm *alarm, ktime_t start);
-int alarm_start_relative(struct alarm *alarm, ktime_t start);
+void alarm_start(struct alarm *alarm, ktime_t start);
+void alarm_start_relative(struct alarm *alarm, ktime_t start);
 void alarm_restart(struct alarm *alarm);
 int alarm_try_to_cancel(struct alarm *alarm);
 int alarm_cancel(struct alarm *alarm);
Index: tip/kernel/time/alarmtimer.c
===================================================================
--- tip.orig/kernel/time/alarmtimer.c
+++ tip/kernel/time/alarmtimer.c
@@ -317,19 +317,16 @@ EXPORT_SYMBOL_GPL(alarm_init);
  * @alarm: ptr to alarm to set
  * @start: time to run the alarm
  */
-int alarm_start(struct alarm *alarm, ktime_t start)
+void alarm_start(struct alarm *alarm, ktime_t start)
 {
 	struct alarm_base *base = &alarm_bases[alarm->type];
 	unsigned long flags;
-	int ret;
 
 	spin_lock_irqsave(&base->lock, flags);
 	alarm->node.expires = start;
 	alarmtimer_enqueue(base, alarm);
-	ret = hrtimer_start(&alarm->timer, alarm->node.expires,
-				HRTIMER_MODE_ABS);
+	hrtimer_start(&alarm->timer, alarm->node.expires, HRTIMER_MODE_ABS);
 	spin_unlock_irqrestore(&base->lock, flags);
-	return ret;
 }
 EXPORT_SYMBOL_GPL(alarm_start);
 
@@ -338,12 +335,12 @@ EXPORT_SYMBOL_GPL(alarm_start);
  * @alarm: ptr to alarm to set
  * @start: time relative to now to run the alarm
  */
-int alarm_start_relative(struct alarm *alarm, ktime_t start)
+void alarm_start_relative(struct alarm *alarm, ktime_t start)
 {
 	struct alarm_base *base = &alarm_bases[alarm->type];
 
 	start = ktime_add(start, base->gettime());
-	return alarm_start(alarm, start);
+	alarm_start(alarm, start);
 }
 EXPORT_SYMBOL_GPL(alarm_start_relative);
 



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 34/39] tick: broadcast-hrtimer: Remove overly clever return value abuse
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (31 preceding siblings ...)
  2015-04-14 21:09 ` [patch 32/39] alarmtimer: Get rid of unused return value Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-17 10:33   ` Preeti U Murthy
  2015-04-22 19:13   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 35/39] hrtimer: Remove hrtimer_start() return value Thomas Gleixner
                   ` (5 subsequent siblings)
  38 siblings, 2 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: tick-broadcast-hrtimer-remove-overly-clever-return-value-abuse.patch --]
[-- Type: text/plain, Size: 1306 bytes --]

The assignment of bc_moved in the conditional construct relies on the
fact that in the case of hrtimer_start() invocation the return value
is always 0. It took me a while to understand it.

We want to get rid of the hrtimer_start() return value. Open code the
logic which makes it readable as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc:  Preeti U Murthy <preeti@linux.vnet.ibm.com>
---
 kernel/time/tick-broadcast-hrtimer.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Index: tip/kernel/time/tick-broadcast-hrtimer.c
===================================================================
--- tip.orig/kernel/time/tick-broadcast-hrtimer.c
+++ tip/kernel/time/tick-broadcast-hrtimer.c
@@ -66,9 +66,11 @@ static int bc_set_next(ktime_t expires,
 	 * hrtimer_{start/cancel} functions call into tracing,
 	 * calls to these functions must be bound within RCU_NONIDLE.
 	 */
-	RCU_NONIDLE(bc_moved = (hrtimer_try_to_cancel(&bctimer) >= 0) ?
-		!hrtimer_start(&bctimer, expires, HRTIMER_MODE_ABS_PINNED) :
-			0);
+	RCU_NONIDLE({
+			bc_moved = hrtimer_try_to_cancel(&bctimer) >= 0;
+			if (bc_moved)
+				hrtimer_start(&bctimer, expires,
+					      HRTIMER_MODE_ABS_PINNED);});
 	if (bc_moved) {
 		/* Bind the "device" to the cpu */
 		bc->bound_on = smp_processor_id();



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 35/39] hrtimer: Remove hrtimer_start() return value
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (32 preceding siblings ...)
  2015-04-14 21:09 ` [patch 34/39] tick: broadcast-hrtimer: Remove overly clever return value abuse Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:14   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 36/39] hrtimer: Avoid locking in hrtimer_cancel() if timer not active Thomas Gleixner
                   ` (4 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrtimer-remove-hrtimer-start-return-value.patch --]
[-- Type: text/plain, Size: 4543 bytes --]

No user was ever interested whether the timer was active or not when
it was started. All abusers of the return value are gone, so get rid
of it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h   |   22 +++++++++-------------
 include/linux/interrupt.h |    6 +++---
 kernel/time/hrtimer.c     |   23 +++++++----------------
 3 files changed, 19 insertions(+), 32 deletions(-)

Index: tip/include/linux/hrtimer.h
===================================================================
--- tip.orig/include/linux/hrtimer.h
+++ tip/include/linux/hrtimer.h
@@ -355,7 +355,7 @@ static inline void destroy_hrtimer_on_st
 #endif
 
 /* Basic timer operations: */
-extern int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
+extern void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 			unsigned long range_ns, const enum hrtimer_mode mode);
 
 /**
@@ -364,34 +364,30 @@ extern int hrtimer_start_range_ns(struct
  * @tim:	expiry time
  * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
  *		relative (HRTIMER_MODE_REL)
- *
- * Returns:
- *  0 on success
- *  1 when the timer was active
  */
-static inline int hrtimer_start(struct hrtimer *timer, ktime_t tim,
-				const enum hrtimer_mode mode)
+static inline void hrtimer_start(struct hrtimer *timer, ktime_t tim,
+				 const enum hrtimer_mode mode)
 {
-	return hrtimer_start_range_ns(timer, tim, 0, mode);
+	hrtimer_start_range_ns(timer, tim, 0, mode);
 }
 
 extern int hrtimer_cancel(struct hrtimer *timer);
 extern int hrtimer_try_to_cancel(struct hrtimer *timer);
 
-static inline int hrtimer_start_expires(struct hrtimer *timer,
-						enum hrtimer_mode mode)
+static inline void hrtimer_start_expires(struct hrtimer *timer,
+					 enum hrtimer_mode mode)
 {
 	unsigned long delta;
 	ktime_t soft, hard;
 	soft = hrtimer_get_softexpires(timer);
 	hard = hrtimer_get_expires(timer);
 	delta = ktime_to_ns(ktime_sub(hard, soft));
-	return hrtimer_start_range_ns(timer, soft, delta, mode);
+	hrtimer_start_range_ns(timer, soft, delta, mode);
 }
 
-static inline int hrtimer_restart(struct hrtimer *timer)
+static inline void hrtimer_restart(struct hrtimer *timer)
 {
-	return hrtimer_start_expires(timer, HRTIMER_MODE_ABS);
+	hrtimer_start_expires(timer, HRTIMER_MODE_ABS);
 }
 
 /* Query timers: */
Index: tip/include/linux/interrupt.h
===================================================================
--- tip.orig/include/linux/interrupt.h
+++ tip/include/linux/interrupt.h
@@ -579,10 +579,10 @@ tasklet_hrtimer_init(struct tasklet_hrti
 		     clockid_t which_clock, enum hrtimer_mode mode);
 
 static inline
-int tasklet_hrtimer_start(struct tasklet_hrtimer *ttimer, ktime_t time,
-			  const enum hrtimer_mode mode)
+void tasklet_hrtimer_start(struct tasklet_hrtimer *ttimer, ktime_t time,
+			   const enum hrtimer_mode mode)
 {
-	return hrtimer_start(&ttimer->timer, time, mode);
+	hrtimer_start(&ttimer->timer, time, mode);
 }
 
 static inline
Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -915,22 +915,18 @@ remove_hrtimer(struct hrtimer *timer, st
  * @delta_ns:	"slack" range for the timer
  * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
  *		relative (HRTIMER_MODE_REL)
- *
- * Returns:
- *  0 on success
- *  1 when the timer was active
  */
-int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
-			   unsigned long delta_ns, const enum hrtimer_mode mode)
+void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
+			    unsigned long delta_ns, const enum hrtimer_mode mode)
 {
 	struct hrtimer_clock_base *base, *new_base;
 	unsigned long flags;
-	int ret, leftmost;
+	int leftmost;
 
 	base = lock_hrtimer_base(timer, &flags);
 
 	/* Remove an active timer from the queue: */
-	ret = remove_hrtimer(timer, base);
+	remove_hrtimer(timer, base);
 
 	if (mode & HRTIMER_MODE_REL) {
 		tim = ktime_add_safe(tim, base->get_time());
@@ -954,11 +950,8 @@ int hrtimer_start_range_ns(struct hrtime
 	timer_stats_hrtimer_set_start_info(timer);
 
 	leftmost = enqueue_hrtimer(timer, new_base);
-
-	if (!leftmost) {
-		unlock_hrtimer_base(timer, &flags);
-		return ret;
-	}
+	if (!leftmost)
+		goto unlock;
 
 	if (!hrtimer_is_hres_active(timer)) {
 		/*
@@ -969,10 +962,8 @@ int hrtimer_start_range_ns(struct hrtime
 	} else {
 		hrtimer_reprogram(timer, new_base);
 	}
-
+unlock:
 	unlock_hrtimer_base(timer, &flags);
-
-	return ret;
 }
 EXPORT_SYMBOL_GPL(hrtimer_start_range_ns);
 



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 36/39] hrtimer: Avoid locking in hrtimer_cancel() if timer not active
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (33 preceding siblings ...)
  2015-04-14 21:09 ` [patch 35/39] hrtimer: Remove hrtimer_start() return value Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:14   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 37/39] staging: ozwpan: Remove hrtimer_active() check Thomas Gleixner
                   ` (3 subsequent siblings)
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: hrtimer-hrtimer-cancel-avoid-locking-if-not-active.patch --]
[-- Type: text/plain, Size: 1212 bytes --]

We can do a lockless check for hrtimer_active before actually taking
the lock in hrtimer[_try_to]_cancel. This is useful for hotpath users
like nanosleep as they avoid the lock dance when the timer has
expired.

This is safe because active is true when the timer is enqueued or the
callback is running. Taking the hrtimer base lock does not protect
against concurrent hrtimer_start calls, the callsite has to do the
proper serialization itself.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/hrtimer.c |    9 +++++++++
 1 file changed, 9 insertions(+)

Index: tip/kernel/time/hrtimer.c
===================================================================
--- tip.orig/kernel/time/hrtimer.c
+++ tip/kernel/time/hrtimer.c
@@ -983,6 +983,15 @@ int hrtimer_try_to_cancel(struct hrtimer
 	unsigned long flags;
 	int ret = -1;
 
+	/*
+	 * Check lockless first. If the timer is not active (neither
+	 * enqueued nor running the callback, nothing to do here.  The
+	 * base lock does not serialize against a concurrent enqueue,
+	 * so we can avoid taking it.
+	 */
+	if (!hrtimer_active(timer))
+		return 0;
+
 	base = lock_hrtimer_base(timer, &flags);
 
 	if (!hrtimer_callback_running(timer))



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 37/39] staging: ozwpan: Remove hrtimer_active() check
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (34 preceding siblings ...)
  2015-04-14 21:09 ` [patch 36/39] hrtimer: Avoid locking in hrtimer_cancel() if timer not active Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-14 21:09 ` [patch 38/39] timer: Remove pointless return value of do_usleep_range() Thomas Gleixner
                   ` (2 subsequent siblings)
  38 siblings, 0 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: staging-ozwpan-remove-hrtimer-active-check.patch --]
[-- Type: text/plain, Size: 1105 bytes --]

hrtimer_cancel() can be called unconditionally and this code is hardly
a hotpath which wants to avoid the out of line call.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/staging/ozwpan/ozpd.c |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

Index: tip/drivers/staging/ozwpan/ozpd.c
===================================================================
--- tip.orig/drivers/staging/ozwpan/ozpd.c
+++ tip/drivers/staging/ozwpan/ozpd.c
@@ -175,10 +175,8 @@ static void oz_pd_free(struct work_struc
  */
 void oz_pd_destroy(struct oz_pd *pd)
 {
-	if (hrtimer_active(&pd->timeout))
-		hrtimer_cancel(&pd->timeout);
-	if (hrtimer_active(&pd->heartbeat))
-		hrtimer_cancel(&pd->heartbeat);
+	hrtimer_cancel(&pd->timeout);
+	hrtimer_cancel(&pd->heartbeat);
 
 	INIT_WORK(&pd->workitem, oz_pd_free);
 	if (!schedule_work(&pd->workitem))
@@ -247,7 +245,7 @@ void oz_pd_heartbeat(struct oz_pd *pd, u
 				more = 1;
 		}
 	}
-	if ((!more) && (hrtimer_active(&pd->heartbeat)))
+	if (!more)
 		hrtimer_cancel(&pd->heartbeat);
 	if (pd->mode & OZ_F_ISOC_ANYTIME) {
 		int count = 8;



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 38/39] timer: Remove pointless return value of do_usleep_range()
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (35 preceding siblings ...)
  2015-04-14 21:09 ` [patch 37/39] staging: ozwpan: Remove hrtimer_active() check Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:14   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2015-04-14 21:09 ` [patch 39/39] timer: Put usleep_range into the __sched section Thomas Gleixner
       [not found] ` <20150414203503.322172417@linutronix.de>
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: timer-kill-pointless-return-value.patch --]
[-- Type: text/plain, Size: 873 bytes --]

The only user ignores it anyway and rightfully so.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/timer.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: tip/kernel/time/timer.c
===================================================================
--- tip.orig/kernel/time/timer.c
+++ tip/kernel/time/timer.c
@@ -1695,14 +1695,14 @@ unsigned long msleep_interruptible(unsig
 
 EXPORT_SYMBOL(msleep_interruptible);
 
-static int __sched do_usleep_range(unsigned long min, unsigned long max)
+static void __sched do_usleep_range(unsigned long min, unsigned long max)
 {
 	ktime_t kmin;
 	unsigned long delta;
 
 	kmin = ktime_set(0, min * NSEC_PER_USEC);
 	delta = (max - min) * NSEC_PER_USEC;
-	return schedule_hrtimeout_range(&kmin, delta, HRTIMER_MODE_REL);
+	schedule_hrtimeout_range(&kmin, delta, HRTIMER_MODE_REL);
 }
 
 /**



^ permalink raw reply	[flat|nested] 123+ messages in thread

* [patch 39/39] timer: Put usleep_range into the __sched section
  2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
                   ` (36 preceding siblings ...)
  2015-04-14 21:09 ` [patch 38/39] timer: Remove pointless return value of do_usleep_range() Thomas Gleixner
@ 2015-04-14 21:09 ` Thomas Gleixner
  2015-04-22 19:14   ` [tip:timers/core] " tip-bot for Thomas Gleixner
       [not found] ` <20150414203503.322172417@linutronix.de>
  38 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-14 21:09 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

[-- Attachment #1: timer-make-usleep_range-__sched.patch --]
[-- Type: text/plain, Size: 792 bytes --]

do_usleep_range() and schedule_hrtimeout_range() are __sched as
well. So it makes no sense to have the exported function in a
different section.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/timer.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: tip/kernel/time/timer.c
===================================================================
--- tip.orig/kernel/time/timer.c
+++ tip/kernel/time/timer.c
@@ -1707,7 +1707,7 @@ static void __sched do_usleep_range(unsi
  * @min: Minimum time in usecs to sleep
  * @max: Maximum time in usecs to sleep
  */
-void usleep_range(unsigned long min, unsigned long max)
+void __sched usleep_range(unsigned long min, unsigned long max)
 {
 	__set_current_state(TASK_UNINTERRUPTIBLE);
 	do_usleep_range(min, max);



^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 33/39] power: reset: ltc2952: Remove bogus hrtimer_start() return value checks
       [not found] ` <20150414203503.322172417@linutronix.de>
@ 2015-04-14 21:38   ` Frans Klaver
  2015-04-30 15:49   ` Sebastian Reichel
  1 sibling, 0 replies; 123+ messages in thread
From: Frans Klaver @ 2015-04-14 21:38 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, Sebastian Reichel,
	Dmitry Eremin-Solenikov, David Woodhouse, René Moll,
	Wolfram Sang, linux-pm

On Tue, Apr 14, 2015 at 09:09:20PM +0000, Thomas Gleixner wrote:
> The return value of hrtimer_start() tells whether the timer was
> inactive or active already when hrtimer_start() was called.
> 
> The code emits a bogus warning if the timer was active already
> claiming that the timer could not be started.
> 
> Remove it.

Thanks for catching this.

> Cc: "René Moll" <Rene.Moll@xsens.com>

Rene no longer works at Xsens, so this address is no longer in use. Instead:

Cc: "René Moll" <linux@r-moll.nl>

That said, I'm not sure if he's actively monitoring this mailbox, or
even if he should still be mentioned as maintainer of this driver.


> ---
>  drivers/power/reset/ltc2952-poweroff.c |   18 +++---------------
>  1 file changed, 3 insertions(+), 15 deletions(-)
> 
> Index: tip/drivers/power/reset/ltc2952-poweroff.c
> ===================================================================
> --- tip.orig/drivers/power/reset/ltc2952-poweroff.c
> +++ tip/drivers/power/reset/ltc2952-poweroff.c
> @@ -120,18 +120,7 @@ static enum hrtimer_restart ltc2952_powe
>  
>  static void ltc2952_poweroff_start_wde(struct ltc2952_poweroff *data)
>  {
> -	if (hrtimer_start(&data->timer_wde, data->wde_interval,
> -			  HRTIMER_MODE_REL)) {
> -		/*
> -		 * The device will not toggle the watchdog reset,
> -		 * thus shut down is only safe if the PowerPath controller
> -		 * has a long enough time-off before triggering a hardware
> -		 * power-off.
> -		 *
> -		 * Only sending a warning as the system will power-off anyway
> -		 */
> -		dev_err(data->dev, "unable to start the timer\n");
> -	}
> +	hrtimer_start(&data->timer_wde, data->wde_interval, HRTIMER_MODE_REL);
>  }
>  
>  static enum hrtimer_restart
> @@ -165,9 +154,8 @@ static irqreturn_t ltc2952_poweroff_hand
>  	}
>  
>  	if (gpiod_get_value(data->gpio_trigger)) {
> -		if (hrtimer_start(&data->timer_trigger, data->trigger_delay,
> -				  HRTIMER_MODE_REL))
> -			dev_err(data->dev, "unable to start the wait timer\n");
> +		hrtimer_start(&data->timer_trigger, data->trigger_delay,
> +			      HRTIMER_MODE_REL);
>  	} else {
>  		hrtimer_cancel(&data->timer_trigger);
>  		/* omitting return value check, timer should have been valid */

You might as well remove this bogus comment along with the rest.
hrtimer_cancel()s return value says nothing about the validity of the
passed timer.

In any case

Acked-by: Frans Klaver <frans.klaver@xsens.com>

Thanks,
Frans

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 02/39] hrtimer: Get rid of the resolution field in hrtimer_clock_base
  2015-04-14 21:08 ` [patch 02/39] hrtimer: Get rid of the resolution field in hrtimer_clock_base Thomas Gleixner
@ 2015-04-15  6:29   ` Frans Klaver
  2015-04-15  6:32     ` Frans Klaver
  2015-04-20  8:34   ` Preeti U Murthy
  2015-04-22 19:05   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2 siblings, 1 reply; 123+ messages in thread
From: Frans Klaver @ 2015-04-15  6:29 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

On Tue, Apr 14, 2015 at 11:08 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> The field has no value because all clock bases have the same
> resolution. The resolution only changes when we switch to high
> resolution timer mode. We can evaluate that from a single static
> variable as well. In the !HIGHRES case its simply a constant.
>
> Export the variable, so we can simplify the usage sites.

There seems to be only one usage site outside hrtimer.c itself. That
one only reads the value. Wouldn't it make sense to keep the variable
from the interface and use a read function instead?


> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  include/linux/hrtimer.h  |    6 ++++--
>  kernel/time/hrtimer.c    |   26 +++++++++-----------------
>  kernel/time/timer_list.c |    8 ++++----
>  3 files changed, 17 insertions(+), 23 deletions(-)
>
> Index: tip/include/linux/hrtimer.h
> ===================================================================
> --- tip.orig/include/linux/hrtimer.h
> +++ tip/include/linux/hrtimer.h
> @@ -137,7 +137,6 @@ struct hrtimer_sleeper {
>   *                     timer to a base on another cpu.
>   * @clockid:           clock id for per_cpu support
>   * @active:            red black tree root node for the active timers
> - * @resolution:                the resolution of the clock, in nanoseconds
>   * @get_time:          function to retrieve the current time of the clock
>   * @softirq_time:      the time when running the hrtimer queue in the softirq
>   * @offset:            offset of this clock to the monotonic base
> @@ -147,7 +146,6 @@ struct hrtimer_clock_base {
>         int                     index;
>         clockid_t               clockid;
>         struct timerqueue_head  active;
> -       ktime_t                 resolution;
>         ktime_t                 (*get_time)(void);
>         ktime_t                 softirq_time;
>         ktime_t                 offset;
> @@ -295,11 +293,15 @@ extern void hrtimer_peek_ahead_timers(vo
>
>  extern void clock_was_set_delayed(void);
>
> +extern unsigned int hrtimer_resolution;
> +
>  #else
>
>  # define MONOTONIC_RES_NSEC    LOW_RES_NSEC
>  # define KTIME_MONOTONIC_RES   KTIME_LOW_RES
>
> +#define hrtimer_resolution     LOW_RES_NSEC
> +
>  static inline void hrtimer_peek_ahead_timers(void) { }
>
>  /*
> Index: tip/kernel/time/hrtimer.c
> ===================================================================
> --- tip.orig/kernel/time/hrtimer.c
> +++ tip/kernel/time/hrtimer.c
> @@ -66,7 +66,6 @@
>   */
>  DEFINE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases) =
>  {
> -
>         .lock = __RAW_SPIN_LOCK_UNLOCKED(hrtimer_bases.lock),
>         .clock_base =
>         {
> @@ -74,25 +73,21 @@ DEFINE_PER_CPU(struct hrtimer_cpu_base,
>                         .index = HRTIMER_BASE_MONOTONIC,
>                         .clockid = CLOCK_MONOTONIC,
>                         .get_time = &ktime_get,
> -                       .resolution = KTIME_LOW_RES,
>                 },
>                 {
>                         .index = HRTIMER_BASE_REALTIME,
>                         .clockid = CLOCK_REALTIME,
>                         .get_time = &ktime_get_real,
> -                       .resolution = KTIME_LOW_RES,
>                 },
>                 {
>                         .index = HRTIMER_BASE_BOOTTIME,
>                         .clockid = CLOCK_BOOTTIME,
>                         .get_time = &ktime_get_boottime,
> -                       .resolution = KTIME_LOW_RES,
>                 },
>                 {
>                         .index = HRTIMER_BASE_TAI,
>                         .clockid = CLOCK_TAI,
>                         .get_time = &ktime_get_clocktai,
> -                       .resolution = KTIME_LOW_RES,
>                 },
>         }
>  };
> @@ -478,6 +473,8 @@ static ktime_t __hrtimer_get_next_event(
>   * High resolution timer enabled ?
>   */
>  static int hrtimer_hres_enabled __read_mostly  = 1;
> +unsigned int hrtimer_resolution __read_mostly = LOW_RES_NSEC;
> +EXPORT_SYMBOL_GPL(hrtimer_resolution);
>
>  /*
>   * Enable / Disable high resolution mode
> @@ -660,7 +657,7 @@ static void retrigger_next_event(void *a
>   */
>  static int hrtimer_switch_to_hres(void)
>  {
> -       int i, cpu = smp_processor_id();
> +       int cpu = smp_processor_id();
>         struct hrtimer_cpu_base *base = &per_cpu(hrtimer_bases, cpu);
>         unsigned long flags;
>
> @@ -676,8 +673,7 @@ static int hrtimer_switch_to_hres(void)
>                 return 0;
>         }
>         base->hres_active = 1;
> -       for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++)
> -               base->clock_base[i].resolution = KTIME_HIGH_RES;
> +       hrtimer_resolution = HIGH_RES_NSEC;
>
>         tick_setup_sched_timer();
>         /* "Retrigger" the interrupt to get things going */
> @@ -812,8 +808,8 @@ u64 hrtimer_forward(struct hrtimer *time
>         if (delta.tv64 < 0)
>                 return 0;
>
> -       if (interval.tv64 < timer->base->resolution.tv64)
> -               interval.tv64 = timer->base->resolution.tv64;
> +       if (interval.tv64 < hrtimer_resolution)
> +               interval.tv64 = hrtimer_resolution;
>
>         if (unlikely(delta.tv64 >= interval.tv64)) {
>                 s64 incr = ktime_to_ns(interval);
> @@ -955,7 +951,7 @@ int __hrtimer_start_range_ns(struct hrti
>                  * timeouts. This will go away with the GTOD framework.
>                  */
>  #ifdef CONFIG_TIME_LOW_RES
> -               tim = ktime_add_safe(tim, base->resolution);
> +               tim = ktime_add_safe(tim, ktime_set(0, hrtimer_resolution));
>  #endif
>         }
>
> @@ -1185,12 +1181,8 @@ EXPORT_SYMBOL_GPL(hrtimer_init);
>   */
>  int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp)
>  {
> -       struct hrtimer_cpu_base *cpu_base;
> -       int base = hrtimer_clockid_to_base(which_clock);
> -
> -       cpu_base = raw_cpu_ptr(&hrtimer_bases);
> -       *tp = ktime_to_timespec(cpu_base->clock_base[base].resolution);
> -
> +       tp->tv_sec = 0;
> +       tp->tv_nsec = hrtimer_resolution;
>         return 0;
>  }
>  EXPORT_SYMBOL_GPL(hrtimer_get_res);
> Index: tip/kernel/time/timer_list.c
> ===================================================================
> --- tip.orig/kernel/time/timer_list.c
> +++ tip/kernel/time/timer_list.c
> @@ -120,10 +120,10 @@ static void
>  print_base(struct seq_file *m, struct hrtimer_clock_base *base, u64 now)
>  {
>         SEQ_printf(m, "  .base:       %pK\n", base);
> -       SEQ_printf(m, "  .index:      %d\n",
> -                       base->index);
> -       SEQ_printf(m, "  .resolution: %Lu nsecs\n",
> -                       (unsigned long long)ktime_to_ns(base->resolution));
> +       SEQ_printf(m, "  .index:      %d\n", base->index);
> +
> +       SEQ_printf(m, "  .resolution: %u nsecs\n", (unsigned) hrtimer_resolution);

hrtimer_resolution already is an unsigned int. Doesn't that make this
cast pointless?

Thanks,
Frans

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 02/39] hrtimer: Get rid of the resolution field in hrtimer_clock_base
  2015-04-15  6:29   ` Frans Klaver
@ 2015-04-15  6:32     ` Frans Klaver
  0 siblings, 0 replies; 123+ messages in thread
From: Frans Klaver @ 2015-04-15  6:32 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker

On Wed, Apr 15, 2015 at 8:29 AM, Frans Klaver <fransklaver@gmail.com> wrote:
> On Tue, Apr 14, 2015 at 11:08 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>> The field has no value because all clock bases have the same
>> resolution. The resolution only changes when we switch to high
>> resolution timer mode. We can evaluate that from a single static
>> variable as well. In the !HIGHRES case its simply a constant.
>>
>> Export the variable, so we can simplify the usage sites.
>
> There seems to be only one usage site outside hrtimer.c itself. That
> one only reads the value.

OK, this is incorrect. Still, if it's only meant to be read from
outside hrtimer.c:

>  Wouldn't it make sense to keep the variable
> from the interface and use a read function instead?

Thanks,
Frans

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 04/39] sound: Use hrtimer_resolution instead of hrtimer_get_res()
  2015-04-14 21:08 ` [patch 04/39] sound: " Thomas Gleixner
@ 2015-04-16  8:07   ` Takashi Iwai
  2015-04-16  9:08     ` Thomas Gleixner
  2015-04-22 19:05   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  1 sibling, 1 reply; 123+ messages in thread
From: Takashi Iwai @ 2015-04-16  8:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, Jaroslav Kysela,
	alsa-devel

At Tue, 14 Apr 2015 21:08:30 -0000,
Thomas Gleixner wrote:
> 
> No point in converting a timespec now that the value is directly
> accessible. Get rid of the null check while at it. Resolution is
> guaranteed to be > 0.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Jaroslav Kysela <perex@perex.cz>
> Cc: Takashi Iwai <tiwai@suse.de>

Would you like me picking up this through sound git tree, or apply the
whole set through tip?  In the latter case, feel free to take my ack:
  Acked-by: Takashi Iwai <tiwai@suse.de>


thanks,

Takashi


> Cc: alsa-devel@alsa-project.org
> ---
>  sound/core/hrtimer.c      |    9 +--------
>  sound/drivers/pcsp/pcsp.c |   15 ++++++---------
>  2 files changed, 7 insertions(+), 17 deletions(-)
> 
> Index: tip/sound/core/hrtimer.c
> ===================================================================
> --- tip.orig/sound/core/hrtimer.c
> +++ tip/sound/core/hrtimer.c
> @@ -121,16 +121,9 @@ static struct snd_timer *mytimer;
>  static int __init snd_hrtimer_init(void)
>  {
>  	struct snd_timer *timer;
> -	struct timespec tp;
>  	int err;
>  
> -	hrtimer_get_res(CLOCK_MONOTONIC, &tp);
> -	if (tp.tv_sec > 0 || !tp.tv_nsec) {
> -		pr_err("snd-hrtimer: Invalid resolution %u.%09u",
> -			   (unsigned)tp.tv_sec, (unsigned)tp.tv_nsec);
> -		return -EINVAL;
> -	}
> -	resolution = tp.tv_nsec;
> +	resolution = hrtimer_resolution;
>  
>  	/* Create a new timer and set up the fields */
>  	err = snd_timer_global_new("hrtimer", SNDRV_TIMER_GLOBAL_HRTIMER,
> Index: tip/sound/drivers/pcsp/pcsp.c
> ===================================================================
> --- tip.orig/sound/drivers/pcsp/pcsp.c
> +++ tip/sound/drivers/pcsp/pcsp.c
> @@ -42,16 +42,13 @@ struct snd_pcsp pcsp_chip;
>  static int snd_pcsp_create(struct snd_card *card)
>  {
>  	static struct snd_device_ops ops = { };
> -	struct timespec tp;
> -	int err;
> -	int div, min_div, order;
> -
> -	hrtimer_get_res(CLOCK_MONOTONIC, &tp);
> +	unsigned int resolution = hrtimer_resolution;
> +	int err, div, min_div, order;
>  
>  	if (!nopcm) {
> -		if (tp.tv_sec || tp.tv_nsec > PCSP_MAX_PERIOD_NS) {
> +		if (resolution > PCSP_MAX_PERIOD_NS) {
>  			printk(KERN_ERR "PCSP: Timer resolution is not sufficient "
> -				"(%linS)\n", tp.tv_nsec);
> +				"(%linS)\n", resolution);
>  			printk(KERN_ERR "PCSP: Make sure you have HPET and ACPI "
>  				"enabled.\n");
>  			printk(KERN_ERR "PCSP: Turned into nopcm mode.\n");
> @@ -59,13 +56,13 @@ static int snd_pcsp_create(struct snd_ca
>  		}
>  	}
>  
> -	if (loops_per_jiffy >= PCSP_MIN_LPJ && tp.tv_nsec <= PCSP_MIN_PERIOD_NS)
> +	if (loops_per_jiffy >= PCSP_MIN_LPJ && resolution <= PCSP_MIN_PERIOD_NS)
>  		min_div = MIN_DIV;
>  	else
>  		min_div = MAX_DIV;
>  #if PCSP_DEBUG
>  	printk(KERN_DEBUG "PCSP: lpj=%li, min_div=%i, res=%li\n",
> -	       loops_per_jiffy, min_div, tp.tv_nsec);
> +	       loops_per_jiffy, min_div, resolution);
>  #endif
>  
>  	div = MAX_DIV / min_div;
> 
> 

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 04/39] sound: Use hrtimer_resolution instead of hrtimer_get_res()
  2015-04-16  8:07   ` Takashi Iwai
@ 2015-04-16  9:08     ` Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-16  9:08 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, Jaroslav Kysela,
	alsa-devel

On Thu, 16 Apr 2015, Takashi Iwai wrote:
> At Tue, 14 Apr 2015 21:08:30 -0000,
> Thomas Gleixner wrote:
> > 
> > No point in converting a timespec now that the value is directly
> > accessible. Get rid of the null check while at it. Resolution is
> > guaranteed to be > 0.
> > 
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Jaroslav Kysela <perex@perex.cz>
> > Cc: Takashi Iwai <tiwai@suse.de>
> 
> Would you like me picking up this through sound git tree, or apply the
> whole set through tip?  In the latter case, feel free to take my ack:
>   Acked-by: Takashi Iwai <tiwai@suse.de>

I pick it up as it has a dependency on core code changes.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 17/39] tick: sched: Remove hrtimer_active() checks
  2015-04-14 21:08 ` [patch 17/39] tick: sched: Remove hrtimer_active() checks Thomas Gleixner
@ 2015-04-16 13:37   ` Frederic Weisbecker
  2015-04-22 19:09   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: Frederic Weisbecker @ 2015-04-16 13:37 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, John Stultz, Marcelo Tosatti

On Tue, Apr 14, 2015 at 09:08:52PM -0000, Thomas Gleixner wrote:
> hrtimer_start() enforces a timer interrupt if the timer is already
> expired. Get rid of the checks and the forward loop.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>

Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 03/39] net: sched: Use hrtimer_resolution instead of hrtimer_get_res()
  2015-04-14 21:08 ` [patch 03/39] net: sched: Use hrtimer_resolution instead of hrtimer_get_res() Thomas Gleixner
@ 2015-04-16 16:04   ` David Miller
  2015-04-22 19:05   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: David Miller @ 2015-04-16 16:04 UTC (permalink / raw)
  To: tglx
  Cc: linux-kernel, peterz, mingo, preeti, viresh.kumar, mtosatt,
	fweisbec, jhs, netdev

From: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 14 Apr 2015 21:08:28 -0000

> No point in converting a timespec now that the value is directly
> accessible.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 31/39] net: core: pktgen: Remove bogus hrtimer_active() check
  2015-04-14 21:09 ` [patch 31/39] net: core: pktgen: " Thomas Gleixner
@ 2015-04-16 16:04   ` David Miller
  2015-04-22 19:13   ` [tip:timers/core] net: core: pktgen: Remove bogus hrtimer_active( ) check tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: David Miller @ 2015-04-16 16:04 UTC (permalink / raw)
  To: tglx
  Cc: linux-kernel, peterz, mingo, preeti, viresh.kumar, mtosatt,
	fweisbec, netdev

From: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 14 Apr 2015 21:09:16 -0000

> The check for hrtimer_active() after starting the timer is
> pointless. If the timer is inactive it has expired already and
> therefor the task pointer is already NULL.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 20/39] tick: nohz: Rework next timer evaluation
  2015-04-14 21:08 ` [patch 20/39] tick: nohz: Rework next timer evaluation Thomas Gleixner
@ 2015-04-16 16:42   ` Paul E. McKenney
  2015-04-21 12:04     ` Thomas Gleixner
  2015-04-22 19:10   ` [tip:timers/core] tick: Nohz: " tip-bot for Thomas Gleixner
  1 sibling, 1 reply; 123+ messages in thread
From: Paul E. McKenney @ 2015-04-16 16:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, Josh Triplett,
	Lai Jiangshan, John Stultz, Marcelo Tosatti

On Tue, Apr 14, 2015 at 09:08:58PM -0000, Thomas Gleixner wrote:
> The evaluation of the next timer in the nohz code is based on jiffies
> while all the tick internals are nano seconds based. We have also to
> convert hrtimer nanoseconds to jiffies in the !highres case. That's
> just wrong and introduces interesting corner cases.
> 
> Turn it around and convert the next timer wheel timer expiry and the
> rcu event to clock monotonic and base all calculations on
> nanoseconds. That identifies the case where no timer is pending
> clearly with an absolute expiry value of KTIME_MAX.
> 
> Makes the code more readable and gets rid of the jiffies magic in the
> nohz code.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>

Good stuff!

> ---
>  include/linux/hrtimer.h     |    2 
>  include/linux/rcupdate.h    |    6 +-
>  include/linux/rcutree.h     |    2 
>  include/linux/timer.h       |    7 --
>  kernel/rcu/tree_plugin.h    |   14 +++--

I guess that I had better take a look.  ;-)

Short summary:  No problems, looks like you found all of the
rcu_needs_cpu() functions.   (Yes, there are more of them than I would
have believed possible!)

>  kernel/time/hrtimer.c       |   14 ++---
>  kernel/time/tick-internal.h |    2 
>  kernel/time/tick-sched.c    |  109 +++++++++++++++++++-------------------------

I got a build error with CONFIG_NO_HZ_FULL on this one, looks like a
stray time_delta that needs to change to delta.  Builds OK with that
change, and one of the three scenarios runs fine as well (the other two
are running now).  Non-CONFIG_NO_HZ_FULL kernels pass a (very short)
five-minute rcutorture test.  (But note that rcutorture doesn't use
hrtimers directly, changing which is on my list.)

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 3af15c3..753c211 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -653,7 +653,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 #ifdef CONFIG_NO_HZ_FULL
 	/* Limit the tick delta to the maximum scheduler deferment */
 	if (!ts->inidle)
-		delta = min(time_delta, scheduler_tick_max_deferment());
+		delta = min(delta, scheduler_tick_max_deferment());
 #endif
 
 	/* Calculate the next expiry time */

------------------------------------------------------------------------

>  kernel/time/tick-sched.h    |    2 
>  kernel/time/timer.c         |   71 +++++++++++++---------------
>  kernel/time/timer_list.c    |    4 -
>  11 files changed, 107 insertions(+), 126 deletions(-)
> 
> Index: tip/include/linux/hrtimer.h
> ===================================================================
> --- tip.orig/include/linux/hrtimer.h
> +++ tip/include/linux/hrtimer.h
> @@ -386,7 +386,7 @@ static inline int hrtimer_restart(struct
>  /* Query timers: */
>  extern ktime_t hrtimer_get_remaining(const struct hrtimer *timer);
> 
> -extern ktime_t hrtimer_get_next_event(void);
> +extern u64 hrtimer_get_next_event(void);
> 
>  /*
>   * A timer is active, when it is enqueued into the rbtree or the
> Index: tip/include/linux/rcupdate.h
> ===================================================================
> --- tip.orig/include/linux/rcupdate.h
> +++ tip/include/linux/rcupdate.h
> @@ -44,6 +44,8 @@
>  #include <linux/debugobjects.h>
>  #include <linux/bug.h>
>  #include <linux/compiler.h>
> +#include <linux/ktime.h>
> +
>  #include <asm/barrier.h>
> 
>  extern int rcu_expedited; /* for sysctl */
> @@ -1122,9 +1124,9 @@ static inline notrace void rcu_read_unlo
>  	__kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
> 
>  #if defined(CONFIG_TINY_RCU) || defined(CONFIG_RCU_NOCB_CPU_ALL)
> -static inline int rcu_needs_cpu(unsigned long *delta_jiffies)
> +static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
>  {
> -	*delta_jiffies = ULONG_MAX;
> +	*nextevt = KTIME_MAX;
>  	return 0;
>  }
>  #endif /* #if defined(CONFIG_TINY_RCU) || defined(CONFIG_RCU_NOCB_CPU_ALL) */
> Index: tip/include/linux/rcutree.h
> ===================================================================
> --- tip.orig/include/linux/rcutree.h
> +++ tip/include/linux/rcutree.h
> @@ -32,7 +32,7 @@
> 
>  void rcu_note_context_switch(void);
>  #ifndef CONFIG_RCU_NOCB_CPU_ALL
> -int rcu_needs_cpu(unsigned long *delta_jiffies);
> +int rcu_needs_cpu(u64 basem, u64 *nextevt);
>  #endif /* #ifndef CONFIG_RCU_NOCB_CPU_ALL */
>  void rcu_cpu_stall_reset(void);
> 
> Index: tip/include/linux/timer.h
> ===================================================================
> --- tip.orig/include/linux/timer.h
> +++ tip/include/linux/timer.h
> @@ -188,13 +188,6 @@ extern void set_timer_slack(struct timer
>  #define NEXT_TIMER_MAX_DELTA	((1UL << 30) - 1)
> 
>  /*
> - * Return when the next timer-wheel timeout occurs (in absolute jiffies),
> - * locks the timer base and does the comparison against the given
> - * jiffie.
> - */
> -extern unsigned long get_next_timer_interrupt(unsigned long now);
> -
> -/*
>   * Timer-statistics info:
>   */
>  #ifdef CONFIG_TIMER_STATS
> Index: tip/kernel/rcu/tree_plugin.h
> ===================================================================
> --- tip.orig/kernel/rcu/tree_plugin.h
> +++ tip/kernel/rcu/tree_plugin.h
> @@ -1372,9 +1372,9 @@ static void rcu_prepare_kthreads(int cpu
>   * any flavor of RCU.
>   */
>  #ifndef CONFIG_RCU_NOCB_CPU_ALL
> -int rcu_needs_cpu(unsigned long *delta_jiffies)
> +int rcu_needs_cpu(u64 basemono, u64 *nextevt)
>  {
> -	*delta_jiffies = ULONG_MAX;
> +	*nextevt = KTIME_MAX;

I was going to ask about basemono, but I see it in the next hunk.

>  	return rcu_cpu_has_callbacks(NULL);
>  }
>  #endif /* #ifndef CONFIG_RCU_NOCB_CPU_ALL */
> @@ -1485,16 +1485,17 @@ static bool __maybe_unused rcu_try_advan
>   * The caller must have disabled interrupts.
>   */
>  #ifndef CONFIG_RCU_NOCB_CPU_ALL
> -int rcu_needs_cpu(unsigned long *dj)
> +int rcu_needs_cpu(u64 basemono, u64 *nextevt)
>  {
>  	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> +	unsigned long dj;
> 
>  	/* Snapshot to detect later posting of non-lazy callback. */
>  	rdtp->nonlazy_posted_snap = rdtp->nonlazy_posted;
> 
>  	/* If no callbacks, RCU doesn't need the CPU. */
>  	if (!rcu_cpu_has_callbacks(&rdtp->all_lazy)) {
> -		*dj = ULONG_MAX;
> +		*nextevt = KTIME_MAX;
>  		return 0;
>  	}
> 
> @@ -1508,11 +1509,12 @@ int rcu_needs_cpu(unsigned long *dj)
> 
>  	/* Request timer delay depending on laziness, and round. */
>  	if (!rdtp->all_lazy) {
> -		*dj = round_up(rcu_idle_gp_delay + jiffies,
> +		dj = round_up(rcu_idle_gp_delay + jiffies,
>  			       rcu_idle_gp_delay) - jiffies;
>  	} else {
> -		*dj = round_jiffies(rcu_idle_lazy_gp_delay + jiffies) - jiffies;
> +		dj = round_jiffies(rcu_idle_lazy_gp_delay + jiffies) - jiffies;
>  	}
> +	*nextevt = basemono + dj * TICK_NSEC;

The multiply would have been a problem back in the day, but should
be just fine on modern hardware.  I suppose that slow hardware could
compensate by having the scheduling-clock period be an exact power of
two worth of nanoseconds.

>  	return 0;
>  }
>  #endif /* #ifndef CONFIG_RCU_NOCB_CPU_ALL */
> Index: tip/kernel/time/hrtimer.c
> ===================================================================
> --- tip.orig/kernel/time/hrtimer.c
> +++ tip/kernel/time/hrtimer.c
> @@ -1072,26 +1072,22 @@ EXPORT_SYMBOL_GPL(hrtimer_get_remaining)
>  /**
>   * hrtimer_get_next_event - get the time until next expiry event
>   *
> - * Returns the delta to the next expiry event or KTIME_MAX if no timer
> - * is pending.
> + * Returns the next expiry time or KTIME_MAX if no timer is pending.
>   */
> -ktime_t hrtimer_get_next_event(void)
> +u64 hrtimer_get_next_event(void)
>  {
>  	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
> -	ktime_t mindelta = { .tv64 = KTIME_MAX };
> +	u64 expires = KTIME_MAX;
>  	unsigned long flags;
> 
>  	raw_spin_lock_irqsave(&cpu_base->lock, flags);
> 
>  	if (!__hrtimer_hres_active(cpu_base))
> -		mindelta = ktime_sub(__hrtimer_get_next_event(cpu_base),
> -				     ktime_get());
> +		expires = __hrtimer_get_next_event(cpu_base).tv64;
> 
>  	raw_spin_unlock_irqrestore(&cpu_base->lock, flags);
> 
> -	if (mindelta.tv64 < 0)
> -		mindelta.tv64 = 0;
> -	return mindelta;
> +	return expires;
>  }
>  #endif
> 
> Index: tip/kernel/time/tick-internal.h
> ===================================================================
> --- tip.orig/kernel/time/tick-internal.h
> +++ tip/kernel/time/tick-internal.h
> @@ -137,3 +137,5 @@ extern void tick_nohz_init(void);
>  # else
>  static inline void tick_nohz_init(void) { }
>  #endif
> +
> +extern u64 get_next_timer_interrupt(unsigned long basej, u64 basem);
> Index: tip/kernel/time/tick-sched.c
> ===================================================================
> --- tip.orig/kernel/time/tick-sched.c
> +++ tip/kernel/time/tick-sched.c
> @@ -582,39 +582,46 @@ static void tick_nohz_restart(struct tic
>  static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>  					 ktime_t now, int cpu)
>  {
> -	unsigned long seq, last_jiffies, next_jiffies, delta_jiffies;
> -	ktime_t last_update, expires, ret = { .tv64 = 0 };
> -	unsigned long rcu_delta_jiffies;
>  	struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);
> -	u64 time_delta;
> -
> -	time_delta = timekeeping_max_deferment();
> +	u64 basemono, next_tick, next_tmr, next_rcu, delta, expires;
> +	unsigned long seq, basejiff;
> +	ktime_t	tick;
> 
>  	/* Read jiffies and the time when jiffies were updated last */
>  	do {
>  		seq = read_seqbegin(&jiffies_lock);
> -		last_update = last_jiffies_update;
> -		last_jiffies = jiffies;
> +		basemono = last_jiffies_update.tv64;
> +		basejiff = jiffies;
>  	} while (read_seqretry(&jiffies_lock, seq));
> +	ts->last_jiffies = basejiff;
> 
> -	if (rcu_needs_cpu(&rcu_delta_jiffies) ||
> +	if (rcu_needs_cpu(basemono, &next_rcu) ||
>  	    arch_needs_cpu() || irq_work_needs_cpu()) {
> -		next_jiffies = last_jiffies + 1;
> -		delta_jiffies = 1;
> +		next_tick = basemono + TICK_NSEC;
>  	} else {
> -		/* Get the next timer wheel timer */
> -		next_jiffies = get_next_timer_interrupt(last_jiffies);
> -		delta_jiffies = next_jiffies - last_jiffies;
> -		if (rcu_delta_jiffies < delta_jiffies) {
> -			next_jiffies = last_jiffies + rcu_delta_jiffies;
> -			delta_jiffies = rcu_delta_jiffies;
> -		}
> +		/*
> +		 * Get the next pending timer. If high resolution
> +		 * timers are enabled this only takes the timer wheel
> +		 * timers into account. If high resolution timers are
> +		 * disabled this also looks at the next expiring
> +		 * hrtimer.
> +		 */
> +		next_tmr = get_next_timer_interrupt(basejiff, basemono);
> +		ts->next_timer = next_tmr;
> +		/* Take the next rcu event into account */
> +		next_tick = next_rcu < next_tmr ? next_rcu : next_tmr;
>  	}
> 
> -	if ((long)delta_jiffies <= 1) {
> +	/*
> +	 * If the tick is due in the next period, keep it ticking or
> +	 * restart it proper.
> +	 */
> +	delta = next_tick - basemono;
> +	if (delta <= (u64)TICK_NSEC) {
> +		tick.tv64 = 0;
>  		if (!ts->tick_stopped)
>  			goto out;
> -		if (delta_jiffies == 0) {
> +		if (delta == 0) {
>  			/* Tick is stopped, but required now. Enforce it */
>  			tick_nohz_restart(ts, now);
>  			goto out;
> @@ -629,54 +636,39 @@ static ktime_t tick_nohz_stop_sched_tick
>  	 * do_timer() never invoked. Keep track of the fact that it
>  	 * was the one which had the do_timer() duty last. If this cpu
>  	 * is the one which had the do_timer() duty last, we limit the
> -	 * sleep time to the timekeeping max_deferement value which we
> -	 * retrieved above. Otherwise we can sleep as long as we want.
> +	 * sleep time to the timekeeping max_deferement value.
> +	 * Otherwise we can sleep as long as we want.
>  	 */
> +	delta = timekeeping_max_deferment();
>  	if (cpu == tick_do_timer_cpu) {
>  		tick_do_timer_cpu = TICK_DO_TIMER_NONE;
>  		ts->do_timer_last = 1;
>  	} else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
> -		time_delta = KTIME_MAX;
> +		delta = KTIME_MAX;
>  		ts->do_timer_last = 0;
>  	} else if (!ts->do_timer_last) {
> -		time_delta = KTIME_MAX;
> +		delta = KTIME_MAX;
>  	}
> 
>  #ifdef CONFIG_NO_HZ_FULL
> +	/* Limit the tick delta to the maximum scheduler deferment */
>  	if (!ts->inidle)
> -		time_delta = min(time_delta, scheduler_tick_max_deferment());
> +		delta = min(time_delta, scheduler_tick_max_deferment());

s/time_delta/delta/?

>  #endif
> 
> -	/*
> -	 * calculate the expiry time for the next timer wheel
> -	 * timer. delta_jiffies >= NEXT_TIMER_MAX_DELTA signals that
> -	 * there is no timer pending or at least extremely far into
> -	 * the future (12 days for HZ=1000). In this case we set the
> -	 * expiry to the end of time.
> -	 */
> -	if (likely(delta_jiffies < NEXT_TIMER_MAX_DELTA)) {
> -		/*
> -		 * Calculate the time delta for the next timer event.
> -		 * If the time delta exceeds the maximum time delta
> -		 * permitted by the current clocksource then adjust
> -		 * the time delta accordingly to ensure the
> -		 * clocksource does not wrap.
> -		 */
> -		time_delta = min_t(u64, time_delta,
> -				   tick_period.tv64 * delta_jiffies);
> -	}
> -
> -	if (time_delta < KTIME_MAX)
> -		expires = ktime_add_ns(last_update, time_delta);
> +	/* Calculate the next expiry time */
> +	if (delta < (KTIME_MAX - basemono))
> +		expires = basemono + delta;
>  	else
> -		expires.tv64 = KTIME_MAX;
> +		expires = KTIME_MAX;
> +
> +	expires = min_t(u64, expires, next_tick);
> +	tick.tv64 = expires;
> 
>  	/* Skip reprogram of event if its not changed */
> -	if (ts->tick_stopped && ktime_equal(expires, dev->next_event))
> +	if (ts->tick_stopped && (expires == dev->next_event.tv64))
>  		goto out;
> 
> -	ret = expires;
> -
>  	/*
>  	 * nohz_stop_sched_tick can be called several times before
>  	 * the nohz_restart_sched_tick is called. This happens when
> @@ -694,26 +686,23 @@ static ktime_t tick_nohz_stop_sched_tick
>  	}
> 
>  	/*
> -	 * If the expiration time == KTIME_MAX, then
> -	 * in this case we simply stop the tick timer.
> +	 * If the expiration time == KTIME_MAX, then we simply stop
> +	 * the tick timer.
>  	 */
> -	if (unlikely(expires.tv64 == KTIME_MAX)) {
> +	if (unlikely(expires == KTIME_MAX)) {
>  		if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
>  			hrtimer_cancel(&ts->sched_timer);
>  		goto out;
>  	}
> 
>  	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
> -		hrtimer_start(&ts->sched_timer, expires,
> -			      HRTIMER_MODE_ABS_PINNED);
> +		hrtimer_start(&ts->sched_timer, tick, HRTIMER_MODE_ABS_PINNED);
>  	else
> -		tick_program_event(expires, 1);
> +		tick_program_event(tick, 1);
>  out:
> -	ts->next_jiffies = next_jiffies;
> -	ts->last_jiffies = last_jiffies;
> +	/* Update the estimated sleep length */
>  	ts->sleep_length = ktime_sub(dev->next_event, now);
> -
> -	return ret;
> +	return tick;
>  }
> 
>  static void tick_nohz_full_stop_tick(struct tick_sched *ts)
> Index: tip/kernel/time/tick-sched.h
> ===================================================================
> --- tip.orig/kernel/time/tick-sched.h
> +++ tip/kernel/time/tick-sched.h
> @@ -57,7 +57,7 @@ struct tick_sched {
>  	ktime_t				iowait_sleeptime;
>  	ktime_t				sleep_length;
>  	unsigned long			last_jiffies;
> -	unsigned long			next_jiffies;
> +	u64				next_timer;
>  	ktime_t				idle_expires;
>  	int				do_timer_last;
>  };
> Index: tip/kernel/time/timer.c
> ===================================================================
> --- tip.orig/kernel/time/timer.c
> +++ tip/kernel/time/timer.c
> @@ -49,6 +49,8 @@
>  #include <asm/timex.h>
>  #include <asm/io.h>
> 
> +#include "tick-internal.h"
> +
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/timer.h>
> 
> @@ -1311,54 +1313,48 @@ cascade:
>   * Check, if the next hrtimer event is before the next timer wheel
>   * event:
>   */
> -static unsigned long cmp_next_hrtimer_event(unsigned long now,
> -					    unsigned long expires)
> +static u64 cmp_next_hrtimer_event(u64 basem, u64 expires)
>  {
> -	ktime_t hr_delta = hrtimer_get_next_event();
> -	struct timespec tsdelta;
> -	unsigned long delta;
> -
> -	if (hr_delta.tv64 == KTIME_MAX)
> -		return expires;
> +	u64 nextevt = hrtimer_get_next_event();
> 
>  	/*
> -	 * Expired timer available, let it expire in the next tick
> +	 * If high resolution timers are enabled
> +	 * hrtimer_get_next_event() returns KTIME_MAX.
>  	 */
> -	if (hr_delta.tv64 <= 0)
> -		return now + 1;
> -
> -	tsdelta = ktime_to_timespec(hr_delta);
> -	delta = timespec_to_jiffies(&tsdelta);
> +	if (expires <= nextevt)
> +		return expires;
> 
>  	/*
> -	 * Limit the delta to the max value, which is checked in
> -	 * tick_nohz_stop_sched_tick():
> +	 * If the next timer is already expired, return the tick base
> +	 * time so the tick is fired immediately.
>  	 */
> -	if (delta > NEXT_TIMER_MAX_DELTA)
> -		delta = NEXT_TIMER_MAX_DELTA;
> +	if (nextevt <= basem)
> +		return basem;
> 
>  	/*
> -	 * Take rounding errors in to account and make sure, that it
> -	 * expires in the next tick. Otherwise we go into an endless
> -	 * ping pong due to tick_nohz_stop_sched_tick() retriggering
> -	 * the timer softirq
> +	 * Round up to the next jiffie. High resolution timers are
> +	 * off, so the hrtimers are expired in the tick and we need to
> +	 * make sure that this tick really expires the timer to avoid
> +	 * a ping pong of the nohz stop code.
> +	 *
> +	 * Use DIV_ROUND_UP_ULL to prevent gcc calling __divdi3
>  	 */
> -	if (delta < 1)
> -		delta = 1;
> -	now += delta;
> -	if (time_before(now, expires))
> -		return now;
> -	return expires;
> +	return DIV_ROUND_UP_ULL(nextevt, TICK_NSEC) * TICK_NSEC;
>  }
> 
>  /**
> - * get_next_timer_interrupt - return the jiffy of the next pending timer
> - * @now: current time (in jiffies)
> + * get_next_timer_interrupt - return the time (clock mono) of the next timer
> + * @basej:	base time jiffies
> + * @basem:	base time clock monotonic
> + *
> + * Returns the tick aligned clock monotonic time of the next pending
> + * timer or KTIME_MAX if no timer is pending.
>   */
> -unsigned long get_next_timer_interrupt(unsigned long now)
> +u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
>  {
>  	struct tvec_base *base = __this_cpu_read(tvec_bases);
> -	unsigned long expires = now + NEXT_TIMER_MAX_DELTA;
> +	u64 expires = KTIME_MAX;
> +	unsigned long nextevt;
> 
>  	/*
>  	 * Pretend that there is no timer pending if the cpu is offline.
> @@ -1371,14 +1367,15 @@ unsigned long get_next_timer_interrupt(u
>  	if (base->active_timers) {
>  		if (time_before_eq(base->next_timer, base->timer_jiffies))
>  			base->next_timer = __next_timer_interrupt(base);
> -		expires = base->next_timer;
> +		nextevt = base->next_timer;
> +		if (time_before_eq(nextevt, basej))
> +			expires = basem;
> +		else
> +			expires = basem + (nextevt - basej) * TICK_NSEC;
>  	}
>  	spin_unlock(&base->lock);
> 
> -	if (time_before_eq(expires, now))
> -		return now;
> -
> -	return cmp_next_hrtimer_event(now, expires);
> +	return cmp_next_hrtimer_event(basem, expires);
>  }
>  #endif
> 
> Index: tip/kernel/time/timer_list.c
> ===================================================================
> --- tip.orig/kernel/time/timer_list.c
> +++ tip/kernel/time/timer_list.c
> @@ -184,7 +184,7 @@ static void print_cpu(struct seq_file *m
>  		P_ns(idle_sleeptime);
>  		P_ns(iowait_sleeptime);
>  		P(last_jiffies);
> -		P(next_jiffies);
> +		P(next_timer);
>  		P_ns(idle_expires);
>  		SEQ_printf(m, "jiffies: %Lu\n",
>  			   (unsigned long long)jiffies);
> @@ -282,7 +282,7 @@ static void timer_list_show_tickdevices_
> 
>  static inline void timer_list_header(struct seq_file *m, u64 now)
>  {
> -	SEQ_printf(m, "Timer List Version: v0.7\n");
> +	SEQ_printf(m, "Timer List Version: v0.8\n");
>  	SEQ_printf(m, "HRTIMER_MAX_CLOCK_BASES: %d\n", HRTIMER_MAX_CLOCK_BASES);
>  	SEQ_printf(m, "now at %Ld nsecs\n", (unsigned long long)now);
>  	SEQ_printf(m, "\n");
> 
> 


^ permalink raw reply related	[flat|nested] 123+ messages in thread

* Re: [patch 34/39] tick: broadcast-hrtimer: Remove overly clever return value abuse
  2015-04-14 21:09 ` [patch 34/39] tick: broadcast-hrtimer: Remove overly clever return value abuse Thomas Gleixner
@ 2015-04-17 10:33   ` Preeti U Murthy
  2015-04-22 19:13   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: Preeti U Murthy @ 2015-04-17 10:33 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Peter Zijlstra, Ingo Molnar, Viresh Kumar, Marcelo Tosatti,
	Frederic Weisbecker

On 04/15/2015 02:39 AM, Thomas Gleixner wrote:
> The assignment of bc_moved in the conditional construct relies on the
> fact that in the case of hrtimer_start() invocation the return value
> is always 0. It took me a while to understand it.
> 
> We want to get rid of the hrtimer_start() return value. Open code the
> logic which makes it readable as well.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc:  Preeti U Murthy <preeti@linux.vnet.ibm.com>
> ---
>  kernel/time/tick-broadcast-hrtimer.c |    8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> Index: tip/kernel/time/tick-broadcast-hrtimer.c
> ===================================================================
> --- tip.orig/kernel/time/tick-broadcast-hrtimer.c
> +++ tip/kernel/time/tick-broadcast-hrtimer.c
> @@ -66,9 +66,11 @@ static int bc_set_next(ktime_t expires,
>  	 * hrtimer_{start/cancel} functions call into tracing,
>  	 * calls to these functions must be bound within RCU_NONIDLE.
>  	 */
> -	RCU_NONIDLE(bc_moved = (hrtimer_try_to_cancel(&bctimer) >= 0) ?
> -		!hrtimer_start(&bctimer, expires, HRTIMER_MODE_ABS_PINNED) :
> -			0);
> +	RCU_NONIDLE({
> +			bc_moved = hrtimer_try_to_cancel(&bctimer) >= 0;
> +			if (bc_moved)
> +				hrtimer_start(&bctimer, expires,
> +					      HRTIMER_MODE_ABS_PINNED);});
>  	if (bc_moved) {
>  		/* Bind the "device" to the cpu */
>  		bc->bound_on = smp_processor_id();

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>


^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 02/39] hrtimer: Get rid of the resolution field in hrtimer_clock_base
  2015-04-14 21:08 ` [patch 02/39] hrtimer: Get rid of the resolution field in hrtimer_clock_base Thomas Gleixner
  2015-04-15  6:29   ` Frans Klaver
@ 2015-04-20  8:34   ` Preeti U Murthy
  2015-04-22 19:05   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  2 siblings, 0 replies; 123+ messages in thread
From: Preeti U Murthy @ 2015-04-20  8:34 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Peter Zijlstra, Ingo Molnar, Viresh Kumar, Marcelo Tosatti,
	Frederic Weisbecker

On 04/15/2015 02:38 AM, Thomas Gleixner wrote:
>>The field has no value because all clock bases have the same
>>resolution. The resolution only changes when we switch to high
>>resolution timer mode. We can evaluate that from a single static
>>variable as well. In the !HIGHRES case its simply a constant.
>>
>>Export the variable, so we can simplify the usage sites.
>>
>>Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>>---

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>


^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 08/39] hrtimer: Make offset update smarter
  2015-04-14 21:08 ` [patch 08/39] hrtimer: Make offset update smarter Thomas Gleixner
@ 2015-04-20  9:30   ` Preeti U Murthy
  2015-04-22 19:06   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: Preeti U Murthy @ 2015-04-20  9:30 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Peter Zijlstra, Ingo Molnar, Viresh Kumar, Marcelo Tosatti,
	Frederic Weisbecker, John Stultz

On 04/15/2015 02:38 AM, Thomas Gleixner wrote:
>>On every tick/hrtimer interrupt we update the offset variables of the
>>clock bases. That's silly because these offsets change very seldom.
>>
>>Add a sequence counter to the time keeping code which keeps track of
>>the offset updates (clock_was_set()). Have a sequence cache in the
>>hrtimer cpu bases to evaluate whether the offsets must be updated or
>>not. This allows us later to avoid pointless cacheline pollution.
>>
>>Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>>Cc: John Stultz <john.stultz@linaro.org>

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>


^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 10/39] hrtimer: Use cpu_base->active_base for hotpath iterators
  2015-04-14 21:08 ` [patch 10/39] hrtimer: Use cpu_base->active_base for hotpath iterators Thomas Gleixner
@ 2015-04-20 11:16   ` Preeti U Murthy
  2015-04-21 11:53     ` Thomas Gleixner
  2015-04-22 19:07   ` [tip:timers/core] hrtimer: Use cpu_base-> active_base " tip-bot for Thomas Gleixner
  1 sibling, 1 reply; 123+ messages in thread
From: Preeti U Murthy @ 2015-04-20 11:16 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Peter Zijlstra, Ingo Molnar, Viresh Kumar, Marcelo Tosatti,
	Frederic Weisbecker

On 04/15/2015 02:38 AM, Thomas Gleixner wrote:
>>Now that we have the active_bases field in sync we can use it for

This sentence appears a bit ambiguous. I am guessing you are referring
to what the first patch in this series did, in which case wouldn't it be
better if it is stated a bit more elaborately like 'Now that it is
guaranteed that active_bases field will be in sync with the timerqueue
on the corresponding clock base' ? It took me a while to figure out what
the statement was referring to.

>>iterating over the clock bases. This allows to break out early if no
>>more active clock bases are available and avoids touching the cache
>>lines of inactive clock bases.
>>
>>Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>>---

Regards
Preeti U Murthy

>> kernel/time/hrtimer.c |   17 ++++++++---------
>> 1 file changed, 8 insertions(+), 9 deletions(-)
>>
>>Index: tip/kernel/time/hrtimer.c
>>===================================================================
>>--- tip.orig/kernel/time/hrtimer.c
>>+++ tip/kernel/time/hrtimer.c
>>@@ -419,16 +419,16 @@ static ktime_t __hrtimer_get_next_event(
>> {
>> 	struct hrtimer_clock_base *base = cpu_base->clock_base;
>> 	ktime_t expires, expires_next = { .tv64 = KTIME_MAX };
>>-	int i;
>>+	unsigned int active = cpu_base->active_bases;
>>
>>-	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++, base++) {
>>+	for (; active; base++, active >>= 1) {
>> 		struct timerqueue_node *next;
>> 		struct hrtimer *timer;
>>
>>-		next = timerqueue_getnext(&base->active);
>>-		if (!next)
>>+		if (!(active & 0x01))
>> 			continue;
>>
>>+		next = timerqueue_getnext(&base->active);
>> 		timer = container_of(next, struct hrtimer, node);
>> 		expires = ktime_sub(hrtimer_get_expires(timer),
>>base->offset);
>> 		if (expires.tv64 < expires_next.tv64)
>>@@ -1206,17 +1206,16 @@ static void __run_hrtimer(struct hrtimer
>>
>> static void __hrtimer_run_queues(struct hrtimer_cpu_base *cpu_base,
ktime_t
>>now)
>> {
>>-	int i;
>>+	struct hrtimer_clock_base *base = cpu_base->clock_base;
>>+	unsigned int active = cpu_base->active_bases;
>>
>>-	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
>>-		struct hrtimer_clock_base *base;
>>+	for (; active; base++, active >>= 1) {
>> 		struct timerqueue_node *node;
>> 		ktime_t basenow;
>>
>>-		if (!(cpu_base->active_bases & (1 << i)))
>>+		if (!(active & 0x01))
>> 			continue;
>>
>>-		base = cpu_base->clock_base + i;
>> 		basenow = ktime_add(now, base->offset);
>>
>> 		while ((node = timerqueue_getnext(&base->active))) {
>>


^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 10/39] hrtimer: Use cpu_base->active_base for hotpath iterators
  2015-04-20 11:16   ` Preeti U Murthy
@ 2015-04-21 11:53     ` Thomas Gleixner
  2015-04-22  3:13       ` Preeti U Murthy
  0 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-21 11:53 UTC (permalink / raw)
  To: Preeti U Murthy
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Viresh Kumar, Marcelo Tosatti,
	Frederic Weisbecker

On Mon, 20 Apr 2015, Preeti U Murthy wrote:

> On 04/15/2015 02:38 AM, Thomas Gleixner wrote:
> >>Now that we have the active_bases field in sync we can use it for
> 
> This sentence appears a bit ambiguous. I am guessing you are referring
> to what the first patch in this series did, in which case wouldn't it be
> better if it is stated a bit more elaborately like 'Now that it is
> guaranteed that active_bases field will be in sync with the timerqueue
> on the corresponding clock base' ? It took me a while to figure out what
> the statement was referring to.

Indeed. I'm going to rephrase that.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 20/39] tick: nohz: Rework next timer evaluation
  2015-04-16 16:42   ` Paul E. McKenney
@ 2015-04-21 12:04     ` Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-21 12:04 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, Josh Triplett,
	Lai Jiangshan, John Stultz, Marcelo Tosatti

On Thu, 16 Apr 2015, Paul E. McKenney wrote:
> > @@ -1508,11 +1509,12 @@ int rcu_needs_cpu(unsigned long *dj)
> > 
> >  	/* Request timer delay depending on laziness, and round. */
> >  	if (!rdtp->all_lazy) {
> > -		*dj = round_up(rcu_idle_gp_delay + jiffies,
> > +		dj = round_up(rcu_idle_gp_delay + jiffies,
> >  			       rcu_idle_gp_delay) - jiffies;
> >  	} else {
> > -		*dj = round_jiffies(rcu_idle_lazy_gp_delay + jiffies) - jiffies;
> > +		dj = round_jiffies(rcu_idle_lazy_gp_delay + jiffies) - jiffies;
> >  	}
> > +	*nextevt = basemono + dj * TICK_NSEC;
> 
> The multiply would have been a problem back in the day, but should
> be just fine on modern hardware.  I suppose that slow hardware could
> compensate by having the scheduling-clock period be an exact power of
> two worth of nanoseconds.

I don't think the extra multiply matters much. round_up() and
round_jiffies() are way more expensive ...
 
> >  #ifdef CONFIG_NO_HZ_FULL
> > +	/* Limit the tick delta to the maximum scheduler deferment */
> >  	if (!ts->inidle)
> > -		time_delta = min(time_delta, scheduler_tick_max_deferment());
> > +		delta = min(time_delta, scheduler_tick_max_deferment());
> 
> s/time_delta/delta/?

Doh, yes.

Btw. Could you please trim your replies? It's hard to find the single
line comment when forced to scroll down several pages.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 10/39] hrtimer: Use cpu_base->active_base for hotpath iterators
  2015-04-21 11:53     ` Thomas Gleixner
@ 2015-04-22  3:13       ` Preeti U Murthy
  0 siblings, 0 replies; 123+ messages in thread
From: Preeti U Murthy @ 2015-04-22  3:13 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Viresh Kumar, Marcelo Tosatti,
	Frederic Weisbecker

On 04/21/2015 05:23 PM, Thomas Gleixner wrote:
> On Mon, 20 Apr 2015, Preeti U Murthy wrote:
> 
>> On 04/15/2015 02:38 AM, Thomas Gleixner wrote:
>>>> Now that we have the active_bases field in sync we can use it for
>>
>> This sentence appears a bit ambiguous. I am guessing you are referring
>> to what the first patch in this series did, in which case wouldn't it be
>> better if it is stated a bit more elaborately like 'Now that it is
>> guaranteed that active_bases field will be in sync with the timerqueue
>> on the corresponding clock base' ? It took me a while to figure out what
>> the statement was referring to.
> 
> Indeed. I'm going to rephrase that.

Thanks.

Reviewed-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com>
> 
> Thanks,
> 
> 	tglx
> 



^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 18/39] tick: sched: Force tick interrupt and get rid of softirq magic
  2015-04-14 21:08 ` [patch 18/39] tick: sched: Force tick interrupt and get rid of softirq magic Thomas Gleixner
@ 2015-04-22 14:22   ` Frederic Weisbecker
  2015-04-22 14:32     ` Thomas Gleixner
  2015-04-22 19:09   ` [tip:timers/core] " tip-bot for Thomas Gleixner
  1 sibling, 1 reply; 123+ messages in thread
From: Frederic Weisbecker @ 2015-04-22 14:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, John Stultz, Marcelo Tosatti

On Tue, Apr 14, 2015 at 09:08:54PM -0000, Thomas Gleixner wrote:
> We already got rid of the hrtimer reprogramming loops and hoops as
> hrtimer now enforces an interrupt if the enqueued time is in the past.
> 
> Do the same for the nohz non highres mode. That gets rid of the need
> to raise the softirq which only serves the purpose of getting the
> machine out of the inner idle loop.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> 
> ---
>  kernel/time/tick-sched.c |   83 ++++++++++++++++-------------------------------
>  1 file changed, 29 insertions(+), 54 deletions(-)
> 
> Index: tip/kernel/time/tick-sched.c
> ===================================================================
> --- tip.orig/kernel/time/tick-sched.c
> +++ tip/kernel/time/tick-sched.c
> @@ -565,6 +565,20 @@ u64 get_cpu_iowait_time_us(int cpu, u64
>  }
>  EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us);
>  
> +static void tick_nohz_restart(struct tick_sched *ts, ktime_t now)
> +{
> +	hrtimer_cancel(&ts->sched_timer);
> +	hrtimer_set_expires(&ts->sched_timer, ts->last_tick);
> +
> +	/* Forward the time to expire in the future */
> +	hrtimer_forward(&ts->sched_timer, now, tick_period);
> +
> +	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
> +		hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED);
> +	else
> +		tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
> +}
> +
>  static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>  					 ktime_t now, int cpu)
>  {
> @@ -691,22 +705,18 @@ static ktime_t tick_nohz_stop_sched_tick
>  			if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
>  				hrtimer_cancel(&ts->sched_timer);
>  			goto out;
> -		}
> +		 }
>  
> -		if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
> -			hrtimer_start(&ts->sched_timer, expires,
> -				      HRTIMER_MODE_ABS_PINNED);
> -			goto out;
> -		} else if (!tick_program_event(expires, 0))
> -			goto out;
> -		/*
> -		 * We are past the event already. So we crossed a
> -		 * jiffie boundary. Update jiffies and raise the
> -		 * softirq.
> -		 */
> -		tick_do_update_jiffies64(ktime_get());
> +		 if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
> +			 hrtimer_start(&ts->sched_timer, expires,
> +				       HRTIMER_MODE_ABS_PINNED);
> +		 else
> +			 tick_program_event(expires, 1);
> +	} else {
> +		/* Tick is stopped, but required now. Enforce it */
> +		tick_nohz_restart(ts, now);
>  	}
> -	raise_softirq_irqoff(TIMER_SOFTIRQ);
> +

But the reprogramming happens only under "if ((long)delta_jiffies >= 1)".
Probably this condition should go away as well.

In the end, the possible side effect, at least on low-res, is that timers
which are already expired will be handled on the next tick instead of now.
But probably it doesn't matter much to have a one-tick delay.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 18/39] tick: sched: Force tick interrupt and get rid of softirq magic
  2015-04-22 14:22   ` Frederic Weisbecker
@ 2015-04-22 14:32     ` Thomas Gleixner
  2015-04-23 11:47       ` Frederic Weisbecker
  0 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-22 14:32 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, John Stultz, Marcelo Tosatti

On Wed, 22 Apr 2015, Frederic Weisbecker wrote:

> On Tue, Apr 14, 2015 at 09:08:54PM -0000, Thomas Gleixner wrote:
> > We already got rid of the hrtimer reprogramming loops and hoops as
> > hrtimer now enforces an interrupt if the enqueued time is in the past.
> > 
> > Do the same for the nohz non highres mode. That gets rid of the need
> > to raise the softirq which only serves the purpose of getting the
> > machine out of the inner idle loop.
> > 
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > Cc: John Stultz <john.stultz@linaro.org>
> > Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> > Cc: Marcelo Tosatti <mtosatti@redhat.com>
> > Cc: Frederic Weisbecker <fweisbec@gmail.com>
> > 
> > ---
> >  kernel/time/tick-sched.c |   83 ++++++++++++++++-------------------------------
> >  1 file changed, 29 insertions(+), 54 deletions(-)
> > 
> > Index: tip/kernel/time/tick-sched.c
> > ===================================================================
> > --- tip.orig/kernel/time/tick-sched.c
> > +++ tip/kernel/time/tick-sched.c
> > @@ -565,6 +565,20 @@ u64 get_cpu_iowait_time_us(int cpu, u64
> >  }
> >  EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us);
> >  
> > +static void tick_nohz_restart(struct tick_sched *ts, ktime_t now)
> > +{
> > +	hrtimer_cancel(&ts->sched_timer);
> > +	hrtimer_set_expires(&ts->sched_timer, ts->last_tick);
> > +
> > +	/* Forward the time to expire in the future */
> > +	hrtimer_forward(&ts->sched_timer, now, tick_period);
> > +
> > +	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
> > +		hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED);
> > +	else
> > +		tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
> > +}
> > +
> >  static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> >  					 ktime_t now, int cpu)
> >  {
> > @@ -691,22 +705,18 @@ static ktime_t tick_nohz_stop_sched_tick
> >  			if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
> >  				hrtimer_cancel(&ts->sched_timer);
> >  			goto out;
> > -		}
> > +		 }
> >  
> > -		if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
> > -			hrtimer_start(&ts->sched_timer, expires,
> > -				      HRTIMER_MODE_ABS_PINNED);
> > -			goto out;
> > -		} else if (!tick_program_event(expires, 0))
> > -			goto out;
> > -		/*
> > -		 * We are past the event already. So we crossed a
> > -		 * jiffie boundary. Update jiffies and raise the
> > -		 * softirq.
> > -		 */
> > -		tick_do_update_jiffies64(ktime_get());
> > +		 if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
> > +			 hrtimer_start(&ts->sched_timer, expires,
> > +				       HRTIMER_MODE_ABS_PINNED);
> > +		 else
> > +			 tick_program_event(expires, 1);
> > +	} else {
> > +		/* Tick is stopped, but required now. Enforce it */
> > +		tick_nohz_restart(ts, now);
> >  	}
> > -	raise_softirq_irqoff(TIMER_SOFTIRQ);
> > +
> 
> But the reprogramming happens only under "if ((long)delta_jiffies >= 1)".
> Probably this condition should go away as well.

Errm.

	if (!ts->tick_stopped && delta_jiffies <= 1)
	     goto out;

So if the tick is NOT stopped and delta_jiffies <= 1 we let it tick
and do nothing.

        if (delta_jiffies >= 1)
	     Do the magic nohz stuff
	else
	     tick_nohz_restart()

We want the distinction here because if the tick IS stopped and the
next event is due we need to kick it into gear again. So the condition
needs to stay. It probably should be if (delta > 1), but that's a
different story.
 
> In the end, the possible side effect, at least on low-res, is that timers
> which are already expired will be handled on the next tick instead of now.
> But probably it doesn't matter much to have a one-tick delay.

We really don't care. That stuff has no guarantees aside of the
guarantee that it does not expire early :)

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Update active_bases before calling hrtimer_force_reprogram()
  2015-04-07  2:10 ` [PATCH V2 1/2] hrtimer: update '->active_bases' before calling hrtimer_force_reprogram() Viresh Kumar
@ 2015-04-22 19:04   ` tip-bot for Viresh Kumar
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Viresh Kumar @ 2015-04-22 19:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, peterz, fweisbec, preeti, tglx, viresh.kumar, mtosatti,
	linux-kernel, mingo

Commit-ID:  d9f0acdeef48570c4e6159d3108f12b64571392e
Gitweb:     http://git.kernel.org/tip/d9f0acdeef48570c4e6159d3108f12b64571392e
Author:     Viresh Kumar <viresh.kumar@linaro.org>
AuthorDate: Tue, 14 Apr 2015 21:08:25 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:48 +0200

hrtimer: Update active_bases before calling hrtimer_force_reprogram()

'active_bases' indicates which clock-base have active timer. The
intention of this bit field was to avoid evaluating inactive bases. It
was introduced with the introduction of the BOOTTIME and TAI clock
bases, but it was never brought into full use.

We want to use it now, but in __remove_hrtimer() the update happens
after the calling hrtimer_force_reprogram() which has to evaluate all
clock bases for the next expiring timer. So in case the last timer of
a clock base got removed we still see the active bit and therefor
evaluate the clock base for no value. There are further optimizations
possible when active_bases is updated in the right place.

Move the update before the call to hrtimer_force_reprogram()

[ tglx: Massaged changelog ]

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: linaro-kernel@lists.linaro.org
Link: http://lkml.kernel.org/r/20150414203500.533438642@linutronix.de
Link: http://lkml.kernel.org/r/c7c8ebcd9ed88bb09d76059c745a1fafb48314e7.1428039899.git.viresh.kumar@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/hrtimer.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index b1a74ee..9abd50b 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -887,6 +887,9 @@ static void __remove_hrtimer(struct hrtimer *timer,
 
 	next_timer = timerqueue_getnext(&base->active);
 	timerqueue_del(&base->active, &timer->node);
+	if (!timerqueue_getnext(&base->active))
+		base->cpu_base->active_bases &= ~(1 << base->index);
+
 	if (&timer->node == next_timer) {
 #ifdef CONFIG_HIGH_RES_TIMERS
 		/* Reprogram the clock event device. if enabled */
@@ -900,8 +903,6 @@ static void __remove_hrtimer(struct hrtimer *timer,
 		}
 #endif
 	}
-	if (!timerqueue_getnext(&base->active))
-		base->cpu_base->active_bases &= ~(1 << base->index);
 out:
 	timer->state = newstate;
 }

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Get rid of the resolution field in hrtimer_clock_base
  2015-04-14 21:08 ` [patch 02/39] hrtimer: Get rid of the resolution field in hrtimer_clock_base Thomas Gleixner
  2015-04-15  6:29   ` Frans Klaver
  2015-04-20  8:34   ` Preeti U Murthy
@ 2015-04-22 19:05   ` tip-bot for Thomas Gleixner
  2 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: fweisbec, viresh.kumar, mingo, linux-kernel, mtosatti, tglx, hpa,
	peterz, preeti

Commit-ID:  398ca17fb54b212cdc9da7ff4a17a35c48dd2103
Gitweb:     http://git.kernel.org/tip/398ca17fb54b212cdc9da7ff4a17a35c48dd2103
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:27 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:48 +0200

hrtimer: Get rid of the resolution field in hrtimer_clock_base

The field has no value because all clock bases have the same
resolution. The resolution only changes when we switch to high
resolution timer mode. We can evaluate that from a single static
variable as well. In the !HIGHRES case its simply a constant.

Export the variable, so we can simplify the usage sites.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203500.645454122@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h  |  6 ++++--
 kernel/time/hrtimer.c    | 26 +++++++++-----------------
 kernel/time/timer_list.c |  8 ++++----
 3 files changed, 17 insertions(+), 23 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 7770676..bc6f91b 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -137,7 +137,6 @@ struct hrtimer_sleeper {
  *			timer to a base on another cpu.
  * @clockid:		clock id for per_cpu support
  * @active:		red black tree root node for the active timers
- * @resolution:		the resolution of the clock, in nanoseconds
  * @get_time:		function to retrieve the current time of the clock
  * @softirq_time:	the time when running the hrtimer queue in the softirq
  * @offset:		offset of this clock to the monotonic base
@@ -147,7 +146,6 @@ struct hrtimer_clock_base {
 	int			index;
 	clockid_t		clockid;
 	struct timerqueue_head	active;
-	ktime_t			resolution;
 	ktime_t			(*get_time)(void);
 	ktime_t			softirq_time;
 	ktime_t			offset;
@@ -295,11 +293,15 @@ extern void hrtimer_peek_ahead_timers(void);
 
 extern void clock_was_set_delayed(void);
 
+extern unsigned int hrtimer_resolution;
+
 #else
 
 # define MONOTONIC_RES_NSEC	LOW_RES_NSEC
 # define KTIME_MONOTONIC_RES	KTIME_LOW_RES
 
+#define hrtimer_resolution	LOW_RES_NSEC
+
 static inline void hrtimer_peek_ahead_timers(void) { }
 
 /*
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 9abd50b..965687a 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -66,7 +66,6 @@
  */
 DEFINE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases) =
 {
-
 	.lock = __RAW_SPIN_LOCK_UNLOCKED(hrtimer_bases.lock),
 	.clock_base =
 	{
@@ -74,25 +73,21 @@ DEFINE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases) =
 			.index = HRTIMER_BASE_MONOTONIC,
 			.clockid = CLOCK_MONOTONIC,
 			.get_time = &ktime_get,
-			.resolution = KTIME_LOW_RES,
 		},
 		{
 			.index = HRTIMER_BASE_REALTIME,
 			.clockid = CLOCK_REALTIME,
 			.get_time = &ktime_get_real,
-			.resolution = KTIME_LOW_RES,
 		},
 		{
 			.index = HRTIMER_BASE_BOOTTIME,
 			.clockid = CLOCK_BOOTTIME,
 			.get_time = &ktime_get_boottime,
-			.resolution = KTIME_LOW_RES,
 		},
 		{
 			.index = HRTIMER_BASE_TAI,
 			.clockid = CLOCK_TAI,
 			.get_time = &ktime_get_clocktai,
-			.resolution = KTIME_LOW_RES,
 		},
 	}
 };
@@ -478,6 +473,8 @@ static ktime_t __hrtimer_get_next_event(struct hrtimer_cpu_base *cpu_base)
  * High resolution timer enabled ?
  */
 static int hrtimer_hres_enabled __read_mostly  = 1;
+unsigned int hrtimer_resolution __read_mostly = LOW_RES_NSEC;
+EXPORT_SYMBOL_GPL(hrtimer_resolution);
 
 /*
  * Enable / Disable high resolution mode
@@ -660,7 +657,7 @@ static void retrigger_next_event(void *arg)
  */
 static int hrtimer_switch_to_hres(void)
 {
-	int i, cpu = smp_processor_id();
+	int cpu = smp_processor_id();
 	struct hrtimer_cpu_base *base = &per_cpu(hrtimer_bases, cpu);
 	unsigned long flags;
 
@@ -676,8 +673,7 @@ static int hrtimer_switch_to_hres(void)
 		return 0;
 	}
 	base->hres_active = 1;
-	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++)
-		base->clock_base[i].resolution = KTIME_HIGH_RES;
+	hrtimer_resolution = HIGH_RES_NSEC;
 
 	tick_setup_sched_timer();
 	/* "Retrigger" the interrupt to get things going */
@@ -820,8 +816,8 @@ u64 hrtimer_forward(struct hrtimer *timer, ktime_t now, ktime_t interval)
 	if (delta.tv64 < 0)
 		return 0;
 
-	if (interval.tv64 < timer->base->resolution.tv64)
-		interval.tv64 = timer->base->resolution.tv64;
+	if (interval.tv64 < hrtimer_resolution)
+		interval.tv64 = hrtimer_resolution;
 
 	if (unlikely(delta.tv64 >= interval.tv64)) {
 		s64 incr = ktime_to_ns(interval);
@@ -963,7 +959,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 		 * timeouts. This will go away with the GTOD framework.
 		 */
 #ifdef CONFIG_TIME_LOW_RES
-		tim = ktime_add_safe(tim, base->resolution);
+		tim = ktime_add_safe(tim, ktime_set(0, hrtimer_resolution));
 #endif
 	}
 
@@ -1193,12 +1189,8 @@ EXPORT_SYMBOL_GPL(hrtimer_init);
  */
 int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp)
 {
-	struct hrtimer_cpu_base *cpu_base;
-	int base = hrtimer_clockid_to_base(which_clock);
-
-	cpu_base = raw_cpu_ptr(&hrtimer_bases);
-	*tp = ktime_to_timespec(cpu_base->clock_base[base].resolution);
-
+	tp->tv_sec = 0;
+	tp->tv_nsec = hrtimer_resolution;
 	return 0;
 }
 EXPORT_SYMBOL_GPL(hrtimer_get_res);
diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
index 5960af21..bdd5e98 100644
--- a/kernel/time/timer_list.c
+++ b/kernel/time/timer_list.c
@@ -127,10 +127,10 @@ static void
 print_base(struct seq_file *m, struct hrtimer_clock_base *base, u64 now)
 {
 	SEQ_printf(m, "  .base:       %pK\n", base);
-	SEQ_printf(m, "  .index:      %d\n",
-			base->index);
-	SEQ_printf(m, "  .resolution: %Lu nsecs\n",
-			(unsigned long long)ktime_to_ns(base->resolution));
+	SEQ_printf(m, "  .index:      %d\n", base->index);
+
+	SEQ_printf(m, "  .resolution: %u nsecs\n", (unsigned) hrtimer_resolution);
+
 	SEQ_printf(m,   "  .get_time:   ");
 	print_name_offset(m, base->get_time);
 	SEQ_printf(m,   "\n");

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] net: sched: Use hrtimer_resolution instead of hrtimer_get_res()
  2015-04-14 21:08 ` [patch 03/39] net: sched: Use hrtimer_resolution instead of hrtimer_get_res() Thomas Gleixner
  2015-04-16 16:04   ` David Miller
@ 2015-04-22 19:05   ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: preeti, fweisbec, tglx, viresh.kumar, peterz, linux-kernel,
	davem, mingo, mtosatti, jhs, hpa

Commit-ID:  1e3176885cce8e0137d6f4072c5910bfa00901ed
Gitweb:     http://git.kernel.org/tip/1e3176885cce8e0137d6f4072c5910bfa00901ed
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:28 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:48 +0200

net: sched: Use hrtimer_resolution instead of hrtimer_get_res()

No point in converting a timespec now that the value is directly
accessible.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Link: http://lkml.kernel.org/r/20150414203500.720623028@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 net/sched/sch_api.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index ad9eed7..45bc63a 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1883,13 +1883,10 @@ EXPORT_SYMBOL(tcf_destroy_chain);
 #ifdef CONFIG_PROC_FS
 static int psched_show(struct seq_file *seq, void *v)
 {
-	struct timespec ts;
-
-	hrtimer_get_res(CLOCK_MONOTONIC, &ts);
 	seq_printf(seq, "%08x %08x %08x %08x\n",
 		   (u32)NSEC_PER_USEC, (u32)PSCHED_TICKS2NS(1),
 		   1000000,
-		   (u32)NSEC_PER_SEC/(u32)ktime_to_ns(timespec_to_ktime(ts)));
+		   (u32)NSEC_PER_SEC / hrtimer_resolution);
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] sound: Use hrtimer_resolution instead of hrtimer_get_res()
  2015-04-14 21:08 ` [patch 04/39] sound: " Thomas Gleixner
  2015-04-16  8:07   ` Takashi Iwai
@ 2015-04-22 19:05   ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, linux-kernel, tglx, preeti, mingo, mtosatti, fweisbec,
	perex, viresh.kumar, tiwai, peterz

Commit-ID:  447fbbdc2cd58cdaf410fefef365a9ce38833157
Gitweb:     http://git.kernel.org/tip/447fbbdc2cd58cdaf410fefef365a9ce38833157
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:30 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:48 +0200

sound: Use hrtimer_resolution instead of hrtimer_get_res()

No point in converting a timespec now that the value is directly
accessible. Get rid of the null check while at it. Resolution is
guaranteed to be > 0.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: alsa-devel@alsa-project.org
Link: http://lkml.kernel.org/r/20150414203500.799133359@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 sound/core/hrtimer.c      |  9 +--------
 sound/drivers/pcsp/pcsp.c | 15 ++++++---------
 2 files changed, 7 insertions(+), 17 deletions(-)

diff --git a/sound/core/hrtimer.c b/sound/core/hrtimer.c
index 886be7d..f845ecf 100644
--- a/sound/core/hrtimer.c
+++ b/sound/core/hrtimer.c
@@ -121,16 +121,9 @@ static struct snd_timer *mytimer;
 static int __init snd_hrtimer_init(void)
 {
 	struct snd_timer *timer;
-	struct timespec tp;
 	int err;
 
-	hrtimer_get_res(CLOCK_MONOTONIC, &tp);
-	if (tp.tv_sec > 0 || !tp.tv_nsec) {
-		pr_err("snd-hrtimer: Invalid resolution %u.%09u",
-			   (unsigned)tp.tv_sec, (unsigned)tp.tv_nsec);
-		return -EINVAL;
-	}
-	resolution = tp.tv_nsec;
+	resolution = hrtimer_resolution;
 
 	/* Create a new timer and set up the fields */
 	err = snd_timer_global_new("hrtimer", SNDRV_TIMER_GLOBAL_HRTIMER,
diff --git a/sound/drivers/pcsp/pcsp.c b/sound/drivers/pcsp/pcsp.c
index d9647bd..eb54702 100644
--- a/sound/drivers/pcsp/pcsp.c
+++ b/sound/drivers/pcsp/pcsp.c
@@ -42,16 +42,13 @@ struct snd_pcsp pcsp_chip;
 static int snd_pcsp_create(struct snd_card *card)
 {
 	static struct snd_device_ops ops = { };
-	struct timespec tp;
-	int err;
-	int div, min_div, order;
-
-	hrtimer_get_res(CLOCK_MONOTONIC, &tp);
+	unsigned int resolution = hrtimer_resolution;
+	int err, div, min_div, order;
 
 	if (!nopcm) {
-		if (tp.tv_sec || tp.tv_nsec > PCSP_MAX_PERIOD_NS) {
+		if (resolution > PCSP_MAX_PERIOD_NS) {
 			printk(KERN_ERR "PCSP: Timer resolution is not sufficient "
-				"(%linS)\n", tp.tv_nsec);
+				"(%linS)\n", resolution);
 			printk(KERN_ERR "PCSP: Make sure you have HPET and ACPI "
 				"enabled.\n");
 			printk(KERN_ERR "PCSP: Turned into nopcm mode.\n");
@@ -59,13 +56,13 @@ static int snd_pcsp_create(struct snd_card *card)
 		}
 	}
 
-	if (loops_per_jiffy >= PCSP_MIN_LPJ && tp.tv_nsec <= PCSP_MIN_PERIOD_NS)
+	if (loops_per_jiffy >= PCSP_MIN_LPJ && resolution <= PCSP_MIN_PERIOD_NS)
 		min_div = MIN_DIV;
 	else
 		min_div = MAX_DIV;
 #if PCSP_DEBUG
 	printk(KERN_DEBUG "PCSP: lpj=%li, min_div=%i, res=%li\n",
-	       loops_per_jiffy, min_div, tp.tv_nsec);
+	       loops_per_jiffy, min_div, resolution);
 #endif
 
 	div = MAX_DIV / min_div;

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Get rid of hrtimer_get_res()
  2015-04-14 21:08 ` [patch 05/39] hrtimer: Get rid " Thomas Gleixner
@ 2015-04-22 19:05   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: viresh.kumar, preeti, fweisbec, peterz, hpa, mtosatti, tglx,
	mingo, linux-kernel

Commit-ID:  056a3cacbc46e5aca27b350ce4ecb3b33ebb0700
Gitweb:     http://git.kernel.org/tip/056a3cacbc46e5aca27b350ce4ecb3b33ebb0700
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:32 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:49 +0200

hrtimer: Get rid of hrtimer_get_res()

The resolution is directly accessible now. So its simpler just to fill
in the values of the timespec and be done with it.

Text size reduction (combined with "hrtimer: Get rid of the resolution
field in hrtimer_clock_base"):
       x8664 -61, i386 -221, ARM -60, power64 -48

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203500.879888080@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h    |  1 -
 kernel/time/alarmtimer.c   |  6 +++---
 kernel/time/hrtimer.c      | 16 ----------------
 kernel/time/posix-timers.c | 17 ++++++++++++-----
 4 files changed, 15 insertions(+), 25 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index bc6f91b..8025156 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -385,7 +385,6 @@ static inline int hrtimer_restart(struct hrtimer *timer)
 
 /* Query timers: */
 extern ktime_t hrtimer_get_remaining(const struct hrtimer *timer);
-extern int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp);
 
 extern ktime_t hrtimer_get_next_event(void);
 
diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index 1b001ed..0b55a75 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -495,12 +495,12 @@ static enum alarmtimer_restart alarm_handle_timer(struct alarm *alarm,
  */
 static int alarm_clock_getres(const clockid_t which_clock, struct timespec *tp)
 {
-	clockid_t baseid = alarm_bases[clock2alarm(which_clock)].base_clockid;
-
 	if (!alarmtimer_get_rtcdev())
 		return -EINVAL;
 
-	return hrtimer_get_res(baseid, tp);
+	tp->tv_sec = 0;
+	tp->tv_nsec = hrtimer_resolution;
+	return 0;
 }
 
 /**
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 965687a..73131da 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1179,22 +1179,6 @@ void hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
 }
 EXPORT_SYMBOL_GPL(hrtimer_init);
 
-/**
- * hrtimer_get_res - get the timer resolution for a clock
- * @which_clock: which clock to query
- * @tp:		 pointer to timespec variable to store the resolution
- *
- * Store the resolution of the clock selected by @which_clock in the
- * variable pointed to by @tp.
- */
-int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp)
-{
-	tp->tv_sec = 0;
-	tp->tv_nsec = hrtimer_resolution;
-	return 0;
-}
-EXPORT_SYMBOL_GPL(hrtimer_get_res);
-
 static void __run_hrtimer(struct hrtimer *timer, ktime_t *now)
 {
 	struct hrtimer_clock_base *base = timer->base;
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 31ea01f..31d11ac 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -272,13 +272,20 @@ static int posix_get_tai(clockid_t which_clock, struct timespec *tp)
 	return 0;
 }
 
+static int posix_get_hrtimer_res(clockid_t which_clock, struct timespec *tp)
+{
+	tp->tv_sec = 0;
+	tp->tv_nsec = hrtimer_resolution;
+	return 0;
+}
+
 /*
  * Initialize everything, well, just everything in Posix clocks/timers ;)
  */
 static __init int init_posix_timers(void)
 {
 	struct k_clock clock_realtime = {
-		.clock_getres	= hrtimer_get_res,
+		.clock_getres	= posix_get_hrtimer_res,
 		.clock_get	= posix_clock_realtime_get,
 		.clock_set	= posix_clock_realtime_set,
 		.clock_adj	= posix_clock_realtime_adj,
@@ -290,7 +297,7 @@ static __init int init_posix_timers(void)
 		.timer_del	= common_timer_del,
 	};
 	struct k_clock clock_monotonic = {
-		.clock_getres	= hrtimer_get_res,
+		.clock_getres	= posix_get_hrtimer_res,
 		.clock_get	= posix_ktime_get_ts,
 		.nsleep		= common_nsleep,
 		.nsleep_restart	= hrtimer_nanosleep_restart,
@@ -300,7 +307,7 @@ static __init int init_posix_timers(void)
 		.timer_del	= common_timer_del,
 	};
 	struct k_clock clock_monotonic_raw = {
-		.clock_getres	= hrtimer_get_res,
+		.clock_getres	= posix_get_hrtimer_res,
 		.clock_get	= posix_get_monotonic_raw,
 	};
 	struct k_clock clock_realtime_coarse = {
@@ -312,7 +319,7 @@ static __init int init_posix_timers(void)
 		.clock_get	= posix_get_monotonic_coarse,
 	};
 	struct k_clock clock_tai = {
-		.clock_getres	= hrtimer_get_res,
+		.clock_getres	= posix_get_hrtimer_res,
 		.clock_get	= posix_get_tai,
 		.nsleep		= common_nsleep,
 		.nsleep_restart	= hrtimer_nanosleep_restart,
@@ -322,7 +329,7 @@ static __init int init_posix_timers(void)
 		.timer_del	= common_timer_del,
 	};
 	struct k_clock clock_boottime = {
-		.clock_getres	= hrtimer_get_res,
+		.clock_getres	= posix_get_hrtimer_res,
 		.clock_get	= posix_get_boottime,
 		.nsleep		= common_nsleep,
 		.nsleep_restart	= hrtimer_nanosleep_restart,

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Make the statistics fields smaller
  2015-04-14 21:08 ` [patch 06/39] hrtimer: Make the statistics fields smaller Thomas Gleixner
@ 2015-04-22 19:06   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, viresh.kumar, tglx, mingo, linux-kernel, preeti, fweisbec,
	peterz, mtosatti

Commit-ID:  a6ffebce7f89f6f97cc22838a5d4383b15d6774f
Gitweb:     http://git.kernel.org/tip/a6ffebce7f89f6f97cc22838a5d4383b15d6774f
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:34 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:49 +0200

hrtimer: Make the statistics fields smaller

No point in having usigned long for /proc/timer_list statistics. Make
them unsigned int.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203500.959773467@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h  | 8 ++++----
 kernel/time/hrtimer.c    | 4 ++--
 kernel/time/timer_list.c | 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 8025156..d39f284 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -187,10 +187,10 @@ struct hrtimer_cpu_base {
 	int				in_hrtirq;
 	int				hres_active;
 	int				hang_detected;
-	unsigned long			nr_events;
-	unsigned long			nr_retries;
-	unsigned long			nr_hangs;
-	ktime_t				max_hang_time;
+	unsigned int			nr_events;
+	unsigned int			nr_retries;
+	unsigned int			nr_hangs;
+	unsigned int			max_hang_time;
 #endif
 	struct hrtimer_clock_base	clock_base[HRTIMER_MAX_CLOCK_BASES];
 };
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 73131da..874e091 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1327,8 +1327,8 @@ retry:
 	cpu_base->hang_detected = 1;
 	raw_spin_unlock(&cpu_base->lock);
 	delta = ktime_sub(now, entry_time);
-	if (delta.tv64 > cpu_base->max_hang_time.tv64)
-		cpu_base->max_hang_time = delta;
+	if ((unsigned int)delta.tv64 > cpu_base->max_hang_time)
+		cpu_base->max_hang_time = (unsigned int) delta.tv64;
 	/*
 	 * Limit it to a sensible value as we enforce a longer
 	 * delay. Give the CPU at least 100ms to catch up.
diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
index bdd5e98..6232fc5 100644
--- a/kernel/time/timer_list.c
+++ b/kernel/time/timer_list.c
@@ -165,7 +165,7 @@ static void print_cpu(struct seq_file *m, int cpu, u64 now)
 	P(nr_events);
 	P(nr_retries);
 	P(nr_hangs);
-	P_ns(max_hang_time);
+	P(max_hang_time);
 #endif
 #undef P
 #undef P_ns

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Get rid of softirq time
  2015-04-14 21:08 ` [patch 07/39] hrtimer: Get rid of softirq time Thomas Gleixner
@ 2015-04-22 19:06   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mtosatti, hpa, linux-kernel, tglx, viresh.kumar, fweisbec,
	preeti, peterz, mingo

Commit-ID:  21d6d52a1b7028e6a6840bd82e354aefa9a5e203
Gitweb:     http://git.kernel.org/tip/21d6d52a1b7028e6a6840bd82e354aefa9a5e203
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:35 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:49 +0200

hrtimer: Get rid of softirq time

The softirq time field in the clock bases is an optimization from the
early days of hrtimers. It provides a coarse "jiffies" like time
mostly for self rearming timers.

But that comes with a price:
    - Larger code size
    - Extra storage space
    - Duplicated functions with really small differences
   
The benefit of this is optimization is marginal for contemporary
systems.

Consolidate everything on the high resolution timer
implementation. This makes further optimizations possible.

Text size reduction:
       x8664 -95, i386 -356, ARM -148, ARM64 -40, power64 -16

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203501.039977424@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h   |  24 ++------
 kernel/time/hrtimer.c     | 148 ++++++++++++++++++----------------------------
 kernel/time/timekeeping.c |  32 ----------
 kernel/time/timekeeping.h |   3 -
 4 files changed, 64 insertions(+), 143 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index d39f284..e292830 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -138,7 +138,6 @@ struct hrtimer_sleeper {
  * @clockid:		clock id for per_cpu support
  * @active:		red black tree root node for the active timers
  * @get_time:		function to retrieve the current time of the clock
- * @softirq_time:	the time when running the hrtimer queue in the softirq
  * @offset:		offset of this clock to the monotonic base
  */
 struct hrtimer_clock_base {
@@ -147,7 +146,6 @@ struct hrtimer_clock_base {
 	clockid_t		clockid;
 	struct timerqueue_head	active;
 	ktime_t			(*get_time)(void);
-	ktime_t			softirq_time;
 	ktime_t			offset;
 };
 
@@ -260,19 +258,16 @@ static inline ktime_t hrtimer_expires_remaining(const struct hrtimer *timer)
 	return ktime_sub(timer->node.expires, timer->base->get_time());
 }
 
-#ifdef CONFIG_HIGH_RES_TIMERS
-struct clock_event_device;
-
-extern void hrtimer_interrupt(struct clock_event_device *dev);
-
-/*
- * In high resolution mode the time reference must be read accurate
- */
 static inline ktime_t hrtimer_cb_get_time(struct hrtimer *timer)
 {
 	return timer->base->get_time();
 }
 
+#ifdef CONFIG_HIGH_RES_TIMERS
+struct clock_event_device;
+
+extern void hrtimer_interrupt(struct clock_event_device *dev);
+
 static inline int hrtimer_is_hres_active(struct hrtimer *timer)
 {
 	return timer->base->cpu_base->hres_active;
@@ -304,15 +299,6 @@ extern unsigned int hrtimer_resolution;
 
 static inline void hrtimer_peek_ahead_timers(void) { }
 
-/*
- * In non high resolution mode the time reference is taken from
- * the base softirq time variable.
- */
-static inline ktime_t hrtimer_cb_get_time(struct hrtimer *timer)
-{
-	return timer->base->softirq_time;
-}
-
 static inline int hrtimer_is_hres_active(struct hrtimer *timer)
 {
 	return 0;
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 874e091..9e111dd 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -104,27 +104,6 @@ static inline int hrtimer_clockid_to_base(clockid_t clock_id)
 	return hrtimer_clock_to_base_table[clock_id];
 }
 
-
-/*
- * Get the coarse grained time at the softirq based on xtime and
- * wall_to_monotonic.
- */
-static void hrtimer_get_softirq_time(struct hrtimer_cpu_base *base)
-{
-	ktime_t xtim, mono, boot, tai;
-	ktime_t off_real, off_boot, off_tai;
-
-	mono = ktime_get_update_offsets_tick(&off_real, &off_boot, &off_tai);
-	boot = ktime_add(mono, off_boot);
-	xtim = ktime_add(mono, off_real);
-	tai = ktime_add(mono, off_tai);
-
-	base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim;
-	base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono;
-	base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot;
-	base->clock_base[HRTIMER_BASE_TAI].softirq_time = tai;
-}
-
 /*
  * Functions and macros which are different for UP/SMP systems are kept in a
  * single place
@@ -466,6 +445,15 @@ static ktime_t __hrtimer_get_next_event(struct hrtimer_cpu_base *cpu_base)
 }
 #endif
 
+static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base)
+{
+	ktime_t *offs_real = &base->clock_base[HRTIMER_BASE_REALTIME].offset;
+	ktime_t *offs_boot = &base->clock_base[HRTIMER_BASE_BOOTTIME].offset;
+	ktime_t *offs_tai = &base->clock_base[HRTIMER_BASE_TAI].offset;
+
+	return ktime_get_update_offsets_now(offs_real, offs_boot, offs_tai);
+}
+
 /* High resolution timer related functions */
 #ifdef CONFIG_HIGH_RES_TIMERS
 
@@ -516,7 +504,12 @@ static inline int hrtimer_hres_active(void)
 static void
 hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base, int skip_equal)
 {
-	ktime_t expires_next = __hrtimer_get_next_event(cpu_base);
+	ktime_t expires_next;
+
+	if (!cpu_base->hres_active)
+		return;
+
+	expires_next = __hrtimer_get_next_event(cpu_base);
 
 	if (skip_equal && expires_next.tv64 == cpu_base->expires_next.tv64)
 		return;
@@ -625,15 +618,6 @@ static inline void hrtimer_init_hres(struct hrtimer_cpu_base *base)
 	base->hres_active = 0;
 }
 
-static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base)
-{
-	ktime_t *offs_real = &base->clock_base[HRTIMER_BASE_REALTIME].offset;
-	ktime_t *offs_boot = &base->clock_base[HRTIMER_BASE_BOOTTIME].offset;
-	ktime_t *offs_tai = &base->clock_base[HRTIMER_BASE_TAI].offset;
-
-	return ktime_get_update_offsets_now(offs_real, offs_boot, offs_tai);
-}
-
 /*
  * Retrigger next event is called after clock was set
  *
@@ -1179,10 +1163,10 @@ void hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
 }
 EXPORT_SYMBOL_GPL(hrtimer_init);
 
-static void __run_hrtimer(struct hrtimer *timer, ktime_t *now)
+static void __run_hrtimer(struct hrtimer_cpu_base *cpu_base,
+			  struct hrtimer_clock_base *base,
+			  struct hrtimer *timer, ktime_t *now)
 {
-	struct hrtimer_clock_base *base = timer->base;
-	struct hrtimer_cpu_base *cpu_base = base->cpu_base;
 	enum hrtimer_restart (*fn)(struct hrtimer *);
 	int restart;
 
@@ -1219,34 +1203,9 @@ static void __run_hrtimer(struct hrtimer *timer, ktime_t *now)
 	timer->state &= ~HRTIMER_STATE_CALLBACK;
 }
 
-#ifdef CONFIG_HIGH_RES_TIMERS
-
-/*
- * High resolution timer interrupt
- * Called with interrupts disabled
- */
-void hrtimer_interrupt(struct clock_event_device *dev)
+static void __hrtimer_run_queues(struct hrtimer_cpu_base *cpu_base, ktime_t now)
 {
-	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
-	ktime_t expires_next, now, entry_time, delta;
-	int i, retries = 0;
-
-	BUG_ON(!cpu_base->hres_active);
-	cpu_base->nr_events++;
-	dev->next_event.tv64 = KTIME_MAX;
-
-	raw_spin_lock(&cpu_base->lock);
-	entry_time = now = hrtimer_update_base(cpu_base);
-retry:
-	cpu_base->in_hrtirq = 1;
-	/*
-	 * We set expires_next to KTIME_MAX here with cpu_base->lock
-	 * held to prevent that a timer is enqueued in our queue via
-	 * the migration code. This does not affect enqueueing of
-	 * timers which run their callback and need to be requeued on
-	 * this CPU.
-	 */
-	cpu_base->expires_next.tv64 = KTIME_MAX;
+	int i;
 
 	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
 		struct hrtimer_clock_base *base;
@@ -1279,9 +1238,42 @@ retry:
 			if (basenow.tv64 < hrtimer_get_softexpires_tv64(timer))
 				break;
 
-			__run_hrtimer(timer, &basenow);
+			__run_hrtimer(cpu_base, base, timer, &basenow);
 		}
 	}
+}
+
+#ifdef CONFIG_HIGH_RES_TIMERS
+
+/*
+ * High resolution timer interrupt
+ * Called with interrupts disabled
+ */
+void hrtimer_interrupt(struct clock_event_device *dev)
+{
+	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
+	ktime_t expires_next, now, entry_time, delta;
+	int retries = 0;
+
+	BUG_ON(!cpu_base->hres_active);
+	cpu_base->nr_events++;
+	dev->next_event.tv64 = KTIME_MAX;
+
+	raw_spin_lock(&cpu_base->lock);
+	entry_time = now = hrtimer_update_base(cpu_base);
+retry:
+	cpu_base->in_hrtirq = 1;
+	/*
+	 * We set expires_next to KTIME_MAX here with cpu_base->lock
+	 * held to prevent that a timer is enqueued in our queue via
+	 * the migration code. This does not affect enqueueing of
+	 * timers which run their callback and need to be requeued on
+	 * this CPU.
+	 */
+	cpu_base->expires_next.tv64 = KTIME_MAX;
+
+	__hrtimer_run_queues(cpu_base, now);
+
 	/* Reevaluate the clock bases for the next expiry */
 	expires_next = __hrtimer_get_next_event(cpu_base);
 	/*
@@ -1416,38 +1408,16 @@ void hrtimer_run_pending(void)
  */
 void hrtimer_run_queues(void)
 {
-	struct timerqueue_node *node;
 	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
-	struct hrtimer_clock_base *base;
-	int index, gettime = 1;
+	ktime_t now;
 
 	if (hrtimer_hres_active())
 		return;
 
-	for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) {
-		base = &cpu_base->clock_base[index];
-		if (!timerqueue_getnext(&base->active))
-			continue;
-
-		if (gettime) {
-			hrtimer_get_softirq_time(cpu_base);
-			gettime = 0;
-		}
-
-		raw_spin_lock(&cpu_base->lock);
-
-		while ((node = timerqueue_getnext(&base->active))) {
-			struct hrtimer *timer;
-
-			timer = container_of(node, struct hrtimer, node);
-			if (base->softirq_time.tv64 <=
-					hrtimer_get_expires_tv64(timer))
-				break;
-
-			__run_hrtimer(timer, &base->softirq_time);
-		}
-		raw_spin_unlock(&cpu_base->lock);
-	}
+	raw_spin_lock(&cpu_base->lock);
+	now = hrtimer_update_base(cpu_base);
+	__hrtimer_run_queues(cpu_base, now);
+	raw_spin_unlock(&cpu_base->lock);
 }
 
 /*
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 946acb7..dd1efa6 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1926,37 +1926,6 @@ void do_timer(unsigned long ticks)
 }
 
 /**
- * ktime_get_update_offsets_tick - hrtimer helper
- * @offs_real:	pointer to storage for monotonic -> realtime offset
- * @offs_boot:	pointer to storage for monotonic -> boottime offset
- * @offs_tai:	pointer to storage for monotonic -> clock tai offset
- *
- * Returns monotonic time at last tick and various offsets
- */
-ktime_t ktime_get_update_offsets_tick(ktime_t *offs_real, ktime_t *offs_boot,
-							ktime_t *offs_tai)
-{
-	struct timekeeper *tk = &tk_core.timekeeper;
-	unsigned int seq;
-	ktime_t base;
-	u64 nsecs;
-
-	do {
-		seq = read_seqcount_begin(&tk_core.seq);
-
-		base = tk->tkr_mono.base;
-		nsecs = tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift;
-
-		*offs_real = tk->offs_real;
-		*offs_boot = tk->offs_boot;
-		*offs_tai = tk->offs_tai;
-	} while (read_seqcount_retry(&tk_core.seq, seq));
-
-	return ktime_add_ns(base, nsecs);
-}
-
-#ifdef CONFIG_HIGH_RES_TIMERS
-/**
  * ktime_get_update_offsets_now - hrtimer helper
  * @offs_real:	pointer to storage for monotonic -> realtime offset
  * @offs_boot:	pointer to storage for monotonic -> boottime offset
@@ -1986,7 +1955,6 @@ ktime_t ktime_get_update_offsets_now(ktime_t *offs_real, ktime_t *offs_boot,
 
 	return ktime_add_ns(base, nsecs);
 }
-#endif
 
 /**
  * do_adjtimex() - Accessor function to NTP __do_adjtimex function
diff --git a/kernel/time/timekeeping.h b/kernel/time/timekeeping.h
index 5b57f6c..4d177fc 100644
--- a/kernel/time/timekeeping.h
+++ b/kernel/time/timekeeping.h
@@ -3,9 +3,6 @@
 /*
  * Internal interfaces for kernel/time/
  */
-extern ktime_t ktime_get_update_offsets_tick(ktime_t *offs_real,
-						ktime_t *offs_boot,
-						ktime_t *offs_tai);
 extern ktime_t ktime_get_update_offsets_now(ktime_t *offs_real,
 						ktime_t *offs_boot,
 						ktime_t *offs_tai);

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Make offset update smarter
  2015-04-14 21:08 ` [patch 08/39] hrtimer: Make offset update smarter Thomas Gleixner
  2015-04-20  9:30   ` Preeti U Murthy
@ 2015-04-22 19:06   ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, peterz, preeti, fweisbec, mtosatti, hpa, viresh.kumar,
	linux-kernel, tglx, john.stultz

Commit-ID:  868a3e915f7f5eba8f8cb4f7da2276760807c51c
Gitweb:     http://git.kernel.org/tip/868a3e915f7f5eba8f8cb4f7da2276760807c51c
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:37 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:49 +0200

hrtimer: Make offset update smarter

On every tick/hrtimer interrupt we update the offset variables of the
clock bases. That's silly because these offsets change very seldom.

Add a sequence counter to the time keeping code which keeps track of
the offset updates (clock_was_set()). Have a sequence cache in the
hrtimer cpu bases to evaluate whether the offsets must be updated or
not. This allows us later to avoid pointless cacheline pollution.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20150414203501.132820245@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
---
 include/linux/hrtimer.h             |  4 ++--
 include/linux/timekeeper_internal.h |  2 ++
 kernel/time/hrtimer.c               |  3 ++-
 kernel/time/timekeeping.c           | 23 ++++++++++++++++-------
 kernel/time/timekeeping.h           |  7 ++++---
 5 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index e292830..5e04f8f 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -163,7 +163,7 @@ enum  hrtimer_base_type {
  *			and timers
  * @cpu:		cpu number
  * @active_bases:	Bitfield to mark bases with active timers
- * @clock_was_set:	Indicates that clock was set from irq context.
+ * @clock_was_set_seq:	Sequence counter of clock was set events
  * @expires_next:	absolute time of the next event which was scheduled
  *			via clock_set_next_event()
  * @in_hrtirq:		hrtimer_interrupt() is currently executing
@@ -179,7 +179,7 @@ struct hrtimer_cpu_base {
 	raw_spinlock_t			lock;
 	unsigned int			cpu;
 	unsigned int			active_bases;
-	unsigned int			clock_was_set;
+	unsigned int			clock_was_set_seq;
 #ifdef CONFIG_HIGH_RES_TIMERS
 	ktime_t				expires_next;
 	int				in_hrtirq;
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index fb86963..6f8276a 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -49,6 +49,7 @@ struct tk_read_base {
  * @offs_boot:		Offset clock monotonic -> clock boottime
  * @offs_tai:		Offset clock monotonic -> clock tai
  * @tai_offset:		The current UTC to TAI offset in seconds
+ * @clock_was_set_seq:	The sequence number of clock was set events
  * @raw_time:		Monotonic raw base time in timespec64 format
  * @cycle_interval:	Number of clock cycles in one NTP interval
  * @xtime_interval:	Number of clock shifted nano seconds in one NTP
@@ -85,6 +86,7 @@ struct timekeeper {
 	ktime_t			offs_boot;
 	ktime_t			offs_tai;
 	s32			tai_offset;
+	unsigned int		clock_was_set_seq;
 	struct timespec64	raw_time;
 
 	/* The following members are for timekeeping internal use */
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 9e111dd..8ce9b31 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -451,7 +451,8 @@ static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base)
 	ktime_t *offs_boot = &base->clock_base[HRTIMER_BASE_BOOTTIME].offset;
 	ktime_t *offs_tai = &base->clock_base[HRTIMER_BASE_TAI].offset;
 
-	return ktime_get_update_offsets_now(offs_real, offs_boot, offs_tai);
+	return ktime_get_update_offsets_now(&base->clock_was_set_seq,
+					    offs_real, offs_boot, offs_tai);
 }
 
 /* High resolution timer related functions */
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index dd1efa6..3365e32 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -602,6 +602,9 @@ static void timekeeping_update(struct timekeeper *tk, unsigned int action)
 
 	update_fast_timekeeper(&tk->tkr_mono, &tk_fast_mono);
 	update_fast_timekeeper(&tk->tkr_raw,  &tk_fast_raw);
+
+	if (action & TK_CLOCK_WAS_SET)
+		tk->clock_was_set_seq++;
 }
 
 /**
@@ -1927,15 +1930,19 @@ void do_timer(unsigned long ticks)
 
 /**
  * ktime_get_update_offsets_now - hrtimer helper
+ * @cwsseq:	pointer to check and store the clock was set sequence number
  * @offs_real:	pointer to storage for monotonic -> realtime offset
  * @offs_boot:	pointer to storage for monotonic -> boottime offset
  * @offs_tai:	pointer to storage for monotonic -> clock tai offset
  *
- * Returns current monotonic time and updates the offsets
+ * Returns current monotonic time and updates the offsets if the
+ * sequence number in @cwsseq and timekeeper.clock_was_set_seq are
+ * different.
+ *
  * Called from hrtimer_interrupt() or retrigger_next_event()
  */
-ktime_t ktime_get_update_offsets_now(ktime_t *offs_real, ktime_t *offs_boot,
-							ktime_t *offs_tai)
+ktime_t ktime_get_update_offsets_now(unsigned int *cwsseq, ktime_t *offs_real,
+				     ktime_t *offs_boot, ktime_t *offs_tai)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
 	unsigned int seq;
@@ -1947,10 +1954,12 @@ ktime_t ktime_get_update_offsets_now(ktime_t *offs_real, ktime_t *offs_boot,
 
 		base = tk->tkr_mono.base;
 		nsecs = timekeeping_get_ns(&tk->tkr_mono);
-
-		*offs_real = tk->offs_real;
-		*offs_boot = tk->offs_boot;
-		*offs_tai = tk->offs_tai;
+		if (*cwsseq != tk->clock_was_set_seq) {
+			*cwsseq = tk->clock_was_set_seq;
+			*offs_real = tk->offs_real;
+			*offs_boot = tk->offs_boot;
+			*offs_tai = tk->offs_tai;
+		}
 	} while (read_seqcount_retry(&tk_core.seq, seq));
 
 	return ktime_add_ns(base, nsecs);
diff --git a/kernel/time/timekeeping.h b/kernel/time/timekeeping.h
index 4d177fc..704f595 100644
--- a/kernel/time/timekeeping.h
+++ b/kernel/time/timekeeping.h
@@ -3,9 +3,10 @@
 /*
  * Internal interfaces for kernel/time/
  */
-extern ktime_t ktime_get_update_offsets_now(ktime_t *offs_real,
-						ktime_t *offs_boot,
-						ktime_t *offs_tai);
+extern ktime_t ktime_get_update_offsets_now(unsigned int *cwsseq,
+					    ktime_t *offs_real,
+					    ktime_t *offs_boot,
+					    ktime_t *offs_tai);
 
 extern int timekeeping_valid_for_hres(void);
 extern u64 timekeeping_max_deferment(void);

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Use bits for various boolean indicators
  2015-04-14 21:08 ` [patch 09/39] hrtimer: Use a bits for various boolean indicators Thomas Gleixner
@ 2015-04-22 19:07   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, fweisbec, tglx, mtosatti, preeti, hpa, linux-kernel,
	viresh.kumar, peterz

Commit-ID:  e19ffe8be2cd0a1f726b235443eba21e64f6be5e
Gitweb:     http://git.kernel.org/tip/e19ffe8be2cd0a1f726b235443eba21e64f6be5e
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:39 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:49 +0200

hrtimer: Use bits for various boolean indicators

No point in wasting 12 byte storage space. Generates better code as well.

Text size reduction:
       x8664 -64, i386 -16, ARM -132, ARM64 -0, power64 -48

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203501.227955358@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h |  6 +++---
 kernel/time/hrtimer.c   | 24 ++++++++++++++++--------
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 5e04f8f..17a59dd 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -181,10 +181,10 @@ struct hrtimer_cpu_base {
 	unsigned int			active_bases;
 	unsigned int			clock_was_set_seq;
 #ifdef CONFIG_HIGH_RES_TIMERS
+	unsigned int			in_hrtirq	: 1,
+					hres_active	: 1,
+					hang_detected	: 1;
 	ktime_t				expires_next;
-	int				in_hrtirq;
-	int				hres_active;
-	int				hang_detected;
 	unsigned int			nr_events;
 	unsigned int			nr_retries;
 	unsigned int			nr_hangs;
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 8ce9b31..9bbfe33 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -492,9 +492,14 @@ static inline int hrtimer_is_hres_enabled(void)
 /*
  * Is the high resolution mode active ?
  */
+static inline int __hrtimer_hres_active(struct hrtimer_cpu_base *cpu_base)
+{
+	return cpu_base->hres_active;
+}
+
 static inline int hrtimer_hres_active(void)
 {
-	return __this_cpu_read(hrtimer_bases.hres_active);
+	return __hrtimer_hres_active(this_cpu_ptr(&hrtimer_bases));
 }
 
 /*
@@ -628,7 +633,7 @@ static void retrigger_next_event(void *arg)
 {
 	struct hrtimer_cpu_base *base = this_cpu_ptr(&hrtimer_bases);
 
-	if (!hrtimer_hres_active())
+	if (!base->hres_active)
 		return;
 
 	raw_spin_lock(&base->lock);
@@ -685,6 +690,7 @@ void clock_was_set_delayed(void)
 
 #else
 
+static inline int __hrtimer_hres_active(struct hrtimer_cpu_base *b) { return 0; }
 static inline int hrtimer_hres_active(void) { return 0; }
 static inline int hrtimer_is_hres_enabled(void) { return 0; }
 static inline int hrtimer_switch_to_hres(void) { return 0; }
@@ -862,25 +868,27 @@ static void __remove_hrtimer(struct hrtimer *timer,
 			     struct hrtimer_clock_base *base,
 			     unsigned long newstate, int reprogram)
 {
+	struct hrtimer_cpu_base *cpu_base = base->cpu_base;
 	struct timerqueue_node *next_timer;
+
 	if (!(timer->state & HRTIMER_STATE_ENQUEUED))
 		goto out;
 
 	next_timer = timerqueue_getnext(&base->active);
 	timerqueue_del(&base->active, &timer->node);
 	if (!timerqueue_getnext(&base->active))
-		base->cpu_base->active_bases &= ~(1 << base->index);
+		cpu_base->active_bases &= ~(1 << base->index);
 
 	if (&timer->node == next_timer) {
 #ifdef CONFIG_HIGH_RES_TIMERS
 		/* Reprogram the clock event device. if enabled */
-		if (reprogram && hrtimer_hres_active()) {
+		if (reprogram && cpu_base->hres_active) {
 			ktime_t expires;
 
 			expires = ktime_sub(hrtimer_get_expires(timer),
 					    base->offset);
-			if (base->cpu_base->expires_next.tv64 == expires.tv64)
-				hrtimer_force_reprogram(base->cpu_base, 1);
+			if (cpu_base->expires_next.tv64 == expires.tv64)
+				hrtimer_force_reprogram(cpu_base, 1);
 		}
 #endif
 	}
@@ -1114,7 +1122,7 @@ ktime_t hrtimer_get_next_event(void)
 
 	raw_spin_lock_irqsave(&cpu_base->lock, flags);
 
-	if (!hrtimer_hres_active())
+	if (!__hrtimer_hres_active(cpu_base))
 		mindelta = ktime_sub(__hrtimer_get_next_event(cpu_base),
 				     ktime_get());
 
@@ -1412,7 +1420,7 @@ void hrtimer_run_queues(void)
 	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	ktime_t now;
 
-	if (hrtimer_hres_active())
+	if (__hrtimer_hres_active(cpu_base))
 		return;
 
 	raw_spin_lock(&cpu_base->lock);

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Use cpu_base-> active_base for hotpath iterators
  2015-04-14 21:08 ` [patch 10/39] hrtimer: Use cpu_base->active_base for hotpath iterators Thomas Gleixner
  2015-04-20 11:16   ` Preeti U Murthy
@ 2015-04-22 19:07   ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, viresh.kumar, mtosatti, tglx, fweisbec, mingo, hpa,
	peterz, preeti

Commit-ID:  34aee88a02ba296a8e8c9523cdf77147731903f1
Gitweb:     http://git.kernel.org/tip/34aee88a02ba296a8e8c9523cdf77147731903f1
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:41 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:49 +0200

hrtimer: Use cpu_base->active_base for hotpath iterators

The active_bases field is guaranteed to be in sync with the timerqueue
of the corresponding clock base. So we can use it for iterating over
the clock bases. This allows to break out early if no more active
clock bases are available and avoids touching the cache lines of
inactive clock bases.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203501.322887675@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/hrtimer.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 9bbfe33..fce0ccf 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -419,16 +419,16 @@ static ktime_t __hrtimer_get_next_event(struct hrtimer_cpu_base *cpu_base)
 {
 	struct hrtimer_clock_base *base = cpu_base->clock_base;
 	ktime_t expires, expires_next = { .tv64 = KTIME_MAX };
-	int i;
+	unsigned int active = cpu_base->active_bases;
 
-	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++, base++) {
+	for (; active; base++, active >>= 1) {
 		struct timerqueue_node *next;
 		struct hrtimer *timer;
 
-		next = timerqueue_getnext(&base->active);
-		if (!next)
+		if (!(active & 0x01))
 			continue;
 
+		next = timerqueue_getnext(&base->active);
 		timer = container_of(next, struct hrtimer, node);
 		expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
 		if (expires.tv64 < expires_next.tv64)
@@ -1214,17 +1214,16 @@ static void __run_hrtimer(struct hrtimer_cpu_base *cpu_base,
 
 static void __hrtimer_run_queues(struct hrtimer_cpu_base *cpu_base, ktime_t now)
 {
-	int i;
+	struct hrtimer_clock_base *base = cpu_base->clock_base;
+	unsigned int active = cpu_base->active_bases;
 
-	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
-		struct hrtimer_clock_base *base;
+	for (; active; base++, active >>= 1) {
 		struct timerqueue_node *node;
 		ktime_t basenow;
 
-		if (!(cpu_base->active_bases & (1 << i)))
+		if (!(active & 0x01))
 			continue;
 
-		base = cpu_base->clock_base + i;
 		basenow = ktime_add(now, base->offset);
 
 		while ((node = timerqueue_getnext(&base->active))) {

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Cache line align the hrtimer cpu base
  2015-04-14 21:08 ` [patch 11/39] hrtimer: Cache line align the hrtimer cpu base Thomas Gleixner
@ 2015-04-22 19:07   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, preeti, mtosatti, mingo, peterz, hpa, viresh.kumar,
	linux-kernel, fweisbec

Commit-ID:  6d9a1411393d51f17bee3fe163430b21b2cb2de9
Gitweb:     http://git.kernel.org/tip/6d9a1411393d51f17bee3fe163430b21b2cb2de9
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:42 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:49 +0200

hrtimer: Cache line align the hrtimer cpu base

We really want that data structure to start at a cache line boundary.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203501.417597627@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 17a59dd..0853f52 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -191,7 +191,7 @@ struct hrtimer_cpu_base {
 	unsigned int			max_hang_time;
 #endif
 	struct hrtimer_clock_base	clock_base[HRTIMER_MAX_CLOCK_BASES];
-};
+} ____cacheline_aligned;
 
 static inline void hrtimer_set_expires(struct hrtimer *timer, ktime_t time)
 {

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Align the hrtimer clock bases as well
  2015-04-14 21:08 ` [patch 12/39] hrtimer: Align the hrtimer clock bases as well Thomas Gleixner
@ 2015-04-22 19:07   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, viresh.kumar, mtosatti, fweisbec, linux-kernel, mingo, hpa,
	preeti, peterz

Commit-ID:  b8e38413ac2c33c497e72895fcd5da709fd1b908
Gitweb:     http://git.kernel.org/tip/b8e38413ac2c33c497e72895fcd5da709fd1b908
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:44 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:49 +0200

hrtimer: Align the hrtimer clock bases as well

We don't use cacheline_align here because that might waste lot of
space on 32bit machine with 64 bytes cachelines and on 64bit machines
with 128 bytes cachelines.

The size of struct hrtimer_clock_base is 64byte on 64bit and 32byte on
32bit machines. So we utilize the cache lines proper.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203501.498165771@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 0853f52..e5c22d6 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -130,6 +130,12 @@ struct hrtimer_sleeper {
 	struct task_struct *task;
 };
 
+#ifdef CONFIG_64BIT
+# define HRTIMER_CLOCK_BASE_ALIGN	64
+#else
+# define HRTIMER_CLOCK_BASE_ALIGN	32
+#endif
+
 /**
  * struct hrtimer_clock_base - the timer base for a specific clock
  * @cpu_base:		per cpu clock base
@@ -147,7 +153,7 @@ struct hrtimer_clock_base {
 	struct timerqueue_head	active;
 	ktime_t			(*get_time)(void);
 	ktime_t			offset;
-};
+} __attribute__((__aligned__(HRTIMER_CLOCK_BASE_ALIGN)));
 
 enum  hrtimer_base_type {
 	HRTIMER_BASE_MONOTONIC,
@@ -195,6 +201,8 @@ struct hrtimer_cpu_base {
 
 static inline void hrtimer_set_expires(struct hrtimer *timer, ktime_t time)
 {
+	BUILD_BUG_ON(sizeof(struct hrtimer_clock_base) > HRTIMER_CLOCK_BASE_ALIGN);
+
 	timer->node.expires = time;
 	timer->_softexpires = time;
 }

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] timerqueue: Let timerqueue_add/ del return information
  2015-04-14 21:08 ` [patch 13/39] timerqueue: Let timerqueue_add/del return information Thomas Gleixner
@ 2015-04-22 19:08   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, mtosatti, hpa, viresh.kumar, preeti, john.stultz,
	fweisbec, tglx, peterz, mingo

Commit-ID:  c320642e1ced3b81592610e374894fea995f475b
Gitweb:     http://git.kernel.org/tip/c320642e1ced3b81592610e374894fea995f475b
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:46 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:49 +0200

timerqueue: Let timerqueue_add/del return information

The hrtimer code is interested whether the added timer is the first
one to expire and whether the removed timer was the last one in the
tree. The add/del routines have that information already. So we can
return it right away instead of reevaluating it at the call site.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20150414203501.579063647@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/timerqueue.h |  8 ++++----
 lib/timerqueue.c           | 10 +++++++---
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/include/linux/timerqueue.h b/include/linux/timerqueue.h
index a520fd7..7eec17a 100644
--- a/include/linux/timerqueue.h
+++ b/include/linux/timerqueue.h
@@ -16,10 +16,10 @@ struct timerqueue_head {
 };
 
 
-extern void timerqueue_add(struct timerqueue_head *head,
-				struct timerqueue_node *node);
-extern void timerqueue_del(struct timerqueue_head *head,
-				struct timerqueue_node *node);
+extern bool timerqueue_add(struct timerqueue_head *head,
+			   struct timerqueue_node *node);
+extern bool timerqueue_del(struct timerqueue_head *head,
+			   struct timerqueue_node *node);
 extern struct timerqueue_node *timerqueue_iterate_next(
 						struct timerqueue_node *node);
 
diff --git a/lib/timerqueue.c b/lib/timerqueue.c
index a382e4a..782ae8c 100644
--- a/lib/timerqueue.c
+++ b/lib/timerqueue.c
@@ -36,7 +36,7 @@
  * Adds the timer node to the timerqueue, sorted by the
  * node's expires value.
  */
-void timerqueue_add(struct timerqueue_head *head, struct timerqueue_node *node)
+bool timerqueue_add(struct timerqueue_head *head, struct timerqueue_node *node)
 {
 	struct rb_node **p = &head->head.rb_node;
 	struct rb_node *parent = NULL;
@@ -56,8 +56,11 @@ void timerqueue_add(struct timerqueue_head *head, struct timerqueue_node *node)
 	rb_link_node(&node->node, parent, p);
 	rb_insert_color(&node->node, &head->head);
 
-	if (!head->next || node->expires.tv64 < head->next->expires.tv64)
+	if (!head->next || node->expires.tv64 < head->next->expires.tv64) {
 		head->next = node;
+		return true;
+	}
+	return false;
 }
 EXPORT_SYMBOL_GPL(timerqueue_add);
 
@@ -69,7 +72,7 @@ EXPORT_SYMBOL_GPL(timerqueue_add);
  *
  * Removes the timer node from the timerqueue.
  */
-void timerqueue_del(struct timerqueue_head *head, struct timerqueue_node *node)
+bool timerqueue_del(struct timerqueue_head *head, struct timerqueue_node *node)
 {
 	WARN_ON_ONCE(RB_EMPTY_NODE(&node->node));
 
@@ -82,6 +85,7 @@ void timerqueue_del(struct timerqueue_head *head, struct timerqueue_node *node)
 	}
 	rb_erase(&node->node, &head->head);
 	RB_CLEAR_NODE(&node->node);
+	return head->next != NULL;
 }
 EXPORT_SYMBOL_GPL(timerqueue_del);
 

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Make use of timerqueue_add/ del return values
  2015-04-14 21:08 ` [patch 14/39] hrtimer: Make use of timerqueue_add/del return values Thomas Gleixner
@ 2015-04-22 19:08   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: fweisbec, hpa, preeti, peterz, viresh.kumar, linux-kernel, tglx,
	mtosatti, mingo

Commit-ID:  b97f44c9b658d52e0139c947ea5519e51ba38d81
Gitweb:     http://git.kernel.org/tip/b97f44c9b658d52e0139c947ea5519e51ba38d81
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:47 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:50 +0200

hrtimer: Make use of timerqueue_add/del return values

Use the return value instead of reevaluating the information.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203501.658152945@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/hrtimer.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index fce0ccf..0cd1e0b 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -842,7 +842,6 @@ static int enqueue_hrtimer(struct hrtimer *timer,
 {
 	debug_activate(timer);
 
-	timerqueue_add(&base->active, &timer->node);
 	base->cpu_base->active_bases |= 1 << base->index;
 
 	/*
@@ -851,7 +850,7 @@ static int enqueue_hrtimer(struct hrtimer *timer,
 	 */
 	timer->state |= HRTIMER_STATE_ENQUEUED;
 
-	return (&timer->node == base->active.next);
+	return timerqueue_add(&base->active, &timer->node);
 }
 
 /*
@@ -875,8 +874,7 @@ static void __remove_hrtimer(struct hrtimer *timer,
 		goto out;
 
 	next_timer = timerqueue_getnext(&base->active);
-	timerqueue_del(&base->active, &timer->node);
-	if (!timerqueue_getnext(&base->active))
+	if (!timerqueue_del(&base->active, &timer->node))
 		cpu_base->active_bases &= ~(1 << base->index);
 
 	if (&timer->node == next_timer) {

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Keep pointer to first timer and simplify __remove_hrtimer()
  2015-04-14 21:08 ` [patch 15/39] hrtimer: Keep pointer to first timer and simplify __remove_hrtimer() Thomas Gleixner
@ 2015-04-22 19:08   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, linux-kernel, mtosatti, preeti, hpa, viresh.kumar,
	fweisbec, mingo, tglx

Commit-ID:  895bdfa793f6e912d1a58fc445b3dd4d686f7bd3
Gitweb:     http://git.kernel.org/tip/895bdfa793f6e912d1a58fc445b3dd4d686f7bd3
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:49 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:50 +0200

hrtimer: Keep pointer to first timer and simplify __remove_hrtimer()

__remove_hrtimer() needs to evaluate the expiry time to figure out
whether the timer which is removed is eventually the first expiring
timer on the cpu. Keep a pointer to it, which is lazily updated, so we
can avoid the evaluation dance and retrieve the information from there.

Generates slightly better code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203501.752838019@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h |  6 ++++++
 kernel/time/hrtimer.c   | 46 ++++++++++++++++++++++++++++------------------
 2 files changed, 34 insertions(+), 18 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index e5c22d6..d194c1d 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -172,6 +172,7 @@ enum  hrtimer_base_type {
  * @clock_was_set_seq:	Sequence counter of clock was set events
  * @expires_next:	absolute time of the next event which was scheduled
  *			via clock_set_next_event()
+ * @next_timer:		Pointer to the first expiring timer
  * @in_hrtirq:		hrtimer_interrupt() is currently executing
  * @hres_active:	State of high resolution mode
  * @hang_detected:	The last hrtimer interrupt detected a hang
@@ -180,6 +181,10 @@ enum  hrtimer_base_type {
  * @nr_hangs:		Total number of hrtimer interrupt hangs
  * @max_hang_time:	Maximum time spent in hrtimer_interrupt
  * @clock_base:		array of clock bases for this cpu
+ *
+ * Note: next_timer is just an optimization for __remove_hrtimer().
+ *	 Do not dereference the pointer because it is not reliable on
+ *	 cross cpu removals.
  */
 struct hrtimer_cpu_base {
 	raw_spinlock_t			lock;
@@ -191,6 +196,7 @@ struct hrtimer_cpu_base {
 					hres_active	: 1,
 					hang_detected	: 1;
 	ktime_t				expires_next;
+	struct hrtimer			*next_timer;
 	unsigned int			nr_events;
 	unsigned int			nr_retries;
 	unsigned int			nr_hangs;
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 0cd1e0b..30178d0 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -415,12 +415,21 @@ static inline void debug_deactivate(struct hrtimer *timer)
 }
 
 #if defined(CONFIG_NO_HZ_COMMON) || defined(CONFIG_HIGH_RES_TIMERS)
+static inline void hrtimer_update_next_timer(struct hrtimer_cpu_base *cpu_base,
+					     struct hrtimer *timer)
+{
+#ifdef CONFIG_HIGH_RES_TIMERS
+	cpu_base->next_timer = timer;
+#endif
+}
+
 static ktime_t __hrtimer_get_next_event(struct hrtimer_cpu_base *cpu_base)
 {
 	struct hrtimer_clock_base *base = cpu_base->clock_base;
 	ktime_t expires, expires_next = { .tv64 = KTIME_MAX };
 	unsigned int active = cpu_base->active_bases;
 
+	hrtimer_update_next_timer(cpu_base, NULL);
 	for (; active; base++, active >>= 1) {
 		struct timerqueue_node *next;
 		struct hrtimer *timer;
@@ -431,8 +440,10 @@ static ktime_t __hrtimer_get_next_event(struct hrtimer_cpu_base *cpu_base)
 		next = timerqueue_getnext(&base->active);
 		timer = container_of(next, struct hrtimer, node);
 		expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
-		if (expires.tv64 < expires_next.tv64)
+		if (expires.tv64 < expires_next.tv64) {
 			expires_next = expires;
+			hrtimer_update_next_timer(cpu_base, timer);
+		}
 	}
 	/*
 	 * clock_was_set() might have changed base->offset of any of
@@ -597,6 +608,8 @@ static int hrtimer_reprogram(struct hrtimer *timer,
 	if (cpu_base->in_hrtirq)
 		return 0;
 
+	cpu_base->next_timer = timer;
+
 	/*
 	 * If a hang was detected in the last timer interrupt then we
 	 * do not schedule a timer which is earlier than the expiry
@@ -868,30 +881,27 @@ static void __remove_hrtimer(struct hrtimer *timer,
 			     unsigned long newstate, int reprogram)
 {
 	struct hrtimer_cpu_base *cpu_base = base->cpu_base;
-	struct timerqueue_node *next_timer;
+	unsigned int state = timer->state;
 
-	if (!(timer->state & HRTIMER_STATE_ENQUEUED))
-		goto out;
+	timer->state = newstate;
+	if (!(state & HRTIMER_STATE_ENQUEUED))
+		return;
 
-	next_timer = timerqueue_getnext(&base->active);
 	if (!timerqueue_del(&base->active, &timer->node))
 		cpu_base->active_bases &= ~(1 << base->index);
 
-	if (&timer->node == next_timer) {
 #ifdef CONFIG_HIGH_RES_TIMERS
-		/* Reprogram the clock event device. if enabled */
-		if (reprogram && cpu_base->hres_active) {
-			ktime_t expires;
-
-			expires = ktime_sub(hrtimer_get_expires(timer),
-					    base->offset);
-			if (cpu_base->expires_next.tv64 == expires.tv64)
-				hrtimer_force_reprogram(cpu_base, 1);
-		}
+	/*
+	 * Note: If reprogram is false we do not update
+	 * cpu_base->next_timer. This happens when we remove the first
+	 * timer on a remote cpu. No harm as we never dereference
+	 * cpu_base->next_timer. So the worst thing what can happen is
+	 * an superflous call to hrtimer_force_reprogram() on the
+	 * remote cpu later on if the same timer gets enqueued again.
+	 */
+	if (reprogram && timer == cpu_base->next_timer)
+		hrtimer_force_reprogram(cpu_base, 1);
 #endif
-	}
-out:
-	timer->state = newstate;
 }
 
 /*

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Get rid of hrtimer softirq
  2015-04-14 21:08 ` [patch 16/39] hrtimer: Get rid of hrtimer softirq Thomas Gleixner
@ 2015-04-22 19:09   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mtosatti, preeti, peterz, fweisbec, viresh.kumar, linux-kernel,
	mingo, hpa, tglx

Commit-ID:  c6eb3f70d4482806dc2d3e1e3c7736f497b1d418
Gitweb:     http://git.kernel.org/tip/c6eb3f70d4482806dc2d3e1e3c7736f497b1d418
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:51 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:50 +0200

hrtimer: Get rid of hrtimer softirq

hrtimer softirq is a leftover from the initial implementation and
serves only the purpose to handle the enqueueing of already expired
timers in the high resolution timer mode. We discussed whether we
change the return value and force all start sites to handle that the
timer is already expired, but that would be a Herculean task and I'm
not sure whether its a good idea to enforce that handling on
everyone.

A simpler solution is to enforce a timer interrupt instead of raising
and scheduling a softirq. Just use the existing infrastructure to do
so and remove all the softirq leftovers.

The HRTIMER softirq enum is now unused, but kept around because trace
parsers rely on the existing numbering.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203501.840834708@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h   |   1 -
 include/linux/interrupt.h |   3 +-
 kernel/time/hrtimer.c     | 163 ++++++++++++----------------------------------
 kernel/time/tick-common.c |  10 +++
 kernel/time/timer.c       |   2 -
 5 files changed, 55 insertions(+), 124 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index d194c1d..048270a 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -459,7 +459,6 @@ extern int schedule_hrtimeout(ktime_t *expires, const enum hrtimer_mode mode);
 
 /* Soft interrupt function to run the hrtimer queues: */
 extern void hrtimer_run_queues(void);
-extern void hrtimer_run_pending(void);
 
 /* Bootup initialization: */
 extern void __init hrtimers_init(void);
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 950ae45..6bf15a6 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -413,7 +413,8 @@ enum
 	BLOCK_IOPOLL_SOFTIRQ,
 	TASKLET_SOFTIRQ,
 	SCHED_SOFTIRQ,
-	HRTIMER_SOFTIRQ,
+	HRTIMER_SOFTIRQ, /* Unused, but kept as tools rely on the
+			    numbering. Sigh! */
 	RCU_SOFTIRQ,    /* Preferable RCU should always be the last softirq */
 
 	NR_SOFTIRQS
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 30178d0..fc6b6d2 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -555,59 +555,48 @@ hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base, int skip_equal)
 }
 
 /*
- * Shared reprogramming for clock_realtime and clock_monotonic
- *
  * When a timer is enqueued and expires earlier than the already enqueued
  * timers, we have to check, whether it expires earlier than the timer for
  * which the clock event device was armed.
  *
- * Note, that in case the state has HRTIMER_STATE_CALLBACK set, no reprogramming
- * and no expiry check happens. The timer gets enqueued into the rbtree. The
- * reprogramming and expiry check is done in the hrtimer_interrupt or in the
- * softirq.
- *
  * Called with interrupts disabled and base->cpu_base.lock held
  */
-static int hrtimer_reprogram(struct hrtimer *timer,
-			     struct hrtimer_clock_base *base)
+static void hrtimer_reprogram(struct hrtimer *timer,
+			      struct hrtimer_clock_base *base)
 {
 	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	ktime_t expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
-	int res;
 
 	WARN_ON_ONCE(hrtimer_get_expires_tv64(timer) < 0);
 
 	/*
-	 * When the callback is running, we do not reprogram the clock event
-	 * device. The timer callback is either running on a different CPU or
-	 * the callback is executed in the hrtimer_interrupt context. The
-	 * reprogramming is handled either by the softirq, which called the
-	 * callback or at the end of the hrtimer_interrupt.
+	 * If the timer is not on the current cpu, we cannot reprogram
+	 * the other cpus clock event device.
 	 */
-	if (hrtimer_callback_running(timer))
-		return 0;
+	if (base->cpu_base != cpu_base)
+		return;
+
+	/*
+	 * If the hrtimer interrupt is running, then it will
+	 * reevaluate the clock bases and reprogram the clock event
+	 * device. The callbacks are always executed in hard interrupt
+	 * context so we don't need an extra check for a running
+	 * callback.
+	 */
+	if (cpu_base->in_hrtirq)
+		return;
 
 	/*
 	 * CLOCK_REALTIME timer might be requested with an absolute
-	 * expiry time which is less than base->offset. Nothing wrong
-	 * about that, just avoid to call into the tick code, which
-	 * has now objections against negative expiry values.
+	 * expiry time which is less than base->offset. Set it to 0.
 	 */
 	if (expires.tv64 < 0)
-		return -ETIME;
+		expires.tv64 = 0;
 
 	if (expires.tv64 >= cpu_base->expires_next.tv64)
-		return 0;
-
-	/*
-	 * When the target cpu of the timer is currently executing
-	 * hrtimer_interrupt(), then we do not touch the clock event
-	 * device. hrtimer_interrupt() will reevaluate all clock bases
-	 * before reprogramming the device.
-	 */
-	if (cpu_base->in_hrtirq)
-		return 0;
+		return;
 
+	/* Update the pointer to the next expiring timer */
 	cpu_base->next_timer = timer;
 
 	/*
@@ -617,15 +606,14 @@ static int hrtimer_reprogram(struct hrtimer *timer,
 	 * to make progress.
 	 */
 	if (cpu_base->hang_detected)
-		return 0;
+		return;
 
 	/*
-	 * Clockevents returns -ETIME, when the event was in the past.
+	 * Program the timer hardware. We enforce the expiry for
+	 * events which are already in the past.
 	 */
-	res = tick_program_event(expires, 0);
-	if (!IS_ERR_VALUE(res))
-		cpu_base->expires_next = expires;
-	return res;
+	cpu_base->expires_next = expires;
+	tick_program_event(expires, 1);
 }
 
 /*
@@ -660,19 +648,11 @@ static void retrigger_next_event(void *arg)
  */
 static int hrtimer_switch_to_hres(void)
 {
-	int cpu = smp_processor_id();
-	struct hrtimer_cpu_base *base = &per_cpu(hrtimer_bases, cpu);
-	unsigned long flags;
-
-	if (base->hres_active)
-		return 1;
-
-	local_irq_save(flags);
+	struct hrtimer_cpu_base *base = this_cpu_ptr(&hrtimer_bases);
 
 	if (tick_init_highres()) {
-		local_irq_restore(flags);
 		printk(KERN_WARNING "Could not switch to high resolution "
-				    "mode on CPU %d\n", cpu);
+				    "mode on CPU %d\n", base->cpu);
 		return 0;
 	}
 	base->hres_active = 1;
@@ -681,7 +661,6 @@ static int hrtimer_switch_to_hres(void)
 	tick_setup_sched_timer();
 	/* "Retrigger" the interrupt to get things going */
 	retrigger_next_event(NULL);
-	local_irq_restore(flags);
 	return 1;
 }
 
@@ -984,26 +963,8 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 		 * on dynticks target.
 		 */
 		wake_up_nohz_cpu(new_base->cpu_base->cpu);
-	} else if (new_base->cpu_base == this_cpu_ptr(&hrtimer_bases) &&
-			hrtimer_reprogram(timer, new_base)) {
-		/*
-		 * Only allow reprogramming if the new base is on this CPU.
-		 * (it might still be on another CPU if the timer was pending)
-		 *
-		 * XXX send_remote_softirq() ?
-		 */
-		if (wakeup) {
-			/*
-			 * We need to drop cpu_base->lock to avoid a
-			 * lock ordering issue vs. rq->lock.
-			 */
-			raw_spin_unlock(&new_base->cpu_base->lock);
-			raise_softirq_irqoff(HRTIMER_SOFTIRQ);
-			local_irq_restore(flags);
-			return ret;
-		} else {
-			__raise_softirq_irqoff(HRTIMER_SOFTIRQ);
-		}
+	} else {
+		hrtimer_reprogram(timer, new_base);
 	}
 
 	unlock_hrtimer_base(timer, &flags);
@@ -1354,7 +1315,7 @@ retry:
  * local version of hrtimer_peek_ahead_timers() called with interrupts
  * disabled.
  */
-static void __hrtimer_peek_ahead_timers(void)
+static inline void __hrtimer_peek_ahead_timers(void)
 {
 	struct tick_device *td;
 
@@ -1366,29 +1327,6 @@ static void __hrtimer_peek_ahead_timers(void)
 		hrtimer_interrupt(td->evtdev);
 }
 
-/**
- * hrtimer_peek_ahead_timers -- run soft-expired timers now
- *
- * hrtimer_peek_ahead_timers will peek at the timer queue of
- * the current cpu and check if there are any timers for which
- * the soft expires time has passed. If any such timers exist,
- * they are run immediately and then removed from the timer queue.
- *
- */
-void hrtimer_peek_ahead_timers(void)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	__hrtimer_peek_ahead_timers();
-	local_irq_restore(flags);
-}
-
-static void run_hrtimer_softirq(struct softirq_action *h)
-{
-	hrtimer_peek_ahead_timers();
-}
-
 #else /* CONFIG_HIGH_RES_TIMERS */
 
 static inline void __hrtimer_peek_ahead_timers(void) { }
@@ -1396,31 +1334,7 @@ static inline void __hrtimer_peek_ahead_timers(void) { }
 #endif	/* !CONFIG_HIGH_RES_TIMERS */
 
 /*
- * Called from timer softirq every jiffy, expire hrtimers:
- *
- * For HRT its the fall back code to run the softirq in the timer
- * softirq context in case the hrtimer initialization failed or has
- * not been done yet.
- */
-void hrtimer_run_pending(void)
-{
-	if (hrtimer_hres_active())
-		return;
-
-	/*
-	 * This _is_ ugly: We have to check in the softirq context,
-	 * whether we can switch to highres and / or nohz mode. The
-	 * clocksource switch happens in the timer interrupt with
-	 * xtime_lock held. Notification from there only sets the
-	 * check bit in the tick_oneshot code, otherwise we might
-	 * deadlock vs. xtime_lock.
-	 */
-	if (tick_check_oneshot_change(!hrtimer_is_hres_enabled()))
-		hrtimer_switch_to_hres();
-}
-
-/*
- * Called from hardirq context every jiffy
+ * Called from run_local_timers in hardirq context every jiffy
  */
 void hrtimer_run_queues(void)
 {
@@ -1430,6 +1344,18 @@ void hrtimer_run_queues(void)
 	if (__hrtimer_hres_active(cpu_base))
 		return;
 
+	/*
+	 * This _is_ ugly: We have to check periodically, whether we
+	 * can switch to highres and / or nohz mode. The clocksource
+	 * switch happens with xtime_lock held. Notification from
+	 * there only sets the check bit in the tick_oneshot code,
+	 * otherwise we might deadlock vs. xtime_lock.
+	 */
+	if (tick_check_oneshot_change(!hrtimer_is_hres_enabled())) {
+		hrtimer_switch_to_hres();
+		return;
+	}
+
 	raw_spin_lock(&cpu_base->lock);
 	now = hrtimer_update_base(cpu_base);
 	__hrtimer_run_queues(cpu_base, now);
@@ -1700,9 +1626,6 @@ void __init hrtimers_init(void)
 	hrtimer_cpu_notify(&hrtimers_nb, (unsigned long)CPU_UP_PREPARE,
 			  (void *)(long)smp_processor_id());
 	register_cpu_notifier(&hrtimers_nb);
-#ifdef CONFIG_HIGH_RES_TIMERS
-	open_softirq(HRTIMER_SOFTIRQ, run_hrtimer_softirq);
-#endif
 }
 
 /**
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 3ae6afa..ea5f9ea 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -102,6 +102,16 @@ void tick_handle_periodic(struct clock_event_device *dev)
 
 	tick_periodic(cpu);
 
+#if defined(CONFIG_HIGH_RES_TIMERS) || defined(CONFIG_NO_HZ_COMMON)
+	/*
+	 * The cpu might have transitioned to HIGHRES or NOHZ mode via
+	 * update_process_times() -> run_local_timers() ->
+	 * hrtimer_run_queues().
+	 */
+	if (dev->event_handler != tick_handle_periodic)
+		return;
+#endif
+
 	if (dev->state != CLOCK_EVT_STATE_ONESHOT)
 		return;
 	for (;;) {
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 2ece3aa..b31f13f 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1409,8 +1409,6 @@ static void run_timer_softirq(struct softirq_action *h)
 {
 	struct tvec_base *base = __this_cpu_read(tvec_bases);
 
-	hrtimer_run_pending();
-
 	if (time_after_eq(jiffies, base->timer_jiffies))
 		__run_timers(base);
 }

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] tick: sched: Remove hrtimer_active() checks
  2015-04-14 21:08 ` [patch 17/39] tick: sched: Remove hrtimer_active() checks Thomas Gleixner
  2015-04-16 13:37   ` Frederic Weisbecker
@ 2015-04-22 19:09   ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, preeti, peterz, hpa, mtosatti, fweisbec, mingo,
	john.stultz, linux-kernel, viresh.kumar

Commit-ID:  afc08b15cc2a3d2c48cbd427be8e0eea05698363
Gitweb:     http://git.kernel.org/tip/afc08b15cc2a3d2c48cbd427be8e0eea05698363
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:52 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:50 +0200

tick: sched: Remove hrtimer_active() checks

hrtimer_start() enforces a timer interrupt if the timer is already
expired. Get rid of the checks and the forward loop.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Link: http://lkml.kernel.org/r/20150414203501.943658239@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 kernel/time/tick-sched.c | 19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 9142591..dc586c3 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -696,11 +696,9 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 		if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
 			hrtimer_start(&ts->sched_timer, expires,
 				      HRTIMER_MODE_ABS_PINNED);
-			/* Check, if the timer was already in the past */
-			if (hrtimer_active(&ts->sched_timer))
-				goto out;
+			goto out;
 		} else if (!tick_program_event(expires, 0))
-				goto out;
+			goto out;
 		/*
 		 * We are past the event already. So we crossed a
 		 * jiffie boundary. Update jiffies and raise the
@@ -888,8 +886,6 @@ static void tick_nohz_restart(struct tick_sched *ts, ktime_t now)
 		if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
 			hrtimer_start_expires(&ts->sched_timer,
 					      HRTIMER_MODE_ABS_PINNED);
-			/* Check, if the timer was already in the past */
-			if (hrtimer_active(&ts->sched_timer))
 				break;
 		} else {
 			if (!tick_program_event(
@@ -1167,15 +1163,8 @@ void tick_setup_sched_timer(void)
 		hrtimer_add_expires_ns(&ts->sched_timer, offset);
 	}
 
-	for (;;) {
-		hrtimer_forward(&ts->sched_timer, now, tick_period);
-		hrtimer_start_expires(&ts->sched_timer,
-				      HRTIMER_MODE_ABS_PINNED);
-		/* Check, if the timer was already in the past */
-		if (hrtimer_active(&ts->sched_timer))
-			break;
-		now = ktime_get();
-	}
+	hrtimer_forward(&ts->sched_timer, now, tick_period);
+	hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED);
 
 #ifdef CONFIG_NO_HZ_COMMON
 	if (tick_nohz_enabled) {

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] tick: sched: Force tick interrupt and get rid of softirq magic
  2015-04-14 21:08 ` [patch 18/39] tick: sched: Force tick interrupt and get rid of softirq magic Thomas Gleixner
  2015-04-22 14:22   ` Frederic Weisbecker
@ 2015-04-22 19:09   ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: fweisbec, viresh.kumar, john.stultz, mingo, mtosatti,
	linux-kernel, peterz, hpa, tglx, preeti

Commit-ID:  0ff53d09642204c648424def0caa9117e7a3caaf
Gitweb:     http://git.kernel.org/tip/0ff53d09642204c648424def0caa9117e7a3caaf
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:54 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:50 +0200

tick: sched: Force tick interrupt and get rid of softirq magic

We already got rid of the hrtimer reprogramming loops and hoops as
hrtimer now enforces an interrupt if the enqueued time is in the past.

Do the same for the nohz non highres mode. That gets rid of the need
to raise the softirq which only serves the purpose of getting the
machine out of the inner idle loop.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Link: http://lkml.kernel.org/r/20150414203502.023464878@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 kernel/time/tick-sched.c | 83 +++++++++++++++++-------------------------------
 1 file changed, 29 insertions(+), 54 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index dc586c3..0f07ff2 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -565,6 +565,20 @@ u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time)
 }
 EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us);
 
+static void tick_nohz_restart(struct tick_sched *ts, ktime_t now)
+{
+	hrtimer_cancel(&ts->sched_timer);
+	hrtimer_set_expires(&ts->sched_timer, ts->last_tick);
+
+	/* Forward the time to expire in the future */
+	hrtimer_forward(&ts->sched_timer, now, tick_period);
+
+	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
+		hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED);
+	else
+		tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
+}
+
 static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 					 ktime_t now, int cpu)
 {
@@ -691,22 +705,18 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 			if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
 				hrtimer_cancel(&ts->sched_timer);
 			goto out;
-		}
+		 }
 
-		if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
-			hrtimer_start(&ts->sched_timer, expires,
-				      HRTIMER_MODE_ABS_PINNED);
-			goto out;
-		} else if (!tick_program_event(expires, 0))
-			goto out;
-		/*
-		 * We are past the event already. So we crossed a
-		 * jiffie boundary. Update jiffies and raise the
-		 * softirq.
-		 */
-		tick_do_update_jiffies64(ktime_get());
+		 if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
+			 hrtimer_start(&ts->sched_timer, expires,
+				       HRTIMER_MODE_ABS_PINNED);
+		 else
+			 tick_program_event(expires, 1);
+	} else {
+		/* Tick is stopped, but required now. Enforce it */
+		tick_nohz_restart(ts, now);
 	}
-	raise_softirq_irqoff(TIMER_SOFTIRQ);
+
 out:
 	ts->next_jiffies = next_jiffies;
 	ts->last_jiffies = last_jiffies;
@@ -874,30 +884,6 @@ ktime_t tick_nohz_get_sleep_length(void)
 	return ts->sleep_length;
 }
 
-static void tick_nohz_restart(struct tick_sched *ts, ktime_t now)
-{
-	hrtimer_cancel(&ts->sched_timer);
-	hrtimer_set_expires(&ts->sched_timer, ts->last_tick);
-
-	while (1) {
-		/* Forward the time to expire in the future */
-		hrtimer_forward(&ts->sched_timer, now, tick_period);
-
-		if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
-			hrtimer_start_expires(&ts->sched_timer,
-					      HRTIMER_MODE_ABS_PINNED);
-				break;
-		} else {
-			if (!tick_program_event(
-				hrtimer_get_expires(&ts->sched_timer), 0))
-				break;
-		}
-		/* Reread time and update jiffies */
-		now = ktime_get();
-		tick_do_update_jiffies64(now);
-	}
-}
-
 static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now)
 {
 	/* Update jiffies first */
@@ -968,12 +954,6 @@ void tick_nohz_idle_exit(void)
 	local_irq_enable();
 }
 
-static int tick_nohz_reprogram(struct tick_sched *ts, ktime_t now)
-{
-	hrtimer_forward(&ts->sched_timer, now, tick_period);
-	return tick_program_event(hrtimer_get_expires(&ts->sched_timer), 0);
-}
-
 /*
  * The nohz low res interrupt handler
  */
@@ -992,10 +972,8 @@ static void tick_nohz_handler(struct clock_event_device *dev)
 	if (unlikely(ts->tick_stopped))
 		return;
 
-	while (tick_nohz_reprogram(ts, now)) {
-		now = ktime_get();
-		tick_do_update_jiffies64(now);
-	}
+	hrtimer_forward(&ts->sched_timer, now, tick_period);
+	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
 }
 
 /**
@@ -1025,12 +1003,9 @@ static void tick_nohz_switch_to_nohz(void)
 	/* Get the next period */
 	next = tick_init_jiffy_update();
 
-	for (;;) {
-		hrtimer_set_expires(&ts->sched_timer, next);
-		if (!tick_program_event(next, 0))
-			break;
-		next = ktime_add(next, tick_period);
-	}
+	hrtimer_forward_now(&ts->sched_timer, tick_period);
+	hrtimer_set_expires(&ts->sched_timer, next);
+	tick_program_event(next, 1);
 	local_irq_enable();
 }
 

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] tick: Sched: Restructure code
  2015-04-14 21:08 ` [patch 19/39] tick: sched: Restructure code Thomas Gleixner
@ 2015-04-22 19:09   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: viresh.kumar, tglx, mingo, john.stultz, preeti, fweisbec, peterz,
	mtosatti, linux-kernel, hpa

Commit-ID:  157d29e101c7d032e886df067aeea1b21a366cc5
Gitweb:     http://git.kernel.org/tip/157d29e101c7d032e886df067aeea1b21a366cc5
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:56 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:50 +0200

tick: Sched: Restructure code

Get rid of one indentation level. Preparatory patch for a major
rework. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Link: http://lkml.kernel.org/r/20150414203502.101563235@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 kernel/time/tick-sched.c | 171 ++++++++++++++++++++++-------------------------
 1 file changed, 81 insertions(+), 90 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 0f07ff2..4c5f4a9 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -611,112 +611,103 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 		}
 	}
 
+	if ((long)delta_jiffies <= 1) {
+		if (!ts->tick_stopped)
+			goto out;
+		if (delta_jiffies == 0) {
+			/* Tick is stopped, but required now. Enforce it */
+			tick_nohz_restart(ts, now);
+			goto out;
+		}
+	}
+
 	/*
-	 * Do not stop the tick, if we are only one off (or less)
-	 * or if the cpu is required for RCU:
+	 * If this cpu is the one which updates jiffies, then give up
+	 * the assignment and let it be taken by the cpu which runs
+	 * the tick timer next, which might be this cpu as well. If we
+	 * don't drop this here the jiffies might be stale and
+	 * do_timer() never invoked. Keep track of the fact that it
+	 * was the one which had the do_timer() duty last. If this cpu
+	 * is the one which had the do_timer() duty last, we limit the
+	 * sleep time to the timekeeping max_deferement value which we
+	 * retrieved above. Otherwise we can sleep as long as we want.
 	 */
-	if (!ts->tick_stopped && delta_jiffies <= 1)
-		goto out;
-
-	/* Schedule the tick, if we are at least one jiffie off */
-	if ((long)delta_jiffies >= 1) {
-
-		/*
-		 * If this cpu is the one which updates jiffies, then
-		 * give up the assignment and let it be taken by the
-		 * cpu which runs the tick timer next, which might be
-		 * this cpu as well. If we don't drop this here the
-		 * jiffies might be stale and do_timer() never
-		 * invoked. Keep track of the fact that it was the one
-		 * which had the do_timer() duty last. If this cpu is
-		 * the one which had the do_timer() duty last, we
-		 * limit the sleep time to the timekeeping
-		 * max_deferement value which we retrieved
-		 * above. Otherwise we can sleep as long as we want.
-		 */
-		if (cpu == tick_do_timer_cpu) {
-			tick_do_timer_cpu = TICK_DO_TIMER_NONE;
-			ts->do_timer_last = 1;
-		} else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
-			time_delta = KTIME_MAX;
-			ts->do_timer_last = 0;
-		} else if (!ts->do_timer_last) {
-			time_delta = KTIME_MAX;
-		}
+	if (cpu == tick_do_timer_cpu) {
+		tick_do_timer_cpu = TICK_DO_TIMER_NONE;
+		ts->do_timer_last = 1;
+	} else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
+		time_delta = KTIME_MAX;
+		ts->do_timer_last = 0;
+	} else if (!ts->do_timer_last) {
+		time_delta = KTIME_MAX;
+	}
 
 #ifdef CONFIG_NO_HZ_FULL
-		if (!ts->inidle) {
-			time_delta = min(time_delta,
-					 scheduler_tick_max_deferment());
-		}
+	if (!ts->inidle)
+		time_delta = min(time_delta, scheduler_tick_max_deferment());
 #endif
 
+	/*
+	 * calculate the expiry time for the next timer wheel
+	 * timer. delta_jiffies >= NEXT_TIMER_MAX_DELTA signals that
+	 * there is no timer pending or at least extremely far into
+	 * the future (12 days for HZ=1000). In this case we set the
+	 * expiry to the end of time.
+	 */
+	if (likely(delta_jiffies < NEXT_TIMER_MAX_DELTA)) {
 		/*
-		 * calculate the expiry time for the next timer wheel
-		 * timer. delta_jiffies >= NEXT_TIMER_MAX_DELTA signals
-		 * that there is no timer pending or at least extremely
-		 * far into the future (12 days for HZ=1000). In this
-		 * case we set the expiry to the end of time.
+		 * Calculate the time delta for the next timer event.
+		 * If the time delta exceeds the maximum time delta
+		 * permitted by the current clocksource then adjust
+		 * the time delta accordingly to ensure the
+		 * clocksource does not wrap.
 		 */
-		if (likely(delta_jiffies < NEXT_TIMER_MAX_DELTA)) {
-			/*
-			 * Calculate the time delta for the next timer event.
-			 * If the time delta exceeds the maximum time delta
-			 * permitted by the current clocksource then adjust
-			 * the time delta accordingly to ensure the
-			 * clocksource does not wrap.
-			 */
-			time_delta = min_t(u64, time_delta,
-					   tick_period.tv64 * delta_jiffies);
-		}
-
-		if (time_delta < KTIME_MAX)
-			expires = ktime_add_ns(last_update, time_delta);
-		else
-			expires.tv64 = KTIME_MAX;
+		time_delta = min_t(u64, time_delta,
+				   tick_period.tv64 * delta_jiffies);
+	}
 
-		/* Skip reprogram of event if its not changed */
-		if (ts->tick_stopped && ktime_equal(expires, dev->next_event))
-			goto out;
+	if (time_delta < KTIME_MAX)
+		expires = ktime_add_ns(last_update, time_delta);
+	else
+		expires.tv64 = KTIME_MAX;
 
-		ret = expires;
+	/* Skip reprogram of event if its not changed */
+	if (ts->tick_stopped && ktime_equal(expires, dev->next_event))
+		goto out;
 
-		/*
-		 * nohz_stop_sched_tick can be called several times before
-		 * the nohz_restart_sched_tick is called. This happens when
-		 * interrupts arrive which do not cause a reschedule. In the
-		 * first call we save the current tick time, so we can restart
-		 * the scheduler tick in nohz_restart_sched_tick.
-		 */
-		if (!ts->tick_stopped) {
-			nohz_balance_enter_idle(cpu);
-			calc_load_enter_idle();
+	ret = expires;
 
-			ts->last_tick = hrtimer_get_expires(&ts->sched_timer);
-			ts->tick_stopped = 1;
-			trace_tick_stop(1, " ");
-		}
+	/*
+	 * nohz_stop_sched_tick can be called several times before
+	 * the nohz_restart_sched_tick is called. This happens when
+	 * interrupts arrive which do not cause a reschedule. In the
+	 * first call we save the current tick time, so we can restart
+	 * the scheduler tick in nohz_restart_sched_tick.
+	 */
+	if (!ts->tick_stopped) {
+		nohz_balance_enter_idle(cpu);
+		calc_load_enter_idle();
 
-		/*
-		 * If the expiration time == KTIME_MAX, then
-		 * in this case we simply stop the tick timer.
-		 */
-		 if (unlikely(expires.tv64 == KTIME_MAX)) {
-			if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
-				hrtimer_cancel(&ts->sched_timer);
-			goto out;
-		 }
+		ts->last_tick = hrtimer_get_expires(&ts->sched_timer);
+		ts->tick_stopped = 1;
+		trace_tick_stop(1, " ");
+	}
 
-		 if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
-			 hrtimer_start(&ts->sched_timer, expires,
-				       HRTIMER_MODE_ABS_PINNED);
-		 else
-			 tick_program_event(expires, 1);
-	} else {
-		/* Tick is stopped, but required now. Enforce it */
-		tick_nohz_restart(ts, now);
+	/*
+	 * If the expiration time == KTIME_MAX, then
+	 * in this case we simply stop the tick timer.
+	 */
+	if (unlikely(expires.tv64 == KTIME_MAX)) {
+		if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
+			hrtimer_cancel(&ts->sched_timer);
+		goto out;
 	}
 
+	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
+		hrtimer_start(&ts->sched_timer, expires,
+			      HRTIMER_MODE_ABS_PINNED);
+	else
+		tick_program_event(expires, 1);
 out:
 	ts->next_jiffies = next_jiffies;
 	ts->last_jiffies = last_jiffies;

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] tick: Nohz: Rework next timer evaluation
  2015-04-14 21:08 ` [patch 20/39] tick: nohz: Rework next timer evaluation Thomas Gleixner
  2015-04-16 16:42   ` Paul E. McKenney
@ 2015-04-22 19:10   ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, laijs, peterz, mingo, john.stultz, linux-kernel, paulmck,
	mtosatti, viresh.kumar, tglx, fweisbec, preeti, josh

Commit-ID:  c1ad348b452aacd784fb97403d03d71723c72ee1
Gitweb:     http://git.kernel.org/tip/c1ad348b452aacd784fb97403d03d71723c72ee1
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:08:58 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:50 +0200

tick: Nohz: Rework next timer evaluation

The evaluation of the next timer in the nohz code is based on jiffies
while all the tick internals are nano seconds based. We have also to
convert hrtimer nanoseconds to jiffies in the !highres case. That's
just wrong and introduces interesting corner cases.

Turn it around and convert the next timer wheel timer expiry and the
rcu event to clock monotonic and base all calculations on
nanoseconds. That identifies the case where no timer is pending
clearly with an absolute expiry value of KTIME_MAX.

Makes the code more readable and gets rid of the jiffies magic in the
nohz code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Link: http://lkml.kernel.org/r/20150414203502.184198593@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 include/linux/hrtimer.h     |   2 +-
 include/linux/rcupdate.h    |   6 ++-
 include/linux/rcutree.h     |   2 +-
 include/linux/timer.h       |   7 ---
 kernel/rcu/tree_plugin.h    |  14 +++---
 kernel/time/hrtimer.c       |  14 ++----
 kernel/time/tick-internal.h |   2 +
 kernel/time/tick-sched.c    | 109 ++++++++++++++++++++------------------------
 kernel/time/tick-sched.h    |   2 +-
 kernel/time/timer.c         |  71 ++++++++++++++---------------
 kernel/time/timer_list.c    |   4 +-
 11 files changed, 107 insertions(+), 126 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 048270a..2c68f71 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -386,7 +386,7 @@ static inline int hrtimer_restart(struct hrtimer *timer)
 /* Query timers: */
 extern ktime_t hrtimer_get_remaining(const struct hrtimer *timer);
 
-extern ktime_t hrtimer_get_next_event(void);
+extern u64 hrtimer_get_next_event(void);
 
 /*
  * A timer is active, when it is enqueued into the rbtree or the
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 573a5af..0627a44 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -44,6 +44,8 @@
 #include <linux/debugobjects.h>
 #include <linux/bug.h>
 #include <linux/compiler.h>
+#include <linux/ktime.h>
+
 #include <asm/barrier.h>
 
 extern int rcu_expedited; /* for sysctl */
@@ -1154,9 +1156,9 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
 	__kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
 
 #if defined(CONFIG_TINY_RCU) || defined(CONFIG_RCU_NOCB_CPU_ALL)
-static inline int rcu_needs_cpu(unsigned long *delta_jiffies)
+static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
 {
-	*delta_jiffies = ULONG_MAX;
+	*nextevt = KTIME_MAX;
 	return 0;
 }
 #endif /* #if defined(CONFIG_TINY_RCU) || defined(CONFIG_RCU_NOCB_CPU_ALL) */
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index d2e583a..db2e31b 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -32,7 +32,7 @@
 
 void rcu_note_context_switch(void);
 #ifndef CONFIG_RCU_NOCB_CPU_ALL
-int rcu_needs_cpu(unsigned long *delta_jiffies);
+int rcu_needs_cpu(u64 basem, u64 *nextevt);
 #endif /* #ifndef CONFIG_RCU_NOCB_CPU_ALL */
 void rcu_cpu_stall_reset(void);
 
diff --git a/include/linux/timer.h b/include/linux/timer.h
index 8c5a197..fbb80e0 100644
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -188,13 +188,6 @@ extern void set_timer_slack(struct timer_list *time, int slack_hz);
 #define NEXT_TIMER_MAX_DELTA	((1UL << 30) - 1)
 
 /*
- * Return when the next timer-wheel timeout occurs (in absolute jiffies),
- * locks the timer base and does the comparison against the given
- * jiffie.
- */
-extern unsigned long get_next_timer_interrupt(unsigned long now);
-
-/*
  * Timer-statistics info:
  */
 #ifdef CONFIG_TIMER_STATS
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 8c0ec0f..0ef80a0 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1368,9 +1368,9 @@ static void rcu_prepare_kthreads(int cpu)
  * any flavor of RCU.
  */
 #ifndef CONFIG_RCU_NOCB_CPU_ALL
-int rcu_needs_cpu(unsigned long *delta_jiffies)
+int rcu_needs_cpu(u64 basemono, u64 *nextevt)
 {
-	*delta_jiffies = ULONG_MAX;
+	*nextevt = KTIME_MAX;
 	return rcu_cpu_has_callbacks(NULL);
 }
 #endif /* #ifndef CONFIG_RCU_NOCB_CPU_ALL */
@@ -1481,16 +1481,17 @@ static bool __maybe_unused rcu_try_advance_all_cbs(void)
  * The caller must have disabled interrupts.
  */
 #ifndef CONFIG_RCU_NOCB_CPU_ALL
-int rcu_needs_cpu(unsigned long *dj)
+int rcu_needs_cpu(u64 basemono, u64 *nextevt)
 {
 	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
+	unsigned long dj;
 
 	/* Snapshot to detect later posting of non-lazy callback. */
 	rdtp->nonlazy_posted_snap = rdtp->nonlazy_posted;
 
 	/* If no callbacks, RCU doesn't need the CPU. */
 	if (!rcu_cpu_has_callbacks(&rdtp->all_lazy)) {
-		*dj = ULONG_MAX;
+		*nextevt = KTIME_MAX;
 		return 0;
 	}
 
@@ -1504,11 +1505,12 @@ int rcu_needs_cpu(unsigned long *dj)
 
 	/* Request timer delay depending on laziness, and round. */
 	if (!rdtp->all_lazy) {
-		*dj = round_up(rcu_idle_gp_delay + jiffies,
+		dj = round_up(rcu_idle_gp_delay + jiffies,
 			       rcu_idle_gp_delay) - jiffies;
 	} else {
-		*dj = round_jiffies(rcu_idle_lazy_gp_delay + jiffies) - jiffies;
+		dj = round_jiffies(rcu_idle_lazy_gp_delay + jiffies) - jiffies;
 	}
+	*nextevt = basemono + dj * TICK_NSEC;
 	return 0;
 }
 #endif /* #ifndef CONFIG_RCU_NOCB_CPU_ALL */
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index fc6b6d2..179b991 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1080,26 +1080,22 @@ EXPORT_SYMBOL_GPL(hrtimer_get_remaining);
 /**
  * hrtimer_get_next_event - get the time until next expiry event
  *
- * Returns the delta to the next expiry event or KTIME_MAX if no timer
- * is pending.
+ * Returns the next expiry time or KTIME_MAX if no timer is pending.
  */
-ktime_t hrtimer_get_next_event(void)
+u64 hrtimer_get_next_event(void)
 {
 	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
-	ktime_t mindelta = { .tv64 = KTIME_MAX };
+	u64 expires = KTIME_MAX;
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&cpu_base->lock, flags);
 
 	if (!__hrtimer_hres_active(cpu_base))
-		mindelta = ktime_sub(__hrtimer_get_next_event(cpu_base),
-				     ktime_get());
+		expires = __hrtimer_get_next_event(cpu_base).tv64;
 
 	raw_spin_unlock_irqrestore(&cpu_base->lock, flags);
 
-	if (mindelta.tv64 < 0)
-		mindelta.tv64 = 0;
-	return mindelta;
+	return expires;
 }
 #endif
 
diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h
index b64fdd8..65273f0 100644
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -137,3 +137,5 @@ extern void tick_nohz_init(void);
 # else
 static inline void tick_nohz_init(void) { }
 #endif
+
+extern u64 get_next_timer_interrupt(unsigned long basej, u64 basem);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 4c5f4a9..753c211 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -582,39 +582,46 @@ static void tick_nohz_restart(struct tick_sched *ts, ktime_t now)
 static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 					 ktime_t now, int cpu)
 {
-	unsigned long seq, last_jiffies, next_jiffies, delta_jiffies;
-	ktime_t last_update, expires, ret = { .tv64 = 0 };
-	unsigned long rcu_delta_jiffies;
 	struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);
-	u64 time_delta;
-
-	time_delta = timekeeping_max_deferment();
+	u64 basemono, next_tick, next_tmr, next_rcu, delta, expires;
+	unsigned long seq, basejiff;
+	ktime_t	tick;
 
 	/* Read jiffies and the time when jiffies were updated last */
 	do {
 		seq = read_seqbegin(&jiffies_lock);
-		last_update = last_jiffies_update;
-		last_jiffies = jiffies;
+		basemono = last_jiffies_update.tv64;
+		basejiff = jiffies;
 	} while (read_seqretry(&jiffies_lock, seq));
+	ts->last_jiffies = basejiff;
 
-	if (rcu_needs_cpu(&rcu_delta_jiffies) ||
+	if (rcu_needs_cpu(basemono, &next_rcu) ||
 	    arch_needs_cpu() || irq_work_needs_cpu()) {
-		next_jiffies = last_jiffies + 1;
-		delta_jiffies = 1;
+		next_tick = basemono + TICK_NSEC;
 	} else {
-		/* Get the next timer wheel timer */
-		next_jiffies = get_next_timer_interrupt(last_jiffies);
-		delta_jiffies = next_jiffies - last_jiffies;
-		if (rcu_delta_jiffies < delta_jiffies) {
-			next_jiffies = last_jiffies + rcu_delta_jiffies;
-			delta_jiffies = rcu_delta_jiffies;
-		}
+		/*
+		 * Get the next pending timer. If high resolution
+		 * timers are enabled this only takes the timer wheel
+		 * timers into account. If high resolution timers are
+		 * disabled this also looks at the next expiring
+		 * hrtimer.
+		 */
+		next_tmr = get_next_timer_interrupt(basejiff, basemono);
+		ts->next_timer = next_tmr;
+		/* Take the next rcu event into account */
+		next_tick = next_rcu < next_tmr ? next_rcu : next_tmr;
 	}
 
-	if ((long)delta_jiffies <= 1) {
+	/*
+	 * If the tick is due in the next period, keep it ticking or
+	 * restart it proper.
+	 */
+	delta = next_tick - basemono;
+	if (delta <= (u64)TICK_NSEC) {
+		tick.tv64 = 0;
 		if (!ts->tick_stopped)
 			goto out;
-		if (delta_jiffies == 0) {
+		if (delta == 0) {
 			/* Tick is stopped, but required now. Enforce it */
 			tick_nohz_restart(ts, now);
 			goto out;
@@ -629,54 +636,39 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 	 * do_timer() never invoked. Keep track of the fact that it
 	 * was the one which had the do_timer() duty last. If this cpu
 	 * is the one which had the do_timer() duty last, we limit the
-	 * sleep time to the timekeeping max_deferement value which we
-	 * retrieved above. Otherwise we can sleep as long as we want.
+	 * sleep time to the timekeeping max_deferement value.
+	 * Otherwise we can sleep as long as we want.
 	 */
+	delta = timekeeping_max_deferment();
 	if (cpu == tick_do_timer_cpu) {
 		tick_do_timer_cpu = TICK_DO_TIMER_NONE;
 		ts->do_timer_last = 1;
 	} else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
-		time_delta = KTIME_MAX;
+		delta = KTIME_MAX;
 		ts->do_timer_last = 0;
 	} else if (!ts->do_timer_last) {
-		time_delta = KTIME_MAX;
+		delta = KTIME_MAX;
 	}
 
 #ifdef CONFIG_NO_HZ_FULL
+	/* Limit the tick delta to the maximum scheduler deferment */
 	if (!ts->inidle)
-		time_delta = min(time_delta, scheduler_tick_max_deferment());
+		delta = min(delta, scheduler_tick_max_deferment());
 #endif
 
-	/*
-	 * calculate the expiry time for the next timer wheel
-	 * timer. delta_jiffies >= NEXT_TIMER_MAX_DELTA signals that
-	 * there is no timer pending or at least extremely far into
-	 * the future (12 days for HZ=1000). In this case we set the
-	 * expiry to the end of time.
-	 */
-	if (likely(delta_jiffies < NEXT_TIMER_MAX_DELTA)) {
-		/*
-		 * Calculate the time delta for the next timer event.
-		 * If the time delta exceeds the maximum time delta
-		 * permitted by the current clocksource then adjust
-		 * the time delta accordingly to ensure the
-		 * clocksource does not wrap.
-		 */
-		time_delta = min_t(u64, time_delta,
-				   tick_period.tv64 * delta_jiffies);
-	}
-
-	if (time_delta < KTIME_MAX)
-		expires = ktime_add_ns(last_update, time_delta);
+	/* Calculate the next expiry time */
+	if (delta < (KTIME_MAX - basemono))
+		expires = basemono + delta;
 	else
-		expires.tv64 = KTIME_MAX;
+		expires = KTIME_MAX;
+
+	expires = min_t(u64, expires, next_tick);
+	tick.tv64 = expires;
 
 	/* Skip reprogram of event if its not changed */
-	if (ts->tick_stopped && ktime_equal(expires, dev->next_event))
+	if (ts->tick_stopped && (expires == dev->next_event.tv64))
 		goto out;
 
-	ret = expires;
-
 	/*
 	 * nohz_stop_sched_tick can be called several times before
 	 * the nohz_restart_sched_tick is called. This happens when
@@ -694,26 +686,23 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 	}
 
 	/*
-	 * If the expiration time == KTIME_MAX, then
-	 * in this case we simply stop the tick timer.
+	 * If the expiration time == KTIME_MAX, then we simply stop
+	 * the tick timer.
 	 */
-	if (unlikely(expires.tv64 == KTIME_MAX)) {
+	if (unlikely(expires == KTIME_MAX)) {
 		if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
 			hrtimer_cancel(&ts->sched_timer);
 		goto out;
 	}
 
 	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
-		hrtimer_start(&ts->sched_timer, expires,
-			      HRTIMER_MODE_ABS_PINNED);
+		hrtimer_start(&ts->sched_timer, tick, HRTIMER_MODE_ABS_PINNED);
 	else
-		tick_program_event(expires, 1);
+		tick_program_event(tick, 1);
 out:
-	ts->next_jiffies = next_jiffies;
-	ts->last_jiffies = last_jiffies;
+	/* Update the estimated sleep length */
 	ts->sleep_length = ktime_sub(dev->next_event, now);
-
-	return ret;
+	return tick;
 }
 
 static void tick_nohz_full_stop_tick(struct tick_sched *ts)
diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h
index 28b5da3..42fdf49 100644
--- a/kernel/time/tick-sched.h
+++ b/kernel/time/tick-sched.h
@@ -57,7 +57,7 @@ struct tick_sched {
 	ktime_t				iowait_sleeptime;
 	ktime_t				sleep_length;
 	unsigned long			last_jiffies;
-	unsigned long			next_jiffies;
+	u64				next_timer;
 	ktime_t				idle_expires;
 	int				do_timer_last;
 };
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index b31f13f..172b83c 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -49,6 +49,8 @@
 #include <asm/timex.h>
 #include <asm/io.h>
 
+#include "tick-internal.h"
+
 #define CREATE_TRACE_POINTS
 #include <trace/events/timer.h>
 
@@ -1311,54 +1313,48 @@ cascade:
  * Check, if the next hrtimer event is before the next timer wheel
  * event:
  */
-static unsigned long cmp_next_hrtimer_event(unsigned long now,
-					    unsigned long expires)
+static u64 cmp_next_hrtimer_event(u64 basem, u64 expires)
 {
-	ktime_t hr_delta = hrtimer_get_next_event();
-	struct timespec tsdelta;
-	unsigned long delta;
-
-	if (hr_delta.tv64 == KTIME_MAX)
-		return expires;
+	u64 nextevt = hrtimer_get_next_event();
 
 	/*
-	 * Expired timer available, let it expire in the next tick
+	 * If high resolution timers are enabled
+	 * hrtimer_get_next_event() returns KTIME_MAX.
 	 */
-	if (hr_delta.tv64 <= 0)
-		return now + 1;
-
-	tsdelta = ktime_to_timespec(hr_delta);
-	delta = timespec_to_jiffies(&tsdelta);
+	if (expires <= nextevt)
+		return expires;
 
 	/*
-	 * Limit the delta to the max value, which is checked in
-	 * tick_nohz_stop_sched_tick():
+	 * If the next timer is already expired, return the tick base
+	 * time so the tick is fired immediately.
 	 */
-	if (delta > NEXT_TIMER_MAX_DELTA)
-		delta = NEXT_TIMER_MAX_DELTA;
+	if (nextevt <= basem)
+		return basem;
 
 	/*
-	 * Take rounding errors in to account and make sure, that it
-	 * expires in the next tick. Otherwise we go into an endless
-	 * ping pong due to tick_nohz_stop_sched_tick() retriggering
-	 * the timer softirq
+	 * Round up to the next jiffie. High resolution timers are
+	 * off, so the hrtimers are expired in the tick and we need to
+	 * make sure that this tick really expires the timer to avoid
+	 * a ping pong of the nohz stop code.
+	 *
+	 * Use DIV_ROUND_UP_ULL to prevent gcc calling __divdi3
 	 */
-	if (delta < 1)
-		delta = 1;
-	now += delta;
-	if (time_before(now, expires))
-		return now;
-	return expires;
+	return DIV_ROUND_UP_ULL(nextevt, TICK_NSEC) * TICK_NSEC;
 }
 
 /**
- * get_next_timer_interrupt - return the jiffy of the next pending timer
- * @now: current time (in jiffies)
+ * get_next_timer_interrupt - return the time (clock mono) of the next timer
+ * @basej:	base time jiffies
+ * @basem:	base time clock monotonic
+ *
+ * Returns the tick aligned clock monotonic time of the next pending
+ * timer or KTIME_MAX if no timer is pending.
  */
-unsigned long get_next_timer_interrupt(unsigned long now)
+u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
 {
 	struct tvec_base *base = __this_cpu_read(tvec_bases);
-	unsigned long expires = now + NEXT_TIMER_MAX_DELTA;
+	u64 expires = KTIME_MAX;
+	unsigned long nextevt;
 
 	/*
 	 * Pretend that there is no timer pending if the cpu is offline.
@@ -1371,14 +1367,15 @@ unsigned long get_next_timer_interrupt(unsigned long now)
 	if (base->active_timers) {
 		if (time_before_eq(base->next_timer, base->timer_jiffies))
 			base->next_timer = __next_timer_interrupt(base);
-		expires = base->next_timer;
+		nextevt = base->next_timer;
+		if (time_before_eq(nextevt, basej))
+			expires = basem;
+		else
+			expires = basem + (nextevt - basej) * TICK_NSEC;
 	}
 	spin_unlock(&base->lock);
 
-	if (time_before_eq(expires, now))
-		return now;
-
-	return cmp_next_hrtimer_event(now, expires);
+	return cmp_next_hrtimer_event(basem, expires);
 }
 #endif
 
diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
index 6232fc5..66f39bb 100644
--- a/kernel/time/timer_list.c
+++ b/kernel/time/timer_list.c
@@ -191,7 +191,7 @@ static void print_cpu(struct seq_file *m, int cpu, u64 now)
 		P_ns(idle_sleeptime);
 		P_ns(iowait_sleeptime);
 		P(last_jiffies);
-		P(next_jiffies);
+		P(next_timer);
 		P_ns(idle_expires);
 		SEQ_printf(m, "jiffies: %Lu\n",
 			   (unsigned long long)jiffies);
@@ -289,7 +289,7 @@ static void timer_list_show_tickdevices_header(struct seq_file *m)
 
 static inline void timer_list_header(struct seq_file *m, u64 now)
 {
-	SEQ_printf(m, "Timer List Version: v0.7\n");
+	SEQ_printf(m, "Timer List Version: v0.8\n");
 	SEQ_printf(m, "HRTIMER_MAX_CLOCK_BASES: %d\n", HRTIMER_MAX_CLOCK_BASES);
 	SEQ_printf(m, "now at %Ld nsecs\n", (unsigned long long)now);
 	SEQ_printf(m, "\n");

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] x86: perf: Use hrtimer_start()
  2015-04-14 21:09 ` [patch 21/39] x86: perf: Use hrtimer_start() Thomas Gleixner
@ 2015-04-22 19:10   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, fweisbec, tglx, viresh.kumar, mingo, preeti, mtosatti,
	linux-kernel, hpa

Commit-ID:  514c2304b4574dae28c3d7c0ad6b9cf296994140
Gitweb:     http://git.kernel.org/tip/514c2304b4574dae28c3d7c0ad6b9cf296994140
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:00 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:50 +0200

x86: perf: Use hrtimer_start()

hrtimer_start() does not longer defer already expired timers to the
softirq. Get rid of the __hrtimer_start_range_ns() invocation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/20150414203502.260487331@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
index 999289b9..10190c0 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -204,9 +204,8 @@ again:
 
 static void rapl_start_hrtimer(struct rapl_pmu *pmu)
 {
-	__hrtimer_start_range_ns(&pmu->hrtimer,
-			pmu->timer_interval, 0,
-			HRTIMER_MODE_REL_PINNED, 0);
+       hrtimer_start(&pmu->hrtimer, pmu->timer_interval,
+		     HRTIMER_MODE_REL_PINNED);
 }
 
 static void rapl_stop_hrtimer(struct rapl_pmu *pmu)

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] x86: perf: uncore: Use hrtimer_start()
  2015-04-14 21:09 ` [patch 22/39] x86: perf: uncore: " Thomas Gleixner
@ 2015-04-22 19:10   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: fweisbec, mtosatti, mingo, linux-kernel, preeti, viresh.kumar,
	tglx, peterz, hpa

Commit-ID:  576b0704c9def6d54b3ae9e13b0b7567c713f568
Gitweb:     http://git.kernel.org/tip/576b0704c9def6d54b3ae9e13b0b7567c713f568
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:01 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:50 +0200

x86: perf: uncore: Use hrtimer_start()

hrtimer_start() does not longer defer already expired timers to the
softirq. Get rid of the __hrtimer_start_range_ns() invocation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/20150414203502.360555157@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index c635b8b..7c411f0 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -233,9 +233,8 @@ static enum hrtimer_restart uncore_pmu_hrtimer(struct hrtimer *hrtimer)
 
 void uncore_pmu_start_hrtimer(struct intel_uncore_box *box)
 {
-	__hrtimer_start_range_ns(&box->hrtimer,
-			ns_to_ktime(box->hrtimer_duration), 0,
-			HRTIMER_MODE_REL_PINNED, 0);
+	hrtimer_start(&box->hrtimer, ns_to_ktime(box->hrtimer_duration),
+		      HRTIMER_MODE_REL_PINNED);
 }
 
 void uncore_pmu_cancel_hrtimer(struct intel_uncore_box *box)

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] perf: core: Use hrtimer_start()
  2015-04-14 21:09 ` [patch 23/39] perf: core: " Thomas Gleixner
@ 2015-04-22 19:11   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: fweisbec, preeti, peterz, tglx, viresh.kumar, mingo, hpa,
	linux-kernel, mtosatti

Commit-ID:  3497d206c4d9b266d2e56c8b20e51b2f0e6a3c72
Gitweb:     http://git.kernel.org/tip/3497d206c4d9b266d2e56c8b20e51b2f0e6a3c72
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:03 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:51 +0200

perf: core: Use hrtimer_start()

hrtimer_start() does not longer defer already expired timers to the
softirq. Get rid of the __hrtimer_start_range_ns() invocation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203502.452104213@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/events/core.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 81aa3a4..05309fd 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -834,9 +834,7 @@ static void perf_cpu_hrtimer_restart(struct perf_cpu_context *cpuctx)
 	if (hrtimer_active(hr))
 		return;
 
-	if (!hrtimer_callback_running(hr))
-		__hrtimer_start_range_ns(hr, cpuctx->hrtimer_interval,
-					 0, HRTIMER_MODE_REL_PINNED, 0);
+	hrtimer_start(hr, cpuctx->hrtimer_interval, HRTIMER_MODE_REL_PINNED);
 }
 
 void perf_pmu_disable(struct pmu *pmu)
@@ -6843,9 +6841,8 @@ static void perf_swevent_start_hrtimer(struct perf_event *event)
 	} else {
 		period = max_t(u64, 10000, hwc->sample_period);
 	}
-	__hrtimer_start_range_ns(&hwc->hrtimer,
-				ns_to_ktime(period), 0,
-				HRTIMER_MODE_REL_PINNED, 0);
+	hrtimer_start(&hwc->hrtimer, ns_to_ktime(period),
+		      HRTIMER_MODE_REL_PINNED);
 }
 
 static void perf_swevent_cancel_hrtimer(struct perf_event *event)

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] sched: core: Use hrtimer_start[_expires]()
  2015-04-14 21:09 ` [patch 24/39] sched: core: Use hrtimer_start[_expires]() Thomas Gleixner
@ 2015-04-22 19:11   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, preeti, hpa, peterz, viresh.kumar, mingo, fweisbec,
	tglx, mtosatti

Commit-ID:  4961b6e11825c2b05b516374b1800fc5dfc2cb78
Gitweb:     http://git.kernel.org/tip/4961b6e11825c2b05b516374b1800fc5dfc2cb78
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:05 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:51 +0200

sched: core: Use hrtimer_start[_expires]()

hrtimer_start() now enforces a timer interrupt when an already expired
timer is enqueued.

Get rid of the __hrtimer_start_range_ns() invocations and the loops
around it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203502.531131739@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/sched/core.c | 28 ++++++++--------------------
 kernel/sched/fair.c |  2 +-
 2 files changed, 9 insertions(+), 21 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f9123a8..3026678 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -92,22 +92,11 @@
 
 void start_bandwidth_timer(struct hrtimer *period_timer, ktime_t period)
 {
-	unsigned long delta;
-	ktime_t soft, hard, now;
-
-	for (;;) {
-		if (hrtimer_active(period_timer))
-			break;
-
-		now = hrtimer_cb_get_time(period_timer);
-		hrtimer_forward(period_timer, now, period);
+	if (hrtimer_active(period_timer))
+		return;
 
-		soft = hrtimer_get_softexpires(period_timer);
-		hard = hrtimer_get_expires(period_timer);
-		delta = ktime_to_ns(ktime_sub(hard, soft));
-		__hrtimer_start_range_ns(period_timer, soft, delta,
-					 HRTIMER_MODE_ABS_PINNED, 0);
-	}
+	hrtimer_forward_now(period_timer, period);
+	hrtimer_start_expires(period_timer, HRTIMER_MODE_ABS_PINNED);
 }
 
 DEFINE_MUTEX(sched_domains_mutex);
@@ -355,12 +344,11 @@ static enum hrtimer_restart hrtick(struct hrtimer *timer)
 
 #ifdef CONFIG_SMP
 
-static int __hrtick_restart(struct rq *rq)
+static void __hrtick_restart(struct rq *rq)
 {
 	struct hrtimer *timer = &rq->hrtick_timer;
-	ktime_t time = hrtimer_get_softexpires(timer);
 
-	return __hrtimer_start_range_ns(timer, time, 0, HRTIMER_MODE_ABS_PINNED, 0);
+	hrtimer_start_expires(timer, HRTIMER_MODE_ABS_PINNED);
 }
 
 /*
@@ -440,8 +428,8 @@ void hrtick_start(struct rq *rq, u64 delay)
 	 * doesn't make sense. Rely on vruntime for fairness.
 	 */
 	delay = max_t(u64, delay, 10000LL);
-	__hrtimer_start_range_ns(&rq->hrtick_timer, ns_to_ktime(delay), 0,
-			HRTIMER_MODE_REL_PINNED, 0);
+	hrtimer_start(&rq->hrtick_timer, ns_to_ktime(delay),
+		      HRTIMER_MODE_REL_PINNED);
 }
 
 static inline void init_hrtick(void)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ffeaa41..854881b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3850,7 +3850,7 @@ static const u64 cfs_bandwidth_slack_period = 5 * NSEC_PER_MSEC;
  * Are we near the end of the current quota period?
  *
  * Requires cfs_b->lock for hrtimer_expires_remaining to be safe against the
- * hrtimer base being cleared by __hrtimer_start_range_ns. In the case of
+ * hrtimer base being cleared by hrtimer_start. In the case of
  * migrate_hrtimers, base is never cleared, so we are fine.
  */
 static int runtime_refresh_within(struct cfs_bandwidth *cfs_b, u64 min_expire)

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] sched: deadline: Use hrtimer_start()
  2015-04-14 21:09 ` [patch 25/39] sched: deadline: Use hrtimer_start() Thomas Gleixner
@ 2015-04-22 19:11   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: viresh.kumar, preeti, mingo, mtosatti, fweisbec, linux-kernel,
	hpa, peterz, tglx

Commit-ID:  cc9684d3c1188ac5f1cf0ee9f8be7ba456099d7b
Gitweb:     http://git.kernel.org/tip/cc9684d3c1188ac5f1cf0ee9f8be7ba456099d7b
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:06 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:51 +0200

sched: deadline: Use hrtimer_start()

hrtimer_start() does not longer defer already expired timers to the
softirq. Get rid of the __hrtimer_start_range_ns() invocation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203502.627353666@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/sched/deadline.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 5e95145..21d6907 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -503,8 +503,6 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
 	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
 	struct rq *rq = rq_of_dl_rq(dl_rq);
 	ktime_t now, act;
-	ktime_t soft, hard;
-	unsigned long range;
 	s64 delta;
 
 	if (boosted)
@@ -527,15 +525,9 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
 	if (ktime_us_delta(act, now) < 0)
 		return 0;
 
-	hrtimer_set_expires(&dl_se->dl_timer, act);
+	hrtimer_start(&dl_se->dl_timer, act, HRTIMER_MODE_ABS);
 
-	soft = hrtimer_get_softexpires(&dl_se->dl_timer);
-	hard = hrtimer_get_expires(&dl_se->dl_timer);
-	range = ktime_to_ns(ktime_sub(hard, soft));
-	__hrtimer_start_range_ns(&dl_se->dl_timer, soft,
-				 range, HRTIMER_MODE_ABS, 0);
-
-	return hrtimer_active(&dl_se->dl_timer);
+	return 1;
 }
 
 /*

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Get rid of __hrtimer_start_range_ns()
  2015-04-14 21:09 ` [patch 26/39] hrtimer: Get rid of __hrtimer_start_range_ns() Thomas Gleixner
@ 2015-04-22 19:11   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, fweisbec, viresh.kumar, peterz, linux-kernel, mtosatti,
	preeti, tglx, mingo

Commit-ID:  58f1f803f1d6ef9ab280de13246d65970a09cb95
Gitweb:     http://git.kernel.org/tip/58f1f803f1d6ef9ab280de13246d65970a09cb95
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:08 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:51 +0200

hrtimer: Get rid of __hrtimer_start_range_ns()

No more callers. Remove the leftovers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203502.707871492@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h |  4 ----
 kernel/time/hrtimer.c   | 38 +++++++++++++++-----------------------
 2 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 2c68f71..a80baa8 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -359,10 +359,6 @@ extern int hrtimer_start(struct hrtimer *timer, ktime_t tim,
 			 const enum hrtimer_mode mode);
 extern int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 			unsigned long range_ns, const enum hrtimer_mode mode);
-extern int
-__hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
-			 unsigned long delta_ns,
-			 const enum hrtimer_mode mode, int wakeup);
 
 extern int hrtimer_cancel(struct hrtimer *timer);
 extern int hrtimer_try_to_cancel(struct hrtimer *timer);
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 179b991..88d6ea2 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -916,9 +916,20 @@ remove_hrtimer(struct hrtimer *timer, struct hrtimer_clock_base *base)
 	return 0;
 }
 
-int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
-		unsigned long delta_ns, const enum hrtimer_mode mode,
-		int wakeup)
+/**
+ * hrtimer_start_range_ns - (re)start an hrtimer on the current CPU
+ * @timer:	the timer to be added
+ * @tim:	expiry time
+ * @delta_ns:	"slack" range for the timer
+ * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
+ *		relative (HRTIMER_MODE_REL)
+ *
+ * Returns:
+ *  0 on success
+ *  1 when the timer was active
+ */
+int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
+			   unsigned long delta_ns, const enum hrtimer_mode mode)
 {
 	struct hrtimer_clock_base *base, *new_base;
 	unsigned long flags;
@@ -971,25 +982,6 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(__hrtimer_start_range_ns);
-
-/**
- * hrtimer_start_range_ns - (re)start an hrtimer on the current CPU
- * @timer:	the timer to be added
- * @tim:	expiry time
- * @delta_ns:	"slack" range for the timer
- * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
- *		relative (HRTIMER_MODE_REL)
- *
- * Returns:
- *  0 on success
- *  1 when the timer was active
- */
-int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
-		unsigned long delta_ns, const enum hrtimer_mode mode)
-{
-	return __hrtimer_start_range_ns(timer, tim, delta_ns, mode, 1);
-}
 EXPORT_SYMBOL_GPL(hrtimer_start_range_ns);
 
 /**
@@ -1006,7 +998,7 @@ EXPORT_SYMBOL_GPL(hrtimer_start_range_ns);
 int
 hrtimer_start(struct hrtimer *timer, ktime_t tim, const enum hrtimer_mode mode)
 {
-	return __hrtimer_start_range_ns(timer, tim, 0, mode, 1);
+	return hrtimer_start_range_ns(timer, tim, 0, mode);
 }
 EXPORT_SYMBOL_GPL(hrtimer_start);
 

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Make hrtimer_start() a inline wrapper
  2015-04-14 21:09 ` [patch 27/39] hrtimer: Make hrtimer_start() a inline wrapper Thomas Gleixner
@ 2015-04-22 19:12   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:12 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, fweisbec, mtosatti, mingo, tglx, viresh.kumar,
	linux-kernel, preeti, hpa

Commit-ID:  02a171af1a46966dcdb5b38cdc33e4f43e92c778
Gitweb:     http://git.kernel.org/tip/02a171af1a46966dcdb5b38cdc33e4f43e92c778
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:10 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:51 +0200

hrtimer: Make hrtimer_start() a inline wrapper

No point for an extra export just to set the extra argument of
hrtimer_start_range_ns() to 0.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203502.808544539@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h | 19 +++++++++++++++++--
 kernel/time/hrtimer.c   | 19 -------------------
 2 files changed, 17 insertions(+), 21 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index a80baa8..42074ab 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -355,11 +355,26 @@ static inline void destroy_hrtimer_on_stack(struct hrtimer *timer) { }
 #endif
 
 /* Basic timer operations: */
-extern int hrtimer_start(struct hrtimer *timer, ktime_t tim,
-			 const enum hrtimer_mode mode);
 extern int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 			unsigned long range_ns, const enum hrtimer_mode mode);
 
+/**
+ * hrtimer_start - (re)start an hrtimer on the current CPU
+ * @timer:	the timer to be added
+ * @tim:	expiry time
+ * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
+ *		relative (HRTIMER_MODE_REL)
+ *
+ * Returns:
+ *  0 on success
+ *  1 when the timer was active
+ */
+static inline int hrtimer_start(struct hrtimer *timer, ktime_t tim,
+				const enum hrtimer_mode mode)
+{
+	return hrtimer_start_range_ns(timer, tim, 0, mode);
+}
+
 extern int hrtimer_cancel(struct hrtimer *timer);
 extern int hrtimer_try_to_cancel(struct hrtimer *timer);
 
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 88d6ea2..e5cf71aa 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -985,25 +985,6 @@ int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 EXPORT_SYMBOL_GPL(hrtimer_start_range_ns);
 
 /**
- * hrtimer_start - (re)start an hrtimer on the current CPU
- * @timer:	the timer to be added
- * @tim:	expiry time
- * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
- *		relative (HRTIMER_MODE_REL)
- *
- * Returns:
- *  0 on success
- *  1 when the timer was active
- */
-int
-hrtimer_start(struct hrtimer *timer, ktime_t tim, const enum hrtimer_mode mode)
-{
-	return hrtimer_start_range_ns(timer, tim, 0, mode);
-}
-EXPORT_SYMBOL_GPL(hrtimer_start);
-
-
-/**
  * hrtimer_try_to_cancel - try to deactivate a timer
  * @timer:	hrtimer to stop
  *

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Remove bogus hrtimer_active() check
  2015-04-14 21:09 ` [patch 28/39] hrtimer: Remove bogus hrtimer_active() check Thomas Gleixner
@ 2015-04-22 19:12   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:12 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, preeti, mtosatti, tglx, mingo, linux-kernel,
	viresh.kumar, hpa, fweisbec

Commit-ID:  3f7b349ac14885472a19c46840235114e5ad5e52
Gitweb:     http://git.kernel.org/tip/3f7b349ac14885472a19c46840235114e5ad5e52
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:11 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:51 +0200

hrtimer: Remove bogus hrtimer_active() check

The check for hrtimer_active() after starting the timer is
pointless. If the timer is inactive it has expired already and
therefor the task pointer is already NULL.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203502.907149271@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/hrtimer.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index e5cf71aa..c38f0b6 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1361,8 +1361,6 @@ static int __sched do_nanosleep(struct hrtimer_sleeper *t, enum hrtimer_mode mod
 	do {
 		set_current_state(TASK_INTERRUPTIBLE);
 		hrtimer_start_expires(&t->timer, mode);
-		if (!hrtimer_active(&t->timer))
-			t->task = NULL;
 
 		if (likely(t->task))
 			freezable_schedule();
@@ -1633,8 +1631,6 @@ schedule_hrtimeout_range_clock(ktime_t *expires, unsigned long delta,
 	hrtimer_init_sleeper(&t, current);
 
 	hrtimer_start_expires(&t.timer, mode);
-	if (!hrtimer_active(&t.timer))
-		t.task = NULL;
 
 	if (likely(t.task))
 		schedule();

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] futex: Remove bogus hrtimer_active() check
  2015-04-14 21:09 ` [patch 29/39] hrtimer: Rmove " Thomas Gleixner
@ 2015-04-22 19:12   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:12 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: viresh.kumar, fweisbec, hpa, mtosatti, preeti, linux-kernel,
	mingo, peterz, tglx

Commit-ID:  2e4b0d3fe88bc2618fd5d081ace338a70f8c23da
Gitweb:     http://git.kernel.org/tip/2e4b0d3fe88bc2618fd5d081ace338a70f8c23da
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:13 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:51 +0200

futex: Remove bogus hrtimer_active() check

The check for hrtimer_active() after starting the timer is
pointless. If the timer is inactive it has expired already and
therefor the task pointer is already NULL.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203502.985825453@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/futex.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 2579e40..720eacf 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2063,11 +2063,8 @@ static void futex_wait_queue_me(struct futex_hash_bucket *hb, struct futex_q *q,
 	queue_me(q, hb);
 
 	/* Arm the timer */
-	if (timeout) {
+	if (timeout)
 		hrtimer_start_expires(&timeout->timer, HRTIMER_MODE_ABS);
-		if (!hrtimer_active(&timeout->timer))
-			timeout->task = NULL;
-	}
 
 	/*
 	 * If we have been removed from the hash list, then another task

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] rtmutex: Remove bogus hrtimer_active() check
  2015-04-14 21:09 ` [patch 30/39] rtmutex: " Thomas Gleixner
@ 2015-04-22 19:13   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, mingo, viresh.kumar, tglx, fweisbec, mtosatti, hpa,
	peterz, preeti

Commit-ID:  ccdd92c17e144c8494f4c94ab85b48d297545cec
Gitweb:     http://git.kernel.org/tip/ccdd92c17e144c8494f4c94ab85b48d297545cec
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:15 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:51 +0200

rtmutex: Remove bogus hrtimer_active() check

The check for hrtimer_active() after starting the timer is
pointless. If the timer is inactive it has expired already and
therefor the task pointer is already NULL.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203503.081830481@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/locking/rtmutex.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index b732793..8626437 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1180,11 +1180,8 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
 	set_current_state(state);
 
 	/* Setup the timer, when timeout != NULL */
-	if (unlikely(timeout)) {
+	if (unlikely(timeout))
 		hrtimer_start_expires(&timeout->timer, HRTIMER_MODE_ABS);
-		if (!hrtimer_active(&timeout->timer))
-			timeout->task = NULL;
-	}
 
 	ret = task_blocks_on_rt_mutex(lock, &waiter, current, chwalk);
 

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] net: core: pktgen: Remove bogus hrtimer_active( ) check
  2015-04-14 21:09 ` [patch 31/39] net: core: pktgen: " Thomas Gleixner
  2015-04-16 16:04   ` David Miller
@ 2015-04-22 19:13   ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, davem, hpa, preeti, linux-kernel, fweisbec, viresh.kumar,
	mtosatti, mingo, peterz

Commit-ID:  46ac2f53dfe93a25b251696b946af7b673978ccb
Gitweb:     http://git.kernel.org/tip/46ac2f53dfe93a25b251696b946af7b673978ccb
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:16 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:52 +0200

net: core: pktgen: Remove bogus hrtimer_active() check

The check for hrtimer_active() after starting the timer is
pointless. If the timer is inactive it has expired already and
therefor the task pointer is already NULL.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Link: http://lkml.kernel.org/r/20150414203503.165258315@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 net/core/pktgen.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 508155b..54817d3 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -2212,8 +2212,6 @@ static void spin(struct pktgen_dev *pkt_dev, ktime_t spin_until)
 		do {
 			set_current_state(TASK_INTERRUPTIBLE);
 			hrtimer_start_expires(&t.timer, HRTIMER_MODE_ABS);
-			if (!hrtimer_active(&t.timer))
-				t.task = NULL;
 
 			if (likely(t.task))
 				schedule();

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] alarmtimer: Get rid of unused return value
  2015-04-14 21:09 ` [patch 32/39] alarmtimer: Get rid of unused return value Thomas Gleixner
@ 2015-04-22 19:13   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, viresh.kumar, hpa, linux-kernel, fweisbec, peterz,
	john.stultz, mtosatti, preeti, mingo

Commit-ID:  b193217e6dc3f88b599b573b53e0e0f6671d969a
Gitweb:     http://git.kernel.org/tip/b193217e6dc3f88b599b573b53e0e0f6671d969a
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:18 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:52 +0200

alarmtimer: Get rid of unused return value

We want to get rid of the hrtimer_start() return value and the alarm
timer return value is nowhere used. Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20150414203503.243910615@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/alarmtimer.h |  4 ++--
 kernel/time/alarmtimer.c   | 11 ++++-------
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/include/linux/alarmtimer.h b/include/linux/alarmtimer.h
index a899402..52f3b7d 100644
--- a/include/linux/alarmtimer.h
+++ b/include/linux/alarmtimer.h
@@ -43,8 +43,8 @@ struct alarm {
 
 void alarm_init(struct alarm *alarm, enum alarmtimer_type type,
 		enum alarmtimer_restart (*function)(struct alarm *, ktime_t));
-int alarm_start(struct alarm *alarm, ktime_t start);
-int alarm_start_relative(struct alarm *alarm, ktime_t start);
+void alarm_start(struct alarm *alarm, ktime_t start);
+void alarm_start_relative(struct alarm *alarm, ktime_t start);
 void alarm_restart(struct alarm *alarm);
 int alarm_try_to_cancel(struct alarm *alarm);
 int alarm_cancel(struct alarm *alarm);
diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index 0b55a75..7fbba63 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -317,19 +317,16 @@ EXPORT_SYMBOL_GPL(alarm_init);
  * @alarm: ptr to alarm to set
  * @start: time to run the alarm
  */
-int alarm_start(struct alarm *alarm, ktime_t start)
+void alarm_start(struct alarm *alarm, ktime_t start)
 {
 	struct alarm_base *base = &alarm_bases[alarm->type];
 	unsigned long flags;
-	int ret;
 
 	spin_lock_irqsave(&base->lock, flags);
 	alarm->node.expires = start;
 	alarmtimer_enqueue(base, alarm);
-	ret = hrtimer_start(&alarm->timer, alarm->node.expires,
-				HRTIMER_MODE_ABS);
+	hrtimer_start(&alarm->timer, alarm->node.expires, HRTIMER_MODE_ABS);
 	spin_unlock_irqrestore(&base->lock, flags);
-	return ret;
 }
 EXPORT_SYMBOL_GPL(alarm_start);
 
@@ -338,12 +335,12 @@ EXPORT_SYMBOL_GPL(alarm_start);
  * @alarm: ptr to alarm to set
  * @start: time relative to now to run the alarm
  */
-int alarm_start_relative(struct alarm *alarm, ktime_t start)
+void alarm_start_relative(struct alarm *alarm, ktime_t start)
 {
 	struct alarm_base *base = &alarm_bases[alarm->type];
 
 	start = ktime_add(start, base->gettime());
-	return alarm_start(alarm, start);
+	alarm_start(alarm, start);
 }
 EXPORT_SYMBOL_GPL(alarm_start_relative);
 

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] tick: broadcast-hrtimer: Remove overly clever return value abuse
  2015-04-14 21:09 ` [patch 34/39] tick: broadcast-hrtimer: Remove overly clever return value abuse Thomas Gleixner
  2015-04-17 10:33   ` Preeti U Murthy
@ 2015-04-22 19:13   ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, linux-kernel, peterz, preeti, mingo, fweisbec, tglx,
	mtosatti, viresh.kumar

Commit-ID:  b8a62f1ff0ccb18fdc25c6150d1cd394610f4753
Gitweb:     http://git.kernel.org/tip/b8a62f1ff0ccb18fdc25c6150d1cd394610f4753
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:22 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:52 +0200

tick: broadcast-hrtimer: Remove overly clever return value abuse

The assignment of bc_moved in the conditional construct relies on the
fact that in the case of hrtimer_start() invocation the return value
is always 0. It took me a while to understand it.

We want to get rid of the hrtimer_start() return value. Open code the
logic which makes it readable as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203503.404751457@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/tick-broadcast-hrtimer.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/time/tick-broadcast-hrtimer.c b/kernel/time/tick-broadcast-hrtimer.c
index 6aac4be..96428d7 100644
--- a/kernel/time/tick-broadcast-hrtimer.c
+++ b/kernel/time/tick-broadcast-hrtimer.c
@@ -66,9 +66,11 @@ static int bc_set_next(ktime_t expires, struct clock_event_device *bc)
 	 * hrtimer_{start/cancel} functions call into tracing,
 	 * calls to these functions must be bound within RCU_NONIDLE.
 	 */
-	RCU_NONIDLE(bc_moved = (hrtimer_try_to_cancel(&bctimer) >= 0) ?
-		!hrtimer_start(&bctimer, expires, HRTIMER_MODE_ABS_PINNED) :
-			0);
+	RCU_NONIDLE({
+			bc_moved = hrtimer_try_to_cancel(&bctimer) >= 0;
+			if (bc_moved)
+				hrtimer_start(&bctimer, expires,
+					      HRTIMER_MODE_ABS_PINNED);});
 	if (bc_moved) {
 		/* Bind the "device" to the cpu */
 		bc->bound_on = smp_processor_id();

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Remove hrtimer_start() return value
  2015-04-14 21:09 ` [patch 35/39] hrtimer: Remove hrtimer_start() return value Thomas Gleixner
@ 2015-04-22 19:14   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:14 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, tglx, peterz, hpa, mingo, preeti, fweisbec,
	viresh.kumar, mtosatti

Commit-ID:  61699e13072a89880aa584dcc64c6da465fb2ccc
Gitweb:     http://git.kernel.org/tip/61699e13072a89880aa584dcc64c6da465fb2ccc
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:23 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:52 +0200

hrtimer: Remove hrtimer_start() return value

No user was ever interested whether the timer was active or not when
it was started. All abusers of the return value are gone, so get rid
of it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203503.483556394@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h   | 22 +++++++++-------------
 include/linux/interrupt.h |  6 +++---
 kernel/time/hrtimer.c     | 23 +++++++----------------
 3 files changed, 19 insertions(+), 32 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 42074ab..470d876 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -355,7 +355,7 @@ static inline void destroy_hrtimer_on_stack(struct hrtimer *timer) { }
 #endif
 
 /* Basic timer operations: */
-extern int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
+extern void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 			unsigned long range_ns, const enum hrtimer_mode mode);
 
 /**
@@ -364,34 +364,30 @@ extern int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
  * @tim:	expiry time
  * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
  *		relative (HRTIMER_MODE_REL)
- *
- * Returns:
- *  0 on success
- *  1 when the timer was active
  */
-static inline int hrtimer_start(struct hrtimer *timer, ktime_t tim,
-				const enum hrtimer_mode mode)
+static inline void hrtimer_start(struct hrtimer *timer, ktime_t tim,
+				 const enum hrtimer_mode mode)
 {
-	return hrtimer_start_range_ns(timer, tim, 0, mode);
+	hrtimer_start_range_ns(timer, tim, 0, mode);
 }
 
 extern int hrtimer_cancel(struct hrtimer *timer);
 extern int hrtimer_try_to_cancel(struct hrtimer *timer);
 
-static inline int hrtimer_start_expires(struct hrtimer *timer,
-						enum hrtimer_mode mode)
+static inline void hrtimer_start_expires(struct hrtimer *timer,
+					 enum hrtimer_mode mode)
 {
 	unsigned long delta;
 	ktime_t soft, hard;
 	soft = hrtimer_get_softexpires(timer);
 	hard = hrtimer_get_expires(timer);
 	delta = ktime_to_ns(ktime_sub(hard, soft));
-	return hrtimer_start_range_ns(timer, soft, delta, mode);
+	hrtimer_start_range_ns(timer, soft, delta, mode);
 }
 
-static inline int hrtimer_restart(struct hrtimer *timer)
+static inline void hrtimer_restart(struct hrtimer *timer)
 {
-	return hrtimer_start_expires(timer, HRTIMER_MODE_ABS);
+	hrtimer_start_expires(timer, HRTIMER_MODE_ABS);
 }
 
 /* Query timers: */
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 6bf15a6..be7e75c 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -593,10 +593,10 @@ tasklet_hrtimer_init(struct tasklet_hrtimer *ttimer,
 		     clockid_t which_clock, enum hrtimer_mode mode);
 
 static inline
-int tasklet_hrtimer_start(struct tasklet_hrtimer *ttimer, ktime_t time,
-			  const enum hrtimer_mode mode)
+void tasklet_hrtimer_start(struct tasklet_hrtimer *ttimer, ktime_t time,
+			   const enum hrtimer_mode mode)
 {
-	return hrtimer_start(&ttimer->timer, time, mode);
+	hrtimer_start(&ttimer->timer, time, mode);
 }
 
 static inline
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index c38f0b6..beab02d 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -923,22 +923,18 @@ remove_hrtimer(struct hrtimer *timer, struct hrtimer_clock_base *base)
  * @delta_ns:	"slack" range for the timer
  * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
  *		relative (HRTIMER_MODE_REL)
- *
- * Returns:
- *  0 on success
- *  1 when the timer was active
  */
-int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
-			   unsigned long delta_ns, const enum hrtimer_mode mode)
+void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
+			    unsigned long delta_ns, const enum hrtimer_mode mode)
 {
 	struct hrtimer_clock_base *base, *new_base;
 	unsigned long flags;
-	int ret, leftmost;
+	int leftmost;
 
 	base = lock_hrtimer_base(timer, &flags);
 
 	/* Remove an active timer from the queue: */
-	ret = remove_hrtimer(timer, base);
+	remove_hrtimer(timer, base);
 
 	if (mode & HRTIMER_MODE_REL) {
 		tim = ktime_add_safe(tim, base->get_time());
@@ -962,11 +958,8 @@ int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 	timer_stats_hrtimer_set_start_info(timer);
 
 	leftmost = enqueue_hrtimer(timer, new_base);
-
-	if (!leftmost) {
-		unlock_hrtimer_base(timer, &flags);
-		return ret;
-	}
+	if (!leftmost)
+		goto unlock;
 
 	if (!hrtimer_is_hres_active(timer)) {
 		/*
@@ -977,10 +970,8 @@ int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 	} else {
 		hrtimer_reprogram(timer, new_base);
 	}
-
+unlock:
 	unlock_hrtimer_base(timer, &flags);
-
-	return ret;
 }
 EXPORT_SYMBOL_GPL(hrtimer_start_range_ns);
 

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] hrtimer: Avoid locking in hrtimer_cancel() if timer not active
  2015-04-14 21:09 ` [patch 36/39] hrtimer: Avoid locking in hrtimer_cancel() if timer not active Thomas Gleixner
@ 2015-04-22 19:14   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:14 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: viresh.kumar, preeti, fweisbec, tglx, mingo, hpa, peterz,
	mtosatti, linux-kernel

Commit-ID:  19d9f4225dd6a47fca430f15eeae345ceb95c301
Gitweb:     http://git.kernel.org/tip/19d9f4225dd6a47fca430f15eeae345ceb95c301
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:25 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:52 +0200

hrtimer: Avoid locking in hrtimer_cancel() if timer not active

We can do a lockless check for hrtimer_active before actually taking
the lock in hrtimer[_try_to]_cancel. This is useful for hotpath users
like nanosleep as they avoid the lock dance when the timer has
expired.

This is safe because active is true when the timer is enqueued or the
callback is running. Taking the hrtimer base lock does not protect
against concurrent hrtimer_start calls, the callsite has to do the
proper serialization itself.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203503.580273114@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/hrtimer.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index beab02d..3bac942 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -991,6 +991,15 @@ int hrtimer_try_to_cancel(struct hrtimer *timer)
 	unsigned long flags;
 	int ret = -1;
 
+	/*
+	 * Check lockless first. If the timer is not active (neither
+	 * enqueued nor running the callback, nothing to do here.  The
+	 * base lock does not serialize against a concurrent enqueue,
+	 * so we can avoid taking it.
+	 */
+	if (!hrtimer_active(timer))
+		return 0;
+
 	base = lock_hrtimer_base(timer, &flags);
 
 	if (!hrtimer_callback_running(timer))

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] timer: Remove pointless return value of do_usleep_range()
  2015-04-14 21:09 ` [patch 38/39] timer: Remove pointless return value of do_usleep_range() Thomas Gleixner
@ 2015-04-22 19:14   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:14 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: preeti, viresh.kumar, mtosatti, hpa, tglx, mingo, fweisbec,
	peterz, linux-kernel

Commit-ID:  6deba083e1de3f92f65c9849254e92a1ef001b73
Gitweb:     http://git.kernel.org/tip/6deba083e1de3f92f65c9849254e92a1ef001b73
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:28 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:52 +0200

timer: Remove pointless return value of do_usleep_range()

The only user ignores it anyway and rightfully so.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203503.756060258@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/timer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 172b83c..e9cc7e0 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1692,14 +1692,14 @@ unsigned long msleep_interruptible(unsigned int msecs)
 
 EXPORT_SYMBOL(msleep_interruptible);
 
-static int __sched do_usleep_range(unsigned long min, unsigned long max)
+static void __sched do_usleep_range(unsigned long min, unsigned long max)
 {
 	ktime_t kmin;
 	unsigned long delta;
 
 	kmin = ktime_set(0, min * NSEC_PER_USEC);
 	delta = (max - min) * NSEC_PER_USEC;
-	return schedule_hrtimeout_range(&kmin, delta, HRTIMER_MODE_REL);
+	schedule_hrtimeout_range(&kmin, delta, HRTIMER_MODE_REL);
 }
 
 /**

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [tip:timers/core] timer: Put usleep_range into the __sched section
  2015-04-14 21:09 ` [patch 39/39] timer: Put usleep_range into the __sched section Thomas Gleixner
@ 2015-04-22 19:14   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 123+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-04-22 19:14 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, mingo, tglx, preeti, mtosatti, peterz,
	viresh.kumar, hpa, fweisbec

Commit-ID:  2ad5d3272d8e20e24d8242ebac9f3007f1ea56bc
Gitweb:     http://git.kernel.org/tip/2ad5d3272d8e20e24d8242ebac9f3007f1ea56bc
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 14 Apr 2015 21:09:30 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 22 Apr 2015 17:06:52 +0200

timer: Put usleep_range into the __sched section

do_usleep_range() and schedule_hrtimeout_range() are __sched as
well. So it makes no sense to have the exported function in a
different section.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203503.833709502@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index e9cc7e0..03f926c 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1707,7 +1707,7 @@ static void __sched do_usleep_range(unsigned long min, unsigned long max)
  * @min: Minimum time in usecs to sleep
  * @max: Maximum time in usecs to sleep
  */
-void usleep_range(unsigned long min, unsigned long max)
+void __sched usleep_range(unsigned long min, unsigned long max)
 {
 	__set_current_state(TASK_UNINTERRUPTIBLE);
 	do_usleep_range(min, max);

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* Re: [patch 18/39] tick: sched: Force tick interrupt and get rid of softirq magic
  2015-04-22 14:32     ` Thomas Gleixner
@ 2015-04-23 11:47       ` Frederic Weisbecker
  2015-04-23 13:07         ` Thomas Gleixner
  0 siblings, 1 reply; 123+ messages in thread
From: Frederic Weisbecker @ 2015-04-23 11:47 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, John Stultz, Marcelo Tosatti

On Wed, Apr 22, 2015 at 04:32:11PM +0200, Thomas Gleixner wrote:
> On Wed, 22 Apr 2015, Frederic Weisbecker wrote:
> > But the reprogramming happens only under "if ((long)delta_jiffies >= 1)".
> > Probably this condition should go away as well.
> 
> Errm.
> 
> 	if (!ts->tick_stopped && delta_jiffies <= 1)
> 	     goto out;
> 
> So if the tick is NOT stopped and delta_jiffies <= 1 we let it tick
> and do nothing.
> 
>         if (delta_jiffies >= 1)
> 	     Do the magic nohz stuff
> 	else
> 	     tick_nohz_restart()
> 
> We want the distinction here because if the tick IS stopped and the
> next event is due we need to kick it into gear again. So the condition
> needs to stay. It probably should be if (delta > 1), but that's a
> different story.

Yes but what if the tick is stopped already and delta_jiffies < 1? Say the
tick was last programmed to fire in 5 seconds. An irq fires and enqueues
a timer to fire now. If it's soon enough that delta_jiffies < 1, it seems
we are missing the clock reprogramming and even the softirq from that irq exit.

Because we have:

      if (!ts->tick_stopped && delta_jiffies <= 1)
          goto out;

      if ((long)delta_jiffies >= 1) {
          //do clock reprogramming or restart
          ...
      }

-     raise_softirq(TIMER_SOFTIRQ);

 out:
    ...

>  
> > In the end, the possible side effect, at least on low-res, is that timers
> > which are already expired will be handled on the next tick instead of now.
> > But probably it doesn't matter much to have a one-tick delay.
> 
> We really don't care. That stuff has no guarantees aside of the
> guarantee that it does not expire early :)

Good :)

> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 18/39] tick: sched: Force tick interrupt and get rid of softirq magic
  2015-04-23 11:47       ` Frederic Weisbecker
@ 2015-04-23 13:07         ` Thomas Gleixner
  2015-04-23 16:14           ` Frederic Weisbecker
  0 siblings, 1 reply; 123+ messages in thread
From: Thomas Gleixner @ 2015-04-23 13:07 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, John Stultz, Marcelo Tosatti

On Thu, 23 Apr 2015, Frederic Weisbecker wrote:

> On Wed, Apr 22, 2015 at 04:32:11PM +0200, Thomas Gleixner wrote:
> > On Wed, 22 Apr 2015, Frederic Weisbecker wrote:
> > > But the reprogramming happens only under "if ((long)delta_jiffies >= 1)".
> > > Probably this condition should go away as well.
> > 
> > Errm.
> > 
> > 	if (!ts->tick_stopped && delta_jiffies <= 1)
> > 	     goto out;
> > 
> > So if the tick is NOT stopped and delta_jiffies <= 1 we let it tick
> > and do nothing.
> > 
> >         if (delta_jiffies >= 1)
> > 	     Do the magic nohz stuff
> > 	else
> > 	     tick_nohz_restart()
> > 
> > We want the distinction here because if the tick IS stopped and the
> > next event is due we need to kick it into gear again. So the condition
> > needs to stay. It probably should be if (delta > 1), but that's a
> > different story.
> 
> Yes but what if the tick is stopped already and delta_jiffies < 1? Say the
> tick was last programmed to fire in 5 seconds. An irq fires and enqueues
> a timer to fire now. If it's soon enough that delta_jiffies < 1, it seems
> we are missing the clock reprogramming and even the softirq from that irq exit.
> 
> Because we have:
> 
>       if (!ts->tick_stopped && delta_jiffies <= 1)
>           goto out;
> 
>       if ((long)delta_jiffies >= 1) {
>           //do clock reprogramming or restart
>           ...
>       }
> 
> -     raise_softirq(TIMER_SOFTIRQ);

Did you actually read what I wrote?

> >     if (delta_jiffies >= 1)
> > 	     Do the magic nohz stuff
> > 	else
> > 	     tick_nohz_restart()

So if ticked is stopped AND delta < 1 we call
tick_nohz_restart(). Which will programm the tick for the next
possible period to expire. Which will kick the timer softirq and
expire the timer.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 18/39] tick: sched: Force tick interrupt and get rid of softirq magic
  2015-04-23 13:07         ` Thomas Gleixner
@ 2015-04-23 16:14           ` Frederic Weisbecker
  0 siblings, 0 replies; 123+ messages in thread
From: Frederic Weisbecker @ 2015-04-23 16:14 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, John Stultz, Marcelo Tosatti

On Thu, Apr 23, 2015 at 03:07:59PM +0200, Thomas Gleixner wrote:
> On Thu, 23 Apr 2015, Frederic Weisbecker wrote: 
> > On Wed, Apr 22, 2015 at 04:32:11PM +0200, Thomas Gleixner wrote:
> > > On Wed, 22 Apr 2015, Frederic Weisbecker wrote:
> > > > But the reprogramming happens only under "if ((long)delta_jiffies >= 1)".
> > > > Probably this condition should go away as well.
> > > 
> > > Errm.
> > > 
> > > 	if (!ts->tick_stopped && delta_jiffies <= 1)
> > > 	     goto out;
> > > 
> > > So if the tick is NOT stopped and delta_jiffies <= 1 we let it tick
> > > and do nothing.
> > > 
> > >         if (delta_jiffies >= 1)
> > > 	     Do the magic nohz stuff
> > > 	else
> > > 	     tick_nohz_restart()
> > > 
> > > We want the distinction here because if the tick IS stopped and the
> > > next event is due we need to kick it into gear again. So the condition
> > > needs to stay. It probably should be if (delta > 1), but that's a
> > > different story.
> > 
> > Yes but what if the tick is stopped already and delta_jiffies < 1? Say the
> > tick was last programmed to fire in 5 seconds. An irq fires and enqueues
> > a timer to fire now. If it's soon enough that delta_jiffies < 1, it seems
> > we are missing the clock reprogramming and even the softirq from that irq exit.
> > 
> > Because we have:
> > 
> >       if (!ts->tick_stopped && delta_jiffies <= 1)
> >           goto out;
> > 
> >       if ((long)delta_jiffies >= 1) {
> >           //do clock reprogramming or restart
> >           ...
> >       }
> > 
> > -     raise_softirq(TIMER_SOFTIRQ);
> 
> Did you actually read what I wrote?
> 
> > >     if (delta_jiffies >= 1)
> > > 	     Do the magic nohz stuff
> > > 	else
> > > 	     tick_nohz_restart()
> 
> So if ticked is stopped AND delta < 1 we call
> tick_nohz_restart(). Which will programm the tick for the next
> possible period to expire. Which will kick the timer softirq and
> expire the timer.

Right, I missed the "else" that moved.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [patch 33/39] power: reset: ltc2952: Remove bogus hrtimer_start() return value checks
       [not found] ` <20150414203503.322172417@linutronix.de>
  2015-04-14 21:38   ` [patch 33/39] power: reset: ltc2952: Remove bogus hrtimer_start() return value checks Frans Klaver
@ 2015-04-30 15:49   ` Sebastian Reichel
  1 sibling, 0 replies; 123+ messages in thread
From: Sebastian Reichel @ 2015-04-30 15:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Preeti U Murthy, Viresh Kumar,
	Marcelo Tosatti, Frederic Weisbecker, Dmitry Eremin-Solenikov,
	David Woodhouse, Frans Klaver, René Moll,
	Wolfram Sang, linux-pm

[-- Attachment #1: Type: text/plain, Size: 510 bytes --]

Hi,

On Tue, Apr 14, 2015 at 09:09:20PM -0000, Thomas Gleixner wrote:
> The return value of hrtimer_start() tells whether the timer was
> inactive or active already when hrtimer_start() was called.
> 
> The code emits a bogus warning if the timer was active already
> claiming that the timer could not be started.
> 
> Remove it.

Thanks, queued for 4.1-rc with Acked-By from Frans Klaver:

http://git.infradead.org/battery-2.6.git/commit/d8818257d3befce6ce7da4c09112654914c3fd58

-- Sebastian

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 123+ messages in thread

end of thread, other threads:[~2015-04-30 15:49 UTC | newest]

Thread overview: 123+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-14 21:08 [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues Thomas Gleixner
2015-04-14 21:08 ` [patch 01/39] hrtimer: Update active_bases before calling hrtimer_force_reprogram() Thomas Gleixner
2015-04-14 21:08 ` [patch 02/39] hrtimer: Get rid of the resolution field in hrtimer_clock_base Thomas Gleixner
2015-04-15  6:29   ` Frans Klaver
2015-04-15  6:32     ` Frans Klaver
2015-04-20  8:34   ` Preeti U Murthy
2015-04-22 19:05   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 03/39] net: sched: Use hrtimer_resolution instead of hrtimer_get_res() Thomas Gleixner
2015-04-16 16:04   ` David Miller
2015-04-22 19:05   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 04/39] sound: " Thomas Gleixner
2015-04-16  8:07   ` Takashi Iwai
2015-04-16  9:08     ` Thomas Gleixner
2015-04-22 19:05   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 05/39] hrtimer: Get rid " Thomas Gleixner
2015-04-22 19:05   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 06/39] hrtimer: Make the statistics fields smaller Thomas Gleixner
2015-04-22 19:06   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 07/39] hrtimer: Get rid of softirq time Thomas Gleixner
2015-04-22 19:06   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 08/39] hrtimer: Make offset update smarter Thomas Gleixner
2015-04-20  9:30   ` Preeti U Murthy
2015-04-22 19:06   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 09/39] hrtimer: Use a bits for various boolean indicators Thomas Gleixner
2015-04-22 19:07   ` [tip:timers/core] hrtimer: Use " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 10/39] hrtimer: Use cpu_base->active_base for hotpath iterators Thomas Gleixner
2015-04-20 11:16   ` Preeti U Murthy
2015-04-21 11:53     ` Thomas Gleixner
2015-04-22  3:13       ` Preeti U Murthy
2015-04-22 19:07   ` [tip:timers/core] hrtimer: Use cpu_base-> active_base " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 11/39] hrtimer: Cache line align the hrtimer cpu base Thomas Gleixner
2015-04-22 19:07   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 12/39] hrtimer: Align the hrtimer clock bases as well Thomas Gleixner
2015-04-22 19:07   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 13/39] timerqueue: Let timerqueue_add/del return information Thomas Gleixner
2015-04-22 19:08   ` [tip:timers/core] timerqueue: Let timerqueue_add/ del " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 14/39] hrtimer: Make use of timerqueue_add/del return values Thomas Gleixner
2015-04-22 19:08   ` [tip:timers/core] hrtimer: Make use of timerqueue_add/ del " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 15/39] hrtimer: Keep pointer to first timer and simplify __remove_hrtimer() Thomas Gleixner
2015-04-22 19:08   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 16/39] hrtimer: Get rid of hrtimer softirq Thomas Gleixner
2015-04-22 19:09   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 17/39] tick: sched: Remove hrtimer_active() checks Thomas Gleixner
2015-04-16 13:37   ` Frederic Weisbecker
2015-04-22 19:09   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 18/39] tick: sched: Force tick interrupt and get rid of softirq magic Thomas Gleixner
2015-04-22 14:22   ` Frederic Weisbecker
2015-04-22 14:32     ` Thomas Gleixner
2015-04-23 11:47       ` Frederic Weisbecker
2015-04-23 13:07         ` Thomas Gleixner
2015-04-23 16:14           ` Frederic Weisbecker
2015-04-22 19:09   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 19/39] tick: sched: Restructure code Thomas Gleixner
2015-04-22 19:09   ` [tip:timers/core] tick: Sched: " tip-bot for Thomas Gleixner
2015-04-14 21:08 ` [patch 20/39] tick: nohz: Rework next timer evaluation Thomas Gleixner
2015-04-16 16:42   ` Paul E. McKenney
2015-04-21 12:04     ` Thomas Gleixner
2015-04-22 19:10   ` [tip:timers/core] tick: Nohz: " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 21/39] x86: perf: Use hrtimer_start() Thomas Gleixner
2015-04-22 19:10   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 22/39] x86: perf: uncore: " Thomas Gleixner
2015-04-22 19:10   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 23/39] perf: core: " Thomas Gleixner
2015-04-22 19:11   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 24/39] sched: core: Use hrtimer_start[_expires]() Thomas Gleixner
2015-04-22 19:11   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 25/39] sched: deadline: Use hrtimer_start() Thomas Gleixner
2015-04-22 19:11   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 26/39] hrtimer: Get rid of __hrtimer_start_range_ns() Thomas Gleixner
2015-04-22 19:11   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 27/39] hrtimer: Make hrtimer_start() a inline wrapper Thomas Gleixner
2015-04-22 19:12   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 28/39] hrtimer: Remove bogus hrtimer_active() check Thomas Gleixner
2015-04-22 19:12   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 29/39] hrtimer: Rmove " Thomas Gleixner
2015-04-22 19:12   ` [tip:timers/core] futex: Remove " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 30/39] rtmutex: " Thomas Gleixner
2015-04-22 19:13   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 31/39] net: core: pktgen: " Thomas Gleixner
2015-04-16 16:04   ` David Miller
2015-04-22 19:13   ` [tip:timers/core] net: core: pktgen: Remove bogus hrtimer_active( ) check tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 32/39] alarmtimer: Get rid of unused return value Thomas Gleixner
2015-04-22 19:13   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 34/39] tick: broadcast-hrtimer: Remove overly clever return value abuse Thomas Gleixner
2015-04-17 10:33   ` Preeti U Murthy
2015-04-22 19:13   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 35/39] hrtimer: Remove hrtimer_start() return value Thomas Gleixner
2015-04-22 19:14   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 36/39] hrtimer: Avoid locking in hrtimer_cancel() if timer not active Thomas Gleixner
2015-04-22 19:14   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 37/39] staging: ozwpan: Remove hrtimer_active() check Thomas Gleixner
2015-04-14 21:09 ` [patch 38/39] timer: Remove pointless return value of do_usleep_range() Thomas Gleixner
2015-04-22 19:14   ` [tip:timers/core] " tip-bot for Thomas Gleixner
2015-04-14 21:09 ` [patch 39/39] timer: Put usleep_range into the __sched section Thomas Gleixner
2015-04-22 19:14   ` [tip:timers/core] " tip-bot for Thomas Gleixner
     [not found] ` <20150414203503.322172417@linutronix.de>
2015-04-14 21:38   ` [patch 33/39] power: reset: ltc2952: Remove bogus hrtimer_start() return value checks Frans Klaver
2015-04-30 15:49   ` Sebastian Reichel
  -- strict thread matches above, loose matches on Subject: below --
2015-04-07  2:10 [PATCH V2 0/2] hrtimer: Iterate only over active clock-bases Viresh Kumar
2015-04-07  2:10 ` [PATCH V2 1/2] hrtimer: update '->active_bases' before calling hrtimer_force_reprogram() Viresh Kumar
2015-04-22 19:04   ` [tip:timers/core] hrtimer: Update active_bases " tip-bot for Viresh Kumar
2015-04-07  2:10 ` [PATCH V2 2/2] hrtimer: Iterate only over active clock-bases Viresh Kumar
2015-04-08 12:10   ` Peter Zijlstra
2015-04-08 20:11   ` Thomas Gleixner
2015-04-09  2:42     ` Viresh Kumar
2015-04-09  6:28     ` [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list Ingo Molnar
2015-04-09  6:38       ` Ingo Molnar
2015-04-09  6:39         ` [PATCH] hrtimer: Only iterate over active bases in migrate_hrtimers() Ingo Molnar
2015-04-09  6:53         ` [PATCH] hrtimer: Replace timerqueue_getnext() uses with direct access to 'active.next' Ingo Molnar
2015-04-09  7:10         ` [PATCH] hrtimers: Use consistent variable names for timerqueue_node iterations Ingo Molnar
2015-04-09  6:57       ` [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list Peter Zijlstra
2015-04-09  7:09         ` Ingo Molnar
2015-04-09  7:20           ` Ingo Molnar
2015-04-09  8:58             ` Thomas Gleixner
2015-04-09  8:58             ` Peter Zijlstra
2015-04-09  9:18               ` Thomas Gleixner
2015-04-09  9:31                 ` Peter Zijlstra
2015-04-09  9:56                   ` Thomas Gleixner
2015-04-13  5:53                 ` Preeti U Murthy
2015-04-13  7:53                   ` Thomas Gleixner
2015-04-09  8:03           ` Peter Zijlstra
2015-04-09  8:10             ` Ingo Molnar
2015-04-09  8:53       ` Thomas Gleixner
2015-04-09  9:18         ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).