All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: hrtimer (rdmavt RNR timer) was lost
       [not found] <3F128C9216C9B84BB6ED23EF16290AFB634C8C87@CRSMSX101.amr.corp.intel.com>
@ 2018-04-23 12:53 ` Thomas Gleixner
  2018-04-23 13:22   ` Wan, Kaike
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2018-04-23 12:53 UTC (permalink / raw)
  To: Wan, Kaike
  Cc: Marciniszyn, Mike, Dalessandro, Dennis, Weiny, Ira, Fleck, John,
	linux-kernel, linux-rdma

On Fri, 20 Apr 2018, Wan, Kaike wrote:

Please do not send html mail. It's rejected by the list server.

>           <idle>-0     [001] d.h. 3598473310671: hrtimer_start: hrtimer=00000000708914d7 function=rvt_rc_rnr_retry [rdmavt] expires=712172295802 softexpires=712172295802mode=REL base=0000000066dda1ea next=0000000039cae92b
>           <idle>-0     [001] d.h. 3598473311561: hrtimer_post_add: hrtimer=00000000708914d7 function=rvt_rc_rnr_retry [rdmavt] base=0000000066dda1ea next=0000000039cae92b
> 

> --> The target was first added. The expires was 712172295802. The call
>     was made from cpu 1 while the timer base was on cpu 3.

>           <idle>-0     [003] dn.. 3598605305357: hrtimer_start: hrtimer=0000000004346740 function=tick_sched_timer expires=716767000000 softexpires=716767000000mode=ABS|PINNED base=0000000066dda1ea next=00000000708914d7
>           <idle>-0     [003] dn.. 3598605305781: hrtimer_post_add: hrtimer=0000000004346740 function=tick_sched_timer base=0000000066dda1ea next=00000000708914d7
> 
> --> the target timer was the next one to trigger. The call was from cpu 3.
> 
>           <idle>-0     [003] dN.. 3598605306604: hrtimer_pre_remove: hrtimer=0000000004346740 function=tick_sched_timer base=0000000066dda1ea next=00000000708914d7
>           <idle>-0     [003] dN.. 3598605306802: hrtimer_post_remove: hrtimer=0000000004346740 function=tick_sched_timer base=0000000066dda1ea next=00000000708914d7
> 
> --> The tick_sched_timer was removed
> 
>           <idle>-0     [003] dN.. 3598605307236: hrtimer_start: hrtimer=0000000004346740 function=tick_sched_timer expires=712171000000 softexpires=712171000000mode=ABS|PINNED base=0000000066dda1ea next=00000000708914d7
>           <idle>-0     [003] dN.. 3598605307601: hrtimer_post_add: hrtimer=0000000004346740 function=tick_sched_timer base=0000000066dda1ea next=0000000004346740
> 
> --> The tick_sched_timer was added again with a new expires (712171000000
>     < 716767000000), which replaced the target (712172295802) as the
>     head->next.
> 
>           <idle>-0     [002] d.h. 3598605313255: hrtimer_start: hrtimer=00000000662d2dd8 function=rvt_rc_rnr_retry [rdmavt] expires=712172915662 softexpires=712172915662mode=REL base=0000000066dda1ea next=0000000004346740
>           <idle>-0     [002] d.h. 3599538740786: hrtimer_post_add: hrtimer=00000000662d2dd8 function=rvt_rc_rnr_retry [rdmavt] base=0000000066dda1ea next=00000000662d2dd8
>
> 
> --> A new hrtimer (00000000662d2dd8) was added (from cpu 2). Its expires
>     (712172915662) was larger than the target's (712172295802). How could
>     it become the next hrtimer to trigger??? The next pointer was not set
>     properly here.

I have no idea how that can happen assumed that you did not strip anything
from the trace. If you stripped stuff, please send me the full trace or
upload it somewhere.

Can you apply the following debug patch and enable the hrtimer_start trace
point and send me the full trace or upload it somewhere?

Thanks,

	tglx

8<-------------------------
--- a/lib/timerqueue.c
+++ b/lib/timerqueue.c
@@ -57,6 +57,10 @@ bool timerqueue_add(struct timerqueue_he
 	rb_link_node(&node->node, parent, p);
 	rb_insert_color(&node->node, &head->head);
 
+	trace_printk("head %p next %p next->exp %llu node %p node->exp %llu\n",
+		     head, head->next, head->next ? head->next->expires : 0ULL,
+		     node, node->expires);
+
 	if (!head->next || node->expires < head->next->expires) {
 		head->next = node;
 		return true;
@@ -84,6 +88,11 @@ bool timerqueue_del(struct timerqueue_he
 
 		head->next = rb_entry_safe(rbn, struct timerqueue_node, node);
 	}
+
+	trace_printk("head %p next %p next->exp %llu node %p node->exp %llu\n",
+		     head, head->next, head->next ? head->next->expires : 0ULL,
+		     node, node->expires);
+
 	rb_erase(&node->node, &head->head);
 	RB_CLEAR_NODE(&node->node);
 	return head->next != NULL;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: hrtimer (rdmavt RNR timer) was lost
  2018-04-23 12:53 ` hrtimer (rdmavt RNR timer) was lost Thomas Gleixner
@ 2018-04-23 13:22   ` Wan, Kaike
  2018-04-23 13:35     ` Thomas Gleixner
  0 siblings, 1 reply; 10+ messages in thread
From: Wan, Kaike @ 2018-04-23 13:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Marciniszyn, Mike, Dalessandro, Dennis, Weiny, Ira, Fleck, John,
	linux-kernel, linux-rdma



> -----Original Message-----
> From: Thomas Gleixner [mailto:tglx@linutronix.de]
> Sent: Monday, April 23, 2018 8:53 AM
> To: Wan, Kaike <kaike.wan@intel.com>
> Cc: Marciniszyn, Mike <mike.marciniszyn@intel.com>; Dalessandro, Dennis
> <dennis.dalessandro@intel.com>; Weiny, Ira <ira.weiny@intel.com>; Fleck,
> John <john.fleck@intel.com>; linux-kernel@vger.kernel.org; linux-
> rdma@vger.kernel.org
> Subject: Re: hrtimer (rdmavt RNR timer) was lost
> 
> On Fri, 20 Apr 2018, Wan, Kaike wrote:
> 
> Please do not send html mail. It's rejected by the list server.
> 
> >           <idle>-0     [001] d.h. 3598473310671: hrtimer_start:
> hrtimer=00000000708914d7 function=rvt_rc_rnr_retry [rdmavt]
> expires=712172295802 softexpires=712172295802mode=REL
> base=0000000066dda1ea next=0000000039cae92b
> >           <idle>-0     [001] d.h. 3598473311561: hrtimer_post_add:
> hrtimer=00000000708914d7 function=rvt_rc_rnr_retry [rdmavt]
> base=0000000066dda1ea next=0000000039cae92b
> >
> 
> > --> The target was first added. The expires was 712172295802. The call
> >     was made from cpu 1 while the timer base was on cpu 3.
> 
> >           <idle>-0     [003] dn.. 3598605305357: hrtimer_start:
> hrtimer=0000000004346740 function=tick_sched_timer
> expires=716767000000 softexpires=716767000000mode=ABS|PINNED
> base=0000000066dda1ea next=00000000708914d7
> >           <idle>-0     [003] dn.. 3598605305781: hrtimer_post_add:
> hrtimer=0000000004346740 function=tick_sched_timer
> base=0000000066dda1ea next=00000000708914d7
> >
> > --> the target timer was the next one to trigger. The call was from cpu 3.
> >
> >           <idle>-0     [003] dN.. 3598605306604: hrtimer_pre_remove:
> hrtimer=0000000004346740 function=tick_sched_timer
> base=0000000066dda1ea next=00000000708914d7
> >           <idle>-0     [003] dN.. 3598605306802: hrtimer_post_remove:
> hrtimer=0000000004346740 function=tick_sched_timer
> base=0000000066dda1ea next=00000000708914d7
> >
> > --> The tick_sched_timer was removed
> >
> >           <idle>-0     [003] dN.. 3598605307236: hrtimer_start:
> hrtimer=0000000004346740 function=tick_sched_timer
> expires=712171000000 softexpires=712171000000mode=ABS|PINNED
> base=0000000066dda1ea next=00000000708914d7
> >           <idle>-0     [003] dN.. 3598605307601: hrtimer_post_add:
> hrtimer=0000000004346740 function=tick_sched_timer
> base=0000000066dda1ea next=0000000004346740
> >
> > --> The tick_sched_timer was added again with a new expires
> > --> (712171000000
> >     < 716767000000), which replaced the target (712172295802) as the
> >     head->next.
> >
> >           <idle>-0     [002] d.h. 3598605313255: hrtimer_start:
> hrtimer=00000000662d2dd8 function=rvt_rc_rnr_retry [rdmavt]
> expires=712172915662 softexpires=712172915662mode=REL
> base=0000000066dda1ea next=0000000004346740
> >           <idle>-0     [002] d.h. 3599538740786: hrtimer_post_add:
> hrtimer=00000000662d2dd8 function=rvt_rc_rnr_retry [rdmavt]
> base=0000000066dda1ea next=00000000662d2dd8
> >
> >
> > --> A new hrtimer (00000000662d2dd8) was added (from cpu 2). Its
> > --> expires
> >     (712172915662) was larger than the target's (712172295802). How could
> >     it become the next hrtimer to trigger??? The next pointer was not set
> >     properly here.
> 
> I have no idea how that can happen assumed that you did not strip anything
> from the trace. If you stripped stuff, please send me the full trace or upload
> it somewhere.
> 
> Can you apply the following debug patch and enable the hrtimer_start trace
> point and send me the full trace or upload it somewhere?

The original trace was about 29GB and I filtered it with  "0000000066dda1ea" (the offending base) to generate a 1.4GB  file that I could open and investigate.  I am not sure how I can send them to you. Do you have somewhere I can upload to?

I can try your debug patch and again I am anticipating a big trace file.

Kaike

> 
> Thanks,
> 
> 	tglx
> 
> 8<-------------------------
> --- a/lib/timerqueue.c
> +++ b/lib/timerqueue.c
> @@ -57,6 +57,10 @@ bool timerqueue_add(struct timerqueue_he
>  	rb_link_node(&node->node, parent, p);
>  	rb_insert_color(&node->node, &head->head);
> 
> +	trace_printk("head %p next %p next->exp %llu node %p node-
> >exp %llu\n",
> +		     head, head->next, head->next ? head->next->expires :
> 0ULL,
> +		     node, node->expires);
> +
>  	if (!head->next || node->expires < head->next->expires) {
>  		head->next = node;
>  		return true;
> @@ -84,6 +88,11 @@ bool timerqueue_del(struct timerqueue_he
> 
>  		head->next = rb_entry_safe(rbn, struct timerqueue_node,
> node);
>  	}
> +
> +	trace_printk("head %p next %p next->exp %llu node %p node-
> >exp %llu\n",
> +		     head, head->next, head->next ? head->next->expires :
> 0ULL,
> +		     node, node->expires);
> +
>  	rb_erase(&node->node, &head->head);
>  	RB_CLEAR_NODE(&node->node);
>  	return head->next != NULL;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: hrtimer (rdmavt RNR timer) was lost
  2018-04-23 13:22   ` Wan, Kaike
@ 2018-04-23 13:35     ` Thomas Gleixner
  2018-04-24 14:54       ` Thomas Gleixner
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2018-04-23 13:35 UTC (permalink / raw)
  To: Wan, Kaike
  Cc: Marciniszyn, Mike, Dalessandro, Dennis, Weiny, Ira, Fleck, John,
	linux-kernel, linux-rdma

On Mon, 23 Apr 2018, Wan, Kaike wrote:
> > Can you apply the following debug patch and enable the hrtimer_start trace
> > point and send me the full trace or upload it somewhere?
> 
> The original trace was about 29GB and I filtered it with
> "0000000066dda1ea" (the offending base) to generate a 1.4GB file that I
> could open and investigate.  I am not sure how I can send them to you. Do
> you have somewhere I can upload to?
>
> I can try your debug patch and again I am anticipating a big trace file.

Well, you can find the spot where the fail happens and then extract the
full thing from 2s before that point to 1s after. That should be reasonably
small and good enough. Let me know when you have it and how big it is
(compressed) and we'll figure something out how to transport it.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: hrtimer (rdmavt RNR timer) was lost
  2018-04-23 13:35     ` Thomas Gleixner
@ 2018-04-24 14:54       ` Thomas Gleixner
  2018-04-24 16:48         ` Frederic Weisbecker
  2018-04-24 19:22         ` [PATCH] tick/sched: Do not mess with an enqueued hrtimer Thomas Gleixner
  0 siblings, 2 replies; 10+ messages in thread
From: Thomas Gleixner @ 2018-04-24 14:54 UTC (permalink / raw)
  To: Wan, Kaike
  Cc: Marciniszyn, Mike, Dalessandro, Dennis, Weiny, Ira, Fleck, John,
	linux-kernel, linux-rdma, Peter Zijlstra, Anna-Maria Gleixner,
	Frederic Weisbecker, Ingo Molnar

On Mon, 23 Apr 2018, Thomas Gleixner wrote:
> On Mon, 23 Apr 2018, Wan, Kaike wrote:
> > > Can you apply the following debug patch and enable the hrtimer_start trace
> > > point and send me the full trace or upload it somewhere?
> > 
> > The original trace was about 29GB and I filtered it with
> > "0000000066dda1ea" (the offending base) to generate a 1.4GB file that I
> > could open and investigate.  I am not sure how I can send them to you. Do
> > you have somewhere I can upload to?
> >
> > I can try your debug patch and again I am anticipating a big trace file.
> 
> Well, you can find the spot where the fail happens and then extract the
> full thing from 2s before that point to 1s after. That should be reasonably
> small and good enough. Let me know when you have it and how big it is
> (compressed) and we'll figure something out how to transport it.

Thanks for the more complete data which actually let me decode the wreckage.

So you have NO_HZ enabled and looking at the trace then this is related:

          <idle>-0     [003] dN.. 3598605307236: hrtimer_start: hrtimer=0000000004346740 function=tick_sched_timer expires=712171000000 softexpires=712171000000mode=ABS|PINNED base=0000000066dda1ea next=00000000708914d7
          <idle>-0     [003] dN.. 3598605307601: hrtimer_post_add: hrtimer=0000000004346740 function=tick_sched_timer base=0000000066dda1ea next=0000000004346740
	  
          <idle>-0     [002] d.h. 3598605313255: hrtimer_start: hrtimer=00000000662d2dd8 function=rvt_rc_rnr_retry [rdmavt] expires=712172915662 softexpires=712172915662mode=REL base=0000000066dda1ea next=0000000004346740
          <idle>-0     [002] d.h. 3599538740786: hrtimer_post_add: hrtimer=00000000662d2dd8 function=rvt_rc_rnr_retry [rdmavt] base=0000000066dda1ea next=00000000662d2dd8

          <idle>-0     [003] dn.. 3599538742242: hrtimer_pre_remove: hrtimer=0000000004346740 function=tick_sched_timer base=0000000066dda1ea next=00000000662d2dd8
          <idle>-0     [003] dn.. 3599538743084: hrtimer_post_remove: hrtimer=0000000004346740 function=tick_sched_timer base=0000000066dda1ea next=00000000662d2dd8
  
          <idle>-0     [003] dn.. 3599538743830: hrtimer_start: hrtimer=0000000004346740 function=tick_sched_timer expires=716767000000 softexpires=716767000000mode=ABS|PINNED base=0000000066dda1ea next=00000000662d2dd8
          <idle>-0     [003] dn.. 3599538744560: hrtimer_post_add: hrtimer=0000000004346740 function=tick_sched_timer base=0000000066dda1ea next=00000000662d2dd8

And staring at the NOHZ code I'm pretty sure, I know what happens.

CPU 3						CPU 2

idle
start tick_sched_timer 712171000000
						start rdmavt timer 712172915662
						lock(base)
tick_nohz_stop_tick()
tick = 716767000000				timerqueue_add(tmr)

hrtimer_set_expires(&ts->sched_timer, tick); <---- FAIL

						    if (tmr->exp < queue->next->exp)
hrtimer_start(sched_timer)			        queue->next = tmr;
lock(base)
						unlock(base)
timerqueue_remove()
timerqueue_add()
....

So ts->sched_timer is still queued and queue->next is pointing to it, but
then expires is modified. So the other side sees the new expiry time and
makes the rdmavt timer the new queue->next. All f*cked from there.

The problem was introduced with:

  d4af6d933ccf ("nohz: Fix spurious warning when hrtimer and clockevent get out of sync")

The changelog is not really helpful, but I can't see why the expiry time of
ts->sched_timer should be updated before the timer is [re]started in case
of HIGHRES + NOHZ. hrtimer_start() sets the expiry time, so that should be
sufficient. Of course the tick needs to be stored in ts->sched_timer for
the !HIGHRES + HOHZ case, but that's trivial enough to do. Patch against
Linus tree below. It's easy to backport in case you want to run it against
the kernel which made the observation simpler.

I need to come up with integration of hrtimer_set_expires() into debug
objects to catch this kind of horrors. Groan....

Thanks,

	tglx

8<-----------
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -804,12 +804,12 @@ static void tick_nohz_stop_tick(struct t
 		return;
 	}
 
-	hrtimer_set_expires(&ts->sched_timer, tick);
-
-	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
-		hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED);
-	else
+	if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
+		hrtimer_start(&ts->sched_timer, tick, HRTIMER_MODE_ABS_PINNED);
+	} else {
+		hrtimer_set_expires(&ts->sched_timer, tick);
 		tick_program_event(tick, 1);
+	}
 }
 
 static void tick_nohz_retain_tick(struct tick_sched *ts)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hrtimer (rdmavt RNR timer) was lost
  2018-04-24 14:54       ` Thomas Gleixner
@ 2018-04-24 16:48         ` Frederic Weisbecker
  2018-04-24 19:22         ` [PATCH] tick/sched: Do not mess with an enqueued hrtimer Thomas Gleixner
  1 sibling, 0 replies; 10+ messages in thread
From: Frederic Weisbecker @ 2018-04-24 16:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Wan, Kaike, Marciniszyn, Mike, Dalessandro, Dennis, Weiny, Ira,
	Fleck, John, linux-kernel, linux-rdma, Peter Zijlstra,
	Anna-Maria Gleixner, Frederic Weisbecker, Ingo Molnar

On Tue, Apr 24, 2018 at 04:54:58PM +0200, Thomas Gleixner wrote:
> On Mon, 23 Apr 2018, Thomas Gleixner wrote:
> > On Mon, 23 Apr 2018, Wan, Kaike wrote:
> > > > Can you apply the following debug patch and enable the hrtimer_start trace
> > > > point and send me the full trace or upload it somewhere?
> > > 
> > > The original trace was about 29GB and I filtered it with
> > > "0000000066dda1ea" (the offending base) to generate a 1.4GB file that I
> > > could open and investigate.  I am not sure how I can send them to you. Do
> > > you have somewhere I can upload to?
> > >
> > > I can try your debug patch and again I am anticipating a big trace file.
> > 
> > Well, you can find the spot where the fail happens and then extract the
> > full thing from 2s before that point to 1s after. That should be reasonably
> > small and good enough. Let me know when you have it and how big it is
> > (compressed) and we'll figure something out how to transport it.
> 
> Thanks for the more complete data which actually let me decode the wreckage.
> 
> So you have NO_HZ enabled and looking at the trace then this is related:
> 
>           <idle>-0     [003] dN.. 3598605307236: hrtimer_start: hrtimer=0000000004346740 function=tick_sched_timer expires=712171000000 softexpires=712171000000mode=ABS|PINNED base=0000000066dda1ea next=00000000708914d7
>           <idle>-0     [003] dN.. 3598605307601: hrtimer_post_add: hrtimer=0000000004346740 function=tick_sched_timer base=0000000066dda1ea next=0000000004346740
> 	  
>           <idle>-0     [002] d.h. 3598605313255: hrtimer_start: hrtimer=00000000662d2dd8 function=rvt_rc_rnr_retry [rdmavt] expires=712172915662 softexpires=712172915662mode=REL base=0000000066dda1ea next=0000000004346740
>           <idle>-0     [002] d.h. 3599538740786: hrtimer_post_add: hrtimer=00000000662d2dd8 function=rvt_rc_rnr_retry [rdmavt] base=0000000066dda1ea next=00000000662d2dd8
> 
>           <idle>-0     [003] dn.. 3599538742242: hrtimer_pre_remove: hrtimer=0000000004346740 function=tick_sched_timer base=0000000066dda1ea next=00000000662d2dd8
>           <idle>-0     [003] dn.. 3599538743084: hrtimer_post_remove: hrtimer=0000000004346740 function=tick_sched_timer base=0000000066dda1ea next=00000000662d2dd8
>   
>           <idle>-0     [003] dn.. 3599538743830: hrtimer_start: hrtimer=0000000004346740 function=tick_sched_timer expires=716767000000 softexpires=716767000000mode=ABS|PINNED base=0000000066dda1ea next=00000000662d2dd8
>           <idle>-0     [003] dn.. 3599538744560: hrtimer_post_add: hrtimer=0000000004346740 function=tick_sched_timer base=0000000066dda1ea next=00000000662d2dd8
> 
> And staring at the NOHZ code I'm pretty sure, I know what happens.
> 
> CPU 3						CPU 2
> 
> idle
> start tick_sched_timer 712171000000
> 						start rdmavt timer 712172915662
> 						lock(base)
> tick_nohz_stop_tick()
> tick = 716767000000				timerqueue_add(tmr)
> 
> hrtimer_set_expires(&ts->sched_timer, tick); <---- FAIL
> 
> 						    if (tmr->exp < queue->next->exp)
> hrtimer_start(sched_timer)			        queue->next = tmr;
> lock(base)
> 						unlock(base)
> timerqueue_remove()
> timerqueue_add()
> ....
> 
> So ts->sched_timer is still queued and queue->next is pointing to it, but
> then expires is modified. So the other side sees the new expiry time and
> makes the rdmavt timer the new queue->next. All f*cked from there.
> 
> The problem was introduced with:
> 
>   d4af6d933ccf ("nohz: Fix spurious warning when hrtimer and clockevent get out of sync")
> 
> The changelog is not really helpful, but I can't see why the expiry time of
> ts->sched_timer should be updated before the timer is [re]started in case
> of HIGHRES + NOHZ. hrtimer_start() sets the expiry time, so that should be
> sufficient. Of course the tick needs to be stored in ts->sched_timer for
> the !HIGHRES + HOHZ case, but that's trivial enough to do. Patch against
> Linus tree below. It's easy to backport in case you want to run it against
> the kernel which made the observation simpler.
> 
> I need to come up with integration of hrtimer_set_expires() into debug
> objects to catch this kind of horrors. Groan....

Sorry for all that, that must have been hairy enough to debug :-(

I thought that I could fiddle with the sched_timer because nothing else
is supposed to touch it concurrently, but that forgot about the whole
queue locked under cpu_base. My fault.

The fix looks all good to me, thanks!

ACK.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] tick/sched: Do not mess with an enqueued hrtimer
@ 2018-04-24 19:22         ` Thomas Gleixner
  2018-04-25 13:22           ` Frederic Weisbecker
                             ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Thomas Gleixner @ 2018-04-24 19:22 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Wan, Kaike, Marciniszyn, Mike, Dalessandro, Dennis, Weiny, Ira,
	Fleck, John, linux-kernel, linux-rdma, Peter Zijlstra,
	Anna-Maria Gleixner, Frederic Weisbecker, Ingo Molnar

Kaike reported that in tests rdma hrtimers occasionaly stopped working. He
did great debugging, which provided enough context to decode the problem.

CPU 3			     	      	     CPU 2

idle
start sched_timer expires = 712171000000
 queue->next = sched_timer
					    start rdmavt timer. expires = 712172915662
					    lock(baseof(CPU3))
tick_nohz_stop_tick()
tick = 716767000000			    timerqueue_add(tmr)

hrtimer_set_expires(sched_timer, tick);
  sched_timer->expires = 716767000000  <---- FAIL
					     if (tmr->expires < queue->next->expires)
hrtimer_start(sched_timer)		          queue->next = tmr;
lock(baseof(CPU3))
					     unlock(baseof(CPU3))
timerqueue_remove()
timerqueue_add()

ts->sched_timer is queued and queue->next is pointing to it, but then
ts->sched_timer.expires is modified.

This not only corrupts the ordering of the timerqueue RB tree, it also
makes CPU2 see the new expiry time of timerqueue->next->expires when
checking whether timerqueue->next needs to be updated. So CPU2 sees that
the rdma timer is earlier than timerqueue->next and sets the rdma timer as
new next.

Depending on whether it had also seen the new time at RB tree enqueue, it
might have queued the rdma timer at the wrong place and then after removing
the sched_timer the RB tree is completely hosed.

The problem was introduced with a commit which tried to solve inconsistency
between the hrtimer in the tick_sched data and the underlying hardware
clockevent. It split out hrtimer_set_expires() to store the new tick time
in both the NOHZ and the NOHZ + HIGHRES case, but missed the fact that in
the NOHZ + HIGHRES case the hrtimer might still be queued.

Use hrtimer_start(timer, tick...) for the NOHZ + HIGHRES case which sets
timer->expires after canceling the timer and move the hrtimer_set_expires()
invocation into the NOHZ only code path which is not affected as it merily
uses the hrtimer as next event storage so code pathes can be shared with
the NOHZ + HIGHRES case.

Fixes: d4af6d933ccf ("nohz: Fix spurious warning when hrtimer and clockevent get out of sync")
Reported-by: "Wan Kaike" <kaike.wan@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Marciniszyn Mike" <mike.marciniszyn@intel.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: linux-rdma@vger.kernel.org
Cc: "Dalessandro Dennis" <dennis.dalessandro@intel.com>
Cc: "Fleck John" <john.fleck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "Weiny Ira" <ira.weiny@intel.com>
Cc: "linux-rdma@vger.kernel.org"
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1804241637390.1679@nanos.tec.linutronix.de

---
 kernel/time/tick-sched.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -804,12 +804,12 @@ static void tick_nohz_stop_tick(struct t
 		return;
 	}
 
-	hrtimer_set_expires(&ts->sched_timer, tick);
-
-	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
-		hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED);
-	else
+	if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
+		hrtimer_start(&ts->sched_timer, tick, HRTIMER_MODE_ABS_PINNED);
+	} else {
+		hrtimer_set_expires(&ts->sched_timer, tick);
 		tick_program_event(tick, 1);
+	}
 }
 
 static void tick_nohz_retain_tick(struct tick_sched *ts)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tick/sched: Do not mess with an enqueued hrtimer
  2018-04-24 19:22         ` [PATCH] tick/sched: Do not mess with an enqueued hrtimer Thomas Gleixner
@ 2018-04-25 13:22           ` Frederic Weisbecker
  2018-04-25 14:21           ` [tip:timers/urgent] " tip-bot for Thomas Gleixner
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Frederic Weisbecker @ 2018-04-25 13:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Wan, Kaike, Marciniszyn, Mike, Dalessandro, Dennis, Weiny, Ira,
	Fleck, John, linux-kernel, linux-rdma, Peter Zijlstra,
	Anna-Maria Gleixner, Frederic Weisbecker, Ingo Molnar

On Tue, Apr 24, 2018 at 09:22:18PM +0200, Thomas Gleixner wrote:
> Kaike reported that in tests rdma hrtimers occasionaly stopped working. He
> did great debugging, which provided enough context to decode the problem.
> 
> CPU 3			     	      	     CPU 2
> 
> idle
> start sched_timer expires = 712171000000
>  queue->next = sched_timer
> 					    start rdmavt timer. expires = 712172915662
> 					    lock(baseof(CPU3))
> tick_nohz_stop_tick()
> tick = 716767000000			    timerqueue_add(tmr)
> 
> hrtimer_set_expires(sched_timer, tick);
>   sched_timer->expires = 716767000000  <---- FAIL
> 					     if (tmr->expires < queue->next->expires)
> hrtimer_start(sched_timer)		          queue->next = tmr;
> lock(baseof(CPU3))
> 					     unlock(baseof(CPU3))
> timerqueue_remove()
> timerqueue_add()
> 
> ts->sched_timer is queued and queue->next is pointing to it, but then
> ts->sched_timer.expires is modified.
> 
> This not only corrupts the ordering of the timerqueue RB tree, it also
> makes CPU2 see the new expiry time of timerqueue->next->expires when
> checking whether timerqueue->next needs to be updated. So CPU2 sees that
> the rdma timer is earlier than timerqueue->next and sets the rdma timer as
> new next.
> 
> Depending on whether it had also seen the new time at RB tree enqueue, it
> might have queued the rdma timer at the wrong place and then after removing
> the sched_timer the RB tree is completely hosed.
> 
> The problem was introduced with a commit which tried to solve inconsistency
> between the hrtimer in the tick_sched data and the underlying hardware
> clockevent. It split out hrtimer_set_expires() to store the new tick time
> in both the NOHZ and the NOHZ + HIGHRES case, but missed the fact that in
> the NOHZ + HIGHRES case the hrtimer might still be queued.
> 
> Use hrtimer_start(timer, tick...) for the NOHZ + HIGHRES case which sets
> timer->expires after canceling the timer and move the hrtimer_set_expires()
> invocation into the NOHZ only code path which is not affected as it merily
> uses the hrtimer as next event storage so code pathes can be shared with
> the NOHZ + HIGHRES case.
> 
> Fixes: d4af6d933ccf ("nohz: Fix spurious warning when hrtimer and clockevent get out of sync")
> Reported-by: "Wan Kaike" <kaike.wan@intel.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Acked-by: Frederic Weisbecker <frederic@kernel.org>

Thanks!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip:timers/urgent] tick/sched: Do not mess with an enqueued hrtimer
  2018-04-24 19:22         ` [PATCH] tick/sched: Do not mess with an enqueued hrtimer Thomas Gleixner
  2018-04-25 13:22           ` Frederic Weisbecker
@ 2018-04-25 14:21           ` tip-bot for Thomas Gleixner
  2018-04-26 12:56           ` [PATCH] " Wan, Kaike
  2018-04-26 12:57           ` [tip:timers/urgent] " tip-bot for Thomas Gleixner
  3 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Thomas Gleixner @ 2018-04-25 14:21 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: fweisbec, linux-kernel, ira.weiny, peterz, john.fleck,
	mike.marciniszyn, tglx, frederic, hpa, anna-maria, mingo,
	dennis.dalessandro, kaike.wan

Commit-ID:  51fc4be41c7ea7a00f6a17ad15ac9ea684d07921
Gitweb:     https://git.kernel.org/tip/51fc4be41c7ea7a00f6a17ad15ac9ea684d07921
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 24 Apr 2018 21:22:18 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 25 Apr 2018 16:11:58 +0200

tick/sched: Do not mess with an enqueued hrtimer

Kaike reported that in tests rdma hrtimers occasionaly stopped working. He
did great debugging, which provided enough context to decode the problem.

CPU 3			     	      	     CPU 2

idle
start sched_timer expires = 712171000000
 queue->next = sched_timer
					    start rdmavt timer. expires = 712172915662
					    lock(baseof(CPU3))
tick_nohz_stop_tick()
tick = 716767000000			    timerqueue_add(tmr)

hrtimer_set_expires(sched_timer, tick);
  sched_timer->expires = 716767000000  <---- FAIL
					     if (tmr->expires < queue->next->expires)
hrtimer_start(sched_timer)		          queue->next = tmr;
lock(baseof(CPU3))
					     unlock(baseof(CPU3))
timerqueue_remove()
timerqueue_add()

ts->sched_timer is queued and queue->next is pointing to it, but then
ts->sched_timer.expires is modified.

This not only corrupts the ordering of the timerqueue RB tree, it also
makes CPU2 see the new expiry time of timerqueue->next->expires when
checking whether timerqueue->next needs to be updated. So CPU2 sees that
the rdma timer is earlier than timerqueue->next and sets the rdma timer as
new next.

Depending on whether it had also seen the new time at RB tree enqueue, it
might have queued the rdma timer at the wrong place and then after removing
the sched_timer the RB tree is completely hosed.

The problem was introduced with a commit which tried to solve inconsistency
between the hrtimer in the tick_sched data and the underlying hardware
clockevent. It split out hrtimer_set_expires() to store the new tick time
in both the NOHZ and the NOHZ + HIGHRES case, but missed the fact that in
the NOHZ + HIGHRES case the hrtimer might still be queued.

Use hrtimer_start(timer, tick...) for the NOHZ + HIGHRES case which sets
timer->expires after canceling the timer and move the hrtimer_set_expires()
invocation into the NOHZ only code path which is not affected as it merily
uses the hrtimer as next event storage so code pathes can be shared with
the NOHZ + HIGHRES case.

Fixes: d4af6d933ccf ("nohz: Fix spurious warning when hrtimer and clockevent get out of sync")
Reported-by: "Wan Kaike" <kaike.wan@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Cc: "Marciniszyn Mike" <mike.marciniszyn@intel.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: linux-rdma@vger.kernel.org
Cc: "Dalessandro Dennis" <dennis.dalessandro@intel.com>
Cc: "Fleck John" <john.fleck@intel.com>
Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "Weiny Ira" <ira.weiny@intel.com>
Cc: "linux-rdma@vger.kernel.org"
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1804241637390.1679@nanos.tec.linutronix.de
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1804242119210.1597@nanos.tec.linutronix.de

---
 kernel/time/tick-sched.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index e35a6fced00c..da9455a6b42b 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -795,12 +795,12 @@ static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu)
 		return;
 	}
 
-	hrtimer_set_expires(&ts->sched_timer, tick);
-
-	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
-		hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED);
-	else
+	if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
+		hrtimer_start(&ts->sched_timer, tick, HRTIMER_MODE_ABS_PINNED);
+	} else {
+		hrtimer_set_expires(&ts->sched_timer, tick);
 		tick_program_event(tick, 1);
+	}
 }
 
 static void tick_nohz_retain_tick(struct tick_sched *ts)

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH] tick/sched: Do not mess with an enqueued hrtimer
  2018-04-24 19:22         ` [PATCH] tick/sched: Do not mess with an enqueued hrtimer Thomas Gleixner
  2018-04-25 13:22           ` Frederic Weisbecker
  2018-04-25 14:21           ` [tip:timers/urgent] " tip-bot for Thomas Gleixner
@ 2018-04-26 12:56           ` Wan, Kaike
  2018-04-26 12:57           ` [tip:timers/urgent] " tip-bot for Thomas Gleixner
  3 siblings, 0 replies; 10+ messages in thread
From: Wan, Kaike @ 2018-04-26 12:56 UTC (permalink / raw)
  To: Thomas Gleixner, Frederic Weisbecker
  Cc: Marciniszyn, Mike, Dalessandro, Dennis, Weiny, Ira, Fleck, John,
	linux-kernel, linux-rdma, Peter Zijlstra, Anna-Maria Gleixner,
	Frederic Weisbecker, Ingo Molnar

> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
> owner@vger.kernel.org] On Behalf Of Thomas Gleixner
> Sent: Tuesday, April 24, 2018 3:22 PM
> To: Frederic Weisbecker <frederic@kernel.org>
> Cc: Wan, Kaike <kaike.wan@intel.com>; Marciniszyn, Mike
> <mike.marciniszyn@intel.com>; Dalessandro, Dennis
> <dennis.dalessandro@intel.com>; Weiny, Ira <ira.weiny@intel.com>; Fleck,
> John <john.fleck@intel.com>; linux-kernel@vger.kernel.org; linux-
> rdma@vger.kernel.org; Peter Zijlstra <peterz@infradead.org>; Anna-Maria
> Gleixner <anna-maria@linutronix.de>; Frederic Weisbecker
> <fweisbec@gmail.com>; Ingo Molnar <mingo@kernel.org>
> Subject: [PATCH] tick/sched: Do not mess with an enqueued hrtimer
> 
> Kaike reported that in tests rdma hrtimers occasionaly stopped working. He
> did great debugging, which provided enough context to decode the problem.
> 
> CPU 3			     	      	     CPU 2
> 
> idle
> start sched_timer expires = 712171000000  queue->next = sched_timer
> 					    start rdmavt timer. expires =
> 712172915662
> 					    lock(baseof(CPU3))
> tick_nohz_stop_tick()
> tick = 716767000000			    timerqueue_add(tmr)
> 
> hrtimer_set_expires(sched_timer, tick);
>   sched_timer->expires = 716767000000  <---- FAIL
> 					     if (tmr->expires < queue->next-
> >expires)
> hrtimer_start(sched_timer)		          queue->next = tmr;
> lock(baseof(CPU3))
> 					     unlock(baseof(CPU3))
> timerqueue_remove()
> timerqueue_add()
> 
> ts->sched_timer is queued and queue->next is pointing to it, but then
> ts->sched_timer.expires is modified.
> 
> This not only corrupts the ordering of the timerqueue RB tree, it also makes
> CPU2 see the new expiry time of timerqueue->next->expires when checking
> whether timerqueue->next needs to be updated. So CPU2 sees that the
> rdma timer is earlier than timerqueue->next and sets the rdma timer as new
> next.
> 
> Depending on whether it had also seen the new time at RB tree enqueue, it
> might have queued the rdma timer at the wrong place and then after
> removing the sched_timer the RB tree is completely hosed.
> 
> The problem was introduced with a commit which tried to solve
> inconsistency between the hrtimer in the tick_sched data and the underlying
> hardware clockevent. It split out hrtimer_set_expires() to store the new tick
> time in both the NOHZ and the NOHZ + HIGHRES case, but missed the fact
> that in the NOHZ + HIGHRES case the hrtimer might still be queued.
> 
> Use hrtimer_start(timer, tick...) for the NOHZ + HIGHRES case which sets
> timer->expires after canceling the timer and move the
> timer->hrtimer_set_expires()
> invocation into the NOHZ only code path which is not affected as it merily
> uses the hrtimer as next event storage so code pathes can be shared with
> the NOHZ + HIGHRES case.
> 
> Fixes: d4af6d933ccf ("nohz: Fix spurious warning when hrtimer and
> clockevent get out of sync")
> Reported-by: "Wan Kaike" <kaike.wan@intel.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: "Marciniszyn Mike" <mike.marciniszyn@intel.com>
> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
> Cc: linux-rdma@vger.kernel.org
> Cc: "Dalessandro Dennis" <dennis.dalessandro@intel.com>
> Cc: "Fleck John" <john.fleck@intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: "Weiny Ira" <ira.weiny@intel.com>
> Cc: "linux-rdma@vger.kernel.org"
> Cc: stable@vger.kernel.org
> Link:
> https://lkml.kernel.org/r/alpine.DEB.2.21.1804241637390.1679@nanos.tec.lin
> utronix.de
> 
> ---
>  kernel/time/tick-sched.c |   10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -804,12 +804,12 @@ static void tick_nohz_stop_tick(struct t
>  		return;
>  	}
> 
> -	hrtimer_set_expires(&ts->sched_timer, tick);
> -
> -	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
> -		hrtimer_start_expires(&ts->sched_timer,
> HRTIMER_MODE_ABS_PINNED);
> -	else
> +	if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
> +		hrtimer_start(&ts->sched_timer, tick,
> HRTIMER_MODE_ABS_PINNED);
> +	} else {
> +		hrtimer_set_expires(&ts->sched_timer, tick);
>  		tick_program_event(tick, 1);
> +	}
>  }
> 
>  static void tick_nohz_retain_tick(struct tick_sched *ts)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
> body of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html


Tested-by: Kaike Wan <Kaike.wan@intel.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip:timers/urgent] tick/sched: Do not mess with an enqueued hrtimer
  2018-04-24 19:22         ` [PATCH] tick/sched: Do not mess with an enqueued hrtimer Thomas Gleixner
                             ` (2 preceding siblings ...)
  2018-04-26 12:56           ` [PATCH] " Wan, Kaike
@ 2018-04-26 12:57           ` tip-bot for Thomas Gleixner
  3 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Thomas Gleixner @ 2018-04-26 12:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: frederic, mike.marciniszyn, linux-kernel, hpa, kaike.wan, tglx,
	dennis.dalessandro, fweisbec, anna-maria, john.fleck, ira.weiny,
	mingo, peterz

Commit-ID:  1f71addd34f4c442bec7d7c749acc1beb58126f2
Gitweb:     https://git.kernel.org/tip/1f71addd34f4c442bec7d7c749acc1beb58126f2
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 24 Apr 2018 21:22:18 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 26 Apr 2018 14:53:32 +0200

tick/sched: Do not mess with an enqueued hrtimer

Kaike reported that in tests rdma hrtimers occasionaly stopped working. He
did great debugging, which provided enough context to decode the problem.

CPU 3			     	      	     CPU 2

idle
start sched_timer expires = 712171000000
 queue->next = sched_timer
					    start rdmavt timer. expires = 712172915662
					    lock(baseof(CPU3))
tick_nohz_stop_tick()
tick = 716767000000			    timerqueue_add(tmr)

hrtimer_set_expires(sched_timer, tick);
  sched_timer->expires = 716767000000  <---- FAIL
					     if (tmr->expires < queue->next->expires)
hrtimer_start(sched_timer)		          queue->next = tmr;
lock(baseof(CPU3))
					     unlock(baseof(CPU3))
timerqueue_remove()
timerqueue_add()

ts->sched_timer is queued and queue->next is pointing to it, but then
ts->sched_timer.expires is modified.

This not only corrupts the ordering of the timerqueue RB tree, it also
makes CPU2 see the new expiry time of timerqueue->next->expires when
checking whether timerqueue->next needs to be updated. So CPU2 sees that
the rdma timer is earlier than timerqueue->next and sets the rdma timer as
new next.

Depending on whether it had also seen the new time at RB tree enqueue, it
might have queued the rdma timer at the wrong place and then after removing
the sched_timer the RB tree is completely hosed.

The problem was introduced with a commit which tried to solve inconsistency
between the hrtimer in the tick_sched data and the underlying hardware
clockevent. It split out hrtimer_set_expires() to store the new tick time
in both the NOHZ and the NOHZ + HIGHRES case, but missed the fact that in
the NOHZ + HIGHRES case the hrtimer might still be queued.

Use hrtimer_start(timer, tick...) for the NOHZ + HIGHRES case which sets
timer->expires after canceling the timer and move the hrtimer_set_expires()
invocation into the NOHZ only code path which is not affected as it merily
uses the hrtimer as next event storage so code pathes can be shared with
the NOHZ + HIGHRES case.

Fixes: d4af6d933ccf ("nohz: Fix spurious warning when hrtimer and clockevent get out of sync")
Reported-by: "Wan Kaike" <kaike.wan@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Cc: "Marciniszyn Mike" <mike.marciniszyn@intel.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: linux-rdma@vger.kernel.org
Cc: "Dalessandro Dennis" <dennis.dalessandro@intel.com>
Cc: "Fleck John" <john.fleck@intel.com>
Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "Weiny Ira" <ira.weiny@intel.com>
Cc: "linux-rdma@vger.kernel.org"
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1804241637390.1679@nanos.tec.linutronix.de
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1804242119210.1597@nanos.tec.linutronix.de


---
 kernel/time/tick-sched.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 646645e981f9..d31bec177fa5 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -804,12 +804,12 @@ static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu)
 		return;
 	}
 
-	hrtimer_set_expires(&ts->sched_timer, tick);
-
-	if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
-		hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED);
-	else
+	if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
+		hrtimer_start(&ts->sched_timer, tick, HRTIMER_MODE_ABS_PINNED);
+	} else {
+		hrtimer_set_expires(&ts->sched_timer, tick);
 		tick_program_event(tick, 1);
+	}
 }
 
 static void tick_nohz_retain_tick(struct tick_sched *ts)

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-04-26 12:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <3F128C9216C9B84BB6ED23EF16290AFB634C8C87@CRSMSX101.amr.corp.intel.com>
2018-04-23 12:53 ` hrtimer (rdmavt RNR timer) was lost Thomas Gleixner
2018-04-23 13:22   ` Wan, Kaike
2018-04-23 13:35     ` Thomas Gleixner
2018-04-24 14:54       ` Thomas Gleixner
2018-04-24 16:48         ` Frederic Weisbecker
2018-04-24 19:22         ` [PATCH] tick/sched: Do not mess with an enqueued hrtimer Thomas Gleixner
2018-04-25 13:22           ` Frederic Weisbecker
2018-04-25 14:21           ` [tip:timers/urgent] " tip-bot for Thomas Gleixner
2018-04-26 12:56           ` [PATCH] " Wan, Kaike
2018-04-26 12:57           ` [tip:timers/urgent] " tip-bot for Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.