From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1036915AbdEZCK7 (ORCPT <rfc822;w@1wt.eu>);
        Thu, 25 May 2017 22:10:59 -0400
Received: from mail-wm0-f67.google.com ([74.125.82.67]:33931 "EHLO
        mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S932931AbdEZCK4 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 25 May 2017 22:10:56 -0400
Date: Fri, 26 May 2017 04:10:53 +0200
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
        Peter Zijlstra <peterz@infradead.org>, Rik van Riel <riel@redhat.com>,
        James Hartsock <hartsjc@redhat.com>,
        Thomas Gleixner <tglx@linutronix.de>, stable@vger.kernel.org,
        Tim Wright <tim@binbash.co.uk>, Pavel Machek <pavel@ucw.cz>
Subject: Re: [PATCH 0/2] nohz: Deal with clock reprogram skipping issues v3
Message-ID: <20170526021051.GA27061@lerouge>
References: <1495201910-12466-1-git-send-email-fweisbec@gmail.com>
 <20170523072508.3bzyipgwnyjcbef6@gmail.com>
 <20170523131042.GA26103@lerouge>
 <20170524071628.ftwjuh27jpdo6qkm@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170524071628.ftwjuh27jpdo6qkm@gmail.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, May 24, 2017 at 09:16:28AM +0200, Ingo Molnar wrote:
> So the interdiff between your two patches and the 3 commits already queued up is:
> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index e3043873fcdc..30253ed0380b 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -150,12 +150,6 @@ static void tick_sched_handle(struct tick_sched *ts, struct pt_regs *regs)
>  		touch_softlockup_watchdog_sched();
>  		if (is_idle_task(current))
>  			ts->idle_jiffies++;
> -		/*
> -		 * In case the current tick fired too early past its expected
> -		 * expiration, make sure we don't bypass the next clock reprogramming
> -		 * to the same deadline.
> -		 */
> -		ts->next_tick = 0;
>  	}
>  #endif
>  	update_process_times(user_mode(regs));
> @@ -1103,8 +1097,15 @@ static void tick_nohz_handler(struct clock_event_device *dev)
>  	tick_sched_handle(ts, regs);
>  
>  	/* No need to reprogram if we are running tickless  */
> -	if (unlikely(ts->tick_stopped))
> +	if (unlikely(ts->tick_stopped)) {
> +		/*
> +		 * In case the current tick fired too early past its expected
> +		 * expiration, make sure we don't bypass the next clock reprogramming
> +		 * to the same deadline.
> +		 */
> +		ts->next_tick = 0;
>  		return;
> +	}
>  
>  	hrtimer_forward(&ts->sched_timer, now, tick_period);
>  	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
> @@ -1202,12 +1203,17 @@ static enum hrtimer_restart tick_sched_timer(struct hrtimer *timer)
>  	 */
>  	if (regs)
>  		tick_sched_handle(ts, regs);
> -	else
> -		ts->next_tick = 0;
>  
>  	/* No need to reprogram if we are in idle or full dynticks mode */
> -	if (unlikely(ts->tick_stopped))
> +	if (unlikely(ts->tick_stopped)) {
> +		/*
> +		 * In case the current tick fired too early past its expected
> +		 * expiration, make sure we don't bypass the next clock reprogramming
> +		 * to the same deadline.
> +		 */
> +		ts->next_tick = 0;
>  		return HRTIMER_NORESTART;
> +	}
>  
>  	hrtimer_forward(timer, now, tick_period);
>  
> 
> ... so the two are not the same - I'd rather not rebase it, I'd like to keep what 
> is working, we had problems with these changes before ...
> 
> If you'd like the changes in this interdiff to be applied as well, please add a 
> changelog to it and post it as a fourth patch.
> 
> Thanks,
> 
> 	Ingo

So if you like, you can replace the top patch with the following. It's exactly
the same code, I've only added a comment and a changelog:

---
>>From 72956bf08c3b2e506a5ce5ec4faac9fd6b097307 Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Mon, 15 May 2017 14:56:50 +0200
Subject: [PATCH] nohz: Reset next_tick cache even when the timer has no regs

The tick IRQ regs can be NULL if hrtimer_interrupt() is called from
non-interrupt contexts (ex: hotplug CPU down). For such very special
path we forget to clean the cached next tick deadline. If we are in
dynticks mode and the actual timer deadline is ahead of us, we might
perform a buggy bypass of the next clock reprogramming.

In fact since CPU down is the only user I'm aware of, this fix is likely
unnecessary as dying CPUs already clean their tick deadline cache. But
given how hard it is to debug such timer cache related issue, we should
never be short on paranoid measures.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/time/tick-sched.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 764d290..ed18ca5 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1200,8 +1200,17 @@ static enum hrtimer_restart tick_sched_timer(struct hrtimer *timer)
 	 * Do not call, when we are not in irq context and have
 	 * no valid regs pointer
 	 */
-	if (regs)
+	if (regs) {
 		tick_sched_handle(ts, regs);
+	} else {
+		/*
+		 * IRQ regs are NULL if hrtimer_interrupt() is called from
+		 * non-interrupt contexts (ex: hotplug cpu down). Make sure to
+		 * clean the cached next tick deadline to avoid buggy bypass of
+		 * clock reprog.
+		 */
+		ts->next_tick = 0;
+	}
 
 	/* No need to reprogram if we are in idle or full dynticks mode */
 	if (unlikely(ts->tick_stopped))
-- 
2.7.4