From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752720AbbFDLXE (ORCPT <rfc822;w@1wt.eu>);
	Thu, 4 Jun 2015 07:23:04 -0400
Received: from www.linutronix.de ([62.245.132.108]:59756 "EHLO
	Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751373AbbFDLWy (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 4 Jun 2015 07:22:54 -0400
Date: Thu, 4 Jun 2015 13:22:45 +0200 (CEST)
From: Thomas Gleixner <tglx@linutronix.de>
To: Jeremiah Mahler <jmmahler@gmail.com>
cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>,
        Peter Zijlstra <peterz@infradead.org>,
        Viresh Kumar <viresh.kumar@linaro.org>,
        Marcelo Tosatti <mtosatti@redhat.com>,
        Frederic Weisbecker <fweisbec@gmail.com>,
        John Stultz <john.stultz@linaro.org>, linux-kernel@vger.kernel.org
Subject: Re: [BUG, bisect] hrtimer: severe lag after suspend & resume
In-Reply-To: <20150604005624.GA1789@hudson.localdomain>
Message-ID: <alpine.DEB.2.11.1506041319580.7118@nanos>
References: <20150604005624.GA1789@hudson.localdomain>
User-Agent: Alpine 2.11 (DEB 23 2013-08-11)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Linutronix-Spam-Score: -1.0
X-Linutronix-Spam-Level: -
X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required,  ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, 3 Jun 2015, Jeremiah Mahler wrote:
> After a fresh boot, the Chrome web browser behaves normally.  Pages
> load quickly and scroll fast.  Even image heavy sites such as
> images.google.com work fine.  However, after a suspend and resume
> cycle, Chrome becomes very slow.  Pages take ten seconds or more to
> load.  The scroll bars and buttons are almost completely
> unresponsive.  Interestingly, I can run Firefox on the same sites
> and it has no issue whatsoever.

Weird.
 
> I have bisected the kernel and found that the following commit
> introduced the bug.  It is present in the latest linux-next (20150602).
> 
>   From 868a3e915f7f5eba8f8cb4f7da2276760807c51c Mon Sep 17 00:00:00 2001
>   From: Thomas Gleixner <tglx@linutronix.de>
>   Date: Tue, 14 Apr 2015 21:08:37 +0000
>   Subject: [PATCH] hrtimer: Make offset update smarter
>   
>   On every tick/hrtimer interrupt we update the offset variables of the
>   clock bases. That's silly because these offsets change very seldom.
>   
>   Add a sequence counter to the time keeping code which keeps track of
>   the offset updates (clock_was_set()). Have a sequence cache in the
>   hrtimer cpu bases to evaluate whether the offsets must be updated or
>   not. This allows us later to avoid pointless cacheline pollution.

I had to wrap my head around that for quite a while, but I think I
have decoded the issue. Can you please test the patch below whether it
solves your problem?

Thanks,

	tglx

------------------------>

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 278d4b36fd94..e9dfcd0b8c41 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1492,6 +1492,12 @@ static void init_hrtimers_cpu(int cpu)
 
 	cpu_base->cpu = cpu;
 	hrtimer_init_hres(cpu_base);
+	/*
+	 * Force an update by setting the clock was set sequence to an
+	 * odd value.
+	 */
+	cpu_base->clock_was_set_seq = 1;
+	hrtimer_update_base(cpu_base);
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 90ed5db67c1d..c97710137a9e 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -593,7 +593,7 @@ static void timekeeping_update(struct timekeeper *tk, unsigned int action)
 	update_fast_timekeeper(&tk->tkr_raw,  &tk_fast_raw);
 
 	if (action & TK_CLOCK_WAS_SET)
-		tk->clock_was_set_seq++;
+		tk->clock_was_set_seq += 2;
 }
 
 /**