From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753923AbdDCPXV (ORCPT ); Mon, 3 Apr 2017 11:23:21 -0400 Received: from mail-wr0-f194.google.com ([209.85.128.194]:36225 "EHLO mail-wr0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752377AbdDCPXU (ORCPT ); Mon, 3 Apr 2017 11:23:20 -0400 Date: Mon, 3 Apr 2017 17:23:17 +0200 From: Frederic Weisbecker To: Luiz Capitulino Cc: Wanpeng Li , Mike Galbraith , Rik van Riel , "linux-kernel@vger.kernel.org" , Peter Zijlstra , Thomas Gleixner Subject: Re: [BUG nohz]: wrong user and system time accounting Message-ID: <20170403152315.GA4221@lerouge> References: <1490818125.28917.11.camel@redhat.com> <1490848051.4167.57.camel@gmx.de> <20170330133802.GC3626@lerouge> <20170330141816.GE3626@lerouge> <20170330172546.4e8e1a6a@redhat.com> <20170331160910.0dda403e@redhat.com> <20170331232452.GA10607@lerouge> <20170331231119.2b78eb1f@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170331231119.2b78eb1f@redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 31, 2017 at 11:11:19PM -0400, Luiz Capitulino wrote: > On Sat, 1 Apr 2017 01:24:54 +0200 > Frederic Weisbecker wrote: > > > On Fri, Mar 31, 2017 at 04:09:10PM -0400, Luiz Capitulino wrote: > > > On Thu, 30 Mar 2017 17:25:46 -0400 > > > Luiz Capitulino wrote: > > > > > > > On Thu, 30 Mar 2017 16:18:17 +0200 > > > > Frederic Weisbecker wrote: > > > > > > > > > On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote: > > > > > > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : > > > > > > > If it works, we may want to take that solution, likely less performance sensitive > > > > > > > than using sched_clock(). In fact sched_clock() is fast, especially as we require it to > > > > > > > be stable for nohz_full, but using it involves costly conversion back and forth to jiffies. > > > > > > > > > > > > So both Rik and you agree with the skew tick solution, I will try it > > > > > > tomorrow. Btw, if we should just add random offset to the cpu in the > > > > > > nohz_full mode or add random offset to all cpus like the codes above? > > > > > > > > > > Lets just keep it to all CPUs for simplicty. > > > > > Also please add a comment that explains why we need that skew_tick on nohz_full. > > > > > > > > I've tried all the test-cases we discussed in this thread with skew_tick=1 > > > > and it worked as expected in bare-metal and KVM guests. > > > > > > > > However, I found a test-case that works in bare-metal but show problems > > > > in KVM guests. It could something that's KVM specific, or it could be > > > > something that's harder to reproduce in bare-metal. > > > > > > After discussing some findings on this issue with Rik, I realized that > > > we don't add the skew when restarting the tick in tick_nohz_restart(). > > > Adding the offset there seems to solve this problem. > > > > Are you sure? tick_nohz_restart() doesn't seem to override the initial skew. It > > always forwards the expiration time on top of the last tick. > > OK, I'll double check. Without my change the bug triggers almost > instantly with the described reproducer. With my change it didn't > trig for several minutes (but it does look wrong looking at it now). Do you observe aligned ticks with trace events (hrtimer_expire_entry)? You might want to enforce the global clock to trace that: echo "global" > /sys/kernel/debug/tracing/trace_clock