From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752290AbdDCTGR (ORCPT ); Mon, 3 Apr 2017 15:06:17 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54296 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751136AbdDCTGQ (ORCPT ); Mon, 3 Apr 2017 15:06:16 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 6BFD9680FC Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=lcapitulino@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 6BFD9680FC Date: Mon, 3 Apr 2017 15:06:13 -0400 From: Luiz Capitulino To: Frederic Weisbecker Cc: Wanpeng Li , Mike Galbraith , Rik van Riel , "linux-kernel@vger.kernel.org" , Peter Zijlstra , Thomas Gleixner Subject: Re: [BUG nohz]: wrong user and system time accounting Message-ID: <20170403150613.77541109@redhat.com> In-Reply-To: <20170403152315.GA4221@lerouge> References: <1490818125.28917.11.camel@redhat.com> <1490848051.4167.57.camel@gmx.de> <20170330133802.GC3626@lerouge> <20170330141816.GE3626@lerouge> <20170330172546.4e8e1a6a@redhat.com> <20170331160910.0dda403e@redhat.com> <20170331232452.GA10607@lerouge> <20170331231119.2b78eb1f@redhat.com> <20170403152315.GA4221@lerouge> Organization: Red Hat MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 03 Apr 2017 19:06:15 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 3 Apr 2017 17:23:17 +0200 Frederic Weisbecker wrote: > Do you observe aligned ticks with trace events (hrtimer_expire_entry)? > > You might want to enforce the global clock to trace that: > > echo "global" > /sys/kernel/debug/tracing/trace_clock I've used the same trace points & debugging code I've been using to debug this issue, and this what I'm seeing: stress-25757 [002] 2742.717507: function: enter_from_user_mode <-- apic_timer_interrupt stress-25757 [002] 2742.717508: function: __context_tracking_exit <-- enter_from_user_mode stress-25757 [002] 2742.717508: bprint: vtime_delta: diff=0 (now=4297409970 vtime_snap=4297409970) stress-25757 [002] 2742.717509: function: smp_apic_timer_interrupt <-- apic_timer_interrupt stress-25757 [002] 2742.717509: function: irq_enter <-- smp_apic_timer_interrupt stress-25757 [002] 2742.717510: hrtimer_expire_entry: hrtimer=0xffffc900039fbe58 function=hrtimer_wakeup now=2742674000776 stress-25757 [002] 2742.717514: function: irq_exit <-- smp_apic_timer_interrupt cyclictest-25760 [002] 2742.717518: function: vtime_account_system <-- vtime_common_task_switch cyclictest-25760 [002] 2742.717518: bprint: vtime_delta: diff=1000000 (now=4297409971 vtime_snap=4297409970) cyclictest-25760 [002] 2742.717519: function: __vtime_account_system <-- vtime_account_system cyclictest-25760 [002] 2742.717519: bprint: get_vtime_delta: vtime_snap=4297409970 now=4297409971 cyclictest-25760 [002] 2742.717520: function: account_system_time <-- __vtime_account_system cyclictest-25760 [002] 2742.717520: bprint: account_system_time: cputime=961981 cyclictest-25760 [002] 2742.717521: function: __context_tracking_enter <-- do_syscall_64 cyclictest-25760 [002] 2742.717522: function: vtime_user_enter <-- __context_tracking_enter cyclictest-25760 [002] 2742.717522: bprint: vtime_delta: diff=0 (now=4297409971 vtime_snap=4297409971) CPU2 shows 98% system time while the other CPUs (from CPU3 to CPU7) show 98% user time (they're all running the same workload). What's happening here is: 1. Timer interrupt 2. Transition from user-space to kernel-space, vtimer_delta() returns zero 3. Context switch from hog application to cyclictest 4. This time vtime_delta() returns != zero, which implies jiffies was updated between steps 2 and 3 This seems to be the pattern that accounts incorrectly, and seem to suggest that the ticks are aligned because this repeats over and over. Please, let me know if you want me to run a different trace-cmd command-line.