linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Christoph Lameter <cl@linux.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>,
	Gilad Ben Yossef <giladb@mellanox.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Tejun Heo <tj@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Francis Giraldeau <francis.giraldeau@gmail.com>,
	linux-doc@vger.kernel.org, linux-api@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Fix /proc/stat freezes (was [PATCH v15] "task_isolation" mode)
Date: Wed, 28 Sep 2016 15:16:14 +0200	[thread overview]
Message-ID: <20160928131613.GA5366@lerouge> (raw)
In-Reply-To: <alpine.DEB.2.20.1608171432200.30568@east.gentwo.org>

On Wed, Aug 17, 2016 at 02:37:46PM -0500, Christoph Lameter wrote:
> On Tue, 16 Aug 2016, Chris Metcalf wrote:
> Subject: NOHZ: Correctly display increasing cputime when processor is busy
> 
> The tick may be switched off when the processor gets busy with nohz full.
> The user time fields in /proc/stat will then no longer increase because
> the tick is not run to update the cpustat values anymore.
> 
> Compensate for the missing ticks by checking if a processor is in
> such a mode. If so then add the ticks that have passed since
> the tick was switched off to the usertime.
> 
> Note that this introduces a slight inaccuracy. The process may
> actually do syscalls without triggering a tick again but the
> processing time in those calls is negligible. Any wait or sleep
> occurrence during syscalls would activate the tick again.
> 
> Any inaccuracy is corrected once the tick is switched on again
> since the actual value where cputime aggregates is not changed.
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>
> 
> Index: linux/fs/proc/stat.c
> ===================================================================
> --- linux.orig/fs/proc/stat.c	2016-08-04 09:04:57.681480937 -0500
> +++ linux/fs/proc/stat.c	2016-08-17 14:27:37.813445675 -0500
> @@ -77,6 +77,12 @@ static u64 get_iowait_time(int cpu)
> 
>  #endif
> 
> +static unsigned long inline get_cputime_user(int cpu)
> +{
> +	return kcpustat_cpu(cpu).cpustat[CPUTIME_USER] +
> +			tick_stopped_busy_ticks(cpu);
> +}
> +
>  static int show_stat(struct seq_file *p, void *v)
>  {
>  	int i, j;
> @@ -93,7 +99,7 @@ static int show_stat(struct seq_file *p,
>  	getboottime64(&boottime);
> 
>  	for_each_possible_cpu(i) {
> -		user += kcpustat_cpu(i).cpustat[CPUTIME_USER];
> +		user += get_cputime_user(i);
>  		nice += kcpustat_cpu(i).cpustat[CPUTIME_NICE];
>  		system += kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM];
>  		idle += get_idle_time(i);
> @@ -130,7 +136,7 @@ static int show_stat(struct seq_file *p,
> 
>  	for_each_online_cpu(i) {
>  		/* Copy values here to work around gcc-2.95.3, gcc-2.96 */
> -		user = kcpustat_cpu(i).cpustat[CPUTIME_USER];
> +		user = get_cputime_user(i);
>  		nice = kcpustat_cpu(i).cpustat[CPUTIME_NICE];
>  		system = kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM];
>  		idle = get_idle_time(i);
> Index: linux/kernel/time/tick-sched.c
> ===================================================================
> --- linux.orig/kernel/time/tick-sched.c	2016-07-27 08:41:17.109862517 -0500
> +++ linux/kernel/time/tick-sched.c	2016-08-17 14:16:42.073835333 -0500
> @@ -990,6 +990,24 @@ ktime_t tick_nohz_get_sleep_length(void)
>  	return ts->sleep_length;
>  }
> 
> +/**
> + * tick_stopped_busy_ticks - return the ticks that did not occur while the
> + *				processor was busy and the tick was off
> + *
> + * Called from sysfs to correctly calculate cputime of nohz full processors
> + */
> +unsigned long tick_stopped_busy_ticks(int cpu)
> +{
> +#ifdef CONFIG_NOHZ_FULL
> +	struct tick_sched *ts = per_cpu_ptr(&tick_cpu_sched, cpu);
> +
> +	if (!ts->inidle && ts->tick_stopped)
> +		return jiffies - ts->idle_jiffies;


It won't work, ts->idle_jiffies only takes care about idle time.

That said, the tick is supposed to fire once per second, the reason for the freeze is
still unknown. Now in order to get rid of the 1hz, we'll need to force updates on
cpustats like that patch intended to.

But I see only two sane ways to do so:

_ fetch the task of CPU X and deduce on top of vtime values where it is executing and
  how much delta is to be added to cpustat. The problem here is that we may need to do that
  under the rq lock to make sure the task is really in CPU X and stays there. Perhaps we could
  cheat though and add the CPU number on vtime fields then vtime_seqcount would be enough
  to get stable results.

_ have housekeeping update all those CPUs cpustat periodically. But that means we need to
  turn back vtime_seqcount into a seqlock and that would be a shame for nohz_full performance.

  parent reply	other threads:[~2016-09-28 13:16 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-16 21:19 [PATCH v15 00/13] support "task_isolation" mode Chris Metcalf
2016-08-16 21:19 ` [PATCH v15 01/13] vmstat: add quiet_vmstat_sync function Chris Metcalf
2016-08-16 21:19 ` [PATCH v15 02/13] vmstat: add vmstat_idle function Chris Metcalf
2016-08-16 21:19 ` [PATCH v15 03/13] lru_add_drain_all: factor out lru_add_drain_needed Chris Metcalf
2016-08-16 21:19 ` [PATCH v15 04/13] task_isolation: add initial support Chris Metcalf
2016-08-29 16:33   ` Peter Zijlstra
2016-08-29 16:40     ` Chris Metcalf
2016-08-29 16:48       ` Peter Zijlstra
2016-08-29 16:53         ` Chris Metcalf
2016-08-30  7:59           ` Peter Zijlstra
2016-08-30  7:58       ` Peter Zijlstra
2016-08-30 15:32         ` Chris Metcalf
2016-08-30 16:30           ` Andy Lutomirski
2016-08-30 17:02             ` Chris Metcalf
2016-08-30 18:43               ` Andy Lutomirski
2016-08-30 19:37                 ` Chris Metcalf
2016-08-30 19:50                   ` Andy Lutomirski
2016-09-02 14:04                     ` Chris Metcalf
2016-09-02 17:28                       ` Andy Lutomirski
2016-09-09 17:40                         ` Chris Metcalf
2016-09-12 17:41                           ` Andy Lutomirski
2016-09-12 19:25                             ` Chris Metcalf
2016-09-27 14:22                         ` Frederic Weisbecker
2016-09-27 14:39                           ` Peter Zijlstra
2016-09-27 14:51                             ` Frederic Weisbecker
2016-09-27 14:48                           ` Paul E. McKenney
2016-09-30 16:59                 ` Chris Metcalf
2016-09-01 10:06           ` Peter Zijlstra
2016-09-02 14:03             ` Chris Metcalf
2016-09-02 16:40               ` Peter Zijlstra
2017-02-02 16:13   ` Eugene Syromiatnikov
2017-02-02 18:12     ` Chris Metcalf
2016-08-16 21:19 ` [PATCH v15 05/13] task_isolation: track asynchronous interrupts Chris Metcalf
2016-08-16 21:19 ` [PATCH v15 06/13] arch/x86: enable task isolation functionality Chris Metcalf
2016-08-30 21:46   ` Andy Lutomirski
2016-08-16 21:19 ` [PATCH v15 07/13] arm64: factor work_pending state machine to C Chris Metcalf
2016-08-17  8:05   ` Will Deacon
2016-08-16 21:19 ` [PATCH v15 08/13] arch/arm64: enable task isolation functionality Chris Metcalf
2016-08-26 16:25   ` Catalin Marinas
2016-08-16 21:19 ` [PATCH v15 09/13] arch/tile: " Chris Metcalf
2016-08-16 21:19 ` [PATCH v15 10/13] arm, tile: turn off timer tick for oneshot_stopped state Chris Metcalf
2016-08-16 21:19 ` [PATCH v15 11/13] task_isolation: support CONFIG_TASK_ISOLATION_ALL Chris Metcalf
2016-08-16 21:19 ` [PATCH v15 12/13] task_isolation: add user-settable notification signal Chris Metcalf
2016-08-16 21:19 ` [PATCH v15 13/13] task_isolation self test Chris Metcalf
2016-08-17 19:37 ` [PATCH] Fix /proc/stat freezes (was [PATCH v15] "task_isolation" mode) Christoph Lameter
2016-08-20  1:42   ` Chris Metcalf
2016-09-28 13:16   ` Frederic Weisbecker [this message]
2016-08-29 16:27 ` Ping: [PATCH v15 00/13] support "task_isolation" mode Chris Metcalf
2016-09-07 21:11   ` Francis Giraldeau
2016-09-07 21:39     ` Francis Giraldeau
2016-09-08 16:21     ` Francis Giraldeau
2016-09-12 16:01     ` Chris Metcalf
2016-09-12 16:14       ` Peter Zijlstra
2016-09-12 21:15         ` Rafael J. Wysocki
2016-09-13  0:05           ` Rafael J. Wysocki
2016-09-13 16:00             ` Francis Giraldeau
2016-09-13  0:20       ` Francis Giraldeau
2016-09-13 16:12         ` Chris Metcalf
2016-09-27 14:49         ` Frederic Weisbecker
2016-09-27 14:35   ` Frederic Weisbecker
2016-09-30 17:07     ` Chris Metcalf
2016-11-05  4:04 ` task isolation discussion at Linux Plumbers Chris Metcalf
2016-11-05 16:05   ` Christoph Lameter
2016-11-07 16:55   ` Thomas Gleixner
2016-11-07 18:36     ` Thomas Gleixner
2016-11-07 19:12       ` Rik van Riel
2016-11-07 19:16         ` Will Deacon
2016-11-07 19:18           ` Rik van Riel
2016-11-11 20:54     ` Luiz Capitulino
2016-11-09  1:40   ` Paul E. McKenney
2016-11-09 11:14     ` Andy Lutomirski
2016-11-09 17:38       ` Paul E. McKenney
2016-11-09 18:57         ` Will Deacon
2016-11-09 19:11           ` Paul E. McKenney
2016-11-10  1:44         ` Andy Lutomirski
2016-11-10  4:52           ` Paul E. McKenney
2016-11-10  5:10             ` Paul E. McKenney
2016-11-11 17:00             ` Andy Lutomirski
2016-11-09 11:07   ` Frederic Weisbecker
2016-12-19 14:37   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160928131613.GA5366@lerouge \
    --to=fweisbec@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=cmetcalf@mellanox.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=francis.giraldeau@gmail.com \
    --cc=giladb@mellanox.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=viresh.kumar@linaro.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).