All of lore.kernel.org
 help / color / mirror / Atom feed
From: Masayoshi Mizuma <msys.mizuma@gmail.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Hasegawa Hitomi <hasegawa-hitomi@fujitsu.com>,
	Mel Gorman <mgorman@suse.de>
Subject: Re: [PATCH 2/2] sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full
Date: Tue, 26 Oct 2021 13:40:37 -0400	[thread overview]
Message-ID: <YXg9lZv6u+zvRco5@gabell> (raw)
In-Reply-To: <20211026141055.57358-3-frederic@kernel.org>

On Tue, Oct 26, 2021 at 04:10:55PM +0200, Frederic Weisbecker wrote:
> getrusage(RUSAGE_THREAD) with nohz_full may return shorter utime/stime
> than the actual time.
> 
> task_cputime_adjusted() snapshots utime and stime and then adjust their
> sum to match the scheduler maintained cputime.sum_exec_runtime.
> Unfortunately in nohz_full, sum_exec_runtime is only updated once per
> second in the worst case, causing a discrepancy against utime and stime
> that can be updated anytime by the reader using vtime.
> 
> To fix this situation, perform an update of cputime.sum_exec_runtime
> when the cputime snapshot reports the task as actually running while
> the tick is disabled. The related overhead is then contained within the
> relevant situations.
> 
> Reported-by: Hasegawa Hitomi <hasegawa-hitomi@fujitsu.com>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Hasegawa Hitomi <hasegawa-hitomi@fujitsu.com>

Thank you for this patch. getrusage(RUSAGE_THREAD) with nohz_full works well!
Please feel free to add:

Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>

Thanks,
Masa

> ---
>  include/linux/sched/cputime.h |  5 +++--
>  kernel/sched/cputime.c        | 12 +++++++++---
>  2 files changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/sched/cputime.h b/include/linux/sched/cputime.h
> index 6c9f19a33865..ce3c58286062 100644
> --- a/include/linux/sched/cputime.h
> +++ b/include/linux/sched/cputime.h
> @@ -18,15 +18,16 @@
>  #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
>  
>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
> -extern void task_cputime(struct task_struct *t,
> +extern bool task_cputime(struct task_struct *t,
>  			 u64 *utime, u64 *stime);
>  extern u64 task_gtime(struct task_struct *t);
>  #else
> -static inline void task_cputime(struct task_struct *t,
> +static inline bool task_cputime(struct task_struct *t,
>  				u64 *utime, u64 *stime)
>  {
>  	*utime = t->utime;
>  	*stime = t->stime;
> +	return false;
>  }
>  
>  static inline u64 task_gtime(struct task_struct *t)
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 872e481d5098..9392aea1804e 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -615,7 +615,8 @@ void task_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
>  		.sum_exec_runtime = p->se.sum_exec_runtime,
>  	};
>  
> -	task_cputime(p, &cputime.utime, &cputime.stime);
> +	if (task_cputime(p, &cputime.utime, &cputime.stime))
> +		cputime.sum_exec_runtime = task_sched_runtime(p);
>  	cputime_adjust(&cputime, &p->prev_cputime, ut, st);
>  }
>  EXPORT_SYMBOL_GPL(task_cputime_adjusted);
> @@ -828,19 +829,21 @@ u64 task_gtime(struct task_struct *t)
>   * add up the pending nohz execution time since the last
>   * cputime snapshot.
>   */
> -void task_cputime(struct task_struct *t, u64 *utime, u64 *stime)
> +bool task_cputime(struct task_struct *t, u64 *utime, u64 *stime)
>  {
>  	struct vtime *vtime = &t->vtime;
>  	unsigned int seq;
>  	u64 delta;
> +	int ret;
>  
>  	if (!vtime_accounting_enabled()) {
>  		*utime = t->utime;
>  		*stime = t->stime;
> -		return;
> +		return false;
>  	}
>  
>  	do {
> +		ret = false;
>  		seq = read_seqcount_begin(&vtime->seqcount);
>  
>  		*utime = t->utime;
> @@ -850,6 +853,7 @@ void task_cputime(struct task_struct *t, u64 *utime, u64 *stime)
>  		if (vtime->state < VTIME_SYS)
>  			continue;
>  
> +		ret = true;
>  		delta = vtime_delta(vtime);
>  
>  		/*
> @@ -861,6 +865,8 @@ void task_cputime(struct task_struct *t, u64 *utime, u64 *stime)
>  		else
>  			*utime += vtime->utime + delta;
>  	} while (read_seqcount_retry(&vtime->seqcount, seq));
> +
> +	return ret;
>  }
>  
>  static int vtime_state_fetch(struct vtime *vtime, int cpu)
> -- 
> 2.25.1
> 

  reply	other threads:[~2021-10-26 17:40 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-26 14:10 [PATCH 0/2] timers/nohz fixes Frederic Weisbecker
2021-10-26 14:10 ` [PATCH 1/2 RESEND] timers/nohz: Last resort update jiffies on nohz_full IRQ entry Frederic Weisbecker
2021-12-02 14:12   ` [tip: timers/urgent] " tip-bot2 for Frederic Weisbecker
2021-10-26 14:10 ` [PATCH 2/2] sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full Frederic Weisbecker
2021-10-26 17:40   ` Masayoshi Mizuma [this message]
2021-11-10 19:30   ` Phil Auld
2021-12-02 14:12   ` [tip: sched/urgent] " tip-bot2 for Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YXg9lZv6u+zvRco5@gabell \
    --to=msys.mizuma@gmail.com \
    --cc=frederic@kernel.org \
    --cc=hasegawa-hitomi@fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.