All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Zucheng Zheng <zhengzucheng@huawei.com>
Cc: mingo@redhat.com, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	bristot@redhat.com, vschneid@redhat.com, frederic@kernel.org,
	hucool.lihua@huawei.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH -next] sched/cputime: Fix the time backward issue about /proc/stat
Date: Wed, 28 Sep 2022 10:12:33 +0200	[thread overview]
Message-ID: <YzQB8afi2rCPvuC1@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20220928033402.181530-1-zhengzucheng@huawei.com>

On Wed, Sep 28, 2022 at 11:34:02AM +0800, Zucheng Zheng wrote:
> From: Zheng Zucheng <zhengzucheng@huawei.com>
> 
> The cputime of cpuN read from /proc/stat has an issue of cputime descent.
> For example, the phenomenon of cputime descent of user is as followed:
> 
> The value read first is 319, and the value read again is 318. As follows:
> first:
>  cat /proc/stat |  grep cpu1
>  cpu1    319    0    496    41665    0    0    0    0    0    0
> again:
>  cat /proc/stat |  grep cpu1
>  cpu1    318    0    497    41674    0    0    0    0    0    0
> 
> The value read from /proc/stat should be monotonically increasing. Otherwise
> user may get incorrect CPU usage.
> 
> The root cause of this problem is that, in the implementation of
> kcpustat_cpu_fetch_vtime, vtime->utime + delta is added to the stack variable
> cpustat instantaneously. If the task is switched between two times, the value
> added to cpustat for the second time may be smaller than that for the first time.
> 
> 				CPU0						CPU1
> First:
> show_stat()
>  ->kcpustat_cpu_fetch()
>   ->kcpustat_cpu_fetch_vtime()
>    ->cpustat[CPUTIME_USER] = kcpustat_cpu(cpu) + vtime->utime + delta       rq->curr is task A
>                                                                             A switch to B,and A->vtime->utime is less than 1 tick
> Then:
> show_stat()
>  ->kcpustat_cpu_fetch()
>   ->kcpustat_cpu_fetch_vtime()
>    ->cpustat[CPUTIME_USER] = kcpustat_cpu(cpu) + vtime->utime + delta;     rq->curr is task B

You're still not explaining where the time gets lost. And the patch is a
horrible band-aid.

What I think you're saying; after staring at this for a while, is that:

  vtime_task_switch_generic()
    __vtime_account_kernel(prev, vtime)
      vtime_account_{guest,system}(tsk, vtime)
        vtime->*time += get_vtime_delta()
	if (vtime->*time >= TICK_NSEC)
	  account_*_time()
	    account_system_index_time()
	      task_group_account_field()
	        __this_cpu_add(kernel_cpustat.cpustat[index], tmp);        <---- here

is not folding time into kernel_cpustat when the task vtime isn't at
least a tick's worth. And then when we switch to another task, we leak
time.

There's another problem here, vtime_task_switch_generic() should use a
single call to sched_clock() to compute the old vtime_delta and set the
new vtime->starttime, otherwise there's a time hole there as well.

This is all quite the maze and it really wants cleaning up, not be made
worse.

So I think you want to do two things:

 - pull kernel_cpustat updates out of task_group_account_field()
   and put them into vtime_task_switch_generic() to be purely
   vtime->starttime based.

 - make vtime_task_switch_generic() use a single sched_clock() call.

I did not audit all the flavours of cputime; there might be fallout, be
sure to cross compile a lot.

Frederic, you agree?

  reply	other threads:[~2022-09-28  8:13 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-28  3:34 [PATCH -next] sched/cputime: Fix the time backward issue about /proc/stat Zucheng Zheng
2022-09-28  8:12 ` Peter Zijlstra [this message]
2022-09-28 12:11   ` Frederic Weisbecker
2022-09-30  2:43     ` zhengzucheng
2022-09-30 12:16       ` Frederic Weisbecker
2022-10-09  2:28         ` zhengzucheng
2022-09-30 12:14 ` [sched/cputime] 131c995687: BUG:spinlock_trylock_failure_on_UP_on_CPU kernel test robot
2022-09-30 12:14   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YzQB8afi2rCPvuC1@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=frederic@kernel.org \
    --cc=hucool.lihua@huawei.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=zhengzucheng@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.