All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Hillf Danton <hdanton@sina.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Yu Liao <liaoyu15@huawei.com>,
	fweisbec@gmail.com, mingo@kernel.org, liwei391@huawei.com,
	adobriyan@gmail.com, mirsad.todorovac@alu.unizg.hr,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH RFC] tick/nohz: fix data races in get_cpu_idle_time_us()
Date: Wed, 1 Feb 2023 15:28:46 +0100	[thread overview]
Message-ID: <Y9p3HjW3yzn0UYrZ@lothringen> (raw)
In-Reply-To: <20230201140117.539-1-hdanton@sina.com>

On Wed, Feb 01, 2023 at 10:01:17PM +0800, Hillf Danton wrote:
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -640,13 +640,26 @@ static void tick_nohz_update_jiffies(kti
> > >  /*
> > >   * Updates the per-CPU time idle statistics counters
> > >   */
> > > -static void
> > > -update_ts_time_stats(int cpu, struct tick_sched *ts, ktime_t now, u64 *last_update_time)
> > > +static u64 update_ts_time_stats(int cpu, struct tick_sched *ts, ktime_t now,
> > > +				int io, u64 *last_update_time)
> > >  {
> > >  	ktime_t delta;
> > >  
> > > +	if (last_update_time)
> > > +		*last_update_time = ktime_to_us(now);
> > > +
> > >  	if (ts->idle_active) {
> > >  		delta = ktime_sub(now, ts->idle_entrytime);
> > > +
> > > +		/* update is only expected on the local CPU */
> > > +		if (cpu != smp_processor_id()) {
> > 
> > Why not just updating it only on idle exit then?
> 
> This aligns to idle exit as much as it can by disallowing remote update.

I mean why bother updating if idle does it for us already?

One possibility is that we get some more precise values if we read during
long idle periods with nr_iowait_cpu() changes in the middle.

> > 
> > > +			if (io)
> > 
> > I fear it's not up to the caller to decides if the idle time is IO or not.
> 
> Could you specify a bit on your concern, given the callers of this function?

You are randomly stating if the elapsing idle time is IO or not depending on
the caller, without verifying nr_iowait_cpu(). Or am I missing something?

> > 
> > > +				delta = ktime_add(ts->iowait_sleeptime, delta);
> > > +			else
> > > +				delta = ktime_add(ts->idle_sleeptime, delta);
> > > +			return ktime_to_us(delta);
> 
> Based on the above comments, I guest you missed this line which prevents
> get_cpu_idle_time_us() and get_cpu_iowait_time_us() from updating ts.

Right...

> > But then you may race with the local updater, risking to return
> > the delta added twice. So you need at least a seqcount.
> 
> Add seqcount if needed. No problem.
> > 
> > But in the end, nr_iowait_cpu() is broken because that counter can be
> > decremented remotely and so the whole thing is beyond repair:
> > 
> > CPU 0                       CPU  1                    CPU 2
> > -----                       -----                     ------
> > //io_schedule() TASK A
> > current->in_iowait = 1
> > rq(0)->nr_iowait++
> > //switch to idle
> >                     // READ /proc/stat
> >                     // See nr_iowait_cpu(0) == 1
> >                     return ts->iowait_sleeptime + ktime_sub(ktime_get(), ts->idle_entrytime)
> > 
> >                                                       //try_to_wake_up(TASK A)
> >                                                       rq(0)->nr_iowait--
> > //idle exit
> > // See nr_iowait_cpu(0) == 0
> > ts->idle_sleeptime += ktime_sub(ktime_get(), ts->idle_entrytime)
> 
> Ah see your point.
> 
> The diff disallows remotely updating ts, and it is updated in idle exit
> after my proposal, so what nr_iowait_cpu() breaks is mitigated.

Only halfway mitigated. This doesn't prevent from backward or forward jumps
when non-updating readers are involved at all.

Thanks.

> 
> Thanks for taking a look, particularly the race linked to nr_iowait_cpu().
> 
> Hillf

  reply	other threads:[~2023-02-01 14:28 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-28  2:00 [PATCH RFC] tick/nohz: fix data races in get_cpu_idle_time_us() Yu Liao
2023-01-31 14:44 ` Thomas Gleixner
2023-01-31 18:35   ` Alexey Dobriyan
2023-01-31 19:59     ` Peter Zijlstra
2023-01-31 19:57   ` Peter Zijlstra
2023-01-31 21:11     ` Frederic Weisbecker
2023-02-01  9:03       ` Peter Zijlstra
2023-02-01  4:53   ` Hillf Danton
2023-02-01 12:02     ` Frederic Weisbecker
2023-02-01 14:01       ` Hillf Danton
2023-02-01 14:28         ` Frederic Weisbecker [this message]
2023-02-08 15:19   ` [PATCH] timers/nohz: Restructure and reshuffle struct tick_sched Frederic Weisbecker
2023-02-06  7:03 ` [PATCH RFC] tick/nohz: fix data races in get_cpu_idle_time_us() kernel test robot
2023-02-07  5:25 ` Mirsad Goran Todorovac
2023-02-07  8:03   ` Mirsad Goran Todorovac

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9p3HjW3yzn0UYrZ@lothringen \
    --to=frederic@kernel.org \
    --cc=adobriyan@gmail.com \
    --cc=fweisbec@gmail.com \
    --cc=hdanton@sina.com \
    --cc=liaoyu15@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liwei391@huawei.com \
    --cc=mingo@kernel.org \
    --cc=mirsad.todorovac@alu.unizg.hr \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.