From: Scott Wood <swood@redhat.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
Frederic Weisbecker <frederic@kernel.org>,
Ingo Molnar <mingo@kernel.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] timers/nohz: Update nohz load even if tick already stopped
Date: Tue, 05 Nov 2019 01:30:58 -0600 [thread overview]
Message-ID: <7b782bc880a29eb7d37f2c2aff73c43e7f7d032f.camel@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1911050042250.17054@nanos.tec.linutronix.de>
On Tue, 2019-11-05 at 00:43 +0100, Thomas Gleixner wrote:
> On Mon, 4 Nov 2019, Thomas Gleixner wrote:
> > On Fri, 1 Nov 2019, Scott Wood wrote:
> > > On Wed, 2019-10-30 at 14:31 +0100, Peter Zijlstra wrote:
> > > > Oh argh! that's a bit radical of the remote tick. The normal tick
> > > > runs
> > > > just fine on idle CPUs, so lets mirror that.
> > > >
> > > > How's this then?
> >
> > ....
> >
> > > Needs to be tick_nohz_tick_stopped_cpu(cpu)
> > >
> > > After fixing that, I get:
> > >
> > > [ 7.439068] WARNING: CPU: 20 PID: 7 at
> > > /home/root/linux/kernel/sched/core.c:3681
> > > sched_tick_remote+0x132/0x150
> >
> > So I'm going to apply Scotts patch if nobody comes up with a better idea
> > until tomorrow.
>
> As Peter pointed out to me privately we should rather go and analyze the
> real thing instead of just applying duct tape.
>
> /me drops the patch again.
The warning is due to kernel/sched/idle.c not updating curr->se.exec_start.
While debugging I noticed an issue with a particular load pattern. The CPU
goes non-nohz for a brief time at an interval very close to twice
tick_period. When the tick is started, the timer expiration is more than
tick_period in the past, so hrtimer_forward() tries to catch up by adding
2*tick_period to the expiration. Then the tick is stopped before that new
expiration, and when the tick is woken up the expiry is again advanced by
2*tick_period with the timer never actually running. sched_tick_remote()
does fire every second, but there are streaks of several seconds where it
keeps catching the CPU in a non-nohz state, so neither the normal nor remote
ticks are calling calc_load_nohz_remote().
Is there a reason to not just remove the hrtimer_forward() from
tick_nohz_restart(), letting the timer fire if it's in the past, which will
take care of doing hrtimer_forward()?
As for the warning in sched_tick_remote(), it seems like a test for time
since the last tick on this cpu (remote or otherwise) would be better than
relying on curr->se.exec_start, in order to detect things like this.
-Scott
next prev parent reply other threads:[~2019-11-05 7:31 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-28 15:07 [PATCH] timers/nohz: Update nohz load even if tick already stopped Frederic Weisbecker
2019-10-29 10:05 ` Peter Zijlstra
2019-10-30 8:48 ` Scott Wood
2019-10-30 13:31 ` Peter Zijlstra
2019-11-01 5:11 ` Scott Wood
2019-11-04 22:17 ` Thomas Gleixner
2019-11-04 23:43 ` Thomas Gleixner
2019-11-05 7:30 ` Scott Wood [this message]
2019-11-05 9:53 ` Thomas Gleixner
2019-11-08 8:16 ` Scott Wood
2019-11-05 12:43 ` Peter Zijlstra
2019-11-06 8:37 ` Peter Zijlstra
2019-11-08 8:13 ` Scott Wood
2019-12-11 20:37 ` Scott Wood
2019-12-11 20:46 ` Scott Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7b782bc880a29eb7d37f2c2aff73c43e7f7d032f.camel@redhat.com \
--to=swood@redhat.com \
--cc=frederic@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).