From: "Paul E. McKenney" <paulmck@kernel.org>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Energy-efficiency options within RCU
Date: Mon, 14 Dec 2020 11:02:19 -0800 [thread overview]
Message-ID: <20201214190219.GV2657@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <X9erIC8Sbf3ybvHC@google.com>
On Mon, Dec 14, 2020 at 01:12:48PM -0500, Joel Fernandes wrote:
> On Thu, Dec 10, 2020 at 10:37:37AM -0800, Paul E. McKenney wrote:
> > Hello, Joel,
> >
> > In case you are -seriously- interested... ;-)
>
> I am always seriously interested :-). The issue becomes when life throws me a
> curveball. This was the year of curveballs :-)
>
> Thank you for your reply and I have added it to my list to investigate how we
> are configuring nocb on our systems. I don't think anyone over here has given
> these RCU issues a serious look over here.
In your defense, I would guess that many of them don't have that much
effect. But true, you never know until you try.
Thanx, Paul
> thanks,
>
> - Joel
>
>
>
> > Thanx, Paul
> >
> > rcu_nocbs=
> >
> > Adding a CPU to this list offloads RCU callback invocation from
> > that CPU's softirq handler to a kthread. In big.LITTLE systems,
> > this kthread can be placed on a LITTLE CPU, which has been
> > demonstrated to save significant energy in benchmarks.
> > http://www.rdrop.com/users/paulmck/realtime/paper/AMPenergy.2013.04.19a.pdf
> >
> > nohz_full=
> >
> > Any CPU specified by this boot parameter is handled as if it was
> > specified by rcu_nocbs=.
> >
> > rcutree.jiffies_till_first_fqs=
> >
> > Increasing this will decrease wakeup frequency to the grace-period
> > kthread for the first FQS scan. And increase grace-period
> > latency.
> >
> > rcutree.jiffies_till_next_fqs=
> >
> > Ditto, but for the second and subsequent FQS scans.
> >
> > My guess is that neither of these makes much difference. But if
> > they do, maybe some sort of backoff scheme for FQS scans?
> >
> > rcutree.jiffies_till_sched_qs=
> >
> > Increasing this will delay RCU's getting excited about CPUs and
> > tasks not responding with quiescent states. This excitement
> > can cause extra overhead.
> >
> > No idea whether adjusting this would help. But if you increase
> > rcutree.jiffies_till_first_fqs or rcutree.jiffies_till_next_fqs,
> > you might need to increase this one accordingly.
> >
> > rcutree.qovld=
> >
> > Increasing this will increase the grace-period duration at which
> > RCU starts sending IPIs, thus perhaps reducing the total number
> > of IPIs that RCU sends. The destination CPUs are unlikely to be
> > idle, so it is not clear to me that this would help much. But
> > perhaps I am wrong about them being mostly non-idle, who knows?
> >
> > rcupdate.rcu_cpu_stall_timeout=
> >
> > If you get overly zealous about the earlier kernel boot parameters,
> > you might need to increase this one as well. Or instead use the
> > rcupdate.rcu_cpu_stall_suppress= kernel boot parameter to suppress
> > RCU CPU stall warnings entirely.
> >
> > rcutree.rcu_nocb_gp_stride=
> >
> > Increasing this might reduce grace-period work somewhat. I don't
> > see why a (say) 16-CPU system really needs to have more than one
> > rcuog kthread, so if this does help it might be worthwhile setting
> > a lower limit to this kernel parameter.
> >
> > rcutree.rcu_idle_gp_delay= (Only CONFIG_RCU_FAST_NO_HZ=y kernels.)
> >
> > This defaults to four jiffies on the theory that grace periods
> > tend to last about that long. If grace periods tend to take
> > longer, then it makes a lot of sense to increase this. And maybe
> > battery-powered devices would rather have it be about 2x or 3x
> > the expected grace-period duration, who knows?
> >
> > I would keep it to a power of two, but the code should work with
> > other numbers. Except that I don't know that this has ever been
> > tested. ;-)
> >
> > srcutree.exp_holdoff=
> >
> > Increasing this decreases the number of SRCU grace periods that
> > are treated as expedited. But you have to have closely-spaced
> > SRCU grace periods for this to matter. (These do happen at least
> > sometimes because I added this only because someone complained
> > about the performance regression from the earlier non-tree SRCU.)
> >
> > rcupdate.rcu_task_ipi_delay=
> >
> > This kernel parameter delays sending IPIs for RCU Tasks Trace,
> > which is used by sleepable BPF programs. Increasing it can
> > reduce overhead, but can also increase the latency of removing
> > sleepable BPF programs.
> >
> > rcupdate.rcu_task_stall_timeout=
> >
> > If you slow down RCU Tasks Trace too much, you may need this.
> > But then again, the default 10-minute value should suffice.
> >
> > CONFIG_RCU_FAST_NO_HZ=y
> >
> > This only has effect on CPUs not specified by rcu_nocbs, and thus
> > might be useful on systems that offload RCU callbacks only on
> > some of the CPUs. For example, a big.LITTLE system might offload
> > only the big CPUs. This Kconfig option reduces the frequency of
> > timer interrupts (and thus of RCU-related softirq processing)
> > on idle CPUs. This has been shown to save significant energy
> > in benchmarks:
> > http://www.rdrop.com/users/paulmck/realtime/paper/AMPenergy.2013.04.19a.pdf
> >
> > CONFIG_RCU_STRICT_GRACE_PERIOD=y
> >
> > This works hard (as in burns CPU) to sharply reduce grace-period
> > latency. The effect is probably to greatly increase power
> > consumption, but there might well be workloads where the shorter
> > grace periods more than make up for the extra CPU time. Or not.
> >
> > CONFIG_HZ=
> >
> > Reducing the scheduler-clock interrupt frequency has the opposite
> > effect, namely of increasing RCU grace-period latency, but while
> > also reducing RCU's CPU utilization.
> >
> > CONFIG_TASKS_TRACE_RCU_READ_MB=y
> >
> > Reduce the need to IPI RCU Tasks Trace holdout tasks, but at the
> > expense of an increase in to/from idle overhead. This Kconfig
> > option also slows down the rate at which RCU Tasks Trace polls
> > for holdout tasks. This polling rate cannot be separately
> > specified, but if changing the initial source-code values of
> > either rcu_tasks_trace.gp_sleep or rcu_tasks_trace.init_fract
> > proves useful, kernel boot parameters could be created.
> >
> > That said, automatic initialization heuristics are more
> > convenient. When they work, anyway.
prev parent reply other threads:[~2020-12-14 19:03 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-10 18:37 Energy-efficiency options within RCU Paul E. McKenney
2020-12-10 19:23 ` Uladzislau Rezki
2020-12-14 18:12 ` Joel Fernandes
2020-12-14 19:02 ` Paul E. McKenney [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201214190219.GV2657@paulmck-ThinkPad-P72 \
--to=paulmck@kernel.org \
--cc=joel@joelfernandes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rcu@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).