rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Energy-efficiency options within RCU
@ 2020-12-10 18:37 Paul E. McKenney
  2020-12-10 19:23 ` Uladzislau Rezki
  2020-12-14 18:12 ` Joel Fernandes
  0 siblings, 2 replies; 4+ messages in thread
From: Paul E. McKenney @ 2020-12-10 18:37 UTC (permalink / raw)
  To: joel; +Cc: rcu, linux-kernel

Hello, Joel,

In case you are -seriously- interested...  ;-)

						Thanx, Paul

rcu_nocbs=

	Adding a CPU to this list offloads RCU callback invocation from
	that CPU's softirq handler to a kthread.  In big.LITTLE systems,
	this kthread can be placed on a LITTLE CPU, which has been
	demonstrated to save significant energy in benchmarks.
	http://www.rdrop.com/users/paulmck/realtime/paper/AMPenergy.2013.04.19a.pdf

nohz_full=

	Any CPU specified by this boot parameter is handled as if it was
	specified by rcu_nocbs=.

rcutree.jiffies_till_first_fqs=

	Increasing this will decrease wakeup frequency to the grace-period
	kthread for the first FQS scan.  And increase grace-period
	latency.

rcutree.jiffies_till_next_fqs=

	Ditto, but for the second and subsequent FQS scans.

	My guess is that neither of these makes much difference.  But if
	they do, maybe some sort of backoff scheme for FQS scans?

rcutree.jiffies_till_sched_qs=

	Increasing this will delay RCU's getting excited about CPUs and
	tasks not responding with quiescent states.  This excitement
	can cause extra overhead.

	No idea whether adjusting this would help.  But if you increase
	rcutree.jiffies_till_first_fqs or rcutree.jiffies_till_next_fqs,
	you might need to increase this one accordingly.

rcutree.qovld=

	Increasing this will increase the grace-period duration at which
	RCU starts sending IPIs, thus perhaps reducing the total number
	of IPIs that RCU sends.  The destination CPUs are unlikely to be
	idle, so it is not clear to me that this would help much.  But
	perhaps I am wrong about them being mostly non-idle, who knows?

rcupdate.rcu_cpu_stall_timeout=

	If you get overly zealous about the earlier kernel boot parameters,
	you might need to increase this one as well.  Or instead use the
	rcupdate.rcu_cpu_stall_suppress= kernel boot parameter to suppress
	RCU CPU stall warnings entirely.

rcutree.rcu_nocb_gp_stride=

	Increasing this might reduce grace-period work somewhat.  I don't
	see why a (say) 16-CPU system really needs to have more than one
	rcuog kthread, so if this does help it might be worthwhile setting
	a lower limit to this kernel parameter.

rcutree.rcu_idle_gp_delay=  (Only CONFIG_RCU_FAST_NO_HZ=y kernels.)

	This defaults to four jiffies on the theory that grace periods
	tend to last about that long.  If grace periods tend to take
	longer, then it makes a lot of sense to increase this.	And maybe
	battery-powered devices would rather have it be about 2x or 3x
	the expected grace-period duration, who knows?

	I would keep it to a power of two, but the code should work with
	other numbers.  Except that I don't know that this has ever been
	tested.  ;-)

srcutree.exp_holdoff=

	Increasing this decreases the number of SRCU grace periods that
	are treated as expedited.  But you have to have closely-spaced
	SRCU grace periods for this to matter.	(These do happen at least
	sometimes because I added this only because someone complained
	about the performance regression from the earlier non-tree SRCU.)

rcupdate.rcu_task_ipi_delay=

	This kernel parameter delays sending IPIs for RCU Tasks Trace,
	which is used by sleepable BPF programs.  Increasing it can
	reduce overhead, but can also increase the latency of removing
	sleepable BPF programs.

rcupdate.rcu_task_stall_timeout=

	If you slow down RCU Tasks Trace too much, you may need this.
	But then again, the default 10-minute value should suffice.

CONFIG_RCU_FAST_NO_HZ=y

	This only has effect on CPUs not specified by rcu_nocbs, and thus
	might be useful on systems that offload RCU callbacks only on
	some of the CPUs.  For example, a big.LITTLE system might offload
	only the big CPUs.  This Kconfig option reduces the frequency of
	timer interrupts (and thus of RCU-related softirq processing)
	on idle CPUs.  This has been shown to save significant energy
	in benchmarks:
	http://www.rdrop.com/users/paulmck/realtime/paper/AMPenergy.2013.04.19a.pdf

CONFIG_RCU_STRICT_GRACE_PERIOD=y

	This works hard (as in burns CPU) to sharply reduce grace-period
	latency.  The effect is probably to greatly increase power
	consumption, but there might well be workloads where the shorter
	grace periods more than make up for the extra CPU time.  Or not.

CONFIG_HZ=

	Reducing the scheduler-clock interrupt frequency has the opposite
	effect, namely of increasing RCU grace-period latency, but while
	also reducing RCU's CPU utilization.

CONFIG_TASKS_TRACE_RCU_READ_MB=y

	Reduce the need to IPI RCU Tasks Trace holdout tasks, but at the
	expense of an increase in to/from idle overhead.  This Kconfig
	option also slows down the rate at which RCU Tasks Trace polls
	for holdout tasks.  This polling rate cannot be separately
	specified, but if changing the initial source-code values of
	either rcu_tasks_trace.gp_sleep or rcu_tasks_trace.init_fract
	proves useful, kernel boot parameters could be created.

	That said, automatic initialization heuristics are more
	convenient.  When they work, anyway.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Energy-efficiency options within RCU
  2020-12-10 18:37 Energy-efficiency options within RCU Paul E. McKenney
@ 2020-12-10 19:23 ` Uladzislau Rezki
  2020-12-14 18:12 ` Joel Fernandes
  1 sibling, 0 replies; 4+ messages in thread
From: Uladzislau Rezki @ 2020-12-10 19:23 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: joel, rcu, linux-kernel

Hello, Paul.

[Dropping CC]

> Hello, Joel,
> 
> In case you are -seriously- interested...  ;-)
> 
> 						Thanx, Paul
> 
> rcu_nocbs=
> 
> 	Adding a CPU to this list offloads RCU callback invocation from
> 	that CPU's softirq handler to a kthread.  In big.LITTLE systems,
> 	this kthread can be placed on a LITTLE CPU, which has been
> 	demonstrated to save significant energy in benchmarks.
> 	http://www.rdrop.com/users/paulmck/realtime/paper/AMPenergy.2013.04.19a.pdf
> 
I have checked our config. We do use rcu_nocbs=0-7 as kthreads but what
i see those threads are not bound to 0-3 CPUs. In our case it is little
cluster. I think i should check and run some test cases regarding power 
savings if i pin all threads to little cluster.

> rcutree.rcu_idle_gp_delay=  (Only CONFIG_RCU_FAST_NO_HZ=y kernels.)
> 
> 	This defaults to four jiffies on the theory that grace periods
> 	tend to last about that long.  If grace periods tend to take
> 	longer, then it makes a lot of sense to increase this.	And maybe
> 	battery-powered devices would rather have it be about 2x or 3x
> 	the expected grace-period duration, who knows?
> 
> 	I would keep it to a power of two, but the code should work with
> 	other numbers.  Except that I don't know that this has ever been
> 	tested.  ;-)
> 
Same here. We do use it.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Energy-efficiency options within RCU
  2020-12-10 18:37 Energy-efficiency options within RCU Paul E. McKenney
  2020-12-10 19:23 ` Uladzislau Rezki
@ 2020-12-14 18:12 ` Joel Fernandes
  2020-12-14 19:02   ` Paul E. McKenney
  1 sibling, 1 reply; 4+ messages in thread
From: Joel Fernandes @ 2020-12-14 18:12 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: rcu, linux-kernel

On Thu, Dec 10, 2020 at 10:37:37AM -0800, Paul E. McKenney wrote:
> Hello, Joel,
> 
> In case you are -seriously- interested...  ;-)

I am always seriously interested :-). The issue becomes when life throws me a
curveball. This was the year of curveballs :-)

Thank you for your reply and I have added it to my list to investigate how we
are configuring nocb on our systems. I don't think anyone over here has given
these RCU issues a serious look over here.

thanks,

 - Joel



> 						Thanx, Paul
> 
> rcu_nocbs=
> 
> 	Adding a CPU to this list offloads RCU callback invocation from
> 	that CPU's softirq handler to a kthread.  In big.LITTLE systems,
> 	this kthread can be placed on a LITTLE CPU, which has been
> 	demonstrated to save significant energy in benchmarks.
> 	http://www.rdrop.com/users/paulmck/realtime/paper/AMPenergy.2013.04.19a.pdf
> 
> nohz_full=
> 
> 	Any CPU specified by this boot parameter is handled as if it was
> 	specified by rcu_nocbs=.
> 
> rcutree.jiffies_till_first_fqs=
> 
> 	Increasing this will decrease wakeup frequency to the grace-period
> 	kthread for the first FQS scan.  And increase grace-period
> 	latency.
> 
> rcutree.jiffies_till_next_fqs=
> 
> 	Ditto, but for the second and subsequent FQS scans.
> 
> 	My guess is that neither of these makes much difference.  But if
> 	they do, maybe some sort of backoff scheme for FQS scans?
> 
> rcutree.jiffies_till_sched_qs=
> 
> 	Increasing this will delay RCU's getting excited about CPUs and
> 	tasks not responding with quiescent states.  This excitement
> 	can cause extra overhead.
> 
> 	No idea whether adjusting this would help.  But if you increase
> 	rcutree.jiffies_till_first_fqs or rcutree.jiffies_till_next_fqs,
> 	you might need to increase this one accordingly.
> 
> rcutree.qovld=
> 
> 	Increasing this will increase the grace-period duration at which
> 	RCU starts sending IPIs, thus perhaps reducing the total number
> 	of IPIs that RCU sends.  The destination CPUs are unlikely to be
> 	idle, so it is not clear to me that this would help much.  But
> 	perhaps I am wrong about them being mostly non-idle, who knows?
> 
> rcupdate.rcu_cpu_stall_timeout=
> 
> 	If you get overly zealous about the earlier kernel boot parameters,
> 	you might need to increase this one as well.  Or instead use the
> 	rcupdate.rcu_cpu_stall_suppress= kernel boot parameter to suppress
> 	RCU CPU stall warnings entirely.
> 
> rcutree.rcu_nocb_gp_stride=
> 
> 	Increasing this might reduce grace-period work somewhat.  I don't
> 	see why a (say) 16-CPU system really needs to have more than one
> 	rcuog kthread, so if this does help it might be worthwhile setting
> 	a lower limit to this kernel parameter.
> 
> rcutree.rcu_idle_gp_delay=  (Only CONFIG_RCU_FAST_NO_HZ=y kernels.)
> 
> 	This defaults to four jiffies on the theory that grace periods
> 	tend to last about that long.  If grace periods tend to take
> 	longer, then it makes a lot of sense to increase this.	And maybe
> 	battery-powered devices would rather have it be about 2x or 3x
> 	the expected grace-period duration, who knows?
> 
> 	I would keep it to a power of two, but the code should work with
> 	other numbers.  Except that I don't know that this has ever been
> 	tested.  ;-)
> 
> srcutree.exp_holdoff=
> 
> 	Increasing this decreases the number of SRCU grace periods that
> 	are treated as expedited.  But you have to have closely-spaced
> 	SRCU grace periods for this to matter.	(These do happen at least
> 	sometimes because I added this only because someone complained
> 	about the performance regression from the earlier non-tree SRCU.)
> 
> rcupdate.rcu_task_ipi_delay=
> 
> 	This kernel parameter delays sending IPIs for RCU Tasks Trace,
> 	which is used by sleepable BPF programs.  Increasing it can
> 	reduce overhead, but can also increase the latency of removing
> 	sleepable BPF programs.
> 
> rcupdate.rcu_task_stall_timeout=
> 
> 	If you slow down RCU Tasks Trace too much, you may need this.
> 	But then again, the default 10-minute value should suffice.
> 
> CONFIG_RCU_FAST_NO_HZ=y
> 
> 	This only has effect on CPUs not specified by rcu_nocbs, and thus
> 	might be useful on systems that offload RCU callbacks only on
> 	some of the CPUs.  For example, a big.LITTLE system might offload
> 	only the big CPUs.  This Kconfig option reduces the frequency of
> 	timer interrupts (and thus of RCU-related softirq processing)
> 	on idle CPUs.  This has been shown to save significant energy
> 	in benchmarks:
> 	http://www.rdrop.com/users/paulmck/realtime/paper/AMPenergy.2013.04.19a.pdf
> 
> CONFIG_RCU_STRICT_GRACE_PERIOD=y
> 
> 	This works hard (as in burns CPU) to sharply reduce grace-period
> 	latency.  The effect is probably to greatly increase power
> 	consumption, but there might well be workloads where the shorter
> 	grace periods more than make up for the extra CPU time.  Or not.
> 
> CONFIG_HZ=
> 
> 	Reducing the scheduler-clock interrupt frequency has the opposite
> 	effect, namely of increasing RCU grace-period latency, but while
> 	also reducing RCU's CPU utilization.
> 
> CONFIG_TASKS_TRACE_RCU_READ_MB=y
> 
> 	Reduce the need to IPI RCU Tasks Trace holdout tasks, but at the
> 	expense of an increase in to/from idle overhead.  This Kconfig
> 	option also slows down the rate at which RCU Tasks Trace polls
> 	for holdout tasks.  This polling rate cannot be separately
> 	specified, but if changing the initial source-code values of
> 	either rcu_tasks_trace.gp_sleep or rcu_tasks_trace.init_fract
> 	proves useful, kernel boot parameters could be created.
> 
> 	That said, automatic initialization heuristics are more
> 	convenient.  When they work, anyway.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Energy-efficiency options within RCU
  2020-12-14 18:12 ` Joel Fernandes
@ 2020-12-14 19:02   ` Paul E. McKenney
  0 siblings, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2020-12-14 19:02 UTC (permalink / raw)
  To: Joel Fernandes; +Cc: rcu, linux-kernel

On Mon, Dec 14, 2020 at 01:12:48PM -0500, Joel Fernandes wrote:
> On Thu, Dec 10, 2020 at 10:37:37AM -0800, Paul E. McKenney wrote:
> > Hello, Joel,
> > 
> > In case you are -seriously- interested...  ;-)
> 
> I am always seriously interested :-). The issue becomes when life throws me a
> curveball. This was the year of curveballs :-)
> 
> Thank you for your reply and I have added it to my list to investigate how we
> are configuring nocb on our systems. I don't think anyone over here has given
> these RCU issues a serious look over here.

In your defense, I would guess that many of them don't have that much
effect.  But true, you never know until you try.

							Thanx, Paul

> thanks,
> 
>  - Joel
> 
> 
> 
> > 						Thanx, Paul
> > 
> > rcu_nocbs=
> > 
> > 	Adding a CPU to this list offloads RCU callback invocation from
> > 	that CPU's softirq handler to a kthread.  In big.LITTLE systems,
> > 	this kthread can be placed on a LITTLE CPU, which has been
> > 	demonstrated to save significant energy in benchmarks.
> > 	http://www.rdrop.com/users/paulmck/realtime/paper/AMPenergy.2013.04.19a.pdf
> > 
> > nohz_full=
> > 
> > 	Any CPU specified by this boot parameter is handled as if it was
> > 	specified by rcu_nocbs=.
> > 
> > rcutree.jiffies_till_first_fqs=
> > 
> > 	Increasing this will decrease wakeup frequency to the grace-period
> > 	kthread for the first FQS scan.  And increase grace-period
> > 	latency.
> > 
> > rcutree.jiffies_till_next_fqs=
> > 
> > 	Ditto, but for the second and subsequent FQS scans.
> > 
> > 	My guess is that neither of these makes much difference.  But if
> > 	they do, maybe some sort of backoff scheme for FQS scans?
> > 
> > rcutree.jiffies_till_sched_qs=
> > 
> > 	Increasing this will delay RCU's getting excited about CPUs and
> > 	tasks not responding with quiescent states.  This excitement
> > 	can cause extra overhead.
> > 
> > 	No idea whether adjusting this would help.  But if you increase
> > 	rcutree.jiffies_till_first_fqs or rcutree.jiffies_till_next_fqs,
> > 	you might need to increase this one accordingly.
> > 
> > rcutree.qovld=
> > 
> > 	Increasing this will increase the grace-period duration at which
> > 	RCU starts sending IPIs, thus perhaps reducing the total number
> > 	of IPIs that RCU sends.  The destination CPUs are unlikely to be
> > 	idle, so it is not clear to me that this would help much.  But
> > 	perhaps I am wrong about them being mostly non-idle, who knows?
> > 
> > rcupdate.rcu_cpu_stall_timeout=
> > 
> > 	If you get overly zealous about the earlier kernel boot parameters,
> > 	you might need to increase this one as well.  Or instead use the
> > 	rcupdate.rcu_cpu_stall_suppress= kernel boot parameter to suppress
> > 	RCU CPU stall warnings entirely.
> > 
> > rcutree.rcu_nocb_gp_stride=
> > 
> > 	Increasing this might reduce grace-period work somewhat.  I don't
> > 	see why a (say) 16-CPU system really needs to have more than one
> > 	rcuog kthread, so if this does help it might be worthwhile setting
> > 	a lower limit to this kernel parameter.
> > 
> > rcutree.rcu_idle_gp_delay=  (Only CONFIG_RCU_FAST_NO_HZ=y kernels.)
> > 
> > 	This defaults to four jiffies on the theory that grace periods
> > 	tend to last about that long.  If grace periods tend to take
> > 	longer, then it makes a lot of sense to increase this.	And maybe
> > 	battery-powered devices would rather have it be about 2x or 3x
> > 	the expected grace-period duration, who knows?
> > 
> > 	I would keep it to a power of two, but the code should work with
> > 	other numbers.  Except that I don't know that this has ever been
> > 	tested.  ;-)
> > 
> > srcutree.exp_holdoff=
> > 
> > 	Increasing this decreases the number of SRCU grace periods that
> > 	are treated as expedited.  But you have to have closely-spaced
> > 	SRCU grace periods for this to matter.	(These do happen at least
> > 	sometimes because I added this only because someone complained
> > 	about the performance regression from the earlier non-tree SRCU.)
> > 
> > rcupdate.rcu_task_ipi_delay=
> > 
> > 	This kernel parameter delays sending IPIs for RCU Tasks Trace,
> > 	which is used by sleepable BPF programs.  Increasing it can
> > 	reduce overhead, but can also increase the latency of removing
> > 	sleepable BPF programs.
> > 
> > rcupdate.rcu_task_stall_timeout=
> > 
> > 	If you slow down RCU Tasks Trace too much, you may need this.
> > 	But then again, the default 10-minute value should suffice.
> > 
> > CONFIG_RCU_FAST_NO_HZ=y
> > 
> > 	This only has effect on CPUs not specified by rcu_nocbs, and thus
> > 	might be useful on systems that offload RCU callbacks only on
> > 	some of the CPUs.  For example, a big.LITTLE system might offload
> > 	only the big CPUs.  This Kconfig option reduces the frequency of
> > 	timer interrupts (and thus of RCU-related softirq processing)
> > 	on idle CPUs.  This has been shown to save significant energy
> > 	in benchmarks:
> > 	http://www.rdrop.com/users/paulmck/realtime/paper/AMPenergy.2013.04.19a.pdf
> > 
> > CONFIG_RCU_STRICT_GRACE_PERIOD=y
> > 
> > 	This works hard (as in burns CPU) to sharply reduce grace-period
> > 	latency.  The effect is probably to greatly increase power
> > 	consumption, but there might well be workloads where the shorter
> > 	grace periods more than make up for the extra CPU time.  Or not.
> > 
> > CONFIG_HZ=
> > 
> > 	Reducing the scheduler-clock interrupt frequency has the opposite
> > 	effect, namely of increasing RCU grace-period latency, but while
> > 	also reducing RCU's CPU utilization.
> > 
> > CONFIG_TASKS_TRACE_RCU_READ_MB=y
> > 
> > 	Reduce the need to IPI RCU Tasks Trace holdout tasks, but at the
> > 	expense of an increase in to/from idle overhead.  This Kconfig
> > 	option also slows down the rate at which RCU Tasks Trace polls
> > 	for holdout tasks.  This polling rate cannot be separately
> > 	specified, but if changing the initial source-code values of
> > 	either rcu_tasks_trace.gp_sleep or rcu_tasks_trace.init_fract
> > 	proves useful, kernel boot parameters could be created.
> > 
> > 	That said, automatic initialization heuristics are more
> > 	convenient.  When they work, anyway.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-12-14 19:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-10 18:37 Energy-efficiency options within RCU Paul E. McKenney
2020-12-10 19:23 ` Uladzislau Rezki
2020-12-14 18:12 ` Joel Fernandes
2020-12-14 19:02   ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).