All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 0/2] Documentation changes for 3.11
@ 2013-05-20 14:49 Paul E. McKenney
  2013-05-20 14:50 ` [PATCH tip/core/rcu 1/2] nohz_full: Update based on Sedat Dilek review Paul E. McKenney
  2013-05-21 17:33 ` [PATCH tip/core/rcu 0/2] Documentation changes for 3.11 Josh Triplett
  0 siblings, 2 replies; 6+ messages in thread
From: Paul E. McKenney @ 2013-05-20 14:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, edumazet, darren,
	fweisbec, sbw

Hello!

A pair of documentation changes.  These are not directly related to RCU,
but if no one adopts them, I will send them up with the RCU commits.  The
patches are as follows:

1.	Changes to the new Documentation/timers/NO_HZ.txt documentation
	based on review comments from Sedat Dilek.

2.	Document ways of turning off the scheduling-clock tick even when
	there is more than one runnable task on a given CPU.

3.	Document ways of avoiding OS jitter form the kworker workqueue
	kthreads.

							Thanx, Paul

------------------------------------------------------------------------

 b/Documentation/kernel-per-CPU-kthreads.txt |   47 ++++++++++++++++
 b/Documentation/timers/NO_HZ.txt            |   79 +++++++++++++++++++++++-----
 2 files changed, 114 insertions(+), 12 deletions(-)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH tip/core/rcu 1/2] nohz_full: Update based on Sedat Dilek review
  2013-05-20 14:49 [PATCH tip/core/rcu 0/2] Documentation changes for 3.11 Paul E. McKenney
@ 2013-05-20 14:50 ` Paul E. McKenney
  2013-05-20 14:50   ` [PATCH tip/core/rcu 2/2] nohz_full: Document additional restrictions Paul E. McKenney
  2013-05-20 14:50   ` [PATCH tip/core/rcu 1/1] kthread: Add kworker kthreads to OS-jitter documentation Paul E. McKenney
  2013-05-21 17:33 ` [PATCH tip/core/rcu 0/2] Documentation changes for 3.11 Josh Triplett
  1 sibling, 2 replies; 6+ messages in thread
From: Paul E. McKenney @ 2013-05-20 14:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, edumazet, darren,
	fweisbec, sbw, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

Make it more clear that there are three options, and give hints as
to which of the three is most likely to be useful in different
situations.

Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/timers/NO_HZ.txt | 58 +++++++++++++++++++++++++++++++++++-------
 1 file changed, 49 insertions(+), 9 deletions(-)

diff --git a/Documentation/timers/NO_HZ.txt b/Documentation/timers/NO_HZ.txt
index 5b53220..d5323e0 100644
--- a/Documentation/timers/NO_HZ.txt
+++ b/Documentation/timers/NO_HZ.txt
@@ -7,21 +7,59 @@ efficiency and reducing OS jitter.  Reducing OS jitter is important for
 some types of computationally intensive high-performance computing (HPC)
 applications and for real-time applications.
 
-There are two main contexts in which the number of scheduling-clock
-interrupts can be reduced compared to the old-school approach of sending
-a scheduling-clock interrupt to all CPUs every jiffy whether they need
-it or not (CONFIG_HZ_PERIODIC=y or CONFIG_NO_HZ=n for older kernels):
+There are three main ways of managing scheduling-clock interrupts
+(also known as "scheduling-clock ticks" or simply "ticks"):
 
-1.	Idle CPUs (CONFIG_NO_HZ_IDLE=y or CONFIG_NO_HZ=y for older kernels).
+1.	Never omit scheduling-clock ticks (CONFIG_HZ_PERIODIC=y or
+	CONFIG_NO_HZ=n for older kernels).  You normally will -not-
+	want to choose this option.
 
-2.	CPUs having only one runnable task (CONFIG_NO_HZ_FULL=y).
+2.	Omit scheduling-clock ticks on idle CPUs (CONFIG_NO_HZ_IDLE=y or
+	CONFIG_NO_HZ=y for older kernels).  This is the most common
+	approach, and should be the default.
 
-These two cases are described in the following two sections, followed
+3.	Omit scheduling-clock ticks on CPUs that are either idle or that
+	have only one runnable task (CONFIG_NO_HZ_FULL=y).  Unless you
+	are running realtime applications or certain types of HPC
+	workloads, you will normally -not- want this option.
+
+These three cases are described in the following three sections, followed
 by a third section on RCU-specific considerations and a fourth and final
 section listing known issues.
 
 
-IDLE CPUs
+NEVER OMIT SCHEDULING-CLOCK TICKS
+
+Very old versions of Linux from the 1990s and the very early 2000s
+are incapable of omitting scheduling-clock ticks.  It turns out that
+there are some situations where this old-school approach is still the
+right approach, for example, in heavy workloads with lots of tasks
+that use short bursts of CPU, where there are very frequent idle
+periods, but where these idle periods are also quite short (tens or
+hundreds of microseconds).  For these types of workloads, scheduling
+clock interrupts will normally be delivered any way because there
+will frequently be multiple runnable tasks per CPU.  In these cases,
+attempting to turn off the scheduling clock interrupt will have no effect
+other than increasing the overhead of switching to and from idle and
+transitioning between user and kernel execution.
+
+This mode of operation can be selected using CONFIG_HZ_PERIODIC=y (or
+CONFIG_NO_HZ=n for older kernels).
+
+However, if you are instead running a light workload with long idle
+periods, failing to omit scheduling-clock interrupts will result in
+excessive power consumption.  This is especially bad on battery-powered
+devices, where it results in extremely short battery lifetimes.  If you
+are running light workloads, you should therefore read the following
+section.
+
+In addition, if you are running either a real-time workload or an HPC
+workload with short iterations, the scheduling-clock interrupts can
+degrade your applications performance.  If this describes your workload,
+you should read the following two sections.
+
+
+OMIT SCHEDULING-CLOCK TICKS FOR IDLE CPUs
 
 If a CPU is idle, there is little point in sending it a scheduling-clock
 interrupt.  After all, the primary purpose of a scheduling-clock interrupt
@@ -59,10 +97,12 @@ By default, CONFIG_NO_HZ_IDLE=y kernels boot with "nohz=on", enabling
 dyntick-idle mode.
 
 
-CPUs WITH ONLY ONE RUNNABLE TASK
+OMIT SCHEDULING-CLOCK TICKS FOR CPUs WITH ONLY ONE RUNNABLE TASK
 
 If a CPU has only one runnable task, there is little point in sending it
 a scheduling-clock interrupt because there is no other task to switch to.
+Note that omitting scheduling-clock ticks for CPUs with only one runnable
+task implies also omitting them for idle CPUs.
 
 The CONFIG_NO_HZ_FULL=y Kconfig option causes the kernel to avoid
 sending scheduling-clock interrupts to CPUs with a single runnable task,
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH tip/core/rcu 2/2] nohz_full: Document additional restrictions
  2013-05-20 14:50 ` [PATCH tip/core/rcu 1/2] nohz_full: Update based on Sedat Dilek review Paul E. McKenney
@ 2013-05-20 14:50   ` Paul E. McKenney
  2013-05-20 14:50   ` [PATCH tip/core/rcu 1/1] kthread: Add kworker kthreads to OS-jitter documentation Paul E. McKenney
  1 sibling, 0 replies; 6+ messages in thread
From: Paul E. McKenney @ 2013-05-20 14:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, edumazet, darren,
	fweisbec, sbw, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

This commit calls out the potential for slowing the tick even when there
are multiple runnable processes per CPU,  It also points out that current
mainlined version keeps the tick going on at least one CPU even when all
CPUs are otherwise idle.  Finally, it notes the need for a 1-HZ tick in
order to calculate CPU load, maintain sched average, compute CFS entity
vruntime, compute avenrun, and carry out load balancing.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/timers/NO_HZ.txt | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/Documentation/timers/NO_HZ.txt b/Documentation/timers/NO_HZ.txt
index d5323e0..8869758 100644
--- a/Documentation/timers/NO_HZ.txt
+++ b/Documentation/timers/NO_HZ.txt
@@ -278,6 +278,11 @@ o	Adaptive-ticks does not do anything unless there is only one
 	single runnable SCHED_FIFO task and multiple runnable SCHED_OTHER
 	tasks, even though these interrupts are unnecessary.
 
+	And even when there are multiple runnable tasks on a given CPU,
+	there is little point in interrupting that CPU until the current
+	running task's timeslice expires, which is almost always way
+	longer than the time of the next scheduling-clock interrupt.
+
 	Better handling of these sorts of situations is future work.
 
 o	A reboot is required to reconfigure both adaptive idle and RCU
@@ -308,6 +313,16 @@ o	Unless all CPUs are idle, at least one CPU must keep the
 	scheduling-clock interrupt going in order to support accurate
 	timekeeping.
 
-o	If there are adaptive-ticks CPUs, there will be at least one
-	CPU keeping the scheduling-clock interrupt going, even if all
-	CPUs are otherwise idle.
+o	If there might potentially be some adaptive-ticks CPUs, there
+	will be at least one CPU keeping the scheduling-clock interrupt
+	going, even if all CPUs are otherwise idle.
+
+	Better handling of this situation is ongoing work.
+
+o	Some process-handling operations still require the occasional
+	scheduling-clock tick.	These operations include calculating CPU
+	load, maintaining sched average, computing CFS entity vruntime,
+	computing avenrun, and carrying out load balancing.  They are
+	currently accommodated by scheduling-clock tick every second
+	or so.	On-going work will eliminate the need even for these
+	infrequent scheduling-clock ticks.
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH tip/core/rcu 1/1] kthread: Add kworker kthreads to OS-jitter documentation
  2013-05-20 14:50 ` [PATCH tip/core/rcu 1/2] nohz_full: Update based on Sedat Dilek review Paul E. McKenney
  2013-05-20 14:50   ` [PATCH tip/core/rcu 2/2] nohz_full: Document additional restrictions Paul E. McKenney
@ 2013-05-20 14:50   ` Paul E. McKenney
  1 sibling, 0 replies; 6+ messages in thread
From: Paul E. McKenney @ 2013-05-20 14:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, edumazet, darren,
	fweisbec, sbw, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The kworker workqueue kthreads can also contribute to OS jitter.
The amount of jitter depends on their use, so this commit adds
documentation on avoiding OS jitter due to workqueue use.

Reported-by: Jonathan Clairembault <jonathan.clairembault@novasparks.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/kernel-per-CPU-kthreads.txt | 47 +++++++++++++++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/Documentation/kernel-per-CPU-kthreads.txt b/Documentation/kernel-per-CPU-kthreads.txt
index cbf7ae4..5f39ef5 100644
--- a/Documentation/kernel-per-CPU-kthreads.txt
+++ b/Documentation/kernel-per-CPU-kthreads.txt
@@ -157,6 +157,53 @@ RCU_SOFTIRQ:  Do at least one of the following:
 		calls and by forcing both kernel threads and interrupts
 		to execute elsewhere.
 
+Name: kworker/%u:%d%s (cpu, id, priority)
+Purpose: Execute workqueue requests
+To reduce its OS jitter, do any of the following:
+1.	Run your workload at a real-time priority, which will allow
+	preempting the kworker daemons.
+2.	Do any of the following needed to avoid jitter that your
+	application cannot tolerate:
+	a.	Build your kernel with CONFIG_SLUB=y rather than
+		CONFIG_SLAB=y, thus avoiding the slab allocator's periodic
+		use of each CPU's workqueues to run its cache_reap()
+		function.
+	b.	Avoid using oprofile, thus avoiding OS jitter from
+		wq_sync_buffer().
+	c.	Limit your CPU frequency so that a CPU-frequency
+		governor is not required, possibly enlisting the aid of
+		special heatsinks or other cooling technologies.  If done
+		correctly, and if you CPU architecture permits, you should
+		be able to build your kernel with CONFIG_CPU_FREQ=n to
+		avoid the CPU-frequency governor periodically running
+		on each CPU, including cs_dbs_timer() and od_dbs_timer().
+		WARNING:  Please check your CPU specifications to
+		make sure that this is safe on your particular system.
+	d.	It is not possible to entirely get rid of OS jitter
+		from vmstat_update() on CONFIG_SMP=y systems, but you
+		can decrease its frequency by writing a large value to
+		/proc/sys/vm/stat_interval.  The default value is HZ,
+		for an interval of one second.  Of course, larger values
+		will make your virtual-memory statistics update more
+		slowly.  Of course, you can also run your workload at
+		a real-time priority, thus preempting vmstat_update().
+	e.	If running on high-end powerpc servers, build with
+		CONFIG_PPC_RTAS_DAEMON=n.  This prevents the RTAS
+		daemon from running on each CPU every second or so.
+		(This will require editing Kconfig files and will defeat
+		this platform's RAS functionality.)  This avoids jitter
+		due to the rtas_event_scan() function.
+		WARNING:  Please check your CPU specifications to
+		make sure that this is safe on your particular system.
+	f.	If running on Cell Processor, build your kernel with
+		CBE_CPUFREQ_SPU_GOVERNOR=n to avoid OS jitter from
+		spu_gov_work().
+		WARNING:  Please check your CPU specifications to
+		make sure that this is safe on your particular system.
+	g.	If running on PowerMAC, build your kernel with
+		CONFIG_PMAC_RACKMETER=n to disable the CPU-meter,
+		avoiding OS jitter from rackmeter_do_timer().
+
 Name: rcuc/%u
 Purpose: Execute RCU callbacks in CONFIG_RCU_BOOST=y kernels.
 To reduce its OS jitter, do at least one of the following:
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH tip/core/rcu 0/2] Documentation changes for 3.11
  2013-05-20 14:49 [PATCH tip/core/rcu 0/2] Documentation changes for 3.11 Paul E. McKenney
  2013-05-20 14:50 ` [PATCH tip/core/rcu 1/2] nohz_full: Update based on Sedat Dilek review Paul E. McKenney
@ 2013-05-21 17:33 ` Josh Triplett
  2013-05-21 21:12   ` Paul E. McKenney
  1 sibling, 1 reply; 6+ messages in thread
From: Josh Triplett @ 2013-05-21 17:33 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells, edumazet,
	darren, fweisbec, sbw

On Mon, May 20, 2013 at 07:49:34AM -0700, Paul E. McKenney wrote:
> Hello!
> 
> A pair of documentation changes.  These are not directly related to RCU,
> but if no one adopts them, I will send them up with the RCU commits.  The
> patches are as follows:
> 
> 1.	Changes to the new Documentation/timers/NO_HZ.txt documentation
> 	based on review comments from Sedat Dilek.
> 
> 2.	Document ways of turning off the scheduling-clock tick even when
> 	there is more than one runnable task on a given CPU.
> 
> 3.	Document ways of avoiding OS jitter form the kworker workqueue
> 	kthreads.

For some reason these are numbered 1/2, 2/2, and 1/1.

In any case, for all three:

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH tip/core/rcu 0/2] Documentation changes for 3.11
  2013-05-21 17:33 ` [PATCH tip/core/rcu 0/2] Documentation changes for 3.11 Josh Triplett
@ 2013-05-21 21:12   ` Paul E. McKenney
  0 siblings, 0 replies; 6+ messages in thread
From: Paul E. McKenney @ 2013-05-21 21:12 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells, edumazet,
	darren, fweisbec, sbw

On Tue, May 21, 2013 at 10:33:44AM -0700, Josh Triplett wrote:
> On Mon, May 20, 2013 at 07:49:34AM -0700, Paul E. McKenney wrote:
> > Hello!
> > 
> > A pair of documentation changes.  These are not directly related to RCU,
> > but if no one adopts them, I will send them up with the RCU commits.  The
> > patches are as follows:
> > 
> > 1.	Changes to the new Documentation/timers/NO_HZ.txt documentation
> > 	based on review comments from Sedat Dilek.
> > 
> > 2.	Document ways of turning off the scheduling-clock tick even when
> > 	there is more than one runnable task on a given CPU.
> > 
> > 3.	Document ways of avoiding OS jitter form the kworker workqueue
> > 	kthreads.
> 
> For some reason these are numbered 1/2, 2/2, and 1/1.

That was my unsuccessful attempt to fool git-sent-email.  Now that
3.10-rc2 is out, there is no longer any need to fool it.  (Patch 3
depended on a patch that was on a different branch that I didn't want
to merge into the main RCU branch, but now all of that is in mainline,
so I just rebase onto 3.10-rc2.)

> In any case, for all three:
> 
> Reviewed-by: Josh Triplett <josh@joshtriplett.org>

Thank you for the review!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-05-21 21:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-20 14:49 [PATCH tip/core/rcu 0/2] Documentation changes for 3.11 Paul E. McKenney
2013-05-20 14:50 ` [PATCH tip/core/rcu 1/2] nohz_full: Update based on Sedat Dilek review Paul E. McKenney
2013-05-20 14:50   ` [PATCH tip/core/rcu 2/2] nohz_full: Document additional restrictions Paul E. McKenney
2013-05-20 14:50   ` [PATCH tip/core/rcu 1/1] kthread: Add kworker kthreads to OS-jitter documentation Paul E. McKenney
2013-05-21 17:33 ` [PATCH tip/core/rcu 0/2] Documentation changes for 3.11 Josh Triplett
2013-05-21 21:12   ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.