All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 0/9] Stall-warning changes for 3.20
@ 2015-01-07 17:52 Paul E. McKenney
  2015-01-07 17:52 ` [PATCH tip/core/rcu 01/10] rcu: Set default to RCU_CPU_STALL_INFO=y Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2015-01-07 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

Hello!

This series contains stall-warning changes, mostly inspired by the
long-running "frequent lockups in 3.18rc4" LKML thread, which may
be found at https://lkml.org/lkml/2014/11/14/656.  Those who denigrate
testing clearly have not been up against trinity.  ;-)

1.	Set default to RCU_CPU_STALL_INFO=y.  This is in preparation for
	eliminating this Kconfig variable entirely in a release or two.

2.	Add rcutorture scripting checks for stall ending before dump start.

3.	Make RCU_CPU_STALL_INFO include number of attempts to force
	quiescent states (which mostly identifies idle CPUs).

4.	Improve diagnostics for spurious RCU CPU stall warnings, for
	example, identifying when RCU's grace-period kthreads have
	been prevented from running.

5.	Handle gpnum/completed wrap while dyntick idle.  This is a
	long-standing known bug, but one that requires a CPU to remain
	idle for 4,294,967,296 grace periods on 32-bit systems, and
	lots more on 64-bit systems.  However, with the advent of
	NO_HZ_FULL, this sort of thing becomes more likely, so time
	to fix it.

6.	Run grace-period kthreads at real-time priority.  Not yet ready
	for mainline, but the 0day test robot reports that it greatly
	reduces the number of hangs seen in testing, so something sort
	of like it is needed.  This patch also reduced the number of
	stalls in the 3.18-rc4 efforts.

7.	Make cond_resched_rcu_qs() apply to normal RCU flavors, allowing
	it to be placed in long-running loops that would otherwise result
	in RCU CPU stall warnings.

8.	Add GP-kthread-starvation checks to CPU stall warnings, making
	previous encoded warnings more human-readable.

9.	Fix RCU CPU stall detection in tiny implementation, courtesy of
	Miroslav Benes.

10.	Initialize tiny RCU stall-warning timeouts at boot.

							Thanx, Paul

------------------------------------------------------------------------

 b/Documentation/RCU/stallwarn.txt                         |   14 +
 b/Documentation/RCU/trace.txt                             |   32 +--
 b/include/linux/rcupdate.h                                |    3 
 b/include/linux/rcutiny.h                                 |    5 
 b/include/linux/rcutree.h                                 |    2 
 b/init/Kconfig                                            |    1 
 b/kernel/rcu/tiny.c                                       |    2 
 b/kernel/rcu/tiny_plugin.h                                |    7 
 b/kernel/rcu/tree.c                                       |  119 ++++++++++++--
 b/kernel/rcu/tree.h                                       |    7 
 b/kernel/rcu/tree_plugin.h                                |   10 -
 b/kernel/rcu/tree_trace.c                                 |    8 
 b/lib/Kconfig.debug                                       |    2 
 b/tools/testing/selftests/rcutorture/bin/parse-console.sh |    2 
 14 files changed, 164 insertions(+), 50 deletions(-)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH tip/core/rcu 01/10] rcu: Set default to RCU_CPU_STALL_INFO=y
  2015-01-07 17:52 [PATCH tip/core/rcu 0/9] Stall-warning changes for 3.20 Paul E. McKenney
@ 2015-01-07 17:52 ` Paul E. McKenney
  2015-01-07 17:52   ` [PATCH tip/core/rcu 02/10] rcutorture: Add checks for stall ending before dump start Paul E. McKenney
                     ` (8 more replies)
  0 siblings, 9 replies; 11+ messages in thread
From: Paul E. McKenney @ 2015-01-07 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The RCU_CPU_STALL_INFO code has been in for quite some time, and has
proven reliable.  This commit therefore enables it by default.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 lib/Kconfig.debug | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 5f2ce616c046..b47d9e474cbc 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1257,7 +1257,7 @@ config RCU_CPU_STALL_TIMEOUT
 config RCU_CPU_STALL_INFO
 	bool "Print additional diagnostics on RCU CPU stall"
 	depends on (TREE_RCU || PREEMPT_RCU) && DEBUG_KERNEL
-	default n
+	default y
 	help
 	  For each stalled CPU that is aware of the current RCU grace
 	  period, print out additional per-CPU diagnostic information
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH tip/core/rcu 02/10] rcutorture: Add checks for stall ending before dump start
  2015-01-07 17:52 ` [PATCH tip/core/rcu 01/10] rcu: Set default to RCU_CPU_STALL_INFO=y Paul E. McKenney
@ 2015-01-07 17:52   ` Paul E. McKenney
  2015-01-07 17:52   ` [PATCH tip/core/rcu 03/10] rcu: Make RCU_CPU_STALL_INFO include number of fqs attempts Paul E. McKenney
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2015-01-07 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The current rcutorture scripting checks for actual stalls (via the "Call
Trace:" check), but fails to spot the case where a stall ends just as it
is being detected.  This commit therefore adds a check for this case.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 tools/testing/selftests/rcutorture/bin/parse-console.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/rcutorture/bin/parse-console.sh b/tools/testing/selftests/rcutorture/bin/parse-console.sh
index f962ba4cf68b..d8f35cf116be 100755
--- a/tools/testing/selftests/rcutorture/bin/parse-console.sh
+++ b/tools/testing/selftests/rcutorture/bin/parse-console.sh
@@ -36,7 +36,7 @@ if grep -Pq '\x00' < $file
 then
 	print_warning Console output contains nul bytes, old qemu still running?
 fi
-egrep 'Badness|WARNING:|Warn|BUG|===========|Call Trace:|Oops:' < $file | grep -v 'ODEBUG: ' | grep -v 'Warning: unable to open an initial console' > $T
+egrep 'Badness|WARNING:|Warn|BUG|===========|Call Trace:|Oops:|Stall ended before state dump start' < $file | grep -v 'ODEBUG: ' | grep -v 'Warning: unable to open an initial console' > $T
 if test -s $T
 then
 	print_warning Assertion failure in $file $title
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH tip/core/rcu 03/10] rcu: Make RCU_CPU_STALL_INFO include number of fqs attempts
  2015-01-07 17:52 ` [PATCH tip/core/rcu 01/10] rcu: Set default to RCU_CPU_STALL_INFO=y Paul E. McKenney
  2015-01-07 17:52   ` [PATCH tip/core/rcu 02/10] rcutorture: Add checks for stall ending before dump start Paul E. McKenney
@ 2015-01-07 17:52   ` Paul E. McKenney
  2015-01-07 17:52   ` [PATCH tip/core/rcu 04/10] rcu: Improve diagnostics for spurious RCU CPU stall warnings Paul E. McKenney
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2015-01-07 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

One way that an RCU CPU stall warning can happen is if the grace-period
kthread is not allowed to execute.  One proxy for this kthread's
forward progress is the number of force-quiescent-state (fqs) scans.
This commit therefore adds the number of fqs scans to the RCU CPU stall
warning printouts when CONFIG_RCU_CPU_STALL_INFO=y.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c        | 1 +
 kernel/rcu/tree.h        | 2 ++
 kernel/rcu/tree_plugin.h | 3 ++-
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 4c106fcc0d54..654b15be1e36 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1043,6 +1043,7 @@ static void record_gp_stall_check_time(struct rcu_state *rsp)
 	j1 = rcu_jiffies_till_stall_check();
 	ACCESS_ONCE(rsp->jiffies_stall) = j + j1;
 	rsp->jiffies_resched = j + j1 / 2;
+	rsp->n_force_qs_gpstart = ACCESS_ONCE(rsp->n_force_qs);
 }
 
 /*
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 8e7b1843896e..e300848cc0cf 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -492,6 +492,8 @@ struct rcu_state {
 						/*  for CPU stalls. */
 	unsigned long jiffies_resched;		/* Time at which to resched */
 						/*  a reluctant CPU. */
+	unsigned long n_force_qs_gpstart;	/* Snapshot of n_force_qs at */
+						/*  GP start. */
 	unsigned long gp_max;			/* Maximum GP duration in */
 						/*  jiffies. */
 	const char *name;			/* Name of structure. */
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 3ec85cb5d544..769384d77437 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1898,11 +1898,12 @@ static void print_cpu_stall_info(struct rcu_state *rsp, int cpu)
 		ticks_value = rsp->gpnum - rdp->gpnum;
 	}
 	print_cpu_stall_fast_no_hz(fast_no_hz, cpu);
-	pr_err("\t%d: (%lu %s) idle=%03x/%llx/%d softirq=%u/%u %s\n",
+	pr_err("\t%d: (%lu %s) idle=%03x/%llx/%d softirq=%u/%u fqs=%ld %s\n",
 	       cpu, ticks_value, ticks_title,
 	       atomic_read(&rdtp->dynticks) & 0xfff,
 	       rdtp->dynticks_nesting, rdtp->dynticks_nmi_nesting,
 	       rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
+	       ACCESS_ONCE(rsp->n_force_qs) - rsp->n_force_qs_gpstart,
 	       fast_no_hz);
 }
 
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH tip/core/rcu 04/10] rcu: Improve diagnostics for spurious RCU CPU stall warnings
  2015-01-07 17:52 ` [PATCH tip/core/rcu 01/10] rcu: Set default to RCU_CPU_STALL_INFO=y Paul E. McKenney
  2015-01-07 17:52   ` [PATCH tip/core/rcu 02/10] rcutorture: Add checks for stall ending before dump start Paul E. McKenney
  2015-01-07 17:52   ` [PATCH tip/core/rcu 03/10] rcu: Make RCU_CPU_STALL_INFO include number of fqs attempts Paul E. McKenney
@ 2015-01-07 17:52   ` Paul E. McKenney
  2015-01-07 17:52   ` [PATCH tip/core/rcu 05/10] rcu: Handle gpnum/completed wrap while dyntick idle Paul E. McKenney
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2015-01-07 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The current RCU CPU stall warning code will print "Stall ended before
state dump start" any time that the stall-warning code is triggered on
a CPU that has already reported a quiescent state for the current grace
period and if all quiescent states have been reported for the current
grace period.  However, a true stall can result in these symptoms, for
example, by preventing RCU's grace-period kthreads from ever running

This commit therefore checks for this condition, reporting the end of
the stall only if one of the grace-period counters has actually advanced.
Otherwise, it reports the last time that the grace-period kthread made
meaningful progress.  (In normal situations, the grace-period kthread
should make meaningful progress at least every jiffies_till_next_fqs
jiffies.)

Reported-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Miroslav Benes <mbenes@suse.cz>
---
 Documentation/RCU/stallwarn.txt |  5 +++++
 kernel/rcu/tree.c               | 32 +++++++++++++++++++++++++++-----
 kernel/rcu/tree.h               |  2 ++
 3 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index ed186a902d31..55f9707fe60a 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -187,6 +187,11 @@ o	For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the
 	behavior, you might need to replace some of the cond_resched()
 	calls with calls to cond_resched_rcu_qs().
 
+o	Anything that prevents RCU's grace-period kthreads from running.
+	This can result in the "All QSes seen" console-log message.
+	This message will include information on when the kthread last
+	ran and how often it should be expected to run.
+
 o	A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
 	happen to preempt a low-priority task in the middle of an RCU
 	read-side critical section.   This is especially damaging if
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 654b15be1e36..a2ceb66bcd67 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1066,11 +1066,13 @@ static void rcu_dump_cpu_stacks(struct rcu_state *rsp)
 	}
 }
 
-static void print_other_cpu_stall(struct rcu_state *rsp)
+static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum)
 {
 	int cpu;
 	long delta;
 	unsigned long flags;
+	unsigned long gpa;
+	unsigned long j;
 	int ndetected = 0;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 	long totqlen = 0;
@@ -1123,10 +1125,22 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
 	pr_cont("(detected by %d, t=%ld jiffies, g=%ld, c=%ld, q=%lu)\n",
 	       smp_processor_id(), (long)(jiffies - rsp->gp_start),
 	       (long)rsp->gpnum, (long)rsp->completed, totqlen);
-	if (ndetected == 0)
-		pr_err("INFO: Stall ended before state dump start\n");
-	else
+	if (ndetected) {
 		rcu_dump_cpu_stacks(rsp);
+	} else {
+		if (ACCESS_ONCE(rsp->gpnum) != gpnum ||
+		    ACCESS_ONCE(rsp->completed) == gpnum) {
+			pr_err("INFO: Stall ended before state dump start\n");
+		} else {
+			j = jiffies;
+			gpa = ACCESS_ONCE(rsp->gp_activity);
+			pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld\n",
+			       rsp->name, j - gpa, j, gpa,
+			       jiffies_till_next_fqs);
+			/* In this case, the current CPU might be at fault. */
+			sched_show_task(current);
+		}
+	}
 
 	/* Complain about tasks blocking the grace period. */
 
@@ -1226,7 +1240,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
 		   ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY)) {
 
 		/* They had a few time units to dump stack, so complain. */
-		print_other_cpu_stall(rsp);
+		print_other_cpu_stall(rsp, gpnum);
 	}
 }
 
@@ -1622,6 +1636,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
 	struct rcu_data *rdp;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
+	ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	rcu_bind_gp_kthread();
 	raw_spin_lock_irq(&rnp->lock);
 	smp_mb__after_unlock_lock();
@@ -1682,6 +1697,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
 					    rnp->grphi, rnp->qsmask);
 		raw_spin_unlock_irq(&rnp->lock);
 		cond_resched_rcu_qs();
+		ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	}
 
 	mutex_unlock(&rsp->onoff_mutex);
@@ -1698,6 +1714,7 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 	unsigned long maxj;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
+	ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	rsp->n_force_qs++;
 	if (fqs_state == RCU_SAVE_DYNTICK) {
 		/* Collect dyntick-idle snapshots. */
@@ -1736,6 +1753,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 	struct rcu_data *rdp;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
+	ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	raw_spin_lock_irq(&rnp->lock);
 	smp_mb__after_unlock_lock();
 	gp_duration = jiffies - rsp->gp_start;
@@ -1772,6 +1790,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 		nocb += rcu_future_gp_cleanup(rsp, rnp);
 		raw_spin_unlock_irq(&rnp->lock);
 		cond_resched_rcu_qs();
+		ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	}
 	rnp = rcu_get_root(rsp);
 	raw_spin_lock_irq(&rnp->lock);
@@ -1821,6 +1840,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 			if (rcu_gp_init(rsp))
 				break;
 			cond_resched_rcu_qs();
+			ACCESS_ONCE(rsp->gp_activity) = jiffies;
 			WARN_ON(signal_pending(current));
 			trace_rcu_grace_period(rsp->name,
 					       ACCESS_ONCE(rsp->gpnum),
@@ -1864,9 +1884,11 @@ static int __noreturn rcu_gp_kthread(void *arg)
 						       ACCESS_ONCE(rsp->gpnum),
 						       TPS("fqsend"));
 				cond_resched_rcu_qs();
+				ACCESS_ONCE(rsp->gp_activity) = jiffies;
 			} else {
 				/* Deal with stray signal. */
 				cond_resched_rcu_qs();
+				ACCESS_ONCE(rsp->gp_activity) = jiffies;
 				WARN_ON(signal_pending(current));
 				trace_rcu_grace_period(rsp->name,
 						       ACCESS_ONCE(rsp->gpnum),
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index e300848cc0cf..5ec81cf938fd 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -488,6 +488,8 @@ struct rcu_state {
 						/*  due to no GP active. */
 	unsigned long gp_start;			/* Time at which GP started, */
 						/*  but in jiffies. */
+	unsigned long gp_activity;		/* Time of last GP kthread */
+						/*  activity in jiffies. */
 	unsigned long jiffies_stall;		/* Time at which to check */
 						/*  for CPU stalls. */
 	unsigned long jiffies_resched;		/* Time at which to resched */
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH tip/core/rcu 05/10] rcu: Handle gpnum/completed wrap while dyntick idle
  2015-01-07 17:52 ` [PATCH tip/core/rcu 01/10] rcu: Set default to RCU_CPU_STALL_INFO=y Paul E. McKenney
                     ` (2 preceding siblings ...)
  2015-01-07 17:52   ` [PATCH tip/core/rcu 04/10] rcu: Improve diagnostics for spurious RCU CPU stall warnings Paul E. McKenney
@ 2015-01-07 17:52   ` Paul E. McKenney
  2015-01-07 17:52   ` [PATCH tip/core/rcu 06/10] rcu: Run grace-period kthreads at real-time priority Paul E. McKenney
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2015-01-07 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

Subtle race conditions can result if a CPU stays in dyntick-idle mode
long enough for the ->gpnum and ->completed fields to wrap.  For
example, consider the following sequence of events:

o	CPU 1 encounters a quiescent state while waiting for grace period
	5 to complete, but then enters dyntick-idle mode.

o	While CPU 1 is in dyntick-idle mode, the grace-period counters
	wrap around so that the grace period number is now 4.

o	Just as CPU 1 exits dyntick-idle mode, grace period 4 completes
	and grace period 5 begins.

o	The quiescent state that CPU 1 passed through during the old
	grace period 5 looks like it applies to the new grace period
	5.  Therefore, the new grace period 5 completes without CPU 1
	having passed through a quiescent state.

This could clearly be a fatal surprise to any long-running RCU read-side
critical section that happened to be running on CPU 1 at the time.  At one
time, this was not a problem, given that it takes significant time for
the grace-period counters to overflow even on 32-bit systems.  However,
with the advent of NO_HZ_FULL and SMP embedded systems, arbitrarily long
idle periods are now becoming quite feasible.  It is therefore time to
close this race.

This commit therefore avoids this race condition by having the
quiescent-state forcing code detect when a CPU is falling too far
behind, and setting a new rcu_data field ->gpwrap when this happens.
Whenever this new ->gpwrap field is set, the CPU's ->gpnum and ->completed
fields are known to be untrustworthy, and can be ignored, along with
any associated quiescent states.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c        | 17 ++++++++++++-----
 kernel/rcu/tree.h        |  1 +
 kernel/rcu/tree_plugin.h |  3 ++-
 3 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index a2ceb66bcd67..5987fdc85fc4 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -930,6 +930,9 @@ static int dyntick_save_progress_counter(struct rcu_data *rdp,
 		trace_rcu_fqs(rdp->rsp->name, rdp->gpnum, rdp->cpu, TPS("dti"));
 		return 1;
 	} else {
+		if (ULONG_CMP_LT(ACCESS_ONCE(rdp->gpnum) + ULONG_MAX / 4,
+				 rdp->mynode->gpnum))
+			ACCESS_ONCE(rdp->gpwrap) = true;
 		return 0;
 	}
 }
@@ -1577,7 +1580,8 @@ static bool __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp,
 	bool ret;
 
 	/* Handle the ends of any preceding grace periods first. */
-	if (rdp->completed == rnp->completed) {
+	if (rdp->completed == rnp->completed &&
+	    !unlikely(ACCESS_ONCE(rdp->gpwrap))) {
 
 		/* No grace period end, so just accelerate recent callbacks. */
 		ret = rcu_accelerate_cbs(rsp, rnp, rdp);
@@ -1592,7 +1596,7 @@ static bool __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp,
 		trace_rcu_grace_period(rsp->name, rdp->gpnum, TPS("cpuend"));
 	}
 
-	if (rdp->gpnum != rnp->gpnum) {
+	if (rdp->gpnum != rnp->gpnum || unlikely(ACCESS_ONCE(rdp->gpwrap))) {
 		/*
 		 * If the current grace period is waiting for this CPU,
 		 * set up to detect a quiescent state, otherwise don't
@@ -1603,6 +1607,7 @@ static bool __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp,
 		rdp->passed_quiesce = 0;
 		rdp->qs_pending = !!(rnp->qsmask & rdp->grpmask);
 		zero_cpu_stall_ticks(rdp);
+		ACCESS_ONCE(rdp->gpwrap) = false;
 	}
 	return ret;
 }
@@ -1616,7 +1621,8 @@ static void note_gp_changes(struct rcu_state *rsp, struct rcu_data *rdp)
 	local_irq_save(flags);
 	rnp = rdp->mynode;
 	if ((rdp->gpnum == ACCESS_ONCE(rnp->gpnum) &&
-	     rdp->completed == ACCESS_ONCE(rnp->completed)) || /* w/out lock. */
+	     rdp->completed == ACCESS_ONCE(rnp->completed) &&
+	     !unlikely(ACCESS_ONCE(rdp->gpwrap))) || /* w/out lock. */
 	    !raw_spin_trylock(&rnp->lock)) { /* irqs already off, so later. */
 		local_irq_restore(flags);
 		return;
@@ -2066,7 +2072,7 @@ rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct rcu_data *rdp)
 	raw_spin_lock_irqsave(&rnp->lock, flags);
 	smp_mb__after_unlock_lock();
 	if (rdp->passed_quiesce == 0 || rdp->gpnum != rnp->gpnum ||
-	    rnp->completed == rnp->gpnum) {
+	    rnp->completed == rnp->gpnum || rdp->gpwrap) {
 
 		/*
 		 * The grace period in which this quiescent state was
@@ -3190,7 +3196,8 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp)
 	}
 
 	/* Has a new RCU grace period started? */
-	if (ACCESS_ONCE(rnp->gpnum) != rdp->gpnum) { /* outside lock */
+	if (ACCESS_ONCE(rnp->gpnum) != rdp->gpnum ||
+	    unlikely(ACCESS_ONCE(rdp->gpwrap))) { /* outside lock */
 		rdp->n_rp_gp_started++;
 		return 1;
 	}
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 5ec81cf938fd..7472ff388d55 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -260,6 +260,7 @@ struct rcu_data {
 	bool		passed_quiesce;	/* User-mode/idle loop etc. */
 	bool		qs_pending;	/* Core waits for quiesc state. */
 	bool		beenonline;	/* CPU online at least once. */
+	bool		gpwrap;		/* Possible gpnum/completed wrap. */
 	struct rcu_node *mynode;	/* This CPU's leaf of hierarchy */
 	unsigned long grpmask;		/* Mask to apply to leaf qsmask. */
 #ifdef CONFIG_RCU_CPU_STALL_INFO
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 769384d77437..81ff8b9a5a39 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1605,7 +1605,8 @@ static bool __maybe_unused rcu_try_advance_all_cbs(void)
 		 * completed since we last checked and there are
 		 * callbacks not yet ready to invoke.
 		 */
-		if (rdp->completed != rnp->completed &&
+		if ((rdp->completed != rnp->completed ||
+		     unlikely(ACCESS_ONCE(rdp->gpwrap))) &&
 		    rdp->nxttail[RCU_DONE_TAIL] != rdp->nxttail[RCU_NEXT_TAIL])
 			note_gp_changes(rsp, rdp);
 
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH tip/core/rcu 06/10] rcu: Run grace-period kthreads at real-time priority
  2015-01-07 17:52 ` [PATCH tip/core/rcu 01/10] rcu: Set default to RCU_CPU_STALL_INFO=y Paul E. McKenney
                     ` (3 preceding siblings ...)
  2015-01-07 17:52   ` [PATCH tip/core/rcu 05/10] rcu: Handle gpnum/completed wrap while dyntick idle Paul E. McKenney
@ 2015-01-07 17:52   ` Paul E. McKenney
  2015-01-07 17:52   ` [PATCH tip/core/rcu 07/10] rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors Paul E. McKenney
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2015-01-07 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

This is a experimental commit that attempts to better handle high-load
situations.

Not-yet-signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 init/Kconfig             |  1 -
 kernel/rcu/tree.c        | 10 +++++++++-
 kernel/rcu/tree_plugin.h |  4 ----
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 9afb971497f4..8787131d0f5c 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -669,7 +669,6 @@ config RCU_BOOST
 config RCU_KTHREAD_PRIO
 	int "Real-time priority to use for RCU worker threads"
 	range 1 99
-	depends on RCU_BOOST
 	default 1
 	help
 	  This option specifies the SCHED_FIFO priority value that will be
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 5987fdc85fc4..7a8d7e0d23f6 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -156,6 +156,10 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
 static void invoke_rcu_core(void);
 static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp);
 
+/* rcuc/rcub kthread realtime priority */
+static int kthread_prio = CONFIG_RCU_KTHREAD_PRIO;
+module_param(kthread_prio, int, 0644);
+
 /*
  * Track the rcutorture test sequence number and the update version
  * number within a given test.  The rcutorture_testseq is incremented
@@ -3599,15 +3603,19 @@ static int __init rcu_spawn_gp_kthread(void)
 	unsigned long flags;
 	struct rcu_node *rnp;
 	struct rcu_state *rsp;
+	struct sched_param sp;
 	struct task_struct *t;
 
 	rcu_scheduler_fully_active = 1;
 	for_each_rcu_flavor(rsp) {
-		t = kthread_run(rcu_gp_kthread, rsp, "%s", rsp->name);
+		t = kthread_create(rcu_gp_kthread, rsp, "%s", rsp->name);
 		BUG_ON(IS_ERR(t));
 		rnp = rcu_get_root(rsp);
 		raw_spin_lock_irqsave(&rnp->lock, flags);
 		rsp->gp_kthread = t;
+		sp.sched_priority = kthread_prio;
+		sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
+		wake_up_process(t);
 		raw_spin_unlock_irqrestore(&rnp->lock, flags);
 	}
 	rcu_spawn_nocb_kthreads();
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 81ff8b9a5a39..dcb32638f88b 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -34,10 +34,6 @@
 
 #include "../locking/rtmutex_common.h"
 
-/* rcuc/rcub kthread realtime priority */
-static int kthread_prio = CONFIG_RCU_KTHREAD_PRIO;
-module_param(kthread_prio, int, 0644);
-
 /*
  * Control variables for per-CPU and per-rcu_node kthreads.  These
  * handle all flavors of RCU.
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH tip/core/rcu 07/10] rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors
  2015-01-07 17:52 ` [PATCH tip/core/rcu 01/10] rcu: Set default to RCU_CPU_STALL_INFO=y Paul E. McKenney
                     ` (4 preceding siblings ...)
  2015-01-07 17:52   ` [PATCH tip/core/rcu 06/10] rcu: Run grace-period kthreads at real-time priority Paul E. McKenney
@ 2015-01-07 17:52   ` Paul E. McKenney
  2015-01-07 17:52   ` [PATCH tip/core/rcu 08/10] rcu: Add GP-kthread-starvation checks to CPU stall warnings Paul E. McKenney
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2015-01-07 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

Although cond_resched_rcu_qs() only applies to TASKS_RCU, it is used
in places where it would be useful for it to apply to the normal RCU
flavors, rcu_preempt, rcu_sched, and rcu_bh.  This is especially the
case for workloads that aggressively overload the system, particularly
those that generate large numbers of RCU updates on systems running
NO_HZ_FULL CPUs.  This commit therefore communicates quiescent states
from cond_resched_rcu_qs() to the normal RCU flavors.

Note that it is unfortunately necessary to leave the old ->passed_quiesce
mechanism in place to allow quiescent states that apply to only one
flavor to be recorded.  (Yes, we could decrement ->rcu_qs_ctr_snap in
that case, but that is not so good for debugging of RCU internals.)
In addition, if one of the RCU flavor's grace period has stalled, this
will invoke rcu_momentary_dyntick_idle(), resulting in a heavy-weight
quiescent state visible from other CPUs.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Merge commit from Sasha Levin fixing a bug where __this_cpu()
  was used in preemptible code. ]
---
 Documentation/RCU/trace.txt | 32 ++++++++++++++++----------------
 include/linux/rcupdate.h    |  3 ++-
 include/linux/rcutiny.h     |  5 ++++-
 include/linux/rcutree.h     |  2 ++
 kernel/rcu/tree.c           | 38 +++++++++++++++++++++++++++++++++-----
 kernel/rcu/tree.h           |  2 ++
 kernel/rcu/tree_trace.c     |  8 ++++++--
 7 files changed, 65 insertions(+), 25 deletions(-)

diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt
index b63b9bb3bc0c..08651da15448 100644
--- a/Documentation/RCU/trace.txt
+++ b/Documentation/RCU/trace.txt
@@ -56,14 +56,14 @@ rcuboost:
 
 The output of "cat rcu/rcu_preempt/rcudata" looks as follows:
 
-  0!c=30455 g=30456 pq=1 qp=1 dt=126535/140000000000000/0 df=2002 of=4 ql=0/0 qs=N... b=10 ci=74572 nci=0 co=1131 ca=716
-  1!c=30719 g=30720 pq=1 qp=0 dt=132007/140000000000000/0 df=1874 of=10 ql=0/0 qs=N... b=10 ci=123209 nci=0 co=685 ca=982
-  2!c=30150 g=30151 pq=1 qp=1 dt=138537/140000000000000/0 df=1707 of=8 ql=0/0 qs=N... b=10 ci=80132 nci=0 co=1328 ca=1458
-  3 c=31249 g=31250 pq=1 qp=0 dt=107255/140000000000000/0 df=1749 of=6 ql=0/450 qs=NRW. b=10 ci=151700 nci=0 co=509 ca=622
-  4!c=29502 g=29503 pq=1 qp=1 dt=83647/140000000000000/0 df=965 of=5 ql=0/0 qs=N... b=10 ci=65643 nci=0 co=1373 ca=1521
-  5 c=31201 g=31202 pq=1 qp=1 dt=70422/0/0 df=535 of=7 ql=0/0 qs=.... b=10 ci=58500 nci=0 co=764 ca=698
-  6!c=30253 g=30254 pq=1 qp=1 dt=95363/140000000000000/0 df=780 of=5 ql=0/0 qs=N... b=10 ci=100607 nci=0 co=1414 ca=1353
-  7 c=31178 g=31178 pq=1 qp=0 dt=91536/0/0 df=547 of=4 ql=0/0 qs=.... b=10 ci=109819 nci=0 co=1115 ca=969
+  0!c=30455 g=30456 pq=1/0 qp=1 dt=126535/140000000000000/0 df=2002 of=4 ql=0/0 qs=N... b=10 ci=74572 nci=0 co=1131 ca=716
+  1!c=30719 g=30720 pq=1/0 qp=0 dt=132007/140000000000000/0 df=1874 of=10 ql=0/0 qs=N... b=10 ci=123209 nci=0 co=685 ca=982
+  2!c=30150 g=30151 pq=1/1 qp=1 dt=138537/140000000000000/0 df=1707 of=8 ql=0/0 qs=N... b=10 ci=80132 nci=0 co=1328 ca=1458
+  3 c=31249 g=31250 pq=1/1 qp=0 dt=107255/140000000000000/0 df=1749 of=6 ql=0/450 qs=NRW. b=10 ci=151700 nci=0 co=509 ca=622
+  4!c=29502 g=29503 pq=1/0 qp=1 dt=83647/140000000000000/0 df=965 of=5 ql=0/0 qs=N... b=10 ci=65643 nci=0 co=1373 ca=1521
+  5 c=31201 g=31202 pq=1/0 qp=1 dt=70422/0/0 df=535 of=7 ql=0/0 qs=.... b=10 ci=58500 nci=0 co=764 ca=698
+  6!c=30253 g=30254 pq=1/0 qp=1 dt=95363/140000000000000/0 df=780 of=5 ql=0/0 qs=N... b=10 ci=100607 nci=0 co=1414 ca=1353
+  7 c=31178 g=31178 pq=1/0 qp=0 dt=91536/0/0 df=547 of=4 ql=0/0 qs=.... b=10 ci=109819 nci=0 co=1115 ca=969
 
 This file has one line per CPU, or eight for this 8-CPU system.
 The fields are as follows:
@@ -188,14 +188,14 @@ o	"ca" is the number of RCU callbacks that have been adopted by this
 Kernels compiled with CONFIG_RCU_BOOST=y display the following from
 /debug/rcu/rcu_preempt/rcudata:
 
-  0!c=12865 g=12866 pq=1 qp=1 dt=83113/140000000000000/0 df=288 of=11 ql=0/0 qs=N... kt=0/O ktl=944 b=10 ci=60709 nci=0 co=748 ca=871
-  1 c=14407 g=14408 pq=1 qp=0 dt=100679/140000000000000/0 df=378 of=7 ql=0/119 qs=NRW. kt=0/W ktl=9b6 b=10 ci=109740 nci=0 co=589 ca=485
-  2 c=14407 g=14408 pq=1 qp=0 dt=105486/0/0 df=90 of=9 ql=0/89 qs=NRW. kt=0/W ktl=c0c b=10 ci=83113 nci=0 co=533 ca=490
-  3 c=14407 g=14408 pq=1 qp=0 dt=107138/0/0 df=142 of=8 ql=0/188 qs=NRW. kt=0/W ktl=b96 b=10 ci=121114 nci=0 co=426 ca=290
-  4 c=14405 g=14406 pq=1 qp=1 dt=50238/0/0 df=706 of=7 ql=0/0 qs=.... kt=0/W ktl=812 b=10 ci=34929 nci=0 co=643 ca=114
-  5!c=14168 g=14169 pq=1 qp=0 dt=45465/140000000000000/0 df=161 of=11 ql=0/0 qs=N... kt=0/O ktl=b4d b=10 ci=47712 nci=0 co=677 ca=722
-  6 c=14404 g=14405 pq=1 qp=0 dt=59454/0/0 df=94 of=6 ql=0/0 qs=.... kt=0/W ktl=e57 b=10 ci=55597 nci=0 co=701 ca=811
-  7 c=14407 g=14408 pq=1 qp=1 dt=68850/0/0 df=31 of=8 ql=0/0 qs=.... kt=0/W ktl=14bd b=10 ci=77475 nci=0 co=508 ca=1042
+  0!c=12865 g=12866 pq=1/0 qp=1 dt=83113/140000000000000/0 df=288 of=11 ql=0/0 qs=N... kt=0/O ktl=944 b=10 ci=60709 nci=0 co=748 ca=871
+  1 c=14407 g=14408 pq=1/0 qp=0 dt=100679/140000000000000/0 df=378 of=7 ql=0/119 qs=NRW. kt=0/W ktl=9b6 b=10 ci=109740 nci=0 co=589 ca=485
+  2 c=14407 g=14408 pq=1/0 qp=0 dt=105486/0/0 df=90 of=9 ql=0/89 qs=NRW. kt=0/W ktl=c0c b=10 ci=83113 nci=0 co=533 ca=490
+  3 c=14407 g=14408 pq=1/0 qp=0 dt=107138/0/0 df=142 of=8 ql=0/188 qs=NRW. kt=0/W ktl=b96 b=10 ci=121114 nci=0 co=426 ca=290
+  4 c=14405 g=14406 pq=1/0 qp=1 dt=50238/0/0 df=706 of=7 ql=0/0 qs=.... kt=0/W ktl=812 b=10 ci=34929 nci=0 co=643 ca=114
+  5!c=14168 g=14169 pq=1/0 qp=0 dt=45465/140000000000000/0 df=161 of=11 ql=0/0 qs=N... kt=0/O ktl=b4d b=10 ci=47712 nci=0 co=677 ca=722
+  6 c=14404 g=14405 pq=1/0 qp=0 dt=59454/0/0 df=94 of=6 ql=0/0 qs=.... kt=0/W ktl=e57 b=10 ci=55597 nci=0 co=701 ca=811
+  7 c=14407 g=14408 pq=1/0 qp=1 dt=68850/0/0 df=31 of=8 ql=0/0 qs=.... kt=0/W ktl=14bd b=10 ci=77475 nci=0 co=508 ca=1042
 
 This is similar to the output discussed above, but contains the following
 additional fields:
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index ed4f5939a452..468228750299 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -331,12 +331,13 @@ static inline void rcu_init_nohz(void)
 extern struct srcu_struct tasks_rcu_exit_srcu;
 #define rcu_note_voluntary_context_switch(t) \
 	do { \
+		rcu_all_qs(); \
 		if (ACCESS_ONCE((t)->rcu_tasks_holdout)) \
 			ACCESS_ONCE((t)->rcu_tasks_holdout) = false; \
 	} while (0)
 #else /* #ifdef CONFIG_TASKS_RCU */
 #define TASKS_RCU(x) do { } while (0)
-#define rcu_note_voluntary_context_switch(t)	do { } while (0)
+#define rcu_note_voluntary_context_switch(t)	rcu_all_qs()
 #endif /* #else #ifdef CONFIG_TASKS_RCU */
 
 /**
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 0e5366200154..fabd3fad8516 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -154,7 +154,10 @@ static inline bool rcu_is_watching(void)
 	return true;
 }
 
-
 #endif /* #else defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) */
 
+static inline void rcu_all_qs(void)
+{
+}
+
 #endif /* __LINUX_RCUTINY_H */
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 52953790dcca..ddba927f7316 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -97,4 +97,6 @@ extern int rcu_scheduler_active __read_mostly;
 
 bool rcu_is_watching(void);
 
+void rcu_all_qs(void);
+
 #endif /* __LINUX_RCUTREE_H */
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 7a8d7e0d23f6..6284c19b39ba 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -219,6 +219,9 @@ static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
 #endif /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
 };
 
+DEFINE_PER_CPU_SHARED_ALIGNED(unsigned long, rcu_qs_ctr);
+EXPORT_PER_CPU_SYMBOL_GPL(rcu_qs_ctr);
+
 /*
  * Let the RCU core know that this CPU has gone through the scheduler,
  * which is a quiescent state.  This is called when the need for a
@@ -288,6 +291,22 @@ void rcu_note_context_switch(void)
 }
 EXPORT_SYMBOL_GPL(rcu_note_context_switch);
 
+/*
+ * Register a quiesecent state for all RCU flavors.  If there is an
+ * emergency, invoke rcu_momentary_dyntick_idle() to do a heavy-weight
+ * dyntick-idle quiescent state visible to other CPUs (but only for those
+ * RCU flavors in desparate need of a quiescent state, which will normally
+ * be none of them).  Either way, do a lightweight quiescent state for
+ * all RCU flavors.
+ */
+void rcu_all_qs(void)
+{
+	if (unlikely(raw_cpu_read(rcu_sched_qs_mask)))
+		rcu_momentary_dyntick_idle();
+	this_cpu_inc(rcu_qs_ctr);
+}
+EXPORT_SYMBOL_GPL(rcu_all_qs);
+
 static long blimit = 10;	/* Maximum callbacks per rcu_do_batch. */
 static long qhimark = 10000;	/* If this many pending, ignore blimit. */
 static long qlowmark = 100;	/* Once only this many pending, use blimit. */
@@ -1609,6 +1628,7 @@ static bool __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp,
 		rdp->gpnum = rnp->gpnum;
 		trace_rcu_grace_period(rsp->name, rdp->gpnum, TPS("cpustart"));
 		rdp->passed_quiesce = 0;
+		rdp->rcu_qs_ctr_snap = __this_cpu_read(rcu_qs_ctr);
 		rdp->qs_pending = !!(rnp->qsmask & rdp->grpmask);
 		zero_cpu_stall_ticks(rdp);
 		ACCESS_ONCE(rdp->gpwrap) = false;
@@ -2075,8 +2095,10 @@ rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct rcu_data *rdp)
 	rnp = rdp->mynode;
 	raw_spin_lock_irqsave(&rnp->lock, flags);
 	smp_mb__after_unlock_lock();
-	if (rdp->passed_quiesce == 0 || rdp->gpnum != rnp->gpnum ||
-	    rnp->completed == rnp->gpnum || rdp->gpwrap) {
+	if ((rdp->passed_quiesce == 0 &&
+	     rdp->rcu_qs_ctr_snap == __this_cpu_read(rcu_qs_ctr)) ||
+	    rdp->gpnum != rnp->gpnum || rnp->completed == rnp->gpnum ||
+	    rdp->gpwrap) {
 
 		/*
 		 * The grace period in which this quiescent state was
@@ -2085,6 +2107,7 @@ rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct rcu_data *rdp)
 		 * within the current grace period.
 		 */
 		rdp->passed_quiesce = 0;	/* need qs for new gp. */
+		rdp->rcu_qs_ctr_snap = __this_cpu_read(rcu_qs_ctr);
 		raw_spin_unlock_irqrestore(&rnp->lock, flags);
 		return;
 	}
@@ -2129,7 +2152,8 @@ rcu_check_quiescent_state(struct rcu_state *rsp, struct rcu_data *rdp)
 	 * Was there a quiescent state since the beginning of the grace
 	 * period? If no, then exit and wait for the next call.
 	 */
-	if (!rdp->passed_quiesce)
+	if (!rdp->passed_quiesce &&
+	    rdp->rcu_qs_ctr_snap == __this_cpu_read(rcu_qs_ctr))
 		return;
 
 	/*
@@ -3174,9 +3198,12 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp)
 
 	/* Is the RCU core waiting for a quiescent state from this CPU? */
 	if (rcu_scheduler_fully_active &&
-	    rdp->qs_pending && !rdp->passed_quiesce) {
+	    rdp->qs_pending && !rdp->passed_quiesce &&
+	    rdp->rcu_qs_ctr_snap == __this_cpu_read(rcu_qs_ctr)) {
 		rdp->n_rp_qs_pending++;
-	} else if (rdp->qs_pending && rdp->passed_quiesce) {
+	} else if (rdp->qs_pending &&
+		   (rdp->passed_quiesce ||
+		    rdp->rcu_qs_ctr_snap != __this_cpu_read(rcu_qs_ctr))) {
 		rdp->n_rp_report_qs++;
 		return 1;
 	}
@@ -3510,6 +3537,7 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp)
 			rdp->gpnum = rnp->completed;
 			rdp->completed = rnp->completed;
 			rdp->passed_quiesce = 0;
+			rdp->rcu_qs_ctr_snap = __this_cpu_read(rcu_qs_ctr);
 			rdp->qs_pending = 0;
 			trace_rcu_grace_period(rsp->name, rdp->gpnum, TPS("cpuonl"));
 		}
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 7472ff388d55..1e7f8b05714e 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -257,6 +257,8 @@ struct rcu_data {
 					/*  in order to detect GP end. */
 	unsigned long	gpnum;		/* Highest gp number that this CPU */
 					/*  is aware of having started. */
+	unsigned long	rcu_qs_ctr_snap;/* Snapshot of rcu_qs_ctr to check */
+					/*  for rcu_all_qs() invocations. */
 	bool		passed_quiesce;	/* User-mode/idle loop etc. */
 	bool		qs_pending;	/* Core waits for quiesc state. */
 	bool		beenonline;	/* CPU online at least once. */
diff --git a/kernel/rcu/tree_trace.c b/kernel/rcu/tree_trace.c
index 5cdc62e1beeb..fbb6240509ea 100644
--- a/kernel/rcu/tree_trace.c
+++ b/kernel/rcu/tree_trace.c
@@ -46,6 +46,8 @@
 #define RCU_TREE_NONCORE
 #include "tree.h"
 
+DECLARE_PER_CPU_SHARED_ALIGNED(unsigned long, rcu_qs_ctr);
+
 static int r_open(struct inode *inode, struct file *file,
 					const struct seq_operations *op)
 {
@@ -115,11 +117,13 @@ static void print_one_rcu_data(struct seq_file *m, struct rcu_data *rdp)
 
 	if (!rdp->beenonline)
 		return;
-	seq_printf(m, "%3d%cc=%ld g=%ld pq=%d qp=%d",
+	seq_printf(m, "%3d%cc=%ld g=%ld pq=%d/%d qp=%d",
 		   rdp->cpu,
 		   cpu_is_offline(rdp->cpu) ? '!' : ' ',
 		   ulong2long(rdp->completed), ulong2long(rdp->gpnum),
-		   rdp->passed_quiesce, rdp->qs_pending);
+		   rdp->passed_quiesce,
+		   rdp->rcu_qs_ctr_snap == per_cpu(rcu_qs_ctr, rdp->cpu),
+		   rdp->qs_pending);
 	seq_printf(m, " dt=%d/%llx/%d df=%lu",
 		   atomic_read(&rdp->dynticks->dynticks),
 		   rdp->dynticks->dynticks_nesting,
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH tip/core/rcu 08/10] rcu: Add GP-kthread-starvation checks to CPU stall warnings
  2015-01-07 17:52 ` [PATCH tip/core/rcu 01/10] rcu: Set default to RCU_CPU_STALL_INFO=y Paul E. McKenney
                     ` (5 preceding siblings ...)
  2015-01-07 17:52   ` [PATCH tip/core/rcu 07/10] rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors Paul E. McKenney
@ 2015-01-07 17:52   ` Paul E. McKenney
  2015-01-07 17:52   ` [PATCH tip/core/rcu 09/10] rcu: Fix RCU CPU stall detection in tiny implementation Paul E. McKenney
  2015-01-07 17:52   ` [PATCH tip/core/rcu 10/10] rcu: Initialize tiny RCU stall-warning timeouts at boot Paul E. McKenney
  8 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2015-01-07 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

This commit adds a message that is printed if the relevant grace-period
kthread has not been able to run for the two seconds preceding the
stall warning.  (The two seconds is double the maximum interval between
successive bouts of quiescent-state forcing.)

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/RCU/stallwarn.txt |  9 +++++++++
 kernel/rcu/tree.c               | 21 ++++++++++++++++++++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index 55f9707fe60a..53e7d2856db6 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -152,6 +152,15 @@ no non-lazy callbacks ("." is printed otherwise, as shown above) and
 "D" indicates that dyntick-idle processing is enabled ("." is printed
 otherwise, for example, if disabled via the "nohz=" kernel boot parameter).
 
+If the relevant grace-period kthread has been unable to run prior to
+the stall warning, the following additional line is printed:
+
+	rcu_preempt kthread starved for 2023 jiffies!
+
+Starving the grace-period kthreads of CPU time can of course result in
+RCU CPU stall warnings even when all CPUs and tasks have passed through
+the required quiescent states.
+
 
 Multiple Warnings From One Stall
 
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 6284c19b39ba..5110a1767c22 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1073,6 +1073,21 @@ static void record_gp_stall_check_time(struct rcu_state *rsp)
 }
 
 /*
+ * Complain about starvation of grace-period kthread.
+ */
+static void rcu_check_gp_kthread_starvation(struct rcu_state *rsp)
+{
+	unsigned long gpa;
+	unsigned long j;
+
+	j = jiffies;
+	gpa = ACCESS_ONCE(rsp->gp_activity);
+	if (j - gpa > 2 * HZ)
+		pr_err("%s kthread starved for %ld jiffies!\n",
+		       rsp->name, j - gpa);
+}
+
+/*
  * Dump stacks of all tasks running on stalled CPUs.
  */
 static void rcu_dump_cpu_stacks(struct rcu_state *rsp)
@@ -1169,9 +1184,10 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum)
 	}
 
 	/* Complain about tasks blocking the grace period. */
-
 	rcu_print_detail_task_stall(rsp);
 
+	rcu_check_gp_kthread_starvation(rsp);
+
 	force_quiescent_state(rsp);  /* Kick them all. */
 }
 
@@ -1196,6 +1212,9 @@ static void print_cpu_stall(struct rcu_state *rsp)
 	pr_cont(" (t=%lu jiffies g=%ld c=%ld q=%lu)\n",
 		jiffies - rsp->gp_start,
 		(long)rsp->gpnum, (long)rsp->completed, totqlen);
+
+	rcu_check_gp_kthread_starvation(rsp);
+
 	rcu_dump_cpu_stacks(rsp);
 
 	raw_spin_lock_irqsave(&rnp->lock, flags);
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH tip/core/rcu 09/10] rcu: Fix RCU CPU stall detection in tiny implementation
  2015-01-07 17:52 ` [PATCH tip/core/rcu 01/10] rcu: Set default to RCU_CPU_STALL_INFO=y Paul E. McKenney
                     ` (6 preceding siblings ...)
  2015-01-07 17:52   ` [PATCH tip/core/rcu 08/10] rcu: Add GP-kthread-starvation checks to CPU stall warnings Paul E. McKenney
@ 2015-01-07 17:52   ` Paul E. McKenney
  2015-01-07 17:52   ` [PATCH tip/core/rcu 10/10] rcu: Initialize tiny RCU stall-warning timeouts at boot Paul E. McKenney
  8 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2015-01-07 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Miroslav Benes, Paul E. McKenney

From: Miroslav Benes <mbenes@suse.cz>

The tiny RCU CPU stall detection depends on *rcp->curtail not being
NULL. It is however a tail pointer and thus NULL by definition. Instead we
should check rcp->rcucblist for the presence of pending callbacks which
need to be processed. With this fix INFO about the stall is printed and
jiffies_stall (jiffies at next stall) correctly updated.

Note that the check for pending callback is necessary to avoid spurious
warnings if there are no pendings callbacks.

Signed-off-by: Miroslav Benes <mbenes@suse.cz>
[ paulmck: Fused identical "if" statements, ported to -rcu. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tiny_plugin.h | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/tiny_plugin.h b/kernel/rcu/tiny_plugin.h
index 858c56569127..80f908afe348 100644
--- a/kernel/rcu/tiny_plugin.h
+++ b/kernel/rcu/tiny_plugin.h
@@ -145,17 +145,16 @@ static void check_cpu_stall(struct rcu_ctrlblk *rcp)
 	rcp->ticks_this_gp++;
 	j = jiffies;
 	js = ACCESS_ONCE(rcp->jiffies_stall);
-	if (*rcp->curtail && ULONG_CMP_GE(j, js)) {
+	if (rcp->rcucblist && ULONG_CMP_GE(j, js)) {
 		pr_err("INFO: %s stall on CPU (%lu ticks this GP) idle=%llx (t=%lu jiffies q=%ld)\n",
 		       rcp->name, rcp->ticks_this_gp, rcu_dynticks_nesting,
 		       jiffies - rcp->gp_start, rcp->qlen);
 		dump_stack();
-	}
-	if (*rcp->curtail && ULONG_CMP_GE(j, js))
 		ACCESS_ONCE(rcp->jiffies_stall) = jiffies +
 			3 * rcu_jiffies_till_stall_check() + 3;
-	else if (ULONG_CMP_GE(j, js))
+	} else if (ULONG_CMP_GE(j, js)) {
 		ACCESS_ONCE(rcp->jiffies_stall) = jiffies + rcu_jiffies_till_stall_check();
+	}
 }
 
 static void reset_cpu_stall_ticks(struct rcu_ctrlblk *rcp)
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH tip/core/rcu 10/10] rcu: Initialize tiny RCU stall-warning timeouts at boot
  2015-01-07 17:52 ` [PATCH tip/core/rcu 01/10] rcu: Set default to RCU_CPU_STALL_INFO=y Paul E. McKenney
                     ` (7 preceding siblings ...)
  2015-01-07 17:52   ` [PATCH tip/core/rcu 09/10] rcu: Fix RCU CPU stall detection in tiny implementation Paul E. McKenney
@ 2015-01-07 17:52   ` Paul E. McKenney
  8 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2015-01-07 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The current tiny RCU stall-warning code assumes that the jiffies counter
starts at zero, however, it is sometimes initialized to other values,
for example, -30,000.  This commit therefore changes rcu_init() to
invoke reset_cpu_stall_ticks() for both flavors of RCU to initialize
the stall-warning times properly at boot.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tiny.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index 0db5649f8817..fc0f2c07af3a 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -383,6 +383,8 @@ EXPORT_SYMBOL_GPL(call_rcu_bh);
 void __init rcu_init(void)
 {
 	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
+	RCU_TRACE(reset_cpu_stall_ticks(&rcu_sched_ctrlblk));
+	RCU_TRACE(reset_cpu_stall_ticks(&rcu_bh_ctrlblk));
 
 	rcu_early_boot_tests();
 }
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-01-07 17:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-07 17:52 [PATCH tip/core/rcu 0/9] Stall-warning changes for 3.20 Paul E. McKenney
2015-01-07 17:52 ` [PATCH tip/core/rcu 01/10] rcu: Set default to RCU_CPU_STALL_INFO=y Paul E. McKenney
2015-01-07 17:52   ` [PATCH tip/core/rcu 02/10] rcutorture: Add checks for stall ending before dump start Paul E. McKenney
2015-01-07 17:52   ` [PATCH tip/core/rcu 03/10] rcu: Make RCU_CPU_STALL_INFO include number of fqs attempts Paul E. McKenney
2015-01-07 17:52   ` [PATCH tip/core/rcu 04/10] rcu: Improve diagnostics for spurious RCU CPU stall warnings Paul E. McKenney
2015-01-07 17:52   ` [PATCH tip/core/rcu 05/10] rcu: Handle gpnum/completed wrap while dyntick idle Paul E. McKenney
2015-01-07 17:52   ` [PATCH tip/core/rcu 06/10] rcu: Run grace-period kthreads at real-time priority Paul E. McKenney
2015-01-07 17:52   ` [PATCH tip/core/rcu 07/10] rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors Paul E. McKenney
2015-01-07 17:52   ` [PATCH tip/core/rcu 08/10] rcu: Add GP-kthread-starvation checks to CPU stall warnings Paul E. McKenney
2015-01-07 17:52   ` [PATCH tip/core/rcu 09/10] rcu: Fix RCU CPU stall detection in tiny implementation Paul E. McKenney
2015-01-07 17:52   ` [PATCH tip/core/rcu 10/10] rcu: Initialize tiny RCU stall-warning timeouts at boot Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.