All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 01/55] rcu: Use kthread_create_on_node()
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
@ 2011-09-06 17:59 ` Paul E. McKenney
  2011-09-06 17:59 ` [PATCH tip/core/rcu 02/55] rcu: Avoid unnecessary self-wakeup of per-CPU kthreads Paul E. McKenney
                   ` (55 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Tejun Heo, Rusty Russell, Andi Kleen,
	Paul E. McKenney

From: Eric Dumazet <eric.dumazet@gmail.com>

Commit a26ac2455ffc (move TREE_RCU from softirq to kthread) added
per-CPU kthreads.  However, kthread creation uses kthread_create(), which
can put the kthread's stack and task struct on the wrong NUMA node.
Therefore, use kthread_create_on_node() instead of kthread_create()
so that the stacks and task structs are placed on the correct NUMA node.

A similar change was carried out in commit 94dcf29a11b3 (kthread:
use kthread_create_on_node()).

Also change rcutorture's priority-boost-test kthread creation.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Tejun Heo <tj@kernel.org>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Andi Kleen <ak@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutorture.c     |    5 +++--
 kernel/rcutree_plugin.h |    5 ++++-
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c
index 2e138db..920eb50 100644
--- a/kernel/rcutorture.c
+++ b/kernel/rcutorture.c
@@ -1282,8 +1282,9 @@ static int rcutorture_booster_init(int cpu)
 	/* Don't allow time recalculation while creating a new task. */
 	mutex_lock(&boost_mutex);
 	VERBOSE_PRINTK_STRING("Creating rcu_torture_boost task");
-	boost_tasks[cpu] = kthread_create(rcu_torture_boost, NULL,
-					  "rcu_torture_boost");
+	boost_tasks[cpu] = kthread_create_on_node(rcu_torture_boost, NULL,
+						  cpu_to_node(cpu),
+						  "rcu_torture_boost");
 	if (IS_ERR(boost_tasks[cpu])) {
 		retval = PTR_ERR(boost_tasks[cpu]);
 		VERBOSE_PRINTK_STRING("rcu_torture_boost task create failed");
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 8aafbb8..7b850cd 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1560,7 +1560,10 @@ static int __cpuinit rcu_spawn_one_cpu_kthread(int cpu)
 	if (!rcu_scheduler_fully_active ||
 	    per_cpu(rcu_cpu_kthread_task, cpu) != NULL)
 		return 0;
-	t = kthread_create(rcu_cpu_kthread, (void *)(long)cpu, "rcuc%d", cpu);
+	t = kthread_create_on_node(rcu_cpu_kthread,
+				   (void *)(long)cpu,
+				   cpu_to_node(cpu),
+				   "rcuc%d", cpu);
 	if (IS_ERR(t))
 		return PTR_ERR(t);
 	if (cpu_online(cpu))
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 02/55] rcu: Avoid unnecessary self-wakeup of per-CPU kthreads
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
  2011-09-06 17:59 ` [PATCH tip/core/rcu 01/55] rcu: Use kthread_create_on_node() Paul E. McKenney
@ 2011-09-06 17:59 ` Paul E. McKenney
  2011-09-06 17:59 ` [PATCH tip/core/rcu 03/55] rcu: Update documentation to flag RCU_BOOST trace information Paul E. McKenney
                   ` (54 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Shaohua Li, Paul E. McKenney, Paul E. McKenney

From: Shaohua Li <shaohua.li@intel.com>

There are a number of cases where the RCU can find additional work
for the per-CPU kthread within the context of that per-CPU kthread.
In such cases, the per-CPU kthread is already running, so attempting
to wake itself up does nothing except waste CPU cycles.  This commit
therefore checks to see if it is in the per-CPU kthread context,
omitting the wakeup in this case.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree_plugin.h |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 7b850cd..9703298 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1291,11 +1291,9 @@ static void invoke_rcu_callbacks_kthread(void)
 
 	local_irq_save(flags);
 	__this_cpu_write(rcu_cpu_has_work, 1);
-	if (__this_cpu_read(rcu_cpu_kthread_task) == NULL) {
-		local_irq_restore(flags);
-		return;
-	}
-	wake_up_process(__this_cpu_read(rcu_cpu_kthread_task));
+	if (__this_cpu_read(rcu_cpu_kthread_task) != NULL &&
+	    current != __this_cpu_read(rcu_cpu_kthread_task))
+		wake_up_process(__this_cpu_read(rcu_cpu_kthread_task));
 	local_irq_restore(flags);
 }
 
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 03/55] rcu: Update documentation to flag RCU_BOOST trace information
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
  2011-09-06 17:59 ` [PATCH tip/core/rcu 01/55] rcu: Use kthread_create_on_node() Paul E. McKenney
  2011-09-06 17:59 ` [PATCH tip/core/rcu 02/55] rcu: Avoid unnecessary self-wakeup of per-CPU kthreads Paul E. McKenney
@ 2011-09-06 17:59 ` Paul E. McKenney
  2011-09-06 17:59 ` [PATCH tip/core/rcu 04/55] rcu: Restore checks for blocking in RCU read-side critical sections Paul E. McKenney
                   ` (53 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

Call out the RCU_TRACE information that is provided only in kernels
built with RCU_BOOST.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/RCU/trace.txt |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt
index 8173cec..a67af0a 100644
--- a/Documentation/RCU/trace.txt
+++ b/Documentation/RCU/trace.txt
@@ -184,10 +184,14 @@ o	"kt" is the per-CPU kernel-thread state.  The digit preceding
 	The number after the final slash is the CPU that the kthread
 	is actually running on.
 
+	This field is displayed only for CONFIG_RCU_BOOST kernels.
+
 o	"ktl" is the low-order 16 bits (in hexadecimal) of the count of
 	the number of times that this CPU's per-CPU kthread has gone
 	through its loop servicing invoke_rcu_cpu_kthread() requests.
 
+	This field is displayed only for CONFIG_RCU_BOOST kernels.
+
 o	"b" is the batch limit for this CPU.  If more than this number
 	of RCU callbacks is ready to invoke, then the remainder will
 	be deferred.
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 04/55] rcu: Restore checks for blocking in RCU read-side critical sections
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (2 preceding siblings ...)
  2011-09-06 17:59 ` [PATCH tip/core/rcu 03/55] rcu: Update documentation to flag RCU_BOOST trace information Paul E. McKenney
@ 2011-09-06 17:59 ` Paul E. McKenney
  2011-09-06 17:59 ` [PATCH tip/core/rcu 05/55] rcu: Move rcu_head definition to types.h Paul E. McKenney
                   ` (52 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

Long ago, using TREE_RCU with PREEMPT would result in "scheduling
while atomic" diagnostics if you blocked in an RCU read-side critical
section.  However, PREEMPT now implies TREE_PREEMPT_RCU, which defeats
this diagnostic.  This commit therefore adds a replacement diagnostic
based on PROVE_RCU.

Because rcu_lockdep_assert() and lockdep_rcu_dereference() are now being
used for things that have nothing to do with rcu_dereference(), rename
lockdep_rcu_dereference() to lockdep_rcu_suspicious() and add a third
argument that is a string indicating what is suspicious.  This third
argument is passed in from a new third argument to rcu_lockdep_assert().
Update all calls to rcu_lockdep_assert() to add an informative third
argument.

Also, add a pair of rcu_lockdep_assert() calls from within
rcu_note_context_switch(), one complaining if a context switch occurs
in an RCU-bh read-side critical section and another complaining if a
context switch occurs in an RCU-sched read-side critical section.
These are present only if the PROVE_RCU kernel parameter is enabled.

Finally, fix some checkpatch whitespace complaints in lockdep.c.

Again, you must enable PROVE_RCU to see these new diagnostics.  But you
are enabling PROVE_RCU to check out new RCU uses in any case, aren't you?

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/lockdep.h  |    2 +-
 include/linux/rcupdate.h |   28 ++++++++++++---
 kernel/lockdep.c         |   84 +++++++++++++++++++++++++--------------------
 kernel/pid.c             |    4 ++-
 kernel/sched.c           |    2 +
 5 files changed, 75 insertions(+), 45 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index ef820a3..b6a56e3 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -548,7 +548,7 @@ do {									\
 #endif
 
 #ifdef CONFIG_PROVE_RCU
-extern void lockdep_rcu_dereference(const char *file, const int line);
+void lockdep_rcu_suspicious(const char *file, const int line, const char *s);
 #endif
 
 #endif /* __LINUX_LOCKDEP_H */
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 99f9aa7..8be0433 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -297,19 +297,31 @@ extern int rcu_my_thread_group_empty(void);
 /**
  * rcu_lockdep_assert - emit lockdep splat if specified condition not met
  * @c: condition to check
+ * @s: informative message
  */
-#define rcu_lockdep_assert(c)						\
+#define rcu_lockdep_assert(c, s)					\
 	do {								\
 		static bool __warned;					\
 		if (debug_lockdep_rcu_enabled() && !__warned && !(c)) {	\
 			__warned = true;				\
-			lockdep_rcu_dereference(__FILE__, __LINE__);	\
+			lockdep_rcu_suspicious(__FILE__, __LINE__, s);	\
 		}							\
 	} while (0)
 
+#define rcu_sleep_check()						\
+	do {								\
+		rcu_lockdep_assert(!lock_is_held(&rcu_bh_lock_map),	\
+				   "Illegal context switch in RCU-bh"	\
+				   " read-side critical section");	\
+		rcu_lockdep_assert(!lock_is_held(&rcu_sched_lock_map),	\
+				   "Illegal context switch in RCU-sched"\
+				   " read-side critical section");	\
+	} while (0)
+
 #else /* #ifdef CONFIG_PROVE_RCU */
 
-#define rcu_lockdep_assert(c) do { } while (0)
+#define rcu_lockdep_assert(c, s) do { } while (0)
+#define rcu_sleep_check() do { } while (0)
 
 #endif /* #else #ifdef CONFIG_PROVE_RCU */
 
@@ -338,14 +350,16 @@ extern int rcu_my_thread_group_empty(void);
 #define __rcu_dereference_check(p, c, space) \
 	({ \
 		typeof(*p) *_________p1 = (typeof(*p)*__force )ACCESS_ONCE(p); \
-		rcu_lockdep_assert(c); \
+		rcu_lockdep_assert(c, "suspicious rcu_dereference_check()" \
+				      " usage"); \
 		rcu_dereference_sparse(p, space); \
 		smp_read_barrier_depends(); \
 		((typeof(*p) __force __kernel *)(_________p1)); \
 	})
 #define __rcu_dereference_protected(p, c, space) \
 	({ \
-		rcu_lockdep_assert(c); \
+		rcu_lockdep_assert(c, "suspicious rcu_dereference_protected()" \
+				      " usage"); \
 		rcu_dereference_sparse(p, space); \
 		((typeof(*p) __force __kernel *)(p)); \
 	})
@@ -359,7 +373,9 @@ extern int rcu_my_thread_group_empty(void);
 #define __rcu_dereference_index_check(p, c) \
 	({ \
 		typeof(p) _________p1 = ACCESS_ONCE(p); \
-		rcu_lockdep_assert(c); \
+		rcu_lockdep_assert(c, \
+				   "suspicious rcu_dereference_index_check()" \
+				   " usage"); \
 		smp_read_barrier_depends(); \
 		(_________p1); \
 	})
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 298c927..df2ad37 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -1129,10 +1129,11 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
 	if (debug_locks_silent)
 		return 0;
 
-	printk("\n=======================================================\n");
-	printk(  "[ INFO: possible circular locking dependency detected ]\n");
+	printk("\n");
+	printk("======================================================\n");
+	printk("[ INFO: possible circular locking dependency detected ]\n");
 	print_kernel_version();
-	printk(  "-------------------------------------------------------\n");
+	printk("-------------------------------------------------------\n");
 	printk("%s/%d is trying to acquire lock:\n",
 		curr->comm, task_pid_nr(curr));
 	print_lock(check_src);
@@ -1463,11 +1464,12 @@ print_bad_irq_dependency(struct task_struct *curr,
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
-	printk("\n======================================================\n");
-	printk(  "[ INFO: %s-safe -> %s-unsafe lock order detected ]\n",
+	printk("\n");
+	printk("======================================================\n");
+	printk("[ INFO: %s-safe -> %s-unsafe lock order detected ]\n",
 		irqclass, irqclass);
 	print_kernel_version();
-	printk(  "------------------------------------------------------\n");
+	printk("------------------------------------------------------\n");
 	printk("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] is trying to acquire:\n",
 		curr->comm, task_pid_nr(curr),
 		curr->hardirq_context, hardirq_count() >> HARDIRQ_SHIFT,
@@ -1692,10 +1694,11 @@ print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
-	printk("\n=============================================\n");
-	printk(  "[ INFO: possible recursive locking detected ]\n");
+	printk("\n");
+	printk("=============================================\n");
+	printk("[ INFO: possible recursive locking detected ]\n");
 	print_kernel_version();
-	printk(  "---------------------------------------------\n");
+	printk("---------------------------------------------\n");
 	printk("%s/%d is trying to acquire lock:\n",
 		curr->comm, task_pid_nr(curr));
 	print_lock(next);
@@ -2177,10 +2180,11 @@ print_usage_bug(struct task_struct *curr, struct held_lock *this,
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
-	printk("\n=================================\n");
-	printk(  "[ INFO: inconsistent lock state ]\n");
+	printk("\n");
+	printk("=================================\n");
+	printk("[ INFO: inconsistent lock state ]\n");
 	print_kernel_version();
-	printk(  "---------------------------------\n");
+	printk("---------------------------------\n");
 
 	printk("inconsistent {%s} -> {%s} usage.\n",
 		usage_str[prev_bit], usage_str[new_bit]);
@@ -2241,10 +2245,11 @@ print_irq_inversion_bug(struct task_struct *curr,
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
-	printk("\n=========================================================\n");
-	printk(  "[ INFO: possible irq lock inversion dependency detected ]\n");
+	printk("\n");
+	printk("=========================================================\n");
+	printk("[ INFO: possible irq lock inversion dependency detected ]\n");
 	print_kernel_version();
-	printk(  "---------------------------------------------------------\n");
+	printk("---------------------------------------------------------\n");
 	printk("%s/%d just changed the state of lock:\n",
 		curr->comm, task_pid_nr(curr));
 	print_lock(this);
@@ -3053,9 +3058,10 @@ print_unlock_inbalance_bug(struct task_struct *curr, struct lockdep_map *lock,
 	if (debug_locks_silent)
 		return 0;
 
-	printk("\n=====================================\n");
-	printk(  "[ BUG: bad unlock balance detected! ]\n");
-	printk(  "-------------------------------------\n");
+	printk("\n");
+	printk("=====================================\n");
+	printk("[ BUG: bad unlock balance detected! ]\n");
+	printk("-------------------------------------\n");
 	printk("%s/%d is trying to release lock (",
 		curr->comm, task_pid_nr(curr));
 	print_lockdep_cache(lock);
@@ -3460,9 +3466,10 @@ print_lock_contention_bug(struct task_struct *curr, struct lockdep_map *lock,
 	if (debug_locks_silent)
 		return 0;
 
-	printk("\n=================================\n");
-	printk(  "[ BUG: bad contention detected! ]\n");
-	printk(  "---------------------------------\n");
+	printk("\n");
+	printk("=================================\n");
+	printk("[ BUG: bad contention detected! ]\n");
+	printk("---------------------------------\n");
 	printk("%s/%d is trying to contend lock (",
 		curr->comm, task_pid_nr(curr));
 	print_lockdep_cache(lock);
@@ -3821,9 +3828,10 @@ print_freed_lock_bug(struct task_struct *curr, const void *mem_from,
 	if (debug_locks_silent)
 		return;
 
-	printk("\n=========================\n");
-	printk(  "[ BUG: held lock freed! ]\n");
-	printk(  "-------------------------\n");
+	printk("\n");
+	printk("=========================\n");
+	printk("[ BUG: held lock freed! ]\n");
+	printk("-------------------------\n");
 	printk("%s/%d is freeing memory %p-%p, with a lock still held there!\n",
 		curr->comm, task_pid_nr(curr), mem_from, mem_to-1);
 	print_lock(hlock);
@@ -3877,9 +3885,10 @@ static void print_held_locks_bug(struct task_struct *curr)
 	if (debug_locks_silent)
 		return;
 
-	printk("\n=====================================\n");
-	printk(  "[ BUG: lock held at task exit time! ]\n");
-	printk(  "-------------------------------------\n");
+	printk("\n");
+	printk("=====================================\n");
+	printk("[ BUG: lock held at task exit time! ]\n");
+	printk("-------------------------------------\n");
 	printk("%s/%d is exiting with locks still held!\n",
 		curr->comm, task_pid_nr(curr));
 	lockdep_print_held_locks(curr);
@@ -3973,16 +3982,17 @@ void lockdep_sys_exit(void)
 	if (unlikely(curr->lockdep_depth)) {
 		if (!debug_locks_off())
 			return;
-		printk("\n================================================\n");
-		printk(  "[ BUG: lock held when returning to user space! ]\n");
-		printk(  "------------------------------------------------\n");
+		printk("\n");
+		printk("================================================\n");
+		printk("[ BUG: lock held when returning to user space! ]\n");
+		printk("------------------------------------------------\n");
 		printk("%s/%d is leaving the kernel with locks still held!\n",
 				curr->comm, curr->pid);
 		lockdep_print_held_locks(curr);
 	}
 }
 
-void lockdep_rcu_dereference(const char *file, const int line)
+void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
 {
 	struct task_struct *curr = current;
 
@@ -3991,15 +4001,15 @@ void lockdep_rcu_dereference(const char *file, const int line)
 		return;
 #endif /* #ifdef CONFIG_PROVE_RCU_REPEATEDLY */
 	/* Note: the following can be executed concurrently, so be careful. */
-	printk("\n===================================================\n");
-	printk(  "[ INFO: suspicious rcu_dereference_check() usage. ]\n");
-	printk(  "---------------------------------------------------\n");
-	printk("%s:%d invoked rcu_dereference_check() without protection!\n",
-			file, line);
+	printk("\n");
+	printk("===============================\n");
+	printk("[ INFO: suspicious RCU usage. ]\n");
+	printk("-------------------------------\n");
+	printk("%s:%d %s!\n", file, line, s);
 	printk("\nother info that might help us debug this:\n\n");
 	printk("\nrcu_scheduler_active = %d, debug_locks = %d\n", rcu_scheduler_active, debug_locks);
 	lockdep_print_held_locks(curr);
 	printk("\nstack backtrace:\n");
 	dump_stack();
 }
-EXPORT_SYMBOL_GPL(lockdep_rcu_dereference);
+EXPORT_SYMBOL_GPL(lockdep_rcu_suspicious);
diff --git a/kernel/pid.c b/kernel/pid.c
index 57a8346..a7577b3 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -419,7 +419,9 @@ EXPORT_SYMBOL(pid_task);
  */
 struct task_struct *find_task_by_pid_ns(pid_t nr, struct pid_namespace *ns)
 {
-	rcu_lockdep_assert(rcu_read_lock_held());
+	rcu_lockdep_assert(rcu_read_lock_held(),
+			   "find_task_by_pid_ns() needs rcu_read_lock()"
+			   " protection");
 	return pid_task(find_pid_ns(nr, ns), PIDTYPE_PID);
 }
 
diff --git a/kernel/sched.c b/kernel/sched.c
index fde6ff9..1c87917 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4200,6 +4200,7 @@ static inline void schedule_debug(struct task_struct *prev)
 	 */
 	if (unlikely(in_atomic_preempt_off() && !prev->exit_state))
 		__schedule_bug(prev);
+	rcu_sleep_check();
 
 	profile_hit(SCHED_PROFILING, __builtin_return_address(0));
 
@@ -8198,6 +8199,7 @@ void __might_sleep(const char *file, int line, int preempt_offset)
 #ifdef in_atomic
 	static unsigned long prev_jiffy;	/* ratelimiting */
 
+	rcu_sleep_check(); /* WARN_ON_ONCE() by default, no rate limit reqd. */
 	if ((preempt_count_equals(preempt_offset) && !irqs_disabled()) ||
 	    system_state != SYSTEM_RUNNING || oops_in_progress)
 		return;
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 05/55] rcu: Move rcu_head definition to types.h
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (3 preceding siblings ...)
  2011-09-06 17:59 ` [PATCH tip/core/rcu 04/55] rcu: Restore checks for blocking in RCU read-side critical sections Paul E. McKenney
@ 2011-09-06 17:59 ` Paul E. McKenney
  2011-09-07 18:31   ` Paul Gortmaker
  2011-09-06 18:00 ` [PATCH tip/core/rcu 06/55] rcu: Update rcutorture documentation Paul E. McKenney
                   ` (51 subsequent siblings)
  56 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul Gortmaker

Take a first step towards untangling Linux kernel header files by
placing the struct rcu_head definition into include/linux/types.h
and including include/linux/types.h in include/linux/rcupdate.h
where struct rcu_head used to be defined.  The actual inclusion point
for include/linux/types.h is with the rest of the #include directives
rather than at the point where struct rcu_head used to be defined,
as suggested by Mathieu Desnoyers.

Once this is in place, then header files that need only rcu_head
can include types.h rather than rcupdate.h.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 include/linux/rcupdate.h |   11 +----------
 include/linux/types.h    |   10 ++++++++++
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 8be0433..2516555 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -33,6 +33,7 @@
 #ifndef __LINUX_RCUPDATE_H
 #define __LINUX_RCUPDATE_H
 
+#include <linux/types.h>
 #include <linux/cache.h>
 #include <linux/spinlock.h>
 #include <linux/threads.h>
@@ -64,16 +65,6 @@ static inline void rcutorture_record_progress(unsigned long vernum)
 #define ULONG_CMP_GE(a, b)	(ULONG_MAX / 2 >= (a) - (b))
 #define ULONG_CMP_LT(a, b)	(ULONG_MAX / 2 < (a) - (b))
 
-/**
- * struct rcu_head - callback structure for use with RCU
- * @next: next update requests in a list
- * @func: actual update function to call after the grace period.
- */
-struct rcu_head {
-	struct rcu_head *next;
-	void (*func)(struct rcu_head *head);
-};
-
 /* Exported common interfaces */
 extern void call_rcu_sched(struct rcu_head *head,
 			   void (*func)(struct rcu_head *rcu));
diff --git a/include/linux/types.h b/include/linux/types.h
index 176da8c..57a9723 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -238,6 +238,16 @@ struct ustat {
 	char			f_fpack[6];
 };
 
+/**
+ * struct rcu_head - callback structure for use with RCU
+ * @next: next update requests in a list
+ * @func: actual update function to call after the grace period.
+ */
+struct rcu_head {
+	struct rcu_head *next;
+	void (*func)(struct rcu_head *head);
+};
+
 #endif	/* __KERNEL__ */
 #endif /*  __ASSEMBLY__ */
 #endif /* _LINUX_TYPES_H */
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 06/55] rcu: Update rcutorture documentation
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (4 preceding siblings ...)
  2011-09-06 17:59 ` [PATCH tip/core/rcu 05/55] rcu: Move rcu_head definition to types.h Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 07/55] rcu: Fix mismatched variable in rcutree_trace.c Paul E. McKenney
                   ` (50 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

Update rcutorture documentation to account for boosting, new types of
RCU torture testing that have been added over the past few years, and
the memory-barrier testing that was added an embarrassingly long time
ago.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/RCU/torture.txt |  134 ++++++++++++++++++++++++++++++-----------
 1 files changed, 99 insertions(+), 35 deletions(-)

diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt
index 5d90167..4205ed1 100644
--- a/Documentation/RCU/torture.txt
+++ b/Documentation/RCU/torture.txt
@@ -42,7 +42,7 @@ fqs_holdoff	Holdoff time (in microseconds) between consecutive calls
 fqs_stutter	Wait time (in seconds) between consecutive bursts
 		of calls to force_quiescent_state().
 
-irqreaders	Says to invoke RCU readers from irq level.  This is currently
+irqreader	Says to invoke RCU readers from irq level.  This is currently
 		done via timers.  Defaults to "1" for variants of RCU that
 		permit this.  (Or, more accurately, variants of RCU that do
 		-not- permit this know to ignore this variable.)
@@ -79,19 +79,65 @@ stutter		The length of time to run the test before pausing for this
 		Specifying "stutter=0" causes the test to run continuously
 		without pausing, which is the old default behavior.
 
+test_boost	Whether or not to test the ability of RCU to do priority
+		boosting.  Defaults to "test_boost=1", which performs
+		RCU priority-inversion testing only if the selected
+		RCU implementation supports priority boosting.  Specifying
+		"test_boost=0" never performs RCU priority-inversion
+		testing.  Specifying "test_boost=2" performs RCU
+		priority-inversion testing even if the selected RCU
+		implementation does not support RCU priority boosting,
+		which can be used to test rcutorture's ability to
+		carry out RCU priority-inversion testing.
+
+test_boost_interval
+		The number of seconds in an RCU priority-inversion test
+		cycle.	Defaults to "test_boost_interval=7".  It is
+		usually wise for this value to be relatively prime to
+		the value selected for "stutter".
+
+test_boost_duration
+		The number of seconds to do RCU priority-inversion testing
+		within any given "test_boost_interval".  Defaults to
+		"test_boost_duration=4".
+
 test_no_idle_hz	Whether or not to test the ability of RCU to operate in
 		a kernel that disables the scheduling-clock interrupt to
 		idle CPUs.  Boolean parameter, "1" to test, "0" otherwise.
 		Defaults to omitting this test.
 
-torture_type	The type of RCU to test: "rcu" for the rcu_read_lock() API,
-		"rcu_sync" for rcu_read_lock() with synchronous reclamation,
-		"rcu_bh" for the rcu_read_lock_bh() API, "rcu_bh_sync" for
-		rcu_read_lock_bh() with synchronous reclamation, "srcu" for
-		the "srcu_read_lock()" API, "sched" for the use of
-		preempt_disable() together with synchronize_sched(),
-		and "sched_expedited" for the use of preempt_disable()
-		with synchronize_sched_expedited().
+torture_type	The type of RCU to test, with string values as follows:
+
+		"rcu":  rcu_read_lock(), rcu_read_unlock() and call_rcu().
+
+		"rcu_sync":  rcu_read_lock(), rcu_read_unlock(), and
+			synchronize_rcu().
+
+		"rcu_expedited": rcu_read_lock(), rcu_read_unlock(), and
+			synchronize_rcu_expedited().
+
+		"rcu_bh": rcu_read_lock_bh(), rcu_read_unlock_bh(), and
+			call_rcu_bh().
+
+		"rcu_bh_sync": rcu_read_lock_bh(), rcu_read_unlock_bh(),
+			and synchronize_rcu_bh().
+
+		"srcu": srcu_read_lock(), srcu_read_unlock() and
+			synchronize_srcu().
+
+		"srcu_expedited": srcu_read_lock(), srcu_read_unlock() and
+			synchronize_srcu_expedited().
+
+		"sched": preempt_disable(), preempt_enable(), and
+			call_rcu_sched().
+
+		"sched_sync": preempt_disable(), preempt_enable(), and
+			synchronize_sched().
+
+		"sched_expedited": preempt_disable(), preempt_enable(), and
+			synchronize_sched_expedited().
+
+		Defaults to "rcu".
 
 verbose		Enable debug printk()s.  Default is disabled.
 
@@ -100,12 +146,12 @@ OUTPUT
 
 The statistics output is as follows:
 
-	rcu-torture: --- Start of test: nreaders=16 stat_interval=0 verbose=0
-	rcu-torture: rtc: 0000000000000000 ver: 1916 tfle: 0 rta: 1916 rtaf: 0 rtf: 1915
-	rcu-torture: Reader Pipe:  1466408 9747 0 0 0 0 0 0 0 0 0
-	rcu-torture: Reader Batch:  1464477 11678 0 0 0 0 0 0 0 0
-	rcu-torture: Free-Block Circulation:  1915 1915 1915 1915 1915 1915 1915 1915 1915 1915 0
-	rcu-torture: --- End of test
+	rcu-torture:--- Start of test: nreaders=16 nfakewriters=4 stat_interval=30 verbose=0 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4
+	rcu-torture: rtc:           (null) ver: 155441 tfle: 0 rta: 155441 rtaf: 8884 rtf: 155440 rtmbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 3055767
+	rcu-torture: Reader Pipe:  727860534 34213 0 0 0 0 0 0 0 0 0
+	rcu-torture: Reader Batch:  727877838 17003 0 0 0 0 0 0 0 0 0
+	rcu-torture: Free-Block Circulation:  155440 155440 155440 155440 155440 155440 155440 155440 155440 155440 0
+	rcu-torture:--- End of test: SUCCESS: nreaders=16 nfakewriters=4 stat_interval=30 verbose=0 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4
 
 The command "dmesg | grep torture:" will extract this information on
 most systems.  On more esoteric configurations, it may be necessary to
@@ -113,26 +159,55 @@ use other commands to access the output of the printk()s used by
 the RCU torture test.  The printk()s use KERN_ALERT, so they should
 be evident.  ;-)
 
+The first and last lines show the rcutorture module parameters, and the
+last line shows either "SUCCESS" or "FAILURE", based on rcutorture's
+automatic determination as to whether RCU operated correctly.
+
 The entries are as follows:
 
 o	"rtc": The hexadecimal address of the structure currently visible
 	to readers.
 
-o	"ver": The number of times since boot that the rcutw writer task
+o	"ver": The number of times since boot that the RCU writer task
 	has changed the structure visible to readers.
 
 o	"tfle": If non-zero, indicates that the "torture freelist"
-	containing structure to be placed into the "rtc" area is empty.
+	containing structures to be placed into the "rtc" area is empty.
 	This condition is important, since it can fool you into thinking
 	that RCU is working when it is not.  :-/
 
 o	"rta": Number of structures allocated from the torture freelist.
 
 o	"rtaf": Number of allocations from the torture freelist that have
-	failed due to the list being empty.
+	failed due to the list being empty.  It is not unusual for this
+	to be non-zero, but it is bad for it to be a large fraction of
+	the value indicated by "rta".
 
 o	"rtf": Number of frees into the torture freelist.
 
+o	"rtmbe": A non-zero value indicates that rcutorture believes that
+	rcu_assign_pointer() and rcu_dereference() are not working
+	correctly.  This value should be zero.
+
+o	"rtbke": rcutorture was unable to create the real-time kthreads
+	used to force RCU priority inversion.  This value should be zero.
+
+o	"rtbre": Although rcutorture successfully created the kthreads
+	used to force RCU priority inversion, it was unable to set them
+	to the real-time priority level of 1.  This value should be zero.
+
+o	"rtbf": The number of times that RCU priority boosting failed
+	to resolve RCU priority inversion.
+
+o	"rtb": The number of times that rcutorture attempted to force
+	an RCU priority inversion condition.  If you are testing RCU
+	priority boosting via the "test_boost" module parameter, this
+	value should be non-zero.
+
+o	"nt": The number of times rcutorture ran RCU read-side code from
+	within a timer handler.  This value should be non-zero only
+	if you specified the "irqreader" module parameter.
+
 o	"Reader Pipe": Histogram of "ages" of structures seen by readers.
 	If any entries past the first two are non-zero, RCU is broken.
 	And rcutorture prints the error flag string "!!!" to make sure
@@ -162,26 +237,15 @@ o	"Free-Block Circulation": Shows the number of torture structures
 	somehow gets incremented farther than it should.
 
 Different implementations of RCU can provide implementation-specific
-additional information.  For example, SRCU provides the following:
+additional information.  For example, SRCU provides the following
+additional line:
 
-	srcu-torture: rtc: f8cf46a8 ver: 355 tfle: 0 rta: 356 rtaf: 0 rtf: 346 rtmbe: 0
-	srcu-torture: Reader Pipe:  559738 939 0 0 0 0 0 0 0 0 0
-	srcu-torture: Reader Batch:  560434 243 0 0 0 0 0 0 0 0
-	srcu-torture: Free-Block Circulation:  355 354 353 352 351 350 349 348 347 346 0
 	srcu-torture: per-CPU(idx=1): 0(0,1) 1(0,1) 2(0,0) 3(0,1)
 
-The first four lines are similar to those for RCU.  The last line shows
-the per-CPU counter state.  The numbers in parentheses are the values
-of the "old" and "current" counters for the corresponding CPU.  The
-"idx" value maps the "old" and "current" values to the underlying array,
-and is useful for debugging.
-
-Similarly, sched_expedited RCU provides the following:
-
-	sched_expedited-torture: rtc: d0000000016c1880 ver: 1090796 tfle: 0 rta: 1090796 rtaf: 0 rtf: 1090787 rtmbe: 0 nt: 27713319
-	sched_expedited-torture: Reader Pipe:  12660320201 95875 0 0 0 0 0 0 0 0 0
-	sched_expedited-torture: Reader Batch:  12660424885 0 0 0 0 0 0 0 0 0 0
-	sched_expedited-torture: Free-Block Circulation:  1090795 1090795 1090794 1090793 1090792 1090791 1090790 1090789 1090788 1090787 0
+This line shows the per-CPU counter state.  The numbers in parentheses are
+the values of the "old" and "current" counters for the corresponding CPU.
+The "idx" value maps the "old" and "current" values to the underlying
+array, and is useful for debugging.
 
 
 USAGE
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 07/55] rcu: Fix mismatched variable in rcutree_trace.c
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (5 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 06/55] rcu: Update rcutorture documentation Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 08/55] rcu: Abstract common code for RCU grace-period-wait primitives Paul E. McKenney
                   ` (49 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Andi Kleen, Paul E. McKenney

From: Andi Kleen <ak@linux.intel.com>

rcutree.c defines rcu_cpu_kthread_cpu as int, not unsigned int,
so the extern has to follow that.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree_trace.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/rcutree_trace.c b/kernel/rcutree_trace.c
index 4e14487..8827b34 100644
--- a/kernel/rcutree_trace.c
+++ b/kernel/rcutree_trace.c
@@ -49,7 +49,7 @@
 #ifdef CONFIG_RCU_BOOST
 
 DECLARE_PER_CPU(unsigned int, rcu_cpu_kthread_status);
-DECLARE_PER_CPU(unsigned int, rcu_cpu_kthread_cpu);
+DECLARE_PER_CPU(int, rcu_cpu_kthread_cpu);
 DECLARE_PER_CPU(unsigned int, rcu_cpu_kthread_loops);
 DECLARE_PER_CPU(char, rcu_cpu_has_work);
 
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 08/55] rcu: Abstract common code for RCU grace-period-wait primitives
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (6 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 07/55] rcu: Fix mismatched variable in rcutree_trace.c Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 09/55] rcu: Catch rcutorture up to new RCU API additions Paul E. McKenney
                   ` (48 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

Pull the code that waits for an RCU grace period into a single function,
which is then called by synchronize_rcu() and friends in the case of
TREE_RCU and TREE_PREEMPT_RCU, and from rcu_barrier() and friends in
the case of TINY_RCU and TINY_PREEMPT_RCU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |  130 ++++++++++++++++++++++++++--------------------
 include/linux/rcutiny.h  |   16 +++++-
 include/linux/rcutree.h  |    2 +
 kernel/rcupdate.c        |   21 +++++++-
 kernel/rcutiny.c         |   28 ----------
 kernel/rcutiny_plugin.h  |   14 -----
 kernel/rcutree.c         |   22 +-------
 kernel/rcutree_plugin.h  |   11 +----
 8 files changed, 113 insertions(+), 131 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 2516555..6433a6f 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -66,11 +66,73 @@ static inline void rcutorture_record_progress(unsigned long vernum)
 #define ULONG_CMP_LT(a, b)	(ULONG_MAX / 2 < (a) - (b))
 
 /* Exported common interfaces */
+
+#ifdef CONFIG_PREEMPT_RCU
+
+/**
+ * call_rcu() - Queue an RCU callback for invocation after a grace period.
+ * @head: structure to be used for queueing the RCU updates.
+ * @func: actual callback function to be invoked after the grace period
+ *
+ * The callback function will be invoked some time after a full grace
+ * period elapses, in other words after all pre-existing RCU read-side
+ * critical sections have completed.  However, the callback function
+ * might well execute concurrently with RCU read-side critical sections
+ * that started after call_rcu() was invoked.  RCU read-side critical
+ * sections are delimited by rcu_read_lock() and rcu_read_unlock(),
+ * and may be nested.
+ */
+extern void call_rcu(struct rcu_head *head,
+			      void (*func)(struct rcu_head *head));
+
+#else /* #ifdef CONFIG_PREEMPT_RCU */
+
+/* In classic RCU, call_rcu() is just call_rcu_sched(). */
+#define	call_rcu	call_rcu_sched
+
+#endif /* #else #ifdef CONFIG_PREEMPT_RCU */
+
+/**
+ * call_rcu_bh() - Queue an RCU for invocation after a quicker grace period.
+ * @head: structure to be used for queueing the RCU updates.
+ * @func: actual callback function to be invoked after the grace period
+ *
+ * The callback function will be invoked some time after a full grace
+ * period elapses, in other words after all currently executing RCU
+ * read-side critical sections have completed. call_rcu_bh() assumes
+ * that the read-side critical sections end on completion of a softirq
+ * handler. This means that read-side critical sections in process
+ * context must not be interrupted by softirqs. This interface is to be
+ * used when most of the read-side critical sections are in softirq context.
+ * RCU read-side critical sections are delimited by :
+ *  - rcu_read_lock() and  rcu_read_unlock(), if in interrupt context.
+ *  OR
+ *  - rcu_read_lock_bh() and rcu_read_unlock_bh(), if in process context.
+ *  These may be nested.
+ */
+extern void call_rcu_bh(struct rcu_head *head,
+			void (*func)(struct rcu_head *head));
+
+/**
+ * call_rcu_sched() - Queue an RCU for invocation after sched grace period.
+ * @head: structure to be used for queueing the RCU updates.
+ * @func: actual callback function to be invoked after the grace period
+ *
+ * The callback function will be invoked some time after a full grace
+ * period elapses, in other words after all currently executing RCU
+ * read-side critical sections have completed. call_rcu_sched() assumes
+ * that the read-side critical sections end on enabling of preemption
+ * or on voluntary preemption.
+ * RCU read-side critical sections are delimited by :
+ *  - rcu_read_lock_sched() and  rcu_read_unlock_sched(),
+ *  OR
+ *  anything that disables preemption.
+ *  These may be nested.
+ */
 extern void call_rcu_sched(struct rcu_head *head,
 			   void (*func)(struct rcu_head *rcu));
+
 extern void synchronize_sched(void);
-extern void rcu_barrier_bh(void);
-extern void rcu_barrier_sched(void);
 
 static inline void __rcu_read_lock_bh(void)
 {
@@ -143,6 +205,15 @@ static inline void rcu_exit_nohz(void)
 
 #endif /* #else #ifdef CONFIG_NO_HZ */
 
+/*
+ * Infrastructure to implement the synchronize_() primitives in
+ * TREE_RCU and rcu_barrier_() primitives in TINY_RCU.
+ */
+
+typedef void call_rcu_func_t(struct rcu_head *head,
+			     void (*func)(struct rcu_head *head));
+void wait_rcu_gp(call_rcu_func_t crf);
+
 #if defined(CONFIG_TREE_RCU) || defined(CONFIG_TREE_PREEMPT_RCU)
 #include <linux/rcutree.h>
 #elif defined(CONFIG_TINY_RCU) || defined(CONFIG_TINY_PREEMPT_RCU)
@@ -723,61 +794,6 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
 #define RCU_INIT_POINTER(p, v) \
 		p = (typeof(*v) __force __rcu *)(v)
 
-/* Infrastructure to implement the synchronize_() primitives. */
-
-struct rcu_synchronize {
-	struct rcu_head head;
-	struct completion completion;
-};
-
-extern void wakeme_after_rcu(struct rcu_head  *head);
-
-#ifdef CONFIG_PREEMPT_RCU
-
-/**
- * call_rcu() - Queue an RCU callback for invocation after a grace period.
- * @head: structure to be used for queueing the RCU updates.
- * @func: actual callback function to be invoked after the grace period
- *
- * The callback function will be invoked some time after a full grace
- * period elapses, in other words after all pre-existing RCU read-side
- * critical sections have completed.  However, the callback function
- * might well execute concurrently with RCU read-side critical sections
- * that started after call_rcu() was invoked.  RCU read-side critical
- * sections are delimited by rcu_read_lock() and rcu_read_unlock(),
- * and may be nested.
- */
-extern void call_rcu(struct rcu_head *head,
-			      void (*func)(struct rcu_head *head));
-
-#else /* #ifdef CONFIG_PREEMPT_RCU */
-
-/* In classic RCU, call_rcu() is just call_rcu_sched(). */
-#define	call_rcu	call_rcu_sched
-
-#endif /* #else #ifdef CONFIG_PREEMPT_RCU */
-
-/**
- * call_rcu_bh() - Queue an RCU for invocation after a quicker grace period.
- * @head: structure to be used for queueing the RCU updates.
- * @func: actual callback function to be invoked after the grace period
- *
- * The callback function will be invoked some time after a full grace
- * period elapses, in other words after all currently executing RCU
- * read-side critical sections have completed. call_rcu_bh() assumes
- * that the read-side critical sections end on completion of a softirq
- * handler. This means that read-side critical sections in process
- * context must not be interrupted by softirqs. This interface is to be
- * used when most of the read-side critical sections are in softirq context.
- * RCU read-side critical sections are delimited by :
- *  - rcu_read_lock() and  rcu_read_unlock(), if in interrupt context.
- *  OR
- *  - rcu_read_lock_bh() and rcu_read_unlock_bh(), if in process context.
- *  These may be nested.
- */
-extern void call_rcu_bh(struct rcu_head *head,
-			void (*func)(struct rcu_head *head));
-
 /*
  * debug_rcu_head_queue()/debug_rcu_head_unqueue() are used internally
  * by call_rcu() and rcu callback execution, and are therefore not part of the
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 52b3e02..4eab233 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -31,6 +31,16 @@ static inline void rcu_init(void)
 {
 }
 
+static inline void rcu_barrier_bh(void)
+{
+	wait_rcu_gp(call_rcu_bh);
+}
+
+static inline void rcu_barrier_sched(void)
+{
+	wait_rcu_gp(call_rcu_sched);
+}
+
 #ifdef CONFIG_TINY_RCU
 
 static inline void synchronize_rcu_expedited(void)
@@ -45,9 +55,13 @@ static inline void rcu_barrier(void)
 
 #else /* #ifdef CONFIG_TINY_RCU */
 
-void rcu_barrier(void);
 void synchronize_rcu_expedited(void);
 
+static inline void rcu_barrier(void)
+{
+	wait_rcu_gp(call_rcu);
+}
+
 #endif /* #else #ifdef CONFIG_TINY_RCU */
 
 static inline void synchronize_rcu_bh(void)
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index e65d066..6745846 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -67,6 +67,8 @@ static inline void synchronize_rcu_bh_expedited(void)
 }
 
 extern void rcu_barrier(void);
+extern void rcu_barrier_bh(void);
+extern void rcu_barrier_sched(void);
 
 extern unsigned long rcutorture_testseq;
 extern unsigned long rcutorture_vernum;
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index 7784bd2..a088c90 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -94,11 +94,16 @@ EXPORT_SYMBOL_GPL(rcu_read_lock_bh_held);
 
 #endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
+struct rcu_synchronize {
+	struct rcu_head head;
+	struct completion completion;
+};
+
 /*
  * Awaken the corresponding synchronize_rcu() instance now that a
  * grace period has elapsed.
  */
-void wakeme_after_rcu(struct rcu_head  *head)
+static void wakeme_after_rcu(struct rcu_head  *head)
 {
 	struct rcu_synchronize *rcu;
 
@@ -106,6 +111,20 @@ void wakeme_after_rcu(struct rcu_head  *head)
 	complete(&rcu->completion);
 }
 
+void wait_rcu_gp(call_rcu_func_t crf)
+{
+	struct rcu_synchronize rcu;
+
+	init_rcu_head_on_stack(&rcu.head);
+	init_completion(&rcu.completion);
+	/* Will wake me after RCU finished. */
+	crf(&rcu.head, wakeme_after_rcu);
+	/* Wait for it. */
+	wait_for_completion(&rcu.completion);
+	destroy_rcu_head_on_stack(&rcu.head);
+}
+EXPORT_SYMBOL_GPL(wait_rcu_gp);
+
 #ifdef CONFIG_PROVE_RCU
 /*
  * wrapper function to avoid #include problems.
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 7bbac7d..f544e34 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -281,34 +281,6 @@ void call_rcu_bh(struct rcu_head *head, void (*func)(struct rcu_head *rcu))
 }
 EXPORT_SYMBOL_GPL(call_rcu_bh);
 
-void rcu_barrier_bh(void)
-{
-	struct rcu_synchronize rcu;
-
-	init_rcu_head_on_stack(&rcu.head);
-	init_completion(&rcu.completion);
-	/* Will wake me after RCU finished. */
-	call_rcu_bh(&rcu.head, wakeme_after_rcu);
-	/* Wait for it. */
-	wait_for_completion(&rcu.completion);
-	destroy_rcu_head_on_stack(&rcu.head);
-}
-EXPORT_SYMBOL_GPL(rcu_barrier_bh);
-
-void rcu_barrier_sched(void)
-{
-	struct rcu_synchronize rcu;
-
-	init_rcu_head_on_stack(&rcu.head);
-	init_completion(&rcu.completion);
-	/* Will wake me after RCU finished. */
-	call_rcu_sched(&rcu.head, wakeme_after_rcu);
-	/* Wait for it. */
-	wait_for_completion(&rcu.completion);
-	destroy_rcu_head_on_stack(&rcu.head);
-}
-EXPORT_SYMBOL_GPL(rcu_barrier_sched);
-
 /*
  * Spawn the kthread that invokes RCU callbacks.
  */
diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index f259c67..6b0cedb 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -697,20 +697,6 @@ void call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu))
 }
 EXPORT_SYMBOL_GPL(call_rcu);
 
-void rcu_barrier(void)
-{
-	struct rcu_synchronize rcu;
-
-	init_rcu_head_on_stack(&rcu.head);
-	init_completion(&rcu.completion);
-	/* Will wake me after RCU finished. */
-	call_rcu(&rcu.head, wakeme_after_rcu);
-	/* Wait for it. */
-	wait_for_completion(&rcu.completion);
-	destroy_rcu_head_on_stack(&rcu.head);
-}
-EXPORT_SYMBOL_GPL(rcu_barrier);
-
 /*
  * synchronize_rcu - wait until a grace period has elapsed.
  *
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index ba06207..a7c6bce 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1613,18 +1613,9 @@ EXPORT_SYMBOL_GPL(call_rcu_bh);
  */
 void synchronize_sched(void)
 {
-	struct rcu_synchronize rcu;
-
 	if (rcu_blocking_is_gp())
 		return;
-
-	init_rcu_head_on_stack(&rcu.head);
-	init_completion(&rcu.completion);
-	/* Will wake me after RCU finished. */
-	call_rcu_sched(&rcu.head, wakeme_after_rcu);
-	/* Wait for it. */
-	wait_for_completion(&rcu.completion);
-	destroy_rcu_head_on_stack(&rcu.head);
+	wait_rcu_gp(call_rcu_sched);
 }
 EXPORT_SYMBOL_GPL(synchronize_sched);
 
@@ -1639,18 +1630,9 @@ EXPORT_SYMBOL_GPL(synchronize_sched);
  */
 void synchronize_rcu_bh(void)
 {
-	struct rcu_synchronize rcu;
-
 	if (rcu_blocking_is_gp())
 		return;
-
-	init_rcu_head_on_stack(&rcu.head);
-	init_completion(&rcu.completion);
-	/* Will wake me after RCU finished. */
-	call_rcu_bh(&rcu.head, wakeme_after_rcu);
-	/* Wait for it. */
-	wait_for_completion(&rcu.completion);
-	destroy_rcu_head_on_stack(&rcu.head);
+	wait_rcu_gp(call_rcu_bh);
 }
 EXPORT_SYMBOL_GPL(synchronize_rcu_bh);
 
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 9703298..43daa46 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -656,18 +656,9 @@ EXPORT_SYMBOL_GPL(call_rcu);
  */
 void synchronize_rcu(void)
 {
-	struct rcu_synchronize rcu;
-
 	if (!rcu_scheduler_active)
 		return;
-
-	init_rcu_head_on_stack(&rcu.head);
-	init_completion(&rcu.completion);
-	/* Will wake me after RCU finished. */
-	call_rcu(&rcu.head, wakeme_after_rcu);
-	/* Wait for it. */
-	wait_for_completion(&rcu.completion);
-	destroy_rcu_head_on_stack(&rcu.head);
+	wait_rcu_gp(call_rcu);
 }
 EXPORT_SYMBOL_GPL(synchronize_rcu);
 
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 09/55] rcu: Catch rcutorture up to new RCU API additions
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (7 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 08/55] rcu: Abstract common code for RCU grace-period-wait primitives Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 10/55] rcu: Fix RCU's NMI documentation Paul E. McKenney
                   ` (47 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

Now that the RCU API contains synchronize_rcu_bh(), synchronize_sched(),
call_rcu_sched(), and rcu_bh_expedited()...

Make rcutorture test synchronize_rcu_bh(), getting rid of the old
rcu_bh_torture_synchronize() workaround.  Similarly, make rcutorture test
synchronize_sched(), getting rid of the old sched_torture_synchronize()
workaround.  Make rcutorture test call_rcu_sched() instead of wrappering
synchronize_sched().  Also add testing of rcu_bh_expedited().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/RCU/torture.txt |    3 ++
 kernel/rcutorture.c           |   55 +++++++++++++++-------------------------
 2 files changed, 24 insertions(+), 34 deletions(-)

diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt
index 4205ed1..783d6c1 100644
--- a/Documentation/RCU/torture.txt
+++ b/Documentation/RCU/torture.txt
@@ -122,6 +122,9 @@ torture_type	The type of RCU to test, with string values as follows:
 		"rcu_bh_sync": rcu_read_lock_bh(), rcu_read_unlock_bh(),
 			and synchronize_rcu_bh().
 
+		"rcu_bh_expedited": rcu_read_lock_bh(), rcu_read_unlock_bh(),
+			and synchronize_rcu_bh_expedited().
+
 		"srcu": srcu_read_lock(), srcu_read_unlock() and
 			synchronize_srcu().
 
diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c
index 920eb50..e0f39ea 100644
--- a/kernel/rcutorture.c
+++ b/kernel/rcutorture.c
@@ -480,30 +480,6 @@ static void rcu_bh_torture_deferred_free(struct rcu_torture *p)
 	call_rcu_bh(&p->rtort_rcu, rcu_torture_cb);
 }
 
-struct rcu_bh_torture_synchronize {
-	struct rcu_head head;
-	struct completion completion;
-};
-
-static void rcu_bh_torture_wakeme_after_cb(struct rcu_head *head)
-{
-	struct rcu_bh_torture_synchronize *rcu;
-
-	rcu = container_of(head, struct rcu_bh_torture_synchronize, head);
-	complete(&rcu->completion);
-}
-
-static void rcu_bh_torture_synchronize(void)
-{
-	struct rcu_bh_torture_synchronize rcu;
-
-	init_rcu_head_on_stack(&rcu.head);
-	init_completion(&rcu.completion);
-	call_rcu_bh(&rcu.head, rcu_bh_torture_wakeme_after_cb);
-	wait_for_completion(&rcu.completion);
-	destroy_rcu_head_on_stack(&rcu.head);
-}
-
 static struct rcu_torture_ops rcu_bh_ops = {
 	.init		= NULL,
 	.cleanup	= NULL,
@@ -512,7 +488,7 @@ static struct rcu_torture_ops rcu_bh_ops = {
 	.readunlock	= rcu_bh_torture_read_unlock,
 	.completed	= rcu_bh_torture_completed,
 	.deferred_free	= rcu_bh_torture_deferred_free,
-	.sync		= rcu_bh_torture_synchronize,
+	.sync		= synchronize_rcu_bh,
 	.cb_barrier	= rcu_barrier_bh,
 	.fqs		= rcu_bh_force_quiescent_state,
 	.stats		= NULL,
@@ -528,7 +504,7 @@ static struct rcu_torture_ops rcu_bh_sync_ops = {
 	.readunlock	= rcu_bh_torture_read_unlock,
 	.completed	= rcu_bh_torture_completed,
 	.deferred_free	= rcu_sync_torture_deferred_free,
-	.sync		= rcu_bh_torture_synchronize,
+	.sync		= synchronize_rcu_bh,
 	.cb_barrier	= NULL,
 	.fqs		= rcu_bh_force_quiescent_state,
 	.stats		= NULL,
@@ -536,6 +512,22 @@ static struct rcu_torture_ops rcu_bh_sync_ops = {
 	.name		= "rcu_bh_sync"
 };
 
+static struct rcu_torture_ops rcu_bh_expedited_ops = {
+	.init		= rcu_sync_torture_init,
+	.cleanup	= NULL,
+	.readlock	= rcu_bh_torture_read_lock,
+	.read_delay	= rcu_read_delay,  /* just reuse rcu's version. */
+	.readunlock	= rcu_bh_torture_read_unlock,
+	.completed	= rcu_bh_torture_completed,
+	.deferred_free	= rcu_sync_torture_deferred_free,
+	.sync		= synchronize_rcu_bh_expedited,
+	.cb_barrier	= NULL,
+	.fqs		= rcu_bh_force_quiescent_state,
+	.stats		= NULL,
+	.irq_capable	= 1,
+	.name		= "rcu_bh_expedited"
+};
+
 /*
  * Definitions for srcu torture testing.
  */
@@ -659,11 +651,6 @@ static void rcu_sched_torture_deferred_free(struct rcu_torture *p)
 	call_rcu_sched(&p->rtort_rcu, rcu_torture_cb);
 }
 
-static void sched_torture_synchronize(void)
-{
-	synchronize_sched();
-}
-
 static struct rcu_torture_ops sched_ops = {
 	.init		= rcu_sync_torture_init,
 	.cleanup	= NULL,
@@ -672,7 +659,7 @@ static struct rcu_torture_ops sched_ops = {
 	.readunlock	= sched_torture_read_unlock,
 	.completed	= rcu_no_completed,
 	.deferred_free	= rcu_sched_torture_deferred_free,
-	.sync		= sched_torture_synchronize,
+	.sync		= synchronize_sched,
 	.cb_barrier	= rcu_barrier_sched,
 	.fqs		= rcu_sched_force_quiescent_state,
 	.stats		= NULL,
@@ -688,7 +675,7 @@ static struct rcu_torture_ops sched_sync_ops = {
 	.readunlock	= sched_torture_read_unlock,
 	.completed	= rcu_no_completed,
 	.deferred_free	= rcu_sync_torture_deferred_free,
-	.sync		= sched_torture_synchronize,
+	.sync		= synchronize_sched,
 	.cb_barrier	= NULL,
 	.fqs		= rcu_sched_force_quiescent_state,
 	.stats		= NULL,
@@ -1427,7 +1414,7 @@ rcu_torture_init(void)
 	int firsterr = 0;
 	static struct rcu_torture_ops *torture_ops[] =
 		{ &rcu_ops, &rcu_sync_ops, &rcu_expedited_ops,
-		  &rcu_bh_ops, &rcu_bh_sync_ops,
+		  &rcu_bh_ops, &rcu_bh_sync_ops, &rcu_bh_expedited_ops,
 		  &srcu_ops, &srcu_expedited_ops,
 		  &sched_ops, &sched_sync_ops, &sched_expedited_ops, };
 
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 10/55] rcu: Fix RCU's NMI documentation
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (8 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 09/55] rcu: Catch rcutorture up to new RCU API additions Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 11/55] rcu: Drive configuration directly from SMP and PREEMPT Paul E. McKenney
                   ` (46 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

It has long been the case that the architecture must call nmi_enter()
and nmi_exit() rather than irq_enter() and irq_exit() in order to
permit RCU read-side critical sections in NMIs.  Catch the documentation
up with reality.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 Documentation/RCU/NMI-RCU.txt |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Documentation/RCU/NMI-RCU.txt b/Documentation/RCU/NMI-RCU.txt
index a8536cb..84e4f9c 100644
--- a/Documentation/RCU/NMI-RCU.txt
+++ b/Documentation/RCU/NMI-RCU.txt
@@ -95,7 +95,7 @@ not to return until all ongoing NMI handlers exit.  It is therefore safe
 to free up the handler's data as soon as synchronize_sched() returns.
 
 Important note: for this to work, the architecture in question must
-invoke irq_enter() and irq_exit() on NMI entry and exit, respectively.
+invoke nmi_enter() and nmi_exit() on NMI entry and exit, respectively.
 
 
 Answer to Quick Quiz
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 11/55] rcu: Drive configuration directly from SMP and PREEMPT
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (9 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 10/55] rcu: Fix RCU's NMI documentation Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 12/55] rcu: Fix pathnames in documentation Paul E. McKenney
                   ` (45 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

This commit eliminates the possibility of running TREE_PREEMPT_RCU
when SMP=n and of running TINY_RCU when PREEMPT=y.  People who really
want these combinations can hand-edit init/Kconfig, but eliminating
them as choices for production systems reduces the amount of testing
required.  It will also allow cutting out a few #ifdefs.

Note that running TREE_RCU and TINY_RCU on single-CPU systems using
SMP-built kernels is still supported.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 init/Kconfig |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 412c21b..e22a691 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -391,7 +391,7 @@ config TREE_RCU
 
 config TREE_PREEMPT_RCU
 	bool "Preemptible tree-based hierarchical RCU"
-	depends on PREEMPT
+	depends on PREEMPT && SMP
 	help
 	  This option selects the RCU implementation that is
 	  designed for very large SMP systems with hundreds or
@@ -401,7 +401,7 @@ config TREE_PREEMPT_RCU
 
 config TINY_RCU
 	bool "UP-only small-memory-footprint RCU"
-	depends on !SMP
+	depends on !PREEMPT && !SMP
 	help
 	  This option selects the RCU implementation that is
 	  designed for UP systems from which real-time response
@@ -410,7 +410,7 @@ config TINY_RCU
 
 config TINY_PREEMPT_RCU
 	bool "Preemptible UP-only small-memory-footprint RCU"
-	depends on !SMP && PREEMPT
+	depends on PREEMPT && !SMP
 	help
 	  This option selects the RCU implementation that is designed
 	  for real-time UP systems.  This option greatly reduces the
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 12/55] rcu: Fix pathnames in documentation
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (10 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 11/55] rcu: Drive configuration directly from SMP and PREEMPT Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 13/55] rcu: Don't destroy rcu_torture_boost() callback until it is done Paul E. McKenney
                   ` (44 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Wanlong Gao, Paul E. McKenney

From: Wanlong Gao <wanlong.gao@gmail.com>

The old "arch/i386" has long since become "arch/x86", so fix the
RCU documentation accordingly.

Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/RCU/NMI-RCU.txt |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/RCU/NMI-RCU.txt b/Documentation/RCU/NMI-RCU.txt
index 84e4f9c..687777f 100644
--- a/Documentation/RCU/NMI-RCU.txt
+++ b/Documentation/RCU/NMI-RCU.txt
@@ -5,8 +5,8 @@ Although RCU is usually used to protect read-mostly data structures,
 it is possible to use RCU to provide dynamic non-maskable interrupt
 handlers, as well as dynamic irq handlers.  This document describes
 how to do this, drawing loosely from Zwane Mwaikambo's NMI-timer
-work in "arch/i386/oprofile/nmi_timer_int.c" and in
-"arch/i386/kernel/traps.c".
+work in "arch/x86/oprofile/nmi_timer_int.c" and in
+"arch/x86/kernel/traps.c".
 
 The relevant pieces of code are listed below, each followed by a
 brief explanation.
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 13/55] rcu: Don't destroy rcu_torture_boost() callback until it is done
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (11 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 12/55] rcu: Fix pathnames in documentation Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 14/55] rcu: Add event-tracing for RCU callback invocation Paul E. McKenney
                   ` (43 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

The rcu_torture_boost() cleanup code destroyed debug-objects state before
waiting for the last RCU callback to be invoked, resulting in rare but
very real debug-objects warnings.  Move the destruction to after the
waiting to fix this problem.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutorture.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c
index e0f39ea..1194d6c 100644
--- a/kernel/rcutorture.c
+++ b/kernel/rcutorture.c
@@ -796,11 +796,11 @@ checkwait:	rcu_stutter_wait("rcu_torture_boost");
 
 	/* Clean up and exit. */
 	VERBOSE_PRINTK_STRING("rcu_torture_boost task stopping");
-	destroy_rcu_head_on_stack(&rbi.rcu);
 	rcutorture_shutdown_absorb("rcu_torture_boost");
 	while (!kthread_should_stop() || rbi.inflight)
 		schedule_timeout_uninterruptible(1);
 	smp_mb(); /* order accesses to ->inflight before stack-frame death. */
+	destroy_rcu_head_on_stack(&rbi.rcu);
 	return 0;
 }
 
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 14/55] rcu: Add event-tracing for RCU callback invocation
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (12 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 13/55] rcu: Don't destroy rcu_torture_boost() callback until it is done Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 15/55] rcu: Event-trace markers for computing RCU CPU utilization Paul E. McKenney
                   ` (42 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

There was recently some controversy about the overhead of invoking RCU
callbacks.  Add TRACE_EVENT()s to obtain fine-grained timings for the
start and stop of a batch of callbacks and also for each callback invoked.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h   |   50 ----------------------
 include/trace/events/rcu.h |   98 ++++++++++++++++++++++++++++++++++++++++++++
 kernel/rcu.h               |   79 +++++++++++++++++++++++++++++++++++
 kernel/rcupdate.c          |    5 ++
 kernel/rcutiny.c           |   26 +++++++++++-
 kernel/rcutree.c           |   15 +++++-
 6 files changed, 219 insertions(+), 54 deletions(-)
 create mode 100644 include/trace/events/rcu.h
 create mode 100644 kernel/rcu.h

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 6433a6f..c61a535 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -794,44 +794,6 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
 #define RCU_INIT_POINTER(p, v) \
 		p = (typeof(*v) __force __rcu *)(v)
 
-/*
- * debug_rcu_head_queue()/debug_rcu_head_unqueue() are used internally
- * by call_rcu() and rcu callback execution, and are therefore not part of the
- * RCU API. Leaving in rcupdate.h because they are used by all RCU flavors.
- */
-
-#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD
-# define STATE_RCU_HEAD_READY	0
-# define STATE_RCU_HEAD_QUEUED	1
-
-extern struct debug_obj_descr rcuhead_debug_descr;
-
-static inline void debug_rcu_head_queue(struct rcu_head *head)
-{
-	WARN_ON_ONCE((unsigned long)head & 0x3);
-	debug_object_activate(head, &rcuhead_debug_descr);
-	debug_object_active_state(head, &rcuhead_debug_descr,
-				  STATE_RCU_HEAD_READY,
-				  STATE_RCU_HEAD_QUEUED);
-}
-
-static inline void debug_rcu_head_unqueue(struct rcu_head *head)
-{
-	debug_object_active_state(head, &rcuhead_debug_descr,
-				  STATE_RCU_HEAD_QUEUED,
-				  STATE_RCU_HEAD_READY);
-	debug_object_deactivate(head, &rcuhead_debug_descr);
-}
-#else	/* !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
-static inline void debug_rcu_head_queue(struct rcu_head *head)
-{
-}
-
-static inline void debug_rcu_head_unqueue(struct rcu_head *head)
-{
-}
-#endif	/* #else !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
-
 static __always_inline bool __is_kfree_rcu_offset(unsigned long offset)
 {
 	return offset < 4096;
@@ -850,18 +812,6 @@ void __kfree_rcu(struct rcu_head *head, unsigned long offset)
 	call_rcu(head, (rcu_callback)offset);
 }
 
-extern void kfree(const void *);
-
-static inline void __rcu_reclaim(struct rcu_head *head)
-{
-	unsigned long offset = (unsigned long)head->func;
-
-	if (__is_kfree_rcu_offset(offset))
-		kfree((void *)head - offset);
-	else
-		head->func(head);
-}
-
 /**
  * kfree_rcu() - kfree an object after a grace period.
  * @ptr:	pointer to kfree
diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
new file mode 100644
index 0000000..db3f6e9
--- /dev/null
+++ b/include/trace/events/rcu.h
@@ -0,0 +1,98 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM rcu
+
+#if !defined(_TRACE_RCU_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_RCU_H
+
+#include <linux/tracepoint.h>
+
+/*
+ * Tracepoint for calling rcu_do_batch, performed to start callback invocation:
+ */
+TRACE_EVENT(rcu_batch_start,
+
+	TP_PROTO(long callbacks_ready, int blimit),
+
+	TP_ARGS(callbacks_ready, blimit),
+
+	TP_STRUCT__entry(
+		__field(	long,	callbacks_ready		)
+		__field(	int,	blimit			)
+	),
+
+	TP_fast_assign(
+		__entry->callbacks_ready	= callbacks_ready;
+		__entry->blimit			= blimit;
+	),
+
+	TP_printk("CBs=%ld bl=%d", __entry->callbacks_ready, __entry->blimit)
+);
+
+/*
+ * Tracepoint for the invocation of a single RCU callback
+ */
+TRACE_EVENT(rcu_invoke_callback,
+
+	TP_PROTO(struct rcu_head *rhp),
+
+	TP_ARGS(rhp),
+
+	TP_STRUCT__entry(
+		__field(	void *,	rhp	)
+		__field(	void *,	func	)
+	),
+
+	TP_fast_assign(
+		__entry->rhp	= rhp;
+		__entry->func	= rhp->func;
+	),
+
+	TP_printk("rhp=%p func=%pf", __entry->rhp, __entry->func)
+);
+
+/*
+ * Tracepoint for the invocation of a single RCU kfree callback
+ */
+TRACE_EVENT(rcu_invoke_kfree_callback,
+
+	TP_PROTO(struct rcu_head *rhp, unsigned long offset),
+
+	TP_ARGS(rhp, offset),
+
+	TP_STRUCT__entry(
+		__field(void *,	rhp	)
+		__field(unsigned long,	offset	)
+	),
+
+	TP_fast_assign(
+		__entry->rhp	= rhp;
+		__entry->offset	= offset;
+	),
+
+	TP_printk("rhp=%p func=%ld", __entry->rhp, __entry->offset)
+);
+
+/*
+ * Tracepoint for leaving rcu_do_batch, performed after callback invocation:
+ */
+TRACE_EVENT(rcu_batch_end,
+
+	TP_PROTO(int callbacks_invoked),
+
+	TP_ARGS(callbacks_invoked),
+
+	TP_STRUCT__entry(
+		__field(	int,	callbacks_invoked		)
+	),
+
+	TP_fast_assign(
+		__entry->callbacks_invoked	= callbacks_invoked;
+	),
+
+	TP_printk("CBs-invoked=%d", __entry->callbacks_invoked)
+);
+
+#endif /* _TRACE_RCU_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/kernel/rcu.h b/kernel/rcu.h
new file mode 100644
index 0000000..7bc1643
--- /dev/null
+++ b/kernel/rcu.h
@@ -0,0 +1,79 @@
+/*
+ * Read-Copy Update definitions shared among RCU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright IBM Corporation, 2011
+ *
+ * Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
+ */
+
+#ifndef __LINUX_RCU_H
+#define __LINUX_RCU_H
+
+/*
+ * debug_rcu_head_queue()/debug_rcu_head_unqueue() are used internally
+ * by call_rcu() and rcu callback execution, and are therefore not part of the
+ * RCU API. Leaving in rcupdate.h because they are used by all RCU flavors.
+ */
+
+#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD
+# define STATE_RCU_HEAD_READY	0
+# define STATE_RCU_HEAD_QUEUED	1
+
+extern struct debug_obj_descr rcuhead_debug_descr;
+
+static inline void debug_rcu_head_queue(struct rcu_head *head)
+{
+	WARN_ON_ONCE((unsigned long)head & 0x3);
+	debug_object_activate(head, &rcuhead_debug_descr);
+	debug_object_active_state(head, &rcuhead_debug_descr,
+				  STATE_RCU_HEAD_READY,
+				  STATE_RCU_HEAD_QUEUED);
+}
+
+static inline void debug_rcu_head_unqueue(struct rcu_head *head)
+{
+	debug_object_active_state(head, &rcuhead_debug_descr,
+				  STATE_RCU_HEAD_QUEUED,
+				  STATE_RCU_HEAD_READY);
+	debug_object_deactivate(head, &rcuhead_debug_descr);
+}
+#else	/* !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
+static inline void debug_rcu_head_queue(struct rcu_head *head)
+{
+}
+
+static inline void debug_rcu_head_unqueue(struct rcu_head *head)
+{
+}
+#endif	/* #else !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
+
+extern void kfree(const void *);
+
+static inline void __rcu_reclaim(struct rcu_head *head)
+{
+	unsigned long offset = (unsigned long)head->func;
+
+	if (__is_kfree_rcu_offset(offset)) {
+		trace_rcu_invoke_kfree_callback(head, offset);
+		kfree((void *)head - offset);
+	} else {
+		trace_rcu_invoke_callback(head);
+		head->func(head);
+	}
+}
+
+#endif /* __LINUX_RCU_H */
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index a088c90..5031caf 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -46,6 +46,11 @@
 #include <linux/module.h>
 #include <linux/hardirq.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/rcu.h>
+
+#include "rcu.h"
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 static struct lock_class_key rcu_lock_key;
 struct lockdep_map rcu_lock_map =
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index f544e34..19453ba 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -37,6 +37,25 @@
 #include <linux/cpu.h>
 #include <linux/prefetch.h>
 
+#ifdef CONFIG_RCU_TRACE
+
+#include <trace/events/rcu.h>
+
+#else /* #ifdef CONFIG_RCU_TRACE */
+
+/* No by-default tracing in TINY_RCU: Keep TINY_RCU tiny! */
+static void trace_rcu_invoke_kfree_callback(struct rcu_head *rhp,
+					    unsigned long offset)
+{
+}
+static void trace_rcu_invoke_callback(struct rcu_head *head)
+{
+}
+
+#endif /* #else #ifdef CONFIG_RCU_TRACE */
+
+#include "rcu.h"
+
 /* Controls for rcu_kthread() kthread, replacing RCU_SOFTIRQ used previously. */
 static struct task_struct *rcu_kthread_task;
 static DECLARE_WAIT_QUEUE_HEAD(rcu_kthread_wq);
@@ -161,11 +180,15 @@ static void rcu_process_callbacks(struct rcu_ctrlblk *rcp)
 	RCU_TRACE(int cb_count = 0);
 
 	/* If no RCU callbacks ready to invoke, just return. */
-	if (&rcp->rcucblist == rcp->donetail)
+	if (&rcp->rcucblist == rcp->donetail) {
+		RCU_TRACE(trace_rcu_batch_start(0, -1));
+		RCU_TRACE(trace_rcu_batch_end(0));
 		return;
+	}
 
 	/* Move the ready-to-invoke callbacks to a local list. */
 	local_irq_save(flags);
+	RCU_TRACE(trace_rcu_batch_start(0, -1));
 	list = rcp->rcucblist;
 	rcp->rcucblist = *rcp->donetail;
 	*rcp->donetail = NULL;
@@ -187,6 +210,7 @@ static void rcu_process_callbacks(struct rcu_ctrlblk *rcp)
 		RCU_TRACE(cb_count++);
 	}
 	RCU_TRACE(rcu_trace_sub_qlen(rcp, cb_count));
+	RCU_TRACE(trace_rcu_batch_end(cb_count));
 }
 
 /*
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index a7c6bce..45dcc20 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -52,6 +52,9 @@
 #include <linux/prefetch.h>
 
 #include "rcutree.h"
+#include <trace/events/rcu.h>
+
+#include "rcu.h"
 
 /* Data structures. */
 
@@ -1190,17 +1193,22 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
 {
 	unsigned long flags;
 	struct rcu_head *next, *list, **tail;
-	int count;
+	int bl, count;
 
 	/* If no callbacks are ready, just return.*/
-	if (!cpu_has_callbacks_ready_to_invoke(rdp))
+	if (!cpu_has_callbacks_ready_to_invoke(rdp)) {
+		trace_rcu_batch_start(0, 0);
+		trace_rcu_batch_end(0);
 		return;
+	}
 
 	/*
 	 * Extract the list of ready callbacks, disabling to prevent
 	 * races with call_rcu() from interrupt handlers.
 	 */
 	local_irq_save(flags);
+	bl = rdp->blimit;
+	trace_rcu_batch_start(rdp->qlen, bl);
 	list = rdp->nxtlist;
 	rdp->nxtlist = *rdp->nxttail[RCU_DONE_TAIL];
 	*rdp->nxttail[RCU_DONE_TAIL] = NULL;
@@ -1218,11 +1226,12 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
 		debug_rcu_head_unqueue(list);
 		__rcu_reclaim(list);
 		list = next;
-		if (++count >= rdp->blimit)
+		if (++count >= bl)
 			break;
 	}
 
 	local_irq_save(flags);
+	trace_rcu_batch_end(count);
 
 	/* Update count, and requeue any remaining callbacks. */
 	rdp->qlen -= count;
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 15/55] rcu: Event-trace markers for computing RCU CPU utilization
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (13 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 14/55] rcu: Add event-tracing for RCU callback invocation Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 16/55] rcu: Put names into TINY_RCU structures under RCU_TRACE Paul E. McKenney
                   ` (41 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

This commit adds the trace_rcu_utilization() marker that is to be
used to allow postprocessing scripts compute RCU's CPU utilization,
give or take event-trace overhead.  Note that we do not include RCU's
dyntick-idle interface because event tracing requires RCU protection,
which is not available in dyntick-idle mode.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/trace/events/rcu.h |   73 ++++++++++++++++++++++++++++++++------------
 kernel/rcutree.c           |   16 +++++++++-
 2 files changed, 68 insertions(+), 21 deletions(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index db3f6e9..ab458eb 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -7,29 +7,58 @@
 #include <linux/tracepoint.h>
 
 /*
- * Tracepoint for calling rcu_do_batch, performed to start callback invocation:
+ * Tracepoint for start/end markers used for utilization calculations.
+ * By convention, the string is of the following forms:
+ *
+ * "Start <activity>" -- Mark the start of the specified activity,
+ *			 such as "context switch".  Nesting is permitted.
+ * "End <activity>" -- Mark the end of the specified activity.
+ */
+TRACE_EVENT(rcu_utilization,
+
+	TP_PROTO(char *s),
+
+	TP_ARGS(s),
+
+	TP_STRUCT__entry(
+		__field(char *,	s)
+	),
+
+	TP_fast_assign(
+		__entry->s = s;
+	),
+
+	TP_printk("%s", __entry->s)
+);
+
+/*
+ * Tracepoint for marking the beginning rcu_do_batch, performed to start
+ * RCU callback invocation.  The first argument is the total number of
+ * callbacks (including those that are not yet ready to be invoked),
+ * and the second argument is the current RCU-callback batch limit.
  */
 TRACE_EVENT(rcu_batch_start,
 
-	TP_PROTO(long callbacks_ready, int blimit),
+	TP_PROTO(long qlen, int blimit),
 
-	TP_ARGS(callbacks_ready, blimit),
+	TP_ARGS(qlen, blimit),
 
 	TP_STRUCT__entry(
-		__field(	long,	callbacks_ready		)
-		__field(	int,	blimit			)
+		__field(long, qlen)
+		__field(int, blimit)
 	),
 
 	TP_fast_assign(
-		__entry->callbacks_ready	= callbacks_ready;
-		__entry->blimit			= blimit;
+		__entry->qlen = qlen;
+		__entry->blimit = blimit;
 	),
 
-	TP_printk("CBs=%ld bl=%d", __entry->callbacks_ready, __entry->blimit)
+	TP_printk("CBs=%ld bl=%d", __entry->qlen, __entry->blimit)
 );
 
 /*
- * Tracepoint for the invocation of a single RCU callback
+ * Tracepoint for the invocation of a single RCU callback function.
+ * The argument is a pointer to the RCU callback itself.
  */
 TRACE_EVENT(rcu_invoke_callback,
 
@@ -38,20 +67,23 @@ TRACE_EVENT(rcu_invoke_callback,
 	TP_ARGS(rhp),
 
 	TP_STRUCT__entry(
-		__field(	void *,	rhp	)
-		__field(	void *,	func	)
+		__field(void *,	rhp)
+		__field(void *,	func)
 	),
 
 	TP_fast_assign(
-		__entry->rhp	= rhp;
-		__entry->func	= rhp->func;
+		__entry->rhp = rhp;
+		__entry->func = rhp->func;
 	),
 
 	TP_printk("rhp=%p func=%pf", __entry->rhp, __entry->func)
 );
 
 /*
- * Tracepoint for the invocation of a single RCU kfree callback
+ * Tracepoint for the invocation of a single RCU callback of the special
+ * kfree() form.  The first argument is a pointer to the RCU callback
+ * and the second argument is the offset of the callback within the
+ * enclosing RCU-protected data structure.
  */
 TRACE_EVENT(rcu_invoke_kfree_callback,
 
@@ -60,12 +92,12 @@ TRACE_EVENT(rcu_invoke_kfree_callback,
 	TP_ARGS(rhp, offset),
 
 	TP_STRUCT__entry(
-		__field(void *,	rhp	)
-		__field(unsigned long,	offset	)
+		__field(void *,	rhp)
+		__field(unsigned long, offset)
 	),
 
 	TP_fast_assign(
-		__entry->rhp	= rhp;
+		__entry->rhp = rhp;
 		__entry->offset	= offset;
 	),
 
@@ -73,7 +105,8 @@ TRACE_EVENT(rcu_invoke_kfree_callback,
 );
 
 /*
- * Tracepoint for leaving rcu_do_batch, performed after callback invocation:
+ * Tracepoint for exiting rcu_do_batch after RCU callbacks have been
+ * invoked.  The first argument is the number of callbacks actually invoked.
  */
 TRACE_EVENT(rcu_batch_end,
 
@@ -82,11 +115,11 @@ TRACE_EVENT(rcu_batch_end,
 	TP_ARGS(callbacks_invoked),
 
 	TP_STRUCT__entry(
-		__field(	int,	callbacks_invoked		)
+		__field(int, callbacks_invoked)
 	),
 
 	TP_fast_assign(
-		__entry->callbacks_invoked	= callbacks_invoked;
+		__entry->callbacks_invoked = callbacks_invoked;
 	),
 
 	TP_printk("CBs-invoked=%d", __entry->callbacks_invoked)
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 45dcc20..2a9643b 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -184,8 +184,10 @@ void rcu_bh_qs(int cpu)
  */
 void rcu_note_context_switch(int cpu)
 {
+	trace_rcu_utilization("Start context switch");
 	rcu_sched_qs(cpu);
 	rcu_preempt_note_context_switch(cpu);
+	trace_rcu_utilization("End context switch");
 }
 EXPORT_SYMBOL_GPL(rcu_note_context_switch);
 
@@ -1275,6 +1277,7 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
  */
 void rcu_check_callbacks(int cpu, int user)
 {
+	trace_rcu_utilization("Start scheduler-tick");
 	if (user ||
 	    (idle_cpu(cpu) && rcu_scheduler_active &&
 	     !in_softirq() && hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
@@ -1308,6 +1311,7 @@ void rcu_check_callbacks(int cpu, int user)
 	rcu_preempt_check_callbacks(cpu);
 	if (rcu_pending(cpu))
 		invoke_rcu_core();
+	trace_rcu_utilization("End scheduler-tick");
 }
 
 #ifdef CONFIG_SMP
@@ -1369,10 +1373,14 @@ static void force_quiescent_state(struct rcu_state *rsp, int relaxed)
 	unsigned long flags;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
-	if (!rcu_gp_in_progress(rsp))
+	trace_rcu_utilization("Start fqs");
+	if (!rcu_gp_in_progress(rsp)) {
+		trace_rcu_utilization("End fqs");
 		return;  /* No grace period in progress, nothing to force. */
+	}
 	if (!raw_spin_trylock_irqsave(&rsp->fqslock, flags)) {
 		rsp->n_force_qs_lh++; /* Inexact, can lose counts.  Tough! */
+		trace_rcu_utilization("End fqs");
 		return;	/* Someone else is already on the job. */
 	}
 	if (relaxed && ULONG_CMP_GE(rsp->jiffies_force_qs, jiffies))
@@ -1421,11 +1429,13 @@ static void force_quiescent_state(struct rcu_state *rsp, int relaxed)
 		raw_spin_unlock(&rsp->fqslock); /* irqs remain disabled */
 		rsp->fqs_need_gp = 0;
 		rcu_start_gp(rsp, flags); /* releases rnp->lock */
+		trace_rcu_utilization("End fqs");
 		return;
 	}
 	raw_spin_unlock(&rnp->lock);  /* irqs remain disabled */
 unlock_fqs_ret:
 	raw_spin_unlock_irqrestore(&rsp->fqslock, flags);
+	trace_rcu_utilization("End fqs");
 }
 
 #else /* #ifdef CONFIG_SMP */
@@ -1481,6 +1491,7 @@ __rcu_process_callbacks(struct rcu_state *rsp, struct rcu_data *rdp)
  */
 static void rcu_process_callbacks(struct softirq_action *unused)
 {
+	trace_rcu_utilization("Start RCU core");
 	__rcu_process_callbacks(&rcu_sched_state,
 				&__get_cpu_var(rcu_sched_data));
 	__rcu_process_callbacks(&rcu_bh_state, &__get_cpu_var(rcu_bh_data));
@@ -1488,6 +1499,7 @@ static void rcu_process_callbacks(struct softirq_action *unused)
 
 	/* If we are last CPU on way to dyntick-idle mode, accelerate it. */
 	rcu_needs_cpu_flush();
+	trace_rcu_utilization("End RCU core");
 }
 
 /*
@@ -1910,6 +1922,7 @@ static int __cpuinit rcu_cpu_notify(struct notifier_block *self,
 	struct rcu_data *rdp = per_cpu_ptr(rcu_state->rda, cpu);
 	struct rcu_node *rnp = rdp->mynode;
 
+	trace_rcu_utilization("Start CPU hotplug");
 	switch (action) {
 	case CPU_UP_PREPARE:
 	case CPU_UP_PREPARE_FROZEN:
@@ -1945,6 +1958,7 @@ static int __cpuinit rcu_cpu_notify(struct notifier_block *self,
 	default:
 		break;
 	}
+	trace_rcu_utilization("End CPU hotplug");
 	return NOTIFY_OK;
 }
 
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 16/55] rcu: Put names into TINY_RCU structures under RCU_TRACE
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (14 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 15/55] rcu: Event-trace markers for computing RCU CPU utilization Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 17/55] rcu: Add RCU type to callback-invocation tracing Paul E. McKenney
                   ` (40 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

In order to allow event tracing to distinguish between flavors of
RCU, we need those names in the relevant RCU data structures.  TINY_RCU
has avoided them for memory-footprint reasons, so add them only if
CONFIG_RCU_TRACE=y.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu.h            |   10 ++++++++--
 kernel/rcutiny.c        |   13 -------------
 kernel/rcutiny_plugin.h |   10 ++++------
 kernel/rcutree.c        |   10 +++++-----
 kernel/rcutree_plugin.h |    2 +-
 5 files changed, 18 insertions(+), 27 deletions(-)

diff --git a/kernel/rcu.h b/kernel/rcu.h
index 7bc1643..d7f00ec 100644
--- a/kernel/rcu.h
+++ b/kernel/rcu.h
@@ -23,6 +23,12 @@
 #ifndef __LINUX_RCU_H
 #define __LINUX_RCU_H
 
+#ifdef CONFIG_RCU_TRACE
+#define RCU_TRACE(stmt) stmt
+#else /* #ifdef CONFIG_RCU_TRACE */
+#define RCU_TRACE(stmt)
+#endif /* #else #ifdef CONFIG_RCU_TRACE */
+
 /*
  * debug_rcu_head_queue()/debug_rcu_head_unqueue() are used internally
  * by call_rcu() and rcu callback execution, and are therefore not part of the
@@ -68,10 +74,10 @@ static inline void __rcu_reclaim(struct rcu_head *head)
 	unsigned long offset = (unsigned long)head->func;
 
 	if (__is_kfree_rcu_offset(offset)) {
-		trace_rcu_invoke_kfree_callback(head, offset);
+		RCU_TRACE(trace_rcu_invoke_kfree_callback(head, offset));
 		kfree((void *)head - offset);
 	} else {
-		trace_rcu_invoke_callback(head);
+		RCU_TRACE(trace_rcu_invoke_callback(head));
 		head->func(head);
 	}
 }
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 19453ba..0d28974 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -38,20 +38,7 @@
 #include <linux/prefetch.h>
 
 #ifdef CONFIG_RCU_TRACE
-
 #include <trace/events/rcu.h>
-
-#else /* #ifdef CONFIG_RCU_TRACE */
-
-/* No by-default tracing in TINY_RCU: Keep TINY_RCU tiny! */
-static void trace_rcu_invoke_kfree_callback(struct rcu_head *rhp,
-					    unsigned long offset)
-{
-}
-static void trace_rcu_invoke_callback(struct rcu_head *head)
-{
-}
-
 #endif /* #else #ifdef CONFIG_RCU_TRACE */
 
 #include "rcu.h"
diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index 6b0cedb..791ddf7 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -26,29 +26,26 @@
 #include <linux/debugfs.h>
 #include <linux/seq_file.h>
 
-#ifdef CONFIG_RCU_TRACE
-#define RCU_TRACE(stmt)	stmt
-#else /* #ifdef CONFIG_RCU_TRACE */
-#define RCU_TRACE(stmt)
-#endif /* #else #ifdef CONFIG_RCU_TRACE */
-
 /* Global control variables for rcupdate callback mechanism. */
 struct rcu_ctrlblk {
 	struct rcu_head *rcucblist;	/* List of pending callbacks (CBs). */
 	struct rcu_head **donetail;	/* ->next pointer of last "done" CB. */
 	struct rcu_head **curtail;	/* ->next pointer of last CB. */
 	RCU_TRACE(long qlen);		/* Number of pending CBs. */
+	RCU_TRACE(char *name);		/* Name of RCU type. */
 };
 
 /* Definition for rcupdate control block. */
 static struct rcu_ctrlblk rcu_sched_ctrlblk = {
 	.donetail	= &rcu_sched_ctrlblk.rcucblist,
 	.curtail	= &rcu_sched_ctrlblk.rcucblist,
+	RCU_TRACE(.name = "rcu_sched")
 };
 
 static struct rcu_ctrlblk rcu_bh_ctrlblk = {
 	.donetail	= &rcu_bh_ctrlblk.rcucblist,
 	.curtail	= &rcu_bh_ctrlblk.rcucblist,
+	RCU_TRACE(.name = "rcu_bh")
 };
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -131,6 +128,7 @@ static struct rcu_preempt_ctrlblk rcu_preempt_ctrlblk = {
 	.rcb.curtail = &rcu_preempt_ctrlblk.rcb.rcucblist,
 	.nexttail = &rcu_preempt_ctrlblk.rcb.rcucblist,
 	.blkd_tasks = LIST_HEAD_INIT(rcu_preempt_ctrlblk.blkd_tasks),
+	RCU_TRACE(.rcb.name = "rcu_preempt")
 };
 
 static int rcu_preempted_readers_exp(void);
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 2a9643b..b953e2c 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -61,7 +61,7 @@
 static struct lock_class_key rcu_node_class[NUM_RCU_LVLS];
 
 #define RCU_STATE_INITIALIZER(structname) { \
-	.level = { &structname.node[0] }, \
+	.level = { &structname##_state.node[0] }, \
 	.levelcnt = { \
 		NUM_RCU_LVL_0,  /* root of hierarchy. */ \
 		NUM_RCU_LVL_1, \
@@ -72,17 +72,17 @@ static struct lock_class_key rcu_node_class[NUM_RCU_LVLS];
 	.signaled = RCU_GP_IDLE, \
 	.gpnum = -300, \
 	.completed = -300, \
-	.onofflock = __RAW_SPIN_LOCK_UNLOCKED(&structname.onofflock), \
-	.fqslock = __RAW_SPIN_LOCK_UNLOCKED(&structname.fqslock), \
+	.onofflock = __RAW_SPIN_LOCK_UNLOCKED(&structname##_state.onofflock), \
+	.fqslock = __RAW_SPIN_LOCK_UNLOCKED(&structname##_state.fqslock), \
 	.n_force_qs = 0, \
 	.n_force_qs_ngp = 0, \
 	.name = #structname, \
 }
 
-struct rcu_state rcu_sched_state = RCU_STATE_INITIALIZER(rcu_sched_state);
+struct rcu_state rcu_sched_state = RCU_STATE_INITIALIZER(rcu_sched);
 DEFINE_PER_CPU(struct rcu_data, rcu_sched_data);
 
-struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh_state);
+struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh);
 DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
 
 static struct rcu_state *rcu_state;
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 43daa46..a90bf3c 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -64,7 +64,7 @@ static void __init rcu_bootup_announce_oddness(void)
 
 #ifdef CONFIG_TREE_PREEMPT_RCU
 
-struct rcu_state rcu_preempt_state = RCU_STATE_INITIALIZER(rcu_preempt_state);
+struct rcu_state rcu_preempt_state = RCU_STATE_INITIALIZER(rcu_preempt);
 DEFINE_PER_CPU(struct rcu_data, rcu_preempt_data);
 static struct rcu_state *rcu_state = &rcu_preempt_state;
 
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 17/55] rcu: Add RCU type to callback-invocation tracing
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (15 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 16/55] rcu: Put names into TINY_RCU structures under RCU_TRACE Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 18/55] rcu: Update comments to reflect softirqs vs. kthreads Paul E. McKenney
                   ` (39 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

Add a string the the rcu_batch_start() and rcu_batch_end() trace
messages that indicates the RCU type ("rcu_sched", "rcu_bh", or
"rcu_preempt").  The trace messages for the actual invocations
themselves are not marked, as it should be clear from the
rcu_batch_start() and rcu_batch_end() events before and after.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/trace/events/rcu.h |   28 ++++++++++++++++++----------
 kernel/rcutiny.c           |    8 ++++----
 kernel/rcutree.c           |    8 ++++----
 3 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index ab458eb..508824e5 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -33,27 +33,31 @@ TRACE_EVENT(rcu_utilization,
 
 /*
  * Tracepoint for marking the beginning rcu_do_batch, performed to start
- * RCU callback invocation.  The first argument is the total number of
- * callbacks (including those that are not yet ready to be invoked),
- * and the second argument is the current RCU-callback batch limit.
+ * RCU callback invocation.  The first argument is the RCU flavor,
+ * the second is the total number of callbacks (including those that
+ * are not yet ready to be invoked), and the third argument is the
+ * current RCU-callback batch limit.
  */
 TRACE_EVENT(rcu_batch_start,
 
-	TP_PROTO(long qlen, int blimit),
+	TP_PROTO(char *rcuname, long qlen, int blimit),
 
-	TP_ARGS(qlen, blimit),
+	TP_ARGS(rcuname, qlen, blimit),
 
 	TP_STRUCT__entry(
+		__field(char *, rcuname)
 		__field(long, qlen)
 		__field(int, blimit)
 	),
 
 	TP_fast_assign(
+		__entry->rcuname = rcuname;
 		__entry->qlen = qlen;
 		__entry->blimit = blimit;
 	),
 
-	TP_printk("CBs=%ld bl=%d", __entry->qlen, __entry->blimit)
+	TP_printk("%s CBs=%ld bl=%d",
+		  __entry->rcuname, __entry->qlen, __entry->blimit)
 );
 
 /*
@@ -106,23 +110,27 @@ TRACE_EVENT(rcu_invoke_kfree_callback,
 
 /*
  * Tracepoint for exiting rcu_do_batch after RCU callbacks have been
- * invoked.  The first argument is the number of callbacks actually invoked.
+ * invoked.  The first argument is the name of the RCU flavor and
+ * the second argument is number of callbacks actually invoked.
  */
 TRACE_EVENT(rcu_batch_end,
 
-	TP_PROTO(int callbacks_invoked),
+	TP_PROTO(char *rcuname, int callbacks_invoked),
 
-	TP_ARGS(callbacks_invoked),
+	TP_ARGS(rcuname, callbacks_invoked),
 
 	TP_STRUCT__entry(
+		__field(char *, rcuname)
 		__field(int, callbacks_invoked)
 	),
 
 	TP_fast_assign(
+		__entry->rcuname = rcuname;
 		__entry->callbacks_invoked = callbacks_invoked;
 	),
 
-	TP_printk("CBs-invoked=%d", __entry->callbacks_invoked)
+	TP_printk("%s CBs-invoked=%d",
+		  __entry->rcuname, __entry->callbacks_invoked)
 );
 
 #endif /* _TRACE_RCU_H */
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 0d28974..1c37bdd 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -168,14 +168,14 @@ static void rcu_process_callbacks(struct rcu_ctrlblk *rcp)
 
 	/* If no RCU callbacks ready to invoke, just return. */
 	if (&rcp->rcucblist == rcp->donetail) {
-		RCU_TRACE(trace_rcu_batch_start(0, -1));
-		RCU_TRACE(trace_rcu_batch_end(0));
+		RCU_TRACE(trace_rcu_batch_start(rcp->name, 0, -1));
+		RCU_TRACE(trace_rcu_batch_end(rcp->name, 0));
 		return;
 	}
 
 	/* Move the ready-to-invoke callbacks to a local list. */
 	local_irq_save(flags);
-	RCU_TRACE(trace_rcu_batch_start(0, -1));
+	RCU_TRACE(trace_rcu_batch_start(rcp->name, 0, -1));
 	list = rcp->rcucblist;
 	rcp->rcucblist = *rcp->donetail;
 	*rcp->donetail = NULL;
@@ -197,7 +197,7 @@ static void rcu_process_callbacks(struct rcu_ctrlblk *rcp)
 		RCU_TRACE(cb_count++);
 	}
 	RCU_TRACE(rcu_trace_sub_qlen(rcp, cb_count));
-	RCU_TRACE(trace_rcu_batch_end(cb_count));
+	RCU_TRACE(trace_rcu_batch_end(rcp->name, cb_count));
 }
 
 /*
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index b953e2c..eb6e731 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1199,8 +1199,8 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
 
 	/* If no callbacks are ready, just return.*/
 	if (!cpu_has_callbacks_ready_to_invoke(rdp)) {
-		trace_rcu_batch_start(0, 0);
-		trace_rcu_batch_end(0);
+		trace_rcu_batch_start(rsp->name, 0, 0);
+		trace_rcu_batch_end(rsp->name, 0);
 		return;
 	}
 
@@ -1210,7 +1210,7 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
 	 */
 	local_irq_save(flags);
 	bl = rdp->blimit;
-	trace_rcu_batch_start(rdp->qlen, bl);
+	trace_rcu_batch_start(rsp->name, rdp->qlen, bl);
 	list = rdp->nxtlist;
 	rdp->nxtlist = *rdp->nxttail[RCU_DONE_TAIL];
 	*rdp->nxttail[RCU_DONE_TAIL] = NULL;
@@ -1233,7 +1233,7 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
 	}
 
 	local_irq_save(flags);
-	trace_rcu_batch_end(count);
+	trace_rcu_batch_end(rsp->name, count);
 
 	/* Update count, and requeue any remaining callbacks. */
 	rdp->qlen -= count;
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 18/55] rcu: Update comments to reflect softirqs vs. kthreads
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (16 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 17/55] rcu: Add RCU type to callback-invocation tracing Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 19/55] rcu: Move RCU_BOOST declarations to allow compiler checking Paul E. McKenney
                   ` (38 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

We now have kthreads only for flavors of RCU that support boosting,
so update the now-misleading comments accordingly.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c        |   23 ++++++++++++-----------
 kernel/rcutree_plugin.h |    3 ++-
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index eb6e731..4e24399 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -198,7 +198,7 @@ DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
 };
 #endif /* #ifdef CONFIG_NO_HZ */
 
-static int blimit = 10;		/* Maximum callbacks per softirq. */
+static int blimit = 10;		/* Maximum callbacks per rcu_do_batch. */
 static int qhimark = 10000;	/* If this many pending, ignore blimit. */
 static int qlowmark = 100;	/* Once only this many pending, use blimit. */
 
@@ -1261,7 +1261,7 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
 
 	local_irq_restore(flags);
 
-	/* Re-raise the RCU softirq if there are callbacks remaining. */
+	/* Re-invoke RCU core processing if there are callbacks remaining. */
 	if (cpu_has_callbacks_ready_to_invoke(rdp))
 		invoke_rcu_core();
 }
@@ -1269,7 +1269,7 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
 /*
  * Check to see if this CPU is in a non-context-switch quiescent state
  * (user mode or idle loop for rcu, non-softirq execution for rcu_bh).
- * Also schedule the RCU softirq handler.
+ * Also schedule RCU core processing.
  *
  * This function must be called with hardirqs disabled.  It is normally
  * invoked from the scheduling-clock interrupt.  If rcu_pending returns
@@ -1448,9 +1448,9 @@ static void force_quiescent_state(struct rcu_state *rsp, int relaxed)
 #endif /* #else #ifdef CONFIG_SMP */
 
 /*
- * This does the RCU processing work from softirq context for the
- * specified rcu_state and rcu_data structures.  This may be called
- * only from the CPU to whom the rdp belongs.
+ * This does the RCU core processing work for the specified rcu_state
+ * and rcu_data structures.  This may be called only from the CPU to
+ * whom the rdp belongs.
  */
 static void
 __rcu_process_callbacks(struct rcu_state *rsp, struct rcu_data *rdp)
@@ -1487,7 +1487,7 @@ __rcu_process_callbacks(struct rcu_state *rsp, struct rcu_data *rdp)
 }
 
 /*
- * Do softirq processing for the current CPU.
+ * Do RCU core processing for the current CPU.
  */
 static void rcu_process_callbacks(struct softirq_action *unused)
 {
@@ -1503,10 +1503,11 @@ static void rcu_process_callbacks(struct softirq_action *unused)
 }
 
 /*
- * Wake up the current CPU's kthread.  This replaces raise_softirq()
- * in earlier versions of RCU.  Note that because we are running on
- * the current CPU with interrupts disabled, the rcu_cpu_kthread_task
- * cannot disappear out from under us.
+ * Schedule RCU callback invocation.  If the specified type of RCU
+ * does not support RCU priority boosting, just do a direct call,
+ * otherwise wake up the per-CPU kernel kthread.  Note that because we
+ * are running on the current CPU with interrupts disabled, the
+ * rcu_cpu_kthread_task cannot disappear out from under us.
  */
 static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp)
 {
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index a90bf3c..ecd48a2 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1478,7 +1478,8 @@ static int rcu_cpu_kthread_should_stop(int cpu)
 
 /*
  * Per-CPU kernel thread that invokes RCU callbacks.  This replaces the
- * earlier RCU softirq.
+ * RCU softirq used in flavors and configurations of RCU that do not
+ * support RCU priority boosting.
  */
 static int rcu_cpu_kthread(void *arg)
 {
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 19/55] rcu: Move RCU_BOOST declarations to allow compiler checking
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (17 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 18/55] rcu: Update comments to reflect softirqs vs. kthreads Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 20/55] rcu: Add event-trace markers to TREE_RCU kthreads Paul E. McKenney
                   ` (37 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

Andi Kleen noticed that one of the RCU_BOOST data declarations was
out of sync with the definition.  Move the declarations so that the
compiler can do the checking in the future.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.h       |    7 +++++++
 kernel/rcutree_trace.c |    5 -----
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index 01b2ccd..eee6c94 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -417,6 +417,13 @@ extern struct rcu_state rcu_preempt_state;
 DECLARE_PER_CPU(struct rcu_data, rcu_preempt_data);
 #endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
 
+#ifdef CONFIG_RCU_BOOST
+DECLARE_PER_CPU(unsigned int, rcu_cpu_kthread_status);
+DECLARE_PER_CPU(int, rcu_cpu_kthread_cpu);
+DECLARE_PER_CPU(unsigned int, rcu_cpu_kthread_loops);
+DECLARE_PER_CPU(char, rcu_cpu_has_work);
+#endif /* #ifdef CONFIG_RCU_BOOST */
+
 #ifndef RCU_TREE_NONCORE
 
 /* Forward declarations for rcutree_plugin.h */
diff --git a/kernel/rcutree_trace.c b/kernel/rcutree_trace.c
index 8827b34..e623564 100644
--- a/kernel/rcutree_trace.c
+++ b/kernel/rcutree_trace.c
@@ -48,11 +48,6 @@
 
 #ifdef CONFIG_RCU_BOOST
 
-DECLARE_PER_CPU(unsigned int, rcu_cpu_kthread_status);
-DECLARE_PER_CPU(int, rcu_cpu_kthread_cpu);
-DECLARE_PER_CPU(unsigned int, rcu_cpu_kthread_loops);
-DECLARE_PER_CPU(char, rcu_cpu_has_work);
-
 static char convert_kthread_status(unsigned int kthread_status)
 {
 	if (kthread_status > RCU_KTHREAD_MAX)
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 20/55] rcu: Add event-trace markers to TREE_RCU kthreads
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (18 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 19/55] rcu: Move RCU_BOOST declarations to allow compiler checking Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 21/55] rcu: Make TINY_RCU also use softirq for RCU_BOOST=n Paul E. McKenney
                   ` (36 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

Add event-trace markers to TREE_RCU kthreads to allow including these
kthread's CPU time in the utilization calculations.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/trace/events/rcu.h |    3 +++
 kernel/rcutree_plugin.h    |   12 ++++++++++++
 2 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 508824e5..ac52aba 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -13,6 +13,9 @@
  * "Start <activity>" -- Mark the start of the specified activity,
  *			 such as "context switch".  Nesting is permitted.
  * "End <activity>" -- Mark the end of the specified activity.
+ *
+ * An "@" character within "<activity>" is a comment character: Data
+ * reduction scripts will ignore the "@" and the remainder of the line.
  */
 TRACE_EVENT(rcu_utilization,
 
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index ecd48a2..94d9ca1 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1219,9 +1219,12 @@ static int rcu_boost_kthread(void *arg)
 	int spincnt = 0;
 	int more2boost;
 
+	trace_rcu_utilization("Start boost kthread@init");
 	for (;;) {
 		rnp->boost_kthread_status = RCU_KTHREAD_WAITING;
+		trace_rcu_utilization("End boost kthread@rcu_wait");
 		rcu_wait(rnp->boost_tasks || rnp->exp_tasks);
+		trace_rcu_utilization("Start boost kthread@rcu_wait");
 		rnp->boost_kthread_status = RCU_KTHREAD_RUNNING;
 		more2boost = rcu_boost(rnp);
 		if (more2boost)
@@ -1229,11 +1232,14 @@ static int rcu_boost_kthread(void *arg)
 		else
 			spincnt = 0;
 		if (spincnt > 10) {
+			trace_rcu_utilization("End boost kthread@rcu_yield");
 			rcu_yield(rcu_boost_kthread_timer, (unsigned long)rnp);
+			trace_rcu_utilization("Start boost kthread@rcu_yield");
 			spincnt = 0;
 		}
 	}
 	/* NOTREACHED */
+	trace_rcu_utilization("End boost kthread@notreached");
 	return 0;
 }
 
@@ -1490,9 +1496,12 @@ static int rcu_cpu_kthread(void *arg)
 	char work;
 	char *workp = &per_cpu(rcu_cpu_has_work, cpu);
 
+	trace_rcu_utilization("Start CPU kthread@init");
 	for (;;) {
 		*statusp = RCU_KTHREAD_WAITING;
+		trace_rcu_utilization("End CPU kthread@rcu_wait");
 		rcu_wait(*workp != 0 || kthread_should_stop());
+		trace_rcu_utilization("Start CPU kthread@rcu_wait");
 		local_bh_disable();
 		if (rcu_cpu_kthread_should_stop(cpu)) {
 			local_bh_enable();
@@ -1513,11 +1522,14 @@ static int rcu_cpu_kthread(void *arg)
 			spincnt = 0;
 		if (spincnt > 10) {
 			*statusp = RCU_KTHREAD_YIELDING;
+			trace_rcu_utilization("End CPU kthread@rcu_yield");
 			rcu_yield(rcu_cpu_kthread_timer, (unsigned long)cpu);
+			trace_rcu_utilization("Start CPU kthread@rcu_yield");
 			spincnt = 0;
 		}
 	}
 	*statusp = RCU_KTHREAD_STOPPED;
+	trace_rcu_utilization("End CPU kthread@term");
 	return 0;
 }
 
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 21/55] rcu: Make TINY_RCU also use softirq for RCU_BOOST=n
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (19 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 20/55] rcu: Add event-trace markers to TREE_RCU kthreads Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 22/55] rcu: Add grace-period, quiescent-state, and call_rcu trace events Paul E. McKenney
                   ` (35 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

This patch #ifdefs TINY_RCU kthreads out of the kernel unless RCU_BOOST=y,
thus eliminating context-switch overhead if RCU priority boosting has
not been configured.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcutiny.h |    4 ++
 kernel/rcutiny.c        |   74 ++++---------------------------
 kernel/rcutiny_plugin.h |  110 +++++++++++++++++++++++++++++++++++-----------
 3 files changed, 97 insertions(+), 91 deletions(-)

diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 4eab233..00b7a5e 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -27,9 +27,13 @@
 
 #include <linux/cache.h>
 
+#ifdef CONFIG_RCU_BOOST
 static inline void rcu_init(void)
 {
 }
+#else /* #ifdef CONFIG_RCU_BOOST */
+void rcu_init(void);
+#endif /* #else #ifdef CONFIG_RCU_BOOST */
 
 static inline void rcu_barrier_bh(void)
 {
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 1c37bdd..c9321d8 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -43,16 +43,11 @@
 
 #include "rcu.h"
 
-/* Controls for rcu_kthread() kthread, replacing RCU_SOFTIRQ used previously. */
-static struct task_struct *rcu_kthread_task;
-static DECLARE_WAIT_QUEUE_HEAD(rcu_kthread_wq);
-static unsigned long have_rcu_kthread_work;
-
 /* Forward declarations for rcutiny_plugin.h. */
 struct rcu_ctrlblk;
-static void invoke_rcu_kthread(void);
-static void rcu_process_callbacks(struct rcu_ctrlblk *rcp);
-static int rcu_kthread(void *arg);
+static void invoke_rcu_callbacks(void);
+static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp);
+static void rcu_process_callbacks(struct softirq_action *unused);
 static void __call_rcu(struct rcu_head *head,
 		       void (*func)(struct rcu_head *rcu),
 		       struct rcu_ctrlblk *rcp);
@@ -102,16 +97,6 @@ static int rcu_qsctr_help(struct rcu_ctrlblk *rcp)
 }
 
 /*
- * Wake up rcu_kthread() to process callbacks now eligible for invocation
- * or to boost readers.
- */
-static void invoke_rcu_kthread(void)
-{
-	have_rcu_kthread_work = 1;
-	wake_up(&rcu_kthread_wq);
-}
-
-/*
  * Record an rcu quiescent state.  And an rcu_bh quiescent state while we
  * are at it, given that any rcu quiescent state is also an rcu_bh
  * quiescent state.  Use "+" instead of "||" to defeat short circuiting.
@@ -123,7 +108,7 @@ void rcu_sched_qs(int cpu)
 	local_irq_save(flags);
 	if (rcu_qsctr_help(&rcu_sched_ctrlblk) +
 	    rcu_qsctr_help(&rcu_bh_ctrlblk))
-		invoke_rcu_kthread();
+		invoke_rcu_callbacks();
 	local_irq_restore(flags);
 }
 
@@ -136,7 +121,7 @@ void rcu_bh_qs(int cpu)
 
 	local_irq_save(flags);
 	if (rcu_qsctr_help(&rcu_bh_ctrlblk))
-		invoke_rcu_kthread();
+		invoke_rcu_callbacks();
 	local_irq_restore(flags);
 }
 
@@ -160,7 +145,7 @@ void rcu_check_callbacks(int cpu, int user)
  * Invoke the RCU callbacks on the specified rcu_ctrlkblk structure
  * whose grace period has elapsed.
  */
-static void rcu_process_callbacks(struct rcu_ctrlblk *rcp)
+static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
 {
 	struct rcu_head *next, *list;
 	unsigned long flags;
@@ -200,36 +185,11 @@ static void rcu_process_callbacks(struct rcu_ctrlblk *rcp)
 	RCU_TRACE(trace_rcu_batch_end(rcp->name, cb_count));
 }
 
-/*
- * This kthread invokes RCU callbacks whose grace periods have
- * elapsed.  It is awakened as needed, and takes the place of the
- * RCU_SOFTIRQ that was used previously for this purpose.
- * This is a kthread, but it is never stopped, at least not until
- * the system goes down.
- */
-static int rcu_kthread(void *arg)
+static void rcu_process_callbacks(struct softirq_action *unused)
 {
-	unsigned long work;
-	unsigned long morework;
-	unsigned long flags;
-
-	for (;;) {
-		wait_event_interruptible(rcu_kthread_wq,
-					 have_rcu_kthread_work != 0);
-		morework = rcu_boost();
-		local_irq_save(flags);
-		work = have_rcu_kthread_work;
-		have_rcu_kthread_work = morework;
-		local_irq_restore(flags);
-		if (work) {
-			rcu_process_callbacks(&rcu_sched_ctrlblk);
-			rcu_process_callbacks(&rcu_bh_ctrlblk);
-			rcu_preempt_process_callbacks();
-		}
-		schedule_timeout_interruptible(1); /* Leave CPU for others. */
-	}
-
-	return 0;  /* Not reached, but needed to shut gcc up. */
+	__rcu_process_callbacks(&rcu_sched_ctrlblk);
+	__rcu_process_callbacks(&rcu_bh_ctrlblk);
+	rcu_preempt_process_callbacks();
 }
 
 /*
@@ -291,17 +251,3 @@ void call_rcu_bh(struct rcu_head *head, void (*func)(struct rcu_head *rcu))
 	__call_rcu(head, func, &rcu_bh_ctrlblk);
 }
 EXPORT_SYMBOL_GPL(call_rcu_bh);
-
-/*
- * Spawn the kthread that invokes RCU callbacks.
- */
-static int __init rcu_spawn_kthreads(void)
-{
-	struct sched_param sp;
-
-	rcu_kthread_task = kthread_run(rcu_kthread, NULL, "rcu_kthread");
-	sp.sched_priority = RCU_BOOST_PRIO;
-	sched_setscheduler_nocheck(rcu_kthread_task, SCHED_FIFO, &sp);
-	return 0;
-}
-early_initcall(rcu_spawn_kthreads);
diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index 791ddf7..02aa713 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -245,6 +245,13 @@ static void show_tiny_preempt_stats(struct seq_file *m)
 
 #include "rtmutex_common.h"
 
+#define RCU_BOOST_PRIO CONFIG_RCU_BOOST_PRIO
+
+/* Controls for rcu_kthread() kthread. */
+static struct task_struct *rcu_kthread_task;
+static DECLARE_WAIT_QUEUE_HEAD(rcu_kthread_wq);
+static unsigned long have_rcu_kthread_work;
+
 /*
  * Carry out RCU priority boosting on the task indicated by ->boost_tasks,
  * and advance ->boost_tasks to the next task in the ->blkd_tasks list.
@@ -332,7 +339,7 @@ static int rcu_initiate_boost(void)
 		if (rcu_preempt_ctrlblk.exp_tasks == NULL)
 			rcu_preempt_ctrlblk.boost_tasks =
 				rcu_preempt_ctrlblk.gp_tasks;
-		invoke_rcu_kthread();
+		invoke_rcu_callbacks();
 	} else
 		RCU_TRACE(rcu_initiate_boost_trace());
 	return 1;
@@ -351,14 +358,6 @@ static void rcu_preempt_boost_start_gp(void)
 #else /* #ifdef CONFIG_RCU_BOOST */
 
 /*
- * If there is no RCU priority boosting, we don't boost.
- */
-static int rcu_boost(void)
-{
-	return 0;
-}
-
-/*
  * If there is no RCU priority boosting, we don't initiate boosting,
  * but we do indicate whether there are blocked readers blocking the
  * current grace period.
@@ -425,7 +424,7 @@ static void rcu_preempt_cpu_qs(void)
 
 	/* If there are done callbacks, cause them to be invoked. */
 	if (*rcu_preempt_ctrlblk.rcb.donetail != NULL)
-		invoke_rcu_kthread();
+		invoke_rcu_callbacks();
 }
 
 /*
@@ -646,7 +645,7 @@ static void rcu_preempt_check_callbacks(void)
 		rcu_preempt_cpu_qs();
 	if (&rcu_preempt_ctrlblk.rcb.rcucblist !=
 	    rcu_preempt_ctrlblk.rcb.donetail)
-		invoke_rcu_kthread();
+		invoke_rcu_callbacks();
 	if (rcu_preempt_gp_in_progress() &&
 	    rcu_cpu_blocking_cur_gp() &&
 	    rcu_preempt_running_reader())
@@ -672,7 +671,7 @@ static void rcu_preempt_remove_callbacks(struct rcu_ctrlblk *rcp)
  */
 static void rcu_preempt_process_callbacks(void)
 {
-	rcu_process_callbacks(&rcu_preempt_ctrlblk.rcb);
+	__rcu_process_callbacks(&rcu_preempt_ctrlblk.rcb);
 }
 
 /*
@@ -848,15 +847,6 @@ static void show_tiny_preempt_stats(struct seq_file *m)
 #endif /* #ifdef CONFIG_RCU_TRACE */
 
 /*
- * Because preemptible RCU does not exist, it is never necessary to
- * boost preempted RCU readers.
- */
-static int rcu_boost(void)
-{
-	return 0;
-}
-
-/*
  * Because preemptible RCU does not exist, it never has any callbacks
  * to check.
  */
@@ -882,6 +872,78 @@ static void rcu_preempt_process_callbacks(void)
 
 #endif /* #else #ifdef CONFIG_TINY_PREEMPT_RCU */
 
+#ifdef CONFIG_RCU_BOOST
+
+/*
+ * Wake up rcu_kthread() to process callbacks now eligible for invocation
+ * or to boost readers.
+ */
+static void invoke_rcu_callbacks(void)
+{
+	have_rcu_kthread_work = 1;
+	wake_up(&rcu_kthread_wq);
+}
+
+/*
+ * This kthread invokes RCU callbacks whose grace periods have
+ * elapsed.  It is awakened as needed, and takes the place of the
+ * RCU_SOFTIRQ that is used for this purpose when boosting is disabled.
+ * This is a kthread, but it is never stopped, at least not until
+ * the system goes down.
+ */
+static int rcu_kthread(void *arg)
+{
+	unsigned long work;
+	unsigned long morework;
+	unsigned long flags;
+
+	for (;;) {
+		wait_event_interruptible(rcu_kthread_wq,
+					 have_rcu_kthread_work != 0);
+		morework = rcu_boost();
+		local_irq_save(flags);
+		work = have_rcu_kthread_work;
+		have_rcu_kthread_work = morework;
+		local_irq_restore(flags);
+		if (work)
+			rcu_process_callbacks(NULL);
+		schedule_timeout_interruptible(1); /* Leave CPU for others. */
+	}
+
+	return 0;  /* Not reached, but needed to shut gcc up. */
+}
+
+/*
+ * Spawn the kthread that invokes RCU callbacks.
+ */
+static int __init rcu_spawn_kthreads(void)
+{
+	struct sched_param sp;
+
+	rcu_kthread_task = kthread_run(rcu_kthread, NULL, "rcu_kthread");
+	sp.sched_priority = RCU_BOOST_PRIO;
+	sched_setscheduler_nocheck(rcu_kthread_task, SCHED_FIFO, &sp);
+	return 0;
+}
+early_initcall(rcu_spawn_kthreads);
+
+#else /* #ifdef CONFIG_RCU_BOOST */
+
+/*
+ * Start up softirq processing of callbacks.
+ */
+void invoke_rcu_callbacks(void)
+{
+	raise_softirq(RCU_SOFTIRQ);
+}
+
+void rcu_init(void)
+{
+	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
+}
+
+#endif /* #else #ifdef CONFIG_RCU_BOOST */
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 #include <linux/kernel_stat.h>
 
@@ -897,12 +959,6 @@ void __init rcu_scheduler_starting(void)
 
 #endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
-#ifdef CONFIG_RCU_BOOST
-#define RCU_BOOST_PRIO CONFIG_RCU_BOOST_PRIO
-#else /* #ifdef CONFIG_RCU_BOOST */
-#define RCU_BOOST_PRIO 1
-#endif /* #else #ifdef CONFIG_RCU_BOOST */
-
 #ifdef CONFIG_RCU_TRACE
 
 #ifdef CONFIG_RCU_BOOST
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2
@ 2011-09-06 18:00 Paul E. McKenney
  2011-09-06 17:59 ` [PATCH tip/core/rcu 01/55] rcu: Use kthread_create_on_node() Paul E. McKenney
                   ` (56 more replies)
  0 siblings, 57 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches

Hello!

This patchset adds RCU event tracing, improved diagnostics and
documentation, and fixes a number of bugs, including several from an
ongoing top-to-bottom inspection of RCU.  The patches are as follows:

1.	Place per-CPU kthreads' stack and task struct on the corresponding
	node on NUMA systems (courtesy of Eric Dumazet).
2.	Avoid unnecessary self-wakeups for per-CPU kthreads
	(courtesy of Shaohua Li).
3,6,10,12,25,28,33.
	Documentations updates (some courtesy Wanlong Gao).
4.	Add replacement checks for blocking within an RCU read-side
	critical section.
5.	Header-file untangling part 1 of N: move rcu_head to types.h.
7.	Fix mismatched variable declaration (courtesy of Andi Kleen).
8.	Abstract out common grace-period-primitive code.
9.	Update rcutorture to test newish RCU API members.
11.	Drive RCU algorithm selection directly from SMP and PREEMPT.
13.	Make rcu_torture_boost() wait for callbacks before telling
	debug-objects that they are done.
14-17,20,22.
	Add event tracing for RCU.
18.	Update comments to reflect kthreads being used only when
	RCU priority boosting is enabled.
19.	Move RCU_BOOSt data declarations to alow compiler to detect
	mismatches.
20.	Make TINY_RCU use softirqs for RCU_BOOST=n.
23.	Simplify quiescent-state accounting.
24.	Stop passing rcu_read_lock_held() to rcu_dereference_protected()
	(courtesy of Michal Hocko).
26.	Remove unused and redundant RCU API members.
27.	Allow rcutorture's stat_interval parameter to be changed at runtime
	to make it easier to test RCU in guest OSes.
28.	Removed unused nohz_cpu_mask (courtesy of Alex Shi).
30.	Eliminate in_irq() checks in rcu_enter_nohz().
31.	Fix rcu_implicit_dynticks_qs() local-variable size mismatches.
32.	Make rcu_assign_pointer() unconditionally emit memory barrier
	to silence new gcc warnings (courtesy of Eric Dumazet).
34.	Move __rcu_read_lock()'s barrier within if-statement.
35.	Dump local stack for CPU stall warnings if cannot dump all stacks.
36.	Prevent early-boot set_need_resched() from __rcu_pending().
37.	Simplify unboosting checks.
38.	Prohibit RCU grace periods during early boot.
39.	Suppress NMI backtraces when CPU stall ends before dump.
40.	Avoid just-online CPU needlessly rescheding itself.
41.	Permit rt_mutex_unlock() with irqs disabled.
42-43.	Prevent end-of-test rcutorture hangs.
44.	Wire up RCU_BOOST_PRIO, use conventional kthread naming scheme
	(courtesy of Mike Galbraith).
45.	Check for entering dyntick-idle in RCU read-side critical section.
46.	Adjust RCU_FAST_NO_HZ to avoid false quiescent states.
47.	Avoid concurrent end of old GP with start of new GP.
48.	Strengthen powerpc value-returning atomic memory ordering.
49-51.	Detect illegal RCU use from dyntick-idle mode (courtesy of
	Frederic Weisbecker).
52.	Remove an unnecessary layer of abstraction from PROVE_RCU checking.
53.	Detect illegal SRCU use from dyntick-idle mode.
54.	Make SRCU use common lockdep-splat code.
55.	Placeholder patch that disables illegal tracing from dyntick-idle
	mode (illegal because tracing uses RCU).

For a testing-only version of this patchset from git, please see the
following subject-to-rebase (and subject-to-Hera-availability) branch:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/testing

							Thanx, Paul

------------------------------------------------------------------------

 Documentation/RCU/NMI-RCU.txt           |    4 
 Documentation/RCU/lockdep.txt           |   24 +
 Documentation/RCU/torture.txt           |    3 
 Documentation/RCU/trace.txt             |   34 +-
 b/Documentation/RCU/NMI-RCU.txt         |    2 
 b/Documentation/RCU/lockdep-splat.txt   |  110 +++++++
 b/Documentation/RCU/lockdep.txt         |   10 
 b/Documentation/RCU/torture.txt         |  134 +++++++--
 b/Documentation/RCU/trace.txt           |    4 
 b/arch/powerpc/include/asm/synch.h      |    6 
 b/arch/powerpc/platforms/pseries/lpar.c |    6 
 b/include/linux/lockdep.h               |    2 
 b/include/linux/rcupdate.h              |   28 +
 b/include/linux/rcutiny.h               |   16 +
 b/include/linux/rcutree.h               |    2 
 b/include/linux/sched.h                 |    1 
 b/include/linux/srcu.h                  |   25 +
 b/include/linux/types.h                 |   10 
 b/include/trace/events/rcu.h            |   98 ++++++
 b/init/Kconfig                          |    6 
 b/kernel/lockdep.c                      |   84 +++--
 b/kernel/pid.c                          |    4 
 b/kernel/rcu.h                          |   79 +++++
 b/kernel/rcupdate.c                     |   21 +
 b/kernel/rcutiny.c                      |   28 -
 b/kernel/rcutiny_plugin.h               |   14 
 b/kernel/rcutorture.c                   |    5 
 b/kernel/rcutree.c                      |   22 -
 b/kernel/rcutree.h                      |    7 
 b/kernel/rcutree_plugin.h               |    5 
 b/kernel/rcutree_trace.c                |    2 
 b/kernel/rtmutex.c                      |    8 
 b/kernel/sched.c                        |    2 
 b/kernel/time/tick-sched.c              |    6 
 include/linux/rcupdate.h                |  420 ++++++++++++++---------------
 include/linux/rcutiny.h                 |    4 
 include/linux/sched.h                   |    3 
 include/linux/srcu.h                    |    5 
 include/trace/events/rcu.h              |  449 ++++++++++++++++++++++++++++----
 kernel/lockdep.c                        |   20 +
 kernel/rcu.h                            |   16 -
 kernel/rcupdate.c                       |   22 +
 kernel/rcutiny.c                        |  139 +++------
 kernel/rcutiny_plugin.h                 |  120 ++++++--
 kernel/rcutorture.c                     |   72 ++---
 kernel/rcutree.c                        |  313 +++++++++++++++-------
 kernel/rcutree.h                        |   10 
 kernel/rcutree_plugin.h                 |  150 +++++-----
 kernel/rcutree_trace.c                  |   13 
 kernel/sched.c                          |   11 
 50 files changed, 1761 insertions(+), 818 deletions(-)

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 22/55] rcu: Add grace-period, quiescent-state, and call_rcu trace events
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (20 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 21/55] rcu: Make TINY_RCU also use softirq for RCU_BOOST=n Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-10-17  1:33   ` Josh Triplett
  2011-09-06 18:00 ` [PATCH tip/core/rcu 23/55] rcu: Simplify quiescent-state accounting Paul E. McKenney
                   ` (34 subsequent siblings)
  56 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

Add trace events to record grace-period start and end, quiescent states,
CPUs noticing grace-period start and end, grace-period initialization,
call_rcu() invocation, tasks blocking in RCU read-side critical sections,
tasks exiting those same critical sections, force_quiescent_state()
detection of dyntick-idle and offline CPUs, CPUs entering and leaving
dyntick-idle mode (except from NMIs), CPUs coming online and going
offline, and CPUs being kicked for staying in dyntick-idle mode for too
long (as in many weeks, even on 32-bit systems).

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Add the rcu flavor to callback trace events

The earlier trace events for registering RCU callbacks and for invoking
them did not include the RCU flavor (rcu_bh, rcu_preempt, or rcu_sched).
This commit adds the RCU flavor to those trace events.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/trace/events/rcu.h |  345 ++++++++++++++++++++++++++++++++++++++++++--
 kernel/rcu.h               |    6 +-
 kernel/rcutiny.c           |    4 +-
 kernel/rcutree.c           |   45 ++++++-
 kernel/rcutree.h           |    1 +
 kernel/rcutree_plugin.h    |   22 +++-
 6 files changed, 399 insertions(+), 24 deletions(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index ac52aba..669fbd6 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -24,7 +24,7 @@ TRACE_EVENT(rcu_utilization,
 	TP_ARGS(s),
 
 	TP_STRUCT__entry(
-		__field(char *,	s)
+		__field(char *, s)
 	),
 
 	TP_fast_assign(
@@ -34,6 +34,297 @@ TRACE_EVENT(rcu_utilization,
 	TP_printk("%s", __entry->s)
 );
 
+#ifdef CONFIG_RCU_TRACE
+
+#if defined(CONFIG_TREE_RCU) || defined(CONFIG_TREE_PREEMPT_RCU)
+
+/*
+ * Tracepoint for grace-period events: starting and ending a grace
+ * period ("start" and "end", respectively), a CPU noting the start
+ * of a new grace period or the end of an old grace period ("cpustart"
+ * and "cpuend", respectively), a CPU passing through a quiescent
+ * state ("cpuqs"), a CPU coming online or going offline ("cpuonl"
+ * and "cpuofl", respectively), and a CPU being kicked for being too
+ * long in dyntick-idle mode ("kick").
+ */
+TRACE_EVENT(rcu_grace_period,
+
+	TP_PROTO(char *rcuname, unsigned long gpnum, char *gpevent),
+
+	TP_ARGS(rcuname, gpnum, gpevent),
+
+	TP_STRUCT__entry(
+		__field(char *, rcuname)
+		__field(unsigned long, gpnum)
+		__field(char *, gpevent)
+	),
+
+	TP_fast_assign(
+		__entry->rcuname = rcuname;
+		__entry->gpnum = gpnum;
+		__entry->gpevent = gpevent;
+	),
+
+	TP_printk("%s %lu %s",
+		  __entry->rcuname, __entry->gpnum, __entry->gpevent)
+);
+
+/*
+ * Tracepoint for grace-period-initialization events.  These are
+ * distinguished by the type of RCU, the new grace-period number, the
+ * rcu_node structure level, the starting and ending CPU covered by the
+ * rcu_node structure, and the mask of CPUs that will be waited for.
+ * All but the type of RCU are extracted from the rcu_node structure.
+ */
+TRACE_EVENT(rcu_grace_period_init,
+
+	TP_PROTO(char *rcuname, unsigned long gpnum, u8 level,
+		 int grplo, int grphi, unsigned long qsmask),
+
+	TP_ARGS(rcuname, gpnum, level, grplo, grphi, qsmask),
+
+	TP_STRUCT__entry(
+		__field(char *, rcuname)
+		__field(unsigned long, gpnum)
+		__field(u8, level)
+		__field(int, grplo)
+		__field(int, grphi)
+		__field(unsigned long, qsmask)
+	),
+
+	TP_fast_assign(
+		__entry->rcuname = rcuname;
+		__entry->gpnum = gpnum;
+		__entry->level = level;
+		__entry->grplo = grplo;
+		__entry->grphi = grphi;
+		__entry->qsmask = qsmask;
+	),
+
+	TP_printk("%s %lu %u %d %d %lx",
+		  __entry->rcuname, __entry->gpnum, __entry->level,
+		  __entry->grplo, __entry->grphi, __entry->qsmask)
+);
+
+/*
+ * Tracepoint for tasks blocking within preemptible-RCU read-side
+ * critical sections.  Track the type of RCU (which one day might
+ * include SRCU), the grace-period number that the task is blocking
+ * (the current or the next), and the task's PID.
+ */
+TRACE_EVENT(rcu_preempt_task,
+
+	TP_PROTO(char *rcuname, int pid, unsigned long gpnum),
+
+	TP_ARGS(rcuname, pid, gpnum),
+
+	TP_STRUCT__entry(
+		__field(char *, rcuname)
+		__field(unsigned long, gpnum)
+		__field(int, pid)
+	),
+
+	TP_fast_assign(
+		__entry->rcuname = rcuname;
+		__entry->gpnum = gpnum;
+		__entry->pid = pid;
+	),
+
+	TP_printk("%s %lu %d",
+		  __entry->rcuname, __entry->gpnum, __entry->pid)
+);
+
+/*
+ * Tracepoint for tasks that blocked within a given preemptible-RCU
+ * read-side critical section exiting that critical section.  Track the
+ * type of RCU (which one day might include SRCU) and the task's PID.
+ */
+TRACE_EVENT(rcu_unlock_preempted_task,
+
+	TP_PROTO(char *rcuname, unsigned long gpnum, int pid),
+
+	TP_ARGS(rcuname, gpnum, pid),
+
+	TP_STRUCT__entry(
+		__field(char *, rcuname)
+		__field(unsigned long, gpnum)
+		__field(int, pid)
+	),
+
+	TP_fast_assign(
+		__entry->rcuname = rcuname;
+		__entry->gpnum = gpnum;
+		__entry->pid = pid;
+	),
+
+	TP_printk("%s %lu %d", __entry->rcuname, __entry->gpnum, __entry->pid)
+);
+
+/*
+ * Tracepoint for quiescent-state-reporting events.  These are
+ * distinguished by the type of RCU, the grace-period number, the
+ * mask of quiescent lower-level entities, the rcu_node structure level,
+ * the starting and ending CPU covered by the rcu_node structure, and
+ * whether there are any blocked tasks blocking the current grace period.
+ * All but the type of RCU are extracted from the rcu_node structure.
+ */
+TRACE_EVENT(rcu_quiescent_state_report,
+
+	TP_PROTO(char *rcuname, unsigned long gpnum,
+		 unsigned long mask, unsigned long qsmask,
+		 u8 level, int grplo, int grphi, int gp_tasks),
+
+	TP_ARGS(rcuname, gpnum, mask, qsmask, level, grplo, grphi, gp_tasks),
+
+	TP_STRUCT__entry(
+		__field(char *, rcuname)
+		__field(unsigned long, gpnum)
+		__field(unsigned long, mask)
+		__field(unsigned long, qsmask)
+		__field(u8, level)
+		__field(int, grplo)
+		__field(int, grphi)
+		__field(u8, gp_tasks)
+	),
+
+	TP_fast_assign(
+		__entry->rcuname = rcuname;
+		__entry->gpnum = gpnum;
+		__entry->mask = mask;
+		__entry->qsmask = qsmask;
+		__entry->level = level;
+		__entry->grplo = grplo;
+		__entry->grphi = grphi;
+		__entry->gp_tasks = gp_tasks;
+	),
+
+	TP_printk("%s %lu %lx>%lx %u %d %d %u",
+		  __entry->rcuname, __entry->gpnum,
+		  __entry->mask, __entry->qsmask, __entry->level,
+		  __entry->grplo, __entry->grphi, __entry->gp_tasks)
+);
+
+/*
+ * Tracepoint for quiescent states detected by force_quiescent_state().
+ * These trace events include the type of RCU, the grace-period number
+ * that was blocked by the CPU, the CPU itself, and the type of quiescent
+ * state, which can be "dti" for dyntick-idle mode, "ofl" for CPU offline,
+ * or "kick" when kicking a CPU that has been in dyntick-idle mode for
+ * too long.
+ */
+TRACE_EVENT(rcu_fqs,
+
+	TP_PROTO(char *rcuname, unsigned long gpnum, int cpu, char *qsevent),
+
+	TP_ARGS(rcuname, gpnum, cpu, qsevent),
+
+	TP_STRUCT__entry(
+		__field(char *, rcuname)
+		__field(unsigned long, gpnum)
+		__field(int, cpu)
+		__field(char *, qsevent)
+	),
+
+	TP_fast_assign(
+		__entry->rcuname = rcuname;
+		__entry->gpnum = gpnum;
+		__entry->cpu = cpu;
+		__entry->qsevent = qsevent;
+	),
+
+	TP_printk("%s %lu %d %s",
+		  __entry->rcuname, __entry->gpnum,
+		  __entry->cpu, __entry->qsevent)
+);
+
+#endif /* #if defined(CONFIG_TREE_RCU) || defined(CONFIG_TREE_PREEMPT_RCU) */
+
+/*
+ * Tracepoint for dyntick-idle entry/exit events.  These take a string
+ * as argument: "Start" for entering dyntick-idle mode and "End" for
+ * leaving it.
+ */
+TRACE_EVENT(rcu_dyntick,
+
+	TP_PROTO(char *polarity),
+
+	TP_ARGS(polarity),
+
+	TP_STRUCT__entry(
+		__field(char *, polarity)
+	),
+
+	TP_fast_assign(
+		__entry->polarity = polarity;
+	),
+
+	TP_printk("%s", __entry->polarity)
+);
+
+/*
+ * Tracepoint for the registration of a single RCU callback function.
+ * The first argument is the type of RCU, the second argument is
+ * a pointer to the RCU callback itself, and the third element is the
+ * new RCU callback queue length for the current CPU.
+ */
+TRACE_EVENT(rcu_callback,
+
+	TP_PROTO(char *rcuname, struct rcu_head *rhp, long qlen),
+
+	TP_ARGS(rcuname, rhp, qlen),
+
+	TP_STRUCT__entry(
+		__field(char *, rcuname)
+		__field(void *, rhp)
+		__field(void *, func)
+		__field(long, qlen)
+	),
+
+	TP_fast_assign(
+		__entry->rcuname = rcuname;
+		__entry->rhp = rhp;
+		__entry->func = rhp->func;
+		__entry->qlen = qlen;
+	),
+
+	TP_printk("%s rhp=%p func=%pf %ld",
+		  __entry->rcuname, __entry->rhp, __entry->func, __entry->qlen)
+);
+
+/*
+ * Tracepoint for the registration of a single RCU callback of the special
+ * kfree() form.  The first argument is the RCU type, the second argument
+ * is a pointer to the RCU callback, the third argument is the offset
+ * of the callback within the enclosing RCU-protected data structure,
+ * and the fourth argument is the new RCU callback queue length for the
+ * current CPU.
+ */
+TRACE_EVENT(rcu_kfree_callback,
+
+	TP_PROTO(char *rcuname, struct rcu_head *rhp, unsigned long offset,
+		 long qlen),
+
+	TP_ARGS(rcuname, rhp, offset, qlen),
+
+	TP_STRUCT__entry(
+		__field(char *, rcuname)
+		__field(void *, rhp)
+		__field(unsigned long, offset)
+		__field(long, qlen)
+	),
+
+	TP_fast_assign(
+		__entry->rcuname = rcuname;
+		__entry->rhp = rhp;
+		__entry->offset = offset;
+		__entry->qlen = qlen;
+	),
+
+	TP_printk("%s rhp=%p func=%ld %ld",
+		  __entry->rcuname, __entry->rhp, __entry->offset,
+		  __entry->qlen)
+);
+
 /*
  * Tracepoint for marking the beginning rcu_do_batch, performed to start
  * RCU callback invocation.  The first argument is the RCU flavor,
@@ -65,50 +356,58 @@ TRACE_EVENT(rcu_batch_start,
 
 /*
  * Tracepoint for the invocation of a single RCU callback function.
- * The argument is a pointer to the RCU callback itself.
+ * The first argument is the type of RCU, and the second argument is
+ * a pointer to the RCU callback itself.
  */
 TRACE_EVENT(rcu_invoke_callback,
 
-	TP_PROTO(struct rcu_head *rhp),
+	TP_PROTO(char *rcuname, struct rcu_head *rhp),
 
-	TP_ARGS(rhp),
+	TP_ARGS(rcuname, rhp),
 
 	TP_STRUCT__entry(
-		__field(void *,	rhp)
-		__field(void *,	func)
+		__field(char *, rcuname)
+		__field(void *, rhp)
+		__field(void *, func)
 	),
 
 	TP_fast_assign(
+		__entry->rcuname = rcuname;
 		__entry->rhp = rhp;
 		__entry->func = rhp->func;
 	),
 
-	TP_printk("rhp=%p func=%pf", __entry->rhp, __entry->func)
+	TP_printk("%s rhp=%p func=%pf",
+		  __entry->rcuname, __entry->rhp, __entry->func)
 );
 
 /*
  * Tracepoint for the invocation of a single RCU callback of the special
- * kfree() form.  The first argument is a pointer to the RCU callback
- * and the second argument is the offset of the callback within the
- * enclosing RCU-protected data structure.
+ * kfree() form.  The first argument is the RCU flavor, the second
+ * argument is a pointer to the RCU callback, and the third argument
+ * is the offset of the callback within the enclosing RCU-protected
+ * data structure.
  */
 TRACE_EVENT(rcu_invoke_kfree_callback,
 
-	TP_PROTO(struct rcu_head *rhp, unsigned long offset),
+	TP_PROTO(char *rcuname, struct rcu_head *rhp, unsigned long offset),
 
-	TP_ARGS(rhp, offset),
+	TP_ARGS(rcuname, rhp, offset),
 
 	TP_STRUCT__entry(
-		__field(void *,	rhp)
+		__field(char *, rcuname)
+		__field(void *, rhp)
 		__field(unsigned long, offset)
 	),
 
 	TP_fast_assign(
+		__entry->rcuname = rcuname;
 		__entry->rhp = rhp;
 		__entry->offset	= offset;
 	),
 
-	TP_printk("rhp=%p func=%ld", __entry->rhp, __entry->offset)
+	TP_printk("%s rhp=%p func=%ld",
+		  __entry->rcuname, __entry->rhp, __entry->offset)
 );
 
 /*
@@ -136,6 +435,24 @@ TRACE_EVENT(rcu_batch_end,
 		  __entry->rcuname, __entry->callbacks_invoked)
 );
 
+#else /* #ifdef CONFIG_RCU_TRACE */
+
+#define trace_rcu_grace_period(rcuname, gpnum, gpevent) do { } while (0)
+#define trace_rcu_grace_period_init(rcuname, gpnum, level, grplo, grphi, qsmask) do { } while (0)
+#define trace_rcu_preempt_task(rcuname, pid, gpnum) do { } while (0)
+#define trace_rcu_unlock_preempted_task(rcuname, gpnum, pid) do { } while (0)
+#define trace_rcu_quiescent_state_report(rcuname, gpnum, mask, qsmask, level, grplo, grphi, gp_tasks) do { } while (0)
+#define trace_rcu_fqs(rcuname, gpnum, cpu, qsevent) do { } while (0)
+#define trace_rcu_dyntick(polarity) do { } while (0)
+#define trace_rcu_callback(rcuname, rhp, qlen) do { } while (0)
+#define trace_rcu_kfree_callback(rcuname, rhp, offset, qlen) do { } while (0)
+#define trace_rcu_batch_start(rcuname, qlen, blimit) do { } while (0)
+#define trace_rcu_invoke_callback(rcuname, rhp) do { } while (0)
+#define trace_rcu_invoke_kfree_callback(rcuname, rhp, offset) do { } while (0)
+#define trace_rcu_batch_end(rcuname, callbacks_invoked) do { } while (0)
+
+#endif /* #else #ifdef CONFIG_RCU_TRACE */
+
 #endif /* _TRACE_RCU_H */
 
 /* This part must be outside protection */
diff --git a/kernel/rcu.h b/kernel/rcu.h
index d7f00ec..f600868 100644
--- a/kernel/rcu.h
+++ b/kernel/rcu.h
@@ -69,15 +69,15 @@ static inline void debug_rcu_head_unqueue(struct rcu_head *head)
 
 extern void kfree(const void *);
 
-static inline void __rcu_reclaim(struct rcu_head *head)
+static inline void __rcu_reclaim(char *rn, struct rcu_head *head)
 {
 	unsigned long offset = (unsigned long)head->func;
 
 	if (__is_kfree_rcu_offset(offset)) {
-		RCU_TRACE(trace_rcu_invoke_kfree_callback(head, offset));
+		RCU_TRACE(trace_rcu_invoke_kfree_callback(rn, head, offset));
 		kfree((void *)head - offset);
 	} else {
-		RCU_TRACE(trace_rcu_invoke_callback(head));
+		RCU_TRACE(trace_rcu_invoke_callback(rn, head));
 		head->func(head);
 	}
 }
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index c9321d8..da775c8 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -147,6 +147,7 @@ void rcu_check_callbacks(int cpu, int user)
  */
 static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
 {
+	char *rn = NULL;
 	struct rcu_head *next, *list;
 	unsigned long flags;
 	RCU_TRACE(int cb_count = 0);
@@ -171,12 +172,13 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
 	local_irq_restore(flags);
 
 	/* Invoke the callbacks on the local list. */
+	RCU_TRACE(rn = rcp->name);
 	while (list) {
 		next = list->next;
 		prefetch(next);
 		debug_rcu_head_unqueue(list);
 		local_bh_disable();
-		__rcu_reclaim(list);
+		__rcu_reclaim(rn, list);
 		local_bh_enable();
 		list = next;
 		RCU_TRACE(cb_count++);
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 4e24399..7e02829 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -166,6 +166,8 @@ void rcu_sched_qs(int cpu)
 
 	rdp->passed_quiesc_completed = rdp->gpnum - 1;
 	barrier();
+	if (rdp->passed_quiesc == 0)
+		trace_rcu_grace_period("rcu_sched", rdp->gpnum, "cpuqs");
 	rdp->passed_quiesc = 1;
 }
 
@@ -175,6 +177,8 @@ void rcu_bh_qs(int cpu)
 
 	rdp->passed_quiesc_completed = rdp->gpnum - 1;
 	barrier();
+	if (rdp->passed_quiesc == 0)
+		trace_rcu_grace_period("rcu_bh", rdp->gpnum, "cpuqs");
 	rdp->passed_quiesc = 1;
 }
 
@@ -319,6 +323,7 @@ static int rcu_implicit_offline_qs(struct rcu_data *rdp)
 	 * trust its state not to change because interrupts are disabled.
 	 */
 	if (cpu_is_offline(rdp->cpu)) {
+		trace_rcu_fqs(rdp->rsp->name, rdp->gpnum, rdp->cpu, "ofl");
 		rdp->offline_fqs++;
 		return 1;
 	}
@@ -359,6 +364,7 @@ void rcu_enter_nohz(void)
 		local_irq_restore(flags);
 		return;
 	}
+	trace_rcu_dyntick("Start");
 	/* CPUs seeing atomic_inc() must see prior RCU read-side crit sects */
 	smp_mb__before_atomic_inc();  /* See above. */
 	atomic_inc(&rdtp->dynticks);
@@ -396,6 +402,7 @@ void rcu_exit_nohz(void)
 	/* CPUs seeing atomic_inc() must see later RCU read-side crit sects */
 	smp_mb__after_atomic_inc();  /* See above. */
 	WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
+	trace_rcu_dyntick("End");
 	local_irq_restore(flags);
 }
 
@@ -501,6 +508,7 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
 	 * of the current RCU grace period.
 	 */
 	if ((curr & 0x1) == 0 || ULONG_CMP_GE(curr, snap + 2)) {
+		trace_rcu_fqs(rdp->rsp->name, rdp->gpnum, rdp->cpu, "dti");
 		rdp->dynticks_fqs++;
 		return 1;
 	}
@@ -683,6 +691,7 @@ static void __note_new_gpnum(struct rcu_state *rsp, struct rcu_node *rnp, struct
 		 * go looking for one.
 		 */
 		rdp->gpnum = rnp->gpnum;
+		trace_rcu_grace_period(rsp->name, rdp->gpnum, "cpustart");
 		if (rnp->qsmask & rdp->grpmask) {
 			rdp->qs_pending = 1;
 			rdp->passed_quiesc = 0;
@@ -746,6 +755,7 @@ __rcu_process_gp_end(struct rcu_state *rsp, struct rcu_node *rnp, struct rcu_dat
 
 		/* Remember that we saw this grace-period completion. */
 		rdp->completed = rnp->completed;
+		trace_rcu_grace_period(rsp->name, rdp->gpnum, "cpuend");
 
 		/*
 		 * If we were in an extended quiescent state, we may have
@@ -856,6 +866,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 
 	/* Advance to a new grace period and initialize state. */
 	rsp->gpnum++;
+	trace_rcu_grace_period(rsp->name, rsp->gpnum, "start");
 	WARN_ON_ONCE(rsp->signaled == RCU_GP_INIT);
 	rsp->signaled = RCU_GP_INIT; /* Hold off force_quiescent_state. */
 	rsp->jiffies_force_qs = jiffies + RCU_JIFFIES_TILL_FORCE_QS;
@@ -870,6 +881,9 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 		rsp->signaled = RCU_SIGNAL_INIT; /* force_quiescent_state OK. */
 		rcu_start_gp_per_cpu(rsp, rnp, rdp);
 		rcu_preempt_boost_start_gp(rnp);
+		trace_rcu_grace_period_init(rsp->name, rnp->gpnum,
+					    rnp->level, rnp->grplo,
+					    rnp->grphi, rnp->qsmask);
 		raw_spin_unlock_irqrestore(&rnp->lock, flags);
 		return;
 	}
@@ -906,6 +920,9 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 		if (rnp == rdp->mynode)
 			rcu_start_gp_per_cpu(rsp, rnp, rdp);
 		rcu_preempt_boost_start_gp(rnp);
+		trace_rcu_grace_period_init(rsp->name, rnp->gpnum,
+					    rnp->level, rnp->grplo,
+					    rnp->grphi, rnp->qsmask);
 		raw_spin_unlock(&rnp->lock);	/* irqs remain disabled. */
 	}
 
@@ -939,6 +956,7 @@ static void rcu_report_qs_rsp(struct rcu_state *rsp, unsigned long flags)
 	if (gp_duration > rsp->gp_max)
 		rsp->gp_max = gp_duration;
 	rsp->completed = rsp->gpnum;
+	trace_rcu_grace_period(rsp->name, rsp->completed, "end");
 	rsp->signaled = RCU_GP_IDLE;
 	rcu_start_gp(rsp, flags);  /* releases root node's rnp->lock. */
 }
@@ -967,6 +985,10 @@ rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp,
 			return;
 		}
 		rnp->qsmask &= ~mask;
+		trace_rcu_quiescent_state_report(rsp->name, rnp->gpnum,
+						 mask, rnp->qsmask, rnp->level,
+						 rnp->grplo, rnp->grphi,
+						 !!rnp->gp_tasks);
 		if (rnp->qsmask != 0 || rcu_preempt_blocked_readers_cgp(rnp)) {
 
 			/* Other bits still set at this level, so done. */
@@ -1135,11 +1157,20 @@ static void __rcu_offline_cpu(int cpu, struct rcu_state *rsp)
 		if (rnp->qsmaskinit != 0) {
 			if (rnp != rdp->mynode)
 				raw_spin_unlock(&rnp->lock); /* irqs remain disabled. */
+			else
+				trace_rcu_grace_period(rsp->name,
+						       rnp->gpnum + 1 -
+						       !!(rnp->qsmask & mask),
+						       "cpuofl");
 			break;
 		}
-		if (rnp == rdp->mynode)
+		if (rnp == rdp->mynode) {
+			trace_rcu_grace_period(rsp->name,
+					       rnp->gpnum + 1 -
+					       !!(rnp->qsmask & mask),
+					       "cpuofl");
 			need_report = rcu_preempt_offline_tasks(rsp, rnp, rdp);
-		else
+		} else
 			raw_spin_unlock(&rnp->lock); /* irqs remain disabled. */
 		mask = rnp->grpmask;
 		rnp = rnp->parent;
@@ -1226,7 +1257,7 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
 		next = list->next;
 		prefetch(next);
 		debug_rcu_head_unqueue(list);
-		__rcu_reclaim(list);
+		__rcu_reclaim(rsp->name, list);
 		list = next;
 		if (++count >= bl)
 			break;
@@ -1552,6 +1583,12 @@ __call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu),
 	rdp->nxttail[RCU_NEXT_TAIL] = &head->next;
 	rdp->qlen++;
 
+	if (__is_kfree_rcu_offset((unsigned long)func))
+		trace_rcu_kfree_callback(rsp->name, head, (unsigned long)func,
+					 rdp->qlen);
+	else
+		trace_rcu_callback(rsp->name, head, rdp->qlen);
+
 	/* If interrupts were disabled, don't dive into RCU core. */
 	if (irqs_disabled_flags(flags)) {
 		local_irq_restore(flags);
@@ -1850,6 +1887,7 @@ rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp)
 	rdp->dynticks = &per_cpu(rcu_dynticks, cpu);
 #endif /* #ifdef CONFIG_NO_HZ */
 	rdp->cpu = cpu;
+	rdp->rsp = rsp;
 	raw_spin_unlock_irqrestore(&rnp->lock, flags);
 }
 
@@ -1898,6 +1936,7 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
 			rdp->gpnum = rnp->completed; /* if GP in progress... */
 			rdp->completed = rnp->completed;
 			rdp->passed_quiesc_completed = rnp->completed - 1;
+			trace_rcu_grace_period(rsp->name, rdp->gpnum, "cpuonl");
 		}
 		raw_spin_unlock(&rnp->lock); /* irqs already disabled. */
 		rnp = rnp->parent;
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index eee6c94..d11a006 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -299,6 +299,7 @@ struct rcu_data {
 	unsigned long n_rp_need_nothing;
 
 	int cpu;
+	struct rcu_state *rsp;
 };
 
 /* Values for signaled field in struct rcu_state. */
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 94d9ca1..bdb2e82 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -124,6 +124,8 @@ static void rcu_preempt_qs(int cpu)
 
 	rdp->passed_quiesc_completed = rdp->gpnum - 1;
 	barrier();
+	if (rdp->passed_quiesc == 0)
+		trace_rcu_grace_period("rcu_preempt", rdp->gpnum, "cpuqs");
 	rdp->passed_quiesc = 1;
 	current->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
 }
@@ -190,6 +192,11 @@ static void rcu_preempt_note_context_switch(int cpu)
 			if (rnp->qsmask & rdp->grpmask)
 				rnp->gp_tasks = &t->rcu_node_entry;
 		}
+		trace_rcu_preempt_task(rdp->rsp->name,
+				       t->pid,
+				       (rnp->qsmask & rdp->grpmask)
+				       ? rnp->gpnum
+				       : rnp->gpnum + 1);
 		raw_spin_unlock_irqrestore(&rnp->lock, flags);
 	} else if (t->rcu_read_lock_nesting < 0 &&
 		   t->rcu_read_unlock_special) {
@@ -344,6 +351,8 @@ static noinline void rcu_read_unlock_special(struct task_struct *t)
 		smp_mb(); /* ensure expedited fastpath sees end of RCU c-s. */
 		np = rcu_next_node_entry(t, rnp);
 		list_del_init(&t->rcu_node_entry);
+		trace_rcu_unlock_preempted_task("rcu_preempt",
+						rnp->gpnum, t->pid);
 		if (&t->rcu_node_entry == rnp->gp_tasks)
 			rnp->gp_tasks = np;
 		if (&t->rcu_node_entry == rnp->exp_tasks)
@@ -364,10 +373,17 @@ static noinline void rcu_read_unlock_special(struct task_struct *t)
 		 * we aren't waiting on any CPUs, report the quiescent state.
 		 * Note that rcu_report_unblock_qs_rnp() releases rnp->lock.
 		 */
-		if (empty)
-			raw_spin_unlock_irqrestore(&rnp->lock, flags);
-		else
+		if (!empty && !rcu_preempt_blocked_readers_cgp(rnp)) {
+			trace_rcu_quiescent_state_report("preempt_rcu",
+							 rnp->gpnum,
+							 0, rnp->qsmask,
+							 rnp->level,
+							 rnp->grplo,
+							 rnp->grphi,
+							 !!rnp->gp_tasks);
 			rcu_report_unblock_qs_rnp(rnp, flags);
+		} else
+			raw_spin_unlock_irqrestore(&rnp->lock, flags);
 
 #ifdef CONFIG_RCU_BOOST
 		/* Unboost if we were boosted. */
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 23/55] rcu: Simplify quiescent-state accounting
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (21 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 22/55] rcu: Add grace-period, quiescent-state, and call_rcu trace events Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 24/55] rcu: Not necessary to pass rcu_read_lock_held() to rcu_dereference_protected() Paul E. McKenney
                   ` (33 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

There is often a delay between the time that a CPU passes through a
quiescent state and the time that this quiescent state is reported to the
RCU core.  It is quite possible that the grace period ended before the
quiescent state could be reported, for example, some other CPU might have
deduced that this CPU passed through dyntick-idle mode.  It is critically
important that quiescent state be counted only against the grace period
that was in effect at the time that the quiescent state was detected.

Previously, this was handled by recording the number of the last grace
period to complete when passing through a quiescent state.  The RCU
core then checks this number against the current value, and rejects
the quiescent state if there is a mismatch.  However, one additional
possibility must be accounted for, namely that the quiescent state was
recorded after the prior grace period completed but before the current
grace period started.  In this case, the RCU core must reject the
quiescent state, but the recorded number will match.  This is handled
when the CPU becomes aware of a new grace period -- at that point,
it invalidates any prior quiescent state.

This works, but is a bit indirect.  The new approach records the current
grace period, and the RCU core checks to see (1) that this is still the
current grace period and (2) that this grace period has not yet ended.
This approach simplifies reasoning about correctness, and this commit
changes over to this new approach.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/RCU/trace.txt |   34 ++++++++++++++++----------------
 kernel/rcutree.c            |   44 +++++++++++++++++++++---------------------
 kernel/rcutree.h            |    6 ++--
 kernel/rcutree_plugin.h     |    6 ++--
 kernel/rcutree_trace.c      |    8 +++---
 5 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt
index a67af0a..aaf65f6 100644
--- a/Documentation/RCU/trace.txt
+++ b/Documentation/RCU/trace.txt
@@ -33,23 +33,23 @@ rcu/rcuboost:
 The output of "cat rcu/rcudata" looks as follows:
 
 rcu_sched:
-  0 c=20972 g=20973 pq=1 pqc=20972 qp=0 dt=545/1/0 df=50 of=0 ri=0 ql=163 qs=NRW. kt=0/W/0 ktl=ebc3 b=10 ci=153737 co=0 ca=0
-  1 c=20972 g=20973 pq=1 pqc=20972 qp=0 dt=967/1/0 df=58 of=0 ri=0 ql=634 qs=NRW. kt=0/W/1 ktl=58c b=10 ci=191037 co=0 ca=0
-  2 c=20972 g=20973 pq=1 pqc=20972 qp=0 dt=1081/1/0 df=175 of=0 ri=0 ql=74 qs=N.W. kt=0/W/2 ktl=da94 b=10 ci=75991 co=0 ca=0
-  3 c=20942 g=20943 pq=1 pqc=20942 qp=1 dt=1846/0/0 df=404 of=0 ri=0 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=72261 co=0 ca=0
-  4 c=20972 g=20973 pq=1 pqc=20972 qp=0 dt=369/1/0 df=83 of=0 ri=0 ql=48 qs=N.W. kt=0/W/4 ktl=e0e7 b=10 ci=128365 co=0 ca=0
-  5 c=20972 g=20973 pq=1 pqc=20972 qp=0 dt=381/1/0 df=64 of=0 ri=0 ql=169 qs=NRW. kt=0/W/5 ktl=fb2f b=10 ci=164360 co=0 ca=0
-  6 c=20972 g=20973 pq=1 pqc=20972 qp=0 dt=1037/1/0 df=183 of=0 ri=0 ql=62 qs=N.W. kt=0/W/6 ktl=d2ad b=10 ci=65663 co=0 ca=0
-  7 c=20897 g=20897 pq=1 pqc=20896 qp=0 dt=1572/0/0 df=382 of=0 ri=0 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=75006 co=0 ca=0
+  0 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=545/1/0 df=50 of=0 ri=0 ql=163 qs=NRW. kt=0/W/0 ktl=ebc3 b=10 ci=153737 co=0 ca=0
+  1 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=967/1/0 df=58 of=0 ri=0 ql=634 qs=NRW. kt=0/W/1 ktl=58c b=10 ci=191037 co=0 ca=0
+  2 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=1081/1/0 df=175 of=0 ri=0 ql=74 qs=N.W. kt=0/W/2 ktl=da94 b=10 ci=75991 co=0 ca=0
+  3 c=20942 g=20943 pq=1 pgp=20942 qp=1 dt=1846/0/0 df=404 of=0 ri=0 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=72261 co=0 ca=0
+  4 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=369/1/0 df=83 of=0 ri=0 ql=48 qs=N.W. kt=0/W/4 ktl=e0e7 b=10 ci=128365 co=0 ca=0
+  5 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=381/1/0 df=64 of=0 ri=0 ql=169 qs=NRW. kt=0/W/5 ktl=fb2f b=10 ci=164360 co=0 ca=0
+  6 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=1037/1/0 df=183 of=0 ri=0 ql=62 qs=N.W. kt=0/W/6 ktl=d2ad b=10 ci=65663 co=0 ca=0
+  7 c=20897 g=20897 pq=1 pgp=20896 qp=0 dt=1572/0/0 df=382 of=0 ri=0 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=75006 co=0 ca=0
 rcu_bh:
-  0 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=545/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/0 ktl=ebc3 b=10 ci=0 co=0 ca=0
-  1 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=967/1/0 df=3 of=0 ri=1 ql=0 qs=.... kt=0/W/1 ktl=58c b=10 ci=151 co=0 ca=0
-  2 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=1081/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/2 ktl=da94 b=10 ci=0 co=0 ca=0
-  3 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=1846/0/0 df=8 of=0 ri=1 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=0 co=0 ca=0
-  4 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=369/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/4 ktl=e0e7 b=10 ci=0 co=0 ca=0
-  5 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=381/1/0 df=4 of=0 ri=1 ql=0 qs=.... kt=0/W/5 ktl=fb2f b=10 ci=0 co=0 ca=0
-  6 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=1037/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/6 ktl=d2ad b=10 ci=0 co=0 ca=0
-  7 c=1474 g=1474 pq=1 pqc=1473 qp=0 dt=1572/0/0 df=8 of=0 ri=1 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=0 co=0 ca=0
+  0 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=545/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/0 ktl=ebc3 b=10 ci=0 co=0 ca=0
+  1 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=967/1/0 df=3 of=0 ri=1 ql=0 qs=.... kt=0/W/1 ktl=58c b=10 ci=151 co=0 ca=0
+  2 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1081/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/2 ktl=da94 b=10 ci=0 co=0 ca=0
+  3 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1846/0/0 df=8 of=0 ri=1 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=0 co=0 ca=0
+  4 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=369/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/4 ktl=e0e7 b=10 ci=0 co=0 ca=0
+  5 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=381/1/0 df=4 of=0 ri=1 ql=0 qs=.... kt=0/W/5 ktl=fb2f b=10 ci=0 co=0 ca=0
+  6 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1037/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/6 ktl=d2ad b=10 ci=0 co=0 ca=0
+  7 c=1474 g=1474 pq=1 pgp=1473 qp=0 dt=1572/0/0 df=8 of=0 ri=1 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=0 co=0 ca=0
 
 The first section lists the rcu_data structures for rcu_sched, the second
 for rcu_bh.  Note that CONFIG_TREE_PREEMPT_RCU kernels will have an
@@ -84,7 +84,7 @@ o	"pq" indicates that this CPU has passed through a quiescent state
 	CPU has not yet reported that fact, (2) some other CPU has not
 	yet reported for this grace period, or (3) both.
 
-o	"pqc" indicates which grace period the last-observed quiescent
+o	"pgp" indicates which grace period the last-observed quiescent
 	state for this CPU corresponds to.  This is important for handling
 	the race between CPU 0 reporting an extended dynticks-idle
 	quiescent state for CPU 1 and CPU 1 suddenly waking up and
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 7e02829..7e2f297 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -159,32 +159,34 @@ static int rcu_gp_in_progress(struct rcu_state *rsp)
  * Note a quiescent state.  Because we do not need to know
  * how many quiescent states passed, just if there was at least
  * one since the start of the grace period, this just sets a flag.
+ * The caller must have disabled preemption.
  */
 void rcu_sched_qs(int cpu)
 {
 	struct rcu_data *rdp = &per_cpu(rcu_sched_data, cpu);
 
-	rdp->passed_quiesc_completed = rdp->gpnum - 1;
+	rdp->passed_quiesce_gpnum = rdp->gpnum;
 	barrier();
-	if (rdp->passed_quiesc == 0)
+	if (rdp->passed_quiesce == 0)
 		trace_rcu_grace_period("rcu_sched", rdp->gpnum, "cpuqs");
-	rdp->passed_quiesc = 1;
+	rdp->passed_quiesce = 1;
 }
 
 void rcu_bh_qs(int cpu)
 {
 	struct rcu_data *rdp = &per_cpu(rcu_bh_data, cpu);
 
-	rdp->passed_quiesc_completed = rdp->gpnum - 1;
+	rdp->passed_quiesce_gpnum = rdp->gpnum;
 	barrier();
-	if (rdp->passed_quiesc == 0)
+	if (rdp->passed_quiesce == 0)
 		trace_rcu_grace_period("rcu_bh", rdp->gpnum, "cpuqs");
-	rdp->passed_quiesc = 1;
+	rdp->passed_quiesce = 1;
 }
 
 /*
  * Note a context switch.  This is a quiescent state for RCU-sched,
  * and requires special handling for preemptible RCU.
+ * The caller must have disabled preemption.
  */
 void rcu_note_context_switch(int cpu)
 {
@@ -694,7 +696,7 @@ static void __note_new_gpnum(struct rcu_state *rsp, struct rcu_node *rnp, struct
 		trace_rcu_grace_period(rsp->name, rdp->gpnum, "cpustart");
 		if (rnp->qsmask & rdp->grpmask) {
 			rdp->qs_pending = 1;
-			rdp->passed_quiesc = 0;
+			rdp->passed_quiesce = 0;
 		} else
 			rdp->qs_pending = 0;
 	}
@@ -1027,7 +1029,7 @@ rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp,
  * based on quiescent states detected in an earlier grace period!
  */
 static void
-rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct rcu_data *rdp, long lastcomp)
+rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct rcu_data *rdp, long lastgp)
 {
 	unsigned long flags;
 	unsigned long mask;
@@ -1035,17 +1037,15 @@ rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct rcu_data *rdp, long las
 
 	rnp = rdp->mynode;
 	raw_spin_lock_irqsave(&rnp->lock, flags);
-	if (lastcomp != rnp->completed) {
+	if (lastgp != rnp->gpnum || rnp->completed == rnp->gpnum) {
 
 		/*
-		 * Someone beat us to it for this grace period, so leave.
-		 * The race with GP start is resolved by the fact that we
-		 * hold the leaf rcu_node lock, so that the per-CPU bits
-		 * cannot yet be initialized -- so we would simply find our
-		 * CPU's bit already cleared in rcu_report_qs_rnp() if this
-		 * race occurred.
+		 * The grace period in which this quiescent state was
+		 * recorded has ended, so don't report it upwards.
+		 * We will instead need a new quiescent state that lies
+		 * within the current grace period.
 		 */
-		rdp->passed_quiesc = 0;	/* try again later! */
+		rdp->passed_quiesce = 0;	/* need qs for new gp. */
 		raw_spin_unlock_irqrestore(&rnp->lock, flags);
 		return;
 	}
@@ -1089,14 +1089,14 @@ rcu_check_quiescent_state(struct rcu_state *rsp, struct rcu_data *rdp)
 	 * Was there a quiescent state since the beginning of the grace
 	 * period? If no, then exit and wait for the next call.
 	 */
-	if (!rdp->passed_quiesc)
+	if (!rdp->passed_quiesce)
 		return;
 
 	/*
 	 * Tell RCU we are done (but rcu_report_qs_rdp() will be the
 	 * judge of that).
 	 */
-	rcu_report_qs_rdp(rdp->cpu, rsp, rdp, rdp->passed_quiesc_completed);
+	rcu_report_qs_rdp(rdp->cpu, rsp, rdp, rdp->passed_quiesce_gpnum);
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
@@ -1712,7 +1712,7 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp)
 	check_cpu_stall(rsp, rdp);
 
 	/* Is the RCU core waiting for a quiescent state from this CPU? */
-	if (rdp->qs_pending && !rdp->passed_quiesc) {
+	if (rdp->qs_pending && !rdp->passed_quiesce) {
 
 		/*
 		 * If force_quiescent_state() coming soon and this CPU
@@ -1724,7 +1724,7 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp)
 		    ULONG_CMP_LT(ACCESS_ONCE(rsp->jiffies_force_qs) - 1,
 				 jiffies))
 			set_need_resched();
-	} else if (rdp->qs_pending && rdp->passed_quiesc) {
+	} else if (rdp->qs_pending && rdp->passed_quiesce) {
 		rdp->n_rp_report_qs++;
 		return 1;
 	}
@@ -1907,7 +1907,7 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
 
 	/* Set up local state, ensuring consistent view of global state. */
 	raw_spin_lock_irqsave(&rnp->lock, flags);
-	rdp->passed_quiesc = 0;  /* We could be racing with new GP, */
+	rdp->passed_quiesce = 0;  /* We could be racing with new GP, */
 	rdp->qs_pending = 1;	 /*  so set up to respond to current GP. */
 	rdp->beenonline = 1;	 /* We have now been online. */
 	rdp->preemptible = preemptible;
@@ -1935,7 +1935,7 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
 		if (rnp == rdp->mynode) {
 			rdp->gpnum = rnp->completed; /* if GP in progress... */
 			rdp->completed = rnp->completed;
-			rdp->passed_quiesc_completed = rnp->completed - 1;
+			rdp->passed_quiesce_gpnum = rnp->gpnum - 1;
 			trace_rcu_grace_period(rsp->name, rdp->gpnum, "cpuonl");
 		}
 		raw_spin_unlock(&rnp->lock); /* irqs already disabled. */
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index d11a006..51638b6 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -230,9 +230,9 @@ struct rcu_data {
 					/*  in order to detect GP end. */
 	unsigned long	gpnum;		/* Highest gp number that this CPU */
 					/*  is aware of having started. */
-	unsigned long	passed_quiesc_completed;
-					/* Value of completed at time of qs. */
-	bool		passed_quiesc;	/* User-mode/idle loop etc. */
+	unsigned long	passed_quiesce_gpnum;
+					/* gpnum at time of quiescent state. */
+	bool		passed_quiesce;	/* User-mode/idle loop etc. */
 	bool		qs_pending;	/* Core waits for quiesc state. */
 	bool		beenonline;	/* CPU online at least once. */
 	bool		preemptible;	/* Preemptible RCU? */
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index bdb2e82..4bac5a2 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -122,11 +122,11 @@ static void rcu_preempt_qs(int cpu)
 {
 	struct rcu_data *rdp = &per_cpu(rcu_preempt_data, cpu);
 
-	rdp->passed_quiesc_completed = rdp->gpnum - 1;
+	rdp->passed_quiesce_gpnum = rdp->gpnum;
 	barrier();
-	if (rdp->passed_quiesc == 0)
+	if (rdp->passed_quiesce == 0)
 		trace_rcu_grace_period("rcu_preempt", rdp->gpnum, "cpuqs");
-	rdp->passed_quiesc = 1;
+	rdp->passed_quiesce = 1;
 	current->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
 }
 
diff --git a/kernel/rcutree_trace.c b/kernel/rcutree_trace.c
index e623564..3f739cf3 100644
--- a/kernel/rcutree_trace.c
+++ b/kernel/rcutree_trace.c
@@ -61,11 +61,11 @@ static void print_one_rcu_data(struct seq_file *m, struct rcu_data *rdp)
 {
 	if (!rdp->beenonline)
 		return;
-	seq_printf(m, "%3d%cc=%lu g=%lu pq=%d pqc=%lu qp=%d",
+	seq_printf(m, "%3d%cc=%lu g=%lu pq=%d pgp=%lu qp=%d",
 		   rdp->cpu,
 		   cpu_is_offline(rdp->cpu) ? '!' : ' ',
 		   rdp->completed, rdp->gpnum,
-		   rdp->passed_quiesc, rdp->passed_quiesc_completed,
+		   rdp->passed_quiesce, rdp->passed_quiesce_gpnum,
 		   rdp->qs_pending);
 #ifdef CONFIG_NO_HZ
 	seq_printf(m, " dt=%d/%d/%d df=%lu",
@@ -139,7 +139,7 @@ static void print_one_rcu_data_csv(struct seq_file *m, struct rcu_data *rdp)
 		   rdp->cpu,
 		   cpu_is_offline(rdp->cpu) ? "\"N\"" : "\"Y\"",
 		   rdp->completed, rdp->gpnum,
-		   rdp->passed_quiesc, rdp->passed_quiesc_completed,
+		   rdp->passed_quiesce, rdp->passed_quiesce_gpnum,
 		   rdp->qs_pending);
 #ifdef CONFIG_NO_HZ
 	seq_printf(m, ",%d,%d,%d,%lu",
@@ -170,7 +170,7 @@ static void print_one_rcu_data_csv(struct seq_file *m, struct rcu_data *rdp)
 
 static int show_rcudata_csv(struct seq_file *m, void *unused)
 {
-	seq_puts(m, "\"CPU\",\"Online?\",\"c\",\"g\",\"pq\",\"pqc\",\"pq\",");
+	seq_puts(m, "\"CPU\",\"Online?\",\"c\",\"g\",\"pq\",\"pgp\",\"pq\",");
 #ifdef CONFIG_NO_HZ
 	seq_puts(m, "\"dt\",\"dt nesting\",\"dt NMI nesting\",\"df\",");
 #endif /* #ifdef CONFIG_NO_HZ */
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 24/55] rcu: Not necessary to pass rcu_read_lock_held() to rcu_dereference_protected()
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (22 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 23/55] rcu: Simplify quiescent-state accounting Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 25/55] rcu: Update documentation for additional RCU lockdep functions Paul E. McKenney
                   ` (32 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Michal Hocko, Paul E. McKenney

From: Michal Hocko <mhocko@suse.cz>

Since ca5ecddf (rcu: define __rcu address space modifier for sparse)
rcu_dereference_check() use rcu_read_lock_held() as a part of condition
automatically.  Therefore, callers of rcu_dereference_check() no longer
need to pass rcu_read_lock_held() to rcu_dereference_check().

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/RCU/lockdep.txt |   10 ++++------
 1 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/Documentation/RCU/lockdep.txt b/Documentation/RCU/lockdep.txt
index d7a49b2..4998c63 100644
--- a/Documentation/RCU/lockdep.txt
+++ b/Documentation/RCU/lockdep.txt
@@ -48,13 +48,11 @@ checking of rcu_dereference() primitives:
 		value of the pointer itself, for example, against NULL.
 
 The rcu_dereference_check() check expression can be any boolean
-expression, but would normally include one of the rcu_read_lock_held()
-family of functions and a lockdep expression.  However, any boolean
-expression can be used.  For a moderately ornate example, consider
-the following:
+expression, but would normally include a lockdep expression.  However,
+any boolean expression can be used.  For a moderately ornate example,
+consider the following:
 
 	file = rcu_dereference_check(fdt->fd[fd],
-				     rcu_read_lock_held() ||
 				     lockdep_is_held(&files->file_lock) ||
 				     atomic_read(&files->count) == 1);
 
@@ -62,7 +60,7 @@ This expression picks up the pointer "fdt->fd[fd]" in an RCU-safe manner,
 and, if CONFIG_PROVE_RCU is configured, verifies that this expression
 is used in:
 
-1.	An RCU read-side critical section, or
+1.	An RCU read-side critical section (implicit), or
 2.	with files->file_lock held, or
 3.	on an unshared files_struct.
 
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 25/55] rcu: Update documentation for additional RCU lockdep functions
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (23 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 24/55] rcu: Not necessary to pass rcu_read_lock_held() to rcu_dereference_protected() Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 26/55] rcu: Remove unused and redundant interfaces Paul E. McKenney
                   ` (31 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

Add documentation for rcu_dereference_bh_check(),
rcu_dereference_sched_check(), srcu_dereference_check(), and
rcu_dereference_index_check().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/RCU/lockdep.txt |   24 +++++++++++++++++++++---
 1 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/Documentation/RCU/lockdep.txt b/Documentation/RCU/lockdep.txt
index 4998c63..a102d4b 100644
--- a/Documentation/RCU/lockdep.txt
+++ b/Documentation/RCU/lockdep.txt
@@ -32,9 +32,27 @@ checking of rcu_dereference() primitives:
 	srcu_dereference(p, sp):
 		Check for SRCU read-side critical section.
 	rcu_dereference_check(p, c):
-		Use explicit check expression "c".  This is useful in
-		code that is invoked by both readers and updaters.
-	rcu_dereference_raw(p)
+		Use explicit check expression "c" along with
+		rcu_read_lock_held().  This is useful in code that is
+		invoked by both RCU readers and updaters.
+	rcu_dereference_bh_check(p, c):
+		Use explicit check expression "c" along with
+		rcu_read_lock_bh_held().  This is useful in code that
+		is invoked by both RCU-bh readers and updaters.
+	rcu_dereference_sched_check(p, c):
+		Use explicit check expression "c" along with
+		rcu_read_lock_sched_held().  This is useful in code that
+		is invoked by both RCU-sched readers and updaters.
+	srcu_dereference_check(p, c):
+		Use explicit check expression "c" along with
+		srcu_read_lock_held()().  This is useful in code that
+		is invoked by both SRCU readers and updaters.
+	rcu_dereference_index_check(p, c):
+		Use explicit check expression "c", but the caller
+		must supply one of the rcu_read_lock_held() functions.
+		This is useful in code that uses RCU-protected arrays
+		that is invoked by both RCU readers and updaters.
+	rcu_dereference_raw(p):
 		Don't check.  (Use sparingly, if at all.)
 	rcu_dereference_protected(p, c):
 		Use explicit check expression "c", and omit all barriers
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 26/55] rcu: Remove unused and redundant interfaces
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (24 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 25/55] rcu: Update documentation for additional RCU lockdep functions Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 27/55] rcu: Allow rcutorture's stat_interval parameter to be changed at runtime Paul E. McKenney
                   ` (30 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

The rcu_dereference_bh_protected() and rcu_dereference_sched_protected()
macros are synonyms for rcu_dereference_protected() and are not used
anywhere in mainline.  This commit therefore removes them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |   20 --------------------
 1 files changed, 0 insertions(+), 20 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index c61a535..ea80396 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -578,26 +578,6 @@ extern int rcu_my_thread_group_empty(void);
 #define rcu_dereference_protected(p, c) \
 	__rcu_dereference_protected((p), (c), __rcu)
 
-/**
- * rcu_dereference_bh_protected() - fetch RCU-bh pointer when updates prevented
- * @p: The pointer to read, prior to dereferencing
- * @c: The conditions under which the dereference will take place
- *
- * This is the RCU-bh counterpart to rcu_dereference_protected().
- */
-#define rcu_dereference_bh_protected(p, c) \
-	__rcu_dereference_protected((p), (c), __rcu)
-
-/**
- * rcu_dereference_sched_protected() - fetch RCU-sched pointer when updates prevented
- * @p: The pointer to read, prior to dereferencing
- * @c: The conditions under which the dereference will take place
- *
- * This is the RCU-sched counterpart to rcu_dereference_protected().
- */
-#define rcu_dereference_sched_protected(p, c) \
-	__rcu_dereference_protected((p), (c), __rcu)
-
 
 /**
  * rcu_dereference() - fetch RCU-protected pointer for dereferencing
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 27/55] rcu: Allow rcutorture's stat_interval parameter to be changed at runtime
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (25 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 26/55] rcu: Remove unused and redundant interfaces Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 28/55] rcu: Document interpretation of RCU-lockdep splats Paul E. McKenney
                   ` (29 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

When rcutorture is compiled directly into the kernel
(instead of separately as a module), it is necessary to specify
rcutorture.stat_interval as a kernel command-line parameter, otherwise,
the rcu_torture_stats kthread is never started.  However, when working
with the system after it has booted, it is convenient to be able to
change the time between statistic printing, particularly when logged
into the console.

This commit therefore allows the stat_interval parameter to be changed
at runtime.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutorture.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c
index 1194d6c..6648e00 100644
--- a/kernel/rcutorture.c
+++ b/kernel/rcutorture.c
@@ -73,7 +73,7 @@ module_param(nreaders, int, 0444);
 MODULE_PARM_DESC(nreaders, "Number of RCU reader threads");
 module_param(nfakewriters, int, 0444);
 MODULE_PARM_DESC(nfakewriters, "Number of RCU fake writer threads");
-module_param(stat_interval, int, 0444);
+module_param(stat_interval, int, 0644);
 MODULE_PARM_DESC(stat_interval, "Number of seconds between stats printk()s");
 module_param(verbose, bool, 0444);
 MODULE_PARM_DESC(verbose, "Enable verbose debugging printk()s");
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 28/55] rcu: Document interpretation of RCU-lockdep splats
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (26 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 27/55] rcu: Allow rcutorture's stat_interval parameter to be changed at runtime Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 29/55] nohz: Remove nohz_cpu_mask Paul E. McKenney
                   ` (28 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

There has been quite a bit of confusion about what RCU-lockdep splats
mean, so this commit adds some documentation describing how to
interpret them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/RCU/lockdep-splat.txt |  110 +++++++++++++++++++++++++++++++++++
 1 files changed, 110 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/RCU/lockdep-splat.txt

diff --git a/Documentation/RCU/lockdep-splat.txt b/Documentation/RCU/lockdep-splat.txt
new file mode 100644
index 0000000..bf90611
--- /dev/null
+++ b/Documentation/RCU/lockdep-splat.txt
@@ -0,0 +1,110 @@
+Lockdep-RCU was added to the Linux kernel in early 2010
+(http://lwn.net/Articles/371986/).  This facility checks for some common
+misuses of the RCU API, most notably using one of the rcu_dereference()
+family to access an RCU-protected pointer without the proper protection.
+When such misuse is detected, an lockdep-RCU splat is emitted.
+
+The usual cause of a lockdep-RCU slat is someone accessing an
+RCU-protected data structure without either (1) being in the right kind of
+RCU read-side critical section or (2) holding the right update-side lock.
+This problem can therefore be serious: it might result in random memory
+overwriting or worse.  There can of course be false positives, this
+being the real world and all that.
+
+So let's look at an example RCU lockdep splat from 3.0-rc5, one that
+has long since been fixed:
+
+===============================
+[ INFO: suspicious RCU usage. ]
+-------------------------------
+block/cfq-iosched.c:2776 suspicious rcu_dereference_protected() usage!
+
+other info that might help us debug this:
+
+
+rcu_scheduler_active = 1, debug_locks = 0
+3 locks held by scsi_scan_6/1552:
+ #0:  (&shost->scan_mutex){+.+.+.}, at: [<ffffffff8145efca>]
+scsi_scan_host_selected+0x5a/0x150
+ #1:  (&eq->sysfs_lock){+.+...}, at: [<ffffffff812a5032>]
+elevator_exit+0x22/0x60
+ #2:  (&(&q->__queue_lock)->rlock){-.-...}, at: [<ffffffff812b6233>]
+cfq_exit_queue+0x43/0x190
+
+stack backtrace:
+Pid: 1552, comm: scsi_scan_6 Not tainted 3.0.0-rc5 #17
+Call Trace:
+ [<ffffffff810abb9b>] lockdep_rcu_dereference+0xbb/0xc0
+ [<ffffffff812b6139>] __cfq_exit_single_io_context+0xe9/0x120
+ [<ffffffff812b626c>] cfq_exit_queue+0x7c/0x190
+ [<ffffffff812a5046>] elevator_exit+0x36/0x60
+ [<ffffffff812a802a>] blk_cleanup_queue+0x4a/0x60
+ [<ffffffff8145cc09>] scsi_free_queue+0x9/0x10
+ [<ffffffff81460944>] __scsi_remove_device+0x84/0xd0
+ [<ffffffff8145dca3>] scsi_probe_and_add_lun+0x353/0xb10
+ [<ffffffff817da069>] ? error_exit+0x29/0xb0
+ [<ffffffff817d98ed>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
+ [<ffffffff8145e722>] __scsi_scan_target+0x112/0x680
+ [<ffffffff812c690d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
+ [<ffffffff817da069>] ? error_exit+0x29/0xb0
+ [<ffffffff812bcc60>] ? kobject_del+0x40/0x40
+ [<ffffffff8145ed16>] scsi_scan_channel+0x86/0xb0
+ [<ffffffff8145f0b0>] scsi_scan_host_selected+0x140/0x150
+ [<ffffffff8145f149>] do_scsi_scan_host+0x89/0x90
+ [<ffffffff8145f170>] do_scan_async+0x20/0x160
+ [<ffffffff8145f150>] ? do_scsi_scan_host+0x90/0x90
+ [<ffffffff810975b6>] kthread+0xa6/0xb0
+ [<ffffffff817db154>] kernel_thread_helper+0x4/0x10
+ [<ffffffff81066430>] ? finish_task_switch+0x80/0x110
+ [<ffffffff817d9c04>] ? retint_restore_args+0xe/0xe
+ [<ffffffff81097510>] ? __init_kthread_worker+0x70/0x70
+ [<ffffffff817db150>] ? gs_change+0xb/0xb
+
+Line 2776 of block/cfq-iosched.c in v3.0-rc5 is as follows:
+
+	if (rcu_dereference(ioc->ioc_data) == cic) {
+
+This form says that it must be in a plain vanilla RCU read-side critical
+section, but the "other info" list above shows that this is not the
+case.  Instead, we hold three locks, one of which might be RCU related.
+And maybe that lock really does protect this reference.  If so, the fix
+is to inform RCU, perhaps by changing __cfq_exit_single_io_context() to
+take the struct request_queue "q" from cfq_exit_queue() as an argument,
+which would permit us to invoke rcu_dereference_protected as follows:
+
+	if (rcu_dereference_protected(ioc->ioc_data,
+				      lockdep_is_held(&q->queue_lock)) == cic) {
+
+With this change, there would be no lockdep-RCU splat emitted if this
+code was invoked either from within an RCU read-side critical section
+or with the ->queue_lock held.  In particular, this would have suppressed
+the above lockdep-RCU splat because ->queue_lock is held (see #2 in the
+list above).
+
+On the other hand, perhaps we really do need an RCU read-side critical
+section.  In this case, the critical section must span the use of the
+return value from rcu_dereference(), or at least until there is some
+reference count incremented or some such.  One way to handle this is to
+add rcu_read_lock() and rcu_read_unlock() as follows:
+
+	rcu_read_lock();
+	if (rcu_dereference(ioc->ioc_data) == cic) {
+		spin_lock(&ioc->lock);
+		rcu_assign_pointer(ioc->ioc_data, NULL);
+		spin_unlock(&ioc->lock);
+	}
+	rcu_read_unlock();
+
+With this change, the rcu_dereference() is always within an RCU
+read-side critical section, which again would have suppressed the
+above lockdep-RCU splat.
+
+But in this particular case, we don't actually deference the pointer
+returned from rcu_dereference().  Instead, that pointer is just compared
+to the cic pointer, which means that the rcu_dereference() can be replaced
+by rcu_access_pointer() as follows:
+
+	if (rcu_access_pointer(ioc->ioc_data) == cic) {
+
+Because it is legal to invoke rcu_access_pointer() without protection,
+this change would also suppress the above lockdep-RCU splat.
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 29/55] nohz: Remove nohz_cpu_mask
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (27 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 28/55] rcu: Document interpretation of RCU-lockdep splats Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 30/55] rcu: Eliminate in_irq() checks in rcu_enter_nohz() Paul E. McKenney
                   ` (27 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Shi, Alex, Paul E. McKenney

From: Shi, Alex <alex.shi@intel.com>

RCU no longer uses this global variable, nor does anyone else.  This
commit therefore removes this variable.  This reduces memory footprint
and also removes some atomic instructions and memory barriers from
the dyntick-idle path.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/sched.h    |    1 -
 kernel/sched.c           |   11 -----------
 kernel/time/tick-sched.c |    6 ------
 3 files changed, 0 insertions(+), 18 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 14a6c7b..d1a1533 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -270,7 +270,6 @@ extern void init_idle_bootup_task(struct task_struct *idle);
 
 extern int runqueue_is_locked(int cpu);
 
-extern cpumask_var_t nohz_cpu_mask;
 #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ)
 extern void select_nohz_load_balancer(int stop_tick);
 extern int get_nohz_timer_target(void);
diff --git a/kernel/sched.c b/kernel/sched.c
index 1c87917..313c0f6 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -5940,15 +5940,6 @@ void __cpuinit init_idle(struct task_struct *idle, int cpu)
 }
 
 /*
- * In a system that switches off the HZ timer nohz_cpu_mask
- * indicates which cpus entered this state. This is used
- * in the rcu update to wait only for active cpus. For system
- * which do not switch off the HZ timer nohz_cpu_mask should
- * always be CPU_BITS_NONE.
- */
-cpumask_var_t nohz_cpu_mask;
-
-/*
  * Increase the granularity value when there are more CPUs,
  * because with more CPUs the 'effective latency' as visible
  * to users decreases. But the relationship is not linear,
@@ -8167,8 +8158,6 @@ void __init sched_init(void)
 	 */
 	current->sched_class = &fair_sched_class;
 
-	/* Allocate the nohz_cpu_mask if CONFIG_CPUMASK_OFFSTACK */
-	zalloc_cpumask_var(&nohz_cpu_mask, GFP_NOWAIT);
 #ifdef CONFIG_SMP
 	zalloc_cpumask_var(&sched_domains_tmpmask, GFP_NOWAIT);
 #ifdef CONFIG_NO_HZ
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index d5097c4..eb98e55 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -139,7 +139,6 @@ static void tick_nohz_update_jiffies(ktime_t now)
 	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
 	unsigned long flags;
 
-	cpumask_clear_cpu(cpu, nohz_cpu_mask);
 	ts->idle_waketime = now;
 
 	local_irq_save(flags);
@@ -389,9 +388,6 @@ void tick_nohz_stop_sched_tick(int inidle)
 		else
 			expires.tv64 = KTIME_MAX;
 
-		if (delta_jiffies > 1)
-			cpumask_set_cpu(cpu, nohz_cpu_mask);
-
 		/* Skip reprogram of event if its not changed */
 		if (ts->tick_stopped && ktime_equal(expires, dev->next_event))
 			goto out;
@@ -441,7 +437,6 @@ void tick_nohz_stop_sched_tick(int inidle)
 		 * softirq.
 		 */
 		tick_do_update_jiffies64(ktime_get());
-		cpumask_clear_cpu(cpu, nohz_cpu_mask);
 	}
 	raise_softirq_irqoff(TIMER_SOFTIRQ);
 out:
@@ -524,7 +519,6 @@ void tick_nohz_restart_sched_tick(void)
 	/* Update jiffies first */
 	select_nohz_load_balancer(0);
 	tick_do_update_jiffies64(now);
-	cpumask_clear_cpu(cpu, nohz_cpu_mask);
 
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING
 	/*
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 30/55] rcu: Eliminate in_irq() checks in rcu_enter_nohz()
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (28 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 29/55] nohz: Remove nohz_cpu_mask Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 31/55] rcu: Make rcu_implicit_dynticks_qs() locals be correct size Paul E. McKenney
                   ` (26 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

The in_irq() check in rcu_enter_nohz() is redundant because if we really
are in an interrupt, the attempt to re-enter dyntick-idle mode will invoke
rcu_needs_cpu() in any case, which will force the check for RCU callbacks.
So this commit removes the check along with the set_need_resched().

Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |    7 -------
 1 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 7e2f297..0c6c30d 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -373,13 +373,6 @@ void rcu_enter_nohz(void)
 	smp_mb__after_atomic_inc();  /* Force ordering with next sojourn. */
 	WARN_ON_ONCE(atomic_read(&rdtp->dynticks) & 0x1);
 	local_irq_restore(flags);
-
-	/* If the interrupt queued a callback, get out of dyntick mode. */
-	if (in_irq() &&
-	    (__get_cpu_var(rcu_sched_data).nxtlist ||
-	     __get_cpu_var(rcu_bh_data).nxtlist ||
-	     rcu_preempt_needs_cpu(smp_processor_id())))
-		set_need_resched();
 }
 
 /*
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 31/55] rcu: Make rcu_implicit_dynticks_qs() locals be correct size
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (29 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 30/55] rcu: Eliminate in_irq() checks in rcu_enter_nohz() Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-10-17  1:43   ` Josh Triplett
  2011-09-06 18:00 ` [PATCH tip/core/rcu 32/55] rcu: Make rcu_assign_pointer() unconditionally insert a memory barrier Paul E. McKenney
                   ` (25 subsequent siblings)
  56 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

When the ->dynticks field in the rcu_dynticks structure changed to an
atomic_t, its size on 64-bit systems changed from 64 bits to 32 bits.
The local variables in rcu_implicit_dynticks_qs() need to change as
well, hence this commit.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 0c6c30d..ebd18e5 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -488,11 +488,11 @@ static int dyntick_save_progress_counter(struct rcu_data *rdp)
  */
 static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
 {
-	unsigned long curr;
-	unsigned long snap;
+	unsigned int curr;
+	unsigned int snap;
 
-	curr = (unsigned long)atomic_add_return(0, &rdp->dynticks->dynticks);
-	snap = (unsigned long)rdp->dynticks_snap;
+	curr = (unsigned int)atomic_add_return(0, &rdp->dynticks->dynticks);
+	snap = (unsigned int)rdp->dynticks_snap;
 
 	/*
 	 * If the CPU passed through or entered a dynticks idle phase with
@@ -502,7 +502,7 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
 	 * read-side critical section that started before the beginning
 	 * of the current RCU grace period.
 	 */
-	if ((curr & 0x1) == 0 || ULONG_CMP_GE(curr, snap + 2)) {
+	if ((curr & 0x1) == 0 || UINT_CMP_GE(curr, snap + 2)) {
 		trace_rcu_fqs(rdp->rsp->name, rdp->gpnum, rdp->cpu, "dti");
 		rdp->dynticks_fqs++;
 		return 1;
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 32/55] rcu: Make rcu_assign_pointer() unconditionally insert a memory barrier
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (30 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 31/55] rcu: Make rcu_implicit_dynticks_qs() locals be correct size Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 33/55] rcu: Improve rcu_assign_pointer() and RCU_INIT_POINTER() documentation Paul E. McKenney
                   ` (24 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

From: Eric Dumazet <eric.dumazet@gmail.com>

Recent changes to gcc give warning messages on rcu_assign_pointers()'s
checks that allow it to determine when it is OK to omit the memory
barrier.  Stephen Hemminger tried a number of gcc tricks to silence
this warning, but #pragmas and CPP macros do not work together in the
way that would be required to make this work.

However, we now have RCU_INIT_POINTER(), which already omits this
memory barrier, and which therefore may be used when assigning NULL to
an RCU-protected pointer that is accessible to readers.  This commit
therefore makes rcu_assign_pointer() unconditionally emit the memory
barrier.

Reported-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index ea80396..b2e5fe8 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -443,9 +443,7 @@ extern int rcu_my_thread_group_empty(void);
 	})
 #define __rcu_assign_pointer(p, v, space) \
 	({ \
-		if (!__builtin_constant_p(v) || \
-		    ((v) != NULL)) \
-			smp_wmb(); \
+		smp_wmb(); \
 		(p) = (typeof(*v) __force space *)(v); \
 	})
 
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 33/55] rcu: Improve rcu_assign_pointer() and RCU_INIT_POINTER() documentation
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (31 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 32/55] rcu: Make rcu_assign_pointer() unconditionally insert a memory barrier Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 34/55] rcu: Move __rcu_read_unlock()'s barrier() within if-statement Paul E. McKenney
                   ` (23 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

The differences between rcu_assign_pointer() and RCU_INIT_POINTER() are
subtle, and it is easy to use the the cheaper RCU_INIT_POINTER() when
the more-expensive rcu_assign_pointer() should have been used instead.
The consequences of this mistake are quite severe.

This commit therefore carefully lays out the situations in which it it
permissible to use RCU_INIT_POINTER().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |   47 +++++++++++++++++++++++++++++++++++++++------
 1 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index b2e5fe8..9873040 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -754,11 +754,18 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
  * any prior initialization.  Returns the value assigned.
  *
  * Inserts memory barriers on architectures that require them
- * (pretty much all of them other than x86), and also prevents
- * the compiler from reordering the code that initializes the
- * structure after the pointer assignment.  More importantly, this
- * call documents which pointers will be dereferenced by RCU read-side
- * code.
+ * (which is most of them), and also prevents the compiler from
+ * reordering the code that initializes the structure after the pointer
+ * assignment.  More importantly, this call documents which pointers
+ * will be dereferenced by RCU read-side code.
+ *
+ * In some special cases, you may use RCU_INIT_POINTER() instead
+ * of rcu_assign_pointer().  RCU_INIT_POINTER() is a bit faster due
+ * to the fact that it does not constrain either the CPU or the compiler.
+ * That said, using RCU_INIT_POINTER() when you should have used
+ * rcu_assign_pointer() is a very bad thing that results in
+ * impossible-to-diagnose memory corruption.  So please be careful.
+ * See the RCU_INIT_POINTER() comment header for details.
  */
 #define rcu_assign_pointer(p, v) \
 	__rcu_assign_pointer((p), (v), __rcu)
@@ -766,8 +773,34 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
 /**
  * RCU_INIT_POINTER() - initialize an RCU protected pointer
  *
- * Initialize an RCU-protected pointer in such a way to avoid RCU-lockdep
- * splats.
+ * Initialize an RCU-protected pointer in special cases where readers
+ * do not need ordering constraints on the CPU or the compiler.  These
+ * special cases are:
+ *
+ * 1.	This use of RCU_INIT_POINTER() is NULLing out the pointer -or-
+ * 2.	The caller has taken whatever steps are required to prevent
+ *	RCU readers from concurrently accessing this pointer -or-
+ * 3.	The referenced data structure has already been exposed to
+ *	readers either at compile time or via rcu_assign_pointer() -and-
+ *	a.	You have not made -any- reader-visible changes to
+ *		this structure since then -or-
+ *	b.	It is OK for readers accessing this structure from its
+ *		new location to see the old state of the structure.  (For
+ *		example, the changes were to statistical counters or to
+ *		other state where exact synchronization is not required.)
+ *
+ * Failure to follow these rules governing use of RCU_INIT_POINTER() will
+ * result in impossible-to-diagnose memory corruption.  As in the structures
+ * will look OK in crash dumps, but any concurrent RCU readers might
+ * see pre-initialized values of the referenced data structure.  So
+ * please be very careful how you use RCU_INIT_POINTER()!!!
+ *
+ * If you are creating an RCU-protected linked structure that is accessed
+ * by a single external-to-structure RCU-protected pointer, then you may
+ * use RCU_INIT_POINTER() to initialize the internal RCU-protected
+ * pointers, but you must use rcu_assign_pointer() to initialize the
+ * external-to-structure pointer -after- you have completely initialized
+ * the reader-accessible portions of the linked structure.
  */
 #define RCU_INIT_POINTER(p, v) \
 		p = (typeof(*v) __force __rcu *)(v)
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 34/55] rcu: Move __rcu_read_unlock()'s barrier() within if-statement
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (32 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 33/55] rcu: Improve rcu_assign_pointer() and RCU_INIT_POINTER() documentation Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 35/55] rcu: Dump local stack if cannot dump all CPUs' stacks Paul E. McKenney
                   ` (22 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

We only need to constrain the compiler if we are actually exiting
the top-level RCU read-side critical section.  This commit therefore
moves the first barrier() cal in __rcu_read_unlock() to inside the
"if" statement, thus avoiding needless register flushes for inner
rcu_read_unlock() calls.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |   14 ++------------
 kernel/rcutree_plugin.h  |    2 +-
 2 files changed, 3 insertions(+), 13 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 9873040..e86bc28 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -134,16 +134,6 @@ extern void call_rcu_sched(struct rcu_head *head,
 
 extern void synchronize_sched(void);
 
-static inline void __rcu_read_lock_bh(void)
-{
-	local_bh_disable();
-}
-
-static inline void __rcu_read_unlock_bh(void)
-{
-	local_bh_enable();
-}
-
 #ifdef CONFIG_PREEMPT_RCU
 
 extern void __rcu_read_lock(void);
@@ -686,7 +676,7 @@ static inline void rcu_read_unlock(void)
  */
 static inline void rcu_read_lock_bh(void)
 {
-	__rcu_read_lock_bh();
+	local_bh_disable();
 	__acquire(RCU_BH);
 	rcu_read_acquire_bh();
 }
@@ -700,7 +690,7 @@ static inline void rcu_read_unlock_bh(void)
 {
 	rcu_read_release_bh();
 	__release(RCU_BH);
-	__rcu_read_unlock_bh();
+	local_bh_enable();
 }
 
 /**
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 4bac5a2..ed70f6b 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -415,10 +415,10 @@ void __rcu_read_unlock(void)
 {
 	struct task_struct *t = current;
 
-	barrier();  /* needed if we ever invoke rcu_read_unlock in rcutree.c */
 	if (t->rcu_read_lock_nesting != 1)
 		--t->rcu_read_lock_nesting;
 	else {
+		barrier();  /* critical section before exit code. */
 		t->rcu_read_lock_nesting = INT_MIN;
 		barrier();  /* assign before ->rcu_read_unlock_special load */
 		if (unlikely(ACCESS_ONCE(t->rcu_read_unlock_special)))
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 35/55] rcu: Dump local stack if cannot dump all CPUs' stacks
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (33 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 34/55] rcu: Move __rcu_read_unlock()'s barrier() within if-statement Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 36/55] rcu: Prevent early boot set_need_resched() from __rcu_pending() Paul E. McKenney
                   ` (21 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

The trigger_all_cpu_backtrace() function is a no-op in architectures that
do not define arch_trigger_all_cpu_backtrace.  On such architectures, RCU
CPU stall warning messages contain no stack trace information, which makes
debugging quite difficult.  This commit therefore substitutes dump_stack()
for architectures that do not define arch_trigger_all_cpu_backtrace,
so that at least the local CPU's stack is dumped as part of the RCU CPU
stall warning message.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index ebd18e5..a07bf55 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -583,7 +583,8 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
 	}
 	printk("} (detected by %d, t=%ld jiffies)\n",
 	       smp_processor_id(), (long)(jiffies - rsp->gp_start));
-	trigger_all_cpu_backtrace();
+	if (!trigger_all_cpu_backtrace())
+		dump_stack();
 
 	/* If so configured, complain about tasks blocking the grace period. */
 
@@ -604,7 +605,8 @@ static void print_cpu_stall(struct rcu_state *rsp)
 	 */
 	printk(KERN_ERR "INFO: %s detected stall on CPU %d (t=%lu jiffies)\n",
 	       rsp->name, smp_processor_id(), jiffies - rsp->gp_start);
-	trigger_all_cpu_backtrace();
+	if (!trigger_all_cpu_backtrace())
+		dump_stack();
 
 	raw_spin_lock_irqsave(&rnp->lock, flags);
 	if (ULONG_CMP_GE(jiffies, rsp->jiffies_stall))
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 36/55] rcu: Prevent early boot set_need_resched() from __rcu_pending()
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (34 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 35/55] rcu: Dump local stack if cannot dump all CPUs' stacks Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-10-17  1:49   ` Josh Triplett
  2011-09-06 18:00 ` [PATCH tip/core/rcu 37/55] rcu: Simplify unboosting checks Paul E. McKenney
                   ` (20 subsequent siblings)
  56 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

There isn't a whole lot of point in poking the scheduler before there
are other tasks to switch to.  This commit therefore adds a check
for rcu_scheduler_fully_active in __rcu_pending() to suppress any
pre-scheduler calls to set_need_resched().  The downside of this approach
is additional runtime overhead in a reasonably hot code path.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index a07bf55..0051dbf 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1707,7 +1707,8 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp)
 	check_cpu_stall(rsp, rdp);
 
 	/* Is the RCU core waiting for a quiescent state from this CPU? */
-	if (rdp->qs_pending && !rdp->passed_quiesce) {
+	if (rcu_scheduler_fully_active &&
+	    rdp->qs_pending && !rdp->passed_quiesce) {
 
 		/*
 		 * If force_quiescent_state() coming soon and this CPU
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 37/55] rcu: Simplify unboosting checks
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (35 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 36/55] rcu: Prevent early boot set_need_resched() from __rcu_pending() Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 38/55] rcu: Prohibit grace periods during early boot Paul E. McKenney
                   ` (19 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

Commit 7765be (Fix RCU_BOOST race handling current->rcu_read_unlock_special)
introduced a new ->rcu_boosted field in the task structure.  This is
redundant because the existing ->rcu_boost_mutex will be non-NULL at
any time that ->rcu_boosted is nonzero.  Therefore, this commit removes
->rcu_boosted and tests ->rcu_boost_mutex instead.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/sched.h   |    3 ---
 kernel/rcutree_plugin.h |   20 ++++++++++----------
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index d1a1533..90b02be 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1259,9 +1259,6 @@ struct task_struct {
 #ifdef CONFIG_PREEMPT_RCU
 	int rcu_read_lock_nesting;
 	char rcu_read_unlock_special;
-#if defined(CONFIG_RCU_BOOST) && defined(CONFIG_TREE_PREEMPT_RCU)
-	int rcu_boosted;
-#endif /* #if defined(CONFIG_RCU_BOOST) && defined(CONFIG_TREE_PREEMPT_RCU) */
 	struct list_head rcu_node_entry;
 #endif /* #ifdef CONFIG_PREEMPT_RCU */
 #ifdef CONFIG_TREE_PREEMPT_RCU
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index ed70f6b..eeb38ee 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -306,6 +306,9 @@ static noinline void rcu_read_unlock_special(struct task_struct *t)
 	int empty_exp;
 	unsigned long flags;
 	struct list_head *np;
+#ifdef CONFIG_RCU_BOOST
+	struct rt_mutex *rbmp = NULL;
+#endif /* #ifdef CONFIG_RCU_BOOST */
 	struct rcu_node *rnp;
 	int special;
 
@@ -351,6 +354,7 @@ static noinline void rcu_read_unlock_special(struct task_struct *t)
 		smp_mb(); /* ensure expedited fastpath sees end of RCU c-s. */
 		np = rcu_next_node_entry(t, rnp);
 		list_del_init(&t->rcu_node_entry);
+		t->rcu_blocked_node = NULL;
 		trace_rcu_unlock_preempted_task("rcu_preempt",
 						rnp->gpnum, t->pid);
 		if (&t->rcu_node_entry == rnp->gp_tasks)
@@ -360,13 +364,12 @@ static noinline void rcu_read_unlock_special(struct task_struct *t)
 #ifdef CONFIG_RCU_BOOST
 		if (&t->rcu_node_entry == rnp->boost_tasks)
 			rnp->boost_tasks = np;
-		/* Snapshot and clear ->rcu_boosted with rcu_node lock held. */
-		if (t->rcu_boosted) {
-			special |= RCU_READ_UNLOCK_BOOSTED;
-			t->rcu_boosted = 0;
+		/* Snapshot/clear ->rcu_boost_mutex with rcu_node lock held. */
+		if (t->rcu_boost_mutex) {
+			rbmp = t->rcu_boost_mutex;
+			t->rcu_boost_mutex = NULL;
 		}
 #endif /* #ifdef CONFIG_RCU_BOOST */
-		t->rcu_blocked_node = NULL;
 
 		/*
 		 * If this was the last task on the current list, and if
@@ -387,10 +390,8 @@ static noinline void rcu_read_unlock_special(struct task_struct *t)
 
 #ifdef CONFIG_RCU_BOOST
 		/* Unboost if we were boosted. */
-		if (special & RCU_READ_UNLOCK_BOOSTED) {
-			rt_mutex_unlock(t->rcu_boost_mutex);
-			t->rcu_boost_mutex = NULL;
-		}
+		if (rbmp)
+			rt_mutex_unlock(rbmp);
 #endif /* #ifdef CONFIG_RCU_BOOST */
 
 		/*
@@ -1206,7 +1207,6 @@ static int rcu_boost(struct rcu_node *rnp)
 	t = container_of(tb, struct task_struct, rcu_node_entry);
 	rt_mutex_init_proxy_locked(&mtx, t);
 	t->rcu_boost_mutex = &mtx;
-	t->rcu_boosted = 1;
 	raw_spin_unlock_irqrestore(&rnp->lock, flags);
 	rt_mutex_lock(&mtx);  /* Side effect: boosts task t's priority. */
 	rt_mutex_unlock(&mtx);  /* Keep lockdep happy. */
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 38/55] rcu: Prohibit grace periods during early boot
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (36 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 37/55] rcu: Simplify unboosting checks Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-10-17  1:51   ` Josh Triplett
  2011-09-06 18:00 ` [PATCH tip/core/rcu 39/55] rcu: Suppress NMI backtraces when stall ends before dump Paul E. McKenney
                   ` (18 subsequent siblings)
  56 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

Greater use of RCU during early boot (before the scheduler is operating)
is causing RCU to attempt to start grace periods during that time, which
in turn is resulting in both RCU and the callback functions attempting
to use the scheduler before it is ready.

This commit prevents these problems by prohibiting RCU grace periods
until after the scheduler has spawned the first non-idle task.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 0051dbf..9970116 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -838,8 +838,11 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 	struct rcu_data *rdp = this_cpu_ptr(rsp->rda);
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
-	if (!cpu_needs_another_gp(rsp, rdp) || rsp->fqs_active) {
-		if (cpu_needs_another_gp(rsp, rdp))
+	if (!rcu_scheduler_fully_active ||
+	    !cpu_needs_another_gp(rsp, rdp) ||
+	    rsp->fqs_active) {
+		if (rcu_scheduler_fully_active &&
+		    cpu_needs_another_gp(rsp, rdp))
 			rsp->fqs_need_gp = 1;
 		if (rnp->completed == rsp->completed) {
 			raw_spin_unlock_irqrestore(&rnp->lock, flags);
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 39/55] rcu: Suppress NMI backtraces when stall ends before dump
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (37 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 38/55] rcu: Prohibit grace periods during early boot Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 40/55] rcu: Avoid having just-onlined CPU resched itself when RCU is idle Paul E. McKenney
                   ` (17 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

It is possible for an RCU CPU stall to end just as it is detected, in
which case the current code will uselessly dump all CPU's stacks.
This commit therefore checks for this condition and refrains from
sending needless NMIs.

And yes, the stall might also end just after we checked all CPUs and
tasks, but in that case we would at least have given some clue as
to which CPU/task was at fault.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c        |   13 +++++++++----
 kernel/rcutree.h        |    2 +-
 kernel/rcutree_plugin.h |   13 +++++++++----
 3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 9970116..ade7883 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -545,6 +545,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
 	int cpu;
 	long delta;
 	unsigned long flags;
+	int ndetected;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
 	/* Only let one CPU complain about others per time interval. */
@@ -561,7 +562,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
 	 * Now rat on any tasks that got kicked up to the root rcu_node
 	 * due to CPU offlining.
 	 */
-	rcu_print_task_stall(rnp);
+	ndetected = rcu_print_task_stall(rnp);
 	raw_spin_unlock_irqrestore(&rnp->lock, flags);
 
 	/*
@@ -573,17 +574,21 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
 	       rsp->name);
 	rcu_for_each_leaf_node(rsp, rnp) {
 		raw_spin_lock_irqsave(&rnp->lock, flags);
-		rcu_print_task_stall(rnp);
+		ndetected += rcu_print_task_stall(rnp);
 		raw_spin_unlock_irqrestore(&rnp->lock, flags);
 		if (rnp->qsmask == 0)
 			continue;
 		for (cpu = 0; cpu <= rnp->grphi - rnp->grplo; cpu++)
-			if (rnp->qsmask & (1UL << cpu))
+			if (rnp->qsmask & (1UL << cpu)) {
 				printk(" %d", rnp->grplo + cpu);
+				ndetected++;
+			}
 	}
 	printk("} (detected by %d, t=%ld jiffies)\n",
 	       smp_processor_id(), (long)(jiffies - rsp->gp_start));
-	if (!trigger_all_cpu_backtrace())
+	if (ndetected == 0)
+		printk(KERN_ERR "INFO: Stall ended before state dump start\n");
+	else if (!trigger_all_cpu_backtrace())
 		dump_stack();
 
 	/* If so configured, complain about tasks blocking the grace period. */
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index 51638b6..f509f72 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -438,7 +438,7 @@ static void rcu_report_unblock_qs_rnp(struct rcu_node *rnp,
 static void rcu_stop_cpu_kthread(int cpu);
 #endif /* #ifdef CONFIG_HOTPLUG_CPU */
 static void rcu_print_detail_task_stall(struct rcu_state *rsp);
-static void rcu_print_task_stall(struct rcu_node *rnp);
+static int rcu_print_task_stall(struct rcu_node *rnp);
 static void rcu_preempt_stall_reset(void);
 static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp);
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index eeb38ee..d3127e8 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -483,16 +483,20 @@ static void rcu_print_detail_task_stall(struct rcu_state *rsp)
  * Scan the current list of tasks blocked within RCU read-side critical
  * sections, printing out the tid of each.
  */
-static void rcu_print_task_stall(struct rcu_node *rnp)
+static int rcu_print_task_stall(struct rcu_node *rnp)
 {
 	struct task_struct *t;
+	int ndetected = 0;
 
 	if (!rcu_preempt_blocked_readers_cgp(rnp))
-		return;
+		return 0;
 	t = list_entry(rnp->gp_tasks,
 		       struct task_struct, rcu_node_entry);
-	list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry)
+	list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
 		printk(" P%d", t->pid);
+		ndetected++;
+	}
+	return ndetected;
 }
 
 /*
@@ -976,8 +980,9 @@ static void rcu_print_detail_task_stall(struct rcu_state *rsp)
  * Because preemptible RCU does not exist, we never have to check for
  * tasks blocked within RCU read-side critical sections.
  */
-static void rcu_print_task_stall(struct rcu_node *rnp)
+static int rcu_print_task_stall(struct rcu_node *rnp)
 {
+	return 0;
 }
 
 /*
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 40/55] rcu: Avoid having just-onlined CPU resched itself when RCU is idle
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (38 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 39/55] rcu: Suppress NMI backtraces when stall ends before dump Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 41/55] rcu: Permit rt_mutex_unlock() with irqs disabled Paul E. McKenney
                   ` (16 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

CPUs set rdp->qs_pending when coming online to resolve races with
grace-period start.  However, this means that if RCU is idle, the
just-onlined CPU might needlessly send itself resched IPIs.  Adjust
the online-CPU initialization to avoid this, and also to correctly
cause the CPU to respond to the current grace period if needed.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index ade7883..c95fa89 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1911,8 +1911,6 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
 
 	/* Set up local state, ensuring consistent view of global state. */
 	raw_spin_lock_irqsave(&rnp->lock, flags);
-	rdp->passed_quiesce = 0;  /* We could be racing with new GP, */
-	rdp->qs_pending = 1;	 /*  so set up to respond to current GP. */
 	rdp->beenonline = 1;	 /* We have now been online. */
 	rdp->preemptible = preemptible;
 	rdp->qlen_last_fqs_check = 0;
@@ -1937,8 +1935,15 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
 		rnp->qsmaskinit |= mask;
 		mask = rnp->grpmask;
 		if (rnp == rdp->mynode) {
-			rdp->gpnum = rnp->completed; /* if GP in progress... */
+			/*
+			 * If there is a grace period in progress, we will
+			 * set up to wait for it next time we run the
+			 * RCU core code.
+			 */
+			rdp->gpnum = rnp->completed;
 			rdp->completed = rnp->completed;
+			rdp->passed_quiesce = 0;
+			rdp->qs_pending = 0;
 			rdp->passed_quiesce_gpnum = rnp->gpnum - 1;
 			trace_rcu_grace_period(rsp->name, rdp->gpnum, "cpuonl");
 		}
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 41/55] rcu: Permit rt_mutex_unlock() with irqs disabled
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (39 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 40/55] rcu: Avoid having just-onlined CPU resched itself when RCU is idle Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-18  4:09   ` Yong Zhang
  2011-09-06 18:00 ` [PATCH tip/core/rcu 42/55] rcu: Make rcu_torture_fqs() exit loops at end of test Paul E. McKenney
                   ` (15 subsequent siblings)
  56 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

Create a separate lockdep class for the rt_mutex used for RCU priority
boosting and enable use of rt_mutex_lock() with irqs disabled.  This
prevents RCU priority boosting from falling prey to deadlocks when
someone begins an RCU read-side critical section in preemptible state,
but releases it with an irq-disabled lock held.

Unfortunately, the scheduler's runqueue and priority-inheritance locks
still must either completely enclose or be completely enclosed by any
overlapping RCU read-side critical section.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree_plugin.h |    6 ++++++
 kernel/rtmutex.c        |    8 ++++++++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index d3127e8..f6c63ea 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1149,6 +1149,8 @@ static void rcu_initiate_boost_trace(struct rcu_node *rnp)
 
 #endif /* #else #ifdef CONFIG_RCU_TRACE */
 
+static struct lock_class_key rcu_boost_class;
+
 /*
  * Carry out RCU priority boosting on the task indicated by ->exp_tasks
  * or ->boost_tasks, advancing the pointer to the next task in the
@@ -1211,10 +1213,14 @@ static int rcu_boost(struct rcu_node *rnp)
 	 */
 	t = container_of(tb, struct task_struct, rcu_node_entry);
 	rt_mutex_init_proxy_locked(&mtx, t);
+	/* Avoid lockdep false positives.  This rt_mutex is its own thing. */
+	lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class,
+				   "rcu_boost_mutex");
 	t->rcu_boost_mutex = &mtx;
 	raw_spin_unlock_irqrestore(&rnp->lock, flags);
 	rt_mutex_lock(&mtx);  /* Side effect: boosts task t's priority. */
 	rt_mutex_unlock(&mtx);  /* Keep lockdep happy. */
+	local_irq_restore(flags);
 
 	return rnp->exp_tasks != NULL || rnp->boost_tasks != NULL;
 }
diff --git a/kernel/rtmutex.c b/kernel/rtmutex.c
index ab44911..2548f44 100644
--- a/kernel/rtmutex.c
+++ b/kernel/rtmutex.c
@@ -579,6 +579,7 @@ __rt_mutex_slowlock(struct rt_mutex *lock, int state,
 		    struct rt_mutex_waiter *waiter)
 {
 	int ret = 0;
+	int was_disabled;
 
 	for (;;) {
 		/* Try to acquire the lock: */
@@ -601,10 +602,17 @@ __rt_mutex_slowlock(struct rt_mutex *lock, int state,
 
 		raw_spin_unlock(&lock->wait_lock);
 
+		was_disabled = irqs_disabled();
+		if (was_disabled)
+			local_irq_enable();
+
 		debug_rt_mutex_print_deadlock(waiter);
 
 		schedule_rt_mutex(lock);
 
+		if (was_disabled)
+			local_irq_disable();
+
 		raw_spin_lock(&lock->wait_lock);
 		set_current_state(state);
 	}
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 42/55] rcu: Make rcu_torture_fqs() exit loops at end of test
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (40 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 41/55] rcu: Permit rt_mutex_unlock() with irqs disabled Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-10-17  1:53   ` Josh Triplett
  2011-09-06 18:00 ` [PATCH tip/core/rcu 43/55] rcu: Make rcu_torture_boost() " Paul E. McKenney
                   ` (14 subsequent siblings)
  56 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

The rcu_torture_fqs() function can prevent the rcutorture tests from
completing, resulting in a hang.  This commit therefore ensures that
rcu_torture_fqs() will exit its inner loops at the end of the test,
and also applies the newish ULONG_CMP_LT() macro to time comparisons.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutorture.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c
index 6648e00..eb739c0 100644
--- a/kernel/rcutorture.c
+++ b/kernel/rcutorture.c
@@ -741,7 +741,7 @@ static int rcu_torture_boost(void *arg)
 	do {
 		/* Wait for the next test interval. */
 		oldstarttime = boost_starttime;
-		while (jiffies - oldstarttime > ULONG_MAX / 2) {
+		while (ULONG_CMP_LT(jiffies, oldstarttime)) {
 			schedule_timeout_uninterruptible(1);
 			rcu_stutter_wait("rcu_torture_boost");
 			if (kthread_should_stop() ||
@@ -752,7 +752,7 @@ static int rcu_torture_boost(void *arg)
 		/* Do one boost-test interval. */
 		endtime = oldstarttime + test_boost_duration * HZ;
 		call_rcu_time = jiffies;
-		while (jiffies - endtime > ULONG_MAX / 2) {
+		while (ULONG_CMP_LT(jiffies, endtime)) {
 			/* If we don't have a callback in flight, post one. */
 			if (!rbi.inflight) {
 				smp_mb(); /* RCU core before ->inflight = 1. */
@@ -818,11 +818,13 @@ rcu_torture_fqs(void *arg)
 	VERBOSE_PRINTK_STRING("rcu_torture_fqs task started");
 	do {
 		fqs_resume_time = jiffies + fqs_stutter * HZ;
-		while (jiffies - fqs_resume_time > LONG_MAX) {
+		while (ULONG_CMP_LT(jiffies, fqs_resume_time) &&
+		       !kthread_should_stop()) {
 			schedule_timeout_interruptible(1);
 		}
 		fqs_burst_remaining = fqs_duration;
-		while (fqs_burst_remaining > 0) {
+		while (fqs_burst_remaining > 0 &&
+		       !kthread_should_stop()) {
 			cur_ops->fqs();
 			udelay(fqs_holdoff);
 			fqs_burst_remaining -= fqs_holdoff;
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 43/55] rcu: Make rcu_torture_boost() exit loops at end of test
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (41 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 42/55] rcu: Make rcu_torture_fqs() exit loops at end of test Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 44/55] rcu: wire up RCU_BOOST_PRIO for rcutree Paul E. McKenney
                   ` (13 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Paul E. McKenney

From: Paul E. McKenney <paul.mckenney@linaro.org>

One of the loops in rcu_torture_boost() fails to check kthread_should_stop(),
and thus might be slowing or even stopping completion of rcutorture tests
at rmmod time.  This commit adds the kthread_should_stop() check to the
offending loop.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutorture.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c
index eb739c0..76fe893 100644
--- a/kernel/rcutorture.c
+++ b/kernel/rcutorture.c
@@ -779,7 +779,8 @@ static int rcu_torture_boost(void *arg)
 		 * interval.  Besides, we are running at RT priority,
 		 * so delays should be relatively rare.
 		 */
-		while (oldstarttime == boost_starttime) {
+		while (oldstarttime == boost_starttime &&
+		       !kthread_should_stop()) {
 			if (mutex_trylock(&boost_mutex)) {
 				boost_starttime = jiffies +
 						  test_boost_interval * HZ;
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 44/55] rcu: wire up RCU_BOOST_PRIO for rcutree
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (42 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 43/55] rcu: Make rcu_torture_boost() " Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-13 12:02   ` Mike Galbraith
  2011-10-17  1:55   ` Josh Triplett
  2011-09-06 18:00 ` [PATCH tip/core/rcu 45/55] rcu: check for entering dyntick-idle mode while in read-side critical section Paul E. McKenney
                   ` (12 subsequent siblings)
  56 siblings, 2 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, Mike Galbraith

RCU boost threads start life at RCU_BOOST_PRIO, while others remain
at RCU_KTHREAD_PRIO.  While here, change thread names to match other
kthreads, and adjust rcu_yield() to not override the priority set by
the user.  This last change sets the stage for runtime changes to
priority in the -rt tree.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c        |    2 --
 kernel/rcutree_plugin.h |   20 +++++++++++++++-----
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index c95fa89..8455043 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -131,8 +131,6 @@ static void rcu_node_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu);
 static void invoke_rcu_core(void);
 static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp);
 
-#define RCU_KTHREAD_PRIO 1	/* RT priority for per-CPU kthreads. */
-
 /*
  * Track the rcutorture test sequence number and the update version
  * number within a given test.  The rcutorture_testseq is incremented
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index f6c63ea..f751a2d 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -27,6 +27,14 @@
 #include <linux/delay.h>
 #include <linux/stop_machine.h>
 
+#define RCU_KTHREAD_PRIO 1
+
+#ifdef CONFIG_RCU_BOOST
+#define RCU_BOOST_PRIO CONFIG_RCU_BOOST_PRIO
+#else
+#define RCU_BOOST_PRIO RCU_KTHREAD_PRIO
+#endif
+
 /*
  * Check the RCU kernel configuration parameters and print informative
  * messages about anything out of the ordinary.  If you like #ifdef, you
@@ -1365,13 +1373,13 @@ static int __cpuinit rcu_spawn_one_boost_kthread(struct rcu_state *rsp,
 	if (rnp->boost_kthread_task != NULL)
 		return 0;
 	t = kthread_create(rcu_boost_kthread, (void *)rnp,
-			   "rcub%d", rnp_index);
+			   "rcub/%d", rnp_index);
 	if (IS_ERR(t))
 		return PTR_ERR(t);
 	raw_spin_lock_irqsave(&rnp->lock, flags);
 	rnp->boost_kthread_task = t;
 	raw_spin_unlock_irqrestore(&rnp->lock, flags);
-	sp.sched_priority = RCU_KTHREAD_PRIO;
+	sp.sched_priority = RCU_BOOST_PRIO;
 	sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
 	wake_up_process(t); /* get to TASK_INTERRUPTIBLE quickly. */
 	return 0;
@@ -1466,6 +1474,7 @@ static void rcu_yield(void (*f)(unsigned long), unsigned long arg)
 {
 	struct sched_param sp;
 	struct timer_list yield_timer;
+	int prio = current->normal_prio;
 
 	setup_timer_on_stack(&yield_timer, f, arg);
 	mod_timer(&yield_timer, jiffies + 2);
@@ -1473,7 +1482,8 @@ static void rcu_yield(void (*f)(unsigned long), unsigned long arg)
 	sched_setscheduler_nocheck(current, SCHED_NORMAL, &sp);
 	set_user_nice(current, 19);
 	schedule();
-	sp.sched_priority = RCU_KTHREAD_PRIO;
+	set_user_nice(current, 0);
+	sp.sched_priority = prio;
 	sched_setscheduler_nocheck(current, SCHED_FIFO, &sp);
 	del_timer(&yield_timer);
 }
@@ -1592,7 +1602,7 @@ static int __cpuinit rcu_spawn_one_cpu_kthread(int cpu)
 	t = kthread_create_on_node(rcu_cpu_kthread,
 				   (void *)(long)cpu,
 				   cpu_to_node(cpu),
-				   "rcuc%d", cpu);
+				   "rcuc/%d", cpu);
 	if (IS_ERR(t))
 		return PTR_ERR(t);
 	if (cpu_online(cpu))
@@ -1701,7 +1711,7 @@ static int __cpuinit rcu_spawn_one_node_kthread(struct rcu_state *rsp,
 		return 0;
 	if (rnp->node_kthread_task == NULL) {
 		t = kthread_create(rcu_node_kthread, (void *)rnp,
-				   "rcun%d", rnp_index);
+				   "rcun/%d", rnp_index);
 		if (IS_ERR(t))
 			return PTR_ERR(t);
 		raw_spin_lock_irqsave(&rnp->lock, flags);
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 45/55] rcu: check for entering dyntick-idle mode while in read-side critical section
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (43 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 44/55] rcu: wire up RCU_BOOST_PRIO for rcutree Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 46/55] rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states Paul E. McKenney
                   ` (11 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

CONFIG_RCU_FAST_NO_HZ assumes that all calls to rcu_needs_cpu() are
outside of all RCU read-side critical sections.  This patch adds diagnostic
checks to verify this assumption.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |    3 +++
 kernel/rcutree.c         |    3 +++
 2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index e86bc28..8d7efc8 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -233,6 +233,8 @@ static inline void destroy_rcu_head_on_stack(struct rcu_head *head)
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 
+#define PROVE_RCU(a) a
+
 extern struct lockdep_map rcu_lock_map;
 # define rcu_read_acquire() \
 		lock_acquire(&rcu_lock_map, 0, 0, 2, 1, NULL, _THIS_IP_)
@@ -311,6 +313,7 @@ static inline int rcu_read_lock_sched_held(void)
 
 #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
+# define PROVE_RCU(a)			do { } while (0)
 # define rcu_read_acquire()		do { } while (0)
 # define rcu_read_release()		do { } while (0)
 # define rcu_read_acquire_bh()		do { } while (0)
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 8455043..743a658 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1786,6 +1786,9 @@ static int rcu_pending(int cpu)
  */
 static int rcu_needs_cpu_quick_check(int cpu)
 {
+	PROVE_RCU(WARN_ON_ONCE(lock_is_held(&rcu_lock_map)));
+	PROVE_RCU(WARN_ON_ONCE(lock_is_held(&rcu_bh_lock_map)));
+	PROVE_RCU(WARN_ON_ONCE(lock_is_held(&rcu_sched_lock_map)));
 	/* RCU callbacks either ready or pending? */
 	return per_cpu(rcu_sched_data, cpu).nxtlist ||
 	       per_cpu(rcu_bh_data, cpu).nxtlist ||
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 46/55] rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (44 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 45/55] rcu: check for entering dyntick-idle mode while in read-side critical section Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 47/55] rcu: Move propagation of ->completed from rcu_start_gp() to rcu_report_qs_rsp() Paul E. McKenney
                   ` (10 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

The purpose of rcu_needs_cpu_flush() was to iterate on pushing the
current grace period in order to help the current CPU enter dyntick-idle
mode.  However, this can result in failures if the CPU starts entering
dyntick-idle mode, but then backs out.  In this case, the call to
rcu_pending() from rcu_needs_cpu_flush() might end up announcing a
non-existing quiescent state.

This commit therefore removes rcu_needs_cpu_flush() in favor of letting
the dyntick-idle machinery at the end of the softirq handler push the
loop along via its call to rcu_pending().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c        |    3 ---
 kernel/rcutree.h        |    1 -
 kernel/rcutree_plugin.h |   25 -------------------------
 3 files changed, 0 insertions(+), 29 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 743a658..ab37c19 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1528,9 +1528,6 @@ static void rcu_process_callbacks(struct softirq_action *unused)
 				&__get_cpu_var(rcu_sched_data));
 	__rcu_process_callbacks(&rcu_bh_state, &__get_cpu_var(rcu_bh_data));
 	rcu_preempt_process_callbacks();
-
-	/* If we are last CPU on way to dyntick-idle mode, accelerate it. */
-	rcu_needs_cpu_flush();
 	trace_rcu_utilization("End RCU core");
 }
 
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index f509f72..849ce9e 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -458,7 +458,6 @@ static int rcu_preempt_needs_cpu(int cpu);
 static void __cpuinit rcu_preempt_init_percpu_data(int cpu);
 static void rcu_preempt_send_cbs_to_online(void);
 static void __init __rcu_init_preempt(void);
-static void rcu_needs_cpu_flush(void);
 static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags);
 static void rcu_preempt_boost_start_gp(struct rcu_node *rnp);
 static void invoke_rcu_callbacks_kthread(void);
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index f751a2d..e4fb1ff 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1949,15 +1949,6 @@ int rcu_needs_cpu(int cpu)
 	return rcu_needs_cpu_quick_check(cpu);
 }
 
-/*
- * Check to see if we need to continue a callback-flush operations to
- * allow the last CPU to enter dyntick-idle mode.  But fast dyntick-idle
- * entry is not configured, so we never do need to.
- */
-static void rcu_needs_cpu_flush(void)
-{
-}
-
 #else /* #if !defined(CONFIG_RCU_FAST_NO_HZ) */
 
 #define RCU_NEEDS_CPU_FLUSHES 5
@@ -2033,20 +2024,4 @@ int rcu_needs_cpu(int cpu)
 	return c;
 }
 
-/*
- * Check to see if we need to continue a callback-flush operations to
- * allow the last CPU to enter dyntick-idle mode.
- */
-static void rcu_needs_cpu_flush(void)
-{
-	int cpu = smp_processor_id();
-	unsigned long flags;
-
-	if (per_cpu(rcu_dyntick_drain, cpu) <= 0)
-		return;
-	local_irq_save(flags);
-	(void)rcu_needs_cpu(cpu);
-	local_irq_restore(flags);
-}
-
 #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 47/55] rcu: Move propagation of ->completed from rcu_start_gp() to rcu_report_qs_rsp()
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (45 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 46/55] rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 48/55] powerpc: strengthen value-returning-atomics memory barriers Paul E. McKenney
                   ` (9 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

It is possible for the CPU that noted the end of the prior grace period
to not need a new one, and therefore to decide to propagate ->completed
throughout the rcu_node tree without starting another grace period.
However, in so doing, it releases the root rcu_node structure's lock,
which can allow some other CPU to start another grace period.  The first
CPU will be propagating ->completed in parallel with the second CPU
initializing the rcu_node tree for the new grace period.  In theory
this is harmless, but in practice we need to keep things simple.

This commit therefore moves the propagation of ->completed to
rcu_report_qs_rsp(), and refrains from marking the old grace period
as having been completed until it has finished doing this.  This
prevents anyone from starting a new grace period concurrently with
marking the old grace period as having been completed.

Of course, the optimization where a CPU needing a new grace period
doesn't bother marking the old one completed is still in effect:
In that case, the marking happens implicitly as part of initializing
the new grace period.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   71 ++++++++++++++++++++++++++++++++++++++---------------
 1 files changed, 51 insertions(+), 20 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index ab37c19..f0a9432 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -842,28 +842,24 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
 	if (!rcu_scheduler_fully_active ||
-	    !cpu_needs_another_gp(rsp, rdp) ||
-	    rsp->fqs_active) {
-		if (rcu_scheduler_fully_active &&
-		    cpu_needs_another_gp(rsp, rdp))
-			rsp->fqs_need_gp = 1;
-		if (rnp->completed == rsp->completed) {
-			raw_spin_unlock_irqrestore(&rnp->lock, flags);
-			return;
-		}
-		raw_spin_unlock(&rnp->lock);	 /* irqs remain disabled. */
+	    !cpu_needs_another_gp(rsp, rdp)) {
+		/*
+		 * Either the scheduler hasn't yet spawned the first
+		 * non-idle task or this CPU does not need another
+		 * grace period.  Either way, don't start a new grace
+		 * period.
+		 */
+		raw_spin_unlock_irqrestore(&rnp->lock, flags);
+		return;
+	}
 
+	if (rsp->fqs_active) {
 		/*
-		 * Propagate new ->completed value to rcu_node structures
-		 * so that other CPUs don't have to wait until the start
-		 * of the next grace period to process their callbacks.
+		 * This CPU needs a grace period, but force_quiescent_state()
+		 * is running.  Tell it to start one on this CPU's behalf.
 		 */
-		rcu_for_each_node_breadth_first(rsp, rnp) {
-			raw_spin_lock(&rnp->lock); /* irqs already disabled. */
-			rnp->completed = rsp->completed;
-			raw_spin_unlock(&rnp->lock); /* irqs remain disabled. */
-		}
-		local_irq_restore(flags);
+		rsp->fqs_need_gp = 1;
+		raw_spin_unlock_irqrestore(&rnp->lock, flags);
 		return;
 	}
 
@@ -947,6 +943,8 @@ static void rcu_report_qs_rsp(struct rcu_state *rsp, unsigned long flags)
 	__releases(rcu_get_root(rsp)->lock)
 {
 	unsigned long gp_duration;
+	struct rcu_node *rnp = rcu_get_root(rsp);
+	struct rcu_data *rdp = this_cpu_ptr(rsp->rda);
 
 	WARN_ON_ONCE(!rcu_gp_in_progress(rsp));
 
@@ -958,7 +956,40 @@ static void rcu_report_qs_rsp(struct rcu_state *rsp, unsigned long flags)
 	gp_duration = jiffies - rsp->gp_start;
 	if (gp_duration > rsp->gp_max)
 		rsp->gp_max = gp_duration;
-	rsp->completed = rsp->gpnum;
+
+	/*
+	 * We know the grace period is complete, but to everyone else
+	 * it appears to still be ongoing.  But it is also the case
+	 * that to everyone else it looks like there is nothing that
+	 * they can do to advance the grace period.  It is therefore
+	 * safe for us to drop the lock in order to mark the grace
+	 * period as completed in all of the rcu_node structures.
+	 *
+	 * But if this CPU needs another grace period, it will take
+	 * care of this while initializing the next grace period.
+	 * We use RCU_WAIT_TAIL instead of the usual RCU_DONE_TAIL
+	 * because the callbacks have not yet been advanced: Those
+	 * callbacks are waiting on the grace period that just now
+	 * completed.
+	 */
+	if (*rdp->nxttail[RCU_WAIT_TAIL] == NULL) {
+		raw_spin_unlock(&rnp->lock);	 /* irqs remain disabled. */
+
+		/*
+		 * Propagate new ->completed value to rcu_node structures
+		 * so that other CPUs don't have to wait until the start
+		 * of the next grace period to process their callbacks.
+		 */
+		rcu_for_each_node_breadth_first(rsp, rnp) {
+			raw_spin_lock(&rnp->lock); /* irqs already disabled. */
+			rnp->completed = rsp->gpnum;
+			raw_spin_unlock(&rnp->lock); /* irqs remain disabled. */
+		}
+		rnp = rcu_get_root(rsp);
+		raw_spin_lock(&rnp->lock); /* irqs already disabled. */
+	}
+
+	rsp->completed = rsp->gpnum;  /* Declare the grace period complete. */
 	trace_rcu_grace_period(rsp->name, rsp->completed, "end");
 	rsp->signaled = RCU_GP_IDLE;
 	rcu_start_gp(rsp, flags);  /* releases root node's rnp->lock. */
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 48/55] powerpc: strengthen value-returning-atomics memory barriers
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (46 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 47/55] rcu: Move propagation of ->completed from rcu_start_gp() to rcu_report_qs_rsp() Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-09 17:23     ` Olof Johansson
  2011-09-06 18:00 ` [PATCH tip/core/rcu 49/55] rcu: Detect illegal rcu dereference in extended quiescent state Paul E. McKenney
                   ` (8 subsequent siblings)
  56 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, anton, benh, paulus

The trailing isync/lwsync in PowerPC value-returning atomics needs
to be a sync in order to provide the required ordering properties.
The leading lwsync/eieio can remain, as the remainder of the required
ordering guarantees are provided by the atomic instructions: Any
reordering will cause the stwcx to fail, which will result in a retry.

This commit provides the needed adjustment.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: anton@samba.org
Cc: benh@kernel.crashing.org
Cc: paulus@samba.org
---
 arch/powerpc/include/asm/synch.h |    6 +-----
 1 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/synch.h b/arch/powerpc/include/asm/synch.h
index d7cab44..4d97fbe 100644
--- a/arch/powerpc/include/asm/synch.h
+++ b/arch/powerpc/include/asm/synch.h
@@ -37,11 +37,7 @@ static inline void isync(void)
 #endif
 
 #ifdef CONFIG_SMP
-#define __PPC_ACQUIRE_BARRIER				\
-	START_LWSYNC_SECTION(97);			\
-	isync;						\
-	MAKE_LWSYNC_SECTION_ENTRY(97, __lwsync_fixup);
-#define PPC_ACQUIRE_BARRIER	"\n" stringify_in_c(__PPC_ACQUIRE_BARRIER)
+#define PPC_ACQUIRE_BARRIER	"\n" stringify_in_c(sync;)
 #define PPC_RELEASE_BARRIER	stringify_in_c(LWSYNC) "\n"
 #else
 #define PPC_ACQUIRE_BARRIER
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 49/55] rcu: Detect illegal rcu dereference in extended quiescent state
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (47 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 48/55] powerpc: strengthen value-returning-atomics memory barriers Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 50/55] rcu: Inform the user about dynticks-idle mode on PROVE_RCU warning Paul E. McKenney
                   ` (7 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Frederic Weisbecker, Paul E. McKenney,
	Peter Zijlstra

From: Frederic Weisbecker <fweisbec@gmail.com>

Report that none of the rcu read lock maps are held while in an RCU
extended quiescent state (in this case, the RCU extended quiescent state
is dyntick-idle mode). This helps detect any use of rcu_dereference()
and friends from within dyntick-idle mode.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |   36 ++++++++++++++++++++++++++++++++++++
 kernel/rcupdate.c        |   17 ++++++++++++++++-
 kernel/rcutiny.c         |   14 ++++++++++++++
 kernel/rcutree.c         |   16 ++++++++++++++++
 4 files changed, 82 insertions(+), 1 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 8d7efc8..7d8fa7c 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -231,6 +231,14 @@ static inline void destroy_rcu_head_on_stack(struct rcu_head *head)
 }
 #endif	/* #else !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
 
+
+#if defined(CONFIG_PROVE_RCU) && defined(CONFIG_NO_HZ)
+extern bool rcu_check_extended_qs(void);
+#else
+static inline bool rcu_check_extended_qs(void) { return false; }
+#endif
+
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 
 #define PROVE_RCU(a) a
@@ -264,11 +272,25 @@ extern int debug_lockdep_rcu_enabled(void);
  *
  * Checks debug_lockdep_rcu_enabled() to prevent false positives during boot
  * and while lockdep is disabled.
+ *
+ * Note that if the CPU is in an extended quiescent state, for example,
+ * if the CPU is in dyntick-idle mode, then rcu_read_lock_held() returns
+ * false even if the CPU did an rcu_read_lock().  The reason for this is
+ * that RCU ignores CPUs that are in extended quiescent states, so such
+ * a CPU is effectively never in an RCU read-side critical section
+ * regardless of what RCU primitives it invokes.  This state of affairs
+ * is required -- RCU would otherwise need to periodically wake up
+ * dyntick-idle CPUs, which would defeat the whole purpose of dyntick-idle
+ * mode.
  */
 static inline int rcu_read_lock_held(void)
 {
 	if (!debug_lockdep_rcu_enabled())
 		return 1;
+
+	if (rcu_check_extended_qs())
+		return 0;
+
 	return lock_is_held(&rcu_lock_map);
 }
 
@@ -292,6 +314,16 @@ extern int rcu_read_lock_bh_held(void);
  *
  * Check debug_lockdep_rcu_enabled() to prevent false positives during boot
  * and while lockdep is disabled.
+ *
+ * Note that if the CPU is in an extended quiescent state, for example,
+ * if the CPU is in dyntick-idle mode, then rcu_read_lock_held() returns
+ * false even if the CPU did an rcu_read_lock().  The reason for this is
+ * that RCU ignores CPUs that are in extended quiescent states, so such
+ * a CPU is effectively never in an RCU read-side critical section
+ * regardless of what RCU primitives it invokes.  This state of affairs
+ * is required -- RCU would otherwise need to periodically wake up
+ * dyntick-idle CPUs, which would defeat the whole purpose of dyntick-idle
+ * mode.
  */
 #ifdef CONFIG_PREEMPT
 static inline int rcu_read_lock_sched_held(void)
@@ -300,6 +332,10 @@ static inline int rcu_read_lock_sched_held(void)
 
 	if (!debug_lockdep_rcu_enabled())
 		return 1;
+
+	if (rcu_check_extended_qs())
+		return 0;
+
 	if (debug_locks)
 		lockdep_opinion = lock_is_held(&rcu_sched_lock_map);
 	return lockdep_opinion || preempt_count() != 0 || irqs_disabled();
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index 5031caf..e4d8a98 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -87,12 +87,27 @@ EXPORT_SYMBOL_GPL(debug_lockdep_rcu_enabled);
  * that require that they be called within an RCU read-side critical
  * section.
  *
- * Check debug_lockdep_rcu_enabled() to prevent false positives during boot.
+ * Check debug_lockdep_rcu_enabled() to prevent false positives during boot
+ * and while lockdep is disabled.
+ *
+ * Note that if the CPU is in an extended quiescent state, for example,
+ * if the CPU is in dyntick-idle mode, then rcu_read_lock_held() returns
+ * false even if the CPU did an rcu_read_lock().  The reason for this is
+ * that RCU ignores CPUs that are in extended quiescent states, so such
+ * a CPU is effectively never in an RCU read-side critical section
+ * regardless of what RCU primitives it invokes.  This state of affairs
+ * is required -- RCU would otherwise need to periodically wake up
+ * dyntick-idle CPUs, which would defeat the whole purpose of dyntick-idle
+ * mode.
  */
 int rcu_read_lock_bh_held(void)
 {
 	if (!debug_lockdep_rcu_enabled())
 		return 1;
+
+	if (rcu_check_extended_qs())
+		return 0;
+
 	return in_softirq() || irqs_disabled();
 }
 EXPORT_SYMBOL_GPL(rcu_read_lock_bh_held);
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index da775c8..9e493b9 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -78,6 +78,20 @@ void rcu_exit_nohz(void)
 	rcu_dynticks_nesting++;
 }
 
+
+#ifdef CONFIG_PROVE_RCU
+
+bool rcu_check_extended_qs(void)
+{
+	if (!rcu_dynticks_nesting)
+		return true;
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(rcu_check_extended_qs);
+
+#endif
+
 #endif /* #ifdef CONFIG_NO_HZ */
 
 /*
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index f0a9432..c9b4adf 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -465,6 +465,22 @@ void rcu_irq_exit(void)
 	rcu_enter_nohz();
 }
 
+#ifdef CONFIG_PROVE_RCU
+
+bool rcu_check_extended_qs(void)
+{
+	struct rcu_dynticks *rdtp;
+
+	rdtp = &per_cpu(rcu_dynticks, raw_smp_processor_id());
+	if (atomic_read(&rdtp->dynticks) & 0x1)
+		return false;
+
+	return true;
+}
+EXPORT_SYMBOL_GPL(rcu_check_extended_qs);
+
+#endif /* CONFIG_PROVE_RCU */
+
 #ifdef CONFIG_SMP
 
 /*
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 50/55] rcu: Inform the user about dynticks-idle mode on PROVE_RCU warning
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (48 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 49/55] rcu: Detect illegal rcu dereference in extended quiescent state Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 51/55] rcu: Warn when rcu_read_lock() is used in extended quiescent state Paul E. McKenney
                   ` (6 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Frederic Weisbecker, Paul E. McKenney,
	Peter Zijlstra

From: Frederic Weisbecker <fweisbec@gmail.com>

Inform the user if an RCU usage error is detected by lockdep while in
an extended quiescent state (in this case, dyntick-idle mode).  This
is accomplished by adding a line to the RCU lockdep splat indicating
whether or not the splat occurred in dyntick-idle mode.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/lockdep.c |   20 ++++++++++++++++++++
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index df2ad37..ef5dd69 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -4008,6 +4008,26 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
 	printk("%s:%d %s!\n", file, line, s);
 	printk("\nother info that might help us debug this:\n\n");
 	printk("\nrcu_scheduler_active = %d, debug_locks = %d\n", rcu_scheduler_active, debug_locks);
+
+	/*
+	 * If a CPU is in dyntick-idle mode (CONFIG_NO_HZ), then RCU
+	 * considers that CPU to be in an "extended quiescent state",
+	 * which means that RCU will be completely ignoring that CPU.
+	 * Therefore, rcu_read_lock() and friends have absolutely no
+	 * effect on a dyntick-idle CPU.  In other words, even if a
+	 * dyntick-idle CPU has called rcu_read_lock(), RCU might well
+	 * delete data structures out from under it.  RCU really has no
+	 * choice here: if it were to consult the CPU, that would wake
+	 * the CPU up, and the whole point of dyntick-idle mode is to
+	 * allow CPUs to enter extremely deep sleep states.
+	 *
+	 * So complain bitterly if someone does call rcu_read_lock(),
+	 * rcu_read_lock_bh() and so on from extended quiescent states
+	 * such as dyntick-idle mode.
+	 */
+	if (rcu_check_extended_qs())
+		printk("RCU used illegally from extended quiescent state!\n");
+
 	lockdep_print_held_locks(curr);
 	printk("\nstack backtrace:\n");
 	dump_stack();
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 51/55] rcu: Warn when rcu_read_lock() is used in extended quiescent state
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (49 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 50/55] rcu: Inform the user about dynticks-idle mode on PROVE_RCU warning Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 52/55] rcu: Remove one layer of abstraction from PROVE_RCU checking Paul E. McKenney
                   ` (5 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Frederic Weisbecker, Paul E. McKenney,
	Peter Zijlstra

From: Frederic Weisbecker <fweisbec@gmail.com>

We are currently able to detect uses of rcu_dereference_check() inside
extended quiescent states (such as dyntick-idle mode). But rcu_read_lock()
and friends can be used without rcu_dereference(), so that the earlier
commit checking for use of rcu_dereference() and friends while in
dyntick-idle mode miss some error conditions.  This commit therefore adds
dyntick-idle checking to rcu_read_lock() and friends.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |   52 +++++++++++++++++++++++++++++++++++++--------
 1 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 7d8fa7c..a9bc36d 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -243,21 +243,53 @@ static inline bool rcu_check_extended_qs(void) { return false; }
 
 #define PROVE_RCU(a) a
 
+static inline void rcu_lock_acquire(struct lockdep_map *map)
+{
+	WARN_ON_ONCE(rcu_check_extended_qs());
+	lock_acquire(map, 0, 0, 2, 1, NULL, _THIS_IP_);
+}
+
+static inline void rcu_lock_release(struct lockdep_map *map)
+{
+	WARN_ON_ONCE(rcu_check_extended_qs());
+	lock_release(map, 1, _THIS_IP_);
+}
+
 extern struct lockdep_map rcu_lock_map;
-# define rcu_read_acquire() \
-		lock_acquire(&rcu_lock_map, 0, 0, 2, 1, NULL, _THIS_IP_)
-# define rcu_read_release()	lock_release(&rcu_lock_map, 1, _THIS_IP_)
+
+static inline void rcu_read_acquire(void)
+{
+	rcu_lock_acquire(&rcu_lock_map);
+}
+
+static inline void rcu_read_release(void)
+{
+	rcu_lock_release(&rcu_lock_map);
+}
 
 extern struct lockdep_map rcu_bh_lock_map;
-# define rcu_read_acquire_bh() \
-		lock_acquire(&rcu_bh_lock_map, 0, 0, 2, 1, NULL, _THIS_IP_)
-# define rcu_read_release_bh()	lock_release(&rcu_bh_lock_map, 1, _THIS_IP_)
+
+static inline void rcu_read_acquire_bh(void)
+{
+	rcu_lock_acquire(&rcu_bh_lock_map);
+}
+
+static inline void rcu_read_release_bh(void)
+{
+	rcu_lock_release(&rcu_bh_lock_map);
+}
 
 extern struct lockdep_map rcu_sched_lock_map;
-# define rcu_read_acquire_sched() \
-		lock_acquire(&rcu_sched_lock_map, 0, 0, 2, 1, NULL, _THIS_IP_)
-# define rcu_read_release_sched() \
-		lock_release(&rcu_sched_lock_map, 1, _THIS_IP_)
+
+static inline void rcu_read_acquire_sched(void)
+{
+	rcu_lock_acquire(&rcu_sched_lock_map);
+}
+
+static inline void rcu_read_release_sched(void)
+{
+	rcu_lock_release(&rcu_sched_lock_map);
+}
 
 extern int debug_lockdep_rcu_enabled(void);
 
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 52/55] rcu: Remove one layer of abstraction from PROVE_RCU checking
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (50 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 51/55] rcu: Warn when rcu_read_lock() is used in extended quiescent state Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-06 18:00 ` [PATCH tip/core/rcu 53/55] rcu: Warn when srcu_read_lock() is used in an extended quiescent state Paul E. McKenney
                   ` (4 subsequent siblings)
  56 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

Simplify things a bit by substituting the definitions of the single-line
rcu_read_acquire(), rcu_read_release(), rcu_read_acquire_bh(),
rcu_read_release_bh(), rcu_read_acquire_sched(), and
rcu_read_release_sched() functions at their call points.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |   53 +++++++---------------------------------------
 1 files changed, 8 insertions(+), 45 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index a9bc36d..9d40e42 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -256,41 +256,8 @@ static inline void rcu_lock_release(struct lockdep_map *map)
 }
 
 extern struct lockdep_map rcu_lock_map;
-
-static inline void rcu_read_acquire(void)
-{
-	rcu_lock_acquire(&rcu_lock_map);
-}
-
-static inline void rcu_read_release(void)
-{
-	rcu_lock_release(&rcu_lock_map);
-}
-
 extern struct lockdep_map rcu_bh_lock_map;
-
-static inline void rcu_read_acquire_bh(void)
-{
-	rcu_lock_acquire(&rcu_bh_lock_map);
-}
-
-static inline void rcu_read_release_bh(void)
-{
-	rcu_lock_release(&rcu_bh_lock_map);
-}
-
 extern struct lockdep_map rcu_sched_lock_map;
-
-static inline void rcu_read_acquire_sched(void)
-{
-	rcu_lock_acquire(&rcu_sched_lock_map);
-}
-
-static inline void rcu_read_release_sched(void)
-{
-	rcu_lock_release(&rcu_sched_lock_map);
-}
-
 extern int debug_lockdep_rcu_enabled(void);
 
 /**
@@ -382,12 +349,8 @@ static inline int rcu_read_lock_sched_held(void)
 #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
 # define PROVE_RCU(a)			do { } while (0)
-# define rcu_read_acquire()		do { } while (0)
-# define rcu_read_release()		do { } while (0)
-# define rcu_read_acquire_bh()		do { } while (0)
-# define rcu_read_release_bh()		do { } while (0)
-# define rcu_read_acquire_sched()	do { } while (0)
-# define rcu_read_release_sched()	do { } while (0)
+# define rcu_lock_acquire(a)		do { } while (0)
+# define rcu_lock_release(a)		do { } while (0)
 
 static inline int rcu_read_lock_held(void)
 {
@@ -708,7 +671,7 @@ static inline void rcu_read_lock(void)
 {
 	__rcu_read_lock();
 	__acquire(RCU);
-	rcu_read_acquire();
+	rcu_lock_acquire(&rcu_lock_map);
 }
 
 /*
@@ -728,7 +691,7 @@ static inline void rcu_read_lock(void)
  */
 static inline void rcu_read_unlock(void)
 {
-	rcu_read_release();
+	rcu_lock_release(&rcu_lock_map);
 	__release(RCU);
 	__rcu_read_unlock();
 }
@@ -749,7 +712,7 @@ static inline void rcu_read_lock_bh(void)
 {
 	local_bh_disable();
 	__acquire(RCU_BH);
-	rcu_read_acquire_bh();
+	rcu_lock_acquire(&rcu_bh_lock_map);
 }
 
 /*
@@ -759,7 +722,7 @@ static inline void rcu_read_lock_bh(void)
  */
 static inline void rcu_read_unlock_bh(void)
 {
-	rcu_read_release_bh();
+	rcu_lock_release(&rcu_bh_lock_map);
 	__release(RCU_BH);
 	local_bh_enable();
 }
@@ -776,7 +739,7 @@ static inline void rcu_read_lock_sched(void)
 {
 	preempt_disable();
 	__acquire(RCU_SCHED);
-	rcu_read_acquire_sched();
+	rcu_lock_acquire(&rcu_sched_lock_map);
 }
 
 /* Used by lockdep and tracing: cannot be traced, cannot call lockdep. */
@@ -793,7 +756,7 @@ static inline notrace void rcu_read_lock_sched_notrace(void)
  */
 static inline void rcu_read_unlock_sched(void)
 {
-	rcu_read_release_sched();
+	rcu_lock_release(&rcu_sched_lock_map);
 	__release(RCU_SCHED);
 	preempt_enable();
 }
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 53/55] rcu: Warn when srcu_read_lock() is used in an extended quiescent state
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (51 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 52/55] rcu: Remove one layer of abstraction from PROVE_RCU checking Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-10-04 21:03   ` Frederic Weisbecker
  2011-09-06 18:00 ` [PATCH tip/core/rcu 54/55] rcu: Make srcu_read_lock_held() call common lockdep-enabled function Paul E. McKenney
                   ` (3 subsequent siblings)
  56 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

Catch SRCU up to the other variants of RCU by making PROVE_RCU
complain if either srcu_read_lock() or srcu_read_lock_held() are
used from within dyntick-idle mode.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/srcu.h |   25 +++++++++++++++----------
 1 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 58971e8..fcbaee7 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -28,6 +28,7 @@
 #define _LINUX_SRCU_H
 
 #include <linux/mutex.h>
+#include <linux/rcupdate.h>
 
 struct srcu_struct_array {
 	int c[2];
@@ -60,18 +61,10 @@ int __init_srcu_struct(struct srcu_struct *sp, const char *name,
 	__init_srcu_struct((sp), #sp, &__srcu_key); \
 })
 
-# define srcu_read_acquire(sp) \
-		lock_acquire(&(sp)->dep_map, 0, 0, 2, 1, NULL, _THIS_IP_)
-# define srcu_read_release(sp) \
-		lock_release(&(sp)->dep_map, 1, _THIS_IP_)
-
 #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
 int init_srcu_struct(struct srcu_struct *sp);
 
-# define srcu_read_acquire(sp)  do { } while (0)
-# define srcu_read_release(sp)  do { } while (0)
-
 #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
 void cleanup_srcu_struct(struct srcu_struct *sp);
@@ -90,11 +83,23 @@ long srcu_batches_completed(struct srcu_struct *sp);
  * read-side critical section.  In absence of CONFIG_DEBUG_LOCK_ALLOC,
  * this assumes we are in an SRCU read-side critical section unless it can
  * prove otherwise.
+ *
+ * Note that if the CPU is in an extended quiescent state, for example,
+ * if the CPU is in dyntick-idle mode, then rcu_read_lock_held() returns
+ * false even if the CPU did an rcu_read_lock().  The reason for this is
+ * that RCU ignores CPUs that are in extended quiescent states, so such
+ * a CPU is effectively never in an RCU read-side critical section
+ * regardless of what RCU primitives it invokes.  This state of affairs
+ * is required -- RCU would otherwise need to periodically wake up
+ * dyntick-idle CPUs, which would defeat the whole purpose of dyntick-idle
+ * mode.
  */
 static inline int srcu_read_lock_held(struct srcu_struct *sp)
 {
 	if (debug_locks)
 		return lock_is_held(&sp->dep_map);
+	if (rcu_check_extended_qs())
+		return 0;
 	return 1;
 }
 
@@ -150,7 +155,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
 {
 	int retval = __srcu_read_lock(sp);
 
-	srcu_read_acquire(sp);
+	rcu_lock_acquire(&(sp)->dep_map);
 	return retval;
 }
 
@@ -164,7 +169,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
 static inline void srcu_read_unlock(struct srcu_struct *sp, int idx)
 	__releases(sp)
 {
-	srcu_read_release(sp);
+	rcu_lock_release(&(sp)->dep_map);
 	__srcu_read_unlock(sp, idx);
 }
 
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 54/55] rcu: Make srcu_read_lock_held() call common lockdep-enabled function
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (52 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 53/55] rcu: Warn when srcu_read_lock() is used in an extended quiescent state Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-10-17  2:03   ` Josh Triplett
  2011-09-06 18:00 ` [PATCH tip/core/rcu 55/55] powerpc: Work around tracing from dyntick-idle mode Paul E. McKenney
                   ` (2 subsequent siblings)
  56 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney

A common debug_lockdep_rcu_enabled() function is used to check whether
RCU lockdep splats should be reported, but srcu_read_lock() does not
use it.  This commit therefore brings srcu_read_lock_held() up to date.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/srcu.h |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index fcbaee7..54a70b7 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -84,6 +84,9 @@ long srcu_batches_completed(struct srcu_struct *sp);
  * this assumes we are in an SRCU read-side critical section unless it can
  * prove otherwise.
  *
+ * Checks debug_lockdep_rcu_enabled() to prevent false positives during boot
+ * and while lockdep is disabled.
+ *
  * Note that if the CPU is in an extended quiescent state, for example,
  * if the CPU is in dyntick-idle mode, then rcu_read_lock_held() returns
  * false even if the CPU did an rcu_read_lock().  The reason for this is
@@ -96,7 +99,7 @@ long srcu_batches_completed(struct srcu_struct *sp);
  */
 static inline int srcu_read_lock_held(struct srcu_struct *sp)
 {
-	if (debug_locks)
+	if (!debug_lockdep_rcu_enabled())
 		return lock_is_held(&sp->dep_map);
 	if (rcu_check_extended_qs())
 		return 0;
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH tip/core/rcu 55/55] powerpc: Work around tracing from dyntick-idle mode
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (53 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 54/55] rcu: Make srcu_read_lock_held() call common lockdep-enabled function Paul E. McKenney
@ 2011-09-06 18:00 ` Paul E. McKenney
  2011-09-07 10:00   ` Benjamin Herrenschmidt
  2011-09-07 14:39 ` [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Lin Ming
  2011-10-17  2:06 ` Josh Triplett
  56 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-06 18:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, patches, Paul E. McKenney, anton, benh, paulus

PowerPC LPAR's __trace_hcall_exit() can invoke event tracing at a
point where RCU has been told that the CPU is in dyntick-idle mode.
Because event tracing uses RCU, this can result in failures.

A correct fix would arrange for RCU to be told about dyntick-idle
mode after tracing had completed, however, this will require some care
because it appears that __trace_hcall_exit() can also be called from
non-dyntick-idle mode.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: anton@samba.org
Cc: benh@kernel.crashing.org
Cc: paulus@samba.org
---
 arch/powerpc/platforms/pseries/lpar.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 39e6e0a..668f300 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -715,12 +715,14 @@ EXPORT_SYMBOL(arch_free_page);
 /* NB: reg/unreg are called while guarded with the tracepoints_mutex */
 extern long hcall_tracepoint_refcount;
 
+#if 0 /* work around buggy use of RCU from dyntick-idle mode */
 /* 
  * Since the tracing code might execute hcalls we need to guard against
  * recursion. One example of this are spinlocks calling H_YIELD on
  * shared processor partitions.
  */
 static DEFINE_PER_CPU(unsigned int, hcall_trace_depth);
+#endif /* #if 0 work around buggy use of RCU from dyntick-idle mode */
 
 void hcall_tracepoint_regfunc(void)
 {
@@ -734,6 +736,7 @@ void hcall_tracepoint_unregfunc(void)
 
 void __trace_hcall_entry(unsigned long opcode, unsigned long *args)
 {
+#if 0 /* work around buggy use of RCU from dyntick-idle mode */
 	unsigned long flags;
 	unsigned int *depth;
 
@@ -750,11 +753,13 @@ void __trace_hcall_entry(unsigned long opcode, unsigned long *args)
 
 out:
 	local_irq_restore(flags);
+#endif /* #if 0 work around buggy use of RCU from dyntick-idle mode */
 }
 
 void __trace_hcall_exit(long opcode, unsigned long retval,
 			unsigned long *retbuf)
 {
+#if 0 /* work around buggy use of RCU from dyntick-idle mode */
 	unsigned long flags;
 	unsigned int *depth;
 
@@ -771,6 +776,7 @@ void __trace_hcall_exit(long opcode, unsigned long retval,
 
 out:
 	local_irq_restore(flags);
+#endif /* #if 0 work around buggy use of RCU from dyntick-idle mode */
 }
 #endif
 
-- 
1.7.3.2


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 55/55] powerpc: Work around tracing from dyntick-idle mode
  2011-09-06 18:00 ` [PATCH tip/core/rcu 55/55] powerpc: Work around tracing from dyntick-idle mode Paul E. McKenney
@ 2011-09-07 10:00   ` Benjamin Herrenschmidt
  2011-09-07 13:44     ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Benjamin Herrenschmidt @ 2011-09-07 10:00 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, anton, paulus

On Tue, 2011-09-06 at 11:00 -0700, Paul E. McKenney wrote:
> PowerPC LPAR's __trace_hcall_exit() can invoke event tracing at a
> point where RCU has been told that the CPU is in dyntick-idle mode.
> Because event tracing uses RCU, this can result in failures.
> 
> A correct fix would arrange for RCU to be told about dyntick-idle
> mode after tracing had completed, however, this will require some care
> because it appears that __trace_hcall_exit() can also be called from
> non-dyntick-idle mode.

This obviously needs to be fixed properly. hcall tracing is very useful
and if I understand your patch properly, it just comments it out :-)

I'm not sure what the best approach is, maybe have the hcall tracing
test for the dyntick-idle mode and skip tracing in that case ?

Cheers,
Ben.

> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: anton@samba.org
> Cc: benh@kernel.crashing.org
> Cc: paulus@samba.org
> ---
>  arch/powerpc/platforms/pseries/lpar.c |    6 ++++++
>  1 files changed, 6 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
> index 39e6e0a..668f300 100644
> --- a/arch/powerpc/platforms/pseries/lpar.c
> +++ b/arch/powerpc/platforms/pseries/lpar.c
> @@ -715,12 +715,14 @@ EXPORT_SYMBOL(arch_free_page);
>  /* NB: reg/unreg are called while guarded with the tracepoints_mutex */
>  extern long hcall_tracepoint_refcount;
>  
> +#if 0 /* work around buggy use of RCU from dyntick-idle mode */
>  /* 
>   * Since the tracing code might execute hcalls we need to guard against
>   * recursion. One example of this are spinlocks calling H_YIELD on
>   * shared processor partitions.
>   */
>  static DEFINE_PER_CPU(unsigned int, hcall_trace_depth);
> +#endif /* #if 0 work around buggy use of RCU from dyntick-idle mode */
>  
>  void hcall_tracepoint_regfunc(void)
>  {
> @@ -734,6 +736,7 @@ void hcall_tracepoint_unregfunc(void)
>  
>  void __trace_hcall_entry(unsigned long opcode, unsigned long *args)
>  {
> +#if 0 /* work around buggy use of RCU from dyntick-idle mode */
>  	unsigned long flags;
>  	unsigned int *depth;
>  
> @@ -750,11 +753,13 @@ void __trace_hcall_entry(unsigned long opcode, unsigned long *args)
>  
>  out:
>  	local_irq_restore(flags);
> +#endif /* #if 0 work around buggy use of RCU from dyntick-idle mode */
>  }
>  
>  void __trace_hcall_exit(long opcode, unsigned long retval,
>  			unsigned long *retbuf)
>  {
> +#if 0 /* work around buggy use of RCU from dyntick-idle mode */
>  	unsigned long flags;
>  	unsigned int *depth;
>  
> @@ -771,6 +776,7 @@ void __trace_hcall_exit(long opcode, unsigned long retval,
>  
>  out:
>  	local_irq_restore(flags);
> +#endif /* #if 0 work around buggy use of RCU from dyntick-idle mode */
>  }
>  #endif
>  



^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 55/55] powerpc: Work around tracing from dyntick-idle mode
  2011-09-07 10:00   ` Benjamin Herrenschmidt
@ 2011-09-07 13:44     ` Paul E. McKenney
  2011-09-13 19:13       ` Frederic Weisbecker
  0 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-07 13:44 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, anton, paulus

On Wed, Sep 07, 2011 at 07:00:22AM -0300, Benjamin Herrenschmidt wrote:
> On Tue, 2011-09-06 at 11:00 -0700, Paul E. McKenney wrote:
> > PowerPC LPAR's __trace_hcall_exit() can invoke event tracing at a
> > point where RCU has been told that the CPU is in dyntick-idle mode.
> > Because event tracing uses RCU, this can result in failures.
> > 
> > A correct fix would arrange for RCU to be told about dyntick-idle
> > mode after tracing had completed, however, this will require some care
> > because it appears that __trace_hcall_exit() can also be called from
> > non-dyntick-idle mode.
> 
> This obviously needs to be fixed properly. hcall tracing is very useful
> and if I understand your patch properly, it just comments it out :-)

That is exactly what it does, and I completely agree that this patch
is nothing but a short-term work-around to allow my RCU tests to find
other bugs.

> I'm not sure what the best approach is, maybe have the hcall tracing
> test for the dyntick-idle mode and skip tracing in that case ?

Another approach would be to update Frederic Weisbecker's patch at:

	https://lkml.org/lkml/2011/8/20/83

so that powerpc does tick_nohz_enter_idle(false), and then uses
rcu_enter_nohz() explicitly just after doing the hcall tracing.
If pseries is the only powerpc architecture requiring this, then
the argument to tick_nohz_enter_idle() could depend on the powerpc
sub-architecture.

The same thing would be needed for tick_nohz_exit_idle() and
rcu_exit_nohz(): powerpc would need to invoke rcu_exit_nohz() after
gaining control from the hypervisor but before doing its first tracing,
and then it would need the idle loop to to tick_nohz_exit_idle(false).
Again, if pseries is the only powerpc architecture requiring this,
the argument to tick_nohz_exit_idle() could depend on the architecture.

Would this approach work?

							Thanx, Paul

> Cheers,
> Ben.
> 
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: anton@samba.org
> > Cc: benh@kernel.crashing.org
> > Cc: paulus@samba.org
> > ---
> >  arch/powerpc/platforms/pseries/lpar.c |    6 ++++++
> >  1 files changed, 6 insertions(+), 0 deletions(-)
> > 
> > diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
> > index 39e6e0a..668f300 100644
> > --- a/arch/powerpc/platforms/pseries/lpar.c
> > +++ b/arch/powerpc/platforms/pseries/lpar.c
> > @@ -715,12 +715,14 @@ EXPORT_SYMBOL(arch_free_page);
> >  /* NB: reg/unreg are called while guarded with the tracepoints_mutex */
> >  extern long hcall_tracepoint_refcount;
> >  
> > +#if 0 /* work around buggy use of RCU from dyntick-idle mode */
> >  /* 
> >   * Since the tracing code might execute hcalls we need to guard against
> >   * recursion. One example of this are spinlocks calling H_YIELD on
> >   * shared processor partitions.
> >   */
> >  static DEFINE_PER_CPU(unsigned int, hcall_trace_depth);
> > +#endif /* #if 0 work around buggy use of RCU from dyntick-idle mode */
> >  
> >  void hcall_tracepoint_regfunc(void)
> >  {
> > @@ -734,6 +736,7 @@ void hcall_tracepoint_unregfunc(void)
> >  
> >  void __trace_hcall_entry(unsigned long opcode, unsigned long *args)
> >  {
> > +#if 0 /* work around buggy use of RCU from dyntick-idle mode */
> >  	unsigned long flags;
> >  	unsigned int *depth;
> >  
> > @@ -750,11 +753,13 @@ void __trace_hcall_entry(unsigned long opcode, unsigned long *args)
> >  
> >  out:
> >  	local_irq_restore(flags);
> > +#endif /* #if 0 work around buggy use of RCU from dyntick-idle mode */
> >  }
> >  
> >  void __trace_hcall_exit(long opcode, unsigned long retval,
> >  			unsigned long *retbuf)
> >  {
> > +#if 0 /* work around buggy use of RCU from dyntick-idle mode */
> >  	unsigned long flags;
> >  	unsigned int *depth;
> >  
> > @@ -771,6 +776,7 @@ void __trace_hcall_exit(long opcode, unsigned long retval,
> >  
> >  out:
> >  	local_irq_restore(flags);
> > +#endif /* #if 0 work around buggy use of RCU from dyntick-idle mode */
> >  }
> >  #endif
> >  
> 
> 

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (54 preceding siblings ...)
  2011-09-06 18:00 ` [PATCH tip/core/rcu 55/55] powerpc: Work around tracing from dyntick-idle mode Paul E. McKenney
@ 2011-09-07 14:39 ` Lin Ming
  2011-09-08 17:41   ` Paul E. McKenney
  2011-10-17  2:06 ` Josh Triplett
  56 siblings, 1 reply; 101+ messages in thread
From: Lin Ming @ 2011-09-07 14:39 UTC (permalink / raw)
  To: paulmck
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Wed, Sep 7, 2011 at 2:00 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
>
> For a testing-only version of this patchset from git, please see the
> following subject-to-rebase (and subject-to-Hera-availability) branch:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/testing

kernel.org is still down.
Would you put the tree somewhere else(maybe github.com)?

Thanks,
Lin Ming

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 05/55] rcu: Move rcu_head definition to types.h
  2011-09-06 17:59 ` [PATCH tip/core/rcu 05/55] rcu: Move rcu_head definition to types.h Paul E. McKenney
@ 2011-09-07 18:31   ` Paul Gortmaker
  2011-09-07 22:11     ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Paul Gortmaker @ 2011-09-07 18:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On 11-09-06 01:59 PM, Paul E. McKenney wrote:
> Take a first step towards untangling Linux kernel header files by
> placing the struct rcu_head definition into include/linux/types.h
> and including include/linux/types.h in include/linux/rcupdate.h
> where struct rcu_head used to be defined.  The actual inclusion point
> for include/linux/types.h is with the rest of the #include directives
> rather than at the point where struct rcu_head used to be defined,
> as suggested by Mathieu Desnoyers.
> 
> Once this is in place, then header files that need only rcu_head
> can include types.h rather than rcupdate.h.

Good to see more of this untangle work taking place.

The only comment I have is whether there is any sort of implicit
categorization in place for what is appropriate for types.h.

At the moment it seems to only contain really core primary types
(and the list/hlist structs which are almost core types...)

Are there any other alternative places - does linux/kernel.h make
more sense?

Paul.

> 
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> ---
>  include/linux/rcupdate.h |   11 +----------
>  include/linux/types.h    |   10 ++++++++++
>  2 files changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 8be0433..2516555 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -33,6 +33,7 @@
>  #ifndef __LINUX_RCUPDATE_H
>  #define __LINUX_RCUPDATE_H
>  
> +#include <linux/types.h>
>  #include <linux/cache.h>
>  #include <linux/spinlock.h>
>  #include <linux/threads.h>
> @@ -64,16 +65,6 @@ static inline void rcutorture_record_progress(unsigned long vernum)
>  #define ULONG_CMP_GE(a, b)	(ULONG_MAX / 2 >= (a) - (b))
>  #define ULONG_CMP_LT(a, b)	(ULONG_MAX / 2 < (a) - (b))
>  
> -/**
> - * struct rcu_head - callback structure for use with RCU
> - * @next: next update requests in a list
> - * @func: actual update function to call after the grace period.
> - */
> -struct rcu_head {
> -	struct rcu_head *next;
> -	void (*func)(struct rcu_head *head);
> -};
> -
>  /* Exported common interfaces */
>  extern void call_rcu_sched(struct rcu_head *head,
>  			   void (*func)(struct rcu_head *rcu));
> diff --git a/include/linux/types.h b/include/linux/types.h
> index 176da8c..57a9723 100644
> --- a/include/linux/types.h
> +++ b/include/linux/types.h
> @@ -238,6 +238,16 @@ struct ustat {
>  	char			f_fpack[6];
>  };
>  
> +/**
> + * struct rcu_head - callback structure for use with RCU
> + * @next: next update requests in a list
> + * @func: actual update function to call after the grace period.
> + */
> +struct rcu_head {
> +	struct rcu_head *next;
> +	void (*func)(struct rcu_head *head);
> +};
> +
>  #endif	/* __KERNEL__ */
>  #endif /*  __ASSEMBLY__ */
>  #endif /* _LINUX_TYPES_H */

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 05/55] rcu: Move rcu_head definition to types.h
  2011-09-07 18:31   ` Paul Gortmaker
@ 2011-09-07 22:11     ` Paul E. McKenney
  0 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-07 22:11 UTC (permalink / raw)
  To: Paul Gortmaker
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Wed, Sep 07, 2011 at 02:31:51PM -0400, Paul Gortmaker wrote:
> On 11-09-06 01:59 PM, Paul E. McKenney wrote:
> > Take a first step towards untangling Linux kernel header files by
> > placing the struct rcu_head definition into include/linux/types.h
> > and including include/linux/types.h in include/linux/rcupdate.h
> > where struct rcu_head used to be defined.  The actual inclusion point
> > for include/linux/types.h is with the rest of the #include directives
> > rather than at the point where struct rcu_head used to be defined,
> > as suggested by Mathieu Desnoyers.
> > 
> > Once this is in place, then header files that need only rcu_head
> > can include types.h rather than rcupdate.h.
> 
> Good to see more of this untangle work taking place.
> 
> The only comment I have is whether there is any sort of implicit
> categorization in place for what is appropriate for types.h.
> 
> At the moment it seems to only contain really core primary types
> (and the list/hlist structs which are almost core types...)

Actually, it was the presence of the list/hlist structs that convinced
me that types.h was a good place for rcu_head.

							Thanx, Paul

> Are there any other alternative places - does linux/kernel.h make
> more sense?
> 
> Paul.
> 
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
> > Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> > ---
> >  include/linux/rcupdate.h |   11 +----------
> >  include/linux/types.h    |   10 ++++++++++
> >  2 files changed, 11 insertions(+), 10 deletions(-)
> > 
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 8be0433..2516555 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -33,6 +33,7 @@
> >  #ifndef __LINUX_RCUPDATE_H
> >  #define __LINUX_RCUPDATE_H
> >  
> > +#include <linux/types.h>
> >  #include <linux/cache.h>
> >  #include <linux/spinlock.h>
> >  #include <linux/threads.h>
> > @@ -64,16 +65,6 @@ static inline void rcutorture_record_progress(unsigned long vernum)
> >  #define ULONG_CMP_GE(a, b)	(ULONG_MAX / 2 >= (a) - (b))
> >  #define ULONG_CMP_LT(a, b)	(ULONG_MAX / 2 < (a) - (b))
> >  
> > -/**
> > - * struct rcu_head - callback structure for use with RCU
> > - * @next: next update requests in a list
> > - * @func: actual update function to call after the grace period.
> > - */
> > -struct rcu_head {
> > -	struct rcu_head *next;
> > -	void (*func)(struct rcu_head *head);
> > -};
> > -
> >  /* Exported common interfaces */
> >  extern void call_rcu_sched(struct rcu_head *head,
> >  			   void (*func)(struct rcu_head *rcu));
> > diff --git a/include/linux/types.h b/include/linux/types.h
> > index 176da8c..57a9723 100644
> > --- a/include/linux/types.h
> > +++ b/include/linux/types.h
> > @@ -238,6 +238,16 @@ struct ustat {
> >  	char			f_fpack[6];
> >  };
> >  
> > +/**
> > + * struct rcu_head - callback structure for use with RCU
> > + * @next: next update requests in a list
> > + * @func: actual update function to call after the grace period.
> > + */
> > +struct rcu_head {
> > +	struct rcu_head *next;
> > +	void (*func)(struct rcu_head *head);
> > +};
> > +
> >  #endif	/* __KERNEL__ */
> >  #endif /*  __ASSEMBLY__ */
> >  #endif /* _LINUX_TYPES_H */

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2
  2011-09-07 14:39 ` [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Lin Ming
@ 2011-09-08 17:41   ` Paul E. McKenney
  2011-09-08 19:23     ` Thomas Gleixner
  0 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-08 17:41 UTC (permalink / raw)
  To: Lin Ming
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Wed, Sep 07, 2011 at 10:39:57PM +0800, Lin Ming wrote:
> On Wed, Sep 7, 2011 at 2:00 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> >
> > For a testing-only version of this patchset from git, please see the
> > following subject-to-rebase (and subject-to-Hera-availability) branch:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/testing
> 
> kernel.org is still down.
> Would you put the tree somewhere else(maybe github.com)?

Hello, Lin,

I am unlikely to figure out why github doesn't like me by the end of
the week, so please feel free to post the patches somewhere.  If you
have a -tip tree handy, you could apply them on top of tip/master.
If you don't have a -tip tree handy, they should also apply cleanly
onto v3.0.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2
  2011-09-08 17:41   ` Paul E. McKenney
@ 2011-09-08 19:23     ` Thomas Gleixner
  2011-09-08 20:48       ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Thomas Gleixner @ 2011-09-08 19:23 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Lin Ming, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, josh, niv, peterz, rostedt, Valdis.Kletnieks,
	dhowells, eric.dumazet, darren, patches

On Thu, 8 Sep 2011, Paul E. McKenney wrote:

> On Wed, Sep 07, 2011 at 10:39:57PM +0800, Lin Ming wrote:
> > On Wed, Sep 7, 2011 at 2:00 AM, Paul E. McKenney
> > <paulmck@linux.vnet.ibm.com> wrote:
> > >
> > > For a testing-only version of this patchset from git, please see the
> > > following subject-to-rebase (and subject-to-Hera-availability) branch:
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/testing
> > 
> > kernel.org is still down.
> > Would you put the tree somewhere else(maybe github.com)?
> 
> Hello, Lin,
> 
> I am unlikely to figure out why github doesn't like me by the end of
> the week, so please feel free to post the patches somewhere.  If you
> have a -tip tree handy, you could apply them on top of tip/master.
> If you don't have a -tip tree handy, they should also apply cleanly
> onto v3.0.

  git://tesla.tglx.de/git/linux-2.6-tip

Just in case.

     tglx

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2
  2011-09-08 19:23     ` Thomas Gleixner
@ 2011-09-08 20:48       ` Paul E. McKenney
  2011-09-12 16:24         ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-08 20:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Lin Ming, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, josh, niv, peterz, rostedt, Valdis.Kletnieks,
	dhowells, eric.dumazet, darren, patches

On Thu, Sep 08, 2011 at 09:23:05PM +0200, Thomas Gleixner wrote:
> On Thu, 8 Sep 2011, Paul E. McKenney wrote:
> 
> > On Wed, Sep 07, 2011 at 10:39:57PM +0800, Lin Ming wrote:
> > > On Wed, Sep 7, 2011 at 2:00 AM, Paul E. McKenney
> > > <paulmck@linux.vnet.ibm.com> wrote:
> > > >
> > > > For a testing-only version of this patchset from git, please see the
> > > > following subject-to-rebase (and subject-to-Hera-availability) branch:
> > > >
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/testing
> > > 
> > > kernel.org is still down.
> > > Would you put the tree somewhere else(maybe github.com)?
> > 
> > Hello, Lin,
> > 
> > I am unlikely to figure out why github doesn't like me by the end of
> > the week, so please feel free to post the patches somewhere.  If you
> > have a -tip tree handy, you could apply them on top of tip/master.
> > If you don't have a -tip tree handy, they should also apply cleanly
> > onto v3.0.
> 
>   git://tesla.tglx.de/git/linux-2.6-tip
> 
> Just in case.

Thank you very much, Thomas!!!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 48/55] powerpc: strengthen value-returning-atomics memory barriers
  2011-09-06 18:00 ` [PATCH tip/core/rcu 48/55] powerpc: strengthen value-returning-atomics memory barriers Paul E. McKenney
@ 2011-09-09 17:23     ` Olof Johansson
  0 siblings, 0 replies; 101+ messages in thread
From: Olof Johansson @ 2011-09-09 17:23 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, anton, benh, paulus, linuxppc-dev

[+linuxppc-dev]

On Tue, Sep 6, 2011 at 11:00 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> The trailing isync/lwsync in PowerPC value-returning atomics needs
> to be a sync in order to provide the required ordering properties.
> The leading lwsync/eieio can remain, as the remainder of the required
> ordering guarantees are provided by the atomic instructions: Any
> reordering will cause the stwcx to fail, which will result in a retry.

Admittedly, my powerpc barrier memory is starting to fade, but isn't
isync sufficient here? It will make sure all instructions before it
have retired, and will restart any speculative/issued instructions
beyond it.

lwsync not being sufficient makes sense since a load can overtake it.

> diff --git a/arch/powerpc/include/asm/synch.h b/arch/powerpc/include/asm/synch.h
> index d7cab44..4d97fbe 100644
> --- a/arch/powerpc/include/asm/synch.h
> +++ b/arch/powerpc/include/asm/synch.h
> @@ -37,11 +37,7 @@ static inline void isync(void)
>  #endif
>
>  #ifdef CONFIG_SMP
> -#define __PPC_ACQUIRE_BARRIER                          \
> -       START_LWSYNC_SECTION(97);                       \
> -       isync;                                          \
> -       MAKE_LWSYNC_SECTION_ENTRY(97, __lwsync_fixup);
> -#define PPC_ACQUIRE_BARRIER    "\n" stringify_in_c(__PPC_ACQUIRE_BARRIER)
> +#define PPC_ACQUIRE_BARRIER    "\n" stringify_in_c(sync;)

This can just be done as "\n\tsync\n" instead of the stringify stuff.


-Olof

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 48/55] powerpc: strengthen value-returning-atomics memory barriers
@ 2011-09-09 17:23     ` Olof Johansson
  0 siblings, 0 replies; 101+ messages in thread
From: Olof Johansson @ 2011-09-09 17:23 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: mathieu.desnoyers, laijs, eric.dumazet, patches, peterz,
	linux-kernel, rostedt, josh, dhowells, darren, niv, linuxppc-dev,
	tglx, anton, Valdis.Kletnieks, mingo, akpm, paulus

[+linuxppc-dev]

On Tue, Sep 6, 2011 at 11:00 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> The trailing isync/lwsync in PowerPC value-returning atomics needs
> to be a sync in order to provide the required ordering properties.
> The leading lwsync/eieio can remain, as the remainder of the required
> ordering guarantees are provided by the atomic instructions: Any
> reordering will cause the stwcx to fail, which will result in a retry.

Admittedly, my powerpc barrier memory is starting to fade, but isn't
isync sufficient here? It will make sure all instructions before it
have retired, and will restart any speculative/issued instructions
beyond it.

lwsync not being sufficient makes sense since a load can overtake it.

> diff --git a/arch/powerpc/include/asm/synch.h b/arch/powerpc/include/asm/=
synch.h
> index d7cab44..4d97fbe 100644
> --- a/arch/powerpc/include/asm/synch.h
> +++ b/arch/powerpc/include/asm/synch.h
> @@ -37,11 +37,7 @@ static inline void isync(void)
> =A0#endif
>
> =A0#ifdef CONFIG_SMP
> -#define __PPC_ACQUIRE_BARRIER =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0\
> - =A0 =A0 =A0 START_LWSYNC_SECTION(97); =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 \
> - =A0 =A0 =A0 isync; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0\
> - =A0 =A0 =A0 MAKE_LWSYNC_SECTION_ENTRY(97, __lwsync_fixup);
> -#define PPC_ACQUIRE_BARRIER =A0 =A0"\n" stringify_in_c(__PPC_ACQUIRE_BAR=
RIER)
> +#define PPC_ACQUIRE_BARRIER =A0 =A0"\n" stringify_in_c(sync;)

This can just be done as "\n\tsync\n" instead of the stringify stuff.


-Olof

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 48/55] powerpc: strengthen value-returning-atomics memory barriers
  2011-09-09 17:23     ` Olof Johansson
@ 2011-09-09 17:34       ` Paul E. McKenney
  -1 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-09 17:34 UTC (permalink / raw)
  To: Olof Johansson
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, anton, benh, paulus, linuxppc-dev

On Fri, Sep 09, 2011 at 10:23:33AM -0700, Olof Johansson wrote:
> [+linuxppc-dev]
> 
> On Tue, Sep 6, 2011 at 11:00 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > The trailing isync/lwsync in PowerPC value-returning atomics needs
> > to be a sync in order to provide the required ordering properties.
> > The leading lwsync/eieio can remain, as the remainder of the required
> > ordering guarantees are provided by the atomic instructions: Any
> > reordering will cause the stwcx to fail, which will result in a retry.
> 
> Admittedly, my powerpc barrier memory is starting to fade, but isn't
> isync sufficient here? It will make sure all instructions before it
> have retired, and will restart any speculative/issued instructions
> beyond it.
> 
> lwsync not being sufficient makes sense since a load can overtake it.

As I understand it, although isync waits for the prior stwcx to execute,
it does not guarantee that the corresponding store is visible to all
processors before any following loads.

> > diff --git a/arch/powerpc/include/asm/synch.h b/arch/powerpc/include/asm/synch.h
> > index d7cab44..4d97fbe 100644
> > --- a/arch/powerpc/include/asm/synch.h
> > +++ b/arch/powerpc/include/asm/synch.h
> > @@ -37,11 +37,7 @@ static inline void isync(void)
> >  #endif
> >
> >  #ifdef CONFIG_SMP
> > -#define __PPC_ACQUIRE_BARRIER                          \
> > -       START_LWSYNC_SECTION(97);                       \
> > -       isync;                                          \
> > -       MAKE_LWSYNC_SECTION_ENTRY(97, __lwsync_fixup);
> > -#define PPC_ACQUIRE_BARRIER    "\n" stringify_in_c(__PPC_ACQUIRE_BARRIER)
> > +#define PPC_ACQUIRE_BARRIER    "\n" stringify_in_c(sync;)
> 
> This can just be done as "\n\tsync\n" instead of the stringify stuff.

That does sound a bit more straightforward, now that you mention it.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 48/55] powerpc: strengthen value-returning-atomics memory barriers
@ 2011-09-09 17:34       ` Paul E. McKenney
  0 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-09 17:34 UTC (permalink / raw)
  To: Olof Johansson
  Cc: mathieu.desnoyers, laijs, eric.dumazet, patches, peterz,
	linux-kernel, rostedt, josh, dhowells, darren, niv, linuxppc-dev,
	tglx, anton, Valdis.Kletnieks, mingo, akpm, paulus

On Fri, Sep 09, 2011 at 10:23:33AM -0700, Olof Johansson wrote:
> [+linuxppc-dev]
> 
> On Tue, Sep 6, 2011 at 11:00 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > The trailing isync/lwsync in PowerPC value-returning atomics needs
> > to be a sync in order to provide the required ordering properties.
> > The leading lwsync/eieio can remain, as the remainder of the required
> > ordering guarantees are provided by the atomic instructions: Any
> > reordering will cause the stwcx to fail, which will result in a retry.
> 
> Admittedly, my powerpc barrier memory is starting to fade, but isn't
> isync sufficient here? It will make sure all instructions before it
> have retired, and will restart any speculative/issued instructions
> beyond it.
> 
> lwsync not being sufficient makes sense since a load can overtake it.

As I understand it, although isync waits for the prior stwcx to execute,
it does not guarantee that the corresponding store is visible to all
processors before any following loads.

> > diff --git a/arch/powerpc/include/asm/synch.h b/arch/powerpc/include/asm/synch.h
> > index d7cab44..4d97fbe 100644
> > --- a/arch/powerpc/include/asm/synch.h
> > +++ b/arch/powerpc/include/asm/synch.h
> > @@ -37,11 +37,7 @@ static inline void isync(void)
> >  #endif
> >
> >  #ifdef CONFIG_SMP
> > -#define __PPC_ACQUIRE_BARRIER                          \
> > -       START_LWSYNC_SECTION(97);                       \
> > -       isync;                                          \
> > -       MAKE_LWSYNC_SECTION_ENTRY(97, __lwsync_fixup);
> > -#define PPC_ACQUIRE_BARRIER    "\n" stringify_in_c(__PPC_ACQUIRE_BARRIER)
> > +#define PPC_ACQUIRE_BARRIER    "\n" stringify_in_c(sync;)
> 
> This can just be done as "\n\tsync\n" instead of the stringify stuff.

That does sound a bit more straightforward, now that you mention it.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 48/55] powerpc: strengthen value-returning-atomics memory barriers
  2011-09-09 17:34       ` Paul E. McKenney
@ 2011-09-09 18:43         ` Olof Johansson
  -1 siblings, 0 replies; 101+ messages in thread
From: Olof Johansson @ 2011-09-09 18:43 UTC (permalink / raw)
  To: paulmck
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, anton, benh, paulus, linuxppc-dev

On Fri, Sep 9, 2011 at 10:34 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Fri, Sep 09, 2011 at 10:23:33AM -0700, Olof Johansson wrote:
>> [+linuxppc-dev]
>>
>> On Tue, Sep 6, 2011 at 11:00 AM, Paul E. McKenney
>> <paulmck@linux.vnet.ibm.com> wrote:
>> > The trailing isync/lwsync in PowerPC value-returning atomics needs
>> > to be a sync in order to provide the required ordering properties.
>> > The leading lwsync/eieio can remain, as the remainder of the required
>> > ordering guarantees are provided by the atomic instructions: Any
>> > reordering will cause the stwcx to fail, which will result in a retry.
>>
>> Admittedly, my powerpc barrier memory is starting to fade, but isn't
>> isync sufficient here? It will make sure all instructions before it
>> have retired, and will restart any speculative/issued instructions
>> beyond it.
>>
>> lwsync not being sufficient makes sense since a load can overtake it.
>
> As I understand it, although isync waits for the prior stwcx to execute,
> it does not guarantee that the corresponding store is visible to all
> processors before any following loads.

Ah yes, combined with brushing up on the semantics in
memory-barriers.txt, this sounds reasonable to me.

>> > diff --git a/arch/powerpc/include/asm/synch.h b/arch/powerpc/include/asm/synch.h
>> > index d7cab44..4d97fbe 100644
>> > --- a/arch/powerpc/include/asm/synch.h
>> > +++ b/arch/powerpc/include/asm/synch.h
>> > @@ -37,11 +37,7 @@ static inline void isync(void)
>> >  #endif
>> >
>> >  #ifdef CONFIG_SMP
>> > -#define __PPC_ACQUIRE_BARRIER                          \
>> > -       START_LWSYNC_SECTION(97);                       \
>> > -       isync;                                          \
>> > -       MAKE_LWSYNC_SECTION_ENTRY(97, __lwsync_fixup);
>> > -#define PPC_ACQUIRE_BARRIER    "\n" stringify_in_c(__PPC_ACQUIRE_BARRIER)
>> > +#define PPC_ACQUIRE_BARRIER    "\n" stringify_in_c(sync;)
>>
>> This can just be done as "\n\tsync\n" instead of the stringify stuff.
>
> That does sound a bit more straightforward, now that you mention it.  ;-)

With that change, I'm:

Acked-by: Olof Johansson <olof@lixom.net>

But at least Ben or Anton should sign off on it too.


-Olof

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 48/55] powerpc: strengthen value-returning-atomics memory barriers
@ 2011-09-09 18:43         ` Olof Johansson
  0 siblings, 0 replies; 101+ messages in thread
From: Olof Johansson @ 2011-09-09 18:43 UTC (permalink / raw)
  To: paulmck
  Cc: mathieu.desnoyers, laijs, eric.dumazet, patches, peterz,
	linux-kernel, rostedt, josh, dhowells, darren, niv, linuxppc-dev,
	tglx, anton, Valdis.Kletnieks, mingo, akpm, paulus

On Fri, Sep 9, 2011 at 10:34 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Fri, Sep 09, 2011 at 10:23:33AM -0700, Olof Johansson wrote:
>> [+linuxppc-dev]
>>
>> On Tue, Sep 6, 2011 at 11:00 AM, Paul E. McKenney
>> <paulmck@linux.vnet.ibm.com> wrote:
>> > The trailing isync/lwsync in PowerPC value-returning atomics needs
>> > to be a sync in order to provide the required ordering properties.
>> > The leading lwsync/eieio can remain, as the remainder of the required
>> > ordering guarantees are provided by the atomic instructions: Any
>> > reordering will cause the stwcx to fail, which will result in a retry.
>>
>> Admittedly, my powerpc barrier memory is starting to fade, but isn't
>> isync sufficient here? It will make sure all instructions before it
>> have retired, and will restart any speculative/issued instructions
>> beyond it.
>>
>> lwsync not being sufficient makes sense since a load can overtake it.
>
> As I understand it, although isync waits for the prior stwcx to execute,
> it does not guarantee that the corresponding store is visible to all
> processors before any following loads.

Ah yes, combined with brushing up on the semantics in
memory-barriers.txt, this sounds reasonable to me.

>> > diff --git a/arch/powerpc/include/asm/synch.h b/arch/powerpc/include/a=
sm/synch.h
>> > index d7cab44..4d97fbe 100644
>> > --- a/arch/powerpc/include/asm/synch.h
>> > +++ b/arch/powerpc/include/asm/synch.h
>> > @@ -37,11 +37,7 @@ static inline void isync(void)
>> > =A0#endif
>> >
>> > =A0#ifdef CONFIG_SMP
>> > -#define __PPC_ACQUIRE_BARRIER =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 =A0 =A0\
>> > - =A0 =A0 =A0 START_LWSYNC_SECTION(97); =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 \
>> > - =A0 =A0 =A0 isync; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0\
>> > - =A0 =A0 =A0 MAKE_LWSYNC_SECTION_ENTRY(97, __lwsync_fixup);
>> > -#define PPC_ACQUIRE_BARRIER =A0 =A0"\n" stringify_in_c(__PPC_ACQUIRE_=
BARRIER)
>> > +#define PPC_ACQUIRE_BARRIER =A0 =A0"\n" stringify_in_c(sync;)
>>
>> This can just be done as "\n\tsync\n" instead of the stringify stuff.
>
> That does sound a bit more straightforward, now that you mention it. =A0;=
-)

With that change, I'm:

Acked-by: Olof Johansson <olof@lixom.net>

But at least Ben or Anton should sign off on it too.


-Olof

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2
  2011-09-08 20:48       ` Paul E. McKenney
@ 2011-09-12 16:24         ` Paul E. McKenney
  0 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-12 16:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Lin Ming, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, josh, niv, peterz, rostedt, Valdis.Kletnieks,
	dhowells, eric.dumazet, darren, patches

On Thu, Sep 08, 2011 at 01:48:48PM -0700, Paul E. McKenney wrote:
> On Thu, Sep 08, 2011 at 09:23:05PM +0200, Thomas Gleixner wrote:
> > On Thu, 8 Sep 2011, Paul E. McKenney wrote:

[ . . . ]

> > > Hello, Lin,
> > > 
> > > I am unlikely to figure out why github doesn't like me by the end of
> > > the week, so please feel free to post the patches somewhere.  If you
> > > have a -tip tree handy, you could apply them on top of tip/master.
> > > If you don't have a -tip tree handy, they should also apply cleanly
> > > onto v3.0.
> > 
> >   git://tesla.tglx.de/git/linux-2.6-tip
> > 
> > Just in case.
> 
> Thank you very much, Thomas!!!

And now that I have finished herding the cats at Linux Plumbers Conference,
I did figure out how to make github accept my push requests.  Please see
https://github.com/paulmckrcu/linux.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 44/55] rcu: wire up RCU_BOOST_PRIO for rcutree
  2011-09-06 18:00 ` [PATCH tip/core/rcu 44/55] rcu: wire up RCU_BOOST_PRIO for rcutree Paul E. McKenney
@ 2011-09-13 12:02   ` Mike Galbraith
  2011-09-13 15:34     ` Paul E. McKenney
  2011-10-17  1:55   ` Josh Triplett
  1 sibling, 1 reply; 101+ messages in thread
From: Mike Galbraith @ 2011-09-13 12:02 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

Hi Paul,

This patch causes RCU thread priority funnies, with some help from rcun.

On Tue, 2011-09-06 at 11:00 -0700, Paul E. McKenney wrote:
> 	return 0;
> @@ -1466,6 +1474,7 @@ static void rcu_yield(void (*f)(unsigned long), unsigned long arg)
>  {
>  	struct sched_param sp;
>  	struct timer_list yield_timer;
> +	int prio = current->normal_prio;
>  
>  	setup_timer_on_stack(&yield_timer, f, arg);
>  	mod_timer(&yield_timer, jiffies + 2);

There's a thinko there, prio either needs to be inverted before feeding
it to __setscheduler().. or just use ->rt_priority.  I did the latter,
and twiddled rcun to restore it's priority instead of RCU_KTHREAD_PRIO.
RCU threads now stay put.

rcu: wire up RCU_BOOST_PRIO for rcutree

RCU boost threads start life at RCU_BOOST_PRIO, while others remain at
RCU_KTHREAD_PRIO.  Adjust rcu_yield() to preserve priority across the
yield, and if the node thread restores RT policy for a yielding thread,
it sets priority to it's own priority.  This sets the stage for user
controlled runtime changes to priority in the -rt tree.

While here, change thread names to match other kthreads.

Signed-off-by: Mike Galbraith <efault@gmx.de>

---
 kernel/rcutree.c        |    2 --
 kernel/rcutree_plugin.h |   22 ++++++++++++++++------
 2 files changed, 16 insertions(+), 8 deletions(-)

Index: linux-3.0-tip/kernel/rcutree.c
===================================================================
--- linux-3.0-tip.orig/kernel/rcutree.c
+++ linux-3.0-tip/kernel/rcutree.c
@@ -128,8 +128,6 @@ static void rcu_node_kthread_setaffinity
 static void invoke_rcu_core(void);
 static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp);
 
-#define RCU_KTHREAD_PRIO 1	/* RT priority for per-CPU kthreads. */
-
 /*
  * Track the rcutorture test sequence number and the update version
  * number within a given test.  The rcutorture_testseq is incremented
Index: linux-3.0-tip/kernel/rcutree_plugin.h
===================================================================
--- linux-3.0-tip.orig/kernel/rcutree_plugin.h
+++ linux-3.0-tip/kernel/rcutree_plugin.h
@@ -27,6 +27,14 @@
 #include <linux/delay.h>
 #include <linux/stop_machine.h>
 
+#define RCU_KTHREAD_PRIO 1
+
+#ifdef CONFIG_RCU_BOOST
+#define RCU_BOOST_PRIO CONFIG_RCU_BOOST_PRIO
+#else
+#define RCU_BOOST_PRIO RCU_KTHREAD_PRIO
+#endif
+
 /*
  * Check the RCU kernel configuration parameters and print informative
  * messages about anything out of the ordinary.  If you like #ifdef, you
@@ -1345,13 +1353,13 @@ static int __cpuinit rcu_spawn_one_boost
 	if (rnp->boost_kthread_task != NULL)
 		return 0;
 	t = kthread_create(rcu_boost_kthread, (void *)rnp,
-			   "rcub%d", rnp_index);
+			   "rcub/%d", rnp_index);
 	if (IS_ERR(t))
 		return PTR_ERR(t);
 	raw_spin_lock_irqsave(&rnp->lock, flags);
 	rnp->boost_kthread_task = t;
 	raw_spin_unlock_irqrestore(&rnp->lock, flags);
-	sp.sched_priority = RCU_KTHREAD_PRIO;
+	sp.sched_priority = RCU_BOOST_PRIO;
 	sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
 	wake_up_process(t); /* get to TASK_INTERRUPTIBLE quickly. */
 	return 0;
@@ -1446,6 +1454,7 @@ static void rcu_yield(void (*f)(unsigned
 {
 	struct sched_param sp;
 	struct timer_list yield_timer;
+	int prio = current->rt_priority;
 
 	setup_timer_on_stack(&yield_timer, f, arg);
 	mod_timer(&yield_timer, jiffies + 2);
@@ -1453,7 +1462,8 @@ static void rcu_yield(void (*f)(unsigned
 	sched_setscheduler_nocheck(current, SCHED_NORMAL, &sp);
 	set_user_nice(current, 19);
 	schedule();
-	sp.sched_priority = RCU_KTHREAD_PRIO;
+	set_user_nice(current, 0);
+	sp.sched_priority = prio;
 	sched_setscheduler_nocheck(current, SCHED_FIFO, &sp);
 	del_timer(&yield_timer);
 }
@@ -1562,7 +1572,7 @@ static int __cpuinit rcu_spawn_one_cpu_k
 	if (!rcu_scheduler_fully_active ||
 	    per_cpu(rcu_cpu_kthread_task, cpu) != NULL)
 		return 0;
-	t = kthread_create(rcu_cpu_kthread, (void *)(long)cpu, "rcuc%d", cpu);
+	t = kthread_create(rcu_cpu_kthread, (void *)(long)cpu, "rcuc/%d", cpu);
 	if (IS_ERR(t))
 		return PTR_ERR(t);
 	if (cpu_online(cpu))
@@ -1608,7 +1618,7 @@ static int rcu_node_kthread(void *arg)
 				continue;
 			}
 			per_cpu(rcu_cpu_has_work, cpu) = 1;
-			sp.sched_priority = RCU_KTHREAD_PRIO;
+			sp.sched_priority = current->rt_priority;
 			sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
 			preempt_enable();
 		}
@@ -1671,7 +1681,7 @@ static int __cpuinit rcu_spawn_one_node_
 		return 0;
 	if (rnp->node_kthread_task == NULL) {
 		t = kthread_create(rcu_node_kthread, (void *)rnp,
-				   "rcun%d", rnp_index);
+				   "rcun/%d", rnp_index);
 		if (IS_ERR(t))
 			return PTR_ERR(t);
 		raw_spin_lock_irqsave(&rnp->lock, flags);



^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 44/55] rcu: wire up RCU_BOOST_PRIO for rcutree
  2011-09-13 12:02   ` Mike Galbraith
@ 2011-09-13 15:34     ` Paul E. McKenney
  2011-09-13 16:04       ` Mike Galbraith
  0 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-13 15:34 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Tue, Sep 13, 2011 at 02:02:14PM +0200, Mike Galbraith wrote:
> Hi Paul,
> 
> This patch causes RCU thread priority funnies, with some help from rcun.
> 
> On Tue, 2011-09-06 at 11:00 -0700, Paul E. McKenney wrote:
> > 	return 0;
> > @@ -1466,6 +1474,7 @@ static void rcu_yield(void (*f)(unsigned long), unsigned long arg)
> >  {
> >  	struct sched_param sp;
> >  	struct timer_list yield_timer;
> > +	int prio = current->normal_prio;
> >  
> >  	setup_timer_on_stack(&yield_timer, f, arg);
> >  	mod_timer(&yield_timer, jiffies + 2);
> 
> There's a thinko there, prio either needs to be inverted before feeding
> it to __setscheduler().. or just use ->rt_priority.  I did the latter,
> and twiddled rcun to restore it's priority instead of RCU_KTHREAD_PRIO.
> RCU threads now stay put.

Very good -- some comments below.  I had to hand-apply this due to
conflicts with my current patch stack: https://github.com/paulmckrcu/linux

> rcu: wire up RCU_BOOST_PRIO for rcutree
> 
> RCU boost threads start life at RCU_BOOST_PRIO, while others remain at
> RCU_KTHREAD_PRIO.  Adjust rcu_yield() to preserve priority across the
> yield, and if the node thread restores RT policy for a yielding thread,
> it sets priority to it's own priority.  This sets the stage for user
> controlled runtime changes to priority in the -rt tree.
> 
> While here, change thread names to match other kthreads.
> 
> Signed-off-by: Mike Galbraith <efault@gmx.de>
> 
> ---
>  kernel/rcutree.c        |    2 --
>  kernel/rcutree_plugin.h |   22 ++++++++++++++++------
>  2 files changed, 16 insertions(+), 8 deletions(-)
> 
> Index: linux-3.0-tip/kernel/rcutree.c
> ===================================================================
> --- linux-3.0-tip.orig/kernel/rcutree.c
> +++ linux-3.0-tip/kernel/rcutree.c
> @@ -128,8 +128,6 @@ static void rcu_node_kthread_setaffinity
>  static void invoke_rcu_core(void);
>  static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp);
> 
> -#define RCU_KTHREAD_PRIO 1	/* RT priority for per-CPU kthreads. */
> -
>  /*
>   * Track the rcutorture test sequence number and the update version
>   * number within a given test.  The rcutorture_testseq is incremented
> Index: linux-3.0-tip/kernel/rcutree_plugin.h
> ===================================================================
> --- linux-3.0-tip.orig/kernel/rcutree_plugin.h
> +++ linux-3.0-tip/kernel/rcutree_plugin.h
> @@ -27,6 +27,14 @@
>  #include <linux/delay.h>
>  #include <linux/stop_machine.h>
> 
> +#define RCU_KTHREAD_PRIO 1
> +
> +#ifdef CONFIG_RCU_BOOST
> +#define RCU_BOOST_PRIO CONFIG_RCU_BOOST_PRIO
> +#else
> +#define RCU_BOOST_PRIO RCU_KTHREAD_PRIO
> +#endif
> +
>  /*
>   * Check the RCU kernel configuration parameters and print informative
>   * messages about anything out of the ordinary.  If you like #ifdef, you
> @@ -1345,13 +1353,13 @@ static int __cpuinit rcu_spawn_one_boost
>  	if (rnp->boost_kthread_task != NULL)
>  		return 0;
>  	t = kthread_create(rcu_boost_kthread, (void *)rnp,
> -			   "rcub%d", rnp_index);
> +			   "rcub/%d", rnp_index);
>  	if (IS_ERR(t))
>  		return PTR_ERR(t);
>  	raw_spin_lock_irqsave(&rnp->lock, flags);
>  	rnp->boost_kthread_task = t;
>  	raw_spin_unlock_irqrestore(&rnp->lock, flags);
> -	sp.sched_priority = RCU_KTHREAD_PRIO;
> +	sp.sched_priority = RCU_BOOST_PRIO;
>  	sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
>  	wake_up_process(t); /* get to TASK_INTERRUPTIBLE quickly. */
>  	return 0;
> @@ -1446,6 +1454,7 @@ static void rcu_yield(void (*f)(unsigned
>  {
>  	struct sched_param sp;
>  	struct timer_list yield_timer;
> +	int prio = current->rt_priority;

This makes sense, and I have merged it into your previous patch.

>  	setup_timer_on_stack(&yield_timer, f, arg);
>  	mod_timer(&yield_timer, jiffies + 2);
> @@ -1453,7 +1462,8 @@ static void rcu_yield(void (*f)(unsigned
>  	sched_setscheduler_nocheck(current, SCHED_NORMAL, &sp);
>  	set_user_nice(current, 19);
>  	schedule();
> -	sp.sched_priority = RCU_KTHREAD_PRIO;
> +	set_user_nice(current, 0);
> +	sp.sched_priority = prio;
>  	sched_setscheduler_nocheck(current, SCHED_FIFO, &sp);
>  	del_timer(&yield_timer);
>  }
> @@ -1562,7 +1572,7 @@ static int __cpuinit rcu_spawn_one_cpu_k
>  	if (!rcu_scheduler_fully_active ||
>  	    per_cpu(rcu_cpu_kthread_task, cpu) != NULL)
>  		return 0;
> -	t = kthread_create(rcu_cpu_kthread, (void *)(long)cpu, "rcuc%d", cpu);
> +	t = kthread_create(rcu_cpu_kthread, (void *)(long)cpu, "rcuc/%d", cpu);
>  	if (IS_ERR(t))
>  		return PTR_ERR(t);
>  	if (cpu_online(cpu))
> @@ -1608,7 +1618,7 @@ static int rcu_node_kthread(void *arg)
>  				continue;
>  			}
>  			per_cpu(rcu_cpu_has_work, cpu) = 1;
> -			sp.sched_priority = RCU_KTHREAD_PRIO;
> +			sp.sched_priority = current->rt_priority;

This is broken -- the per-node kthread runs at RT prio 99, but we usually
would not want to boost that high.

Seems like we should have a global variable that tracks the current
priority.  This global variable could then be set in a manner similar
to the softirq priorities -- or, perhaps better, simply set whenever
the softirq priority is changed.

Thoughts?

>  			sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
>  			preempt_enable();
>  		}
> @@ -1671,7 +1681,7 @@ static int __cpuinit rcu_spawn_one_node_
>  		return 0;
>  	if (rnp->node_kthread_task == NULL) {
>  		t = kthread_create(rcu_node_kthread, (void *)rnp,
> -				   "rcun%d", rnp_index);
> +				   "rcun/%d", rnp_index);
>  		if (IS_ERR(t))
>  			return PTR_ERR(t);
>  		raw_spin_lock_irqsave(&rnp->lock, flags);
> 
> 


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 44/55] rcu: wire up RCU_BOOST_PRIO for rcutree
  2011-09-13 15:34     ` Paul E. McKenney
@ 2011-09-13 16:04       ` Mike Galbraith
  2011-09-13 20:50         ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Mike Galbraith @ 2011-09-13 16:04 UTC (permalink / raw)
  To: paulmck
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Tue, 2011-09-13 at 08:34 -0700, Paul E. McKenney wrote:
> On Tue, Sep 13, 2011 at 02:02:14PM +0200, Mike Galbraith wrote:

> > @@ -1608,7 +1618,7 @@ static int rcu_node_kthread(void *arg)
> >  				continue;
> >  			}
> >  			per_cpu(rcu_cpu_has_work, cpu) = 1;
> > -			sp.sched_priority = RCU_KTHREAD_PRIO;
> > +			sp.sched_priority = current->rt_priority;
> 
> This is broken -- the per-node kthread runs at RT prio 99, but we usually
> would not want to boost that high.

Ouch, right.  My userland sets things on boot, so it works.

> Seems like we should have a global variable that tracks the current
> priority.  This global variable could then be set in a manner similar
> to the softirq priorities -- or, perhaps better, simply set whenever
> the softirq priority is changed.
> 
> Thoughts?

RCU threads would have to constantly watch for user priority changes on
their own, and update private data methinks.

	-Mike


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 55/55] powerpc: Work around tracing from dyntick-idle mode
  2011-09-07 13:44     ` Paul E. McKenney
@ 2011-09-13 19:13       ` Frederic Weisbecker
  2011-09-13 19:50         ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Frederic Weisbecker @ 2011-09-13 19:13 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Benjamin Herrenschmidt, linux-kernel, mingo, laijs, dipankar,
	akpm, mathieu.desnoyers, josh, niv, tglx, peterz, rostedt,
	Valdis.Kletnieks, dhowells, eric.dumazet, darren, patches, anton,
	paulus

On Wed, Sep 07, 2011 at 06:44:00AM -0700, Paul E. McKenney wrote:
> On Wed, Sep 07, 2011 at 07:00:22AM -0300, Benjamin Herrenschmidt wrote:
> > On Tue, 2011-09-06 at 11:00 -0700, Paul E. McKenney wrote:
> > > PowerPC LPAR's __trace_hcall_exit() can invoke event tracing at a
> > > point where RCU has been told that the CPU is in dyntick-idle mode.
> > > Because event tracing uses RCU, this can result in failures.
> > > 
> > > A correct fix would arrange for RCU to be told about dyntick-idle
> > > mode after tracing had completed, however, this will require some care
> > > because it appears that __trace_hcall_exit() can also be called from
> > > non-dyntick-idle mode.
> > 
> > This obviously needs to be fixed properly. hcall tracing is very useful
> > and if I understand your patch properly, it just comments it out :-)
> 
> That is exactly what it does, and I completely agree that this patch
> is nothing but a short-term work-around to allow my RCU tests to find
> other bugs.
> 
> > I'm not sure what the best approach is, maybe have the hcall tracing
> > test for the dyntick-idle mode and skip tracing in that case ?
> 
> Another approach would be to update Frederic Weisbecker's patch at:
> 
> 	https://lkml.org/lkml/2011/8/20/83
> 
> so that powerpc does tick_nohz_enter_idle(false), and then uses
> rcu_enter_nohz() explicitly just after doing the hcall tracing.
> If pseries is the only powerpc architecture requiring this, then
> the argument to tick_nohz_enter_idle() could depend on the powerpc
> sub-architecture.

I'm trying to fix this but I need a bit of help to understand the
pseries cpu sleeping.

In pseries_dedicated_idle_sleep(), what is the function that does
the real sleeping? Is it cede_processor()?

> 
> The same thing would be needed for tick_nohz_exit_idle() and
> rcu_exit_nohz(): powerpc would need to invoke rcu_exit_nohz() after
> gaining control from the hypervisor but before doing its first tracing,
> and then it would need the idle loop to to tick_nohz_exit_idle(false).
> Again, if pseries is the only powerpc architecture requiring this,
> the argument to tick_nohz_exit_idle() could depend on the architecture.
> 
> Would this approach work?

Sounds like we really need that.

Thanks.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 55/55] powerpc: Work around tracing from dyntick-idle mode
  2011-09-13 19:13       ` Frederic Weisbecker
@ 2011-09-13 19:50         ` Paul E. McKenney
  2011-09-13 20:49           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-13 19:50 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Benjamin Herrenschmidt, linux-kernel, mingo, laijs, dipankar,
	akpm, mathieu.desnoyers, josh, niv, tglx, peterz, rostedt,
	Valdis.Kletnieks, dhowells, eric.dumazet, darren, patches, anton,
	paulus

On Tue, Sep 13, 2011 at 09:13:21PM +0200, Frederic Weisbecker wrote:
> On Wed, Sep 07, 2011 at 06:44:00AM -0700, Paul E. McKenney wrote:
> > On Wed, Sep 07, 2011 at 07:00:22AM -0300, Benjamin Herrenschmidt wrote:
> > > On Tue, 2011-09-06 at 11:00 -0700, Paul E. McKenney wrote:
> > > > PowerPC LPAR's __trace_hcall_exit() can invoke event tracing at a
> > > > point where RCU has been told that the CPU is in dyntick-idle mode.
> > > > Because event tracing uses RCU, this can result in failures.
> > > > 
> > > > A correct fix would arrange for RCU to be told about dyntick-idle
> > > > mode after tracing had completed, however, this will require some care
> > > > because it appears that __trace_hcall_exit() can also be called from
> > > > non-dyntick-idle mode.
> > > 
> > > This obviously needs to be fixed properly. hcall tracing is very useful
> > > and if I understand your patch properly, it just comments it out :-)
> > 
> > That is exactly what it does, and I completely agree that this patch
> > is nothing but a short-term work-around to allow my RCU tests to find
> > other bugs.
> > 
> > > I'm not sure what the best approach is, maybe have the hcall tracing
> > > test for the dyntick-idle mode and skip tracing in that case ?
> > 
> > Another approach would be to update Frederic Weisbecker's patch at:
> > 
> > 	https://lkml.org/lkml/2011/8/20/83
> > 
> > so that powerpc does tick_nohz_enter_idle(false), and then uses
> > rcu_enter_nohz() explicitly just after doing the hcall tracing.
> > If pseries is the only powerpc architecture requiring this, then
> > the argument to tick_nohz_enter_idle() could depend on the powerpc
> > sub-architecture.
> 
> I'm trying to fix this but I need a bit of help to understand the
> pseries cpu sleeping.
> 
> In pseries_dedicated_idle_sleep(), what is the function that does
> the real sleeping? Is it cede_processor()?

As I understand it, cede_processor()'s call to plpar_hcall_norets()
results in the hypervisor being invoked, and could give up the CPU.
And yes, in this case, RCU needs to stop paying attention to this CPU.
And pseries_shared_idle_sleep() also invokes cede_proceessor().

Gah...  And there also appear to be some assembly-language functions
that can be invoked via the ppc_md.power_save() call from cpu_idle():
ppc6xx_idle(), power4_idle(), idle_spin(), idle_doze(), and book3e_idle().
There is also a power7_idle(), but it does not appear to be used anywhere.

Plus there are the C-language ppc44x_idle(), beat_power_save(),
cbe_power_save(), ps3_power_save(), and cpm_idle().

> > The same thing would be needed for tick_nohz_exit_idle() and
> > rcu_exit_nohz(): powerpc would need to invoke rcu_exit_nohz() after
> > gaining control from the hypervisor but before doing its first tracing,
> > and then it would need the idle loop to to tick_nohz_exit_idle(false).
> > Again, if pseries is the only powerpc architecture requiring this,
> > the argument to tick_nohz_exit_idle() could depend on the architecture.
> > 
> > Would this approach work?
> 
> Sounds like we really need that.

Sounds like an arch-dependent config symbol that is defined for the
pseries targets, but not for the other powerpc architectures.

Not clear to me what to do about power4_idle(), though.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 55/55] powerpc: Work around tracing from dyntick-idle mode
  2011-09-13 19:50         ` Paul E. McKenney
@ 2011-09-13 20:49           ` Benjamin Herrenschmidt
  2011-09-15 14:53             ` Frederic Weisbecker
  2011-09-16 12:24             ` Frederic Weisbecker
  0 siblings, 2 replies; 101+ messages in thread
From: Benjamin Herrenschmidt @ 2011-09-13 20:49 UTC (permalink / raw)
  To: paulmck
  Cc: Frederic Weisbecker, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, josh, niv, tglx, peterz, rostedt,
	Valdis.Kletnieks, dhowells, eric.dumazet, darren, patches, anton,
	paulus


> As I understand it, cede_processor()'s call to plpar_hcall_norets()
> results in the hypervisor being invoked, and could give up the CPU.
> And yes, in this case, RCU needs to stop paying attention to this CPU.
> And pseries_shared_idle_sleep() also invokes cede_proceessor().
> 
> Gah...  And there also appear to be some assembly-language functions
> that can be invoked via the ppc_md.power_save() call from cpu_idle():
> ppc6xx_idle(), power4_idle(), idle_spin(), idle_doze(), and book3e_idle().
> There is also a power7_idle(), but it does not appear to be used anywhere.
> 
> Plus there are the C-language ppc44x_idle(), beat_power_save(),
> cbe_power_save(), ps3_power_save(), and cpm_idle().
> 
> > > The same thing would be needed for tick_nohz_exit_idle() and
> > > rcu_exit_nohz(): powerpc would need to invoke rcu_exit_nohz() after
> > > gaining control from the hypervisor but before doing its first tracing,
> > > and then it would need the idle loop to to tick_nohz_exit_idle(false).
> > > Again, if pseries is the only powerpc architecture requiring this,
> > > the argument to tick_nohz_exit_idle() could depend on the architecture.
> > > 
> > > Would this approach work?
> > 
> > Sounds like we really need that.
> 
> Sounds like an arch-dependent config symbol that is defined for the
> pseries targets, but not for the other powerpc architectures.
> 
> Not clear to me what to do about power4_idle(), though.

I don't totally follow, too many things to deal with right now, but keep
in mind that we build multiplatform kernels, so you can have powermac,
cell, pseries, etc... all in one kernel binary (including power7 idle).

Shouldn't we instead change the plpar trace call to skip the tracing
when not safe to do so ?

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 44/55] rcu: wire up RCU_BOOST_PRIO for rcutree
  2011-09-13 16:04       ` Mike Galbraith
@ 2011-09-13 20:50         ` Paul E. McKenney
  0 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-13 20:50 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Tue, Sep 13, 2011 at 06:04:18PM +0200, Mike Galbraith wrote:
> On Tue, 2011-09-13 at 08:34 -0700, Paul E. McKenney wrote:
> > On Tue, Sep 13, 2011 at 02:02:14PM +0200, Mike Galbraith wrote:
> 
> > > @@ -1608,7 +1618,7 @@ static int rcu_node_kthread(void *arg)
> > >  				continue;
> > >  			}
> > >  			per_cpu(rcu_cpu_has_work, cpu) = 1;
> > > -			sp.sched_priority = RCU_KTHREAD_PRIO;
> > > +			sp.sched_priority = current->rt_priority;
> > 
> > This is broken -- the per-node kthread runs at RT prio 99, but we usually
> > would not want to boost that high.
> 
> Ouch, right.  My userland sets things on boot, so it works.

;-)

> > Seems like we should have a global variable that tracks the current
> > priority.  This global variable could then be set in a manner similar
> > to the softirq priorities -- or, perhaps better, simply set whenever
> > the softirq priority is changed.
> > 
> > Thoughts?
> 
> RCU threads would have to constantly watch for user priority changes on
> their own, and update private data methinks.

I believe that we are going to need some sort of -rt-specific handling
of the RCU boost priority in the short term.  Though maybe I could think
about getting runtime modification into mainline as well -- but it would
be different that -rt for a bit.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 55/55] powerpc: Work around tracing from dyntick-idle mode
  2011-09-13 20:49           ` Benjamin Herrenschmidt
@ 2011-09-15 14:53             ` Frederic Weisbecker
  2011-09-16 12:24             ` Frederic Weisbecker
  1 sibling, 0 replies; 101+ messages in thread
From: Frederic Weisbecker @ 2011-09-15 14:53 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: paulmck, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, josh, niv, tglx, peterz, rostedt,
	Valdis.Kletnieks, dhowells, eric.dumazet, darren, patches, anton,
	paulus

On Tue, Sep 13, 2011 at 05:49:53PM -0300, Benjamin Herrenschmidt wrote:
> 
> > As I understand it, cede_processor()'s call to plpar_hcall_norets()
> > results in the hypervisor being invoked, and could give up the CPU.
> > And yes, in this case, RCU needs to stop paying attention to this CPU.
> > And pseries_shared_idle_sleep() also invokes cede_proceessor().
> > 
> > Gah...  And there also appear to be some assembly-language functions
> > that can be invoked via the ppc_md.power_save() call from cpu_idle():
> > ppc6xx_idle(), power4_idle(), idle_spin(), idle_doze(), and book3e_idle().
> > There is also a power7_idle(), but it does not appear to be used anywhere.
> > 
> > Plus there are the C-language ppc44x_idle(), beat_power_save(),
> > cbe_power_save(), ps3_power_save(), and cpm_idle().
> > 
> > > > The same thing would be needed for tick_nohz_exit_idle() and
> > > > rcu_exit_nohz(): powerpc would need to invoke rcu_exit_nohz() after
> > > > gaining control from the hypervisor but before doing its first tracing,
> > > > and then it would need the idle loop to to tick_nohz_exit_idle(false).
> > > > Again, if pseries is the only powerpc architecture requiring this,
> > > > the argument to tick_nohz_exit_idle() could depend on the architecture.
> > > > 
> > > > Would this approach work?
> > > 
> > > Sounds like we really need that.
> > 
> > Sounds like an arch-dependent config symbol that is defined for the
> > pseries targets, but not for the other powerpc architectures.
> > 
> > Not clear to me what to do about power4_idle(), though.
> 
> I don't totally follow, too many things to deal with right now, but keep
> in mind that we build multiplatform kernels, so you can have powermac,
> cell, pseries, etc... all in one kernel binary (including power7 idle).
> 
> Shouldn't we instead change the plpar trace call to skip the tracing
> when not safe to do so ?

Or may be we can have a plpar_hcall_norets_notrace() for this specific case?
Lemme try something.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 55/55] powerpc: Work around tracing from dyntick-idle mode
  2011-09-13 20:49           ` Benjamin Herrenschmidt
  2011-09-15 14:53             ` Frederic Weisbecker
@ 2011-09-16 12:24             ` Frederic Weisbecker
  1 sibling, 0 replies; 101+ messages in thread
From: Frederic Weisbecker @ 2011-09-16 12:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, anton, paulus

On Tue, Sep 13, 2011 at 05:49:53PM -0300, Benjamin Herrenschmidt wrote:
> 
> > As I understand it, cede_processor()'s call to plpar_hcall_norets()
> > results in the hypervisor being invoked, and could give up the CPU.
> > And yes, in this case, RCU needs to stop paying attention to this CPU.
> > And pseries_shared_idle_sleep() also invokes cede_proceessor().
> > 
> > Gah...  And there also appear to be some assembly-language functions
> > that can be invoked via the ppc_md.power_save() call from cpu_idle():
> > ppc6xx_idle(), power4_idle(), idle_spin(), idle_doze(), and book3e_idle().
> > There is also a power7_idle(), but it does not appear to be used anywhere.
> > 
> > Plus there are the C-language ppc44x_idle(), beat_power_save(),
> > cbe_power_save(), ps3_power_save(), and cpm_idle().
> > 
> > > > The same thing would be needed for tick_nohz_exit_idle() and
> > > > rcu_exit_nohz(): powerpc would need to invoke rcu_exit_nohz() after
> > > > gaining control from the hypervisor but before doing its first tracing,
> > > > and then it would need the idle loop to to tick_nohz_exit_idle(false).
> > > > Again, if pseries is the only powerpc architecture requiring this,
> > > > the argument to tick_nohz_exit_idle() could depend on the architecture.
> > > > 
> > > > Would this approach work?
> > > 
> > > Sounds like we really need that.
> > 
> > Sounds like an arch-dependent config symbol that is defined for the
> > pseries targets, but not for the other powerpc architectures.
> > 
> > Not clear to me what to do about power4_idle(), though.
> 
> I don't totally follow, too many things to deal with right now, but keep
> in mind that we build multiplatform kernels, so you can have powermac,
> cell, pseries, etc... all in one kernel binary (including power7 idle).
> 
> Shouldn't we instead change the plpar trace call to skip the tracing
> when not safe to do so ?

So perhaps something like this could help? AFAIK the only place
where the calls to trace_hcall_entry/exit are unsafe is on
cede_processor().

(I don't know powerpc asm so that patch is only made on guesses
from similarities with ARM asm that I know better).

Only compile tested.


Not-yet-signed-off-by: Me
---
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index fd8201d..37818d3 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -250,6 +250,8 @@
  */
 long plpar_hcall_norets(unsigned long opcode, ...);
 
+long plpar_hcall_norets_notrace(unsigned long opcode, ...);
+
 /**
  * plpar_hcall: - Make a pseries hypervisor call
  * @opcode: The hypervisor call to make.
diff --git a/arch/powerpc/platforms/pseries/hvCall.S b/arch/powerpc/platforms/pseries/hvCall.S
index fd05fde..302fc7a 100644
--- a/arch/powerpc/platforms/pseries/hvCall.S
+++ b/arch/powerpc/platforms/pseries/hvCall.S
@@ -107,6 +107,18 @@ END_FTR_SECTION(0, 1);						\
 
 	.text
 
+_GLOBAL(plpar_hcall_norets_notrace)
+	HMT_MEDIUM
+
+	mfcr	r0
+	stw	r0,8(r1)
+
+	HVSC				/* invoke the hypervisor */
+
+	lwz	r0,8(r1)
+	mtcrf	0xff,r0
+	blr				/* return r3 = status */
+
 _GLOBAL(plpar_hcall_norets)
 	HMT_MEDIUM
 
diff --git a/arch/powerpc/platforms/pseries/plpar_wrappers.h b/arch/powerpc/platforms/pseries/plpar_wrappers.h
index 4bf2120..30f3d64 100644
--- a/arch/powerpc/platforms/pseries/plpar_wrappers.h
+++ b/arch/powerpc/platforms/pseries/plpar_wrappers.h
@@ -29,7 +29,7 @@ static inline void set_cede_latency_hint(u8 latency_hint)
 
 static inline long cede_processor(void)
 {
-	return plpar_hcall_norets(H_CEDE);
+	return plpar_hcall_norets_notrace(H_CEDE);
 }
 
 static inline long extended_cede_processor(unsigned long latency_hint)


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 41/55] rcu: Permit rt_mutex_unlock() with irqs disabled
  2011-09-06 18:00 ` [PATCH tip/core/rcu 41/55] rcu: Permit rt_mutex_unlock() with irqs disabled Paul E. McKenney
@ 2011-09-18  4:09   ` Yong Zhang
  2011-09-19  4:14     ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Yong Zhang @ 2011-09-18  4:09 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, Paul E. McKenney

On Tue, Sep 06, 2011 at 11:00:35AM -0700, Paul E. McKenney wrote:
> From: Paul E. McKenney <paul.mckenney@linaro.org>
> 
> Create a separate lockdep class for the rt_mutex used for RCU priority
> boosting and enable use of rt_mutex_lock() with irqs disabled.  This
> prevents RCU priority boosting from falling prey to deadlocks when
> someone begins an RCU read-side critical section in preemptible state,
> but releases it with an irq-disabled lock held.
> 
> Unfortunately, the scheduler's runqueue and priority-inheritance locks
> still must either completely enclose or be completely enclosed by any
> overlapping RCU read-side critical section.
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
>  kernel/rcutree_plugin.h |    6 ++++++
>  kernel/rtmutex.c        |    8 ++++++++
>  2 files changed, 14 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index d3127e8..f6c63ea 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -1149,6 +1149,8 @@ static void rcu_initiate_boost_trace(struct rcu_node *rnp)
>  
>  #endif /* #else #ifdef CONFIG_RCU_TRACE */
>  
> +static struct lock_class_key rcu_boost_class;
> +
>  /*
>   * Carry out RCU priority boosting on the task indicated by ->exp_tasks
>   * or ->boost_tasks, advancing the pointer to the next task in the
> @@ -1211,10 +1213,14 @@ static int rcu_boost(struct rcu_node *rnp)
>  	 */
>  	t = container_of(tb, struct task_struct, rcu_node_entry);
>  	rt_mutex_init_proxy_locked(&mtx, t);
> +	/* Avoid lockdep false positives.  This rt_mutex is its own thing. */
> +	lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class,
> +				   "rcu_boost_mutex");
>  	t->rcu_boost_mutex = &mtx;

  	raw_spin_unlock_irqrestore(&rnp->lock, flags);  <====A

>  	rt_mutex_lock(&mtx);  /* Side effect: boosts task t's priority. */
>  	rt_mutex_unlock(&mtx);  /* Keep lockdep happy. */
> +	local_irq_restore(flags);

Does it help here?
irq is enabled at A. So we still call rt_mutex_lock() with irq enabled.

Seems should s/raw_spin_unlock_irqrestore/raw_spin_unlock ?

BTW, since we are in process context, 'flags' is not needed to save,
no?

Thanks,
Yong


>  
>  	return rnp->exp_tasks != NULL || rnp->boost_tasks != NULL;
>  }
> diff --git a/kernel/rtmutex.c b/kernel/rtmutex.c
> index ab44911..2548f44 100644
> --- a/kernel/rtmutex.c
> +++ b/kernel/rtmutex.c
> @@ -579,6 +579,7 @@ __rt_mutex_slowlock(struct rt_mutex *lock, int state,
>  		    struct rt_mutex_waiter *waiter)
>  {
>  	int ret = 0;
> +	int was_disabled;
>  
>  	for (;;) {
>  		/* Try to acquire the lock: */
> @@ -601,10 +602,17 @@ __rt_mutex_slowlock(struct rt_mutex *lock, int state,
>  
>  		raw_spin_unlock(&lock->wait_lock);
>  
> +		was_disabled = irqs_disabled();
> +		if (was_disabled)
> +			local_irq_enable();
> +
>  		debug_rt_mutex_print_deadlock(waiter);
>  
>  		schedule_rt_mutex(lock);
>  
> +		if (was_disabled)
> +			local_irq_disable();
> +
>  		raw_spin_lock(&lock->wait_lock);
>  		set_current_state(state);
>  	}
> -- 
> 1.7.3.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 41/55] rcu: Permit rt_mutex_unlock() with irqs disabled
  2011-09-18  4:09   ` Yong Zhang
@ 2011-09-19  4:14     ` Paul E. McKenney
  2011-09-19  5:49       ` Yong Zhang
  0 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-19  4:14 UTC (permalink / raw)
  To: Yong Zhang
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, Paul E. McKenney

On Sun, Sep 18, 2011 at 12:09:23PM +0800, Yong Zhang wrote:
> On Tue, Sep 06, 2011 at 11:00:35AM -0700, Paul E. McKenney wrote:
> > From: Paul E. McKenney <paul.mckenney@linaro.org>
> > 
> > Create a separate lockdep class for the rt_mutex used for RCU priority
> > boosting and enable use of rt_mutex_lock() with irqs disabled.  This
> > prevents RCU priority boosting from falling prey to deadlocks when
> > someone begins an RCU read-side critical section in preemptible state,
> > but releases it with an irq-disabled lock held.
> > 
> > Unfortunately, the scheduler's runqueue and priority-inheritance locks
> > still must either completely enclose or be completely enclosed by any
> > overlapping RCU read-side critical section.
> > 
> > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  kernel/rcutree_plugin.h |    6 ++++++
> >  kernel/rtmutex.c        |    8 ++++++++
> >  2 files changed, 14 insertions(+), 0 deletions(-)
> > 
> > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> > index d3127e8..f6c63ea 100644
> > --- a/kernel/rcutree_plugin.h
> > +++ b/kernel/rcutree_plugin.h
> > @@ -1149,6 +1149,8 @@ static void rcu_initiate_boost_trace(struct rcu_node *rnp)
> >  
> >  #endif /* #else #ifdef CONFIG_RCU_TRACE */
> >  
> > +static struct lock_class_key rcu_boost_class;
> > +
> >  /*
> >   * Carry out RCU priority boosting on the task indicated by ->exp_tasks
> >   * or ->boost_tasks, advancing the pointer to the next task in the
> > @@ -1211,10 +1213,14 @@ static int rcu_boost(struct rcu_node *rnp)
> >  	 */
> >  	t = container_of(tb, struct task_struct, rcu_node_entry);
> >  	rt_mutex_init_proxy_locked(&mtx, t);
> > +	/* Avoid lockdep false positives.  This rt_mutex is its own thing. */
> > +	lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class,
> > +				   "rcu_boost_mutex");
> >  	t->rcu_boost_mutex = &mtx;
> 
>   	raw_spin_unlock_irqrestore(&rnp->lock, flags);  <====A
> 
> >  	rt_mutex_lock(&mtx);  /* Side effect: boosts task t's priority. */
> >  	rt_mutex_unlock(&mtx);  /* Keep lockdep happy. */
> > +	local_irq_restore(flags);
> 
> Does it help here?
> irq is enabled at A. So we still call rt_mutex_lock() with irq enabled.
> 
> Seems should s/raw_spin_unlock_irqrestore/raw_spin_unlock ?

Hmmm...  The above works at least by accident, but I am clearly not
testing calling rt_mutex_lock(&mtx) and rt_mutex_unlock(&mtx) with
interrupts disabled anywhere near as heavily as I thought I was.

I will fix this one way or the other.

> BTW, since we are in process context, 'flags' is not needed to save,
> no?

Only until the code gets moved/reused...

							Thanx, Paul

> Thanks,
> Yong
> 
> 
> >  
> >  	return rnp->exp_tasks != NULL || rnp->boost_tasks != NULL;
> >  }
> > diff --git a/kernel/rtmutex.c b/kernel/rtmutex.c
> > index ab44911..2548f44 100644
> > --- a/kernel/rtmutex.c
> > +++ b/kernel/rtmutex.c
> > @@ -579,6 +579,7 @@ __rt_mutex_slowlock(struct rt_mutex *lock, int state,
> >  		    struct rt_mutex_waiter *waiter)
> >  {
> >  	int ret = 0;
> > +	int was_disabled;
> >  
> >  	for (;;) {
> >  		/* Try to acquire the lock: */
> > @@ -601,10 +602,17 @@ __rt_mutex_slowlock(struct rt_mutex *lock, int state,
> >  
> >  		raw_spin_unlock(&lock->wait_lock);
> >  
> > +		was_disabled = irqs_disabled();
> > +		if (was_disabled)
> > +			local_irq_enable();
> > +
> >  		debug_rt_mutex_print_deadlock(waiter);
> >  
> >  		schedule_rt_mutex(lock);
> >  
> > +		if (was_disabled)
> > +			local_irq_disable();
> > +
> >  		raw_spin_lock(&lock->wait_lock);
> >  		set_current_state(state);
> >  	}
> > -- 
> > 1.7.3.2
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 41/55] rcu: Permit rt_mutex_unlock() with irqs disabled
  2011-09-19  4:14     ` Paul E. McKenney
@ 2011-09-19  5:49       ` Yong Zhang
  2011-09-20 14:57         ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Yong Zhang @ 2011-09-19  5:49 UTC (permalink / raw)
  To: paulmck
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, Paul E. McKenney

[-- Attachment #1: Type: text/plain, Size: 3589 bytes --]

On Mon, Sep 19, 2011 at 12:14 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Sun, Sep 18, 2011 at 12:09:23PM +0800, Yong Zhang wrote:
>> > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
>> > index d3127e8..f6c63ea 100644
>> > --- a/kernel/rcutree_plugin.h
>> > +++ b/kernel/rcutree_plugin.h
>> > @@ -1149,6 +1149,8 @@ static void rcu_initiate_boost_trace(struct rcu_node *rnp)
>> >
>> >  #endif /* #else #ifdef CONFIG_RCU_TRACE */
>> >
>> > +static struct lock_class_key rcu_boost_class;
>> > +
>> >  /*
>> >   * Carry out RCU priority boosting on the task indicated by ->exp_tasks
>> >   * or ->boost_tasks, advancing the pointer to the next task in the
>> > @@ -1211,10 +1213,14 @@ static int rcu_boost(struct rcu_node *rnp)
>> >      */
>> >     t = container_of(tb, struct task_struct, rcu_node_entry);
>> >     rt_mutex_init_proxy_locked(&mtx, t);
>> > +   /* Avoid lockdep false positives.  This rt_mutex is its own thing. */
>> > +   lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class,
>> > +                              "rcu_boost_mutex");
>> >     t->rcu_boost_mutex = &mtx;
>>
>>       raw_spin_unlock_irqrestore(&rnp->lock, flags);  <====A
>>
>> >     rt_mutex_lock(&mtx);  /* Side effect: boosts task t's priority. */
>> >     rt_mutex_unlock(&mtx);  /* Keep lockdep happy. */
>> > +   local_irq_restore(flags);
>>
>> Does it help here?
>> irq is enabled at A. So we still call rt_mutex_lock() with irq enabled.
>>
>> Seems should s/raw_spin_unlock_irqrestore/raw_spin_unlock ?
>
> Hmmm...  The above works at least by accident, but I am clearly not
> testing calling rt_mutex_lock(&mtx) and rt_mutex_unlock(&mtx) with
> interrupts disabled anywhere near as heavily as I thought I was.
>
> I will fix this one way or the other.

Forget to mention: if we want to suppress the lockdep warning on
overlapping usage of rcu_read_*()/local_irq_*() like below:

rcu_read_lock();
...
local_irq_disable();
...
rcu_read_unlock();
...
local_irq_enable();

'rt_mutex_unlock(rbmp);' must also be surrounded by
local_irq_irqsave()/restore().

Untested patch is attached.

Thanks,
Yong

---
From: Yong Zhang <yong.zhang0@gmail.com>
Subject: [PATCH] rcu: Permit rt_mutex_unlock() with irqs disabled take#2

This make the below rcu usage really valid(AKA: lockdep
will not warn on it):

rcu_read_lock();
local_irq_disable();
rcu_read_unlock();
local_irq_enable();

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
---
 kernel/rcutree_plugin.h |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index e7eea74..d41a9b0 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -398,8 +398,11 @@ static noinline void
rcu_read_unlock_special(struct task_struct *t)

 #ifdef CONFIG_RCU_BOOST
 		/* Unboost if we were boosted. */
-		if (rbmp)
+		if (rbmp) {
+			local_irq_save(flags);
 			rt_mutex_unlock(rbmp);
+			local_irq_restore(flags);
+		}
 #endif /* #ifdef CONFIG_RCU_BOOST */

 		/*
@@ -1225,7 +1228,7 @@ static int rcu_boost(struct rcu_node *rnp)
 	lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class,
 				   "rcu_boost_mutex");
 	t->rcu_boost_mutex = &mtx;
-	raw_spin_unlock_irqrestore(&rnp->lock, flags);
+	raw_spin_unlock(&rnp->lock);
 	rt_mutex_lock(&mtx);  /* Side effect: boosts task t's priority. */
 	rt_mutex_unlock(&mtx);  /* Keep lockdep happy. */
 	local_irq_restore(flags);
-- 
1.7.4.1

[-- Attachment #2: 0001-rcu-Permit-rt_mutex_unlock-with-irqs-disabled-take-2.patch --]
[-- Type: text/x-patch, Size: 1422 bytes --]

From 7d74d1b89a4cd4c03b30e47044b716913f68bd1d Mon Sep 17 00:00:00 2001
From: Yong Zhang <yong.zhang0@gmail.com>
Date: Mon, 19 Sep 2011 13:42:32 +0800
Subject: [PATCH] rcu: Permit rt_mutex_unlock() with irqs disabled take#2

This make the below rcu usage really valid(AKA: lockdep
will not warn on it):

rcu_read_lock();
local_irq_disable();
rcu_read_unlock();
local_irq_enable();

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
---
 kernel/rcutree_plugin.h |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index e7eea74..d41a9b0 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -398,8 +398,11 @@ static noinline void rcu_read_unlock_special(struct task_struct *t)
 
 #ifdef CONFIG_RCU_BOOST
 		/* Unboost if we were boosted. */
-		if (rbmp)
+		if (rbmp) {
+			local_irq_save(flags);
 			rt_mutex_unlock(rbmp);
+			local_irq_restore(flags);
+		}
 #endif /* #ifdef CONFIG_RCU_BOOST */
 
 		/*
@@ -1225,7 +1228,7 @@ static int rcu_boost(struct rcu_node *rnp)
 	lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class,
 				   "rcu_boost_mutex");
 	t->rcu_boost_mutex = &mtx;
-	raw_spin_unlock_irqrestore(&rnp->lock, flags);
+	raw_spin_unlock(&rnp->lock);
 	rt_mutex_lock(&mtx);  /* Side effect: boosts task t's priority. */
 	rt_mutex_unlock(&mtx);  /* Keep lockdep happy. */
 	local_irq_restore(flags);
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 41/55] rcu: Permit rt_mutex_unlock() with irqs disabled
  2011-09-19  5:49       ` Yong Zhang
@ 2011-09-20 14:57         ` Paul E. McKenney
  0 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-09-20 14:57 UTC (permalink / raw)
  To: Yong Zhang
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, Paul E. McKenney

On Mon, Sep 19, 2011 at 01:49:33PM +0800, Yong Zhang wrote:
> On Mon, Sep 19, 2011 at 12:14 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Sun, Sep 18, 2011 at 12:09:23PM +0800, Yong Zhang wrote:
> >> > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> >> > index d3127e8..f6c63ea 100644
> >> > --- a/kernel/rcutree_plugin.h
> >> > +++ b/kernel/rcutree_plugin.h
> >> > @@ -1149,6 +1149,8 @@ static void rcu_initiate_boost_trace(struct rcu_node *rnp)
> >> >
> >> >  #endif /* #else #ifdef CONFIG_RCU_TRACE */
> >> >
> >> > +static struct lock_class_key rcu_boost_class;
> >> > +
> >> >  /*
> >> >   * Carry out RCU priority boosting on the task indicated by ->exp_tasks
> >> >   * or ->boost_tasks, advancing the pointer to the next task in the
> >> > @@ -1211,10 +1213,14 @@ static int rcu_boost(struct rcu_node *rnp)
> >> >      */
> >> >     t = container_of(tb, struct task_struct, rcu_node_entry);
> >> >     rt_mutex_init_proxy_locked(&mtx, t);
> >> > +   /* Avoid lockdep false positives.  This rt_mutex is its own thing. */
> >> > +   lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class,
> >> > +                              "rcu_boost_mutex");
> >> >     t->rcu_boost_mutex = &mtx;
> >>
> >>       raw_spin_unlock_irqrestore(&rnp->lock, flags);  <====A
> >>
> >> >     rt_mutex_lock(&mtx);  /* Side effect: boosts task t's priority. */
> >> >     rt_mutex_unlock(&mtx);  /* Keep lockdep happy. */
> >> > +   local_irq_restore(flags);
> >>
> >> Does it help here?
> >> irq is enabled at A. So we still call rt_mutex_lock() with irq enabled.
> >>
> >> Seems should s/raw_spin_unlock_irqrestore/raw_spin_unlock ?
> >
> > Hmmm...  The above works at least by accident, but I am clearly not
> > testing calling rt_mutex_lock(&mtx) and rt_mutex_unlock(&mtx) with
> > interrupts disabled anywhere near as heavily as I thought I was.
> >
> > I will fix this one way or the other.
> 
> Forget to mention: if we want to suppress the lockdep warning on
> overlapping usage of rcu_read_*()/local_irq_*() like below:
> 
> rcu_read_lock();
> ...
> local_irq_disable();
> ...
> rcu_read_unlock();
> ...
> local_irq_enable();
> 
> 'rt_mutex_unlock(rbmp);' must also be surrounded by
> local_irq_irqsave()/restore().
> 
> Untested patch is attached.

What I am doing for 3.2 (given that the merge window is likely very soon)
is removing the redundant local_irq_restore().  For 3.3, I will apply
something like your patch below to rcu_boost(), and will think about
what (if anything) to do about rcu_read_unlock_special().  With your
Signed-off-by either way, of course.

							Thanx, Paul

> Thanks,
> Yong
> 
> ---
> From: Yong Zhang <yong.zhang0@gmail.com>
> Subject: [PATCH] rcu: Permit rt_mutex_unlock() with irqs disabled take#2
> 
> This make the below rcu usage really valid(AKA: lockdep
> will not warn on it):
> 
> rcu_read_lock();
> local_irq_disable();
> rcu_read_unlock();
> local_irq_enable();
> 
> Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
> ---
>  kernel/rcutree_plugin.h |    7 +++++--
>  1 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index e7eea74..d41a9b0 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -398,8 +398,11 @@ static noinline void
> rcu_read_unlock_special(struct task_struct *t)
> 
>  #ifdef CONFIG_RCU_BOOST
>  		/* Unboost if we were boosted. */
> -		if (rbmp)
> +		if (rbmp) {
> +			local_irq_save(flags);
>  			rt_mutex_unlock(rbmp);
> +			local_irq_restore(flags);
> +		}
>  #endif /* #ifdef CONFIG_RCU_BOOST */
> 
>  		/*
> @@ -1225,7 +1228,7 @@ static int rcu_boost(struct rcu_node *rnp)
>  	lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class,
>  				   "rcu_boost_mutex");
>  	t->rcu_boost_mutex = &mtx;
> -	raw_spin_unlock_irqrestore(&rnp->lock, flags);
> +	raw_spin_unlock(&rnp->lock);
>  	rt_mutex_lock(&mtx);  /* Side effect: boosts task t's priority. */
>  	rt_mutex_unlock(&mtx);  /* Keep lockdep happy. */
>  	local_irq_restore(flags);
> -- 
> 1.7.4.1

> From 7d74d1b89a4cd4c03b30e47044b716913f68bd1d Mon Sep 17 00:00:00 2001
> From: Yong Zhang <yong.zhang0@gmail.com>
> Date: Mon, 19 Sep 2011 13:42:32 +0800
> Subject: [PATCH] rcu: Permit rt_mutex_unlock() with irqs disabled take#2
> 
> This make the below rcu usage really valid(AKA: lockdep
> will not warn on it):
> 
> rcu_read_lock();
> local_irq_disable();
> rcu_read_unlock();
> local_irq_enable();
> 
> Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
> ---
>  kernel/rcutree_plugin.h |    7 +++++--
>  1 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index e7eea74..d41a9b0 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -398,8 +398,11 @@ static noinline void rcu_read_unlock_special(struct task_struct *t)
>  
>  #ifdef CONFIG_RCU_BOOST
>  		/* Unboost if we were boosted. */
> -		if (rbmp)
> +		if (rbmp) {
> +			local_irq_save(flags);
>  			rt_mutex_unlock(rbmp);
> +			local_irq_restore(flags);
> +		}
>  #endif /* #ifdef CONFIG_RCU_BOOST */
>  
>  		/*
> @@ -1225,7 +1228,7 @@ static int rcu_boost(struct rcu_node *rnp)
>  	lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class,
>  				   "rcu_boost_mutex");
>  	t->rcu_boost_mutex = &mtx;
> -	raw_spin_unlock_irqrestore(&rnp->lock, flags);
> +	raw_spin_unlock(&rnp->lock);
>  	rt_mutex_lock(&mtx);  /* Side effect: boosts task t's priority. */
>  	rt_mutex_unlock(&mtx);  /* Keep lockdep happy. */
>  	local_irq_restore(flags);
> -- 
> 1.7.4.1
> 


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 53/55] rcu: Warn when srcu_read_lock() is used in an extended quiescent state
  2011-09-06 18:00 ` [PATCH tip/core/rcu 53/55] rcu: Warn when srcu_read_lock() is used in an extended quiescent state Paul E. McKenney
@ 2011-10-04 21:03   ` Frederic Weisbecker
  2011-10-04 23:40     ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Frederic Weisbecker @ 2011-10-04 21:03 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Tue, Sep 06, 2011 at 11:00:47AM -0700, Paul E. McKenney wrote:
> Catch SRCU up to the other variants of RCU by making PROVE_RCU
> complain if either srcu_read_lock() or srcu_read_lock_held() are
> used from within dyntick-idle mode.
> 
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
>  include/linux/srcu.h |   25 +++++++++++++++----------
>  1 files changed, 15 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> index 58971e8..fcbaee7 100644
> --- a/include/linux/srcu.h
> +++ b/include/linux/srcu.h
> @@ -28,6 +28,7 @@
>  #define _LINUX_SRCU_H
>  
>  #include <linux/mutex.h>
> +#include <linux/rcupdate.h>
>  
>  struct srcu_struct_array {
>  	int c[2];
> @@ -60,18 +61,10 @@ int __init_srcu_struct(struct srcu_struct *sp, const char *name,
>  	__init_srcu_struct((sp), #sp, &__srcu_key); \
>  })
>  
> -# define srcu_read_acquire(sp) \
> -		lock_acquire(&(sp)->dep_map, 0, 0, 2, 1, NULL, _THIS_IP_)
> -# define srcu_read_release(sp) \
> -		lock_release(&(sp)->dep_map, 1, _THIS_IP_)
> -
>  #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
>  
>  int init_srcu_struct(struct srcu_struct *sp);
>  
> -# define srcu_read_acquire(sp)  do { } while (0)
> -# define srcu_read_release(sp)  do { } while (0)
> -
>  #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
>  
>  void cleanup_srcu_struct(struct srcu_struct *sp);
> @@ -90,11 +83,23 @@ long srcu_batches_completed(struct srcu_struct *sp);
>   * read-side critical section.  In absence of CONFIG_DEBUG_LOCK_ALLOC,
>   * this assumes we are in an SRCU read-side critical section unless it can
>   * prove otherwise.
> + *
> + * Note that if the CPU is in an extended quiescent state, for example,
> + * if the CPU is in dyntick-idle mode, then rcu_read_lock_held() returns
> + * false even if the CPU did an rcu_read_lock().  The reason for this is
> + * that RCU ignores CPUs that are in extended quiescent states, so such
> + * a CPU is effectively never in an RCU read-side critical section
> + * regardless of what RCU primitives it invokes.  This state of affairs
> + * is required -- RCU would otherwise need to periodically wake up
> + * dyntick-idle CPUs, which would defeat the whole purpose of dyntick-idle
> + * mode.
>   */
>  static inline int srcu_read_lock_held(struct srcu_struct *sp)
>  {
>  	if (debug_locks)
>  		return lock_is_held(&sp->dep_map);
> +	if (rcu_check_extended_qs())
> +		return 0;

Just to warn you, While rebasing this, I'm also moving things around:

	if (!debug_lock)
		return 1;

	if (rcu_is_cpu_idle())
		return 0;

	return lock_is_held(&sp->dep_map);

Otherwise we only do the check if lock debugging is disabled,
which is not what we want I think.

>  	return 1;
>  }
>  
> @@ -150,7 +155,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
>  {
>  	int retval = __srcu_read_lock(sp);
>  
> -	srcu_read_acquire(sp);
> +	rcu_lock_acquire(&(sp)->dep_map);
>  	return retval;
>  }
>  
> @@ -164,7 +169,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
>  static inline void srcu_read_unlock(struct srcu_struct *sp, int idx)
>  	__releases(sp)
>  {
> -	srcu_read_release(sp);
> +	rcu_lock_release(&(sp)->dep_map);
>  	__srcu_read_unlock(sp, idx);
>  }
>  
> -- 
> 1.7.3.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 53/55] rcu: Warn when srcu_read_lock() is used in an extended quiescent state
  2011-10-04 21:03   ` Frederic Weisbecker
@ 2011-10-04 23:40     ` Paul E. McKenney
  2011-10-04 23:42       ` Frederic Weisbecker
  0 siblings, 1 reply; 101+ messages in thread
From: Paul E. McKenney @ 2011-10-04 23:40 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Tue, Oct 04, 2011 at 11:03:29PM +0200, Frederic Weisbecker wrote:
> On Tue, Sep 06, 2011 at 11:00:47AM -0700, Paul E. McKenney wrote:
> > Catch SRCU up to the other variants of RCU by making PROVE_RCU
> > complain if either srcu_read_lock() or srcu_read_lock_held() are
> > used from within dyntick-idle mode.
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  include/linux/srcu.h |   25 +++++++++++++++----------
> >  1 files changed, 15 insertions(+), 10 deletions(-)
> > 
> > diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> > index 58971e8..fcbaee7 100644
> > --- a/include/linux/srcu.h
> > +++ b/include/linux/srcu.h
> > @@ -28,6 +28,7 @@
> >  #define _LINUX_SRCU_H
> >  
> >  #include <linux/mutex.h>
> > +#include <linux/rcupdate.h>
> >  
> >  struct srcu_struct_array {
> >  	int c[2];
> > @@ -60,18 +61,10 @@ int __init_srcu_struct(struct srcu_struct *sp, const char *name,
> >  	__init_srcu_struct((sp), #sp, &__srcu_key); \
> >  })
> >  
> > -# define srcu_read_acquire(sp) \
> > -		lock_acquire(&(sp)->dep_map, 0, 0, 2, 1, NULL, _THIS_IP_)
> > -# define srcu_read_release(sp) \
> > -		lock_release(&(sp)->dep_map, 1, _THIS_IP_)
> > -
> >  #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> >  
> >  int init_srcu_struct(struct srcu_struct *sp);
> >  
> > -# define srcu_read_acquire(sp)  do { } while (0)
> > -# define srcu_read_release(sp)  do { } while (0)
> > -
> >  #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> >  
> >  void cleanup_srcu_struct(struct srcu_struct *sp);
> > @@ -90,11 +83,23 @@ long srcu_batches_completed(struct srcu_struct *sp);
> >   * read-side critical section.  In absence of CONFIG_DEBUG_LOCK_ALLOC,
> >   * this assumes we are in an SRCU read-side critical section unless it can
> >   * prove otherwise.
> > + *
> > + * Note that if the CPU is in an extended quiescent state, for example,
> > + * if the CPU is in dyntick-idle mode, then rcu_read_lock_held() returns
> > + * false even if the CPU did an rcu_read_lock().  The reason for this is
> > + * that RCU ignores CPUs that are in extended quiescent states, so such
> > + * a CPU is effectively never in an RCU read-side critical section
> > + * regardless of what RCU primitives it invokes.  This state of affairs
> > + * is required -- RCU would otherwise need to periodically wake up
> > + * dyntick-idle CPUs, which would defeat the whole purpose of dyntick-idle
> > + * mode.
> >   */
> >  static inline int srcu_read_lock_held(struct srcu_struct *sp)
> >  {
> >  	if (debug_locks)
> >  		return lock_is_held(&sp->dep_map);
> > +	if (rcu_check_extended_qs())
> > +		return 0;
> 
> Just to warn you, While rebasing this, I'm also moving things around:

Thank you for letting me know, should not be a problem.

> 	if (!debug_lock)
> 		return 1;
> 
> 	if (rcu_is_cpu_idle())
> 		return 0;
> 
> 	return lock_is_held(&sp->dep_map);
> 
> Otherwise we only do the check if lock debugging is disabled,
> which is not what we want I think.

Would it make sense to use this order?

	if (rcu_is_cpu_idle())
		return 0;

	if (!debug_lock)
		return 1;

	return lock_is_held(&sp->dep_map);

Given the new approach, rcu_is_cpu_idle() works whether or not debug_lock
is enabled.

							Thanx, Paul

> >  	return 1;
> >  }
> >  
> > @@ -150,7 +155,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
> >  {
> >  	int retval = __srcu_read_lock(sp);
> >  
> > -	srcu_read_acquire(sp);
> > +	rcu_lock_acquire(&(sp)->dep_map);
> >  	return retval;
> >  }
> >  
> > @@ -164,7 +169,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
> >  static inline void srcu_read_unlock(struct srcu_struct *sp, int idx)
> >  	__releases(sp)
> >  {
> > -	srcu_read_release(sp);
> > +	rcu_lock_release(&(sp)->dep_map);
> >  	__srcu_read_unlock(sp, idx);
> >  }
> >  
> > -- 
> > 1.7.3.2
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 53/55] rcu: Warn when srcu_read_lock() is used in an extended quiescent state
  2011-10-04 23:40     ` Paul E. McKenney
@ 2011-10-04 23:42       ` Frederic Weisbecker
  0 siblings, 0 replies; 101+ messages in thread
From: Frederic Weisbecker @ 2011-10-04 23:42 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Tue, Oct 04, 2011 at 04:40:28PM -0700, Paul E. McKenney wrote:
> On Tue, Oct 04, 2011 at 11:03:29PM +0200, Frederic Weisbecker wrote:
> > On Tue, Sep 06, 2011 at 11:00:47AM -0700, Paul E. McKenney wrote:
> > > Catch SRCU up to the other variants of RCU by making PROVE_RCU
> > > complain if either srcu_read_lock() or srcu_read_lock_held() are
> > > used from within dyntick-idle mode.
> > > 
> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > ---
> > >  include/linux/srcu.h |   25 +++++++++++++++----------
> > >  1 files changed, 15 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> > > index 58971e8..fcbaee7 100644
> > > --- a/include/linux/srcu.h
> > > +++ b/include/linux/srcu.h
> > > @@ -28,6 +28,7 @@
> > >  #define _LINUX_SRCU_H
> > >  
> > >  #include <linux/mutex.h>
> > > +#include <linux/rcupdate.h>
> > >  
> > >  struct srcu_struct_array {
> > >  	int c[2];
> > > @@ -60,18 +61,10 @@ int __init_srcu_struct(struct srcu_struct *sp, const char *name,
> > >  	__init_srcu_struct((sp), #sp, &__srcu_key); \
> > >  })
> > >  
> > > -# define srcu_read_acquire(sp) \
> > > -		lock_acquire(&(sp)->dep_map, 0, 0, 2, 1, NULL, _THIS_IP_)
> > > -# define srcu_read_release(sp) \
> > > -		lock_release(&(sp)->dep_map, 1, _THIS_IP_)
> > > -
> > >  #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> > >  
> > >  int init_srcu_struct(struct srcu_struct *sp);
> > >  
> > > -# define srcu_read_acquire(sp)  do { } while (0)
> > > -# define srcu_read_release(sp)  do { } while (0)
> > > -
> > >  #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> > >  
> > >  void cleanup_srcu_struct(struct srcu_struct *sp);
> > > @@ -90,11 +83,23 @@ long srcu_batches_completed(struct srcu_struct *sp);
> > >   * read-side critical section.  In absence of CONFIG_DEBUG_LOCK_ALLOC,
> > >   * this assumes we are in an SRCU read-side critical section unless it can
> > >   * prove otherwise.
> > > + *
> > > + * Note that if the CPU is in an extended quiescent state, for example,
> > > + * if the CPU is in dyntick-idle mode, then rcu_read_lock_held() returns
> > > + * false even if the CPU did an rcu_read_lock().  The reason for this is
> > > + * that RCU ignores CPUs that are in extended quiescent states, so such
> > > + * a CPU is effectively never in an RCU read-side critical section
> > > + * regardless of what RCU primitives it invokes.  This state of affairs
> > > + * is required -- RCU would otherwise need to periodically wake up
> > > + * dyntick-idle CPUs, which would defeat the whole purpose of dyntick-idle
> > > + * mode.
> > >   */
> > >  static inline int srcu_read_lock_held(struct srcu_struct *sp)
> > >  {
> > >  	if (debug_locks)
> > >  		return lock_is_held(&sp->dep_map);
> > > +	if (rcu_check_extended_qs())
> > > +		return 0;
> > 
> > Just to warn you, While rebasing this, I'm also moving things around:
> 
> Thank you for letting me know, should not be a problem.
> 
> > 	if (!debug_lock)
> > 		return 1;
> > 
> > 	if (rcu_is_cpu_idle())
> > 		return 0;
> > 
> > 	return lock_is_held(&sp->dep_map);
> > 
> > Otherwise we only do the check if lock debugging is disabled,
> > which is not what we want I think.
> 
> Would it make sense to use this order?
> 
> 	if (rcu_is_cpu_idle())
> 		return 0;
> 
> 	if (!debug_lock)
> 		return 1;
> 
> 	return lock_is_held(&sp->dep_map);
> 
> Given the new approach, rcu_is_cpu_idle() works whether or not debug_lock
> is enabled.

Yeah why not.

I'm taking that approach.

Thanks.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 22/55] rcu: Add grace-period, quiescent-state, and call_rcu trace events
  2011-09-06 18:00 ` [PATCH tip/core/rcu 22/55] rcu: Add grace-period, quiescent-state, and call_rcu trace events Paul E. McKenney
@ 2011-10-17  1:33   ` Josh Triplett
  2011-10-24 12:02     ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Josh Triplett @ 2011-10-17  1:33 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, Paul E. McKenney

On Tue, Sep 06, 2011 at 11:00:16AM -0700, Paul E. McKenney wrote:
> From: Paul E. McKenney <paul.mckenney@linaro.org>
> 
> Add trace events to record grace-period start and end, quiescent states,
> CPUs noticing grace-period start and end, grace-period initialization,
> call_rcu() invocation, tasks blocking in RCU read-side critical sections,
> tasks exiting those same critical sections, force_quiescent_state()
> detection of dyntick-idle and offline CPUs, CPUs entering and leaving
> dyntick-idle mode (except from NMIs), CPUs coming online and going
> offline, and CPUs being kicked for staying in dyntick-idle mode for too
> long (as in many weeks, even on 32-bit systems).
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> rcu: Add the rcu flavor to callback trace events
> 
> The earlier trace events for registering RCU callbacks and for invoking
> them did not include the RCU flavor (rcu_bh, rcu_preempt, or rcu_sched).
> This commit adds the RCU flavor to those trace events.
> 
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Did you intend for this commit message to contain two full commit
messages, and three signoffs from you?

Also, the subject doesn't seem to cover the second half of the commit
message.

- Josh Triplett

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 31/55] rcu: Make rcu_implicit_dynticks_qs() locals be correct size
  2011-09-06 18:00 ` [PATCH tip/core/rcu 31/55] rcu: Make rcu_implicit_dynticks_qs() locals be correct size Paul E. McKenney
@ 2011-10-17  1:43   ` Josh Triplett
  2011-10-24 12:00     ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Josh Triplett @ 2011-10-17  1:43 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Tue, Sep 06, 2011 at 11:00:25AM -0700, Paul E. McKenney wrote:
> When the ->dynticks field in the rcu_dynticks structure changed to an
> atomic_t, its size on 64-bit systems changed from 64 bits to 32 bits.
> The local variables in rcu_implicit_dynticks_qs() need to change as
> well, hence this commit.

If an atomic_t always holds 32-bits, which it appears to, then shouldn't
this use u32 rather than unsigned int?

- Josh Triplett

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 36/55] rcu: Prevent early boot set_need_resched() from __rcu_pending()
  2011-09-06 18:00 ` [PATCH tip/core/rcu 36/55] rcu: Prevent early boot set_need_resched() from __rcu_pending() Paul E. McKenney
@ 2011-10-17  1:49   ` Josh Triplett
  2011-10-24 12:07     ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Josh Triplett @ 2011-10-17  1:49 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, Paul E. McKenney

On Tue, Sep 06, 2011 at 11:00:30AM -0700, Paul E. McKenney wrote:
> From: Paul E. McKenney <paul.mckenney@linaro.org>
> 
> There isn't a whole lot of point in poking the scheduler before there
> are other tasks to switch to.  This commit therefore adds a check
> for rcu_scheduler_fully_active in __rcu_pending() to suppress any
> pre-scheduler calls to set_need_resched().  The downside of this approach
> is additional runtime overhead in a reasonably hot code path.

If you're concerned about the runtime overhead, this does seem like a
perfect candidate for jump labels.

- Josh Triplett

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 38/55] rcu: Prohibit grace periods during early boot
  2011-09-06 18:00 ` [PATCH tip/core/rcu 38/55] rcu: Prohibit grace periods during early boot Paul E. McKenney
@ 2011-10-17  1:51   ` Josh Triplett
  0 siblings, 0 replies; 101+ messages in thread
From: Josh Triplett @ 2011-10-17  1:51 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, Paul E. McKenney

On Tue, Sep 06, 2011 at 11:00:32AM -0700, Paul E. McKenney wrote:
> From: Paul E. McKenney <paul.mckenney@linaro.org>
> 
> Greater use of RCU during early boot (before the scheduler is operating)
> is causing RCU to attempt to start grace periods during that time, which
> in turn is resulting in both RCU and the callback functions attempting
> to use the scheduler before it is ready.
> 
> This commit prevents these problems by prohibiting RCU grace periods
> until after the scheduler has spawned the first non-idle task.

As with patch 36, this seems like a good candidate for jump labels.

- Josh Triplett

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 42/55] rcu: Make rcu_torture_fqs() exit loops at end of test
  2011-09-06 18:00 ` [PATCH tip/core/rcu 42/55] rcu: Make rcu_torture_fqs() exit loops at end of test Paul E. McKenney
@ 2011-10-17  1:53   ` Josh Triplett
  2011-10-24 12:10     ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Josh Triplett @ 2011-10-17  1:53 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Tue, Sep 06, 2011 at 11:00:36AM -0700, Paul E. McKenney wrote:
> The rcu_torture_fqs() function can prevent the rcutorture tests from
> completing, resulting in a hang.  This commit therefore ensures that
> rcu_torture_fqs() will exit its inner loops at the end of the test,
> and also applies the newish ULONG_CMP_LT() macro to time comparisons.
> 
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

This seems like two entirely separate changes, which should go in
separate commits.

- Josh Triplett

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 44/55] rcu: wire up RCU_BOOST_PRIO for rcutree
  2011-09-06 18:00 ` [PATCH tip/core/rcu 44/55] rcu: wire up RCU_BOOST_PRIO for rcutree Paul E. McKenney
  2011-09-13 12:02   ` Mike Galbraith
@ 2011-10-17  1:55   ` Josh Triplett
  1 sibling, 0 replies; 101+ messages in thread
From: Josh Triplett @ 2011-10-17  1:55 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, Mike Galbraith

On Tue, Sep 06, 2011 at 11:00:38AM -0700, Paul E. McKenney wrote:
> RCU boost threads start life at RCU_BOOST_PRIO, while others remain
> at RCU_KTHREAD_PRIO.  While here, change thread names to match other
> kthreads, and adjust rcu_yield() to not override the priority set by
> the user.  This last change sets the stage for runtime changes to
> priority in the -rt tree.

This seems like either two or three independent changes, which should go
in separate commits.  You could plausibly group the rcu_yield changes
together with the rest, but the thread naming seems like a separate
commit.

- Josh Triplett

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 54/55] rcu: Make srcu_read_lock_held() call common lockdep-enabled function
  2011-09-06 18:00 ` [PATCH tip/core/rcu 54/55] rcu: Make srcu_read_lock_held() call common lockdep-enabled function Paul E. McKenney
@ 2011-10-17  2:03   ` Josh Triplett
  2011-10-24 12:34     ` Paul E. McKenney
  0 siblings, 1 reply; 101+ messages in thread
From: Josh Triplett @ 2011-10-17  2:03 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Tue, Sep 06, 2011 at 11:00:48AM -0700, Paul E. McKenney wrote:
> A common debug_lockdep_rcu_enabled() function is used to check whether
> RCU lockdep splats should be reported, but srcu_read_lock() does not
> use it.  This commit therefore brings srcu_read_lock_held() up to date.

Should this patch go before 53, or merge into 53, to prevent the issues
you describe from occuring in a tree which has 53 applied but not 54?

- Josh Triplett

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2
  2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
                   ` (55 preceding siblings ...)
  2011-09-07 14:39 ` [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Lin Ming
@ 2011-10-17  2:06 ` Josh Triplett
  2011-10-24 12:35   ` Paul E. McKenney
  56 siblings, 1 reply; 101+ messages in thread
From: Josh Triplett @ 2011-10-17  2:06 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Tue, Sep 06, 2011 at 11:00:15AM -0700, Paul E. McKenney wrote:
> Hello!
> 
> This patchset adds RCU event tracing, improved diagnostics and
> documentation, and fixes a number of bugs, including several from an
> ongoing top-to-bottom inspection of RCU.  The patches are as follows:
> 
> 1.	Place per-CPU kthreads' stack and task struct on the corresponding
> 	node on NUMA systems (courtesy of Eric Dumazet).
> 2.	Avoid unnecessary self-wakeups for per-CPU kthreads
> 	(courtesy of Shaohua Li).
> 3,6,10,12,25,28,33.
> 	Documentations updates (some courtesy Wanlong Gao).
> 4.	Add replacement checks for blocking within an RCU read-side
> 	critical section.
> 5.	Header-file untangling part 1 of N: move rcu_head to types.h.
> 7.	Fix mismatched variable declaration (courtesy of Andi Kleen).
> 8.	Abstract out common grace-period-primitive code.
> 9.	Update rcutorture to test newish RCU API members.
> 11.	Drive RCU algorithm selection directly from SMP and PREEMPT.
> 13.	Make rcu_torture_boost() wait for callbacks before telling
> 	debug-objects that they are done.
> 14-17,20,22.
> 	Add event tracing for RCU.
> 18.	Update comments to reflect kthreads being used only when
> 	RCU priority boosting is enabled.
> 19.	Move RCU_BOOSt data declarations to alow compiler to detect
> 	mismatches.
> 20.	Make TINY_RCU use softirqs for RCU_BOOST=n.
> 23.	Simplify quiescent-state accounting.
> 24.	Stop passing rcu_read_lock_held() to rcu_dereference_protected()
> 	(courtesy of Michal Hocko).
> 26.	Remove unused and redundant RCU API members.
> 27.	Allow rcutorture's stat_interval parameter to be changed at runtime
> 	to make it easier to test RCU in guest OSes.
> 28.	Removed unused nohz_cpu_mask (courtesy of Alex Shi).
> 30.	Eliminate in_irq() checks in rcu_enter_nohz().
> 31.	Fix rcu_implicit_dynticks_qs() local-variable size mismatches.
> 32.	Make rcu_assign_pointer() unconditionally emit memory barrier
> 	to silence new gcc warnings (courtesy of Eric Dumazet).
> 34.	Move __rcu_read_lock()'s barrier within if-statement.
> 35.	Dump local stack for CPU stall warnings if cannot dump all stacks.
> 36.	Prevent early-boot set_need_resched() from __rcu_pending().
> 37.	Simplify unboosting checks.
> 38.	Prohibit RCU grace periods during early boot.
> 39.	Suppress NMI backtraces when CPU stall ends before dump.
> 40.	Avoid just-online CPU needlessly rescheding itself.
> 41.	Permit rt_mutex_unlock() with irqs disabled.
> 42-43.	Prevent end-of-test rcutorture hangs.
> 44.	Wire up RCU_BOOST_PRIO, use conventional kthread naming scheme
> 	(courtesy of Mike Galbraith).
> 45.	Check for entering dyntick-idle in RCU read-side critical section.
> 46.	Adjust RCU_FAST_NO_HZ to avoid false quiescent states.
> 47.	Avoid concurrent end of old GP with start of new GP.
> 48.	Strengthen powerpc value-returning atomic memory ordering.
> 49-51.	Detect illegal RCU use from dyntick-idle mode (courtesy of
> 	Frederic Weisbecker).
> 52.	Remove an unnecessary layer of abstraction from PROVE_RCU checking.
> 53.	Detect illegal SRCU use from dyntick-idle mode.
> 54.	Make SRCU use common lockdep-splat code.
> 55.	Placeholder patch that disables illegal tracing from dyntick-idle
> 	mode (illegal because tracing uses RCU).

I responded to a few of the patches with comments and potential issues.
I also don't consider myself qualified to review patch 48/55, "powerpc:
strengthen value-returning-atomics memory barriers".  For the rest:

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 31/55] rcu: Make rcu_implicit_dynticks_qs() locals be correct size
  2011-10-17  1:43   ` Josh Triplett
@ 2011-10-24 12:00     ` Paul E. McKenney
  0 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-10-24 12:00 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Sun, Oct 16, 2011 at 06:43:50PM -0700, Josh Triplett wrote:
> On Tue, Sep 06, 2011 at 11:00:25AM -0700, Paul E. McKenney wrote:
> > When the ->dynticks field in the rcu_dynticks structure changed to an
> > atomic_t, its size on 64-bit systems changed from 64 bits to 32 bits.
> > The local variables in rcu_implicit_dynticks_qs() need to change as
> > well, hence this commit.
> 
> If an atomic_t always holds 32-bits, which it appears to, then shouldn't
> this use u32 rather than unsigned int?

The atomic_t definition is "int", and I need "unsigned int" to avoid
integer overflow.

But it might make sense to make atomic_t s32, in which case I would
use u32.  I don't feel very strongly about it either way, because "int"
is also defined to be 32 bits.  Maybe I should put such a patch forward
later on.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 22/55] rcu: Add grace-period, quiescent-state, and call_rcu trace events
  2011-10-17  1:33   ` Josh Triplett
@ 2011-10-24 12:02     ` Paul E. McKenney
  0 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-10-24 12:02 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, Paul E. McKenney

On Sun, Oct 16, 2011 at 06:33:53PM -0700, Josh Triplett wrote:
> On Tue, Sep 06, 2011 at 11:00:16AM -0700, Paul E. McKenney wrote:
> > From: Paul E. McKenney <paul.mckenney@linaro.org>
> > 
> > Add trace events to record grace-period start and end, quiescent states,
> > CPUs noticing grace-period start and end, grace-period initialization,
> > call_rcu() invocation, tasks blocking in RCU read-side critical sections,
> > tasks exiting those same critical sections, force_quiescent_state()
> > detection of dyntick-idle and offline CPUs, CPUs entering and leaving
> > dyntick-idle mode (except from NMIs), CPUs coming online and going
> > offline, and CPUs being kicked for staying in dyntick-idle mode for too
> > long (as in many weeks, even on 32-bit systems).
> > 
> > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > rcu: Add the rcu flavor to callback trace events
> > 
> > The earlier trace events for registering RCU callbacks and for invoking
> > them did not include the RCU flavor (rcu_bh, rcu_preempt, or rcu_sched).
> > This commit adds the RCU flavor to those trace events.
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> Did you intend for this commit message to contain two full commit
> messages, and three signoffs from you?

No, messed up a rebase.  <red face>

> Also, the subject doesn't seem to cover the second half of the commit
> message.

This is indeed the first set of trace events where the name of the RCU
flavor was important.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 36/55] rcu: Prevent early boot set_need_resched() from __rcu_pending()
  2011-10-17  1:49   ` Josh Triplett
@ 2011-10-24 12:07     ` Paul E. McKenney
  0 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-10-24 12:07 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches, Paul E. McKenney

On Sun, Oct 16, 2011 at 06:49:15PM -0700, Josh Triplett wrote:
> On Tue, Sep 06, 2011 at 11:00:30AM -0700, Paul E. McKenney wrote:
> > From: Paul E. McKenney <paul.mckenney@linaro.org>
> > 
> > There isn't a whole lot of point in poking the scheduler before there
> > are other tasks to switch to.  This commit therefore adds a check
> > for rcu_scheduler_fully_active in __rcu_pending() to suppress any
> > pre-scheduler calls to set_need_resched().  The downside of this approach
> > is additional runtime overhead in a reasonably hot code path.
> 
> If you're concerned about the runtime overhead, this does seem like a
> perfect candidate for jump labels.

I have added this to my todo list, but would welcome a patch.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 42/55] rcu: Make rcu_torture_fqs() exit loops at end of test
  2011-10-17  1:53   ` Josh Triplett
@ 2011-10-24 12:10     ` Paul E. McKenney
  0 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-10-24 12:10 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Sun, Oct 16, 2011 at 06:53:29PM -0700, Josh Triplett wrote:
> On Tue, Sep 06, 2011 at 11:00:36AM -0700, Paul E. McKenney wrote:
> > The rcu_torture_fqs() function can prevent the rcutorture tests from
> > completing, resulting in a hang.  This commit therefore ensures that
> > rcu_torture_fqs() will exit its inner loops at the end of the test,
> > and also applies the newish ULONG_CMP_LT() macro to time comparisons.
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> This seems like two entirely separate changes, which should go in
> separate commits.

Interesting.  I was thinking of this as fixing hangs in rcutorture,
and was feeling somewhat bad about having the separate commit for the
part of the fix that I missed.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 54/55] rcu: Make srcu_read_lock_held() call common lockdep-enabled function
  2011-10-17  2:03   ` Josh Triplett
@ 2011-10-24 12:34     ` Paul E. McKenney
  0 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-10-24 12:34 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Sun, Oct 16, 2011 at 07:03:10PM -0700, Josh Triplett wrote:
> On Tue, Sep 06, 2011 at 11:00:48AM -0700, Paul E. McKenney wrote:
> > A common debug_lockdep_rcu_enabled() function is used to check whether
> > RCU lockdep splats should be reported, but srcu_read_lock() does not
> > use it.  This commit therefore brings srcu_read_lock_held() up to date.
> 
> Should this patch go before 53, or merge into 53, to prevent the issues
> you describe from occuring in a tree which has 53 applied but not 54?

It is OK as is because the SRCU code does something reasonably sane
as is.  This is more about code cleanliness than about functionality.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2
  2011-10-17  2:06 ` Josh Triplett
@ 2011-10-24 12:35   ` Paul E. McKenney
  0 siblings, 0 replies; 101+ messages in thread
From: Paul E. McKenney @ 2011-10-24 12:35 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, patches

On Sun, Oct 16, 2011 at 07:06:13PM -0700, Josh Triplett wrote:
> On Tue, Sep 06, 2011 at 11:00:15AM -0700, Paul E. McKenney wrote:
> > Hello!
> > 
> > This patchset adds RCU event tracing, improved diagnostics and
> > documentation, and fixes a number of bugs, including several from an
> > ongoing top-to-bottom inspection of RCU.  The patches are as follows:
> > 
> > 1.	Place per-CPU kthreads' stack and task struct on the corresponding
> > 	node on NUMA systems (courtesy of Eric Dumazet).
> > 2.	Avoid unnecessary self-wakeups for per-CPU kthreads
> > 	(courtesy of Shaohua Li).
> > 3,6,10,12,25,28,33.
> > 	Documentations updates (some courtesy Wanlong Gao).
> > 4.	Add replacement checks for blocking within an RCU read-side
> > 	critical section.
> > 5.	Header-file untangling part 1 of N: move rcu_head to types.h.
> > 7.	Fix mismatched variable declaration (courtesy of Andi Kleen).
> > 8.	Abstract out common grace-period-primitive code.
> > 9.	Update rcutorture to test newish RCU API members.
> > 11.	Drive RCU algorithm selection directly from SMP and PREEMPT.
> > 13.	Make rcu_torture_boost() wait for callbacks before telling
> > 	debug-objects that they are done.
> > 14-17,20,22.
> > 	Add event tracing for RCU.
> > 18.	Update comments to reflect kthreads being used only when
> > 	RCU priority boosting is enabled.
> > 19.	Move RCU_BOOSt data declarations to alow compiler to detect
> > 	mismatches.
> > 20.	Make TINY_RCU use softirqs for RCU_BOOST=n.
> > 23.	Simplify quiescent-state accounting.
> > 24.	Stop passing rcu_read_lock_held() to rcu_dereference_protected()
> > 	(courtesy of Michal Hocko).
> > 26.	Remove unused and redundant RCU API members.
> > 27.	Allow rcutorture's stat_interval parameter to be changed at runtime
> > 	to make it easier to test RCU in guest OSes.
> > 28.	Removed unused nohz_cpu_mask (courtesy of Alex Shi).
> > 30.	Eliminate in_irq() checks in rcu_enter_nohz().
> > 31.	Fix rcu_implicit_dynticks_qs() local-variable size mismatches.
> > 32.	Make rcu_assign_pointer() unconditionally emit memory barrier
> > 	to silence new gcc warnings (courtesy of Eric Dumazet).
> > 34.	Move __rcu_read_lock()'s barrier within if-statement.
> > 35.	Dump local stack for CPU stall warnings if cannot dump all stacks.
> > 36.	Prevent early-boot set_need_resched() from __rcu_pending().
> > 37.	Simplify unboosting checks.
> > 38.	Prohibit RCU grace periods during early boot.
> > 39.	Suppress NMI backtraces when CPU stall ends before dump.
> > 40.	Avoid just-online CPU needlessly rescheding itself.
> > 41.	Permit rt_mutex_unlock() with irqs disabled.
> > 42-43.	Prevent end-of-test rcutorture hangs.
> > 44.	Wire up RCU_BOOST_PRIO, use conventional kthread naming scheme
> > 	(courtesy of Mike Galbraith).
> > 45.	Check for entering dyntick-idle in RCU read-side critical section.
> > 46.	Adjust RCU_FAST_NO_HZ to avoid false quiescent states.
> > 47.	Avoid concurrent end of old GP with start of new GP.
> > 48.	Strengthen powerpc value-returning atomic memory ordering.
> > 49-51.	Detect illegal RCU use from dyntick-idle mode (courtesy of
> > 	Frederic Weisbecker).
> > 52.	Remove an unnecessary layer of abstraction from PROVE_RCU checking.
> > 53.	Detect illegal SRCU use from dyntick-idle mode.
> > 54.	Make SRCU use common lockdep-splat code.
> > 55.	Placeholder patch that disables illegal tracing from dyntick-idle
> > 	mode (illegal because tracing uses RCU).
> 
> I responded to a few of the patches with comments and potential issues.
> I also don't consider myself qualified to review patch 48/55, "powerpc:
> strengthen value-returning-atomics memory barriers".  For the rest:
> 
> Reviewed-by: Josh Triplett <josh@joshtriplett.org>

Thank you very much for your review and comments!!!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 101+ messages in thread

end of thread, other threads:[~2011-10-24 12:36 UTC | newest]

Thread overview: 101+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-06 18:00 [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Paul E. McKenney
2011-09-06 17:59 ` [PATCH tip/core/rcu 01/55] rcu: Use kthread_create_on_node() Paul E. McKenney
2011-09-06 17:59 ` [PATCH tip/core/rcu 02/55] rcu: Avoid unnecessary self-wakeup of per-CPU kthreads Paul E. McKenney
2011-09-06 17:59 ` [PATCH tip/core/rcu 03/55] rcu: Update documentation to flag RCU_BOOST trace information Paul E. McKenney
2011-09-06 17:59 ` [PATCH tip/core/rcu 04/55] rcu: Restore checks for blocking in RCU read-side critical sections Paul E. McKenney
2011-09-06 17:59 ` [PATCH tip/core/rcu 05/55] rcu: Move rcu_head definition to types.h Paul E. McKenney
2011-09-07 18:31   ` Paul Gortmaker
2011-09-07 22:11     ` Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 06/55] rcu: Update rcutorture documentation Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 07/55] rcu: Fix mismatched variable in rcutree_trace.c Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 08/55] rcu: Abstract common code for RCU grace-period-wait primitives Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 09/55] rcu: Catch rcutorture up to new RCU API additions Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 10/55] rcu: Fix RCU's NMI documentation Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 11/55] rcu: Drive configuration directly from SMP and PREEMPT Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 12/55] rcu: Fix pathnames in documentation Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 13/55] rcu: Don't destroy rcu_torture_boost() callback until it is done Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 14/55] rcu: Add event-tracing for RCU callback invocation Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 15/55] rcu: Event-trace markers for computing RCU CPU utilization Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 16/55] rcu: Put names into TINY_RCU structures under RCU_TRACE Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 17/55] rcu: Add RCU type to callback-invocation tracing Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 18/55] rcu: Update comments to reflect softirqs vs. kthreads Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 19/55] rcu: Move RCU_BOOST declarations to allow compiler checking Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 20/55] rcu: Add event-trace markers to TREE_RCU kthreads Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 21/55] rcu: Make TINY_RCU also use softirq for RCU_BOOST=n Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 22/55] rcu: Add grace-period, quiescent-state, and call_rcu trace events Paul E. McKenney
2011-10-17  1:33   ` Josh Triplett
2011-10-24 12:02     ` Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 23/55] rcu: Simplify quiescent-state accounting Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 24/55] rcu: Not necessary to pass rcu_read_lock_held() to rcu_dereference_protected() Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 25/55] rcu: Update documentation for additional RCU lockdep functions Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 26/55] rcu: Remove unused and redundant interfaces Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 27/55] rcu: Allow rcutorture's stat_interval parameter to be changed at runtime Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 28/55] rcu: Document interpretation of RCU-lockdep splats Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 29/55] nohz: Remove nohz_cpu_mask Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 30/55] rcu: Eliminate in_irq() checks in rcu_enter_nohz() Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 31/55] rcu: Make rcu_implicit_dynticks_qs() locals be correct size Paul E. McKenney
2011-10-17  1:43   ` Josh Triplett
2011-10-24 12:00     ` Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 32/55] rcu: Make rcu_assign_pointer() unconditionally insert a memory barrier Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 33/55] rcu: Improve rcu_assign_pointer() and RCU_INIT_POINTER() documentation Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 34/55] rcu: Move __rcu_read_unlock()'s barrier() within if-statement Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 35/55] rcu: Dump local stack if cannot dump all CPUs' stacks Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 36/55] rcu: Prevent early boot set_need_resched() from __rcu_pending() Paul E. McKenney
2011-10-17  1:49   ` Josh Triplett
2011-10-24 12:07     ` Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 37/55] rcu: Simplify unboosting checks Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 38/55] rcu: Prohibit grace periods during early boot Paul E. McKenney
2011-10-17  1:51   ` Josh Triplett
2011-09-06 18:00 ` [PATCH tip/core/rcu 39/55] rcu: Suppress NMI backtraces when stall ends before dump Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 40/55] rcu: Avoid having just-onlined CPU resched itself when RCU is idle Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 41/55] rcu: Permit rt_mutex_unlock() with irqs disabled Paul E. McKenney
2011-09-18  4:09   ` Yong Zhang
2011-09-19  4:14     ` Paul E. McKenney
2011-09-19  5:49       ` Yong Zhang
2011-09-20 14:57         ` Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 42/55] rcu: Make rcu_torture_fqs() exit loops at end of test Paul E. McKenney
2011-10-17  1:53   ` Josh Triplett
2011-10-24 12:10     ` Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 43/55] rcu: Make rcu_torture_boost() " Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 44/55] rcu: wire up RCU_BOOST_PRIO for rcutree Paul E. McKenney
2011-09-13 12:02   ` Mike Galbraith
2011-09-13 15:34     ` Paul E. McKenney
2011-09-13 16:04       ` Mike Galbraith
2011-09-13 20:50         ` Paul E. McKenney
2011-10-17  1:55   ` Josh Triplett
2011-09-06 18:00 ` [PATCH tip/core/rcu 45/55] rcu: check for entering dyntick-idle mode while in read-side critical section Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 46/55] rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 47/55] rcu: Move propagation of ->completed from rcu_start_gp() to rcu_report_qs_rsp() Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 48/55] powerpc: strengthen value-returning-atomics memory barriers Paul E. McKenney
2011-09-09 17:23   ` Olof Johansson
2011-09-09 17:23     ` Olof Johansson
2011-09-09 17:34     ` Paul E. McKenney
2011-09-09 17:34       ` Paul E. McKenney
2011-09-09 18:43       ` Olof Johansson
2011-09-09 18:43         ` Olof Johansson
2011-09-06 18:00 ` [PATCH tip/core/rcu 49/55] rcu: Detect illegal rcu dereference in extended quiescent state Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 50/55] rcu: Inform the user about dynticks-idle mode on PROVE_RCU warning Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 51/55] rcu: Warn when rcu_read_lock() is used in extended quiescent state Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 52/55] rcu: Remove one layer of abstraction from PROVE_RCU checking Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 53/55] rcu: Warn when srcu_read_lock() is used in an extended quiescent state Paul E. McKenney
2011-10-04 21:03   ` Frederic Weisbecker
2011-10-04 23:40     ` Paul E. McKenney
2011-10-04 23:42       ` Frederic Weisbecker
2011-09-06 18:00 ` [PATCH tip/core/rcu 54/55] rcu: Make srcu_read_lock_held() call common lockdep-enabled function Paul E. McKenney
2011-10-17  2:03   ` Josh Triplett
2011-10-24 12:34     ` Paul E. McKenney
2011-09-06 18:00 ` [PATCH tip/core/rcu 55/55] powerpc: Work around tracing from dyntick-idle mode Paul E. McKenney
2011-09-07 10:00   ` Benjamin Herrenschmidt
2011-09-07 13:44     ` Paul E. McKenney
2011-09-13 19:13       ` Frederic Weisbecker
2011-09-13 19:50         ` Paul E. McKenney
2011-09-13 20:49           ` Benjamin Herrenschmidt
2011-09-15 14:53             ` Frederic Weisbecker
2011-09-16 12:24             ` Frederic Weisbecker
2011-09-07 14:39 ` [PATCH tip/core/rcu 0/55] Preview of RCU changes for 3.2 Lin Ming
2011-09-08 17:41   ` Paul E. McKenney
2011-09-08 19:23     ` Thomas Gleixner
2011-09-08 20:48       ` Paul E. McKenney
2011-09-12 16:24         ` Paul E. McKenney
2011-10-17  2:06 ` Josh Triplett
2011-10-24 12:35   ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.