All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 0/15] Improvements to rcu_barrier() and RT response on big systems
@ 2012-06-15 21:05 Paul E. McKenney
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches

Hello!

This patch series contains improvements to the rcu_barrier() family of
primitives and to latency for large systems.  These are in a single
series due to conflicts that would otherwise occur.  The individual
patches are as follows:

1.	Allow the value for RCU_FANOUT_LEAF to be increased (but not
	decreased!) via a boot-time parameter, in turn allowing a
	default kernel build to be adjusted for low RCU grace-period
	initialization latency on a large system.
2.	Work around the new default NR_CPUS=4096 by checking the
	boot-time-computed nr_cpu_ids, allowing this to override
	NR_CPUS.  This again reduces RCU grace-period initialization
	latency for kernels built with large NR_CPUS running on small
	systems.
3.	Shrink a macro argument to keep lines under 80 characters.
4.	Add a pointer in the rcu_state structure to the corresponding
	member of the call_rcu() family of functions in preparation
	for increasing rcu_barrier() concurrency.
5.	Move _rcu_barrier()'s rcu_head structures to the per-CPU
	per-RCU-flavor rcu_data structures so that different flavors
	of rcu_barrier() do not need to contend for the rcu_head
	structures.
6.	Move rcu_barrier()'s rcu_barrier_cpu_count global variable to
	a new ->barrier_cpu_count field in the rcu_state structure, so
	that different flavors of rcu_barrier() do not need to contend
	for this variable.
7.	Move rcu_barrier()'s rcu_barrier_completion global variable to
	a new ->barrier_completion field in the rcu_state structure, so
	that different flavors of rcu_barrier() do not need to contend
	for this variable.
8.	Move rcu_barrier()'s rcu_barrier_mutex global variable to
	a new ->barrier_mutex field in the rcu_state structure, so that
	different flavors of rcu_barrier() do not need to contend for
	this variable.
9.	Introduce counter scheme to allow multiple concurrent executions
	of a given flavor of rcu_barrier() to share work.
10.	Add event tracing for _rcu_barrier().
11.	Add debugfs tracing for _rcu_barrier().
12.	Remove unnecessary per-CPU variable argument from
	__rcu_process_callbacks().
13.	Introduce for_each_rcu_flavor() iterator and use it.  This provides
	a nicer way to iterate through the RCU flavors to do per-flavor
	processing.
14.	Apply the for_each_rcu_flavor() iterator to debugfs tracing.
15.	Remove dead-code gcc helper from code that is no longer ever dead.

							Thanx, Paul

 b/Documentation/kernel-parameters.txt |    4 
 b/include/trace/events/rcu.h          |   45 +++++++
 b/kernel/rcutree.c                    |   97 +++++++++++++--
 b/kernel/rcutree.h                    |   23 ++-
 b/kernel/rcutree_plugin.h             |    4 
 b/kernel/rcutree_trace.c              |    2 
 kernel/rcutree.c                      |  213 +++++++++++++++++++++-------------
 kernel/rcutree.h                      |   22 ++-
 kernel/rcutree_plugin.h               |  126 --------------------
 kernel/rcutree_trace.c                |  134 ++++++++++++---------
 10 files changed, 379 insertions(+), 291 deletions(-)


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter
  2012-06-15 21:05 [PATCH tip/core/rcu 0/15] Improvements to rcu_barrier() and RT response on big systems Paul E. McKenney
@ 2012-06-15 21:05 ` Paul E. McKenney
  2012-06-15 21:05   ` [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS Paul E. McKenney
                     ` (14 more replies)
  0 siblings, 15 replies; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

Although making RCU_FANOUT_LEAF a kernel configuration parameter rather
than a fixed constant makes it easier for people to decrease cache-miss
overhead for large systems, it is of little help for people who must
run a single pre-built kernel binary.

This commit therefore allows the value of RCU_FANOUT_LEAF to be
increased (but not decreased!) via a boot-time parameter named
rcutree.rcu_fanout_leaf.

Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/kernel-parameters.txt |    4 ++
 kernel/rcutree.c                    |   97 ++++++++++++++++++++++++++++++-----
 kernel/rcutree.h                    |   23 +++++----
 kernel/rcutree_plugin.h             |    4 +-
 kernel/rcutree_trace.c              |    2 +-
 5 files changed, 104 insertions(+), 26 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index c45513d..88bd3ef 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2367,6 +2367,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			Set maximum number of finished RCU callbacks to process
 			in one batch.
 
+	rcutree.fanout_leaf=	[KNL,BOOT]
+			Set maximum number of finished RCU callbacks to process
+			in one batch.
+
 	rcutree.qhimark=	[KNL,BOOT]
 			Set threshold of queued
 			RCU callbacks over which batch limiting is disabled.
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 0da7b88..a151184 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -60,17 +60,10 @@
 
 /* Data structures. */
 
-static struct lock_class_key rcu_node_class[NUM_RCU_LVLS];
+static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
 
 #define RCU_STATE_INITIALIZER(structname) { \
 	.level = { &structname##_state.node[0] }, \
-	.levelcnt = { \
-		NUM_RCU_LVL_0,  /* root of hierarchy. */ \
-		NUM_RCU_LVL_1, \
-		NUM_RCU_LVL_2, \
-		NUM_RCU_LVL_3, \
-		NUM_RCU_LVL_4, /* == MAX_RCU_LVLS */ \
-	}, \
 	.fqs_state = RCU_GP_IDLE, \
 	.gpnum = -300, \
 	.completed = -300, \
@@ -91,6 +84,19 @@ DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
 
 static struct rcu_state *rcu_state;
 
+/* Increase (but not decrease) the CONFIG_RCU_FANOUT_LEAF at boot time. */
+static int rcu_fanout_leaf = CONFIG_RCU_FANOUT_LEAF;
+module_param(rcu_fanout_leaf, int, 0);
+int rcu_num_lvls = RCU_NUM_LVLS;
+static int num_rcu_lvl[] = {  /* Number of rcu_nodes at specified level. */
+	NUM_RCU_LVL_0,
+	NUM_RCU_LVL_1,
+	NUM_RCU_LVL_2,
+	NUM_RCU_LVL_3,
+	NUM_RCU_LVL_4,
+};
+int rcu_num_nodes = NUM_RCU_NODES; /* Total number of rcu_nodes in use. */
+
 /*
  * The rcu_scheduler_active variable transitions from zero to one just
  * before the first task is spawned.  So when this variable is zero, RCU
@@ -2571,9 +2577,9 @@ static void __init rcu_init_levelspread(struct rcu_state *rsp)
 {
 	int i;
 
-	for (i = NUM_RCU_LVLS - 1; i > 0; i--)
+	for (i = rcu_num_lvls - 1; i > 0; i--)
 		rsp->levelspread[i] = CONFIG_RCU_FANOUT;
-	rsp->levelspread[0] = CONFIG_RCU_FANOUT_LEAF;
+	rsp->levelspread[0] = rcu_fanout_leaf;
 }
 #else /* #ifdef CONFIG_RCU_FANOUT_EXACT */
 static void __init rcu_init_levelspread(struct rcu_state *rsp)
@@ -2583,7 +2589,7 @@ static void __init rcu_init_levelspread(struct rcu_state *rsp)
 	int i;
 
 	cprv = NR_CPUS;
-	for (i = NUM_RCU_LVLS - 1; i >= 0; i--) {
+	for (i = rcu_num_lvls - 1; i >= 0; i--) {
 		ccur = rsp->levelcnt[i];
 		rsp->levelspread[i] = (cprv + ccur - 1) / ccur;
 		cprv = ccur;
@@ -2610,13 +2616,15 @@ static void __init rcu_init_one(struct rcu_state *rsp,
 
 	/* Initialize the level-tracking arrays. */
 
-	for (i = 1; i < NUM_RCU_LVLS; i++)
+	for (i = 0; i < rcu_num_lvls; i++)
+		rsp->levelcnt[i] = num_rcu_lvl[i];
+	for (i = 1; i < rcu_num_lvls; i++)
 		rsp->level[i] = rsp->level[i - 1] + rsp->levelcnt[i - 1];
 	rcu_init_levelspread(rsp);
 
 	/* Initialize the elements themselves, starting from the leaves. */
 
-	for (i = NUM_RCU_LVLS - 1; i >= 0; i--) {
+	for (i = rcu_num_lvls - 1; i >= 0; i--) {
 		cpustride *= rsp->levelspread[i];
 		rnp = rsp->level[i];
 		for (j = 0; j < rsp->levelcnt[i]; j++, rnp++) {
@@ -2646,7 +2654,7 @@ static void __init rcu_init_one(struct rcu_state *rsp,
 	}
 
 	rsp->rda = rda;
-	rnp = rsp->level[NUM_RCU_LVLS - 1];
+	rnp = rsp->level[rcu_num_lvls - 1];
 	for_each_possible_cpu(i) {
 		while (i > rnp->grphi)
 			rnp++;
@@ -2655,11 +2663,72 @@ static void __init rcu_init_one(struct rcu_state *rsp,
 	}
 }
 
+/*
+ * Compute the rcu_node tree geometry from kernel parameters.  This cannot
+ * replace the definitions in rcutree.h because those are needed to size
+ * the ->node array in the rcu_state structure.
+ */
+static void __init rcu_init_geometry(void)
+{
+	int i;
+	int j;
+	int n = NR_CPUS;
+	int rcu_capacity[MAX_RCU_LVLS + 1];
+
+	/* If the compile-time values are accurate, just leave. */
+	if (rcu_fanout_leaf == CONFIG_RCU_FANOUT_LEAF)
+		return;
+
+	/*
+	 * Compute number of nodes that can be handled an rcu_node tree
+	 * with the given number of levels.  Setting rcu_capacity[0] makes
+	 * some of the arithmetic easier.
+	 */
+	rcu_capacity[0] = 1;
+	rcu_capacity[1] = rcu_fanout_leaf;
+	for (i = 2; i <= MAX_RCU_LVLS; i++)
+		rcu_capacity[i] = rcu_capacity[i - 1] * CONFIG_RCU_FANOUT;
+
+	/*
+	 * The boot-time rcu_fanout_leaf parameter is only permitted
+	 * to increase the leaf-level fanout, not decrease it.  Of course,
+	 * the leaf-level fanout cannot exceed the number of bits in
+	 * the rcu_node masks.  Finally, the tree must be able to accommodate
+	 * the configured number of CPUs.  Complain and fall back to the
+	 * compile-timer values if these limits are exceeded.
+	 */
+	if (rcu_fanout_leaf < CONFIG_RCU_FANOUT_LEAF ||
+	    rcu_fanout_leaf > sizeof(unsigned long) * 8 ||
+	    n > rcu_capacity[4]) {
+		WARN_ON(1);
+		return;
+	}
+
+	/* Calculate the number of rcu_nodes at each level of the tree. */
+	for (i = 1; i <= MAX_RCU_LVLS; i++)
+		if (n <= rcu_capacity[i]) {
+			for (j = 0; j <= i; j++)
+				num_rcu_lvl[j] =
+					DIV_ROUND_UP(n, rcu_capacity[i - j]);
+			rcu_num_lvls = i;
+			for (j = i + 1; j <= MAX_RCU_LVLS; j++)
+				num_rcu_lvl[j] = 0;
+			break;
+		}
+
+	/* Calculate the total number of rcu_node structures. */
+	rcu_num_nodes = 0;
+	for (i = 0; i <= MAX_RCU_LVLS; i++)
+		rcu_num_nodes += num_rcu_lvl[i];
+	rcu_num_nodes -= n;
+}
+
 void __init rcu_init(void)
 {
 	int cpu;
 
 	rcu_bootup_announce();
+	rcu_init_geometry();
 	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
 	rcu_init_one(&rcu_bh_state, &rcu_bh_data);
 	__rcu_init_preempt();
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index 7f5d138..df3c2c8 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -42,28 +42,28 @@
 #define RCU_FANOUT_4	      (RCU_FANOUT_3 * CONFIG_RCU_FANOUT)
 
 #if NR_CPUS <= RCU_FANOUT_1
-#  define NUM_RCU_LVLS	      1
+#  define RCU_NUM_LVLS	      1
 #  define NUM_RCU_LVL_0	      1
 #  define NUM_RCU_LVL_1	      (NR_CPUS)
 #  define NUM_RCU_LVL_2	      0
 #  define NUM_RCU_LVL_3	      0
 #  define NUM_RCU_LVL_4	      0
 #elif NR_CPUS <= RCU_FANOUT_2
-#  define NUM_RCU_LVLS	      2
+#  define RCU_NUM_LVLS	      2
 #  define NUM_RCU_LVL_0	      1
 #  define NUM_RCU_LVL_1	      DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1)
 #  define NUM_RCU_LVL_2	      (NR_CPUS)
 #  define NUM_RCU_LVL_3	      0
 #  define NUM_RCU_LVL_4	      0
 #elif NR_CPUS <= RCU_FANOUT_3
-#  define NUM_RCU_LVLS	      3
+#  define RCU_NUM_LVLS	      3
 #  define NUM_RCU_LVL_0	      1
 #  define NUM_RCU_LVL_1	      DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2)
 #  define NUM_RCU_LVL_2	      DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1)
 #  define NUM_RCU_LVL_3	      (NR_CPUS)
 #  define NUM_RCU_LVL_4	      0
 #elif NR_CPUS <= RCU_FANOUT_4
-#  define NUM_RCU_LVLS	      4
+#  define RCU_NUM_LVLS	      4
 #  define NUM_RCU_LVL_0	      1
 #  define NUM_RCU_LVL_1	      DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_3)
 #  define NUM_RCU_LVL_2	      DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2)
@@ -76,6 +76,9 @@
 #define RCU_SUM (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2 + NUM_RCU_LVL_3 + NUM_RCU_LVL_4)
 #define NUM_RCU_NODES (RCU_SUM - NR_CPUS)
 
+extern int rcu_num_lvls;
+extern int rcu_num_nodes;
+
 /*
  * Dynticks per-CPU state.
  */
@@ -192,7 +195,7 @@ struct rcu_node {
  */
 #define rcu_for_each_node_breadth_first(rsp, rnp) \
 	for ((rnp) = &(rsp)->node[0]; \
-	     (rnp) < &(rsp)->node[NUM_RCU_NODES]; (rnp)++)
+	     (rnp) < &(rsp)->node[rcu_num_nodes]; (rnp)++)
 
 /*
  * Do a breadth-first scan of the non-leaf rcu_node structures for the
@@ -201,7 +204,7 @@ struct rcu_node {
  */
 #define rcu_for_each_nonleaf_node_breadth_first(rsp, rnp) \
 	for ((rnp) = &(rsp)->node[0]; \
-	     (rnp) < (rsp)->level[NUM_RCU_LVLS - 1]; (rnp)++)
+	     (rnp) < (rsp)->level[rcu_num_lvls - 1]; (rnp)++)
 
 /*
  * Scan the leaves of the rcu_node hierarchy for the specified rcu_state
@@ -210,8 +213,8 @@ struct rcu_node {
  * It is still a leaf node, even if it is also the root node.
  */
 #define rcu_for_each_leaf_node(rsp, rnp) \
-	for ((rnp) = (rsp)->level[NUM_RCU_LVLS - 1]; \
-	     (rnp) < &(rsp)->node[NUM_RCU_NODES]; (rnp)++)
+	for ((rnp) = (rsp)->level[rcu_num_lvls - 1]; \
+	     (rnp) < &(rsp)->node[rcu_num_nodes]; (rnp)++)
 
 /* Index values for nxttail array in struct rcu_data. */
 #define RCU_DONE_TAIL		0	/* Also RCU_WAIT head. */
@@ -343,9 +346,9 @@ do {									\
  */
 struct rcu_state {
 	struct rcu_node node[NUM_RCU_NODES];	/* Hierarchy. */
-	struct rcu_node *level[NUM_RCU_LVLS];	/* Hierarchy levels. */
+	struct rcu_node *level[RCU_NUM_LVLS];	/* Hierarchy levels. */
 	u32 levelcnt[MAX_RCU_LVLS + 1];		/* # nodes in each level. */
-	u8 levelspread[NUM_RCU_LVLS];		/* kids/node in each level. */
+	u8 levelspread[RCU_NUM_LVLS];		/* kids/node in each level. */
 	struct rcu_data __percpu *rda;		/* pointer of percu rcu_data. */
 
 	/* The following fields are guarded by the root rcu_node's lock. */
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 2411000..e9b44c3 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -68,8 +68,10 @@ static void __init rcu_bootup_announce_oddness(void)
 	printk(KERN_INFO "\tAdditional per-CPU info printed with stalls.\n");
 #endif
 #if NUM_RCU_LVL_4 != 0
-	printk(KERN_INFO "\tExperimental four-level hierarchy is enabled.\n");
+	printk(KERN_INFO "\tFour-level hierarchy is enabled.\n");
 #endif
+	if (rcu_fanout_leaf != CONFIG_RCU_FANOUT_LEAF)
+		printk(KERN_INFO "\tExperimental boot-time adjustment of leaf fanout.\n");
 }
 
 #ifdef CONFIG_TREE_PREEMPT_RCU
diff --git a/kernel/rcutree_trace.c b/kernel/rcutree_trace.c
index d4bc16d..a3556a2 100644
--- a/kernel/rcutree_trace.c
+++ b/kernel/rcutree_trace.c
@@ -278,7 +278,7 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
 		   rsp->n_force_qs, rsp->n_force_qs_ngp,
 		   rsp->n_force_qs - rsp->n_force_qs_ngp,
 		   rsp->n_force_qs_lh, rsp->qlen_lazy, rsp->qlen);
-	for (rnp = &rsp->node[0]; rnp - &rsp->node[0] < NUM_RCU_NODES; rnp++) {
+	for (rnp = &rsp->node[0]; rnp - &rsp->node[0] < rcu_num_nodes; rnp++) {
 		if (rnp->level != level) {
 			seq_puts(m, "\n");
 			level = rnp->level;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
@ 2012-06-15 21:05   ` Paul E. McKenney
  2012-06-15 21:47     ` Josh Triplett
  2012-06-15 21:05   ` [PATCH tip/core/rcu 03/15] rcu: Prevent excessive line length in RCU_STATE_INITIALIZER() Paul E. McKenney
                     ` (13 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The rcu_node tree array is sized based on compile-time constants,
including NR_CPUS.  Although this approach has worked well in the past,
the recent trend by many distros to define NR_CPUS=4096 results in
excessive grace-period-initialization latencies.

This commit therefore substitutes the run-time computed nr_cpu_ids for
the compile-time NR_CPUS when building the tree.  This can result in
much of the compile-time-allocated rcu_node array being unused.  If
this is a major problem, you are in a specialized situation anyway,
so you can manually adjust the NR_CPUS, RCU_FANOUT, and RCU_FANOUT_LEAF
kernel config parameters.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c        |    2 +-
 kernel/rcutree_plugin.h |    2 ++
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index a151184..9098910 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -2672,7 +2672,7 @@ static void __init rcu_init_geometry(void)
 {
 	int i;
 	int j;
-	int n = NR_CPUS;
+	int n = nr_cpu_ids;
 	int rcu_capacity[MAX_RCU_LVLS + 1];
 
 	/* If the compile-time values are accurate, just leave. */
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index e9b44c3..7cb86ae 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -72,6 +72,8 @@ static void __init rcu_bootup_announce_oddness(void)
 #endif
 	if (rcu_fanout_leaf != CONFIG_RCU_FANOUT_LEAF)
 		printk(KERN_INFO "\tExperimental boot-time adjustment of leaf fanout.\n");
+	if (nr_cpu_ids != NR_CPUS)
+		printk(KERN_INFO "\tRCU restricting CPUs from NR_CPUS=%d to nr_cpu_ids=%d.\n", NR_CPUS, nr_cpu_ids);
 }
 
 #ifdef CONFIG_TREE_PREEMPT_RCU
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 03/15] rcu: Prevent excessive line length in RCU_STATE_INITIALIZER()
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
  2012-06-15 21:05   ` [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS Paul E. McKenney
@ 2012-06-15 21:05   ` Paul E. McKenney
  2012-06-15 21:48     ` Josh Triplett
  2012-06-15 21:05   ` [PATCH tip/core/rcu 04/15] rcu: Place pointer to call_rcu() in rcu_data structure Paul E. McKenney
                     ` (12 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney, Paul E. McKenney

From: "Paul E. McKenney" <paul.mckenney@linaro.org>

Upcoming rcu_barrier() concurrency commits will result in line lengths
greater than 80 characters in the RCU_STATE_INITIALIZER(), so this commit
shortens the name of the macro's argument to prevent this.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 9098910..8ce1b1d 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -62,18 +62,18 @@
 
 static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
 
-#define RCU_STATE_INITIALIZER(structname) { \
-	.level = { &structname##_state.node[0] }, \
+#define RCU_STATE_INITIALIZER(sname) { \
+	.level = { &sname##_state.node[0] }, \
 	.fqs_state = RCU_GP_IDLE, \
 	.gpnum = -300, \
 	.completed = -300, \
-	.onofflock = __RAW_SPIN_LOCK_UNLOCKED(&structname##_state.onofflock), \
-	.orphan_nxttail = &structname##_state.orphan_nxtlist, \
-	.orphan_donetail = &structname##_state.orphan_donelist, \
-	.fqslock = __RAW_SPIN_LOCK_UNLOCKED(&structname##_state.fqslock), \
+	.onofflock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.onofflock), \
+	.orphan_nxttail = &sname##_state.orphan_nxtlist, \
+	.orphan_donetail = &sname##_state.orphan_donelist, \
+	.fqslock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.fqslock), \
 	.n_force_qs = 0, \
 	.n_force_qs_ngp = 0, \
-	.name = #structname, \
+	.name = #sname, \
 }
 
 struct rcu_state rcu_sched_state = RCU_STATE_INITIALIZER(rcu_sched);
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 04/15] rcu: Place pointer to call_rcu() in rcu_data structure
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
  2012-06-15 21:05   ` [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS Paul E. McKenney
  2012-06-15 21:05   ` [PATCH tip/core/rcu 03/15] rcu: Prevent excessive line length in RCU_STATE_INITIALIZER() Paul E. McKenney
@ 2012-06-15 21:05   ` Paul E. McKenney
  2012-06-15 22:08     ` Josh Triplett
  2012-06-15 21:06   ` [PATCH tip/core/rcu 05/15] rcu: Move _rcu_barrier()'s rcu_head structures to rcu_data structures Paul E. McKenney
                     ` (11 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney, Paul E. McKenney

From: "Paul E. McKenney" <paul.mckenney@linaro.org>

This is a preparatory commit for increasing rcu_barrier()'s concurrency.
It adds a pointer in the rcu_data structure to the corresponding call_rcu()
function.  This allows a pointer to the rcu_data structure to imply the
function pointer, which allows _rcu_barrier() state to be placed in the
rcu_state structure.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c        |   27 ++++++++++++---------------
 kernel/rcutree.h        |    2 ++
 kernel/rcutree_plugin.h |    5 +++--
 3 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 8ce1b1d..8b3ab4e 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -62,8 +62,9 @@
 
 static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
 
-#define RCU_STATE_INITIALIZER(sname) { \
+#define RCU_STATE_INITIALIZER(sname, cr) { \
 	.level = { &sname##_state.node[0] }, \
+	.call = cr, \
 	.fqs_state = RCU_GP_IDLE, \
 	.gpnum = -300, \
 	.completed = -300, \
@@ -76,10 +77,11 @@ static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
 	.name = #sname, \
 }
 
-struct rcu_state rcu_sched_state = RCU_STATE_INITIALIZER(rcu_sched);
+struct rcu_state rcu_sched_state =
+	RCU_STATE_INITIALIZER(rcu_sched, call_rcu_sched);
 DEFINE_PER_CPU(struct rcu_data, rcu_sched_data);
 
-struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh);
+struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh, call_rcu_bh);
 DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
 
 static struct rcu_state *rcu_state;
@@ -2279,21 +2281,17 @@ static void rcu_barrier_func(void *type)
 {
 	int cpu = smp_processor_id();
 	struct rcu_head *head = &per_cpu(rcu_barrier_head, cpu);
-	void (*call_rcu_func)(struct rcu_head *head,
-			      void (*func)(struct rcu_head *head));
+	struct rcu_state *rsp = type;
 
 	atomic_inc(&rcu_barrier_cpu_count);
-	call_rcu_func = type;
-	call_rcu_func(head, rcu_barrier_callback);
+	rsp->call(head, rcu_barrier_callback);
 }
 
 /*
  * Orchestrate the specified type of RCU barrier, waiting for all
  * RCU callbacks of the specified type to complete.
  */
-static void _rcu_barrier(struct rcu_state *rsp,
-			 void (*call_rcu_func)(struct rcu_head *head,
-					       void (*func)(struct rcu_head *head)))
+static void _rcu_barrier(struct rcu_state *rsp)
 {
 	int cpu;
 	unsigned long flags;
@@ -2345,8 +2343,7 @@ static void _rcu_barrier(struct rcu_state *rsp,
 			while (cpu_is_offline(cpu) && ACCESS_ONCE(rdp->qlen))
 				schedule_timeout_interruptible(1);
 		} else if (ACCESS_ONCE(rdp->qlen)) {
-			smp_call_function_single(cpu, rcu_barrier_func,
-						 (void *)call_rcu_func, 1);
+			smp_call_function_single(cpu, rcu_barrier_func, rsp, 1);
 			preempt_enable();
 		} else {
 			preempt_enable();
@@ -2367,7 +2364,7 @@ static void _rcu_barrier(struct rcu_state *rsp,
 	raw_spin_unlock_irqrestore(&rsp->onofflock, flags);
 	atomic_inc(&rcu_barrier_cpu_count);
 	smp_mb__after_atomic_inc(); /* Ensure atomic_inc() before callback. */
-	call_rcu_func(&rh, rcu_barrier_callback);
+	rsp->call(&rh, rcu_barrier_callback);
 
 	/*
 	 * Now that we have an rcu_barrier_callback() callback on each
@@ -2390,7 +2387,7 @@ static void _rcu_barrier(struct rcu_state *rsp,
  */
 void rcu_barrier_bh(void)
 {
-	_rcu_barrier(&rcu_bh_state, call_rcu_bh);
+	_rcu_barrier(&rcu_bh_state);
 }
 EXPORT_SYMBOL_GPL(rcu_barrier_bh);
 
@@ -2399,7 +2396,7 @@ EXPORT_SYMBOL_GPL(rcu_barrier_bh);
  */
 void rcu_barrier_sched(void)
 {
-	_rcu_barrier(&rcu_sched_state, call_rcu_sched);
+	_rcu_barrier(&rcu_sched_state);
 }
 EXPORT_SYMBOL_GPL(rcu_barrier_sched);
 
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index df3c2c8..15837d7 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -350,6 +350,8 @@ struct rcu_state {
 	u32 levelcnt[MAX_RCU_LVLS + 1];		/* # nodes in each level. */
 	u8 levelspread[RCU_NUM_LVLS];		/* kids/node in each level. */
 	struct rcu_data __percpu *rda;		/* pointer of percu rcu_data. */
+	void (*call)(struct rcu_head *head,	/* call_rcu() flavor. */
+		     void (*func)(struct rcu_head *head));
 
 	/* The following fields are guarded by the root rcu_node's lock. */
 
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 7cb86ae..6888706 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -78,7 +78,8 @@ static void __init rcu_bootup_announce_oddness(void)
 
 #ifdef CONFIG_TREE_PREEMPT_RCU
 
-struct rcu_state rcu_preempt_state = RCU_STATE_INITIALIZER(rcu_preempt);
+struct rcu_state rcu_preempt_state =
+	RCU_STATE_INITIALIZER(rcu_preempt, call_rcu);
 DEFINE_PER_CPU(struct rcu_data, rcu_preempt_data);
 static struct rcu_state *rcu_state = &rcu_preempt_state;
 
@@ -944,7 +945,7 @@ static int rcu_preempt_cpu_has_callbacks(int cpu)
  */
 void rcu_barrier(void)
 {
-	_rcu_barrier(&rcu_preempt_state, call_rcu);
+	_rcu_barrier(&rcu_preempt_state);
 }
 EXPORT_SYMBOL_GPL(rcu_barrier);
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 05/15] rcu: Move _rcu_barrier()'s rcu_head structures to rcu_data structures
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
                     ` (2 preceding siblings ...)
  2012-06-15 21:05   ` [PATCH tip/core/rcu 04/15] rcu: Place pointer to call_rcu() in rcu_data structure Paul E. McKenney
@ 2012-06-15 21:06   ` Paul E. McKenney
  2012-06-15 22:19     ` Josh Triplett
  2012-06-15 21:06   ` [PATCH tip/core/rcu 06/15] rcu: Move rcu_barrier_cpu_count to rcu_state structure Paul E. McKenney
                     ` (10 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney, Paul E. McKenney

From: "Paul E. McKenney" <paul.mckenney@linaro.org>

In order for multiple flavors of RCU to each concurrently run one
rcu_barrier(), each flavor needs its own per-CPU set of rcu_head
structures.  This commit therefore moves _rcu_barrier()'s set of
per-CPU rcu_head structures from per-CPU variables to the existing
per-CPU and per-RCU-flavor rcu_data structures.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |    6 ++----
 kernel/rcutree.h |    3 +++
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 8b3ab4e..2cfbdb8 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -157,7 +157,6 @@ unsigned long rcutorture_vernum;
 
 /* State information for rcu_barrier() and friends. */
 
-static DEFINE_PER_CPU(struct rcu_head, rcu_barrier_head) = {NULL};
 static atomic_t rcu_barrier_cpu_count;
 static DEFINE_MUTEX(rcu_barrier_mutex);
 static struct completion rcu_barrier_completion;
@@ -2279,12 +2278,11 @@ static void rcu_barrier_callback(struct rcu_head *notused)
  */
 static void rcu_barrier_func(void *type)
 {
-	int cpu = smp_processor_id();
-	struct rcu_head *head = &per_cpu(rcu_barrier_head, cpu);
 	struct rcu_state *rsp = type;
+	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
 
 	atomic_inc(&rcu_barrier_cpu_count);
-	rsp->call(head, rcu_barrier_callback);
+	rsp->call(&rdp->barrier_head, rcu_barrier_callback);
 }
 
 /*
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index 15837d7..1783eae 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -300,6 +300,9 @@ struct rcu_data {
 	unsigned long n_rp_need_fqs;
 	unsigned long n_rp_need_nothing;
 
+	/* 6) _rcu_barrier() callback. */
+	struct rcu_head barrier_head;
+
 	int cpu;
 	struct rcu_state *rsp;
 };
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 06/15] rcu: Move rcu_barrier_cpu_count to rcu_state structure
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
                     ` (3 preceding siblings ...)
  2012-06-15 21:06   ` [PATCH tip/core/rcu 05/15] rcu: Move _rcu_barrier()'s rcu_head structures to rcu_data structures Paul E. McKenney
@ 2012-06-15 21:06   ` Paul E. McKenney
  2012-06-15 22:44     ` Josh Triplett
  2012-06-15 21:06   ` [PATCH tip/core/rcu 07/15] rcu: Move rcu_barrier_completion " Paul E. McKenney
                     ` (9 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney, Paul E. McKenney

From: "Paul E. McKenney" <paul.mckenney@linaro.org>

In order to allow each RCU flavor to concurrently execute its rcu_barrier()
function, it is necessary to move the relevant state to the rcu_state
structure.  This commit therefore moves the rcu_barrier_cpu_count global
variable to a new ->barrier_cpu_count field in the rcu_state structure.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   25 ++++++++++++++-----------
 kernel/rcutree.h |    1 +
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 2cfbdb8..d363416 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -157,7 +157,6 @@ unsigned long rcutorture_vernum;
 
 /* State information for rcu_barrier() and friends. */
 
-static atomic_t rcu_barrier_cpu_count;
 static DEFINE_MUTEX(rcu_barrier_mutex);
 static struct completion rcu_barrier_completion;
 
@@ -2267,9 +2266,12 @@ static int rcu_cpu_has_callbacks(int cpu)
  * RCU callback function for _rcu_barrier().  If we are last, wake
  * up the task executing _rcu_barrier().
  */
-static void rcu_barrier_callback(struct rcu_head *notused)
+static void rcu_barrier_callback(struct rcu_head *rhp)
 {
-	if (atomic_dec_and_test(&rcu_barrier_cpu_count))
+	struct rcu_data *rdp = container_of(rhp, struct rcu_data, barrier_head);
+	struct rcu_state *rsp = rdp->rsp;
+
+	if (atomic_dec_and_test(&rsp->barrier_cpu_count))
 		complete(&rcu_barrier_completion);
 }
 
@@ -2281,7 +2283,7 @@ static void rcu_barrier_func(void *type)
 	struct rcu_state *rsp = type;
 	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
 
-	atomic_inc(&rcu_barrier_cpu_count);
+	atomic_inc(&rsp->barrier_cpu_count);
 	rsp->call(&rdp->barrier_head, rcu_barrier_callback);
 }
 
@@ -2294,9 +2296,9 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	int cpu;
 	unsigned long flags;
 	struct rcu_data *rdp;
-	struct rcu_head rh;
+	struct rcu_data rd;
 
-	init_rcu_head_on_stack(&rh);
+	init_rcu_head_on_stack(&rd.barrier_head);
 
 	/* Take mutex to serialize concurrent rcu_barrier() requests. */
 	mutex_lock(&rcu_barrier_mutex);
@@ -2321,7 +2323,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	 *	us -- but before CPU 1's orphaned callbacks are invoked!!!
 	 */
 	init_completion(&rcu_barrier_completion);
-	atomic_set(&rcu_barrier_cpu_count, 1);
+	atomic_set(&rsp->barrier_cpu_count, 1);
 	raw_spin_lock_irqsave(&rsp->onofflock, flags);
 	rsp->rcu_barrier_in_progress = current;
 	raw_spin_unlock_irqrestore(&rsp->onofflock, flags);
@@ -2360,15 +2362,16 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	rcu_adopt_orphan_cbs(rsp);
 	rsp->rcu_barrier_in_progress = NULL;
 	raw_spin_unlock_irqrestore(&rsp->onofflock, flags);
-	atomic_inc(&rcu_barrier_cpu_count);
+	atomic_inc(&rsp->barrier_cpu_count);
 	smp_mb__after_atomic_inc(); /* Ensure atomic_inc() before callback. */
-	rsp->call(&rh, rcu_barrier_callback);
+	rd.rsp = rsp;
+	rsp->call(&rd.barrier_head, rcu_barrier_callback);
 
 	/*
 	 * Now that we have an rcu_barrier_callback() callback on each
 	 * CPU, and thus each counted, remove the initial count.
 	 */
-	if (atomic_dec_and_test(&rcu_barrier_cpu_count))
+	if (atomic_dec_and_test(&rsp->barrier_cpu_count))
 		complete(&rcu_barrier_completion);
 
 	/* Wait for all rcu_barrier_callback() callbacks to be invoked. */
@@ -2377,7 +2380,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	/* Other rcu_barrier() invocations can now safely proceed. */
 	mutex_unlock(&rcu_barrier_mutex);
 
-	destroy_rcu_head_on_stack(&rh);
+	destroy_rcu_head_on_stack(&rd.barrier_head);
 }
 
 /**
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index 1783eae..e7d29b7 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -386,6 +386,7 @@ struct rcu_state {
 	struct task_struct *rcu_barrier_in_progress;
 						/* Task doing rcu_barrier(), */
 						/*  or NULL if no barrier. */
+	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
 	raw_spinlock_t fqslock;			/* Only one task forcing */
 						/*  quiescent states. */
 	unsigned long jiffies_force_qs;		/* Time at which to invoke */
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 07/15] rcu: Move rcu_barrier_completion to rcu_state structure
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
                     ` (4 preceding siblings ...)
  2012-06-15 21:06   ` [PATCH tip/core/rcu 06/15] rcu: Move rcu_barrier_cpu_count to rcu_state structure Paul E. McKenney
@ 2012-06-15 21:06   ` Paul E. McKenney
  2012-06-15 22:51     ` Josh Triplett
  2012-06-15 21:06   ` [PATCH tip/core/rcu 08/15] rcu: Move rcu_barrier_mutex " Paul E. McKenney
                     ` (8 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney, Paul E. McKenney

From: "Paul E. McKenney" <paul.mckenney@linaro.org>

In order to allow each RCU flavor to concurrently execute its
rcu_barrier() function, it is necessary to move the relevant
state to the rcu_state structure.  This commit therefore moves the
rcu_barrier_completion global variable to a new ->barrier_completion
field in the rcu_state structure.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |    9 ++++-----
 kernel/rcutree.h |    1 +
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index d363416..a946437 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -158,7 +158,6 @@ unsigned long rcutorture_vernum;
 /* State information for rcu_barrier() and friends. */
 
 static DEFINE_MUTEX(rcu_barrier_mutex);
-static struct completion rcu_barrier_completion;
 
 /*
  * Return true if an RCU grace period is in progress.  The ACCESS_ONCE()s
@@ -2272,7 +2271,7 @@ static void rcu_barrier_callback(struct rcu_head *rhp)
 	struct rcu_state *rsp = rdp->rsp;
 
 	if (atomic_dec_and_test(&rsp->barrier_cpu_count))
-		complete(&rcu_barrier_completion);
+		complete(&rsp->barrier_completion);
 }
 
 /*
@@ -2322,7 +2321,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	 * 6.	Both rcu_barrier_callback() callbacks are invoked, awakening
 	 *	us -- but before CPU 1's orphaned callbacks are invoked!!!
 	 */
-	init_completion(&rcu_barrier_completion);
+	init_completion(&rsp->barrier_completion);
 	atomic_set(&rsp->barrier_cpu_count, 1);
 	raw_spin_lock_irqsave(&rsp->onofflock, flags);
 	rsp->rcu_barrier_in_progress = current;
@@ -2372,10 +2371,10 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	 * CPU, and thus each counted, remove the initial count.
 	 */
 	if (atomic_dec_and_test(&rsp->barrier_cpu_count))
-		complete(&rcu_barrier_completion);
+		complete(&rsp->barrier_completion);
 
 	/* Wait for all rcu_barrier_callback() callbacks to be invoked. */
-	wait_for_completion(&rcu_barrier_completion);
+	wait_for_completion(&rsp->barrier_completion);
 
 	/* Other rcu_barrier() invocations can now safely proceed. */
 	mutex_unlock(&rcu_barrier_mutex);
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index e7d29b7..56fb8d4 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -387,6 +387,7 @@ struct rcu_state {
 						/* Task doing rcu_barrier(), */
 						/*  or NULL if no barrier. */
 	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
+	struct completion barrier_completion;	/* Wake at barrier end. */
 	raw_spinlock_t fqslock;			/* Only one task forcing */
 						/*  quiescent states. */
 	unsigned long jiffies_force_qs;		/* Time at which to invoke */
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 08/15] rcu: Move rcu_barrier_mutex to rcu_state structure
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
                     ` (5 preceding siblings ...)
  2012-06-15 21:06   ` [PATCH tip/core/rcu 07/15] rcu: Move rcu_barrier_completion " Paul E. McKenney
@ 2012-06-15 21:06   ` Paul E. McKenney
  2012-06-15 22:55     ` Josh Triplett
  2012-06-15 21:06   ` [PATCH tip/core/rcu 09/15] rcu: Increasing rcu_barrier() concurrency Paul E. McKenney
                     ` (7 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney, Paul E. McKenney

From: "Paul E. McKenney" <paul.mckenney@linaro.org>

In order to allow each RCU flavor to concurrently execute its
rcu_barrier() function, it is necessary to move the relevant
state to the rcu_state structure.  This commit therefore moves the
rcu_barrier_mutex global variable to a new ->barrier_mutex field
in the rcu_state structure.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   11 +++--------
 kernel/rcutree.h |    1 +
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index a946437..93358d4 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -71,9 +71,8 @@ static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
 	.onofflock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.onofflock), \
 	.orphan_nxttail = &sname##_state.orphan_nxtlist, \
 	.orphan_donetail = &sname##_state.orphan_donelist, \
+	.barrier_mutex = __MUTEX_INITIALIZER(sname##_state.barrier_mutex), \
 	.fqslock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.fqslock), \
-	.n_force_qs = 0, \
-	.n_force_qs_ngp = 0, \
 	.name = #sname, \
 }
 
@@ -155,10 +154,6 @@ static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp);
 unsigned long rcutorture_testseq;
 unsigned long rcutorture_vernum;
 
-/* State information for rcu_barrier() and friends. */
-
-static DEFINE_MUTEX(rcu_barrier_mutex);
-
 /*
  * Return true if an RCU grace period is in progress.  The ACCESS_ONCE()s
  * permit this function to be invoked without holding the root rcu_node
@@ -2300,7 +2295,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	init_rcu_head_on_stack(&rd.barrier_head);
 
 	/* Take mutex to serialize concurrent rcu_barrier() requests. */
-	mutex_lock(&rcu_barrier_mutex);
+	mutex_lock(&rsp->barrier_mutex);
 
 	smp_mb();  /* Prevent any prior operations from leaking in. */
 
@@ -2377,7 +2372,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	wait_for_completion(&rsp->barrier_completion);
 
 	/* Other rcu_barrier() invocations can now safely proceed. */
-	mutex_unlock(&rcu_barrier_mutex);
+	mutex_unlock(&rsp->barrier_mutex);
 
 	destroy_rcu_head_on_stack(&rd.barrier_head);
 }
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index 56fb8d4..d9ac82f 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -386,6 +386,7 @@ struct rcu_state {
 	struct task_struct *rcu_barrier_in_progress;
 						/* Task doing rcu_barrier(), */
 						/*  or NULL if no barrier. */
+	struct mutex barrier_mutex;		/* Guards barrier fields. */
 	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
 	struct completion barrier_completion;	/* Wake at barrier end. */
 	raw_spinlock_t fqslock;			/* Only one task forcing */
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 09/15] rcu: Increasing rcu_barrier() concurrency
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
                     ` (6 preceding siblings ...)
  2012-06-15 21:06   ` [PATCH tip/core/rcu 08/15] rcu: Move rcu_barrier_mutex " Paul E. McKenney
@ 2012-06-15 21:06   ` Paul E. McKenney
  2012-06-15 23:31     ` Josh Triplett
  2012-06-15 21:06   ` [PATCH tip/core/rcu 10/15] rcu: Add tracing for _rcu_barrier() Paul E. McKenney
                     ` (6 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney, Paul E. McKenney

From: "Paul E. McKenney" <paul.mckenney@linaro.org>

The traditional rcu_barrier() implementation has serialized all requests,
regardless of RCU flavor, and also does not coalesce concurrent requests.
In the past, this has been good and sufficient.

However, systems are getting larger and use of rcu_barrier() has been
increasing.  This commit therefore introduces a counter-based scheme
that allows _rcu_barrier() calls for the same flavor of RCU to take
advantage of each others' work.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   27 ++++++++++++++++++++++++++-
 kernel/rcutree.h |    2 ++
 2 files changed, 28 insertions(+), 1 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 93358d4..7c299d3 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -2291,13 +2291,32 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	unsigned long flags;
 	struct rcu_data *rdp;
 	struct rcu_data rd;
+	unsigned long snap = ACCESS_ONCE(rsp->n_barrier_done);
+	unsigned long snap_done;
 
 	init_rcu_head_on_stack(&rd.barrier_head);
 
 	/* Take mutex to serialize concurrent rcu_barrier() requests. */
 	mutex_lock(&rsp->barrier_mutex);
 
-	smp_mb();  /* Prevent any prior operations from leaking in. */
+	/*
+	 * Ensure tht all prior references, including to ->n_barrier_done,
+	 * are ordered before the _rcu_barrier() machinery.
+	 */
+	smp_mb();  /* See above block comment. */
+
+	/* Recheck ->n_barrier_done to see if others did our work for us. */
+	snap_done = ACCESS_ONCE(rsp->n_barrier_done);
+	if (ULONG_CMP_GE(snap_done, ((snap + 1) & ~0x1) + 2)) {
+		smp_mb();
+		mutex_unlock(&rsp->barrier_mutex);
+		return;
+	}
+
+	/* Increment ->n_barrier_done to avoid duplicate work. */
+	ACCESS_ONCE(rsp->n_barrier_done)++;
+	WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 1);
+	smp_mb(); /* Order ->n_barrier_done increment with below mechanism. */
 
 	/*
 	 * Initialize the count to one rather than to zero in order to
@@ -2368,6 +2387,12 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	if (atomic_dec_and_test(&rsp->barrier_cpu_count))
 		complete(&rsp->barrier_completion);
 
+	/* Increment ->n_barrier_done to prevent duplicate work. */
+	smp_mb(); /* Keep increment after above mechanism. */
+	ACCESS_ONCE(rsp->n_barrier_done)++;
+	WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 0);
+	smp_mb(); /* Keep increment before caller's subsequent code. */
+
 	/* Wait for all rcu_barrier_callback() callbacks to be invoked. */
 	wait_for_completion(&rsp->barrier_completion);
 
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index d9ac82f..a294f7f 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -389,6 +389,8 @@ struct rcu_state {
 	struct mutex barrier_mutex;		/* Guards barrier fields. */
 	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
 	struct completion barrier_completion;	/* Wake at barrier end. */
+	unsigned long n_barrier_done;		/* ++ at start and end of */
+						/*  _rcu_barrier(). */
 	raw_spinlock_t fqslock;			/* Only one task forcing */
 						/*  quiescent states. */
 	unsigned long jiffies_force_qs;		/* Time at which to invoke */
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 10/15] rcu: Add tracing for _rcu_barrier()
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
                     ` (7 preceding siblings ...)
  2012-06-15 21:06   ` [PATCH tip/core/rcu 09/15] rcu: Increasing rcu_barrier() concurrency Paul E. McKenney
@ 2012-06-15 21:06   ` Paul E. McKenney
  2012-06-15 23:35     ` Josh Triplett
  2012-06-15 21:06   ` [PATCH tip/core/rcu 11/15] rcu: Add rcu_barrier() statistics to debugfs tracing Paul E. McKenney
                     ` (5 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney, Paul E. McKenney

From: "Paul E. McKenney" <paul.mckenney@linaro.org>

This commit adds event tracing for _rcu_barrier() execution.  This
is defined only if RCU_TRACE=y.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/trace/events/rcu.h |   45 ++++++++++++++++++++++++++++++++++++++++++++
 kernel/rcutree.c           |   29 +++++++++++++++++++++++++++-
 2 files changed, 73 insertions(+), 1 deletions(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 1480900..cd63f79 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -540,6 +540,50 @@ TRACE_EVENT(rcu_torture_read,
 		  __entry->rcutorturename, __entry->rhp)
 );
 
+/*
+ * Tracepoint for _rcu_barrier() execution.  The string "s" describes
+ * the _rcu_barrier phase:
+ *	"Begin": rcu_barrier_callback() started.
+ *	"Check": rcu_barrier_callback() checking for piggybacking.
+ *	"EarlyExit": rcu_barrier_callback() piggybacked, thus early exit.
+ *	"Inc1": rcu_barrier_callback() piggyback check counter incremented.
+ *	"Offline": rcu_barrier_callback() found offline CPU
+ *	"OnlineQ": rcu_barrier_callback() found online CPU with callbacks.
+ *	"OnlineNQ": rcu_barrier_callback() found online CPU, no callbacks.
+ *	"IRQ": An rcu_barrier_callback() callback posted on remote CPU.
+ *	"CB": An rcu_barrier_callback() invoked a callback, not the last.
+ *	"LastCB": An rcu_barrier_callback() invoked the last callback.
+ *	"Inc2": rcu_barrier_callback() piggyback check counter incremented.
+ * The "cpu" argument is the CPU or -1 if meaningless, the "cnt" argument
+ * is the count of remaining callbacks, and "done" is the piggybacking count.
+ */
+TRACE_EVENT(rcu_barrier,
+
+	TP_PROTO(char *rcuname, char *s, int cpu, int cnt, unsigned long done),
+
+	TP_ARGS(rcuname, s, cpu, cnt, done),
+
+	TP_STRUCT__entry(
+		__field(char *, rcuname)
+		__field(char *, s)
+		__field(int, cpu)
+		__field(int, cnt)
+		__field(unsigned long, done)
+	),
+
+	TP_fast_assign(
+		__entry->rcuname = rcuname;
+		__entry->s = s;
+		__entry->cpu = cpu;
+		__entry->cnt = cnt;
+		__entry->done = done;
+	),
+
+	TP_printk("%s %s cpu %d remaining %d # %lu",
+		  __entry->rcuname, __entry->s, __entry->cpu, __entry->cnt,
+		  __entry->done)
+);
+
 #else /* #ifdef CONFIG_RCU_TRACE */
 
 #define trace_rcu_grace_period(rcuname, gpnum, gpevent) do { } while (0)
@@ -563,6 +607,7 @@ TRACE_EVENT(rcu_torture_read,
 #define trace_rcu_batch_end(rcuname, callbacks_invoked, cb, nr, iit, risk) \
 	do { } while (0)
 #define trace_rcu_torture_read(rcutorturename, rhp) do { } while (0)
+#define trace_rcu_barrier(name, s, cpu, cnt, done) do { } while (0)
 
 #endif /* #else #ifdef CONFIG_RCU_TRACE */
 
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 7c299d3..ebd5223 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -2257,6 +2257,17 @@ static int rcu_cpu_has_callbacks(int cpu)
 }
 
 /*
+ * Helper function for _rcu_barrier() tracing.  If tracing is disabled,
+ * the compiler is expected to optimize this away.
+ */
+static void _rcu_barrier_trace(struct rcu_state *rsp, char *s,
+			       int cpu, unsigned long done)
+{
+	trace_rcu_barrier(rsp->name, s, cpu,
+			  atomic_read(&rsp->barrier_cpu_count), done);
+}
+
+/*
  * RCU callback function for _rcu_barrier().  If we are last, wake
  * up the task executing _rcu_barrier().
  */
@@ -2265,8 +2276,12 @@ static void rcu_barrier_callback(struct rcu_head *rhp)
 	struct rcu_data *rdp = container_of(rhp, struct rcu_data, barrier_head);
 	struct rcu_state *rsp = rdp->rsp;
 
-	if (atomic_dec_and_test(&rsp->barrier_cpu_count))
+	if (atomic_dec_and_test(&rsp->barrier_cpu_count)) {
+		_rcu_barrier_trace(rsp, "LastCB", -1, rsp->n_barrier_done);
 		complete(&rsp->barrier_completion);
+	} else {
+		_rcu_barrier_trace(rsp, "CB", -1, rsp->n_barrier_done);
+	}
 }
 
 /*
@@ -2277,6 +2292,7 @@ static void rcu_barrier_func(void *type)
 	struct rcu_state *rsp = type;
 	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
 
+	_rcu_barrier_trace(rsp, "IRQ", -1, rsp->n_barrier_done);
 	atomic_inc(&rsp->barrier_cpu_count);
 	rsp->call(&rdp->barrier_head, rcu_barrier_callback);
 }
@@ -2295,6 +2311,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	unsigned long snap_done;
 
 	init_rcu_head_on_stack(&rd.barrier_head);
+	_rcu_barrier_trace(rsp, "Begin", -1, snap);
 
 	/* Take mutex to serialize concurrent rcu_barrier() requests. */
 	mutex_lock(&rsp->barrier_mutex);
@@ -2307,7 +2324,9 @@ static void _rcu_barrier(struct rcu_state *rsp)
 
 	/* Recheck ->n_barrier_done to see if others did our work for us. */
 	snap_done = ACCESS_ONCE(rsp->n_barrier_done);
+	_rcu_barrier_trace(rsp, "Check", -1, snap_done);
 	if (ULONG_CMP_GE(snap_done, ((snap + 1) & ~0x1) + 2)) {
+		_rcu_barrier_trace(rsp, "EarlyExit", -1, snap_done);
 		smp_mb();
 		mutex_unlock(&rsp->barrier_mutex);
 		return;
@@ -2316,6 +2335,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	/* Increment ->n_barrier_done to avoid duplicate work. */
 	ACCESS_ONCE(rsp->n_barrier_done)++;
 	WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 1);
+	_rcu_barrier_trace(rsp, "Inc1", -1, rsp->n_barrier_done);
 	smp_mb(); /* Order ->n_barrier_done increment with below mechanism. */
 
 	/*
@@ -2352,13 +2372,19 @@ static void _rcu_barrier(struct rcu_state *rsp)
 		preempt_disable();
 		rdp = per_cpu_ptr(rsp->rda, cpu);
 		if (cpu_is_offline(cpu)) {
+			_rcu_barrier_trace(rsp, "Offline", cpu,
+					   rsp->n_barrier_done);
 			preempt_enable();
 			while (cpu_is_offline(cpu) && ACCESS_ONCE(rdp->qlen))
 				schedule_timeout_interruptible(1);
 		} else if (ACCESS_ONCE(rdp->qlen)) {
+			_rcu_barrier_trace(rsp, "OnlineQ", cpu,
+					   rsp->n_barrier_done);
 			smp_call_function_single(cpu, rcu_barrier_func, rsp, 1);
 			preempt_enable();
 		} else {
+			_rcu_barrier_trace(rsp, "OnlineNQ", cpu,
+					   rsp->n_barrier_done);
 			preempt_enable();
 		}
 	}
@@ -2391,6 +2417,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	smp_mb(); /* Keep increment after above mechanism. */
 	ACCESS_ONCE(rsp->n_barrier_done)++;
 	WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 0);
+	_rcu_barrier_trace(rsp, "Inc2", -1, rsp->n_barrier_done);
 	smp_mb(); /* Keep increment before caller's subsequent code. */
 
 	/* Wait for all rcu_barrier_callback() callbacks to be invoked. */
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 11/15] rcu: Add rcu_barrier() statistics to debugfs tracing
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
                     ` (8 preceding siblings ...)
  2012-06-15 21:06   ` [PATCH tip/core/rcu 10/15] rcu: Add tracing for _rcu_barrier() Paul E. McKenney
@ 2012-06-15 21:06   ` Paul E. McKenney
  2012-06-15 23:38     ` Josh Triplett
  2012-06-15 21:06   ` [PATCH tip/core/rcu 12/15] rcu: Remove unneeded __rcu_process_callbacks() argument Paul E. McKenney
                     ` (4 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney, Paul E. McKenney

From: "Paul E. McKenney" <paul.mckenney@linaro.org>

This commit adds an rcubarrier file to RCU's debugfs statistical tracing
directory, providing diagnostic information on rcu_barrier().

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree_trace.c |   39 +++++++++++++++++++++++++++++++++++++++
 1 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/kernel/rcutree_trace.c b/kernel/rcutree_trace.c
index a3556a2..057408b 100644
--- a/kernel/rcutree_trace.c
+++ b/kernel/rcutree_trace.c
@@ -46,6 +46,40 @@
 #define RCU_TREE_NONCORE
 #include "rcutree.h"
 
+static void print_rcubarrier(struct seq_file *m, struct rcu_state *rsp)
+{
+	seq_printf(m, "%c bcc: %d nbd: %lu\n",
+		   rsp->rcu_barrier_in_progress ? 'B' : '.',
+		   atomic_read(&rsp->barrier_cpu_count),
+		   rsp->n_barrier_done);
+}
+
+static int show_rcubarrier(struct seq_file *m, void *unused)
+{
+#ifdef CONFIG_TREE_PREEMPT_RCU
+	seq_puts(m, "rcu_preempt: ");
+	print_rcubarrier(m, &rcu_preempt_state);
+#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
+	seq_puts(m, "rcu_sched: ");
+	print_rcubarrier(m, &rcu_sched_state);
+	seq_puts(m, "rcu_bh: ");
+	print_rcubarrier(m, &rcu_bh_state);
+	return 0;
+}
+
+static int rcubarrier_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, show_rcubarrier, NULL);
+}
+
+static const struct file_operations rcubarrier_fops = {
+	.owner = THIS_MODULE,
+	.open = rcubarrier_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = single_release,
+};
+
 #ifdef CONFIG_RCU_BOOST
 
 static char convert_kthread_status(unsigned int kthread_status)
@@ -453,6 +487,11 @@ static int __init rcutree_trace_init(void)
 	if (!rcudir)
 		goto free_out;
 
+	retval = debugfs_create_file("rcubarrier", 0444, rcudir,
+						NULL, &rcubarrier_fops);
+	if (!retval)
+		goto free_out;
+
 	retval = debugfs_create_file("rcudata", 0444, rcudir,
 						NULL, &rcudata_fops);
 	if (!retval)
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 12/15] rcu: Remove unneeded __rcu_process_callbacks() argument
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
                     ` (9 preceding siblings ...)
  2012-06-15 21:06   ` [PATCH tip/core/rcu 11/15] rcu: Add rcu_barrier() statistics to debugfs tracing Paul E. McKenney
@ 2012-06-15 21:06   ` Paul E. McKenney
  2012-06-15 23:37     ` Josh Triplett
  2012-06-15 21:06   ` [PATCH tip/core/rcu 13/15] rcu: Introduce for_each_rcu_flavor() and use it Paul E. McKenney
                     ` (3 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

With the advent of __this_cpu_ptr(), it is no longer necessary to pass
both the rcu_state and rcu_data structures into __rcu_process_callbacks().
This commit therefore computes the rcu_data pointer from the rcu_state
pointer within __rcu_process_callbacks() so that callers can pass in
only the pointer to the rcu_state structure.  This paves the way for
linking the rcu_state structures together and iterating over them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c        |    8 ++++----
 kernel/rcutree_plugin.h |    3 +--
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index ebd5223..bd4e41c 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1785,9 +1785,10 @@ unlock_fqs_ret:
  * whom the rdp belongs.
  */
 static void
-__rcu_process_callbacks(struct rcu_state *rsp, struct rcu_data *rdp)
+__rcu_process_callbacks(struct rcu_state *rsp)
 {
 	unsigned long flags;
+	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
 
 	WARN_ON_ONCE(rdp->beenonline == 0);
 
@@ -1824,9 +1825,8 @@ __rcu_process_callbacks(struct rcu_state *rsp, struct rcu_data *rdp)
 static void rcu_process_callbacks(struct softirq_action *unused)
 {
 	trace_rcu_utilization("Start RCU core");
-	__rcu_process_callbacks(&rcu_sched_state,
-				&__get_cpu_var(rcu_sched_data));
-	__rcu_process_callbacks(&rcu_bh_state, &__get_cpu_var(rcu_bh_data));
+	__rcu_process_callbacks(&rcu_sched_state);
+	__rcu_process_callbacks(&rcu_bh_state);
 	rcu_preempt_process_callbacks();
 	trace_rcu_utilization("End RCU core");
 }
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 6888706..686cb55 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -687,8 +687,7 @@ static void rcu_preempt_check_callbacks(int cpu)
  */
 static void rcu_preempt_process_callbacks(void)
 {
-	__rcu_process_callbacks(&rcu_preempt_state,
-				&__get_cpu_var(rcu_preempt_data));
+	__rcu_process_callbacks(&rcu_preempt_state);
 }
 
 #ifdef CONFIG_RCU_BOOST
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 13/15] rcu: Introduce for_each_rcu_flavor() and use it
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
                     ` (10 preceding siblings ...)
  2012-06-15 21:06   ` [PATCH tip/core/rcu 12/15] rcu: Remove unneeded __rcu_process_callbacks() argument Paul E. McKenney
@ 2012-06-15 21:06   ` Paul E. McKenney
  2012-06-15 23:52     ` Josh Triplett
  2012-06-15 21:06   ` [PATCH tip/core/rcu 14/15] rcu: Use for_each_rcu_flavor() in TREE_RCU tracing Paul E. McKenney
                     ` (2 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The arrival of TREE_PREEMPT_RCU some years back included some ugly
code involving either #ifdef or #ifdef'ed wrapper functions to iterate
over all non-SRCU flavors of RCU.  This commit therefore introduces
a for_each_rcu_flavor() iterator over the rcu_state structures for each
flavor of RCU to clean up a bit of the ugliness.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c        |   53 +++++++++++++---------
 kernel/rcutree.h        |   12 ++---
 kernel/rcutree_plugin.h |  116 -----------------------------------------------
 3 files changed, 36 insertions(+), 145 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index bd4e41c..75ad92a 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -84,6 +84,7 @@ struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh, call_rcu_bh);
 DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
 
 static struct rcu_state *rcu_state;
+LIST_HEAD(rcu_struct_flavors);
 
 /* Increase (but not decrease) the CONFIG_RCU_FANOUT_LEAF at boot time. */
 static int rcu_fanout_leaf = CONFIG_RCU_FANOUT_LEAF;
@@ -859,9 +860,10 @@ static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr)
  */
 void rcu_cpu_stall_reset(void)
 {
-	rcu_sched_state.jiffies_stall = jiffies + ULONG_MAX / 2;
-	rcu_bh_state.jiffies_stall = jiffies + ULONG_MAX / 2;
-	rcu_preempt_stall_reset();
+	struct rcu_state *rsp;
+
+	for_each_rcu_flavor(rsp)
+		rsp->jiffies_stall = jiffies + ULONG_MAX / 2;
 }
 
 static struct notifier_block rcu_panic_block = {
@@ -1824,10 +1826,11 @@ __rcu_process_callbacks(struct rcu_state *rsp)
  */
 static void rcu_process_callbacks(struct softirq_action *unused)
 {
+	struct rcu_state *rsp;
+
 	trace_rcu_utilization("Start RCU core");
-	__rcu_process_callbacks(&rcu_sched_state);
-	__rcu_process_callbacks(&rcu_bh_state);
-	rcu_preempt_process_callbacks();
+	for_each_rcu_flavor(rsp)
+		__rcu_process_callbacks(rsp);
 	trace_rcu_utilization("End RCU core");
 }
 
@@ -2238,9 +2241,12 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp)
  */
 static int rcu_pending(int cpu)
 {
-	return __rcu_pending(&rcu_sched_state, &per_cpu(rcu_sched_data, cpu)) ||
-	       __rcu_pending(&rcu_bh_state, &per_cpu(rcu_bh_data, cpu)) ||
-	       rcu_preempt_pending(cpu);
+	struct rcu_state *rsp;
+
+	for_each_rcu_flavor(rsp)
+		if (__rcu_pending(rsp, per_cpu_ptr(rsp->rda, cpu)))
+			return 1;
+	return 0;
 }
 
 /*
@@ -2250,10 +2256,13 @@ static int rcu_pending(int cpu)
  */
 static int rcu_cpu_has_callbacks(int cpu)
 {
+	struct rcu_state *rsp;
+
 	/* RCU callbacks either ready or pending? */
-	return per_cpu(rcu_sched_data, cpu).nxtlist ||
-	       per_cpu(rcu_bh_data, cpu).nxtlist ||
-	       rcu_preempt_cpu_has_callbacks(cpu);
+	for_each_rcu_flavor(rsp)
+		if (per_cpu_ptr(rsp->rda, cpu)->nxtlist)
+			return 1;
+	return 0;
 }
 
 /*
@@ -2539,9 +2548,10 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
 
 static void __cpuinit rcu_prepare_cpu(int cpu)
 {
-	rcu_init_percpu_data(cpu, &rcu_sched_state, 0);
-	rcu_init_percpu_data(cpu, &rcu_bh_state, 0);
-	rcu_preempt_init_percpu_data(cpu);
+	struct rcu_state *rsp;
+
+	for_each_rcu_flavor(rsp)
+		rcu_init_percpu_data(cpu, rsp, 0);
 }
 
 /*
@@ -2553,6 +2563,7 @@ static int __cpuinit rcu_cpu_notify(struct notifier_block *self,
 	long cpu = (long)hcpu;
 	struct rcu_data *rdp = per_cpu_ptr(rcu_state->rda, cpu);
 	struct rcu_node *rnp = rdp->mynode;
+	struct rcu_state *rsp;
 
 	trace_rcu_utilization("Start CPU hotplug");
 	switch (action) {
@@ -2577,18 +2588,15 @@ static int __cpuinit rcu_cpu_notify(struct notifier_block *self,
 		 * touch any data without introducing corruption. We send the
 		 * dying CPU's callbacks to an arbitrarily chosen online CPU.
 		 */
-		rcu_cleanup_dying_cpu(&rcu_bh_state);
-		rcu_cleanup_dying_cpu(&rcu_sched_state);
-		rcu_preempt_cleanup_dying_cpu();
-		rcu_cleanup_after_idle(cpu);
+		for_each_rcu_flavor(rsp)
+			rcu_cleanup_dying_cpu(rsp);
 		break;
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
 	case CPU_UP_CANCELED:
 	case CPU_UP_CANCELED_FROZEN:
-		rcu_cleanup_dead_cpu(cpu, &rcu_bh_state);
-		rcu_cleanup_dead_cpu(cpu, &rcu_sched_state);
-		rcu_preempt_cleanup_dead_cpu(cpu);
+		for_each_rcu_flavor(rsp)
+			rcu_cleanup_dead_cpu(cpu, rsp);
 		break;
 	default:
 		break;
@@ -2705,6 +2713,7 @@ static void __init rcu_init_one(struct rcu_state *rsp,
 		per_cpu_ptr(rsp->rda, i)->mynode = rnp;
 		rcu_boot_init_percpu_data(i, rsp);
 	}
+	list_add(&rsp->flavors, &rcu_struct_flavors);
 }
 
 /*
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index a294f7f..138fb33 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -408,8 +408,13 @@ struct rcu_state {
 	unsigned long gp_max;			/* Maximum GP duration in */
 						/*  jiffies. */
 	char *name;				/* Name of structure. */
+	struct list_head flavors;		/* List of RCU flavors. */
 };
 
+extern struct list_head rcu_struct_flavors;
+#define for_each_rcu_flavor(rsp) \
+	list_for_each_entry((rsp), &rcu_struct_flavors, flavors)
+
 /* Return values for rcu_preempt_offline_tasks(). */
 
 #define RCU_OFL_TASKS_NORM_GP	0x1		/* Tasks blocking normal */
@@ -451,25 +456,18 @@ static void rcu_stop_cpu_kthread(int cpu);
 #endif /* #ifdef CONFIG_HOTPLUG_CPU */
 static void rcu_print_detail_task_stall(struct rcu_state *rsp);
 static int rcu_print_task_stall(struct rcu_node *rnp);
-static void rcu_preempt_stall_reset(void);
 static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp);
 #ifdef CONFIG_HOTPLUG_CPU
 static int rcu_preempt_offline_tasks(struct rcu_state *rsp,
 				     struct rcu_node *rnp,
 				     struct rcu_data *rdp);
 #endif /* #ifdef CONFIG_HOTPLUG_CPU */
-static void rcu_preempt_cleanup_dead_cpu(int cpu);
 static void rcu_preempt_check_callbacks(int cpu);
-static void rcu_preempt_process_callbacks(void);
 void call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu));
 #if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_TREE_PREEMPT_RCU)
 static void rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp,
 			       bool wake);
 #endif /* #if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_TREE_PREEMPT_RCU) */
-static int rcu_preempt_pending(int cpu);
-static int rcu_preempt_cpu_has_callbacks(int cpu);
-static void __cpuinit rcu_preempt_init_percpu_data(int cpu);
-static void rcu_preempt_cleanup_dying_cpu(void);
 static void __init __rcu_init_preempt(void);
 static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags);
 static void rcu_preempt_boost_start_gp(struct rcu_node *rnp);
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 686cb55..41ce563 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -545,16 +545,6 @@ static int rcu_print_task_stall(struct rcu_node *rnp)
 }
 
 /*
- * Suppress preemptible RCU's CPU stall warnings by pushing the
- * time of the next stall-warning message comfortably far into the
- * future.
- */
-static void rcu_preempt_stall_reset(void)
-{
-	rcu_preempt_state.jiffies_stall = jiffies + ULONG_MAX / 2;
-}
-
-/*
  * Check that the list of blocked tasks for the newly completed grace
  * period is in fact empty.  It is a serious bug to complete a grace
  * period that still has RCU readers blocked!  This function must be
@@ -655,14 +645,6 @@ static int rcu_preempt_offline_tasks(struct rcu_state *rsp,
 #endif /* #ifdef CONFIG_HOTPLUG_CPU */
 
 /*
- * Do CPU-offline processing for preemptible RCU.
- */
-static void rcu_preempt_cleanup_dead_cpu(int cpu)
-{
-	rcu_cleanup_dead_cpu(cpu, &rcu_preempt_state);
-}
-
-/*
  * Check for a quiescent state from the current CPU.  When a task blocks,
  * the task is recorded in the corresponding CPU's rcu_node structure,
  * which is checked elsewhere.
@@ -682,14 +664,6 @@ static void rcu_preempt_check_callbacks(int cpu)
 		t->rcu_read_unlock_special |= RCU_READ_UNLOCK_NEED_QS;
 }
 
-/*
- * Process callbacks for preemptible RCU.
- */
-static void rcu_preempt_process_callbacks(void)
-{
-	__rcu_process_callbacks(&rcu_preempt_state);
-}
-
 #ifdef CONFIG_RCU_BOOST
 
 static void rcu_preempt_do_callbacks(void)
@@ -921,24 +895,6 @@ mb_ret:
 }
 EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);
 
-/*
- * Check to see if there is any immediate preemptible-RCU-related work
- * to be done.
- */
-static int rcu_preempt_pending(int cpu)
-{
-	return __rcu_pending(&rcu_preempt_state,
-			     &per_cpu(rcu_preempt_data, cpu));
-}
-
-/*
- * Does preemptible RCU have callbacks on this CPU?
- */
-static int rcu_preempt_cpu_has_callbacks(int cpu)
-{
-	return !!per_cpu(rcu_preempt_data, cpu).nxtlist;
-}
-
 /**
  * rcu_barrier - Wait until all in-flight call_rcu() callbacks complete.
  */
@@ -949,23 +905,6 @@ void rcu_barrier(void)
 EXPORT_SYMBOL_GPL(rcu_barrier);
 
 /*
- * Initialize preemptible RCU's per-CPU data.
- */
-static void __cpuinit rcu_preempt_init_percpu_data(int cpu)
-{
-	rcu_init_percpu_data(cpu, &rcu_preempt_state, 1);
-}
-
-/*
- * Move preemptible RCU's callbacks from dying CPU to other online CPU
- * and record a quiescent state.
- */
-static void rcu_preempt_cleanup_dying_cpu(void)
-{
-	rcu_cleanup_dying_cpu(&rcu_preempt_state);
-}
-
-/*
  * Initialize preemptible RCU's state structures.
  */
 static void __init __rcu_init_preempt(void)
@@ -1042,14 +981,6 @@ static int rcu_print_task_stall(struct rcu_node *rnp)
 }
 
 /*
- * Because preemptible RCU does not exist, there is no need to suppress
- * its CPU stall warnings.
- */
-static void rcu_preempt_stall_reset(void)
-{
-}
-
-/*
  * Because there is no preemptible RCU, there can be no readers blocked,
  * so there is no need to check for blocked tasks.  So check only for
  * bogus qsmask values.
@@ -1077,14 +1008,6 @@ static int rcu_preempt_offline_tasks(struct rcu_state *rsp,
 #endif /* #ifdef CONFIG_HOTPLUG_CPU */
 
 /*
- * Because preemptible RCU does not exist, it never needs CPU-offline
- * processing.
- */
-static void rcu_preempt_cleanup_dead_cpu(int cpu)
-{
-}
-
-/*
  * Because preemptible RCU does not exist, it never has any callbacks
  * to check.
  */
@@ -1093,14 +1016,6 @@ static void rcu_preempt_check_callbacks(int cpu)
 }
 
 /*
- * Because preemptible RCU does not exist, it never has any callbacks
- * to process.
- */
-static void rcu_preempt_process_callbacks(void)
-{
-}
-
-/*
  * Queue an RCU callback for lazy invocation after a grace period.
  * This will likely be later named something like "call_rcu_lazy()",
  * but this change will require some way of tagging the lazy RCU
@@ -1141,22 +1056,6 @@ static void rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp,
 #endif /* #ifdef CONFIG_HOTPLUG_CPU */
 
 /*
- * Because preemptible RCU does not exist, it never has any work to do.
- */
-static int rcu_preempt_pending(int cpu)
-{
-	return 0;
-}
-
-/*
- * Because preemptible RCU does not exist, it never has callbacks
- */
-static int rcu_preempt_cpu_has_callbacks(int cpu)
-{
-	return 0;
-}
-
-/*
  * Because preemptible RCU does not exist, rcu_barrier() is just
  * another name for rcu_barrier_sched().
  */
@@ -1167,21 +1066,6 @@ void rcu_barrier(void)
 EXPORT_SYMBOL_GPL(rcu_barrier);
 
 /*
- * Because preemptible RCU does not exist, there is no per-CPU
- * data to initialize.
- */
-static void __cpuinit rcu_preempt_init_percpu_data(int cpu)
-{
-}
-
-/*
- * Because there is no preemptible RCU, there is no cleanup to do.
- */
-static void rcu_preempt_cleanup_dying_cpu(void)
-{
-}
-
-/*
  * Because preemptible RCU does not exist, it need not be initialized.
  */
 static void __init __rcu_init_preempt(void)
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 14/15] rcu: Use for_each_rcu_flavor() in TREE_RCU tracing
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
                     ` (11 preceding siblings ...)
  2012-06-15 21:06   ` [PATCH tip/core/rcu 13/15] rcu: Introduce for_each_rcu_flavor() and use it Paul E. McKenney
@ 2012-06-15 21:06   ` Paul E. McKenney
  2012-06-15 23:59     ` Josh Triplett
  2012-06-15 21:06   ` [PATCH tip/core/rcu 15/15] rcu: RCU_SAVE_DYNTICK code no longer ever dead Paul E. McKenney
  2012-06-15 21:43   ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Josh Triplett
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

This commit applies the new for_each_rcu_flavor() macro to the
kernel/rcutree_trace.c file.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree_trace.c |   95 +++++++++++++++++++-----------------------------
 1 files changed, 38 insertions(+), 57 deletions(-)

diff --git a/kernel/rcutree_trace.c b/kernel/rcutree_trace.c
index 057408b..c618665 100644
--- a/kernel/rcutree_trace.c
+++ b/kernel/rcutree_trace.c
@@ -48,22 +48,18 @@
 
 static void print_rcubarrier(struct seq_file *m, struct rcu_state *rsp)
 {
-	seq_printf(m, "%c bcc: %d nbd: %lu\n",
-		   rsp->rcu_barrier_in_progress ? 'B' : '.',
+	seq_printf(m, "%s: %c bcc: %d nbd: %lu\n",
+		   rsp->name, rsp->rcu_barrier_in_progress ? 'B' : '.',
 		   atomic_read(&rsp->barrier_cpu_count),
 		   rsp->n_barrier_done);
 }
 
 static int show_rcubarrier(struct seq_file *m, void *unused)
 {
-#ifdef CONFIG_TREE_PREEMPT_RCU
-	seq_puts(m, "rcu_preempt: ");
-	print_rcubarrier(m, &rcu_preempt_state);
-#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
-	seq_puts(m, "rcu_sched: ");
-	print_rcubarrier(m, &rcu_sched_state);
-	seq_puts(m, "rcu_bh: ");
-	print_rcubarrier(m, &rcu_bh_state);
+	struct rcu_state *rsp;
+
+	for_each_rcu_flavor(rsp)
+		print_rcubarrier(m, rsp);
 	return 0;
 }
 
@@ -129,24 +125,16 @@ static void print_one_rcu_data(struct seq_file *m, struct rcu_data *rdp)
 		   rdp->n_cbs_invoked, rdp->n_cbs_orphaned, rdp->n_cbs_adopted);
 }
 
-#define PRINT_RCU_DATA(name, func, m) \
-	do { \
-		int _p_r_d_i; \
-		\
-		for_each_possible_cpu(_p_r_d_i) \
-			func(m, &per_cpu(name, _p_r_d_i)); \
-	} while (0)
-
 static int show_rcudata(struct seq_file *m, void *unused)
 {
-#ifdef CONFIG_TREE_PREEMPT_RCU
-	seq_puts(m, "rcu_preempt:\n");
-	PRINT_RCU_DATA(rcu_preempt_data, print_one_rcu_data, m);
-#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
-	seq_puts(m, "rcu_sched:\n");
-	PRINT_RCU_DATA(rcu_sched_data, print_one_rcu_data, m);
-	seq_puts(m, "rcu_bh:\n");
-	PRINT_RCU_DATA(rcu_bh_data, print_one_rcu_data, m);
+	int cpu;
+	struct rcu_state *rsp;
+
+	for_each_rcu_flavor(rsp) {
+		seq_printf(m, "%s:\n", rsp->name);
+		for_each_possible_cpu(cpu)
+		print_one_rcu_data(m, per_cpu_ptr(rsp->rda, cpu));
+	}
 	return 0;
 }
 
@@ -200,6 +188,9 @@ static void print_one_rcu_data_csv(struct seq_file *m, struct rcu_data *rdp)
 
 static int show_rcudata_csv(struct seq_file *m, void *unused)
 {
+	int cpu;
+	struct rcu_state *rsp;
+
 	seq_puts(m, "\"CPU\",\"Online?\",\"c\",\"g\",\"pq\",\"pgp\",\"pq\",");
 	seq_puts(m, "\"dt\",\"dt nesting\",\"dt NMI nesting\",\"df\",");
 	seq_puts(m, "\"of\",\"qll\",\"ql\",\"qs\"");
@@ -207,14 +198,11 @@ static int show_rcudata_csv(struct seq_file *m, void *unused)
 	seq_puts(m, "\"kt\",\"ktl\"");
 #endif /* #ifdef CONFIG_RCU_BOOST */
 	seq_puts(m, ",\"b\",\"ci\",\"co\",\"ca\"\n");
-#ifdef CONFIG_TREE_PREEMPT_RCU
-	seq_puts(m, "\"rcu_preempt:\"\n");
-	PRINT_RCU_DATA(rcu_preempt_data, print_one_rcu_data_csv, m);
-#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
-	seq_puts(m, "\"rcu_sched:\"\n");
-	PRINT_RCU_DATA(rcu_sched_data, print_one_rcu_data_csv, m);
-	seq_puts(m, "\"rcu_bh:\"\n");
-	PRINT_RCU_DATA(rcu_bh_data, print_one_rcu_data_csv, m);
+	for_each_rcu_flavor(rsp) {
+		seq_printf(m, "\"%s:\"\n", rsp->name);
+		for_each_possible_cpu(cpu)
+			print_one_rcu_data_csv(m, per_cpu_ptr(rsp->rda, cpu));
+	}
 	return 0;
 }
 
@@ -304,9 +292,9 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
 	struct rcu_node *rnp;
 
 	gpnum = rsp->gpnum;
-	seq_printf(m, "c=%lu g=%lu s=%d jfq=%ld j=%x "
+	seq_printf(m, "%s: c=%lu g=%lu s=%d jfq=%ld j=%x "
 		      "nfqs=%lu/nfqsng=%lu(%lu) fqlh=%lu oqlen=%ld/%ld\n",
-		   rsp->completed, gpnum, rsp->fqs_state,
+		   rsp->name, rsp->completed, gpnum, rsp->fqs_state,
 		   (long)(rsp->jiffies_force_qs - jiffies),
 		   (int)(jiffies & 0xffff),
 		   rsp->n_force_qs, rsp->n_force_qs_ngp,
@@ -329,14 +317,10 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
 
 static int show_rcuhier(struct seq_file *m, void *unused)
 {
-#ifdef CONFIG_TREE_PREEMPT_RCU
-	seq_puts(m, "rcu_preempt:\n");
-	print_one_rcu_state(m, &rcu_preempt_state);
-#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
-	seq_puts(m, "rcu_sched:\n");
-	print_one_rcu_state(m, &rcu_sched_state);
-	seq_puts(m, "rcu_bh:\n");
-	print_one_rcu_state(m, &rcu_bh_state);
+	struct rcu_state *rsp;
+
+	for_each_rcu_flavor(rsp)
+		print_one_rcu_state(m, rsp);
 	return 0;
 }
 
@@ -377,11 +361,10 @@ static void show_one_rcugp(struct seq_file *m, struct rcu_state *rsp)
 
 static int show_rcugp(struct seq_file *m, void *unused)
 {
-#ifdef CONFIG_TREE_PREEMPT_RCU
-	show_one_rcugp(m, &rcu_preempt_state);
-#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
-	show_one_rcugp(m, &rcu_sched_state);
-	show_one_rcugp(m, &rcu_bh_state);
+	struct rcu_state *rsp;
+
+	for_each_rcu_flavor(rsp)
+		show_one_rcugp(m, rsp);
 	return 0;
 }
 
@@ -430,14 +413,12 @@ static void print_rcu_pendings(struct seq_file *m, struct rcu_state *rsp)
 
 static int show_rcu_pending(struct seq_file *m, void *unused)
 {
-#ifdef CONFIG_TREE_PREEMPT_RCU
-	seq_puts(m, "rcu_preempt:\n");
-	print_rcu_pendings(m, &rcu_preempt_state);
-#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
-	seq_puts(m, "rcu_sched:\n");
-	print_rcu_pendings(m, &rcu_sched_state);
-	seq_puts(m, "rcu_bh:\n");
-	print_rcu_pendings(m, &rcu_bh_state);
+	struct rcu_state *rsp;
+
+	for_each_rcu_flavor(rsp) {
+		seq_printf(m, "%s:\n", rsp->name);
+		print_rcu_pendings(m, rsp);
+	}
 	return 0;
 }
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH tip/core/rcu 15/15] rcu: RCU_SAVE_DYNTICK code no longer ever dead
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
                     ` (12 preceding siblings ...)
  2012-06-15 21:06   ` [PATCH tip/core/rcu 14/15] rcu: Use for_each_rcu_flavor() in TREE_RCU tracing Paul E. McKenney
@ 2012-06-15 21:06   ` Paul E. McKenney
  2012-06-16  0:02     ` Josh Triplett
  2012-06-15 21:43   ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Josh Triplett
  14 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 21:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, patches, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

Before RCU had unified idle, the RCU_SAVE_DYNTICK leg of the switch
statement in force_quiescent_state() was dead code for CONFIG_NO_HZ=n
kernel builds.  With unified idle, the code is never dead.  This commit
therefore removes the "if" statement designed to make gcc aware of when
the code was and was not dead.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 75ad92a..0b0c9cc 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1744,8 +1744,6 @@ static void force_quiescent_state(struct rcu_state *rsp, int relaxed)
 		break; /* grace period idle or initializing, ignore. */
 
 	case RCU_SAVE_DYNTICK:
-		if (RCU_SIGNAL_INIT != RCU_SAVE_DYNTICK)
-			break; /* So gcc recognizes the dead code. */
 
 		raw_spin_unlock(&rnp->lock);  /* irqs remain disabled */
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter
  2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
                     ` (13 preceding siblings ...)
  2012-06-15 21:06   ` [PATCH tip/core/rcu 15/15] rcu: RCU_SAVE_DYNTICK code no longer ever dead Paul E. McKenney
@ 2012-06-15 21:43   ` Josh Triplett
  2012-06-15 22:10     ` Paul E. McKenney
  14 siblings, 1 reply; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 21:43 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 02:05:56PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> Although making RCU_FANOUT_LEAF a kernel configuration parameter rather
> than a fixed constant makes it easier for people to decrease cache-miss
> overhead for large systems, it is of little help for people who must
> run a single pre-built kernel binary.
> 
> This commit therefore allows the value of RCU_FANOUT_LEAF to be
> increased (but not decreased!) via a boot-time parameter named
> rcutree.rcu_fanout_leaf.
> 
> Reported-by: Mike Galbraith <efault@gmx.de>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
>  Documentation/kernel-parameters.txt |    4 ++
>  kernel/rcutree.c                    |   97 ++++++++++++++++++++++++++++++-----
>  kernel/rcutree.h                    |   23 +++++----
>  kernel/rcutree_plugin.h             |    4 +-
>  kernel/rcutree_trace.c              |    2 +-
>  5 files changed, 104 insertions(+), 26 deletions(-)
> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index c45513d..88bd3ef 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2367,6 +2367,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  			Set maximum number of finished RCU callbacks to process
>  			in one batch.
>  
> +	rcutree.fanout_leaf=	[KNL,BOOT]
> +			Set maximum number of finished RCU callbacks to process
> +			in one batch.

Copy-paste problem.

>  	rcutree.qhimark=	[KNL,BOOT]
>  			Set threshold of queued
>  			RCU callbacks over which batch limiting is disabled.
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 0da7b88..a151184 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -60,17 +60,10 @@
>  
>  /* Data structures. */
>  
> -static struct lock_class_key rcu_node_class[NUM_RCU_LVLS];
> +static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];

I assume that the requirement to only increase the fanout and never
decrease it comes from the desire to not increase the sizes of all of
these arrays to MAX_RCU_LVLS?

> +/*
> + * Compute the rcu_node tree geometry from kernel parameters.  This cannot
> + * replace the definitions in rcutree.h because those are needed to size
> + * the ->node array in the rcu_state structure.
> + */
> +static void __init rcu_init_geometry(void)
> +{
> +	int i;
> +	int j;
> +	int n = NR_CPUS;
> +	int rcu_capacity[MAX_RCU_LVLS + 1];
> +
> +	/* If the compile-time values are accurate, just leave. */
> +	if (rcu_fanout_leaf == CONFIG_RCU_FANOUT_LEAF)
> +		return;
> +
> +	/*
> +	 * Compute number of nodes that can be handled an rcu_node tree
> +	 * with the given number of levels.  Setting rcu_capacity[0] makes
> +	 * some of the arithmetic easier.
> +	 */
> +	rcu_capacity[0] = 1;
> +	rcu_capacity[1] = rcu_fanout_leaf;
> +	for (i = 2; i <= MAX_RCU_LVLS; i++)
> +		rcu_capacity[i] = rcu_capacity[i - 1] * CONFIG_RCU_FANOUT;
> +
> +	/*
> +	 * The boot-time rcu_fanout_leaf parameter is only permitted
> +	 * to increase the leaf-level fanout, not decrease it.  Of course,
> +	 * the leaf-level fanout cannot exceed the number of bits in
> +	 * the rcu_node masks.  Finally, the tree must be able to accommodate
> +	 * the configured number of CPUs.  Complain and fall back to the
> +	 * compile-timer values if these limits are exceeded.

Typo: s/timer/time/

> +	 */
> +	if (rcu_fanout_leaf < CONFIG_RCU_FANOUT_LEAF ||
> +	    rcu_fanout_leaf > sizeof(unsigned long) * 8 ||
> +	    n > rcu_capacity[4]) {

4 seems like a magic number here; did you mean MAX_RCU_LVLS or similar?

Also, why have n as a variable when it never changes?

> --- a/kernel/rcutree.h
> +++ b/kernel/rcutree.h
> @@ -42,28 +42,28 @@
>  #define RCU_FANOUT_4	      (RCU_FANOUT_3 * CONFIG_RCU_FANOUT)
>  
>  #if NR_CPUS <= RCU_FANOUT_1
> -#  define NUM_RCU_LVLS	      1
> +#  define RCU_NUM_LVLS	      1

I assume you made this change to make it easier to track down all the
uses of the macro to change them; however, having now done so, the
change itself seems rather gratuitous, and inconsistent with the other
macros.  Would you consider changing it back?

> +extern int rcu_num_lvls;
> +extern int rcu_num_nodes;

Given the above, you might also want to change these for consistency.

Also, have you checked the various loops using these variables to figure
out if GCC emits less optimal code now that it can't rely on a
compile-time constant?  I don't expect it to make much of a difference,
but it seems worth checking.

You might also consider marking these as __read_mostly, at a minimum.

> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index 2411000..e9b44c3 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -68,8 +68,10 @@ static void __init rcu_bootup_announce_oddness(void)
>  	printk(KERN_INFO "\tAdditional per-CPU info printed with stalls.\n");
>  #endif
>  #if NUM_RCU_LVL_4 != 0
> -	printk(KERN_INFO "\tExperimental four-level hierarchy is enabled.\n");
> +	printk(KERN_INFO "\tFour-level hierarchy is enabled.\n");

This change seems entirely unrelated to this patch.  Seems simple enough
to split it into a separate one-line patch ("Mark four-level hierarchy
as no longer experimental").

>  #endif
> +	if (rcu_fanout_leaf != CONFIG_RCU_FANOUT_LEAF)
> +		printk(KERN_INFO "\tExperimental boot-time adjustment of leaf fanout.\n");

You might consider printing rcu_fanout_leaf in this message.

- Josh Triplett

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS
  2012-06-15 21:05   ` [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS Paul E. McKenney
@ 2012-06-15 21:47     ` Josh Triplett
  2012-06-16  0:37       ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 21:47 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 02:05:57PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> The rcu_node tree array is sized based on compile-time constants,
> including NR_CPUS.  Although this approach has worked well in the past,
> the recent trend by many distros to define NR_CPUS=4096 results in
> excessive grace-period-initialization latencies.
> 
> This commit therefore substitutes the run-time computed nr_cpu_ids for
> the compile-time NR_CPUS when building the tree.  This can result in
> much of the compile-time-allocated rcu_node array being unused.  If
> this is a major problem, you are in a specialized situation anyway,
> so you can manually adjust the NR_CPUS, RCU_FANOUT, and RCU_FANOUT_LEAF
> kernel config parameters.
> 
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
>  kernel/rcutree.c        |    2 +-
>  kernel/rcutree_plugin.h |    2 ++
>  2 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index a151184..9098910 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -2672,7 +2672,7 @@ static void __init rcu_init_geometry(void)
>  {
>  	int i;
>  	int j;
> -	int n = NR_CPUS;
> +	int n = nr_cpu_ids;

Same question as before: why have this as a variable when it never
changes?

- Josh Triplett

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 03/15] rcu: Prevent excessive line length in RCU_STATE_INITIALIZER()
  2012-06-15 21:05   ` [PATCH tip/core/rcu 03/15] rcu: Prevent excessive line length in RCU_STATE_INITIALIZER() Paul E. McKenney
@ 2012-06-15 21:48     ` Josh Triplett
  0 siblings, 0 replies; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 21:48 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches, Paul E. McKenney

On Fri, Jun 15, 2012 at 02:05:58PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> 
> Upcoming rcu_barrier() concurrency commits will result in line lengths
> greater than 80 characters in the RCU_STATE_INITIALIZER(), so this commit
> shortens the name of the macro's argument to prevent this.
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

>  kernel/rcutree.c |   14 +++++++-------
>  1 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 9098910..8ce1b1d 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -62,18 +62,18 @@
>  
>  static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
>  
> -#define RCU_STATE_INITIALIZER(structname) { \
> -	.level = { &structname##_state.node[0] }, \
> +#define RCU_STATE_INITIALIZER(sname) { \
> +	.level = { &sname##_state.node[0] }, \
>  	.fqs_state = RCU_GP_IDLE, \
>  	.gpnum = -300, \
>  	.completed = -300, \
> -	.onofflock = __RAW_SPIN_LOCK_UNLOCKED(&structname##_state.onofflock), \
> -	.orphan_nxttail = &structname##_state.orphan_nxtlist, \
> -	.orphan_donetail = &structname##_state.orphan_donelist, \
> -	.fqslock = __RAW_SPIN_LOCK_UNLOCKED(&structname##_state.fqslock), \
> +	.onofflock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.onofflock), \
> +	.orphan_nxttail = &sname##_state.orphan_nxtlist, \
> +	.orphan_donetail = &sname##_state.orphan_donelist, \
> +	.fqslock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.fqslock), \
>  	.n_force_qs = 0, \
>  	.n_force_qs_ngp = 0, \
> -	.name = #structname, \
> +	.name = #sname, \
>  }
>  
>  struct rcu_state rcu_sched_state = RCU_STATE_INITIALIZER(rcu_sched);
> -- 
> 1.7.8
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 04/15] rcu: Place pointer to call_rcu() in rcu_data structure
  2012-06-15 21:05   ` [PATCH tip/core/rcu 04/15] rcu: Place pointer to call_rcu() in rcu_data structure Paul E. McKenney
@ 2012-06-15 22:08     ` Josh Triplett
  0 siblings, 0 replies; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 22:08 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches, Paul E. McKenney

On Fri, Jun 15, 2012 at 02:05:59PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> 
> This is a preparatory commit for increasing rcu_barrier()'s concurrency.
> It adds a pointer in the rcu_data structure to the corresponding call_rcu()
> function.  This allows a pointer to the rcu_data structure to imply the
> function pointer, which allows _rcu_barrier() state to be placed in the
> rcu_state structure.
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

>  kernel/rcutree.c        |   27 ++++++++++++---------------
>  kernel/rcutree.h        |    2 ++
>  kernel/rcutree_plugin.h |    5 +++--
>  3 files changed, 17 insertions(+), 17 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 8ce1b1d..8b3ab4e 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -62,8 +62,9 @@
>  
>  static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
>  
> -#define RCU_STATE_INITIALIZER(sname) { \
> +#define RCU_STATE_INITIALIZER(sname, cr) { \
>  	.level = { &sname##_state.node[0] }, \
> +	.call = cr, \
>  	.fqs_state = RCU_GP_IDLE, \
>  	.gpnum = -300, \
>  	.completed = -300, \
> @@ -76,10 +77,11 @@ static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
>  	.name = #sname, \
>  }
>  
> -struct rcu_state rcu_sched_state = RCU_STATE_INITIALIZER(rcu_sched);
> +struct rcu_state rcu_sched_state =
> +	RCU_STATE_INITIALIZER(rcu_sched, call_rcu_sched);
>  DEFINE_PER_CPU(struct rcu_data, rcu_sched_data);
>  
> -struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh);
> +struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh, call_rcu_bh);
>  DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
>  
>  static struct rcu_state *rcu_state;
> @@ -2279,21 +2281,17 @@ static void rcu_barrier_func(void *type)
>  {
>  	int cpu = smp_processor_id();
>  	struct rcu_head *head = &per_cpu(rcu_barrier_head, cpu);
> -	void (*call_rcu_func)(struct rcu_head *head,
> -			      void (*func)(struct rcu_head *head));
> +	struct rcu_state *rsp = type;
>  
>  	atomic_inc(&rcu_barrier_cpu_count);
> -	call_rcu_func = type;
> -	call_rcu_func(head, rcu_barrier_callback);
> +	rsp->call(head, rcu_barrier_callback);
>  }
>  
>  /*
>   * Orchestrate the specified type of RCU barrier, waiting for all
>   * RCU callbacks of the specified type to complete.
>   */
> -static void _rcu_barrier(struct rcu_state *rsp,
> -			 void (*call_rcu_func)(struct rcu_head *head,
> -					       void (*func)(struct rcu_head *head)))
> +static void _rcu_barrier(struct rcu_state *rsp)
>  {
>  	int cpu;
>  	unsigned long flags;
> @@ -2345,8 +2343,7 @@ static void _rcu_barrier(struct rcu_state *rsp,
>  			while (cpu_is_offline(cpu) && ACCESS_ONCE(rdp->qlen))
>  				schedule_timeout_interruptible(1);
>  		} else if (ACCESS_ONCE(rdp->qlen)) {
> -			smp_call_function_single(cpu, rcu_barrier_func,
> -						 (void *)call_rcu_func, 1);
> +			smp_call_function_single(cpu, rcu_barrier_func, rsp, 1);
>  			preempt_enable();
>  		} else {
>  			preempt_enable();
> @@ -2367,7 +2364,7 @@ static void _rcu_barrier(struct rcu_state *rsp,
>  	raw_spin_unlock_irqrestore(&rsp->onofflock, flags);
>  	atomic_inc(&rcu_barrier_cpu_count);
>  	smp_mb__after_atomic_inc(); /* Ensure atomic_inc() before callback. */
> -	call_rcu_func(&rh, rcu_barrier_callback);
> +	rsp->call(&rh, rcu_barrier_callback);
>  
>  	/*
>  	 * Now that we have an rcu_barrier_callback() callback on each
> @@ -2390,7 +2387,7 @@ static void _rcu_barrier(struct rcu_state *rsp,
>   */
>  void rcu_barrier_bh(void)
>  {
> -	_rcu_barrier(&rcu_bh_state, call_rcu_bh);
> +	_rcu_barrier(&rcu_bh_state);
>  }
>  EXPORT_SYMBOL_GPL(rcu_barrier_bh);
>  
> @@ -2399,7 +2396,7 @@ EXPORT_SYMBOL_GPL(rcu_barrier_bh);
>   */
>  void rcu_barrier_sched(void)
>  {
> -	_rcu_barrier(&rcu_sched_state, call_rcu_sched);
> +	_rcu_barrier(&rcu_sched_state);
>  }
>  EXPORT_SYMBOL_GPL(rcu_barrier_sched);
>  
> diff --git a/kernel/rcutree.h b/kernel/rcutree.h
> index df3c2c8..15837d7 100644
> --- a/kernel/rcutree.h
> +++ b/kernel/rcutree.h
> @@ -350,6 +350,8 @@ struct rcu_state {
>  	u32 levelcnt[MAX_RCU_LVLS + 1];		/* # nodes in each level. */
>  	u8 levelspread[RCU_NUM_LVLS];		/* kids/node in each level. */
>  	struct rcu_data __percpu *rda;		/* pointer of percu rcu_data. */
> +	void (*call)(struct rcu_head *head,	/* call_rcu() flavor. */
> +		     void (*func)(struct rcu_head *head));
>  
>  	/* The following fields are guarded by the root rcu_node's lock. */
>  
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index 7cb86ae..6888706 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -78,7 +78,8 @@ static void __init rcu_bootup_announce_oddness(void)
>  
>  #ifdef CONFIG_TREE_PREEMPT_RCU
>  
> -struct rcu_state rcu_preempt_state = RCU_STATE_INITIALIZER(rcu_preempt);
> +struct rcu_state rcu_preempt_state =
> +	RCU_STATE_INITIALIZER(rcu_preempt, call_rcu);
>  DEFINE_PER_CPU(struct rcu_data, rcu_preempt_data);
>  static struct rcu_state *rcu_state = &rcu_preempt_state;
>  
> @@ -944,7 +945,7 @@ static int rcu_preempt_cpu_has_callbacks(int cpu)
>   */
>  void rcu_barrier(void)
>  {
> -	_rcu_barrier(&rcu_preempt_state, call_rcu);
> +	_rcu_barrier(&rcu_preempt_state);
>  }
>  EXPORT_SYMBOL_GPL(rcu_barrier);
>  
> -- 
> 1.7.8
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter
  2012-06-15 21:43   ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Josh Triplett
@ 2012-06-15 22:10     ` Paul E. McKenney
  0 siblings, 0 replies; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-15 22:10 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 02:43:09PM -0700, Josh Triplett wrote:
> On Fri, Jun 15, 2012 at 02:05:56PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > 
> > Although making RCU_FANOUT_LEAF a kernel configuration parameter rather
> > than a fixed constant makes it easier for people to decrease cache-miss
> > overhead for large systems, it is of little help for people who must
> > run a single pre-built kernel binary.
> > 
> > This commit therefore allows the value of RCU_FANOUT_LEAF to be
> > increased (but not decreased!) via a boot-time parameter named
> > rcutree.rcu_fanout_leaf.
> > 
> > Reported-by: Mike Galbraith <efault@gmx.de>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  Documentation/kernel-parameters.txt |    4 ++
> >  kernel/rcutree.c                    |   97 ++++++++++++++++++++++++++++++-----
> >  kernel/rcutree.h                    |   23 +++++----
> >  kernel/rcutree_plugin.h             |    4 +-
> >  kernel/rcutree_trace.c              |    2 +-
> >  5 files changed, 104 insertions(+), 26 deletions(-)
> > 
> > diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> > index c45513d..88bd3ef 100644
> > --- a/Documentation/kernel-parameters.txt
> > +++ b/Documentation/kernel-parameters.txt
> > @@ -2367,6 +2367,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
> >  			Set maximum number of finished RCU callbacks to process
> >  			in one batch.
> >  
> > +	rcutree.fanout_leaf=	[KNL,BOOT]
> > +			Set maximum number of finished RCU callbacks to process
> > +			in one batch.
> 
> Copy-paste problem.

Indeed!  Good catch!

> >  	rcutree.qhimark=	[KNL,BOOT]
> >  			Set threshold of queued
> >  			RCU callbacks over which batch limiting is disabled.
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index 0da7b88..a151184 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -60,17 +60,10 @@
> >  
> >  /* Data structures. */
> >  
> > -static struct lock_class_key rcu_node_class[NUM_RCU_LVLS];
> > +static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
> 
> I assume that the requirement to only increase the fanout and never
> decrease it comes from the desire to not increase the sizes of all of
> these arrays to MAX_RCU_LVLS?

Actually, it is the node[] array in the rcu_state structure that is
of concern.

> > +/*
> > + * Compute the rcu_node tree geometry from kernel parameters.  This cannot
> > + * replace the definitions in rcutree.h because those are needed to size
> > + * the ->node array in the rcu_state structure.
> > + */
> > +static void __init rcu_init_geometry(void)
> > +{
> > +	int i;
> > +	int j;
> > +	int n = NR_CPUS;
> > +	int rcu_capacity[MAX_RCU_LVLS + 1];
> > +
> > +	/* If the compile-time values are accurate, just leave. */
> > +	if (rcu_fanout_leaf == CONFIG_RCU_FANOUT_LEAF)
> > +		return;
> > +
> > +	/*
> > +	 * Compute number of nodes that can be handled an rcu_node tree
> > +	 * with the given number of levels.  Setting rcu_capacity[0] makes
> > +	 * some of the arithmetic easier.
> > +	 */
> > +	rcu_capacity[0] = 1;
> > +	rcu_capacity[1] = rcu_fanout_leaf;
> > +	for (i = 2; i <= MAX_RCU_LVLS; i++)
> > +		rcu_capacity[i] = rcu_capacity[i - 1] * CONFIG_RCU_FANOUT;
> > +
> > +	/*
> > +	 * The boot-time rcu_fanout_leaf parameter is only permitted
> > +	 * to increase the leaf-level fanout, not decrease it.  Of course,
> > +	 * the leaf-level fanout cannot exceed the number of bits in
> > +	 * the rcu_node masks.  Finally, the tree must be able to accommodate
> > +	 * the configured number of CPUs.  Complain and fall back to the
> > +	 * compile-timer values if these limits are exceeded.
> 
> Typo: s/timer/time/

Good catch!

> > +	 */
> > +	if (rcu_fanout_leaf < CONFIG_RCU_FANOUT_LEAF ||
> > +	    rcu_fanout_leaf > sizeof(unsigned long) * 8 ||
> > +	    n > rcu_capacity[4]) {
> 
> 4 seems like a magic number here; did you mean MAX_RCU_LVLS or similar?

I believe so, good catch!  That would have been painful if another
level of the hierarchy were needed...

> Also, why have n as a variable when it never changes?

Will propagate the value unless I can come up with a good reason.  ;-)

> > --- a/kernel/rcutree.h
> > +++ b/kernel/rcutree.h
> > @@ -42,28 +42,28 @@
> >  #define RCU_FANOUT_4	      (RCU_FANOUT_3 * CONFIG_RCU_FANOUT)
> >  
> >  #if NR_CPUS <= RCU_FANOUT_1
> > -#  define NUM_RCU_LVLS	      1
> > +#  define RCU_NUM_LVLS	      1
> 
> I assume you made this change to make it easier to track down all the
> uses of the macro to change them; however, having now done so, the
> change itself seems rather gratuitous, and inconsistent with the other
> macros.  Would you consider changing it back?
> 
> > +extern int rcu_num_lvls;
> > +extern int rcu_num_nodes;
> 
> Given the above, you might also want to change these for consistency.

This might make sense.  If I run into too many conflicts, I may defer
the change to the join of the topic trees.

> Also, have you checked the various loops using these variables to figure
> out if GCC emits less optimal code now that it can't rely on a
> compile-time constant?  I don't expect it to make much of a difference,
> but it seems worth checking.

I am sure that it generates worse code, but the uses are on slowpaths,
so I really am not worried about it.

> You might also consider marking these as __read_mostly, at a minimum.

This sounds quite sensible, will do.

> > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> > index 2411000..e9b44c3 100644
> > --- a/kernel/rcutree_plugin.h
> > +++ b/kernel/rcutree_plugin.h
> > @@ -68,8 +68,10 @@ static void __init rcu_bootup_announce_oddness(void)
> >  	printk(KERN_INFO "\tAdditional per-CPU info printed with stalls.\n");
> >  #endif
> >  #if NUM_RCU_LVL_4 != 0
> > -	printk(KERN_INFO "\tExperimental four-level hierarchy is enabled.\n");
> > +	printk(KERN_INFO "\tFour-level hierarchy is enabled.\n");
> 
> This change seems entirely unrelated to this patch.  Seems simple enough
> to split it into a separate one-line patch ("Mark four-level hierarchy
> as no longer experimental").

Can't see why not...

> >  #endif
> > +	if (rcu_fanout_leaf != CONFIG_RCU_FANOUT_LEAF)
> > +		printk(KERN_INFO "\tExperimental boot-time adjustment of leaf fanout.\n");
> 
> You might consider printing rcu_fanout_leaf in this message.

Ouch!  Good point, will fix.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 05/15] rcu: Move _rcu_barrier()'s rcu_head structures to rcu_data structures
  2012-06-15 21:06   ` [PATCH tip/core/rcu 05/15] rcu: Move _rcu_barrier()'s rcu_head structures to rcu_data structures Paul E. McKenney
@ 2012-06-15 22:19     ` Josh Triplett
  0 siblings, 0 replies; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 22:19 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches, Paul E. McKenney

On Fri, Jun 15, 2012 at 02:06:00PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> 
> In order for multiple flavors of RCU to each concurrently run one
> rcu_barrier(), each flavor needs its own per-CPU set of rcu_head
> structures.  This commit therefore moves _rcu_barrier()'s set of
> per-CPU rcu_head structures from per-CPU variables to the existing
> per-CPU and per-RCU-flavor rcu_data structures.
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

>  kernel/rcutree.c |    6 ++----
>  kernel/rcutree.h |    3 +++
>  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 8b3ab4e..2cfbdb8 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -157,7 +157,6 @@ unsigned long rcutorture_vernum;
>  
>  /* State information for rcu_barrier() and friends. */
>  
> -static DEFINE_PER_CPU(struct rcu_head, rcu_barrier_head) = {NULL};
>  static atomic_t rcu_barrier_cpu_count;
>  static DEFINE_MUTEX(rcu_barrier_mutex);
>  static struct completion rcu_barrier_completion;
> @@ -2279,12 +2278,11 @@ static void rcu_barrier_callback(struct rcu_head *notused)
>   */
>  static void rcu_barrier_func(void *type)
>  {
> -	int cpu = smp_processor_id();
> -	struct rcu_head *head = &per_cpu(rcu_barrier_head, cpu);
>  	struct rcu_state *rsp = type;
> +	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
>  
>  	atomic_inc(&rcu_barrier_cpu_count);
> -	rsp->call(head, rcu_barrier_callback);
> +	rsp->call(&rdp->barrier_head, rcu_barrier_callback);
>  }
>  
>  /*
> diff --git a/kernel/rcutree.h b/kernel/rcutree.h
> index 15837d7..1783eae 100644
> --- a/kernel/rcutree.h
> +++ b/kernel/rcutree.h
> @@ -300,6 +300,9 @@ struct rcu_data {
>  	unsigned long n_rp_need_fqs;
>  	unsigned long n_rp_need_nothing;
>  
> +	/* 6) _rcu_barrier() callback. */
> +	struct rcu_head barrier_head;
> +
>  	int cpu;
>  	struct rcu_state *rsp;
>  };
> -- 
> 1.7.8
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 06/15] rcu: Move rcu_barrier_cpu_count to rcu_state structure
  2012-06-15 21:06   ` [PATCH tip/core/rcu 06/15] rcu: Move rcu_barrier_cpu_count to rcu_state structure Paul E. McKenney
@ 2012-06-15 22:44     ` Josh Triplett
  0 siblings, 0 replies; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 22:44 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches, Paul E. McKenney

On Fri, Jun 15, 2012 at 02:06:01PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> 
> In order to allow each RCU flavor to concurrently execute its rcu_barrier()
> function, it is necessary to move the relevant state to the rcu_state
> structure.  This commit therefore moves the rcu_barrier_cpu_count global
> variable to a new ->barrier_cpu_count field in the rcu_state structure.
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

>  kernel/rcutree.c |   25 ++++++++++++++-----------
>  kernel/rcutree.h |    1 +
>  2 files changed, 15 insertions(+), 11 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 2cfbdb8..d363416 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -157,7 +157,6 @@ unsigned long rcutorture_vernum;
>  
>  /* State information for rcu_barrier() and friends. */
>  
> -static atomic_t rcu_barrier_cpu_count;
>  static DEFINE_MUTEX(rcu_barrier_mutex);
>  static struct completion rcu_barrier_completion;
>  
> @@ -2267,9 +2266,12 @@ static int rcu_cpu_has_callbacks(int cpu)
>   * RCU callback function for _rcu_barrier().  If we are last, wake
>   * up the task executing _rcu_barrier().
>   */
> -static void rcu_barrier_callback(struct rcu_head *notused)
> +static void rcu_barrier_callback(struct rcu_head *rhp)
>  {
> -	if (atomic_dec_and_test(&rcu_barrier_cpu_count))
> +	struct rcu_data *rdp = container_of(rhp, struct rcu_data, barrier_head);
> +	struct rcu_state *rsp = rdp->rsp;
> +
> +	if (atomic_dec_and_test(&rsp->barrier_cpu_count))
>  		complete(&rcu_barrier_completion);
>  }
>  
> @@ -2281,7 +2283,7 @@ static void rcu_barrier_func(void *type)
>  	struct rcu_state *rsp = type;
>  	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
>  
> -	atomic_inc(&rcu_barrier_cpu_count);
> +	atomic_inc(&rsp->barrier_cpu_count);
>  	rsp->call(&rdp->barrier_head, rcu_barrier_callback);
>  }
>  
> @@ -2294,9 +2296,9 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	int cpu;
>  	unsigned long flags;
>  	struct rcu_data *rdp;
> -	struct rcu_head rh;
> +	struct rcu_data rd;
>  
> -	init_rcu_head_on_stack(&rh);
> +	init_rcu_head_on_stack(&rd.barrier_head);
>  
>  	/* Take mutex to serialize concurrent rcu_barrier() requests. */
>  	mutex_lock(&rcu_barrier_mutex);
> @@ -2321,7 +2323,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	 *	us -- but before CPU 1's orphaned callbacks are invoked!!!
>  	 */
>  	init_completion(&rcu_barrier_completion);
> -	atomic_set(&rcu_barrier_cpu_count, 1);
> +	atomic_set(&rsp->barrier_cpu_count, 1);
>  	raw_spin_lock_irqsave(&rsp->onofflock, flags);
>  	rsp->rcu_barrier_in_progress = current;
>  	raw_spin_unlock_irqrestore(&rsp->onofflock, flags);
> @@ -2360,15 +2362,16 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	rcu_adopt_orphan_cbs(rsp);
>  	rsp->rcu_barrier_in_progress = NULL;
>  	raw_spin_unlock_irqrestore(&rsp->onofflock, flags);
> -	atomic_inc(&rcu_barrier_cpu_count);
> +	atomic_inc(&rsp->barrier_cpu_count);
>  	smp_mb__after_atomic_inc(); /* Ensure atomic_inc() before callback. */
> -	rsp->call(&rh, rcu_barrier_callback);
> +	rd.rsp = rsp;
> +	rsp->call(&rd.barrier_head, rcu_barrier_callback);
>  
>  	/*
>  	 * Now that we have an rcu_barrier_callback() callback on each
>  	 * CPU, and thus each counted, remove the initial count.
>  	 */
> -	if (atomic_dec_and_test(&rcu_barrier_cpu_count))
> +	if (atomic_dec_and_test(&rsp->barrier_cpu_count))
>  		complete(&rcu_barrier_completion);
>  
>  	/* Wait for all rcu_barrier_callback() callbacks to be invoked. */
> @@ -2377,7 +2380,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	/* Other rcu_barrier() invocations can now safely proceed. */
>  	mutex_unlock(&rcu_barrier_mutex);
>  
> -	destroy_rcu_head_on_stack(&rh);
> +	destroy_rcu_head_on_stack(&rd.barrier_head);
>  }
>  
>  /**
> diff --git a/kernel/rcutree.h b/kernel/rcutree.h
> index 1783eae..e7d29b7 100644
> --- a/kernel/rcutree.h
> +++ b/kernel/rcutree.h
> @@ -386,6 +386,7 @@ struct rcu_state {
>  	struct task_struct *rcu_barrier_in_progress;
>  						/* Task doing rcu_barrier(), */
>  						/*  or NULL if no barrier. */
> +	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
>  	raw_spinlock_t fqslock;			/* Only one task forcing */
>  						/*  quiescent states. */
>  	unsigned long jiffies_force_qs;		/* Time at which to invoke */
> -- 
> 1.7.8
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 07/15] rcu: Move rcu_barrier_completion to rcu_state structure
  2012-06-15 21:06   ` [PATCH tip/core/rcu 07/15] rcu: Move rcu_barrier_completion " Paul E. McKenney
@ 2012-06-15 22:51     ` Josh Triplett
  0 siblings, 0 replies; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 22:51 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches, Paul E. McKenney

On Fri, Jun 15, 2012 at 02:06:02PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> 
> In order to allow each RCU flavor to concurrently execute its
> rcu_barrier() function, it is necessary to move the relevant
> state to the rcu_state structure.  This commit therefore moves the
> rcu_barrier_completion global variable to a new ->barrier_completion
> field in the rcu_state structure.
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

>  kernel/rcutree.c |    9 ++++-----
>  kernel/rcutree.h |    1 +
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index d363416..a946437 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -158,7 +158,6 @@ unsigned long rcutorture_vernum;
>  /* State information for rcu_barrier() and friends. */
>  
>  static DEFINE_MUTEX(rcu_barrier_mutex);
> -static struct completion rcu_barrier_completion;
>  
>  /*
>   * Return true if an RCU grace period is in progress.  The ACCESS_ONCE()s
> @@ -2272,7 +2271,7 @@ static void rcu_barrier_callback(struct rcu_head *rhp)
>  	struct rcu_state *rsp = rdp->rsp;
>  
>  	if (atomic_dec_and_test(&rsp->barrier_cpu_count))
> -		complete(&rcu_barrier_completion);
> +		complete(&rsp->barrier_completion);
>  }
>  
>  /*
> @@ -2322,7 +2321,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	 * 6.	Both rcu_barrier_callback() callbacks are invoked, awakening
>  	 *	us -- but before CPU 1's orphaned callbacks are invoked!!!
>  	 */
> -	init_completion(&rcu_barrier_completion);
> +	init_completion(&rsp->barrier_completion);
>  	atomic_set(&rsp->barrier_cpu_count, 1);
>  	raw_spin_lock_irqsave(&rsp->onofflock, flags);
>  	rsp->rcu_barrier_in_progress = current;
> @@ -2372,10 +2371,10 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	 * CPU, and thus each counted, remove the initial count.
>  	 */
>  	if (atomic_dec_and_test(&rsp->barrier_cpu_count))
> -		complete(&rcu_barrier_completion);
> +		complete(&rsp->barrier_completion);
>  
>  	/* Wait for all rcu_barrier_callback() callbacks to be invoked. */
> -	wait_for_completion(&rcu_barrier_completion);
> +	wait_for_completion(&rsp->barrier_completion);
>  
>  	/* Other rcu_barrier() invocations can now safely proceed. */
>  	mutex_unlock(&rcu_barrier_mutex);
> diff --git a/kernel/rcutree.h b/kernel/rcutree.h
> index e7d29b7..56fb8d4 100644
> --- a/kernel/rcutree.h
> +++ b/kernel/rcutree.h
> @@ -387,6 +387,7 @@ struct rcu_state {
>  						/* Task doing rcu_barrier(), */
>  						/*  or NULL if no barrier. */
>  	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
> +	struct completion barrier_completion;	/* Wake at barrier end. */
>  	raw_spinlock_t fqslock;			/* Only one task forcing */
>  						/*  quiescent states. */
>  	unsigned long jiffies_force_qs;		/* Time at which to invoke */
> -- 
> 1.7.8
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 08/15] rcu: Move rcu_barrier_mutex to rcu_state structure
  2012-06-15 21:06   ` [PATCH tip/core/rcu 08/15] rcu: Move rcu_barrier_mutex " Paul E. McKenney
@ 2012-06-15 22:55     ` Josh Triplett
  0 siblings, 0 replies; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 22:55 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches, Paul E. McKenney

On Fri, Jun 15, 2012 at 02:06:03PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> 
> In order to allow each RCU flavor to concurrently execute its
> rcu_barrier() function, it is necessary to move the relevant
> state to the rcu_state structure.  This commit therefore moves the
> rcu_barrier_mutex global variable to a new ->barrier_mutex field
> in the rcu_state structure.
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
>  kernel/rcutree.c |   11 +++--------
>  kernel/rcutree.h |    1 +
>  2 files changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index a946437..93358d4 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -71,9 +71,8 @@ static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
>  	.onofflock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.onofflock), \
>  	.orphan_nxttail = &sname##_state.orphan_nxtlist, \
>  	.orphan_donetail = &sname##_state.orphan_donelist, \
> +	.barrier_mutex = __MUTEX_INITIALIZER(sname##_state.barrier_mutex), \
>  	.fqslock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.fqslock), \
> -	.n_force_qs = 0, \
> -	.n_force_qs_ngp = 0, \

The removal of these two fields seems unrelated to the rest of this
commit.

I assume you've removed them because the use of "static" makes
initializations to 0 unnecessary?

The rest of this commit seems fine to me.

>  	.name = #sname, \
>  }
>  
> @@ -155,10 +154,6 @@ static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp);
>  unsigned long rcutorture_testseq;
>  unsigned long rcutorture_vernum;
>  
> -/* State information for rcu_barrier() and friends. */
> -
> -static DEFINE_MUTEX(rcu_barrier_mutex);
> -
>  /*
>   * Return true if an RCU grace period is in progress.  The ACCESS_ONCE()s
>   * permit this function to be invoked without holding the root rcu_node
> @@ -2300,7 +2295,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	init_rcu_head_on_stack(&rd.barrier_head);
>  
>  	/* Take mutex to serialize concurrent rcu_barrier() requests. */
> -	mutex_lock(&rcu_barrier_mutex);
> +	mutex_lock(&rsp->barrier_mutex);
>  
>  	smp_mb();  /* Prevent any prior operations from leaking in. */
>  
> @@ -2377,7 +2372,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	wait_for_completion(&rsp->barrier_completion);
>  
>  	/* Other rcu_barrier() invocations can now safely proceed. */
> -	mutex_unlock(&rcu_barrier_mutex);
> +	mutex_unlock(&rsp->barrier_mutex);
>  
>  	destroy_rcu_head_on_stack(&rd.barrier_head);
>  }
> diff --git a/kernel/rcutree.h b/kernel/rcutree.h
> index 56fb8d4..d9ac82f 100644
> --- a/kernel/rcutree.h
> +++ b/kernel/rcutree.h
> @@ -386,6 +386,7 @@ struct rcu_state {
>  	struct task_struct *rcu_barrier_in_progress;
>  						/* Task doing rcu_barrier(), */
>  						/*  or NULL if no barrier. */
> +	struct mutex barrier_mutex;		/* Guards barrier fields. */
>  	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
>  	struct completion barrier_completion;	/* Wake at barrier end. */
>  	raw_spinlock_t fqslock;			/* Only one task forcing */
> -- 
> 1.7.8
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 09/15] rcu: Increasing rcu_barrier() concurrency
  2012-06-15 21:06   ` [PATCH tip/core/rcu 09/15] rcu: Increasing rcu_barrier() concurrency Paul E. McKenney
@ 2012-06-15 23:31     ` Josh Triplett
  2012-06-16  0:21       ` Steven Rostedt
  2012-06-16  0:48       ` Paul E. McKenney
  0 siblings, 2 replies; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 23:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches, Paul E. McKenney

On Fri, Jun 15, 2012 at 02:06:04PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> 
> The traditional rcu_barrier() implementation has serialized all requests,
> regardless of RCU flavor, and also does not coalesce concurrent requests.
> In the past, this has been good and sufficient.
> 
> However, systems are getting larger and use of rcu_barrier() has been
> increasing.  This commit therefore introduces a counter-based scheme
> that allows _rcu_barrier() calls for the same flavor of RCU to take
> advantage of each others' work.
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
>  kernel/rcutree.c |   27 ++++++++++++++++++++++++++-
>  kernel/rcutree.h |    2 ++
>  2 files changed, 28 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 93358d4..7c299d3 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -2291,13 +2291,32 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	unsigned long flags;
>  	struct rcu_data *rdp;
>  	struct rcu_data rd;
> +	unsigned long snap = ACCESS_ONCE(rsp->n_barrier_done);
> +	unsigned long snap_done;
>  
>  	init_rcu_head_on_stack(&rd.barrier_head);
>  
>  	/* Take mutex to serialize concurrent rcu_barrier() requests. */
>  	mutex_lock(&rsp->barrier_mutex);
>  
> -	smp_mb();  /* Prevent any prior operations from leaking in. */
> +	/*
> +	 * Ensure tht all prior references, including to ->n_barrier_done,
> +	 * are ordered before the _rcu_barrier() machinery.
> +	 */
> +	smp_mb();  /* See above block comment. */

If checkpatch complains about the lack of a comment to the right of a
barrier even when the barrier has a comment directly above it, that
seems like a bug in checkpatch that needs fixing, to prevent developers
from having to add noise like "See above block comment.". :)

Also: what type of barriers do mutex_lock and mutex_unlock imply?  I
assume they imply some weaker barrier than smp_mb, but I'd still assume
they imply *some* barrier.

> +	/* Recheck ->n_barrier_done to see if others did our work for us. */
> +	snap_done = ACCESS_ONCE(rsp->n_barrier_done);
> +	if (ULONG_CMP_GE(snap_done, ((snap + 1) & ~0x1) + 2)) {

This calculation seems sufficiently clever that it merits an explanatory
comment.

> +		smp_mb();
> +		mutex_unlock(&rsp->barrier_mutex);
> +		return;
> +	}
> +
> +	/* Increment ->n_barrier_done to avoid duplicate work. */
> +	ACCESS_ONCE(rsp->n_barrier_done)++;

Interesting dissonance here: the use of ACCESS_ONCE with ++ implies
exactly two accesses, rather than exactly one.  What makes it safe to
not use atomic_inc here, but not safe to drop the ACCESS_ONCE?
Potential use of a cached value read earlier in the function?

- Josh Triplett

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 10/15] rcu: Add tracing for _rcu_barrier()
  2012-06-15 21:06   ` [PATCH tip/core/rcu 10/15] rcu: Add tracing for _rcu_barrier() Paul E. McKenney
@ 2012-06-15 23:35     ` Josh Triplett
  0 siblings, 0 replies; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 23:35 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches, Paul E. McKenney

On Fri, Jun 15, 2012 at 02:06:05PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> 
> This commit adds event tracing for _rcu_barrier() execution.  This
> is defined only if RCU_TRACE=y.
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

>  include/trace/events/rcu.h |   45 ++++++++++++++++++++++++++++++++++++++++++++
>  kernel/rcutree.c           |   29 +++++++++++++++++++++++++++-
>  2 files changed, 73 insertions(+), 1 deletions(-)
> 
> diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
> index 1480900..cd63f79 100644
> --- a/include/trace/events/rcu.h
> +++ b/include/trace/events/rcu.h
> @@ -540,6 +540,50 @@ TRACE_EVENT(rcu_torture_read,
>  		  __entry->rcutorturename, __entry->rhp)
>  );
>  
> +/*
> + * Tracepoint for _rcu_barrier() execution.  The string "s" describes
> + * the _rcu_barrier phase:
> + *	"Begin": rcu_barrier_callback() started.
> + *	"Check": rcu_barrier_callback() checking for piggybacking.
> + *	"EarlyExit": rcu_barrier_callback() piggybacked, thus early exit.
> + *	"Inc1": rcu_barrier_callback() piggyback check counter incremented.
> + *	"Offline": rcu_barrier_callback() found offline CPU
> + *	"OnlineQ": rcu_barrier_callback() found online CPU with callbacks.
> + *	"OnlineNQ": rcu_barrier_callback() found online CPU, no callbacks.
> + *	"IRQ": An rcu_barrier_callback() callback posted on remote CPU.
> + *	"CB": An rcu_barrier_callback() invoked a callback, not the last.
> + *	"LastCB": An rcu_barrier_callback() invoked the last callback.
> + *	"Inc2": rcu_barrier_callback() piggyback check counter incremented.
> + * The "cpu" argument is the CPU or -1 if meaningless, the "cnt" argument
> + * is the count of remaining callbacks, and "done" is the piggybacking count.
> + */
> +TRACE_EVENT(rcu_barrier,
> +
> +	TP_PROTO(char *rcuname, char *s, int cpu, int cnt, unsigned long done),
> +
> +	TP_ARGS(rcuname, s, cpu, cnt, done),
> +
> +	TP_STRUCT__entry(
> +		__field(char *, rcuname)
> +		__field(char *, s)
> +		__field(int, cpu)
> +		__field(int, cnt)
> +		__field(unsigned long, done)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->rcuname = rcuname;
> +		__entry->s = s;
> +		__entry->cpu = cpu;
> +		__entry->cnt = cnt;
> +		__entry->done = done;
> +	),
> +
> +	TP_printk("%s %s cpu %d remaining %d # %lu",
> +		  __entry->rcuname, __entry->s, __entry->cpu, __entry->cnt,
> +		  __entry->done)
> +);
> +
>  #else /* #ifdef CONFIG_RCU_TRACE */
>  
>  #define trace_rcu_grace_period(rcuname, gpnum, gpevent) do { } while (0)
> @@ -563,6 +607,7 @@ TRACE_EVENT(rcu_torture_read,
>  #define trace_rcu_batch_end(rcuname, callbacks_invoked, cb, nr, iit, risk) \
>  	do { } while (0)
>  #define trace_rcu_torture_read(rcutorturename, rhp) do { } while (0)
> +#define trace_rcu_barrier(name, s, cpu, cnt, done) do { } while (0)
>  
>  #endif /* #else #ifdef CONFIG_RCU_TRACE */
>  
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 7c299d3..ebd5223 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -2257,6 +2257,17 @@ static int rcu_cpu_has_callbacks(int cpu)
>  }
>  
>  /*
> + * Helper function for _rcu_barrier() tracing.  If tracing is disabled,
> + * the compiler is expected to optimize this away.
> + */
> +static void _rcu_barrier_trace(struct rcu_state *rsp, char *s,
> +			       int cpu, unsigned long done)
> +{
> +	trace_rcu_barrier(rsp->name, s, cpu,
> +			  atomic_read(&rsp->barrier_cpu_count), done);
> +}
> +
> +/*
>   * RCU callback function for _rcu_barrier().  If we are last, wake
>   * up the task executing _rcu_barrier().
>   */
> @@ -2265,8 +2276,12 @@ static void rcu_barrier_callback(struct rcu_head *rhp)
>  	struct rcu_data *rdp = container_of(rhp, struct rcu_data, barrier_head);
>  	struct rcu_state *rsp = rdp->rsp;
>  
> -	if (atomic_dec_and_test(&rsp->barrier_cpu_count))
> +	if (atomic_dec_and_test(&rsp->barrier_cpu_count)) {
> +		_rcu_barrier_trace(rsp, "LastCB", -1, rsp->n_barrier_done);
>  		complete(&rsp->barrier_completion);
> +	} else {
> +		_rcu_barrier_trace(rsp, "CB", -1, rsp->n_barrier_done);
> +	}
>  }
>  
>  /*
> @@ -2277,6 +2292,7 @@ static void rcu_barrier_func(void *type)
>  	struct rcu_state *rsp = type;
>  	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
>  
> +	_rcu_barrier_trace(rsp, "IRQ", -1, rsp->n_barrier_done);
>  	atomic_inc(&rsp->barrier_cpu_count);
>  	rsp->call(&rdp->barrier_head, rcu_barrier_callback);
>  }
> @@ -2295,6 +2311,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	unsigned long snap_done;
>  
>  	init_rcu_head_on_stack(&rd.barrier_head);
> +	_rcu_barrier_trace(rsp, "Begin", -1, snap);
>  
>  	/* Take mutex to serialize concurrent rcu_barrier() requests. */
>  	mutex_lock(&rsp->barrier_mutex);
> @@ -2307,7 +2324,9 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  
>  	/* Recheck ->n_barrier_done to see if others did our work for us. */
>  	snap_done = ACCESS_ONCE(rsp->n_barrier_done);
> +	_rcu_barrier_trace(rsp, "Check", -1, snap_done);
>  	if (ULONG_CMP_GE(snap_done, ((snap + 1) & ~0x1) + 2)) {
> +		_rcu_barrier_trace(rsp, "EarlyExit", -1, snap_done);
>  		smp_mb();
>  		mutex_unlock(&rsp->barrier_mutex);
>  		return;
> @@ -2316,6 +2335,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	/* Increment ->n_barrier_done to avoid duplicate work. */
>  	ACCESS_ONCE(rsp->n_barrier_done)++;
>  	WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 1);
> +	_rcu_barrier_trace(rsp, "Inc1", -1, rsp->n_barrier_done);
>  	smp_mb(); /* Order ->n_barrier_done increment with below mechanism. */
>  
>  	/*
> @@ -2352,13 +2372,19 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  		preempt_disable();
>  		rdp = per_cpu_ptr(rsp->rda, cpu);
>  		if (cpu_is_offline(cpu)) {
> +			_rcu_barrier_trace(rsp, "Offline", cpu,
> +					   rsp->n_barrier_done);
>  			preempt_enable();
>  			while (cpu_is_offline(cpu) && ACCESS_ONCE(rdp->qlen))
>  				schedule_timeout_interruptible(1);
>  		} else if (ACCESS_ONCE(rdp->qlen)) {
> +			_rcu_barrier_trace(rsp, "OnlineQ", cpu,
> +					   rsp->n_barrier_done);
>  			smp_call_function_single(cpu, rcu_barrier_func, rsp, 1);
>  			preempt_enable();
>  		} else {
> +			_rcu_barrier_trace(rsp, "OnlineNQ", cpu,
> +					   rsp->n_barrier_done);
>  			preempt_enable();
>  		}
>  	}
> @@ -2391,6 +2417,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	smp_mb(); /* Keep increment after above mechanism. */
>  	ACCESS_ONCE(rsp->n_barrier_done)++;
>  	WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 0);
> +	_rcu_barrier_trace(rsp, "Inc2", -1, rsp->n_barrier_done);
>  	smp_mb(); /* Keep increment before caller's subsequent code. */
>  
>  	/* Wait for all rcu_barrier_callback() callbacks to be invoked. */
> -- 
> 1.7.8
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 12/15] rcu: Remove unneeded __rcu_process_callbacks() argument
  2012-06-15 21:06   ` [PATCH tip/core/rcu 12/15] rcu: Remove unneeded __rcu_process_callbacks() argument Paul E. McKenney
@ 2012-06-15 23:37     ` Josh Triplett
  0 siblings, 0 replies; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 23:37 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 02:06:07PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> With the advent of __this_cpu_ptr(), it is no longer necessary to pass
> both the rcu_state and rcu_data structures into __rcu_process_callbacks().
> This commit therefore computes the rcu_data pointer from the rcu_state
> pointer within __rcu_process_callbacks() so that callers can pass in
> only the pointer to the rcu_state structure.  This paves the way for
> linking the rcu_state structures together and iterating over them.
> 
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

>  kernel/rcutree.c        |    8 ++++----
>  kernel/rcutree_plugin.h |    3 +--
>  2 files changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index ebd5223..bd4e41c 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -1785,9 +1785,10 @@ unlock_fqs_ret:
>   * whom the rdp belongs.
>   */
>  static void
> -__rcu_process_callbacks(struct rcu_state *rsp, struct rcu_data *rdp)
> +__rcu_process_callbacks(struct rcu_state *rsp)
>  {
>  	unsigned long flags;
> +	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
>  
>  	WARN_ON_ONCE(rdp->beenonline == 0);
>  
> @@ -1824,9 +1825,8 @@ __rcu_process_callbacks(struct rcu_state *rsp, struct rcu_data *rdp)
>  static void rcu_process_callbacks(struct softirq_action *unused)
>  {
>  	trace_rcu_utilization("Start RCU core");
> -	__rcu_process_callbacks(&rcu_sched_state,
> -				&__get_cpu_var(rcu_sched_data));
> -	__rcu_process_callbacks(&rcu_bh_state, &__get_cpu_var(rcu_bh_data));
> +	__rcu_process_callbacks(&rcu_sched_state);
> +	__rcu_process_callbacks(&rcu_bh_state);
>  	rcu_preempt_process_callbacks();
>  	trace_rcu_utilization("End RCU core");
>  }
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index 6888706..686cb55 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -687,8 +687,7 @@ static void rcu_preempt_check_callbacks(int cpu)
>   */
>  static void rcu_preempt_process_callbacks(void)
>  {
> -	__rcu_process_callbacks(&rcu_preempt_state,
> -				&__get_cpu_var(rcu_preempt_data));
> +	__rcu_process_callbacks(&rcu_preempt_state);
>  }
>  
>  #ifdef CONFIG_RCU_BOOST
> -- 
> 1.7.8
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 11/15] rcu: Add rcu_barrier() statistics to debugfs tracing
  2012-06-15 21:06   ` [PATCH tip/core/rcu 11/15] rcu: Add rcu_barrier() statistics to debugfs tracing Paul E. McKenney
@ 2012-06-15 23:38     ` Josh Triplett
  0 siblings, 0 replies; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 23:38 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches, Paul E. McKenney

On Fri, Jun 15, 2012 at 02:06:06PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> 
> This commit adds an rcubarrier file to RCU's debugfs statistical tracing
> directory, providing diagnostic information on rcu_barrier().
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

>  kernel/rcutree_trace.c |   39 +++++++++++++++++++++++++++++++++++++++
>  1 files changed, 39 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/rcutree_trace.c b/kernel/rcutree_trace.c
> index a3556a2..057408b 100644
> --- a/kernel/rcutree_trace.c
> +++ b/kernel/rcutree_trace.c
> @@ -46,6 +46,40 @@
>  #define RCU_TREE_NONCORE
>  #include "rcutree.h"
>  
> +static void print_rcubarrier(struct seq_file *m, struct rcu_state *rsp)
> +{
> +	seq_printf(m, "%c bcc: %d nbd: %lu\n",
> +		   rsp->rcu_barrier_in_progress ? 'B' : '.',
> +		   atomic_read(&rsp->barrier_cpu_count),
> +		   rsp->n_barrier_done);
> +}
> +
> +static int show_rcubarrier(struct seq_file *m, void *unused)
> +{
> +#ifdef CONFIG_TREE_PREEMPT_RCU
> +	seq_puts(m, "rcu_preempt: ");
> +	print_rcubarrier(m, &rcu_preempt_state);
> +#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> +	seq_puts(m, "rcu_sched: ");
> +	print_rcubarrier(m, &rcu_sched_state);
> +	seq_puts(m, "rcu_bh: ");
> +	print_rcubarrier(m, &rcu_bh_state);
> +	return 0;
> +}
> +
> +static int rcubarrier_open(struct inode *inode, struct file *file)
> +{
> +	return single_open(file, show_rcubarrier, NULL);
> +}
> +
> +static const struct file_operations rcubarrier_fops = {
> +	.owner = THIS_MODULE,
> +	.open = rcubarrier_open,
> +	.read = seq_read,
> +	.llseek = seq_lseek,
> +	.release = single_release,
> +};
> +
>  #ifdef CONFIG_RCU_BOOST
>  
>  static char convert_kthread_status(unsigned int kthread_status)
> @@ -453,6 +487,11 @@ static int __init rcutree_trace_init(void)
>  	if (!rcudir)
>  		goto free_out;
>  
> +	retval = debugfs_create_file("rcubarrier", 0444, rcudir,
> +						NULL, &rcubarrier_fops);
> +	if (!retval)
> +		goto free_out;
> +
>  	retval = debugfs_create_file("rcudata", 0444, rcudir,
>  						NULL, &rcudata_fops);
>  	if (!retval)
> -- 
> 1.7.8
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 13/15] rcu: Introduce for_each_rcu_flavor() and use it
  2012-06-15 21:06   ` [PATCH tip/core/rcu 13/15] rcu: Introduce for_each_rcu_flavor() and use it Paul E. McKenney
@ 2012-06-15 23:52     ` Josh Triplett
  2012-06-16  1:01       ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 23:52 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 02:06:08PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> The arrival of TREE_PREEMPT_RCU some years back included some ugly
> code involving either #ifdef or #ifdef'ed wrapper functions to iterate
> over all non-SRCU flavors of RCU.  This commit therefore introduces
> a for_each_rcu_flavor() iterator over the rcu_state structures for each
> flavor of RCU to clean up a bit of the ugliness.
> 
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Great cleanup!

A few comments below, though.

>  kernel/rcutree.c        |   53 +++++++++++++---------
>  kernel/rcutree.h        |   12 ++---
>  kernel/rcutree_plugin.h |  116 -----------------------------------------------
>  3 files changed, 36 insertions(+), 145 deletions(-)

Awesome diffstat.

> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index bd4e41c..75ad92a 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -84,6 +84,7 @@ struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh, call_rcu_bh);
>  DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
>  
>  static struct rcu_state *rcu_state;
> +LIST_HEAD(rcu_struct_flavors);

Does any means exist to turn this into a constant array known at compile
time rather than a runtime linked list?  Having this as a compile-time
constant may allow the compiler to unroll for_each_rcu_flavor and
potentially inline the calls inside it.

> @@ -2539,9 +2548,10 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
>  
>  static void __cpuinit rcu_prepare_cpu(int cpu)
>  {
> -	rcu_init_percpu_data(cpu, &rcu_sched_state, 0);
> -	rcu_init_percpu_data(cpu, &rcu_bh_state, 0);
> -	rcu_preempt_init_percpu_data(cpu);
> +	struct rcu_state *rsp;
> +
> +	for_each_rcu_flavor(rsp)
> +		rcu_init_percpu_data(cpu, rsp, 0);

This results in passing 0 as the "preemptible" parameter of
rcu_init_percpu_data, which seems wrong if the preemptible parameter has
any meaning at all. :)

> @@ -2577,18 +2588,15 @@ static int __cpuinit rcu_cpu_notify(struct notifier_block *self,
>  		 * touch any data without introducing corruption. We send the
>  		 * dying CPU's callbacks to an arbitrarily chosen online CPU.
>  		 */
> -		rcu_cleanup_dying_cpu(&rcu_bh_state);
> -		rcu_cleanup_dying_cpu(&rcu_sched_state);
> -		rcu_preempt_cleanup_dying_cpu();
> -		rcu_cleanup_after_idle(cpu);
> +		for_each_rcu_flavor(rsp)
> +			rcu_cleanup_dying_cpu(rsp);

Why did rcu_cleanup_after_idle go away here?

- Josh Triplett

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 14/15] rcu: Use for_each_rcu_flavor() in TREE_RCU tracing
  2012-06-15 21:06   ` [PATCH tip/core/rcu 14/15] rcu: Use for_each_rcu_flavor() in TREE_RCU tracing Paul E. McKenney
@ 2012-06-15 23:59     ` Josh Triplett
  2012-06-16  0:56       ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Josh Triplett @ 2012-06-15 23:59 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 02:06:09PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> This commit applies the new for_each_rcu_flavor() macro to the
> kernel/rcutree_trace.c file.
> 
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
>  kernel/rcutree_trace.c |   95 +++++++++++++++++++-----------------------------
>  1 files changed, 38 insertions(+), 57 deletions(-)
> 
> diff --git a/kernel/rcutree_trace.c b/kernel/rcutree_trace.c
> index 057408b..c618665 100644
> --- a/kernel/rcutree_trace.c
> +++ b/kernel/rcutree_trace.c
> @@ -48,22 +48,18 @@
>  
>  static void print_rcubarrier(struct seq_file *m, struct rcu_state *rsp)
>  {
> -	seq_printf(m, "%c bcc: %d nbd: %lu\n",
> -		   rsp->rcu_barrier_in_progress ? 'B' : '.',
> +	seq_printf(m, "%s: %c bcc: %d nbd: %lu\n",
> +		   rsp->name, rsp->rcu_barrier_in_progress ? 'B' : '.',
>  		   atomic_read(&rsp->barrier_cpu_count),
>  		   rsp->n_barrier_done);
>  }
>  
>  static int show_rcubarrier(struct seq_file *m, void *unused)
>  {
> -#ifdef CONFIG_TREE_PREEMPT_RCU
> -	seq_puts(m, "rcu_preempt: ");
> -	print_rcubarrier(m, &rcu_preempt_state);
> -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> -	seq_puts(m, "rcu_sched: ");
> -	print_rcubarrier(m, &rcu_sched_state);
> -	seq_puts(m, "rcu_bh: ");
> -	print_rcubarrier(m, &rcu_bh_state);
> +	struct rcu_state *rsp;
> +
> +	for_each_rcu_flavor(rsp)
> +		print_rcubarrier(m, rsp);

Now that you call this function exactly once, I'd suggest inlining it
for clarity; I don't think having it as a separate function makes sense
anymore.

> @@ -129,24 +125,16 @@ static void print_one_rcu_data(struct seq_file *m, struct rcu_data *rdp)
>  		   rdp->n_cbs_invoked, rdp->n_cbs_orphaned, rdp->n_cbs_adopted);
>  }
>  
> -#define PRINT_RCU_DATA(name, func, m) \
> -	do { \
> -		int _p_r_d_i; \
> -		\
> -		for_each_possible_cpu(_p_r_d_i) \
> -			func(m, &per_cpu(name, _p_r_d_i)); \
> -	} while (0)
> -
>  static int show_rcudata(struct seq_file *m, void *unused)
>  {
> -#ifdef CONFIG_TREE_PREEMPT_RCU
> -	seq_puts(m, "rcu_preempt:\n");
> -	PRINT_RCU_DATA(rcu_preempt_data, print_one_rcu_data, m);
> -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> -	seq_puts(m, "rcu_sched:\n");
> -	PRINT_RCU_DATA(rcu_sched_data, print_one_rcu_data, m);
> -	seq_puts(m, "rcu_bh:\n");
> -	PRINT_RCU_DATA(rcu_bh_data, print_one_rcu_data, m);
> +	int cpu;
> +	struct rcu_state *rsp;
> +
> +	for_each_rcu_flavor(rsp) {
> +		seq_printf(m, "%s:\n", rsp->name);
> +		for_each_possible_cpu(cpu)
> +		print_one_rcu_data(m, per_cpu_ptr(rsp->rda, cpu));
> +	}

As above, I'd suggest inlining print_one_rcu_data.

Also, you need to indent the body of the for_each_possible_cpu loop.

> @@ -200,6 +188,9 @@ static void print_one_rcu_data_csv(struct seq_file *m, struct rcu_data *rdp)
>  
>  static int show_rcudata_csv(struct seq_file *m, void *unused)
>  {
> +	int cpu;
> +	struct rcu_state *rsp;
> +
>  	seq_puts(m, "\"CPU\",\"Online?\",\"c\",\"g\",\"pq\",\"pgp\",\"pq\",");
>  	seq_puts(m, "\"dt\",\"dt nesting\",\"dt NMI nesting\",\"df\",");
>  	seq_puts(m, "\"of\",\"qll\",\"ql\",\"qs\"");
> @@ -207,14 +198,11 @@ static int show_rcudata_csv(struct seq_file *m, void *unused)
>  	seq_puts(m, "\"kt\",\"ktl\"");
>  #endif /* #ifdef CONFIG_RCU_BOOST */
>  	seq_puts(m, ",\"b\",\"ci\",\"co\",\"ca\"\n");
> -#ifdef CONFIG_TREE_PREEMPT_RCU
> -	seq_puts(m, "\"rcu_preempt:\"\n");
> -	PRINT_RCU_DATA(rcu_preempt_data, print_one_rcu_data_csv, m);
> -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> -	seq_puts(m, "\"rcu_sched:\"\n");
> -	PRINT_RCU_DATA(rcu_sched_data, print_one_rcu_data_csv, m);
> -	seq_puts(m, "\"rcu_bh:\"\n");
> -	PRINT_RCU_DATA(rcu_bh_data, print_one_rcu_data_csv, m);
> +	for_each_rcu_flavor(rsp) {
> +		seq_printf(m, "\"%s:\"\n", rsp->name);
> +		for_each_possible_cpu(cpu)
> +			print_one_rcu_data_csv(m, per_cpu_ptr(rsp->rda, cpu));
> +	}

As above, I'd suggest inlining print_one_rcu_data_csv.

> @@ -304,9 +292,9 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
>  	struct rcu_node *rnp;
>  
>  	gpnum = rsp->gpnum;
> -	seq_printf(m, "c=%lu g=%lu s=%d jfq=%ld j=%x "
> +	seq_printf(m, "%s: c=%lu g=%lu s=%d jfq=%ld j=%x "
>  		      "nfqs=%lu/nfqsng=%lu(%lu) fqlh=%lu oqlen=%ld/%ld\n",
> -		   rsp->completed, gpnum, rsp->fqs_state,
> +		   rsp->name, rsp->completed, gpnum, rsp->fqs_state,
>  		   (long)(rsp->jiffies_force_qs - jiffies),
>  		   (int)(jiffies & 0xffff),
>  		   rsp->n_force_qs, rsp->n_force_qs_ngp,
> @@ -329,14 +317,10 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
>  
>  static int show_rcuhier(struct seq_file *m, void *unused)
>  {
> -#ifdef CONFIG_TREE_PREEMPT_RCU
> -	seq_puts(m, "rcu_preempt:\n");
> -	print_one_rcu_state(m, &rcu_preempt_state);
> -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> -	seq_puts(m, "rcu_sched:\n");
> -	print_one_rcu_state(m, &rcu_sched_state);
> -	seq_puts(m, "rcu_bh:\n");
> -	print_one_rcu_state(m, &rcu_bh_state);
> +	struct rcu_state *rsp;
> +
> +	for_each_rcu_flavor(rsp)
> +		print_one_rcu_state(m, rsp);

As above, I'd suggest inlining print_one_rcu_state.

> @@ -377,11 +361,10 @@ static void show_one_rcugp(struct seq_file *m, struct rcu_state *rsp)
>  
>  static int show_rcugp(struct seq_file *m, void *unused)
>  {
> -#ifdef CONFIG_TREE_PREEMPT_RCU
> -	show_one_rcugp(m, &rcu_preempt_state);
> -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> -	show_one_rcugp(m, &rcu_sched_state);
> -	show_one_rcugp(m, &rcu_bh_state);
> +	struct rcu_state *rsp;
> +
> +	for_each_rcu_flavor(rsp)
> +		show_one_rcugp(m, rsp);

As above, I'd suggest inlining show_one_rcugp.

> @@ -430,14 +413,12 @@ static void print_rcu_pendings(struct seq_file *m, struct rcu_state *rsp)
>  
>  static int show_rcu_pending(struct seq_file *m, void *unused)
>  {
> -#ifdef CONFIG_TREE_PREEMPT_RCU
> -	seq_puts(m, "rcu_preempt:\n");
> -	print_rcu_pendings(m, &rcu_preempt_state);
> -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> -	seq_puts(m, "rcu_sched:\n");
> -	print_rcu_pendings(m, &rcu_sched_state);
> -	seq_puts(m, "rcu_bh:\n");
> -	print_rcu_pendings(m, &rcu_bh_state);
> +	struct rcu_state *rsp;
> +
> +	for_each_rcu_flavor(rsp) {
> +		seq_printf(m, "%s:\n", rsp->name);
> +		print_rcu_pendings(m, rsp);
> +	}

As above, I'd suggest inlining print_rcu_pendings.

- Josh Triplett

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 15/15] rcu: RCU_SAVE_DYNTICK code no longer ever dead
  2012-06-15 21:06   ` [PATCH tip/core/rcu 15/15] rcu: RCU_SAVE_DYNTICK code no longer ever dead Paul E. McKenney
@ 2012-06-16  0:02     ` Josh Triplett
  2012-06-16  0:04       ` Josh Triplett
  0 siblings, 1 reply; 50+ messages in thread
From: Josh Triplett @ 2012-06-16  0:02 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 02:06:10PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> Before RCU had unified idle, the RCU_SAVE_DYNTICK leg of the switch
> statement in force_quiescent_state() was dead code for CONFIG_NO_HZ=n
> kernel builds.  With unified idle, the code is never dead.  This commit
> therefore removes the "if" statement designed to make gcc aware of when
> the code was and was not dead.
> 
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

One comment below; with that change:

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

>  kernel/rcutree.c |    2 --
>  1 files changed, 0 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 75ad92a..0b0c9cc 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -1744,8 +1744,6 @@ static void force_quiescent_state(struct rcu_state *rsp, int relaxed)
>  		break; /* grace period idle or initializing, ignore. */
>  
>  	case RCU_SAVE_DYNTICK:
> -		if (RCU_SIGNAL_INIT != RCU_SAVE_DYNTICK)
> -			break; /* So gcc recognizes the dead code. */
>  
>  		raw_spin_unlock(&rnp->lock);  /* irqs remain disabled */

Drop the blank line too?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 15/15] rcu: RCU_SAVE_DYNTICK code no longer ever dead
  2012-06-16  0:02     ` Josh Triplett
@ 2012-06-16  0:04       ` Josh Triplett
  2012-06-16  1:04         ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Josh Triplett @ 2012-06-16  0:04 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 05:02:38PM -0700, Josh Triplett wrote:
> On Fri, Jun 15, 2012 at 02:06:10PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > 
> > Before RCU had unified idle, the RCU_SAVE_DYNTICK leg of the switch
> > statement in force_quiescent_state() was dead code for CONFIG_NO_HZ=n
> > kernel builds.  With unified idle, the code is never dead.  This commit
> > therefore removes the "if" statement designed to make gcc aware of when
> > the code was and was not dead.
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> One comment below; with that change:
> 
> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
> 
> >  kernel/rcutree.c |    2 --
> >  1 files changed, 0 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index 75ad92a..0b0c9cc 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -1744,8 +1744,6 @@ static void force_quiescent_state(struct rcu_state *rsp, int relaxed)
> >  		break; /* grace period idle or initializing, ignore. */
> >  
> >  	case RCU_SAVE_DYNTICK:
> > -		if (RCU_SIGNAL_INIT != RCU_SAVE_DYNTICK)
> > -			break; /* So gcc recognizes the dead code. */
> >  
> >  		raw_spin_unlock(&rnp->lock);  /* irqs remain disabled */
> 
> Drop the blank line too?

Actually, I just realized a larger concern with what this change
implies: does this mean that whatever change made this code no longer
dead introduced a major locking bug here?  If so, has that change
already progressed past the point where you could update it to include
this fix?

- Josh Triplett

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 09/15] rcu: Increasing rcu_barrier() concurrency
  2012-06-15 23:31     ` Josh Triplett
@ 2012-06-16  0:21       ` Steven Rostedt
  2012-06-16  0:49         ` Paul E. McKenney
  2012-06-16  0:48       ` Paul E. McKenney
  1 sibling, 1 reply; 50+ messages in thread
From: Steven Rostedt @ 2012-06-16  0:21 UTC (permalink / raw)
  To: Josh Triplett
  Cc: Paul E. McKenney, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, niv, tglx, peterz, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches, Paul E. McKenney

On Fri, 2012-06-15 at 16:31 -0700, Josh Triplett wrote:
>   
> > -	smp_mb();  /* Prevent any prior operations from leaking in. */
> > +	/*
> > +	 * Ensure tht all prior references, including to ->n_barrier_done,
> > +	 * are ordered before the _rcu_barrier() machinery.
> > +	 */
> > +	smp_mb();  /* See above block comment. */
> 
> If checkpatch complains about the lack of a comment to the right of a
> barrier even when the barrier has a comment directly above it, that
> seems like a bug in checkpatch that needs fixing, to prevent developers
> from having to add noise like "See above block comment.". :)


Yuck yuck yuck yuck!!!


Really, checkpatch is not the golden rule. I've copied an old checkpatch
from something like 2.6.38 or so and use that today, where it was still
reasonable. I've long abandoned the latest checkpatch, as it causes too
many false positives. Or nazis like dictation.

My rule of thumb is this. If what checkpatch tells you to do makes the
format either uglier, or look stupid, it's a good idea to ignore the
checkpatch complaint.

I think in this case, you hit the latter.

-- Steve



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS
  2012-06-15 21:47     ` Josh Triplett
@ 2012-06-16  0:37       ` Paul E. McKenney
  2012-06-16  5:17         ` Josh Triplett
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-16  0:37 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 02:47:26PM -0700, Josh Triplett wrote:
> On Fri, Jun 15, 2012 at 02:05:57PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > 
> > The rcu_node tree array is sized based on compile-time constants,
> > including NR_CPUS.  Although this approach has worked well in the past,
> > the recent trend by many distros to define NR_CPUS=4096 results in
> > excessive grace-period-initialization latencies.
> > 
> > This commit therefore substitutes the run-time computed nr_cpu_ids for
> > the compile-time NR_CPUS when building the tree.  This can result in
> > much of the compile-time-allocated rcu_node array being unused.  If
> > this is a major problem, you are in a specialized situation anyway,
> > so you can manually adjust the NR_CPUS, RCU_FANOUT, and RCU_FANOUT_LEAF
> > kernel config parameters.
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  kernel/rcutree.c        |    2 +-
> >  kernel/rcutree_plugin.h |    2 ++
> >  2 files changed, 3 insertions(+), 1 deletions(-)
> > 
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index a151184..9098910 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -2672,7 +2672,7 @@ static void __init rcu_init_geometry(void)
> >  {
> >  	int i;
> >  	int j;
> > -	int n = NR_CPUS;
> > +	int n = nr_cpu_ids;
> 
> Same question as before: why have this as a variable when it never
> changes?

Ah, that explains why.  This prevented me from forgetting the random
NR_CPUS.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 09/15] rcu: Increasing rcu_barrier() concurrency
  2012-06-15 23:31     ` Josh Triplett
  2012-06-16  0:21       ` Steven Rostedt
@ 2012-06-16  0:48       ` Paul E. McKenney
  1 sibling, 0 replies; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-16  0:48 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches, Paul E. McKenney

On Fri, Jun 15, 2012 at 04:31:51PM -0700, Josh Triplett wrote:
> On Fri, Jun 15, 2012 at 02:06:04PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> > 
> > The traditional rcu_barrier() implementation has serialized all requests,
> > regardless of RCU flavor, and also does not coalesce concurrent requests.
> > In the past, this has been good and sufficient.
> > 
> > However, systems are getting larger and use of rcu_barrier() has been
> > increasing.  This commit therefore introduces a counter-based scheme
> > that allows _rcu_barrier() calls for the same flavor of RCU to take
> > advantage of each others' work.
> > 
> > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  kernel/rcutree.c |   27 ++++++++++++++++++++++++++-
> >  kernel/rcutree.h |    2 ++
> >  2 files changed, 28 insertions(+), 1 deletions(-)
> > 
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index 93358d4..7c299d3 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -2291,13 +2291,32 @@ static void _rcu_barrier(struct rcu_state *rsp)
> >  	unsigned long flags;
> >  	struct rcu_data *rdp;
> >  	struct rcu_data rd;
> > +	unsigned long snap = ACCESS_ONCE(rsp->n_barrier_done);
> > +	unsigned long snap_done;
> >  
> >  	init_rcu_head_on_stack(&rd.barrier_head);
> >  
> >  	/* Take mutex to serialize concurrent rcu_barrier() requests. */
> >  	mutex_lock(&rsp->barrier_mutex);
> >  
> > -	smp_mb();  /* Prevent any prior operations from leaking in. */
> > +	/*
> > +	 * Ensure tht all prior references, including to ->n_barrier_done,
> > +	 * are ordered before the _rcu_barrier() machinery.
> > +	 */
> > +	smp_mb();  /* See above block comment. */
> 
> If checkpatch complains about the lack of a comment to the right of a
> barrier even when the barrier has a comment directly above it, that
> seems like a bug in checkpatch that needs fixing, to prevent developers
> from having to add noise like "See above block comment.". :)

;-)

> Also: what type of barriers do mutex_lock and mutex_unlock imply?  I
> assume they imply some weaker barrier than smp_mb, but I'd still assume
> they imply *some* barrier.

mutex_lock() prevents code from leaving the critical section, but is
not guaranteed to prevent code from entering the critical section.

> > +	/* Recheck ->n_barrier_done to see if others did our work for us. */
> > +	snap_done = ACCESS_ONCE(rsp->n_barrier_done);
> > +	if (ULONG_CMP_GE(snap_done, ((snap + 1) & ~0x1) + 2)) {
> 
> This calculation seems sufficiently clever that it merits an explanatory
> comment.

I will see what I can come up with.

> > +		smp_mb();
> > +		mutex_unlock(&rsp->barrier_mutex);
> > +		return;
> > +	}
> > +
> > +	/* Increment ->n_barrier_done to avoid duplicate work. */
> > +	ACCESS_ONCE(rsp->n_barrier_done)++;
> 
> Interesting dissonance here: the use of ACCESS_ONCE with ++ implies
> exactly two accesses, rather than exactly one.  What makes it safe to
> not use atomic_inc here, but not safe to drop the ACCESS_ONCE?
> Potential use of a cached value read earlier in the function?

Or, worse yet, the compiler speculating the increment and then backing
it out if the early-exit path is taken.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 09/15] rcu: Increasing rcu_barrier() concurrency
  2012-06-16  0:21       ` Steven Rostedt
@ 2012-06-16  0:49         ` Paul E. McKenney
  0 siblings, 0 replies; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-16  0:49 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Josh Triplett, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, niv, tglx, peterz, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches, Paul E. McKenney

On Fri, Jun 15, 2012 at 08:21:20PM -0400, Steven Rostedt wrote:
> On Fri, 2012-06-15 at 16:31 -0700, Josh Triplett wrote:
> >   
> > > -	smp_mb();  /* Prevent any prior operations from leaking in. */
> > > +	/*
> > > +	 * Ensure tht all prior references, including to ->n_barrier_done,
> > > +	 * are ordered before the _rcu_barrier() machinery.
> > > +	 */
> > > +	smp_mb();  /* See above block comment. */
> > 
> > If checkpatch complains about the lack of a comment to the right of a
> > barrier even when the barrier has a comment directly above it, that
> > seems like a bug in checkpatch that needs fixing, to prevent developers
> > from having to add noise like "See above block comment.". :)
> 
> 
> Yuck yuck yuck yuck!!!
> 
> 
> Really, checkpatch is not the golden rule. I've copied an old checkpatch
> from something like 2.6.38 or so and use that today, where it was still
> reasonable. I've long abandoned the latest checkpatch, as it causes too
> many false positives. Or nazis like dictation.
> 
> My rule of thumb is this. If what checkpatch tells you to do makes the
> format either uglier, or look stupid, it's a good idea to ignore the
> checkpatch complaint.
> 
> I think in this case, you hit the latter.

Heh.  I have been doing this "/* See above block comment. */" thing for
quite some time.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 14/15] rcu: Use for_each_rcu_flavor() in TREE_RCU tracing
  2012-06-15 23:59     ` Josh Triplett
@ 2012-06-16  0:56       ` Paul E. McKenney
  2012-06-16  5:22         ` Josh Triplett
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-16  0:56 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 04:59:57PM -0700, Josh Triplett wrote:
> On Fri, Jun 15, 2012 at 02:06:09PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > 
> > This commit applies the new for_each_rcu_flavor() macro to the
> > kernel/rcutree_trace.c file.
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  kernel/rcutree_trace.c |   95 +++++++++++++++++++-----------------------------
> >  1 files changed, 38 insertions(+), 57 deletions(-)
> > 
> > diff --git a/kernel/rcutree_trace.c b/kernel/rcutree_trace.c
> > index 057408b..c618665 100644
> > --- a/kernel/rcutree_trace.c
> > +++ b/kernel/rcutree_trace.c
> > @@ -48,22 +48,18 @@
> >  
> >  static void print_rcubarrier(struct seq_file *m, struct rcu_state *rsp)
> >  {
> > -	seq_printf(m, "%c bcc: %d nbd: %lu\n",
> > -		   rsp->rcu_barrier_in_progress ? 'B' : '.',
> > +	seq_printf(m, "%s: %c bcc: %d nbd: %lu\n",
> > +		   rsp->name, rsp->rcu_barrier_in_progress ? 'B' : '.',
> >  		   atomic_read(&rsp->barrier_cpu_count),
> >  		   rsp->n_barrier_done);
> >  }
> >  
> >  static int show_rcubarrier(struct seq_file *m, void *unused)
> >  {
> > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > -	seq_puts(m, "rcu_preempt: ");
> > -	print_rcubarrier(m, &rcu_preempt_state);
> > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > -	seq_puts(m, "rcu_sched: ");
> > -	print_rcubarrier(m, &rcu_sched_state);
> > -	seq_puts(m, "rcu_bh: ");
> > -	print_rcubarrier(m, &rcu_bh_state);
> > +	struct rcu_state *rsp;
> > +
> > +	for_each_rcu_flavor(rsp)
> > +		print_rcubarrier(m, rsp);
> 
> Now that you call this function exactly once, I'd suggest inlining it
> for clarity; I don't think having it as a separate function makes sense
> anymore.

This one I am OK with.

> > @@ -129,24 +125,16 @@ static void print_one_rcu_data(struct seq_file *m, struct rcu_data *rdp)
> >  		   rdp->n_cbs_invoked, rdp->n_cbs_orphaned, rdp->n_cbs_adopted);
> >  }
> >  
> > -#define PRINT_RCU_DATA(name, func, m) \
> > -	do { \
> > -		int _p_r_d_i; \
> > -		\
> > -		for_each_possible_cpu(_p_r_d_i) \
> > -			func(m, &per_cpu(name, _p_r_d_i)); \
> > -	} while (0)
> > -
> >  static int show_rcudata(struct seq_file *m, void *unused)
> >  {
> > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > -	seq_puts(m, "rcu_preempt:\n");
> > -	PRINT_RCU_DATA(rcu_preempt_data, print_one_rcu_data, m);
> > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > -	seq_puts(m, "rcu_sched:\n");
> > -	PRINT_RCU_DATA(rcu_sched_data, print_one_rcu_data, m);
> > -	seq_puts(m, "rcu_bh:\n");
> > -	PRINT_RCU_DATA(rcu_bh_data, print_one_rcu_data, m);
> > +	int cpu;
> > +	struct rcu_state *rsp;
> > +
> > +	for_each_rcu_flavor(rsp) {
> > +		seq_printf(m, "%s:\n", rsp->name);
> > +		for_each_possible_cpu(cpu)
> > +		print_one_rcu_data(m, per_cpu_ptr(rsp->rda, cpu));
> > +	}
> 
> As above, I'd suggest inlining print_one_rcu_data.

Not this one, too bulky.

> Also, you need to indent the body of the for_each_possible_cpu loop.

Good point.

> > @@ -200,6 +188,9 @@ static void print_one_rcu_data_csv(struct seq_file *m, struct rcu_data *rdp)
> >  
> >  static int show_rcudata_csv(struct seq_file *m, void *unused)
> >  {
> > +	int cpu;
> > +	struct rcu_state *rsp;
> > +
> >  	seq_puts(m, "\"CPU\",\"Online?\",\"c\",\"g\",\"pq\",\"pgp\",\"pq\",");
> >  	seq_puts(m, "\"dt\",\"dt nesting\",\"dt NMI nesting\",\"df\",");
> >  	seq_puts(m, "\"of\",\"qll\",\"ql\",\"qs\"");
> > @@ -207,14 +198,11 @@ static int show_rcudata_csv(struct seq_file *m, void *unused)
> >  	seq_puts(m, "\"kt\",\"ktl\"");
> >  #endif /* #ifdef CONFIG_RCU_BOOST */
> >  	seq_puts(m, ",\"b\",\"ci\",\"co\",\"ca\"\n");
> > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > -	seq_puts(m, "\"rcu_preempt:\"\n");
> > -	PRINT_RCU_DATA(rcu_preempt_data, print_one_rcu_data_csv, m);
> > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > -	seq_puts(m, "\"rcu_sched:\"\n");
> > -	PRINT_RCU_DATA(rcu_sched_data, print_one_rcu_data_csv, m);
> > -	seq_puts(m, "\"rcu_bh:\"\n");
> > -	PRINT_RCU_DATA(rcu_bh_data, print_one_rcu_data_csv, m);
> > +	for_each_rcu_flavor(rsp) {
> > +		seq_printf(m, "\"%s:\"\n", rsp->name);
> > +		for_each_possible_cpu(cpu)
> > +			print_one_rcu_data_csv(m, per_cpu_ptr(rsp->rda, cpu));
> > +	}
> 
> As above, I'd suggest inlining print_one_rcu_data_csv.

Also too bulky.

> > @@ -304,9 +292,9 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
> >  	struct rcu_node *rnp;
> >  
> >  	gpnum = rsp->gpnum;
> > -	seq_printf(m, "c=%lu g=%lu s=%d jfq=%ld j=%x "
> > +	seq_printf(m, "%s: c=%lu g=%lu s=%d jfq=%ld j=%x "
> >  		      "nfqs=%lu/nfqsng=%lu(%lu) fqlh=%lu oqlen=%ld/%ld\n",
> > -		   rsp->completed, gpnum, rsp->fqs_state,
> > +		   rsp->name, rsp->completed, gpnum, rsp->fqs_state,
> >  		   (long)(rsp->jiffies_force_qs - jiffies),
> >  		   (int)(jiffies & 0xffff),
> >  		   rsp->n_force_qs, rsp->n_force_qs_ngp,
> > @@ -329,14 +317,10 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
> >  
> >  static int show_rcuhier(struct seq_file *m, void *unused)
> >  {
> > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > -	seq_puts(m, "rcu_preempt:\n");
> > -	print_one_rcu_state(m, &rcu_preempt_state);
> > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > -	seq_puts(m, "rcu_sched:\n");
> > -	print_one_rcu_state(m, &rcu_sched_state);
> > -	seq_puts(m, "rcu_bh:\n");
> > -	print_one_rcu_state(m, &rcu_bh_state);
> > +	struct rcu_state *rsp;
> > +
> > +	for_each_rcu_flavor(rsp)
> > +		print_one_rcu_state(m, rsp);
> 
> As above, I'd suggest inlining print_one_rcu_state.

Also too bulky.

> > @@ -377,11 +361,10 @@ static void show_one_rcugp(struct seq_file *m, struct rcu_state *rsp)
> >  
> >  static int show_rcugp(struct seq_file *m, void *unused)
> >  {
> > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > -	show_one_rcugp(m, &rcu_preempt_state);
> > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > -	show_one_rcugp(m, &rcu_sched_state);
> > -	show_one_rcugp(m, &rcu_bh_state);
> > +	struct rcu_state *rsp;
> > +
> > +	for_each_rcu_flavor(rsp)
> > +		show_one_rcugp(m, rsp);
> 
> As above, I'd suggest inlining show_one_rcugp.

Also too bulky.

> > @@ -430,14 +413,12 @@ static void print_rcu_pendings(struct seq_file *m, struct rcu_state *rsp)
> >  
> >  static int show_rcu_pending(struct seq_file *m, void *unused)
> >  {
> > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > -	seq_puts(m, "rcu_preempt:\n");
> > -	print_rcu_pendings(m, &rcu_preempt_state);
> > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > -	seq_puts(m, "rcu_sched:\n");
> > -	print_rcu_pendings(m, &rcu_sched_state);
> > -	seq_puts(m, "rcu_bh:\n");
> > -	print_rcu_pendings(m, &rcu_bh_state);
> > +	struct rcu_state *rsp;
> > +
> > +	for_each_rcu_flavor(rsp) {
> > +		seq_printf(m, "%s:\n", rsp->name);
> > +		print_rcu_pendings(m, rsp);
> > +	}
> 
> As above, I'd suggest inlining print_rcu_pendings.

This one I will inline.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 13/15] rcu: Introduce for_each_rcu_flavor() and use it
  2012-06-15 23:52     ` Josh Triplett
@ 2012-06-16  1:01       ` Paul E. McKenney
  2012-06-16  5:35         ` Josh Triplett
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-16  1:01 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 04:52:40PM -0700, Josh Triplett wrote:
> On Fri, Jun 15, 2012 at 02:06:08PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > 
> > The arrival of TREE_PREEMPT_RCU some years back included some ugly
> > code involving either #ifdef or #ifdef'ed wrapper functions to iterate
> > over all non-SRCU flavors of RCU.  This commit therefore introduces
> > a for_each_rcu_flavor() iterator over the rcu_state structures for each
> > flavor of RCU to clean up a bit of the ugliness.
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> Great cleanup!
> 
> A few comments below, though.
> 
> >  kernel/rcutree.c        |   53 +++++++++++++---------
> >  kernel/rcutree.h        |   12 ++---
> >  kernel/rcutree_plugin.h |  116 -----------------------------------------------
> >  3 files changed, 36 insertions(+), 145 deletions(-)
> 
> Awesome diffstat.

;-)

> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index bd4e41c..75ad92a 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -84,6 +84,7 @@ struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh, call_rcu_bh);
> >  DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
> >  
> >  static struct rcu_state *rcu_state;
> > +LIST_HEAD(rcu_struct_flavors);
> 
> Does any means exist to turn this into a constant array known at compile
> time rather than a runtime linked list?  Having this as a compile-time
> constant may allow the compiler to unroll for_each_rcu_flavor and
> potentially inline the calls inside it.

I could do that, but none of the traversals is anywhere near performance
critical, and all the ways I can think of to do this are uglier than
the list.

> > @@ -2539,9 +2548,10 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
> >  
> >  static void __cpuinit rcu_prepare_cpu(int cpu)
> >  {
> > -	rcu_init_percpu_data(cpu, &rcu_sched_state, 0);
> > -	rcu_init_percpu_data(cpu, &rcu_bh_state, 0);
> > -	rcu_preempt_init_percpu_data(cpu);
> > +	struct rcu_state *rsp;
> > +
> > +	for_each_rcu_flavor(rsp)
> > +		rcu_init_percpu_data(cpu, rsp, 0);
> 
> This results in passing 0 as the "preemptible" parameter of
> rcu_init_percpu_data, which seems wrong if the preemptible parameter has
> any meaning at all. :)

Good catch!  Hmmm...  Probably best to move this to the rcu_state
initialization.

> > @@ -2577,18 +2588,15 @@ static int __cpuinit rcu_cpu_notify(struct notifier_block *self,
> >  		 * touch any data without introducing corruption. We send the
> >  		 * dying CPU's callbacks to an arbitrarily chosen online CPU.
> >  		 */
> > -		rcu_cleanup_dying_cpu(&rcu_bh_state);
> > -		rcu_cleanup_dying_cpu(&rcu_sched_state);
> > -		rcu_preempt_cleanup_dying_cpu();
> > -		rcu_cleanup_after_idle(cpu);
> > +		for_each_rcu_flavor(rsp)
> > +			rcu_cleanup_dying_cpu(rsp);
> 
> Why did rcu_cleanup_after_idle go away here?

Because I fat-fingered it.  Thank you very much for spotting this.  It
would have been nasty to find otherwise.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 15/15] rcu: RCU_SAVE_DYNTICK code no longer ever dead
  2012-06-16  0:04       ` Josh Triplett
@ 2012-06-16  1:04         ` Paul E. McKenney
  0 siblings, 0 replies; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-16  1:04 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 05:04:49PM -0700, Josh Triplett wrote:
> On Fri, Jun 15, 2012 at 05:02:38PM -0700, Josh Triplett wrote:
> > On Fri, Jun 15, 2012 at 02:06:10PM -0700, Paul E. McKenney wrote:
> > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > 
> > > Before RCU had unified idle, the RCU_SAVE_DYNTICK leg of the switch
> > > statement in force_quiescent_state() was dead code for CONFIG_NO_HZ=n
> > > kernel builds.  With unified idle, the code is never dead.  This commit
> > > therefore removes the "if" statement designed to make gcc aware of when
> > > the code was and was not dead.
> > > 
> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > One comment below; with that change:
> > 
> > Reviewed-by: Josh Triplett <josh@joshtriplett.org>
> > 
> > >  kernel/rcutree.c |    2 --
> > >  1 files changed, 0 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > > index 75ad92a..0b0c9cc 100644
> > > --- a/kernel/rcutree.c
> > > +++ b/kernel/rcutree.c
> > > @@ -1744,8 +1744,6 @@ static void force_quiescent_state(struct rcu_state *rsp, int relaxed)
> > >  		break; /* grace period idle or initializing, ignore. */
> > >  
> > >  	case RCU_SAVE_DYNTICK:
> > > -		if (RCU_SIGNAL_INIT != RCU_SAVE_DYNTICK)
> > > -			break; /* So gcc recognizes the dead code. */
> > >  
> > >  		raw_spin_unlock(&rnp->lock);  /* irqs remain disabled */
> > 
> > Drop the blank line too?
> 
> Actually, I just realized a larger concern with what this change
> implies: does this mean that whatever change made this code no longer
> dead introduced a major locking bug here?  If so, has that change
> already progressed past the point where you could update it to include
> this fix?

No, the lock is dropped and then reacquired, so the "break" is OK.
This change should have been made back when dyntick-idle mode became
unconditional from RCU's viewpoint.

And yes, I probably should change "rcu_dyntick" to "rcu_idle" and
make a bunch of similar changes.  But not particularly high priority.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS
  2012-06-16  0:37       ` Paul E. McKenney
@ 2012-06-16  5:17         ` Josh Triplett
  2012-06-16  6:38           ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Josh Triplett @ 2012-06-16  5:17 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 05:37:14PM -0700, Paul E. McKenney wrote:
> On Fri, Jun 15, 2012 at 02:47:26PM -0700, Josh Triplett wrote:
> > On Fri, Jun 15, 2012 at 02:05:57PM -0700, Paul E. McKenney wrote:
> > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > 
> > > The rcu_node tree array is sized based on compile-time constants,
> > > including NR_CPUS.  Although this approach has worked well in the past,
> > > the recent trend by many distros to define NR_CPUS=4096 results in
> > > excessive grace-period-initialization latencies.
> > > 
> > > This commit therefore substitutes the run-time computed nr_cpu_ids for
> > > the compile-time NR_CPUS when building the tree.  This can result in
> > > much of the compile-time-allocated rcu_node array being unused.  If
> > > this is a major problem, you are in a specialized situation anyway,
> > > so you can manually adjust the NR_CPUS, RCU_FANOUT, and RCU_FANOUT_LEAF
> > > kernel config parameters.
> > > 
> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > ---
> > >  kernel/rcutree.c        |    2 +-
> > >  kernel/rcutree_plugin.h |    2 ++
> > >  2 files changed, 3 insertions(+), 1 deletions(-)
> > > 
> > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > > index a151184..9098910 100644
> > > --- a/kernel/rcutree.c
> > > +++ b/kernel/rcutree.c
> > > @@ -2672,7 +2672,7 @@ static void __init rcu_init_geometry(void)
> > >  {
> > >  	int i;
> > >  	int j;
> > > -	int n = NR_CPUS;
> > > +	int n = nr_cpu_ids;
> > 
> > Same question as before: why have this as a variable when it never
> > changes?
> 
> Ah, that explains why.  This prevented me from forgetting the random
> NR_CPUS.

Does that mean it can go away now that you've written the patches?

- Josh Triplett

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 14/15] rcu: Use for_each_rcu_flavor() in TREE_RCU tracing
  2012-06-16  0:56       ` Paul E. McKenney
@ 2012-06-16  5:22         ` Josh Triplett
  2012-06-16  6:42           ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Josh Triplett @ 2012-06-16  5:22 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 05:56:05PM -0700, Paul E. McKenney wrote:
> On Fri, Jun 15, 2012 at 04:59:57PM -0700, Josh Triplett wrote:
> > On Fri, Jun 15, 2012 at 02:06:09PM -0700, Paul E. McKenney wrote:
> > > @@ -129,24 +125,16 @@ static void print_one_rcu_data(struct seq_file *m, struct rcu_data *rdp)
> > >  		   rdp->n_cbs_invoked, rdp->n_cbs_orphaned, rdp->n_cbs_adopted);
> > >  }
> > >  
> > > -#define PRINT_RCU_DATA(name, func, m) \
> > > -	do { \
> > > -		int _p_r_d_i; \
> > > -		\
> > > -		for_each_possible_cpu(_p_r_d_i) \
> > > -			func(m, &per_cpu(name, _p_r_d_i)); \
> > > -	} while (0)
> > > -
> > >  static int show_rcudata(struct seq_file *m, void *unused)
> > >  {
> > > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > > -	seq_puts(m, "rcu_preempt:\n");
> > > -	PRINT_RCU_DATA(rcu_preempt_data, print_one_rcu_data, m);
> > > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > > -	seq_puts(m, "rcu_sched:\n");
> > > -	PRINT_RCU_DATA(rcu_sched_data, print_one_rcu_data, m);
> > > -	seq_puts(m, "rcu_bh:\n");
> > > -	PRINT_RCU_DATA(rcu_bh_data, print_one_rcu_data, m);
> > > +	int cpu;
> > > +	struct rcu_state *rsp;
> > > +
> > > +	for_each_rcu_flavor(rsp) {
> > > +		seq_printf(m, "%s:\n", rsp->name);
> > > +		for_each_possible_cpu(cpu)
> > > +		print_one_rcu_data(m, per_cpu_ptr(rsp->rda, cpu));
> > > +	}
> > 
> > As above, I'd suggest inlining print_one_rcu_data.
> 
> Not this one, too bulky.

I looked at the implementation; it just consists of a pile of calls to
seq_printf.  What about that makes it too bulky to include in the body
of the loop?

> > > @@ -200,6 +188,9 @@ static void print_one_rcu_data_csv(struct seq_file *m, struct rcu_data *rdp)
> > >  
> > >  static int show_rcudata_csv(struct seq_file *m, void *unused)
> > >  {
> > > +	int cpu;
> > > +	struct rcu_state *rsp;
> > > +
> > >  	seq_puts(m, "\"CPU\",\"Online?\",\"c\",\"g\",\"pq\",\"pgp\",\"pq\",");
> > >  	seq_puts(m, "\"dt\",\"dt nesting\",\"dt NMI nesting\",\"df\",");
> > >  	seq_puts(m, "\"of\",\"qll\",\"ql\",\"qs\"");
> > > @@ -207,14 +198,11 @@ static int show_rcudata_csv(struct seq_file *m, void *unused)
> > >  	seq_puts(m, "\"kt\",\"ktl\"");
> > >  #endif /* #ifdef CONFIG_RCU_BOOST */
> > >  	seq_puts(m, ",\"b\",\"ci\",\"co\",\"ca\"\n");
> > > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > > -	seq_puts(m, "\"rcu_preempt:\"\n");
> > > -	PRINT_RCU_DATA(rcu_preempt_data, print_one_rcu_data_csv, m);
> > > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > > -	seq_puts(m, "\"rcu_sched:\"\n");
> > > -	PRINT_RCU_DATA(rcu_sched_data, print_one_rcu_data_csv, m);
> > > -	seq_puts(m, "\"rcu_bh:\"\n");
> > > -	PRINT_RCU_DATA(rcu_bh_data, print_one_rcu_data_csv, m);
> > > +	for_each_rcu_flavor(rsp) {
> > > +		seq_printf(m, "\"%s:\"\n", rsp->name);
> > > +		for_each_possible_cpu(cpu)
> > > +			print_one_rcu_data_csv(m, per_cpu_ptr(rsp->rda, cpu));
> > > +	}
> > 
> > As above, I'd suggest inlining print_one_rcu_data_csv.
> 
> Also too bulky.

Also just a few calls to seq_printf. :)

> > > @@ -304,9 +292,9 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
> > >  	struct rcu_node *rnp;
> > >  
> > >  	gpnum = rsp->gpnum;
> > > -	seq_printf(m, "c=%lu g=%lu s=%d jfq=%ld j=%x "
> > > +	seq_printf(m, "%s: c=%lu g=%lu s=%d jfq=%ld j=%x "
> > >  		      "nfqs=%lu/nfqsng=%lu(%lu) fqlh=%lu oqlen=%ld/%ld\n",
> > > -		   rsp->completed, gpnum, rsp->fqs_state,
> > > +		   rsp->name, rsp->completed, gpnum, rsp->fqs_state,
> > >  		   (long)(rsp->jiffies_force_qs - jiffies),
> > >  		   (int)(jiffies & 0xffff),
> > >  		   rsp->n_force_qs, rsp->n_force_qs_ngp,
> > > @@ -329,14 +317,10 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
> > >  
> > >  static int show_rcuhier(struct seq_file *m, void *unused)
> > >  {
> > > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > > -	seq_puts(m, "rcu_preempt:\n");
> > > -	print_one_rcu_state(m, &rcu_preempt_state);
> > > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > > -	seq_puts(m, "rcu_sched:\n");
> > > -	print_one_rcu_state(m, &rcu_sched_state);
> > > -	seq_puts(m, "rcu_bh:\n");
> > > -	print_one_rcu_state(m, &rcu_bh_state);
> > > +	struct rcu_state *rsp;
> > > +
> > > +	for_each_rcu_flavor(rsp)
> > > +		print_one_rcu_state(m, rsp);
> > 
> > As above, I'd suggest inlining print_one_rcu_state.
> 
> Also too bulky.

This one I'll grant, since it would introduce an additional level of
nested loop.

> > > @@ -377,11 +361,10 @@ static void show_one_rcugp(struct seq_file *m, struct rcu_state *rsp)
> > >  
> > >  static int show_rcugp(struct seq_file *m, void *unused)
> > >  {
> > > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > > -	show_one_rcugp(m, &rcu_preempt_state);
> > > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > > -	show_one_rcugp(m, &rcu_sched_state);
> > > -	show_one_rcugp(m, &rcu_bh_state);
> > > +	struct rcu_state *rsp;
> > > +
> > > +	for_each_rcu_flavor(rsp)
> > > +		show_one_rcugp(m, rsp);
> > 
> > As above, I'd suggest inlining show_one_rcugp.
> 
> Also too bulky.

show_one_rcugp seems like an extremely simple function; what makes it
unsuitable for the body of this loop?

- Josh Triplett

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 13/15] rcu: Introduce for_each_rcu_flavor() and use it
  2012-06-16  1:01       ` Paul E. McKenney
@ 2012-06-16  5:35         ` Josh Triplett
  2012-06-16  6:36           ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Josh Triplett @ 2012-06-16  5:35 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 06:01:49PM -0700, Paul E. McKenney wrote:
> On Fri, Jun 15, 2012 at 04:52:40PM -0700, Josh Triplett wrote:
> > On Fri, Jun 15, 2012 at 02:06:08PM -0700, Paul E. McKenney wrote:
> > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > > index bd4e41c..75ad92a 100644
> > > --- a/kernel/rcutree.c
> > > +++ b/kernel/rcutree.c
> > > @@ -84,6 +84,7 @@ struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh, call_rcu_bh);
> > >  DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
> > >  
> > >  static struct rcu_state *rcu_state;
> > > +LIST_HEAD(rcu_struct_flavors);
> > 
> > Does any means exist to turn this into a constant array known at compile
> > time rather than a runtime linked list?  Having this as a compile-time
> > constant may allow the compiler to unroll for_each_rcu_flavor and
> > potentially inline the calls inside it.
> 
> I could do that, but none of the traversals is anywhere near performance
> critical, and all the ways I can think of to do this are uglier than
> the list.

All of the struct rcu_state instances exist at compile time, so you can
just create an array of pointers to them:

static struct rcu_state *const rcu_struct_flavors[] = {
    &rcu_data,
    &rcu_bh_data,
#ifdef CONFIG_TREE_PREEMPT_RCU
    &rcu_preempt_data,
#endif
};

Then just define for_each_rcu_flavor to iterate over that compile-time
constant array.  Any reason that wouldn't work?

- Josh Triplett

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 13/15] rcu: Introduce for_each_rcu_flavor() and use it
  2012-06-16  5:35         ` Josh Triplett
@ 2012-06-16  6:36           ` Paul E. McKenney
  0 siblings, 0 replies; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-16  6:36 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 10:35:00PM -0700, Josh Triplett wrote:
> On Fri, Jun 15, 2012 at 06:01:49PM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 15, 2012 at 04:52:40PM -0700, Josh Triplett wrote:
> > > On Fri, Jun 15, 2012 at 02:06:08PM -0700, Paul E. McKenney wrote:
> > > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > > > index bd4e41c..75ad92a 100644
> > > > --- a/kernel/rcutree.c
> > > > +++ b/kernel/rcutree.c
> > > > @@ -84,6 +84,7 @@ struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh, call_rcu_bh);
> > > >  DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
> > > >  
> > > >  static struct rcu_state *rcu_state;
> > > > +LIST_HEAD(rcu_struct_flavors);
> > > 
> > > Does any means exist to turn this into a constant array known at compile
> > > time rather than a runtime linked list?  Having this as a compile-time
> > > constant may allow the compiler to unroll for_each_rcu_flavor and
> > > potentially inline the calls inside it.
> > 
> > I could do that, but none of the traversals is anywhere near performance
> > critical, and all the ways I can think of to do this are uglier than
> > the list.
> 
> All of the struct rcu_state instances exist at compile time, so you can
> just create an array of pointers to them:
> 
> static struct rcu_state *const rcu_struct_flavors[] = {
>     &rcu_data,
>     &rcu_bh_data,
> #ifdef CONFIG_TREE_PREEMPT_RCU
>     &rcu_preempt_data,
> #endif
> };
> 
> Then just define for_each_rcu_flavor to iterate over that compile-time
> constant array.  Any reason that wouldn't work?

It could work, but I like the automated response of the current system.
Your array would add one more thing that would need to be manually
kept consistent.  Now, if any of the traversals were on a fastpath,
that would be different.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS
  2012-06-16  5:17         ` Josh Triplett
@ 2012-06-16  6:38           ` Paul E. McKenney
  2012-06-16  9:17             ` Josh Triplett
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-16  6:38 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 10:17:12PM -0700, Josh Triplett wrote:
> On Fri, Jun 15, 2012 at 05:37:14PM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 15, 2012 at 02:47:26PM -0700, Josh Triplett wrote:
> > > On Fri, Jun 15, 2012 at 02:05:57PM -0700, Paul E. McKenney wrote:
> > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > 
> > > > The rcu_node tree array is sized based on compile-time constants,
> > > > including NR_CPUS.  Although this approach has worked well in the past,
> > > > the recent trend by many distros to define NR_CPUS=4096 results in
> > > > excessive grace-period-initialization latencies.
> > > > 
> > > > This commit therefore substitutes the run-time computed nr_cpu_ids for
> > > > the compile-time NR_CPUS when building the tree.  This can result in
> > > > much of the compile-time-allocated rcu_node array being unused.  If
> > > > this is a major problem, you are in a specialized situation anyway,
> > > > so you can manually adjust the NR_CPUS, RCU_FANOUT, and RCU_FANOUT_LEAF
> > > > kernel config parameters.
> > > > 
> > > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > > ---
> > > >  kernel/rcutree.c        |    2 +-
> > > >  kernel/rcutree_plugin.h |    2 ++
> > > >  2 files changed, 3 insertions(+), 1 deletions(-)
> > > > 
> > > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > > > index a151184..9098910 100644
> > > > --- a/kernel/rcutree.c
> > > > +++ b/kernel/rcutree.c
> > > > @@ -2672,7 +2672,7 @@ static void __init rcu_init_geometry(void)
> > > >  {
> > > >  	int i;
> > > >  	int j;
> > > > -	int n = NR_CPUS;
> > > > +	int n = nr_cpu_ids;
> > > 
> > > Same question as before: why have this as a variable when it never
> > > changes?
> > 
> > Ah, that explains why.  This prevented me from forgetting the random
> > NR_CPUS.
> 
> Does that mean it can go away now that you've written the patches?

If I don't have to change from nr_cpu_ids to yet another thing over
the next while, then it might be worth changing.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 14/15] rcu: Use for_each_rcu_flavor() in TREE_RCU tracing
  2012-06-16  5:22         ` Josh Triplett
@ 2012-06-16  6:42           ` Paul E. McKenney
  0 siblings, 0 replies; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-16  6:42 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 10:22:14PM -0700, Josh Triplett wrote:
> On Fri, Jun 15, 2012 at 05:56:05PM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 15, 2012 at 04:59:57PM -0700, Josh Triplett wrote:
> > > On Fri, Jun 15, 2012 at 02:06:09PM -0700, Paul E. McKenney wrote:
> > > > @@ -129,24 +125,16 @@ static void print_one_rcu_data(struct seq_file *m, struct rcu_data *rdp)
> > > >  		   rdp->n_cbs_invoked, rdp->n_cbs_orphaned, rdp->n_cbs_adopted);
> > > >  }
> > > >  
> > > > -#define PRINT_RCU_DATA(name, func, m) \
> > > > -	do { \
> > > > -		int _p_r_d_i; \
> > > > -		\
> > > > -		for_each_possible_cpu(_p_r_d_i) \
> > > > -			func(m, &per_cpu(name, _p_r_d_i)); \
> > > > -	} while (0)
> > > > -
> > > >  static int show_rcudata(struct seq_file *m, void *unused)
> > > >  {
> > > > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > > > -	seq_puts(m, "rcu_preempt:\n");
> > > > -	PRINT_RCU_DATA(rcu_preempt_data, print_one_rcu_data, m);
> > > > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > > > -	seq_puts(m, "rcu_sched:\n");
> > > > -	PRINT_RCU_DATA(rcu_sched_data, print_one_rcu_data, m);
> > > > -	seq_puts(m, "rcu_bh:\n");
> > > > -	PRINT_RCU_DATA(rcu_bh_data, print_one_rcu_data, m);
> > > > +	int cpu;
> > > > +	struct rcu_state *rsp;
> > > > +
> > > > +	for_each_rcu_flavor(rsp) {
> > > > +		seq_printf(m, "%s:\n", rsp->name);
> > > > +		for_each_possible_cpu(cpu)
> > > > +		print_one_rcu_data(m, per_cpu_ptr(rsp->rda, cpu));
> > > > +	}
> > > 
> > > As above, I'd suggest inlining print_one_rcu_data.
> > 
> > Not this one, too bulky.
> 
> I looked at the implementation; it just consists of a pile of calls to
> seq_printf.  What about that makes it too bulky to include in the body
> of the loop?

Your referring to it as "a pile of calls" is strong evidence.  ;-)

> > > > @@ -200,6 +188,9 @@ static void print_one_rcu_data_csv(struct seq_file *m, struct rcu_data *rdp)
> > > >  
> > > >  static int show_rcudata_csv(struct seq_file *m, void *unused)
> > > >  {
> > > > +	int cpu;
> > > > +	struct rcu_state *rsp;
> > > > +
> > > >  	seq_puts(m, "\"CPU\",\"Online?\",\"c\",\"g\",\"pq\",\"pgp\",\"pq\",");
> > > >  	seq_puts(m, "\"dt\",\"dt nesting\",\"dt NMI nesting\",\"df\",");
> > > >  	seq_puts(m, "\"of\",\"qll\",\"ql\",\"qs\"");
> > > > @@ -207,14 +198,11 @@ static int show_rcudata_csv(struct seq_file *m, void *unused)
> > > >  	seq_puts(m, "\"kt\",\"ktl\"");
> > > >  #endif /* #ifdef CONFIG_RCU_BOOST */
> > > >  	seq_puts(m, ",\"b\",\"ci\",\"co\",\"ca\"\n");
> > > > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > > > -	seq_puts(m, "\"rcu_preempt:\"\n");
> > > > -	PRINT_RCU_DATA(rcu_preempt_data, print_one_rcu_data_csv, m);
> > > > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > > > -	seq_puts(m, "\"rcu_sched:\"\n");
> > > > -	PRINT_RCU_DATA(rcu_sched_data, print_one_rcu_data_csv, m);
> > > > -	seq_puts(m, "\"rcu_bh:\"\n");
> > > > -	PRINT_RCU_DATA(rcu_bh_data, print_one_rcu_data_csv, m);
> > > > +	for_each_rcu_flavor(rsp) {
> > > > +		seq_printf(m, "\"%s:\"\n", rsp->name);
> > > > +		for_each_possible_cpu(cpu)
> > > > +			print_one_rcu_data_csv(m, per_cpu_ptr(rsp->rda, cpu));
> > > > +	}
> > > 
> > > As above, I'd suggest inlining print_one_rcu_data_csv.
> > 
> > Also too bulky.
> 
> Also just a few calls to seq_printf. :)

For some definition or another of a few lines of code.  ;-)

> > > > @@ -304,9 +292,9 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
> > > >  	struct rcu_node *rnp;
> > > >  
> > > >  	gpnum = rsp->gpnum;
> > > > -	seq_printf(m, "c=%lu g=%lu s=%d jfq=%ld j=%x "
> > > > +	seq_printf(m, "%s: c=%lu g=%lu s=%d jfq=%ld j=%x "
> > > >  		      "nfqs=%lu/nfqsng=%lu(%lu) fqlh=%lu oqlen=%ld/%ld\n",
> > > > -		   rsp->completed, gpnum, rsp->fqs_state,
> > > > +		   rsp->name, rsp->completed, gpnum, rsp->fqs_state,
> > > >  		   (long)(rsp->jiffies_force_qs - jiffies),
> > > >  		   (int)(jiffies & 0xffff),
> > > >  		   rsp->n_force_qs, rsp->n_force_qs_ngp,
> > > > @@ -329,14 +317,10 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
> > > >  
> > > >  static int show_rcuhier(struct seq_file *m, void *unused)
> > > >  {
> > > > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > > > -	seq_puts(m, "rcu_preempt:\n");
> > > > -	print_one_rcu_state(m, &rcu_preempt_state);
> > > > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > > > -	seq_puts(m, "rcu_sched:\n");
> > > > -	print_one_rcu_state(m, &rcu_sched_state);
> > > > -	seq_puts(m, "rcu_bh:\n");
> > > > -	print_one_rcu_state(m, &rcu_bh_state);
> > > > +	struct rcu_state *rsp;
> > > > +
> > > > +	for_each_rcu_flavor(rsp)
> > > > +		print_one_rcu_state(m, rsp);
> > > 
> > > As above, I'd suggest inlining print_one_rcu_state.
> > 
> > Also too bulky.
> 
> This one I'll grant, since it would introduce an additional level of
> nested loop.
> 
> > > > @@ -377,11 +361,10 @@ static void show_one_rcugp(struct seq_file *m, struct rcu_state *rsp)
> > > >  
> > > >  static int show_rcugp(struct seq_file *m, void *unused)
> > > >  {
> > > > -#ifdef CONFIG_TREE_PREEMPT_RCU
> > > > -	show_one_rcugp(m, &rcu_preempt_state);
> > > > -#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> > > > -	show_one_rcugp(m, &rcu_sched_state);
> > > > -	show_one_rcugp(m, &rcu_bh_state);
> > > > +	struct rcu_state *rsp;
> > > > +
> > > > +	for_each_rcu_flavor(rsp)
> > > > +		show_one_rcugp(m, rsp);
> > > 
> > > As above, I'd suggest inlining show_one_rcugp.
> > 
> > Also too bulky.
> 
> show_one_rcugp seems like an extremely simple function; what makes it
> unsuitable for the body of this loop?

Ummm...  Nothing.  I actually did inline this one.  I was clearly
confused when documenting my actions!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS
  2012-06-16  6:38           ` Paul E. McKenney
@ 2012-06-16  9:17             ` Josh Triplett
  2012-06-16 14:44               ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Josh Triplett @ 2012-06-16  9:17 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Fri, Jun 15, 2012 at 11:38:48PM -0700, Paul E. McKenney wrote:
> On Fri, Jun 15, 2012 at 10:17:12PM -0700, Josh Triplett wrote:
> > On Fri, Jun 15, 2012 at 05:37:14PM -0700, Paul E. McKenney wrote:
> > > On Fri, Jun 15, 2012 at 02:47:26PM -0700, Josh Triplett wrote:
> > > > On Fri, Jun 15, 2012 at 02:05:57PM -0700, Paul E. McKenney wrote:
> > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > 
> > > > > The rcu_node tree array is sized based on compile-time constants,
> > > > > including NR_CPUS.  Although this approach has worked well in the past,
> > > > > the recent trend by many distros to define NR_CPUS=4096 results in
> > > > > excessive grace-period-initialization latencies.
> > > > > 
> > > > > This commit therefore substitutes the run-time computed nr_cpu_ids for
> > > > > the compile-time NR_CPUS when building the tree.  This can result in
> > > > > much of the compile-time-allocated rcu_node array being unused.  If
> > > > > this is a major problem, you are in a specialized situation anyway,
> > > > > so you can manually adjust the NR_CPUS, RCU_FANOUT, and RCU_FANOUT_LEAF
> > > > > kernel config parameters.
> > > > > 
> > > > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > > > ---
> > > > >  kernel/rcutree.c        |    2 +-
> > > > >  kernel/rcutree_plugin.h |    2 ++
> > > > >  2 files changed, 3 insertions(+), 1 deletions(-)
> > > > > 
> > > > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > > > > index a151184..9098910 100644
> > > > > --- a/kernel/rcutree.c
> > > > > +++ b/kernel/rcutree.c
> > > > > @@ -2672,7 +2672,7 @@ static void __init rcu_init_geometry(void)
> > > > >  {
> > > > >  	int i;
> > > > >  	int j;
> > > > > -	int n = NR_CPUS;
> > > > > +	int n = nr_cpu_ids;
> > > > 
> > > > Same question as before: why have this as a variable when it never
> > > > changes?
> > > 
> > > Ah, that explains why.  This prevented me from forgetting the random
> > > NR_CPUS.
> > 
> > Does that mean it can go away now that you've written the patches?
> 
> If I don't have to change from nr_cpu_ids to yet another thing over
> the next while, then it might be worth changing.

That sounds like an argument for a #define or a static const, rather
than a local variable. :)

- Josh Triplett

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS
  2012-06-16  9:17             ` Josh Triplett
@ 2012-06-16 14:44               ` Paul E. McKenney
  2012-06-16 14:51                 ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-16 14:44 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Sat, Jun 16, 2012 at 02:17:33AM -0700, Josh Triplett wrote:
> On Fri, Jun 15, 2012 at 11:38:48PM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 15, 2012 at 10:17:12PM -0700, Josh Triplett wrote:
> > > On Fri, Jun 15, 2012 at 05:37:14PM -0700, Paul E. McKenney wrote:
> > > > On Fri, Jun 15, 2012 at 02:47:26PM -0700, Josh Triplett wrote:
> > > > > On Fri, Jun 15, 2012 at 02:05:57PM -0700, Paul E. McKenney wrote:
> > > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > > 
> > > > > > The rcu_node tree array is sized based on compile-time constants,
> > > > > > including NR_CPUS.  Although this approach has worked well in the past,
> > > > > > the recent trend by many distros to define NR_CPUS=4096 results in
> > > > > > excessive grace-period-initialization latencies.
> > > > > > 
> > > > > > This commit therefore substitutes the run-time computed nr_cpu_ids for
> > > > > > the compile-time NR_CPUS when building the tree.  This can result in
> > > > > > much of the compile-time-allocated rcu_node array being unused.  If
> > > > > > this is a major problem, you are in a specialized situation anyway,
> > > > > > so you can manually adjust the NR_CPUS, RCU_FANOUT, and RCU_FANOUT_LEAF
> > > > > > kernel config parameters.
> > > > > > 
> > > > > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > > > > ---
> > > > > >  kernel/rcutree.c        |    2 +-
> > > > > >  kernel/rcutree_plugin.h |    2 ++
> > > > > >  2 files changed, 3 insertions(+), 1 deletions(-)
> > > > > > 
> > > > > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > > > > > index a151184..9098910 100644
> > > > > > --- a/kernel/rcutree.c
> > > > > > +++ b/kernel/rcutree.c
> > > > > > @@ -2672,7 +2672,7 @@ static void __init rcu_init_geometry(void)
> > > > > >  {
> > > > > >  	int i;
> > > > > >  	int j;
> > > > > > -	int n = NR_CPUS;
> > > > > > +	int n = nr_cpu_ids;
> > > > > 
> > > > > Same question as before: why have this as a variable when it never
> > > > > changes?
> > > > 
> > > > Ah, that explains why.  This prevented me from forgetting the random
> > > > NR_CPUS.
> > > 
> > > Does that mean it can go away now that you've written the patches?
> > 
> > If I don't have to change from nr_cpu_ids to yet another thing over
> > the next while, then it might be worth changing.
> 
> That sounds like an argument for a #define or a static const, rather
> than a local variable. :)

OK, static const it is!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS
  2012-06-16 14:44               ` Paul E. McKenney
@ 2012-06-16 14:51                 ` Paul E. McKenney
  2012-06-16 20:31                   ` Josh Triplett
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2012-06-16 14:51 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Sat, Jun 16, 2012 at 07:44:53AM -0700, Paul E. McKenney wrote:
> On Sat, Jun 16, 2012 at 02:17:33AM -0700, Josh Triplett wrote:
> > On Fri, Jun 15, 2012 at 11:38:48PM -0700, Paul E. McKenney wrote:
> > > On Fri, Jun 15, 2012 at 10:17:12PM -0700, Josh Triplett wrote:
> > > > On Fri, Jun 15, 2012 at 05:37:14PM -0700, Paul E. McKenney wrote:
> > > > > On Fri, Jun 15, 2012 at 02:47:26PM -0700, Josh Triplett wrote:
> > > > > > On Fri, Jun 15, 2012 at 02:05:57PM -0700, Paul E. McKenney wrote:
> > > > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > > > 
> > > > > > > The rcu_node tree array is sized based on compile-time constants,
> > > > > > > including NR_CPUS.  Although this approach has worked well in the past,
> > > > > > > the recent trend by many distros to define NR_CPUS=4096 results in
> > > > > > > excessive grace-period-initialization latencies.
> > > > > > > 
> > > > > > > This commit therefore substitutes the run-time computed nr_cpu_ids for
> > > > > > > the compile-time NR_CPUS when building the tree.  This can result in
> > > > > > > much of the compile-time-allocated rcu_node array being unused.  If
> > > > > > > this is a major problem, you are in a specialized situation anyway,
> > > > > > > so you can manually adjust the NR_CPUS, RCU_FANOUT, and RCU_FANOUT_LEAF
> > > > > > > kernel config parameters.
> > > > > > > 
> > > > > > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > > > > > ---
> > > > > > >  kernel/rcutree.c        |    2 +-
> > > > > > >  kernel/rcutree_plugin.h |    2 ++
> > > > > > >  2 files changed, 3 insertions(+), 1 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > > > > > > index a151184..9098910 100644
> > > > > > > --- a/kernel/rcutree.c
> > > > > > > +++ b/kernel/rcutree.c
> > > > > > > @@ -2672,7 +2672,7 @@ static void __init rcu_init_geometry(void)
> > > > > > >  {
> > > > > > >  	int i;
> > > > > > >  	int j;
> > > > > > > -	int n = NR_CPUS;
> > > > > > > +	int n = nr_cpu_ids;
> > > > > > 
> > > > > > Same question as before: why have this as a variable when it never
> > > > > > changes?
> > > > > 
> > > > > Ah, that explains why.  This prevented me from forgetting the random
> > > > > NR_CPUS.
> > > > 
> > > > Does that mean it can go away now that you've written the patches?
> > > 
> > > If I don't have to change from nr_cpu_ids to yet another thing over
> > > the next while, then it might be worth changing.
> > 
> > That sounds like an argument for a #define or a static const, rather
> > than a local variable. :)
> 
> OK, static const it is!

Except that the compiler doesn't like the run-time initialization of
a static const variable.

Can't have everything, I guess.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS
  2012-06-16 14:51                 ` Paul E. McKenney
@ 2012-06-16 20:31                   ` Josh Triplett
  0 siblings, 0 replies; 50+ messages in thread
From: Josh Triplett @ 2012-06-16 20:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, patches

On Sat, Jun 16, 2012 at 07:51:54AM -0700, Paul E. McKenney wrote:
> On Sat, Jun 16, 2012 at 07:44:53AM -0700, Paul E. McKenney wrote:
> > On Sat, Jun 16, 2012 at 02:17:33AM -0700, Josh Triplett wrote:
> > > On Fri, Jun 15, 2012 at 11:38:48PM -0700, Paul E. McKenney wrote:
> > > > On Fri, Jun 15, 2012 at 10:17:12PM -0700, Josh Triplett wrote:
> > > > > On Fri, Jun 15, 2012 at 05:37:14PM -0700, Paul E. McKenney wrote:
> > > > > > On Fri, Jun 15, 2012 at 02:47:26PM -0700, Josh Triplett wrote:
> > > > > > > On Fri, Jun 15, 2012 at 02:05:57PM -0700, Paul E. McKenney wrote:
> > > > > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > > > > 
> > > > > > > > The rcu_node tree array is sized based on compile-time constants,
> > > > > > > > including NR_CPUS.  Although this approach has worked well in the past,
> > > > > > > > the recent trend by many distros to define NR_CPUS=4096 results in
> > > > > > > > excessive grace-period-initialization latencies.
> > > > > > > > 
> > > > > > > > This commit therefore substitutes the run-time computed nr_cpu_ids for
> > > > > > > > the compile-time NR_CPUS when building the tree.  This can result in
> > > > > > > > much of the compile-time-allocated rcu_node array being unused.  If
> > > > > > > > this is a major problem, you are in a specialized situation anyway,
> > > > > > > > so you can manually adjust the NR_CPUS, RCU_FANOUT, and RCU_FANOUT_LEAF
> > > > > > > > kernel config parameters.
> > > > > > > > 
> > > > > > > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > > > > > > ---
> > > > > > > >  kernel/rcutree.c        |    2 +-
> > > > > > > >  kernel/rcutree_plugin.h |    2 ++
> > > > > > > >  2 files changed, 3 insertions(+), 1 deletions(-)
> > > > > > > > 
> > > > > > > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > > > > > > > index a151184..9098910 100644
> > > > > > > > --- a/kernel/rcutree.c
> > > > > > > > +++ b/kernel/rcutree.c
> > > > > > > > @@ -2672,7 +2672,7 @@ static void __init rcu_init_geometry(void)
> > > > > > > >  {
> > > > > > > >  	int i;
> > > > > > > >  	int j;
> > > > > > > > -	int n = NR_CPUS;
> > > > > > > > +	int n = nr_cpu_ids;
> > > > > > > 
> > > > > > > Same question as before: why have this as a variable when it never
> > > > > > > changes?
> > > > > > 
> > > > > > Ah, that explains why.  This prevented me from forgetting the random
> > > > > > NR_CPUS.
> > > > > 
> > > > > Does that mean it can go away now that you've written the patches?
> > > > 
> > > > If I don't have to change from nr_cpu_ids to yet another thing over
> > > > the next while, then it might be worth changing.
> > > 
> > > That sounds like an argument for a #define or a static const, rather
> > > than a local variable. :)
> > 
> > OK, static const it is!
> 
> Except that the compiler doesn't like the run-time initialization of
> a static const variable.
> 
> Can't have everything, I guess.  ;-)

Ah, right, nr_cpu_ids is not a static const.

In that case, just waiting until you think this definition won't churn
anymore and substituting nr_cpu_ids for n everywhere seems like the
right solution.

- Josh Triplett

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2012-06-16 20:31 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-15 21:05 [PATCH tip/core/rcu 0/15] Improvements to rcu_barrier() and RT response on big systems Paul E. McKenney
2012-06-15 21:05 ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Paul E. McKenney
2012-06-15 21:05   ` [PATCH tip/core/rcu 02/15] rcu: Size rcu_node tree from nr_cpu_ids rather than NR_CPUS Paul E. McKenney
2012-06-15 21:47     ` Josh Triplett
2012-06-16  0:37       ` Paul E. McKenney
2012-06-16  5:17         ` Josh Triplett
2012-06-16  6:38           ` Paul E. McKenney
2012-06-16  9:17             ` Josh Triplett
2012-06-16 14:44               ` Paul E. McKenney
2012-06-16 14:51                 ` Paul E. McKenney
2012-06-16 20:31                   ` Josh Triplett
2012-06-15 21:05   ` [PATCH tip/core/rcu 03/15] rcu: Prevent excessive line length in RCU_STATE_INITIALIZER() Paul E. McKenney
2012-06-15 21:48     ` Josh Triplett
2012-06-15 21:05   ` [PATCH tip/core/rcu 04/15] rcu: Place pointer to call_rcu() in rcu_data structure Paul E. McKenney
2012-06-15 22:08     ` Josh Triplett
2012-06-15 21:06   ` [PATCH tip/core/rcu 05/15] rcu: Move _rcu_barrier()'s rcu_head structures to rcu_data structures Paul E. McKenney
2012-06-15 22:19     ` Josh Triplett
2012-06-15 21:06   ` [PATCH tip/core/rcu 06/15] rcu: Move rcu_barrier_cpu_count to rcu_state structure Paul E. McKenney
2012-06-15 22:44     ` Josh Triplett
2012-06-15 21:06   ` [PATCH tip/core/rcu 07/15] rcu: Move rcu_barrier_completion " Paul E. McKenney
2012-06-15 22:51     ` Josh Triplett
2012-06-15 21:06   ` [PATCH tip/core/rcu 08/15] rcu: Move rcu_barrier_mutex " Paul E. McKenney
2012-06-15 22:55     ` Josh Triplett
2012-06-15 21:06   ` [PATCH tip/core/rcu 09/15] rcu: Increasing rcu_barrier() concurrency Paul E. McKenney
2012-06-15 23:31     ` Josh Triplett
2012-06-16  0:21       ` Steven Rostedt
2012-06-16  0:49         ` Paul E. McKenney
2012-06-16  0:48       ` Paul E. McKenney
2012-06-15 21:06   ` [PATCH tip/core/rcu 10/15] rcu: Add tracing for _rcu_barrier() Paul E. McKenney
2012-06-15 23:35     ` Josh Triplett
2012-06-15 21:06   ` [PATCH tip/core/rcu 11/15] rcu: Add rcu_barrier() statistics to debugfs tracing Paul E. McKenney
2012-06-15 23:38     ` Josh Triplett
2012-06-15 21:06   ` [PATCH tip/core/rcu 12/15] rcu: Remove unneeded __rcu_process_callbacks() argument Paul E. McKenney
2012-06-15 23:37     ` Josh Triplett
2012-06-15 21:06   ` [PATCH tip/core/rcu 13/15] rcu: Introduce for_each_rcu_flavor() and use it Paul E. McKenney
2012-06-15 23:52     ` Josh Triplett
2012-06-16  1:01       ` Paul E. McKenney
2012-06-16  5:35         ` Josh Triplett
2012-06-16  6:36           ` Paul E. McKenney
2012-06-15 21:06   ` [PATCH tip/core/rcu 14/15] rcu: Use for_each_rcu_flavor() in TREE_RCU tracing Paul E. McKenney
2012-06-15 23:59     ` Josh Triplett
2012-06-16  0:56       ` Paul E. McKenney
2012-06-16  5:22         ` Josh Triplett
2012-06-16  6:42           ` Paul E. McKenney
2012-06-15 21:06   ` [PATCH tip/core/rcu 15/15] rcu: RCU_SAVE_DYNTICK code no longer ever dead Paul E. McKenney
2012-06-16  0:02     ` Josh Triplett
2012-06-16  0:04       ` Josh Triplett
2012-06-16  1:04         ` Paul E. McKenney
2012-06-15 21:43   ` [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter Josh Triplett
2012-06-15 22:10     ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.