All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rcu 01/19] srcu: Tighten cleanup_srcu_struct() GP checks
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 02/19] srcu: Fix s/is/if/ typo in srcu_node comment Paul E. McKenney
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

Currently, cleanup_srcu_struct() checks for a grace period in progress,
but it does not check for a grace period that has not yet started but
which might start at any time.  Such a situation could result in a
use-after-free bug, so this commit adds a check for a grace period that
is needed but not yet started to cleanup_srcu_struct().

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/srcutree.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 6833d88871816..d30e4db04506a 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -382,9 +382,11 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 			return; /* Forgot srcu_barrier(), so just leak it! */
 	}
 	if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
+	    WARN_ON(rcu_seq_current(&ssp->srcu_gp_seq) != ssp->srcu_gp_seq_needed) ||
 	    WARN_ON(srcu_readers_active(ssp))) {
-		pr_info("%s: Active srcu_struct %p state: %d\n",
-			__func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)));
+		pr_info("%s: Active srcu_struct %p read state: %d gp state: %lu/%lu\n",
+			__func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)),
+			rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
 		return; /* Caller forgot to stop doing call_srcu()? */
 	}
 	free_percpu(ssp->sda);
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 02/19] srcu: Fix s/is/if/ typo in srcu_node comment
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 01/19] srcu: Tighten cleanup_srcu_struct() GP checks Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 03/19] srcu: Make srcu_funnel_gp_start() cache ->mynode in snp_leaf Paul E. McKenney
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

This commit fixed a typo in the srcu_node structure's ->srcu_have_cbs
comment.  While in the area, redo a couple of comments to take advantage
of 100-character line lengths.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index cb1f4351e8baa..4025840ba9a38 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -47,11 +47,9 @@ struct srcu_data {
  */
 struct srcu_node {
 	spinlock_t __private lock;
-	unsigned long srcu_have_cbs[4];		/* GP seq for children */
-						/*  having CBs, but only */
-						/*  is > ->srcu_gq_seq. */
-	unsigned long srcu_data_have_cbs[4];	/* Which srcu_data structs */
-						/*  have CBs for given GP? */
+	unsigned long srcu_have_cbs[4];		/* GP seq for children having CBs, but only */
+						/*  if greater than ->srcu_gq_seq. */
+	unsigned long srcu_data_have_cbs[4];	/* Which srcu_data structs have CBs for given GP? */
 	unsigned long srcu_gp_seq_needed_exp;	/* Furthest future exp GP. */
 	struct srcu_node *srcu_parent;		/* Next up in tree. */
 	int grplo;				/* Least CPU for node. */
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 03/19] srcu: Make srcu_funnel_gp_start() cache ->mynode in snp_leaf
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 01/19] srcu: Tighten cleanup_srcu_struct() GP checks Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 02/19] srcu: Fix s/is/if/ typo in srcu_node comment Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 04/19] srcu: Dynamically allocate srcu_node array Paul E. McKenney
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

Currently, the srcu_funnel_gp_start() walks its local variable snp up the
tree and reloads sdp->mynode whenever it is necessary to check whether
it is still at the leaf srcu_node level.  This works, but is a bit more
obtuse than absolutely necessary.  In addition, upcoming commits will
dynamically size srcu_struct structures, in which case sdp->mynode will
no longer necessarily be a constant, and this commit helps prepare for
that dynamic sizing.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/srcutree.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index d30e4db04506a..7d13e35e5d277 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -632,20 +632,21 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 {
 	unsigned long flags;
 	int idx = rcu_seq_ctr(s) % ARRAY_SIZE(sdp->mynode->srcu_have_cbs);
-	struct srcu_node *snp = sdp->mynode;
+	struct srcu_node *snp;
+	struct srcu_node *snp_leaf = sdp->mynode;
 	unsigned long snp_seq;
 
 	/* Each pass through the loop does one level of the srcu_node tree. */
-	for (; snp != NULL; snp = snp->srcu_parent) {
-		if (rcu_seq_done(&ssp->srcu_gp_seq, s) && snp != sdp->mynode)
+	for (snp = snp_leaf; snp != NULL; snp = snp->srcu_parent) {
+		if (rcu_seq_done(&ssp->srcu_gp_seq, s) && snp != snp_leaf)
 			return; /* GP already done and CBs recorded. */
 		spin_lock_irqsave_rcu_node(snp, flags);
 		if (ULONG_CMP_GE(snp->srcu_have_cbs[idx], s)) {
 			snp_seq = snp->srcu_have_cbs[idx];
-			if (snp == sdp->mynode && snp_seq == s)
+			if (snp == snp_leaf && snp_seq == s)
 				snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
 			spin_unlock_irqrestore_rcu_node(snp, flags);
-			if (snp == sdp->mynode && snp_seq != s) {
+			if (snp == snp_leaf && snp_seq != s) {
 				srcu_schedule_cbs_sdp(sdp, do_norm
 							   ? SRCU_INTERVAL
 							   : 0);
@@ -656,7 +657,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 			return;
 		}
 		snp->srcu_have_cbs[idx] = s;
-		if (snp == sdp->mynode)
+		if (snp == snp_leaf)
 			snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
 		if (!do_norm && ULONG_CMP_LT(snp->srcu_gp_seq_needed_exp, s))
 			WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 04/19] srcu: Dynamically allocate srcu_node array
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (2 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 03/19] srcu: Make srcu_funnel_gp_start() cache ->mynode in snp_leaf Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 05/19] srcu: Make Tree SRCU able to operate without snp_node array Paul E. McKenney
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

This commit shrinks the srcu_struct structure by converting its ->node
field from a fixed-size compile-time array to a pointer to a dynamically
allocated array.  In kernels built with large values of NR_CPUS that boot
on systems with smaller numbers of CPUs, this can save significant memory.

[ paulmck: Apply kernel test robot feedback. ]

Reported-by: A cast of thousands
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  2 +-
 kernel/rcu/srcutree.c    | 58 ++++++++++++++++++++++++++++++----------
 2 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 4025840ba9a38..f7190058fb8ab 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -60,7 +60,7 @@ struct srcu_node {
  * Per-SRCU-domain structure, similar in function to rcu_state.
  */
 struct srcu_struct {
-	struct srcu_node node[NUM_RCU_NODES];	/* Combining tree. */
+	struct srcu_node *node;			/* Combining tree. */
 	struct srcu_node *level[RCU_NUM_LVLS + 1];
 						/* First node at each level. */
 	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 7d13e35e5d277..eaf00c354a631 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -24,6 +24,7 @@
 #include <linux/smp.h>
 #include <linux/delay.h>
 #include <linux/module.h>
+#include <linux/slab.h>
 #include <linux/srcu.h>
 
 #include "rcu.h"
@@ -75,12 +76,44 @@ do {									\
 	spin_unlock_irqrestore(&ACCESS_PRIVATE(p, lock), flags)	\
 
 /*
- * Initialize SRCU combining tree.  Note that statically allocated
+ * Initialize SRCU per-CPU data.  Note that statically allocated
  * srcu_struct structures might already have srcu_read_lock() and
  * srcu_read_unlock() running against them.  So if the is_static parameter
  * is set, don't initialize ->srcu_lock_count[] and ->srcu_unlock_count[].
+ *
+ * Returns @true if allocation succeeded and @false otherwise.
+ */
+static void init_srcu_struct_data(struct srcu_struct *ssp)
+{
+	int cpu;
+	struct srcu_data *sdp;
+
+	/*
+	 * Initialize the per-CPU srcu_data array, which feeds into the
+	 * leaves of the srcu_node tree.
+	 */
+	WARN_ON_ONCE(ARRAY_SIZE(sdp->srcu_lock_count) !=
+		     ARRAY_SIZE(sdp->srcu_unlock_count));
+	for_each_possible_cpu(cpu) {
+		sdp = per_cpu_ptr(ssp->sda, cpu);
+		spin_lock_init(&ACCESS_PRIVATE(sdp, lock));
+		rcu_segcblist_init(&sdp->srcu_cblist);
+		sdp->srcu_cblist_invoking = false;
+		sdp->srcu_gp_seq_needed = ssp->srcu_gp_seq;
+		sdp->srcu_gp_seq_needed_exp = ssp->srcu_gp_seq;
+		sdp->mynode = NULL;
+		sdp->cpu = cpu;
+		INIT_WORK(&sdp->work, srcu_invoke_callbacks);
+		timer_setup(&sdp->delay_work, srcu_delay_timer, 0);
+		sdp->ssp = ssp;
+	}
+}
+
+/*
+ * Allocated and initialize SRCU combining tree.  Returns @true if
+ * allocation succeeded and @false otherwise.
  */
-static void init_srcu_struct_nodes(struct srcu_struct *ssp)
+static bool init_srcu_struct_nodes(struct srcu_struct *ssp)
 {
 	int cpu;
 	int i;
@@ -92,6 +125,9 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp)
 
 	/* Initialize geometry if it has not already been initialized. */
 	rcu_init_geometry();
+	ssp->node = kcalloc(rcu_num_nodes, sizeof(*ssp->node), GFP_ATOMIC);
+	if (!ssp->node)
+		return false;
 
 	/* Work out the overall tree geometry. */
 	ssp->level[0] = &ssp->node[0];
@@ -129,29 +165,19 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp)
 	 * Initialize the per-CPU srcu_data array, which feeds into the
 	 * leaves of the srcu_node tree.
 	 */
-	WARN_ON_ONCE(ARRAY_SIZE(sdp->srcu_lock_count) !=
-		     ARRAY_SIZE(sdp->srcu_unlock_count));
 	level = rcu_num_lvls - 1;
 	snp_first = ssp->level[level];
 	for_each_possible_cpu(cpu) {
 		sdp = per_cpu_ptr(ssp->sda, cpu);
-		spin_lock_init(&ACCESS_PRIVATE(sdp, lock));
-		rcu_segcblist_init(&sdp->srcu_cblist);
-		sdp->srcu_cblist_invoking = false;
-		sdp->srcu_gp_seq_needed = ssp->srcu_gp_seq;
-		sdp->srcu_gp_seq_needed_exp = ssp->srcu_gp_seq;
 		sdp->mynode = &snp_first[cpu / levelspread[level]];
 		for (snp = sdp->mynode; snp != NULL; snp = snp->srcu_parent) {
 			if (snp->grplo < 0)
 				snp->grplo = cpu;
 			snp->grphi = cpu;
 		}
-		sdp->cpu = cpu;
-		INIT_WORK(&sdp->work, srcu_invoke_callbacks);
-		timer_setup(&sdp->delay_work, srcu_delay_timer, 0);
-		sdp->ssp = ssp;
 		sdp->grpmask = 1 << (cpu - sdp->mynode->grplo);
 	}
+	return true;
 }
 
 /*
@@ -162,6 +188,7 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp)
  */
 static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 {
+	ssp->node = NULL;
 	mutex_init(&ssp->srcu_cb_mutex);
 	mutex_init(&ssp->srcu_gp_mutex);
 	ssp->srcu_idx = 0;
@@ -174,7 +201,8 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 		ssp->sda = alloc_percpu(struct srcu_data);
 	if (!ssp->sda)
 		return -ENOMEM;
-	init_srcu_struct_nodes(ssp);
+	init_srcu_struct_data(ssp);
+	WARN_ON_ONCE(!init_srcu_struct_nodes(ssp));
 	ssp->srcu_gp_seq_needed_exp = 0;
 	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
 	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
@@ -391,6 +419,8 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 	}
 	free_percpu(ssp->sda);
 	ssp->sda = NULL;
+	kfree(ssp->node);
+	ssp->node = NULL;
 }
 EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
 
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 05/19] srcu: Make Tree SRCU able to operate without snp_node array
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (3 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 04/19] srcu: Dynamically allocate srcu_node array Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 06/19] srcu: Add size-state transitioning code Paul E. McKenney
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

This commit makes Tree SRCU able to operate without an snp_node
array, that is, when the srcu_data structures' ->mynode pointers
are NULL.  This can result in high contention on the srcu_struct
structure's ->lock, but only when there are lots of call_srcu(),
synchronize_srcu(), and synchronize_srcu_expedited() calls.

Note that when there is no snp_node array, all SRCU callbacks use
CPU 0's callback queue.  This is optimal in the common case of low
update-side load because it removes the need to search each CPU
for the single callback that made the grace period happen.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  14 ++-
 kernel/rcu/srcutree.c    | 202 +++++++++++++++++++++------------------
 2 files changed, 123 insertions(+), 93 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index f7190058fb8ab..8501b6b459411 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -63,8 +63,9 @@ struct srcu_struct {
 	struct srcu_node *node;			/* Combining tree. */
 	struct srcu_node *level[RCU_NUM_LVLS + 1];
 						/* First node at each level. */
+	int srcu_size_state;			/* Small-to-big transition state. */
 	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
-	spinlock_t __private lock;		/* Protect counters */
+	spinlock_t __private lock;		/* Protect counters and size state. */
 	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
 	unsigned int srcu_idx;			/* Current rdr array element. */
 	unsigned long srcu_gp_seq;		/* Grace-period seq #. */
@@ -83,6 +84,17 @@ struct srcu_struct {
 	struct lockdep_map dep_map;
 };
 
+/* Values for size state variable (->srcu_size_state). */
+#define SRCU_SIZE_SMALL		0
+#define SRCU_SIZE_ALLOC		1
+#define SRCU_SIZE_WAIT_BARRIER	2
+#define SRCU_SIZE_WAIT_CALL	3
+#define SRCU_SIZE_WAIT_CBS1	4
+#define SRCU_SIZE_WAIT_CBS2	5
+#define SRCU_SIZE_WAIT_CBS3	6
+#define SRCU_SIZE_WAIT_CBS4	7
+#define SRCU_SIZE_BIG		8
+
 /* Values for state variable (bottom bits of ->srcu_gp_seq). */
 #define SRCU_STATE_IDLE		0
 #define SRCU_STATE_SCAN1	1
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index eaf00c354a631..2bbe8a5d9ae86 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -177,14 +177,14 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp)
 		}
 		sdp->grpmask = 1 << (cpu - sdp->mynode->grplo);
 	}
+	smp_store_release(&ssp->srcu_size_state, SRCU_SIZE_WAIT_BARRIER);
 	return true;
 }
 
 /*
  * Initialize non-compile-time initialized fields, including the
- * associated srcu_node and srcu_data structures.  The is_static
- * parameter is passed through to init_srcu_struct_nodes(), and
- * also tells us that ->sda has already been wired up to srcu_data.
+ * associated srcu_node and srcu_data structures.  The is_static parameter
+ * tells us that ->sda has already been wired up to srcu_data.
  */
 static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 {
@@ -421,6 +421,7 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 	ssp->sda = NULL;
 	kfree(ssp->node);
 	ssp->node = NULL;
+	ssp->srcu_size_state = SRCU_SIZE_SMALL;
 }
 EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
 
@@ -469,6 +470,10 @@ static void srcu_gp_start(struct srcu_struct *ssp)
 	struct srcu_data *sdp = this_cpu_ptr(ssp->sda);
 	int state;
 
+	if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+		sdp = per_cpu_ptr(ssp->sda, 0);
+	else
+		sdp = this_cpu_ptr(ssp->sda);
 	lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
 	WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
 	spin_lock_rcu_node(sdp);  /* Interrupts already disabled. */
@@ -569,38 +574,40 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	/* A new grace period can start at this point.  But only one. */
 
 	/* Initiate callback invocation as needed. */
-	idx = rcu_seq_ctr(gpseq) % ARRAY_SIZE(snp->srcu_have_cbs);
-	srcu_for_each_node_breadth_first(ssp, snp) {
-		spin_lock_irq_rcu_node(snp);
-		cbs = false;
-		last_lvl = snp >= ssp->level[rcu_num_lvls - 1];
-		if (last_lvl)
-			cbs = snp->srcu_have_cbs[idx] == gpseq;
-		snp->srcu_have_cbs[idx] = gpseq;
-		rcu_seq_set_state(&snp->srcu_have_cbs[idx], 1);
-		if (ULONG_CMP_LT(snp->srcu_gp_seq_needed_exp, gpseq))
-			WRITE_ONCE(snp->srcu_gp_seq_needed_exp, gpseq);
-		mask = snp->srcu_data_have_cbs[idx];
-		snp->srcu_data_have_cbs[idx] = 0;
-		spin_unlock_irq_rcu_node(snp);
-		if (cbs)
-			srcu_schedule_cbs_snp(ssp, snp, mask, cbdelay);
-
-		/* Occasionally prevent srcu_data counter wrap. */
-		if (!(gpseq & counter_wrap_check) && last_lvl)
-			for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) {
-				sdp = per_cpu_ptr(ssp->sda, cpu);
-				spin_lock_irqsave_rcu_node(sdp, flags);
-				if (ULONG_CMP_GE(gpseq,
-						 sdp->srcu_gp_seq_needed + 100))
-					sdp->srcu_gp_seq_needed = gpseq;
-				if (ULONG_CMP_GE(gpseq,
-						 sdp->srcu_gp_seq_needed_exp + 100))
-					sdp->srcu_gp_seq_needed_exp = gpseq;
-				spin_unlock_irqrestore_rcu_node(sdp, flags);
-			}
+	if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER) {
+		srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, 0), cbdelay);
+	} else {
+		idx = rcu_seq_ctr(gpseq) % ARRAY_SIZE(snp->srcu_have_cbs);
+		srcu_for_each_node_breadth_first(ssp, snp) {
+			spin_lock_irq_rcu_node(snp);
+			cbs = false;
+			last_lvl = snp >= ssp->level[rcu_num_lvls - 1];
+			if (last_lvl)
+				cbs = snp->srcu_have_cbs[idx] == gpseq;
+			snp->srcu_have_cbs[idx] = gpseq;
+			rcu_seq_set_state(&snp->srcu_have_cbs[idx], 1);
+			if (ULONG_CMP_LT(snp->srcu_gp_seq_needed_exp, gpseq))
+				WRITE_ONCE(snp->srcu_gp_seq_needed_exp, gpseq);
+			mask = snp->srcu_data_have_cbs[idx];
+			snp->srcu_data_have_cbs[idx] = 0;
+			spin_unlock_irq_rcu_node(snp);
+			if (cbs)
+				srcu_schedule_cbs_snp(ssp, snp, mask, cbdelay);
+		}
 	}
 
+	/* Occasionally prevent srcu_data counter wrap. */
+	if (!(gpseq & counter_wrap_check))
+		for_each_possible_cpu(cpu) {
+			sdp = per_cpu_ptr(ssp->sda, cpu);
+			spin_lock_irqsave_rcu_node(sdp, flags);
+			if (ULONG_CMP_GE(gpseq, sdp->srcu_gp_seq_needed + 100))
+				sdp->srcu_gp_seq_needed = gpseq;
+			if (ULONG_CMP_GE(gpseq, sdp->srcu_gp_seq_needed_exp + 100))
+				sdp->srcu_gp_seq_needed_exp = gpseq;
+			spin_unlock_irqrestore_rcu_node(sdp, flags);
+		}
+
 	/* Callback initiation done, allow grace periods after next. */
 	mutex_unlock(&ssp->srcu_cb_mutex);
 
@@ -629,18 +636,19 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
 {
 	unsigned long flags;
 
-	for (; snp != NULL; snp = snp->srcu_parent) {
-		if (rcu_seq_done(&ssp->srcu_gp_seq, s) ||
-		    ULONG_CMP_GE(READ_ONCE(snp->srcu_gp_seq_needed_exp), s))
-			return;
-		spin_lock_irqsave_rcu_node(snp, flags);
-		if (ULONG_CMP_GE(snp->srcu_gp_seq_needed_exp, s)) {
+	if (snp)
+		for (; snp != NULL; snp = snp->srcu_parent) {
+			if (rcu_seq_done(&ssp->srcu_gp_seq, s) ||
+			    ULONG_CMP_GE(READ_ONCE(snp->srcu_gp_seq_needed_exp), s))
+				return;
+			spin_lock_irqsave_rcu_node(snp, flags);
+			if (ULONG_CMP_GE(snp->srcu_gp_seq_needed_exp, s)) {
+				spin_unlock_irqrestore_rcu_node(snp, flags);
+				return;
+			}
+			WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
 			spin_unlock_irqrestore_rcu_node(snp, flags);
-			return;
 		}
-		WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
-		spin_unlock_irqrestore_rcu_node(snp, flags);
-	}
 	spin_lock_irqsave_rcu_node(ssp, flags);
 	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, s))
 		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, s);
@@ -663,36 +671,37 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 	unsigned long flags;
 	int idx = rcu_seq_ctr(s) % ARRAY_SIZE(sdp->mynode->srcu_have_cbs);
 	struct srcu_node *snp;
-	struct srcu_node *snp_leaf = sdp->mynode;
+	struct srcu_node *snp_leaf = smp_load_acquire(&sdp->mynode);
 	unsigned long snp_seq;
 
-	/* Each pass through the loop does one level of the srcu_node tree. */
-	for (snp = snp_leaf; snp != NULL; snp = snp->srcu_parent) {
-		if (rcu_seq_done(&ssp->srcu_gp_seq, s) && snp != snp_leaf)
-			return; /* GP already done and CBs recorded. */
-		spin_lock_irqsave_rcu_node(snp, flags);
-		if (ULONG_CMP_GE(snp->srcu_have_cbs[idx], s)) {
-			snp_seq = snp->srcu_have_cbs[idx];
-			if (snp == snp_leaf && snp_seq == s)
-				snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
-			spin_unlock_irqrestore_rcu_node(snp, flags);
-			if (snp == snp_leaf && snp_seq != s) {
-				srcu_schedule_cbs_sdp(sdp, do_norm
-							   ? SRCU_INTERVAL
-							   : 0);
+	if (snp_leaf)
+		/* Each pass through the loop does one level of the srcu_node tree. */
+		for (snp = snp_leaf; snp != NULL; snp = snp->srcu_parent) {
+			if (rcu_seq_done(&ssp->srcu_gp_seq, s) && snp != snp_leaf)
+				return; /* GP already done and CBs recorded. */
+			spin_lock_irqsave_rcu_node(snp, flags);
+			if (ULONG_CMP_GE(snp->srcu_have_cbs[idx], s)) {
+				snp_seq = snp->srcu_have_cbs[idx];
+				if (snp == snp_leaf && snp_seq == s)
+					snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
+				spin_unlock_irqrestore_rcu_node(snp, flags);
+				if (snp == snp_leaf && snp_seq != s) {
+					srcu_schedule_cbs_sdp(sdp, do_norm
+								   ? SRCU_INTERVAL
+								   : 0);
+					return;
+				}
+				if (!do_norm)
+					srcu_funnel_exp_start(ssp, snp, s);
 				return;
 			}
-			if (!do_norm)
-				srcu_funnel_exp_start(ssp, snp, s);
-			return;
+			snp->srcu_have_cbs[idx] = s;
+			if (snp == snp_leaf)
+				snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
+			if (!do_norm && ULONG_CMP_LT(snp->srcu_gp_seq_needed_exp, s))
+				WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
+			spin_unlock_irqrestore_rcu_node(snp, flags);
 		}
-		snp->srcu_have_cbs[idx] = s;
-		if (snp == snp_leaf)
-			snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
-		if (!do_norm && ULONG_CMP_LT(snp->srcu_gp_seq_needed_exp, s))
-			WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
-		spin_unlock_irqrestore_rcu_node(snp, flags);
-	}
 
 	/* Top of tree, must ensure the grace period will be started. */
 	spin_lock_irqsave_rcu_node(ssp, flags);
@@ -850,7 +859,11 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
 
 	check_init_srcu_struct(ssp);
 	idx = srcu_read_lock(ssp);
-	sdp = raw_cpu_ptr(ssp->sda);
+	if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_CALL) {
+		sdp = per_cpu_ptr(ssp->sda, 0);
+	} else {
+		sdp = raw_cpu_ptr(ssp->sda);
+	}
 	spin_lock_irqsave_rcu_node(sdp, flags);
 	if (rhp)
 		rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
@@ -870,7 +883,7 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
 	if (needgp)
 		srcu_funnel_gp_start(ssp, sdp, s, do_norm);
 	else if (needexp)
-		srcu_funnel_exp_start(ssp, sdp->mynode, s);
+		srcu_funnel_exp_start(ssp, smp_load_acquire(&sdp->mynode), s);
 	srcu_read_unlock(ssp, idx);
 	return s;
 }
@@ -1130,6 +1143,28 @@ static void srcu_barrier_cb(struct rcu_head *rhp)
 		complete(&ssp->srcu_barrier_completion);
 }
 
+/*
+ * Enqueue an srcu_barrier() callback on the specified srcu_data
+ * structure's ->cblist.  but only if that ->cblist already has at least one
+ * callback enqueued.  Note that if a CPU already has callbacks enqueue,
+ * it must have already registered the need for a future grace period,
+ * so all we need do is enqueue a callback that will use the same grace
+ * period as the last callback already in the queue.
+ */
+static void srcu_barrier_one_cpu(struct srcu_struct *ssp, struct srcu_data *sdp)
+{
+	spin_lock_irq_rcu_node(sdp);
+	atomic_inc(&ssp->srcu_barrier_cpu_cnt);
+	sdp->srcu_barrier_head.func = srcu_barrier_cb;
+	debug_rcu_head_queue(&sdp->srcu_barrier_head);
+	if (!rcu_segcblist_entrain(&sdp->srcu_cblist,
+				   &sdp->srcu_barrier_head)) {
+		debug_rcu_head_unqueue(&sdp->srcu_barrier_head);
+		atomic_dec(&ssp->srcu_barrier_cpu_cnt);
+	}
+	spin_unlock_irq_rcu_node(sdp);
+}
+
 /**
  * srcu_barrier - Wait until all in-flight call_srcu() callbacks complete.
  * @ssp: srcu_struct on which to wait for in-flight callbacks.
@@ -1137,7 +1172,6 @@ static void srcu_barrier_cb(struct rcu_head *rhp)
 void srcu_barrier(struct srcu_struct *ssp)
 {
 	int cpu;
-	struct srcu_data *sdp;
 	unsigned long s = rcu_seq_snap(&ssp->srcu_barrier_seq);
 
 	check_init_srcu_struct(ssp);
@@ -1153,27 +1187,11 @@ void srcu_barrier(struct srcu_struct *ssp)
 	/* Initial count prevents reaching zero until all CBs are posted. */
 	atomic_set(&ssp->srcu_barrier_cpu_cnt, 1);
 
-	/*
-	 * Each pass through this loop enqueues a callback, but only
-	 * on CPUs already having callbacks enqueued.  Note that if
-	 * a CPU already has callbacks enqueue, it must have already
-	 * registered the need for a future grace period, so all we
-	 * need do is enqueue a callback that will use the same
-	 * grace period as the last callback already in the queue.
-	 */
-	for_each_possible_cpu(cpu) {
-		sdp = per_cpu_ptr(ssp->sda, cpu);
-		spin_lock_irq_rcu_node(sdp);
-		atomic_inc(&ssp->srcu_barrier_cpu_cnt);
-		sdp->srcu_barrier_head.func = srcu_barrier_cb;
-		debug_rcu_head_queue(&sdp->srcu_barrier_head);
-		if (!rcu_segcblist_entrain(&sdp->srcu_cblist,
-					   &sdp->srcu_barrier_head)) {
-			debug_rcu_head_unqueue(&sdp->srcu_barrier_head);
-			atomic_dec(&ssp->srcu_barrier_cpu_cnt);
-		}
-		spin_unlock_irq_rcu_node(sdp);
-	}
+	if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+		srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda, 0));
+	else
+		for_each_possible_cpu(cpu)
+			srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda, cpu));
 
 	/* Remove the initial count, at which point reaching zero can happen. */
 	if (atomic_dec_and_test(&ssp->srcu_barrier_cpu_cnt))
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 06/19] srcu: Add size-state transitioning code
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (4 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 05/19] srcu: Make Tree SRCU able to operate without snp_node array Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 07/19] srcu: Make rcutorture dump the SRCU size state Paul E. McKenney
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

This is just dead code at the moment, but it serves to prevent
spurious compiler warnings about init_srcu_struct_nodes() being unused.
This function will once again be used once the state-transition code
is activated.

Because srcu_barrier() must be aware of transition before call_srcu(), the
state machine waits for an SRCU grace period before callbacks are queued
to the non-CPU-0 queues.  This requres that portions of srcu_barrier()
be enclosed in an SRCU read-side critical section.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/srcutree.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 2bbe8a5d9ae86..fce041b648ee3 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -202,7 +202,6 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	if (!ssp->sda)
 		return -ENOMEM;
 	init_srcu_struct_data(ssp);
-	WARN_ON_ONCE(!init_srcu_struct_nodes(ssp));
 	ssp->srcu_gp_seq_needed_exp = 0;
 	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
 	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
@@ -555,6 +554,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	unsigned long mask;
 	struct srcu_data *sdp;
 	struct srcu_node *snp;
+	int ss_state;
 
 	/* Prevent more than one additional grace period. */
 	mutex_lock(&ssp->srcu_cb_mutex);
@@ -622,6 +622,15 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	} else {
 		spin_unlock_irq_rcu_node(ssp);
 	}
+
+	/* Transition to big if needed. */
+	ss_state = smp_load_acquire(&ssp->srcu_size_state);
+	if (ss_state && ss_state != SRCU_SIZE_BIG) {
+		if (ss_state == SRCU_SIZE_ALLOC)
+			init_srcu_struct_nodes(ssp);
+		else
+			smp_store_release(&ssp->srcu_size_state, ss_state + 1);
+	}
 }
 
 /*
@@ -1172,6 +1181,7 @@ static void srcu_barrier_one_cpu(struct srcu_struct *ssp, struct srcu_data *sdp)
 void srcu_barrier(struct srcu_struct *ssp)
 {
 	int cpu;
+	int idx;
 	unsigned long s = rcu_seq_snap(&ssp->srcu_barrier_seq);
 
 	check_init_srcu_struct(ssp);
@@ -1187,11 +1197,13 @@ void srcu_barrier(struct srcu_struct *ssp)
 	/* Initial count prevents reaching zero until all CBs are posted. */
 	atomic_set(&ssp->srcu_barrier_cpu_cnt, 1);
 
+	idx = srcu_read_lock(ssp);
 	if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
 		srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda, 0));
 	else
 		for_each_possible_cpu(cpu)
 			srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda, cpu));
+	srcu_read_unlock(ssp, idx);
 
 	/* Remove the initial count, at which point reaching zero can happen. */
 	if (atomic_dec_and_test(&ssp->srcu_barrier_cpu_cnt))
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 07/19] srcu: Make rcutorture dump the SRCU size state
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (5 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 06/19] srcu: Add size-state transitioning code Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 08/19] srcu: Add boot-time control over srcu_node array allocation Paul E. McKenney
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

This commit adds the numeric and string version of ->srcu_size_state to
the Tree-SRCU-specific portion of the rcutorture output.

[ paulmck: Apply feedback from kernel test robot and Dan Carpenter. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/srcutree.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index fce041b648ee3..550991cc213d3 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1400,15 +1400,34 @@ void srcutorture_get_gp_data(enum rcutorture_type test_type,
 }
 EXPORT_SYMBOL_GPL(srcutorture_get_gp_data);
 
+const char * const srcu_size_state_name[] =
+{
+	"SRCU_SIZE_SMALL",
+	"SRCU_SIZE_ALLOC",
+	"SRCU_SIZE_WAIT_BARRIER",
+	"SRCU_SIZE_WAIT_CALL",
+	"SRCU_SIZE_WAIT_CBS1",
+	"SRCU_SIZE_WAIT_CBS2",
+	"SRCU_SIZE_WAIT_CBS3",
+	"SRCU_SIZE_WAIT_CBS4",
+	"SRCU_SIZE_BIG",
+	"SRCU_SIZE_???",
+};
+
 void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
 {
 	int cpu;
 	int idx;
 	unsigned long s0 = 0, s1 = 0;
+	int ss_state = READ_ONCE(ssp->srcu_size_state);
+	int ss_state_idx = ss_state;
 
 	idx = ssp->srcu_idx & 0x1;
-	pr_alert("%s%s Tree SRCU g%ld per-CPU(idx=%d):",
-		 tt, tf, rcu_seq_current(&ssp->srcu_gp_seq), idx);
+	if (ss_state < 0 || ss_state >= ARRAY_SIZE(srcu_size_state_name))
+		ss_state_idx = ARRAY_SIZE(srcu_size_state_name) - 1;
+	pr_alert("%s%s Tree SRCU g%ld state %d (%s) per-CPU(idx=%d):",
+		 tt, tf, rcu_seq_current(&ssp->srcu_gp_seq), ss_state,
+		 srcu_size_state_name[ss_state_idx], idx);
 	for_each_possible_cpu(cpu) {
 		unsigned long l0, l1;
 		unsigned long u0, u1;
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 08/19] srcu: Add boot-time control over srcu_node array allocation
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (6 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 07/19] srcu: Make rcutorture dump the SRCU size state Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 09/19] srcu: Use export for srcu_struct defined by DEFINE_STATIC_SRCU() Paul E. McKenney
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

This commit adds an srcu_tree.convert_to_big kernel parameter that either
refuses to convert at all (0), converts immediately at init_srcu_struct()
time (1), or lets rcutorture convert it (2).  An addition contention-based
dynamic convertion choice will be added, along with documentation.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/srcutree.c | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 550991cc213d3..d1f1ff930acf5 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -39,6 +39,15 @@ module_param(exp_holdoff, ulong, 0444);
 static ulong counter_wrap_check = (ULONG_MAX >> 2);
 module_param(counter_wrap_check, ulong, 0444);
 
+/*
+ * Control conversion to SRCU_SIZE_BIG:
+ * 0: Don't convert at all (default).
+ * 1: Convert at init_srcu_struct() time.
+ * 2: Convert when rcutorture invokes srcu_torture_stats_print().
+ */
+static int convert_to_big;
+module_param(convert_to_big, int, 0444);
+
 /* Early-boot callback-management, so early that no lock is required! */
 static LIST_HEAD(srcu_boot_list);
 static bool __read_mostly srcu_init_done;
@@ -113,7 +122,7 @@ static void init_srcu_struct_data(struct srcu_struct *ssp)
  * Allocated and initialize SRCU combining tree.  Returns @true if
  * allocation succeeded and @false otherwise.
  */
-static bool init_srcu_struct_nodes(struct srcu_struct *ssp)
+static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 {
 	int cpu;
 	int i;
@@ -125,7 +134,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp)
 
 	/* Initialize geometry if it has not already been initialized. */
 	rcu_init_geometry();
-	ssp->node = kcalloc(rcu_num_nodes, sizeof(*ssp->node), GFP_ATOMIC);
+	ssp->node = kcalloc(rcu_num_nodes, sizeof(*ssp->node), gfp_flags);
 	if (!ssp->node)
 		return false;
 
@@ -204,6 +213,16 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	init_srcu_struct_data(ssp);
 	ssp->srcu_gp_seq_needed_exp = 0;
 	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
+	if (READ_ONCE(ssp->srcu_size_state) == SRCU_SIZE_SMALL && convert_to_big == 1) {
+		if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
+			if (!is_static) {
+				free_percpu(ssp->sda);
+				ssp->sda = NULL;
+			}
+			return -ENOMEM;
+		}
+		WRITE_ONCE(ssp->srcu_size_state, SRCU_SIZE_BIG);
+	}
 	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
 	return 0;
 }
@@ -627,7 +646,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	ss_state = smp_load_acquire(&ssp->srcu_size_state);
 	if (ss_state && ss_state != SRCU_SIZE_BIG) {
 		if (ss_state == SRCU_SIZE_ALLOC)
-			init_srcu_struct_nodes(ssp);
+			init_srcu_struct_nodes(ssp, GFP_KERNEL);
 		else
 			smp_store_release(&ssp->srcu_size_state, ss_state + 1);
 	}
@@ -1456,6 +1475,8 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
 		s1 += c1;
 	}
 	pr_cont(" T(%ld,%ld)\n", s0, s1);
+	if (READ_ONCE(ssp->srcu_size_state) == SRCU_SIZE_SMALL && convert_to_big == 2)
+		WRITE_ONCE(ssp->srcu_size_state, SRCU_SIZE_ALLOC);
 }
 EXPORT_SYMBOL_GPL(srcu_torture_stats_print);
 
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 09/19] srcu: Use export for srcu_struct defined by DEFINE_STATIC_SRCU()
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (7 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 08/19] srcu: Add boot-time control over srcu_node array allocation Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 10/19] srcu: Compute snp_seq earlier in srcu_funnel_gp_start() Paul E. McKenney
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Alexander Aring, Paul E . McKenney

From: Alexander Aring <aahringo@redhat.com>

If an srcu_struct structure defined by tree SRCU's DEFINE_STATIC_SRCU()
is used by a module, sparse will give the following diagnostic:

sparse: symbol '__srcu_struct_nodes_srcu' was not declared. Should it be static?

The problem is that a within-module DEFINE_STATIC_SRCU() must define
a non-static srcu_struct because it is exported by referencing it in a
special '__section("___srcu_struct_ptrs")'.  This reference is needed
so that module load and unloading can invoke init_srcu_struct() and
cleanup_srcu_struct(), respectively.  Unfortunately, sparse is unaware of
'__section("___srcu_struct_ptrs")', resulting in the above false-positive
diagnostic.  To avoid this false positive, this commit therefore creates
a prototype of the srcu_struct with an "extern" keyword.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 8501b6b459411..44e998643f483 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -131,6 +131,7 @@ struct srcu_struct {
 #ifdef MODULE
 # define __DEFINE_SRCU(name, is_static)					\
 	is_static struct srcu_struct name;				\
+	extern struct srcu_struct * const __srcu_struct_##name;		\
 	struct srcu_struct * const __srcu_struct_##name			\
 		__section("___srcu_struct_ptrs") = &name
 #else
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 10/19] srcu: Compute snp_seq earlier in srcu_funnel_gp_start()
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (8 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 09/19] srcu: Use export for srcu_struct defined by DEFINE_STATIC_SRCU() Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 11/19] srcu: Use invalid initial value for srcu_node GP sequence numbers Paul E. McKenney
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

Currently, srcu_funnel_gp_start() tests snp->srcu_have_cbs[idx] and then
separately assigns it to the snp_seq local variable.  This commit does
the assignment earlier to simplify the code a bit.  While in the area,
this commit also takes advantage of the 100-character line limit to put
the call to srcu_schedule_cbs_sdp() on a single line.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/srcutree.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index d1f1ff930acf5..8f55967b58a74 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -708,15 +708,13 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 			if (rcu_seq_done(&ssp->srcu_gp_seq, s) && snp != snp_leaf)
 				return; /* GP already done and CBs recorded. */
 			spin_lock_irqsave_rcu_node(snp, flags);
-			if (ULONG_CMP_GE(snp->srcu_have_cbs[idx], s)) {
-				snp_seq = snp->srcu_have_cbs[idx];
+			snp_seq = snp->srcu_have_cbs[idx];
+			if (ULONG_CMP_GE(snp_seq, s)) {
 				if (snp == snp_leaf && snp_seq == s)
 					snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
 				spin_unlock_irqrestore_rcu_node(snp, flags);
 				if (snp == snp_leaf && snp_seq != s) {
-					srcu_schedule_cbs_sdp(sdp, do_norm
-								   ? SRCU_INTERVAL
-								   : 0);
+					srcu_schedule_cbs_sdp(sdp, do_norm ? SRCU_INTERVAL : 0);
 					return;
 				}
 				if (!do_norm)
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 11/19] srcu: Use invalid initial value for srcu_node GP sequence numbers
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (9 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 10/19] srcu: Compute snp_seq earlier in srcu_funnel_gp_start() Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 12/19] srcu: Avoid NULL dereference in srcu_torture_stats_print() Paul E. McKenney
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

Currently, tree SRCU relies on the srcu_node structures being initialized
at the same time that the srcu_struct itself is initialized, and thus
use the initial grace-period sequence number as the initial value for
the srcu_node structure's ->srcu_have_cbs[] and ->srcu_gp_seq_needed_exp
fields.  Although this has a high probability of also working when the
srcu_node array is allocated and initialized at some random later time,
it would be better to avoid leaving such things to chance.

This commit therefore initializes these fields with 0x1, which is a
recognizable invalid value.  It then adds the required checks for this
invalid value in order to avoid confusion on long-running kernels
(especially those on 32-bit systems) that allocate and initialize
srcu_node arrays late in life.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/srcutree.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 8f55967b58a74..4201815744d85 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -150,10 +150,10 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 		WARN_ON_ONCE(ARRAY_SIZE(snp->srcu_have_cbs) !=
 			     ARRAY_SIZE(snp->srcu_data_have_cbs));
 		for (i = 0; i < ARRAY_SIZE(snp->srcu_have_cbs); i++) {
-			snp->srcu_have_cbs[i] = 0;
+			snp->srcu_have_cbs[i] = 0x1;
 			snp->srcu_data_have_cbs[i] = 0;
 		}
-		snp->srcu_gp_seq_needed_exp = 0;
+		snp->srcu_gp_seq_needed_exp = 0x1;
 		snp->grplo = -1;
 		snp->grphi = -1;
 		if (snp == &ssp->node[0]) {
@@ -397,8 +397,7 @@ static bool srcu_readers_active(struct srcu_struct *ssp)
  */
 static unsigned long srcu_get_delay(struct srcu_struct *ssp)
 {
-	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq),
-			 READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
+	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
 		return 0;
 	return SRCU_INTERVAL;
 }
@@ -572,6 +571,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	int idx;
 	unsigned long mask;
 	struct srcu_data *sdp;
+	unsigned long sgsne;
 	struct srcu_node *snp;
 	int ss_state;
 
@@ -605,7 +605,8 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 				cbs = snp->srcu_have_cbs[idx] == gpseq;
 			snp->srcu_have_cbs[idx] = gpseq;
 			rcu_seq_set_state(&snp->srcu_have_cbs[idx], 1);
-			if (ULONG_CMP_LT(snp->srcu_gp_seq_needed_exp, gpseq))
+			sgsne = snp->srcu_gp_seq_needed_exp;
+			if (rcu_seq_state(sgsne) || ULONG_CMP_LT(sgsne, gpseq))
 				WRITE_ONCE(snp->srcu_gp_seq_needed_exp, gpseq);
 			mask = snp->srcu_data_have_cbs[idx];
 			snp->srcu_data_have_cbs[idx] = 0;
@@ -663,14 +664,17 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
 				  unsigned long s)
 {
 	unsigned long flags;
+	unsigned long sgsne;
 
 	if (snp)
 		for (; snp != NULL; snp = snp->srcu_parent) {
+			sgsne = READ_ONCE(snp->srcu_gp_seq_needed_exp);
 			if (rcu_seq_done(&ssp->srcu_gp_seq, s) ||
-			    ULONG_CMP_GE(READ_ONCE(snp->srcu_gp_seq_needed_exp), s))
+			    (!rcu_seq_state(sgsne) && ULONG_CMP_GE(sgsne, s)))
 				return;
 			spin_lock_irqsave_rcu_node(snp, flags);
-			if (ULONG_CMP_GE(snp->srcu_gp_seq_needed_exp, s)) {
+			sgsne = snp->srcu_gp_seq_needed_exp;
+			if (!rcu_seq_state(sgsne) && ULONG_CMP_GE(sgsne, s)) {
 				spin_unlock_irqrestore_rcu_node(snp, flags);
 				return;
 			}
@@ -698,6 +702,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 {
 	unsigned long flags;
 	int idx = rcu_seq_ctr(s) % ARRAY_SIZE(sdp->mynode->srcu_have_cbs);
+	unsigned long sgsne;
 	struct srcu_node *snp;
 	struct srcu_node *snp_leaf = smp_load_acquire(&sdp->mynode);
 	unsigned long snp_seq;
@@ -724,7 +729,8 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 			snp->srcu_have_cbs[idx] = s;
 			if (snp == snp_leaf)
 				snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
-			if (!do_norm && ULONG_CMP_LT(snp->srcu_gp_seq_needed_exp, s))
+			sgsne = snp->srcu_gp_seq_needed_exp;
+			if (!do_norm && (rcu_seq_state(sgsne) || ULONG_CMP_LT(sgsne, s)))
 				WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
 			spin_unlock_irqrestore_rcu_node(snp, flags);
 		}
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 12/19] srcu: Avoid NULL dereference in srcu_torture_stats_print()
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (10 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 11/19] srcu: Use invalid initial value for srcu_node GP sequence numbers Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 13/19] srcu: Prevent cleanup_srcu_struct() from freeing non-dynamic ->sda Paul E. McKenney
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

You really shouldn't invoke srcu_torture_stats_print() after invoking
cleanup_srcu_struct(), but there is really no reason to get a
compiler-obfuscated per-CPU-variable NULL pointer dereference as the
diagnostic.  This commit therefore checks for NULL ->sda and makes a
more polite console-message complaint in that case.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/srcutree.c | 62 ++++++++++++++++++++++++-------------------
 1 file changed, 34 insertions(+), 28 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 4201815744d85..767487ad5440a 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1448,37 +1448,43 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
 	idx = ssp->srcu_idx & 0x1;
 	if (ss_state < 0 || ss_state >= ARRAY_SIZE(srcu_size_state_name))
 		ss_state_idx = ARRAY_SIZE(srcu_size_state_name) - 1;
-	pr_alert("%s%s Tree SRCU g%ld state %d (%s) per-CPU(idx=%d):",
+	pr_alert("%s%s Tree SRCU g%ld state %d (%s)",
 		 tt, tf, rcu_seq_current(&ssp->srcu_gp_seq), ss_state,
-		 srcu_size_state_name[ss_state_idx], idx);
-	for_each_possible_cpu(cpu) {
-		unsigned long l0, l1;
-		unsigned long u0, u1;
-		long c0, c1;
-		struct srcu_data *sdp;
-
-		sdp = per_cpu_ptr(ssp->sda, cpu);
-		u0 = data_race(sdp->srcu_unlock_count[!idx]);
-		u1 = data_race(sdp->srcu_unlock_count[idx]);
+		 srcu_size_state_name[ss_state_idx]);
+	if (!ssp->sda) {
+		// Called after cleanup_srcu_struct(), perhaps.
+		pr_cont(" No per-CPU srcu_data structures (->sda == NULL).\n");
+	} else {
+		pr_cont(" per-CPU(idx=%d):", idx);
+		for_each_possible_cpu(cpu) {
+			unsigned long l0, l1;
+			unsigned long u0, u1;
+			long c0, c1;
+			struct srcu_data *sdp;
 
-		/*
-		 * Make sure that a lock is always counted if the corresponding
-		 * unlock is counted.
-		 */
-		smp_rmb();
-
-		l0 = data_race(sdp->srcu_lock_count[!idx]);
-		l1 = data_race(sdp->srcu_lock_count[idx]);
-
-		c0 = l0 - u0;
-		c1 = l1 - u1;
-		pr_cont(" %d(%ld,%ld %c)",
-			cpu, c0, c1,
-			"C."[rcu_segcblist_empty(&sdp->srcu_cblist)]);
-		s0 += c0;
-		s1 += c1;
+			sdp = per_cpu_ptr(ssp->sda, cpu);
+			u0 = data_race(sdp->srcu_unlock_count[!idx]);
+			u1 = data_race(sdp->srcu_unlock_count[idx]);
+
+			/*
+			 * Make sure that a lock is always counted if the corresponding
+			 * unlock is counted.
+			 */
+			smp_rmb();
+
+			l0 = data_race(sdp->srcu_lock_count[!idx]);
+			l1 = data_race(sdp->srcu_lock_count[idx]);
+
+			c0 = l0 - u0;
+			c1 = l1 - u1;
+			pr_cont(" %d(%ld,%ld %c)",
+				cpu, c0, c1,
+				"C."[rcu_segcblist_empty(&sdp->srcu_cblist)]);
+			s0 += c0;
+			s1 += c1;
+		}
+		pr_cont(" T(%ld,%ld)\n", s0, s1);
 	}
-	pr_cont(" T(%ld,%ld)\n", s0, s1);
 	if (READ_ONCE(ssp->srcu_size_state) == SRCU_SIZE_SMALL && convert_to_big == 2)
 		WRITE_ONCE(ssp->srcu_size_state, SRCU_SIZE_ALLOC);
 }
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 13/19] srcu: Prevent cleanup_srcu_struct() from freeing non-dynamic ->sda
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (11 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 12/19] srcu: Avoid NULL dereference in srcu_torture_stats_print() Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 14/19] srcu: Explain srcu_funnel_gp_start() call to list_add() is safe Paul E. McKenney
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

When an srcu_struct structure is created (but not in a kernel module)
by DEFINE_SRCU() and friends, the per-CPU srcu_data structure is
statically allocated.  In all other cases, that structure is obtained
from alloc_percpu(), in which case cleanup_srcu_struct() must invoke
free_percpu() on the resulting ->sda pointer in the srcu_struct pointer.

Which it does.

Except that it also invokes free_percpu() on the ->sda pointer
referencing the statically allocated per-CPU srcu_data structures.
Which free_percpu() is surprisingly OK with.

This commit nevertheless stops cleanup_srcu_struct() from freeing
statically allocated per-CPU srcu_data structures.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  1 +
 kernel/rcu/srcutree.c    | 13 +++++++++----
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 44e998643f483..44bd204498a11 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -73,6 +73,7 @@ struct srcu_struct {
 	unsigned long srcu_gp_seq_needed_exp;	/* Furthest future exp GP. */
 	unsigned long srcu_last_gp_end;		/* Last GP end timestamp (ns) */
 	struct srcu_data __percpu *sda;		/* Per-CPU srcu_data array. */
+	bool sda_is_static;			/* May ->sda be passed to free_percpu()? */
 	unsigned long srcu_barrier_seq;		/* srcu_barrier seq #. */
 	struct mutex srcu_barrier_mutex;	/* Serialize barrier ops. */
 	struct completion srcu_barrier_completion;
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 767487ad5440a..39dc3015dfeba 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -206,8 +206,11 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	mutex_init(&ssp->srcu_barrier_mutex);
 	atomic_set(&ssp->srcu_barrier_cpu_cnt, 0);
 	INIT_DELAYED_WORK(&ssp->work, process_srcu);
-	if (!is_static)
+	ssp->sda_is_static = false;
+	if (!is_static) {
 		ssp->sda = alloc_percpu(struct srcu_data);
+		ssp->sda_is_static = true;
+	}
 	if (!ssp->sda)
 		return -ENOMEM;
 	init_srcu_struct_data(ssp);
@@ -215,7 +218,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
 	if (READ_ONCE(ssp->srcu_size_state) == SRCU_SIZE_SMALL && convert_to_big == 1) {
 		if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
-			if (!is_static) {
+			if (ssp->sda_is_static) {
 				free_percpu(ssp->sda);
 				ssp->sda = NULL;
 			}
@@ -434,8 +437,10 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 			rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
 		return; /* Caller forgot to stop doing call_srcu()? */
 	}
-	free_percpu(ssp->sda);
-	ssp->sda = NULL;
+	if (ssp->sda_is_static) {
+		free_percpu(ssp->sda);
+		ssp->sda = NULL;
+	}
 	kfree(ssp->node);
 	ssp->node = NULL;
 	ssp->srcu_size_state = SRCU_SIZE_SMALL;
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 14/19] srcu: Explain srcu_funnel_gp_start() call to list_add() is safe
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (12 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 13/19] srcu: Prevent cleanup_srcu_struct() from freeing non-dynamic ->sda Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 15/19] srcu: Create concurrency-safe helper for initiating size transition Paul E. McKenney
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

This commit adds a comment explaining why an unprotected call to
list_add() from srcu_funnel_gp_start() can be safe.  TL;DR: It is only
called during very early boot when we don't have no steeking concurrency!

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/srcutree.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 39dc3015dfeba..d16104405c8f9 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -757,6 +757,12 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 	    rcu_seq_state(ssp->srcu_gp_seq) == SRCU_STATE_IDLE) {
 		WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
 		srcu_gp_start(ssp);
+
+		// And how can that list_add() in the "else" clause
+		// possibly be safe for concurrent execution?  Well,
+		// it isn't.  And it does not have to be.  After all, it
+		// can only be executed during early boot when there is only
+		// the one boot CPU running with interrupts still disabled.
 		if (likely(srcu_init_done))
 			queue_delayed_work(rcu_gp_wq, &ssp->work,
 					   srcu_get_delay(ssp));
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 0/19] SRCU updates for v5.18
@ 2022-02-04 23:38 Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 01/19] srcu: Tighten cleanup_srcu_struct() GP checks Paul E. McKenney
                   ` (19 more replies)
  0 siblings, 20 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt

Hello!

This series provides SRCU updates, perhaps most notably some changes
that greatly reduce the size of the srcu_struct structure in kernels
built with large values for the CONFIG_NR_CPUS Kconfig option.

1.	Tighten cleanup_srcu_struct() GP checks.

2.	Fix s/is/if/ typo in srcu_node comment.

3.	Make srcu_funnel_gp_start() cache ->mynode in snp_leaf.

4.	Dynamically allocate srcu_node array.

5.	Make Tree SRCU able to operate without snp_node array.

6.	Add size-state transitioning code.

7.	Make rcutorture dump the SRCU size state.

8.	Add boot-time control over srcu_node array allocation.

9.	Use export for srcu_struct defined by DEFINE_STATIC_SRCU(),
	courtesy of Alexander Aring.

10.	Compute snp_seq earlier in srcu_funnel_gp_start().

11.	Use invalid initial value for srcu_node GP sequence numbers.

12.	Avoid NULL dereference in srcu_torture_stats_print().

13.	Prevent cleanup_srcu_struct() from freeing non-dynamic ->sda.

14.	Explain srcu_funnel_gp_start() call to list_add() is safe.

15.	Create concurrency-safe helper for initiating size transition.

16.	Add contention-triggered addition of srcu_node tree.

17.	Make srcu_size_state_name static, courtesy of Jiapeng Chong.

18.	Automatically determine size-transition strategy at boot.

19.	Add contention check to call_srcu() srcu_data ->lock acquisition.

						Thanx, Paul

------------------------------------------------------------------------

 Documentation/admin-guide/kernel-parameters.txt   |    9 
 b/Documentation/admin-guide/kernel-parameters.txt |   22 
 b/include/linux/srcutree.h                        |    8 
 b/kernel/rcu/srcutree.c                           |    6 
 include/linux/srcutree.h                          |   20 
 kernel/rcu/srcutree.c                             |  644 +++++++++++++++-------
 6 files changed, 500 insertions(+), 209 deletions(-)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH rcu 15/19] srcu: Create concurrency-safe helper for initiating size transition
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (13 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 14/19] srcu: Explain srcu_funnel_gp_start() call to list_add() is safe Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:38 ` [PATCH rcu 16/19] srcu: Add contention-triggered addition of srcu_node tree Paul E. McKenney
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

Once there are contention-initiated size transitions, it will be
possible for rcutorture to initiate a transition at the same time
as a contention-initiated transition.  This commit therefore creates
a concurrency-safe helper function named srcu_transition_to_big() to
safely initiate size transitions.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/srcutree.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index d16104405c8f9..5eee113531cdc 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -262,6 +262,25 @@ EXPORT_SYMBOL_GPL(init_srcu_struct);
 
 #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
+/*
+ * Initiate an idempotent transition to SRCU_SIZE_BIG.
+ */
+static void srcu_transition_to_big(struct srcu_struct *ssp)
+{
+	unsigned long flags;
+
+	/* Double-checked locking on ->srcu_size-state. */
+	if (smp_load_acquire(&ssp->srcu_size_state) != SRCU_SIZE_SMALL)
+		return;
+	spin_lock_irqsave_rcu_node(ssp, flags);
+	if (smp_load_acquire(&ssp->srcu_size_state) != SRCU_SIZE_SMALL) {
+		spin_unlock_irqrestore_rcu_node(ssp, flags);
+		return;
+	}
+	smp_store_release(&ssp->srcu_size_state, SRCU_SIZE_ALLOC);
+	spin_unlock_irqrestore_rcu_node(ssp, flags);
+}
+
 /*
  * First-use initialization of statically allocated srcu_struct
  * structure.  Wiring up the combining tree is more than can be
@@ -1496,8 +1515,8 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
 		}
 		pr_cont(" T(%ld,%ld)\n", s0, s1);
 	}
-	if (READ_ONCE(ssp->srcu_size_state) == SRCU_SIZE_SMALL && convert_to_big == 2)
-		WRITE_ONCE(ssp->srcu_size_state, SRCU_SIZE_ALLOC);
+	if (convert_to_big == 2)
+		srcu_transition_to_big(ssp);
 }
 EXPORT_SYMBOL_GPL(srcu_torture_stats_print);
 
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 16/19] srcu: Add contention-triggered addition of srcu_node tree
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (14 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 15/19] srcu: Create concurrency-safe helper for initiating size transition Paul E. McKenney
@ 2022-02-04 23:38 ` Paul E. McKenney
  2022-02-04 23:39 ` [PATCH rcu 17/19] srcu: Make srcu_size_state_name static Paul E. McKenney
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:38 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

This commit instruments the acquisitions of the srcu_struct structure's
->lock, enabling the initiation of a transition from SRCU_SIZE_SMALL
to SRCU_SIZE_BIG when sufficient contention is experienced.  The
instrumentation counts the number of trylock failures within the confines
of a single jiffy.  If that number exceeds the value specified by the
srcutree.small_contention_lim kernel boot parameter (which defaults to
100), and if the value specified by the srcutree.convert_to_big kernel
boot parameter has the 0x10 bit set (defaults to 0), then a transition
will be automatically initiated.

By default, there will never be any transitions, so that none of the
srcu_struct structures ever gains an srcu_node array.

The useful values for srcutree.convert_to_big are:

0x00:  Decide conversion approach at boot given system size.
0x01:  Never convert.
0x02:  Always convert at init_srcu_struct() time.
0x03:  Convert when rcutorture prints its first round of statistics.
0x11:  Convert if contention is encountered.
0x12:  Convert if contention is encountered or when rcutorture prints
        its first round of statistics, whichever comes first.

The value 0x12 acts the same as 0x02 because the convertion happens
before there is any chance of contention.

[ paulmck: Apply "static" feedback from kernel test robot. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 .../admin-guide/kernel-parameters.txt         |  22 ++++
 include/linux/srcutree.h                      |   2 +
 kernel/rcu/srcutree.c                         | 107 ++++++++++++++----
 3 files changed, 107 insertions(+), 24 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index f5a27f067db9e..d7b64263fefbc 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5525,6 +5525,20 @@
 			off:    Disable mitigation and remove
 				performance impact to RDRAND and RDSEED
 
+	srcutree.convert_to_big [KNL]
+			Specifies under what conditions an SRCU tree
+			srcu_struct structure will be converted to big
+			form, that is, with an rcu_node tree:
+
+				   1:  Never.
+				   2:  At init_srcu_struct() time.
+				   3:  When rcutorture decides to.
+				0x1X:  Above plus if high contention.
+
+			Either way, the srcu_node tree will be sized based
+			on the actual runtime number of CPUs (nr_cpu_ids)
+			instead of the compile-time CONFIG_NR_CPUS.
+
 	srcutree.counter_wrap_check [KNL]
 			Specifies how frequently to check for
 			grace-period sequence counter wrap for the
@@ -5542,6 +5556,14 @@
 			expediting.  Set to zero to disable automatic
 			expediting.
 
+	srcutree.small_contention_lim [KNL]
+			Specifies the number of update-side contention
+			events per jiffy will be tolerated before
+			initiating a conversion of an srcu_struct
+			structure to big form.	Note that the value of
+			srcutree.convert_to_big must have the 0x10 bit
+			set for contention-based conversions to occur.
+
 	ssbd=		[ARM64,HW]
 			Speculative Store Bypass Disable control
 
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 44bd204498a11..1b9ff4ed37e45 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -72,6 +72,8 @@ struct srcu_struct {
 	unsigned long srcu_gp_seq_needed;	/* Latest gp_seq needed. */
 	unsigned long srcu_gp_seq_needed_exp;	/* Furthest future exp GP. */
 	unsigned long srcu_last_gp_end;		/* Last GP end timestamp (ns) */
+	unsigned long srcu_size_jiffies;	/* Current contention-measurement interval. */
+	unsigned long srcu_n_lock_retries;	/* Contention events in current interval. */
 	struct srcu_data __percpu *sda;		/* Per-CPU srcu_data array. */
 	bool sda_is_static;			/* May ->sda be passed to free_percpu()? */
 	unsigned long srcu_barrier_seq;		/* srcu_barrier seq #. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 5eee113531cdc..1d72c7e3c33b5 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -41,13 +41,29 @@ module_param(counter_wrap_check, ulong, 0444);
 
 /*
  * Control conversion to SRCU_SIZE_BIG:
- * 0: Don't convert at all (default).
- * 1: Convert at init_srcu_struct() time.
- * 2: Convert when rcutorture invokes srcu_torture_stats_print().
+ *    0: Decide at boot time based on system shape.
+ *    1: Don't convert at all (default).
+ *    2: Convert at init_srcu_struct() time.
+ *    3: Convert when rcutorture invokes srcu_torture_stats_print().
+ * 0x1x: Convert when excessive contention encountered.
  */
-static int convert_to_big;
+#define SRCU_SIZING_AUTO	0
+#define SRCU_SIZING_NONE	1
+#define SRCU_SIZING_INIT	2
+#define SRCU_SIZING_TORTURE	3
+#define SRCU_SIZING_CONTEND	0x10
+#define SRCU_SIZING_IS(x) ((convert_to_big & ~SRCU_SIZING_CONTEND) == x)
+#define SRCU_SIZING_IS_NONE() (SRCU_SIZING_IS(SRCU_SIZING_NONE))
+#define SRCU_SIZING_IS_INIT() (SRCU_SIZING_IS(SRCU_SIZING_INIT))
+#define SRCU_SIZING_IS_TORTURE() (SRCU_SIZING_IS(SRCU_SIZING_TORTURE))
+#define SRCU_SIZING_IS_CONTEND() (convert_to_big & SRCU_SIZING_CONTEND)
+static int convert_to_big = SRCU_SIZING_NONE;
 module_param(convert_to_big, int, 0444);
 
+/* Contention events per jiffy to initiate transition to big. */
+static int small_contention_lim __read_mostly = 100;
+module_param(small_contention_lim, int, 0444);
+
 /* Early-boot callback-management, so early that no lock is required! */
 static LIST_HEAD(srcu_boot_list);
 static bool __read_mostly srcu_init_done;
@@ -58,31 +74,40 @@ static void process_srcu(struct work_struct *work);
 static void srcu_delay_timer(struct timer_list *t);
 
 /* Wrappers for lock acquisition and release, see raw_spin_lock_rcu_node(). */
-#define spin_lock_rcu_node(p)					\
-do {									\
-	spin_lock(&ACCESS_PRIVATE(p, lock));			\
-	smp_mb__after_unlock_lock();					\
+#define spin_lock_rcu_node(p)							\
+do {										\
+	spin_lock(&ACCESS_PRIVATE(p, lock));					\
+	smp_mb__after_unlock_lock();						\
 } while (0)
 
 #define spin_unlock_rcu_node(p) spin_unlock(&ACCESS_PRIVATE(p, lock))
 
-#define spin_lock_irq_rcu_node(p)					\
-do {									\
-	spin_lock_irq(&ACCESS_PRIVATE(p, lock));			\
-	smp_mb__after_unlock_lock();					\
+#define spin_lock_irq_rcu_node(p)						\
+do {										\
+	spin_lock_irq(&ACCESS_PRIVATE(p, lock));				\
+	smp_mb__after_unlock_lock();						\
 } while (0)
 
-#define spin_unlock_irq_rcu_node(p)					\
+#define spin_unlock_irq_rcu_node(p)						\
 	spin_unlock_irq(&ACCESS_PRIVATE(p, lock))
 
-#define spin_lock_irqsave_rcu_node(p, flags)			\
-do {									\
-	spin_lock_irqsave(&ACCESS_PRIVATE(p, lock), flags);	\
-	smp_mb__after_unlock_lock();					\
+#define spin_lock_irqsave_rcu_node(p, flags)					\
+do {										\
+	spin_lock_irqsave(&ACCESS_PRIVATE(p, lock), flags);			\
+	smp_mb__after_unlock_lock();						\
 } while (0)
 
-#define spin_unlock_irqrestore_rcu_node(p, flags)			\
-	spin_unlock_irqrestore(&ACCESS_PRIVATE(p, lock), flags)	\
+#define spin_trylock_irqsave_rcu_node(p, flags)					\
+({										\
+	bool ___locked = spin_trylock_irqsave(&ACCESS_PRIVATE(p, lock), flags);	\
+										\
+	if (___locked)								\
+		smp_mb__after_unlock_lock();					\
+	___locked;								\
+})
+
+#define spin_unlock_irqrestore_rcu_node(p, flags)				\
+	spin_unlock_irqrestore(&ACCESS_PRIVATE(p, lock), flags)			\
 
 /*
  * Initialize SRCU per-CPU data.  Note that statically allocated
@@ -216,7 +241,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	init_srcu_struct_data(ssp);
 	ssp->srcu_gp_seq_needed_exp = 0;
 	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
-	if (READ_ONCE(ssp->srcu_size_state) == SRCU_SIZE_SMALL && convert_to_big == 1) {
+	if (READ_ONCE(ssp->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
 		if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
 			if (ssp->sda_is_static) {
 				free_percpu(ssp->sda);
@@ -262,6 +287,15 @@ EXPORT_SYMBOL_GPL(init_srcu_struct);
 
 #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
+/*
+ * Initiate a transition to SRCU_SIZE_BIG with lock held.
+ */
+static void __srcu_transition_to_big(struct srcu_struct *ssp)
+{
+	lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
+	smp_store_release(&ssp->srcu_size_state, SRCU_SIZE_ALLOC);
+}
+
 /*
  * Initiate an idempotent transition to SRCU_SIZE_BIG.
  */
@@ -277,10 +311,35 @@ static void srcu_transition_to_big(struct srcu_struct *ssp)
 		spin_unlock_irqrestore_rcu_node(ssp, flags);
 		return;
 	}
-	smp_store_release(&ssp->srcu_size_state, SRCU_SIZE_ALLOC);
+	__srcu_transition_to_big(ssp);
 	spin_unlock_irqrestore_rcu_node(ssp, flags);
 }
 
+/*
+ * Acquire the specified srcu_struct structure's ->lock, but check for
+ * excessive contention, which results in initiation of a transition
+ * to SRCU_SIZE_BIG.  But only if the srcutree.convert_to_big module
+ * parameter permits this.
+ */
+static void spin_lock_irqsave_ssp_contention(struct srcu_struct *ssp, unsigned long *flags)
+{
+	unsigned long j;
+
+	if (spin_trylock_irqsave_rcu_node(ssp, *flags))
+		return;
+	spin_lock_irqsave_rcu_node(ssp, *flags);
+	if (!SRCU_SIZING_IS_CONTEND() || ssp->srcu_size_state)
+		return;
+	j = jiffies;
+	if (ssp->srcu_size_jiffies != j) {
+		ssp->srcu_size_jiffies = j;
+		ssp->srcu_n_lock_retries = 0;
+	}
+	if (++ssp->srcu_n_lock_retries <= small_contention_lim)
+		return;
+	__srcu_transition_to_big(ssp);
+}
+
 /*
  * First-use initialization of statically allocated srcu_struct
  * structure.  Wiring up the combining tree is more than can be
@@ -705,7 +764,7 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
 			WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
 			spin_unlock_irqrestore_rcu_node(snp, flags);
 		}
-	spin_lock_irqsave_rcu_node(ssp, flags);
+	spin_lock_irqsave_ssp_contention(ssp, &flags);
 	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, s))
 		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, s);
 	spin_unlock_irqrestore_rcu_node(ssp, flags);
@@ -760,7 +819,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 		}
 
 	/* Top of tree, must ensure the grace period will be started. */
-	spin_lock_irqsave_rcu_node(ssp, flags);
+	spin_lock_irqsave_ssp_contention(ssp, &flags);
 	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed, s)) {
 		/*
 		 * Record need for grace period s.  Pair with load
@@ -1515,7 +1574,7 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
 		}
 		pr_cont(" T(%ld,%ld)\n", s0, s1);
 	}
-	if (convert_to_big == 2)
+	if (SRCU_SIZING_IS_TORTURE())
 		srcu_transition_to_big(ssp);
 }
 EXPORT_SYMBOL_GPL(srcu_torture_stats_print);
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 17/19] srcu: Make srcu_size_state_name static
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (15 preceding siblings ...)
  2022-02-04 23:38 ` [PATCH rcu 16/19] srcu: Add contention-triggered addition of srcu_node tree Paul E. McKenney
@ 2022-02-04 23:39 ` Paul E. McKenney
  2022-02-04 23:39 ` [PATCH rcu 18/19] srcu: Automatically determine size-transition strategy at boot Paul E. McKenney
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:39 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Jiapeng Chong, Abaci Robot,
	Paul E . McKenney

From: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>

This symbol is not used outside of srcutree.c, so this commit marks it static.

Doing so fixes the following sparse warning:

kernel/rcu/srcutree.c:1426:12: warning: symbol 'srcu_size_state_name'
was not declared. Should it be static?

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/srcutree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 1d72c7e3c33b5..f13e84ecbca6a 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1512,7 +1512,7 @@ void srcutorture_get_gp_data(enum rcutorture_type test_type,
 }
 EXPORT_SYMBOL_GPL(srcutorture_get_gp_data);
 
-const char * const srcu_size_state_name[] =
+static const char * const srcu_size_state_name[] =
 {
 	"SRCU_SIZE_SMALL",
 	"SRCU_SIZE_ALLOC",
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 18/19] srcu: Automatically determine size-transition strategy at boot
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (16 preceding siblings ...)
  2022-02-04 23:39 ` [PATCH rcu 17/19] srcu: Make srcu_size_state_name static Paul E. McKenney
@ 2022-02-04 23:39 ` Paul E. McKenney
  2022-02-04 23:39 ` [PATCH rcu 19/19] srcu: Add contention check to call_srcu() srcu_data ->lock acquisition Paul E. McKenney
       [not found] ` <20220205030303.2408-1-hdanton@sina.com>
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:39 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

This commit adds a srcutree.convert_to_big option of zero that causes
SRCU to decide at boot whether to wait for contention (small systems) or
immediately expand to large (large systems).  A new srcutree.big_cpu_lim
(defaulting to 128) defines how many CPUs constitute a large system.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 .../admin-guide/kernel-parameters.txt         |  9 +++++++++
 kernel/rcu/srcutree.c                         | 19 ++++++++++++++++++-
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index d7b64263fefbc..ad2388a431944 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5525,11 +5525,20 @@
 			off:    Disable mitigation and remove
 				performance impact to RDRAND and RDSEED
 
+	srcutree.big_cpu_lim [KNL]
+			Specifies the number of CPUs constituting a
+			large system, such that srcu_struct structures
+			should immediately allocate an srcu_node array.
+			This kernel-boot parameter defaults to 128, but
+			takes effect only when srcutree.convert_to_big
+			is equal to zero.
+
 	srcutree.convert_to_big [KNL]
 			Specifies under what conditions an SRCU tree
 			srcu_struct structure will be converted to big
 			form, that is, with an rcu_node tree:
 
+				   0:  Decide at boot time (default).
 				   1:  Never.
 				   2:  At init_srcu_struct() time.
 				   3:  When rcutorture decides to.
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index f13e84ecbca6a..c3968e091d019 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -57,9 +57,13 @@ module_param(counter_wrap_check, ulong, 0444);
 #define SRCU_SIZING_IS_INIT() (SRCU_SIZING_IS(SRCU_SIZING_INIT))
 #define SRCU_SIZING_IS_TORTURE() (SRCU_SIZING_IS(SRCU_SIZING_TORTURE))
 #define SRCU_SIZING_IS_CONTEND() (convert_to_big & SRCU_SIZING_CONTEND)
-static int convert_to_big = SRCU_SIZING_NONE;
+static int convert_to_big = SRCU_SIZING_AUTO;
 module_param(convert_to_big, int, 0444);
 
+/* Number of CPUs to trigger init_srcu_struct()-time transition to big. */
+static int big_cpu_lim __read_mostly = 128;
+module_param(big_cpu_lim, int, 0444);
+
 /* Contention events per jiffy to initiate transition to big. */
 static int small_contention_lim __read_mostly = 100;
 module_param(small_contention_lim, int, 0444);
@@ -1592,6 +1596,17 @@ void __init srcu_init(void)
 {
 	struct srcu_struct *ssp;
 
+	/* Decide on srcu_struct-size strategy. */
+	if (SRCU_SIZING_IS(SRCU_SIZING_AUTO)) {
+		if (nr_cpu_ids >= big_cpu_lim) {
+			convert_to_big = SRCU_SIZING_INIT; // Don't bother waiting for contention.
+			pr_info("%s: Setting srcu_struct sizes to big.\n", __func__);
+		} else {
+			convert_to_big = SRCU_SIZING_NONE | SRCU_SIZING_CONTEND;
+			pr_info("%s: Setting srcu_struct sizes based on contention.\n", __func__);
+		}
+	}
+
 	/*
 	 * Once that is set, call_srcu() can follow the normal path and
 	 * queue delayed work. This must follow RCU workqueues creation
@@ -1602,6 +1617,8 @@ void __init srcu_init(void)
 		ssp = list_first_entry(&srcu_boot_list, struct srcu_struct,
 				      work.work.entry);
 		list_del_init(&ssp->work.work.entry);
+		if (SRCU_SIZING_IS(SRCU_SIZING_INIT) && !ssp->srcu_size_state)
+			ssp->srcu_size_state = SRCU_SIZE_ALLOC;
 		queue_work(rcu_gp_wq, &ssp->work.work);
 	}
 }
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rcu 19/19] srcu: Add contention check to call_srcu() srcu_data ->lock acquisition
  2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
                   ` (17 preceding siblings ...)
  2022-02-04 23:39 ` [PATCH rcu 18/19] srcu: Automatically determine size-transition strategy at boot Paul E. McKenney
@ 2022-02-04 23:39 ` Paul E. McKenney
       [not found] ` <20220205030303.2408-1-hdanton@sina.com>
  19 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-04 23:39 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney

This commit increases the sensitivity of contention detection by adding
checks to the acquisition of the srcu_data structure's lock on the
call_srcu() code path.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/srcutree.c | 45 ++++++++++++++++++++++++++++++++++---------
 1 file changed, 36 insertions(+), 9 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index c3968e091d019..31a7a9e2445da 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -320,18 +320,13 @@ static void srcu_transition_to_big(struct srcu_struct *ssp)
 }
 
 /*
- * Acquire the specified srcu_struct structure's ->lock, but check for
- * excessive contention, which results in initiation of a transition
- * to SRCU_SIZE_BIG.  But only if the srcutree.convert_to_big module
- * parameter permits this.
+ * Check to see if the just-encountered contention event justifies
+ * a transition to SRCU_SIZE_BIG.
  */
-static void spin_lock_irqsave_ssp_contention(struct srcu_struct *ssp, unsigned long *flags)
+static void spin_lock_irqsave_check_contention(struct srcu_struct *ssp)
 {
 	unsigned long j;
 
-	if (spin_trylock_irqsave_rcu_node(ssp, *flags))
-		return;
-	spin_lock_irqsave_rcu_node(ssp, *flags);
 	if (!SRCU_SIZING_IS_CONTEND() || ssp->srcu_size_state)
 		return;
 	j = jiffies;
@@ -344,6 +339,38 @@ static void spin_lock_irqsave_ssp_contention(struct srcu_struct *ssp, unsigned l
 	__srcu_transition_to_big(ssp);
 }
 
+/*
+ * Acquire the specified srcu_data structure's ->lock, but check for
+ * excessive contention, which results in initiation of a transition
+ * to SRCU_SIZE_BIG.  But only if the srcutree.convert_to_big module
+ * parameter permits this.
+ */
+static void spin_lock_irqsave_sdp_contention(struct srcu_data *sdp, unsigned long *flags)
+{
+	struct srcu_struct *ssp = sdp->ssp;
+
+	if (spin_trylock_irqsave_rcu_node(sdp, *flags))
+		return;
+	spin_lock_irqsave_rcu_node(ssp, *flags);
+	spin_lock_irqsave_check_contention(ssp);
+	spin_unlock_irqrestore_rcu_node(ssp, *flags);
+	spin_lock_irqsave_rcu_node(sdp, *flags);
+}
+
+/*
+ * Acquire the specified srcu_struct structure's ->lock, but check for
+ * excessive contention, which results in initiation of a transition
+ * to SRCU_SIZE_BIG.  But only if the srcutree.convert_to_big module
+ * parameter permits this.
+ */
+static void spin_lock_irqsave_ssp_contention(struct srcu_struct *ssp, unsigned long *flags)
+{
+	if (spin_trylock_irqsave_rcu_node(ssp, *flags))
+		return;
+	spin_lock_irqsave_rcu_node(ssp, *flags);
+	spin_lock_irqsave_check_contention(ssp);
+}
+
 /*
  * First-use initialization of statically allocated srcu_struct
  * structure.  Wiring up the combining tree is more than can be
@@ -989,7 +1016,7 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
 	} else {
 		sdp = raw_cpu_ptr(ssp->sda);
 	}
-	spin_lock_irqsave_rcu_node(sdp, flags);
+	spin_lock_irqsave_sdp_contention(sdp, &flags);
 	if (rhp)
 		rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
 	rcu_segcblist_advance(&sdp->srcu_cblist,
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH rcu 19/19] srcu: Add contention check to call_srcu() srcu_data ->lock acquisition
       [not found] ` <20220205030303.2408-1-hdanton@sina.com>
@ 2022-02-05  5:20   ` Paul E. McKenney
  0 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2022-02-05  5:20 UTC (permalink / raw)
  To: Hillf Danton; +Cc: linux-kernel, rostedt

On Sat, Feb 05, 2022 at 11:03:03AM +0800, Hillf Danton wrote:
> On Fri,  4 Feb 2022 15:39:02 -0800
> > This commit increases the sensitivity of contention detection by adding
> > checks to the acquisition of the srcu_data structure's lock on the
> > call_srcu() code path.
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > ---
> >  kernel/rcu/srcutree.c | 45 ++++++++++++++++++++++++++++++++++---------
> >  1 file changed, 36 insertions(+), 9 deletions(-)
> > 
> > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > index c3968e091d019..31a7a9e2445da 100644
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -320,18 +320,13 @@ static void srcu_transition_to_big(struct srcu_struct *ssp)
> >  }
> >  
> >  /*
> > - * Acquire the specified srcu_struct structure's ->lock, but check for
> > - * excessive contention, which results in initiation of a transition
> > - * to SRCU_SIZE_BIG.  But only if the srcutree.convert_to_big module
> > - * parameter permits this.
> > + * Check to see if the just-encountered contention event justifies
> > + * a transition to SRCU_SIZE_BIG.
> >   */
> > -static void spin_lock_irqsave_ssp_contention(struct srcu_struct *ssp, unsigned long *flags)
> > +static void spin_lock_irqsave_check_contention(struct srcu_struct *ssp)
> >  {
> >  	unsigned long j;
> >  
> > -	if (spin_trylock_irqsave_rcu_node(ssp, *flags))
> > -		return;
> > -	spin_lock_irqsave_rcu_node(ssp, *flags);
> 
> Given neither try/lock nor irqsave any longer, a new function name is
> prefered.

This function contains the contention-checking common code for the
spin_lock_irqsave_ssp_contention() and spin_lock_irqsave_sdp_contention()
functions, so the current name is fine.

							Thanx, Paul

> Hillf
> 
> >  	if (!SRCU_SIZING_IS_CONTEND() || ssp->srcu_size_state)
> >  		return;
> >  	j = jiffies;
> > @@ -344,6 +339,38 @@ static void spin_lock_irqsave_ssp_contention(struct srcu_struct *ssp, unsigned l
> >  	__srcu_transition_to_big(ssp);
> >  }
> >  
> > +/*
> > + * Acquire the specified srcu_data structure's ->lock, but check for
> > + * excessive contention, which results in initiation of a transition
> > + * to SRCU_SIZE_BIG.  But only if the srcutree.convert_to_big module
> > + * parameter permits this.
> > + */
> > +static void spin_lock_irqsave_sdp_contention(struct srcu_data *sdp, unsigned long *flags)
> > +{
> > +	struct srcu_struct *ssp = sdp->ssp;
> > +
> > +	if (spin_trylock_irqsave_rcu_node(sdp, *flags))
> > +		return;
> > +	spin_lock_irqsave_rcu_node(ssp, *flags);
> > +	spin_lock_irqsave_check_contention(ssp);
> > +	spin_unlock_irqrestore_rcu_node(ssp, *flags);
> > +	spin_lock_irqsave_rcu_node(sdp, *flags);
> > +}
> > +
> > +/*
> > + * Acquire the specified srcu_struct structure's ->lock, but check for
> > + * excessive contention, which results in initiation of a transition
> > + * to SRCU_SIZE_BIG.  But only if the srcutree.convert_to_big module
> > + * parameter permits this.
> > + */
> > +static void spin_lock_irqsave_ssp_contention(struct srcu_struct *ssp, unsigned long *flags)
> > +{
> > +	if (spin_trylock_irqsave_rcu_node(ssp, *flags))
> > +		return;
> > +	spin_lock_irqsave_rcu_node(ssp, *flags);
> > +	spin_lock_irqsave_check_contention(ssp);
> > +}
> > +
> >  /*
> >   * First-use initialization of statically allocated srcu_struct
> >   * structure.  Wiring up the combining tree is more than can be
> > @@ -989,7 +1016,7 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
> >  	} else {
> >  		sdp = raw_cpu_ptr(ssp->sda);
> >  	}
> > -	spin_lock_irqsave_rcu_node(sdp, flags);
> > +	spin_lock_irqsave_sdp_contention(sdp, &flags);
> >  	if (rhp)
> >  		rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
> >  	rcu_segcblist_advance(&sdp->srcu_cblist,
> > -- 
> > 2.31.1.189.g2e36527f23

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2022-02-05  5:20 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-04 23:38 [PATCH rcu 0/19] SRCU updates for v5.18 Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 01/19] srcu: Tighten cleanup_srcu_struct() GP checks Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 02/19] srcu: Fix s/is/if/ typo in srcu_node comment Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 03/19] srcu: Make srcu_funnel_gp_start() cache ->mynode in snp_leaf Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 04/19] srcu: Dynamically allocate srcu_node array Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 05/19] srcu: Make Tree SRCU able to operate without snp_node array Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 06/19] srcu: Add size-state transitioning code Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 07/19] srcu: Make rcutorture dump the SRCU size state Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 08/19] srcu: Add boot-time control over srcu_node array allocation Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 09/19] srcu: Use export for srcu_struct defined by DEFINE_STATIC_SRCU() Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 10/19] srcu: Compute snp_seq earlier in srcu_funnel_gp_start() Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 11/19] srcu: Use invalid initial value for srcu_node GP sequence numbers Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 12/19] srcu: Avoid NULL dereference in srcu_torture_stats_print() Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 13/19] srcu: Prevent cleanup_srcu_struct() from freeing non-dynamic ->sda Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 14/19] srcu: Explain srcu_funnel_gp_start() call to list_add() is safe Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 15/19] srcu: Create concurrency-safe helper for initiating size transition Paul E. McKenney
2022-02-04 23:38 ` [PATCH rcu 16/19] srcu: Add contention-triggered addition of srcu_node tree Paul E. McKenney
2022-02-04 23:39 ` [PATCH rcu 17/19] srcu: Make srcu_size_state_name static Paul E. McKenney
2022-02-04 23:39 ` [PATCH rcu 18/19] srcu: Automatically determine size-transition strategy at boot Paul E. McKenney
2022-02-04 23:39 ` [PATCH rcu 19/19] srcu: Add contention check to call_srcu() srcu_data ->lock acquisition Paul E. McKenney
     [not found] ` <20220205030303.2408-1-hdanton@sina.com>
2022-02-05  5:20   ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.